Genes of an otitis media isolate of nontypeable Haemophilus influenzae

Information

  • Patent Grant
  • 9034642
  • Patent Number
    9,034,642
  • Date Filed
    Thursday, January 9, 2014
    10 years ago
  • Date Issued
    Tuesday, May 19, 2015
    8 years ago
Abstract
The invention relates to the polynucleotide sequence of a nontypeable strain of Haemophilus influenzae (NTHi) and polypeptides encoded by the polynucleotides and uses thereof. The invention also relates to NTHi genes which are upregulated during or in response to NTHi infection of the middle ear and/or the nasopharynx.
Description
FIELD OF INVENTION

The invention relates to the polynucleotide sequence of a nontypeable strain of Haemophilus influenzae (NTHi) genome, NTHi genes contained within the genome and polypeptides encoded by the polynucleotides. The invention also relates to uses of these NTHi polynucleotides and NTHi polypeptides including vaccines and methods of treating and preventing NTHi related disorders. The invention also relates to NTHi genes which are upregulated during or in response to NTHi infection of the middle ear or nasopharynx.


BACKGROUND

Otitis media (OM) is a highly prevalent pediatric disease worldwide and is the primary cause for emergency room visits by children (Infante-Rivand and Fernandez, Epidemiol. Rev., 15: 444-465, 1993). Recent statistics indicate that 24.5 million physician office visits were made for OM in 1990, representing a greater than 200% increase over those reported in the 1980's. While rarely associated with mortality any longer, the morbidity associated with OM is significant. Hearing loss is a common problem associated with this disease, often times affecting a child's behavior, education and development of language skills (Baldwin, Am. J. Otol., 14: 601-604, 1993; Hunter et al., Ann. Otol. Rhinol. Laryngol. Suppl., 163: 59-61, 1994; Teele et al., J. Infect. Dis., 162: 685-694, 1990). The socioeconomic impact of OM is also great, with direct and indirect costs of diagnosing and managing OM exceeding $5 billion annually in the U.S. alone (Kaplan et al., Pediatr. Infect. Dis. 16: S9-11, 1997).


Whereas antibiotic therapy is common and the surgical placement of tympanostomy tubes has been successful in terms of draining effusions, clearing infection and relieving pain associated with the accumulation of fluids in the middle ear, the emergence of multiple antibiotic-resistant bacteria and the invasive nature associated with tube placement, has illuminated the need for more effective and accepted approaches to the management and preferably, the prevention of OM. Surgical management of chronic OM involves the insertion of tympanostomy tubes through the tympanic membrane while a child is under general anesthesia. While this procedure is commonplace (prevalence rates are ˜13%; Bright et al., Am. J. Public Health, 83(7): 1026-8, 1993) and is highly effective in terms of relieving painful symptoms by draining the middle ear of accumulated fluids, it too has met with criticism due to the invasive nature of the procedure and its incumbent risks (Berman et al., Pediatrics, 93(3):353-63, 1994; Bright et al., supra.; Cimons, ASM News, 60: 527-528; Paap, Ann. Pharmacother., 30(11): 1291-7, 1996).


Progress in vaccine development is most advanced for Streptococcus pneumoniae, the primary causative agent of acute OM (AOM), as evidenced by the recent approval and release of a seven-valent capsular-conjugate vaccine, PREVNAR® (Eskola and Kilpi, Pedriatr. Infect. Dis. J. 16: S72-78, 2000). While PREVNAR® has been highly efficacious for invasive pneumococcal disease, coverage for OM has been disappointing (6-8%) with reports of an increased number of OM cases due to serotypes not included in the vaccine (Black et al., Pedriatr. Infect. Dis J., 19: 187-195; Eskola et al., Pedriatr. Infect. Dis 1, 19: S72-78, 2000; Eskola et al., N. Engl. J. Med. 344: 403-409, 2001; Snow et al., Otol. Neurotol., 23: 1-2, 2002). Less progress has been made for non-typeable Haemophilus influenzae (NTHi), the gram-negative pathogen that predominates in chronic OM with effusion (Klein, Pedriatr. Infect. Dis J., 16: S5-8, 1997; Spinola et al., J. Infect. Dis., 154: 100-109, 1986). Hampering development of effective vaccines against NTHi, is the currently incomplete understanding of the pathogenesis of NTHi-induced middle ear disease. Contributing to this delay is a lack of understanding of the dynamic interplay between microbe-expressed virulence factors and the host's immune response as the disease progresses from one of host immunological tolerance of a benign nasopharyngeal commensal, to that of an active defensive reaction to an opportunistic invader of the normally sterile middle ear space.


Although strain Rd is the exemplar organism for the current small genome sequencing rationale, and an important model organism for studying H. influenzae biology, strain Rd is a poor model for the study of pathogenicity caused by members of the genus Haemophilus. Serotype b strains of H. influenzae cause invasive diseases, for example meningitis, and nontypeable H. influenzae (NTHi) strains principally have a role in localized respiratory disease, particularly in otitis media (OM), acute sinusitis, community acquired pneumonia and have important consequences in patients with chronic obstructive pulmonary disease or cystic fibrosis (Kilpi et al., Pediatr Infect Dis J 20:654-62, 2001; Murphy, Curr Opin Infect Dis 16:129-34, 2003; Roman et al., J Clin Microbial 42:1450-9, 2004; Sethi, Clin Microbiol Rev 14:336-63, 2001; St Geme, Vaccine 19 Suppl 1:S41-50, 2000). Strain Rd, however, is a derivative of a serotype d strain. Serotype d strains are rarely associated with disease (Daines et at, J Med Microbiol 52:277-82 2003; Heath et al., Pediatr Infect Dis J 20:300-5, 2001; Rodriguez et al., Infect Immun 71:1635-42, 2003, Skoczynska et al., J Clin Microbiol 43:938-41, 2005). Because one of the most useful sets of data in the study of an organism's biology is its genomic sequence, a number of investigations have identified and characterized genes found in H. influenzae type b strains, H. influenzae Biogroup Aegyptius strains or in nontypeable strains that are not present in strain Rd (Bergman et al., Infect Immun 71:1098-108, 2003; Chang et al., Infect Immun 68:2630-7, 2000; Erdos et al., Int J Pediatr Otorhinolaryngol 67:749-55. 2003; Li et al., Mol Microbiol 47:1101-11, 2003; McMichael & Green, Curr Opin Investig Drugs 4:953-8, 2003; Pomposiello & Demple, 2001; Smoot et al., Infect Immun 70:2694-9, 2002).


Currently there is a poor understanding of how NTHi causes OM in children. The identification of putative virulence factors necessary for induction of OM will contribute significantly to the understanding of the host-pathogen interaction and ultimately, the identification of potential vaccine candidates and targets of chemotherapy. There is a tremendous need to develop more effective and accepted approaches to the management and preferably, the prevention of otitis media. Vaccine development is a very promising and cost effective method to accomplish this goal (Giebank, Pedriatr. Infect. Dis J., 13(11): 1064-8, 1994: Karma et al., Int. J. Pedritr. Otorhinolalyngol., 32(Suppl.): S127-34, 1995).


SUMMARY OF INVENTION

The present invention provides for the identification and characterization of the genomic sequence of NTHi H. influenzae strain 86-028NP and the polypeptide sequences encoded thereby. The 3-fold analysis of the NTHi genomic sequence is set out in a series of contig sequences denoted as SEQ ID NO: 1-576, and the subsequent 8-fold analysis of the genomic sequence is set out in a series of 11 contig sequences denoted as SEQ ID NOS: 675-685. These contigs are raw data and one of skill in the art may assemble these contigs by comparing overlapping sequences to construct the complete genome of the NTHi stain 86-028NP using routine methods.


The complete and annotated genome of NTHi strain 86-028NP is set out as SEQ ID NO: 772. The open reading frames are set in Table 6 as SEQ ID NOS: 773-2593. The amino acid sequence of the resulting gene products are set out as SEQ ID NOS: 2594-4414. The genome is approximately 1.91 kb in size, slightly larger than the strain Rd genome. A number of regions of gross genome rearrangement relative to the strain Rd genome have been identified as well as a number of genes unique to strain 86-028NP.


The present invention also provides for antibodies specific for the NTHi polypeptides of the invention. Methods of detecting NTHi bacteria in a human or in sample, such as serum, sputum, ear fluid, blood, urine, lymphatic fluid and cerebrospinal fluid are contemplated. These methods include detecting NTHi polynucleotides with specific polynucleotide probes or detecting NTHi polypeptides with specific antibodies. The invention also contemplates diagnostic kits which utilize these methods of detecting NTHi bacteria.


The present invention also contemplates methods of eliciting an immune response by administering a NTHi polypeptide of the invention or a NTHi peptide thereof. These methods include administering the NTHi polypeptide or NTHi peptide as a vaccine for treatment and/or prevention of diseases caused by NTHi infection, such as OM. The following NTHi genes are upregulated during or in response to middle ear and/or nasopharynx infections; and the polypeptides encoded by these genes and peptides thereof are contemplated as possible OM vaccine candidates and/or target of chemotherapy: hisB, lppB, sapA, lolA, rbsC, purE, ribB, arcB, uxuA, dsbB, ureH, licC, HI1647, ispZ, radC, mukF, glpR, ihfB, argR, cspD, HI0094, HI1163, HI1063, HI0665, HI1292, HI1064. NTHi hisB gene is set out as nucleotide sequence SEQ ID NO: 615 and encodes the amino acid sequence set out as SEQ ID NO: 616. NTHi sapA gene is set out as nucleotide sequence SEQ ID NO: 617 and encodes the amino acid sequence set out as SEQ ID NO: 618. NTHi rbsC gene is set out as nucleotide sequence SEQ ID NO: 619 and encodes the amino acid sequence set out as SEQ ID NO: 620. NTHi purE gene is set out as nucleotide sequence SEQ ID NO: 621 and encodes the amino acid sequence set out as SEQ ID NO: 622. NTHi ribB gene is set out as nucleotide sequence SEQ ID NO: 623 and encodes the amino acid sequence set out as SEQ ID NO: 624. NTHi arcB gene is set out as nucleotide sequence SEQ ID NO: 625 and encodes the amino acid sequence set out as SEQ ID NO: 626. NTHi uxuA gene is set out as nucleotide sequence SEQ ID NO: 627 and encodes the amino acid sequence set out as SEQ ID NO: 628. NTHi dsbB gene is set out as nucleotide sequence SEQ ID NO: 629 and encodes the amino acid sequence set out as SEQ ID NO: 630. NTHi ureH gene is set out as nucleotide sequence SEQ ID NO: 631 and encodes the amino acid sequence set out as SEQ ID NO: 632. NTHi licC gene is set out as nucleotide sequence SEQ ID NO: 633 and encodes the amino acid sequence set out as SEQ ID NO: 634. NTHi HI1647 gene is set out as nucleotide sequence SEQ ID NO: 635 and encodes the amino acid sequence set out as SEQ ID NO: 636. NTHi ispZ gene is set out as nucleotide sequence SEQ ID NO: 637 and encodes the amino acid sequence set out as SEQ ID NO: 638. NTHi radC gene is set out as nucleotide sequence SEQ ID NO: 639 and encodes the amino acid sequence set out as SEQ ID NO: 640. NTHi mukF gene is set out as nucleotide sequence SEQ ID NO: 641 and encodes the amino acid sequence set out as SEQ ID NO: 642. NTHi glpR gene is set out as nucleotide sequence SEQ ID NO: 643 and encodes the amino acid sequence set out as SEQ ID NO: 644. NTHi ihfB gene is set out as nucleotide sequence SEQ ID NO: 645 and encodes the amino acid sequence set out as SEQ ID NO: 646. NTHi argR gene is set out as nucleotide sequence SEQ ID NO: 647 and encodes the amino acid sequence set out as SEQ ID NO: 648. NTHi cspD gene is set out as nucleotide sequence SEQ ID NO: 649 and encodes the amino acid sequence set out as SEQ ID NO: 650. NTHi HI1163 gene is set out as nucleotide sequence SEQ ID NO: 651 and encodes the amino acid sequence set out as SEQ ID NO: 652. NTHi HI1063 gene is set out as nucleotide sequence SEQ ID NO: 653 and encodes the amino acid sequence set out as SEQ ID NO: 654. NTHi HI0665 gene is set out as nucleotide sequence SEQ ID NO: 655 and encodes the amino acid sequence set out as SEQ ID NO: 656. NTHi HI1292 gene is set out as nucleotide sequence SEQ ID NO: 657 and encodes the amino acid sequence set out as SEQ ID NO: 658.


The novel NTHi genes included in the polynucleotide sequences presented as SEQ ID NOS: 1-576, SEQ ID NOS: 675-685 and the nucleotide sequences set out in Tables 4 and 4B are also up-regulated during infection of the middle ear and/or the nasopharynx, and therefore are contemplated to encode OM vaccine candidates and/or targets of chemotherapy. In addition, the following NTHi genes are contemplated to be virulence-associated genes and therefore are contemplated to encode possible OM vaccine candidates and/or targets of chemotherapy: HI1386, HI1462, HI1369, lav, HI1598. NTHi HI1386 gene sequence is set out as SEQ ID NO: 659 and encodes the amino acid sequence set out as SEQ ID NO: 660. NTHi HI1462 gene sequence is set out as SEQ ID NO: 661 and encodes the amino acid sequence set out as SEQ ID NO: 662. NTHi HI1369 gene sequence is set out as SEQ ID NO: 665 and encodes the amino acid sequence set out as SEQ ID NO: 666. NTHi lav gene sequence is set out as SEQ ID NO: 663 and encodes the amino acid sequence set out as SEQ ID NO: 664. NTHi HI1598 gene sequence is set out as SEQ ID NO: 669 and SEQ ID NO: 671 and encodes the amino acid sequence set out as SEQ ID NO: 670 and SEQ ID NO: 672. Additional NTHi genes associated with virulence include the polynucleotide sequences presented as SEQ ID NO: 667 and SEQ ID NO: 673.


As a method of treating or preventing NTHi infection, the present invention contemplates administering a molecule that inhibits expression or the activity of the NTHi polypeptides, which are upregulated or active during infection. In particular, the invention contemplates methods of treating or preventing NTHi infection comprising modulating NTHi protein expression by administering an antisense oligonucleotide that specifically binds to NTHi genes that are upregulated during NTHi infections, such genes include hisB, lppB, sapA, lolA, rbsC, purE, ribB, arcB, uxuA, dsbB, ureH, licC, HI1647, ispZ, radC, mukF, glpR, ihfB, argR, cspD, HI0094, HI1163, HI1063, HI0665, HI1292, HI1064. The invention also contemplates methods of treating or preventing NTHi infection comprising administering antibodies or small molecules that modulate the activity of the proteins encoded by theses genes. The novel NTHi genes included in the polynucleotide sequences presented as SEQ ID NOS: 1-576, SEQ ID NOS: 675-685 and the nucleotide sequences set out in Tables 4 and 4B are also up-regulated during infection of the middle ear and/or the nasopharynx and therefore antisense oligonucleotides that specifically bind these polynucleotide sequences are also contemplated.


Polynucleotides and Polypeptides of the Invention


The present invention provides for the sequences of the NTHi strain 86-028NP genome. The 3-fold analysis of the genomic sequence is presented as a series of contig sequences denoted herein as “contigs 1-576”. Each contig is assigned a sequence identification number that correlates with its “contig number”. Therefore, the contigs of the present invention as set out as SEQ ID NOS: 1-576. These contig polynucleotide sequences may be assembled into the complete genome sequence of the NTHi strain 86-028NP using routine methods. Upon completion of 8-fold sequence analysis of the NTHi strain 82-028NP genome, the genomic sequence was assembled into 11 contigs which are denoted herein as SEQ ID NOS: 675-685. Finally, the complete genome is presented as one nucleic acid sequence denoted herein as SEQ ID NO: 772.


The present invention provides for the NTHi polynucleotide sequences and open reading frames contained within the contigs of SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5. The present invention also provides for the polypeptide sequences encoded by the NTHi polynucleotides of the present invention such as the amino acid sequences set out in SEQ ID NOS: 2594-4414, Table 3B, Table 4B and Table 5. The invention provides for polynucleotides that hybridize under stringent conditions to (a) the complement of the nucleotides sequence of SEQ ID NOS: 1-576; SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593 and the nucleotide sequences set out in Table 3B, Table 4B and Table 5 herein (b) a polynucleotide which is an allelic variant of any polynucleotides recited above; (c) a polynucleotide which encodes a species homolog of any of the proteins recited above; or (d) a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the NTHi polypeptides of the present invention.


The NTHi polynucleotides of the invention also include nucleotide sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides according to the invention can have, e.g., at least 65%, at least 70%, at least 75%, at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more typically at least 90%, 91%, 92%, 93%, or 94% and even more typically at least 95%, 96%, 97%, 98% or 99% sequence identity to the NTHi polynucleotides recited above.


Included within the scope of the nucleic acid sequences of the invention are nucleic acid sequence fragments that hybridize under stringent conditions to the NTHi nucleotide sequences of SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5 herein, or compliments thereof, which fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g., 15, 17, or 20 nucleotides or more that are selective for (i.e., specifically hybridize to any one of the polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can differentiate NTHi polynucleotide sequences of the invention from other polynucleotide sequences in the same family of genes or can differentiate NTHi genes from other bacterial genes, and are preferably based on unique nucleotide sequences.


The term “stringent” is used to refer to conditions that are commonly understood in the art as stringent. Hybridization stringency is principally determined by temperature, ionic strength, and the concentration of denaturing agents such as formamide. Examples of stringent conditions for hybridization and washing are 0.015 M sodium chloride, 0.0015 M sodium citrate at 65-68° C. or 0.015 M sodium chloride, 0.0015M sodium citrate, and 50% formamide at 42° C. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, (Cold Spring Harbor, N.Y. 1989). More stringent conditions (such as higher temperature, lower ionic strength, higher formamide, or other denaturing agent) may also be used, however, the rate of hybridization will be affected. In instances wherein hybridization of deoxyoligonucleotides is concerned, additional exemplary stringent hybridization conditions include washing in 6×SSC 0.05% sodium pyrophosphate at 37° C. (for 14-base oligos), 48° C. (for 17-base oligos), 55° C. (for 20-base oligos), and 60° C. (for 23-base oligos).


Other agents may be included in the hybridization and washing buffers for the purpose of reducing non-specific and/or background hybridization. Examples are 0.1% bovine serum albumin, 0.1% polyvinyl-pyrrolidone, 0.1% sodium pyrophosphate, 0.1% sodium dodecylsulfate, NaDodSO4, (SDS), ficoll, Denhardt's solution, sonicated salmon sperm DNA (or other non-complementary DNA), and dextran sulfate, although other suitable agents can also be used. The concentration and types of these additives can be changed without substantially affecting the stringency of the hybridization conditions. Hybridization experiments are usually carried out at pH 6.8-7.4, however, at typical ionic strength conditions, the rate of hybridization is nearly independent of pH. See Anderson et al., Nucleic Acid Hybridisation: A Practical Approach, Ch. 4, IRL Press Limited (Oxford, England). Hybridization conditions can be adjusted by one skilled in the art in order to accommodate these variables and allow DNAs of different sequence relatedness to form hybrids.


The sequences falling within the scope of the present invention are not limited to these specific sequences, but also include allelic and species variations thereof. Allelic and species variations can be routinely determined by comparing the sequence provided in SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and nucleotide sequences out in Table 3B, Table 4B and Table 5 herein, preferably the open reading frames therein, a representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% identical, to the open reading frames within SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5 with a sequence from another isolate of the same species. Preferred computer program methods to determine identity and similarity between two sequences include, but are not limited to, the GCG program package, including GAP (Devereux et al., Nucl. Acid. Res., 12: 387, 1984; Genetics Computer Group, University of Wisconsin, Madison, Wis.), BLASTP, BLASTN, and FASTA (Altschul et al., J. Mol. Biol., 215: 403-410, 1990). The BLASTX program is publicly available from the National Center for Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul et al. NCB/NLM/NIH Bethesda, Md. 20894; Altschul et al., supra). The well known Smith Waterman algorithm may also be used to determine identity.


Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific open reading frames (ORF) disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another codon that encodes the same amino acid is expressly contemplated.


The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising: the amino acid sequences encoded by the nucleotide sequences included within the polynucleotide sequences set out as SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5, or the corresponding full length or mature protein. The polypeptides of the invention include the amino acid sequences of SEQ ID NO: 616, SEQ ID NO: 618, SEQ ID NO: 620, SEQ ID NO: 622, SEQ ID NO: 624, SEQ ID NO: 626, SEQ ID NO: 628, SEQ ID NO: 628, SEQ ID NO: 630, SEQ ID NO: 632, SEQ ID NO: 634, SEQ ID NO: 636, SEQ ID NO: 638, SEQ ID NO: 640, SEQ ID NO: 642, SEQ ID NO: 644, SEQ ID NO: 646, SEQ ID NO: 648, SEQ ID NO: 650, SEQ ID NO: 652, SEQ ID NO: 654, SEQ ID NO: 656, SEQ ID NO: 658, SEQ ID NO: 660, SEQ ID NO: 662, SEQ ID NO: 664, SEQ ID NO: 666, SEQ ID NO: 668, SEQ ID NO: 670, SEQ ID NO: 672, SEQ ID NO: 674, SEQ ID NO: 687, SEQ ID NO: 689, SEQ ID NO: 691, SEQ ID NO: 693, SEQ ID NO: 695, SEQ ID NO: 697, SEQ ID NO: 699, SEQ ID NO: 701, SEQ ID NO: 703, SEQ ID NO: 705, SEQ ID NO: 707, SEQ ID NO: 709, SEQ ID NO: 711, SEQ ID NO: 713, SEQ ID NO:715, SEQ ID NO: 717, SEQ ID NO: 719, SEQ ID NO: 721, SEQ ID NO:723, SEQ ID NO:725, SEQ ID NO:727, SEQ ID NO:729, SEQ ID NO: 731, SEQ ID NO: 733, SEQ ID NO: 735, SEQ ID NO: 737, SEQ ID NO: 739, SEQ ID NO: 741, SEQ ID NO: 743, SEQ ID NO: 745, SEQ ID NO: 747, SEQ ID NO: 749, SEQ ID NO: 751, SEQ ID NO: 753, SEQ ID NO: 755, SEQ ID NO: 757, SEQ ID NO: 759, SEQ ID NO: 761, 763, SEQ ID NO: 765, SEQ ID NO: 767, SEQ ID NO: 769 or SEQ ID NO: 771, SEQ ID NOS: 2594-4414 which are set out in Table 3B, Table 4B and Table 5 herein.


Polypeptides of the invention also include polypeptides preferably with biological or immunogenic activity that are encoded by: (a) an open reading frame contained within the nucleotide sequences set forth as SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5, or (b) polynucleotides that hybridize to the complement of the polynucleotides of (a) under stringent hybridization conditions.


The invention also provides biologically active or immunologically active variants of the amino acid sequences of the present invention; and “substantial equivalents” thereof (e.g., with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 99% amino acid identity) that retain biological and/or immunogenic activity. Polypeptides encoded by allelic variants may have a similar, increased, or decreased activity compared to polypeptides encoded by the polynucleotides included within the nucleotide sequences presented in SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences encoding the amino acid sequences set out as SEQ ID NOS: 2594-4414 and in Table 3B, Table 4B and Table 5 herein, and the polypeptides having an amino acid sequence set out in SEQ ID NOS: 2594-4414, Table 3B, Table 4B and Table 5 herein


NTHi peptides refer to fragments of the NTHi polypeptides encoded by the nucleotide sequences presented in SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, or the nucleotide sequences set out in Table 3B, Table 4B and Table 5 herein, and the polypeptides having the amino acid sequences set out in SEQ ID NOS: 2594-4414 Table 3B, Table 4B and Table 5 herein. The preferred NTHi peptides are biologically and/or immunologically active.


The present invention further provides isolated NTHi polypeptides or NTHi peptides encoded by the NTHi nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. The term “degenerate variant” refers to nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical NTHi polypeptide sequence. Preferred nucleic acid fragments of the present invention are the ORFs that encode proteins.


The invention also provides for NTHi polypeptides with one or more conservative amino acid substitutions that do not affect the biological and/or immunogenic activity of the polypeptide. Alternatively, the NTHi polypeptides of the invention are contemplated to have conservative amino acids substitutions which may or may not alter biological activity. The term “conservative amino acid substitution” refers to a substitution of a native amino acid residue with a nonnative residue, including naturally occurring and nonnaturally occurring amino acids, such that there is little or no effect on the polarity or charge of the amino acid residue at that position. For example, a conservative substitution results from the replacement of a non-polar residue in a polypeptide with any other non-polar residue. Further, any native residue in the polypeptide may also be substituted with alanine, according to the methods of “alanine scanning mutagenesis”. Naturally occurring amino acids are characterized based on their side chains as follows: basic: arginine, lysine, histidine; acidic: glutamic acid, aspartic acid; uncharged polar: glutamine, asparagine, serine, threonine, tyrosine; and non-polar: phenylalanine, tryptophan, cysteine, glycine, alanine, valine, proline, methionine, leucine, norleucine, isoleucine General rules for amino acid substitutions are set forth in Table 1 below.









TABLE 1







Amino Acid Substitutions









Original Residues
Exemplary Substitutions
Preferred Substitutions





Ala
Val, Leu, Ile
Val


Arg
Lys, Gln, Asn
Lys


Asn
Gln
Gln


Asp
Glu
Glu


Cys
Ser, Ala
Ser


Gln
Asn
Asn


Glu
Asp
Asn


Gly
Pro, Ala
Ala


His
Asn, Gln, Lys, Arg
Arg


Ile
Leu, Val, Met, Ala, Phe,
Leu


Leu
Norleucine, Ile, Val, Met,
Leu


Lys
Arg, 1,4 Diaminobutyric
Arg


Met
Leu, Phe, Ile
Leu


Phe
Leu, Val, Ile, Ala, Tyr
Arg


Pro
Ala
Gly


Ser
Thr, Ala, Cys
Thr


Thr
Ser
Ser


Trp
Tyr, Phe
Tyr


Tyr
Trp, Phe, Thr, Ser
Phe


Val
Ile, Met, Leu, Phe, Ala,
Leu









Antisense polynucleotides complementary to the polynucleotides encoding the NTHi polypeptides are also provided.


The invention contemplates that polynucleotides of the invention may be inserted in a vector for amplification or expression. For expression, the polynucleotides are operatively linked to appropriate expression control sequence such as a promoter and polyadenylation signal sequences. Further provided are cells comprising polynucleotides of the invention. Exemplary prokaryotic hosts include bacteria such as E. coli, Bacillus, Streptomyces, Pseudomonas, Salmonella and Serratia.


The term “isolated” refers to a substance removed from, and essentially free of, the other components of the environment in which it naturally exists. For example, a polypeptide is separated from other cellular proteins or a DNA is separated from other DNA flanking it in a genome in which it naturally occurs.


Antibodies and Methods for Eliciting an Immune Response


The invention provides antibodies which bind to antigenic epitopes unique to (Le., are specific for) NTHi polypeptides. Also provided are antibodies which bind to antigenic epitopes common among multiple H. influenzae subtypes but unique with respect to any other antigenic epitopes. The antibodies may be polyclonal antibodies, monoclonal antibodies, antibody fragments which retain their ability to bind their unique epitope (e.g., Fv, Fab and F(ab)2 fragments), single chain antibodies and human or humanized antibodies. Antibodies may be generated by techniques standard in the art.


It is known in the art that antigens that mimic the capsular polysaccharide of H. influenzae will generate antibodies that exhibit the ability to kill bacteria in in vitro assays. These antibodies are also known to protect against challenge with H. influenzae in animal model systems. These studies indicate antibody to the capsular polysaccharides are likely to elicit a protective immune response in humans. The present invention provides for antibodies specific for the NTHi polypeptides of the present invention and fragments thereof, which exhibit the ability to kill both H. influenzae bacteria and to protect humans from NTHi infection. The present invention also provides for antibodies specific for the NTHi polypeptides of the invention which reduce the virulence, inhibit adherence, inhibit cell division, and/or inhibit penetration into the epithelium of H. influenzae bacteria or enhance phagocytosis of the H. influenzae bacteria.


In vitro complement mediated bactericidal assay systems (Musher et al., Infect. Immun. 39: 297-304, 1983; Anderson et al., J. Clin. Invest. 51: 31-38, 1972) may be used to measure the bactericidal activity of anti-NTHi antibodies. Further data on the ability of NTHi polypeptides and NTHi peptides to elicit a protective antibody response may be generated by using animal models of infection such as the chinchilla model system described herein.


It is also possible to confer short-term protection to a host by passive immunotherapy via the administration of pre-formed antibody against an epitope of NTHi, such as antibodies against NTHi OMP, LOS or noncapsular proteins. Thus, the contemplated vaccine formulations can be used to produce antibodies for use in passive immunotherapy. Human immunoglobulin is preferred in human medicine because a heterologous immunoglobulin may provoke an immune response to its foreign immunogenic components. Such passive immunization could be used on an emergency basis for immediate protection of unimmunized individuals exposed to special risks. Alternatively, these antibodies can be used in the production of anti-idiotypic antibody, which in turn can be used as an antigen to stimulate an immune response against NTHi epitopes.


The invention contemplates methods of eliciting an immune response to NTHi in an individual. These methods elicit immune responses which include one or more of killing NTHi bacteria, blocking NTHi attachment to cells and/or slowing NTHi replication. An “immunogenic dose” of a composition of the invention is one that generates, after administration, a detectable humoral and/or cellular immune response in comparison to the immune response detectable before administration or in comparison to a standard immune response before administration. The invention contemplates that the immune response resulting from the methods may be protective and/or therapeutic. In one embodiment, the methods comprise a step of administering an immunogenic dose of a composition comprising a NTHi protein or NTHi peptide of the invention. A NTHi protein or an antigenic peptide thereof may be fused with co-protein which may not by itself produce antibodies, but is capable of stabilizing the first protein and producing a fused protein which will have immunogenic and protective properties. Thus fused recombinant protein, preferably further comprises an antigenic co-protein, such as Glutathione-S-transferase (GST) or beta-galactosidase, relatively large co-proteins which solubilize the protein and facilitate production and purification thereof. Moreover, the co-protein may act as an adjuvant in the sense of providing a generalized stimulation of the immune system. The co-protein may be attached to either the amino or carboxy terminus of the first protein. Provided by this invention are compositions, particularly vaccine compositions, and methods comprising the NTHi polypeptides encoded by the polynucleotide of the invention or antigenic peptides thereof.


The invention also contemplates methods of eliciting an immune response to multiple H. influenzae subtypes in an individual. These methods elicit immune responses which include one or more of killing the H. influenzae bacteria, blocking H. influenzae attachment to cells and/or slowing H. influenzae replication. These methods comprise a step of administering an immunogenic dose of a composition comprising a NTHi protein or NTHi peptide of the invention which comprise an antigenic epitope that is common among multiple H. influenzae subtypes but unique with respect to any other antigenic epitopes.


In another embodiment, the methods comprise administering an immunogenic dose of a composition comprising a cell expressing a NTHi protein or NTHi peptide of the invention. In yet another embodiment, the Methods comprise administering an immunogenic dose of a composition comprising a polynucleotide encoding a NTHi protein or NTHi peptide of the invention. The polynucleotide may be a naked polynucleotide not associated with any other nucleic acid or may be in a vector such as a plasmid or viral vector (e.g., adeno-associated virus vector or adenovirus vector). The compositions of the invention may comprise one or more NTHi proteins or NTHi peptides alone or in combination with other epitopes that elicit an immune response to NTHi bacteria. The compositions of the invention may also comprise one or more NTHi proteins or peptides in combination with epitopes that elicit an immune response to one or more H. influenzae subtypes. Administration of the compositions may be by routes standard in the art, for example, parenteral, intravenous, oral, buccal, nasal, pulmonary, rectal, or vaginal. The methods may be used in combination in a single individual. The methods may be used prior or subsequent to NTHi infection of an individual.


Genes that are up-regulated in NTHi infection of the middle ear and/or the nasopharynx and genes that are associated with NTHi virulence are described herein. The polypeptides and peptides thereof which are encoded by these NTHi genes are contemplated to be useful for eliciting an immune response for treating or preventing disorders associated with NTHi infection, such as OM. Some of the polypeptides encoded by these genes include: histidine biosynthesis protein, lipoprotein B, peptide ABC transporter, periplasmic SapA precursor, outer membrane lipoproteins carrier protein precursor, ribose transport system permease protein, phosphoribosylaminoimidazole carboxylase catalytic subunit, PurE, Phosphoribosylaminoimidazole carboxylase catalytic subunit, ornithine carbamolytransferase, mannonate dehydratase, disulfide oxidoreductase, urease accessory protein, phospshocholine cytidylytransferase, putative pyridoxine biosynthesis protein, singlet oxygen resistance protein, intracellular septation protein, DNA repair protein, MukF protein, glycerol-3-phosphate regulon repressor, integration host factor beta subunit, arginine repressor, cold shock like protein, stress response protein, LicA, MukF, RadA and those hypothetical proteins encoded by HI0094, HI1163, HI0665, HI1292, HI1064 HI186, HI0352 genes. NTHi OMPs, LOS and noncapsular proteins are also contemplated to elicit an immune response for prevention and treatment of disorders associated with NTHi infection.


The invention includes methods of blocking binding of NTHi bacteria to host cells in an individual. The methods comprise administering antibodies or polypeptides of the invention that block binding of NTHi cellular attachment. Alternatively, administration of one or more small molecules that block binding of NTHi cell attachment is contemplated. In vitro assays may be used to demonstrate the ability of an antibody, polypeptide or small molecule of the invention to block NTHi cell attachment.


Pharmaceutical compositions comprising antibodies of the invention, polypeptides of the invention and/or small molecules of the invention that block NTHi cellular attachment are provided. The pharmaceutical compositions may consist of one of the foregoing active ingredients alone, may comprise combinations of the foregoing active ingredients or may comprise additional active ingredients used to treat bacterial infections. The pharmaceutical compositions may comprise one or more additional ingredients such as pharmaceutically effective carriers. Dosage and frequency of the administration of the pharmaceutical compositions are determined by standard techniques and depend, for example, on the weight and age of the individual, the route of administration, and the severity of symptoms. Administration of the pharmaceutical compositions may be by routes standard in the art, for example, parenteral, intravenous, oral, buccal, nasal, pulmonary, rectal, or vaginal.


Also provided by the invention are methods for detecting NTHi infection in an individual. In one embodiment, the methods comprise detecting NTHi polynucleotides of the invention in a sample using primers or probes that specifically bind to the polynucleotides. Detection of the polynucleotide may be accomplished by numerous techniques routine in the art involving, for example, hybridization and PCR.


The antibodies of the present invention may also be used to provide reagents for use in diagnostic assays for the detection of NTHi antigens (NTHi polypeptides and peptides thereof) in various body fluids of individuals suspected of H. influenzae infection. In another embodiment, the NTHi proteins and peptides of the present invention may be used as antigens in immunoassays for the detection of NTHi in various patient tissues and body fluids including, but not limited to: blood, serum, ear fluid, spinal fluid, sputum, urine, lymphatic fluid and cerebrospinal fluid. The antigens of the present invention may be used in any immunoassay system known in the art including, but not limited to: radioimmunoassays, ELISA assays, sandwich assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, fluorescent immunoassays, protein A immunoassays and immunoelectrophoresis assays.


Vaccines and Chemotherapeutic Targets


An aspect of the invention relates to a method for inducing an immunological response in an individual, particularly a mammal which comprises inoculating the individual with a NTHi antigen protein or an antigenic peptide thereof.


The present invention also provides for vaccine formulations which comprise an immunogenic recombinant NTHi protein or NTHi peptide of the invention together with a suitable carrier. The NTHi polypeptides and peptides thereof contemplated as vaccine candidates and/or targets of chemotherapy include, but are not limited to, histidine biosynthesis protein, lipoprotein B, peptide ABC transporter, periplasmic SapA precursor, outer membrane lipoproteins carrier protein precursor, ribose transport system permease protein, phosphoribosylaminoimidazole carboxylase catalytic subunit, PurE, 3,4-dihydroxt-2-butone 4-phosphate synthase, ornithine carbamolytransferase, marmonate dehydratase, disulfide oxidoreductase, urease accessory protein, phospshocholine cytidylytransferase, putative pyridoxine biosynthesis protein, singlet oxygen resistance protein, intracellular septation protein, DNA repair protein, MUKF protein, glycerol-3-phosphate regulon repressor, integration host factor beta subunit, arginine repressor, cold shock like protein, stress response protein, LicA, RadA and those hypothetical proteins encoded by HI0094, HI1163, HI0665, HI1292, HI1064 HI1386, HI0352 genes, NTHi OMPs, NTHi LOS and NTHi noncapsular proteins and polypeptides encoded by the novel NTHi polynucleotide sequences present in the nucleotide sequences set out as SEQ ID NOS: 1-576, SEQ ID NOS: 675-685, SEQ ID NOS: 773-2593, and the nucleotide sequences set out in Table 3B, Table 4B and Table 5 herein, and the polypeptides having the amino acid sequences set out in SEQ ID NOS: 2594-4414, Table 3B, Table 4B and Table 5 herein.


Since the protein may be broken down in the stomach, it is preferably administered parenterally, including, for example, administration that is subcutaneous, intramuscular, intravenous, or intradermal. Formulations suitable for parenteral administration include aqueous and non-aqueous sterile injection solutions which may contain anti-oxidants, buffers, bacteriostats and solutes which render the formulation isotonic with the bodily fluid, preferably the blood, of the individual; and aqueous and non-aqueous sterile suspensions which may include suspending agents or thickening agents. The formulations may be presented in unit-dose or multi-dose containers, for example, sealed ampules and vials and may be stored in a freeze-dried condition requiring only the addition of the sterile liquid carrier immediately prior to use. The vaccine formulation may also include adjuvant systems for enhancing the immunogenicity of the formulation, such as oil-in water systems and other systems known in the art. The dosage will depend on the specific activity of the vaccine and can be readily determined by routine experimentation.


A. Peptide Vaccines


Peptide therapeutic agents, such as peptide vaccines, are well known in the art and are of increasing use in the pharmaceutical arts. Consistent drawbacks to the parenteral administration of such peptide compounds have been the rapidity of breakdown or denaturation. Infusion pumps, as well as wax or oil implants, have been employed for chronic administration of therapeutic agents in an effort to both prolong the presence of peptide-like therapeutic agents and preserve the integrity of such agents. Furthermore, the peptide-like agent should (with particular reference to each epitope of the peptide-like agent) ideally maintain native state configuration for an extended period of time and additionally be presented in a fashion suitable for triggering an immunogenic response in the challenged animal or immunized human.


The NTHi antigenic peptides of the invention can be prepared in a number of conventional ways. The short peptides sequences can be prepared by chemical synthesis using standard means. Particularly convenient are solid phase techniques (see, e.g., Erikson et al., The Proteins (1976) v. 2, Academic Press, New York, p. 255). Automated solid phase synthesizers are commercially available. In addition, modifications in the sequence are easily made by substitution, addition or omission of appropriate residues. For example, a cysteine residue may be added at the carboxy terminus to provide a sulfhydryl group for convenient linkage to a carrier protein, or spacer elements, such as an additional glycine residue, may be incorporated into the sequence between the linking amino acid at the C-terminus and the remainder of the peptide. The short NTHi peptides can also be produced by recombinant techniques. The coding sequence for peptides of this length can easily be synthesized by chemical techniques, e.g., the phosphotriester method described in Matteucci et al., J Am Chem Soc., 103: 3185 (1981).


Some of the NTHi peptide sequences contemplated herein may be considered too small to be immunogenic, they may be linked to carrier substances in order to, confer this property upon them. Any method of creating such linkages known in the art may be used. Linkages can be formed with heterobifunctional agents that generate a disulfide link at one functional group end and a peptide link at the other, such as a disulfide amide forming agent, e.g., N-succidimidyl-3-(2-pyridyldithio) proprionate (SPDP) (See, e.g., Jansen et al., Immun. Rev. 62:185, 1982) and bifunctional coupling agents that form a thioether rather than a disulfide linkage such as reactive esters of 6-maleimidocaproic acid, 2-bromoacetic acid, 2-iodoacetic acid, 4-(N-maleimido-methyl) cyclohexane-1-carboxylic acid and the like, and coupling agent which activate carboxyl groups by combining them with succinimide or 1-hydroxy-2-nitro-4-sulfonic acid, for sodium salt such as succinimidyl 4-(N-maleimido-methyl) cyclohexane-1-carboxylate (SMCC).


B. Vaccine Compositions and Administration


A priming dose of the immunogen that is followed by one or more booster exposures to the immunogen may be necessary to be an effective vaccine (Kramp et al., Infect. Immun., 25: 771-773, 1979; Davis et al., Immunology Letters, 14: 341-8 1986 1987). Examples of proteins or polypeptides that could beneficially enhance the immune response if co-administered include cytokines (e.g., IL-2, IL-12, GM-CSF), cytokine-inducing molecules (e.g. Leaf) or costimulatory molecules. Helper (HTL) epitopes could be joined to intracellular targeting signals and expressed separately from the CTL epitopes. This would allow direction of the HTL epitopes to a cell compartment different than the CTL epitopes. If required, this could facilitate more efficient entry of HTL epitopes into the MHC class II pathway, thereby improving CTL induction. In contrast to CTL induction, specifically decreasing the immune response by co-expression of immunosuppressive molecules (e.g. TGF-β) may be beneficial in certain diseases.


Ideally, an immunogen will exhibit two properties; the capacity to stimulate the formation of the corresponding antibodies and the propensity to react specifically with these antibodies. Immunogens bear one or more epitopes which are the smallest part of an immunogen recognizable by the combing site of an antibody. In particular instances, immunogen, fractions of immunogens or conditions under which the immunogen is presented are inadequate to precipitate the desired immunological response resulting in insufficient immunity. This is often the case with peptides or other small molecules used as immunogens. Other substances such as immunomodulators (e.g., cytokines such as the interleukins) may be combined in vaccines as well.


The vaccine art recognizes the use of certain substances called adjuvants to potentate an immune response when used in conjunction with an immunogen. Adjuvants are further used to elicit an immune response that is faster or greater than would be elicited without the use of the adjuvant. In addition, adjuvants may be used to create an immunological response using less immunogen than would be needed without the inclusion of adjuvant, to increase production of certain antibody subclasses that afford immunological protection or to enhance components of the immune response (e.g., humoral, cellular). Known adjuvants include emulsions such as Freund's Adjuvants and other oil emulsions, Bordetella pertussis, MF59, purified saponin from Quillaja saponaria (QS21), aluminum salts such as hydroxide, phosphate and alum, calcium phosphate, (and other metal salts), gels such as aluminum hydroxide salts, mycobacterial products including muramyl dipeptides, solid materials, particles such as liposomes and virosomes. Examples of natural and bacterial products known to be used as adjuvants include monophosphoryl lipid A (MPL), RC-529 (synthetic MPL-like acylated monosaccharide), OM-174 which is a lipid A derivative from E. coli, holotoxins such as cholera toxin (CT) or one of its derivatives, pertussis toxin (PT) and heat-labile toxin (LT) of E. coli or one of its derivatives, and CpG oligonucleotides. Adjuvant activity can be affected by a number of factors, such as carrier effect, depot formation, altered lymphocyte recirculation, stimulation of T-lymphocytes, direct stimulation of B-lymphocytes and stimulation of macrophages.


Vaccines are typically prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation may also be emulsified. The active immunogenic ingredient is often mixed with excipients, which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, e.g., water, saline, dextrose, glycerol, ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, or adjuvants, which enhance the effectiveness of the vaccine. The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and carriers may include, for example, polyalkalene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1-2%. Oral formulations include such normally employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders and contain 10%-95% of active ingredient, preferably 25-70%.


Vaccines may also be administered through transdermal routes utilizing jet injectors, microneedles, electroporation, sonoporation, microencapsulation, polymers or liposomes, transmucosal routes and intranasal routes using nebulizers, aerosols and nasal sprays. Microencapsulation using natural or synthetic polymers such as starch, alginate and chitosan, D-poly L-lactate (PLA), D-poly DL-lactic-coglycolic microspheres, polycaprolactones, polyorthoesters, polyanhydrides and polyphosphazenes polyphosphatazanes are useful for both transdermal and transmucosal administration. Polymeric complexes comprising synthetic poly-omithate, poly-lysine and poly-arginine or amphipathic peptides are useful for transdermal delivery systems. In addition, due to their amphipathic nature, liposomes are contemplated for transdermal, transmucosal and intranasal vaccine delivery systems. Common lipids used for vaccine delivery include N-(1)2,3-(dioleyl-dihydroxypropyl)-N,N,N-trimethylammonium-methyl sulfate (DOTAP), dioleyloxy-propyl-trimethylammonium chloride DOTMA, dimystyloxypropyl-3-dimethyl-hydroxyethyl ammonium (DMRIE), dimethyldioctadecyl ammonium bromide (DDAB) and 9N(N′,N-dimethylaminoethane) carbamoyl) cholesterol (DC-Chol). The combination of helper lipids and liposomes will enhance up-take of the liposomes through the skin. These helper lipids include, dioleoyl phosphatidylethanolamine (DOPE), dilauroylphosphatidylethanolamine (DLPE), dimyristoyl phosphatidylethanolamine (DMPE), dipalmitoylphosphatidylethanolamine (DPPE). In addition, triterpenoid glycosides or saponins derived from the Chilean soap tree bark (Quillaja saponaria) and chitosan (deacetylated chitan) have been contemplated as useful adjuvants for intranasal and transmucosal vaccine delivery.


The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable salts, include the acid addition salts (formed with the free amino groups of the peptide) and which are formed with inorganic acids such as, e.g., hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, e.g., sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, and procaine.


The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as will be therapeutically effective and immunogenic. The quantity to be administered depends on the subject to be treated, capacity of the subject's immune system to synthesize antibodies, and the degree of protection desired. Precise amounts of active ingredient required to be administered depend on the judgment of the practitioner and are peculiar to each individual. However, suitable dosage ranges are of the order of several hundred micrograms active ingredient per individual. Suitable regimes for initial administration and booster shots are also variable, but are typified by an initial administration followed in one or three month intervals by a subsequent injection or other administration.


Upon immunization with a vaccine composition as described herein, the immune system of the host responds to the vaccine by producing large amounts of CTLs specific for the desired antigen, and the host becomes at least partially immune to later infection, or resistant to developing chronic infection. Vaccine compositions containing the NTHi polypeptide or NTHi peptides of the invention are administered to a patient susceptible to or otherwise at risk of bacterial infection to elicit an immune response against the antigen and thus enhance the patient's own immune response capabilities. In this use, the precise amounts again depend on the patient's state of health and weight, the mode of administration, the nature of the formulation, etc., but generally range from about 1.0 μg to about 5000 per 70 kilogram patient, more commonly from about 10 to about 500 mg per 70 kg of body weight. For therapeutic or immunization purposes, the NTHi polypeptide or NTHi peptides of the invention can also be expressed by attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus as a vector to express nucleotide sequences that encode the peptides of the invention. Upon introduction into an acutely or chronically infected host or into a noninfected host, the recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits a host CTL response.


Humoral immune response may be measured by many well known methods, such as Single Radial Immunodiffussion Assay (SRID), Enzyme Immunoassay (EIA) and Hemagglutination Inhibition Assay (HAI). In particular, SRID utilizes a layer of a gel, such as agarose, containing the immunogen being tested. A well is cut in the gel and the serum being tested is placed in the well. Diffusion of the antibody out into the gel leads to the formation of a precipitation ring whose area is proportional to the concentration of the antibody in the serum being tested. EIA, also known as ELISA (Enzyme Linked Immunoassay), is used to determine total antibodies in the sample. The immunogen is adsorbed to the surface of a microtiter plate. The test serum is exposed to the plate followed by an enzyme linked immunoglobulin, such as IgG. The enzyme activity adherent to the plate is quantified by any convenient means such as spectrophotometry and is proportional to the concentration of antibody directed against the immunogen present in the test sample. HAI utilizes the capability of an immunogen such as viral proteins to agglutinate chicken red blood cells (or the like). The assay detects neutralizing antibodies, i.e., those antibodies able to inhibit hemagglutination. Dilution of the test serum are incubated with a standard concentration of immunogen, followed by the addition of the red blood cells. The presence of neutralizing antibodies will inhibit the agglutination of the red blood cells by the immunogen. Tests to measure cellular immune response include determination of delayed-type hypersensitivity or measuring the proliferative response of lymphocytes to target immunogen.


Nontypeable Haemophilus influenzae (NTHi)



H. influenzae is a small, nonmotile gram negative bacterium. Unlike other H. influenzae strains, the nontypeable H. influenzae (NTHi) strains lack a polysaccharide capsule and are sometimes denoted as “nonencapsulated.” NTHi strains are genetically distinct from encapsulated strains and are more heterogenous than the type b H. influenzae isolates. NTHi presents a complex array of antigens to the human host. Possible antigens that may elicit protection include OMPs, lipopolysaccharides, lipoproteins, adhesion proteins and noncapsular proteins.


Humans are the only host for H. influenze. NTHi strains commonly reside in the upper respiratory tract including the nasopharynx and the posterior oropharynx, the lower respiratory tract and the female genital tract. NTHi causes a broad spectrum of diseases in humans, including but not limited to, otitis media, pneumonia, sinusitis, septicemia, endocarditis, epiglottitis, septic arthritis, meningitis, postpartum and neonatal infections, postpartum and neonatal sepsis, acute and chromic salpingitis, epiglottis, pericarditis, cellulitis, osteomyelitis, endocarditis, cholecystitis, intraabdominal infections, urinary tract infection, mastoiditis, aortic graft infection, conjunctivitis, Brazilian purpuric fever, occult bacteremia and exacerbation of underlying lung diseases such as chronic bronchitis, bronchietasis and cystic fibrosis.


Epidemiologic studies of NTHi have indicated that the strains are heterogeneous with respect to outer membrane protein profiles (Barenkamp et aL, Infect. Immun., 36: 535-40, 1982), enzyme allotypes (Musser et al., Infect. Inzmun., 52: 183-191, 1986), and other commonly used epidemiologic tools. There have been several attempts to subtype NTHi, but none of the methodologies have been totally satisfactory. The outer-membrane protein composition of NTHi consists of approximately 20 proteins. All NTHi strains contains two common OMP's with molecular weights of 30,000 and 16,600 daltons. NTHi strains may be subtyped based on two OMP's within the 32,000-42,000 dalton range. The NTHi liposaccharide profile is fundamentally different than the enteric gram negative bacteria and separates into 1-4 distinct bands ranging from less than 20,000 daltons.


A prototype NTHi isolate is the low passage isolate 86-028NP which was recovered from a child with chronic otitis media. This strain has been well characterized in vitro (Bakaletz et al., Infect. Immun., 53: 331-5, 1988; Holmes et al., Microb. Pathog., 23: 157-66, 1997) as well as in the chinchilla OM model (described herein) (Bakaletz et al., Vaccine, 15: 955-61, 1997; Suzuki et al., Infect. Imnzun., 62: 1710-8, 1994; DeMaria et al., Infect. Inznzun., 64: 5187-92, 1996). The 86-028NP strain was used, as described herein, to identify genes that are up-regulated in expression in the chinchilla model of otitis media and genes that are necessary for NTHi survival in the chinchilla middle ear.


DFI Strategy


A differential fluorescence induction (DFI) strategy was used herein to identify NTHi genes induced during OM in a chinchilla animal model. Several methods have been developed to identify bacterial genes that contribute to the virulence of an organism during infection. Such methods include in vivo expression technology (IVET) in which bacterial promoters regulate the expression of gene(s) required for synthesis of essential nutrients required for survival in the host; signature-tagged mutagenesis (STM) enabling tag-specific identification of genes that alter the virulence properties of a microorganism when mutated; DNA microarray technology to globally screen for transcriptionally active genes, and DFI which uses FACS analysis to select for transcriptionally active promoters (Chiang et al., Annu. Rev. Microbiol., 53: 129-154, 1999). DFI is a high-throughput method that allows for the identification of differentially regulated genes regardless of the basal level of expression and does not exclude those that are essential for growth in vitro.


DFI has been successfully utilized in many microorganisms. For example, a GFP reporter system and flow cytometry was used to study mycobacterial gene expression upon interaction with macrophages (Dhandayuthapani et al., Mol. Microbiol., 17: 901-912, 1995). A promoter trap system was used to identify genes whose transcription was increased when Salmonellae were subjected to environments simulating in vivo growth and when internalized by cultured macrophage-like cells (Valdivia and Falkow, Mol. Microbiol., 22: 367-378, 1996; Valdivia and Falkow, Science, 277: 2007-2011, 1997; Valdivia and Falkow, Curr. Opin. Microbiol., 1: 359-363, 1998). In addition, DFI has been used to identify promoters expressed in S. pneumoniae and S. aureus when grown under varied in vitro conditions simulating infection (Marra et al., Infect. Immun., 148: 1483-1491, 2002; Schneider et al., Proc. Natl. Acad. Sci. U.S.A., 97: 1671-1676, 2000). In addition, DFI has been utilized to study gene regulation in Bacillus cereus in response to environmental stimuli (Dunn and Handelsman, Gene, 226: 297-305, 1999), in S. pneumoniae in response to a competence stimulatory peptide (Bartilson et al., Mol. Microbiol., 39: 126-135, 2001), and upon interaction with and invasion of host cells in Bartonella henselae Lee and Falkow, Infect. Immun., 66: 3964-3967, 1998), Listeria monocytogenes Wilson et al., Infect. Immun., 69: 5016-5024, 2001), Brucella abortus (Eskra et al., Infect. Immun., 69: 7736-7742, 2001), and Escherichia coli (Badger et al., Mol. Microbiol., 36: 174-182, 2000).


Whereas DFI has been successfully used to identify promoters active in cell culture models of infection or in vitro conditions designed to simulate an in vivo environment, few have applied DFI to identify promoters regulated in a specific biological niche within the whole animal. This is likely due to the numerous challenges associated with sorting from an in vivo environment. The host inflammatory response, dissemination and/or clearance of bacterial cells from the site of infection, as well as adherence of bacteria to epithelial cells, possibly via biofilm formation, can make bacteria inaccessible for retrieval from the living animal. These factors, among others, contribute to the complexity of the microenvironment and the heterogeneity of gene expression as the bacteria sense and respond to these changes. Recently, DFI has been used to identify promoters expressed in S. pneumoniae when the bacteria were screened in a mouse model of respiratory tract infection and a gerbil infection model of OM (Marra et al., Infect. Inzmun. 70: 1422-33, 2002; Marra et al., Microbiol., 148: 1483-91, 2002).


Animal Model


The chinchilla model is a widely accepted experimental model for OM. In particular, a chinchilla model of NTHi-induced OM has been well characterized (Bakaletz et al., J. Infect. Dis., 168: 865-872, 1993; Bakaletz and Holmes, Clin. Diagn. Lab. Immunol., 4: 223-225, 1997; Suzuki and Bakaletz, Infect. Immun., 62: 1710-1718, 1994), and has been used to determine the protective efficacy of several NTHi outer membrane proteins, combinations of outer membrane proteins, chimeric synthetic peptide vaccine components, and adjuvant formulations as vaccinogens against OM (Bakaletz et al., Vaccine, 15: 955-961, 1997; Bakaletz et al., Infect. Immun., 67: 2746-2762, 1999; Kennedy et al., Infect. Immun., 68: 2756-2765, 2000).


In particular, there is an unique in vivo model wherein adenovirus predisposes chinchillas to H. influenzae-induced otitis media, which allowed for the establishment of relevant cell, tissue and organ culture systems for the biological assessment of NTHi (Bakaletz et al., J. Infect. Dis., 168: 865-72, 1993; Suzuki et al., Infect. Immunity 62: 1710-8, 1994). Adenovirus infection alone has been used to assess for the transudation of induced serum antibodies into the tympanum (Bakaletz et al., Clin. Diagnostic Lab Immunol., 4(2): 223-5, 1997) and has been used as a co-pathogen with NTHi, to determine the protective efficacy of several active and passive immunization regimens targeting various NTHi outer membrane proteins, combinations of OMPs, chimeric synthetic peptide vaccine components, and adjuvant formulations as vaccinogens against otitis media (Bakaletz et al., Infect Immunity, 67(6): 2746-62, 1999; Kennedy et al., Infect Immun., 68(5): 2756-65, 2000; Novotny et al., Infect Immunity 68(4): 2119-28, 2000; Poolman et al., Vaccine 19 (Suppl. 1): S108-15, 2000).


Genes Unregulated In Vivo in Response to NTHi Infection of the Middle Ear


In order to identify differentially regulated promoters in response to NTHi infection of the middle ear, a promoter trap library was constructed and sorting parameters defined. A portion of the promoter trap library was inoculated directly into the chinchilla middle ear and OM development was monitored by video otoscopy and tympanometry at 24 and 48 hours. In addition, the middle ear fluids were recovered 24 and 48 hours after infection. Two-color FACS analysis was used to isolated bacteria that were expressing GFP from other cells and debris associated with the effusion. Following isolation, DNA sequence of the Haemophilus inserts 5′ of the gfpmut3 gene were determined and analyzed. In this manner, we identified genes that are up-regulated as NTHi sense and respond to the environment of the chinchilla middle ear during AOM. The following genes were identified and due to their up-regulation during NTHi infection, they may play a role in NTHi infection and virulence.


As described below in Example 7, following the DFI procedure described above and subsequent FACS analysis of gfp-expressing clones, 52 candidate clones containing potential in vivo-regulated promoters were isolated. The genes these clones control were categorized based upon general description and function within the cell and include general metabolic processes, environmental informational processing and membrane transport, membrane proteins and hypothetical proteins. Eight of these 52 clones contain sequences that are unique to NTHi strain 86-028NP. Importantly, 3 clones were isolated from independent screens in more than one animal thereby verifying the method of isolation.


In order to independently confirm the FACS data, we determined the relative expression of candidate genes by quantitative RT-PCR. The parent strain 86-028NP, was used for these studies. Thus, wild-type gene expression without the influence of plasmid copy number on gene regulation was analyzed, allowing for the indication of false-positive clone identification by FACS. Of the 44 candidate clones containing sequence similar to that identified in H. influenzae strain Rd, quantitative comparison of gene expression in vitro and in vivo confirmed up-regulated gene expression for twenty-six genes (60%) when NTHi respond to environmental cues present in the chinchilla middle ear. This analysis identified in vivo-regulated promoters which drive expression of genes involved in membrane transport, environmental informational processing, cellular metabolism, gene regulation, as well as hypothetical proteins with unknown function. (See Table 4 in Example 6).


Quantitative RT-PCR demonstrated a two-fold increase in lolA expression, enabling lipoprotein transport from the inner membrane to the outer membrane. Bacteria grow rapidly in the middle ear environment reaching 5.0×108 CFU NTHi ml middle ear fluid within 48 hours. The bacteria sense and respond to the environment, acquiring or synthesizing the necessary nutrients for growth and survival. The gene encoding the membrane component in ribose sugar transport, rbsC (SEQ ID NO: 619), showed a 5-fold increase in expression in vivo compared to cells growing in vitro. In addition, many genes involved in metabolic processes show a dramatic increase in gene expression in vivo compared to cells growing in vitro. These include a riboflavin synthesis gene, ribB (SEQ ID NO: 623), a purine nucleotide biosynthetic gene purE (SEQ ID NO: 621), ornithine carbamoyltransferase, arcB (SEQ ID NO: 625), involved in arginine degradation via the urea cycle and uxuA (SEQ ID NO: 627), encoding mannonate hydrolase, required for the uptake of D-glucuronate and transformation into glyceraldehyde 3-phosphate. In addition, but to a lesser degree, genes for histidine biosynthesis (hisB; SEQ ID NO: 615), DNA repair (radC; SEQ ID NO: 639) and a putative intracellular septation transmembrane protein (ispZ; SEQ ID NO: 637) were up-regulated.


Disulfide bond formation is important for folding and assembly of many secreted proteins in bacteria. In prokaryotes, DsbA and DsbB make up the oxidative pathway responsible for the formation of disulfides. DsbB reoxidizes DsbA, which donates disulfide bonds directly to unfolded polypeptides, and DsbB has been demonstrated to generate disulfides de novo from oxidized quinones (Collet and Bardwell, Mol. Microbiol., 44: 1-8, 2002). In H. influenzae strain Rd, DsbA is required for competence for transformation (Tomb, Proc. Natl. Acad. Sci. U.S.A., 89: 10252-10256, 1992). Herein, an approximate 3-fold increase in dsbB gene (SEQ ID NO: 629) transcription was demonstrated, illuminating an important role for disulfide interchange for NTHi growing in the middle ear environment.


Bacteria colonization of the middle ear, a noiuially sterile environment, results in a host inflammatory response and subsequent neutrophil infiltration. Bacteria have evolved numerous strategies to combat this host response. NTHi increase gene expression (4-fold) of ureH (SEQ ID NO:631), a homologue of a gene required for expression of active urease in Helicobacter, shown to be involved in acid tolerance (Young et al., J. Bacterol., 178: 6487-6495, 1996). Recently, it has been reported that urease activity may play a role in chronic Actinobacillus pleuropneumoniae infection by counteracting the decrease in pH occurring upon infection (Baltes et al., Infect. Inunun., 69: 472-478, 2000; Baltes et al., Infect. Immun., 69: 472-478, 2001; Bosse and MacInnes, Can. J. Vet. Res., 64: 145-150). A biotype analysis on NTHi isolates from middle ear effusions demonstrated that 87% are urease positive (DeMaria et al., J. Clin. Microbiol., 20: 1102-1104, 1984). However, the role of urease in NTHi virulence is unknown. Similarly, an increase in expression of a gene whose product demonstrates 88% sequence identity to a pyridoxine biosynthesis protein in S. pneumoniae and 60% homology to a putative singlet oxygen resistance protein that may function as an antioxidant. Phosphorylcholine (ChoP) has been implicated in the pathogenesis of NTHi (Weiser et al., Infect. Immun., 65: 943-950, 1997). NTHi modulates ChoP expression by phase variation, decorating the LOS on the cell surface. ChoP may contribute to NTHi persistence in the respiratory tract via decreased susceptibility to antimicrobial peptides (Lysecko et al., Infect. Immun., 68: 1664-1671, 2000) and alter the sensitivity to serum killing mediated by C-reactive protein (CRP) (Weiser et al., J. Exp. Med., 187: 631-640, 1998). The microenvironment of the nasopharynx and middle ear cavity may select for the ChoP+ phenotype, as ChoP+ strains show greater colonization of the chinchilla nasopharynx (Tong et al., Infect. Immun., 68: 4593-4597, 2000). Expression of the licC gene (SEQ ID NO: 633) was also increased. The licC gene encodes a phosphorylcholine cytidylyltransferase that plays a role in the biosynthesis of phosphorylcholine-derivatized LOS (Rock et al., J. Bacterol., 183: 4927-4931, 2001).


Also included among the in vivo-induced genes is a set whose products subsequently regulate gene expression or DNA replication. These genes include transcriptional regulation of glycerol metabolism by the glp repressor, glpR (SEQ ID NO: 643), the arginine repressor gene, argR (SEQ ID NO: 647), and the integration host factor (IHF) beta subunit, ihfB (SEQ ID NO: 645). IHF is a histone-like protein that binds DNA at specific sequences, an accessory factor involved in replication, site-specific recombination and transcription, altering the activity of a large number of operons (Goosen and van de Putte, Mol. Microbiol. 16: 1-7, 1995). In addition, CspD inhibits DNA replication during stationary phase-induced stress response in E. coli (Yamanaka et al., Mol. Microbiol., 39: 1572-1584, 2001) and the mukF (SEQ ID NO: 641) gene protein homologue contributes to a remodeling of the nucleiod structure into a more compact fog in prior to cell segregation (Sawitzke and Austin, Proc. Natl. Acad. Sci. U.S.A., 62: 1710-1718, 2000). The DFI strategy described herein also identified promoters induced in vivo for genes of unknown function. The hypothetical protein, HI0094, demonstrated an 8-fold increase in gene expression during early OM but its role remains unknown. HI1163 (SEQ ID NO: 651) showed 58% amino acid identity with the hypothetical YdiJ proteins, a putative oxidase, of E. coli.


A high-density transposon mutagenesis strategy was used to identify H. influenzae genes essential for growth on rich medium (Akerley et al., Proc. Natl. Acad. Sci. U.S.A., 99: 966-971, 2002). Six genes were identified in the screen described herein that are included in essential gene set described in Akerley' et al., supra. (hisB, lppB, lolA, ispZ, mukF and unknown HI0665). Recently genes of non-typeable H. influenzae that are expressed upon interaction with two human respiratory tract-derived epithelial cell lines have been identified. These genes included those involved in metabolic processes, stress responses, gene expression, cell envelope biosynthesis, DNA-related processes, cell division and ORF's encoding proteins of unknown function. (Ulsen et al., Mol. Microbiol., 45: 485-500, 2002). Similarly the stress response gene, cspD (SEQ ID NO: 649), genes involved in purine and riboflavin biosynthesis, and a protein of unknown function, vapA was identified in the screen described herein. Expression of vapA was detected in vitro, yet vapA gene expression increased two-fold in vivo. These unique approaches identified known genes that are upregulated in NTHi-induced OM and therefore are likely to play a role in NTHi infection and virulence; and may be potential candidates for vaccines and antisense therapies and other therapeutic methods of treatment of NTHi related disorders.


The DFI strategy resulted in the identification of promoters induced in vivo for genes of unknown function as well. The hypothetical protein, HI0094, demonstrated a 8-fold increase in gene expression during early OM but its role remains unknown. HI1163 (SEQ ID NO: 651) showed 58% amino acid identity with the hypothetical YdiJ proteins, a putative oxidase, of E. coli. Therefore, these hypothetical genes are likely to play a role in OM induced by NTHi infection.





BRIEF DESCRIPTION OF FIGURES


FIG. 1 depicts the LKP gene region in a panel of Haemophilus isolates. The strain 86-028NP sequence is identical in this region to the sequence in NTHi strain R3001. Both of these NTHi lack the hif gene cluster encoding the hemagglutinating pilus.



FIG. 2 depicts the rfaD region in a panel of Haemophilus isolates. The gene arrangement in the rfaD region of the strain 86-028NP genome is similar to that of the strain Rd genome but different than the arrangement of these genes seen in the genome of most NTHi examined.



FIGS. 3A-3M set out the nucleotide sequences (SEQ ID NOS: 589-614) described in Table 4, which were identified to be upregulated during OM infection (see Example 6). The nucleotides (nt.) which correspond to known genes and those nt. which correspond to the contig sequences set out as SEQ ID NO: 1-576 are also presented.





DETAILED DESCRIPTION

The following examples illustrate the invention wherein Example 1 describes the sequence of the NTHi genome, Example 2 describes the identified contigs and initial gene discovery, Example 3 describes construction of the NTHi promoter trap library, Example 4 describes the analyses of 86-028NP derivatives expressing GFP, Example 5 demonstrates direct labelling of bacteria from middle ear fluids, Example 6 describes identification of promoters induced in vivo in acute otitis media, Example 7 describes identification of virulence-associated genes, Example 8 describes identification of unique NTHi gene sequences, Example 9 described the analysis of the complete NTHi 86-028NP genome, and Example 10 compares the genomic DNA sequences of NTHi, strain 86-028NP and H. influenzae serotype d, strain kw20


Example 1
Sequence of a Non-Typeable Haemophilus influenzae Genome

NTHi strain 86-028NP is a minimally passaged clinical isolate obtained from a pediatric patient who underwent tympanostomy and tube insertion for chronic OM at Columbus Children's Hospital. (Bakaletz et al. Infection and Immunity, 56(2): 331-335, 1988) The 86-028NP strain was deposited with the American Type Tissue Collection (Manassas, Va. 20108 USA) on Oct. 16, 2002 and assigned accession no. PTA-4764.


In an effort to more broadly approach the identification of the virulence determinants in NTHi, the genome of the NTHi 86-028NP strain was sequenced to 3-fold coverage. Chromosomal DNA was prepared from strain 86-028NP using the Puregene protocol and sheared to 2-4 kb in size with a Hydroshear instalment (Gene Machines). The sheared DNA was ethanol-precipitated, end-repaired using a mixture of Klenow enzyme and T4 DNA polymerase, and size-selected by agarose gel electrophoresis to obtain 2-4 kb fragments as described in Chissoe et al. (Methods: a Companion to Methods of Enzymology 3: 55-65, 1991) and Sambrook et al. (Molecular Cloning: a Laboratory Manual, 2nd Ed. Cold Springs Harbor, N.Y., 1989). These fragments were cloned into vector pUC18 using the SmaI restriction site (phosphatase-treated) and transformed into E. coli XL-1 Blue, selecting for ampicillin resistance. Colonies that contain inserts were identified by blue/white screening on LB-Amp plates containing X-gal, and transferred into 96-deep well plates containing 1.5 ml of TB-Amp (TB=Terrific Broth) broth. The deep-well plate cultures were grown overnight (18-22 hours) at 37° C. Template preparation, sequencing and contig assembly were performed.


Automated template preparation was performed on the Beckman Biomek 2000 automated robotics workstation as described in Chissoe et al., (supra.) Briefly, each 96-deep well plate, containing the clones prepared above, was centrifuged to pellet the cells, the supernatant decanted, and the cells frozen (if necessary) at −20° C. Four 96-deep well blocks were placed on the Biomek table, and the liquid handling robot was used to prepare the template using an automated version of a typical SDS-NaOH lysis protocol as described in Chissoe et al., (supra.). The final ethanol-precipitated templates were each dissolved in 50 μl ddH2O, and used for DNA sequencing.


Sequencing reactions were run by re-arraying the templates (from 96-well plates) into 384-well plates, using the Robbins Hydra 96 robot. Cycle-sequencing reactions were run using PE Big-Dye™ terminators and universal primers (M13 forward and reverse), cleaned up over Sephadex G50 columns, and analyzed on a PE Biosystems 3700 capillary electrophoresis DNA sequencer according to the manufacturer's instructions. Sequencing reads (8219) were assembled into 576 contigs (SEQ ID NOS: 1-576 herein). The statistics for the 3-fold sequencing are shown in Table 2A. The total unique sequence in assembly 17 is 1.74 Mb.












TABLE 2A





Contig Size
Total Number
Total Length
% of Cumulative



















0-1
kb
65
55961
3.2%


1-2
kb
228
333665
19.2%


2-3
kb
101
243059
14.0%


3-4
kb
49
172385
9.9%


4-5
kb
45
196699
11.3%


5-10
kb
74
515152
29.6%


10-20
kb
11
144591
8.3%


20-30
kb
3
77352
4.4%









Subsequently, 8-fold sequencing analysis of the NTHi genome was carried out. The 8-fold sequencing assembled the NTHi genome into 11 contigs. Contigs 5, 8, 9, 10, 12-18 are denoted as SEQ ID NOS: 675-685 herein. The statistics for the 8-fold sequencing are shown in Table 2B.












TABLE 2B





Contig Size
Total Number
Total Length
% of Cumulative



















0-1
kb
5
3950
0.2%


1-2
kb
3
4316
0.2%


2-3
kb
0
0
0.0%


3-4
kb
1
3964
0.2%


4-5
kb
0
0
0.0%


5-10
kb
0
0
0.0%


10-20
kb
1
15147
0.8%


20-30
kb
2
51888
2.7%


30-40
kb
0
0
0.0%


40-50
kb
0
0
0.0%


50-100
kb
1
85814
4.5%


>100
kb
5
1760339
91.4%









Example 2
Contig Description and Initial Gene Discovery

Seventy-five of the 88 contigs with length ≧5000 bp, identified with the 3-fold sequence analysis, show significant similarity via BLASTN to genes in H. influenzae strain Rd. To visualize the potential relationship between the gene order in H. influenzae strain 86-028NP and H. influenzae strain Rd, the 86-028NP three-fold contig set and the Rd gene set were bidirectionally compared using BLASTN. The results were plotted in gene-order verses contig space by sorting the contigs based on gene coordinates of the Rd genes hit, anchoring each contig at the smallest coordinate found as described in Ray et al., (Bioinformatics 17: 1105-12, 2001). Compared in this fashion, an incomplete assembly of a genome with identical gene order to a completely known genome would display a monotonically increasing stair-stepped form.


BLASTX was used to identify hits to sequences with homology to genes in the strain Rd genome as well as genes not found in H. influenzae strain Rd. Hits to strain Rd sequences were removed from the data set and the other hits summarized in Table 3A. The data are presented as follows: contig # (SEQ ID NO: #), column 1; E score for each hit, column 2; the name of the protein that had homology to a portion of the amino acid translation of the cited contig, column 3; the organism producing the homologue, column 4; and the Genbank protein identifier for each of the proteins cited in column 3, column 5; the corresponding nucleotides within the contig (referenced by SEQ ID NO:). In most instances, several homologues were identified but for clarity, the protein of greatest homology is cited in Table 3A.


The sequences for some of the genes listed in Table 3A were identified within the 8-fold sequencing of the NTHi genome. Table 3B lists the location of these genes within the 11 contigs, the full length open reading frame sequence (identified by SEQ ID NO:), the derived amino acid sequence encoded by the open reading frame and the gene with high homology identified by BLASTX (as listed in Table 3A).


To examine the relative short range gene arrangements in NTHi and the Rd strain, the gene order in two gene clusters that have been well-described were compared. First, the genes present in the hemagglutinating pilus (LKP) gene region were examined. (Mhlanga-Mutangadura et al., J Bacteriol. 180(17): 4693-703, 1998). The pilus gene cluster is located between the purE and pepN genes, only fragments of which are depicted in FIG. 1. The serotype b strain, Eagan, contains the hifABCDE gene cluster and produces hemagglutinating pili. Strain Rd lacks the hicAB genes as well as the hifABCDE gene cluster. In general, the nontypeable strains previously examined contained the hicAB genes but not the hif genes that encode the hemagglutinating pilus. The strain 86-028NP sequence (described herein) is identical in this region to the sequence in NTHi strain R3001 (FIG. 1). The rfaD gene encodes an enzyme involved in the biosynthesis of endotoxin. In addition, the rfaD gene from NTHi strain 2019 has been characterized by Nichols et al. (Infect Immunity 65(4): 1377-86, 1997). In strain 2019, the rfaD gene is immediately upstream of the rfaF gene that encodes another enzyme involved in endotoxin biosynthesis. The gene arrangement in strain Rd is different; the rfaD and rfaF genes are separated by approximately 11 kb of sequence. Most nontypeable strains examined contained the gene arrangement seen in strain 2019. In contrast, strain 86-028NP has a gene arrangement identical to that seen in strain Rd (FIG. 2).


A global analysis of the current assembly indicates that the gene content and order are similar to that in strain Rd. A more detailed analysis revealed that there are a substantial number of NTHi genes not previously seen in the Pasteurellaceae and some regions where the NTHi gene content and order is different than that seen in strain Rd. Thus, the current data suggest that the strain 86-028NP genome will contain a complex mosaic of Rd and non-Rd like features.


The DFI strategy also identified novel NTHi sequences that had increased gene expression. A list of these novel contig sequences that contain genes or gene fragments that have homology to ORFs in other organisms (primarily gram-negative bacteria) is set out in Table 3A. For example, the nucleotide sequence of contig 442 (SEQ ID NO: 442), nucleotides 1498-1845 are highly homologous to the sequences encoding amino acids 1-116 of H. influenzae strain Rd lipoprotein B (LppB). The gene is positioned between the stationary phase survival gene, surE, and a gene encoding a 43 kD antigenic outer membrane lipoprotein that is highly homologous to the recently identified bacterial lipoprotein, LppB/NlpD, which has been associated with virulence (Padmalayam et al., Infect. Immun., 68: 4972-4979, 2000). Recently, Zhang and coworkers demonstrated that nlpD and surE gene expression was induced during stationary phase of bacterial growth in Thermotoga maritima (Zhang et al., Structure (Camb), 9: 1095-1106, 2001). Therefore, under stress-induced conditions in the middle ear, this NTHi lipoprotein may be expressed.














TABLE 3A









Genbank



Contig
E score
Hit Identity
Organism
Protein
SEQ ID NO:




















104
4.00E−59
CpdB

Pasteurella

NP_246953.1
nt. 204-659 of






multocida


SEQ ID NO: 104


106
9.00E−10
hypothetical protein

Pyrococcus

G71244
nt. 40-309 of




PH0217 -

horikoshii


SEQ ID NO: 106


106
1.00E−08
unknown

Pasteurella

NP_246871.1
nt. 605-694 of






multocida


SEQ ID NO: 106


106
2.00E−20
Orf122

Chlorobium

AAG12204.1
nt. 7-210 of






tepidum


SEQ ID NO: 106


110
3.00E−05
ArdC antirestriction
IncW plasmid pSa
AAD52160.1
compliment of




protein


nt. 959-1162 of







SEQ ID NO: 110


110
1.00E−33
hypothetical protein

Salmonella

NP_458676.1
compliment of






enterica subsp.


nt. nt. 181-825 of






enterica serovar


SEQ ID NO: 110






Typhi



111
5.00E−12
putative membrane

Salmonella

NP_458664.1
compliment of




protein

enterica subsp.


nt. 45-287 of






enterica serovar


SEQ ID NO: 111






Typhi



111
6.00E−41
hypothetical protein

Salmonella

NP_458658.1
compliment of






enterica subsp.


nt. 1091-1480 of






enterica serovar


SEQ ID NO: 111






Typhi



114
7.00E−80
unknown

Pasteurella

NP_245828.1
compliment of






multocida


nt. 118-696 of







SEQ ID NO: 114


115
2.00E−09
A111R

Paramecium

NP_048459.1
nt. 555-869 of






bursaria Chlorella


SEQ ID NO: 115





virus 1


118
5.00E−45
DNA methylase

Vibrio cholerae

NP_231404.1
nt. 44-439 of




HsdM, putative


SEQ ID NO: 118


122
2.00E−18
unknown

Pasteurella

NP_245314.1
nt. 865-1302 of






multocida


SEQ ID NO: 122


123
4.00E−99
RNA

Proteus mirabilis

P50509
nt. 351-782 of




POLYMERASE


SEQ ID NO: 123




SIGMA-32




FACTOR


124
9.00E−58
ACETOLACTATE

Spirulina platensis

P27868
nt. 603-1025 of




SYNTHASE


SEQ ID NO: 124




(ACETOHYDROXY-




ACID SYNTHASE)




(ALS)


130
0
restriction

Neisseria

CAA09003.1
nt. 495-1559 of




modification

meningitidis


SEQ ID NO: 130




system-R protein


131
6.00E−91
uronate isomerase

Salmonella

NP_457532.1
compliment of




(glucuronate

enterica subsp.


nt. 661-1380 of




isomerase)

enterica serovar


SEQ ID NO: 131






Typhi



133
3.00E−30
GyrA

Pasteurella

NP_245778.1
compliment of






multocida


nt. 1447-1626 of







SEQ ID NO: 133


133
1.00E−27
DNA GYRASE

Pectobacterium

P41513
compliment of




SUBUNIT A

carotovorum


nt. 1302-1442 of







SEQ ID NO: 133


138
7.00E−06
KicA

Pasteurella

NP_245545.1
compliment of






multocida


nt. 92-157 of







SEQ ID NO: 138


138
 1.00E−148
TYPE II

Haemophilus

O30869
compliment of




RESTRICTION

aegyptius


nt. 164-1045 of




ENZYME HAEII


SEQ ID NO: 138




(ENDONUCLEASE




HAEII) (R. HAEII)


143
4.00E−06
Gifsy-1 prophage

Salmonella

NP_461555.1
compliment of




protein

typhimurium LT2


nt. 228-632 of







SEQ ID NO: 143


143
1.00E−14
hypothetical protein
Bacteriophage
NP_050531.1
compliment of





VT2-Sa

nt. 778-1248 of







SEQ ID NO: 143


143
5.00E−09
hypothetical protein

Salmonella

CAD09979.1
compliment of






enterica subsp.


nt. 715-1026 of






enterica serovar


SEQ ID NO: 143






Typhi



143
6.00E−10
hypothetical 14.9 kd

Escherichia coli

NP_065324.1
nt. 3-173 of




protein


SEQ ID NO: 143


147
1.00E−38
GTP-binding

Escherichia coli

NP_289127.1
compliment of




elongation factor,
O157:H7 EDL933

nt. 172-342 of




may be inner


SEQ ID NO: 147




membrane protein


147
2.00E−14
GTP-binding

Borrelia

NP_212222.1
compliment of




membrane protein

burgdorferi


nt. 17-181 of




(lepA)


SEQ ID NO: 147


148
6.00E−17
galactokinase

Homo sapiens

AAC35849.1
compliment of







nt. 746-1246 of







SEQ ID NO: 148


148
7.00E−96
GALACTOKINASE

Actinobacillus

P94169
compliment of




(GALACTOSE

pleuropneumoniae


nt. 232-741 of




KINASE)


SEQ ID NO: 148


149
1.00E−92
GTP-binding

Buchnera sp.

NP_240245.1
compliment of




protein TypA/BipA
APS

nt. 265-1077 of







SEQ ID NO: 149


15
2.00E−21
ORF 1

Escherichia coli

CAA39631.1
nt: 665-850 of







SEQ ID NO: 15


150
6.00E−17
unknown

Pasteurella

NP_245919.1
nt. 171-665 of






multocida


SEQ ID NO: 150


153
7.00E−07
outer membrane

Rickettsia conorii

T30852
nt. 51-623 of




protein A


SEQ ID NO: 153


155
7.00E−40
cytochrome d

Vibrio cholerae

NP_233259.1
nt. 583-1002 of




ubiquinol oxidase,


SEQ ID NO: 155




subunit II


157
7.00E−13
unknown

Pasteurella

NP_245490.1
compliment of






multocida


nt. 1170-1367 of







SEQ ID NO: 157


157
2.00E−05
glycosyl

Neisseria

AAA68012.1
nt. 85-189 of




transferase

gonorrhoeae


SEQ ID NO: 157


158
 1.00E−152
MltC

Pasteurella

NP_246259.1
compliment of






multocida


nt. 36-530 of







SEQ ID NO: 158


161
3.00E−25
lipoprotein, putative

Vibrio cholerae

NP_230232.1
nt. 870-1439 of







SEQ ID NO: 161


163
9.00E−53
chorismate

Caulobacter

NP_421948.1
nt. 1283-2029 of




synthase

crescentus


SEQ ID NO: 163


168
3.00E−13
COPPER-

Mus musculus

Q64430
nt. 66-995 of




TRANSPORTING


SEQ ID NO: 168




ATPASE 1




(COPPER PUMP 1)


168
2.00E−22
Cu transporting

Homo sapiens

2001422A
nt. 135-989 of




ATPase P


SEQ ID NO: 168


174
8.00E−48
magnesium/cobalt

Mesorhizobium

NP_103977.1
nt. 918-1205 of




transport protein

loti


SEQ ID NO: 174


175
5.00E−26
vacB protein

Buchnera sp.

NP_240369.1
compliment of





APS

nt. 1-1587 of







SEQ ID NO: 175


176
3.00E−21
putative ABC

Campylobacter

NP_282774.1
compliment of




transport system

jejuni


nt. 259-1089 of




permease protein [


SEQ ID NO: 176


183
5.00E−29
PROBABLE ATP

Ralstonia

NP_521442.1
compliment of




SYNTHASE A

solanacearum


nt. 42-677 of




CHAIN


SEQ ID NO: 183




TRANSMEMBRANE




PROTEIN


185
6.00E−85
putative exported

Salmonella

NP_458655.1
compliment of




protein

enterica subsp.


nt. 162-1529 of






enterica serovar


SEQ ID NO: 185






Typhi



187
2.00E−05
transketolase

Homo sapiens

AAA61222.1
nt. 709-819 of







SEQ ID NO: 187


188
 1.00E−116
ribonuclease E

Xylella fastidiosa

NP_299884.1
compliment of





9a5c

nt. 280-1704 of







SEQ ID NO: 188


192
1.00E−38
ImpA

Pasteurella

NP_245829.1
nt. 35-448 of






multocida


SEQ ID NO: 192


193
3.00E−08
Orf80
Enterobacteria
NP_052285.1
nt. 1612-1818 of





phage 186

SEQ ID NO: 193


193
1.00E−06
holin

Haemophilus

AAC45168.1
nt. 370-576 of






somnus


SEQ ID NO: 193


193
   0.007
unknown
Enterobacteria
NP_052260.1
nt. 1376-1609 of





phage 186

SEQ ID NO: 193


193
2.00E−48
lysozyme

Haemophilus

AAC45169.1
nt. 608-1093 of






somnus


SEQ ID NO: 193


199
4.00E−21
unknown protein

Escherichia coli

NP_288675.1
nt. 398-778 of





O157:H7

SEQ ID NO: 199





EDL933,





prophage CP-





933V


199
2.00E−49
hypothetical protein
Bacteriophage
NP_049495.1
compliment of





933W

nt. 1907-2392 of







SEQ ID NO: 199


20
1.00E−62
RpL14

Pasteurella

NP_246344.1
compliment of






multocida


nt. 233-601 of







SEQ ID NO: 20


200
2.00E−62
hypothetical protein

Salmonella

NP_458658.1
compliment of






enterica subsp.


nt. 431-997 of






enterica serovar


SEQ ID NO: 200






Typhi



200
3.00E−16
hypothetical protein

Salmonella

NP_458657.1
compliment of






enterica subsp.


nt. 1028-1264 of






enterica serovar


SEQ ID NO: 200






Typhi



201
2.00E−26
TsaA

Pasteurella

NP_245732.1
compliment of






multocida


nt. 1618-1809 of







SEQ ID NO: 201


209
6.00E−16
TsaA

Pasteurella

NP_245732.1
compliment of






multocida


nt. 2-136 of







SEQ ID NO: 209


211
2.00E−15
unknown

Pasteurella

NP_245535.1
compliment of






multocida


nt. 23-211 of







SEQ ID NO: 211


211
1.00E−70
PUTATIVE

Ralstonia

NP_520082.1
compliment of




ATPASE PROTEIN

solanacearum


nt. 475-915 of







SEQ ID NO: 211


212
3.00E−18
hypothetical protein

Escherichia coli

NP_309775.1
compliment of





O157:H7

nt. 895-1035 of







SEQ ID NO: 212


216
 1.00E−173
unknown

Pasteurella

NP_245069.1
nt. 35-1543 of






multocida


SEQ ID NO: 216


217
9.00E−18
diacylglycerol

Vibrio cholerae

NP_233101.1
nt. 2083-2208 of




kinase


SEQ ID NO: 217


221
4.00E−34
Tail-Specific

Chlamydia

NP_219953.1
nt. 849-1421 of




Protease

trachomatis


SEQ ID NO: 221


222
4.00E−23
AGR_C_3689p

Agrobacterium

NP_355005.1
compliment of






tumefaciens str.


nt. 940-1305 of





C58 (Cereon)

SEQ ID NO: 222


224
9.00E−19
unknown

Pasteurella

NP_245536.1
nt. 15-308 of






multocida


SEQ ID NO: 224


225
1.00E−89
portal vector -like

Salmonella

NP_461651.1
nt. 31-750 of of




protein, in phage

typhimurium


SEQ ID NO: 225




P2 [Salmonella
LT2Fels-2





typhimurium LT2]

prophage


229
2.00E−25
anaerobic

Salmonella

CAB62266.1
nt. 1806-2108 of




ribonucleotide

typhimurium


SEQ ID NO: 229




reductase


234
3.00E−08
conserved

Xylella fastidiosa

NP_299850.1
nt. 1680-2048 of




hypothetical protein
9a5c

SEQ ID NO: 234


234
1.00E−42
Methionine

Clostridium

NP_348177.1
compliment of




sulfoxide reductase

acetobutylicum


nt. 415-654 of




C-terminal domain


SEQ ID NO: 234




related protein,




YPPQ ortholog


235
4.00E−16
phage-related tail
Wolbachia
AAK85310.1
compliment of




protein
endosymbiont of

nt. 931-1929 of






Drosophila


SEQ ID NO: 235






melanogaster



235
6.00E−56
similar to orfG

Salmonella

NP_461625.1
compliment of




protein in phage

typhimurium LT2,


nt. 313-1863 of




186, Salmonella
Fels-2 prophage

SEQ ID NO: 235





typhimurium LT2



236
6.00E−20
conserved

Pseudomonas

NP_252693.1
nt. 1572-1916 of




hypothetical protein

aeruginosa


SEQ ID NO: 236


240
5.00E−27
MODIFICATION

Brevibacterium

P10283
compliment of




METHYLASE BEPI

epidermidis


nt. 922-1305 of







SEQ ID NO: 240


241
2.00E−15
phage-related

Xylella fastidiosa

NP_299573.1
compliment of




protein
9a5c

nt. 865-1305 of







SEQ ID NO: 241


241
4.00E−08
hypothetical protein
phage SPP1
T42296
nt. 73-636 of







SEQ ID NO: 241


241
4.00E−07
hypothetical protein

Salmonella

NP_458686.1
nt. 10-468 of






enterica subsp.


SEQ ID NO: 241






enterica serovar







Typhi



242
2.00E−29
translation
chloroplast -
S35701
compliment of




elongation factor
soybean

nt. 18-1085 of




EF-G


SEQ ID NO: 242


247
3.00E−23
GTP

Synechococcus

Q54769
compliment of




CYCLOHYDROLASE
sp. PCC 7942

nt. 1009-1257c of




I (GTP-CH-I)


SEQ ID NO: 247


248
6.00E−05
phospho-N-

Aquifex aeolicus

NP_213025.1
nt. 830-1747 of




acetylmuramoyl-


SEQ ID NO: 248




pentapeptide-




transferase


25
2.00E−86
PROBABLE

Ralstonia

NP_522358.1
compliment of




TRANSPORT

solanacearum


nt. 309-854 of




TRANSMEMBRANE


SEQ ID NO: 25




PROTEIN


25
7.00E−06
major facilitator

Caulobacter

NP_419155.1
compliment of




family transporter

crescentus


nt. 134-283 of







SEQ ID NO: 25


250
 1.00E−150
CpdB

Pasteurella

NP_246953.1
compliment of






multocida


nt. 36-1016 of







SEQ ID NO: 250


252
3.00E−57
alanyl-tRNA

Vibrio cholerae

AAA99922.1
compliment of




synthetase


nt. 1418-1951 of







SEQ ID NO: 252


253
 1.00E−108
similar to

Listeria

NP_464432.1
compliment of




glutathione

monocytogenes


nt. 411-1358 of of




Reductase
EGD-e

SEQ ID NO: 253


259
3.00E−39
hypothetical protein

Salmonella

NP_458654.1
compliment of






enterica subsp.


nt. 342-1037 of






enterica serovar


SEQ ID NO: 259






Typhi



259
3.00E−17
possible exported

Salmonella

NP_458653.1
compliment of




protein

enterica subsp.


nt. 1251-1607 of






enterica serovar


SEQ ID NO: 259






Typhi



261
5.00E−74
hypothetical protein

Haemophilus

S27582
compliment of




6 - Haemophilus

influenzae


nt. 3-422 of





influenzae



SEQ ID NO: 261


263
1.00E−94
putative

Haemophilus

AAD01406.1
nt. 2142-2672 of




transposase

paragallinarum


SEQ ID NO: 263


264
 1.00E−126
unknown

Actinobacillus

NP_067554.1
nt. 40-714 of






actinomycetemco-


SEQ ID NO: 264






mitans



264
 1.00E−103
unknown

Actinobacillus

NP_067555.1
nt. 695-1309 of






actinomycetemco-


SEQ ID NO: 264






mitans



264
2.00E−21
unknown

Actinobacillus

NP_067556.1
nt. 1302-1448 of






actinomycetemco-


SEQ ID NO: 264






mitans



265
6.00E−27
Aminopeptidase 2
chloroplast
Q42876
nt. 556-1539 of







SEQ ID NO: 265


268
 1.00E−116
MutY

Pasteurella

NP_246257.1
nt. 1003-1581 of






multocida


SEQ ID NO: 268


272
1.00E−07
hypothetical protein
Bacteriophage
NP_049495.1
compliment of





933W

nt. 77-232 of







SEQ ID NO: 272


274
3.00E−13
unknown

Pasteurella

NP_246952.1
compliment of






multocida


nt. 1658-1975 of







SEQ ID NO: 274


275
3.00E−20
CafA

Neisseria

AAG24267.1
nt. 1299-1571 of






gonorrhoeae


SEQ ID NO: 275


276
1.00E−45
mukE protein

Vibrio cholerae

NP_231351.1
compliment of







nt. 650-1390 of







SEQ ID NO: 276


276
1.00E−69
KicA

Pasteurella

NP_245545.1
compliment of






multocida


nt. 647-1321 of







SEQ ID NO: 276


278
2.00E−56
3-oxoacyl-[acyl-

Salmonella

NP_455686.1
nt. 1366-1944 of




carrier-protein]

enterica subsp.


SEQ ID NO: 278




synthase III

enterica serovar







Typhi



281
5.00E−56
unknown

Pasteurella

NP_246261.1
compliment of






multocida


nt. 31-678 of







SEQ ID NO: 281


282
3.00E−09
orf25; similar to T
bacteriophage phi
NP_490625.1
compliment of




gene of P2
CTX

nt. 511-1032 of







SEQ ID NO: 282


282
1.00E−08
orf11; similar to

Haemophilus

AAC45165.1
compliment of




phage P2 gene S-

somnus


nt. 1450-1584 of




like product, which


SEQ ID NO: 282




is involved in tail




synthesis,


282
9.00E−27
putative

Salmonella

NP_457167.1
compliment of




bacteriophage tail

enterica subsp.


nt. 3-509 of




protein

enterica serovar


SEQ ID NO: 282






Typhi



286
5.00E−18
plasmid-related

Listeria innocua

NP_471066.1
compliment of




protein
plasmid

nt. 887-1501 of







SEQ ID NO: 286


287
8.00E−20
GTP

Escherichia coli

NP_287920.1
nt. 2-145 of




cyclohydrolase II
O157:H7 EDL933

SEQ ID NO: 287


289
 1.00E−168
MODIFICATION

Haemophilus

O30868
compliment of




METHYLASE

aegyptius


nt. 138-1091 of




HAEII


SEQ ID NO: 289


289
5.00E−11
TYPE II

Haemophilus

O30869
compliment of




RESTRICTION

aegyptius


nt. 22-132 of




ENZYME HAEII


SEQ ID NO: 289


289
6.00E−47
mukF homolog

Haemophilus

AAB70828.1
compliment of






influenzae biotype


nt. 1107-1385 of






aegyptius


SEQ ID NO: 289


294
 1.00E−171
LICA PROTEIN

Haemophilus

P14181
compliment of






influenzae


nt. 677-1564 of





RM7004

SEQ ID NO: 294


297
 1.00E−158
DNA methylase

Vibrio cholerae

NP_231404.1
compliment of




HsdM, putative


nt. 12-1136 of







SEQ ID NO: 297


302
0
HEME-BINDING

Haemophilus

P33950
nt. 3-1316 of




PROTEIN A

influenzae DL42


SEQ ID NO: 302


304
6.00E−19
hypothetical protein

Haemophilus

S27582
nt. 121-267 of




6

influenzae


SEQ ID NO: 304


305
6.00E−40
putative

Streptococcus

NP_269557.1
nt. 65-805 of




recombinase -

pyogenes M1


SEQ ID NO: 305




phage associated
GAS


305
7.00E−22
single stranded

Shewanella sp.

AAB57886.1
nt. 1607-2014 of




DNA-binding
F1A

SEQ ID NO: 305




protein


305
1.00E−43
phage-related

Bacillus

NP_244410.1
nt. 92-751 of




protein

halodurans


SEQ ID NO: 305


312
1.00E−28
PUTATIVE

Ralstonia

NP_518994.1
nt. 1819-2673 of




BACTERIOPHAGE-

solanacearum


SEQ ID NO: 312




RELATED




TRANSMEMBRANE




PROTEIN


312
9.00E−25
similar to

Homo sapiens

XP_068727.1
nt. 27-1001 of




BASEMENT


SEQ ID NO: 312




MEMBRANE-




SPECIFIC




HEPARAN




SULFATE




PROTEOGLYCAN




CORE PROTEIN




PRECURSOR




(HSPG)


315
2.00E−45
uracil permease

Deinococcus

NP_296001.1
compliment of






radiodurans


nt. 525-1592 of







SEQ ID NO: 315


318
7.00E−15
CzcD

Pasteurella

NP_246276.1
compliment of






multocida


nt. 3-227 of







SEQ ID NO: 318


320
2.00E−60
orf3; similar to

Haemophilus

AAC45159.1
compliment of




endonuclease

somnus


nt. 606-1241 of




subunit of the


SEQ ID NO: 320




phage P2




terminase (gene M)


320
2.00E−09
orf4; similar to

Haemophilus

AAC45160.1
compliment of




head

somnus


nt. 52-285 of




completion/stabili-


SEQ ID NO: 320




zation protein (gene




L) of phage P2


320
3.00E−35
orf2; similar to

Haemophilus

AAC45158.1
compliment of




major capsid

somnus


nt. 1271-1624 of




protein precursor of


SEQ ID NO: 320




phage P2 (gene N)


323
4.00E−37
dedC protein

Escherichia coli

AAA23966.1
compliment of







nt. 74-463 of







SEQ ID NO: 323


324
 1.00E−153
conserved

Neisseria

NP_274972.1
compliment of




hypothetical protein

meningitidis


nt. 930-1943 of





MC58

SEQ ID NO: 324


326
5.00E−52
selenophosphate

Eubacterium

CAB53511.1
compliment of




synthetase

acidaminophilum


nt. 1186-2292 of







SEQ ID NO: 326


328
 1.00E−129
secretion protein

Pseudomonas

NP_252510.1
compliment of




SecD

aeruginosa


nt. 8-625 of







SEQ ID NO: 328


333
3.00E−08
unknown

Pasteurella

NP_245489.1
compliment of






multocida


nt. 5-418 of







SEQ ID NO: 333


336
6.00E−38
probable methyl

Pseudomonas

NP_253353.1
compliment of




transferase

aeruginosa


nt. 2547-2819 of







SEQ ID NO: 336


338
2.00E−98
Pmi

Pasteurella

NP_245766.1
nt. 144-842 of






multocida


SEQ ID NO: 338


339
2.00E−07
tRNA

Escherichia coli

QQECPE
nt. 2331-2540 of




nucleotidyltransferase


SEQ ID NO: 339


340
0
DNA gyrase,

Salmonella

NP_461214.1
compliment of




subunit A, type II

typhimurium LT2


nt. 93-1799 of




topoisomerase


SEQ ID NO: 340


342
4.00E−12
tolA protein

Haemophilus

JC5212
nt. 980-1318 of






influenzae


SEQ ID NO: 342


344
1.00E−07
MODIFICATION

Haemophilus

P50192
compliment of




METHYLASE

parahaemolyticus


nt. 849-1034 of




HPHIA


SEQ ID NO: 344


344
8.00E−05
ABC transporter

Leishmania major

AAF31030.1
compliment of




protein 1


nt. 17-205 of







SEQ ID NO: 344


349
3.00E−44
conserved

Neisseria

NP_273467.1
compliment of




hypothetical protein

meningitidis


nt. 1397-1903 of





MC58

SEQ ID NO: 349


349
8.00E−09
hypothetical protein

Pseudomonas

NP_252667.1
compliment of






aeruginosa


nt. 795-1121 of







SEQ ID NO: 349


349
9.00E−10
conserved

Helicobacter

NP_207009.1
compliment of




hypothetical

pylori 26695


nt. 1319-1816 of




secreted protein


SEQ ID NO: 349


349
2.00E−06
putative TPR

Salmonella

NP_463149.1
compliment of




repeat protein

typhimurium LT2


nt. 2244-2558 of







SEQ ID NO: 349


35
1.00E−23
type I restriction-

Xylella fastidiosa

NP_300003.1
compliment of




modification
9a5c

nt. 29-388 of




system specificity


SEQ ID NO: 35




determinant


352
 1.00E−116
putative peptidase

Escherichia coli

NP_416827.1
compliment of





K12

nt. 951-1640 of







SEQ ID NO: 352


352
0
unknown

Pasteurella

NP_245275.1
compliment of






multocida


nt. 86-946 of







SEQ ID NO: 352


354
5.00E−86
putative uronate

Salmonella

NP_462052.1
compliment of




isomerase

typhimurium LT2


nt. 168-914 of







SEQ ID NO: 354


356
1.00E−07
isomerase-like

Escherichia coli

S57220
nt. 5-73 of




protein (DsbD) -


SEQ ID NO: 356


358
1.00E−07
USG protein

Pediococcus

CAC16793.1
nt. 534-1307 of






pentosaceus


SEQ ID NO: 358


358
   0.005
HsdS protein

Escherichia coli

CAA10700.1
nt. 26-205 of







SEQ ID NO: 358


361
 1.00E−152
maltodextrin

Escherichia coli

NP_289957.1
compliment of




phosphorylase
O157:H7 EDL933

nt. 77-922 of







SEQ ID NO: 361


363
6.00E−06
BH2505~unknown

Bacillus

NP_243371.1
nt. 554-844 of




conserved protein

halodurans


SEQ ID NO: 363


368
1.00E−12
H02F09.3.p

Caenorhabditis

NP_508295.1
compliment of






elegans


nt. 1069-1977 of







SEQ ID NO: 368


368
6.00E−27
hypothetical

Mesorhizobium

NP_102360.1
compliment of




glycine-rich protein

loti


nt. 1201-1986 of







SEQ ID NO: 368


37
6.00E−09
putative ATP-

Escherichia coli

NP_415469.1
compliment of




binding component
K12

nt. 455-691 of




of a transport


SEQ ID NO: 37




system


372
7.00E−18
conserved

Clostridium

BAB80319.1
compliment of




hypothetical protein

perfringens


nt. 1763-1924 of







SEQ ID NO: 372


376
7.00E−24
putative

Salmonella

NP_456379.1
compliment of




bacteriophage

enterica subsp.


nt. 158-808 of




protein

enterica serovar


SEQ ID NO: 376






Typhi



376
8.00E−10
hypothetical protein

Xylella fastidiosa

NP_298882.1
compliment of





9a5c

nt. 1129-1671 of







SEQ ID NO: 376


376
9.00E−06
Iin1713

Listeria innocua

NP_471049.1
compliment of







nt. 913-1557 of







SEQ ID NO: 376


377
6.00E−05
Vng1732c

Halobacterium sp.

NP_260487.1
nt. 2378-2587 of





NRC-1

SEQ ID NO: 377


377
1.00E−11
INVASIN

Yersinia

P31489
compliment of




PRECURSOR

enterocolitica


nt. 127-345 of




(OUTER


SEQ ID NO: 377




MEMBRANE




ADHESIN)


382
4.00E−16
unknown

Pasteurella

NP_246871.1
compliment of






multocida


nt. 967-1068 of







SEQ ID NO: 382


383
4.00E−36
putative

Streptomyces

BAB69302.1
nt. 488-1162 of




transposase

avermitilis


SEQ ID NO: 383


383
3.00E−58
recombinase
IncN plasmid R46
NP_511241.1
compliment of







nt. 1-393 of







SEQ ID NO: 383


383
4.00E−24
transposase

Escherichia coli

I69674
nt. 1294-1740 of







SEQ ID NO: 383


383
0
tnpA

Yersinia

CAA73750.1
nt. 1782-2834 of






enterocolitica


SEQ ID NO: 383


385
2.00E−31
unknown

Pasteurella

NP_246065.1
nt. 1515-1772 of






multocida


SEQ ID NO: 385


386
5.00E−65
cydC [

Escherichia coli

AAA66172.1
compliment of







nt. 3438-4115 of







SEQ ID NO: 386


386
4.00E−33
ABC transporter,

Mesorhizobium

NP_105463.1
compliment of




ATP-binding

loti


nt. 2569-3390 of




protein


SEQ ID NO: 386


388
3.00E−45
60 KDA INNER-

Coxiella burnetii

P45650
compliment of




MEMBRANE


nt. 3211-3759 of




PROTEIN


SEQ ID NO: 388




HOMOLOG


390
4.00E−25
putative DNA-

Salmonella

NP_458175.1
nt. 1051-1416 of




binding protein

enterica subsp.


SEQ ID NO: 390






enterica serovar







Typhi



390
3.00E−13
transcriptional

Bacillus

NP_241773.1
compliment of




regulator

halodurans


nt. 84-578 of







SEQ ID NO: 390


390
3.00E−06
DNA translocase

Staphylocoecus

NP_372265.1
compliment of




stage III sporulation

aureus subsp.


nt. 620-871 of




prot homolog

aureus Mu50


SEQ ID NO: 390


395
7.00E−31
ATPase, Cu++

Homo sapiens

NP_000044.1
compliment of




transporting, beta


nt. 615-1406 of




polypeptide


SEQ ID NO: 395


397
3.00E−23
terminase large
Bacteriophage
NP_112076.1
compliment of




subunit
HK620

nt. 2363-2725 of







SEQ ID NO: 397


397
3.00E−16
hypothetical protein

Xylella fastidiosa

NP_297824.1
compliment of





9a5c

nt. 1517-1744 of







SEQ ID NO: 397


398
4.00E−67
orf32

Haemophiius

NP_536839.1
compliment of





phage HP2

nt. 1288-1866 of







SEQ ID NO: 398


398
8.00E−24
putative

Salmonella

NP_463063.1
compliment of




cytoplasmic protein

typhimurium LT2


nt. 798-1220 of







SEQ ID NO: 398


398
2.00E−83
orf31

Haemophilus

NP_043502.1
compliment of





phage HP1

nt. 1881-2510 of







SEQ ID NO: 398


399
1.00E−94
HEME/HEMOPEXIN-

Haemophilus

P45355
nt. 88-774 of




BINDING

influenzae N182


SEQ ID NO: 399




PROTEIN


401
3.00E−63
Sty SBLI

Salmonella

CAA68058.1
nt. 1690-2742 of






enterica


SEQ ID NO: 401


401
3.00E−06
RESTRICTION-

Mycoplasma

NP_325912.1
nt. 79-489 of




MODIFICATION

pulmonis


SEQ ID NO: 401




ENZYME




SUBUNIT M3


402
2.00E−13
OPACITY

Neisseria

Q05033
compliment of




PROTEIN OPA66

gonorrhoeae


nt. 2634-2915 of




PRECURSOR


SEQ ID NO: 402


406
8.00E−13
type I restriction

Neisseria

NP_273876.1
nt. 281-520 of




enzyme EcoR124IIR

meningitidis


SEQ ID NO: 406





MC58


407
6.00E−65
unknown

Pasteurella

NP_246237.1
nt. 938-2450 of






multocida


SEQ ID NO: 407


407
5.00E−99
PepE

Pasteurella

NP_245391.1
nt. 1216-1917 of






multocida


SEQ ID NO: 407


407
1.00E−16
Hemoglobin-

Haemophilus

Q48153
nt. 1-141 of




haptoglobin binding

influenzae Tn106


SEQ ID NO: 407




protein A


409
 1.00E−106
hypothetical protein

Haemophilus

S27577
compliment of




1

influenzae


nt. 2524-3159 of







SEQ ID NO: 409


411
4.00E−29
heme-repressible

Haemophilus

AAB46794.1
nt. 391-615 of




hemoglobin-binding

influenzae, type b,


SEQ ID NO: 411




protein
strain HI689


411
0
Hemoglobin-

Haemophilus

Q48153
nt. 651-3263 of




haptoglobin binding

influenzae Tn106


SEQ ID NO: 411




protein A


412
2.00E−07
REGULATORY
bacteriophage
P03036
compliment of




PROTEIN CRO
434

nt. 59-259 of




(ANTIREPRESSOR)


SEQ ID NO: 412


412
4.00E−06
hypothetical protein
Bacteriophage
CAC83535.1
nt. 1436-1654 of





P27

SEQ ID NO: 412


413
8.00E−07
hypothetical protein

Deinococcus

NP_294301.1
compliment of






radiodurans


nt. 791-1012 of







SEQ ID NO: 413


414
9.00E−65
conserved

Vibrio cholerae

NP_230092.1
nt. 1696-2103 of




hypothetical protein


SEQ ID NO: 414


414
3.00E−93
unknown

Pasteurella

NP_246834.1
nt. 1777-2109 of






multocida


SEQ ID NO: 414


416
2.00E−17
unknown

Pasteurella

NP_246629.1
compliment of






multocida


nt. 2565-2831 of







SEQ ID NO: 416


416
4.00E−26
hypothetical protein

Escherichia coli

S30728
compliment of




o154


nt. 1928-2254 of







SEQ ID NO: 416


416
3.00E−37
transport protein

Pseudomonas

NP_253757.1
compliment of




TatC

aeruginosa


nt. 1494-2018 of of







SEQ ID NO: 416


417
1.00E−66
weakly similar to

Listeria innocua

NP_471073.1
compliment of




methyltransferases


nt. 999-1928 of







SEQ ID NO: 417


417
5.00E−05
DNA-BINDING

Pectobacterium

Q47587
compliment of




PROTEIN RDGA

carotovorum


nt. 3526-4212 of







SEQ ID NO: 417


417
2.00E−29
putative phage-

Yersinia pestis

NP_407132.1
compliment of




related protein


nt. 2546-2938 of







SEQ ID NO: 417


417
3.00E−06
Adenine-specific

Thermoplasma

NP_393798.1
compliment of




DNA methylase

acidophilum


nt. 826-1020 of







SEQ ID NO: 417


43
9.00E−16
PcnB

Pasteurella

NP_245801.1
nt. 511-870 of






multocida


SEQ ID NO: 43


434
2.00E−97
beta′ subunit of

Nephroselmis

NP_050840.1
compliment of




RNA polymerase

olivacea


nt. 32-1534 of







SEQ ID NO: 434


435
4.00E−52
MODIFICATION

Brevibacterium

P10283
compliment of




METHYLASE BEPI

epidermidis


nt. 11-565 of







SEQ ID NO: 435


435
4.00E−57
pentafunctional

Saccharomyces

NP_010412.1
compliment of




arom polypeptide

cerevisiae


nt. 757-2064 of




(contains: 3-


SEQ ID NO: 435




dehydroquinate




synthase, 3-




dehydroquinate,




dehydratase (3-




dehydroquinase),




shikimate 5-




dehydrogenase,




shikimate kinase,




and epsp synthase)


437
5.00E−70
dihydrofolate

Haemophilus

S52336
nt. 2393-2767 of




reductase

influenzae


SEQ ID NO: 437





(clinical isolate





R1042)


438
 1.00E−106
polyA polymerase

Vibrio cholerae

NP_230244.1
nt. 3-1124 of







SEQ ID NO: 438


439
6.00E−60
Porphyrin

Salmonella

NP_457816.1
nt. 2343-2783 of




biosynthetic protein

enterica subsp.


SEQ ID NO: 439






enterica serovar







Typhi



441
5.00E−73
RimM

Pasteurella

NP_246234.1
compliment of






multocida


nt. 151-441 of







SEQ ID NO: 441


442
9.00E−31
LIPOPROTEIN

Salmonella

P40827
compliment of




NLPD

typhimurium


nt. 3362-3520 of







SEQ ID NO: 442


444
6.00E−24
glycine betaine

Staphylococcus

NP_371872.1
compliment of




transporter

aureus subsp.


nt. 2242-2514 of






aureus Mu50


SEQ ID NO: 444


452
6.00E−28
unknown

Pasteurella

NP_245616.1
compliment of






multocida


nt. 533-883 of







SEQ ID NO: 452


452
0
Type I restriction

Escherichia coli

Q47163
nt. 3291-4154 of




enzyme Ecoprrl M


SEQ ID NO: 452




protein


452
2.00E−75
type I restriction

Ureaplasma

NP_077929.1
nt. 4156-4662 of




enzyme M protein

urealyticum


SEQ ID NO: 452


455
9.00E−56
PROBABLE

Ralstonia

NP_520059.1
nt. 2028-2774 of




BACTERIOPHAGE

solanacearum


SEQ ID NO: 455




PROTEIN


455
2.00E−55
orf2; similar to

Haemophilus

AAC45158.1
nt. 2864-3490 of




major capsid

somnus


SEQ ID NO: 455




protein precursor of




phage P2 (gene N),


455
 1.00E−175
gpP
Enterobacteria
NP_046758.1
compliment of





phage P2

nt. 127-1812 of







SEQ ID NO: 455


456
1.00E−38
hypothetical protein

Pseudomonas

NP_542872.1
compliment of






putida


nt. 1010-1282 of







SEQ ID NO: 456


456
 1.00E−172
hypothetical protein

Pseudomonas

NP_542873.1
compliment of






putida


nt. 1443-2006 of







SEQ ID NO: 546


457
 1.00E−116
hypothetical protein

Haemophilus

S15287
compliment of




(galE 5′ region) -

influenzae


nt. 62-961 of





Haemophilus



SEQ ID NO: 457





influenzae



457
 1.00E−134
dTDPglucose 4,6-

Actinobacillus

T00102
nt. 2637-3656 of




dehydratase

actinomycetemco-


SEQ ID NO: 457






mitans



459
2.00E−10
RNA polymerase

Synechocystis sp.

NP_441586.1
nt. 25-117 of




gamma-subunit
PCC 6803

SEQ ID NO: 459


461
9.00E−51
conserved

Staphylococcus

NP_370593.1
nt. 4124-4624 of




hypothetical protein

aureus subsp.


SEQ ID NO: 461






aureus Mu50



462
9.00E−06
NADH

Burkholderia

AAG01016.1
nt. 703-828 of




dehydrogenase

pseudomallei


SEQ ID NO: 462


465
3.00E−41
GTP-binding

Synechocystis sp.

NP_441951.1
compliment of




protein Era
PCC 6803

nt. 2470-2787 of







SEQ ID NO: 465


466
1.00E−15
putative

Salmonella

NP_455548.1
nt. 837-1478 of




bacteriophage

enterica subsp.


SEQ ID NO: 466




protein

enterica serovar







Typhi



466
1.00E−90
orf31

Haemophilus

NP_043502.1
nt. 2396-3199 of





phage HP1

SEQ ID NO: 466


469
0
Hemoglobin and

Haemophilus

Q9X442
compliment of




hemoglobin-

influenzae HI689


nt. 427-3459 of




haptoglobin binding


SEQ ID NO: 469




protein C precursor


471
8.00E−05
transposase,

Neisseria

NP_274608.1
nt. 2957-3217 of




putative

meningitidis


SEQ ID NO: 471





MC58


472
6.00E−08
hypothetical protein

Salmonella

NP_458660.1
compliment of






enterica subsp.


nt. 2881-3270 of






enterica serovar


SEQ ID NO: 472






Typhi



472
5.00E−23
antirestriction

Mesorhizobium

NP_106707.1
nt. 4908-5324 of




protein

loti


SEQ ID NO: 472


472
1.00E−75
hypothetical protein

Salmonella

NP_458661.1
compliment of






enterica subsp.


nt. 1931-2776 of






enterica serovar


SEQ ID NO: 472






Typhi



472
9.00E−72
hypothetical protein

Salmonella

NP_458662.1
compliment of






enterica subsp.


nt. 544-1689 of






enterica serovar


SEQ ID NO: 472






Typhi



475
3.00E−25
unknown

Pasteurella

NP_244952.1
nt. 3207-3626 of






multocida


SEQ ID NO: 475


476
8.00E−73
putative DNA-

Salmonella

NP_458175.1
compliment of




binding protein

enterica subsp.


nt. 3339-4310 of






enterica serovar


SEQ ID NO: 476






Typhi



476
6.00E−47
anticodon nuclease

Neisseria

NP_273873.1|
compliment of






meningitidis


nt. 4397-4885 of





MC58

SEQ ID NO: 476


478
3.00E−06
methionin

Arabidopsis

CAB38313.1
compliment of




synthase-like

thaliana


nt. 3554-3679 of




enzyme


SEQ ID NO: 478


478
3.00E−05
unknown

Pasteurella

NP_245444.1
compliment of






multocida


nt. 164-250 of







SEQ ID NO: 478


479
1.00E−18
conserved

Xylella fastidiosa

NP_298841.1
nt. 2302-2658 of




hypothetical protein
9a5c

SEQ ID NO: 479


48
3.00E−19
Dca

Neisseria

AAF12796.1
compliment of






gonorrhoeae


nt. 225-746 of







SEQ ID NO: 48


482
1.00E−06
hypothetical protein

Neisseria

NP_275122.1
nt. 2055-2189 of






meningitidis


SEQ ID NO: 482





MC58


482
9.00E−28
conserved

Neisseria

NP_274383.1
nt. 1689-1898 of




hypothetical protein

meningitidis


SEQ ID NO: 482





MC58


487
5.00E−75
conserved

Neisseria

NP_284304.1
nt. 2541-2978 of




hypothetical protein

meningitidis


SEQ ID NO: 487





Z2491


488
2.00E−64
unknown

Pasteurella

NP_246617.1
nt. 2983-3540 of






multocida


SEQ ID NO: 488


488
8.00E−93
1-deoxy-D-xylulose

Zymomonas

AAD29659.1
nt. 1344-1880 of




5-phosphate

mobilis


SEQ ID NO: 488




reductoisomerase


491
5.00E−51
rubredoxin

Clostridium

AAB50346.1
compliment of




oxidoreductase

acetobutylicum


nt. 1690-2439 of




homolog


SEQ ID NO: 491


492
1.00E−27
phosphotransferase

Staphylococcus

AAK83253.1
compliment of




system enzyme

aureus


nt. 755-970 of




IIA-like protein


SEQ ID NO: 492


493
2.00E−84
unknown

Actinobacillus

AAC70895.1
nt. 3333-3935 of






actinomycetemco-


SEQ ID NO: 493






mitans



493
4.00E−49
unknown

Helicobacter

NP_223898.1
nt. 3345-4010 of






pylori J99


SEQ ID NO: 493


493
9.00E−31
transcriptional

Acinetobacter

AAF20290.1
nt. 1885-2793 of




factor MdcH

calcoaceticus


SEQ ID NO: 493


493
6.00E−30
HimA

Pasteurella

NP_245565.1
nt. 1129-1260 of






multocida


SEQ ID NO: 493


494
4.00E−85
putative prophage

Yersinia pestis

NP_404712.1
nt. 900-2099 of




integrase


SEQ ID NO: 494


494
4.00E−63
DNA

Xylella fastidiosa

NP_299063.1
compliment of




methyltransferase
9a5c

nt. 5544-6170 of







SEQ ID NO: 494


494
6.00E−19
MODIFICATION

Lactococcus lactis

P34877
compliment of




METHYLASE
subsp. cremoris

nt. 5019-6113 of




SCRFIA


SEQ ID NO: 494


497
0
transferrin-binding

Haemophilus

S70906
nt. 3251-4999 of




protein 1

influenzae (strain


SEQ ID NO: 497





PAK 12085)


50
5.00E−07
AcpP

Pasteurella

NP_246856.1
nt. 2-136 of






multocida


SEQ ID NO: 50


501
7.00E−50
conserved

Vibrio cholerae

NP_231403.1
compliment of




hypothetical protein


nt. 3649-4872 of







SEQ ID NO: 501


501
0
type I restriction

Vibrio cholerae

NP_231400.1
compliment of




enzyme HsdR,


nt. 1551-3440 of




putative


SEQ ID NO: 501


501
4.00E−13
ATP-dependent

Deinococcus

NP_295921.1
compliment of




DNA helicase

radiodurans


nt. 5317-5844 of




RecG-related


SEQ ID NO: 501




protein


501
5.00E−11
conserved

Ureaplasma

NP_077868.1
compliment of




hypothetical

urealyticum


nt. 5098-5769 of







SEQ ID NO: 501


504
2.00E−44
OUTER

Haemophilus

Q48218
compliment of




MEMBRANE

influenzae


nt. 4681-5019 of




PROTEIN P2
AG30010

SEQ ID NO: 504




PRECURSOR




(OMP P2)


507
0
SpoT

Pasteurella

NP_245857.1
compliment of






multocida


nt. 3685-5316 of







SEQ ID NO: 507


51
6.00E−87
glucosamine--

Vibrio cholerae

NP_230141.1
nt. 30-470 of




fructose-6-


SEQ ID NO: 51




phosphate




aminotransferase




(isomerizing)


512
2.00E−28
dipeptide transport

Yersinia pestis

NP_407439.1
compliment of




system permease


nt. 1095-1580 of




protein


SEQ ID NO: 512


512
3.00E−82
SapC

Pasteurella

NP_245850.1
compliment of






multocida


nt. 730-1095 of







SEQ ID NO: 512


514
9.00E−06
putative integral

Campylobacter

NP_281236.1
compliment of




membrane protein

jejuni


nt. 577-684 of







SEQ ID NO: 514


514
3.00E−11
orf, hypothetical

Escherichia coli

NP_286004.1
compliment of




protein
O157:H7 EDL933

nt. 449-568 of







SEQ ID NO: 514


518
0
putative inner

Neisseria

NP_284893.1
nt. 92-1927 of




membrane trans-

meningitidis


SEQ ID NO: 518




acylase protein
Z2491


519
4.00E−30
hypothetical protein

Mesorhizobium

NP_108196.1
compliment of






loti


nt. 2221-3159 of







SEQ ID NO: 519


519
2.00E−12
conserved

Listeria innocua

NP_471067.1
compliment of




hypothetical protein


nt. 3994-5241 of







SEQ ID NO: 519


519
6.00E−20
hypothetical protein

Mesorhizobium

NP_108198.1
compliment of






loti


nt. 707-1552 of







SEQ ID NO: 519


519
4.00E−26
putative

Salmonella

NP_455526.1
compliment of




bacteriophage

enterica subsp.


nt. 3982-5163 of




protein

enterica serovar


SEQ ID NO: 519






Typhi



52
3.00E−94
OUTER

Haemophilus

Q48218
nt. 45-788 of




MEMBRANE

influenzae


SEQ ID NO: 52




PROTEIN P2




PRECURSOR




(OMP P2)


520
0
excision nuclease

Escherichia coli

NP_418482.1
compliment of




subunit A
K12

nt. 6309-7745 of







SEQ ID NO: 520


521
5.00E−08
zinc/manganese

Rickettsia conorii

NP_359651.1
nt. 2236-2652 of




ABC transporter


SEQ ID NO: 521




substrate binding




protein


521
 1.00E−140
unknown

Pasteurella

NP_245865.1|
nt. 338-1390 of






multocida


SEQ ID NO: 521


521
1.00E−86
ORF_f432

Escherichia coli

AAB40463.1
nt. 203-1390 of







SEQ ID NO: 521


522
3.00E−22
unknown

Pasteurella

NP_246093.1
nt. 670-885 of






multocida


SEQ ID NO: 522


526
5.00E−33
exodeoxyribonuclease

Yersinia pestis

NP_404635.1
nt. 5582-6202 of




V alpha chain


SEQ ID NO: 526


526
1.00E−62
exodeoxyribonuclease

Vibrio cholerae

NP_231950.1
nt. 5675-6193 of




V, 67 kDa subunit


SEQ ID NO: 526


527
 1.00E−147
unknown

Pasteurella

NP_245980.1
nt. 4283-5203 of






multocida


SEQ ID NO: 527


527
0
Mfd

Pasteurella

NP_245978.1
nt. 7545-8759 of






multocida


SEQ ID NO: 527


527
0
transcription-repair

Salmonella

NP_455708.1
nt. 7611-8762 of




coupling factor

enterica subsp.


SEQ ID NO: 527




(TrcF)

enterica serovar







Typhi



527
0
PROBABLE

Ralstonia

NP_519763.1
nt. 7611-8870 of




TRANSCRIPTION-

solanacearum


SEQ ID NO: 527




REPAIR




COUPLING




FACTOR




PROTEIN


528
1.00E−48
undecaprenyl

Chlamydia

NP_297109.1
nt. 2918-3712 of




pyrophosphate

muridarum


SEQ ID NO: 528




synthetase


528
0
leucyl-tRNA

Vibrio cholerae

NP_230603.1
compliment of




synthetase


nt. 180-2822 of







SEQ ID NO: 528


529
 1.00E−104
DNA PRIMASE

Legionella

P71481
compliment of






pneumophila


nt. 3316-3960 of







SEQ ID NO: 529


534
9.00E−29
putative integrase

Salmonella

NP_461690.1
nt. 4668-5009 of






typhimurium LT2


SEQ ID NO: 534


534
6.00E−18
hypothetical protein

Neisseria

NP_283002.1
compliment of




NMA0153

meningitidis


nt. 5933-6337 of





Z2491

SEQ ID NO: 534


534
2.00E−23
hypothetical protein

Deinococcus

NP_294868.1
nt. 6908-7654 of






radiodurans


SEQ ID NO: 534


534
1.00E−88
prophage CP4-57

Escherichia coli

NP_417111.1
nt. 5057-5875 of




integrase
K12

SEQ ID NO: 534


535
 1.00E−115
phosphate

Buchnera sp.

NP_240007.1
nt. 3385-4596 of




acetyltransferase
APS

SEQ ID NO: 535


536
3.00E−35
cobalt membrane

Actinobacillus

AAD49727.1
compliment of




transport protein

pleuropneumoniae


nt. 3531-4136 of




CbiQ


SEQ ID NO: 536


536
6.00E−37
unknown

Pasteurella

NP_245305.1
compliment of






multocida


nt. 6478-6921 of







SEQ ID NO: 536


539
2.00E−26
Orf122

Chlorobium

AAG12204.1
compliment of






tepidum


nt. 1778-2008 of







SEQ ID NO: 539


540
1.00E−77
heat shock protein

Neisseria

NP_273864.1
compliment of




HtpX

meningitidis


nt. 2567-3481 of





MC58

SEQ ID NO: 540


541
0
IleS

Pasteurella

NP_246601.1
nt. 3167-4549 of






multocida


SEQ ID NO: 541


545
2.00E−09
DNA-BINDING

Pectobacterium

Q47588
nt. 3816-3977 of




PROTEIN RDGB

carotovorum


SEQ ID NO: 545


545
2.00E−11
putative

Sinorhizobium

NP_437741.1
compliment of




transposase

meliloti


nt. 2786-3019 of







SEQ ID NO: 544


545
2.00E−07
Hypothetical 42.5

Escherichia coli

BAA77933.1
compliment of




kd protein in thrW-


nt. 2614-2811 of




argF intergenic


SEQ ID NO: 545




region


545
4.00E−18
putative IS element

Salmonella

NP_454711.1
nt. 1955-2230 of




transposase

enterica subsp.


SEQ ID NO: 545






enterica serovar







Typhi



546
0
HEME/HEMOPEXIN-

Haemophilus

P45354
nt. 5551-7809 of




BINDING

influenzae


SEQ ID NO: 546




PROTEIN


546
0
HEME/HEMOPEXIN

Haemophilus

P45356
nt. 3842-5536 of




UTILIZATION

influenzae


SEQ ID NO: 546




PROTEIN B


546
0
HEME/HEMOPEXIN

Haemophilus

P45357
nt. 1638-3176 of




UTILIZATION

influenzae


SEQ ID NO: 546




PROTEIN C


546
2.00E−12
HasR

Pasteurella

NP_246561.1
nt. 3149-3763 of






multocida


SEQ ID NO: 546


549
0
unknown

Pasteurella

NP_246821.1
nt. 2526-3512 of






multocida


SEQ ID NO: 549


549
 1.00E−121
putative membrane

Yersinia pestis

NP_404859.1
nt. 605-1108 of




protein


SEQ ID NO: 549


549
0
unknown

Pasteurella

NP_246822.1
nt. 1122-1664 of






multocida


SEQ ID NO: 549


551
 1.00E−157
type I restriction-

Xylella fastidiosa

NP_300016.1
compliment of




modification
9a5c

nt. 7396-8322 of




system


SEQ ID NO: 551




endonuclease


552
 1.00E−100
valyl-tRNA

Deinococcus

NP_293872.1
compliment of




synthetase

radiodurans


nt. 6691-8688 of







SEQ ID NO: 552


552
0
VALYL-TRNA

Haemophilus

P36432
compliment of




SYNTHETASE

parainfluenzae


nt. 5850-6647 of







SEQ ID NO: 552


553
0
DNA-directed RNA

Vibrio cholerae

NP_229982.1
nt. 2668-6699 of




polymerase, beta


SEQ ID NO: 553




subunit


554
0
iron utilization

Haemophilus

T10887
nt. 991-2508 of




protein B

influenzae


SEQ ID NO: 554


559
 1.00E−100
PREPROTEIN

Bacillus firmus

P96313
nt. 3420-4472 of




TRANSLOCASE


SEQ ID NO: 559




SECA SUBUNIT


56
2.00E−23
RpL30

Pasteurella

NP_246336.1
compliment of






multocida


nt. 656-832 of







SEQ ID NO: 56


56
9.00E−13
RpS5

Pasteurella

NP_246337.1
compliment of






multocida


nt. 843-1064 of







SEQ ID NO: 56


560
 1.00E−157
Na+/H+ antiporter

Vibrio cholerae

NP_231535.1
2 compliment of







nt. 279-2989 of







SEQ ID NO: 560


562
1.00E−72
putative biotin

Yersinia pestis

NP_404419.1
nt. 7862-8878 of




sulfoxide reductase


SEQ ID NO: 562




2


562
 1.00E−125
restriction

Neisseria

CAA09003.1
nt. 2-985 of




modification

meningitidis


SEQ ID NO: 562




system-R protein


563
0
IMMUNOGLOBULIN

Haemophilus

P45384
compliment of




A1 PROTEASE

influenzae HK715


nt. 4127-9508 of







SEQ ID NO: 563


563
0
3-

Schizosaccharo-

O14289
nt. 1980-3983 of




ISOPROPYLMALATE

myces pombe


SEQ ID NO: 563




DEHYDRATASE




(IPMI)


564
2.00E−79
orf32

Haemophilus

NP_536839.1
nt. 6241-6831 of





phage HP2

SEQ ID NO: 564


564
7.00E−33
probable variable

Salmonella

NP_457882.1
nt. 3707-4177 of




tail fibre protein

enterica subsp.


SEQ ID NO: 564






enterica serovar







Typhi



564
2.00E−14
M protein
Enterobacteria
NP_052264.1
nt. 1905-2213 of





phage 186

SEQ ID NO: 564


564
4.00E−44
similar to tail fiber

Salmonella

NP_461635.1
nt. 3171-3692 of




protein (gpH) in

typhimurium LT2,


SEQ ID NO: 564




phage P2
Fels-2 prophage


564
2.00E−85
gpJ
Enterobacteria
NP_046773.1
nt. 2267-3166 of





phage P2

SEQ ID NO: 564


564
1.00E−24
hypothetical protein

Neisseria

NP_284534.1
nt. 6852-7334 of






meningitidis


SEQ ID NO: 564





Z2491


564
4.00E−26
gpV
Enterobacteria
NP_046771.1
nt. 1337-1912 of





phage P2

SEQ ID NO: 564


564
2.00E−47
similar to

Escherichia coli

BAA16182.1
nt. 11383-11961 of




[SwissProt P44255


SEQ ID NO: 564


564
2.00E−51
hypothetical protein

Neisseria

NP_284066.1
nt. 10452-11180 of




NMA1315

meningitidis


SEQ ID NO: 564





Z2491


564
0
orf31

Haemophilus

NP_043502.1
nt. 4160-6226 of





phage HP1

SEQ ID NO: 564


564
2.00E−09
rep

Haemophilus

NP_536816.1
compliment of





phage HP2

nt. 9986-10234 of







SEQ ID NO: 564


565
2.00E−57
resolvase/

Haemophilus

AAL47097.1
nt. 11885-12445 of




integrase-like

influenzae biotype


SEQ ID NO: 565




protein

aegyptius



565
1.00E−93
integrase

Actinobacillus

AAC70901.1
compliment of






actinomycetemco-


nt. 4118-4900 of






mitans


SEQ ID NO: 565


565
6.00E−35
probable phage

Salmonella

NP_458745.1
compliment of




integrase

enterica subsp.


nt. 4148-4990 of






enterica serovar


SEQ ID NO: 565






Typhi



565
 1.00E−107
hypothetical protein

Xylella fastidiosa

NP_299042.1
compliment of





9a5c

nt. 5066-6817 of







SEQ ID NO: 565


566
 1.00E−126
hypothetical protein

Haemophilus

S15287
compliment of




(galE 5′ region) -

influenzae


nt. 10726-11607 of







SEQ ID NO: 566


567
0
unknown

Pasteurella

NP_246387.1
nt. 5343-7688 of






multocida


SEQ ID NO: 567


568
 1.00E−151
multidrug

Escherichia coli

NP_311575.1
nt. 6-1403 of




resistance
O157:H7

SEQ ID NO: 568




membrane




translocase


568
 1.00E−141
YhbX/YhjW/YijP/Yj

Neisseria

|NP_275002.1
compliment of




dB family protein

meningitidis


nt. 11213-12634 of





MC58

SEQ ID NO: 568


570
 1.00E−180
hypothetical protein

Haemophilus

S71024
compliment of




3 (ksgA-lic2B

influenzae (strain


nt. 12845-13720 of




intergenic region)
RM7004)

SEQ ID NO: 570


571
0
glycerophospho-

Haemophilus

A43576
nt. 1656-2693 of




diester

influenzae (isolate


SEQ ID NO: 571




phosphodiesterase
772)


571
 1.00E−137
outer membrane

Haemophilus

A43604
nt. 6145-6909 of




protein P4

influenzae


SEQ ID NO: 571




precursor -





Haemophilus






influenzae



571
2.00E−72
CG8298 gene

Drosophila

AAF58597.1
nt. 3813-5339 of




product [alt 1]

melanogaster


SEQ ID NO: 571


572
1.00E−40
hypothetical protein

Chlamydia

G81737
nt. 3734-4099 of




TC0130

muridarum (strain


SEQ ID NO: 572





Nigg)


572
5.00E−10
hypothetical protein

Pyrococcus

NP_142215.1
nt. 4472-4888 of






horikoshii


SEQ ID NO: 572


572
3.00E−11
109aa long

Sulfolobus

NP_377117.1
nt. 7303-7470 of




hypothetical protein

tokodaii


SEQ ID NO: 572


572
8.00E−43
hypothetical protein

Chlamydophila

NP_445524.1
nt. 4289-4618 of






pneumoniae


SEQ ID NO: 572





AR39


572
9.00E−08
CDH1-D

Gallus gallus

AAL31950.1
nt. 7183-7521 of







SEQ ID NO: 572


575
 1.00E−173
topoisomerase B

Salmonella

NP_458624.1
nt. 18980-20923 of






enterica subsp.


SEQ ID NO: 575






enterica serovar







Typhi



575
 1.00E−100
DNA helicase

Salmonella

NP_458617.1
nt. 10399-11706 of






enterica subsp.


SEQ ID NO: 575






enterica serovar







Typhi



65
2.00E−53
Sufl

Pasteurella

NP_245041.1
nt. 3-821 of






multocida


SEQ ID NO: 65


67
4.00E−39
putative MFS

Salmonella

NP_462786.1
compliment of




family tranport

typhimurium LT2


nt. 125-1033 of




protein (1st mdule)


SEQ ID NO: 67


7
4.00E−29
putative membrane

Salmonella

NP_458664.1
compliment of




protein

enterica subsp.


nt. 2-559 of






enterica serovar


SEQ ID NO: 7






Typhi



72
2.00E−51
serine transporter

Vibrio cholerae

NP_230946.1
nt. 18-803 of







SEQ ID NO: 72


74
3.00E−90
hypothetical 21.8K

Haemophilus

JH0436
compliment of




protein (in locus

influenzae


nt. 248-766 of




involved in


SEQ ID NO: 74




transformation) -


77
2.00E−18
RecX protein

Legionella

CAC33485.1
nt. 480-920 of






pneumophila


SEQ ID NO: 77


82
4.00E−95
unknown

Pasteurella

NP_246414.1
nt. 128-955 of






multocida


SEQ ID NO: 82


83
2.00E−66
unknown

Pasteurella

NP_246777.1
nt. 5-556 of






multocida


SEQ ID NO: 83


83
6.00E−33
CTP SYNTHASE

Helicobacter

NP_223042.1
compliment of






pylori J99


nt. 1027-1338 of







SEQ ID NO: 83.


83
4.00E−34
CTP synthase

Campylobacter

NP_281249.1
compliment of






jejuni


nt. 1024-1275 of







SEQ ID NO: 83


84
1.00E−16
REPRESSOR
Bacteriophage
P14819
nt. 823-1233 of




PROTEIN CI
phi-80

SEQ ID NO: 84


84
2.00E−05
orf, hypothetical

Escherichia coli

NP_415875.1
compliment of




protein
K12

nt. 533-700 of







SEQ ID NO: 84


84
4.00E−11
orf33
bacteriophage phi
NP_490633.1
compliment of





CTX

nt. 32-466 of







SEQ ID NO: 84


85
3.00E−42
SpoT

Pasteurella

NP_245857.1
nt. 899-1261 of






multocida


SEQ ID NO: 85


90
 1.00E−103
putative methylase
Bacteriophage
NP_108695.1
compliment of





Tuc2009

nt. 478-1206 of







SEQ ID NO:90


90
4.00E−11
probable adenine

Thermoplasma

NP_394624.1
compliment of




specific DNA

acidophilum


nt. 397-1140 of




methyltransferase


SEQ ID NO: 90




















TABLE 3B






Full Length






Nucleotide
Amino Acid

Homology to


Hit Identity
Sequence
Sequence
Location in Contig
Genbank Protein







CpdB
SEQ ID NO: 686
SEQ ID NO: 687
nt. 38041-36068 of
NP_246953.1





SEQ ID NO: 681





(contig 14)


putative membrane
SEQ ID NO: 688
SEQ ID NO: 689
nt. 906601-908094 of
NP_458664.1


protein


SEQ ID NO: 685





(contig 18)


GTP-binding
SEQ ID NO: 690
SEQ ID NO: 691
nt. 42557-40995 of
NP_240245.1


protein TypA/BipA


SEQ ID NO: 683





(contig 16)


outer membrane
SEQ ID NO: 692
SEQ ID NO: 693
nt. 7000420-704187 of
T30852


protein A


SEQ ID NO: 685





(contig 18)


vacB protein
SEQ ID NO: 694
SEQ ID NO: 695
nt. 39184-36836 of
NP_240369.1





SEQ ID NO: 683





(contig 16)


putative ABC
SEQ ID NO: 696
SEQ ID NO: 697
nt. 59155-58370 of
NP_282774.1


transport system


SEQ ID NO: 685


permease protein [


(contig 18)


putative exported
SEQ ID NO: 698
SEQ ID NO: 699
nt. 901142-902542 of
NP_458655.1


protein


SEQ ID NO: 685





(contig 18)


ImpA
SEQ ID NO: 700
SEQ ID NO: 701
nt. 348187-347747 of
NP_245829.1





SEQ ID NO: 685





(contig 18)


TsaA
SEQ ID NO: 702
SEQ ID NO: 703
nt. 74941-75548 of
NP_245732.1





SEQ ID NO: 684





(contig 17)


PROBABLE
SEQ ID NO: 704
SEQ ID NO: 705
nt. 74436-75176 of
NP_522358.1


TRANSPORT
SEQ ID NO: 706
SEQ ID NO: 707
SEQ ID NO: 685


TRANSMEMBRANE


(contig 18)


PROTEIN


nt. 75160-75660 of





SEQ ID NO: 685





(contig 18)


possible exported
SEQ ID NO: 708
SEQ ID NO: 709
nt. 899618-900262 of
NP_458653.1


protein


SEQ ID NO: 685





(contig 18)


LICA PROTEIN
SEQ ID NO: 710
SEQ ID NO: 711
nt. 356917-355958 of
P14181





SEQ ID NO: 685





(contig 18)


HEME-BINDING
SEQ ID NO: 712
SEQ ID NO: 713
NT. 26114-27739 of
P33950


PROTEIN A


SEQ ID NO: 683





(contig 16)


similar to
SEQ ID NO: 714
SEQ ID NO: 715
nt. 311610-312683 of
XP_068727.1


BASEMENT


SEQ ID NO: 685


MEMBRANE-


(contig 18)


SPECIFIC


HEPARAN


SULFATE


PROTEOGLYCAN


CORE PROTEIN


PRECURSOR


(HSPG)


CzcD
SEQ ID NO: 716
SEQ ID NO: 717
nt. 34865-35542 of
NP_246276.1





SEQ ID NO: 681





(contig 14)


conserved
SEQ ID NO: 718
SEQ ID NO: 719
nt. 194993-193977 of
NP_274972.1


hypothetical protein


SEQ ID NO: 685





(contig 18)


secretion protein
SEQ ID NO: 720
SEQ ID NO: 721
nt. 203707-201857 of
NP_252510.1


SecD


SEQ ID NO: 683





(contig 17)


ABC transporter
SEQ ID NO: 722
SEQ ID NO: 723
nt. 3943-5859 of
AAF31030.1


protein 1


SEQ ID NO: 681





(contig 14)


conserved
SEQ ID NO: 724
SEQ ID NO: 725
nt. 331090-331749 of
NP_273467.1


hypothetical protein
SEQ ID NO: 726
SEQ ID NO: 727
SEQ ID NO: 685



SEQ ID NO: 728
SEQ ID NO: 729
(contig 18)





nt. 331938-332492 of





SEQ ID NO: 685





(contig 18)





nt. 332681-33232 of





SEQ ID NO: 685





(contig 18)


INVASIN
SEQ ID NO: 730
SEQ ID NO: 731
nt. 416757-417020 of
P31489


PRECURSOR


SEQ ID NO: 685


(OUTER


(contig 18)


MEMBRANE


ADHESIN)


HEME/HEMOPEXIN-
SEQ ID NO: 732
SEQ ID NO: 733
nt. 229430-232195 of
P45355


BINDING


SEQ ID NO: 384


PROTEIN


(contig 17)


OPACITY
SEQ ID NO: 734
SEQ ID NO: 735
nt. 375592-375879 of
Q05033


PROTEIN OPA66


SEQ ID NO: 384


PRECURSOR


(contig 17)


Hemoglobin-
SEQ ID NO: 736
SEQ ID NO: 737
nt. 45709-42566 of
Q48153


haptoglobin


SEQ ID NO: 681


binding protein A


(contig 14)


transport protein
SEQ ID NO: 738
SEQ ID NO: 739
nt. 134452-135222 of
NP_253757.1


TatC


SEQ ID NO: 384





(contig 17)


LIPOPROTEIN
SEQ ID NO: 740
SEQ ID NO: 741
nt. 18895-20112 of
P40827


NLPD


SEQ ID NO: 682





(contig 15)


Hemoglobin and
SEQ ID NO: 742
SEQ ID NO: 743
nt. 34181-31041 of
Q9X442


hemoglobin-


SEQ ID NO: 682


haptoglobin binding


(contig 15)


protein C precursor


HimA
SEQ ID NO: 744
SEQ ID NO: 745
nt. 382795-383085 of
NP_245565.1





SEQ ID NO: 685





(contig 18)


transferrin-binding
SEQ ID NO: 746
SEQ ID NO: 747
nt. 178537-175799 of
S70906


protein 1


SEQ ID NO: 683





(contig 16)


SapC
SEQ ID NO: 748
SEQ ID NO: 749
nt. 197754-196867 of
NP_245850.1





SEQ ID NO: 685





(contig 18)


heat shock protein
SEQ ID NO: 750
SEQ ID NO: 751
nt. 40414-41265 of
NP_273864.1


HtpX


SEQ ID NO: 682





(contig 15)


HEME/HEMOPEXIN-
SEQ ID NO: 752
SEQ ID NO: 753
nt. 229430-232195 of
P45354


BINDING


SEQ ID NO: 684


PROTEIN


(contig 17)


HEME/HEMOPEXIN
SEQ ID NO: 754
SEQ ID NO: 755
nt. 227721-229418 of
P45356


UTILIZATION


SEQ ID NO: 684


PROTEIN B


(contig 17)


HEME/HEMOPEXIN
SEQ ID NO: 756
SEQ ID NO: 757
nt 225516-227645 of
P45357


UTILIZATION


SEQ ID NO: 684
NP_246561.1


PROTEIN C


(contig 17)


iron utilization
SEQ ID NO: 758
SEQ ID NO: 759
nt. 32076-33611 of
T10887


protein B


SEQ ID NO: 684





(contig 17)


PREPROTEIN
SEQ ID NO: 760
SEQ ID NO: 761
nt. 82314-84785 of
P96313


TRANSLOCASE


SEQ ID NO: 683


SECA SUBUNIT


(contig 16)


IMMUNOGLOBULIN
SEQ ID NO: 762
SEQ ID NO: 763
nt. 171647-166263 of
P45384


A1 PROTEASE


SEQ ID NO: 683





(contig 16)


multidrug
SEQ ID NO: 764
SEQ ID NO: 765
nt. 74524-72992 of
NP_311575.1


resistance


SEQ ID NO: 683


membrane


(contig 16)


translocase


YhbX/YhjW/YijP/Yj
SEQ ID NO: 766
SEQ ID NO: 767
nt. 61734-63200 of
NP_275002.1


dB family protein


SEQ ID NO: 683





(contig 16)


putative membrane
SEQ ID NO: 768
SEQ ID NO: 769
nt. 906601-908094 of
NP_458664.1


protein


SEQ ID NO: 685





(contig 18)


putative membrane
SEQ ID NO: 770
SEQ ID NO: 771
nt. 16185-17942 of
NP_404859.1


protein


SEQ ID NO: 683





(contig)









Example 3
Construction of the NTHi Promoter Trap Library

To identify potential virulence determinants of NTHi, bacterial gene expression was monitored by differential fluorescence induction (DFI) during early disease progression in one specific anatomical niche of a chinchilla model of NTHi-induced otitis media (OM). Genomic DNA fragments from NTHi strain 86-028NP were cloned upstream of the promoterless gfpmut3 gene using a promoter trap library. Plasmid pGZRS39A, a derivative of pGZRS-1 isolated from Actinobacillus pleuropneumoniae, is an A. pleuropneumoniae-Escherichia coli shuttle vector. This plasmid contains the origin of replication from A. pleuropneumoniae, the lacZα gene from pUC19 and the kanamycin resistance gene from Tn903. (West et al., Genes, 160: 81-86, 1995).


The promoter trap vector was constructed by cloning the GTP mutant gfpmut3 gene, as a BamHI to EcoRI fragment into pGZRS-39A to form pRSM2167. This mutant GTP gene contains two amino acid changes, S65G and S72A, that enhance fluorescence emission when excited at 488 nm. This mutant also has high solubility and fast kinetics of chromophore formation (Cormack et al., Gene, 173: 33-38, 1996). This plasmid was transformed by electroporation into NTHi strain 86-028NP, generating the parent-plasmid strain 86-028NP/pRSM2169.


Random genomic DNA fragments (described in Example 1) were prepared for ligation into the promoter probe vector. Genomic DNA was isolated from strain 86-028NP using the Puregene DNA isolation kit (Gentra Systems, Minneapolis, Minn.) according to the manufacturer's protocol. Due to restriction barriers, it was necessary to isolate the plasmid DNA and use this for the library generation. The isolated DNA was partially digested with Sau3AI (NEB, Beverly, Mass.; 0.25 units/μg DNA) for 1 hour at 37° C., separated by gel electrophoresis and DNA fragments 0.5-1.5 kb in size were recovered using the Qiagen gel extraction kit. For vector preparation, pRSM2167 was isolated from an overnight culture using the Wizard Plus Maxiprep DNA purification system (Promega, Madison Wis.) according to the manufacturer's protocol.


Plasmid DNA was linearized by BamHI digestion and 5′ phosphate groups removed by treatment with calf intestinal alkaline phosphatase (CIAP; GibcoBRL Life Technologies). Genomic DNA fragments were ligated with the linearized, phosphatase-treated vector and electroporated into competent NTHi strain 86-028NP prepared for electroporation according to a modified protocol (Mitchell et al., Nucleic Acids Res., 19: 3625-3628, 1991). When plasmid DNA was electroporated back into NTHi strain 86-028NP, transformation efficiency was improved by one-thousand fold. Briefly, cells were grown to an OD600=0.3 in sBHI (brain heart infusion) broth at 37° C., 220 rpm. Cells were chilled on ice for 30 minutes and subsequently washed with an equal volume of 0.5×SG (1×SG: 15% glycerol, 272 mM sucrose) at 4° C. Washes were repeated a total of three times. Subsequently, the cells were diluted in 1×SG to a 100× concentrated volume. The cells were electroporated using the BioRad Gene Pulser II set at 200 ohms, 2.5 kV and 25 μF and then diluted in 1 ml prewarmed sBHI, incubated for 2 hours at 37° C., 5% CO2 and plated on chocolate agar for overnight growth of transformants.


Transformants were selected and frozen in pools of 1000 clones in skim milk containing 20% glycerol (vol/vol). A 68,000 member gfp promoter probe library was generated. Using the probability calculation of Clarke and Carbon (Cell, 9: 91-99, 1976), to achieve a 99% probability of having a given DNA sequence represented in a library of 300 bp fragments of strain 86-028NP DNA (1.8×106 bp/genome), a library of 27,629 clones was needed. Therefore the present library represents 2.5 fold coverage of the 86-028NP genome.


In order to assess the quality of the library, fifty clones were selected at random, grown overnight on chocolate agar and the plasmids were isolated and insert DNA sequenced. A majority (64%) of the selected clones had insert sizes ranging between 200 and 500 bp while 32% exceeded 500 bp. The majority of inserts showed homology to unique H. influenzae strain Rd open reading frames (ORFs), and 15 clones had sequence unique to strain 86-028NP DNA. Of those clones with homology to strain Rd, 60% were in the correct orientation, 36% of which contained sequence upstream an ORF. Although a majority of clones had an insert size less than 500 bp, no correlation was found between small insert size and increased GFP expression. In fact four clones exhibited slight to moderate fluorescence in vitro, 3 of which had insert sizes between 200-500 base pairs and one had an insert that was greater than 700 base pairs.


A fraction of the library (approximately 1000 clones) was grown on chocolate agar, harvested in PBS and analyzed by flow cytometry for GFP fluorescence. Compared to strain 86-028NP/pRSM2169 that contains the promoter trap vector without insert DNA, the pool of library clones displays an increased fluorescence intensity. Thus, the library contains clones with promoters at varying levels of activity.


Example 4
Analysis of 86-028NP Derivatives Expressing GFP

In order to establish the FACS parameters necessary to identify and sort gfp-expressing bacteria, a panel of isolates demonstrating varying levels of gfp expression was utilized. Background fluorescence was assessed using strain 86-028NP/pRSM2169 (negative control), therefore any observed fluorescence would be due to the lacZ promoter driving gfp expression. However, this strain does not produce detectable levels of GFP and in fact, does not demonstrate increased fluorescence when compared to the parent strain 86-028NP. A high-level gfp-expressing isolate was generated by cloning a 500 bp fragment containing the strong promoter for outer membrane protein P2 expression into SalI-BamHI digested pRSM2167. This plasmid was transformed into 86-028NP by electroporation, generating the high-level gfp expressing strain 86-028NP/pRSM2211 (highly fluorescent control). This strain demonstrated an approximate 100 fold increase in GFP fluorescence compared to strain 86-028NP/pRSM2169. An intermediate fluorescent derivative clone, 86-028NP/pKMM4B5 (intermediate fluorescent control), was isolated by FACS analysis and used both in preliminary experiments and as a control for cell sorting. The DNA fragment containing a promoter driving gfp expression in vitro is unique to strain 86-028NP, having no known homology to DNA of other organisms. This clone exhibits an approximate 10 fold increase in fluorescence compared to strain 86-028NP/pRSM2169.


The control strains were resuspended from growth on chocolate agar and labeled with cross-reactive Phycoprobe R-PE anti-human IgG (H+L) antibody (10 μg/ml in 100 μl PBS; Biomeda Corp) for 30 minutes at 4° C. Following three successive washes to remove unbound antibody, bacteria were resuspended in 300 μl DPBS for FACS analysis. These control preparations were used to set the appropriate size and fluorescence gates using a Coulter Epics Elite flow cytometer (Coulter Corp.) equipped with an argon laser emitting at 488 nm. Bacteria were gated for size based on log forward angle and side scatter detection and for sorting by FITC/PE labeling of bacteria. Sorted cells were collected into cold sBHI and plated on chocolate agar. After overnight growth, cells were collected for a secondary round of infection or were individually selected and grown overnight, screened by individual clone for fluorescence when grown in vitro, and frozen in skim milk containing 20% (vol/vol) glycerol prior to plasmid isolation and sequencing of insert DNA. Sorting efficiency of control strains was confirmed using a Coulter EPICS flow cytometer (Coulter Corp.).


Many plasmids were segregated rapidly in vitro in the absence of antibiotic selection. Thus, in order to assess whether the promoter trap vector used here was prone to this event, a single colony of strain 86-028NP/pRSM2211 (highly fluorescent control) was isolated on chocolate agar and passaged 20 times in the absence of antibiotic selection. No significant decrease in fluorescence intensity was observed when compared to bacteria grown in the presence of antibiotic. In addition, the plasmid is maintained in the absence of antibiotic selection in vivo. Similar bacterial counts were observed when bacteria-containing middle ear fluids collected from a chinchilla were plated on chocolate agar with or without kanamycin. These data demonstrate that the promoter trap vector was stably maintained in the absence of antibiotic selection.


In addition to problems with plasmid stability, early studies on the use of GFP as a reporter to study host-pathogen interactions demonstrated that GFP could be continuously synthesized as a cytoplasmic protein with low toxicity, having minimal effects on the bacterial cell-surface dynamics (Chalfie et al., Science, 263: 802-805, 1994). The construction of a high level gfp-expressing derivative allowed the assessment of the GFP toxicity on NTHi. Growth curves of both the wild-type strain (86-028NP) and the high GFP producing strain 86-028NP/pRSM2211 were compared when grown under similar conditions. The growth rates were similar, indicating that GFP expression was not toxic to the cells.


The 86-028NP gfp-expressing derivatives were used to define the parameters for efficient cell sorting. Strain 86-028NP/pRSM2169 was mixed with the intermediate gfp-expressing derivative, strain 86-028NP/pKMM4B5, at a 100:1 ratio, simulating the in vivo environment that is expected to contain a small percentage of gfp-expressing clones relative to the total bacterial population. This mixture was subjected to FACS analysis, collecting the 1.8% most fluorescent population and the 52% least fluorescent population. Flow cytometric analysis of the sorted populations revealed an enrichment of strain 86-028NP/pKMM4B5 to 65% of the bacterial population, a phenomenon that was not observed when sorting on the negative population. Subsequent rounds of sorting would be expected to further enrich for this intermediate fluorescent population. The inability to decrease the amount of fluorescent bacteria in the negative sort was attributed to the size of the gate set for negative sorting. GFP-negative cells were enriched by gating on the 10% least fluorescent population.


Example 5
Direct Labeling of Bacteria from Middle Ear Fluids

A similar strategy (as described in Example 5) was applied to sort fluorescent clones from effusions obtained from the chinchilla middle ear during AOM. Our ability to use differential fluorescence induction (DFI) in vivo was dependent upon our ability to sort gfp-expressing bacteria from non-fluorescent bacteria, fluorescent and non-fluorescent cellular debris, and eukaryotic cells.


Healthy adult chinchillas (Chinchilla lanigera) with no evidence of middle ear infection by either otoscopy or tympanometry were used to screen the library for promoter activity in vivo. Two pools of the NTHi/pRSM2169 library (1000 clones each) were grown overnight on chocolate agar containing kanamycin. The library was combined and diluted in cold 10 mM sterile PBS to 3.3×106 CFU/ml and 300 μl (1.0×106 CFU; 500 CFU/clone) was used to inoculate the left and the right chinchilla transbullar cavity (2000 clones/ear). OM development was monitored by video otoscopy and tympanometry at 24 and 48 hours. The bacteria multiplied in the middle ear cavity, reaching a concentration 500 times the inoculum dose by 48 hours as expected (Bakaletz et al., Infect. Immunity 67: 2746-62, 1999). This bacterial adaptation to the host environment results in an inflammatory response, indicated by erythema, vessel dilation and bulging of the tympanic membrane, infiltration of polymorphonuclear cells (PMN's), and accumulation of fluid in the middle ear cavity as observed by otoscopy and microscopic examination of recovered effusions. Twenty-four and 48 hours later, middle ear fluids were retrieved by epitympanic tap, and prepared for FACS.


It is important to note that this analysis was limited to those bacteria recoverable in the middle ear fluid. In some cases it was necessary to lavage the middle ear cavity to collect the bacteria for FACS analysis. Thus, this analysis includes genes up-regulated when NTHi are loosely adherent to mucosae. NTHi has been observed to form a biofilm in the middle ear cavity in a chinchilla model of OM (Erhlich et aL, JAMA, 287: 1710-5, 2002). Since the protocols described herein select for clones recovered from the planktonic population, it is not expected to recover those clones in which genes are up-regulated when the bacteria are associated with mucosal biofilms. Homogenization of middle ear mucosae and subsequent bacterial cell isolation however, would enable us to recover these clones. It is also possible that some GFP-expressing clones were recovered in the effusion, yet were adherent to eukaryotic cells present in the effusion as exfoliated cells, or in aggregates. These bacteria are difficult to recover from the effusion without compromising the sorting efficiency. Therefore the middle ear fluids were treated with a mucolytic agent, then centrifuged to remove large aggregates and eukaryotic cells and prior to labeling.


Chinchilla middle ear fluids were diluted, if necessary, to 250 μl with sterile saline. An equal volume of N-acetyl-L-cysteine (0.5%; w/v) in DPBS (pH 7.4) was added for 5 minutes at room temperature as a mucolytic agent (Miyamoto and Bakaletz, Microb. Pathog., 21: 343-356 1996). Fluids were centrifuged (300×g, 5 min) to remove cellular debris, red blood cells and inflammatory cells, and supernatants containing bacteria were transferred to a fresh tube. Bacteria were incubated with chinchilla antiserum (1:50 dilution) directed against a whole OMP preparation, derived from NTHi strain 86-028NP, for 45 minutes at 4° C., pelleted by centrifugation (2000×g, 5 min) and washed twice with cold DPBS containing 0.05% bovine serum albumin. Bacteria were subsequently labeled with cross-reactive phycoprobe R-PE anti-human IgG (H+L) antibody (10 μg/ml in 100 μl PBS; Biomeda Corp) for 30 minutes at 4° C. Following three successive washes to remove unbound antibody, cells were resuspended in 300 μl DPBS for FACS analysis.


Example 6
Identification of Promoters Induced In Vivo in Acute Otitis Media


H. influenzae 86-028NP transformed with the promoter trap library was grown overnight on chocolate agar. To select against those clones containing promoters that expressed gfp in vitro, the library was subjected to one round of FACS analysis (as described in Example 6), collecting only those clones expressing low-level amounts of GFP. These clones were pooled and used to inoculate the chinchilla middle ear transbullarly. Following 24 and 48 hours of infection, bacteria-containing effusions were removed by epitympanic tap. Bacteria were indirectly labeled with R-PE-labeled antibody and subjected to FACS analysis by gating on fluorescently tagged bacteria but sorting for those that were also expressing. These clones were used to reinfect animals for further enrichment. Following the final round of sorting, single colony isolates were screened in vitro for lack of fluorescence.


Those clones isolated by FACS analysis (positive for GFP fluorescence in vivo), which did not emit fluorescence in vitro were prepared for plasmid isolation and identification of insert DNA sequence. These clones were grown overnight on chocolate agar plates containing kanamycin and prepared for plasmid isolation using the Qiaprep Miniprep Kit (Qiagen) according to the manufacturer's protocol. Plasmid insert DNA was sequenced using the primer 5′-TGCCCATTAACATCACCATCTA-3′ (SEQ ID NO: 588) that is complementary to the gfpmut3 gene and downstream of the insert DNA. Sequencing reactions were performed using the ABI prism BigDye® terminator cycle sequencing ready reaction kit (Applied Biosystems) according to manufacturer's protocol using a GeneAmp PCR System 9700 (Applied Biosystems). The sequences were then purified by passage through sephadex G-50 in a 96-well multiscreen HV plate (Millipore) and subsequently analyzed on an ABI Prism 3100 DNA analyzer (Applied Biosystems).


Insert sequences were compared to the complete annotated sequence of H. influenzae strain Rd. Those inserts with no nucleotide homology to strain Rd were subsequently analyzed using the BLASTN and BLASTX algorithms. Further sequence analysis was performed with DNASTAR (Madison, Wis.). Inserts in the correct orientation and containing sequence 5′ to a predicted ORF contained a putative promoter that was preferentially active when the NTHi bacteria were in the chinchilla middle ear.


Fifty-two clones with putative promoters that were regulated in vivo were isolated. Of the 44 candidate clones containing sequence similar to that identified in H. influenzae strain Rd, quantitative comparison of gene expression in vitro and in vivo confirmed up-regulated gene expression for twenty-six genes (60%) when NTHi respond to environmental cues present in the chinchilla middle ear and these genes are summarized in Table 4A below. The in vivo-regulated promoters driving expression of genes are predicted to be involved in membrane transport, environmental informational processing, cellular metabolism, gene regulation, as well as hypothetical proteins with unknown function.


In order to confirm the induction of putative promoter candidates in vivo, the relative amount of messenger RNA expression was compared when NTHi strain 86-028NP was grown in vitro to mid-log phase or in vivo for 48 hours. The RNA was isolated using TRIzol LS reagent (Gibco Life Technologies) according to the manufacturer's protocol. DNA was removed from the RNA preparation using DNA-free kit (Ambion) according to the manufacturer's protocol. DNase I treated RNA samples were purified by passage through a Qiagen RNeasy column. RNA purity and integrity was assessed by 260/280 nm spectrophotometer readings and on the Agilent 2100 Bioanalyzer (Agilent Technologies), respectively.


In order to independently confirm the FACS data, we determined the relative expression of candidate genes by quantitative RT-PCR. The parent strain 86-028NP, was used for these studies. Real-time quantitative RT-PCR using the one-step QuantiTect SYBR Green RT-PCR kit (Qiagen) assessed transcription levels according to the manufacture's instructions. Briefly, using primers generated to an open reading frame downstream of the putative in vivo-induced promoters identified by FACS analysis, gene-specific mRNA was reverse transcribed and amplified by RT-PCR on the ABI Prism 7700 sequence detection system (Applied Biosystems). The amount of product was calculated using a standard curve generated to known amounts of bacterial genomic DNA (102-107 genomic copies DNA) by amplifying a fragment of the gyrase (gyr) gene. Controls were analyzed in parallel to verify the absence of DNA in the RNA preparation (−RT control) as well as the absence of primer dimers in control samples lacking template RNA. In addition, RT-PCR products were analyzed by gel electrophoresis and, in all cases, a single product was observed at the appropriate base pair size. Amounts of bacterial RNA between samples were normalized relative to gyr expression, shown to be constitutively expressed under various growth conditions that we tested in vitro. Known amounts of bacterial genomic DNA (102-107 genomic copies DNA) were used to generate a standard curve for RT-PCR quantitation by amplifying a fragment of the gyrase (gyr) gene. Gyrase is constitutively expressed in vitro under various growth conditions and was therefore used to normalize total bacterial RNA levels between samples. Relative gene expression in vivo was compared to that of gene expression in vitro and data expressed as fold-increase are summarized in Table 4A.


The 8-fold sequencing of the NTHi genome identified the full length open reading frames for the majority of genes listed in Table 4A. Table 4B provides the full length nucleotide sequence within the NTHi genome and the corresponding amino acid sequence. The fold induction of the gene due to environmental cues present in the chinchilla middle ear and the product or function of the gene are repeated in Table 4B for convenience.














TABLE 4A






Gene or
SEQ
GenBank
Fold



Category
ORF
ID NO:
Protein ID
Induction
Product or Function




















Amino acid metabolism
hisB
589
NP_438632
2.9
Histidine biosynthesis







bifunctional protein


Lipoprotein
lppB
590
NP_438862.1
2.6
Lipoprotein B homologue


Membrane transport
sapA
591
NP_439780.1
2.8
Peptide ABC transporter;







periplasmic SapA precursor



lolA
592
NP_439736.1
2.4
Outer membrane lipoproteins







carrier protein precursor



rbsC
593
NP_438661.1
5.1
Ribose transport system







permease protein


Purine synthesis
purE
594
NP_439757.1
51.7
Phosphoribosylaminoimidazole







carboxylase catalytic







subunit; PurE


Biosynthetic and
ribB
595
NP_438923.1
8.3
3,4-dihydroxy-2-butanone 4-


metabolic functions




phosphate synthase;







riboflavin biosynthesis



arcB
596
NP_438753.1
10
Ornithine







carbamolytransferase;







arginine degradation



uxuA
597
NP_438228.1
3.1
Mannonate dehydratase;







production of glyceraldehyde







3-phosphate



dsbB
598
NP_438589.1
2.6
Disulfide oxidoreductase;







disulfide bond formation







protein B



ureH
599
NP_438693.1
3.9
Urease accessory protein



licC
600
NP_439688.1
2.3
Phosphocholine (ChoP)







cytidylyltransferase



HI1647
601
NP_439789.1
2.0
Putative pyridoxin







biosynthesis protein; singlet







oxygen resistance protein


DNA replication,
ispZ
602
P43810
2.5
Probable intracellular


repair




septation protein



radC
603
NP_439113.1
2.1
DNA repair protein



mukF
604
P45185
2.0
MukF protein homologue;







remodeling of nucleiod







structure


Gene regulation
glpR
605
NP_438777.1,
2.8
Glycerol-3-phosphate regulon





NP_439170.1

repressor



ihfB
606
P43724
2.5
Integration host factor beta







subunit



argR
607
NP_439365.1
2.7
Arginine repressor



cspD
608
NP_439584.1
2.1
Cold shock like protein;







stress response protein


Hypothetical or
HI0094
609
NP_438267.1
8.3
Hypothetical protein


unknown proteins
HI1163
610
NP_439321.1
2.3
Conserved hypothetical







protein; putative oxidase



HI1063
611
NP_439221.1
2.7
Hypothetical protein



HI0665
612
NP_438824.1
2.8
Hypothetical protein



HI1292
613
NP_439444.1
2.6
Hypothetical protein



HI1064
614
NP_439222.1
2.6
Hypothetical protein






















TABLE 4B







Full







Gene
Length



or
Nucleotide
Amino Acid

Fold
Product or


Category
ORF
Sequence
Sequence
Location in Contig
Induction
Function





















Amino
hisB
SEQ ID NO:
SEQ ID NO:
nt. 68378-67290
2.9
Histidine


acid

615
616
of SEQ ID NO:

biosynthesis


metabolism



680 (contig 13)

bifunctional protein


Membrane
sapA
SEQ ID NO:
SEQ ID NO:
nt. 200403-198709
2.8
Peptide ABC


transport

617
618
of SEQ

transporter;






ID NO: 685

periplasmic SapA






(contig 18)

precursor



rbsC
SEQ ID NO:
SEQ ID NO:
nt. 42773-41802
5.1
Ribose transport




619
620
of SEQ ID NO:

system permease






680 (contig 13)

protein


Purine
purE
SEQ ID NO:
SEQ ID NO:
nt. 219625-219131
51.7
Phosphoribosylaminoimidazole


synthesis

621
622
of SEQ

carboxylase catalytic






ID NO: 685

subunit; PurE






(contig 18)


Biosynthetic
ribB
SEQ ID NO:
SEQ ID NO:
nt. 131537-132184
8.3
3,4-dihydroxy-2-


and

623
624
of SEQ

butanone 4-


metabolic



ID NO: 682

phosphate synthase;


functions



(contig 15)

riboflavin








biosynthesis



arcB
SEQ ID NO:
SEQ ID NO:
nt. 49710-48706
10
Ornithine




625
626
of SEQ ID NO:

carbamolytransferase;






681 (contig 14)

arginine








degradation



uxuA
SEQ ID NO:
SEQ ID NO:
nt. 840671-841855
3.1
Mannonate




627
628
of SEQ

dehydratase;






ID NO: 685

production of






(contig 18)

glyceraldehyde 3-








phosphate



dsbB
SEQ ID NO:
SEQ ID NO:
nt. 388050-388583
2.6
Disulfide




629
630
of SEQ

oxidoreductase;






ID NO: 384

disulfide bond






(contig 17)

formation protein B



ureH
SEQ ID NO:
SEQ ID NO:
nt. 4452-5267 of
3.9
Urease accessory




631
632
SEQ ID NO: 680

protein






(contig 13)



licC
SEQ ID NO:
SEQ ID NO:
nt. 355083-354382
2.3
Phosphocholine




633
634
of SEQ

(ChoP)






ID NO: 385

cytidylyltransferase






(contig 18)



HI1647
SEQ ID NO:
SEQ ID NO:
nt. 664017-664892
2.0
Putative pyridoxin




635
636
of SEQ

biosynthesis protein;






ID NO: 685

singlet oxygen






(contig 18)

resistance protein


DNA
ispZ
SEQ ID NO:
SEQ ID NO:
nt. 4512-5069 of
2.5
Probable


replication,

637
638
SEQ ID NO: 683

intracellular


repair



(contig 16)

septation protein



radC
SEQ ID NO:
SEQ ID NO:
nt. 132695-132030
2.1
DNA repair protein




639
640
of SEQ






ID NO: 683






(contig 16)



mukF
SEQ ID NO:
SEQ ID NO:
nt. 504549-503215
2.0
MukF protein




641
642
of SEQ

homologue;






ID NO: 685

remodeling of






(contig 18)

nucleiod structure


Gene
glpR
SEQ ID NO:
SEQ ID NO:
nt. 72716-73483
2.8
Glycerol-3-


regulation

643
644
of SEQ ID NO:

phosphate regulon






682 (contig 15)

repressor



ihfB
SEQ ID NO:
SEQ ID NO:
nt. 661004-660720
2.5
Integration host




645
646
of SEQ

factor beta subunit






ID NO: 685






(contig 18)



argR
SEQ ID NO:
SEQ ID NO:
nt. 178540-178085
2.7
Arginine repressor




647
648
of SEQ






ID NO: 685






(contig 18)



cspD
SEQ ID NO:
SEQ ID NO:
nt. 435310-435528
2.1
Cold shock like




649
650
of SEQ

protein; stress






ID NO: 685

response protein






(contig 18)


Hypothetical
HI1163
SEQ ID NO:
SEQ ID NO:
nt. 137202-134119
2.3
Conserved


or

651
652
of SEQ

hypothetical protein;


unknown



ID NO: 685

putative oxidase


proteins



(contig 18)



HI1063
SEQ ID NO:
SEQ ID NO:
nt. 35158-34937
2.7
Hypothetical protein




653
654
of SEQ ID NO:






685 (contig 18)



HI0665
SEQ ID NO:
SEQ ID NO:
nt. 17949-18980
2.8
Hypothetical protein




655
656
of SEQ ID NO:






679 (contig 12)



HI1292
SEQ ID NO:
SEQ ID NO:
nt. 555002-555799
2.6
Hypothetical protein




657
658
of SEQ






ID NO: 685






(contig 18)









Example 7
Identification of Virulence-Associated Genes

In many bacterial species, a subset of virulence-associated genes is regulated by errors in replication of short repeats. These repeats may be 5′ to a gene or in the coding sequence, and their presence is an indication of controlled expression of the gene, which indicates association with virulence. Addition or deletion of a repeat results in the expression or of lack of expression of the particular virulence determinant.


The NTHi H. influenzae strain 86-028NP contig set was queried for short oligonucleotide repeats. The region surrounding the repeats was analyzed to identify the gene(s) associated with the repeat. Table 5 lists the identified repeats and the ORF (identified by BLAST) associated with each repeat.


Further sequence analysis has identified the full length nucleotide sequence of the virulence-associated genes and the corresponding amino acid sequences encoded by the ORF. The derived amino acid sequences are highly homologous to the listed Genbank sequence.














TABLE 5






Location in
Location in
Full Length
Amino




3-fold
8-fold
Nucleotide
Acid
Genebank


Repeat
Contigs
Contigs
Sequence
Sequence
Accession No.







SEQ ID
115
nt. 484533-483643
SEQ ID
SEQ ID
NP_439538.1


NO: 581
nt. 473-540
of
NO: 659
NO: 660



of
SEQ ID



SEQ ID
NO: 685



NO: 115
(contig 18)


SEQ ID
377
nt. 416274-414910
SEQ ID
SEQ ID
P45217


NO: 582
nt. 546-597
of
NO: 661
NO: 662



of
SEQ ID NO:



SEQ ID
685 (contig



NO: 337
18)


SEQ ID
505
nt. 414500-416614
SEQ ID
SEQ ID
AAK76425


NO: 583
nt. 310-393
of
NO: 663
NO: 664



of
SEQ ID NO:



SEQ ID
684 (contig



NO: 505
17)


SEQ ID
508
nt. 506516-507913
SEQ ID
SEQ ID
NP_439520


NO: 584
nt. 2079-2120
of
NO: 665
NO: 666



of
SEQ ID NO:



SEQ ID
685 (contig



NO: 508
18)


SEQ ID
518
nt. 354274-352406
SEQ ID
SEQ ID
NP_284893


NO: 585
nt. 758-789
of
NO: 667
NO: 668



of
SEQ ID NO:



SEQ ID
684 (contig



NO: 518
17)


SEQ ID
543
nt. 347864-243236
SEQ ID
SEQ ID
AAA20524


NO: 586
nt. 1814-196
of
NO: 669
NO: 670



of
SEQ ID NO:



SEQ ID
685 (contig



NO: 543
18)


SEQ ID
543
nt. 699709-704187
SEQ ID
SEQ ID
AAD56660


NO: 586
nt. 1814-196
of
NO: 671
NO: 672



of
SEQ ID NO:



SEQ ID
685 (contig



NO: 543
18)


SEQ ID
567
nt. 85546-84689
SEQ ID
SEQ ID
ZP_00053190


NO: 587
nt. 13309-13320
of
NO: 673
NO: 674



of
SEQ ID NO:



SEQ ID
681 (contig



NO: 567
14)









Example 8
Identification of Unique NTHi Gene Sequences

Genes associated with NTHi virulence were also identified by comparing the level of expression of the gene when the NTHi bacterium was infecting a tissue verses the level of expression of the same gene when the NTHi was grown on artificial laboratory media. These novel genes were identified using the promoter trap techniques described above in Examples 4-6, and subsequently comparisons with the known Rd genome demonstrated these genes are unique to NTHi strain 86-028NP.


The DNA sequence identified using this screening procedure are set forth as SEQ ID NOS: 577-580. These sequences did not contain genes or gene fragments that have homologues in the H. influenzae Rd. genome sequence. Even though these are completely novel sequences, due to their expression level during NTHi infection in the chinchilla middle ear, it is likely that expression of these genes are involved in NTHi virulence.


Example 9
Complete Sequence Analysis of NTHi Strain 86-028NP

Library Construction:


Chromosomal DNA was prepared from strain 86-028NP using Puregene reagents (Gentra Systems, Minneapolis, Minn.). For the initial shotgun sequencing of the genome, 1 to 2 kb and 2 to 4 kb libraries of genomic DNA were constructed in pUC18 as previously described (Munson et al., Infect Immun 72:3002-10, 2004). For the scaffolding library, genomic DNA was manually sheared into a mean fragment size of 40 kb using a Hamilton syringe. After end repair, fragments were fractionated using a 0.7% low melting temperature agarose gel. Fragments larger than 30 kb were excised and an in-gel ligation to pEpiFOS-5 was performed. The ligation mixture recovered from the gel was packaged into Lambda phage, in vitro, and used to transfect EPI100 cells (Epicentre, Madison, Wis.).


Sequencing:


For the shotgun portion of the sequencing, cycle-sequencing reactions were run using PE Big-Dye™ terminators and universal primers (M13 forward and reverse) as previously described (Munson et al., Infect Immun 72:3002-10, 2004). To end-sequence the scaffolding library, plasmid was first purified using a R.E.A.L. Prep 96 Plasmid Kit (QIAGEN Inc., Valencia, Calif.), then amplified using a TempliPhi™ DNA Amplification Kit (Amersham Biosciences Corp., Piscataway, N.J.) before running reactions using PE Big-Dye™ terminators and pEpiFOS-5 forward and reverse sequencing primers (Epicentre, Madison, Wis.). The reactions for the clean-up portions of the project were run using PE Big-Dye™ terminators and custom primers (Integrated DNA Technologies, Coralville, Iowa). Excess dye terminators were removed with Sephadex G50 columns in 96-well format and sequence determined on either an ABI 3700 or an ABI 3100 capillary electrophoresis DNA sequencer (Applied Biosystems, Foster City, Calif.).


Genome Closure:


Paired end-sequences from the scaffolding library and PCR were used to order the contigs and to add sequence in areas of low sequence coverage. Paired custom primers (Integrated DNA Technologies, Coralville, Iowa) were designed to bind at the ends of each contig as well as regions flanking areas of low sequence coverage. The intervening regions were amplified with a standard PCR protocol as described in Sambrook et al., Molecular Cloning: A Laboratory Manual (Third Edition). Cold Spring Harbor Laboratory Press., 2001 using Taq polymerase (Roche Diagnostics, Indianapolis, Ind.) and sequenced on both strands. Ribosomal RNA operons and the HMW gene clusters were completely sequenced using clones from the scaffolding library as templates.


Assembly:


Phred/Phrap was used for data assembly, employing the default assembly parameters (Ewing et al., Genome Res 8:186-94, 1998; Ewing et al., Genome Res 8:175-85, 1988; Gordon et al., Genome Res 8:195-202, 1998) as described in Munson et al., Infect Immun 72:3002-10, 2004. Assemblies were checked using the paired-end sequence data from 507 clones using the Seqman II program from the DNASTAR suite.


Data Analyses:


Coding regions were identified using Glimmer2 (v2.13) trained on the set of 1178 longest ORFs identified by the Glimmer2 long-orfs program (Delcher et al., Nucleic Acids Res 27:4636-41, 1999). Automated annotation by similarity was done by searching the Glimmer ORF set against the strain Rd proteome, the SwissProt database, the NCBI COGs database, and the KEGG database. The strain Rd database was compared bi-directionally with the strain 86-028NP ORF set using tricross to determine high-confidence regions of similarity and to produce the dotplot comparison of genome organization Ray et al., Bioinformatics 17:1105-12, 2001).


The automatically predicted annotation information was further manually curated using Artemis (Rodriguez et al., Infect Immun 71:1635-42, 2003) for visualization and demarcation of genomic regions of interest, and a custom FileMaker Pro database was generated which was then used to apply manual revisions and archive data related to the functional assignment. FASTA analyses were used for the primary automated comparisons. The strong synteny between the strain 80-028NP and strain Rd genomes allowed assignment of a function to the majority of the genes automatically, with similarity held to 90% or better at the amino acid level for matching. The near one-to-one mapping from the strain 86-028NP genome to the strain Rd genome was confirmed by assembly of the strain Rd ORFs onto the strain 86-028NP genome sequence, and the reverse assembly of the strain 86-028NP ORFs onto the Rd genome, using the SeqMan program with the assembly criteria of 80% identity at the nucleotide level.


Manual BLAST analyses were used to explore the potential function of ORFs that did not show strong similarity to known genes. Manual curation of the automatic assignments was carried out to conform annotations to the current literature and repair the few places where the automated algorithm was easily led astray (notably the HMW gene clusters, the hemoglobin-binding proteins and the hsd gene clusters, whose high family similarity confounds automated assignment).


The tRNA genes were identified by tRNAscan-SE v1.11 (Lowe et al., Nucleic Acids Res 25:955-64, 1997). The rRNA operons were identified based on 16, 23 and 5S rRNA similarity with strain Rd, and the CLUSTALW alignment of the neighborhoods containing these genes to determine the boundaries of the semi-conserved regions.


The complete genomic DNA sequence of the H. influenzae, strain 86-028NP, is set out as SEQ ID NO: 772. The open reading frames within the genomic DNA are set out as SEQ ID NOS: 773-2593 and are described in Table 6. The resulting gene products are set out as SEQ ID NOS: 2581-4414 and are described in Table 7. The “c” preceding the nucleotides of the open reading frames indicates the open reading frame is complementary to the 5′ to 3′ sequence set out as SEQ ID NO: 722. The unique NTHi genes are marked in Table 6. The genome sequence will be available from GenBank as Accession number CP000057, which is incorporated by reference herein in its entirety.


The following open reading frames are defined as pseudogenes in Table 6 because these nucleotide sequences contain a frameshift or a stop codon within the sequence: SEQ ID NO: 822 (kdgK), SEQ ID NO: 928, SEQ ID NO: 994 (bolA), SEQ ID NO: 276 (adhC) SEQ ID NO: 1102 (dusC), SEQ ID NO: 1121 (merP), SEQ ID NO: 1135, SEQ ID NO:1236, SEQ ID NO: 1254, SEQ ID NO: 1376 (lic2C), SEQ ID NO: 1431 (hgpD), SEQ ID NO: 1502, SEQ ID NO:1505 (ppx), SEQ ID NO: 1523 (hgpC), SEQ ID NO: 1585 (lex2A), SEQ ID NO: 1637 SEQ ID NO: 1713, SEQ ID NO: 1856 (mod), SEQ ID NO: 1899, SEQ ID NO: 2006, SEQ ID NO: 2080, SEQ ID NO: 2155, SEQ ID NO: 2202, SEQ ID NO:2257, SEQ ID NO: 2331, SEQ ID NO:2345, SEQ ID NO: 2365, SEQ ID NO: 2555 (metE) and SEQ ID NO:2563 (pmi). The hgpD, hgpC, lex2A and NTHI1769 genes contain contingency repeats that cause the sequence to shift in and out of the reading frame and the resulting amino acid sequences (SEQ ID NOS: 3242, 3332 and 4142) are partial translation. These contingency repeats regulate gene expression in the H. influenzae and therefore the genes encoding these proteins will be correctly translated after insertion or deletion of 1 or more repeats









TABLE 6







NTHi Open Reading Frames and Gene Poducts













Unique





Gene
to NTHi

SEQ


CDS
Name
86-0268
Product Name
ID NO:














2-1021
gapA

glyceraldehyde 3-phosphate dehydrogenase
773


1191-3014


putative long-chain-fatty-acid--CoA ligase
774


C 3051-3839


conserved hypothetical protein
775


C 3855-4319


conserved hypothetical protein
776


C 4581-5393
fdhD

FdhD protein homolog
777


5664-8750
fdxG

formate dehydrogenase major subunit
778


8752-9690
fdxH

formate dehydrogenase, iron-sulfur subunit
779


9683-10399
fdxI

formate dehydrogenase, cytochrome B556
780





subunit


10469 . . . 11377
fdhE

FdhE homolog
781


C 11421 . . . 11861
rimI

ribosomal-protein-alanine acetyltransferase
782


C 11864 . . . 12268
holD

DNA polymerase III, psi subunit
783


12374 . . . 13366
rsmC

ribosomal RNA small subunit
784





methyltransferase C


C 13434 . . . 14342
era

GTP-binding protein era homolog
785


C 14339 . . . 15022
rnc

ribonuclease III
786


C 15024 . . . 16073
lepB

Signal peptidase I
787


C 16082 . . . 17878
lepA

GTP-binding protein LepA
788


C 18047 . . . 18430


conserved hypothetical acid-induced glycyl
789





radical enzyme


18688 . . . 19347
ung

uracil-DNA glycosylase
790


C 19417 . . . 20841


conserved hypothetical protein
791


C 21224 . . . 22663


conserved hypothetical protein
792


C 22665 . . . 24062
citG

CitXG
793


C 24243 . . . 25745
citF

citrate lyase alpha chain
794


C 25760 . . . 26635
citE

citrate lyase beta chain
795


C 26632 . . . 26919
citD

citrate lyase acyl carrier protein
796


C 26957 . . . 27964
citC

[citrate [pro-3S]-lyase] ligase
797


C 28215 . . . 29177
lipA

Lipoic acid synthetase
798


C 29231 . . . 29869
lipB

lipoate-protein ligase B
799


C 29871 . . . 30149


conserved hypothetical protein
800


C 30205 . . . 31386
dacA

penicillin-binding protein 5 precursor
801


C 31402 . . . 32265
rlpA

RlpA-like protein
802


C 32318 . . . 33433
mrdB

Rod shape-determining protein RodA
803


C 33423 . . . 35378
mrdA

penicillin-binding protein 2
804


C 35400 . . . 35867


conserved hypothetical protein
805


C 35921 . . . 36229


conserved hypothetical protein
806


C 36354 . . . 38009


conserved hypothetical membrane protein
807


38215 . . . 39993


conserved hypothetical ABC transporter ATP-
808





binding protein


40042 . . . 41178
mreB

rod shape-determining protein MreB
809


41258 . . . 42313
mreC

rod shape-determining protein MreC
810


42313 . . . 42801
mreD

rod shape-determining protein MreD
811


C 42852 . . . 43628


conserved hypothetical protein
812


C 43638 . . . 44441
xthA

exodeoxyribonuclease III
813


C 44489 . . . 45163
rluA2

conserved hypothetical pseudouridine synthase
814


C 45163 . . . 46227


conserved hypothetical membrane protein
815


C 46287 . . . 46949


conserved hypothetical FtsH-interacting
816





integral





membrane protein


C 47276 . . . 47830


conserved hypothetical protein
817


47897 . . . 48259
phnA

PhnA homolog
818


C 48312 . . . 48950
eda

KHG/KDPG aldolase
819


C 48959 . . . 50362
uxuC

uronate isomerase
820


C 50372 . . . 51229


putative oxidoreductase
821


C 51229 . . . 52172
kdgK

pseudogene for 2-dehydro-3-
822





deoxygluconokinase


C 52192 . . . 53463


putative TRAP-type C4-dicarboxylate transport
823





system, large permease component


C 53473 . . . 53970


putative TRAP-type C4-dicarboxylate transport
824





system, small permease component


C 54009 . . . 54995


putative TRAP-type C4-dicarboxylate transport
825





system, periplasmic component


C 55022 . . . 56050


conserved hypothetical zinc-type alcohol
826





dehydrogenase-like protein


56160 . . . 56960
uxuR

Uxu operon regulator
827


56988 . . . 58172
uxuA

mannonate dehydratase
828


C 58225 . . . 58938


putative membrane protein TerC
829


C 59036 . . . 60865
uvrC

UvrABC system protein C
830


C 60867 . . . 61631
kdsB

3-deoxy-D-manno-octulosonic acid
831





cytidylyltransferase


C 61702 . . . 62700
lpxK

tetraacyldisaccharide 4′-kinase
832


C 62773 . . . 64536
msbA

lipid A export ATP-binding protein MsbA
833


C 64577 . . . 66943
rec2

recombination protein 2
834


67202 . . . 67639
dksA

DnaK suppressor protein
835


67887 . . . 69245
pcnB

probable poly polymerase
836


69254 . . . 69736
folK

2-amino-4-hydroxy-6-hydroxymethyl-
837





dihydropteridine pyrophosphokinase


69812 . . . 70288


conserved hypothetical protein
838


70296 . . . 71594
amiB

probable N-acetylmuramoyl-L-alanine amidase
839





AmiB precursor


71595 . . . 73484
mutL

DNA mismatch repair protein MutL
840


73492 . . . 74427
miaA

tRNA delta-isopentenylpyrophosphate
841





transferase


74433 . . . 77378
glnE

glutamate-ammonia-ligase adenylyltransferase
842


C 77463 . . . 79139
recN

DNA repair protein RecN
843


C 79251 . . . 80069
ppnK

probable inorganic polyphosphate/ATP-NAD
844





kinase


80133 . . . 80837
grpE

grpE
845


C 80850 . . . 81194


conserved hypothetical protein
846


C 81187 . . . 81627


conserved hypothetical protein
847


81908 . . . 84031
nrdD

anaerobic ribonucleoside-triphosphate
848





reductase


84150 . . . 85010
tesB

Acyl-CoA thioesterase II
849


C 85171 . . . 86550
cysS

cysteinyl-tRNA synthetase
850


86653 . . . 87162
ppiB

peptidyl-prolyl cis-trans isomerase B
851


87166 . . . 87597


conserved hypothetical protein
852


87739 . . . 88527


putative deoxyribonuclease
853


88589 . . . 88846


hypothetical protein
854


88857 . . . 89132


hypothetical protein
855


C 89182 . . . 89505
trxA

thioredoxin
856


C 89625 . . . 90620
ddh

2-hydroxyacid dehydrogenase homolog
857


C 90633 . . . 91778
metB

cystathionine gamma-synthase
858


92334 . . . 93170


predicted ATPase involved in chromosome
859





partitioning


93172 . . . 94527
dnaB2

replicative DNA helicase
860


94520 . . . 96220


conserved hypothetical protein
861


96220 . . . 96771


conserved hypothetical protein
862


96921 . . . 98144


conserved hypothetical protein
863


98441 . . . 99196


conserved hypothetical protein
864


99196 . . . 99684


conserved hypothetical protein
865


99930 . . . 100352
ssb2

Single-strand binding protein
866


100379 . . . 100897


conserved putative lipoprotein
867


C 101113 . . . 101670


1conserved hypothetical protein
868


101746 . . . 103806
topB2

DNA topoisomerase III
869


104691 . . . 105119


conserved hypothetical protein
870


105131 . . . 105223


hypothetical protein
871


105234 . . . 105875


conserved hypothetical protein
872


105910 . . . 106317


hypothetical protein
873


106418 . . . 106888
radC2

putative DNA repair radC-like protein
874


106981 . . . 107577


conserved hypothetical protein
875


107635 . . . 108399


conserved hypothetical protein
876


108498 . . . 109127
pilL

conserved putative lipoprotein
877


109131 . . . 109871


conserved putative exported protein
878


109850 . . . 110608


conserved putative exported protein
879


110624 . . . 111130


conserved putative exported protein
880


111127 . . . 113376


conserved hypothetical protein
881


113695 . . . 114381


conserved putative membrane protein
882


114512 . . . 114862


conserved putative membrane protein
883


115131 . . . 115532


conserved putative membrane protein
884


115552 . . . 115923


conserved putative membrane protein
885


115935 . . . 116579


conserved putative exported protein
886


116579 . . . 117448


conserved hypothetical protein
887


117459 . . . 118859


conserved putative exported protein
888


118869 . . . 119276


conserved putative lipoprotein
889


119292 . . . 122162


conserved hypothetical protein
890


122170 . . . 122577


conserved hypothetical protein
891


122587 . . . 122910


conserved hypothetical membrane protein
892


122919 . . . 124412


conserved putative membrane protein
893


C 124468 . . . 124797


conserved hypothetical membrane protein
894


C 124955 . . . 127960
tnpA

transposon Tn3 transposase
895


126925 . . . 127041


hypothetical protein
896


128088 . . . 128726
tnpR

transposon Tn3 resolvase
897


128751 . . . 128930


hypothetical protein
898


C 128975 . . . 129382


hypothetical exported protein
899


C 129398 . . . 131398


conserved putative exported protein
900


C 131413 . . . 132354


conserved putative exported protein
901


C 132351 . . . 132794


conserved hypothetical protein
902


133106 . . . 133525


conserved hypothetical membrane protein
903


133610 . . . 133843


conserved hypothetical protein
904


133846 . . . 134094


hypothetical protein
905


134098 . . . 134256


hypothetical protein
906


134387 . . . 135340
traC

conserved putative antirestriction protein
907


135444 . . . 136205


possible type I restriction enzyme M subunit
908


136431 . . . 136997


hypothetical protein
909


137037 . . . 137411


conserved hypothetical protein
910


137401 . . . 138078


conserved hypothetical protein
911


138056 . . . 138673


conserved hypothetical protein
912


C 138937 . . . 139536


resolvase/integrase-like protein
913


C 139551 . . . 139742


hypothetical protein
914


C 140211 . . . 140465


hypothetical protein
915


141656 . . . 142369


conserved hypothetical protein
916


143089 . . . 144342


hypothetical protein
917


144485 . . . 146401


conserved hypothetical protein
918


146462 . . . 147304


putative site-specific recombinase
919


C 147954 . . . 149231
thrC

threonine synthase
920


C 149274 . . . 150218
thrB

homoserine kinase
921


C 150231 . . . 152678
thrA

aspartokinase/homoserine dehydrogenase
922


153002 . . . 153715


conserved hypothetical protein
923


153755 . . . 154891
grk

glycerate kinase
924


C 154900 . . . 156159


conserved hypothetical protein
925


C 156281 . . . 157387


conserved hypothetical protein
926


C 157589 . . . 157909


conserved hypothetical protein
927


157962 . . . 158171


pseudogene for glycerol-3-phosphate regulon
928





repressor


C 158260 . . . 159015


conserved hypothetical protein
929


159216 . . . 159497


conserved hypothetical protein
930


159649 . . . 160647
hitA

iron-utilization periplasmic protein hFbpA
931


160765 . . . 162285
hitB

iron(III)-transport system permease protein
932





hFbpB


162287 . . . 163342
hitC

iron-utilization ATP-binding protein hFbpC
933


C 163377 . . . 164057


putative D-alanyl-D-alanine carboxypeptidase
934


C 164059 . . . 165192
dapE

succinyl-diaminopimelate desuccinylase
935


C 165220 . . . 165564


conserved hypothetical protein
936


C 165643 . . . 167538


chaperone protein HtpG
937


167750 . . . 168589


conserved hypothetical NIF3-like protein
938


168763 . . . 171135
hsdM1

putative type I restriction-modification system,
939





methyltransferase subunit


171318 . . . 172727


predicted transcriptional regulator containing
940





an HTH domain


172899 . . . 174149
hsdS1

putative type I site-specific restriction-
941





modification system, S subunit


174249 . . . 177275
hsdR1

putative type I site-specific restriction-
942





modification system, R subunit


C 177500 . . . 178879
ffh

Signal recognition particle protein
943


179130 . . . 180422
corB

putative Mg2+ and Co2+ transporter
944


180443 . . . 181321


conserved hypothetical protein
945


181323 . . . 181802


conserved hypothetical protein
946


181892 . . . 182233


conserved hypothetical protein
947


182400 . . . 183689
serS

Seryl-tRNA synthetase
948


184024 . . . 184653
gst

glutathione S-transferase
949


C 184665 . . . 184961


conserved hypothetical protein
950


185014 . . . 187251
hemR

hemin receptor
951


C 194299 . . . 196434


possible SAM-dependent methyltransferase
952


196827 . . . 197936
mltA

membrane-bound lytic murein transglycosylase A
953





precursor


197936 . . . 198706


conserved hypothetical protein
954


198884 . . . 199885
znuA

high-affinity zinc uptake system protein ZnuA
955


199935 . . . 200441


conserved hypothetical protein
956


C 200554 . . . 201912
mpl

UDP-N-acetylmuramate:L-alanyl-gamma-D-
957





glutamyl-meso-diaminopimelate ligase


C 202297 . . . 203484
metC

cystathionine beta-lyase
958


203615 . . . 204223
tsaA

probable peroxiredoxin
959


204324 . . . 204881
pgsA

CDP-diacylglycerol--glycerol-3-phosphate 3-
960





phosphatidyltransferase


C 205445 . . . 205975
ppa

inorganic pyrophosphatase
961


206209 . . . 207525


conserved hypothetical protein
962


207748 . . . 208389
udk

uridine kinase
963


208398 . . . 208985
dcd

deoxycytidine triphosphate deaminase
964


208989 . . . 210164


conserved hypothetical protein
965


210164 . . . 211372


probable sugar efflux transporter
966


C 211433 . . . 212947
engA

GTP-binding protein EngA
967


C 213629 . . . 214396
dnaQ

DNA polymerase III, epsilon chain
968


214457 . . . 214921
rnhA

ribonuclease HI
969


215187 . . . 216284
omP2

Outer membrane protein P2 precursor
970


C 216481 . . . 217626
nagA

N-acetylglucosamine-6-phosphate deacetylase
971


C 217763 . . . 218575
nagB

glucosamine-6-phosphate deaminase
972


C 218911 . . . 219792
nanA

N-acetylneuraminate lyase
973


C 219803 . . . 220669


putative HTH-type transcriptional regulator
974


C 220662 . . . 221564
nanK

putative N-acetylmannosamine kinase
975


C 221609 . . . 222295
nanE

putative N-acetylmannosamine-6-phosphate
976





2-epimerase


222645 . . . 223634


putative sialic acid transporter, TRAP-type
977





C4-dicarboxylate transport system, periplasmic





component


223699 . . . 225549
siaT

putative sialic acid transporter, TRAP-type
978





C4-dicarboxylate transport system, large





permease component


225713 . . . 226852


conserved hypothetical protein
979


227014 . . . 227448


putative protein-S-isoprenylcysteine
980





methyltransferase


C 227490 . . . 228377
hflC

HflC
981


C 228377 . . . 229609
hflK

HhflK
982


C 229719 . . . 230426


putative 4′-phosphopantetheinyl transferase
983


C 230486 . . . 231817
dcuB2

anaerobic C4-dicarboxylate transporter DcuB
984


C 232021 . . . 232251
acpP

acyl carrier protein
985


C 232511 . . . 233239
fabG

3-oxoacyl-[acyl-carrier protein] reductase
986


C 233256 . . . 234194
fabD

malonyl CoA-acyl carrier protein transacylase
987


C 234694 . . . 235644
fabH

3-oxoacyl-[acyl-carrier-protein] synthase III
988


C 235830 . . . 236000
rpmF

50S ribosomal protein L32
989


C 236023 . . . 236547


conserved hypothetical protein
990


236704 . . . 237576
psd

phosphatidylserine decarboxylase proenzyme
991


C 237685 . . . 239055
gor

glutathione reductase
992


C 239271 . . . 239870


conserved hypothetical lipoprotein
993


239964 . . . 240275
bolA

Pseudogene for BolA homolog
994


240566 . . . 241909
nqrA

Na(+)-translocating NADH-quinone reductase
995





subunit A


241912 . . . 243147
nqrB

Na(+)-translocating NADH-quinone reductase
996





subunit B


243140 . . . 243874
nqrC

Na(+)-translocating NADH-quinone reductase
997





subunit C


243874 . . . 244500
nqrD

Na(+)-translocating NADH-quinone reductase
998





subunit D


244504 . . . 245100
nqrE

Na(+)-translocating NADH-quinone reductase
999





subunit E


245113 . . . 246348
nqrF

Na(+)-translocating NADH-quinone reductase
1000





subunit F


246492 . . . 247532
apbE

thiamine biosynthesis lipoprotein ApbE
1001


247535 . . . 247795


conserved hypothetical protein
1002


247870 . . . 249063
trmU

probable tRNA (5-methylaminomethyl-2-
1003





thiouridylate)-methyltransferase


C 249107 . . . 249841


conserved hypothetical protein
1004


C 249843 . . . 250817
rluD

ribosomal large subunit pseudouridine synthase D
1005


250925 . . . 251713


conserved hypothetical lipoprotein
1006


C 251792 . . . 252274


conserved hypothetical protein
1007


C 252306 . . . 253046
pflA

pyruvate formate-lyase 1 activating enzyme
1008


253171 . . . 255489
pflB

formate acetyltransferase
1009


C 255521 . . . 256375
focA

probable formate transporter
1010


256880 . . . 257794


conserved hypothetical protein
1011


257973 . . . 259418


putative Na+/alanine symporter
1012


C 259603 . . . 260430


conserved hypothetical protein
1013


C 260439 . . . 261575
adhC

pseudogene for putative alcohol dehydrogenase
1014





class III


261698 . . . 262105


putative HTH-type transcriptional regulator
1015


262263 . . . 262550
tatA

Sec-independent protein translocase protein
1016





TatA/E


262554 . . . 263114
tatB

Sec-independent protein translocase protein
1017





TatB


263124 . . . 263894
tatC

Sec-independent protein translocase protein
1018





TatC


264255 . . . 265604
gdhA

NADP-specific glutamate dehydrogenase
1019


C 265712 . . . 266152
fur

Ferric uptake regulation protein
1020


C 266181 . . . 266705
fldA

flavodoxin
1021


C 267655 . . . 268260
seqA

seqA
1022


266802 . . . 267665


putative esterase/lipase
1023


268263 . . . 269621
menE

O-succinylbenzoate--CoA ligase
1024


269649 . . . 272987


putative small-conductance mechanosensitive
1025





channel


272997 . . . 274070
aroC

chorismate synthase
1026


274243 . . . 275103
mepA

penicillin-insensitive murein endopeptidase
1027


275121 . . . 275888


conserved hypothetical protein
1028


275935 . . . 276891
msbB

lipid A biosynthesis (KDO)2-(lauroyl)-lipid
1029





IVA





acyltransferase


277018 . . . 277989
selD

selenide, water dikinase
1030


C 278098 . . . 278448
rplsS

50S ribosomal protein L19
1031


C 278485 . . . 279225
trmD

tRNA (guanine-N(1)-)-methyltransferase
1032


C 279263 . . . 279790
rimM

16S rRNA processing protein RimM
1033


C 279835 . . . 280083
rspP

30S ribosomal protein S16
1034


280363 . . . 281148


conserved hypothetical protein
1035


281171 . . . 282982
nadN

NAD nucleotidase
1036


283267 . . . 283809
aroK

shikimate kinase
1037


283829 . . . 284917
aroB

3-dehydroquinate synthase
1038


284919 . . . 285779
dam

DNA adenine methylase
1039


286744 . . . 286851


conserved hypothetical protein
1040


C 286870 . . . 287595
pgpB

phosphatidylglycerophosphatase B
1041


287641 . . . 288315
ribA

GTP cyclohydrolase II
1042


C 288631 . . . 290175


putative ABC-type oligopeptide transport
1043





system,





periplasmic component


C 290168 . . . 290530


conserved hypothetical protein
1044


290649 . . . 292694
prlc

oligopeptidase A
1045


292926 . . . 294470
Hsdm2

putative type I restriction-modification system
1046





methyltransferase subunit


294555 . . . 295727
Hsds2

putative type I restriction-modification system
1047





specificity protein


295729 . . . 296883
prrC

putative anticodon nuclease
1048


296886 . . . 297884


conserved hypothetical DNA binding protein
1049


297895 . . . 300978
Hsdr2

putative type I restriction-modification system
1050


C 301125 . . . 301754


conserved hypothetical protein
1051


C 301968 . . . 303809
arcb

aerobic respiration control sensor protein ArcB
1052


C 303878 . . . 304522


predicted uracil-DNA glycosylase
1053


C 310574 . . . 311482
bira

BirA bifunctional protein
1054


311609 . . . 313075
guab

inosine-5′-monophosphate dehydrogenase
1055


313185 . . . 314756
guaa

glutamine-hydrolyzing GMP synthase
1056


C 314818 . . . 315720


conserved hypothetical protein
1057


315783 . . . 316289


putative transcriptional regulator
1058


316328 . . . 317530
nhaa

Na(+)/H(+) antiporter 1
1059


317718 . . . 319028
brnq

branched-chain amino acid transport system
1060





carrier protein


C 319136 . . . 319603


conserved hypothetical protein
1061


C 319630 . . . 320076


conserved hypothetical protein
1062


320349 . . . 322478
pnp

polyribonucleotide nucleotidyltransferase
1063


322560 . . . 323504
nlpL

lipoprotein NlpI
1064


323624 . . . 325465
deaD

Cold-shock DEAD-box protein A homolog
1065


325643 . . . 327031


predicted soluble lytic transglycosylase fused
1066





to an ABC-type amino acid-binding protein


327036 . . . 327188


conserved hypothetical protein
1067


C 327322 . . . 327618


conserved hypothetical protein
1068


C 327675 . . . 327884


conserved hypothetical protein
1069


C 327952 . . . 328302
arsC

putative arsenate reductase
1070


328381 . . . 329430
perM

putative permease PerM homolog
1071


C 329544 . . . 330521
secF

protein-export membrane protein SecF
1072


C 330529 . . . 332379
secD

protein-export membrane protein SecD
1073


C 332449 . . . 332742


conserved hypothetical preprotein translocase
1074





subunit YajC


C 332850 . . . 333071


predicted redox protein, regulator of disulfide
1075





bond formation


C 333068 . . . 333586


conserved hypothetical protein
1076


C 333651 . . . 334799
tgt

queuine tRNA-ribosyltransferase
1077


C 335144 . . . 336235
quea

S-adenosylmethionine:tRNA ribosyltransferase-
1078





isomerase


C 336359 . . . 336805


conserved hypothetical protein
1079


C 336948 . . . 341126
hap

adhesion and penetration protein Hap
1080


C 341444 . . . 344275
uvra

UvrABC system protein A
1081


344428 . . . 344934
ssb

Single-strand binding protein
1082


C 345112 . . . 345906
tonb

TonB
1083


345916 . . . 346359
exbD

transport protein ExbD
1084


346363 . . . 346815
exbB

transport protein ExbB
1085


346984 . . . 347451
bcp

bacterioferritin comigratory protein
1086


347552 . . . 348448
dapA

dihydrodipicolinate synthase
1087


348559 . . . 349188


conserved hypothetical lipoprotein
1088


349378 . . . 349701


conserved hypothetical protein
1089


C 350415 . . . 351353
lgtC

UDP-galactose--lipooligosaccharide
1090





galactosyltransferase


C 351421 . . . 352008
orfM

predicted xanthosine triphosphate
1091





pyrophosphatase


C 352020 . . . 352745
kdkA

3-deoxy-D-manno-octulosonic acid kinase
1092


352822 . . . 353865
opsX

ADP-heptose--lipooligosaccharide
1093





heptosyltransferase I


354187 . . . 356316
hxuC

heme/hemopexin-binding protein C
1094


356392 . . . 358089
hxuB

heme/hemopexin-binding protein B
1095


358101 . . . 360866
hxuA

heme/hemopexin-binding protein A
1096


C 360962 . . . 361318
folB

dihydroneopterin aldolase
1097


361401 . . . 362000


conserved hypothetical protein
1098


362042 . . . 363745
narQ

sensor protein NarQ
1099


363755 . . . 364780
murB

UDP-N-acetylenolpyruvoylglucosamine
1100





reductase


364906 . . . 365751
rpoH

RNA polymerase sigma-32 factor
1101


C 365791 . . . 366218
dusC

pseudogene for tRNA-dihydrouridine synthase
1102





C-


C 366218 . . . 367084
djlA

DnaJ-like protein DjlA
1103


C 367163 . . . 367804
pyrE

orotate phosphoribosyltransferase
1104


C 367828 . . . 368544
rph

ribonuclease PH
1105


C 368845 . . . 370287
gltX

glutamyl-tRNA synthetase
1106


371037 . . . 372692
Lpt6

PE-tn-6--lipooligosaccharide
1107





phosphorylethanolamine transferase


C 372710 . . . 373519
rbn

tRNA processing ribonuclease BN
1108


C 373516 . . . 374001


conserved hypothetical protein
1109


C 374004 . . . 374666


conserved hypothetical protein
1110


374881 . . . 375660
udp

uridine phosphorylase
1111


C 375886 . . . 377202


conserved hypothetical metabolite transport
1112





protein


C 377258 . . . 378001


conserved hypothetical protein
1113


C 378062 . . . 379768
mend

menaquinone biosynthesis protein MenD
1114


C 379784 . . . 381055
menf

menaquinone-specific isochorismate synthase
1115


381212 . . . 382426
Aspc3

probable aspartate aminotransferase
1116


382543 . . . 383799
mtr

tryptophan-specific transport protein
1117


383908 . . . 385275
sdaa

L-serine dehydratase
1118


C 385310 . . . 386548
sdac

Serine transporter
1119


C 386771 . . . 388939


probable cation-transporting ATPase
1120


C 388914 . . . 389702
merP

pseudogene for probable inorganic ion transport
1121





protein


389779 . . . 390165


probable heavy metal dependent transcriptional
1122





regulator


C 390179 . . . 390496
metJ

Met repressor
1123


390740 . . . 392002
rho

transcription termination factor rho
1124


C 392056 . . . 392748
pilD

putative type 4 prepilin-like protein specific
1125





leader peptidase


C 392745 . . . 393965
pilC

putative type IV pilin secretion protein
1126


C 393962 . . . 395356
pilB

putative type IV pilin secretion protein
1127


C 395353 . . . 395802
pilA

Type IV pilin subunit protein
1128


395917 . . . 396477
ampD

ampD
1129


397099 . . . 397998
corC

magnesium and cobalt efflux protein CorC
1130


397982 . . . 399550
cutE

apolipoprotein N-acyltransferase
1131


399600 . . . 400337


conserved hypothetical RNA methyltransferase
1132


400334 . . . 400894


conserved hypothetical protein
1133


400894 . . . 401313
ruvX

putative holliday junction resolvase
1134


C 401363.402288


pseudogene for recombination associated
1135





protein RdgC,


402483 . . . 403694


prophage CP4-57-like integrase
1136


403731 . . . 404165


hypothetical protein
1137


404278 . . . 404700


hypothetical protein
1138


404709 . . . 405482


hypothetical protein
1139


406017 . . . 406208


hypothetical protein
1140


406257 . . . 406409


hypothetical protein
1141


C 406576 . . . 407466


hypothetical protein
1142


408127 . . . 408942
proC

pyrroline-5-carboxylate reductase
1143


408942 . . . 410108
hcaT

probable 3-phenylpropionic acid transporter
1144


410139 . . . 411032
xerD

Site-specific recombinase XerD
1145


411205 . . . 411492


conserved hypothetical protein
1146


C 411544 . . . 412551
ruvB

holliday junction DNA helicase RuvB
1147


C 412559-413173
ruvA

holliday junction DNA helicase RuvA
1148


C 413236-413808
ruvC

holliday junction DNA helicase RuvC
1149


C 413885-414595


conserved hypothetical protein
1150


C 414607-415080
ntpA

dATP pyrophosphohydrolase
1151


C 415102-416868
aspS

Aspartyl-tRNA synthetase
1152


417087-417605


conserved hypothetical protein
1153


417658-418383


conserved hypothetical protein
1154


418476-418712


conserved hypothetical protein
1155


418709-419113


Predicted nucleic acid-binding domain,
1156





containsPIN domain


419180-419587
gloA

Lactoylglutathione lyase
1157


419661-420350
Rnt

Ribonuclease T
1158


420664-422016


conserved hypothetical protein
1159


422049-422636


Predicted primosomal replication protein N
1160


C 422987-423553
Efp

Elongation factor P
1161


423591-424607


Predicted lysine 2,3-aminomutase
1162


424706-426001
oapA

Opacity associated protein OapA
1163


426060-426464
oapB

Opacity associated protein OapB
1164


426467-427177
recO

DNA repair protein RecO
1165


427177-428493
rumA

23S rRNA (uracil-5--)methyltransferase RumA
1166


428571-430802
relA

GTP pyrophosphokinase
1167


430817-431173
dgkA

Diacylglycerol kinase
1168


431256-431849
Mog

Molybdopterin biosynthesis mog protein
1169


431851-432189
glnB

Nitrogen regulatory protein P-II
1170


432189-433235


conserved hypothetical protein
1171


C 433277-435469
priA

Prismosomal protein N
1172


435554-436294
trmB

tRNA (guanine-N(7)-)-methyltransferase
1173


436379-436723


conserved hypothetical protein
1174


436929-437459
napF

Ferredoxin-type protein NapF
1175


437452-437733
napD

NapD
1176


437755-440253
napA

Periplasmic nitrate reductase
1177


440307-441146
napG

Ferredoxin-type protein NapG
1178


441146-442009
napH

Ferredoxin-type protein NapH
1179


442006-442458
napB

Diheme cytochrome C NapB
1180


442473-443075
napC

Cytochrome C-type protein NapC
1181


C 443235-443879
Adk

Adenylate kinase
1182


C 443964-445241
Lic3c

Putative integral membrane signal transducer
1183





protein


C 445383-446399
galE

UDP-glucose 4-epimerase
1184


C 446572-447534
Lic3A

CMP-Neu5Ac--lipooligosaccharide alpha 2-3
1185





sialytransferase


447933-448655


Putative ABC-type
1186





nitrate/sulfonate/bicarbonate transport system,





ATPase component


448652-449389


Putative ABC-type
1187





nitrate/sulfonate/bicarbonate transport system,





permease component


449411-450355


Putative ABC-type
1188





nitrate/sulfonate/bicarbonate transport system,





periplasmic component


450365-451012


Putative transcription activator
1189


C 451107-451922
hfeD

Putative ABC-type chelated iron transport
1190





system, permease component


C 451915-452763
hfeC

Putative ABC-type chelated iron transport
1191





system, permease component


C 452767-453687
hfeB

Putative ABC-type chelated iron transport
1192





system, ATPase component


C 453687-454568
hfeA

Putative periplasmic chelated iron binding
1193





protein


454874-455338


Hypothetical protein
1194


C 455430-456308
pbpG

Penicillin-binding protein 7 homolog precursos
1195


456533-457705


Predicted Fe—S-cluster redox enzyme
1196


457805-458344


Conserved hypothetical protein
1197


458416-459327


Conserved hypothetical transcriptional
1198





regulator with an N-terminal xre-type HTH





domain


459336-460442
gcpE

4-hydroxy-2-methyl-2-(E)-butenyl 4-
1199





diphosphate synthase


460452-461723
hisS

Histidyl-tRNA synthetase
1200


461741-462355


Conserved hypothetical protein
1201


C 462406 . . . 462600


conserved hypothetical protein
1202


C 462600-462941
Fdx

Feerdoxin, 2Fe—S
1203


C 462981-464840
hscA

Chaperone protein HscA
1204


C 464859-465545


Conserved hypothetical protein
1205


C 465596-466120
hscB

Co-chaperone protein HscB
1206


C 466133-466456


Conserved hypothetical protein
1207


C 466514-466894
nifU

NifU-like protein
1208


C 466954-468174
nifS2

Cysteine sedulferase
1209


C 468235-468687


Predicted transcriptional regulator
1210


C 468740-469465


Hypothetical tRNA/rRNA methyltransferase
1211


C 469905-470366
Pal

Outer membrane protein P6 precursor
1212


C 470390-471673
tolB

TolB
1213


C 471715-472941
tolA

TolA
1214


C 472957-473376
tolR

TolR
1215


C 473447-474133
tolQ

TolQ
1216


C 474150-474560


Predicted thioestererase
1217


474885-476807
dinG

Probable ATP-dependent helicase
1218


476820-477530


Possible inactive homolog of metal-dependent
1219





proteases, putative molecular chaperone


477555 . . . 478106


possible starvation-inducible outer membrane
1220





lipoprotein


478148-479836
lcfA

Long chain fatty acid CoA ligase
1221


479892-481034
rnD

Ribonuclease D
1222


C 481077-482945


Conserved hypothetical acyltransferase
1223


C 483019-484110


Predicted GTPase, probable translation factor
1224


C 484158-484742
Pth

Peptidyl-tRNA hydrolase
1225


484925-485233


Conserved hypothetical protein
1226


485236-486450


Conserved hypothetical cupin superfamily
1227





metalloenzyme


486452-487771
xseA

Exodeoxyribonuclease VII large subunit
1228


487948-488601
adpP

ADP-ribose pyrophoaphatase
1229


488611-489435
Icc

Predicted phosphohydrolase
1230


C 489533-489820


Conserved hypothetical protein
1231


490017-491393
ompP1

Outer membrane protein P1 precursor
1232


491439-492008
Ogt

Methylated-DNA-protein-cysteine
1233


492005-492676
mutH

DNA mismatch repair protein MutH
1234


C 492684-493976
mesJ

Putative cell cycle protein MesJ
1235


C 493976-494080


Pseudogene for pyridoxamine kinase
1236


C 494153-495100
accA

Acetyl-coenzyme A carboxylase carboxyl
1237





transferase subunit alpha


C 495189-495974
znuB

High affinity zinc uptake system membrane
1238





protein ZnuB


C 495974-496780
znuC

High-affinity zinc uptake system ATP binding
1239





protein AnuC


495652-498379


Conserved hypothetical metalloprotease
1240


498494-499450
tyrR

Transcriptional regulatory protein TyrR
1241


C 499546-499821
hfq

host factor-I protein Hfq
1242


C 499912-500907
rluC

ribosomal large subunit pseudouridine synthase C
1243


501253-504108
rne

ribonuclease E
1244


504340-504552


Conserved hypothetical protein
1245


504876-505673
thiM

hydroxyethylthiazole kinase
1246


505666-506475
thiD

phosphomethylpyrimidine kinase
1247


506486-507166
thiE

thiamine-phosphate pyrophosphorylase
1248


507150-508454


Conserved hypothetical metabolite transport
1249


C 508466-508567


hypothetical protein
1250


508623-510002


putative protease
1251


C 510036-511355
srmB

ATP-dependent RNA helicase SrmB
1252


511401-512123


predicted O-methyltransferase
1253


C 512154-512759


pseudogene for conserved hypothetical
1254





tRNA/rRNA methyltransferase


512906-514273
pssA

CDP-diacylglycerol--serine O-
1255





phosphatidyltransferase


C 514319-515044
fadR

fatty acid metabolism regulator protein
1256


515145-516713
nhaB

Na(+)/H(+) antiporter 2
1257


516723-517256
dsbB

disulfide bond formation protein B
1258


C 517311-519143
glmS

glucosamine--fructose-6-phosphate
1259





aminotransferase


C 519256-519666
hupA

DNA-binding protein HU
1260


C 519668-520258


conserved hypothetical protein
1261


C 520294-521088
nudC

NADH pyrophosphatase
1262


C 521155-521751
orfG

conserved hypothetical 21.9 KD protein in
1263





locus involved in transformation


C 521827-522513
comF

competence protein F
1264


C 522526-523863
comE

competence protein E
1265


C 523873-524283
comD

competence protein D
1266


C 524283-524804
comC

competence protein C
1267


C 524801-525307
comB

competence protein B
1268


C 525308-526105
comA

competence protein A
1269


526204-528798
mrcA

penicillin-binding protein 1A
1270


528871-529716


conserved hypothetical protein
1271


529849-530214


conserved hypothetical protein
1272


530345-530947
recR

recombination protein RecR
1273


530963-532918
topB

DNa topoisomerase III
1274


533027-533368
secG

protein-export membrane protein SecG
1275


C 533913-535583
fruA

PTS system, fructose-specific IIBC component
1276


C 535585-536526
fruK

1-phosphofructokinase
1277


C 536528-538027
fruB

PTS system, fructose-specific IIA/FPr
1278





component


C 538099-538635


conserved hypothetical protein
1279


C 538722-539000
vapD

virulence-associated protein D
1280


C 539009-539200
vapX

VapX
1281


C 539272-540570


conserved hypothetical protein
1282


C 540621-541145


conserved hypothetical protein
1283


C 541172-541954


putative deoxyribonuclase
1284


C 542009-542992
holB

DNA polymerase III, delta subunit
1285


C 542989-543117


hypothetical protein
1286


543192-545315
lav

autotransported protein Lav
1287


C 546221-546880
tmk

thymidylate kinase
1288


C 546870-547913


predicted periplasmic solute-binding protein
1289


C 547983-548924
surA

survival protein SurA homolog
1290


C 548994-549533
pyrR

PyrR bifunctional protein
1291


549665-550456
mazG

predicted pyrophosphatase MazG
1292


C 550506-551381


conserved hypothetical protein
1293


551574-553985
lon

ATP-dependent protease La
1294


554102-555253


predicted Fe—S oxidoreductase
1295


555387-556046
rpiA

ribose 5-phosphate isomerase A
1296


556076-557308
serA

D-3-phosphoglycerate dehydrogenase
1297


C 557351-558193


predicted aminomethyltransferase related to
1298





GcvT


C 558203-559066


conserved hypothetical stress-induced protein
1299


559515-560426
hisG

ATP phosphoribosyltransferase
1300


560519-561802
hisD

histidinol dehydrogenase
1301


561871-562974
hisC

hisitidinol-phosphate aminotransferase 1
1302


563086-564174
hisB

histidine biosynthesis bifunctional protein HisB
1303


564240-564839
hisH

imidazole glycerol phosphate synthase subunit
1304





HisH


564875-565624
hisA

1-5-[methylideneamino] imidazole-4-
1305





carboxamine isomerase


565606-566382
hisF

imidazole glycerol phosphate synthase subunit
1306





HisF


566382-567047
hisI

histidine biosynthesis bifunctional protein hisIE
1307


567129-567281


conserved hypothetical protein
1308


567311-568513
tyrP

tyrosine-specific transportprotein 1
1309


C 568699-569127
atpC

ATP synthase epsilon chain
1310


C 569157-570530
atpD

ATP synthase beta chain
1311


C 570547-571416
atpG

ATP synthase gamma chain
1312


C 571432-572973
atpA

ATP synthase alpha chain
1313


C 572986-573519
atpH

ATP synthase delta chain
1314


C 573532-574002
atpF

ATP synthase B chain
1315


C 574052-574306
atpE

ATP cynthase C chain
1316


C 574362-575150
atpB

ATP synthase A chain
1317


C 575187-575561


predicted F0F1-type ATP synthase subunit I
1318


C 575687-576298
gidB

methyltransferase GidB
1319


C 576438-576815


conserved hypothetical protein
1320


576931-577533


predicted phosphatase/phosphohexomutase
1321


577533-578006


predicted membrane protein
1322


578191-578694
luxS

S-ribosylhomocysteinase
1323


C 579638-580345
aphA

Class B acid phosphatase
1324


580565-581092
hslV

ATP-dependent protease HslV
1325


581103-582437
hslU

HslU, ATP-dependent chaperone of the HslUV
1326





protease


C 582507-583559
ptoD2

spermidine/putrescine-binding periplasmic
1327





protein 2 precursor


583699-584664
ordL

probable oxidoereductase OrdL
1328


585158-586600
rmuC

DNA recombination protein RmuC
1329


586764-587183
rbsD

ribose transport permease protein
1330


587197-588678
rbsA

ribose transport ATP-binding protein
1331


588691-589662
rbsC

ribose transport permease protein
1332


589682-590560
rbsB

ribose-binding periplasmic protein
1333


590658-591578
rbsK

ribokinase
1334


591606-592604
rbsR

ribose operon repressor
1335


592710-893261


predicted membrane protein
1336


C 593401-593889
menG

S-adenosylmethionine:2-demtyhylmenaquinone
1337





methyltransferase


C 593491-594873
menA

1,4-dihydroxy-2-naphthoate
1338





octaprenyltransferase


594920-595639


conserved hypothetical protein
1339


C 595692-596678
tehA

tellurite resistance protein
1340


C 596830-601080
rpoC

DNA-directed RNA polymerase beta′ chain
1341


C 601287-605318
rpoB

DNA-directed RNA polymerase beta chain
1342


C 605702-606391
rplA

50S ribosomal protein L1
1343


C 606396-606824
rplK

50S ribosomal protein L11
1344


C 606997-607713
deoD

purine nucleoside phosphorylase
1345


C 607797-609050


predicted nucleoside permease
1346


C 609149-609937


predicted pyruvate-formate lyase-activating
1347





enzyme


C 609946-611490


conserved hypothetical glycyl radical protein
1348


611703-612359


predicted membrane protein
1349


C 612315-613385
waaQ

ADP-heptose--lipooligosaccharide
1350





heptosyltransferase III


C 613423-614502
fba

frustose-biphosphate aldolase
1351


C 614613-615773
pgk

phosphoglycerate kinase
1352


C 615869-616663


probable rubonuclease I
1353


616741-617001


putative ferredoxin-like protein
1354


C 617135-618355
tyrQ

tyrosine-specific transport protein 2
1355


C 618392-618973
tdk

thymidine kinase
1356


C 618982-620010
gcp

probable O-sialoglycoprotein endopeptidase
1357


620242-620457
rpsU

30S ribosomal protein S21
1358


620591-622372


DNA primase
1359


622443-624314
rpoD

DNA polymerase sigma factor RpoD
1360


C 624624-626051
aspA

aspartate ammonia-lyase
1361


C 626198-627013
ureH

urease accessory protein UreH
1362


C 627088-627765
ureG

urease accessory protein UreG
1363


C 627868-628575
ureF

urease accessory protein UreF
1364


C 628560-629117
ureE

urease accessory protein UreE
1365


C 629245-630963
ureC

urease alpha subunit
1366


C 630975-631280
ureB

urease beta subunit
1367


C 631357-631659
ureA

urease gamma subunit
1368


631822-632112
groES

10 kDa chaperonin
1369


632135-633772
groEL

60 kDa chaperonin
1370


C 633946-634395
rplI

50S ribosomal protein L9
1371


C 634412-634639
rpsR

30S ribosomal protein S18
1372


C 634652-634798
priB

primsomal replication protein N
1373


C 634965-635342
rpsF

30S ribosomal protein S6
1374


C 635534-635788
infA

translation initiation factor IF-1
1375


635963-637037
lic2C

pseudogene for UDP-glucose--
1376





lipoligosaccharide glucosyltransferase


637121-637987
ksgA

dimethyladenosine transferase
1377


C 638024-638962
lic2A

UDP-Gal--lipooligosaccharide
1378





galactosyltransferase


639015-639842
apaH

bis-tetraphosphatase, symmetrical
1379


639852-640475


conserved hypothetical protein
1380


C 640601-642055
gnd

6-phosphogluconate dehydrogenase
1381





decarboxylating


C 642171-642713


conserved hypothetical protein
1382


C 642796-642984


conserved hypothetical protein
1383


C 643054-643791
devB

6-phosphgluconolactonase
1384


C 643895-645379
zwf

glucose-6-phosphate 1-dehydrogenase
1385


C 645455-646264
cysQ

cysQ
1386


C 646269-646685


conserved hypothetical protein
1387


C 646742-648724


predicted membrane protein
1388


648953-649348
hslR

heat shock protein 15 homolog
1389


C 649374-649826
asnC

regulatory protein AsnC
1390


649980-650972
asnA

aspartate--ammonia ligase
1391


651035-651388


conserved hypothetical transposase-like protein
1392


652001-652858


hypothetical protein
1393


C 653015-653131


hypothetical protein
1394


653423-653563


hypothetical protein
1395


C 653743-654417
gph

phosphoglycolate phosphatase
1396


C 654463-655167
rpe

ribulose-phosphate 3-epimerase
1397


C 655289-657709
gyrB

DNA gyrase subunit B
1398


C 657835-660147


predicted transcriptional accessory protein
1399


660243-660719
greB

transcription elongation factor GreB
1400


C 660789-661406


conserved hypothetical transcriptional regulator
1401


C 661406-662311
oxyR

hydrogen peroxide-inducible genes activator
1402


662434-663159
pdgC

peroxiredoxin•glutaredoxin
1403


C 663251-663472
slyX

slyX
1404


663570-664295
fkby

probably FKBP-type peptidyl-proyl cis-trans
1405





isomerase


664385-665050


conserved hypothetical protein
1406


665050-665430


uncharacterized conserved protein involved in
1407





intracellular sulfur reduction


665427-665786


uncharacterized conserved protein involved in
1408





oxidation of intracellular sulfur


665795-666082


uncharacterized conserved protein involved in
1409





oxidation of intracellular sulfur


C 666213-667397
tufB

elongation factor Tu
1410


668103-669026
coaA

pantothenate kinase
1411


C 669117-670064
rseB

sigma-E factor regulatory protein RseB
1412


C 670144-670719
rseA

sigma-E factor negative regulator protein
1413





homolog


C 670744-671313
rpoE

RNA polymerase sigma-E factor
1414


C 671422-671679


conserved hypothetical protein
1415


C 671769-672155
mscL

large-conductance mechanosensitive channel
1416


C 672228-673604
trkA

Trk system potassium uptake protein TrkA
1417


C 673617-674969
sun

SUN protein
1418


C 674969-675925
fmt

methionyl-tRNA formyltransferase
1419


C 676026-676535
def

peptide deformylase
1420


676725-677162


hypothetical protein
1421


677140 . . . 677325


hypothetical protein
1422


C 683606-684259
sxy

DNA transformation protein TfoX
1423


684631-685695
recA

RecA
1424


685774-686232
recX

regulatory protein RecX
1425


C 686229-686615
crcB

CrcB
1426


C 686615-687433


predicted hydrolase of the HAD superfamily
1427


687836-688840
argF

ornithine carbamoyltransferase, catabolic
1428


688850-689782
arcC

carbamate kinase
1429


689887-691416


predicted membrane protein
1430


C 691826-694980
hgpD

pseudogene for hemoglobin-haptoglobin
1431





binding protein D


696052-696756
pepE

peptidase E
1432


696774-698069


predicted C4-dicarboxylate transporter
1433


698141-699412
abgA

aminobenzoyl-glutamate utilization protein A
1434


699505-701478
cpdB

2′,3′-cyclic-nucleotide 2′-phosphodiesterase
1435


C 701587-701994


HTH-type transcriptional regulator
1436


C 702004-702681


zinc transported ZitB
1437


703189-705078
gidA

glucose inhibited division protein A
1438


705260-705634
rpsL

30S ribosoomal protein S12
1439


705791-706261
rpsG

30S ribosomal protein S7
1440


706345-708447
fusA

elongation factor G
1441


708512-709696
tufB2

elongation factor Tu
1442


710156-710500


predicted chloride channel protein
1443


710491-710673


predicted chloride channel protein
1444


710676-711659
dusA

tRNA-dihydrouridine synthase A
1445


C 711707-711997


conserved hypothetical protein
1446


C 712020-713030
trpS

tryptophanyl-tRNA synthetase
1447


713153-713770


predicted protein involved in purine metabolism
1448


713793-715163
purB

adenylosuccinate lyase
1449


715435-715926
rplJ

50S ribosomal protein L10
1450


715981-716532
rplL

50S ribosomal protein L7/L12
1451


716511-717881
glmU

bifunctional GlmU protein
1452


C 717916-718227


hypothetical protein
1453


C 718766-719701
pldB

probable lysophospholipase L2
1454


C 719755-720820
asd

aspartate-semialdehyde dehydrogenase
1455


C 721036-721752


conserved hypothetical protein
1456


721944-722600


predicted 2-methylthioadenine synthetase
1457


C 722664-723242
mdaB

putative NADPH-quinone reductase, modulator
1458





of drug activity B


723521-725533
rep

ATP-dependent DNA helicase rep
1459


C 725539-725751


predicted periplasmic lipoprotein
1460


C 725748-726218
kdtB

phosphopantetheine adenylyltransferase
1461


C 726215-727498
kdtA

3-deoxy-D-manno-octulosonic acid transferase
1462


727561-728325
lgtF

UDP-glucose--lipooligosaccharide
1463





glucosyltransferase


C 728322-728885
tag

DNA-3-methyladenine glycolase
1464


C 729000-729752


hypothetical protein
1465


C 729764-730570
aroE

shikimate 5-dehydrogenase
1466


C 730574-731125


predicted translation factor SUA5
1467


C 731141-731677


Zn-finger domain associated with
1468





topoisomerase typr I


C 731687-733603


probable ABC transporter, ATP binding protein
1469


C 733788-734084


putative HTH-type transcriptional regulator
1470


C 734077-734259


conserved hypothetical protein
1471


C 734529-737507
hgpB

hemoglobin-haptoglobin binding protein B
1472


738238-739983


hypothetical ABC transporter, ATP-binding
1473





protein


739976-741634


probable ABC transporter, ATP binding protein
1474


C 741718-742749


conserved hypothetical protein
1475


C 742746-743066


conserved hypothetical protein
1476


C 743054-743362


putative HTH-type transcriptional regulator
1477


C 743567-744568
glpX

fructose-1,6-bisphosphatase class II GlpX
1478


744730-744948


conserved hypothetical protein
1479


745047-745487
mioC

MioC
1480


745484-745918
dtd

D-tyrosyl-tRNA (Tyr) deacylase
1481


C 746225-746701
ispF

3-C-methyl-D-erythritol 2,4-cyclodiphosphate
1482





synthase


C 746698-747375
ispD

2-C-methyl-D-erythritol 4-phosphate
1483





cytidylyltransferase


C 747375-747653
ftsB

cell division protein FtcB
1484


C 747776-748243
gpt

xanthine-guanine phosphoribosyltransferase
1485


748354-749808
pepD

aminoacyl-histidine dipeptidase
1486


C 749861-750748
xerC

site-specific recombinase XerC
1487


C 750742-751182


conserved hypothetical protein
1488


751348-752138
tpiA

triosephosphate isomerase
1489


C 752221-752538
glpE

thiosulfate sulfurtransferase GlpE
1490


C 752539-753417


conserved hypothetical protein
1491


C 753449-754327
ilvY

HTH-type transcriptional activator IlvY
1492


755011-756489
ilvC

ketol-acid reductoisomerase
1493


C 756600-757880
glpC

anaerobic glycerol-3-phosphate dehydrogenase
1494





subunit C


C 757891-759189
glpB

anaerobic glycerol-3-phosphate dehydrogenase
1495





subunit B


C 759179-760870
glpA

anaerobic glycerol-3-phosphate dehydrogenase
1496





subunit A


761163-762605
glpT

glycerol-3-phosphate transporter
1497


762709-763803
glpQ

glycerophosphoryl diester phosphodiesterase
1498





precursor


764033-764827
glpF

glycerol uptake facilitator protein
1499


764848-766359
glpK

glycerol kinase
1500


C 766443-766910
gpt2

xanthine-guanine phosphoribosyltransferase
1501


766932-767069


pseudogene for glycerophosphoryl diester
1502





phosphodiesterase precursor


767195-768019
hel

outer membrane protein P4, NADP phosphatase
1503


768186-768869
rluE

ribosomal large subunit pseudouridine synthase E
1504


C 768998-770002
ppx

pseudogene for putative exopolyphosphatase
1505


C 769974-773858


conserved hypothetical protein
1506


C 773868-775604


conserved hypothetical protein
1507


C 775755-776327
slyD

FKBP-type peptidyl-prolyl cis-trans isomerase
1508





SlyD


C 776405-776827


conserved hypothetical protein
1509


776936-777955
truD

tRNA pseudouridine synthase D
1510


777965-778714
surE

acid phosphatase surE
1511


778723-779301


conserved hypothetical protein
1512


779310-779495


conserved hypothetical protein
1513


779530-780747
lppB

outer membrane antigenic lipoprotein B
1514


781115-782533
tnaA

tryptophanase
1515


782617-783849
tnaB

tryptophan-specific transport protein
1516


783990-786575
mutS

DNA mismatch repair protein MutS
1517


786781-788166
selA

L-seryl-tRNA selenium transferase
1518


788163-790022
selB

selenocysteine-specific elongation factor
1519


790041-790895


hypothetical protein
1520


791015-791311


conserved hypothetical protein
1521


791311 . . . 791619


conserved hypothetical protein
1522


C 791676-794868
hgpC

pseudogene for hemoglobin-haptoglobin
1523





utilization protein C


795165-796463
tig

trigger factor
1524


796586-797167
clpP

ATP-dependent clp protease proteolytic subunit
1525


797177-798412
clpX

ATP-dependent Clp protease ATP-binding
1526





subunit ClpX


798557-798973
secE

preprotein translocase SecE
1527


798975-799532
nusG

transcription antitermination protein NusG
1528


799683-800435
vacJ

VacJ lipoprotein
1529


800489-800881


putative translation initiation inhibitor YjgF
1530





family


801049-801900
htpX

probable protease HtpX
1531


C 801942-802223
sirA

SirA
1532


802297-802917


conserved hypothetical protein
1533


802290-804383
trkH

Trk system potassium uptake protein TrkH
1534


810818-811237
psiE

PsiE
1535


C 811301-812587
hemY

HemY
1536


C 812598-813809
hemX

putative uroporphyrin-III C-methyltransferase
1537


814115-816646
cya

adenylate cyclase
1538


816720-717727
gpsA

glycerol-3-phosphate dehydrogenase
1539


817744-818547
cysE

serine acetyltransferase
1540


818557-819372


conserved hypothetical shikimate 5-
1541





dehydrogenase-like protein


819510-820895


possible di- and tricarboxylate transporter
1542


C 821033-821881
folD

FolD bifunctional protein
1543


C 822409-823695
fucP

L-fucose permease
1544


C 823734-824384
fucA

L-fuculose phosphate aldolase
1545


C 824404-824838
fucU

fucose operon protein FucU
1546


C 824852-826264
fucK

L-fuculokinase
1547


C 826337-828151
fucI

L-fucose isomerase
1548


C 828336-829085
fucR

L-fucose operon activator
1549


829284-832055
hepA

RNA polymerase associated protein homolog
1550


832058-832717
rluA

ribosomal large subunit pseudouridine synthase A
1551


832744-833322
glpG

GlpG
1552


833352-834119
glpR

glycerol-3-phosphate regulon repressor
1553


C 834421-835242
metQ

probable D-methionine-binding lipoprotein
1554





MetQ


C 835281-835970
metI

probable D-methionine transport system
1555





permease protein


C 835960-836997
metN

probable D-methionine transport ATP-binding
1556





protein


837173-837727


conserved hypothetical protein
1557


843979-844539


conserved hypothetical protein
1558


844543-844983


conserved hypothetical protein
1559


C 845045-845671
narP

nitrate/nitrite response regulator protein
1560


C 845877-847124
lysA

diaminopimelate decarboxylase
1561


C 847172-847303


hypothetical protein
1562


847386-847691
cyaY

CyaY
1563


847693-849552
recQ

ATP-dependent DNA helicase RecQ
1564


849637-851355
proS

prolyl-tRNA synthetase
1565


C 851443-853791
ostA

organic solvent tolerance protein
1566


C 853914-855323
sufI

SufI
1567


C 855325-856047
plsC

1-acyl-sn-glycerol-3-phosphate acyltransferase
1568


856133-856846
lpxH

UDP-2,3-diacylglucosamine hydrolase
1569


C 857076-858602


conserved hypothetical sodium dependent
1570





transporter


858849-859475
ilvG

acetolactate synthase isozyme II large subunit
1571


859517-861355
ilvD

dihydroxy-acid dehydratase
1572


861435-862976
thdI

threonine dehydratase biosynthetic
1573


863014-866493
dnaE

DNA polymerase III alpha subunit
1574


866732-868375
pgmB

phosphoglucomutase
1575


C 868542-869051
secB

protein-export protein SecB
1576


C 869065-869511


predicted rhodanese-related sulfurtransferase
1577


869759-871081
dcuB

anaerobic C4-dicarboxylate transporter DcuB
1578


871285-872619
ndhA

NADH dehydrogenase
1579


C 872699-875131
plsB

glycerol-3-phosphate acyltransferase
1580


875380-876006
lexA

LexA repressor
1581


876152-876976
dapF

diaminopimelate epimerase
1582


877196-877693
tpx

probable thiol peroxidase
1583


878256-882218
purL

phosphoribosylformylglycinamidine synthase
1584


882414-882652
lex2A

pseudogene for Lex2A homolog
1585


882724-883467
lex2B

UDP-glucose--lipooligosaccharide
1586





glucosyltransferase


C 883528-884370


conserved hypothetical protein
1587


C 884370-885602


predicted membrane-bound metallopeptidase
1588


885778-886461
gpmA

2,3-bisphosphoglycerate-dependent
1589





phsphoglycerate mutase


C 886539-886751
rpL31

50S ribosomal protein L31
1590


886928-888064
mutY

A/G-specific adenine glycosylase
1591


888042-888314


conserved hypothetical protein
1592


888329-889402
mltC

membrane-bound lytic murein transglycolase C
1593





precursor


C 890013-890693


predicted diadenosine tetraphosphatase and
1594





related serine/threonine protein phosphatase


C 890690-891955
nadR

bifunctional protein NadR
1595


892174-892821
ribB

3,4-dihydroxy-2-butanone 4-phosphate
1596





synthase


892836-892684
lpsA

lipooligosaccharide glycosyl transferase
1597


C 893733-894215


conserved hypothetical tRNA/rRNA
1598





methyltransferase


C 894225-894806


predicted N6-adenine-specific methylase
1599


894861-896105
FtsY

cell division protein FtsY
1600


896124-896780
ftsE

cell division ATP-binding protein EtsE
1601


896790-897722
ftsx

cell division protein ftsx
1602


C 897817-898998
atoB

acetyl-CoA acetyltransferase
1603


C 899013-900356
atoE

short chain fatty acids transporter
1604


C 900359-901024
atoA

acetate CoA-transferase beta subunit
1605


C 901035-901688
atoD

acetate CoA-transferase alpha subunit
1606


901894-902841


putative HTH-type transcriptional regulator
1607


903078-903389
rpsJ

30S ribosomal protein S10
1608


903406-904032
rplC

50S ribosomal protein L3
1609


904048-904650
rplD

50S ribosomal protein L4
1610


904647-904946
rplW

50S ribosomal protein L23
1611


904964-905785
rplB

50 S ribosomal protein L2
1612


905811-906086
rpsS

30S ribosomal protein S19
1613


906098-906430
rplV

50S ribosomal protein L22
1614


906448-907155
rpsC

30S ribosomal protein S3
1615


607169-907579
rplP

50S ribosomal protein L16
1616


907579-907770
rpmC

50S ribosomal protein L29
1617


907770-908027
rpsQ

30S ribosomal protein S17
1618


908417-909040


conserved hypothetical protein
1619


909321-909692
rplN

50S ribosomal protein L14
1620


909703-910014
rplX

50S ribosomal protein L24
1621


910032-910571
rplE

50S ribosomal protein L5
1622


910583-910888
rpsN

30S ribosomal protein S14
1623


910925-911317
rpsH

30S ribosomal protein S8
1624


911333-911866
rplF

50S ribosomal protein L6
1625


911880-912233
rplR

50S ribosomal protein L18
1626


912248-912748
rpsE

30S ribosomal protein S5
1627


912755-912934
rpmD

50S ribosomal protein L30
1628


912938-913372
rplO

50S ribosomal protein L15
1629


913380-914705
secY

preprotein translocase SecY subunit
1630


914987-915343
rpsM

30S ribosomal protein S13
1631


915359-915748
rpsK

30S ribosomal protein S11
1632


915776-916396
rpsD

30S ribosomal protein S4
1633


916428-917414
rpoA

DNA-directed RNA polymerase alpha chain
1634


917455-917841
rplQ

50S ribosomal protein L17
1635


C 917968-918564


predicted cAMP-binding protein - catabolite
1636





gene activator and regulatory subunit of cAMP-





dependent protein kinase


C 918683-919383


pseudogene for conserved hypothetical protein,
1637





predicted arylsulfatase regulator, Fe—S





oxidoreductase


919587-920326


predicted permease
1638


C 920358-921551
dxr

1-deoxy-D-xylulose 5′phosphate
1639





reductoisomerase


C 921661-922218
frr

ribosome recycling factor
1640


C 922370-923986
pckA

phosphoenolpyruvate carboxykinase
1641


C 924159-925040
hslO

33 kDa chaperonin
1642


C 925185-926558
argH

argininosuccinate lyase
1643


C 926693-927580
galU

UTP-glucose-1-phosphate uridylyltransferase
1644


C 927602-927793
csrA

carbon storage regulator homolog
1645


C 927871-930495
alaS

alanyl-tRNA synthetase
1646


C 930682-931107
uspA

universal stress protein A
1647


C 931209-932501
pepP

Xaa-Pro aminopeptidase
1648


C 932513-933061


conserved hypothetical protein
1649


C 933232-934257
galM

aldose 1-epimerase
1650


C 934264-935460
galK

glactokinase
1651


C 935495-936544
galT

galactose-1-phosphate uridylyltransferase
1652


936752-937750
galR

HTH-type transcriptional regulator GalR
1653


937841-938890
mglB

D-galactose-binding periplasmic protein
1654





precursos


938956-940476
mglA

galactoside transport ATP-binding protein
1655





MglA


940493-941503
mglC

galactoside transport system permease protein
1656





MglC


941649-942383


conserved hypothetical protein
1657


942389-942946
ispZ

probable intracellular septation protein A
1658


942946-943410


putative acyl-CoA thioester hydrolase
1659


943427-943723


conserved hypothetical protein
1660


943739-945520
slt

putative soluble lytic murein transglycosylase
1661


945558-945863
trpR

Trp operon repressor
1662


946006-946590
mtgA

monofuctional biosynthetic peptidoglycan
1663





transglycolase


C 946696-947040
frdD

fumarate reductase subunit D
1664


C 947053-947463
frdC

fumarate reductase subunit C
1665


C 947474-948244
frdB

fumarate reductase iron-sulfur protein
1666


C 948327-950036
frdA

fumarate reductase flavoprotein subunit
1667


950217-951218
genX

putative lysyl-tRNA synthetase
1668


C 951484-952167
cpxR

transcriptional regulatory protein CpxR
1669


C 952221-952634
smpA

small protein A
1670


C 952700-953725
ndpA

nucleosid-associated protein NdpA
1671


953842-954060


conserved hypothetical protein
1672


954062-955819


predicted hydrolase of alkaline phosphatase
1673





superfamily


956017-957009


conserved hypothetical protein
1674


C 957148-957726
mobA

probable molybdopterin-guanine dinucleotide
1675





biosyntheses protein A


957791-958057


conserved hypothetical protein
1676


958068-958685
dsbA

thiol:disulfide interchange protein DsbA
1677


958747-959085


conserved hypothetical protein
1678


959226-960317
trmA

tRNA (Uracil-5)-methyltransferase
1679


960506-961072


conserved hypothetical protein
1680


961066-961500


predicted positive regulator of Sigma E
1681


961901-962437
mobB

molybdopterin-guanine dinucleotide
1682





biosynthesis protein B


962434-963825


conserved hypothetical protein
1683


963966-965615
hbpA

heme-binding protein A
1684


965791-966552


putative heme iron utilization protein
1685


C 966689-967036


conserved hypothetical protein
1686


C 967051-969843
polA

DNA polymerase I
1687


969992-970294


conserved hypothetical protein
1688


970617-971180


predicted 5-formyltetrahydrofolate cyclo-ligase
1689


971308-973878
clpB

ClpB
1690


C 973916-974656


probable tRNA/rRNA methyltransferase
1691


C 974712-977060
vacB

ribonuclease R
1692


C 977252-977959


conserved hypothetical protein
1693


978095-978784
pdxH

pyridoxamine 5′-phosphate oxidase
1694


C 978871-980721
typA

GTP-binding protein TypA/BipA
1695


C 980848-981828
lic3A2

CMP-neu5Ac--lipooligosaccharide alpha 2-3
1696





sialyltransferase


981985-983397
glnA

glutamine synthetase
1697


983484-984527
rmlB

dTDP-glucose 4,6-dehydratase
1698


984604-985908
pepB

peptidase B
1699


985915-986340
ndk

nucleoside diphosphate kinase
1700


C 986410-987582


conserved hypothetical GTP-binding protein
1701


C 987610-988530


conserved hypothetical transport protein
1702


C 988610-988867
rpmA

50S ribosomal protein L27
1703


C 988888-989199
rplU

50S ribosomal protein L21
1704


989412-990401
ispB

octaprenyl-diphosphate synthase
1705


990452-991189


conserved hypothetical protein
1706


C 991249-992619


predicted Na+/alanine symporter
1707


C 993019-993729
arcA

aerobic respiration control protein ArcA
1708


993921-995660
dsbD

thiol:disulfide interchange protein DsbD
1709


995777-996181


predicted membrane protein
1710


996370-997968
purH

bifunctional purine biosynthesis protein PurH
1711


998064-999353
purD

phosphoribosylamine--glycine ligase
1712


999509-1001076


pseudogene for predicted membrane-associated,
1713





metal-dependent hydrolase


1001199-1002464
glyA

serine hydroxymethyltransferase
1714


1002523-1003155
coaE

dephospho-CoA kinase
1715


1003148-1003354


conserved hypothetical zinc-binding protein
1716


1003463-1004719
rh1B

ATP-dependent RNA helicase Rh1B
1717


1004961-1005524


hypothetical transcriptional regulator
1718


1005556-1006752


predicted membrane-fusion protein
1719


1006752-1009850


predicted cation/multidrug efflux pump
1720


1010385-1010855


predicted cell division protein
1721


C 1010868-1012400
emrB

multidrug resistance protein
1722


C 1012410-1013582
emrA

multidrug resistance protein A
1723


C 1013755-1014237
folA

dihydrofolate reductase
1724


1014339-1015445
proB

glutamate 5-kinase
1725


1015526-1016116
nudH

probable nucleoside polyphosphate hydrolase
1726


1016116-1016910


predicted permease
1727


1016919-1017725
lgt

prolipoprotein diacylglyceryl transferase
1728


1017735-1018586
thyA

thymidylate synthase
1729


1018586-1019107


conserved hypothetical protein
1730


C 1019135-1019449


conserved hypothetical protein
1731


1019520-1019831


conserved hypothetical protein
1732


1019956-1022661
secA

preprotein translocase SecA subunit
1733


1022726-1023136
mutT

mutator protein MutT
1734


1023245-1025101
kefB

glutathione-regulated potassium-efflux system
1735





protein


1025202-1025966


conserved hypothetical SAM-dependent
1736





methtransferase


1026111-1026866
rpsB

30S ribosomal protein S2
1737


1027000-1027851
tsf

elongation factor Ts
1738


C 1028031-1029056
lpxD

UDP-3-O-[3-hydroxymyristoyl] glucosamine
1739





N-acyltransferase


C 1029069-1029662
omp26

outer membrane protein 26
1740


C 1029770-1032172


protective surface antigen D15
1741


C 1032192-1033523


predicted membrane bound zinc
1742





matalloprotease with PDZ domain


C 1033533-1034399
cdsA

phosphaatidate cytidylyltransferase
1743


C 1034417-1035136
uppS

undecaprenyl pyrophosphate synthetase
1744


1035385-1037970
leuS

leucyl-tRNA synthetase
1745


1038020-1038568


conserved predicted lipoprotein
1746


1038568-1039602
holA

DNA polymerase III, delta subunit
1747


C 1039836-1042004
glyS

glycyl-tRNA synthetase beta chain
1748


C 1042050-1042466


hypothetical protein
1749


C 1042606-1042971


conserved hypothetical protein
1750


C 1043001-1043261


glutaredoxin-related protein
1751


C 1043320-1044228
glyQ

glycyl-tRNA synthetase alpha chain
1752


1044491-1046017
hktE

catalase
1753


C 1046249-1047430


predicted glutathionylspermidine synthase
1754


C 1047431-1048012


conserved hypothetical protein
1755


C 1048025-1048477


conserved hypothetical protein
1756


C 1048528-1049838
eno

enolase
1757


C 1049959-1051164


conserved hypothetical protein
1758


C 1051161-1052303
nrfF

formate-dependent nitrite reductase complex
1759





nrfFG subunit


C 1052300-1052830
dsbE2

probable thiol:disulfide interchange protein
1760





DsbE


C 1052830-1054737
nrfE

cytochrome c-type biogenesis protein NrfE
1761


C 1054846-1055658
suhB

inositol-1-monophosphatase
1762


1055858-1056370


conserved hypothetical protein
1763


1056370-1057089


predicted Type II secretory pathway, PulJ-like
1764





protein


1057086-1057769


conserved hypothetical protein
1765


1057774-1058049


conserved hypothetical protein
1766


1058095-1061460


exodeoxyribonuclease V gamma chain
1767


1061515-1061964


predicted transcriptional regulator
1768


1061967-1063085
ribD

riboflavin biosynthesis protein RibD
1769


11063086-1064108
degS

protease DegS
1770


C 1064182-1064997
mutM

formamidopyrimidine-DNA glycosylase
1771


C 1065229-1066764
ddc

L-2,4-diaminobutyrate decarboxylase
1772


C 1066784-1067182


predicted nucleic acid-binding protein, contains
1773





PIN domain


C 1067182-1067415


conserved hypothetical protein
1774


C 1067562-1068926
dat

diaminobutyrate--2-oxoglutarate
1775





aminotransferase


C 1069273-1069443
rpmG

50S ribosomal protein L33
1776


C 1069455-1069691
rpmB

50S ribosomal protein L28
1777


C 1069905-1070609
radC

DNA repair protein Radc homolog
1778


1070733-1071935
dfp

phosphopantothenoylcysteine
1779





synthetase/decarboxylase


1071983-1072438
dut

deoxyuridine 5′triphosphate
1780





nucleotidohydrolase


1072442-1073098
ttk

ttk
1781


1073120-1073371


hypothetical protein
1782


1073356-1074030
crp

catabolite gene activator
1783


C 1074395-1075573


hypothetical RNA methyltransferase
1784


C 1075566-1076621
nagZ

beta-hexosaminidase
1785


C 1076625-1076975


predicted periplasmic lipoprotein
1786


C 1076975-1077367


HIT-like protein
1787


C 1077442-1080267
ileS

isoleucyl-tRNA synthetase
1788


C 1080296-1081234
ribF

riboflavin biosynthesis protein RibF
1789


C 1081269-1082801
mviN

putative virulence factor MviN
1790


1083104-1083373
rpsT

30S ribosomal protein S20
1791


C 1083445-1084011


conserved hypothetical protein
1792


1084157-1085014
menB

naphthoate synthase
1793


1085184-1086173
menC

O-succinylbenzoate synthase
1794


1086230-1086679
aroQ

3-dehydroquinate dehydratase
1795


1086833-1087300
accB

biotin carboxyl carrier protein of acetyl-CoA
1796





carboxylase


1087477-1088823
accC

biotin carboxylase
1797


1089028-1089285


conserved hypothetical membrane protein
1798


1089282-1090736
panF

sodium/pantothenase symporter
1799


1090857-1091729


conserved hypothetical protein
1800


1091764-1092651
prmA

ribosomal protein L11 methyltransferase
1801


1092798-1093778
dusB

tRNA-dihydrouridine synthase B
1802


1093772-1094071
fis

DNA-binding protein fis
1803


C 1094172-1094657
smpB

SsrA-binding protein
1804


C 1094896-1095861
pfkA

6-phosphfructokinase
1805


C 1095957-1096538


conserved hypothetical protein
1806


C 1096531-1097382


conserved hypothetical protein
1807


C 1097360-1098481
smf

smf
1808


1099025-1100572
leuA

2-isopropylmalate synthase
1809


1100735-1101811
leuB

3-isopropylmalate dehydrogenase
1810


1101988-1103397
leuC

3-isopropylmalate dehydratase large subunit
1811


1103422-1104024
leuD

3-isopropylmalate dehydratase small subunit
1812


C 1104138-1109522
Iga1

IgA-specific serine endopeptidase
1813


C 1109749-1110828
recF

DNA replication and repair protein RecF
1814


C 1110830-1111930
dnaN

DNA polymerase III, beta chain
1815


C 1112123-1113487
dnaA

chromosomal replication initiator protein DnaA
1816


C 1113674-1116412
tbp1

transferrin-binding protein 1
1817


C 1116457-1118349
tbp2

transferrin-binding protein 2
1818


1118446-1119906


conserved hypothetical protein
1819


1120082-1120216
rpmH

50S ribosomal protain L34
1820


1120229-1120588
rnpA

ribonuclease P protein component
1821


1120552-1120812


conserved hypothetical protein
1822


1120812-1122437
yidC

proprotein translocase subunit YidC
1823


1123308-1124666
trmE

probable tRNA moficiation GTPase TrmE
1824


1124855-1126723
ppiD

peptidyl-prolyl cis-trans isomerase D
1825


1126811-1128370


predicted PR--lipooligosaccharide
1826





phosphorylethanolamine transferase


1128440-1128955
lspA

lipoprotein signal peptidase
1827


1128952-1129896
ispH

4-hydroxy-3-methylbut-2-enyl diphosphate
1828





reductase


1130065-1130412


conserved hypothetical protein
1829


1130659-1131657
tbpA

thiamine-binding periplasmic protein
1830


1131662-1133278
thiP

thiamine transport system permease protein
1831


1133256-1133909
thiQ

thiamine transport ATP-binding protein
1832


1134022-1135023
bioB

biotin synthase
1833


C 1135148-1137145
tktA

transketolase
1834


1137526-1138470
serB

phosphoserine phosphatase
1835


1138488-1138979


conserved hypothetical protein
1836


1139275-1140222
corA

magnesium and cobalt transport protein CorA
1837


1140213-1140767


predicted integral membrane protein
1838


C 1140947-1141831


predicted glutamine amidotransferase
1839


C 1141988-1142515


hypothetical protein
1840


1142645-1143847


predicted ATPase
1841


C 1144037-1145386


hypothetical protein
1842


C 1145854-1146354


predicted ferredoxin
1843


C 1146354-1146964


conserved hypothetical protein
1844


C 1147077-1147916
dmsC

anaerobic dimethyl sulfoxide reductase chain C
1845


C 1147918-1148535
dmsB

anaerobic dimenthyl sulfoxide reductase chain B
1846


C 1148546-1150966
dmaA

anaerobic dimethyl sulfoxide reductase chain A
1847


1151219-1152328


conserved hypothetical protein
1848


1152372-1152665


putative mercuric transport MerT homolog
1849


1152674-1152952


predicted copper chaperone MerP homolog
1850


C 1153077-1154921


conserved hypothetical ABC transporter
1851


C 1155000-1155896


conserved hypothetical transcriptional regulator
1852


1156008-1156349


conserved putative gamma-
1853





carboxymuconolactone decarboxylase subunit


C 1156460-1157056


conserved hypothetical protein
1854


C 1157062-1159227
res

putative type III restriction-modification sustem
1855





HindVIP enzyme res


C 1159217-1161318
mod

pseudogene for putative type III restriction-
1856





modification system HibdVIP enzyme mod


C 1161534-1162127
rnhB

ribonuclease HII
1857


C 1162120-1163292
lpxB

lipid-A-disaccharide synthase
1858


C 1163359-1164147
lpxA

acyl0[acyl-carrier-protein]--UDP-N-
1859





acetylglucosamine O-Acyltransferase


C 1164161-1164607
fabZ

(3R)-hydroxymyristoyl-[acyl carrier protein]
1860





dehydratase


C 1164683-1166290


predicted PR--lipooligosaccharide
1861





phosphorylethanolamine transferase


1166383-1167096
pyrH

uridylate kinase
1862


C 1167335-1168300
nrfD

NrfD, formate-dependent nitrite reductase,
1863





membrane component


C 1168297-1168974
nrfC

NrfC, Fe—S-cluster-containing hydrogenase
1864





component 1


1168971-1169654
nrfB

NrfB, cytochrome C-type protein
1865


C 1169698-1171314
nrfA

cytochrome c552
1866


C 1171476-1175384
hrpA

ATP-dependent helicase HrpA homolog
1867


C 1175381-1175755


conserved putative small membrane protein
1868


C 1175756-1176208


conserved putative membrane protein
1869


C 1176474-1177610
cyoB

probable cyrochrome oxidase subunit II
1870


C 1177625-1179190
cyoA

probable cytochrome oxidase dubunit I
1871


1179843-1181480
pyrG

CTP synthase
1872


C 1181625-1182305
pnuC

nictinamide riboside transporter
1873


C 1182487-1183260


probable amino-acid ABC transporter ATP-
1874





binding protein


C 1183264-1183941


probable amino-acid ABC transporter permease
1875





protein


C 1183964-1184737


probable amino-acid ABC transporter binding
1876





protein


C 1185038-1186312
murA

UDP-N-acetylglucosamine a-
1877





carboxyvinyltransferase


C 1186323-1186580


predicted transcriptional regulator, BolA
1878





superfamily


C 1186601-1186918


predicted NTP binding protein, contains STAS
1879





domain


C 1186929-1187573


conserved ABC-type transport system protein
1880


C 1187606-1188106


conserved ABC-type transport system protein,
1881





periplasmic component


C 1188116-1188901


conserved ABC-type transport system protein,
1882





permease component


C 1188898-1189692


conserved ABC-type transport system protein,
1883





ATPase component


C 1190098-1190745
sodA

superoxide dismutase [Mn]
1884


1190994-1191632
ccmA

heme exporter protein A
1885


1191637-1192302
ccmB

heme exporter protein B
1886


1192363-1193103
ccmC

heme exporter protein C
1887


1193146-1193349
ccmD

heme exporter protein D
1888


1193346-1193867
ccmE

cytochrome c-type biogenesis protein CcmE
1889


1193864-1195813
ccmF

cytochrome c-type biogenesis protein CcmF
1890


1195953-1196498
dsbE

thiol:disulfide interchange protein DsbE
1891


1196498-1197877
ccmH

pseudogene for cytochrome c-type biogenesis
1892





protein CcmH precursor


1197998-1198489


hypothetical protein
1893


1198491-1198799


conserved hypothetical protein
1894


C 1198804-1200843
ligN

DNA ligase
1895


C 1200928-1201914
zipA

cell division protein ZipA
1896


1202067-1202885
cysZ

CysZ
1897


1202894-1203934
cysK

cysteine synthase
1898


1204182-1205406


pseudogene for conserved hypothetical
1899





metabolite transport protein


C 1205515-1206555
rfaF

ADP-heptose--lipooligosaccharide
1900





heptosyltransferase II


C 1206631-1207794
xylR

xylose operon refulatory protein
1901


C 1207906-1209312


conserved hypothetical Na(+)/H(+) antiporter
1902


C 1209314-1210486
aspC2

putative aspartate aminotransferase
1903


1210699-1212018
xylA

xylose isomerase
1904


1212021-1213556
xylB

xylulose kinase
1905


C 1213607-1214533
rfaD

ADP-L-glycero-D-manno-heptose-6-epimerase
1906


C 1214586-1215089


thioredoxin-like protein
1907


1215200-1215871
deoC

deoxyribose-phosphate aldolase
1908


C 1215905-1217434
comM

competence protein ComM
1909


C 1217549-1218166
engB

Probable GTP-binding protein EngB
1910


1218259-1219137


D-xylose transport permease protein
1911


C 1219177-1220175
oppF

oligopeptide transport ATP-binding protein
1912


C 1220172-1221143
oppD

oligopeptide transport ATP-binding protein
1913


C 1221153-1222088
oppC

oligopeptide transport system permease protein
1914


C 1222098-1223018
oppB

oligopeptide transport system permease protein
1915


C 1223099-1224724
oppA

periplasmic oligopeptide-binding protein
1916


1225019-1225972
talB

Transaldolase
1917


1227033-1228634


carbon starvation protein, predicted membrane
1918





protein


1228836-1229291
mraZ

MraZ
1919


1229322-1230287
mraW

predicted S-adenosylmethionine-dependent
1920





methyltransferase involved in cell envelope





biogenesis


1230290-1230613
ftsL

cell division protein FtsL
1921


1230626-1232458
ftsI

peptidoglycan synthetase FtsI
1922


1232468-1233934
murE

UDP-N-acetylmuramoylalanyl-D-glutamate--
1923





2,6-diaminopimelate ligase


1233948-1235321
murF

UDP-N-acetylmuramoyl-tripeptide--D-alanyl-
1924





D-alanine ligase


1235315-1236397
mraY

phospho-N-acetylmuramoyl-pentapeptide-
1925





transferase


1236509-1237822
murD

UDP-N-acetylmuramoylaalanine--D-glutamate
1926





ligase


1237845-1239029
ftsW

cell division protein FtsW
1927


1239041-1240096
murG

UDP-N-acetylglucosamine--N-
1928





acetylmuramylpyrophosphoryl-undecaprenol





N-acetylglucosamine transferase


1240234-1241661
murC

UDP-N-acetylmuramate--L-alanine ligase
1929


1241730-1242650
ddlB

D-alanine--D-alanine ligase
1930


1242650-1243414
ftsQ

cell division protein FtsQ
1931


1243433-1244710
ftsA

cell division protein FtsA
1932


1244794-1246059
ftsZ

cell division protein FtsZ
1933


1246098-1247015
lpxC

UDP-3-O-[3-hydroxymyristoyl] N-
1934





acetylglucosamine deacetylase


1247142-1248299
pheA

P-protein
1935


C 1248344-1249201


predicted P-loop-containing kinase
1936


C 1249219-1249713
ptsN

nitrogen regulatory IIA protein
1937


C 1249716-1250441


probable ABC transporter ATP-binding protein
1938


C 1250445-1250963


conserved hypothetical protein
1939


C 1250944-1251552


conserved hypothetical protein
1940


C 1251602-1252138


conserved hypothetical protein
1941


1252227-1253582
pmbA

pmbA
1942


1253957-1254496
hpt

hypoxanthine phosphoribosyltransferase
1943


1254802-1256124


predicted Na+/dicarboxylate symporter
1944


C 1256224-1256691
nrdG

anaerobic ribonucleoside-triphosphate
1945





reductase activating protein


C 1256939-1258669
cydC

transport ATP-binding protein CydC
1946


C 1258669-1260429
cydD

transport ATP binding protein CydD
1947


C 1260501-1261457
trxB

thioredoxin reductase
1948


C 1261531-1262388


thioredoxin domain-containing protein
1949


C 1262467-1263438
hemH

ferrochelatase
1950


C 1263435-1263851


conserved hypothetical protein
1951


C 1263865-1266948


conserved FAD/FMN-containing
1952





dehydrogenase


1267323-1268402
ompP5

outer membrane protein P5
1953


1268514-1268876


conserved glutaredoxin-related protein
1954


C 1268953-1270053


histidinol-phosphate aminotransferase 2
1955


C 1270141-1271229
serC

phosphoserine aminotransferase
1956


1271420-1271764


conserved hypothetical protein
1957


C 1271804-1272370


conserved hypothetical protein
1958


C 1272370-1273356


conserved hypothetical protein
1959


1273398-1273979
trpG2

putative anthranilate synthase component II
1960


1274064-1275218
metK

S-adenosylmethionine synthetase
1961


1275513-1276028
sprT

SprT
1962


1276079-1276591
opa

opacity protein
1963


C 1276627-1276932


conserved hypothetical protein
1964


C 1277139-1277822
artM

arginine transport system permease protein
1965


C 1277822-1278487
artQ

arginine transport system permease protein
1966


C 1278491-1279210
artI

arginine-binding periplasmic protein
1967


C 1279228-1279959
artP

arginine transport ATP-binding protein
1968


C 1280085-1280069
gmhA

phosphoheptose isomerase
1969


1280769-1281683
ligA

DNA ligase
1970


C 1281712-1282695
dppF

dipeptide transport ATP binding protein
1971


C 1282698-1283690
dppD

dipeptide transport ATP binding protein
1972


C 1283700-1284587
dppC

dipeptide transport system permease protein
1973


C 1284602-1285603
dppB

dipeptide transport system permease protein
1974


C 1285693-1287876
uvrD

DNA helicase II
1975


C 1288483-1289118


predicted organic radical activating enzyme
1976


C 1289119-1289544


predicted 6-pyruval-tetrahydropterin synthase
1977


C 1289537-1290220


predicted PP-loop superfamily ATPase
1978


1290376-1290519


conserved hypothetical protein
1979


C 1290570-1291601
ilvE

branched chain amino acid amino transferase
1980


1292076-1292969
gcvA

glycine cleavage system transcriptional
1981





activator


1292962-1294053


predicted SAM-dependent methyltransferase
1982


1294089-1295258
sucC

succinyl-CoA synthetase beta chain
1983


1295258-1296139
sucD

succinyl-CoA synthetase alpha chain
1984


1296260-1296883


putative translation factor, Sua5
1985


1296931-1289004
rluB

ribosomal large subunit pseudouridine synthase B
1986


1298015-1298986
cysB

HTH-type transcriptional regulator CysB
1987


C 1299045-1299989


conserved hypothetical adenine-specific
1988





methylase


1300120-1300623


conserved hypothetical protein
1989


C 1300701-1302836
pta

phosphate acetyltransferase
1990


C 1302904-1304109
ackA

acetate kinase
1991


1304326-1304769


conserved hypothetical protein
1992


1304856-1305347
cvpA

colicin C production protein
1993


1305358-1306875


amidophosphoribosyltransferase
1994


C 1306940-1307830
sulA

cell division inhibitor SulA
1995


C 1307830-1308285
argR

argininr repressor
1996


1308492-1309427
mdh

malate dehydrogenase
1997


C 1309514-1311022
lysS

lysyl-tRNA synthetase
1998


C 1311134-1311985
prfB

peptide chain release factor 2
1999


1312365-1313054
dsbC

thiol:disulfide interchange protein DsbC
2000


1313342-1315069
recJ

single stranded DNA specific exonuclease RecJ
2001


1315062-1315769


conserved hypothetical protein
2002


1315766-1316458
mtnA

MTA/SAH nucleosidase
2003


C 1316532-1319279
hup

heme utilization protein
2004


C 1319579-1321177


putative L-Lactate permease
2005


1321367-1321546


pseudogene for ctidylate kinase
2006


C 1321581-1322585
fbp

frustose-1,6-bisphosphatase
2007


1322820-1323629
truA

tRNA pseudouridine synthase A
2008


C 1323722-1324738
sapZ

Predicted membrane protein
2009


C 1324750-1325559
sapF

ABC-type transport system, ATPase component
2010





involved in antimicrobial peptide resistance


C 1325559-1326608
sapD

ABC-type transport system, ATP binding
2011





component, involved in antimicrobial peptide





resistance


C 1326612-1327499
sapC

ABC-type transport system, permease protein,
2012





involved in antimicrobial peptide resistance


C 1327489-1328454
sapB

ABC-type transport system, permease protein,
2013





involved in antimicrobial peptide resistance


C 1328454-1330148
sapA

ABC-type transport system, periplasmic
2014





component, involved in antimicrobial peptide





resistance


1330218-1331630


Predicted ATPase
2015


C 1331667-1334306
Ppc

Phosphoenolpyruvate carboxylase
2016


C 1334489-1335499
purR

HTH-type transcriptional repressor PurR
2017


1335796-1336689
dapD

2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-
2018





succinyltransferase


C 1336746-1338044
PurA

adenylosuccinate synthetase
2019


C 1338173-1338538


predicted aspartokinase
2020


C 1339386-1339673
rplY

50S ribosomal protein L25
2021


1339841-1340497


uncharacterized membrane-associate protein
2022


C 1340544-1340960


conserved hypothetical protein
2023


C 1340991-1341341


putative translation initiation inhibitor, YjgF
2024





family


1341453-1342169


conserved hypothetical protein
2025


1342170-1342712


conserved hypothetical protein
2026


C 1342727-1343185


HTH-type trancriptional regulator
2027


1343224-1343907


putative ABC-type Co2+ transport system,
2028





periplasmic component


1343920-1344408


conserved hypothetical protein
2029


1344408-1345028
cbiM

predicted ABC-type cobalt transport system,
2030





permease component


1345028-1345660


predicted cobalt transport protein
2031


1345662-1346288
cbiO

predicted ABC-type cobalt transport system,
2032





ATPase component


C 1346395-1347585
aspC

aspartate aminotransferase
2033


1347725-1348813
purK

phosphoribosylaminoimidazole carboxylase
2034





ATPase subunit


C 1348883-1349377
purE

phosphoribosylaminoimidazole carboxylase
2035





catalytic subunit


1349548-1349796
hicA

HicA
2036


1349793-1350137
hicB

HicB
2037


1350181-1352790
pepN

aminopeptidase N
2038


C 1352862-1353476
ribE

riboflavin synthase alpha chain
2039


1353520-1354914
norM

probable multidrug resistance protein NorM
2040


1354923-1355639
sfsA

sugar fermentation stimulation protein
2041


1355839-1357026
tyrS

tyrosyl-tRNA synthase
2042


C 1357090-1358037
prsA

ribose phosphate pyrophosphokinase
2043


C 1358039-1358992
ispE

4-diphosphocytidyl-2-C-methyl-D-erythritol
2044





kinase


C 1358992-1359621
lolB

outer-membrane lipoprotein LolB
2045


C 1359684-1360934
cca

tRNA nucleotidyltransferase
2046


C 1360941-1361552


conserved hypothetical protein
2047


C 1361618-1362880


putative phosphate permease
2048


C 1362905-1363585


conserved hypothetical phosphate transport
2049





regulator


1363793-1364248


predicted membrane protein
2050


1364287-1364574


conserved hypothetical protein
2051


1364604-1365557


conserved hypothetical protein
2052


1365547-1366263


conserved hypothetical protein
2053


1367371-1367538


hypothetical protein
2054


1367936-1368052


hypothetical protein
2055


1368502 . . . 1368666


hypothetical protein
2056


C 1368856-1370808
hmw2C

HMW2C, putative glycosyltransferase involved
2057





in glycosylation of HMW1A and HMW2A


C 1371200-1372837
hmw2B

HMW2B, OMP-85-like protein required for
2058





HMW1A and HMW2A secretion


C 1372988-1377616


HMW2A, high molecular weight adhesin 2
2059



hmw2A


1378041-1379087


conserved hypothetical protein
2060


1379320-1380696
radA

DNA repair protein RadA homolog
2061


1380755-1381255
lrp

leucine-responsive regulatory protein
2062


1381257-1384013
ftsK

DNa translocase FtsK
2063


1384031-1384648
lolA

outer-membrane lipoproteins carrier protein
2064





precursor


1384709-1386049


predicted ATPase related to the helicase subunit
2065





of the holliday junction resolvase


C


hypothetical protein
2066


1386137 . . . 1387279


C 1387289-1388512


modification methylase BepI-like
2067


C 1388767-1390065
aroA

3-phosphoshikimate 1-carboxyvinyltransferase
2068


C 1390065-1390901
purU

formyltetrahydrofolate deformylase
2069


C 1390971-1391375
hns

DNA-binding protein H-NS homolog
2070


1391779-1393338


predicted Na+/H+ antiporter
2071


1393406-1393555


hypothetical protein
2072


1393605-1395326
ilvI

acetolactate synthase large subunit
2073


1395326-1395817
ilvH

acetolactate synthase small subunit
2074


C 1396111-1397844
argS

arginyl-tRNA synthetase
2075


1397924-1398493


conserved hypothetical protein
2076


1398572-1398793


hypothetical lipoprotein
2077


1399816-1399280
pcp

outer-membrane lipoprotein PCP precursor
2078


C 1399464-1400435
lgtD

UDP-0glcNAc-lipooligosaccharide N-
2079





acetylglucosamine glycosyltransferase


1400502 . . . 1400999


pseudogene for IS1016-V6 protein homolog,
2080





degenerate


C 1401073-1402764
pgi

glucose-6-phosphate isomerase
2081


C 1402819-1403901
alr

alanine racemase
2082


C 1403911-1405425
dnaB

replicative DNA helicase
2083


C 1405459-1406895
pykA

pyruvate kinase
2084


1407432-1408691


prophage CP4-57-like integrase
2085


C 1409305-1409574


hypothetical protein
2086


C 1409679-1409849


hypothetical protein
2087


C 1409839-1410372


hypothetical protein
2088


C 1410444-1410860


hypothetical protein
2089


C 1410864-1411541


hypothetical protein
2090


C 1411560-1412723


modification methylase Bsp6I-like
2091


C 1413329-1414225
rdgC

recombination associated protein
2092


C 1414350-1415696


hypothetical protein
2093


C 1415758-1416192
ssb3

single strand binding protein
2094


C 1416192-1416836


hypothetical protein
2095


C 1416817-1417737


predicted recombinational DNA repair protein,
2096





RecE pathway


C 1417750-1418043


hypothetical protein
2097


C 1418040-1418384


hypothetical protein
2098


C 1418398-1419135


modification methylase DpnIIB-like
2099


C 1419226-1420257


hypothetical protein
2100


C 1420247-1420498


hypothetical protein
2101


C 1420617-1420928


hypothetical protein
2102


C 1421314-1421499


hypothetical protein
2103


C 1421660-1422808


hypothetical protein
2104


C 1422819-1423679


hypothetical protein
2105


C 1423742-1424107


hypothetical protein
2106


C 1424109-1424696


hypothetical protein
2107


1424852-1425037


hypothetical protein
2108


1425068-1425496


hypothetical protein
2109


C 1425497-1426006


hypothetical protein
2110


C 1426011-1426601


hypothetical protein
2111


C 1426598-1427233


hypothetical protein
2112


C 1427218-1428000


hypothetical protein
2113


C 1428002-1428235


hypothetical protein
2114


C 1428281-1428733


hypothetical protein
2115


C 1428777-1428983


hypothetical protein
2116


1429117-1430085


hypothetical protein
2117


1430279-1430605


hypothetical protein
2118


1430592-1430996


hypothetical protein
2119


1430993-1431382


hypothetical protein
2120


1431357-1431587


hypothetical protein
2121


1431596-1432555


hypothetical protein
2122


1432534-1432953


predicted DNA modification methylase
2123


C 1432950-1433171


hypothetical protein
2124


1433229-1433597


hypothetical protein
2125


1433611-1434927


predicted phage terminase large subunit
2126


1434929-1436245


hypothetical protein
2127


1436175-1436987


uncharacterized protein, homolog of phage Mu
2128





protein gp30


1436993-1438066


hypothetical protein
2129


1438080-1438499


hypothetical protein
2130


1438507-1439508


hypothetical protein
2131


1439511-1439699


hypothetical protein
2132


1439703-1440053


hypothetical protein
2133


1440046-1440504


hypothetical protein
2134


1440504-1440863


hypothetical protein
2135


1440865-1441374


hypothetical protein
2136


1441361-1442434


hypothetical protein
2137


1442480-1442905


hypothetical protein
2138


1442905-1443387


hypothetical protein
2139


1443423-1443575


hypothetical protein
2140


1443572-1445989


predicted phage-related minor tail protein
2141


C 1446064-1446636


hypothetical protein
2142


1446944-1447525


hypothetical protein
2143


1447522-1447836


hypothetical protein
2144


1447833-1448789


hypothetical protein
2145


1448792-1449463


hypothetical protein
2146


1449460-1449825


hypothetical protein
2147


1449818-1451254


hypothetical protein
2148


1451263-1451877


hypothetical protein
2149


1451887-1454235


probable tail fiber protein
2150


1454247-1454849


hypothetical protein
2151


1454846-1455346


hypothetical protein
2152


1455412-1455861


conserved hypothetical protein
2153


C 1455900-1456163


hypothetical protein
2154


1456754-1457884


pseudogene for putative integrase/recombinase,
2155





degenerate


C 1457947-1458228


hypothetical protein
2156


1458276-1458461


mu-like prophage protein gp29
2157


1458521-1459189


conserved hypothetical protein
2158


1459291-1459437


hypothetical protein
2159


1459470-1460006


conserved hypothetical protein
2160


1460207-1460653


hypothetical protein
2161


1460653-1460829


hypothetical protein
2162


1430841-1461500


hypothetical protein
2163


1461500-1461676


hypothetical protein
2164


1461689-1462246


hypothetical protein
2165


1462243-1462419


hypothetical protein
2166


1462432-1462983


hypothetical protein
2167


1463259-1464344
prfA

peptide chain release factor 1
2168


1464367-1464837


conserved hypothetical protein
2169


1464837-1465715
hemK

HemK
2170


1465715-1466518


conserved hypothetical protein
2171


1466533-1467387
kdsA

2-dehyrdro-3-deoxyphosphooctonate aldolase
2172


1467447-1468394


putative 2-hydroxyacid dehydrogenase
2173


1468394-1469575
lolC

lipoprotein releasing system transmembrane
2174





protein


1469587-1470879
bioA

adenosylmethionine-8-amino-7-oxononanoate
2175





aminotransferase


1470888-1472027
bioF

8-amino-7-oxononanoate synthase
2176


1472037-1472684


conserved hypothetical protein
2177


1472672-1473454
bioC

putative biotin synthesis protein BioC
2178


1473464-1474105
bioD-B

probable dethiobiotin synthetase 2
2179


1474187-1474870
lolD

lipoprotein releasing system ATP-binding
2180





protein LolD


1474870-1476120
lolE

ABC-type transport system, involved in
2181





lipoprotein release, permease component


1476339-1477427
aroG

phospho-2-dehyrdro-3-deoxyheptonate aldolase
2182


C 1477497-1477937
impA

impA
2183


1478222-1479466


predicted Na+/serine symporter
2184


1479588-1480196


putative NAD(P)H oxidoreductase
2185


1480196-1480750


predicted component of anaerobic
2186





dehydrogenases


C 1480797-1481351


predicted nitroreductase
2187


1481449-1483296
sppA

protease IV
2188


C 1483334-1484131
licD

phosphorylcholine transferase
2189


C 1484131-1484832
licC

LicC
2190


C 1484829-1485707
licB

LicB
2191


C 1485707-1486666
licA

LicA
2192


C 1486754-1488766


predicted glycine/D-amino acid oxidases,
2193





deaminating


1488913-1490133
fabB

3-oxacyl-[acyl-carrier-protein] synthase I
2194


C 1490196-1490459
rrxA

glutaredoxin
2195


1490576-1491484
rimK

probable ribosomal protein S6 modification
2196





protein


C 1495122-1492736
gltS

sodium/glutamate symport carrier protein
2197


C 1492778-1495021
parC

topoisomerase IV subunit A
2198


C 1495088-1496986
parE

topoisomerase IV subunit B
2199


C 1497060-1497995
htrB

lipid A biosynthesis lauroyl acyltransferase
2200


1498092-1499522
rfaE

ADP-heptose synthase
2201


C 1499886-1499981


pseudogene for Type I restriction enzyme R
2202





protein


C 1500193-1500549


hypothetical protein
2203


1500660-1501064
uupA2

ABC transporter ATP-binding protein
2204


1501061-1501621
uppB

ABC transporter ATP-binding protein
2205


C 1501649-1502338


putative carbonic anhydrase
2206


C 1502414-1503922
asnS

asparaginyl-tRNA synthetase
2207


1504044-1504517
ribH

6,7-dimethyl-8-ribityllumazine synthase
2208


1504521-1504955
nusB

N utilization substance protein B
2209


1505022-1506008
thiL

thiamine-monophosphate kinase
2210


1506005-1506496
pgpA

phosphatidylglycerophosphatase A
2211


1506496-1507125


predicted threonine efflux protein
2212


1507142-1507954
dapB

dihydrodipicolinate reductase
2213


C 1507949-1508197


conserved hypothetical ferredoxin-like protein
2214


1508243-1508983


conserved hypothetical protein
2215


1509132-1510121
pheS

ohenylalanyl-tRNA synthetase alpha chain
2216


1510155-1512542
pheT

ohenylalanyl-tRNA synthetase beta chain
2217


1512544-1512834
himA

integration host factor alpha-subunit
2218


1512887-1513372


conserved hypothetical lipoprotein
2219


1513381-1514391


hypothetical protein
2220


1514450-1514905


putative 5′(3′)-deoxyribonucleotidase
2221


1514889-1515599


NAD-dependent deacetylase
2222


C 1515648-1516670


hypothetical protein
2223


1516865-1517065
ftsK2

DNA translocase ftsK
2224


1517062-1517907


NAD-dependent deacetylase sirtuin 5
2225


1517907-1518668


hypothetical protein
2226


1518746-1519720


hypothetical protein
2227


1519797-1519964


hypothetical protein
2228


1520134-1521849


predicted serine/threonine protein phosphatase
2229





family protein


1521869-1522306


predicted arylsulfatase A-like enzyme
2230


1522388-1523200


predicted enzyme related to aldose 1-epimerase
2231


1523468-1523875
infC

translation initiation factor IF-3
2232


1524014-1524283
rpmI

50S ribosomal protein L35
2233


1524349-1524702
rplT

50S ribosomal protein L20
2234


1524815-1528450
recB

exodeoxyribonuclease V beta chain
2235


1528450-1530372
recD

exodeoxyribonuclease V alpha chain
2236


C 1530441-1530887


conserved hypothetical protein
2237


1530984-1532789
lonB

putative protease Lahomolog, predicted ATP-
2238





dependent protease


1532958-1533491
fabA

3-hydroxydecanoyl-[acyl-carrier-protein]
2239





dehydratase


1533727-1534455


conserved hypothetical protein
2240


1534452-1534589


conserved hypothetical protein
2241


1534725-1534994
rpsO

30S ribosomal protein S15
2242


C 1535066-1535437


conserved hypothetical transposase-like protein
2243


C 1535434-1535550


conserved hypothetical protein
2244


C 1535766-1537205
dacB

penicillin-binding protein 4 precursor
2245


1537369-1537845
greA

transcription elongation factor GreA
2246


C 1537884-1538183


predicted RNA-binding protein containing KH
2247





domain, possible ribosomal protein


1538310-1538939
ftsJ

ribosomal RNA large subunit methyltransferase J
2248


1539030-1540937
FTSh

cell division protein FtsH homolog 1
2249


1541049-1541879
folP

dihydropteroate synthase
2250


1541912-1543249
mrsA

predicted phosphomannomutase
2251


1543308-1543802
sixA

phosphohistidine phosphatase SixA homolog
2252


1544004-1544579


conserved hypothetical protein
2253


C 1544657-1546021


conserved hypothetical protein
2254


1546248-1546388


hypothetical protein
2255


1546382-1546534


hypothetical protein
2256


1546609-1546767


pseudogene for surface protein autotransporter
2257





domain


1546843-1547439


possible RNA polymerase sigma factor 24
2258


1547449-1547622


hypothetical protein
2259


C 1547717-1548619


hypothetical protein
2260


1548755-1549816
msaB

peptide methionine sulfoxide reductase
2261





MsrA/MsrB


1549828-1550469


conserved hypothetical cytochrome c-type
2262





biogenesis protein


1550514-1550984


conserved hypothetical protein
2263


C 1551166-1551489


conserved hypothetical protein
2264


C 1551556-1552287
moeB

molybdopterin biosynthesis protein MoeB
2265


C 1552291-1553505
moeA

molybdopterin biosynthesis protein MoeA
2266


1553643-1554299
folE

GTP cyclohydrolase I
2267


1554303-1554626


conserved hypothetical protein
2268


1554729-1555457
bioD-A

probable dethiobiotin synthetase 1
2269


C 1555546-1556424
metF

5,10-methylenetetrahydrofolate reductase
2270


1556887-1557315
rplM

50S ribosomal protein L13
2271


1557332-1557724
rpsI

30S ribosomal protein S9
2272


1557905-1558543
sspA

stringent starvation protein A
2273


1558543-1558995
sspB

stringent starvation protein B
2274


C 1559031-1560908
dxs

1-deoxy-D-xylulose 5-phosphate synthase
2275


C 1560950-1561840
ispA

geranyltranstransferase
2276


C 1561840-1562094
xseB

exodeoxyribonuclease VII small subunit
2277


1562253-1563710
thiI

predicted thiamine biosynthesis ATP
2278





pyrophosphatase


1563725-1564045


conserved hypothetical protein
2279


1564039-1454758
truC

tRNA pseudouridine synthase C
2280


1564751-1564909


conserved hypothetical protein
2281


1565058-1565276
cspD

cold shock-like protein CspD
2282


C 1565415-1565891


conserved hypothetical protein
2283


1565950-1566903
usg

predicted aspartate-semialdehyde
2284





dehydrogenase


C 1567047-1567853
trpA

tryptophan synthase alpha chain
2285


C 1567853-1569046
trpB

tryptophan synthase beta chain
2286


C 1569063-1569821


conserved hypothetical oxidoreductase
2287


1570001-1571035
purM

phosphoribosylformylglycinamidine cyclo-
2288





ligase


1571090-1571728
purN

phosphoribosylglycinamide formyltransferase
2289


C 1571812-1572648


predicted ABC-type transport system protein,
2290





periplasmic component


C 1572846-1573868
uspE

universal stress protein E
2291


C
fnr

fumarate and nitrate reduction regulatory
2292


1573897 . . . 1574670


protein


1575288-1575644


putative integrase/recombinase
2293


C 1575529-1575918


conserved hypothetical protein
2294


C 1576223-1576474


predicted phage anti-repressor protein
2295


C 1576812-1578083


hypothetical protein
2296


1578684-1578887


hypothetical protein
2297


C 1578865-1579023


hypothetical protein
2298


C 1579351-1579572


hypothetical protein
2299


C 1580499-1580834


hypothetical protein
2300


C 1580837-1581916


hypothetical protein
2301


C 1581968-1582666


predicted transcriptional regulator
2302


1582777-1582965


hypothetical protein
2303


1583014-1583454


hypothetical protein
2304


1583503-1584177


hypothetical protein
2305


1584174-1584932


hypothetical protein
2306


1584917-1585558


hypothetical protein
2307


1585555-1585779


hypothetical protein
2308


1585816-1586232
ninB

putative recombination protein NinB
2309


1586500-1586889
ninG

putative recombination protein NinG homolog
2310


1586909-1587538


hypothetical protein
2311


1587662-1587940


conserved hypothetical protein
2312


C 1587990-1588967


hypothetical protein
2313


1589196-1589981


conserved hypothetical protein
2314


1590012-1590194


hypothetical protein
2315


1590318-1590674


conserved hypothetical protein
2316


1590754-1591245


conserved hypothetical protein
2317


1591238-1591561


hypothetical protein
2318


1591473-1591754


conserved hypothetical protein
2319


C 1591756-1592034


hypothetical protein
2320


1592079-1592591


conserved hypothetical protein
2321


1592578-1593921


predicted phage terminase large subunit
2322


1593923-1595257


conserved hypothetical protein
2323


1595286-1596467


uncharacterized protein, homolog of phage Mu
2324





protein gp30


1596613-1596789


conserved hypothetical protein
2325


1596953-1597972
pyrD

dihyrdoorotate dehydrogenase
2326


1597972-1598796
trpH

TrpH
2327


C 1598895-1599503


conserved hypothetical protein
2328


C 1599631-1601025
fumC

fumurate hydratase class II
2329


C 1601175-1602113


putative glycosyl transferase, glycosyl
2330





transferase family 8 protein


1602322-1602666


pseudogene for usg protein homolog
2331


1602769-1603203
holC

DNA polymerase III, chi subunit
2332


1603288-1603458


conserved hypothetical protein
2333


1603471-1603845


conserved hypothetical protein
2334


1603873-1604040


hypothetical protein
2335


1604067-1604381


conserved hypothetical protein
2336


1604425-1604826


hypothetical protein
2337


1604878-1607742
valS

valyl-tRNA synthetase
2338


C 1607777-1607884


hypothetical protein
2339


C 1608172-1609605
trpC

tryptophan biosynthesis protein trpCF
2340


C 1609643-1610644
trpD

anthranilate phosphoribosyltransferase
2341


C 1610697-1611083


conserved hypothetical protein
2342


C 1611132-1611719
trpG

anthranilate synthase component II
2343


C 1611732-1613288
trpE

anthranilate synthase component I
2344


C 1613390-1614300


pseudogene for conserved hypothetical
2345





glycosyltransferase


C 1614423-1614920
ftnB

ferritin-like protein 2
2346


C 1614936-1615484
ftnA

ferritin-like protein 1
2347


1615927-1616946
pstS

phosphate-binding periplasmic protein
2348





precursor PstS


1617038-1617985
pstC

phosphate transport system permease protein
2349





PstC


1617987-1618835
pstA

phoaphate transport system permease proteain
2350





PstA


1618845-1619612
pstB

phosphate import ATP-binding protein PstB
2351


1619709-1620404
phoB

phosphate regulon transcriptional regulatory
2352





protein PhoB


1620401-1621678
phoR

phosphate regulon sensor protein PhoR
2353


C 1622370-1623791
sbcB

exodeoxyribonuclease I
2354


C 1623804-1624679


conserved hypothetical protein
2355


C 1624683-1625591


conserved hypothetical protein
2356


C 1625661-1630193


cell division protein MukB
2357


C 1630193-1630927


cell division protein MukE
2358


C 1630931-1631989
haeIIR

type II restriction exzyme HaeII
2359


C 1631992-1632948
haeIIM

modification methylase HaeII
2360


C 1632961-1634388
mukF

MukF homolog
2361


1634445-1635377


predicted ATPase of the PP-loop superfamily
2362





implicated in cell cycle control


1635379-1635708


predicted dissimilatory sulfite reductase,
2363





desulfoviridin, gamma subunit


1635816-1636025
mop

probable molybdenum-pterin binding protein
2364


1636209-1638621


pseudogene for conserved hypothetical protein,
2365





denegerate


1638725-1641505
pqqL

probable zinc protease
2366


C 1641628-1643559
thrS

threonyl-tRNA synthetase
2367


1643843-1644427
acpD

probable acyl carrier protein phosphodiesterase
2368


C 1644504-1647110
topA

DNA topoisomerase I
2369


1647206-1648123


putative HTH-type transcriptional regulator
2370


C 1648187-1649611
pntB

NAD(P) transhydrogenase subunit beta
2371


C
pntA

NAD(P) transhydrogenase subunit alpha
2372


1649622 . . . 1651160


C 1651421-1653886
glgP

glycogen phosphorylase
2373


C 1654133-1655563
glgA

glycogen synthase
2374


C 1655671-1657005
glgC

glucose-1-phosphate adenylyltransferase
2375


C 1656955-1658971
glgX

glycogen operon protein GlgX
2376


C 1659070-1661262
glgB

1,4-alpha-glucan branching enzyme
2377


C 1661272-1663371
malQ

4-alpha-glucanotransferase
2378


C 1663433-1663897


conserved hypothetical protein
2379


C 1663963-1665636
glnS

glutaminyl-tRNA synthetase
2380


C 1666056-1667531
cafA

ribonuclease G
2381


1667647-1669161
putP

sodium/proline symporter
2382


1669158-1670123


conserved hypothetical protein
2383


C 1670166-1671044
cdd

cytiding deaminase
2384


1671318-1671800


conserved hypothetical DNA-binding ferritin-
2385





like protein


C 1672372-1673610
pepT

peptidase T
2386


1673880-1675025
potA

spermidine/putrescine transport ATP-binding
2387





protein


1675009-1675869
potB

spermidine/putrescine transport permease
2388





protein


1675869-1676639
potC

spermidine/putrescine transport system
2389





permease protein


1676686-1677852
potD1

spermidine/putrescine-binding periplasmic
2390





protein 1 precursor


C 1677933-1679876
uupA

ABC transporter ATP-binding protein
2391


C 1679970-1681355


deoxyguanosinetriphosphate
2392





triphosphohydrolase-like protein


C 1681448-1682143


putative effector of murein hydrolase
2393


C 1682162-1682584


putative effector of murein hydrolase
2394


1682699-1683181


predicted micrococcal nuclease-like protein
2395


1683183-1684382
nifS

predicted selenocysteine lyase
2396


1684379-1684759


predicted SufE protein probably involved in Fe—S
2397





center assembly


1684750-1685547


Zn-ribbon-containing, possible nucleic-acid
2398





binding protein


1685591-1686430


predicted enzyme related to GTP
2399





cyclohydrolase I


C 1686463-1687587
tyrA

T-protein
2400


C 1687677-1688597
truB

tRNA pseudouridine synthase B
2401


C 1688597-1688983
rbfA

ribosome-binding factor A
2402


1689148-1690818
hsdM3

putative type I restriction enzyme HindVIIP M
2403





protein


1690811-1692076
hsdS3

putative type I restriction enzyme HindVIIP
2404





specificity protein


1692069-1693115


hypothetical protein
2405


1693117-1696284
hsdR3

putative type I restriction enzyme HindVIIP R
2406





protein


C 1696363-1698897
infB

translation initiation factor IF-2
2407


C 1698909-1700396
nusA

transcriptional elongation protein NusA
2408


C 1700413-1700868


conserved hypothetical protein
2409


C 1702529-1703293


hypothetical protein
2410


C 1703297-1703824


hypothetical protein
2411


C 1703834-1704973


hypothetical protein
2412


C 1704990-1705358


hypothetical protein
2413


C 1705372-1706193


hypothetical protein
2414


C 1706250-1706753


hypothetical protein
2415


C 1706750-1707352


hypothetical protein
2416


C 1707364-1709886


probable tail fiber protein
2417


C 1709895-1710431


predicted bacteriophage P2-related tail
2418





formation protein gpI


C 1710421-1711335


predicted phage-related baseplate assembly
2419





protein


C 1711332-1711670


predicted baseplate assembly protein W
2420


C 1711672-1712271


predicted phage P2-like baseplate assembly
2421





protein


C 1712375-1712803


hypothetical protein
2422


C 1712812-1713369


hypothetical protein
2423


C 1713467-1713919


hypothetical protein
2424


C 1713897-1714097


hypothetical protein
2425


C 1714141-1716105


predicted phage-related tail protein
2426


C 1716148-1716915


hypothetical protein
2427


1716960-1717298


hypothetical protein
2428


C 1717309-1717491


hypothetical protein
2429


1717595-1717942


hypothetical protein
2430


C 1717943-1718404


probable bacteriophage tail completion protein
2431





gpS homolog


C 1718404-1718790


probable bacteriophage tail completion protein
2432





gpR homolog


C 1718840-1718947


hypothetical protein
2433


C 1718980-1719102


hypothetical protein
2434


C 1719099-1719314


DnaK suppressor protein, bacteriophage PSP3
2435





gp34 homolog


C 1719489-1719839


hypothetical protein
2436


C 1719824-1720342


predicted phage-related lysozyme
2437


1720335 . . . 1720556


hypothetical protein
2438


C 1720558-1720767


hypothetical protein
2439


C 1720767-1721273


hypothetical protein
2440


C 1721567-1722217


predicted terminase, endonuclease subunit
2441


C 1722229-1723278


predicted major capsid protein
2442


C 1723299-1724114


predicted capsid scaffolding protein
2443


1724279-1726060


terminase, ATPase subunit
2444


1726070-1727080


predicted portal vertex protein
2445


C 1727788-1728462
siaB

acylneuraminate cytidylyltransferase
2446


1728576-1729238


putative NAD(P)H nitroreductase
2447


C 1729285-1730445
mrp

Mrp
2448


1730548-1732596
metG

methionyl-tRNA synthetase
2449


1732707-1733567
tehB

tellurite resistance protein
2450


C 1733611-1734312
gloB

probable hydroxyacylglutathione hydrolase
2451


C 1734364-1735170


conserved hypothetical protein
2452


1735406-1735882


conserved hypothetical protein
2453


1736019-1737782


conserved hypothetical protein
2454


1738382-1741024
gyrA

DNA gyrase subunt A
2455


C 1741079-1742155
metX

homoserine O-acetyltransferase
2456


1742259-1742993
sanA

SanA
2457


C 1743281-1744594
folC

folypolyglutamate synthase
2458


C 1744587-1745477
accD

acethy-coenzyme A carboxylase carboxyl
2459





transferase subunit beta


1745671-1747062
htoA

probable periplasmic serine protease do/HhoA-
2460





like precursor


C 1747115-1750555
mfd

transcription-repair coupling factor
2461


C 1750657-1750830


conserved hypothetical protein
2462


C 1750827-1752797


predicted P-loop ATPase fused to an
2463





acetyltransferase


C 1752802-1753143


conserved hypothetical protein
2464


C 1753205-1754875


ABC transported ATP-binding protein
2465


C 1755038-1755361


predicted plasmid maintenance system antidote
2466





protein


C 1755372-1755677


predicted plasmid maintenance system killer
2467





protein


1756002-1756622


predicted ABC-type transport system,
2468





periplasmic component


1756625-1757593


predicted ABC-type transport system, permease
2469





component


C 1758207-1760246
uvrB

UvrABC system protein B
2470


C 1760276-1762201


predicted phosphoglycerol transferase-like
2471





protein


C 1762442-1764712
mao2

NADP-dependent malic enzyme
2472


1764910-1765530


possible polysaccharide biosynthesis protein
2473


1765621-1766319
rsuA

ribosomal small subunit pseodouridine synthase A
2474


1766321-1767517
bcr

bicyclomycin resistance protein
2475


C 1767549-1768349


conserved hypothetical protein
2476


1768447-1769109


predicted membrane protein
2477


1769130-1770383
proA

gamma-glutamyl phosphate reductase
2478


C 1770462-1771607
dnaJ

chaperone protein DnaJ
2479


C 1771694-1773601
dnaK

chaperone protein DnaK
2480


1773868-1775310


conserved hypothetical protein
2481


C 1775467-1775925
mgsA

methylglyoxal synthase
2482


1776225-1778885
aceE

pyruvate dehydrogenase E1 component
2483


1778948-1780645
aceF

dihydrolipoamide acetyltransferase component
2484





of pyruvate dehydrogenase complex


1780759-1782183
lpdA

dihydrolipoamide dehudrogenase
2485


C 1782230-1782337


hypothetical protein
2486


1782343-1782885
apt

adenine phosphoribosyltransferase
2487


1782897-1784963
dnzX

DNA polymerase III subunit gamma/tau
2488


1785103-1785729
upp

uracil phosphoribosyltransferase
2489


1785833-1787077
uraA

probable uracil permease
2490


1787143-1787838


predicted ATPase involved in DNA replication
2491





initiation


C 1787905-1788186


predicted translation initiation factor 1-like
2492





proterin


C 1788193-1788885
pyrF

orotidine 5′-phosphate decarboxylase
2493


C 1788909-1790099


predicted N-acetylglucosaminyl transferase
2494


C 1790099-1790392


predicted membrane protein
2495


C 1790467-1790751
ihfB

integration host factor beta subunit
2496


C 1790874-1792523
rpsA

30S ribosomal protein S1
2497


C 1792626-1793420
cmk

ctidylate kinase
2498


1793766-1794641


conserved hypothetical pyridoxine biosynthesis
2499





enzyme


1794642-1795220


predicted glutamine amidotransferase involved
2500





in pyridoxine biosynthesis


C 1795769-1797463
dld

D-lactate dehydrogenase
2501


1797723-1797833


conserved hypothetical protein
2502


C 1798368-1798919
nlpC

probable lipoprotein NlpC
2503


C 1799032-1800480
tldD

TldD
2504


C 1800582-1801433


predicted methyltransferase
2505


1801505-1803232


putative lipoprotein
2506


1803232-1803591


predicted endonuclease distantly related to
2507





archael holliday junction resolvase


1803604-1804188


predicted phosphoheptose isomerase
2508


1804245-1804826


predicted periplasmic or secreted lipoprotein
2509


1805081-1807351
nrdA

ribonucleoside-diphosphate reductase alpha
2510





chain


1807493-1808623
nrdB

ribonucleoside-diphosphate reductase beta
2511





chain


C 1808739-1809968
sucB

dihydrolipoamide succinyltransferase
2512





component of 2-oxoglutarate dehydrogenase





complex


C 1810071-1812923
sucA

2-oxoglutarate dehydrogenase E1 component
2513


C 1813081-1813719


predicted Zn-dependent hydrolase-like protein,
2514





including glyoxylases


C 1813729-1814484


conserved putative deoxyribonuclease
2515


C 1814536-1815096


conserved hypothetical protein
2516


C 1815112-1816581


conserved hypothetical protein
2517


C 1816655-1818742
prc

tail-specific protease precursor
2518


C 1818764-1819402
proQ

predicted activator of osmoprotectant
2519





transporter PropP


1819587-1820837


paraquat-inducible protein A-like protein
2520


1820821-1823466


paraquat-inducible protein B-like protein
2521


C 1823521-1823973
moaE

molybdopterin converting factor subunit 2
2522


C 1823974-1824219
moaD

molybdopterin converting factor subunit 1
2523


C 1824221-1824703
moaC

molybdenum cofactor biosynthesis protein C
2524


C 1824795-1825808
moaA

molybdenum cofactor biosynthesis protein A
2525


1826253-1826726


predicted regulator of cell morphogenesis and
2526





NO signaling


1826726-1826923


predicted regulator of cell morphogenesis and
2527





NO signaling


1827460-1728473


probable phosphosugar isomerase Hi1678
2528


1828483-1829025
yrbI

3-deoxy-D-manno-octulosonate 8-phosphate
2529





phosphatase


1829456-1833934
hmw1A

HMW1A, high molecular weight adhesin 1
2530


1834085-1835722
hmw1B

HMW1B, OMP-85-like protein required for
2531





secretion of HMW1A and HMW2A


1835916-1837868
hmw1C

HMW1C, putative glycosyltransferase involved
2532





in glycosylation of HMW1A and HWM2A


C 1837945-1840101


predicted membrane protein
2533


C 1840169-1840834


conserved hypothetical protein
2534


1841032-1842093
sohB

possible protease SohB
2535


1842377-1842955
rnfA

predicted NADH:ubiquinone oxidoreductase,
2536





subunit RnfA


1843043-1843624
rnfB

predicted NADH:ubiquinone oxidoreductase,
2537





subunit RnfB


1843625-1845622
rnfC

predicted NADH:ubiquinone oxidoreductase,
2538





subunit RnfC


1845752-1846828
rnfD

predicted NADH:ubiquinone oxidoreductase,
2539





subunit RnfD


1846828-1847451
rnfG

predicted NADH:ubiquinone oxidoreductase,
2540





subunit RnfG


1847453-1848160


predicted NADH:ubiquinone oxidoreductase,
2541





subunit RnfE


1848282-1848917
nth

endonuclease III
2542


1848976-1850349


predicted Na+-dependent transporters of the
2543





SNF family


C 1850373-1851428
modC

molybdenum import ATP-binding protein
2544


C 1851415-1852104
modB

molybdenum transport system permease protein
2545


C 1852278-1853042
modA

molybdate-binding periplasmic protein
2546


1853180-1853947
mode

Transcriptional regulator ModE
2547


1854057-1854860
lsgF

Putative UDP-galactose-lipooligosaccharide
2548





galactosyltransferase


C 1854862-1855746
lsgE

Putative UDP-galactose-lipooligosaccharide
2549





galactosyltransferase


C 1855758-1856531
lsgD

Putative UDP-glcNAc-lipooligosaccharide
2550





N-acetylglucosaminyl glycosyltransferase


C 1856543-1857604
lsgC

Putative UDP-galactose--lipooligosaccharide
2551





galactosyltransferase


C 1857606-1858520
lsgB

CMP-N-acetylneuraminate-beta-galactosamide-
2552





alpha-2,3-sialytransferase


C 1858517-1859722
lsgA

putative lipooligosaccharide flippase
2553


1859857-1860600


conserved hypothetical protein
2554


C 1860823-1863092
metE

pseudogene for 5-
2555





nethyltetrahydropteroytrigultamate -





homocysteine methyltransferase


C 1863286-1864362


predicted permease
2556


C 1864359-1865462


predicted permease
2557


1865570-1867045
pepA

cytosol aminopeptidase
2558


C 1867081-1869090


predicted choline-glycine betaine transporter
2559


C 1869385-1870740
qseC

sensor protein QseC
2560


C 1870737-1871402
qseB

transcriptional regulatory protein QseB
2561


C 1871466-1871831


conserved hypothetical protein
2562


C 1871946-1872966
pmi

pseudogene for mannose-6-phosphate
2563





isomerase


C 1873336-1873836
crr

PTS system, glucose-specific IIA component
2564


C 1873896-1875623
ptsI

Phosphoenolpyruvate-protein
2565





phosphotransferase


C 1875703-1875960
ptsH

Phosphocarrier protein HPr
2566


C 1876119-1877159


probable GTPase
2567


1877230-1877778
orn

oligoribonuclease
2568


C 1878178-1879245
wecA

undecaprenyl phosphate
2569


C 1879320-1881911
glnD

[protein-PII] uridylyltransferase
2570


C 1882005-1882811
map

methionine aminopeptidase
2571


1882947-1883291


conserved hypothetical protein
2572


1883293-1883646


conserved hypothetical protein
2573


1883653-1885998
mrcB

penicillin-binding protein 1B
2574


1886123-1887043
purC

phosphoribosylaminoimidazole-
2575





succinocarboxamine synthase


1887199-1888533
argG

argininosuccinate synthase
2576


C 1888604-1889797


Mn2+ and Fe2+ transporter of the NRAMP
2577





family


C 1889841-1890578


conserved hypothetical protein
2578


C 1890565-1891494


predicted allophanate hydrolase subunit 2
2579


C 1891491-1892132


predicted allophanate hydrolase subunit 1
2580


C 1892394-1894373
rnb

exoribonuclease II
2581


C 1894471-1895358
fabI

enoyl-[acyl-carrier-protein] reductase [NADH]
2582


1895438-1897021
prfC

peptide chain release factor 3
2583


C 1897097-1897330


conserved hypothetical protein
2584


C 1897433-1897762


predicted branched chain amino acid permease
2585


C 1897759-1898493


predicted branched chain amino acid permease
2586


C 1898503-1899432
metR

HTH-type transcriptional regulator MetR
2587


C 1905763-1906908
lldD

L-lactate dehydrogenase
2588


C 1907120-1907929
murI

glutamate racemase
2589


C 1907959-1910040
recG

ATP-dependent DNA helicase
2590


C 1910037-1912151
spoT

guanosine-3′,5′-bis 3′pyrophosphohydrolase
2591


C 1912224-1912490
rpoZ

DNA directed RNA polymerase omega chain
2592


C 1912553-1913179
gmk

guanylate kinase
2593
















TABLE 7







Gene Products of the NTHi Genome











SEQ


Gene

ID


Name
Product Name
NO:





gapA
glyceraldehyde 3-phosphate dehydrogenase
2594



putative long-chain-fatty-acid--CoA ligase
2595



conserved hypothetical protein
2596



conserved hypothetical protein
2597


fdhD
FdhD protein homolog
2598


fdxG
formate dehydrogenase major subunit
2599


fdxH
formate dehydrogenase, iron-sulfur subunit
2600


fdxI
formate dehydrogenase, cytochrome B556 subunit
2601


fdhE
FdhE homolog
2602


rimI
ribosomal-protein-alanine acetyltransferase
2603


holD
DNA polymerase III, psi subunit
2604


rsmC
ribosomal RNA small subunit methyltransferase C
2605


era
GTP-binding protein era homolog
2606


rnc
ribonuclease III
2607


lepB
Signal peptidase I
2608


lepA
GTP-binding protein LepA
2609



conserved hypothetical acid-induced glycyl
2610



radical enzyme


ung
uracil-DNA glycosylase
2611



conserved hypothetical protein
2612



conserved hypothetical protein
2613


citG
CitXG
2614


citF
citrate lyase alpha chain
2615


citE
citrate lyase beta chain
2616


citD
citrate lyase acyl carrier protein
2617


citC
[citrate [pro-3S]-lyase] ligase
2618


lipA
Lipoic acid synthetase
2619


lipB
lipoate-protein ligase B
2620



conserved hypothetical protein
2621


dacA
penicillin-binding protein 5 precursor
2622


rlpA
RlpA-like protein
2623


mrdB
Rod shape-determining protein RodA
2624


mrdA
penicillin-binding protein 2
2625



conserved hypothetical protein
2626



conserved hypothetical protein
2627



conserved hypothetical membrane protein
2628



conserved hypothetical ABC transporter ATP-binding
2629



protein


mreB
rod shape-determining protein MreB
2630


mreC
rod shape-determining protein MreC
2631


mreD
rod shape-determining protein MreD
2632



conserved hypothetical protein
2633


xthA
exodeoxyribonuclease III
2634


rluA2
conserved hypothetical pseudouridine synthase
2635



conserved hypothetical membrane protein
2636



conserved hypothetical FtsH-interacting integral
2637



membrane protein



conserved hypothetical protein
2638


phnA
PhnA homolog
2639


eda
KHG/KDPG aldolase
2640


uxuC
uronate isomerase
2641



putative oxidoreductase
2642



putative TRAP-type C4-dicarboxylate transport
2643



system, large permease component



putative TRAP-type C4-dicarboxylate transport
2644



system, small permease component



putative TRAP-type C4-dicarboxylate transport
2645



system, periplasmic component



conserved hypothetical zinc-type alcohol
2646



dehydrogenase-like protein


uxuR
Uxu operon regulator
2647


uxuA
mannonate dehydratase
2648



putative membrane protein TerC
2649


uvrC
UvrABC system protein C
2650


kdsB
3-deoxy-D-manno-octulosonic acid
2651



cytidylyltransferase


lpxK
tetraacyldisaccharide 4′-kinase
2652


msbA
lipid A export ATP-binding protein MsbA
2653


rec2
recombination protein 2
2654


dksA
DnaK suppressor protein
2655


pcnB
probable poly polymerase
2656


folK
2-amino-4-hydroxy-6-hydroxymethyl-
2657



dihydropteridine pyrophosphokinase



conserved hypothetical protein
2658


amiB
probable N-acetylmuramoyl-L-alanine amidase AmiB
2659



precursor


mutL
DNA mismatch repair protein MutL
2660


miaA
tRNA delta-isopentenylpyrophosphate transferase
2661


glnE
glutamate-ammonia-ligase adenylyltransferase
2662


recN
DNA repair protein RecN
2663


ppnK
probable inorganic polyphosphate/ATP-NAD kinase
2664


grpE
grpE
2665



conserved hypothetical protein
2666



conserved hypothetical protein
2667


nrdD
anaerobic ribonucleoside-triphosphate reductase
2668


tesB
Acyl-CoA thioesterase II
2669


cysS
cysteinyl-tRNA synthetase
2670


ppiB
peptidyl-prolyl cis-trans isomerase B
2671



conserved hypothetical protein
2672



putative deoxyribonuclease
2673



hypothetical protein
2674



hypothetical protein
2675


trxA
thioredoxin
2676


ddh
2-hydroxyacid dehydrogenase homolog
2677


metB
cystathionine gamma-synthase
2678



predicted ATPase involved in chromosome
2679



partitioning


dnaB2
replicative DNA helicase
2680



conserved hypothetical protein
2681



conserved hypothetical protein
2682



conserved hypothetical protein
2683



conserved hypothetical protein
2684



conserved hypothetical protein
2685


ssb2
Single-strand binding protein
2686



conserved putative lipoprotein
2687



1conserved hypothetical protein
2688


topB2
DNA topoisomerase III
2689



conserved hypothetical protein
2690



hypothetical protein
2691



conserved hypothetical protein
2692



hypothetical protein
2693


radC2
putative DNA repair radC-like protein
2694



conserved hypothetical protein
2695



conserved hypothetical protein
2696


pilL
conserved putative lipoprotein
2697



conserved putative exported protein
2698



conserved putative exported protein
2699



conserved putative exported protein
2700



conserved hypothetical protein
2701



conserved putative membrane protein
2702



conserved putative membrane protein
2703



conserved putative membrane protein
2704



conserved putative membrane protein
2705



conserved putative exported protein
2706



conserved hypothetical protein
2707



conserved putative exported protein
2708



conserved putative lipoprotein
2709



conserved hypothetical protein
2710



conserved hypothetical protein
2711



conserved hypothetical membrane protein
2712



conserved putative membrane protein
2713



conserved hypothetical membrane protein
2714


tnpA
transposon Tn3 transposase
2715



hypothetical protein
2716


tnpR
transposon Tn3 resolvase
2717



hypothetical protein
2718



hypothetical exported protein
2719



conserved putative exported protein
2720



conserved putative exported protein
2721



conserved hypothetical protein
2722



conserved hypothetical membrane protein
2723



conserved hypothetical protein
2724



hypothetical protein
2725



hypothetical protein
2726


traC
conserved putative antirestriction protein
2727



possible type I restriction enzyme M subunit
2728



hypothetical protein
2729



conserved hypothetical protein
2730



conserved hypothetical protein
2731



conserved hypothetical protein
2732



resolvase/integrase-like protein
2733



hypothetical protein
2734



hypothetical protein
2735



conserved hypothetical protein
2736



hypothetical protein
2737



conserved hypothetical protein
2738



putative site-specific recombinase
2739


thrC
threonine synthase
2740


thrB
homoserine kinase
2741


thrA
aspartokinase/homoserine dehydrogenase
2742



conserved hypothetical protein
2743


grk
glycerate kinase
2744



conserved hypothetical protein
2745



conserved hypothetical protein
2746



conserved hypothetical protein
2747



conserved hypothetical protein
2748



conserved hypothetical protein
2749


hitA
iron-utilization periplasmic protein hFbpA
2750


hitB
iron(III)-transport system permease protein
2751



hFbpB


hitC
iron-utilization ATP-binding protein hFbpC
2752



putative D-alanyl-D-alanine carboxypeptidase
2753


dapE
succinyl-diaminopimelate desuccinylase
2754



conserved hypothetical protein
2755



chaperone protein HtpG
2756



conserved hypothetical NIF3-like protein
2757


hsdM1
putative type I restriction-modification system,
2758



methyltransferase subunit



predicted transcriptional regulator containing
2759



an HTH domain


hsdS1
putative type I site-specific restriction-modification
2760



system, S subunit


hsdR1
putative type I site-specific restriction-modification
2761



system, R subunit


ffh
Signal recognition particle protein
2762


corB
putative Mg2+ and Co2+ transporter
2763



conserved hypothetical protein
2764



conserved hypothetical protein
2765



conserved hypothetical protein
2766


serS
Seryl-tRNA synthetase
2767


gst
glutathione S-transferase
2768



conserved hypothetical protein
2769


hemR
hemin receptor
2770



possible SAM-dependent methyltransferase
2771


mltA
membrane-bound lytic murein transglycosylase A
2772



precursor



conserved hypothetical protein
2773


znuA
high-affinity zinc uptake system protein ZnuA
2774



conserved hypothetical protein
2775


mpl
UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-
2776



meso-diaminopimelate ligase


metC
cystathionine beta-lyase
2777


tsaA
probable peroxiredoxin
2778


pgsA
CDP-diacylglycerol--glycerol-3-phosphate 3-
2779



phosphatidyltransferase


ppa
inorganic pyrophosphatase
2780



conserved hypothetical protein
2781


udk
uridine kinase
2782


dcd
deoxycytidine triphosphate deaminase
2783



conserved hypothetical protein
2784



probable sugar efflux transporter
2785


engA
GTP-binding protein EngA
2786


dnaQ
DNA polymerase III, epsilon chain
2787


rnhA
ribonuclease HI
2788


omP2
Outer membrane protein P2 precursor
2789


nagA
N-acetylglucosamine-6-phosphate deacetylase
2790


nagB
glucosamine-6-phosphate deaminase
2791


nanA
N-acetylneuraminate lyase
2792



putative HTH-type transcriptional regulator
2793


nanK
putative N-acetylmannosamine kinase
2794


nanE
putative N-acetylmannosamine-6-phosphate
2795



2-epimerase



putative sialic acid transporter, TRAP-type
2796



C4-dicarboxylate transport system, periplasmic



component


siaT
putative sialic acid transporter, TRAP-type
2797



C4-dicarboxylate transport system, large permease



component



conserved hypothetical protein
2798



putative protein-S-isoprenylcysteine
2799



methyltransferase


hflC
HflC
2800


hflK
HhflK
2801



putative 4′-phosphopantetheinyl transferase
2802


dcuB2
anaerobic C4-dicarboxylate transporter DcuB
2803


acpP
acyl carrier protein
2804


fabG
3-oxoacyl-[acyl-carrier protein] reductase
2805


fabD
malonyl CoA-acyl carrier protein transacylase
2806


fabH
3-oxoacyl-[acyl-carrier-protein] synthase III
2807


rpmF
50S ribosomal protein L32
2808



conserved hypothetical protein
2809


psd
phosphatidylserine decarboxylase proenzyme
2810


gor
glutathione reductase
2811



conserved hypothetical lipoprotein
2812


nqrA
Na(+)-translocating NADH-quinone reductase
2813



subunit A


nqrB
Na(+)-translocating NADH-quinone reductase
2814



subunit B


nqrC
Na(+)-translocating NADH-quinone reductase
2815



subunit C


nqrD
Na(+)-translocating NADH-quinone reductase
2816



subunit D


nqrE
Na(+)-translocating NADH-quinone reductase
2817



subunit E


nqrF
Na(+)-translocating NADH-quinone reductase
2818



subunit F


apbE
thiamine biosynthesis lipoprotein ApbE
2819



conserved hypothetical protein
2820


trmU
probable tRNA (5-methylaminomethyl-2-
2821



thiouridylate)-methyltransferase



conserved hypothetical protein
2822


rluD
ribosomal large subunit pseudouridine synthase D
2823



conserved hypothetical lipoprotein
2824



conserved hypothetical protein
2825


pflA
pyruvate formate-lyase 1 activating enzyme
2826


pflB
formate acetyltransferase
2827


focA
probable formate transporter
2828



conserved hypothetical protein
2829



putative Na+/alanine symporter
2830



conserved hypothetical protein
2831



putative HTH-type transcriptional regulator
2832


tatA
Sec-independent protein translocase protein
2833



TatA/E


tatB
Sec-independent protein translocase protein
2834



TatB


tatC
Sec-independent protein translocase protein
2835



TatC


gdhA
NADP-specific glutamate dehydrogenase
2836


fur
Ferric uptake regulation protein
2837


fldA
flavodoxin
2838


seqA
seqA
2839



putative esterase/lipase
2840


menE
O-succinylbenzoate--CoA ligase
2841



putative small-conductance mechanosensitive
2842



channel


aroC
chorismate synthase
2843


mepA
penicillin-insensitive murein endopeptidase
2844



conserved hypothetical protein
2845


msbB
lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA
2846



acyltransferase


selD
selenide, water dikinase
2847


rplsS
50S ribosomal protein L19
2848


trmD
tRNA (guanine-N(1)-)-methyltransferase
2849


rimM
16S rRNA processing protein RimM
2850


rspP
30S ribosomal protein S16
2851



conserved hypothetical protein
2852


nadN
NAD nucleotidase
2853


aroK
shikimate kinase
2854


aroB
3-dehydroquinate synthase
2855


dam
DNA adenine methylase
2856



conserved hypothetical protein
2857


pgpB
phosphatidylglycerophosphatase B
2858


ribA
GTP cyclohydrolase II
2859



putative ABC-type oligopeptide transport system,
2860



periplasmic component



conserved hypothetical protein
2861


prlc
oligopeptidase A
2862


Hsdm2
putative type I restriction-modification system
2863



methyltransferase subunit


Hsds2
putative type I restriction-modification system
2864



specificity protein


prrC
putative anticodon nuclease
2865



conserved hypothetical DNA binding protein
2866


Hsdr2
putative type I restriction-modification system
2867



conserved hypothetical protein
2868


arcb
aerobic respiration control sensor protein ArcB
2869



predicted uracil-DNA glycosylase
2870


bira
BirA bifunctional protein
2871


guab
inosine-5′-monophosphate dehydrogenase
2872


guaa
glutamine-hydrolyzing GMP synthase
2873



conserved hypothetical protein
2874



putative transcriptional regulator
2875


nhaa
Na(+)/H(+) antiporter 1
2876


brnq
branched-chain amino acid transport system
2877



carrier protein



conserved hypothetical protein
2878



conserved hypothetical protein
2879


pnp
polyribonucleotide nucleotidyltransferase
2880


nlpL
lipoprotein NlpI
2881


dead
Cold-shock DEAD-box protein A homolog
2882



predicted soluble lytic transglycosylase fused
2883



to an ABC-type amino acid-binding protein



conserved hypothetical protein
2884



conserved hypothetical protein
2885



conserved hypothetical protein
2886


arsc
putative arsenate reductase
2887


perm
putative permease PerM homolog
2888


secf
protein-export membrane protein SecF
2889


secd
protein-export membrane protein SecD
2890



conserved hypothetical preprotein translocase
2891



subunit YajC



predicted redox protein, regulator of disulfide
2892



bond formation



conserved hypothetical protein
2893


tgt
queuine tRNA-ribosyltransferase
2894


quea
S-adenosylmethionine:tRNA ribosyltransferase-
2895



isomerase



conserved hypothetical protein
2896


hap
adhesion and penetration protein Hap
2897


uvra
UvrABC system protein A
2898


ssb
Single-strand binding protein
2899


tonb
TonB
2900


exbd
transport protein ExbD
2901


exbb
transport protein ExbB
2902


bcp
bacterioferritin comigratory protein
2903


dapa
dihydrodipicolinate synthase
2904



conserved hypothetical lipoprotein
2905



conserved hypothetical protein
2906


lgtC
UDP-galactose--lipooligosaccharide
2907



galactosyltransferase


orfM
predicted xanthosine triphosphate pyrophosphatase
2908


kdkA
3-deoxy-D-manno-octulosonic acid kinase
2909


opsX
ADP-heptose--lipooligosaccharide
2910



heptosyltransferase I


hxuC
heme/hemopexin-binding protein C
2911


hxuB
heme/hemopexin-binding protein B
2912


hxuA
heme/hemopexin-binding protein A
2913


folB
dihydroneopterin aldolase
2914



conserved hypothetical protein
2915


narQ
sensor protein NarQ
2916


murB
UDP-N-acetylenolpyruvoylglucosamine reductase
2917


rpoH
RNA polymerase sigma-32 factor
2918


djlA
DnaJ-like protein DjlA
2919


pyrE
orotate phosphoribosyltransferase
2920


rph
ribonuclease PH
2921


gltX
glutamyl-tRNA synthetase
2922


Lpt6
PE-tn-6--lipooligosaccharide
2923



phosphorylethanolamine transferase


rbn
tRNA processing ribonuclease BN
2924



conserved hypothetical protein
2925



conserved hypothetical protein
2926


udp
uridine phosphorylase
2927



conserved hypothetical metabolite transport
2928



protein



conserved hypothetical protein
2929


mend
menaquinone biosynthesis protein MenD
2930


menf
menaquinone-specific isochorismate synthase
2931


Aspc3
probable aspartate aminotransferase
2932


mtr
tryptophan-specific transport protein
2933


sdaa
L-serine dehydratase
2934


sdac
Serine transporter
2935



probable cation-transporting ATPase
2936



probable heavy metal dependent transcriptional
2937



regulator


metJ
Met repressor
2938


rho
transcription termination factor rho
2939


pilD
putative type 4 prepilin-like protein specific
2940



leader peptidase


pilC
putative type IV pilin secretion protein
2941


pilB
putative type IV pilin secretion protein
2942


pilA
Type IV pilin subunit protein
2943


ampD
ampD
2944


corC
magnesium and cobalt efflux protein CorC
2945


cutE
apolipoprotein N-acyltransferase
2946



conserved hypothetical RNA methyltransferase
2947



conserved hypothetical protein
2948


ruvX
putative holliday junction resolvase
2949



prophage CP4-57-like integrase
2950



hypothetical protein
2951



hypothetical protein
2952



hypothetical protein
2953



hypothetical protein
2954



hypothetical protein
2955



hypothetical protein
2956


proC
pyrroline-5-carboxylate reductase
2957


hcaT
probable 3-phenylpropionic acid transporter
2958


xerD
Site-specific recombinase XerD
2959



conserved hypothetical protein
2960


ruvB
holliday junction DNA helicase RuvB
2961


ruvA
holliday junction DNA helicase RuvA
2962


ruvC
holliday junction DNA helicase RuvC
2963



conserved hypothetical protein
2964


ntpA
dATP pyrophosphohydrolase
2965


aspS
Aspartyl-tRNA synthetase
2966



conserved hypothetical protein
2967



conserved hypothetical protein
2968



conserved hypothetical protein
2969



Predicted nucleic acid-binding domain, containsPIN
2970



domain


gloA
Lactoylglutathione lyase
2971


Rnt
Ribonuclease T
2972



conserved hypothetical protein
2973



Predicted primosomal replication protein N
2974


Efp
Elongation factor P
2975



Predicted lysine 2,3-aminomutase
2976


oapA
Opacity associated protein OapA
2977


oapB
Opacity associated protein OapB
2978


recO
DNA repair protein RecO
2979


rumA
23S rRNA (uracil-5--)methyltransferase RumA
2980


relA
GTP pyrophosphokinase
2981


dgkA
Diacylglycerol kinase
2982


Mog
Molybdopterin biosynthesis mog protein
2983


glnB
Nitrogen regulatory protein P-II
2984



conserved hypothetical protein
2985


priA
Prismosomal protein N
2986


trmB
tRNA (guanine-N(7)-)-methyltransferase
2987



conserved hypothetical protein
2988


napF
Ferredoxin-type protein NapF
2989


napD
NapD
2990


napA
Periplasmic nitrate reductase
2991


napG
Ferredoxin-type protein NapG
2992


napH
Ferredoxin-type protein NapH
2993


napB
Diheme cytochrome C NapB
2994


napC
Cytochrome C-type protein NapC
2995


Adk
Adenylate kinase
2996


Lic3c
Putative integral membrane signal transducer protein
2997


galE
UDP-glucose 4-epimerase
2998


Lic3A
CMP-Neu5Ac--lipooligosaccharide alpha 2-3
2999



sialytransferase



Putative ABC-type nitrate/sulfonate/bicarbonate
3000



transport system, ATPase component



Putative ABC-type nitrate/sulfonate/bicarbonate
3001



transport system, permease component



Putative ABC-type nitrate/sulfonate/bicarbonate
3002



transport system, periplasmic component



Putative transcription activator
3003


hfeD
Putative ABC-type chelated iron transport system,
3004



permease component


hfeC
Putative ABC-type chelated iron transport system,
3005



permease component


hfeB
Putative ABC-type chelated iron transport system,
3006



ATPase component


hfeA
Putative periplasmic chelated iron binding protein
3007



Hypothetical protein
3008


pbpG
Penicillin-binding protein 7 homolog precursos
3009



Predicted Fe—S-cluster redox enzyme
3010



Conserved hypothetical protein
3011



Conserved hypothetical transcriptional regulator with
3012



an N-terminal xre-type HTH domain


gcpE
4-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate
3013



synthase


hisS
Histidyl-tRNA synthetase
3014



Conserved hypothetical protein
3015



conserved hypothetical protein
3016


Fdx
Feerdoxin, 2Fe—S
3017


hscA
Chaperone protein HscA
3018



Conserved hypothetical protein
3019


hscB
Co-chaperone protein HscB
3020



Conserved hypothetical protein
3021


nifU
NifU-like protein
3022


nifS2
Cysteine sedulferase
3023



Predicted transcriptional regulator
3024



Hypothetical tRNA/rRNA methyltransferase
3025


Pal
Outer membrane protein P6 precursor
3026


tolB
TolB
3027


tolA
TolA
3028


tolR
TolR
3029


tolQ
TolQ
3030



Predicted thioestererase
3031


dinG
Probable ATP-dependent helicase
3032



Possible inactive homolog of metal-dependent
3033



proteases, putative molecular chaperone



possible starvation-inducible outer membrane
3034



lipoprotein


lcfA
Long chain fatty acid CoA ligase
3035


rnD
Ribonuclease D
3036



Conserved hypothetical acyltransferase
3037



Predicted GTPase, probable translation factor
3038


Pth
Peptidyl-tRNA hydrolase
3039



Conserved hypothetical protein
3040



Conserved hypothetical cupin superfamily
3041



metalloenzyme


xseA
Exodeoxyribonuclease VII large subunit
3042


adpP
ADP-ribose pyrophoaphatase
3043


Icc
Predicted phosphohydrolase
3044



Conserved hypothetical protein
3045


ompP1
Outer membrane protein P1 precursor
3046


Ogt
Methylated-DNA-protein-cysteine
3047


mutH
DNA mismatch repair protein MutH
3048


mesJ
Putative cell cycle protein MesJ
3049


accA
Acetyl-coenzyme A carboxylase carboxyl transferase
3050



subunit alpha


znuB
High affinity zinc uptake system membrane protein
3051



ZnuB


znuC
High-affinity zinc uptake system ATP binding protein
3052



AnuC



Conserved hypothetical metalloprotease
3053


tyrR
Transcriptional regulatory protein TyrR
3054


hfq
host factor-I protein Hfq
3055


rluC
ribosomal large subunit pseudouridine synthase C
3056


rne
ribonuclease E
3057



Conserved hypothetical protein
3058


thiM
hydroxyethylthiazole kinase
3059


thiD
phosphomethylpyrimidine kinase
3060


thiE
thiamine-phosphate pyrophosphorylase
3061



Conserved hypothetical metabolite transport
3062



hypothetical protein
3063



putative protease
3064


srmB
ATP-dependent RNA helicase SrmB
3065



predicted O-methyltransferase
3066


pssA
CDP-diacylglycerol--serine O-
3067



phosphatidyltransferase


fadR
fatty acid metabolism regulator protein
3068


nhaB
Na(+)/H(+) antiporter 2
3069


dsbB
disulfide bond formation protein B
3070


glmS
glucosamine--fructose-6-phosphate aminotransferase
3071


hupA
DNA-binding protein HU
3072



conserved hypothetical protein
3073


nudC
NADH pyrophosphatase
3074


orfG
conserved hypothetical 21.9 KD protein in locus
3075



involved in transformation


comF
competence protein F
3076


comE
competence protein E
3077


comD
competence protein D
3078


comC
competence protein C
3079


comB
competence protein B
3080


comA
competence protein A
3081


mrcA
penicillin-binding protein 1A
3082



conserved hypothetical protein
3083



conserved hypothetical protein
3084


recR
recombination protein RecR
3085


topB
DNa topoisomerase III
3086


secG
protein-export membrane protein SecG
3087


fruA
PTS system, fructose-specific IIBC component
3088


fruK
1-phosphofructokinase
3089


fruB
PTS system, fructose-specific IIA/FPr component
3090



conserved hypothetical protein
3091


vapD
virulence-associated protein D
3092


vapX
VapX
3093



conserved hypothetical protein
3094



conserved hypothetical protein
3095



putative deoxyribonuclase
3096


holB
DNA polymerase III, delta subunit
3097



hypothetical protein
3098


lav
autotransported protein Lav
3099


tmk
thymidylate kinase
3100



predicted periplasmic solute-binding protein
3101


surA
survival protein SurA homolog
3102


pyrR
PyrR bifunctional protein
3103


mazG
predicted pyrophosphatase MazG
3104



conserved hypothetical protein
3105


lon
ATP-dependent protease La
3106



predicted Fe—S oxidoreductase
3107


rpiA
ribose 5-phosphate isomerase A
3108


serA
D-3-phosphoglycerate dehydrogenase
3109



predicted aminomethyltransferase related to GcvT
3110



conserved hypothetical stress-induced protein
3111


hisG
ATP phosphoribosyltransferase
3112


hisD
histidinol dehydrogenase
3113


hisC
hisitidinol-phosphate aminotransferase 1
3114


hisB
histidine biosynthesis bifunctional protein HisB
3115


hisH
imidazole glycerol phosphate synthase subunit HisH
3116


hisA
1-5-[methylideneamino] imidazole-4-carboxamine
3117



isomerase


hisF
imidazole glycerol phosphate synthase subunit HisF
3118


hisI
histidine biosynthesis bifunctional protein hisIE
3119



conserved hypothetical protein
3120


tyrP
tyrosine-specific transportprotein 1
3121


atpC
ATP synthase epsilon chain
3122


atpD
ATP synthase beta chain
3123


atpG
ATP synthase gamma chain
3124


atpA
ATP synthase alpha chain
3125


atpH
ATP synthase delta chain
3126


atpF
ATP synthase B chain
3127


atpE
ATP cynthase C chain
3128


atpB
ATP synthase A chain
3129



predicted F0F1-type ATP synthase subunit I
3130


gidB
methyltransferase GidB
3131



conserved hypothetical protein
3132



predicted phosphatase/phosphohexomutase
3133



predicted membrane protein
3134


luxS
S-ribosylhomocysteinase
3135


aphA
Class B acid phosphatase
3136


hslV
ATP-dependent protease HslV
3137


hslU
HslU, ATP-dependent chaperone of the HslUV
3138



protease


ptoD2
spermidine/putrescine-binding periplasmic protein 2
3139



precursor


ordL
probable oxidoereductase OrdL
3140


rmuC
DNA recombination protein RmuC
3141


rbsD
ribose transport permease protein
3142


rbsA
ribose transport ATP-binding protein
3143


rbsC
ribose transport permease protein
3144


rbsB
ribose-binding periplasmic protein
3145


rbsK
ribokinase
3146


rbsR
ribose operon repressor
3147



predicted membrane protein
3148


menG
S-adenosylmethionine:2-demtyhylmenaquinone
3149



methyltransferase


menA
1,4-dihydroxy-2-naphthoate octaprenyltransferase
3150



conserved hypothetical protein
3151


tehA
tellurite resistance protein
3152


rpoC
DNA-directed RNA polymerase beta′ chain
3153


rpoB
DNA-directed RNA polymerase beta chain
3154


rplA
50S ribosomal protein L1
3155


rplK
50S ribosomal protein L11
3156


deoD
purine nucleoside phosphorylase
3157



predicted nucleoside permease
3158



predicted pyruvate-formate lyase-activating enzyme
3159



conserved hypothetical glycyl radical protein
3160



predicted membrane protein
3161


waaQ
ADP-heptose--lipooligosaccharide heptosyltransferase
3162



III


fba
frustose-biphosphate aldolase
3163


pgk
phosphoglycerate kinase
3164



probable rubonuclease I
3165



putative ferredoxin-like protein
3166


tyrQ
tyrosine-specific transport protein 2
3167


tdk
thymidine kinase
3168


gcp
probable O-sialoglycoprotein endopeptidase
3169


rpsU
30S ribosomal protein S21
3170



DNA primase
3171


rpoD
DNA polymerase sigma factor RpoD
3172


aspA
aspartate ammonia-lyase
3173


ureH
urease accessory protein UreH
3174


ureG
urease accessory protein UreG
3175


ureF
urease accessory protein UreF
3176


ureE
urease accessory protein UreE
3177


ureC
urease alpha subunit
3178


ureB
urease beta subunit
3179


ureA
urease gamma subunit
3180


groES
10 kDa chaperonin
3181


groEL
60 kDa chaperonin
3182


rplI
50S ribosomal protein L9
3183


rpsR
30S ribosomal protein S18
3184


priB
primsomal replication protein N
3185


rpsF
30S ribosomal protein S6
3186


infA
translation initiation factor IF-1
3187


ksgA
dimethyladenosine transferase
3188


lic2A
UDP-Gal--lipooligosaccharide galactosyltransferase
3189


apaH
bis-tetraphosphatase, symmetrical
3190



conserved hypothetical protein
3191


gnd
6-phosphogluconate dehydrogenase decarboxylating
3192



conserved hypothetical protein
3193



conserved hypothetical protein
3194


devB
6-phosphgluconolactonase
3195


zwf
glucose-6-phosphate 1-dehydrogenase
3196


cysQ
cysQ
3197



conserved hypothetical protein
3198



predicted membrane protein
3199


hslR
heat shock protein 15 homolog
3200


asnC
regulatory protein AsnC
3201


asnA
aspartate--ammonia ligase
3202



conserved hypothetical transposase-like protein
3203



hypothetical protein
3204



hypothetical protein
3205



hypothetical protein
3206


gph
phosphoglycolate phosphatase
3207


rpe
ribulose-phosphate 3-epimerase
3208


gyrB
DNA gyrase subunit B
3209



predicted transcriptional accessory protein
3210


greB
transcription elongation factor GreB
3211



conserved hypothetical transcriptional regulator
3212


oxyR
hydrogen peroxide-inducible genes activator
3213


pdgC
peroxiredoxin•glutaredoxin
3214


slyX
slyX
3215


fkby
probably FKBP-type peptidyl-proyl cis-trans
3216



isomerase



conserved hypothetical protein
3217



uncharacterized conserved protein involved in
3218



intracellular sulfur reduction



uncharacterized conserved protein involved in
3219



oxidation of intracellular sulfur



uncharacterized conserved protein involved in
3220



oxidation of intracellular sulfur


tufB
elongation factor Tu
3221


coaA
pantothenate kinase
3222


rseB
sigma-E factor regulatory protein RseB
3223


rseA
sigma-E factor negative regulator protein homolog
3224


rpoE
RNA polymerase sigma-E factor
3225



conserved hypothetical protein
3226


mscL
large-conductance mechanosensitive channel
3227


trkA
Trk system potassium uptake protein TrkA
3228


sun
SUN protein
3229


fmt
methionyl-tRNA formyltransferase
3230


def
peptide deformylase
3231



hypothetical protein
3232



hypothetical protein
3233


sxy
DNA transformation protein TfoX
3234


recA
RecA
3235


recX
regulatory protein RecX
3236


crcB
CrcB
3237



predicted hydrolase of the HAD superfamily
3238


argF
ornithine carbamoyltransferase, catabolic
3239


arcC
carbamate kinase
3240



predicted membrane protein
3241


hgpD
pseudogene for hemoglobin-haptoglobin binding
3242



protein D


pepE
peptidase E
3243



predicted C4-dicarboxylate transporter
3244


abgA
aminobenzoyl-glutamate utilization protein A
3245


cpdB
2′,3′-cyclic-nucleotide 2′-phosphodiesterase
3246



HTH-type transcriptional regulator
3247



zinc transported ZitB
3248


gidA
glucose inhibited division protein A
3249


rpsL
30S ribosoomal protein S12
3250


rpsG
30S ribosomal protein S7
3251


fusA
elongation factor G
3252


tufB2
elongation factor Tu
3253



predicted chloride channel protein
3254



predicted chloride channel protein
3255


dusA
tRNA-dihydrouridine synthase A
3256



conserved hypothetical protein
3257


trpS
tryptophanyl-tRNA synthetase
3258



predicted protein involved in purine metabolism
3259


purB
adenylosuccinate lyase
3260


rplJ
50S ribosomal protein L10
3261


rplL
50S ribosomal protein L7/L12
3262


glmU
bifunctional GlmU protein
3263



hypothetical protein
3264


pldB
probable lysophospholipase L2
3265


asd
aspartate-semialdehyde dehydrogenase
3266



conserved hypothetical protein
3267



predicted 2-methylthioadenine synthetase
3268


mdaB
putative NADPH-quinone reductase, modulator of
3269



drug activity B


rep
ATP-dependent DNA helicase rep
3270



predicted periplasmic lipoprotein
3271


kdtB
phosphopantetheine adenylyltransferase
3272


kdtA
3-deoxy-D-manno-octulosonic acid transferase
3273


lgtF
UDP-glucose--lipooligosaccharide glucosyltransferase
3274


tag
DNA-3-methyladenine glycolase
3275



hypothetical protein
3276


aroE
shikimate 5-dehydrogenase
3277



predicted translation factor SUA5
3278



Zn-finger domain associated with topoisomerase typr I
3279



probable ABC transporter, ATP binding protein
3280



putative HTH-type transcriptional regulator
3281



conserved hypothetical protein
3282


hgpB
hemoglobin-haptoglobin binding protein B
3283



hypothetical ABC transporter, ATP-binding protein
3284



probable ABC transporter, ATP binding protein
3285



conserved hypothetical protein
3286



conserved hypothetical protein
3287



putative HTH-type transcriptional regulator
3288


glpX
fructose-1,6-bisphosphatase class II GlpX
3289



conserved hypothetical protein
3290


mioC
MioC
3291


dtd
D-tyrosyl-tRNA (Tyr) deacylase
3292


ispF
3-C-methyl-D-erythritol 2,4-cyclodiphosphate
3293



synthase


ispD
2-C-methyl-D-erythritol 4-phosphate
3294



cytidylyltransferase


ftsB
cell division protein FtcB
3295


gpt
xanthine-guanine phosphoribosyltransferase
3296


pepD
aminoacyl-histidine dipeptidase
3297


xerC
site-specific recombinase XerC
3298



conserved hypothetical protein
3299


tpiA
triosephosphate isomerase
3300


glpE
thiosulfate sulfurtransferase GlpE
3301



conserved hypothetical protein
3302


ilvY
HTH-type transcriptional activator IlvY
3303


ilvC
ketol-acid reductoisomerase
3304


glpC
anaerobic glycerol-3-phosphate dehydrogenase
3305



subunit C


glpB
anaerobic glycerol-3-phosphate dehydrogenase
3306



subunit B


glpA
anaerobic glycerol-3-phosphate dehydrogenase
3307



subunit A


glpT
glycerol-3-phosphate transporter
3308


glpQ
glycerophosphoryl diester phosphodiesterase precursor
3309


glpF
glycerol uptake facilitator protein
3310


glpK
glycerol kinase
3311


gpt2
xanthine-guanine phosphoribosyltransferase
3312


hel
outer membrane protein P4, NADP phosphatase
3313


rluE
ribosomal large subunit pseudouridine synthase E
3314



conserved hypothetical protein
3315



conserved hypothetical protein
3316


slyD
FKBP-type peptidyl-prolyl cis-trans isomerase SlyD
3317



conserved hypothetical protein
3318


truD
tRNA pseudouridine synthase D
3319


surE
acid phosphatase surE
3320



conserved hypothetical protein
3321



conserved hypothetical protein
3322


lppB
outer membrane antigenic lipoprotein B
3323


tnaA
tryptophanase
3324


tnaB
tryptophan-specific transport protein
3325


mutS
DNA mismatch repair protein MutS
3326


selA
L-seryl-tRNA selenium transferase
3327


selB
selenocysteine-specific elongation factor
3328



hypothetical protein
3329



conserved hypothetical protein
3330



conserved hypothetical protein
3331


hgpC
pseudogene for hemoglobin-haptoglobin utilization
3332



protein C


tig
trigger factor
3333


clpP
ATP-dependent clp protease proteolytic subunit
3334


clpX
ATP-dependent Clp protease ATP-binding subunit
3335



ClpX


secE
preprotein translocase SecE
3336


nusG
transcription antitermination protein NusG
3337


vacJ
VacJ lipoprotein
3338



putative translation initiation inhibitor YjgF family
3339


htpX
probable protease HtpX
3340


sirA
SirA
3341



conserved hypothetical protein
3342


trkH
Trk system potassium uptake protein TrkH
3343


psiE
PsiE
3344


hemY
HemY
3345


hemX
putative uroporphyrin-III C-methyltransferase
3346


cya
adenylate cyclase
3347


gpsA
glycerol-3-phosphate dehydrogenase
3348


cysE
serine acetyltransferase
3349



conserved hypothetical shikimate 5-dehydrogenase-
3350



like protein



possible di- and tricarboxylate transporter
3351


folD
FolD bifunctional protein
3352


fucP
L-fucose permease
3353


fucA
L-fuculose phosphate aldolase
3354


fucU
fucose operon protein FucU
3355


fucK
L-fuculokinase
3356


fucI
L-fucose isomerase
3357


fucR
L-fucose operon activator
3358


hepA
RNA polymerase associated protein homolog
3359


rluA
ribosomal large subunit pseudouridine synthase A
3360


glpG
GlpG
3361


glpR
glycerol-3-phosphate regulon repressor
3362


metQ
probable D-methionine-binding lipoprotein MetQ
3363


metI
probable D-methionine transport system permease
3364



protein


metN
probable D-methionine transport ATP-binding protein
3365



conserved hypothetical protein
3366



conserved hypothetical protein
3367



conserved hypothetical protein
3368


narP
nitrate/nitrite response regulator protein
3369


lysA
diaminopimelate decarboxylase
3370



hypothetical protein
3371


cyaY
CyaY
3372


recQ
ATP-dependent DNA helicase RecQ
3373


proS
prolyl-tRNA synthetase
3374


ostA
organic solvent tolerance protein
3375


sufI
SufI
3376


plsC
1-acyl-sn-glycerol-3-phosphate acyltransferase
3377


lpxH
UDP-2,3-diacylglucosamine hydrolase
3378



conserved hypothetical sodium dependent transporter
3379


ilvG
acetolactate synthase isozyme II large subunit
3380


ilvD
dihydroxy-acid dehydratase
3381


thdI
threonine dehydratase biosynthetic
3382


dnaE
DNA polymerase III alpha subunit
3383


pgmB
phosphoglucomutase
3384


secB
protein-export protein SecB
3385



predicted rhodanese-related sulfurtransferase
3386


dcuB
anaerobic C4-dicarboxylate transporter DcuB
3387


ndhA
NADH dehydrogenase
3388


plsB
glycerol-3-phosphate acyltransferase
3389


lexA
LexA repressor
3390


dapF
diaminopimelate epimerase
3391


tpx
probable thiol peroxidase
3392


purL
phosphoribosylformylglycinamidine synthase
3393


lex2B
UDP-glucose--lipooligosaccharide glucosyltransferase
3394



conserved hypothetical protein
3395



predicted membrane-bound metallopeptidase
3396


gpmA
2,3-bisphosphoglycerate-dependent phsphoglycerate
3397



mutase


rpL31
50S ribosomal protein L31
3398


mutY
A/G-specific adenine glycosylase
3399



conserved hypothetical protein
3400


mltC
membrane-bound lytic murein transglycolase C
3401



precursor



predicted diadenosine tetraphosphatase and related
3402



serine/threonine protein phosphatase


nadR
bifunctional protein NadR
3403


ribB
3,4-dihydroxy-2-butanone 4-phosphate synthase
3404


lpsA
lipooligosaccharide glycosyl transferase
3405



conserved hypothetical tRNA/rRNA methyltransferase
3406



predicted N6-adenine-specific methylase
3407


FtsY
cell division protein FtsY
3408


ftsE
cell division ATP-binding protein EtsE
3409


ftsx
cell division protein ftsx
3410


atoB
acetyl-CoA acetyltransferase
3411


atoE
short chain fatty acids transporter
3412


atoA
acetate CoA-transferase beta subunit
3413


atoD
acetate CoA-transferase alpha subunit
3414



putative HTH-type transcriptional regulator
3415


rpsJ
30S ribosomal protein S10
3416


rplC
50S ribosomal protein L3
3417


rplD
50S ribosomal protein L4
3418


rplW
50S ribosomal protein L23
3419


rplB
50S ribosomal protein L2
3420


rpsS
30S ribosomal protein S19
3421


rplV
50S ribosomal protein L22
3422


rpsC
30S ribosomal protein S3
3423


rplP
50S ribosomal protein L16
3424


rpmC
50S ribosomal protein L29
3425


rpsQ
30S ribosomal protein S17
3426



conserved hypothetical protein
3427


rplN
50S ribosomal protein L14
3428


rplX
50S ribosomal protein L24
3429


rplE
50S ribosomal protein L5
3430


rpsN
30S ribosomal protein S14
3431


rpsH
30S ribosomal protein S8
3432


rplF
50S ribosomal protein L6
3433


rplR
50S ribosomal protein L18
3434


rpsE
30S ribosomal protein S5
3435


rpmD
50S ribosomal protein L30
3436


rplO
50S ribosomal protein L15
3437


secY
preprotein translocase SecY subunit
3438


rpsM
30S ribosomal protein S13
3439


rpsK
30S ribosomal protein S11
3440


rpsD
30S ribosomal protein S4
3441


rpoA
DNA-directed RNA polymerase alpha chain
3442


rplQ
50S ribosomal protein L17
3443



predicted cAMP-binding protein - catabolite gene
3444



activator and regulatory subunit of cAMP-dependent



protein kinase



predicted permease
3445


dxr
1-deoxy-D-xylulose 5′phosphate reductoisomerase
3446


frr
ribosome recycling factor
3447


pckA
phosphoenolpyruvate carboxykinase
3448


hslO
33 kDa chaperonin
3449


argH
argininosuccinate lyase
3450


galU
UTP-glucose-1-phosphate uridylyltransferase
3451


csrA
carbon storage regulator homolog
3452


alaS
alanyl-tRNA synthetase
3453


uspA
universal stress protein A
3454


pepP
Xaa-Pro aminopeptidase
3455



conserved hypothetical protein
3456


galM
aldose 1-epimerase
3457


galK
glactokinase
3458


galT
galactose-1-phosphate uridylyltransferase
3459


galR
HTH-type transcriptional regulator GalR
3460


mglB
D-galactose-binding periplasmic protein precursos
3461


mglA
galactoside transport ATP-binding protein MglA
3462


mglC
galactoside transport system permease protein MglC
3463



conserved hypothetical protein
3464


ispZ
probable intracellular septation protein A
3465



putative acyl-CoA thioester hydrolase
3466



conserved hypothetical protein
3467


slt
putative soluble lytic murein transglycosylase
3468


trpR
Trp operon repressor
3469


mtgA
monofuctional biosynthetic peptidoglycan
3470



transglycolase


frdD
fumarate reductase subunit D
3471


frdC
fumarate reductase subunit C
3472


frdB
fumarate reductase iron-sulfur protein
3473


frdA
fumarate reductase flavoprotein subunit
3474


genX
putative lysyl-tRNA synthetase
3475


cpxR
transcriptional regulatory protein CpxR
3476


smpA
small protein A
3477


ndpA
nucleosid-associated protein NdpA
3478



conserved hypothetical protein
3479



predicted hydrolase of alkaline phosphatase
3480



superfamily



conserved hypothetical protein
3481


mobA
probable molybdopterin-guanine dinucleotide
3482



biosyntheses protein A



conserved hypothetical protein
3483


dsbA
thiol:disulfide interchange protein DsbA
3484



conserved hypothetical protein
3485


trmA
tRNA (Uracil-5)-methyltransferase
3486



conserved hypothetical protein
3487



predicted positive regulator of Sigma E
3488


mobB
molybdopterin-guanine dinucleotide biosynthesis
3489



protein B



conserved hypothetical protein
3490


hbpA
heme-binding protein A
3491



putative heme iron utilization protein
3492



conserved hypothetical protein
3493


polA
DNA polymerase I
3494



conserved hypothetical protein
3495



predicted 5-formyltetrahydrofolate cyclo-ligase
3496


clpB
ClpB
3497



probable tRNA/rRNA methyltransferase
3498


vacB
ribonuclease R
3499



conserved hypothetical protein
3500


pdxH
pyridoxamine 5′-phosphate oxidase
3501


typA
GTP-binding protein TypA/BipA
3502


lic3A2
CMP-neu5Ac--lipooligosaccharide alpha 2-3
3503



sialyltransferase


glnA
glutamine synthetase
3504


rmlB
dTDP-glucose 4,6-dehydratase
3505


pepB
peptidase B
3506


ndk
nucleoside diphosphate kinase
3507



conserved hypothetical GTP-binding protein
3508



conserved hypothetical transport protein
3509


rpmA
50S ribosomal protein L27
3510


rplU
50S ribosomal protein L21
3511


ispB
octaprenyl-diphosphate synthase
3512



conserved hypothetical protein
3513



predicted Na+/alanine symporter
3514


arcA
aerobic respiration control protein ArcA
3515


dsbD
thiol:disulfide interchange protein DsbD
3516



predicted membrane protein
3517


purH
bifunctional purine biosynthesis protein PurH
3518


purD
phosphoribosylamine--glycine ligase
3519


glyA
serine hydroxymethyltransferase
3520


coaE
dephospho-CoA kinase
3521



conserved hypothetical zinc-binding protein
3522


rhlB
ATP-dependent RNA helicase RhlB
3523



hypothetical transcriptional regulator
3524



predicted membrane-fusion protein
3525



predicted cation/multidrug efflux pump
3526



predicted cell division protein
3527


emrB
multidrug resistance protein
3528


emrA
multidrug resistance protein A
3529


folA
dihydrofolate reductase
3530


proB
glutamate 5-kinase
3531


nudH
probable nucleoside polyphosphate hydrolase
3532



predicted permease
3533


lgt
prolipoprotein diacylglyceryl transferase
3534


thyA
thymidylate synthase
3535



conserved hypothetical protein
3536



conserved hypothetical protein
3537



conserved hypothetical protein
3538


secA
preprotein translocase SecA subunit
3539


mutT
mutator protein MutT
3540


kefB
glutathione-regulated potassium-efflux system protein
3541



conserved hypothetical SAM-dependent
3542



methtransferase


rpsB
30S ribosomal protein S2
3543


tsf
elongation factor Ts
3544


lpxD
UDP-3-O-[3-hydroxymyristoyl] glucosamine N-
3545



acyltransferase


omp26
outer membrane protein 26
3546



protective surface antigen D15
3547



predicted membrane bound zinc matalloprotease with
3548



PDZ domain


cdsA
phosphaatidate cytidylyltransferase
3549


uppS
undecaprenyl pyrophosphate synthetase
3550


leuS
leucyl-tRNA synthetase
3551



conserved predicted lipoprotein
3552


holA
DNA polymerase III, delta subunit
3553


glyS
glycyl-tRNA synthetase beta chain
3554



hypothetical protein
3555



conserved hypothetical protein
3556



glutaredoxin-related protein
3557


glyQ
glycyl-tRNA synthetase alpha chain
3558


hktE
catalase
3559



predicted glutathionylspermidine synthase
3560



conserved hypothetical protein
3561



conserved hypothetical protein
3562


eno
enolase
3563



conserved hypothetical protein
3564


nrfF
formate-dependent nitrite reductase complex nrfFG
3565



subunit


dsbE2
probable thiol:disulfide interchange protein DsbE
3566


nrfE
cytochrome c-type biogenesis protein NrfE
3567


suhB
inositol-1-monophosphatase
3568



conserved hypothetical protein
3569



predicted Type II secretory pathway, PulJ-like protein
3570



conserved hypothetical protein
3571



conserved hypothetical protein
3572



exodeoxyribonuclease V gamma chain
3573



predicted transcriptional regulator
3574


ribD
riboflavin biosynthesis protein RibD
3575


degS
protease DegS
3576


mutM
formamidopyrimidine-DNA glycosylase
3577


ddc
L-2,4-diaminobutyrate decarboxylase
3578



predicted nucleic acid-binding protein, contains PIN
3579



domain



conserved hypothetical protein
3580


dat
diaminobutyrate--2-oxoglutarate aminotransferase
3581


rpmG
50S ribosomal protein L33
3582


rpmB
50S ribosomal protein L28
3583


radC
DNA repair protein Radc homolog
3584


dfp
phosphopantothenoylcysteine
3585



synthetase/decarboxylase


dut
deoxyuridine 5′triphosphate nucleotidohydrolase
3586


ttk
ttk
3587



hypothetical protein
3588


crp
catabolite gene activator
3589



hypothetical RNA methyltransferase
3590


nagZ
beta-hexosaminidase
3591



predicted periplasmic lipoprotein
3592



HIT-like protein
3593


ileS
isoleucyl-tRNA synthetase
3594


ribF
riboflavin biosynthesis protein RibF
3595


mviN
putative virulence factor MviN
3596


rpsT
30S ribosomal protein S20
3597



conserved hypothetical protein
3598


menB
naphthoate synthase
3599


menC
O-succinylbenzoate synthase
3600


aroQ
3-dehydroquinate dehydratase
3601


accB
biotin carboxyl carrier protein of acetyl-CoA
3602



carboxylase


accC
biotin carboxylase
3603



conserved hypothetical membrane protein
3604


panF
sodium/pantothenase symporter
3605



conserved hypothetical protein
3606


prmA
ribosomal protein L11 methyltransferase
3607


dusB
tRNA-dihydrouridine synthase B
3608


fis
DNA-binding protein fis
3609


smpB
SsrA-binding protein
3610


pfkA
6-phosphfructokinase
3611



conserved hypothetical protein
3612



conserved hypothetical protein
3613


smf
smf
3614


leuA
2-isopropylmalate synthase
3615


leuB
3-isopropylmalate dehydrogenase
3616


leuC
3-isopropylmalate dehydratase large subunit
3617


leuD
3-isopropylmalate dehydratase small subunit
3618


igal
IgA-specific serine endopeptidase
3619


recF
DNA replication and repair protein RecF
3620


dnaN
DNA polymerase III, beta chain
3621


dnaA
chromosomal replication initiator protein DnaA
3622


tbp1
transferrin-binding protein 1
3623


tbp2
transferrin-binding protein 2
3624



conserved hypothetical protein
3625


rpmH
50S ribosomal protain L34
3626


rnpA
ribonuclease P protein component
3627



conserved hypothetical protein
3628


yidC
proprotein translocase subunit YidC
3629


trmE
probable tRNA moficiation GTPase TrmE
3630


ppiD
peptidyl-prolyl cis-trans isomerase D
3631



predicted PR--lipooligosaccharide
3632



phosphorylethanolamine transferase


lspA
lipoprotein signal peptidase
3633


ispH
4-hydroxy-3-methylbut-2-enyl diphosphate reductase
3634



conserved hypothetical protein
3635


tbpA
thiamine-binding periplasmic protein
3636


thiP
thiamine transport system permease protein
3637


thiQ
thiamine transport ATP-binding protein
3638


bioB
biotin synthase
3639


tktA
transketolase
3640


serB
phosphoserine phosphatase
3641



conserved hypothetical protein
3642


corA
magnesium and cobalt transport protein CorA
3643



predicted integral membrane protein
3644



predicted glutamine amidotransferase
3645



hypothetical protein
3646



predicted ATPase
3647



hypothetical protein
3648



predicted ferredoxin
3649



conserved hypothetical protein
3650


dmsC
anaerobic dimethyl sulfoxide reductase chain C
3651


dmsB
anaerobic dimenthyl sulfoxide reductase chain B
3652


dmaA
anaerobic dimethyl sulfoxide reductase chain A
3653



conserved hypothetical protein
3654



putative mercuric transport MerT homolog
3655



predicted copper chaperone MerP homolog
3656



conserved hypothetical ABC transporter
3657



conserved hypothetical transcriptional regulator
3658



conserved putative gamma-carboxymuconolactone
3659



decarboxylase subunit



conserved hypothetical protein
3660


res
putative type III restriction-modification sustem
3661



HindVIP enzyme res


rnhB
ribonuclease HII
3662


lpxB
lipid-A-disaccharide synthase
3663


lpxA
acyl0[acyl-carrier-protein]--UDP-N-
3664



acetylglucosamine O-Acyltransferase


fabZ
(3R)-hydroxymyristoyl-[acyl carrier protein]
3665



dehydratase



predicted PR--lipooligosaccharide
3666



phosphorylethanolamine transferase


pyrH
uridylate kinase
3667


nrfD
NrfD, formate-dependent nitrite reductase, membrane
3668



component


nrfC
NrfC, Fe—S-cluster-containing hydrogenase
3669



component 1


nrfB
NrfB, cytochrome C-type protein
3670


nrfA
cytochrome c552
3671


hrpA
ATP-dependent helicase HrpA homolog
3672



conserved putative small membrane protein
3673



conserved putative membrane protein
3674


cyoB
probable cyrochrome oxidase subunit II
3675


cyoA
probable cytochrome oxidase dubunit I
3676


pyrG
CTP synthase
3677


pnuC
nictinamide riboside transporter
3678



probable amino-acid ABC transporter ATP-binding
3679



protein



probable amino-acid ABC transporter permease
3680



protein



probable amino-acid ABC transporter binding protein
3681


murA
UDP-N-acetylglucosamine a-carboxyvinyltransferase
3682



predicted transcriptional regulator, BolA superfamily
3683



predicted NTP binding protein, contains STAS domain
3684



conserved ABC-type transport system protein
3685



conserved ABC-type transport system protein,
3686



periplasmic component



conserved ABC-type transport system protein,
3687



permease component



conserved ABC-type transport system protein, ATPase
3688



component


sodA
superoxide dismutase [Mn]
3689


ccmA
heme exporter protein A
3690


ccmB
heme exporter protein B
3691


ccmC
heme exporter protein C
3692


ccmD
heme exporter protein D
3693


ccmE
cytochrome c-type biogenesis protein CcmE
3694


ccmF
cytochrome c-type biogenesis protein CcmF
3695


dsbE
thiol:disulfide interchange protein DsbE
3696



hypothetical protein
3697



conserved hypothetical protein
3698


ligN
DNA ligase
3699


zipA
cell division protein ZipA
3700


cysZ
CysZ
3701


cysK
cysteine synthase
3702


rfaF
ADP-heptose--lipooligosaccharide heptosyltransferase
3703



II


xylR
xylose operon refulatory protein
3704



conserved hypothetical Na(+)/H(+) antiporter
3705


aspC2
putative aspartate aminotransferase
3706


xylA
xylose isomerase
3707


xylB
xylulose kinase
3708


rfaD
ADP-L-glycero-D-manno-heptose-6-epimerase
3709



thioredoxin-like protein
3710


deoC
deoxyribose-phosphate aldolase
3711


comM
competence protein ComM
3712


engB
Probable GTP-binding protein EngB
3713



D-xylose transport permease protein
3714


oppF
oligopeptide transport ATP-binsing protein
3715


oppD
oligopeptide transport ATP-binding protein
3716


oppC
oligopeptide transport system permease protein
3717


oppB
oligopeptide transport system permease protein
3718


oppA
periplasmic oligopeptide-binding protein
3719


talB
Transaldolase
3720



carbon starvation protein, predicted membrane protein
3721


mraZ
MraZ
3722


mraW
predicted S-adenosylmethionine-dependent
3723



methyltransferase involved in cell envelope biogenesis


ftsL
cell division protein FtsL
3724


ftsI
peptidoglycan synthetase FtsI
3725


murE
UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-
3726



diaminopimelate ligase


murF
UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-
3727



alanine ligase


mraY
phospho-N-acetylmuramoyl-pentapeptide-transferase
3728


murD
UDP-N-acetylmuramoylaalanine--D-glutamate ligase
3729


ftsW
cell division protein FtsW
3730


murG
UDP-N-acetylglucosamine--N-
3731



acetylmuramylpyrophosphoryl-undecaprenol N-



acetylglucosamine transferase


murC
UDP-N-acetylmuramate--L-alanine ligase
3732


ddlB
D-alanine--D-alanine ligase
3733


ftsQ
cell division protein FtsQ
3734


ftsA
cell division protein FtsA
3735


ftsZ
cell division protein FtsZ
3736


lpxC
UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine
3737



deacetylase


pheA
P-protein
3738



predicted P-loop-containing kinase
3739


ptsN
nitrogen regulatory IIA protein
3740



probable ABC transporter ATP-binding protein
3741



conserved hypothetical protein
3742



conserved hypothetical protein
3743



conserved hypothetical protein
3744


pmbA
pmbA
3745


hpt
hypoxanthine phosphoribosyltransferase
3746



predicted Na+/dicarboxylate symporter
3747


nrdG
anaerobic ribonucleoside-triphosphate reductase
3748



activating protein


cydC
transport ATP-binding protein CydC
3749


cydD
transport ATP binding protein CydD
3750


trxB
thioredoxin reductase
3751



thioredoxin domain-containing protein
3752


hemH
ferrochelatase
3753



conserved hypothetical protein
3754



conserved FAD/FMN-containing dehydrogenase
3755


ompP5
outer membrane protein P5
3756



conserved glutaredoxin-related protein
3757



histidinol-phosphate aminotransferase 2
3758


serC
phosphoserine aminotransferase
3759



conserved hypothetical protein
3760



conserved hypothetical protein
3761



conserved hypothetical protein
3762


trpG2
putative anthranilate synthase component II
3763


metK
S-adenosylmethionine synthetase
3764


sprT
SprT
3765


opa
opacity protein
3766



conserved hypothetical protein
3767


artM
arginine transport system permease protein
3768


artQ
arginine transport system permease protein
3769


artI
arginine-binding periplasmic protein
3770


artP
arginine transport ATP-binding protein
3771


gmhA
phosphoheptose isomerase
3772


ligA
DNA ligase
3773


dppF
dipeptide transport ATP binding protein
3774


dppD
dipeptide transport ATP binding protein
3775


dppC
dipeptide transport system permease protein
3776


dppB
dipeptide transport system permease protein
3777


uvrD
DNA helicase II
3778



predicted organic radical activating enzyme
3779



predicted 6-pyruval-tetrahydropterin synthase
3780



predicted PP-loop superfamily ATPase
3781



conserved hypothetical protein
3782


ilvE
branched chain amino acid amino transferase
3783


gcvA
glycine cleavage system transcriptional activator
3784



predicted SAM-dependent methyltransferase
3785


sucC
succinyl-CoA synthetase beta chain
3786


sucD
succinyl-CoA synthetase alpha chain
3787



putative translation factor, Sua5
3788


rluB
ribosomal large subunit pseudouridine synthase B
3789


cysB
HTH-type transcriptional regulator CysB
3790



conserved hypothetical adenine-specific methylase
3791



conserved hypothetical protein
3792


pta
phosphate acetyltransferase
3793


ackA
acetate kinase
3794



conserved hypothetical protein
3795


cvpA
colicin C production protein
3796



amidophosphoribosyltransferase
3797


sulA
cell division inhibitor SulA
3798


argR
argininr repressor
3799


mdh
malate dehydrogenase
3800


lysS
lysyl-tRNA synthetase
3801


prfB
peptide chain release factor 2
3802


dsbC
thiol:disulfide interchange protein DsbC
3803


recJ
single stranded DNA specific exonuclease RecJ
3804



conserved hypothetical protein
3805


mtnA
MTA/SAH nucleosidase
3806


hup
heme utilization protein
3807



putative L-Lactate permease
3808


fbp
frustose-1,6-bisphosphatase
3809


truA
tRNA pseudouridine synthase A
3810


sapZ
Predicted membrane protein
3811


sapF
ABC-type transport system, ATPase component
3812



involved in antimicrobial peptide resistance


sapD
ABC-type transport system, ATP binding component,
3813



involved in antimicrobial peptide resistance


sapC
ABC-type transport system, permease protein,
3814



involved in antimicrobial peptide resistance


sapB
ABC-type transport system, permease protein,
3815



involved in antimicrobial peptide resistance


sapA
ABC-type transport system, periplasmic component,
3816



involved in antimicrobial peptide resistance



Predicted ATPase
3817


Ppc
Phosphoenolpyruvate carboxylase
3818


purR
HTH-type transcriptional repressor PurR
3819


dapD
2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-
3820



succinyltransferase


PurA
adenylosuccinate synthetase
3821



predicted aspartokinase
3822


rplY
50S ribosomal protein L25
3823



uncharacterized membrane-associate protein
3824



conserved hypothetical protein
3825



putative translation initiation inhibitor, YjgF family
3826



conserved hypothetical protein
3827



conserved hypothetical protein
3828



HTH-type trancriptional regulator
3829



putative ABC-type Co2+ transport system,
3830



periplasmic component



conserved hypothetical protein
3831


cbiM
predicted ABC-type cobalt transport system, permease
3832



component



predicted cobalt transport protein
3833


cbiO
predicted ABC-type cobalt transport system, ATPase
3834



component


aspC
aspartate aminotransferase
3835


purK
phosphoribosylaminoimidazole carboxylase ATPase
3836



subunit


purE
phosphoribosylaminoimidazole carboxylase catalytic
3837



subunit


hicA
HicA
3838


hicB
HicB
3839


pepN
aminopeptidase N
3840


ribE
riboflavin synthase alpha chain
3841


norM
probable multidrug resistance protein NorM
3842


sfsA
sugar fermentation stimulation protein
3843


tyrS
tyrosyl-tRNA synthase
3844


prsA
ribose phosphate pyrophosphokinase
3845


ispE
4-diphosphocytidyl-2-C-methyl-D-erythritol kinase
3846


lolB
outer-membrane lipoprotein LolB
3847


cca
tRNA nucleotidyltransferase
3848



conserved hypothetical protein
3849



putative phosphate permease
3850



conserved hypothetical phosphate transport regulator
3851



predicted membrane protein
3852



conserved hypothetical protein
3853



conserved hypothetical protein
3854



conserved hypothetical protein
3855



hypothetical protein
3856



hypothetical protein
3857



hypothetical protein
3858


hmw2C
HMW2C, putative glycosyltransferase involved in
3859



glycosylation of HMW1A and HMW2A


hmw2B
HMW2B, OMP-85-like protein required for HMW1A
3860



and HMW2A secretion


—hmw2A
HMW2A, high molecular weight adhesin 2
3861



conserved hypothetical protein
3862


radA
DNA repair protein RadA homolog
3863


lrp
leucine-responsive regulatory protein
3864


ftsK
DNa translocase FtsK
3865


lolA
outer-membrane lipoproteins carrier protein precursor
3866



predicted ATPase related to the helicase subunit of the
3867



holliday junction resolvase



hypothetical protein
3868



modification methylase BepI-like
3869


aroA
3-phosphoshikimate 1-carboxyvinyltransferase
3870


purU
formyltetrahydrofolate deformylase
3871


hns
DNA-binding protein H-NS homolog
3872



predicted Na+/H+ antiporter
3873



hypothetical protein
3874


ilvI
acetolactate synthase large subunit
3875


ilvH
acetolactate synthase small subunit
3876


argS
arginyl-tRNA synthetase
3877



conserved hypothetical protein
3878



hypothetical lipoprotein
3879


pcp
outer-membrane lipoprotein PCP precursor
3880


lgtD
UDP-0glcNAc-lipooligosaccharide N-
3881



acetylglucosamine glycosyltransferase


pgi
glucose-6-phosphate isomerase
3882


alr
alanine racemase
3883


dnaB
replicative DNA helicase
3884


pykA
pyruvate kinase
3885



prophage CP4-57-like integrase
3886



hypothetical protein
3887



hypothetical protein
3888



hypothetical protein
3889



hypothetical protein
3890



hypothetical protein
3891



modification methylase Bsp6I-like
3892


rdgC
recombination associated protein
3893



hypothetical protein
3894


ssb3
single strand binding protein
3895



hypothetical protein
3896



predicted recombinational DNA repair protein, RecE
3897



pathway



hypothetical protein
3898



hypothetical protein
3899



modification methylase DpnIIB-like
3900



hypothetical protein
3901



hypothetical protein
3902



hypothetical protein
3903



hypothetical protein
3904



hypothetical protein
3905



hypothetical protein
3906



hypothetical protein
3907



hypothetical protein
3908



hypothetical protein
3909



hypothetical protein
3910



hypothetical protein
3911



hypothetical protein
3912



hypothetical protein
3913



hypothetical protein
3914



hypothetical protein
3915



hypothetical protein
3916



hypothetical protein
3917



hypothetical protein
3918



hypothetical protein
3919



hypothetical protein
3920



hypothetical protein
3921



hypothetical protein
3922



hypothetical protein
3923



predicted DNA modification methylase
3924



hypothetical protein
3925



hypothetical protein
3926



predicted phage terminase large subunit
3927



hypothetical protein
3928



uncharacterized protein, homolog of phage Mu protein
3929



gp30



hypothetical protein
3930



hypothetical protein
3931



hypothetical protein
3932



hypothetical protein
3933



hypothetical protein
3934



hypothetical protein
3935



hypothetical protein
3936



hypothetical protein
3937



hypothetical protein
3938



hypothetical protein
3939



hypothetical protein
3940



hypothetical protein
3941



predicted phage-related minor tail protein
3942



hypothetical protein
3943



hypothetical protein
3944



hypothetical protein
3945



hypothetical protein
3946



hypothetical protein
3947



hypothetical protein
3948



hypothetical protein
3949



hypothetical protein
3950



probable tail fiber protein
3951



hypothetical protein
3952



hypothetical protein
3953



conserved hypothetical protein
3954



hypothetical protein
3955



hypothetical protein
3956



mu-like prophage protein gp29
3957



conserved hypothetical protein
3958



hypothetical protein
3959



conserved hypothetical protein
3960



hypothetical protein
3961



hypothetical protein
3962



hypothetical protein
3963



hypothetical protein
3964



hypothetical protein
3965



hypothetical protein
3966



hypothetical protein
3967


prfA
peptide chain release factor 1
3968



conserved hypothetical protein
3969


hemK
HemK
3970



conserved hypothetical protein
3971


kdsA
2-dehyrdro-3-deoxyphosphooctonate aldolase
3972



putative 2-hydroxyacid dehydrogenase
3973


lolC
lipoprotein releasing system transmembrane protein
3974


bioA
adenosylmethionine-8-amino-7-oxononanoate
3975



aminotransferase


bioF
8-amino-7-oxononanoate synthase
3976



conserved hypothetical protein
3977


bioC
putative biotin synthesis protein BioC
3978


bioD-B
probable dethiobiotin synthetase 2
3979


lolD
lipoprotein releasing system ATP-binding protein
3980



LolD


lolE
ABC-type transport system, involved in lipoprotein
3981



release, permease component


aroG
phospho-2-dehyrdro-3-deoxyheptonate aldolase
3982


impA
impA
3983



predicted Na+/serine symporter
3984



putative NAD(P)H oxidoreductase
3985



predicted component of anaerobic dehydrogenases
3986



predicted nitroreductase
3987


sppA
protease IV
3988


licD
phosphorylcholine transferase
3989


licC
LicC
3990


licB
LicB
3991


licA
LicA
3992



predicted glycine/D-amino acid oxidases, deaminating
3993


fabB
3-oxacyl-[acyl-carrier-protein] synthase I
3994


rrxA
glutaredoxin
3995


rimK
probable ribosomal protein S6 modification protein
3996


gltS
sodium/glutamate symport carrier protein
3997


parC
topoisomerase IV subunit A
3998


parE
topoisomerase IV subunit B
3999


htrB
lipid A biosynthesis lauroyl acyltransferase
4000


rfaE
ADP-heptose synthase
4001



hypothetical protein
4002


uupA2
ABC transporter ATP-binding protein
4003


uppB
ABC transporter ATP-binding protein
4004



putative carbonic anhydrase
4005


asnS
asparaginyl-tRNA synthetase
4006


ribH
6,7-dimethyl-8-ribityllumazine synthase
4007


nusB
N utilization substance protein B
4008


thiL
thiamine-monophosphate kinase
4009


pgpA
phosphatidylglycerophosphatase A
4010



predicted threonine efflux protein
4011


dapB
dihydrodipicolinate reductase
4012



conserved hypothetical ferredoxin-like protein
4013



conserved hypothetical protein
4014


pheS
ohenylalanyl-tRNA synthetase alpha chain
4015


pheT
ohenylalanyl-tRNA synthetase beta chain
4016


himA
integration host factor alpha-subunit
4017



conserved hypothetical lipoprotein
4018



hypothetical protein
4019



putative 5′(3′)-deoxyribonucleotidase
4020



NAD-dependent deacetylase
4021



hypothetical protein
4022


ftsK2
DNA translocase ftsK
4023



NAD-dependent deacetylase sirtuin 5
4024



hypothetical protein
4025



hypothetical protein
4026



hypothetical protein
4027



predicted serine/threonine protein phosphatase family
4028



protein



predicted arylsulfatase A-like enzyme
4029



predicted enzyme related to aldose 1-epimerase
4030


infC
translation initiation factor IF-3
4031


rpmI
50S ribosomal protein L35
4032


rplT
50S ribosomal protein L20
4033


recB
exodeoxyribonuclease V beta chain
4034


recD
exodeoxyribonuclease V alpha chain
4035



conserved hypothetical protein
4036


lonB
putative protease Lahomolog, predicted ATP-
4037



dependent protease


fabA
3-hydroxydecanoyl-[acyl-carrier-protein] dehydratase
4038



conserved hypothetical protein
4039



conserved hypothetical protein
4040


rpsO
30S ribosomal protein S15
4041



conserved hypothetical transposase-like protein
4042



conserved hypothetical protein
4043


dacB
penicillin-binding protein 4 precursor
4044


greA
transcription elongation factor GreA
4045



predicted RNA-binding protein containing KH
4046



domain, possible ribosomal protein


ftsJ
ribosomal RNA large subunit methyltransferase J
4047


FTSh
cell division protein FtsH homolog 1
4048


folP
dihydropteroate synthase
4049


mrsA
predicted phosphomannomutase
4050


sixA
phosphohistidine phosphatase SixA homolog
4051



conserved hypothetical protein
4052



conserved hypothetical protein
4053



hypothetical protein
4054



hypothetical protein
4055



possible RNA polymerase sigma factor 24
4056



hypothetical protein
4057



hypothetical protein
4058


msaB
peptide methionine sulfoxide reductase MsrA/MsrB
4059



conserved hypothetical cytochrome c-type biogenesis
4060



protein



conserved hypothetical protein
4061



conserved hypothetical protein
4062


moeB
molybdopterin biosynthesis protein MoeB
4063


moeA
molybdopterin biosynthesis protein MoeA
4064


folE
GTP cyclohydrolase I
4065



conserved hypothetical protein
4066


bioD-A
probable dethiobiotin synthetase 1
4067


metF
5,10-methylenetetrahydrofolate reductase
4068


rplM
50S ribosomal protein L13
4069


rpsI
30S ribosomal protein S9
4070


sspA
stringent starvation protein A
4071


sspB
stringent starvation protein B
4072


dxs
1-deoxy-D-xylulose 5-phosphate synthase
4073


ispA
geranyltranstransferase
4074


xseB
exodeoxyribonuclease VII small subunit
4075


thiI
predicted thiamine biosynthesis ATP pyrophosphatase
4076



conserved hypothetical protein
4077


truC
tRNA pseudouridine synthase C
4078



conserved hypothetical protein
4079


cspD
cold shock-like protein CspD
4080



conserved hypothetical protein
4081


usg
predicted aspartate-semialdehyde dehydrogenase
4082


trpA
tryptophan synthase alpha chain
4083


trpB
tryptophan synthase beta chain
4084



conserved hypothetical oxidoreductase
4085


purM
phosphoribosylformylglycinamidine cyclo-ligase
4086


purN
phosphoribosylglycinamide formyltransferase
4087



predicted ABC-type transport system protein,
4088



periplasmic component


uspE
universal stress protein E
4089


fnr
fumarate and nitrate reduction regulatory protein
4090



putative integrase/recombinase
4091



conserved hypothetical protein
4092



predicted phage anti-repressor protein
4093



hypothetical protein
4094



hypothetical protein
4095



hypothetical protein
4096



hypothetical protein
4097



hypothetical protein
4098



hypothetical protein
4099



predicted transcriptional regulator
4100



hypothetical protein
4101



hypothetical protein
4102



hypothetical protein
4103



hypothetical protein
4104



hypothetical protein
4105



hypothetical protein
4106


ninB
putative recombination protein NinB
4107


ninG
putative recombination protein NinG homolog
4108



hypothetical protein
4109



conserved hypothetical protein
4110



hypothetical protein
4111



conserved hypothetical protein
4112



hypothetical protein
4113



conserved hypothetical protein
4114



conserved hypothetical protein
4115



hypothetical protein
4116



conserved hypothetical protein
4117



hypothetical protein
4118



conserved hypothetical protein
4119



predicted phage terminase large subunit
4120



conserved hypothetical protein
4121



uncharacterized protein, homolog of phage Mu protein
4122



gp30



conserved hypothetical protein
4123


pyrD
dihyrdoorotate dehydrogenase
4124


trpH
TrpH
4125



conserved hypothetical protein
4126


fumC
fumurate hydratase class II
4127



putative glycosyl transferase, glycosyl transferase
4128



family 8 protein


holC
DNA polymerase III, chi subunit
4129



conserved hypothetical protein
4130



conserved hypothetical protein
4131



hypothetical protein
4132



conserved hypothetical protein
4133



hypothetical protein
4134


valS
valyl-tRNA synthetase
4135



hypothetical protein
4136


trpC
tryptophan biosynthesis protein trpCF
4137


trpD
anthranilate phosphoribosyltransferase
4138



conserved hypothetical protein
4139


trpG
anthranilate synthase component II
4140


trpE
anthranilate synthase component I
4141



pseudogene for conserved hypothetical
4142



glycosyltransferase


ftnB
ferritin-like protein 2
4143


ftnA
ferritin-like protein 1
4144


pstS
phosphate-binding periplasmic protein precursor PstS
4145


pstC
phosphate transport system permease protein PstC
4146


pstA
phoaphate transport system permease proteain PstA
4147


pstB
phosphate import ATP-binding protein PstB
4148


phoB
phosphate regulon transcriptional regulatory protein
4149



PhoB


phoR
phosphate regulon sensor protein PhoR
4150


sbcB
exodeoxyribonuclease I
4151



conserved hypothetical protein
4152



conserved hypothetical protein
4153



cell division protein MukB
4154



cell division protein MukE
4155


haeIIR
type II restriction exzyme HaeII
4156


haeIIM
modification methylase HaeII
4157


mukF
MukF homolog
4158



predicted ATPase of the PP-loop superfamily
4159



implicated in cell cycle control



predicted dissimilatory sulfite reductase,
4160



desulfoviridin, gamma subunit


mop
probable molybdenum-pterin binding protein
4161


pqqL
probable zinc protease
4162


thrS
threonyl-tRNA synthetase
4163


acpD
probable acyl carrier protein phosphodiesterase
4164


topA
DNA topoisomerase I
4165



putative HTH-type transcriptional regulator
4166


pntB
NAD(P) transhydrogenase subunit beta
4167


pntA
NAD(P) transhydrogenase subunit alpha
4168


glgP
glycogen phosphorylase
4169


glgA
glycogen synthase
4170


glgC
glucose-1-phosphate adenylyltransferase
4171


glgX
glycogen operon protein GlgX
4172


glgB
1,4-alpha-glucan branching enzyme
4173


malQ
4-alpha-glucanotransferase
4174



conserved hypothetical protein
4175


glnS
glutaminyl-tRNA synthetase
4176


cafA
ribonuclease G
4177


putP
sodium/proline symporter
4178



conserved hypothetical protein
4179


cdd
cytiding deaminase
4180



conserved hypothetical DNA-binding ferritin-like
4181



protein


pepT
peptidase T
4182


potA
spermidine/putrescine transport ATP-binding protein
4183


potB
spermidine/putrescine transport permease protein
4184


potC
spermidine/putrescine transport system permease
4185



protein


potD1
spermidine/putrescine-binding periplasmic protein 1
4186



precursor


uupA
ABC transporter ATP-binding protein
4187



deoxyguanosinetriphosphate triphosphohydrolase-like
4188



protein



putative effector of murein hydrolase
4189



putative effector of murein hydrolase
4190



predicted micrococcal nuclease-like protein
4191


nifS
predicted selenocysteine lyase
4192



predicted SufE protein probably involved in Fe—S
4193



center assembly



Zn-ribbon-containing, possible nucleic-acid binding
4194



protein



predicted enzyme related to GTP cyclohydrolase I
4195


tyrA
T-protein
4196


truB
tRNA pseudouridine synthase B
4197


rbfA
ribosome-binding factor A
4198


hsdM3
putative type I restriction enzyme HindVIIP M protein
4199


hsdS3
putative type I restriction enzyme HindVIIP
4200



specificity protein



hypothetical protein
4201


hsdR3
putative type I restriction enzyme HindVIIP R protein
4202


infB
translation initiation factor IF-2
4203


nusA
transcriptional elongation protein NusA
4204



conserved hypothetical protein
4205



hypothetical protein
4206



hypothetical protein
4207



hypothetical protein
4208



hypothetical protein
4209



hypothetical protein
4210



hypothetical protein
4211



hypothetical protein
4212



probable tail fiber protein
4213



predicted bacteriophage P2-related tail formation
4214



protein gpI



predicted phage-related baseplate assembly protein
4215



predicted baseplate assembly protein W
4216



predicted phage P2-like baseplate assembly protein
4217



hypothetical protein
4218



hypothetical protein
4219



hypothetical protein
4220



hypothetical protein
4221



predicted phage-related tail protein
4222



hypothetical protein
4223



hypothetical protein
4224



hypothetical protein
4225



hypothetical protein
4226



probable bacteriophage tail completion protein gpS
4227



homolog



probable bacteriophage tail completion protein gpR
4228



homolog



hypothetical protein
4229



hypothetical protein
4230



DnaK suppressor protein, bacteriophage PSP3 gp34
4231



homolog



hypothetical protein
4232



predicted phage-related lysozyme
4233



hypothetical protein
4234



hypothetical protein
4235



hypothetical protein
4236



predicted terminase, endonuclease subunit
4237



predicted major capsid protein
4238



predicted capsid scaffolding protein
4239



terminase, ATPase subunit
4240



predicted portal vertex protein
4241


siaB
acylneuraminate cytidylyltransferase
4242



putative NAD(P)H nitroreductase
4243


mrp
Mrp
4244


metG
methionyl-tRNA synthetase
4245


tehB
tellurite resistance protein
4246


gloB
probable hydroxyacylglutathione hydrolase
4247



conserved hypothetical protein
4248



conserved hypothetical protein
4249



conserved hypothetical protein
4250


gyrA
DNA gyrase subunt A
4251


metX
homoserine O-acetyltransferase
4252


sanA
SanA
4253


folC
folypolyglutamate synthase
4254


accD
acethy-coenzyme A carboxylase carboxyl transferase
4255



subunit beta


htoA
probable periplasmic serine protease do/HhoA-like
4256



precursor


mfd
transcription-repair coupling factor
4257



conserved hypothetical protein
4258



predicted P-loop ATPase fused to an acetyltransferase
4259



conserved hypothetical protein
4260



ABC transported ATP-binding protein
4261



predicted plasmid maintenance system antidote protein
4262



predicted plasmid maintenance system killer protein
4263



predicted ABC-type transport system, periplasmic
4264



component



predicted ABC-type transport system, permease
4265



component


uvrB
UvrABC system protein B
4266



predicted phosphoglycerol transferase-like protein
4267


mao2
NADP-dependent malic enzyme
4268



possible polysaccharide biosynthesis protein
4269


rsuA
ribosomal small subunit pseodouridine synthase A
4270


bcr
bicyclomycin resistance protein
4271



conserved hypothetical protein
4272



predicted membrane protein
4273


proA
gamma-glutamyl phosphate reductase
4274


dnaJ
chaperone protein DnaJ
4275


dnaK
chaperone protein DnaK
4276



conserved hypothetical protein
4277


mgsA
methylglyoxal synthase
4278


aceE
pyruvate dehydrogenase E1 component
4279


aceF
dihydrolipoamide acetyltransferase component of
4280



pyruvate dehydrogenase complex


lpdA
dihydrolipoamide dehudrogenase
4281



hypothetical protein
4282


apt
adenine phosphoribosyltransferase
4283


dnzX
DNA polymerase III subunit gamma/tau
4284


upp
uracil phosphoribosyltransferase
4285


uraA
probable uracil permease
4286



predicted ATPase involved in DNA replication
4287



initiation



predicted translation initiation factor 1-like proterin
4288


pyrF
orotidine 5′-phosphate decarboxylase
4289



predicted N-acetylglucosaminyl transferase
4290



predicted membrane protein
4291


ihfB
integration host factor beta subunit
4292


rpsA
30S ribosomal protein S1
4293


cmk
ctidylate kinase
4294



conserved hypothetical pyridoxine biosynthesis
4295



enzyme



predicted glutamine amidotransferase involved in
4296



pyridoxine biosynthesis


dld
D-lactate dehydrogenase
4297



conserved hypothetical protein
4298


nlpC
probable lipoprotein NlpC
4299


tldD
TldD
4300



predicted methyltransferase
4301



putative lipoprotein
4302



predicted endonuclease distantly related to archael
4303



holliday junction resolvase



predicted phosphoheptose isomerase
4304



predicted periplasmic or secreted lipoprotein
4305


nrdA
ribonucleoside-diphosphate reductase alpha chain
4306


nrdB
ribonucleoside-diphosphate reductase beta chain
4307


sucB
dihydrolipoamide succinyltransferase component of
4308



2-oxoglutarate dehydrogenase complex


sucA
2-oxoglutarate dehydrogenase E1 component
4309



predicted Zn-dependent hydrolase-like protein,
4310



including glyoxylases



conserved putative deoxyribonuclease
4311



conserved hypothetical protein
4312



conserved hypothetical protein
4313


prc
tail-specific protease precursor
4314


proQ
predicted activator of osmoprotectant transporter
4315



PropP



paraquat-inducible protein A-like protein
4316



paraquat-inducible protein B-like protein
4317


moaE
molybdopterin converting factor subunit 2
4318


moaD
molybdopterin converting factor subunit 1
4319


moaC
molybdenum cofactor biosynthesis protein C
4320


moaA
molybdenum cofactor biosynthesis protein A
4321



predicted regulator of cell morphogenesis and NO
4322



signaling



predicted regulator of cell morphogenesis and NO
4323



signaling



probable phosphosugar isomerase Hi1678
4324


yrbI
3-deoxy-D-manno-octulosonate 8-phosphate
4325



phosphatase


hmw1A
HMW1A, high molecular weight adhesin 1
4326


hmw1B
HMW1B, OMP-85-like protein required for secretion
4327



of HMW1A and HMW2A


hmw1C
HMW1C, putative glycosyltransferase involved in
4328



glycosylation of HMW1A and HWM2A



predicted membrane protein
4329



conserved hypothetical protein
4330


sohB
possible protease SohB
4331


rnfA
predicted NADH:ubiquinone oxidoreductase, subunit
4332



RnfA


rnfB
predicted NADH:ubiquinone oxidoreductase, subunit
4333



RnfB


rnfC
predicted NADH:ubiquinone oxidoreductase, subunit
4334



RnfC


rnfD
predicted NADH:ubiquinone oxidoreductase, subunit
4335



RnfD


rnfG
predicted NADH:ubiquinone oxidoreductase, subunit
4336



RnfG



predicted NADH:ubiquinone oxidoreductase, subunit
4337



RnfE


nth
endonuclease III
4338



predicted Na+-dependent transporters of the SNF
4339



family


modC
molybdenum import ATP-binding protein
4340


modB
molybdenum transport system permease protein
4341


modA
molybdate-binding periplasmic protein
4342


mode
Transcriptional regulator ModE
4343


lsgF
Putative UDP-galactose-lipooligosaccharide
4344



galactosyltransferase


lsgE
Putative UDP-galactose-lipooligosaccharide
4345



galactosyltransferase


lsgD
Putative UDP-glcNAc-lipooligosaccharide N-
4346



acetylglucosaminyl glycosyltransferase


lsgC
Putative UDP-galactose--lipooligosaccharide
4347



galactosyltransferase


lsgB
CMP-N-acetylneuraminate-beta-galactosamide-alpha-
4348



2,3-sialytransferase


lsgA
putative lipooligosaccharide flippase
4349



conserved hypothetical protein
4350



predicted permease
4351



predicted permease
4352


pepA
cytosol aminopeptidase
4353



predicted choline-glycine betaine transporter
4354


qseC
sensor protein QseC
4355


qseB
transcriptional regulatory protein QseB
4356



conserved hypothetical protein
4357


crr
PTS system, glucose-specific IIA component
4358


ptsI
Phosphoenolpyruvate-protein phosphotransferase
4359


ptsH
Phosphocarrier protein HPr
4360



probable GTPase
4361


orn
oligoribonuclease
4362


wecA
undecaprenyl phosphate
4363


glnD
[protein-PII] uridylyltransferase
4364


map
methionine aminopeptidase
4365



conserved hypothetical protein
4366



conserved hypothetical protein
4367


mrcB
penicillin-binding protein 1B
4368


purC
phosphoribosylaminoimidazole-succinocarboxamine
4369



synthase


argG
argininosuccinate synthase
4370



Mn2+ and Fe2+ transporter of the NRAMP family
4371



conserved hypothetical protein
4372



predicted allophanate hydrolase subunit 2
4373



predicted allophanate hydrolase subunit 1
4374


rnb
exoribonuclease II
4375


fabI
enoyl-[acyl-carrier-protein] reductase [NADH]
4376


prfC
peptide chain release factor 3
4377



conserved hypothetical protein
4378



predicted branched chain amino acid permease
4379



predicted branched chain amino acid permease
4380


metR
HTH-type transcriptional regulator MetR
4381


lldD
L-lactate dehydrogenase
4382


murI
glutamate racemase
4383


recG
ATP-dependent DNA helicase
4384


spoT
guanosine-3′,5′-bis 3′pyrophosphohydrolase
4385


rpoZ
DNA directed RNA polymerase omega chain
4386


gmk
guanylate kinase
4387









Example 10
Comparison of the Genomes of NTHi, Strain 86-028NP and H. influenzae, Strain RD

The genomic sequence of strain 86-028NP contains 1,913,428 bp. This is approximately 4 percent larger than the strain Rd genome (1,830,137 bp) (Fleischmann et al., Science 269:496-512, 1995). There are also a larger number of genes in strain 86-028NP: 1942 compared to 1743 in strain Rd. The gene complement was compared to that of strain Rd using the Seqman program in the DNASTAR suite. With 80% identity at the nucleotide level as a cutoff value, 285 ORFS were identified in the 86-028NP genome that were absent from the strain Rd genome and 167 ORFs were identified in the strain Rd genome that are absent from the strain 86-028NP genome.


Strain 86-028NP, like strain Rd, has six ribosomal operons. Using tRNAscan-SE v1.11, 58 tRNA genes were identified in the strain 86-028NP genome, representing the 20 common amino acids. The tRNA-Glu, tRNA-Ala and tRNA-Ile genes were located in spacer regions between the 16S and 23S ribosomal RNA genes. A tRNA gene containing the UCA anticodon was also identified. This anticodon corresponds to an opal stop codon and is typically associated with an opal-suppressing tRNA that incorporates selenocysteine. The tRNA is adjacent to two genes encoding selB (NTHI0836), a Sec tRNA specific elongation factor, and selA (NTHI0835), the enzyme that converts serine to dehydroalanine preparatory to forming selenocysteine by incorporation of selenium (Forchhammer et al., Nature 342:453-6, 1989) The selD gene (NTHI0297), encoding selenophosphate synthetase was also identified. The importance of this selenocysteine system is evidenced by the coding sequence for the alpha subunit of formate dehydrogenase (NTHI0007) containing an inframe TGA stop codon that is presumably read as a selenocysteine codon. The inframe TGA stop codon was previously noted in the current annotation of the strain Rd formate dehydrogenase gene (GenPept Accession P46448).


A gross comparison between the genomes involving analysis of the gene order of strain 86-028NP and that of strain Rd reveals a single major rearrangement in the form of a large inversion. This 471 kb inversion represents almost 25% of the strain 86-028NP genome and is bounded by NTHI1391, and NTHI1394 (homologues of HI1218 and HI1645 respectively) and by NTHI11949 and NTHI1950 (homologues of HI1219 and HI1647 respectively). HI1219 and HI1646 are partially duplicated genes in strain Rd annotated as cmkA and cmkB (cytidylate kinases). One cmk gene (NTHI1949) is present in strain 86-028NP with a small cmk-like fragment between NTHI1391 and NTHI1394. Several clones from the scaffolding library overlap each end of the inversion in the 86-028NP genome validating our assembly. Within this large inversion are several insertions, the largest of which are approximately 13 kb, 27 kb and 51 kb in size. These regions contain predominantly hypothetical and conserved hypothetical genes as well as a number of homologues of phage genes. For example, the 27 kb insertion contains remnants of HP1- and HP2-like phage genes. The largest insert is bounded by homologues of integrase genes. In strain Rd, a mu-like phage is localized to this region (Morgan et al., J Mol Biol 317:337-59, 2002) This phage is not present in the strain 86-028NP genome. Also within the large inverted region is a 21 kb inversion that restores synteny with the Rd genome.


In addition to the large inversion, strain 86-028NP has other regions of divergence from co-linearity with the strain Rd genome. These include 9 regions greater than 5 kb, which contain sequences with no apparent homology to DNA that is present in strain Rd. Two of these regions contain the HMW adhesins that are discussed below. Hypothetical genes predominate in six of the unique regions. The ninth region is approximately 56 kb in size. It lies between NTHI0100 and NTHI0165. BLASTn analysis indicated that genes in this region, designated ICEHin86-028NP, have high homology to genes in the H. influenzae type b plasmid, ICEHin1056 (Mohd-Zain et al., J Bacteriol 186:8114-22, 2004). ICEHin1056 is a member of an extended family of genomic islands that are defined by a series of common core genes (Mohd-Zain et al., J Bacteriol 186:8114-22, 2004). ICEHin86-028NP possesses homologues of 45 ICEHin1056 ORFs. These include ORFs near the 5′ end of ICEHin86-028NP, including the defined core genes, that primarily encode proteins with putative roles in plasmid replication and conjugation and ORFs near the 3′ end that primarily encode conserved hypothetical proteins with motifs that suggest that they may be either membrane associated or exported. Notably, ICEHin86-028NP lacks the genes encoding proteins involved in tetracycline, chloramphenicol and β-lactam resistance found in ICEHin1056. Scattered within ICEHin86-028NP are a transposase, resolvases, and a putative integrase regulator suggesting that ICEHin86-028NP is a composite element derived from several mobile genetic elements.


ICEHin1506 has a sequence designated as an attP site 5′ the first gene. In strain 86-028NP, a perfect copy of this attP site is present 5′ to NTHI0101 and a copy of this attP site, with a single nucleotide change, is present 3′ of NTHI0164. The attP sites are implicated in the incorporation of mobile genetic elements into bacterial chromosomes to form genomic islands, possibly suggesting a mechanism by which this large section of genetic material became integrated into the strain 86-028NP genome (Dimopoulou et al., Antimicrob Agents Chemother 46:1602-3, 2002). ICEHin86-028NP has a G+C content of 39%, lower than any of the other related genomic islands and close to strain 86-028NP's overall genome G+C content of 38%. This implies a long-term genomic association for this element. The presence of this element with its complement of genes homologous to those in ICEHin1506 (Dimopoulou et al., Antimicrob Agents Chemother 46:1602-3, 2002) which are thought to encode membrane-associated and secreted proteins may have important implications for the virulence of strain 86-028NP.


Several members of the Pasteurellaceae including Haemophilus ducreyi, Pasteurella multocida and Actinobacillus actinomycetemcomitans produce well characterized protein toxins. In contrast, H. influenzae does not appear to produce protein toxins and genes encoding putative protein toxins were not identified in the strain 86-028NP genome. In H. influenzae, the genes encoding glycosyltransferases responsible for endotoxin biosynthesis and genes encoding proteins that give the bacteria enhanced “fitness” during the process of infection have generally been considered virulence determinants. These genes include those that encode adhesins, the heme and haemoglobin binding proteins as well as the genes that encode proteins that protect against oxidative stress.


Contingency Genes



H. influenzae has a limited number of two-component regulatory systems and other global regulators. Moxon and co-workers have argued that loci termed “simple contingency loci” provide an alternative mechanism for regulating gene expression, thus increasing the fitness of an organism by contributing to that organism's ability to rapidly respond to changing environmental conditions. These loci contain short tandem sequence repeats either within, or 5′ to, a coding region. During DNA replication, addition or loss of a repeat within a reading frame results in an alteration in the reading frame. When localized 5′ to a coding region, addition or loss of a repeat results in a change in promoter activity (Bayliss et al., Clin Invest 107:657-62, 2001). Loci containing simple sequence repeats have been studied extensively in H. influenzae, for example (Hood et al., Proc Nall Acad Sci USA 93:11121-5, 1996). Several of the loci described in the following sections as phase variable contain simple sequence repeats.


Adhesins


Strain 86-028NP possesses a number of genes which encode products that primarily function in adherence to host cells (Table 8). One of these, the outer membrane protein P5, has previously been identified and its function carefully dissected (Jiang et al., Infect Immun 67:187-92, 1999; Kennedy et al., Infect Immun 68:2756-65, 2000; Novotny et al., J Immunol 171:1978-83, 2003; Novotny et al., Infect Immun 68:2119-28, 2000; Novotny et al., Vaccine 20:3590-7, 2002; Sirakova et al., Infect Immun 62:2002-20, 1994). Strain 86-028NP possesses a gene cluster containing four genes that are homologues of pilABCD from strain Rd, Actinobacillus pleuropneumoniae and P. multocida (Bakaletz et al., Infect Immun 73:1635-4, 2005; Doughty et al., Vet Microbiol 72:79-90, 2000; Ruffolo et al., Infect Immun 65:339-43, 1997 Stevenson et al., Vet Microbiol 92:121-34, 2003). These genes together with the comE gene and genes yet to be identified encode a type IV pilus that has a role in adherence of strain 86-028NP to nasopharyngeal tissues (Kennedy et al., Infect. Immun., 68: 2756-2765, 2000).


Strain 86-028NP possesses two high molecular weight (HMW) adhesin gene clusters that are absent in strain Rd. The high molecular weight adhesins were first characterized in NTHi, strain 12, which has two HMW gene clusters, each encoding three proteins (HMWA, HMWB and HMWC). HMWA is the structural component of the adhesin, HMWB has a role in trans-membrane translocation, while HMWC is required for glycosylation of HMWA (Barenkamp et al., Infect Immun 60:1302-13, 1992; Barenkamp et al., Infect Immun 62:3320-8; 1994; Grass et al., Mol Microbiol 48:737-51, 2003; St Geme et al., Mol Microbiol 27:617-30, 1998). Similarly, strain 86-028NP's two HMW gene clusters contain homologues of the hmwA, B and C genes in the same gene context as in strain 12 (Buscher et al., J Bacterial 186:4209-17, 2004). The HMW1A and HMW2A proteins from strain 86-028NP are 72% identical, with the major area of divergence, including a 41 amino acid insertion in HMW2A, toward the C-termini. The paired HMWB and HMWC proteins from strain 86-028NP are 99% identical, respectively. The sequence ATCTTTC is repeated 17 times upstream of hmw1A and 23 times upstream of hmw2A. In strain 12, 16 repeats of this sequence are found 5′ of each hmw gene cluster (Barenkamp et al., Infect Immun 60:1302-13, 1992).


Hap is an autotransported protein with a domain homologous to the catalytic domain of IgA1 proteases. The NTHI0354 gene encodes a protein with 83% identity to Hap from the NTHi strain N187 (St Geme et al., Mol Microbiol 14:217-3, 1994). Strain 86-028NP, along with other NTHi strains that possess HMW1 and HMW2, lacks the gene encoding Hia, another Haemophilus adhesin (Barenkamp et al., Mol Microbiol 19:1215-23, 1996). Strain 86-028NP also lacks the hif gene cluster, encoding the hemagglutinating pilus as we previously reported (Munson et al., Infect Immuzz 72:3002-10, 2004).









TABLE 8







NTHi genes that encode proteins that primarily function in adherence to host cells














Gene
SEQ ID

Contingency


NTHI#
HI#
name
NO:
Function
Repeats














354

hap
1080
Adhesion and penetration protein Hap


406
296
pilD
1125
Putative type 4 prepilin-like protein specific






leader peptidase (EC 3.4.23.43)


407
297
pilC
1126
Putative type IV pilin secretion protein


408
298
pilB
1127
Putative type IV pilin secretion protein


409
299
pilA
1128
Type IV pilin subunit protein


1332
1164
ompP5
1953
Outer membrane protein P5 (OMP P5-






homologous adhesin)











1448

hmw2C
2057
HMW2C, putative glycosyltransferase involved in glycosylation of






HMW1A and HMW2A












1449

hmw2B
2058
HMW2B, OMP-85-like protein required for







HMW1A and HMW2A secretion


1450

hmw2A
2059
HMW2A, high molecular weight adhesin 2
ATCTTTC repeated 23







times, 5′ of gene


1983

hmw1A
2530
HMW1A, high molecular weight adhesin 1
ATCTTTC repeated 17







times, 5′ of gene


1984

hmw1B
2531
HMW1B, OMP-85-like protein required for






secretion of HMW1A and HMW2A


1985

Hmw1C
2532
HMW1C, putative gylcosyltransferase






involved in glycosylation of HMW1A and






HMW1B









In Tables 8, 9 and 10, the “NTHI number” refers to the locus tag number within the NTHi, strain 86-028NP genome as indicated at the Microbial-Pathogenesis H. influenzae 86028 NP web site and in Genbank Accession No. CP000057. The HI number” refers to the corresponding locus tag number in the TIGR (The Institute for genomic redearc


Lipooligosaccharide Synthesis [See [Page 13 of Original Application]


The structure, biosynthesis and role in virulence of H. influenzae lipooligosaccharide (LOS) has been studied extensively. Table 9 contains a list of genes involved in lipooligosaccharide biosynthesis. Strain 86-028NP has the full complement of genes required to synthesize the heptose-Kdo-Lipid A portion of LOS. The lgtF and lpsA genes encode glycosyltransferases that add glucose, and glucose or galactose, to heptose residues 1 and 3, respectively. Both of these genes are present in the strain 86-028NP genome, therefore it is likely that carbohydrate chains can be extended from the heptose 1 and heptose 3 residues of the strain 86-028NP LOS (Hood et al., Microbiology 150:2089-97, 2004). In the serotype b strain RM153, the lic2C gene encodes a glucosyltransferase that adds glucose to heptose 2 (Hood et al., Microbiology 150:2089-97, 2004). In the strain 86-028NP genome, this gene contains a frame shift. The phase variable lic2A and licA genes, encoding a galactosyltranferase and choline kinase, respectively, are present in the strain 86-028NP genome (High et al., Mol Microbiol 9:1275-82, 1993; Hood et al., Glycobiology 11:957-67, 2001; Weiser et aL, Infect Immun 65:943-50, 1997). The lex2B gene which encodes a glucosyltransferase in the serotype b strain DL42, as well as a number of other serotypeable strains, is present in the strain 86-028NP genome (Griffin et al., Microbiology 149:3165-75, 2003; Jarosik et al., Infect Immun 62:4861-7, 1994). Five-prime to the lex2B gene in strain DL42 is the short phase variable lex2A gene. In strain 86-028NP, this gene is out-of-frame compared to the DL42 sequence (Genbank Accession U05670), due to the loss of one tetranucleotide repeat and a 5 bp deletion. Recently, Hood and co-workers described a locus in strain Rd, designated hmg, that contains HI0866 through H10874 (Hood et al., J Bacteriol 186:7429-39, 2004). With the exception of a homologue of rmlB, these genes are absent from the strain 86-028NP genome. This includes the siaA gene which encodes a sialyltransferase recently shown to be important in biofilm formation in NTHI strain 2019 (Greiner et al., Infect Immun 72:4249-60, 2004; Jones et al., J Biol Chem 277:14598-611, 2002). Two copies of a homologue of the lic3A gene, encoding an alternative sialyltransferase, were identified in the strain 86-028NP genome (Hood et al., Mol Microbiol 39:341-50, 2001; Jones et al., J Biol Chem 277:14598-611, 2002), as well as a copy of the lsgB gene that encodes another sialyltransferase Jones et al., J Biol Chem 277:14598-611, 2002).









TABLE 9







NTHi genes involved in lipooligosaccharide biosynthesis














Gene
SEQ ID

Contingency


NTHI#
HI#
name
NO:
Function
Repeats















68
58
kdsB
831
3-deoxy-D-manno-octulosonic acid







cytidylyltransferase


69
59
lpxK
832
Tetraacyldisaccharide 4′-kinase


72
60
msbA
833
Lipid A export ATP-binding protein msbA


296
199
msbB
1029
Lipid A biosynthesis (KDO)2-(lauroyl)-lipid






IVA acyltransferase


365
258
lgtC
1090
UDP-galactose--lipooligosaccharide
GACA repeated 10






galactosyltransferase
times, in frame


366
260
orfM
1091
Xanthosine triphosphate pyrophosphatase


367
260.1
kdkA
1092
3-deoxy-D-manno-octulosonic acid kinase


368
261
opsX
1093
ADP-heptose--lipooligosaccharide






heptosyltransferase I


383
275
lpt6
1107
PE-tn-6--lipooligosaccharide






phosphorylethanolamine transferase


471
351
galE
1184
UDP-glucose 4-epimerase


472
352
lic3A
1185
CMP-neu5Ac--lipooligosaccharide alpha 2-3
CAAT repeated 18






sialyltransferase
times, in frame


512
391


Predicted acyltransferase
AGCA repeated 8







times, in frame


649
523
waaQ
1350
ADP-heptose--lipooligosaccharide






heptosyltransferase III


677
550
lic2A
1378
UDP-galactose--lipooligosaccharide
CAAT repeated 14






galactosyltransferase
times, in frame


772
652
kdtA
1462
3-deoxy-D-manno-octulosonic acid






transferase


773
653
lgtF
1463
UDP-glucose--lipooligosaccharide






glucosyltransferase


892
735
lpxH
1569
UDP-2,3-diacylglucosamine hydrolase


899
740
pgmB
1575
phosphoglucomutase


913

lex2B
1586
UDP-glucose--lipooligosaccharide






glucosyltransferase


926
765
lpsA
1597
lipooligosaccharide glycosyltransferase


976
812
galU
1644
UTP--glucose-1-phosphate uridylyltransferase


1034

lic3A2
1696
CMP-neu5Ac--lipooligosaccharide alpha 2-3
CAAT repeated 18






sialyltransferase
times, in frame


1037
873
rmlB
1698
dTDP-glucose 4,6-dehydratase


1082
915
lpxC
1739
UDP-3-O-[3-hydroxymyristoyl] glucosamine






N-acyltransferase


1180
1005


Predicted PE--lipooligosaccharide






phosphorylethanolamine transferase


1220
1060
lpxB
1858
Lipid-A-disaccharide synthase


1222
1061
lpxA
1859
Acyl-[acyl-carrier-protein]--UDP-N-






acetylglucosamine O-acyltransferase


1224
1064


Predicted PE--lipooligosaccharide






phosphorylethanolamine transferase


1272
1105
rfaF
1900
ADP-heptose--lipooligosaccharide






heptosyltransferase II


1278
1114
rfaD
1906
ADP-L-glycero-D-manno-heptose-6-






epimerase


1312
1144
gmbA
1934
UDP-3-O-[3-hydroxymyristoyl] N-






acetylglucosamine deacetylase


1350
1181
gmhA
1969
Phosphoheptose isomerase


1474
1578
lgtD
2079
Putative UDP-glcNAc--lipooligosaccharide N-






Acetylglucosamine glycosyltransferase


1576
1557
kdsA
2172
Phospho-2-dehydro-3-deoxyoctonate






aldolase and 3-deoxy-D-manno-octulosonic






acid 8-phosphate synthetase


1594
1540
licD
2189
Phosphorylcholine transferase


1595
1539
licC
2190
Protein licC, CTP--phosphocholine






cytidylyltransferase


1596
1538
licB
2191
Protein licB, putative cho;ine uptake protein


1597
1537
licA
2192
Protein licA, choline kinase
CAAT repeated 15







times, in frame


1606
1527
htrB
2200
Lipid A biosynthesis lauroyl acyltransferase


1607
1526
rfaE
2201
ADP-heptose synthase


1664
1337
mrsA
2251
Predicted phosphomannomutase


1750



Putative glycosyltransferase, glycosyl
GACA repeated 14






transferase family 8 protein
times, in frame


1769



Putative glycosyltransferase
CCAA repeated 17







times, out of frame


1891
1279
siaB
2433
CMP-neu5Ac synthetase


1921
1244


Possible polysaccharide biosynthesis protein


2002
1695
lsgF
2548
Putative UDP-galactose--lipooligosaccharide






galactosyltransferase


2003
1696
lsgE
2549
Putative UDP-galactose--lipooligosaccharide






galactosyltransferase


2004
1697
lsgD
2550
Putative UDP-glcNAc--lipooligosaccharide N-






Acetylglucosaminyl glycosyltransferase


2005
1698
lsgC
2551
Putative UDP-galactose--lipooligosaccharide






galactosyltransferase


2006
1699
lsgB
2552
CMP-N-acetylneuraminate-beta-






galactosamide-alpha-2,3-sialyltransferase


2007
1700
lsgA
2553
Putative lipooligosaccharide flippase


2025
1716
wecA
2569
Undecaprenyl-phosphate alpha-N-






acetylglucosaminyl 1-phosphate transferase










Iron Acquisition



H. influenzae strains have an absolute requirement for either heme or iron, together with protophorphyrin IX (PPIX), the immediate precursor of heme (Evans et al., J Med Microbiol 7:359-65, 1974; White et al., J Bacteriol 85:842-50, 1963). Table 10 contains a list of genes involved in iron acquisition. Three haemoglobin and haemoglobin-haptoglobin binding proteins HgpA, HgpB and HgpC, were identified in H. influenzae type b, strain HI689 (Jin et al., Microbiology 145 (Pt 4):905-14, 1999; Morton et al., Infect Immun 67:2729-39, 1999; Ren et al., Infect Immun 66:4733-41, 1998). In strain HI689, these genes have CCAA tetranucleotide repeats and are known to be regulated by slip-strand mispairing. Two of these genes are present in strain 86-028NP. They both contain CCAA repeats; the hgpB gene is in-frame while the hgpC gene is out-of-frame. The derived amino acid sequence of a third gene that contains CCAA repeats is 45% identical to hgpA. We have designated this gene hgpD. This gene is out-of-frame. Homologues of the hxuABC genes of H. influenzae type b that encode heme and heme-hemopexin complexes (Cope et al., Infect Immun 69:2353-63, 2001; Cope et al., Infect Immun 66:4511-6, 1998; Cope et al., J Bacteriol 177:2644-53, 1995) as well as a homologue of the hemR receptor were identified. Strain 86-028NP also has the gene encoding the heme-binding lipoprotein HbpA (Heath et al., Pediatr Infect Dis J20:300-5, 2001). Downstream of hbpA is NTHI1022, a hypothetical gene whose product is a member of COG0748, a cluster that includes putative heme utilization proteins. A homologue of the hup gene, recently identified in H. influenzae type b, that encodes a general heme utilization protein, was also identified (Morton et al., Microbiology 150:3923-33, 2004).


In addition to the heme transport systems, iron transport systems were also identified. The hitABC genes encode the FbpABC proteins respectively, members of a highly specific ferric iron ABC transport system that was elegantly characterized by complementing a siderophore-deficient E. coli strain with the hitABC genes cloned from an H. influenzae type b strain (Anderson et al., J Bacteriol 186:6220-9, 2004). Transferrin-binding proteins 1 and 2 encoded by tbpAB (Gray-Owen et al., Infect Immun 63:1201-10, 1995; Gray-Owen et al., Infect Immun 63:3809-15, 1995) as well as genes designated hfeABCD that are homologues of an ABC transport system involved in iron uptake, originally characterized in Yersinia pestis (Bearden et al., J Bacteriol 180:1135-47, 1998) were identified. This latter gene cluster is also present in strain Rd. NTHI2035 encodes a putative homologue of the NRAMP family of Mn2+ and Fe2+ transporters (Richer et al., J Mol Evol 57:363-7, 2003).


As noted above, H. influenzae can use iron, together with PPIX, as a source of heme for growth in vitro. The hemH gene encoding ferrochelatase, which catalyzes the incorporation of iron into PPIX (Schlor et al., Infect Immun 68:3007-9, 2000), was identified. The gene encoding the global regulator, Fur, was also identified (Andrews et al., FEMS Microbiol Rev 27:215-37, 2003; Smoot et al., J Med Microbiol 48:629-3, 1999).









TABLE 10







NTHi Genes Involved in Iron Acquisition














Gene
SEQ ID

Contingency


NTHI#
HI#
name
NO:
Function
Repeats















177
97
hitA
931
hFbpA, Iron-utilization periplasmic protein



179
98
hitB
932
hFbpB, Iron(III)-transport system permease






protein


180
99
hitC
933
hFbpC, Iron-utilization ATP-binding protein


202
113
hemR
951
Hemin receptor


284
190
fur
1020
Ferric uptake regulation protein


369
262
hxuC
1094
Heme/hemopexin-binding protein C






(Heme: hemopexin utilization protein C)


370
263
hxuB
1095
Heme/hemopexin-binding protein B






(Heme: hemopexin utilization protein B)


371
264
hxuA
1096
Heme/hemopexin-binding protein A






(Heme: hemopexin utilization protein A)


477
359
hfeD
1090
Putative ABC-type chelated iron transport






system, permease component


478
360
hfeC
1191
Putative ABC-type chelated iron transport






system, permease component


479
361
hfeB
1192
Putative ABC-type chelated iron transport






system, ATPase component


481
362
hfeA
1193
Putative periplasmic chelated iron binding






protein


736

hgpD
1431
Hemoglobin-haptoglobin binding protein D
CCAA repeated 17






(Hemoglobin-haptoglobin utilization protein
times, out of frame






D)


782
661
hgpB
1472
Hemoglobin-haptoglobin binding protein B
CCAA repeated 12






(Hemoglobin-haptoglobin utilization protein B)
times, in frame


840
712
hgpC
1523
Hemoglobin-haptoglobin binding protein C
CCAA repeated 20






(Hemoglobin-haptoglobin utilization protein
times, out of frame






C)


1021
853
hbpA
1684
Heme-binding protein A (Hemin-binding






lipoprotein)


1168
994
tbp1
1817
Transferrin-binding protein 1


1169
995
tbp2
1818
Transferrin-binding protein 2


1329
1160
hemH
1950
Ferrochelatase


1390
1217
hup
2004
Heme utilization protein


2035
1728

2577
Mn2+ and Fe2+ transporter of the NRAMP






family










Oxidative Stress


Although necessary for growth, the active acquisition of iron can have deleterious effects on bacterial cells. Through the Fenton reaction, iron can react with hydrogen peroxide and generate highly reactive hydroxyl radicals. These products have profound effects, including lipid peroxidation and damage to both iron-containing enzymes and DNA (Imlay, Annu Rev Microbiol 57:395-418, 2003). The best-known defense system against hydroxyl radicals consists of superoxide dismutase A and B which convert highly reactive superoxide to hydrogen peroxide, which is then converted, by catalase, into water and oxygen (Demple, Annu Rev Genet 25:315-37, 1991). Strains 86-028NP and Rd contain the sodA gene (NTHI1251), but lack the sodB gene. Both strains also possess a catalase gene hktE (NTHI1099) (Bishai et al., J Bacteriol 176:2914-21, 1994), the oxyR gene (NTHI0704) encoding a primary regulator of genes involved in protection against oxidative stress (Maciver & Hansen, Infect Immun 64:4618-29, 1996; Pomposiello et aL, Trends Biotechnol 19:109-14, 2001) and the gene encoding a chimeric peroxidase termed Prx/Grx that has a glutathione-dependent role in protection against small alkyl hydroperoxides (Pauwels et al., J Biol Chenz 278:16658-66, 2003; Vergauwen et al., J Bacteriol 185:5555-62, 2003; Vergauwen et al., J Bacteriol 185:1572-81, 2003). We previously identified NTHI0212, a gene encoding a homologue of the P. multocida peroxiredoxin, TsaA that is absent in strain Rd (Munson et al., Infect Immun 72:3002-10, 2004). Strain 86-028NP, however, lacks AhpF, a dedicated alkyl hydroperoxide reductase known to be involved in the reduction of TsaA in Salmonella (Poole et al., Biochemistry 39:6602-15, 2000). Further protection against oxidative stress may be afforded by the ferritin-like proteins encoded by the finA and ftnB (NTHI1773 and NTHI1772, respectively) genes. Over-expression of these proteins were shown to protect an iron overloaded E. coli fur mutant against oxidative damage Touati et al., J Bacteriol 177:2305-14, 1995). A conserved hypothetical gene, NTHI1817, encodes a protein with homology to a DNA-binding ferritin-like protein. This is a member of the Dps family of non-specific DNA binding proteins, which in S. enterica have roles in protection against oxidative stress, both in the presence of iron and during phagocytosis, and are important for virulence in a murine model of Salmonella infection (Halsey et al., Infect Immun 72:1155-8, 2004) In E. coli, Dps was shown to preferentially bind iron that had been oxidized by hydrogen peroxide, thus having an important role in abrogating the production of hydroxyl radicals generated via the Fenton reaction (Zhao et al., J Biol Chem 277:27689-96, 2002).


Secretion


In addition to the Sec system, strain 86-028NP has genes that encode the TatA, B and C proteins, cytoplasmic membrane-associated proteins that are involved in a Sec-independent transport of proteins with twin arginines in their signal peptides (NTHI0279, NTHI0280 and NTHI0282) (Bolhuis et al., J Biol Chem 276:20213-9, 2001; Yen et al., Arch Microbiol 177:441-50, 2002). As previously reported, strain 86-028NP possesses NTHI0585, the gene encoding the autotransported protein Lav (Munson et al., Infect Immun 72:3002-10, 2004). This protein is absent in strain Rd, present in Neisseria and appears, within Haemophilus, to be restricted to pathogenic strains (Davis et al., J Bacteriol 183:4626-35, 2001). Strain 86-028NP also has the gene encoding an IgA protease (NTHI1164) (Poulsen et al J Bacteriol 174:2913-21, 1992), and as noted above, the gene encoding the Hap adhesin. Both are proteins of the autotransporter class. As described above, the HMW adhesins are members of the two-partner secretion pathway group of proteins.


Outer Membrane Proteins


A number of outer membrane protein (OMP) encoding genes have been identified by homology to those in other Haemophilus isolates. These include the major OMPs that were all originally identified in H. influenzae type b; the surface expressed P1 (NTHI0522), the porin P2 (NTHI0225), the phosphomonoesterase and heme transporter P4 (NTHI0816), the adhesin P5 (NTHI1332) and the lipoprotein P6 (NTHI0501). Strain 86-028NP also shares a number of minor OMPs with other Haemophilus strains. These include D15 and the transferrin binding proteins from H. influenzae type b, as well as a homologue of OMP26, which was identified in NTHi strain 289 (Munson et al., Infect Immun 56:2235-42, 1988; Munson et al., Infect Immun 49:544-9, 1985; Munson et al., J Clin Invest 72:677-84, 1983; Reidl et al., J Exp Med 183:621-9, Reilly et al., J Bacteriol 181:6797-805, 1999; Reilly et al., FEBS Lett 494:19-23, 2001). All have subsequently been characterized in NTHi strains and analyzed as potential vaccine candidates (Poolman et al., Vaccine 19 Suppl 1:S108-15, 2000; Murphy et al. Curr Opin Infect Dis 16:129-34, 2003; McMichael et al., Curr Opin Investig Drugs 4:953-8, 2003 Cripps et al. Immunol Cell Biol 81:46-51, 2003; Bakaletz et al. Ann Otol Rhinol Laryngol Suppl 188:82-94, 2002).


Restriction Enzymes Systems:


Strain 86-028NP lacks the HindII and HindIII type II restriction systems (Fleischmann et al., Science 269:496-512. 80, 1995; Nwankwo et al., Gene 150:75-80. 104, 1994, Smith, & Marley. Methods Enzynzol 65:104-8, 1980). In contrast, genes encoding the HaeII system that was originally identified in H. aegyptius (Slatko et al., Gene 74:45-50, 1988) are present in the strain 86-028NP genome but absent in strain Rd. Both strain 86-028NP and strain Rd have Hsd type restriction systems encoding a methytransferase (HsdM), a sequence recognition protein (HsdS) and a restriction enzyme (HsdR) (Roberts et al., Nucleic Acids Res 31:1805-12, 2003). These genes are adjacent in the strain Rd genome (HI1285-H11287). The 86-028NP genome contains 3 hsd-like loci that each contain 4 genes. One hsd system is encoded by NTHI1838-NTHI1843. In this gene cluster, NTHI1841 encodes a hypothetical protein. A second hsd-like locus is encoded by NTHI0314-NTHI0318. In this gene cluster, NTHI0316 encodes a putative anticodon nuclease. This hsd-like system may be similar to the prr system in E. coli (Tyndall et al., J Mol Biol 237:266-74, 1994). A third hsd-locus is encoded by NTHI0188-NTHI0193. In this gene cluster, NTHI0190 encodes a predicted transcriptional regulator with a helix-turn-helix domain.

Claims
  • 1. A viral vector comprising an isolated polynucleotide comprising the nucleotide sequence of SEQ ID NO: 1498.
  • 2. A vector comprising an isolated polynucleotide comprising the nucleotide sequence of SEQ ID NO: 1498 wherein the vector is operatively linked to an expression control sequence.
  • 3. An isolated host cell comprising the vector of claim 1 or 2.
  • 4. A method for detecting NTHi bacteria in a biological sample comprising: (a) contacting an isolated polynucleotide comprising the nucleotide sequence set forth in SEQ ID NO: 1498 with a biological sample under stringent hybridization conditions, wherein the stringent hybridization conditions comprise washing with 0.015 M sodium chloride, 0.0015 M sodium citrate at 65-68° C., and(b) detecting hybridization of the polynucleotide within the sample, wherein hybridization indicates the presence of NTHi bacteria in the biological sample.
  • 5. The method of claim 4 wherein the biological sample is selected from the group consisting of serum, sputum, ear fluid, blood, urine, lymphatic fluid, and cerebrospinal fluid.
Parent Case Info

This application is a divisional of U.S. application Ser. No. 13/612,176 (now U.S. Pat. No. 8,652,773) filed Sep. 12, 2012 which is a divisional of U.S. application Ser. No. 11/1917,368 (now U.S. Pat. No. 8,283,114) filed Jan. 20, 2011 which is a national phase filing under 35 U.S.C. §371 of International Application No. PCT/US2006/1023428, which claims priority to U.S. Provisional Application No. 60/691,214, filed Jun. 16, 2005 and U.S. patent application Ser. No. 10/795,159 filed Mar. 5, 2004 (now U.S. Pat. No. 7,241,867), which claims priority to U.S. Provisional Application No. 60/1453,134 filed Mar. 6, 2003, all of which are incorporated by reference herein in its entirety.

US Referenced Citations (1)
Number Name Date Kind
6797274 Ruelle Sep 2004 B1
Foreign Referenced Citations (4)
Number Date Country
WO-9902457 May 1999 WO
WO-0234768 May 2002 WO
WO-2004078949 Sep 2004 WO
WO-2005111066 Nov 2005 WO
Non-Patent Literature Citations (180)
Entry
Ackerley et al., A genome-scale analysis for identification of genes required for growth or survival of Haemophilus influenzae, Proc. Natl. Acad. Sci USA. 99: 966-71 (2002).
Altschul et al., Basic local alignment search tool, J. Mol. Biol. 215: 403-10 (1990).
Anderson et al., Human Serum Activities against Hemophilus influenzae, Type B, J. Clin. Invest. 51: 31-8 (1972).
Anderson et al., Quantitative Filter Hybridisation, Chapter 4: 73-111, Nucleic Acid Hybridisation: A Practical Approach, IRL Press Limited, Oxford, England (1987).
Anderson et al., The hFbpABC transporter from Haemophilus influenzae functions as a binding-protein-dependent ABC transporter with high specificity and affinity for ferric iron, J. Bacteriol. 186: 6220-9 (2004).
Andrews et al., Bacterial iron homeostasis, FEMS Microbiol. Rev. 27:215-37 (2003).
Badger et al., Identification of Escherichia coli K1 genes contributing to human brain microvascular endothelial cell invasion by differential fluorescence induction, Mol. Microbiol., 36: 174-82 (2000).
Bakaletz et al. Recent advances in otitis media. 7. Vaccine, Ann. Otol. Rhinol. Laryngol. Suppl. 188: 82-94 (2002).
Bakaletz et al., Demonstration of type IV pilus expression and a twitching phenotype by Haemophilus influenzae, Infect. Immun. 73:1635-4 (2005).
Bakaletz et al., Evidence for transudation of specific antibody into the middle ears of parenterally immunized chinchillas after an upper respiratory tract infection with adenovirus, Clin. Diagnostic Lab. Immunol. 4: 223-5 (1997).
Bakaletz et al., Frequency of fimbriation of nontypable Haemophilus influenzae and its ability to adhere to chinchilla and human respiratory epithelium, Infect. Immun., 53: 331-5 (1988).
Bakaletz et al., Modeling adenovirus type 1-induced otitis media in the chinchilla: effect on ciliary activity and fluid transport function of eustachian tube mucosal epithelium, J. Infect. Dis. 168: 865-72 (1993).
Bakaletz et al., Protection against development of otitis media induced by nontypeable Haemophilus influenzae by both active and passive immunization in a chinchilla model of virus-bacterium superinfection, Infect. Immun. 67: 2746-62 (1999).
Bakaletz et al., Relative immunogenicity and efficacy of two synthetic chimeric peptides of fimbrin as vaccinogens against nasopharyngeal colonization by nontypeable Haemophilus influenzae in the chinchilla, Vaccine. 15: 955-61 (1997).
Baldwin, Effects of otitis media on child development, Am. J. Otol. 14: 601-4 (1993).
Baltes et al., Actinobacillus pleuropneumoniae iron transport and urease activity: Effects on bacterial virulence and host immune response, Infect. Immun. 69: 472-478 (2001).
Barenkamp et al., Cloning, expression, and DNA sequence analysis of genes encoding nontypeable Haemophilus influenzae high-molecular-weight surface-exposed proteins related to filamentous hemagglutinin of Bordetella pertussis, Infect. Immun. 60:1302-13 (1992).
Barenkamp et al., Genes encoding high-molecular-weight adhesion proteins of nontypeable Haemophilus influenzae are part of gene clusters, Infect. Immun. 62: 3320-8 (1994).
Barenkamp et al., Identification of a second family of high-molecular-weight adhesion proteins expressed by non-typable Haemophilus influenzae, Mol. Microbiol. 19:1215-23 (1996).
Barenkamp et al., Outer membrane protein and biotype analysis of pathogenic nontypable Haemophilus influenzae. Infect. Immun. 36: 535-40 (1982).
Bartilson et al., Differential fluorescence induction reveals Streptococcus pneumoniae loci regulated by competence stimulatory peptide, Mol. Microbiol. 39: 126-35 (2001).
Bayliss et al., The simple sequence contingency loci of Haemophilus influenzae and Neisseria meningitides, J. Clin. Invest. 107:657-62 (2001).
Bearden et al., An ABC transporter system of Yersinia pestis allows utilization of chelated iron by Escherichia coil SAB11. J Bacteriol. 180:1135-47 (1998).
Bergman et al., Position-based scanning for comparative genomics and identification of genetic islands in Haemophilus influenzae type b, Infect. Immun. 71:1098-108 (2003).
Berman et al., Theoretical cost effectiveness of management options for children with persisting middle ear effusions, Pediatrics. 93:353-63 (1994).
Bishai et al., A peroxide/ascorbate-inducible catalase from Haemophilus influenzae is homologous to the Escherichia coli katE gene product, J Bacteriol 176: 2914-21 (1994).
Black et al., Efficacy, safety and immunogenecity of heptavalent pneumococcal conjugate vaccine in children, Pedriatr. Infect. Dis J. 19: 187-195 (2000).
Bolhuis et al., TatB and TatC form a functional and structural unit of the twin-arginine translocase from Escherichia coli, J. Biol. Chem. 276: 20213-9 (2001).
Bosse et al., Urease activity may contribute to the ability of Actinobacillus pleuropneumoniae to establish infection, Can. J. Vet. Res. 64: 145-50 (2001).
Bright et al., The Prevalence of Tympanostomy Tubes in Children in the United States, 1988, Am. J. Public Health. 83: 1026-8 (1993).
Buscher et al., Evolutionary and functional relationships among the nontypeable Haemophilus influenzae HMW family of adhesins. J. Bacterial. 186: 4209-17 (2004).
Cardillo et al., Synthesis of the phenylserine-leusine dipeptide fragment present in the antibiotic lysobactin from an aziridine-2-imide precursor. Eur. J. Org. Chem. 2000: 2489-94 (2000).
Chang et al., Identification and genetic characterization of Haemophilus influenzae genetic island 1. Infect. Immun. 68:2630-7 (2000).
Chiang et al., In vivo Genetic Analysis of Bacterial Virulence. Annu. Rev. Microbiol. 53: 129-54 (1999).
Chissoe et al., Strategies for Rapid and Accurate DNA Sequencing, Methods. 3: 55-65 (1991).
Cimons, Lurid Reports Obscure Reality of Strep A Outbreak, ASM News. 60: 527-8 (1994).
Collet et al., Oxidative protein folding in bacteria, Mol. Microbiol. 44:1-8 (2002).
Cope et al., A gene cluster involved in the utilization of both free heme and heme:hemopexin by Haemophilus influenzae type b, J Bacteriol. 177:2644-53 (1995).
Cope et al., Binding of heme-hemopexin complexes by soluble HxuA protein allows utilization of this complexed heme by Haemophilus influenzae, Infect. Immun. 66: 4511-6 (1998).
Cope et al., Involvement of HxuC outer membrane protein in utilization of hemoglobin by Haemophilus influenzae, Infect. Immun. 69: 2353-63 (2001).
Cormack et al., FACS-optimized mutants of the green fluorescent protein (GFP), Gene. 173: 33-8 (1996).
Cripps et al., Bacterial otitis media: current vaccine development strategies, Immunol. Cell. Biol. 81: 46-51 (2003).
Daines et al., Haemophilus influenzae Rd KW20 has virulence properties. J. Med. Microbiol. 52:277-82 (2003).
Davis et al., Evolution of an autotransporter: domain shuffling and lateral transfer from pathogenic Haemophilus to Neisseria, J. Bacteriol. 183: 4626-35 (2001).
Davis et al., Liposomes as adjuvants with immunopurified tetanus toxoid: the immune response, Immunol. Lett. 14: 341-8 (1987).
DeMaria et al., Biotypes of serologically nontypable Haemophilus influenzae isolated from the middle ears and nasopharynges of patients with otitis media with effusion, J. Clin. Microbiol., 20: 1102-4 (1984).
DeMaria et al., Immunization with outer membrane protein P6 from nontypeable Haemophilus influenzae induces bactericidal antibody and affords protection in the chinchilla model of otitis media. Infect. Immun. 64: 5187-92 (1996).
Devereux et al., A comprehensive set of sequence analysis programs for the VAX, Nucleic Acids Res. 12: 387-95 (1984).
Dhandayuthapani et al., Green fluorescent protein as a marker for gene expression and cell biology of mycobacterial interactions with macrophages, Mol. Microbiol. 17: 901-12 (1995).
Dimopoulou et al., Site-specific recombination with the chromosomal tRNA(Leu) gene by the large conjugative Haemophilus resistance plasmid, Antimicrob Agents Chemother 46:1602-3 (2002).
Dunn et al., A vector for promoter trapping in Bacillus cereus. Gene. 226: 297-305 (1999).
EBI Accession No. P44079, Full uncharacterized protein HI0938, Nov. 1, 1995.
Ehrlich, et al., Mucosal biofilm formation on middle-ear mucosa in the chinchilla model of otitis media, JAMA. 287: 1710-5 (2002).
Erdos et al., Development and characterization of a pooled Haemophilus influenzae genomic library for the evaluation of gene expression changes associated with mucosal biofilm formation in otitis media. Int. J. Pediatr. Otorhinolaryngol. 67:749-55 (2003).
Erickson et al. Chapter 3: Solid-Phase Peptide Synthesis. The Proteins. v.2, Academic Press, New York: 255-527 (1976).
Eskola et al., Efficacy of a Pneumococcal Conjugate Vaccine Against Acute Otitis Media, N. Engl. J. Med. 344: 403-9 (2001).
Eskola et al., Potential of bacterial vaccines in the prevention of acute otitis media, Pediatr. Infect. Dis. J. 19: 72-8 (2000).
Eskra et al., Brucella abortus Genes Identified following Constitutive Growth and Macrophage Infection, Infect. Immun. 69: 7736-42 (2001).
Evans et al., Haemin and nicotinamide adenine dinucleotide requirements of Haemophilus influenzae and Haemophilus parainfluenzae, J. Med. Microbiol. 7: 359-65 (1974).
Fleischmann et al., Haemophilis influenzae Rd KW20 section 45 of 163 of the complete genome. EMBL H132730, Database accession No. U37230, L42023 (1995).
Fleischmann et al., Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496-512, 1995).
Fleishchmann et al., Histidine biosynthesis bifunctional protein hisB. UNIPROT: HIS7—HAEIN, Database accession No. P44327 (1995).
Forchhammer et al., Identification of a novel translation factor necessary for the incorporation of selenocysteine into protein, Nature. 342: 453-6 (1989).
Genbank Accession No. CP000057, Haemophilus influenzae 86-028NP, complete genome. Aug. 30, 2007.
Genbank Accession U05670, Haemophilus influenzae Lex2A and Lex2B genes, complete cds. Oct. 12, 2005.
Giebank, Immunology: promise of new vaccines, Pedriatr. Infect. Dis J. 13: 1064-8 (1994).
Goosen et al., The regulation of transcription initiation by integration host factor, Mol. Microbiol. 16:1-7 (1995).
Grass et al., The Haemophilus influenzae HMW1 adhesin is glycosylated in a process that requires HMW1C and phosphoglucomutase, an enzyme involved in lipooligosaccharide biosynthesis, Mol. Microbiol. 48: 737-51 (2003).
Gray-Owen et al., Characterization of transferrin binding proteins 1 and 2 in invasive type b and nontypeable strains of Haemophilus influenzae. Infect. Immun. 63: 3809-15 (1995).
Gray-Owen et al., Identification and characterization of genes encoding the human transferrin-bindingproteins from Haemophilus influenzae, Infect. Immun. 63:1201-10 (1995).
Green et al., Certain site-directed, nonenzymatically active mutants of the Haemophilus influenzae P4 lipoprotein are able to elicit bacterial antibodies. Infect. Immun. 73: 4454-57 (2002).
Greiner et al., Nontypeable Haemophilus influenzae strain 2019 produces a biofilm containing N-acetylneuraminic acid that may mimic sialylated O-linked glycans, Infect. Immun. 72: 4249-60 (2004).
Griffin et al., The role of lex2 in lipopolysaccharide biosynthesis in Haemophilus influenzae strains RM7004 and RM153, Microbiology. 149: 3165-75 (2003).
Gritsun et al., Analysis of flavivirus envelope proteins reveals variable domains that reflect their antigenicity and may determine their pathogenesis. Virus Res. 35: 307-321 (1995).
Halsey et al., The ferritin-like Dps protein is required for Salmonella enterica serovar Typhimurium oxidative stress resistance and virulence, Infect. Immun. 72: 1155-8 (2004).
Harrison et al., Genomic sequence of an otitis media isolate of nontypeable Haemophilus influenzae: comparative study with H. influenzae serotype d, strain KW20. J. Bacteriology. 187(13):4627-36 (2005).
Heath et al., Non-type b Haemophilus influenzae disease: clinical and epidemiologic characteristics in the Haemophilus influenzae type b vaccine era. Pediatr. Infect. Dis. J. 20: 300-5 (2001).
Henriksen et al., Negative regulation of STAT92E by an n-terminally truncated STAT protein derived from an alternative promoter site. Genes Dev. 16: 2379-89 (2002).
High et al., The role of a repetitive DNA motif (5′-CAAT-3′) in the variable expression of the Haemophilus influenzae lipopolysaccharide epitope alpha Gal(1-4)beta Gal, Mol. Microbiol. 9: 1275-82 (1993).
Holmes et al., Adherence of non-typeable Haemophilus influenzae promotes reorganization of the actin cytoskeleton in human or chinchilla epithelial cells in vitro, Microb. Pathog. 23: 157-66 (1997).
Hood et al., Biosynthesis of cryptic lipopolysaccharide glycoforms in Haemophilus influenzae involves a mechanism similar to that required for O-antigen synthesis, J. Bacteriol. 186: 7429-39 (2004).
Hood et al., DNA repeats identify novel virulence genes in Haemophilus influenzae, Proc. Natl. Acad. Sci. USA. 93: 11121-5 (1996).
Hood et al., Genetic basis for expression of the major globotetraose-containing lipopolysaccharide from H. influenzae strain Rd (RM118), Glycobiology. 11: 957-67 (2001).
Hood et al., Identification of a lipopolysaccharide alpha-2,3-sialyltransferase from Haemophilus influenzae, Mol. Microbiol. 39: 341-50 (2001).
Hood et al., Three genes, IgtF, lic2C and IpsA, have a primary role in determining the pattern of oligosaccharide extension from the inner core of Haemophilus influenzae LPS. Microbiology. 150: 2089-97 (2004).
Imlay, Pathways of oxidative damage. Annu. Rev. Microbiol . 57: 395-418 (2003).
Infante-Rivand et al., Otitis Media in Children: Frequency, Risk Factors, and Research Avenues, Epidemiol. Rev. 15: 444-65 (1993).
International Search Report, European Patent Office, PCT/US2004/007001 dated Feb. 17, 2005.
International Search Report, European Patent Office, PCT/US2006/023428 dated Nov. 10, 2006.
Jansen et al., Immunotoxins: Hybrid molecules combining high specificity and potent cytotoxicity, Immun. Rev. 62:185-216 (1982).
Jarosik et al., Identification of a new locus involved in expression of Haemophilus influenzae type b lipooligosaccharide, Infect. Immun. 62: 4861-7 (1994).
Jiang et al., Fimbria-mediated enhanced attachment of nontypeable Haemophilus influenzae to respiratory syncytial virus-infected respiratory epithelial cells, Infect. Immun. 67: 187-92 (1999).
Jin et al., Characterization of hgpA, a gene encoding a haemoglobin/haemoglobin-haptoglobin-binding protein of Haemophilus influenzae, Microbiology. 145 (Pt 4): 905-14, (1999).
Jones et al., Haemophilus influenzae type b strain A2 has multiple sialyltransferases involved in lipooligosaccharide sialylation, J. Biol. Chem. 277: 14598-611 (2002).
Kaplan et al., Overall cost in the treatment of otitis media, Pediatr. Infect. Dis. J. 16: S9-11 (1997).
Karma et al., Immunological aspects of otitis media: present views on possibilities of immunoprophylaxis of acute otitis media in infants and children, Int. J. Pediatr. Otorhinolaryngol. 32(Suppl.):S127-S134 (1995).
Kennedy et al., Passive transfer of antiserum specific for immungens derived from a nontypeable Haemophilus influenzae adhesin and lipoprotein D prevents otitis media after heterologous challenge, Infect. lmmun. 68: 2756-65 (2000).
Kilpi et al., Bacteriology of acute otitis media in a cohort of Finnish children followed for the first two years of life, Pediatr. Infect. Dis. J. 20: 654-62 (2001).
Klein, Role of nontypeable Haemophilus influenzae in pediatric respiratory tract infections. Pedriatr. Infect. Dis J. 16: S5-8 (1997).
Kramp et al., Liposomal enhancement of the immunogenicity of adenovirus type 5 hexon and fiber vaccines, Infect. Immun. 25: 771-3 (1979).
Lee et al., Constitutive and inducible green fluorescent protein expression in Bartonella henselae, Infect. Immun. 66: 3964-7 (1998).
Li et al., Identification and characterization of genomic loci unique to the Brazilian purpuric fever clonal group of H. influenzae biogroup aegyptius: functionality explored using meningococcal homology, Mol. Microbiol. 47:1101-11 (2003).
Lowe et al., tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence, Nucleic Acids Res. 25: 955-64 (1997).
Lysenko et al., Bacterial Phosphorylcholine decreases susceptibility to the antimicrobial peptide LL-37/hCAP18 expressed in the upper respiratory tract, Infect. Immun. 68:1664-71 (2000).
Maciver et al., Lack of expression of the global regulator OxyR in Haemophilus influenzae has a profound effect on growth phenotype. Infect. Immun. 64: 4618-29 (1996).
Margolis et al., Identification of hearing loss in children with otitis media, Ann. Otol. Rhinol.1 Laryngol. 103: 59-61 (1994).
Marra et al., Differential fluorescence induction analysis of Streptococcus pneumoniae identified genes involved in pathogenesis, Infect. Immun. 70:1422-33 (2002).
Marra et al., In vivo characterization of the psa genes from Streptococcus pneumoniae in multiple models of infection, Microbiology. 148: 1483-91 (2002).
Matteucci et al., Synthesis of deoxyoligonucleotides on a polymer support, J. Am. Chem. 103:11: 3185-3191 (1981).
Mhlanga-Mutangadura et al., Evolution of the major pilus gene cluster of Haemophilus influenzae, J. Bacteriol. 180: 4693-703 (1998).
Mitchell et al., Electroporation of Haemophilus influenzae is effective for transformation of plasmid but not chromosomal DNA, Nucleic Acids Res. 19: 3625-8 (1991).
Miyamoto et al., Selective adherence of non-typeable Haemophilus influenzae (NTHi) to mucus or epithelial cells in the chinchilla Eustachian tube and middle ear, Microbial Pathogenesis. 21: 343-56 (1996).
Mohd-Zain et al., Transferable antibiotic resistance elements in Haemophilus influenzae share a common evolutionary origin with a diverse family of syntenic genomic islands. J Bacteriol. 186: 8114-22 (2004).
Morgan et al., Bacteriophage Mu genome sequence: analysis and comparison with Mu-like prophages in Haemophilus, Neisseria and Deinococcus, J. Mol. Biol. 317: 337-59 (2002).
Mortan et al., Effect of multiple mutations in the hemoglobin- and hemoglobin-haptoglobin-binding proteins, HgpA, HgpB, and HgpC, of Haemophilus influenzae type b. Infect. Immun. 67: 2729-39 (1999).
Mortan et al., Identification of a haem-utilization protein (Hup) in Haemophilus influenzae, Microbiology. 150: 3923-33 (2004).
Munson et al., Partial analysis of the genomes of two nontypeable Haemophilus influenzae otitis media isolates, Infect. Immun. 72: 3002-10 (2004).
Munson et al., Purification and comparison of outer membrane protein P2 from Haemophilus influenzae type b isolates, J. Clin. Invest. 72: 677-84 (1983).
Munson et al., Purification and partial characterization of outer membrane proteins P5 and P6 from Haemophilus influenzae type b. Infect. Immun. 49: 544-9 (1985).
Munson et al., Purification, cloning, and sequence of outer membrane protein P1 of Haemophilus influenzae type b, Infect. Immun. 56: 2235-42 (1988).
Musher et al., Opsonizing and bactericidal effects of normal human serum on nontypable Haemophilus influenzae, Infect. Immun. 39: 297-304 (1983).
Musser et al., Genetic relationships of serologically nontypable and serotype B strains of Haemophilus influenzae, Infect. Immun. 52:183-91 (1986).
Nichols et al., Identification of the ADP-L-glycerol-D-manno-heptose-6-epimerase (rfaD) and heptosyltransferase II (rfaF) biosynthesis genes from nontypeable Haemophilus influenzae 2019, Infect. Immun. 65:1377-86 (1997).
Novotny et al., Detection and characterization of pediatric serum antibody to the OMP P5-homologous adhesin of nontypeable Haemophilus influenzae during acute otitis media. Vaccine. 20: 3590-7 (2002).
Novotny et al., Epitope mapping of the outer membrane protein P5-homologous fimbrin adhesin of nontypeable Haemophilus influenzae, Infect. Immun. 68: 2119-28 (2000).
Novotny et al., The fourth surface-exposed region of the outer membrane protein P5-homologous adhesin of nontypable Haemophilus influenzae is an immunodominant but nonprotective decoying epitope, J. Immunol. 171: 1978-83 (2003).
Padmalayam et al., Molecular cloning, sequencing, expression and characterization of an immunogenic 43-kilodalton lipoprotein of Bartonella bacilliformis that has homology to N1pD/LppB, Infect. Immun. 68: 4972-9 (2000).
Papp, Management of otitis media with effusion in young children, Ann. Pharmacother. 30: 1291-7 (1996).
Partial European Search Report, EP 08 02 2435, European Patent Office dated Apr. 23, 2009.
Pauwels et al., Purification and characterization of a chimeric enzyme from Haemophilus influenzae Rd that exhibits glutathione-dependent peroxidase activity, J. Biol. Chem. 278:16658-66 (2003).
Pomposiello et al., Redox-operated genetic switches: the SoxR and OxyR transcription factors, Trends Biotechnol. 19: 109-14 (2001).
Poole et al., AhpF can be dissected into two functional units: tandem repeats of two thioredoxin-like folds in the N-terminus mediate electron transfer from the thioredoxin reductase-like C-terminus to AhpC, Biochemistry. 39: 6602-15 (2000).
Poolman et al., Developing a nontypeable Haemophilus influenzae (NTHi) vaccine, Vaccine. 19: S109-15 (2001).
Poulsen et al., A comparative genetic study of serologically distinct Haemophilus influenzae type 1 immunoglobulin A1 proteases, J. Bacteriol. 174: 2913-21 (1992).
Ray et al., Tricross: using dot-plots in sequence-id space to detect uncataloged intergenic features, Bioinformatics. 17: 1105-12 (2001).
Reidl et al., Lipoprotein e(P4) is essential for hemin uptake by Haemophilus influenzae, J. Exp. Med. 183: 621-9 (1996).
Reilly et al., Contribution of the DDDD motif of H. influenzae e (P4) to phosphomonoesterase activity and heme transport, FEBS. Lett. 494:19-23 (2001).
Reilly et al., Outer membrane lipoprotein e (P4) of Haemophilus influenzae is a novel phosphomonoesterase, J. Bacteriol. 181: 6797-805 (1999).
Ren et al., hgpB, a gene encoding a second Haemophilus influenzae hemoglobin- and hemoglobin-haptoglobin-binding protein, Infect. Immun. 66: 4733-41 (1998).
Richer et al., Horizontal gene transfer of “prototype” Nramp in bacteria, J. Mol. Evol. 57: 363-7 (2003).
Roberts et al., A nomenclature for restriction enzymes, DNA methyltransferases, homing endonucleases and their genes, Nucleic Acids Res. 31: 1805-12 (2003).
Rock et al., The licC gene of Streptococcus pneumoniae Encodes a CTP: Phosphocholine Cytidylytransferase, J. Bacteriol. 183: 4927-31 (2001).
Rodriguez et al., Prevalence and distribution of adhesins in invasive non-type b encapsulated Haemophilus influenzae, Infect. Immun. 71: 1635-42 (2003).
Roman et al., Dynamics of long-term colonization of respiratory tract by Haemophilus influenzae in cystic fibrosis patients shows a marked increase in hypermutable strains, J. Clin. Microbiol. 42:1450-9 (2004).
Ruffolo et al., Identification, purification, and characterization of the type 4 fimbriae of Pasteurella multocida. Infect Immun 65:339-43, 1997.
Sawitzke et al., Suppression of chromosome segregation defects of Escherichia coli muk mutants by mutations in topoisomerase I, Proc. Natl. Acad. Sci. USA. 97: 1671-6 (2000).
Sawitzke et al., Suppression of chromosome segregation defects of Escherichia coli muk mutants by mutations in topoisomerase I. Proc. Natl. Acad. Sci. U.S.A., 97: 1671-6 (2000).
Schlor et al., Characterization of ferrochelatase (hemH) mutations in Haemophilus influenzae, Infect. Immun. 68: 3007-9 (2000).
Schneider et al., Virulence gene identification by differential fluorescence induction Analysis of Staphylococcus aureus gene expression during infection-simulating culture, Infect. Immun. 70: 1326-33 (2002).
Sethi et al., Bacterial infection in chronic obstructive pulmonary disease in 2000: a state-of-the-art review, Clin. Microbiol. Rev. 14: 336-63 (2001).
Shen et al., Molecular determinants of disease and resistance in interactions of Xanthomonas oryzae pv. oryzae and rice. Microbes. Infect. 4: 1361-7 (2002).
Sirakova et al., Role of fimbriae expressed by nontypeable Haemophilus influenzae in pathogenesis of and protection against otitis media and relatedness of the fimbrin subunit to outer membrane protein A, Infect. Immun. 62: 2002-20 (1994).
Skoczynska et al., Prevalence and serotype distribution of encapsulated Haemophilus influenzae isolates from patients with lower respiratory tract infections in Poland, J. Clin. Microbiol. 43: 938-41 (2005).
Smoot et al., Fur and iron transport proteins in the Brazilian purpuric fever clone of Haemophilus influenzae biogroup aegyptius, J. Med. Microbiol. 48: 629-3 (1999).
Smoot et al., Genomic analysis of the F3031 Brazilian purpuric fever clone of Haemophilus influenzae biogroup aegyptius by PCR-based subtractive hybridization, Infect. Immun. 70: 2694-9 (2002).
Snow, Progress in the prevention of otitis media through immunization, Otology Neurol.23: 1-2 (2002).
Spinola et al., Epidemiology of colonization by nontypable Haemophilus influenzae in children: A longitudinal study, J. Infect. Dis. 154: 100-9 (1986).
St Geme et al., A Haemophilus influenzae IgA protease-like protein promotes intimate interaction with human epithelial cells, Mol. Microbiol. 14: 217-3 (1994).
St Geme et al., Secretion of the Haemophilus influenzae HMW1 and HMW2 adhesins involves a periplasmic intermediate and requires the HMWB and HMWC proteins, Mol. Microbiol. 27: 617-30 (1998).
St Geme, The pathogenesis of nontypable Haemophilus influenzae otitis media. Vaccine. 19: Suppl 1:S41-50 (2000).
Suzuki et al., Synergistic effect of adenovirus type 1 and nontypeable Haemophilus influenzae in a chinchilla model of experimental otitis media, Infect. Immun. 62: 1710-8 (1994).
Suzuki et al., Synergistic effect of adenovirus type 1 and nontypeable Haemophilus influenzae in chinchilla model experimental otitis media. Infect. Immun. 62: 1710-8 (1994).
Teele et al., Otitis media in infancy and intellectual ability, school achievement, speech, and language at age 7 years, J. Infect. Dis. 162: 685-94 (1990).
Tomb, A periplasmic protein disulfide oxidoreductase is required for transformation of Haemophilus influenzae Rd, Proc. Natl. Acad. Sci. USA. 89: 10252-6 (1992).
Tong, et al., Evaluation of phase variation of nontypeable Haemophilus influenzae lipooligosaccharide during nasopharyngeal colonization and development of otitis media in the chinchilla model, Infect. Immun. 68: 4593-4597 (2000).
Touati et al., Lethal oxidative damage and mutagenesis are generated by iron in delta fur mutants of Escherichia coli: protective role of superoxide dismutase, J. Bacteriol. 177: 2305-14 (1995).
Valdivia et al., Bacterial genetics by flow cytometry: rapid isolation of Salmonella typhimurium acid-inducible promoters by differential fluorescence induction, Mol. Microbiol. 22: 367-78 (1996).
Valdivia et al., Flow cytometry and bacterial pathogenesis, Curr.Opin. Microbiol.1: 359-63 (1998).
Valdivia et al., Fluorescence-Based Isolation of Bacterial Genes Expressed Within Host Cells, Science. 277: 2007-11 (1997).
Van Ulsen et al., Genes of non-typeable Haemophilus influenzae expressed during interaction with human epithelial cell lines, Mol. Microbiol. 45: 485-500 (2002).
Vergauwen et al., Exogenous glutathione completes the defense against oxidative stress in Haemophilus influenzae, J. Bacteriol. 185: 1572-81 (2003).
Vergauwen et al., Glutathione and catalase provide overlapping defenses for protection against respiration-generated hydrogen peroxide in Haemophilus influenzae, J. Bacteriol. 185: 5555-62 (2003).
Weiser et al., Decoration of lipopolysaccharide with phosphorylcholine: a phase-variable characteristic of Haemophilus influenzae, Infect. Immun. 65: 943-50 (1997).
Weiser et al., Phosphorylcholine on the lipopolysaccharide of Haemophilus influenzae contributes to persistence in the respiratory tract and sensitivity to serum killing mediated by C-reactive protein, J. Exp. Med. 187: 631-40 (1998).
White et al., Hemin biosynthesis in hemophilus. J. Bacteriol. 85: 842-50 (1963).
Wilson et al., Identification of Listeria monocytogenes In Vivo-Induced Genes by Fluorescence-Activated Cell Sorting, Infect. Immun. 69: 5016-24 (2001).
Yamanaka, et al., CspD, a novel DNA replication inhibitor induced during the stationary phase in Escherichia coli, Mol. Microbiol. 39: 1572-84 (2001).
Young et al., A bifunctional urease enhances survival of pathogenic Yersinia enterocolitica and Morganella moganii at low pH, J. Bacteriol. 178: 6487-95 (1996).
Zhang et al., Structure of Thermotoga maritima stationary phase survival protein SurE: a novel acid phosphatase, Structure. 9:1095-106 (2001).
Zhao et al., Iron and hydrogen peroxide detoxification properties of DNA-binding protein from starved cells. A ferritin-like DNA-binding protein of Escherichia coli, J. Biol. Chem. 277: 27689-96 (2002).
Related Publications (1)
Number Date Country
20140127221 A1 May 2014 US
Provisional Applications (1)
Number Date Country
60691214 Jun 2005 US
Divisions (2)
Number Date Country
Parent 13612176 Sep 2012 US
Child 14151227 US
Parent 11917368 US
Child 13612176 US