Xylose isomerases and their uses

Information

  • Patent Grant
  • 9982249
  • Patent Number
    9,982,249
  • Date Filed
    Tuesday, July 23, 2013
    10 years ago
  • Date Issued
    Tuesday, May 29, 2018
    5 years ago
Abstract
This disclosure relates to novel xylose isomerases and their uses, particularly in fermentation processes that employ xylose-containing media.
Description
1. BACKGROUND

The efficient, commercial production of biofuels from plant material, such as sugarcane, requires the fermentation of pentoses, such as xylose. Xylose in plant material typically comes from lignocellulose, which is a matrix composed of cellulose, hemicelluloses, and lignin. Lignocellulose is broken down either by acid hydrolysis or enzymatic reaction, yielding xylose in addition to other monosaccharides, such as glucose (Maki et al., 2009, Int. J. Biol. Sci. 5:500-516).


Fungi, especially Saccharomyces cerevisiae, are commercially relevant microorganisms that ferment sugars into biofuels such as ethanol. However, S. cerevisiae does not endogenously metabolize xylose, requiring genetic modifications that allow it to convert xylose into xylulose. Other organisms, whose usefulness in ethanol production is limited, are able to metabolize xylose (Nevigot, 2008, Micobiol. Mol. Biol. Rev. 72:379-412).


Two pathways have been identified for the metabolism of xylose to xylulose in microorganisms: the xylose reductase (XR, EC 1.1.1.307)/xylitol dehydrogenase (XDH, EC 1.1.1.9, 1.1.1.10 and 1.1.1.B19) pathway and the xylose isomerase (XI, EC 5.3.1.5) pathway. Use of the XR/XDH pathway for xylose metabolism creates an imbalance of cofactors (excess NADH and NADP+) limiting the potential output of this pathway for the production of ethanol. The XI pathway, on the otherhand, converts xylose to xylulose in a single step and does not create a cofactor imbalance (Young et al., 2010, Biotechnol. Biofuels 3:24-36).


Because S. cerevisiae does not possess a native XI, it has been desirable to search for an XI in another organism to insert into S. cerevisiae for the purpose of biofuels production. Several XI genes have been discovered, although little or no enzymatic activity upon expression in S. cerevisiae has been a common problem. The XI from Piromyces sp. E2 was the first heterologously expressed XI in S. cerevisiae whose enzymatic activity could be observed (WO 03/062430).


2. SUMMARY

Due to the physiology of S. cerevisiae and the process of commercial biofuel production, there are other characteristics besides activity that are valuable in a commercially useful XI. During fermentation, the pH of the yeast cell and its environment can become more acidic (Rosa and Sa-Correia, 1991, Appl. Environ. Microbiol. 57:830-835). The ability of the XI to function in an acidic environment is therefore highly desirable. Therefore, there is a still a need in the art for XI enzymes with enhanced activity to convert xylose to xylulose for biofuels production under a broader range of commercially relevant conditions.


The present disclosure relates to novel xylose isomerases. The xylose isomerases have desirable characteristics for xylose fermentation, such as high activity, tolerance to acidic conditions (i.e., pH levels below 7, e.g., pH 6.5 or pH 6), or both.


The present disclosure has multiple aspects. In one aspect, the disclosure is directed to XI polypeptides. The polypeptides of the disclosure typically comprise amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 93%, 95%, 96%, 98%, 99% or 100% sequence identity to any of the XI polypeptides of Table 1, or the catalytic domain or dimerization domain thereof, or are encoded by nucleic acid sequences comprising nucleotide sequences having at least 70%, 75%, 80%, 85%, 90%, 93%, 95%, 96%, 98%, 99% or 100% sequence identity to any of the nucleic acids of Table 1:














TABLE 1





SEQ

Organism
Type of
Catalytic
Dimerization


ID NO:
Clone No.
Classification
Sequence
Domain
Domain




















1
1754MI2_001

Bacteroidales

DNA




2
1754MI2_001

Bacteroidales

Amino Acid
2-376
377-437


3
5586MI6_004

Bacteroidales

DNA




4
5586MI6_004

Bacteroidales

Amino Acid
2-376
377-437


5
5749MI1_003

Bacteroidales

DNA




6
5749MI1_003

Bacteroidales

Amino Acid
2-381
382-442


7
5750MI1_003

Bacteroidales

DNA




8
5750MI1_003

Bacteroidales

Amino Acid
2-381
382-442


9
5750MI2_003

Bacteroidales

DNA




10
5750MI2_003

Bacteroidales

Amino Acid
2-381
382-442


11
5586MI5_004

Bacteroides

DNA




12
5586MI5_004

Bacteroides

Amino Acid
2-375
376-435


13
5586MI202_004

Bacteroides

DNA




14
5586MI202_004

Bacteroides

Amino Acid
2-377
378-438


15
5586MI211_003

Bacteroides

DNA




16
5586MI211_003

Bacteroides

Amino Acid
2-376
377-437


17
5606MI1_005

Bacteroides

DNA




18
5606MI1_005

Bacteroides

Amino Acid
2-377
378-438


19
5606MI2_003

Bacteroides

DNA




20
5606MI2_003

Bacteroides

Amino Acid
2-378
379-439


21
5610MI3_003

Bacteroides

DNA




22
5610MI3_003

Bacteroides

Amino Acid
2-377
378-439


23
5749MI2_004

Bacteroides

DNA




24
5749MI2_004

Bacteroides

Amino Acid
2-377
378-438


25
5750MI3_003

Bacteroides

DNA




26
5750MI3_003

Bacteroides

Amino Acid
2-377
378-438


27
5750MI4_003

Bacteroides

DNA




28
5750MI4_003

Bacteroides

Amino Acid
2-377
378-438


29
5751MI4_002

Bacteroides

DNA




30
5751MI4_002

Bacteroides

Amino Acid
2-376
377-437


31
5751MI5_003

Bacteroides

DNA




32
5751MI5_003

Bacteroides

Amino Acid
2-377
378-438


33
5751MI6_004

Bacteroides

DNA




34
5751MI6_004

Bacteroides

Amino Acid
2-377
378-438


35
5586MI22_003

Clostridiales

DNA




36
5586MI22_003

Clostridiales

Amino Acid
2-375
376-439


37
1753MI4_001

Firmicutes

DNA




38
1753MI4_001

Firmicutes

Amino Acid
2-374
375-440


39
1753MI6_001

Firmicutes

DNA




40
1753MI6_001

Firmicutes

Amino Acid
2-374
375-440


41
1753MI35_004

Firmicutes

DNA




42
1753MI35_004

Firmicutes

Amino Acid
2-375
376-441


43
1754MI9_004

Firmicutes

DNA




44
1754MI9_004

Firmicutes

Amino Acid
2-375
376-440


45
1754MI22_004

Firmicutes

DNA




46
1754MI22_004

Firmicutes

Amino Acid
2-375
376-440


47
727MI1_002

Firmicutes

DNA




48
727MI1_002

Firmicutes

Amino Acid
2-372
373-436


49
727MI9_005

Firmicutes

DNA




50
727MI9_005

Firmicutes

Amino Acid
2-374
375-438


51
727MI27_002

Firmicutes

DNA




52
727MI27_002

Firmicutes

Amino Acid
2-374
375-439


53
1753MI2_006

Neocallimastigales

DNA




54
1753MI2_006

Neocallimastigales

Amino Acid
2-376
377-437


55
5586MI3_005

Neocallimastigales

DNA




56
5586MI3_005

Neocallimastigales

Amino Acid
2-376
377-437


57
5586MI91_002

Neocallimastigales

DNA




58
5586MI91_002

Neocallimastigales

Amino Acid
2-376
377-437


59
5586MI194_003

Neocallimastigales

DNA




60
5586MI194_003

Neocallimastigales

Amino Acid
2-376
377-438


61
5586MI198_003

Neocallimastigales

DNA




62
5586MI198_003

Neocallimastigales

Amino Acid
2-375
376-437


63
5586MI201_003

Neocallimastigales

DNA




64
5586MI201_003

Neocallimastigales

Amino Acid
2-376
377-438


65
5586MI204_002

Neocallimastigales

DNA




66
5586M1204_002

Neocallimastigales

Amino Acid
2-375
376-437


67
5586MI207_002

Neocallimastigales

DNA




68
5586MI207_002

Neocallimastigales

Amino Acid
2-375
376-437


69
5586MI209_003

Neocallimastigales

DNA




70
5586MI209_003

Neocallimastigales

Amino Acid
2-375
376-437


71
5586MI214_002

Neocallimastigales

DNA




72
5586MI214_002

Neocallimastigales

Amino Acid
2-375
376-437


73
5751MI3_001

Neocallimastigales

DNA




74
5751MI3_001

Neocallimastigales

Amino Acid
2-375
376-437


75
5753MI3_002

Prevotella

DNA




76
5753MI3_002

Prevotella

Amino Acid
2-376
377-439


77
1754MI1_001

Prevotella

DNA




78
1754MI1_001

Prevotella

Amino Acid
2-377
378-439


79
1754MI3_007

Prevotella

DNA




80
1754MI3_007

Prevotella

Amino Acid
2-377
378-439


81
1754MI5_009

Prevotella

DNA




82
1754MI5_009

Prevotella

Amino Acid
2-375
376-437


83
5586MI1_003

Prevotella

DNA




84
5586MI1_003

Prevotella

Amino Acid
2-377
378-439


85
5586MI2_006

Prevotella

DNA




86
5586MI2_006

Prevotella

Amino Acid
2-377
378-439


87
5586MI8_003

Prevotella

DNA




88
5586MI8_003

Prevotella

Amino Acid
2-377
378-439


89
5586MI14_003

Prevotella

DNA




90
5586MI14_003

Prevotella

Amino Acid
2-377
378-439


91
5586MI26_003

Prevotella

DNA




92
5586MI26_003

Prevotella

Amino Acid
2-377
378-439


93
5586MI86_001

Prevotella

DNA




94
5586MI86_001

Prevotella

Amino Acid
2-376
377-438


95
5586MI108_002

Prevotella

DNA




96
5586MI108_002

Prevotella

Amino Acid
2-377
378-439


97
5586MI182_004

Prevotella

DNA




98
5586MI182_004

Prevotella

Amino Acid
2-377
378-439


99
5586MI193_004

Prevotella

DNA




100
5586MI193_004

Prevotella

Amino Acid
2-376
377-438


101
5586MI195_003

Prevotella

DNA




102
5586MI195_003

Prevotella

Amino Acid
2-376
377-438


103
5586MI216_003

Prevotella

DNA




104
5586MI216_003

Prevotella

Amino Acid
2-376
377-438


105
5586MI197_003

Prevotella

DNA




106
5586MI197_003

Prevotella

Amino Acid
2-376
377-438


107
5586MI199_003

Prevotella

DNA




108
5586MI199_003

Prevotella

Amino Acid
2-376
377-438


109
5586MI200_003

Prevotella

DNA




110
5586MI200_003

Prevotella

Amino Acid
2-376
377-438


111
5586MI203_003

Prevotella

DNA




112
5586MI203_003

Prevotella

Amino Acid
2-376
377-438


113
5586MI205_004

Prevotella

DNA




114
5586MI205_004

Prevotella

Amino Acid
2-376
377-438


115
5586MI206_004

Prevotella

DNA




116
5586MI206_004

Prevotella

Amino Acid
2-376
377-438


117
5586MI208_003

Prevotella

DNA




118
5586MI208_003

Prevotella

Amino Acid
2-376
377-438


119
5586MI210_002

Prevotella

DNA




120
5586MI210_002

Prevotella

Amino Acid
2-374
375-437


121
5586MI212_002

Prevotella

DNA




122
5586MI212_002

Prevotella

Amino Acid
2-376
377-438


123
5586MI213_003

Prevotella

DNA




124
5586MI213_003

Prevotella

Amino Acid
2-376
377-438


125
5586MI215_003

Prevotella

DNA




126
5586MI215_003

Prevotella

Amino Acid
2-376
377-438


127
5607MI1_003

Prevotella

DNA




128
5607MI1_003

Prevotella

Amino Acid
2-376
377-438


129
5607MI2_003

Prevotella

DNA




130
5607MI2_003

Prevotella

Amino Acid
2-376
377-442


131
5607MI3_003

Prevotella

DNA




132
5607MI3_003

Prevotella

Amino Acid
2-376
377-438


133
5607MI4_005

Prevotella

DNA




134
5607MI4_005

Prevotella

Amino Acid
2-376
377-438


135
5607MI5_002

Prevotella

DNA




136
5607MI5_002

Prevotella

Amino Acid
2-376
377-439


137
5607MI6_002

Prevotella

DNA




138
5607MI6_002

Prevotella

Amino Acid
2-376
377-438


139
5607MI7_002

Prevotella

DNA




140
5607MI7_002

Prevotella

Amino Acid
2-376
377-438


141
5608MI1_004

Prevotella

DNA




142
5608MI1_004

Prevotella

Amino Acid
2-376
377-438


143
5608MI2_002

Prevotella

DNA




144
5608MI2_002

Prevotella

Amino Acid
2-375
376-437


145
5608MI3_004

Prevotella

DNA




146
5608MI3_004

Prevotella

Amino Acid
2-376
377-438


147
5609MI1_005

Prevotella

DNA




148
5609MI1_005

Prevotella

Amino Acid
2-376
377-438


149
5610MI1_003

Prevotella

DNA




150
5610MI1_003

Prevotella

Amino Acid
2-376
377-438


151
5610MI2_004

Prevotella

DNA




152
5610MI2_004

Prevotella

Amino Acid
2-376
377-438


153
5751MI1_003

Prevotella

DNA




154
5751MI1_003

Prevotella

Amino Acid
2-376
377-438


155
5751MI2_003

Prevotella

DNA




156
5751MI2_003

Prevotella

Amino Acid
2-376
377-438


157
5752MI1_003

Prevotella

DNA




158
5752MI1_003

Prevotella

Amino Acid
2-376
377-438


159
5752MI2_003

Prevotella

DNA




160
5752MI2_003

Prevotella

Amino Acid
2-376
377-438


161
5752MI3_002

Prevotella

DNA




162
5752MI3_002

Prevotella

Amino Acid
2-376
377-438


163
5752MI5_003

Prevotella

DNA




164
5752MI5_003

Prevotella

Amino Acid
2-376
377-438


165
5752MI6_004

Prevotella

DNA




166
5752MI6_004

Prevotella

Amino Acid
2-376
377-438


167
5753MI1_002

Prevotella

DNA




168
5753MI1_002

Prevotella

Amino Acid
2-376
377-438


169
5753MI2_002

Prevotella

DNA




170
5753MI2_002

Prevotella

Amino Acid
2-376
377-438


171
5753MI4_002

Prevotella

DNA




172
5753MI4_002

Prevotella

Amino Acid
2-376
377-438


173
5752MI4_004

Prevotella

DNA




174
5752MI4_004

Prevotella

Amino Acid
2-376
377-438


175
727MI4_006

Rhizobiales

DNA




176
727MI4_006

Rhizobiales

Amino Acid
2-373
374-435









In specific embodiments, a polypeptide of the disclosure comprises an amino acid sequence having:

    • (1) (a) at least 97% or 98% sequence identity to SEQ ID NO:78 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:78 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (2) (a) at least 95%, 97% or 98% sequence identity to SEQ ID NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:96) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:96) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (3) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:38 or the catalytic domain thereof (amino acids 2-374 of SEQ ID NO:38), and optionally further comprises one, two, three, four or all five of (i) SEQ ID NO:206 or SEQ ID NO:207; (ii) SEQ ID NO:208; (iii) SEQ ID NO:209; (iv) SEQ ID NO:210; and (iv) SEQ ID NO:211;
    • (4) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:2 or the catalytic domain thereof (amino acids 2-374 of SEQ ID NO:2);
    • (5) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:58 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:58),
    • (6) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:42 or the catalytic domain thereof (amino acids 2-375 of SEQ ID NO:42), and optionally further comprises one, two or all three of (i) SEQ ID NO:206 or SEQ ID NO:207; (ii) SEQ ID NO:210; and (iii) SEQ ID NO:211;
    • (7) (a) at least 97% or 98% sequence identity to SEQ ID NO:84 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:84), and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:84 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:84) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (8) (a) at least 97% or 98% sequence identity to SEQ ID NO:80 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:80 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (9) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:54 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:54);
    • (10) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:46 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:46), and optionally further comprises SEQ ID NO:206 or SEQ ID NO:207;
    • (11) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:16 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:16);
    • (12) at least 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:82 or the catalytic domain thereof (amino acids 2-375 of SEQ ID NO:82); and/or
    • (13) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:32 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:32).


The XIs of the disclosure can be characterized in terms of their activity. In some embodiments, a XI of the disclosure has at least 1.3 times the activity of the Orpinomyces sp. XI assigned Genbank Accession No. 169733248 (“Op-XI”) at pH 7.5, for example using the assay described in any of Examples 4, 6 and 7. In certain specific embodiments, a XI of the disclosure has an activity ranging from 1.25 to 3.0 times, from 1.5 to 3 times, from 1.5 to 2.25 times, or from 1.75 to 3 times the activity of Op-XI at pH 7.5.


The XIs of the disclosure can also be characterized in terms of their tolerance to acidic environments (e.g., at a pH of 6.5 or 6). In some embodiments, a XI of the disclosure has at least 1.9 times the activity of the Op-XI at pH 6, for example using the assay described in Example 7. In certain specific embodiments, a XI of the disclosure has an activity ranging from 1.9 to 4.1 times, from 2.4 to 4.1 times, from 2.4 to 3.9 times, or 2.4 to 4.1 times the activity of Op-XI at pH 6.


Tolerance to acidic environments can also be characterized as a ratio of activity at pH 6 to activity at pH 7.5 (“a pH 6 to pH 7.5 activity ratio”), for example as measured using the assay of Example 7. In some embodiments, the pH 6 to pH 7.5 activity ratio is at least 0.5 or at least 0.6. In various embodiments, the pH 6 to pH 7.5 activity ratio is 0.5-0.9 or 0.6-0.9.


In another aspect, the disclosure is directed to a nucleic acid which encodes a XI polypeptide of the disclosure. In various embodiments, the nucleic acid comprises a nucleotide sequence with at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 93%, 95%, 96%, 98%, 99% or 100% sequence identity to the nucleotide sequence of any one of SEQ ID NOS:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, and 175, or the portion of any of the foregoing sequences encoding a XI catalytic domain or dimerization domain.


The nucleic acids of the disclosure can be codon optimized, e.g., for expression in eukaryotic organisms such as yeast or filamentous fungi. Exemplary codon optimized open reading frames for expression in S. cerevisiae are SEQ ID NO:238 (encoding a XI of SEQ ID NO:54), SEQ ID NO:239 (encoding a XI of SEQ ID NO:58), SEQ ID NO:244 (encoding a XI of SEQ ID NO:78), SEQ ID NO:245 (encoding a XI of SEQ ID NO:96), SEQ ID NO:246 (encoding a XI of SEQ ID NO:38), SEQ ID NO:247 (encoding a XI of SEQ ID NO:78), SEQ ID NO:248 (encoding a XI of SEQ ID NO:96), and SEQ ID NO:249 (encoding a XI of SEQ ID NO:38). In various embodiments, the disclosure provides nucleic acids comprising nucleotide sequences having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 93%, at least 95%, at least 96%, at least 98%, or at least 99% sequence identity, or having 100% sequence identity, to the nucleotide sequence of any one of SEQ ID NOs:238, 239, 244, 245, 246, 247, 248 and 249, or the portion of any of the foregoing sequences encoding a XI catalytic domain or dimerization domain.


In other aspects, the disclosure is directed to a vector comprising a XI-encoding nucleotide sequence, for example a vector having an origin of replication and/or a promoter sequence operably linked to the XI-encoding nucleotide sequence. The promoter sequence can be one that is operable in a eukaryotic cell, for example in a fungal cell. In some embodiments, the promoter is operable in yeast (e.g., S. cerevisiae) or filamentous fungi.


In yet another aspect, the disclosure is directed to a recombinant cell comprising a nucleic acid that encodes a XI polypeptide. Particularly, the cell is engineered to express any of the XI polypeptides described herein. The recombinant cell may be of any species, and is preferably a eukaryotic cell, for example a yeast cell. Suitable genera of yeast include Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces, Issatchenkia and Yarrowia. In specific embodiments, the recombinant cell is a S. cerevisiae, S. bulderi, S. barnetti, S. exiguus, S. uvarum, S. diastaticus, K. lactis, I. orientalis, K. marxianusor K. fragilis. Suitable genera of filamentous fungi include Aspergillus, Penicillium, Rhizopus, Chrysosporium, Myceliophthora, Trichoderma, Humicola, Acremonium and Fusarium. In specific embodiments, the recombinant cell is an Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Penicillium chrysogenum, Myceliophthora thermophila, or Rhizopus oryzae.


The recombinant cell may also be mutagenized or engineered to include modifications other than the recombinant expression of XI, particularly those that make the cell more suited to utilize xylose in a fermentation pathway. Exemplary additional modifications create one, two, three, four, five or even more of the following phenotypes: (a) increase in xylose transport into the cell; (b) increase in aerobic growth rate on xylose; (c) increase in xylulose kinase activity; (d) increase in flux through the pentose phosphate pathway into glycolysis, (e) decrease in aldose reductase activity, (f) decrease in sensitivity to catabolite repression, (g) increase in tolerance to biofuels, e.g., ethanol, (h) increase tolerance to intermediate production (e.g., xylitol), (i) increase in temperature tolerance, (j) osmolarity of organic acids, and (k) a reduced production of byproducts.


Increases in activity can be achieved by increased expression levels, for example expression of a hexose or pentose (e.g., xylose) transporter, a xylulose kinase, a glycolytic enzyme, or an ethanologenic enzyme is increased. The increased expression levels are achieved by overexpressing an endogenous protein or by expressing a heterologous protein.


Other modifications to the recombinant cell that are part of the disclosure are modifications that decrease the activity of genes or pathways in the recombinant cell. Preferably, the expression levels of one, two, three or more of the genes for hexose kinase, MIG-1, MIG-2, XR, aldose reductase, and XDH are reduced. Reducing gene activity can be achieved by a targeted deletion or disruption of the gene (and optionally reintroducing the gene under the control of a different promoter that drives lower levels of expression or inducible expression).


In yet other aspects, the disclosure is directed to methods of producing fermentation products, for example one or more of ethanol, butanol, diesel, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, malic acid, fumaric acid, itaconic acid, an amino acid, 1,3-propane-diol, ethylene, glycerol, a β-lactam antibiotic and a cephalosporin. Typically, a cell that recombinantly expresses a XI of the disclosure is cultured in a xylose-containing medium, for example a medium supplemented with a lignocellulosic hydrolysate. The media may also contain glucose, arabinose, or other sugars, particularly those derived from lignocellulose. The media may be of any pH, particularly a pH between 3.0 and 9.0, preferably between 4.0 and 8.0, more preferably between 5.0 and 8.0, even more preferably between 6.0 and 7.5. The culture may occur in any media where the culture is under anaerobic or aerobic conditions, preferably under anaerobic conditions for production of compounds mentioned above and aerobically for biomass/cellular production. Optionally, the methods further comprise recovering the fermentation product produced by the recombinant cell.





3. BRIEF DESCRIPTION OF THE FIGURES


FIGS. 1A-1B are maps for the vector pMEV-ΔxylA (MEV3 xylA del) and PCR-BluntII-TOPO-xylA, respectively, used in the activity-based screen for XIs.



FIG. 2 illustrates the experimental strategy for the two-step marker exchange approach.



FIG. 3 is a map of the vector p426PGK1 for expressing XI in yeast strain, Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108).



FIG. 4 shows the growth rates on xylose containing media of selected clones expressed in yeast strain, Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108).



FIGS. 5A-5D are maps for the vectors pYDAB-006, pYDURA01, pYDPt-005 and pYDAB-0006, respectively, all used in creating strains of industrial S. cerevisiae strain yBPA130 with a single genomic copy of select XI clones.



FIG. 6 is a map of vector YDAB008-rDNA for multiple XI integration into S. cerevisiae strain yBPB007 and yBPB008.



FIGS. 7A-7D show monosaccharide (including xylose) utilization and ethanol production by strains of industrial S. cerevisiae with multiple copies of XI clones integrated into ribosomal DNA loci.



FIG. 8: Production of ethanol from glycolytic and pentose phosphate (“PPP”) pathways. Not all steps are shown. For example, glyceraldehyde-3-phosphate is converted to pyruvate via a series of glycolytic steps: (1) glyceraldehyde-3-phosphate to 3-phospho-D-glycerol-phosphate catalyzed by glyceraldehyde-3-phosphate dehydrogenase (TDH1-3); (2) 3-phospho-D-glycerol-phosphate to 3-phosphoglycerate catalyzed by 3-phosphoglycerate kinase (PGK1); (3) 3-phosphoglycerate to 2-phosphoglycerate catalyzed by phosphoglycerate mutase (GPM1); (4) 2-phosphoglycerate to phosphoenolpyruvate catalyzed by enolase (ENO1; ENO2); and (5) phosphoenolpyruvate to pyruvate calatyzed by pyruvate kinase (PYK2; CDC19). Other abbreviations: DHAP=dihydroxy-acetone-phosphate; GPD=Glycerol-3-phosphate dehydrogenase; RHR2/HOR2=DL-glycerol-3-phosphatase; XI=xylose isomerase; GRE=xylose reductase/aldose reductase; XYL=xylitol dehydrogenase; XKS=xylulokinase; PDC=pyruvate decarboxylase; ADH=alcohol dehydrogenase; ALD=aldehyde dehydrogenase; HXK=hexokinase; PGI=phosphoglucose isomerase; PFK=phosphofructokinase; FBA=aldolase; TPI=triosephosphate isomerase; ZWF=glucose-6 phosphate dehydrogenase; SOL=6-phosphogluconolactonase; GND=6-phosphogluconate dehydrogenase; RPE=D-ribulose-5-Phosphate 3-epimerase; RKI=ribose-5-phosphate ketol-isomerase; TKL=transketolase; TAL=transaldolase. Heavy dashed arrows indicate reactions and corresponding enzymes that can be reduced or eliminated to increase xylose utilization, particularly in the production of ethanol, and heavy solid arrows indicate reactions and corresponding enzymes that can be increased to increase xylose utilization, particularly in the production of ethanol. The enzymes shown in FIG. 8 are encoded by S. cerevisiae genes. The S. cerevisiae genes are used for exemplification purposes. Analogous enzymes and modifications in other organisms are within the scope of the present disclosure.





4. DETAILED DESCRIPTION
4.1 Xylose Isomerase Polypeptides

A “xylose isomerase” or “XI” is an enzyme that catalyzes the direct isomerisation of D-xylose into D-xylulose and/or vice versa. This class of enzymes is also known as D-xylose ketoisomerases. A xylose isomerase herein may also be capable of catalyzing the conversion between D-glucose and D-fructose (and accordingly may therefore be referred to as a glucose isomerase).


A “XI polypeptide of the disclosure” or a “XI of the disclosure” is a xylose isomerase having an amino acid sequence that is related to any one of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, or 176. In some embodiments, the xylose isomerase of the disclosure has an amino acid sequence that is at least about 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 98%, or at least 99% sequence identity thereto, or to a catalytic or dimerization domain thereof. The xylose isomerase of the disclosure can also have 100% sequence identity to one of the foregoing sequences.


The disclosure provides isolated, synthetic or recombinant XI polypeptides comprising an amino acid sequence having at least about 80%, e.g., at least about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or complete (100%) sequence identity to a polypeptide of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, or 176, over a region of at least about 10, e.g., at least about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, or 350 residues, or over the full length of the polypeptide, over the length of catalytic domain, or over the length of the dimerization domain.


The XI polypeptides of the disclosure can be encoded by a nucleic acid sequence having at least about 80%, about 85%, about 86%, about 87%, about 88%, about 89%, or about 90% sequence identity to 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, or 175, or by a nucleic acid sequence capable of hybridizing under high stringency conditions to a complement of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, or 175, or to a fragment thereof. Exemplary nucleic acids of the disclosure are described in Section 4.2 below.


In specific embodiments, a polypeptide of the disclosure comprises an amino acid sequence having:

    • (1) (a) at least 97% or 98% sequence identity to SEQ ID NO:78 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:78 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:78) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (2) (a) at least 95%, 97% or 98% sequence identity to SEQ ID NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:96) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:96 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:96) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (3) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:38 or the catalytic domain thereof (amino acids 2-374 of SEQ ID NO:38), and optionally further comprises one, two, three, four or all five of (i) SEQ ID NO:206 or SEQ ID NO:207; (ii) SEQ ID NO:208; (iii) SEQ ID NO:209; (iv) SEQ ID NO:210; and (iv) SEQ ID NO:211;
    • (4) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:2 or the catalytic domain thereof (amino acids 2-374 of SEQ ID NO:2);
    • (5) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:58 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:58),
    • (6) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:42 or the catalytic domain thereof (amino acids 2-375 of SEQ ID NO:42), and optionally further comprises one, two or all three of (i) SEQ ID NO:206 or SEQ ID NO:207; (ii) SEQ ID NO:210; and (iii) SEQ ID NO:211;
    • (7) (a) at least 97% or 98% sequence identity to SEQ ID NO:84 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:84), and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:84 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:84) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (8) (a) at least 97% or 98% sequence identity to SEQ ID NO:80 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80) and/or (b) at least 80%, 85%, 90%, 93% or 95% sequence identity to SEQ ID NO:80 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:80) and further comprises (i) SEQ ID NO:212 or SEQ ID NO:213 and/or (ii) SEQ ID NO:214;
    • (9) at least 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:54 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:54);
    • (10) at least 80%, 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:46 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:46), and optionally further comprises SEQ ID NO:206 or SEQ ID NO:207;
    • (11) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:16 or the catalytic domain thereof (amino acids 2-376 of SEQ ID NO:16);
    • (12) at least 85%, 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:82 or the catalytic domain thereof (amino acids 2-375 of SEQ ID NO:82); and/or
    • (13) at least 90%, 93%, 95%, 97% or 98% sequence identity to SEQ ID NO:32 or the catalytic domain thereof (amino acids 2-377 of SEQ ID NO:32).


An example of an algorithm that is suitable for determining sequence similarity is the BLAST algorithm, which is described in Altschul et al., 1990, J. Mol. Biol. 215:403-410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. These initial neighborhood word hits act as starting points to find longer HSPs containing them. The word hits are expanded in both directions along each of the two sequences being compared for as far as the cumulative alignment score can be increased. Extension of the word hits is stopped when: the cumulative alignment score falls off by the quantity X from a maximum achieved value; the cumulative score goes to zero or below; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1992, Proc. Nat'l. Acad. Sci. USA 89:10915-10919) alignments (B) of 50, expectation (E) of 10, M′5, N′-4, and a comparison of both strands.


Any of the amino acid sequences described herein can be produced together or in conjunction with at least 1, e.g., at least (or up to) 2, 3, 5, 10, or 20 heterologous amino acids flanking each of the C- and/or N-terminal ends of the specified amino acid sequence, and or deletions of at least 1, e.g., at least (or up to) 2, 3, 5, 10, or 20 amino acids from the C- and/or N-terminal ends of a XI of the disclosure.


The XIs of the disclosure can be characterized in terms of their activity. In some embodiments, a XI of the disclosure has at least 1.3 times the activity of the Orpinomyces sp. XI assigned Genbank Accession No. 169733248 (“Op-XI”) at pH 7.5, for example using the assay described in any of Examples 4, 6 and 7. In certain specific embodiments, a XI of the disclosure has an activity ranging from 1.25 to 3.0 times, from 1.5 to 3 times, from 1.5 to 2.25 times, or from 1.75 to 3 times the activity of Op-XI at pH 7.5.


The XIs of the disclosure can also be characterized in terms of their tolerance to acidic environments (e.g., at a pH of 6.5 or 6). In some embodiments, a XI of the disclosure has at least 1.9 times the activity of the Op-XI at pH 6, for example using the assay described in Example 7. In certain specific embodiments, a XI of the disclosure has an activity ranging from 1.9 to 4.1 times, from 2.4 to 4.1 times, from 2.4 to 3.9 times, or 2.4 to 4.1 times the activity of Op-XI at pH6.


Tolerance to acidic environments can also be characterized as a ratio of activity at pH 6 to activity at pH 7.5 (“a pH 6 to pH 7.5 activity ratio”), for example as measured using the assay of Example 7. In some embodiments, the pH 6 to pH 7.5 activity ratio is at least 0.5 or at least 0.6. In various embodiments, the pH 6 to pH 7.5 activity ratio is 0.5-0.9 or 0.6-0.9.


The xylose isomerases of the disclosure can have one or more (e.g., up to 2, 3, 5, 10, or 20) conservative amino acid substitutions relative to the polypeptide of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, or 176 or to the portion thereof of discussed above. The conservative substitutions can be chosen from among a group having a similar side chain to the reference amino acid. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulphur-containing side chains is cysteine and methionine. Accordingly, exemplary conservative substitutions for each of the naturally occurring amino acids are as follows: ala to ser; arg to lys; asn to gln or his; asp to glu; cys to ser or ala; gln to asn; glu to asp; gly to pro; his to asn or gin; ile to leu or val; leu to ile or val; lys to arg; gln or glu; met to leu or ile; phe to met, leu or tyr; ser to thr; thr to ser; trp to tyr; tyr to trp or phe; and, val to ile or leu.


The present disclosure also provides a fusion protein that includes at least a portion (e.g., a fragment or domain) of a XI polypeptide of the disclosure attached to one or more fusion segments, which are typically heterologous to the XI polypeptide. Suitable fusion segments include, without limitation, segments that can provide other desirable biological activity or facilitate purification of the XI polypeptide (e.g., by affinity chromatography). Fusion segments can be joined to the amino or carboxy terminus of a XI polypeptide. The fusion segments can be susceptible to cleavage.


4.2 Xylose Isomerase Nucleic Acids


A “XI nucleic acid of the disclosure” is a nucleic acid encoding a xylose isomerase of the disclosure. In certain embodiments, the xylose isomerase nucleic acid of the disclosure is encoded by a nucleotide sequence of any one of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, or 175, or a sequence having at least about 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 98%, or at least 99% sequence identity thereto. The xylose isomerase nucleic acid of the disclosure can also have 100% sequence identity to one of the foregoing sequences.


The present disclosure provides nucleic acids encoding a polypeptide of the disclosure, for example one described in Section 4.1 above. The disclosure provides isolated, synthetic or recombinant nucleic acids comprising a nucleic acid sequence having at least about 70%, e.g., at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%; 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, or complete (100%) sequence identity to a nucleic acid of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, or 175, over a region of at least about 10, e.g., at least about 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900, 1950, or 2000 nucleotides.


Nucleic acids of the disclosure also include isolated, synthetic or recombinant nucleic acids encoding a XI polypeptide having the sequence of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, or 176, and subsequences thereof (e.g., a conserved domain or a catalytic domain), and variants thereof.


To increase the likelihood that a XI polypeptide is recombinantly expressed, a XI nucleic acid may be adapted to optimize its codon usage to that of the chosen cell. Several methods for codon optimization are known in the art. For expression in yeast, an exemplary method to optimize codon usage of the nucleotide sequences to that of the yeast is a codon pair optimization technology as disclosed in WO 2006/077258 and/or WO 2008/000632. WO2008/000632 addresses codon-pair optimization. Codon-pair optimization is a method wherein the nucleotide sequences encoding a polypeptide are modified with respect to their codon-usage, in particular the codon-pairs that are used, to obtain improved expression of the nucleotide sequence encoding the polypeptide and/or improved production of the encoded polypeptide. Codon pairs are defined as a set of two subsequent triplets (codons) in a coding sequence. Boles codon optimization (see Table 2 of Wiedemann and Boles, 2008, Appl. Environ. Microbiol. 74:2043-2050) can also be used to optimize expression and activity of XIs in yeast. Alternatively, the XI sequence can be optimized using commercially available software, such as Gene Designer (DNA2.0). Preferably, codon optimized sequences avoid nucleotide repeats and restriction sites that are utilized in cloning the XI nucleic acids, by adjusting the settings in commercial software or by manually altering the sequences to substitute codons that introduce undesired sequences, for example with highly utilized codons in the organism of interest. Exemplary codon optimized open reading frames for expression in S. cerevisiae are SEQ ID NO:238 (encoding a XI of SEQ ID NO:54), SEQ ID NO:239 (encoding a XI of SEQ ID NO:58), SEQ ID NO:244 (encoding a XI of SEQ ID NO:78), SEQ ID NO:245 (encoding a XI of SEQ ID NO:96), SEQ ID NO:246 (encoding a XI of SEQ ID NO:38), SEQ ID NO:247 (encoding a XI of SEQ ID NO:78), SEQ ID NO:248 (encoding a XI of SEQ ID NO:96), and SEQ ID NO:249 (encoding a XI of SEQ ID NO:38). In various embodiments, the disclosure provides nucleic acids comprising nucleotide sequences having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 93%, at least 95%, at least 96%, at least 98%, or at least 99% sequence identity, or having 100% sequence identity, to the nucleotide sequence of any one of SEQ ID NOs:238, 239, 244, 245, 246, 247, 248 and 249, or the portion of any of the foregoing sequences encoding a XI catalytic domain or dimerization domain.


4.3 Host Cells and Recombinant Expression


The disclosure also provides host cells transformed with a XI nucleic acid and recombinant host cells engineered to express XI polypeptides. The XI nucleic acid construct may be extrachromosomal, on a plasmid, which can be a low copy plasmid or a high copy plasmid. The nucleic acid construct may be maintained episomally and thus comprise a sequence for autonomous replication, such as an autosomal replication sequence. Alternatively, a XI nucleic acid may be integrated in one or more copies into the genome of the cell. Integration into the cell's genome may occur at random by non-homologous recombination but preferably, the nucleic acid construct may be integrated into the cell's genome by homologous recombination as is well known in the art. In certain embodiments, the host cell is bacterial or fungal (e.g., a yeast or a filamentous fungus).


Suitable host cells of the bacterial genera include, but are not limited to, cells of Escherichia, Bacillus, Lactobacillus, Pseudomonas, and Streptomyces. Suitable cells of bacterial species include, but are not limited to, cells of Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Lactobacillus brevis, Pseudomonas aeruginosa, and Streptomyces lividans.


Suitable host cells of the genera of yeast include, but are not limited to, cells of Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces, Phaffia, Issatchenkia and Yarrowia. In specific embodiments, the recombinant cell is a S. cerevisiae, C. albicans, S. pombe, S. bulderi, S. barnetti, S. exiguus, S. uvarum, S. diastaticus, H. polymorpha, K. lactis, I. orientalis, K. marxianus, K. fragilis, P. pastoris, P. canadensis, K. marxianus or P. rhodozyma. Exemplary yeast strains that are suitable for recombinant XI expression include, but are not limited to, Lallemand LYCC 6391, Lallemand LYCC 6939, Lallemand LYCC 6469, (all from Lallemand, Inc., Montreal, Canada); NRRL YB-1952 (ARS (NRRL) Collection, U.S. Department of Agriculture); and BY4741.


Suitable host cells of filamentous fungi include all filamentous forms of the subdivision Eumycotina. Suitable cells of filamentous fungal genera include, but are not limited to, cells of Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysoporium, Coprinus, Coriolus, Corynascus, Chaetomium, Cryptococcus, Filobasidium, Fusarium, Gibberella, Humicola, Hypocrea, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Scytaldium, Schizophyllum, Sporotrichum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, and Trichoderma. In certain aspects, the recombinant cell is a Trichoderma sp. (e.g., Trichoderma reesei), Penicillium sp., Humicola sp. (e.g., Humicola insolens); Aspergillus sp. (e.g., Aspergillus niger), Chrysosporium sp., Fusarium sp., or Hypocrea sp. Suitable cells can also include cells of various anamorph and teleomorph forms of these filamentous fungal genera.


Suitable cells of filamentous fungal species include, but are not limited to, cells of Aspergillus awamori, Aspergillus fumigatus, Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Chrysosporium lucknowense, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Coprinus cinereus, Coriolus hirsutus, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Neurospora intermedia, Penicillium purpurogenum, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, Phanerochaete chrysosporium, Phlebia radiate, Pleurotus eryngii, Talaromyces flavus, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, and Trichoderma viride.


Typically, for recombinant expression, the XI nucleic acid will be operably linked to one or more nucleic acid sequences capable of providing for or aiding the transcription and/or translation of the XI sequence, for example a promoter operable in the organism in which the XI is to be expressed. The promoters can be homologous or heterologous, and constitutive or inducible.


Preferably, the XI polypeptide is expressed in the cytosol and therefore lacks a mitochondrial or peroxisomal targeting signal.


Where recombinant expression in a filamentous fungal host is desired, the promoter can be a fungal promoter (including but not limited to a filamentous fungal promoter), a promoter operable in plant cells, a promoter operable in mammalian cells.


As described in U.S. provisional application No. 61/553,901, filed Oct. 31, 2011, the contents of which are hereby incorporated in their entireties, promoters that are constitutively active in mammalian cells (which can derived from a mammalian genome or the genome of a mammalian virus) are capable of eliciting high expression levels in filamentous fungi such as Trichoderma reesei. An exemplary promoter is the cytomegalovirus (“CMV”) promoter.


As described in U.S. provisional application No. 61/553,897, filed Oct. 31, 2011, the contents of which are hereby incorporated in their entireties, promoters that are constitutively active in plant cells (which can derived from a plant genome or the genome of a plant virus) are capable of eliciting high expression levels in filamentous fungi such as Trichoderma reesei. Exemplary promoters are the cauliflower mosaic virus (“CaMV”) 35S promoter or the Commelina yellow mottle virus (“CoYMV”) promoter.


Mammalian, mammalian viral, plant and plant viral promoters can drive particularly high expression when the associated 5′ UTR sequence (i.e., the sequence which begins at the transcription start site and ends one nucleotide (nt) before the start codon), normally associated with the mammalian or mammalian viral promoter is replaced by a fungal 5′ UTR sequence.


The source of the 5′ UTR can vary provided it is operable in the filamentous fungal cell. In various embodiments, the 5′ UTR can be derived from a yeast gene or a filamentous fungal gene. The 5′ UTR can be from the same species, one other component in the expression cassette (e.g., the promoter or the XI coding sequence), or from a different species. The 5′ UTR can be from the same species as the filamentous fungal cell that the expression construct is intended to operate in. In an exemplary embodiment, the 5′ UTR comprises a sequence corresponding to a fragment of a 5′ UTR from a T. reesei glyceraldehyde-3-phosphate dehydrogenase (gpd). In a specific embodiment, the 5′ UTR is not naturally associated with the CMV promoter


Examples of other promoters that can be used include, but are not limited to, a cellulase promoter, a xylanase promoter, the 1818 promoter (previously identified as a highly expressed protein by EST mapping Trichoderma). For example, the promoter can suitably be a cellobiohydrolase, endoglucanase, or β-glucosidase promoter. A particularly suitable promoter can be, for example, a T. reesei cellobiohydrolase, endoglucanase, or β-glucosidase promoter. Non-limiting examples of promoters include a cbh1, cbh2, egl1, egl2, egl3, egl4, egl5, pki1, gpd1, xyn1, or xyn2 promoter.


For recombinant expression in yeast, suitable promoters for S. cerevisiae include the MFα1 promoter, galactose inducible promoters such as the GAL1, GAL7 and GAL10 promoters, glycolytic enzyme promoters including the TPI and PGK promoters, the TDH3 promoter, the TEF1 promoter, the TRP1 promoter, the CYCI promoter, the CUP1 promoter, the PHO5 promoter, the ADH1 promoter, and the HSP promoter. Promoters that are active at different stage of growth or production (e.g., idiophase or trophophase) can also be used (see, e.g., Puig et al., 1996, Biotechnology Letters 18(8):887-892; Puig and Pérez-Ortin, 2000, Systematic and Applied Microbiology 23(2): 300-303; Simon et al., 2001, Cell 106:697-708; Wittenberg and Reed, 2005, Oncogene 24:2746-2755). A suitable promoter in the genus Pichia sp. is the AOXI (methanol utilization) promoter.


The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants, or amplifying the nucleic acid sequence encoding the XI polypeptide. Culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to those skilled in the art. As noted, many references are available for the culture and production of many cells, including cells of bacterial and fungal origin. Cell culture media in general are set forth in Atlas and Parks (eds.), 1993, The Handbook of Microbiological Media, CRC Press, Boca Raton, Fla., which is incorporated herein by reference. For recombinant expression in filamentous fungal cells, the cells are cultured in a standard medium containing physiological salts and nutrients, such as described in Pourquie et al., 1988, Biochemistry and Genetics of Cellulose Degradation, eds. Aubert, et al., Academic Press, pp. 71-86; and Ilmen et al., 1997, Appl. Environ. Microbiol. 63:1298-1306. Culture conditions are also standard, e.g., cultures are incubated at 30° C. in shaker cultures or fermenters until desired levels of XI expression are achieved. Preferred culture conditions for a given filamentous fungus may be found in the scientific literature and/or from the source of the fungi such as the American Type Culture Collection (ATCC). After fungal growth has been established, the cells are exposed to conditions effective to cause or permit the expression of a XI.


In cases where a XI coding sequence is under the control of an inducible promoter, the inducing agent, e.g., a sugar, metal salt or antibiotics, is added to the medium at a concentration effective to induce XI expression.


In addition to recombinant expression of a XI polypeptide, a host cell of the disclosure may further include one or more genetic modifications that increase the cell's ability to utilize xylose as a substrate in a fermentation process. Exemplary additional modifications create one, two, three, four, five or even more of the following phenotypes: (a) increase in xylose transport into the cell; (b) increase in aerobic growth rate on xylose; (c) increase in xylulose kinase activity; (d) increase in flux through the pentose phosphate pathway into glycolysis, (e) modulating in aldose reductase activity, (f) decrease in sensitivity to catabolite repression, (g) increase in tolerance to biofuels, e.g., ethanol, (h) increase tolerance to intermediate production (for example xylitol), (i) increase in temperature tolerance, (j) osmolarity of organic acids, and (k) a reduced production of byproducts.


As illustrated below, a modification that results in one or more of the foregoing phenotypes can be a result of increasing or decreasing expression of an endogenous protein (e.g., by at least a factor of about 1.1, about 1.2, about 1.5, about 2, about 5, about 10 or about 20) or a result of introducing expression of a heterologous polypeptide. For avoidance of doubt, “decreasing” or “reducing” gene expression encompasses eliminating expression. Decreasing (or reducing) the expression of an endogenous protein can be accomplished by inactivating one or more (or all) endogenous copies of a gene in a cell. A gene can be inactivated by deletion of at least part of the gene or by disruption of the gene. This can be achieved by deleting the some or all of a gene coding sequence or regulatory sequence whose deletion results in a reduction of gene expression in the cell. Examples of modifications that increase xylose utilization or yield of fermentation product are described below.


Increasing Xylose Transport:


Xylose transport can be increased directly or indirectly. For example, a recombinant cell may include one or more genetic modifications that result in expression of a xylose transporter. Exemplary transporters include, but are not limited to GXF1, SUT1 and At6g59250 from Candida intermedia, Pichia stipitis (now renamed Scheffersomyces stipitis; the terms are used interchangeably herein) and Arabidopsis thaliana, respectively (Runquist et al., 2010, Biotechnol. Biofuels 3:5), as well as HXT4, HXT5, HXT7, GAL2, AGT1, and GXF2 (see, e.g., Matsushika et al., 2009, Appl. Microbiol. Biotechnol. 84:37-53). Other transporters include PsAraT, SUT2-4 and XUT1-5 from P. stiptis; GXS1 from Candida intermedia; XylHP and DEHAOD02167 from Debaryomyces hansenii; and YALI0C06424 from Yarrowia lipolytica (see, e.g., Young et al., 2011, Appl. Environ. Microbiol. 77:3311-3319). Xylose transport can also be increased by (over-) expression of low-affinity hexose transporters, which are capable of non-selectively transporting sugars, including xylose, into the cell once glucose levels are low (e.g., 0.2-1.0 g/1); and includes CgHXT1-CgHXT5 from Colletotrichum graminicola. The foregoing modifications can be made singly or in combinations of two, three or more modifications.


Increasing Xylulose Kinase Activity:


Xylulose kinase activity can be increased by overexpression of a xylulose kinase, e.g., xylulose kinase (XKS1; Saccharomyces genome database (“SGD”) accession no. YGR194C) of S. cerevisiae, particularly where the recombinant cell is a yeast cell. In one embodiment, a S. cerevisiae cell is engineered to include at least 2 additional copies of xylulose kinase under the control of a strong constitutive promoter such as TDH3, TEF1 or PGK1. In another embodiment, overexpression of an endogenous xylulose kinase was engineered. This xylulose kinase having improved kinetic activities through the use of protein engineering techniques known by those skilled in the art.


Increasing Flux Through the Pentose Phosphate Pathway:


This can be achieved by increasing expression of one or more genes in the pentose phosphate pathway, for example S. cerevisiae transaldolase TAL1 (SGD accession no. YLR354C), transketolase TKL1 (SGD accession no. YPR074C), ribulose 5-phosphate epimerase RPE1 (SGD accession no. YJL121C) and ribose-5-phosphate ketoisomerase RKI1 (SGD accession no. YOR095C) and/or one or more genes to increase glycolytic flux, for example S. cerevisiae pyruvate kinase PYK1/CDC19 (SGD accession no. YAL038W), pyruvate decarboxylase PDC1 (SGD accession no. YLR044C), pyruvate decarboxylase PDC5 (SGD accession no. YLR134W), pyruvate decarboxylase PDC6 (SGD accession no. YGR087C), the alcohol dehydrogenases ADH1-5 (SGD accession nos. YOL086C, YMR303C, YMR083W, YGL256W, and YBR145W, respectively), and hexose kinase HXK1-2 (SGD accession nos. YFR053C and YGL253W, respectively). In one embodiment, the yeast cell has one additional copy each of TAL1, TKL1, RPE1 and RKI1 from S. cerevisiae under the control of strong constitutive promoters (e.g., PGK1, TDH3, TEF1); and may also include improvements to glycolytic flux (e.g., increased copies of genes such as PYK1, PDC1, PDC5, PDC6, ADH1-5) and glucose-6-phosphate and hexokinase. The foregoing modifications can be made singly or in combinations of two, three or more modifications.


Modulating Aldose Reductase Activity:


A recombinant cell can include one or more genetic modifications that increase or reduce (unspecific) aldose reductase (sometimes called aldo-keto reductase) activity. Aldose reductase activity can be reduced by one or more genetic modifications that reduce the expression of or inactivate a gene encoding an aldose reductase, for example S. cerevisiae GRE3 (SGD accession no. YHR104W).


In certain embodiments, GRE3 expression is reduced. In one aspect, the recombinant cell is a yeast cell in which the GRE3 gene is deleted. Deletion of GRE3 decreased xylitol yield by 49% and biomass production by 31%, but increased ethanol yield by 19% (Traff-Bjerre et al., 2004, Yeast 21:141-150). In another aspect, the recombinant cell is a yeast cell which has a reduction in expression of GRE3. Reducing GRE3 expression has been shown to result in a two-fold decrease in by-product (i.e., xylitol) formation and an associated improvement in ethanol yield (Traff et al., 2001, Appl. Environ. Microbiol. 67:5668-5674).


In another embodiment, the recombinant cell is a cell (optionally but not necessarily a yeast cell) in which GRE3 is overexpressed. In a study analyzing the effect of GRE3 overexpression in S. cerevisiae to investigate the effect on xylose utilization, an increase of about 30% in xylose consumption and about 120% in ethanol production was noted (Traff-Bjerre et al., 2004, Yeast 21:141-150).


Decreasing Xylose Reductase Activity:


A recombinant cell may include one or more genetic modifications that reduce xylose reductase activity. Xylose reductase activity can be reduced by one or more genetic modifications that reduce the expression of or inactivate a gene encoding a xylose reductase.


Decreasing Sensitivity to Catabolite Repression:


Glucose and other sugars, such as galactose or maltose, are able to cause carbon catabolite repression in Crabtree-positive yeast, such as S. cerevisiae. In one study, xylose was found to decrease the derepression of various enzymes of an engineered S. cerevisiae strain capable of xylose utilization by at least 10-fold in the presence of ethanol. Xylose also impaired the derepression of galactokinase and invertase (Belinchon & Gancedo, 2003, Arch. Microbiol. 180:293-297). In certain embodiments, in order to reduce catabolite sensitivity, yeast can include one or more genetic modifications that reduce expression of one or more of GRR1 (SGD accession no. YJR090C), the gene assigned SGD accession no. YLR042C, GAT1 (SGD accession no. YKR067W) and/or one or more genetic modifications that decrease expression of one or more of SNF1 (SGD accession no. YDR477W), SNF4 (SGD accession no. YGL115W), MIG1 (SGD accession no. YGL035C) and CRE1 (SGD accession no. YJL127C). In further embodiments, yeast can include one or more genetic modifications that result in overexpression of the pentose phosphate pathway enzymes. In yet further embodiments, yeast can include one or more genetic modifications that reduce expression of hexo-/glucokinase. In yet a further embodiment, yeast can include one or more genetic modifications that modulate the activity of one or more GATA factors, for example GAT1, DAL80 (SGD accession no. YKR034W), GZF3 (SGD accession no. YJL110C) and GLN3 (SGD accession no. YER040W). The foregoing modifications can be made singly or in combinations of two, three or more modifications.


Increasing Tolerance to Biofuels (e.g., Ethanol), Pathway Intermediates (e.g., Xylitol), Organic Acids and Temperature:


For efficient bioethanol production from lignocellulosic biomass, it is useful to improve cellular tolerance to toxic compounds released during the pretreatment of biomass. In one study, the gene encoding PHO13 (SGD accession no. YDL236W), a protein with alkaline phosphatase activity, was disrupted. This resulted in improved ethanol production from xylose in the presence of three major inhibitors (i.e., acetic acid, formic acid and furfural). Further, the specific ethanol productivity of the mutant in the presence of 90 mM furfural was four fold higher (Fujitomi et al., 2012, Biores. Tech., 111:161-166). Thus, in one embodiment, yeast has one or more genetic modifications that reduce PHO13 expression. In other embodiments, yeast, bacterial and fungal cells are evolved under selective conditions to identify strains that can withstand higher temperatures, higher levels of intermediates, higher levels of organic acids and/or higher levels of biofuels (e.g., ethanol). In yet other embodiments, yeast are engineered to reduce expression of FPS1 (SGD accession no. YLL043W); overexpress unsaturated lipid and ergosterol biosynthetic pathways; reduce expression of PHO13 and/or SSK2 (SGD accession no. YNR031C); modulate global transcription factor cAMP receptor protein, through increasing or decreasing expression; increase expression of MSN2 (SGD accession no. YMR037C), RCN1 (SGD accession no. YKL159C), RSA3 (SGD accession no. YLR221C), CDC19 and/or ADH1; or increase expression of Rice ASR1. The foregoing modifications can be made singly or in combinations of two, three or more modifications.


Reducing Production of Byproducts:


Glycerol is one of the main byproducts in C6 ethanol production. Reducing glycerol is desirable for increasing xylose utilization by yeast. Production of glycerol can be reduced by deleting the gene encoding the FPS1 channel protein, which mediates glycerol export, and GPD2 (SGD accession no. YOL059W), which encodes glycerol-3-phosphate dehydrogenase; optionally along with overexpression of GLT1 (SGD accession no. YDL171C) and GLN1 (SGD accession no. YPR035W). In one study, FPS1 and GPD2 were knocked-out in one S. cerevisiae strain, and in another were replaced by overexpression of GLT1 and GLN1, which encode glutamate synthase and glutamine synthetase, respectively. When grown under microaerobic conditions, these strains showed ethanol yield improvements of 13.17% and 6.66%, respectively. Conversely, glycerol, acetic acid and pyruvic acid were found to all decrease, with glycerol down 37.4% and 41.7%, respectively (Zhang and Chen, 2008, Chinese J. Chem. Eng. 16:620-625).


Production of glycerol can also be reduced by deleting the NADH-dependent glycerol-3-phosphate dehydrogenase 1 (GPD1; SGD accession no. YDL022W) and/or the NADPH-dependent glutamate dehydrogenase 1 (GDH1; SGD accession no. YOR375C). Sole deletion of GPD1 or GDH1 reduces glycerol production, and double deletion results in a 46.4% reduction of glycerol production as compared to wild-type S. cerevisiae (Kim et al., 2012, Bioproc. Biosys. Eng. 35:49-54). Deleting FPS1 can decrease production of glycerol for osmoregulatory reasons.


Reducing production of acetate can also increase xylose utilization. Deleting ALD6 (SGD accession no. YPL061W) can decrease production of acetate.


ADH2 can also be deleted to reduce or eliminate acetylaldehyde formation from ethanol and thereby increase ethanol yield.


The foregoing modifications to reduce byproduct formation can be made singly or in combinations of two, three or more modifications.


In addition to ethanol production, a recombinant XI-expressing cell of the disclosure can be suitable for the production of non-ethanolic fermentation products. Such non-ethanolic fermentation products include in principle any bulk or fine chemical that is producible by a eukaryotic microorganism such as a yeast or a filamentous fungus. Such fermentation products may be, for example, butanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, malic acid, fumaric acid, itaconic acid, an amino acid, 1,3-propane-diol, ethylene, glycerol, a β-lactam antibiotic or a cephalosporin. A preferred modified host cell of the disclosure for production of non-ethanolic fermentation products is a host cell that contains a genetic modification that results in decreased alcohol dehydrogenase activity.


Cells expressing the XI polypeptides of the disclosure can be grown under batch, fed-batch or continuous fermentations conditions. Classical batch fermentation is a closed system, wherein the compositions of the medium is set at the beginning of the fermentation and is not subject to artificial alternations during the fermentation. A variation of the batch system is a fed-batch fermentation in which the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression is likely to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the medium. Batch and fed-batch fermentations are common and well known in the art. Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth. Continuous fermentation systems strive to maintain steady state growth conditions. Methods for modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology.


4.4 Fermentation Methods


A further aspect the disclosure relates to fermentation processes in which the recombinant XI-expressing cells are used for the fermentation of carbon source comprising a source of xylose. Thus, in certain embodiments, the disclosure provides a process for producing a fermentation product by (a) fermenting a medium containing a source of xylose with a recombinant XI-expressing cell as defined herein above, under conditions in which the cell ferments xylose to the fermentation product, and optionally, (b) recovery of the fermentation product. In some embodiments, the fermentation product is an alcohol (e.g., ethanol, butanol, etc.), a fatty alcohol (e.g., a C8-C20 fatty alcohol), a fatty acid (e.g., a C8-C20 fatty acid), lactic acid, 3-hydroxypropionic acid, acrylic acid, acetic acid, succinic acid, citric acid, malic acid, fumaric acid, an amino acid, 1,3-propanediol, itaconic acid, ethylene, glycerol, and a β-lactam antibiotic such as Penicillin G or Penicillin V and fermentative derivatives thereof and cephalosporins. The fermentation process may be an aerobic or an anaerobic fermentation process.


In addition to a source of xylose the carbon source in the fermentation medium may also comprise a source of glucose. The source of xylose or glucose may be xylose or glucose as such or may be any carbohydrate oligo- or polymer comprising xylose or glucose units, such as e.g., lignocellulose, xylans, cellulose, starch and the like. Most microorganisms possess carbon catabolite repression that results in sequential consumption of mixed sugars derived from the lignocellulose, reducing the efficacy of the overall process. To increase the efficiency of fermentation, microorganisms that are capable of simultaneous consumption of mixed sugars (e.g., glucose and xylose) have been developed, for example by rendering them less sensitive to glucose repression (see, e.g., Kim et al., 2010, Appl. Microbiol. Biotechnol. 88:1077-85 and Ho et al., 1999, Adv. Biochem. Eng. Biotechnol. 65:163-92). Such cells can be used for recombinant XI expression and in the fermentation methods of the disclosure.


The fermentation process is preferably run at a temperature that is optimal for the recombinant XI-expressing cells. Thus, for most yeasts or fungal host cells, the fermentation process is performed at a temperature which is less than 38° C., unless temperature tolerant mutant strains are used, in which case the temperature may be higher. For most yeast or filamentous fungal host cells, the fermentation process is suitably performed at a temperature which is lower than 35° C., 33° C., 30° C. or 28° C. Optionally, the temperature is higher than 20° C., 22° C., or 25° C.


An exemplary process is a process for the production of ethanol, whereby the process comprises the steps of: (a) fermenting a medium containing a source of xylose with a transformed host cell as defined above, whereby the host cell ferments xylose to ethanol; and optionally, (b) recovery of the ethanol. The fermentation medium can also comprise a source of glucose that is also fermented to ethanol. The source of xylose can be sugars produced from biomass or agricultural wastes. Many processes for the production of monomeric sugars such as glucose generated from lignocellulose are well known, and are suitable for use herein. In brief, the cellulolytic material may be enzymatically, chemically, and/or physically hydrolyzed to a glucose and xylose containing fraction. Alternatively, the recombinant XI-expressing cells of the disclosure can be further transformed with one or more genes encoding for enzymes effective for hydrolysis of complex substrates such as lignocellulose, and include but are not limited to cellulases, hemicellulases, peroxidases, laccases, chitinases, proteases, and pectinases. The recombinant cells of the disclosure can then be fermented under anaerobic in the presence of glucose and xylose. Where the recombinant cell is a yeast cell, the fermentation techniques and conditions described for example, by Wyman (1994, Biores. Technol. 50:3-16) and Olsson and Hahn-Hagerdal (1996, Enzyme Microb. Technol. 18:312-331) can be used. After completion of the fermentation, the ethanol may be recovered and optionally purified or distilled. Solid residue containing lignin may be discarded or burned as a fuel.


The fermentation process may be run under aerobic and anaerobic conditions. In some embodiments, the process is carried out under microaerobic or oxygen limited conditions. Fermentation can be carried out in a batch, fed-batch, or continuous configuration within (bio)reactors.


5. EXAMPLES
5.1 Materials and Methods

5.1.1 Yeast Culture


Unless stated otherwise for a particular example, yeast transformants were grown in SC-ura media with about 2% glucose at 30° C. for about 24 hours. The media contains approx. 20 g agar, approx. 134 g BD Difco™ Yeast Nitrogen Base without amino acids (BD, Franklin Lakes, N.J., and approx. 2 g SC amino-acid mix containing about 85 mg of the following amino acids unless noted (quantity listed in parentheses): L-Adenine (21.0), L-Alanine, L-Arginine, L-Asparagine, L-Aspartic Acid, L-Cysteine, Glutamine, L-Glutamic Acid, Glycine, L-Histidine, Myo-Inositol, L-Isoleucine, L-Leucine (173.4), L-Lysine, L-Methionine, p-Aminobenzoic Acid (8.6), L-Phenylalanine, L-Proline, L-Serine, L-Threonine, L-Tryptophan, L-Tyrosine, L-Valine).


5.1.2 Xylose Isomerase Activity


XI activity in cell lysates was determined using a method based on that of Kersters-Hilderson et al., 1986, Enzyme Microb. Technol. 9:145-148, in which enzymatic conversion of xylose to xylulose by the XI is coupled with the enzymatic conversion of the product (xylulose) to xylitol via the enzyme sorbitol dehydrogenase (SDH). SDH activity requires the oxidation of NADH to NAD+. The rate of oxidation of NADH is directly proportional to the rate of SDH conversion of D-xylulose to D-xylitol and is measured by the decrease in absorbance at 340 nm One unit of enzyme activity as measured by this assay is a decrease of 1 mole of NADH per minute under assay conditions. All reactions, solutions, plates, and spectrophotometer were equilibrated to about 35° C. prior to use. Assays were performed either on fresh lysates immediately after preparation or lysates that had been frozen at −20° C. immediately after preparation. Assays were performed using a BioTek Model: Synergy H1 Hybrid Reader spectrophotometer and 96-well plates (Corning, Model #Costar® #3598). All spectrophotometric readings were performed at 340 nm. A standard curve of NADH was generated with each assay with concentrations ranging from 0 to about 0.6 mM.


The reaction buffer used for experiments at pH 7.5 was about 100 mM Tris-HCl (pH 7.5). The assay mix was prepared as follows: reaction buffer to which was added about 10 mM MgCl2, 0.15 mM NADH and 0.05 mg/ml SDH (Roche, catalog #50-720-3313). For experiments where activity was also measured at pH 6, the buffer was changed to about 100 mM sodium phosphate, pH 6. The assay mix for the entire experiment was then prepared as follows: about 10 mM MgCl2, 1.2 mM NADH and 0.02 mg/ml SDH.


Any sample dilutions were performed using the reaction buffer as diluent. Reactions were set up by aliquotting about 90 μl of assay mix into each well of the plates. About 10 μl of each XI sample was added to the wells. The reactions were started by the addition of about 100 μl substrate solution (about 1 M D-xylose). Reactions were mixed and read immediately using kinetic assay mode for about 10 minutes. Volumetric activity (VA) units are in milli-absorbance (mA) units per minute per ml of lysate added to the reactions (mA/min/ml). Background VA rates of negative control wells (no enzyme added) were subtracted from VA of samples. Determination of fold improvement over positive control (FIOPC) was obtained by dividing the VA of the XI-samples by the VA observed for a control (Orpinomyces xylose isomerase, NCBI:169733248 (Op-XI)) expressed using the same host and expression vector. In some characterizations, the slope of an NADH standard curve was used to convert VA (mA/min) to μmole-NADH/min (or Units). If protein quantitation was performed, specific activities (SA) were calculated where the units for SA are (mole NADH+/min/mg, or U/mg lysate protein). All activities listed (VA or SA) account for any dilutions, volumes of lysate added, and protein concentrations for the lysates assayed.


5.2 Example 2: Activity-Based Discovery Screen for Xylose Isomerases

Libraries used for the activity-based discovery (“ABD”) screen were in the format of excised phagemids. These libraries were constructed as described in U.S. Pat. No. 6,280,926. Sources for these libraries were environmental rumen samples collected from the foregut of deceased herbivores.


An Escherichia coli screening strain was constructed to identify genes from the environmental libraries encoding xylose isomerase activity. Specifically, E. coli strain SEL700, a MG1655 derivative that is recA, phage lambda resistant and contains an F′ plasmid, was complemented with plasmid pJC859, a derivative of pBR322 containing the E. coli recA gene (Kokjohn et al., 1987, J. Bacteriol. 169:1499-1508) to generate a wild-type recA phenotype.


A two-step marker exchange procedure was then used to delete the entire coding sequence of the endogenous xy/A xylose isomerase gene. Briefly, pMEV3, a plasmid with a pir-dependent replicon (ori6RK) encoding kanamycin-resistance and the sacB levansucrase, was used as a vector for construction of the xylA deletion plasmid. A fragment of DNA containing the flanking regions of the xylA gene (0.7 kb of sequence 5′ and 0.9 kb of sequence 3′ of xylA) and containing BsaI restriction sites was generated by overlap extension PCR using primers, ligated to pMEV3 digested with BbsI, and transformed into E. coli by electroporation. Clones were confirmed by sequencing, resulting in plasmid pMEV3-ΔxylA (FIG. 1A).


The pMEV3-ΔxylA plasmid was then transformed into strain E. coli strain SEL700 (MG1655 Δr, Δ(recA-srl)306,srl-301::Tn10-84(Tets), [F′ proAB, lacIq, ZΔM15, Tn10 (Tetr)] pJC859). Single-crossover events were selected for by plating on LB agar plates containing kanamycin (final concentration, about 50 μg/ml). After confirmation of integration of pMEV3-ΔxylA on the chromosome, a second crossover event was selected for by growth on LB agar media containing sucrose (FIG. 2). Colonies displaying resistance to kanamycin and the ability to grow on sucrose were screened both by PCR characterization with primers flanking the xy/A gene to confirm gene deletion and by growth on a modified MacConkey media (ABD media), comprised of: MacConkey Agar Base (Difco™ #281810) (approximate formula per liter: Pancreatic Digest of Gelatin (17.0 g) Peptones (meat and casein) (3.0 g), Bile Salts No. 3 (1.5 g), Sodium Chloride (5.0 g), Agar (13.5 g), Neutral Red (0.03 g), Crystal Violet (1.0 g, Xylose (30.0 g) and Kanamycin (50 mg). The ABD media contained neutral red, a pH indicator that turns red at a pH <6.8. Colonies of mutants lacking xylA appeared white on this media while colonies with restored xylose metabolism ability appeared red in color due to the fermentation of xylose to xylulose, which lowered the pH of the media surrounding those colonies.


Following the successful deletion of xylA, the resulting strain was cured of pJC859 by the following method: The xylA deletion strain was grown for about 24 hours in LB media containing tetracycline at a final concentration, about 20 μg/ml, at around 37° C. The next day the cells were subcultured (1:100 dilution) into LB tetracycline (at the same concentration) media and incubated at about three different temperatures (30, 37, and 42° C.). Cells were passaged the same way as above for about two more days. Dilutions of the resulting cultures were plated on LB plates to isolate single colonies. Colonies were replica plated onto LB agar plates with and without Carbenicillin (at about 100 μg/ml, final concentration). Carbenicillin resistant colonies were deemed to still contain vector pJC859 whereas carbenicillin sensitive colonies were cured of pJC859, restoring the recA genotype of strain SEL700. This strain, SEL700 ΔxylA, was used for the ABD screening.


The ABD screening method was verified by creating a positive control strain by PCR amplification of the xylA gene from E. coli K12 and cloning into the PCR-BluntII TOPO vector (Invitrogen, Carlsbad, Calif.) using standard procedures. This vector (PCR-BluntII-TOPO-xylA, FIG. 1B) was then transformed into the screening strain (SEL700 ΔxylA). Complementation of the xylose phenotype was verified by growth of transformants on ABD media and appearance of red halos indicating xylose utilization.


The libraries were screened for XI activity by infecting strain SEL700 ΔxylA with the excised phagemid libraries. Infected cells were plated onto ABD media and only colonies with red “halos” (indicating xylose fermentation), were carried forward. Positives were purified to single colonies, and regrown on ABD media to confirm phenotype.


5.3 Example 2:Sequence-Based Discovery for Xylose Isomerases

Libraries used for sequence-based discovery (“SBD”) were in the format of genomic DNA (gDNA) extractions. These libraries were constructed as described in U.S. Pat. No. 6,280,926. Sources for these libraries were samples collected from the guts of deceased herbivores.


XI genes often exist in conserved gene clusters (Dodd et al., 2011, Molecular Microbiol. 79:292-304). In order to obtain full length XI gene sequences from metagenomic samples, primers were designed to both upstream and downstream conserved DNA sequences found in several Bacteroides species, typically xylulose kinase and xylose permease, respectively. These flanking DNA sequences were obtained from public databases. Sample genomic DNA was extracted from eleven different animal rumen samples. Left flanking consensus primer has the sequence 5′-GCIGCICARGARGGNATYGTVTT-3′ (SEQ ID NO:177) (this primer codes for the amino acid motif AAQEGIV(F) (SEQ ID NO:178)). Right flanking consensus primer has the sequence 5′-GCDATYTCNGCRATRTACATSGG-3′ (SEQ ID NO:179) (this primer codes for the amino acid motif PMYIAEIA (SEQ ID NO:180)). PCR reactions were carried out using touchdown cycling conditions, and hot start Platinum® Taq DNA polymerase (Invitrogen, Carlsbad, Calif.). PCR products of expected size were purified and subcloned into pCR4-TOPO vector system (Invitrogen, Carlsbad, Calif.). Positive colonies from the TOPO-based PCR libraries were transformed into TOP10 (Invitrogen, Carlsbad, Calif.) and the transformants grown on LB agar plates with kanamycin (about 25 μg/ml final concentration). Resistant colonies were picked and inoculated into 2 columns each of a 96-deep well plate in about 1.2 ml LB kanamycin (25 μg/ml final concentration) media per well. Cultures were grown overnight at about 30° C. The next day plasmids were purified and inserts sequenced. Sequence analysis revealed multiple full length XI genes. Identification of putative ORFs was done by identifying start and stop codons for the longest protein coding region, and subsequent manual curation based on homology to published xylose isomerase DNA sequences.


5.4 Example 3: XI Sequence Analysis

Plasmids from both ABD and SBD screens were purified and vector inserts were sequenced using an ABI 3730xl DNA Analyzer and ABI BigDye® v3.1 cycle sequencing chemistry. Identification of putative ORFs was done by identifying start and stop codons for the longest protein coding region, and subsequent manual curation based on homology to published xylose isomerase DNA sequences. The XI ORF identified are set forth in Table 2 below, which indicates the sequences and source organism classification for each XI determined from either the ABD or SBD libraries as well as their assigned sequence identifiers. The putative catalytic domains (based on sequence alignments with other XIs) are underlined.













TABLE 2








SEQ



Clone
Class of
Type of
ID



No.
organism
Sequence
NO:
Sequence







1754MI2_

Bacteroidales

DNA
  1
ATGGCAGTTAAAGAATATTTCCCGGAGATAGGCAAGATCGCCTTTGAAGGAAAGGAGTCC


001



AAGAACCCTATGGCATTCCACTACTACAATCCAGAGCAGGTAGTAGCCGGAAAGAAAATG






AAAGATTGGTTCAAGTTCGCTATGGCATGGTGGCACACCCTCTGCGCTGAAGGTGGCGAC






CAGTTCGGTCCTGGTACCAAGAAATTCCCTTGGAACACAGGTGCAACTGCACTCGAAAGA






GCAAAGAACAAAATGGACGCAGGTTTCGAGATCATGAGCAAGCTCGGTATCGAGTATTTC






TGCTTCCACGATGTTGACCTTATCGACGAGGCTGACACTGTTGAAGAGTACGAGGCTAAC






ATGAAGGCTATCACAGCTTACGCAAAGGAGAAAATGGCCGCTACTGGCATCAAACTCCTC






TGGGGAACAGCCAATGTATTCGGCAACAAGAGATATATGAACGGCGCTTCTACCAACCCT






GACTTCAACGTGGCTGCACGCGCTATGCTCCAGATCAAGAACGCTATCGACGCAACTATC






GCTCTCGGTGGTGACTGCTATGTATTCTGGGGCGGCCGTGAGGGTTACATGAGCCTTCTC






AACACCGATATGAAGAGAGAGAAAGAGCACATGGCTACCATGCTTACCATGGCACGCGAC






TATGCTCGTTCTAAGGGCTTCAAGGGTACCTTCCTTATCGAGCCTAAGCCAATGGAGCCG






ATGAAGCACCAGTACGATGTCGATACTGAGACTGTCGTAGGTTTCCTCCGCGCCCATGGT






CTTGACAAGGACTTCAAGGTAAACATCGAGGTTAACCACGCTACTCTCGCAGGCCACACC






TTCGAGCACGAGCTCCAGTGCGCCGTTGACGCAGGCATGCTCGGAAGCATCGACGCCAAC






CGTGGTGACTACCAGAACGGCTGGGATACCGACCAGTTCCCTATCGACCTCTATGAGCTC






GTACAGGCTATGATGGTTATCATCAAGGGCGGCGGTCTCGTCGGCGGTACCAACTTCGAC






GCCAAGACCCGTCGTAACTCAACAGACCTCGAGGATATCTTCATCGCTCATGTATCCGGC






ATGGATGTCATGGCACGCGCTCTCCTCATCGCTGCTGACCTTCTCGAGAAATCTCCTATT






CCTGCAATGGTCAAGGAGCGTTACGCTTCCTACGACTCAGGCATGGGCAAGGACTTCGAG






AACGGCAAGCTTACTCTCGAGCAGGTTGTCGATTTCGCAAGAAAGAACGGCGAGCCTAAG






AGCACCAGCGGAAAGCAGGAGCTCTACGAGTCTATCGTCAATCTCTACATCTAA





1754MI2_

Bacteroidales

Amino 
  2
MAVKEYFPEIGKIAFEGKESKNPMAFHYYNPEQVVAGKKMKDWFKFAMAWWHTLCAEGGD


001

Acid


QFGPGTKKFPWNTGATALERAKNKMDAGFEIMSKLGIEYFCFHDVDLIDEADTVEEYEAN








MKAITAYAKEKMAATGIKLLWGTANVFGNKRYMNGASTNDDFNVAARAMLQIKNAIDATI








ALGGPCYVFWGGREGYMSLLNTDMKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEP








MKHQYDVDTETVVGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELQCAVDAGMLGSIDAN








RGDYQNGWDTDQFPIDLYELVQAMMVIIKGGGLVGGTNFDAKTRRNSTDLEDIFIAHVSG








MDVMARALLIAADLLEKSPIPAMVKERYASYDSGMGKDFENGKLTLEQVVDFARKNGEPK







STSGKQELYESIVNLYI





5586MI6_

Bacteroidales

DNA
  3
ATGGCAAACAAAGAGTACTTCCCGGAGATCGGGAAAATCAAATTCGAAGGCAAGGATTCC


004



AAGAACCCGCTTGCATTCCATTATTACAATCCTGAGCAGGTCGTCTGCGGCAAGCCGATG






AAGGACTGGCTCAAGTTCGCTATGGCATGGTGGCACACCCTCTGCGCAGAGGGTAGCGAC






CAGTTCGGCGGACCCACCAAGTCATTCCCTTGGAACAAAGCTTCGGATCCCATCGCAAAG






GCCAAGCAGAAAGTCGACGCCGGTTTCGAGATCATGCAGAAGCTCGGTATCGGATACTAT






TGCTTCCACGATGTAGACCTCATCGACGAGCCCGCCACCATCGAGGAGTATGAGGCCGAT






CTCAAGGAGATCGTCGCTTACCTCAAGGAGAAGCAGGCCCAGACCGGCATCAAGCTCCTT






TGGGGCACCGCCAACGTCTTCGGTCACAAGCGGTACATGAACGGCGCCTCCACCAACCCT






GATTTCGACGTCGCAGCCCGCGCCATGGTCCAGATCAAGAACGCCATGGACGCCACCATC






GAGCTCGGCGGCGAGTGCTATGTCTTCTGGGGCGGCCGCGAGGGCTACATGAGCCTCCTC






AACACCGACATGAAGCGTGAGAAGCAGCATATGGCCACCATGCTCGGCATGGCCCGCGAC






TATGCACGCGGCAAGGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCG






ACCAAGCACCAGTATGACGTCGACACCGAGACCGTCATCGGTTTCCTCCGTGCCAACGGT






CTTGACAAGGACTTCAAGGTCAACATCGAGGTCAATCACGCCACCCTCGCCGGCCACACC






TTCGAGCATGAGCTCCAGTGCGCCGCCGATGCCGGTCTCCTCGGATCCATCGACGCCAAC






CGCGGCGACTATCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACCTCTATGAGCTC






ACCCAGGCCATGATGGTCATCCTCAAGAATGGCGGCCTCGTCGGCGGTACCAACTTCGAC






GCCAAGACCCGTCGCAACTCCACCGACCTGGACGACATCATCATCGCCCACGTCAGCGGT






ATGGACATCATGGCACGCGCACTCCTCGTCGCTGCCGACGTCCTCACCAAGTCCGAGCTT






CCCAAGATGCTCAAGGAGCGTTACGCTTCCTTCGACTCCGGCAAGGGCAAGGAGTTCGAA






GAGGGCAAGCTCACTCTCGAGCAGGTCGTAGAGTACGCCAAGACCAAGGGCGAGCCCAAG






GCCACCAGCGGCAAGCAGGAGCTCTACGAGACCATCGTCAACATGTACATCTAA





5586MI6_

Bacteroidales

Amino 
  4
MANKEYFPEIGKIKFEGKDSKNPLAFHYYNPEQVVCGKPMKDWLKFAMAWWHTLCAEGSD


004

Acid


QFGGPTKSFPWNKASDPIAKAKQKVDAGFEIMQKLGIGYYCFHDVDLIDEPATIEEYEAD








LKEIVAYLKEKQAQTGIKLLWGTANVFGHKRYMNGASTNPDFDVAARAMVQIKNAMDATI








ELGGECYVFWGGREGYMSLLNTDMKREKQHMATMLGMARDYARGKGFKGTFLIEPKPMEP








TKHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELQCAADAGLLGSIDAN








RGDYQNGWDTDQFPIDLYELTQAMMVILKNGGLVGGTNFDAKTRRNSTDLDDIIIAHVSG








MDIMARALLVAADVLTKSELPKMLKERYASFDSGKGKEFEEGKLTLEQVVEYAKTKGEPK







ATSGKQELYETIVNMYI





5749MI1_

Bacteroidales

DNA
  5
ATGAATTTTTATAAAGGCGAAAAAGAATTCTTCCCCGGAATAGGAAAGATTCAGTTTGAA


003



GGACGCGAGTCAAAGAACCCGATGGCGTTTCATTATTATGACGAAAACAAGGTGGTGATG






GGTAAAACACTGAAGGATCATCTTCGTTTTGCAATGGCTTACTGGCATACGCTTTGTGCC






GAAGGGGGCGACCAGTTTGGCGGTGGTACGAAAACATTCCCCTGGAATGCTGCTGCCGAC






CCGATCAGCCGTGCCAAATATAAGATGGATGCAGCGTTCGAGTTTATGACAAAATGCAGC






ATCCCTTATTACTGTTTCCATGATGTGGACGTGGTGGACGAAGCTCCCACGCTGGCTCAG






TTTGAAAAAGACCTTCATACGATGGTAGGCCATGCCAAAGGGCTTCAGCAGGCAACCGGA






AAAAAACTGTTATGGTCTACTGCCAACGTGTTCAGCAACAAACGCTATATGAACGGGGCT






GCCACTAATCCTGACTTCTCGGCCGTGGCTTGTGCCGGTACGCAGATCAAGAATGCGATC






GATGCCTGTATCGCGCTGGACGGTGAAAACTATGTGTTCTGGGGCGGACGTGAAGGATAT






ATGGGCTTGCTCAATACCGATATGAAACGCGAAAAAGACCATCTGGCCATGATGCTGACG






ATGGCACGCGACTATGGCCGCAAGAACGGTTTCAAAGGTACTTTCCTGATCGAGCCGAAA






CCGATGGAACCGACCAAGCATCAATATGATGTCGACTCGGAAACTGTAATCGGCTTCCTA






CGTCATTATGGCCTGGATAAAGACTTCGCCCTGAATATCGAAGTAAATCATGCAACCCTG






GCCGGACATACGTTCGAGCACGAATTGCAGGCTGCTGTCGATGCCGGTATGCTGTGCAGT






ATCGATGCCAACCGTGGTGACTACCAGAATGGCTGGGATACCGACCAATTCCCGATGGAC






ATCTACGAACTGACTCAGGCTTGGCTGGTCATTCTGCAAGGTGGTGGTCTGACAACCGGC






GGAACGAACTTCGATGCCAAGACCCGCCGCAACTCGACCGACCTGGACGATATCTTCCTG






GCTCATATAGGTGGTATGGATGCGTTTGCCCGTGCCCTGATCACGGCTGCTGCCATCCTT






GAAAACTCCGATTACACGAAGATGCGTGCCGAACGTTACACCAGCTTCGATGGTGGCGAA






GGCAAAGCGTTTGAAGACGGTAAACTTTCTCTGGAAGACCTGCGTACGATCGCTCTCCGC






GACGGAGAACCGAAGATGGTCAGCGGCAAACAGGAATTATATGAGATGATTCTCAATTTA






TACATATAA





5749MI1_

Bacteroidales

Amino 
  6
MNFYKGEKEFFPGIGKIQFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA


003

Acid


EGGDQFGGGTKTFPWNAAADPISRAKYKMDAAFEFMTKCSIPYYCFHDVDVVDEAPTLAQ








FEKDLHTMVGHAKGLQQATGKKLLWSTANVFSNKRYMNGAATNPDFSAVACAGTQIKNAI








DACIALDGENYVFWGGREGYMGLLNTDMKREKDHLAMMLTMARDYGRKNGFKGTFLIEPK








PMEPTKHQYDVDSETVIGFLRHYGLDKPFALNIEVNHATLAGHTFEHELQAAVDAGMLCS








IDANRGDYQNGWDTDQFPMDIYELTQAWLVILQGGGLTTGGTNFDAKTRRNSTDLDDIFL








AHIGGMDAFARALITAAAILENSDYTKMRAERYTSFDGGEGKAFEDGKLSLEDLRTIALR







DGEPKMVSGKQELYEMILNLYI





5750MI1_

Bacteroidales

DNA
  7
ATGAATTACTTTAAAGGTGAGAAAGAGTTCTTCCCGGGAATCGGGAAAATAGAGTTTGAA


003



GGACGTGAATCGAAGAATCCGATGGCTTTTCATTACTATGACGAGAACAAGGTTGTCATG






GGGAAGACCTTGAAGGACCATCTGCGTTTTGCGATGGCTTATTGGCATACGCTGTGTGCG






GAAGGCGCCGACCAGTTCGGCGGCGGGACGAAGGCATTTCCCTGGAATACCGGGGCGGAT






CGTATTTCCCGTGCCAAGTATAAGATGGATGCTGCTTTTGAGTTTATGACGAAATGTAAC






ATCCCGTACTATTGTTTCCATGATGTGGATGTGGTGGATGAAGCTCCGACACTGGCCGAA






TTTGAAAAAGACTTGCATACGATGGTCGAATATGCCAAGCAGCATCAGGAGGCAACCGGG






AAAAAACTGTTGTGGTCTACCGCCAATGTGTTCAGCAATAAACGTTATATGAACGGGGCT






GCCACAAATCCGTATTTCCCTGCTGTCGCTTGTGCGGGTACGCAGATCAAGAATGCTATC






GACGCTTGTATTGCCCTGGGCGGCGAAAACTATGTGTTCTGGGGCGGTCGTGAAGGGTAT






ATGAGCTTGTTGAACACCAATATGAAACGCGAAAAGGAACATCTCGCCATGATGTTGACG






ATGGCTCGCGATTATGCGCGTAAGAACGGCTTCAAAGGTACTTTCCTGGTAGAGCCTAAA






CCGATGGAACCGACCAAACATCAGTATGATGTGGACACAGAAACTGTTATCGGCTTCCTG






CGTCATTACGGCCTTGACAAGGACTTTGCCATCAACATCGAAGTGAATCATGCTACATTG






GCTGGACATACATTCGAACATGAGCTTCAGGCGGCTGCCGATGCCGGTATGCTGTGCAGC






ATCGACGCCAACCGCGGCGATTACCAGAATGGTTGGGACACGGATCAGTTCCCGGTCGAC






ATCTACGAACTGACACAGGCGTGGCTGGTTATCCTCGAAGCGGGTGGCCTGACTACCGGT






GGTACGAACTTCGACGCCAAGACGCGCCGCAACTCGACTGACCTGGACGATATCTTCCTG






GCACACATCGGTGGTATGGATTCGTTTGCCCGTGCTTTGATGGCGGCTGCCGATATATTG






GAACACTCCGATTACAAAAAGATGCGTGCCGAACGTTATGCCAGCTTCGATCAAGGCGAC






GGCAAGAAGTTCGAAGATGGTAAACTCCTTCTCGAGGACCTCCGCACCATCGCTCTTGCC






TCCGGCGAACCGAAGCAAATCAGCGGGAAACAGGAATTGTATGAAATGATTATCAACCAG






TACATTTAA





5750MI1_

Bacteroidales

Amino   
  8
MNYFKGEKEFFPGIGKIEFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA


003

Acid


EGADQFGGGTKAFPWNTGADRISRAKYKMDAAFEFMTKCNIPYYCFHDVDVVDEAPTLAE








FEKDLHTMVEYAKQHQEATGKKLLWSTANVFSNKRYMNGAATNPYFPAVACAGTQIKNAI








DACIALGGENYVFWGGREGYMSLLNTNMKREKEHLAMMLTMARDYARKNGFKGTFLVEPK








PMEPTKHQYDVDTETVIGFLRHYGLDKPFAINIEVNHATLAGHTFEHELQAAADAGMLCS








IDANRGDYQNGWDTDQFPVDIYELTQAWLVILEAGGLTTGGTNFDAKTRRNSTDLDDIFL








AHIGGMDSFARALMAAADILEHSDYKKMRAERYASFDQGDGKKFEDGKLLLEDLRTIALA







SGEPKQISGKQELYEMIINQYI





5750MI2_

Bacteroidales

DNA
  9
ATGAATTATTTTAAAGGTGAAAAAGAGTTTTTCCCTGGAATCGGGAAAATAGAGTTTGAA


003



GGACGTGAGTCGAAGAATCCGATGGCTTTTCATTATTATGATGAAAACAAGGTCGTAATG






GGCAAGACCTTGAAAGATCACCTCCGCTTTGCAATGGCTTACTGGCATACGTTGTGCGCG






GAAGGCGCAGACCAGTTTGGCGGTGGCACAAAATCATTCCCCTGGAATACCGCAGCGGAT






CGTATTTCCCGCGCTAAATATAAAATGGATGCTGCTTTCGAGTTTATGACCAAGTGCAGT






ATCCCGTACTATTGTTTCCATGATGTGGACGTGGTGGACGAAGCTCCGGCACTGGCCGAA






TTTGAAAAGGACCTGCATACGATGGTGGGATTCGCCAAACAACACCAGGAAGCAACCGGA






AAGAAACTGTTGTGGTCTACAGCCAATGTATTCGGGCATAAACGTTATATGAACGGAGCG






GCTACCAATCarTATTTCCCGGCTGTCGCTTGTGCCGGTACGCAGATCAAGAATGCAATC






GACGCCTGTATCGAGCTGGGTGGAGAGAACTATGTATTCTGGGGCGGACGCGAAGGCTAC






ATGAGCCTGCTGAACACCAATATGAAACGTGAAAAGGATCATTTGGCCATGATGCTGACA






ATGGCACGCGATTATGCCCGCAAGAATGGTTTCAAGGGTACTTTCCTGGTGGAATCTAAG






CCGATGGAACCGACCAAACATCAGTATGACGCAGATACGGAAACCGTGATCGGCTTCCTG






CGCCACTATGGCCTCGACAAGGATTTCGCTATCAACATTGAAGTGAACCATGCTACATTG






GCCGGCCATACATTCGAACATGAACTTCAGGCTGCTGCCGATGCCGGTATGCTGTGCAGC






ATCGATGCAAATAGAGGCGACTATCAGAATGGTTGGGATACGGATCAGTTCCCCGTAGAC






ATTTACGAACTGACACAGGCCTGGCTGGTTATCCTGGAAGCGGGCGGACTGACAACCGGA






GGTACGAACTTCGATGCGAAGACCCGTCGTAACTCGACTGACCTCGACGATATCTTCCTG






GCCCATATCGGCGGTATGGATTCGTTTGCACGTGCCTTGATGGCAGCTGCCGATATCCTG






GAACATTCTGATTACAAGAAGATGCGTGCCGAACGTTACGCCAGCTTCGACCAGGGCGAC






GGCAAGAAGTTCGAAGACGGCAAACTCCTTCTCGAAGACCTGCGCACAATTGCCCTTGCC






GGCGACGAACCGAAGCAGATCAGCGGCAAGCAGGAGTTGTATGAGATGATTATCAATCAG






TATATTTAA





5750MI2_

Bacteroidales

Amino 
 10
MNYFKGEKEFFPGIGKIEFEGRESKNPMAFHYYDENKVVMGKTLKDHLRFAMAYWHTLCA


003

Acid


EGADQFGGGTKSFPWNTAADRISRAKYKMDAAFEFMTKCSIPYYCFHDVDVVDEAPALAE








FEKDLHTMVGFAKQHQEATGKKLLWSTANVFGHKRYMNGAATNPYFPAVACAGTQIKNAI








DACIELGGENYVFWGGREGYMSLLNTNMKREKDHLAMMLTMARDYARKNGFKGTFLVESK








PMEPTKHQYDADTETVIGFLRHYGLDKDFAINIEVNHATLAGHTFEHELQAAADAGMLCS








IDANRGDYQNGWDTDQFPVDIYELTQAWLVILEAGGLTTGGTNFDAKTRRNSTDLDDIFL








AHIGGMDSFARALMAAADILEHSDYKKMRAERYASFDQGDGKKFEDGKLLLEDLRTIALA







GDEPKQISGKQELYEMIINQYI





5586MI5_

Bacteroides

DNA
 11
ATGAAACAGTATTTCCCGAACATCTCCGCCATCAAGTTTGAGGGCGTCGAGAGCAAGAAT


004



CCCCTGGCTTACCGCTACTACGACCGCGACCGCGTCGTCATGGGTAAGAAGATGAGCGAA






TGGTTTAAGTTCGCTATGTGCTGGTGGCACACCCTCTGCGCCGAGGGCTCCGATCAGTTC






GGTCCCGGCACAAAGACCTTCCCCTGGAACGCCGCCGCCGACCCCGTGCAGGCTGCCAAG






GACAAGGCCGACGCTGGCTTCGAGATCATGCAGAAACTCGGCATCGAGTACTACTGCTTC






CACGACGTTGACCTCGTGGCCGAGGCTCCCGACGTGGAGACCTACGAGAAGAACCTCAAG






GAGATCGTGGCTTATCTCAAGCAGAAACAGGCTGAGACGGGCATCAAGCTGCTCTGGGGC






ACTGCCAACGTCTTCGGACACAAGCGCTACATGAACGGAGCCTCCACGAACCCCGACTTC






GATGTCGTGGCACGCGCTATCGTGCAGATCAAGAACGCCATCGATGCTACCATCGAGCTG






GGCGGCACCAACTACGTCTTCTGGGGCGGTCGCGAAGGCTACATGAGCCTGCTCAACACC






GATATGAAGCGCGAGAAGGAGCACATGGCTACGATGTTGACGATGGCACGCGACTATGCC






CGTTCTAAGGGATTCAAGGGCACGTTCCTCATCGAACCCAAACCCATGGAACCCACGAAG






CATCAGTACGATGCGGACACCGAGACGGTCATCGGATTCCTCCGTGCTCATGGTCTCGAC






AAGGATTTCAAGGTCAACATCGAGGTCAACCACGCCACGCTGGCCGGACACACGTTCGAG






CATGAGCTGGCCTGCGCCGTAGACGCCGATATGCTCGGCAGCATCGATGCCAATCGCGGC






GACTATCAGAACGGATGGGACACCGACCAGTTCCCCATCGACCACTACGAACTCACGCAG






GCTATGCTGCAGATCATCCGCAACGGAGGTTTCAAGGACGGTGGCACCAATTTTGACGCT






AAGACGCGCCGCAACAGCACCGACCTCGAGGATATCTTCATCGCTCACGTAGCAGCCATG






GACGCCATGGCCCACGCCCTGTTGTCGGCTGCCGATATCATCGAGAAGTCGCCCATCTGC






ACGATGGTCAAGGAGCGTTACGCCAGCTTCGATGCCGGCGAAGGCAAGCGCTTCGAAGAA






GGCAAGATGACCCTCGAGGAAGCCTACGAGTATGGCAAGAAGGTCGGGGAGCCCAAGCAG






ACCAGCGGAAAGCAGGAGCTCTACGAAGCCATTGTCAATATGTATTGA





5586MI5_

Bacteroides

Amino 
 12
MKQYFPNISAIKFEGVESKNPLAYRYYDRDRVVMGKKMSEWFKFAMCWWHTLCAEGSDQF


004

Acid


GPGTKTFPWNAAADPVQAAKDKADAGFEIMQKLGIEYYCFHDVDLVAEAPDVETYEKNLK








EIVAYLKQKQAETGIKLLWGTANVFGHKRYMNGASTNPDFDVVARAIVQIKNAIDATIEL








GGTNYVFWGGREGYMSLLNTDMKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPTK








HQYDADTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDADMLGSIDANRG








DYQNGWDTDQFPIDHYELTQAMLQIIRNGGFKDGGTNFDAKTRRNSTDLEDIFIAHVAAM








DAMAHALLSAADIIEKSPICTMVKERYASFDAGEGKRFEEGKMTLEEAYEYGKKVGEPKQ







TSGKQELYEAIVNMY





5586MI202_

Bacteroides

DNA
 13
ATGGCAACAAAAGAGTATTTTCCCGGAATAGGAAAGATTAAATTCGAAGGTAAAGAGAGT


004



ATGAACCCGATGGCATATCGTTACTACGATGCTGAGAAGGTAATCATGGGTAAGAAGATG






AAAGATTGGTTGAAGTTTGCTATGGCTTGGTGGCACACTCTCTGCGCAGAAGGTGGTGAC






CAATTCGGTGGCGGAACGAAACAATTCCCTTGGAATGGTGACTCTGACGCTTTGCAAGCA






GCTAAAAATAAATTGGATGCAGGTTTCGAATTCATGCAGAAGATGGGTATCGAATACTAT






TGCTTCCACGATGTAGACCTGATTTCTGAAGGTGCAAGCATCGAAGAATACGAAGCTAAC






TTGAAAGCTATCGTAGCTTATGCAAAAGAAAAACAGGCTGAAACTGGTATCAAGCTGTTG






TGGGGTACTGCTAACGTATTCGGTCATGCACGTTATATGAACGGTGCTGCTACCAATCCT






GATTTCGACGTTGTAGCACGCGCTGCTGTTCAGATCAAGAACGCTATTGACGCTACTATC






GAACTGGGTGGTTCAAACTATGTATTCTGGGGCGGTCGCGAAGGTTACATGTCTTTGCTG






AACACTGACCAGAAACGTGAAAAAGAACACCTTGCAAAGATGTTGACTATCGCTCGTGAC






TATGCACGTGCTCGTGGCTTCAAAGGTACTTTCCTGATTGAGCCGAAACCGATGGAACCG






ACAAAACATCAGTATGATGTAGATACTGAAACAGTTATCGGCTTCCTGAAAGCTCACGGT






TTGGATAAGGATTTCAAAGTAAACATCGAGGTTAATCACGCAACTTTGGCTGGCCATACT






TTCGAACACGAACTGGCTGTAGCTGTTGACAACGGCATGTTAGGTTCTATCGACGCTAAC






CGTGGTGACTACCAGAACGGTTGGGATACTGACCAATTCCCTATCGATAACTACGAACTG






ACTCAAGCTATGATGCAGATCATCCGCAACGGTGGTTTGGGTAATGGCGGTACTAACTTC






GACGCTAAGACCCGTCGTAACTCTACCGACCTGGAAGATATCTTCATCGCTCACATTGCA






GGTATGGATGCTATGGCACGTGCTCTGGAAAGTGCAGCTAAATTACTGGAAGAATCTCCT






TATAAGAAAATGTTGGCTGATCGTTACGCATCATTCGACGGTGGCAAGGGTAAGGAATTC






GAAGAAGGCAAATTGTCTTTGGAAGATGTTGTAGCTTATGCGAAAGCTAACGGCGAACCG






AAGCAAACCAGCGGCAAGCAAGAATTGTATGAAGCAATCGTGAATATGTATTGCTAA





5586MI202_

Bacteroides

Amino 
 14
MATKEYFPGIGKIKFEGKESMNPMAYRYYDAEKVIMGKKMKDWLKFAMAWWHTLCAEGGD


004

Acid


QFGGGTKQFPWNGDSDALQAAKNKLDAGFEFMQKMGIETYCFHDVDLISEGASIEEYEAN








LKAIVAYAKEKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKNAIDATI








ELGGSNYVFWGGREGYMSLLNTDQKREKEHLAKMLTIARDYARARGFKGTFLIEPKPMEP








TKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN








RGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIA








GMDAMARALESAAKLLEESPYKKMLADRYASFDGGKGKEFEEGKLSLEDVVAYAKANGEP







KQTSGKQELYEAIVNMYC





5586MI211_

Bacteroides

DNA
 15
ATGGCAAAAGAGTATTTTCCTGGCGTGAAAAAAATCCAGTTCGAGGGTAAGGACAGTAAG


003



AATCCAATGGCTTACCGTTATTATGATGCAGAGAAGGTCATCATGGGTAAGAAGATGAAG






GATTGGTTGAAGTTCGCTATGGCTTGGTGGCACACTTTGTGCGCTGAGGGCGCAGACCAG






TTCGGTGGCGGTACTAAGACTTTCCCTTGGAACGAAGGTGCAAACGCTTTGGAAGTTGCT






AAGAATAAGGCTGATGCTGGTTTCGAGATTATGGAGAAGCTTGGCATCGAGTACTACTGT






TTCCACGATGTAGACCTCGTTGAGGAGGCTGCAACTATCGAGGAGTATGAGGCTAACATG






AAGGCTATCGTTGCTTATCTTAAGGAGAAGCAGGCTGCTACTGGCAAGAAGCTTCTTTGG






GGTACTGCTAACGTATTCGGCAACAAGCGCTATATGAACGGTGCTTCTACAAACCCTGAC






TTCGACGTTGTTGCTCGCGCTTGTGTTCAGATTAAGAACGCTATCGACGCTACTATCGAA






CTTGGTGGTACAAACTACGTATTCTGGGGTGGCCGCGAGGGTTATATGAGCCTTCTTAAC






ACAGATATGAAGCGTGAGAAGGAGCACATGGCAACTATGCTTACTAAGGCTCGCGACTAC






GCTCGTTCAAAGGGCTTTACTGGTACATTCCTTATCGAGCCAAAGCCAATGGAACCATCA






AAGCATCAGTATGATGTTGATACTGAGACTGTTTGTGGTTTCTTGAGGGCTCACGGTCTT






GACAAGGACTTCAAGGTAAACATCGAGGTTAACCACGCTACTTTGGCTGGTCACACATTC






GAGCACGAGTTGGCTGCTGCTGTTGATAACGGTATGCTTGGCTCTATCGACGCTAACCGC






GGTGACTACCAGAACGGTTGGGATACTGACCAGTTCCCTATCGACAACTTCGAGCTTATT






CAGGCTATGATGCAGATTATCCGCAACGGTGGTCTTGGCAACGGTGGTACAAACTTCGAC






GCTAAGACTCGTCGTAACTCAACTGACCTTGAGGATATCTTCATCGCACACATCGCTGGT






ATGGATGCAATGGCTCGCGCTCTTGAGAACGCAGCAGACCTTTTGGAGAACTCTCCAATC






AAGAAGATGGTTGCTGAGCGTTACGCTTCATTCGACAGCGGCAAGGGTAAGGAGTTCGAG






GAAGGCAAGTTGAGCCTTGGGGACATCGTTGCTTATGCTAAGCAGAACGGTGAGCCTAAG






CAGACAAGCGGTAAGCAGGAGCTTTACGAGGCTATCGTAAACATGTACTGCTAA





5586MI211_

Bacteroides

Amino 
 16
MAKEYFPGVKKIQFEGKDSKNPMAYRYYDAEKVIMGKKMKDWLKFAMAWWHTLCAEGADQ


003

Acid


FGGGTKTFPWNEGANALEVAKNKADAGFEIMEKLGIEYYCFHDVDLVEEAATIEEYEANM








KAIVAYLKEKQAATGKKLLWGTANVFGNKRYMNGASTNPDFDVVARACVQIKNAIDATIE








LGGTNYVFWGGREGYMSLLNTDMKREKEHMATMLTKARDYARSKGFTGTFLIEPKPMEPS








KHQYDVDTETVCGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELAAAVDNGMLGSIDANR








GDYQNGWDTDQFPIDNFELIQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIAG








MDAMARALENAADLLENSPIKKMVAERYASFDSGKGKEFEEGKLSLGDIVAYAKQNGEPK







QTSGKQELYEAIVNMYC





5606MI1_

Bacteroides

DNA
 17
ATGGCGACAAAAGAATACTTTCCCGGAATAGGGAAAATCAAGTTTGAGGGTGTGAATAGC


005



TATAATCCGCTGGCATACAGATATTACGATGCCGAGCGCATAGTCCTTGGCAAGCCGATG






AAGGAGTGGCTCAAGTTTGCCATGGCATGGTGGCACACACTCTGCGCAGAGGGTGGCGAC






CAGTTTGGCGGCGGTACGAAGAATTTTCCCTGGAATGGAGATCCCGATCCGGTACAGGCC






GCAAAAAACAAAGTAGACGCCGGCTTCGAATTCATGACCAAGATGGGAATAGAGTATTTC






TGTTTCCACGACGTGGATCTCGTCAGCGAGGCAGCAACCATCGAGGAGTATGAGGCCAAC






CTGAAGGAAGTGGTGGGCTACATCAAGGAAAAGCAGGCCGAGACGGGGATCAAAAACCTC






TGGGGCACTGCCAACGTGTTCAGCCACGCGCGCTACATGAACGGAGCCGCCACCAACCCC






GACTTCGATGTAGTGGCCCGCGCAGCCGTGCAGATCAAGAATGCTATCGACGCCACGATA






GCCTTAGGTGGCACCAACTACGTGTTCTGGGGTGGCCGTGAAGGTTACATGAGCCTGCTC






AACACCGACCAGAAGCGCGAGAAGGAGCATCTGGCAATGATGCTCCGCATGGCCCGCGAC






TATGCGCGTGCAAAAGGCTTCACCGGCACCTTCCTTATCGAGCCCAAGCCGATGGAGCCC






ACCAAGCACCAGTATGATGTAGACACCGAGACTGTGATAGGCTTCCTCCGTGCCCACGGC






CTCGACAAGGACTTCAAGGTCAACATAGAGGTGAACCACGCCACCCTGGCCGGCCATACC






TTCGAGCATGAGCTGGCAGTGGCCGTGGACAACGGTATGCTCGGCAGCATCGACGCCAAC






CGCGGTGACTACCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTG






ACCCAGGCCATGATGCAGATAATACGCAACGGCGGCTTCGGCAACGGCGGATGCAACTTC






GACGCCAAGACACGCCGCAACTCCACCGACCTGGAGGATATCTTCATAGCCCACATAGCA






GGCATGGACGCCATGGCCCGCGCCCTGCTCAGCGCAGCAGAAGTGCTGGAGAAATCGCCC






TACAGGAAGATGCTCGCCGAGCGCTACGCACCGTTTGATGCCGGCCAGGGAAAGGCATTT






GAAGAGGGCGCAATGTCGCTCACCGACCTTGTGGAGTATGCCAAGGAGCATGGCGAGCCC






ACACAGACTTCCGGCAAGCAGGAACTCTATGAGGCAATCGTCAATATGTATTGCTAA





5606MI1_

Bacteroides

Amino 
 18
MATKEYFPGIGKIKFEGVNSYNPLAYRYYDAERIVLGKPMKEWLKFAMAWWHTLCAEGGD


005

Acid


QFGGGTKNFPWNGDPDPVQAAKNKVDAGFEFMTKMGIEYFCFHDVDLVSEAATIEEYEAN








LKEVVGYIKEKQAETGIKNLWGTANVFSHARYMNGAATNPDFDVVARAAVQIKNATDATI








ALGGTNYVFWGGREGYMSLLNTDQKREKEHLAMMLRMARDYARAKGFTGTFLIEPKPMEP








TKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN








RGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGFGNGGCNFDAKTRRNSTDLEDIFIAHIA








GMDAMARALLSAAEVLEKSPYRKMLAERYAPFDAGQGKAFEEGAMSLTDLVEYAKEHGEP







TQTSGKQELYEAIVNMYC





5606MI2_

Bacteroides

DNA
 19
ATGGCAACAAAGGAATATTTTCCCCATATAGGGAAGATCCAGTTCAAAGGCACGGAATCG


003



TACGATCCGATGTCGTATCGTTACTATGACGCCGAGCGCGTAGTTCTGGGCAAGCCCATG






AAGGAATGGCTGAAATTCGCCATGGCATGGTGGCACACATTGTGCGCCGAGGGCGGCGAC






CAGTTCGGCGGCGGAACGAAGAAGTTCCCCTGGAACGAGGGCGAGGACGCCATGACCATC






GCCAAGCAGAAGGCTGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATCGAGTATTTC






TGCTTCCACGACATCGACCTGATCGGCGACCTGGGCGACGACATCGAGGACTATGAGAAC






CGTATGCACGAAATCACCGCACACCTGAAGGAGAAGATGGCCGCCACGGGCATCAAGAAC






CTGTGGGGCACTGCCAACGTGTTCGGCCACGCACGCTATATGAACGGCGCCGCCACCAAC






CCCGACTTCGACGTTGTGGCACGCGCATGTGTGCAGATCAAGAACGCCATCGACGCCACC






ATCGCTCTAGGCGGTACAAACTATGTATTCTGGGGCGGCCGCGAGGGCTACATGAGCCTG






CTGAACACCGACCAGAAGCGCGAGAAAGAGCACTTGGCTACCATGCTGACCATGGCACGC






GACTATGCCCGCGCCAATGGCTTCACCGGAACGTTCCTGATCGAGCCCAAACCCATGGAG






CCCAGCAAGCATCAGTATGATGTGGATACCGAGACCGTAATCGGCTTCCTGAAGGCCCAC






AACCTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCATGCCACTCTGGCCGGCCAC






ACATTCGAGCATGAGCTGGCAGTAGCCGTGGACAACGGCATGCTGGGCAGCATCGACGCC






AACCGCGGCGACTATCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTATGAG






CTGACCCAGGCCATGATGCAGATAATCCGCAACGGTGGCCTCGGCAACGGCGGTACCAAC






TTCGACGCCAAGACACGTCGCAACTCCACCGACCTGGACGACATCTTCATCGCTCACATC






GCCGGTATGGACGCTATGGCCCGCGCTCCGCTCAGCGCAGCCGACGTGCTTGAGAAGTCG






CCTTACAAGAAGATGCTGGCCGACCGCTACGCTTCATTCGACAGCGGCGAGGGCAAGAAG






TTCGAGGAAGGCAAGATGACTCTGGAGGATGTCGTGGCCTACGCCAAGAAGAATCCCGAA






CCCGCTCAGACCAGCGGCAAGCAGGAACTCTACGAGGCCATCATCAACATGTACGCCTGA





5606MI2_

Bacteroides

Amino 
 20
MATKEYFPHIGKIQFKGTESTDPMSYRYTDAERVVLGKPMKEWLKFAMAWWHTLCAEGGD


003

Acid


QFGGGTKKFPWNEGEDAMTIAKQKADAGFEIMQKLGIEYFCFHDIDLIGDLGDDIEDYEN








RMHEITAHLKEKMAATGIKNLWGTANVFGHARTMNGAATNPDFDVVARACVQIKNAIDAT








IALGGTNYVFWGGREGYMSLLNTDQKREKEHLATMLTMARDTARANGFTGTFLIEPKPME








PSKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDA








NRGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLDDIFIAHI








AGMDAMARAPLSAADVLEKSPTKKMLADRTASFDSGEGKKFEEGKMTLEDVVAYAKKNPE







PAQTSGKQELYEAIINMYA





5610MI3_

Bacteroides

DNA
 21
ATGGCAACAAAAGAATTTTTTCCCGAGATTGGTAAAATCAAGTTTGAGGGCCGCGAAAGC


003



CGCAATCCCCTCGCATTCCGCTACTACGGCCCCGAGAAAGTCGTTCTTGGCAAGAAGATG






AAAGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACACTGTGCGCCCAGGGCACCGAC






CAGTTTGGTGGCGACACCAAGCAGTTTCCGTGGAACACTGCCAGTGACCCCATGCAGGCC






GCCAAGGATAAGGTGGATGCCGGATTTGAATTCATGACCAAGATGGGCATTGAGTACTTC






TGCTTCCACGATGTGGATCTCGTCGCCGAGGCCGCCACTGTCGAGGAGTATGAGGCTAAC






CTCAAGACCATCGTCGCCTACATCAAAGAGAAACAAGCCGAGACCGGCATCAAGAACCTG






TGGGGCACAGCCAACGTATTCGGACACAAACGCTACATGAACGGTGCCGCCACCAACCCC






GACTTTGATGTCGTGGCACGCGCCATCGTGCAAATCAAGAACGCCATCGACGCCACCATC






GAGTTGGGCGGCACGAGTTACGTCTTTTGGGGCGGCCGCGAGGGCCACATGAGCCTGCTC






AACACCGACCAGAAGCGCGAGAAGGAGCACCTTGCACGCATGCTGACCATGGCACGCGAC






TATGCCCGCGCACGTGGTTTCAACGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCG






ACCAAGCACCAATATGATGTGGACACCGAGACCGTCATCGGTTTCCTGCGTGCCCATGGT






CTGGACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCTACACTGGCCGGACACACC






TTCGAGCGCGAACTGGCAGTGGCCGTCGACAACGGTCTACTCGGCTCAATCGACGCCAAC






CGTGGTGACTATCAGAATGGTTGGGACACCGATCAGTTCCCCATCGACCACTATGAGTTG






GTTCAGGGCATGTTGCAGATTATCCGCAATGGTGGTTTCACCGACGGTGGCACCAACTTC






GATGCCAAGACCCGCCGCAACTCGACCGACCTCGAGGACATCTTCATCGCCCACATCGCC






GCGATGGATGCCATGGCTCATGCGCTGGAGAGTGCTGCCTCCATCATCGAGGAGTCGCCC






TACTGCCAGATGGTCAAGGATCGCTATGCCTCATTTGACTCCGGCATCGGCAAGGACTTT






GAGGACGGCAAGTTGACACTGGAACAAGCCTACGAGTACGGTAAGCAAGTGGGCGAACCC






AAGCAGACCAGTGGCAAGCAAGAACTGTACGAGTCAATCATCAATATGTATTCCATTTAA





5610MI3_

Bacteroides

Amino 
 22
MATKEFFPEIGKIKFEGRESRNPLAFRYYGPEKVVLGKKMKDWFKFAMAWWHTLCAQGTD


003

Acid


QFGGDTKQFPWNTASDDMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEAATVEEYEAN








LKTIVAYIKEKQAETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAIVQIKNAIDATI








ELGGTSYVFWGGREGHMSLLNTDQKREKEHLARMLTMARDYARARGFNGTFLIEPKPMEP








TKHQYEVETETVIGFLRAHGLDKEEKVNIEVNHATLAGHTFERELAVAVDNGLLGSIDAN








RGDYQNGWDTDQFPIDHYELVQGMLQIIRNGGFTDGGTNFDAKTRRNSTDLEDIFIAHIA








AMDAMAHALESAASIIEESPYCQMVKDRYASFDSGIGKDFEDGKLTLEQAYEYGKQVGEP







KQTSGKQELYESIINMYSI





5749MI2_

Bacteroides

DNA
 23
ATGGCAACAAAAGAGTATTTTCCTGGTATAGGAAAGATTAAATTTGAAGGTAAAGAGAGT


004



AAGAATCCGATGGCATTCCGCTATTATGATGCCAATAAAGTAATCATGGGCAAGAAGATG






AGCGAGTGGCTGAAGTTTGCCATGGCTTGGTGGCACACATTGTGCGCCGAAGGTGGTGAC






CAGTTTGGTGGTGGAACAAAGACTTTCCCGTGGAACGATTCGGACAACGCCGTAGAAGCA






GCCAACCATAAAGTAGATGCCGGTTTTGAATTTATGCAGAAAATGGGCATCGAATACTAT






TGCTTCCATGATGTAGACCTCTGCACTGAAGCTGCTACCATTGAAGAATATGAAGCCAAT






CTGAAGGAAATAGTAGCCTATCCGAAACAGAAACAGGCTGAAACAGGTATCAAACTTCTG






TGGGGTACGGCAAATGTATTTGGTCACAAACGCTATATGAATGGTGCTGCTACCAATCCG






GATTTTGATGTAGTGGCTCGTGCTGCTGTACAGATTAAGAATGCGATAGACGCTACAATT






GAACTCGGTGGTAGCAACTACGTGTTCTGGGGCGGCCGTGAAGGTTATATGAGCTTGCTC






AATACAGACCAGAAACGTGAGAAAGAGCATTTGGCACAAATGTTGACCATGGCTCGTGAC






TATGCTCGTGCCAAAGGATTCAAGGGTACCTTCCTGGTTGAACCCAAACCGATGGAACCA






ACTAAACACCAGTATGATGTAGATACGGAAACTGTAATCGGCTTCCTCAAGGCTCATAAT






TTGGATAAGGATTTCAAGGTAAATATTGAAGTAAACCATGCTACATTGGCCGGTCATACT






TTTGAACACGAATTGGCTGTTGCCGTAGACAACGATATGCTTGGCTCTATCGATGCCAAC






CGCGGTGACTATCAGAACGGTTGGGATACTGACCAGTTCCCCATTGACAACTTCGAGCTT






ATCCAAGCCATGATGCAGATTATTCGCGGTGGTGGCTTCAAAGATGGTGGTACAAACTTC






GACGCTAAGACTCGTCGTAACTCTACCGACCTGGAAGATATTTTCATTGCACACATCGCT






GGTATGGATGCTATGGCACGTGCTTTGGAAAGTGCAGCCAAGTTGCTTGAGGAATCTCCT






TATAAGAAAATGTTGGCTGACCGCTATGCATCGTTCGATAGTGGCAAAGGTAAGGAGTTT






GAAGAAGGCAAGCTGACATTGGAAGACGTTGTAGTTTATGCCAAGCAGAATGGCGAGCCT






AAACAGACCAGCGGTAAGCAGGAATTGTATGAGGCAATTGTAAATATGTATGCCTGA





5749MI2_

Bacteroides

Amino 
 24
MATKEYFPGIGKIKFEGKESKNPMAFRYYDANKVIMGKKMSEWLKFAMAWWHTLCAEGGD


004

Acid


QFGGGTKTFPWNDSDNAVEAANHKVDAGFEFMQKMGIEYYCFHDVDLCTEAATIEEYEAN








LKEIVAYPKQKQAETGIKLLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNAIDATI








ELGGSNYVFWGGREGYMSLLNTDQKREKEHLAQMLTMARDYARAKGFKGTFLVEPKPMEP








TKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNDMLGSIDAN








RGDYQNGWDTDQFPIDNFELIQAMMQIIRGGGFKDGGTNFDAKTRRNSTDLEDIFIAHIA








GMDAMARALESAAKLLEESPYKKMLADRYASFDSGKGKEFEEGKLTLEDVVVYAKQNGEP







KQTSGKQELYEAIVNMYA





5750MI3_

Bacteroides

DNA
 25
ATGGCAACAAAAGAGTATTTTCCTGGAATAGGAAAGATTAAATTTGAAGGAAAAGAGAGT


003



AAGAACCCGATGGCATTCCGTTGCTACGATGCAGAAAAAGTTATCATGGGTAAGAGAATG






AAAGATTGGTTGAAGTTTGCAATGGCGTGGTGGCATACACTTTGTGCAGAAGGCGGTGAC






CAATTCGGTGGCGGTACAAAGAGTTTCCCCCGGAACGACTATACTGATAAAATTCAGGCT






GCTAAAAACAAGATGGATGCCGGTTTTGAGTTTATGCAGAAGATGGGGATCGAATACTAT






TGTTTTCACGATGTAGACCTCTGCACGGAAGCTGATACCATTGAAGAATACGAAGCTAAT






TTGAAAGAAATCGTAGTTTACGCAAAGCAAAAGCAGGTAGAAACAGGTATCAAATTATTG






TGGGGTACTGCCAATGTATTCGGTCATGAACGCTATATGAATGGTGCGGCTACCAACCCA






GATTTTGATGTTGTAGCCCGTGCTGCTGTTCAGATTAAGAATGCAATTGATGCTACCATT






GAACTAGGTGGCTTAAACTATGTGTTCTGGGGTGGACGCGAAGGTTATATGTCTTTGCTG






AACACTGATCAGAAACGTGAGAAAGAACATCTTGCACAAATGCTGACCATTGCCCGTGAC






TATGCCCGTGCCCGTGGCTTCAAAGGTACATTCTTGGTTGAACCGAAACCGATGGAACCA






ACCAAACATCAATATGACGTAGATACAGAAACAGTTATCGGTTTTTTGAAAGCTCATGCT






TTGGATAAAGACTTTAAAGTAAATATTGAAGTAAATCATGCAACATTAGCCGGTCATACA






TTTGAACACGAACTGGCAGTGGCTGTCGACAACGGTATGCTGGGTTCTATTGACGCTAAT






CGTGGTGATTGTCAAAACGGTTGGGATACAGACCAATTTCCCATTGATAACTATGAACTG






ACTCAAGCCATGATGCAGATTATTCGTAACGGTGGTTTGGGCAATGGTGGTACGAATTTT






GACGCTAAAACTCGCCGTAATTCTACTGATCTTGGAGATATCTTCATTGCTCACATCGCA






GGTATGGATGCTATGGCACGTGCATTGGAAAGTGCGGCCAAGTTGTTGGAAGAATCTCCC






TATAAGAAGATGCTGGCAGAACGTTATGCATCCTTTGACAGCGGTAAGGGTAAAGAGTTT






GAAGAGGGTAAGTTGACCTTGGAGGATCTTGTTGCTTATGCAAAAGTCAATGGCGAACCG






AAACAAATCAGIGGTAAACAAGAATTGTATGAGGCAATTGTGAATATGTATTGCTAA





5750MI3_

Bacteroides

Amino 
 26
MATKEYFPGIGKIKFEGKESKNPMAFRCYDAEKVIMGKRMKDWLKFAMAWWHTLCAEGGD


003

Acid


QFGGGTKSFPRNDYTDKIQAAKNKMDAGFEFMQKMGIEYYCFHDVDLCTEADTIEEYEAN








LKEIVVYAKQKQVETGIKLLWGTANVFGHERYMNGAATNPDFDVVARAAVQIKNAIDATI








ELGGLNYVFWGGREGYMSLLNTDQKREKEHLAQMLTIARDYARARGFKGTFLVEPKPMEP








TKHQYDVDTETVIGFLKAHALDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN








RGDCQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLGDIFIAHIA








GMDAMARALESAAKLLEESPYKKMLAERYASFDSGKGKEFEEGKLTLEDLVAYAKVNGEP







KQISGKQELYEAIVNMYC





5750MI4_

Bacteroides

DNA
 27
ATGGCAACAAAAGAGTATTTTCCCGGAATAGGAAAGATTAAATTCGAAGGTAAAGAGAGC


003



AAGAACCCGATGGCATTCCGTTATTACGATGCCGATAAAGTAATCATGGGTAAGAAAATG






AGCGAATGGCTGAAGTTCGCCATGGCATGGTGGCACACTCTTTGCGCAGAAGGTGGTGAC






CAGTTCGGTGGCGGAACAAAGAAATTCCCCTGGAACGGTGAGGCTGACAAGGTTCAGGCT






GCCAAGAACAAAATGGACGCCGGCTTTGAATTCATGCAGAAAATGGGTATCGAATACTAC






TGCTTCCACGATGTAGACCTCTGCGAAGAAGCCGAGACCATTGAAGAATACGAAGCCAAC






TTGAAGGAAATCGTAGCGTATGCCAAGCAGAAACAAGCAGAAACCGGCATCAAGCTGTTG






TGGGGTACTGCCAACGTATTCGGCCATGCCCGCTACATGAATGGTGCAGCCACCAACCCC






GATTTCGATGTTGTGGCACGTGCAGCCGTCCAAATCAAAAGCGCCATCGACGCTACTATC






GAGCTGGGAGGTTCGAACTATGTGTTCTGGGGCGGTCGCGAAGGCTACATGTCATTGCTG






AATACAGACCAGAAGCGTGAGAAAGAGCACCTCGCACAGATGTTGACCATCGCCCGCGAC






TATGCCCGTGCCCGTGGCTTCAAAGGTACCTTCCTGATTGAACCGAAACCGATGGAACCT






ACAAAACACCAGTATGATGTAGACACCGAAACCGTTATCGGCTTCTTGAAGGCCCACAAT






CTGGACAAAGATTTCAAGGTAAACATCGAAGTGAACCACGCTACTTTGGCGGGCCACACC






TTCGAGCACGAACTCGCAGTAGCCGTAGACAACGGTATGCTCGGCTCCATCGATGCCAAC






CGTGGTGACTACCAGAACGGCTGGGATACAGACCAGTTCCCCATTGACAACTTCGAACTG






ACCCAGGCAATGATGCAAATCATCCGTAACGGCGGCTTTGGCAATGGCGGTACAAACTTC






GATGCCAAGACCCGTCGTAACICCACCGACCTGGAAGACATCTTCATTGCCCACATCGCC






GGTATGGACGTGATGGCACGTGCACTGGAAAGTGCAGCCAAATTGCTTGAAGAGTCTCCT






TACAAGAAGATGCTTGCCGACCGCTATGCTTCCTTCGACAGTGGTAAAGGCAAGGAATTC






GAAGACGGCAAGCTGACACTGGAGGATTTGGCAGCTTACGCAAAAGCCAACGGTGAGCCG






AAACAGACCAGCGGCAAGCAGGGATTGTATGAGGCAATCGTAAATATGTACTGCTGA





5750MI4_

Bacteroides

Amino 
 28
MATKEYFPGIGKIKFEGKESKNPMAFRYYDADKVIMGKKMSEWLKFAMAWWHTLCAEGGD


003

Acid


QFGGGTKKFPWNGEADKVQAAKNKMDAGFEFMQKMGIEYYCFHDVDLCEEAETIEEYEAN








LKEIVAYAKQKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKSAIDATI








ELGGSNYVFWGGREGYMSLLNTDQKREKEHLAQMLTIARDYARARGFKGTFLIEPKPMEP








TKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN








RGDYQNGWDTDQFPIDNFELTQAMMQIIRNGGFGNGGTNFDAKTRRNSTDLEDIFIAHIA








GMDVMARALESAAKLLEESPYKKMLADRYASFDSGKGKEFEDGKLTLEDLAAYAKANGEP







KQTSGKQGLYEAIVNMYC





5751MI4_

Bacteroides

DNA
 29
ATGACAAAAGAGTATTTTCCAACCATTGGTAAAATTCAGTTTGAAGGTAAAGAGAGTAAG


002



AATCCATTAGCATATCGTTATTACGATGCTAACAAAGTAATAATGGGTAAAAAGATGAGC






GAATGGCTCAAGTTTGCAATGGCATGGTGGCACACTTTGTGTGCTGAGGGTAGCGACCAG






TTTGGTCCTGGCACCAAGTCATTCCCATGGAACGCATCAACCGACCGTATGCAGGCTGCA






AAAGATAAGGCTGACGCAGGCTTCGAAATCATGCAAAAACTGGGCATCGAATACTACTGT






TTCCATGATGTTGACCTCATCGACCCAGCAGACGATATTCCAACATACGAAAAGAATCTC






AAGGAAATCGTTGCATACCTCAAGCAAAAACAGGCCGAGACAGGTATCAAATTGCTATGG






GGTACAGCTAACGTATTTGGCCACAAGCGTTATATGAACGGTGCATCTACCAATCCTGAC






TTTGACGTTGTTGCACGAGCTATCGTGCAAATCAAGAATGCTATCGATGCAACAATCGAA






CTGGGCGGCACGAACTACGTATTCTGGGGTGGTCGCGAAGGTTACATGTCACTGCTCAAC






ACCGACCAAAAGCGCGAGAAAGAGCACATGGCTACCATGTTAGGAATGGCACGTGACTAT






GCACGTTCTAAAGGCTTTACTGGTACTCTCCTTATCGAGCCAAAGCCTATGGAACCAACT






AAGCATCAATACGACGTCGATACAGAAACTGTTATTGGTTTCCTCAAAGCTCACGGATTA






GACAAGGACTTCAAGGTAAATATCGAAGTGAACCACGCTACATTGGCTGGCCATACCTTC






GAACATGAATTAGCATGTGCTGTTGATGCAGGTATGCTTGGTTCCATCGATGCTAACCGT






GGTGATATGCAGAATGGCTGGGATACAGATCAGTTCCCTATCAACAATTACGAGCTCGTT






CAGGCCATGATGCAGATTATCCGCAATGGTGGTTTCGGTAACGGTGGTACAAACTTCGAC






GCTAAGACACGTCGTAATTCAACCGATTTGGAAGACATCATCATTGCTCACGTTTCAGCT






ATGGATGCTATGGCACGTGCTCTTGAATGTGCTGCAGACATTCTTCAAAACTCACCTATT






CCACAGATGGTGGCCAACCGTTATGCAAGTTTTGACAAGGGTATAGGTAAAGATTTCGAA






GACGGCAAGCTCACCCTCGAGCAAGTATACGAATATGGTAAGACCGTCGGCGAACCAGCT






ATTACAAGCGGCAAACAGGAGCTCTACGAAGCTATCGTTAATATGTATTGCTGA





5751MI4_

Bacteroides

Amino 
 30
MTKEYFPTIGKIQFEGKESKNPLAYRYYDANKVIMGKKMSEWLKFAMAWWHTLCAEGSDQ


002

Acid


FGPGTKSFPWNASTDRMQAAKDKADAGFEIMQKLGIEYYCFHDVDLIDPADDIPTYEKNL








KEIVAYLKQKQAETGIKLLWGTANVFGHKRYMNGASTNPDFDVVARAIVQIKNAIDATIE








LGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLGMARDYARSKGFTGTLLIEPKPMEPT








KHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANR








GDMQNGWDTDQFPINNYELVQAMMQIIRNGGFGNGGTNFDAKTRRNSTDLEDIIIAHVSA








MDAMARALECAADILQNSPIPQMVANRYASFDKGIGKDFEDGKLTLEQVYEYGKTVGEPA







ITSGKQELYEAIVNMYC





5751MI5_

Bacteroides

DNA
 31
ATGGCTAACAAAGAATTTTTCCCCGGTATTGGTAAAATCAAATTCGAAGGTAAAGAGAGC


003



AAGAACCCCATGGCATATCGTTACTACGATGCTGAGAAGGTAGTCCTTGGCAAGAATATG






AAAGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACATTGTGCGCCGAGGGTAGCGAC






CAGTTTGGTCCCGGCACTAAGTCTTTCCCCTGGAACACCGCAGAGTGCCCCATGCAGGCA






GCTAAGGACAAGGTTGACGCTGGCTTCGAGTTCATGACCAAGATGGGTATTGAATACTTC






TGCTTCCACGATGTAGACCTCGTTGCCGAGGCCGACACTGTTGAGGAGTACGAGGCTCGC






ATGAAGGAAATCGTTGCTTACATCAAGGAGAAGGTGGCCGAGACTGGCATCAAGAACCTG






TGGGGTACAGCTAACGTATTTGGCAACAAGCGCTACATGAACGGTGCTGCTACTAACCCC






GACTTTGACGTTGTGGCTCGCGCTATCGTTCAAATCAAGAACGCTATCGACGCTACTATC






GAGCTCGGTGGTACGTCATACGTATTCTGGGGCGGCCGCGAGGGTTACATGAGCCTCTTG






AACACCGACCAGAAGCGTGAGAAAGAGCACCTGGCTACTATGCTCACTATGGCACGCGAC






TACGCTCGCGCTAAGGGTTTCAAGGGTACATTCCTCATCGAGCCCAAGCCCATGGAGCCC






ACAAAGCACCAGTACGATGTTGACACTGAGACTGTAATCGGCTTCCTTAAGGCACACAAC






CTTGACAAGGACTTCAAGGTTAACATTGAGGTTAACCACGCAACTCTCGCTGGTCACACA






TTTGAGCACGAGCTCGCTTGTGCTGTTGACGCTGGCATGCTTGGCAGCATCGACGCTAAC






CGCGGTGACTACCAGAACGGCTGGGATACTGACCAATTCCCCATCGACAACTTCGACCTC






ACTCAAGCTATGCTCGAGATCATCCGCAACGATGGTTTCAAGGATGGTGGTACAAACTTC






GACGCTAAGACTCGCCGCAACAGCACCGACCTCGAGGATATCTTCATCGCACACATCGCT






GCTATGGACGCTATGGCACGTGCTCTCGAGAGCGCTGCTGCAGTACTCGAGGAGTCAGCT






CTGCCCCAAATGAAGAAGGACCGCTATGCATCGTTCGACGCTGGCATGGGTAAGGACTTC






GAGGACGGCAAGCTCACCCTGGAGCAAGTTTACGAGTATGGTAAGAAGGTGGGCGAGCCC






AAGCAGACTAGCGGCAAGCAAGAGCTGTATGAGGCTATCCTCAACATGTACGTATAA





5751MI5_

Bacteroides

Amino 
 32
MANKEFFPGIGKIKFEGKESKNPMAYRYYDAEKVVLGKNMKDWFKFAMAWWHTLCAEGSD


003

Acid


QFGPGTKSFPWNTAECPMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEADTVEEYEAR








MKEIVAYIKEKVAETGIKNLWGTANVFGNKRYMNGAATNPDFDVVARAIVQIKNAIDATI








ELGGTSYVFWGGREGYMSLLNTDQKREKEHLATMLTMARDYARAKGFKGTFLIEPKPMEP








TKHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDYQNGWDTDQFPIDNFDLTQAMLEIIRNDGFKDGGTNFDAKTRRNSTDLEDIFIAHIA








AMDAMARALESAAAVLEESALPQMKKDRYASFDAGMGKDFEDGKLTLEQVYEYGKKVGEP







KQTSGKQELYEAILNMYV





5751MI6_

Bacteroides

DNA
 33
ATGGCTAACAAAGAATTTTTCCCAGGTATTGGTAAAATCAAATTCGAAGGCAAAGAAAGC


004



AAGAACCCCATGGCATATCGTCACTACGATGCCGAGAAGGTAGTCCTTGGTAAGAAGATG






AAGGACTGGTTCAAGTTTGCGATGGCTTGGTGGCACACTCTGTGCGCCGAGGGTAGCGAC






CAGTTCGGCCCCGTGACCAAGTCTTTCCCCTGGAACCAGGCCGAGTGCCCCATGCAGGCT






GCTAAGGACAAGGTTGACGCCGGCTTCGAGTTCATGACCAAGATGGGTATCGAATACTTC






TGTTTCCACGATGTAGACCTCGTTGCCGAGGCCGACACCGTTGAGGAGTACGAAGCTCGC






ATGAAGGAAATCGTGGCTTACATCAAGGAGAAGATGGCCGAGACCGGCATCAAGAACCTG






TGGGGTACAGCCAACGTATTCGGCAACAAGCGCTACATGAACGGTGCTGCCACCAACCCC






GACTTTGACGTTGTGGCTCGCGCAATCGTTCAGATCAAGAACGCCATCGACGCTACTATC






GAGCTCGGCGGTACCTCTTACGTGTTCTGGGGCGGCCGCGAGGGTTACATGACTCTCTTG






AACACCGACCAGAAGCGCGAGAAGGAGCACCTGGCTACCATGCTCACCATGGCTCGCGAC






TATGCTCGCGCTAAGGGCTTCAAGGGTACATTCCTTATCGAGCCCAAGCCCATGGAGCCC






ACCAAGCACCAGTATGACGTGGATACCGAGACCGTTATCGGCTTCCTCAAGGCTCACGGC






CTGGACAAGGACTTCAAGGTGAACATCGAGGTTAACCATGCAACTCTCGCCGGCCACACA






TTCGAGCACGAACTCGCTTGCGCTGTTGACGCTGGCATGCTGGGCAGCATCGACGCTAAC






CGCGGCGACTACCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTTCGACCTC






ACTCAGGCTATGCTCGAGATCATCCGCAACGGTGGTTTCAAGGACGGTGGTACAAACTTC






GACGCTAAGACCCGTCGCAACAGCACCGATCTTGAGGACATCTTCATCGCTCACATCGCT






GCTATGGACGCAATGGCACGCGCGCTCGAGAGCGCTGCCGCTGTGCTCGAGCAGAGCCCC






CTTCCCCAGATGAAGAAAGACCGCTACGCATCGTTCGATGCCGGCATGGGCAAGGACTTC






GAGGACGGCAAGCTCACTCTGGAGCAGGTTTACGAGTATGGTAAGAAGGTAGGCGAGCCC






AAGCAGACCAGCGGCAAGCAGGAACTGTACGAGGCTATCCTCAACATGTATGTATAA





5751MI6_

Bacteroides

Amino 
 34
MANKEFFPGIGKIKFEGKESKNPMAYRHYDAEKVVLGKKMKDWFKFAMAWWHTLCAEGSD


004

Acid


QFGPVTKSFPWNQAECPMQAAKDKVDAGFEFMTKMGIEYFCFHDVDLVAEADTVEEYEAR








MKEIVAYIKEKMAETGIKNLWGTANVFGNKRYMNGAATNPDFDVVARAIVQIKNAIDATI








ELGGTSYVFWGGREGYMTLLNTDQKREKEHLATMLTMARDYARAKGFKGTFLIEPKPMEP








TKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDYQNGWDTDQFPIDNFDLTQAMLEIIRNGGFKDGGTNFDAKTRRNSTDLEDIFIAHIA








AMDAMARALESAAAVLEQSPLPQMKKDRYASFDAGMGKDFEDGKLTLEQVYEYGKKVGEP







KQTSGKQELYEAILNMYV





5586MI22_

Clostridiales

DNA
 35
ATGAAAGAATATTTTCCTATGACAAAAAAAGTTGAATATGAGGGCGCAGCATCTAAAAAT


003



CCATTTGCGTTTAAATACTATGATGCCGAAAGAATTATAGCAGGCAAGCCTATGAAAGAA






CATCTTAAATTTGCTATGAGTTGGTGGCATACACTTTGTGCGGGCGGTGCAGACCCATTT






GGCACAACAACTATGGACAGAACATACGGCGGACTTACCGACCCAATGGAAATTGCAAAG






GCAAAAGTAGATGCAGGCTTTGAGTTTATGCAAAAACTCGGTATAGAGTATTTTTGTTTT






CACGATGCGGATATTGCACCGGAAGGAAGCAGTTTTGTTGAAACAAAGAAAAACTTTTGG






GAAATAGTAGATTATATACAGCAAAAGATGAATGAAACAGGCATAAAGTTGCTTTGGGGT






ACTGCAAACTGCTTTAATGCTCCACGTTATATGCACGGTGCAGGAACATCATGCAATGCG






CACAGTTTTGCATATGCAGCCGCACAGATAAAAAATGCAATTGAAGCTACCGTTAAACTG






GGTGGAAAAGGCTATGTTTTCTGGGGCGGAAGAGAGGGTTATGAAACACTTCTCAATACG






GATATGGCACTTGAACTTGACAATATGGCAAGACTTATGCATATGGCAGTTGATTATGGC






AGAAGCATTGGTTTTGACGGTGATTTTTATATCGAACCAAAGCCAAAGGAACCAACAAAA






CATCAATATGACTTTGACTCGGCAACTGTTTTGGGATTTTTGAGAAAGTACGGTTTAGAT






AAGGATTTTAAACTTAATATAGAGGCAAATCATGCGACACTTGCAGGTCATACATTTGAA






CATGAATTGACTGTAGCGCGTATAAACGGTGCATTTGGCAGCATAGATGCAAATAGCGGC






GATCCCAATCTTGGCTGGGATACCGACCAATTCCCAACAGATGTTTATTCGGCAACCCTT






TGTATGCTTGAAGTGATAAGAGCAGGCGGCTTTACAAACGGAGGTCTTAATTTTGATGCA






AAGGTCAGAAGAGGCTCATTTACGTTTGATGACATTGTTTATGCATATATCAGCGGTATG






GACACTTTTGCGCTGGGTTTTATAAAGGCATATGAAATAATTGAGGACGGCAGAATAGAT






GAATTTGTAAAAGAAAGATACGCAAGCTATAATACAGGCATAGGCAAAGATATTATAGAT






GGAAAGGCAAGCCTTGAAAGTTTGGAAGAATATATTCTTTCAAATGATAATGTTGTAATG






CAAAGCGGCAGACAGGAATATCTTGAAACAGTTTTGAATAATATTTTGTTTAAAGCATAA





5586MI22_

Clostridiales

Amino 
 36
MKEYFPMTKKVEYEGAASKNDFAFKYYDAERIIAGKPMKEHLKFAMSWWHTLCAGGADPF


003

Acid


GTTTMDRTYGGLTDPMEIAKAKVDAGFEFMQKLGIEYFCFHDADIAPEGSSFVETKKNFW








EIVDYIQQKMNETGIKLLWGTANCFNAPRYMHGAGTSCNAHSFAYAAAQIKNAIEATVKL








GGKGYVFWGGREGYETLLNTDMALELDNMARLMHMAVDYGRSIGFDGDFYIEPKPKEPTK








HQYDFDSATVLGFLRKYGLDKDFKLNIEANHATLAGHTFEHELTVARINGAFGSIDANSG








DPNLGWDTDQFPTDVYSATLCMLEVIRAGGFTNGGLNFDAKVRRGSFTFDDIVYAYISGM








DTFALGFIKAYEIIEDGRIDEFVKERYASYNTGIGKDIIDGKASLESLEEYILSNDNVVM







QSGRQEYLETVLNNILFKA





1753MI4_

Firmicutes

DNA
 37
ATGAAAGAAATTTTCCCAAATATTCCTGAGATTAAATTCGAAGGAAAAGACAGCAAAAAT


001



CCTTTTGCTTTCCATTACTACAACCCAGACCAAATCATCTTAGGCAAACCAATGAAAGAA






CACCTCCCATTCGCTATGGCTTGGTGGCACAATCTTGGTGCAACAGGTGTTGATATGTTT






GGCGCTGGCCCAGCTGATAAGAGTTTCGGTGCTAAAGTTGGCACAATGGAACACGCTAAG






GCCAAAGTCGATGCCGGTTTCGAATTCATGAAGAAACTCGGTATCAGATATTTCTGCTTC






CATGATGTTGACTTAGTTCCAGAATGTGCAGATATCAAAGATACAAACAAAGAATTAGAT






GAAATCAGTGACTACATCTTAGAAAAGATGAAAGGCACAGATATTAAGTGTTTATGGGGC






ACCGCCAATATGTTCTCTAACCCACGCTTCTGCAATGGIGCGGGTTCCACAAACAGTGCG






GATGTCTTCGCTTTCGCCGCTGCTCAAGTTAAGAAAGCCTTAGATATCACCGTTAAATTA






GGTGGTAGGGGTTACGTCTTCTGGGGTGGTCGTGAAGGTTACGAAACATTACTCAATACA






GACGTTAAATTCGAACAAGAAAACATTGCTCGTTTAATGAAGATGGCTGTTGAATATGGC






CGTTCCATCGGTTTCAAAGGCGATTTCTATATCGAACCAAAACCAAAAGAACCAATGAAA






CACCAATATGACTTCGACGCCGCTACAGCTATTGGCTTCTTAAGAGCCCACGGCTTAGAC






AAAGACTTCAAGTTGAACATCGAAGCTAACCACGCTACATTAGCGGGTCATACATTCCAA






CACGATTTAAGAATCTCCGCCATTAATGGTATGTTAGGTTCTATCGATGCTAACCAAGGC






GATATGCTCTTAGGTTGGGATACAGACGAATTCCCATTTGATGTCTACAGTGCGACACAA






TGTATGTACGAAGTCTTAAAGAATGGTGGTCTTACAGGTGGTTTCAACTTTGACTCCAAA






ACACGTCGTCCATCCTACACAATGGAAGATATGTTCTTAGCCTATATCTTAGGTATGGAT






ACATTCGCTTTAGGTTTAATCAAAGCTGCTCAAATCATCGAAGATGGCCGTATTGATCAA






TTCATCGAAAAGAAATATTCTTCCTTCCGTGAAACAGAAATCGGTCAAAAGATCTTAAAC






AACAAGACAAGCTTAAAAGAATTATCCGATTACGCTTGCAAGATGGGTGCTCCAGAACTT






CCAGGTAGTGGTCGTCAAGAAATGCTCGAAGCCATCGTTAACGATGTCTTATTCGGCAAG






TAA





1753MI4_

Firmicutes

Amino 
 38
MKEIFPNIPEIKFEGKDSKNPFAFHYYNPDQIILGKPMKEHLPFAMAWWHNLGATGVDMF


001

Acid


GAGPADKSFGAKVGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDLVPECADIKDTNKELD








EISDYILEKMKGTDIKCLWGTANMFSNPRFCNGAGSTNSADVFAFAAAQVKKALDITVKL








GGRGYVFWGGREGYETLLNTDVKFEQENIARLMKMAVEYGRSIGFKGDFYIEPKPKEPMK








HQYDFDAATAIGFLRAHGLDKDFKLNIEANHATLAGHTFQHDLRISAINGMLGSIDANQG








DMLLGWDTDEFPFDVYSATQCMYEVLKNGGLTGGFNFDSKTRRPSYTMEDMFLAYILGMD








TFALGLIKAAQIIEDGRIDQFIEKKYSSFRETEIGQKILNNKTSLKELSDYACKMGAPEL







PGSGRQEMLEAIVNDVLFGK





1753MI6_

Firmicutes

DNA
 39
ATGAAAGAAATTTTCCCAAATATTCCTGAGATTAAATTCGAAGGAAAAGACAGCAAAAAT


001



CCTTTTGCTTTCCATTACTACAACCCAGACCAAATCATCTTAGGTAAACCAATGAAAGAA






CACCTCCCATTCGCTATGGCTTGGTGGCACAATCTTGGTGCAACAGGTGTTGATATGTTT






GGCGCTGGCCCAGCTGATAAGAGTTTCGGTGCTAAAGTTGGCACAATGGAACACGCTAAG






GCCAAAGTCGATGCCGGTTTCGAATTCATGAAGAAACTTGGTATCAGATATTTCTGCTTC






CATGATGTTGACTTAGTTCCAGAATGTGCAGATATCAAAGATACAAACAAAGAATTAGAT






GAAATCAGTGACTACATCTTAGAAAAGATGAAAGGCACAGATATCAAGTGTTTATGGGGC






ACCGCCAATATGTTCTCTAACCCACGTTTCTGCAATGGTGCGGGTTCCACAAACAGTGCG






GATGTCTTCGCTTTCGCCGCTGCTCAAGTTAAGAAAGCCTTAGATATCACCGTTAAATTA






GGTGGTAGGGGTTACGTCTTCTGGGGTGGTCGTGAAGGTTACGAAACATTACTCAATACA






GACGTTAAATTCGAACAAGAAAACATTGCTCGTTTAATGAAGATGGCTGTTGAATATGGC






CGTTCCATCGGTTTCAAAGGCGATTTCTATATCGAACCAAAACCAAAAGAACCAATGAAA






CACCAATATGACTTCGACGCCGCTACAGCTATTGGCTTCTTAAGAGCCCACGGCTTAGAC






AAAGACTTCAAGTTGAACATCGAAGCTAACCACGCTACATTAGCGGGTCATACATTCCAA






CACGATTTAAGAATCTCCGCCATTAATGGTATGTTAGGTTCTATCGATGCTAACCAAGGC






GATATGCTCTTAGGTTGGGATACAGACGAATTCCCATTTGATGTCTACAGTGCGACACAA






TGTATGTACGAAGTCTTAAAGAATGGTGGTCTTACAGGTGGTTTCAACTTTGACTCCAAA






ACACGTCGTCCATCCTACACAATGGAAGATATGTTCTTAGCCTATATCTTAGGTATGGAT






ACATTCGCTTTAGGTTTAATCAAAGCTGCTCAAATCATCGAAGATGGCCGTATTGATCAA






TTCATCGAAAAGAAATATTCTTCCTTCCGTGAAACAGAAATCGGTCAAAAGATCTTAAAC






AACAAGACAAGCTTAAAAGAATTATCCGATTACGCTTGCAAGATGGGTGCTCCAGAACTT






CCAGGTAGTGGTCGTCAAGAAATGCTCGAAGCCATCGTTAACGATGTCTTATTCGGCAAG






TAA





1753MI6_

Firmicutes

Amino 
 40
MKEIFPNIPEIKFEGKDSKNPFAFHYYNPDQIILGKDMKEHLPFAMAWWHNLGATGVDMF


001

Acid


GAGPADKSFGAKVGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDLVPECADIKDTNKELD








EISDYILEKMKGTDIKCLWGTANMFSNPRFCNGAGSTNSADVFAFAAAQVKKALDITVKL








GGRGYVFWGGREGYETLLNTDVKFEQENIARLMKMAVEYGRSIGFKGDFYIEPKPKEPMK








HQYDFDAATAIGFLRAHGLDKDFKLNIEANHATLAGHTFQHDLRISAINGMLGSIDANQG








DMLLGWDTDEFPFDVYSATQCMYEVLKNGGLTGGFNFDSKTRRDSYTMEDMFLAYILGMD








TFALGLIKAAQIIEDGRIDQFIEKKYSSFRETEIGQKILNNKTSLKELSDYACKMGAPEL







PGSGRQEMLEAIVNDVLFGK





1753MI35_

Firmicutes

DNA
 41
ATGGAATATTTCCCTTTCGTCAAATCGGTCCAATACAAGGGACCAACCTCAACTGAACCA


004



TTCGCTTTCAAGTACTACGATGCCAACCGTGTCGTTCTTGGAAAACCAATGAAAGAATGG






ATGCCATTCGCTATGGCTTGGTGGCACAACCTCGGCGCTGCCGGTACCGACATGTTCGGC






GGCAACACCATGGACAAGTCCTGGGGAGTCGATAAAGAAAAAGACCCAATGGGCTATGCC






AAAGCCAAAGTTGATGCCGGCTTCGAATTCATGCAGAAGATGGGCATCGAATACTACTGC






TTCCACGATGTCGACCTCGTCCCAGAGTGCGACGACATCACCGTTATGTACCAGAGACTC






GATGAGATCGGTGATTACCTTCTCAAGAAACAGAAGGAAACCGGTATCAAGCTTCTTTGG






TCAACCGCCAATGCCTTCGGACACCGCCGTTTCATGAACGGTGCTGGTTCCAGCAACTCC






GCCGAAGTCTATTGCTTCGCCGCCGCCCAGATCAAGAAAGCTCTTGAGCTCTGCGTCAAA






CTCGGTGGCAAAGGCTATGTCTTCTGGGGTGGACGTGAAGGCTACGAAACCCTTCTCAAC






ACCGACATGAAGTTCGAACAAGAGAACATCGCCAACCTTATGAGATGCGCCCGTGACTAC






GGCCGCAAGATCGGTTTCAAAGGCGACTTCTACATCGAACCAAAACCAAAAGAGCCAACA






AAGCATCAGTATGACTTCGACGCCGCTACCGCCATCGGATTCCTCCGTCAGTACGGTCTC






GACAAAGACTTCAAGATGAACATCGAAGCCAACCACGCTACCTTAGCTGGCCACACCTTC






GAACACGAACTCCGCGTCTCCGCCATGAACGGCATGCTCGGTTCCATCGACGCCAACGAA






GGCGATATGCTCCTCGGATGGGATGTCGACCGTTTCCCAGCCAACGTCTATAGCGCCACC






TTCGCCATGCTCGAAGTCATCAAAGCCGGTGGACTTACCGGTGGCTTCAACTTCGACGCC






AAGACCCGCCGCGCTTCCAACACCTATGAAGATATGTTCAAGGCTTTCGTCCTTGGTATG






GATACCTTCGCTTTAGGTCTTCTCAATGCCGAAGCCATCATCAAAGACGGCCGCATCGAC






AAGTTCGTCGAGGATAGATATGCCAGCTTCAAGACCGGCATCGGTGCTAAGGTCCGCGAT






CACTCCGCTACCCTTGAGGATTTAGCTGCCCACGCCCTTGAGACCAAGGTTTGCCCAGAT






CCAGGCAGCGGCGACGAGGAAGAACTCCAGGAAATCCTCAACCAGTTAATGTTCGGTAAG






AAATAA





1753MI35_

Firmicutes

Amino 
 42
MEYFPFVKSVQYKGPTSTEPFAFKYYDANRVVLGKPMKEWMPFAMAWWHNLGAAGTDMFG


004

Acid


GNTMDKSWGVDKEKDPMGYAKAKVDAGFEFMQKMGIEYYCFHDVDLVPECDDITVMYQRL








DEIGDYLLKKQKETGIKLLWSTANAFGHRRFMNGAGSSNSAEVYCFAAAQIKKALELCVK








LGGKGYVFWGGREGYETLLNTDMKFEQENIANLMRCARDYGRKIGFKGDFYIEPKPKEPT








KHQYDFDAATAIGFLRQYGLDKDFKMNIEANHATLAGHTFEHELRVSAMNGMLGSIDANE








GDMLLGWDVDRFPANVYSATFAMLEVIKAGGLTGGFNFDAKTRRASNTYEDMFKAFVLGM








DTFALGLLNAEAIIKDGRIDKFVEDRYASFKTGIGAKVRDHSATLEDLAAHALETKVCPD







PGSGDEEELQEILNQLMFGKK





1754MI9_

Firmicutes

DNA
 43
ATGAGCGAATTTTTTAAGAATATTCCAGAGATTAAATTCGAAGGAAAAGATAGTAAAAAT


004



CCATGGGCATTCAAGTATTACAATCCTGAATTGACCATTATGGGTAAAAAAATGTCTGAA






CATCTTCCTTTTGCAATGGCCTGGTGGCATAACCTTGGCGCAAATGGAGTTGATATGTTC






GGTTCGGGAACCGCCGATAAATCTTTCGGTCAGGCTCCGGGAACTATGGAGCACGCAAAG






GCTAAGGTAGATGCAGGTATCGAGTTTATGAAGAAACTCGGAATCAAGTACTACTGCTGG






CATGATGTAGACCTTGTTCCTGAAGATCCAAACGATATCAACGTAACAAACAAGCGCCTT






GATGAGATTTCAGATTATATCCTTGAAAAAACAAAGGGAACTGACATCAAGTGTCTCTGG






GGAACTGCTAACATGTTCAGTAATCCCCGCTTTATGAACGGGGCAGGCTCAACAAACTCT






GCTGACGTTTACTGCTTTGCAGCTGCCCAGGTTAAAAAGGCTCTTGAGATTACCGTAAAG






CTTGGTGGCCGCGGTTATGTATTCTGGGGTGGACGCGAAGGTTATGAAACTCTTCTTAAT






ACAGATGTAAAGCTTGAACAGGAAAATATTGCAAACCTTATGCACATGGCAGTTGATTAT






GGCCGTTCAATCGGTTTCAAGGGAGACTTCTACATCGAGCCTAAGCCAAAGGAGCCGATG






AGTCATCAGTATGATTTTGATGCCGCAACTGCAATCGGCTTCCTCCGCCAGTATGGCCTC






GACAAAGACTTTAAGATGAACATTGAGGCTAACCACGCTTCTCTTGCAAATCATACCTTC






CAGCATGAGCTTTATATCAGCCGCATTAACGGAATGCTTGGTTCTGTAGATGCTAACCAG






GGAAATCCAATTCTCGGCTGGGATACAGATAACTTCCCTTGGAATGTCTACGACGCAACT






CTTGCAATGTACGAAGTACTCAAGGCTGGTGGACTTACAGGTGGCTTCAACTTTGACTCA






AAGAACCGCCGCCCATCAAATACATTTGAAGATATGTTCCACGCTTACATCATGGGAATG






GACACTTTTGCTCTTGGTCTTATTAAGGCTGCAGAAATTATTGAAGACGGAAGAATCGAT






GGCTTCATTAAAGAAAAGTATTCAAGCTACGAAAGTGGAATTGGTAAGAAGATCCGCGAC






AAGCAGACAACTTTGGAAGAGCTTGCTGCCCGTGCCGCAGAAATGAAAAAGCCATCTGAT






CCAGGTTCAGGCCGCGAGGAATATCTGGAAGGAGTTGTTAACAATATCCTCTTTCGCGGA






TAA





1754MI9_

Firmicutes

Amino 
 44
MSEFFKNIPEIKFEGKDSKNPWAFKYYNPELTIMGKKMSEHLPFAMAWWHNLGANGVDMF


004

Acid


GSGTADKSFGQAPGTMEHAKAKVDAGIEFMKKLGIKYYCWHDVDINPEDPNDINVTNKRL








DEISDYILEKTKGTDIKCLWGTANMFSNPRFMNGAGSTNSADVYCFAAAQVKKALEITVK








LGGRGYVFWGGREGYETLLNTDVKLEQENIANLMHMAVDYGRSIGFKGDFYIEPKPKEPM








SHQYDFDAATAIGFLRQYGLDKDFKMNIEANHASLANHTFQHELYISRINGMLGSVDANQ







GNPILGWDTDNFPWNVYDATLAMYEVLKAGGLTGGFNFDSKNRRPSNTFEDMFHAYIMGM







DTFALGLIKAAEIIEDGRIDGFIKEKYSSYESGIGKKIRDKQTTLEELAARAAEMKKPSD







PGSGREEYLEGVVNNILFRG





1754MI22_

Firmicutes

DNA
 45
ATGAGCGAGTTTTTTAAGAATATTCCTCAAATAAAATACGAAGGAAAAGATAGCAAAAAT


004



CCCTGGGCATTCAAGTATTACAATCCTGAATTGACAATCATGGGTAAAAAGATGAGCGAA






CATCTTCCATTCGCAATGGCATGGTGGCATAACCTTGGCGCAAACGGCGTTGATATGTTT






GGTCAGGGAACAGCAGACAAGTCTTTCGGACAGATTCCTGGAACTATGGAGCATGCAAAG






GCTAAGGTTGATGCTGGTATAGAGTTTATGAAGAAGCTCGGAATCAAATATTACTGCTGG






CACGATGTTGACCTTGTTCCTGAGGATCCAAACGATATCAACGTAACTAACAAACGTCTG






GACGAAATTTCAGATTACATCCTTGAAAAGACAAAAGGAACAGACATTAAGTGTCTCTGG






GGAACTGCAAACATGTTCGGTAACCCTCGCTTTATGAACGGTGCAGGCTCTACAAACTCT






GCTGACGTTTACTGTTTTGCTGCCGCTCAGGTAAAAAAGGCTCTTGAGATTACTGTAAAG






CTTGGTGGCCGAGGTTATGTTTTCTGGGGTGGCCGCGAAGGTTACGAAACTCTTCTCAAT






ACAGACGTAAAACTTGAACAGGAAAATATCGCAAACCTCATGCATATGGCTGTTGATTAT






GGCCGCTCAATCGGTTTCAAGGGAGACTTCTACATCGAGCCTAAGCCAAAGGAGCCAATG






AGCCATCAGTATGATTTTGATGCTGCAACAGCAATCGGCTTCCTCCGCCAGTATGGCCTC






GACAAAGATTTTAAGATGAACATCGAAGCTAACCATGCCTCACTTGCAAATCACACCTTC






CAGCACGAGCTTTGTATCAGCCGCATAAACGGAATGCTTGGTTCTGTAGATGCAAATCAG






GGAAATCCAATTCTTGGCTGGGATACAGATAACTTCCCATGGAATGTTTACGATGCAACT






CTGGCAATGTACGAAGTTCTCAAGGCTGGCGGTCTAACAGGTGGCTTCAACTTTGACTCA






AAGAACCGICGCCCATCAAATACTTTTGAAGAIATGTTCCACGCTTATATCATGGGTATG






GATACTTTTGCCCTTGGCCTTATTAAGGCTGCAGAAATTATTGAAGACGGCAGAATTGAC






GGCTTCATCAAAGAAAAGTATTCAAGCTTTGAAAGTGGAATTGGTAAGAAGATTCGTGAC






AAGCAGACAAGTTTGGAAGAGCTTGCAGCTCGTGCCGCTGAAATGAAAAAGCCATCTGAT






CCAGGTTCAGGCCGCGAGGAATACCTCGAAGGAGTTGTTAACAACATCCTCTTTCGCGGA






TAA





1754MI22_

Firmicutes

Amino 
 46
MSEFFKNIPQIKYEGKDSKNPWAFKYYNPELTIMGKKMSEHLPFAMAWWHNLGANGVDMF


004

Acid


GQGTADKSFGQIPGTMEHAKAKVDAGIEFMKKLGIKYYCWHDVDLVPEDPNDINVTNKRL








DEISDYILEKTKGTDIKCLWGTANMFGNPRFMNGAGSTNSADVYCFAAAQVKKALEITVK








LGGRGYVFWGGREGYETLLNTDVKLEQENIANLMHMAVDYGRSIGFKGDFYIEPKPKEPM








SHQYDFDAATAIGFLRQYGLDKDFKMNIEANHASLANHTFQHELCISRINGMLGSVDANQ








GNPILGWDTDNFPWNVYDATLAMYEVLKAGGLTGGFNFDSKNRRDSNTFEDMFHAYIMGM








DTFALGLIKAAEIIEPGRIDGFIKEKYSSFESGIGKKIRDKQTSLEELAARAAEMKKPSD







RGSGREEYLEGVVNNILFRG





727MI1_

Firmicutes

DNA
 47
ATGATATTTGAAAATATTCCCGCAATTCCTTATGAGGGTCCGAAGAGCACAAATCCGCTG


002



GCGTTTAAATTCTATGATCCGGACAAGATCGTTATGGGAAAGCCCATGAAGGAGCATCTG






CCCTTTGCAATGGCCTGGTGGCACAACCTTGGCGCGGCCGGAACCGATATGTTCGGGCGC






GATACCGCCGACAAATCCTTCGGTGCGGTAAAAGGCACAATGGAGCATGCCAAAGCGAAA






GTCGATGCCGGCTTTGAGTTCATGCAGAAGCTGGGGATCCGCTATTTCTGCTTCCATGAT






GTGGATCTTGTTCCGGAGGCGGATGATATAAAGGAGACCAACCGCCGTCTGGACGAGATC






AGCGATTACATCCTTGAAAAGATGAAGGGCACCGATATCAAGTGCCTTTGGGGCACGGCC






AATATGTTCTCAAATCCGCGCTTTATGAACGGCGCAGGCTCCTCCAATTCTGCCGATGTA






TTCGCTTTTGCGGCAGCACAGGCCAAGAAGGCCTTGGATCTGACCGTCAAACTCGGCGGG






CGCGGCTATGTCTTCTGGGGCGGACGTGAGGGCTATGAGACACTTCTCAATACCGACATG






AAGTTCGAGCAGGAGAATATCGCGAAGCTCATGCATATGGCTGTCGATTACGGCCGCAGC






ATAGGCTTTACCGGTGATTTCTATATCGAGCCCAAACCGAAAGAGCCGATGAAACACCAG






TATGATTTCGATGCAGCCACTGCGATAGGCTTCCTCCGCCAGTACGGACTCGATAAGGAC






TTCAAGCTCAACATCGAGGCAAACCACGCCACACTGGCAGGTCACACTTTCCAGCACGAT






CTGCGTGTTTCCGCAATAAACGGAATGCTGGGCAGCATTGACGCCAACCAGGGCGATATG






CTCCTCGGCTGGGATACCGACGAGTTCCCGTTCAATGTATATGATGCGACCATGTGCATG






TATGAGGTGCTCAAGTCAGACGGGCTCACCGGCGGCTTTAACTTCGACTCCAAATCACGC






CGCCCGAGCTATACGGTCGAGGATATGTTTACAAGCTATATCCTCGGCATGGACACTTTT






GCCCTCGGCCTTCTGAAAGCGGCCGAGCTTATCGAAGACGGAAGGCTTGACGCCTTCGTC






AAAGAACGCTATTCAAGCTATGAGAGCGGCATCGGCGCAAAGATCCGCAGCGGAGAAACC






GATTTGAAGGAATTGGCGGAATATGCGGACTCCCTCGGAGCCCCCGAACTTCCGGGCAGC






GGAAAACAGGAACAGCTCGAGAGCATAGTAAATCAGATACTTTTCGGATAA





727MI1_

Firmicutes

Amino 
 48
MIFENIPAIPYEGPKSTNPLAFKFTDPDKIVMGKPMKEHLPFAMAWWHNLGAAGTDMFGR


002

Acid


DTADKSFGAVKGTMEHAKAKVDAGFEFMQKLGIRYFCFHDVDLVPEADDIKETNRRLDEI








SDYILEKMKGTDIKCLWGTANMFSNPRFMNGAGSSNSADVFAFAAAQAKKALDLTVKLGG








RGYVFWGGREGYETLLNTDMKFEQENIAKLMHMAVDYGRSIGFTGDFYIEPKPKEPMKHQ








YDFDAATAIGFLRQYGLDKDFKLNIEANHATLAGHTFQHDLRVSAINGMLGSIDANQGDM








LLGWDTDEFPFNVYDATMCMYEVLKSDGLTGGFNFDSKSRRPSYTVEDMFTSYILGMDTF








ALGLLKAAELIEDGRLDAFVKERYSSYESGIGAKIRSGETDLKELAEYADSLGAPELPGS







GKQEQLESIVNQILFG





727MI9_

Firmicutes

DNA
 49
ATGAGCGAGTTTTTTGCCAGCATTCCCAAAATTCCCTTTGAAGGCAAGGACAGCGCCAAT


005



CCCCTGGCGTTCAAATACTACGACGCCGACAGGATGATACTGGGCAAGCCCATGAAGGAG






CACCTTCCCTTCGCCATGGCCTGGTGGCACAACCTGTGCGCCGCGGGCACCGATATGTTT






GGCCGGGACACCGCCGACAAGTCCTTCGGCCAGGTCAAGGGCACCATGGAACACGCCAAG






GCCAAGGTGGACGCGGGCTTTGAGTTCATGAAGAAGCTGGGCATCCGCTACTTCTGCTTC






CACGACGTGGACATCGTGCCCGAAGCCGACGACATCAAGGAAACCAACCGCCGTCTGGAC






GAGATCTCCGACTATATCCTGGAGAAAATGAAAGGCACCGACATCCAGTGCCTGTGGGGC






ACCGCCAACATGTTCGGCAACCCCCGCTATATGAACGGCGCGGGCAGCTCCAACTCCGCC






GACGTATACTGCTTCGCCGCGGCCCAGATCAAAAAGGCCCTGGACATCACCGTGAAGCTG






GGCGGCAAGGGCTACGTGTTCTGGGGCGGCCGCGAGGGCTACGAGACCCTGCTGAACACC






GATATGAAGTTCGAGCAGGAGAACATCGCCCGCCTGATGCACATGGCCGTGGACTACGGC






CGCAGCATCGGCTTCACCGGCGATTTCTACATCGAGCCCAAGCCCAAGGAGCCCATGAAG






CACCAGTACGACTTCGACGCCGCCACCGCCATAGGCTTTTTGCGCCAGTACGGCCTGGAC






AAGGATTTCAAGCTGAACATCGAGTCCAACCACGCCACCCTGGCGGGCCATACCTTCCAG






CACGACCTGCGCGTTTCCGCCATCAACGGCATGCTGGGCTCCATCGACGCCAACCAGGGC






GACTACCTGCTGGGCTGGGATACCGACGAGTTCCCCTACAGCGTATACGAGACCACCATG






TGCATGTACGAGGTGCTCAAGGCCGGAGGTCTCACCGGCGGCTTCAATTTCGACGCCAAG






AACCGCCGTCCCAGCTACACCCCCGAGGATATGTTCCACGCCTACATCCTTGGGATGGAC






AGCTTCGCCCTGGGCCTGATCAAGGCCGCCGAGCTCATCGAGGACGGTCGCCTGGACGCC






TTCGTCCGGGACCGCTACCAGAGCTGGGAGACCGGCATCGGCGATAAGATCCGCAAGGGC






GAGACCACACTGGCCGAGCTGGCCGAGTACGCCGCCCGGATGGGCGCGCCCGCGCTGCCC






GGCAGCGGCCGCCAGGAATACCTGGAGGGCGTGGTCAACAATATCCTGTTCAAATAA





727MI9_

Firmicutes

Amino 
 50
MSEFFASIPKIPFEGKDSANDLAFKYYDADRMILGKPMKEHLDFAMAWWHNLCAAGTDMF


005

Acid


GRDTADKSFGQVKGTMEHAKAKVDAGFEFMKKLGIRYFCFHDVDIVPEADDIKETNRRLD








EISDYILEKMKGTDIQCLWGTANMFGNPRYMNGAGSSNSADVYCFAAAQIKKALDITVKL








GGKGYVFWGGREGYETLLNTDMKFEQENIARLMHMAVDYGRSIGFTGDFYIEPKPKEPMK








HQYDFDAATAIGFLRQYGLDKDFKLNIESNHATLAGHTFQHDLRVSAINGMLGSIDANQG








DYLLGWDTDEFPYSVYETTMCMYEVLKAGGLTGGFNFDAKNRRPSYTPEDMFHAYILGMD








SFALGLIKAAELIEDGRLDAFVRDRYQSWETGIGEKIRKGETTLAELAEYAARMGAPALP







GSGRQEYLEGVVNNILFK





727MI27_

Firmicutes

DNA
 51
ATGAAGACCTATTTCAAAAAAATCCCCGTGATCCCCTACGAGGGACCGAAGTCCCAGAAT


002



CCGCTGTCGTTCAAATTCTATGACGCGGACCGCATCGTTCTCGGCAAGCCCATGAAGGAG






CATCTGCCCTTCGCCATGGCCTGGTGGCACAATCTGGGTGCTGCCGGAACGGACATGTTC






GGCCGCGATACCGCCGACAAGTCCTTCGGAGCGGAGAAGGGCACCATGGAGCATGCCAAG






GCCAAGGTGGACGCTGGCTTCGAGTTTATGAAGAAGGTGGGCATCCGGTATTTCTGCTTC






CATGACGTGGATCTGGTCCCGGAAGCGGACGACATCAAGGAGACCAACCGCCGTCTCGAT






GAGATCAGCGACTACATCCTCAAGAAGATGAAGGGCACGGATATCAAGTGCCTCTGGGGC






ACCGCCAACATGTTCGGCAATCCCCGGTTCATGAACGGCGCGGGCAGCTCCAACAGCGCG






GACGTGTTCTGCTTTGCCGCGGCCCAGGTGAAGAAGGCCTTGGACATCACCGTCAAGCTG






GGCGGCCGGGGCTATGTGTTCTGGGGCGGCCGTGAGGGGTATGAGTCCCTGCTGAACACG






GACGTGAAGTTTGAGCAGGAGAACATCGCCAAGCTCATGCACCTTGCCGTGGACTACGGC






CGCAGCATCGGCTTCACCGGCGATTTCTACATCGAGCCCAAGCCCAAGGAGCCCATGAAG






CACCAGTACGACTTCGATGCCGCCACCGCCATCGGCTTCCTCAGGCAGTACGGCCTCGAT






AAGGACTTCAAGATGAACATTGAAGCCAACCACGCGACCCTGGCCGGCCACACCTTCCAG






CACGACCTCAGGATCAGCGCCATCAACGGGATGCTGGGCTCCATCGACGCCAACCAGGGC






GACCTCCTGCTGGGATGGGACACCGACGAATTCCCCTTCAACGTCTATGAGGCCACCATG






TGCATGTACGAGGTCCTCAAGGCCGGCGGCCTCACCGGCGGCTTCAACTTCGACTCAAAG






AACCGCCGTCCCTCCTACACCATGGAGGATATGTTCCACGCCTACATCCTGGGCATGGAC






ACCTTCGCCCTGGGTCTTCTCAAGGCCGCGGAGCTCATCGAGGACGGTCGGATCGACAAA






TTCGTGGAGGAGCGCTACGCCAGCTACAAGACCGGCATCGGCGCCAAGATCCGTTCCGGC






GAGACCACGCTTCAGGAGCTGGCCGCCTATGCCGACAAGTTGGGCGCGCCTGCCCTTCCC






GGCAGCGGCCGTCAGGAGTACCTGGAGAGCATCGTCAACCAGGTGCTCTTCGGGATGTGA





727MI27_

Firmicutes

Amino 
 52
MKTYFKKIPVIPYEGPKSQNPLSFKFYDADRIVLGKPMKEHLPFAMAWWHNLGAAGTDMF


002

Acid


GRDTADKSFGAEKGTMEHAKAKVDAGFEFMKKVGIRYFCFHDVDLVPEADDIKETNRRLD








EISDYILKKMKGTDIKCLWGTANMFGNPRFMNGAGSSNSADVFCFAAAQVKKALDITVKL








GGRGYVFWGGREGYESLLNTDVKFEQENIAKLMHLAVDYGRSIGFTGDFYIEPKPKEPMK








HQYDFDAATAIGFLRQYGLDKDFKMNIEANHATLAGHTFQHDLRISAINGMLGSIDANQG








DLLLGWDTDEFPFNVYEATMCMYEVLKAGGLTGGFNFDSKNRRPSYTMEDMFHAYILGMD








TFALGLLKAAELIEDGRIDKFVEERYASYKTGIGAKIRSGETTLQELAAYADKLGAPALP







GSGRQEYLESIVNQVLFGM





1753MI2_

Neocalli-

DNA
 53
ATGGCTAAAGAGTATTTTCCAGAGATTGGCAAAATCAAGTTTGAAGGCAAGGACAGCAAA


006

mastigales



AACCCAATGGCTTTCCACTACTATGACCCCGAGAAGGTGATCATGGGCAAGCCTATGAAA






GACTGGCTCCGCTTCGCTATGGCATGGTGGCACACCCTCTGCGCAGAAGGTGGCGACCAG






TTCGGTGGCGGCACTAAGAAGTTCCCTTGGAACAACGGCGCTGACGCTGTAGAAATCGCA






AAACAGAAGGCTGACGCAGGTTTCGAAATCATGCAGAAGCTCGGCATCCCATATTTCTGC






TTCCACGACGTGGACCTCGTGTCTGAGGGCGCATCTGTAGAAGAGTATGAGGCTAACCTC






AAGGCTATCACAGACTACCTCGCTGTGAAGATGAAGGAAACAGGCATCAAGCTCCTGTGG






TCTACTGCCAACGTATTCGGCAACGGCCGCTACATGAACGGTGCTTCTACCAACCCTGAC






TTCGACGTCGTTGCTCGCGCTATCGTGCAGATTAAGAACGCTATCGACGCTGGTATCAAG






CTCGGCGCTGAGAACTACGTGTTCTGGGGCGGACGCGAAGGCTACATGAGCCTCCTCAAC






ACCGACCAGAAGCGTGAGAAGGAGCACATGGCCACTATGCTCACTATGGCTCGCGACTAC






GCTCGCGCTAAGGGCTTCAAGGGCACATTCCTCATCGAGCCTAAGCCAATGGAGCCTTCT






AAGCACCAGTATGACGTTGACACTGAGACTGTCATCGGCTTCCTCAAGGCACACAACCTC






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCAACTCTCGCTGGCCACACCTTC






GAGCACGAGCTCGCAGTGGCAGTGGACAACAACATGCTCGGCTCTATCGACGCTAACCGT






GGTGACTACCAGAATGGCTGGGATACTGACCAGTTCCCAATCGACCAGTACGAACTCGTT






CAGGCTTGGATGGAAATCATCCGTGGCGGCGGTCTCGGCACTGGCGGCACGAACTTCGAC






GCTAAGACTCGTCGTAACTCTACCGACCTCGAAGACATCTTCATCGCACACATCGCAGGC






ATGGACGCTATGGCACGCGCACTCGAATCAGCTGCTAAGCTCCTCGAAGAGTCTCCATAC






AAGGCAATGAAGGCAGCTCGCTACGCTTCATTCGACAACGGTATCGGTAAGGACTTCGAA






GATGGCAAGCTCACTCTCGAGCAGGCTTACGAATACGGTAAGAAGGTTGGTGAGCCTAAG






CAGACTTCTGGCAAGCAGGAGCTCTACGAAGCCATCGTTGCAATGTACGCTTAA





1753MI2_

Neocalli-

Amino 
 54
MAKEYFPEIGKIKFEGKDSKNPMAFHYYDPEKVIMGKPMKDWLRFAMAWWHTLCAEGGDQ


006

mastigales

Acid


FGGGTKKFPWNNGADAVEIAKQKADAGFEIMQKLGIPYFCFHDVDLVSEGASVEEYEANL








KAITDYLAVKMKETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIK








LGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEPS








KHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNNMLGSIDANR








GDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHIAG








MDAMARALESAAKLLEESPYKAMKAARYASFDNGIGKDFEDGKLTLEQAYEYGKKVGEPK







QTSGKQELYEAIVAMYA





5586MI3_

Neocalli-

DNA
 55
ATGGCTAAAGAATTTTTCCCAGAGATTGGTAAAATCAAGTTCGAAGGCAAGGATTCAAAG


005

mastigales



AATCCAATGGCTTTCCATTACTATGATGCAGAGAAGGTAATCATGGGCAAACCCATGAAG






GACTGGCTCCGTTTCGCTATGGCATGGTGGCACACACTCTGTGCAGAGGGCGGCGACCAG






TTCGGTGGCGGTACGAAGAAGTTCCCTTGGAACGAGGGTGCTAATGCTGTCGAGATTGCT






AAGCAGAAGGCTGACGCTGGTTTCGAAATCATGCAGAAGCTTGGCATTCCTTACTTCTGC






TTCCACGATGTTGACCTCGTTTCTGAAGGCGCATCTGTTGAGGAGTATGAGGCCAACCTC






AAGGCTATCACTGACTATCTCGCGGTGAAGATGAAGGAGACTGGCATTAAGCTCCTGTGG






TCTACTGCCAACGTGTTCGGCAATGGCCGTTACATGAATGGTGCTTCCACCAACCCTGAC






TTCGACGTTGTTGCTCGCGCCATCGTTCAGATTAAGAACGCTATCGATGCAGGTATCAAG






CTCGGTGCTGAGAACTATGTGTTCTGGGGCGGTCGTGAAGGTTACATGAGCCTCCTGAAC






ACAGACCAGAAGCGTGAGAAGGAGCACATGGCTACTATGCTCACTATGGCTCGCGACTAC






GCTCGCAGCAAGGGCTTCAAGGGTACTTTCCTCATCGAGCCTAAGCCAATGGAGCCATCT






AAGCACCAGTACGACGTTGACACAGAGACTGTTATCGGCTTCCTGAAGGCACACAACCTT






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCAACACTCGCTGGICACACCTTC






GAGCACGAGCTCGCTGTGGCTGTCGACAACAATATGCTTGGTTCTATCGATGCTAACCGC






GGTGACTACCAGAATGGTTGGGATACGGACCAGTTCCCAATTGACCAGTACGAGCTCGTT






CAGGCTTGGATGGAGATCATCCGTGGTGGCGGTCTCGGCACAGGTGGTACAAACTTCGAC






GCTAAGACTCGTCGTAACTCTACCGACCTCGAGGACATTTTCATTGCTCACATCGCTGGT






ATGGACGCTATGGCTCGCGCTCTTGAGTCAGCAGCTAAGCTCCTTGAGGAGTCTCCATAC






AAGAAGATGAAGGCTGCCCGTTATGCTTCTTTCGACAGCGGCATGGGTAAGGACTTTGAG






AACGGCAAGCTCACACTCGAACAGGTTTATGAGTATGGTAAGAAGGTAGGTGAGCCCAAG






CAGACTTCTGGCAAGCAGGAGCTCTTCGAGGCAATCGTGGCCATGTACGCATAA





5586MI3_

Neocalli-

Amino 
 56
MAKEFFPEIGKIKFEGKDSKNPMAFHYYDAEKVIMGKPMKDWLRFAMAWWHTLCAEGGDQ


005

mastigales

Acid


FGGGTKKFPWNEGANAVEIAKQKADAGFEIMQKLGIPYFCFHDVDLVSEGASVEEYEANL








KAITDYLAVKMKETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIK








LGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPS








KHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNNMLGSIDANR








GDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHIAG








MDAMARALESAAKLLEESPYKKMKAARYASFDSGMGKDFENGKLTLEQVYEYGKKVGEPK







QTSGKQELFEAIVAMYA





5586MI91_

Neocalli-

DNA
 57
ATGGCTAAAGAGTATTTTCCAGAGATTGGTAAAATCAAGTTTGAAGGCAAGGATTCCAAG


002

mastigales



AATCCAATGGCATTCCACTATTATGATGCAGAGAAAGTGATTATGGGTAAGCCTATGAAG






GAGTGGCTCCGCTTTGCAATGGCATGGTGGCACACACTCTGTGCAGAGGGIGGCGACCAG






TTTGGTGGTGGCACTAAGAAATTCCCATGGAACGAGGGCACTGACGCTGTGACGATTGCT






AAGCAGAAGGCTGATGCAGGTTTCGAAATCATGCAGAAACTCGGTTTCCCATATTTTTGC






TTCCACGACATTGACCTCGTTTCCGAAGGCAACAGCATTGAAGAGTATGAGGCTAACCTC






CAGGCAATCACTGATTATCTGAAAGTGAAGATGGAAGAGACAGGCATCAAACTCTTGTGG






TCAACTGCCAACGTATTCGGCAATGGTCGCTACATGAATGGTGCTTCCACAAACCCAGAC






TTTGACGTGGTGGCTCGTGCCATCGTTCAGATTAAGAACGCAATTGACGCTGGTATCAAA






CTCGGTGCTGAGAACTATGTATTCTGGGGCGGTCGCGAAGGCTACATGAGCCTTCTGAAC






ACTGACCAGAAGCGTGAGAAGGAGCACATGGCAACCATGCTCACTATGGCTCGCGACTAC






GCTCGCAGCAAGGGTTTCAAGGGCACTTTCCTCATTGAGCCAAAGCCAATGGAGCCATCT






AAGCACCAGTATGACGTTGACACGGAGACTGTCATCGGCTTCCTCAAGGCACACAACCTC






GACAAGGATTTCAAGGTGAACATCGAAGTGAACCACGCTACACTTGCAGGTCATACTTTC






GAGCACGAACTTGCTGTGGCTGTTGACAATGGCATGCTCGGTTCTATCGACGCTAACCGT






GGTGACTATCAGAACGGTTGGGACACTGACCAGTTCCCAATCGACCAGTACGAACTCGTT






CAGGCTTGGATGGAAATCATCCGTGGTGGTGGTCTCGGCACAGGTGGTACTAACTTCGAT






GCTAAGACTCGTCGTAACTCAACTGACCTCGAGGACATCTTCATCGCACACATCTCTGGT






ATGGATGCAATGGCACGTGCTCTCGAATCGGCGGCTAAACTTCTTGAGGAGTCTCCATAC






TGCGCTATGAAGAAGGCTCGTTACGCTTCCTTCGACAGCGGCATCGGTAAGGACTTCGAG






GACGGCAAACTCACGCTCGAGCAGGCTTACGAGTACGGCAAGAAAGTCGGCGAACCCAAG






CAGACTTCTGGCAAGCAGGAACTCTACGAGGCAATCGTTGCCATGTACGCATAA





5586MI91_

Neocalli-

Amino 
 58
MAKEYFPEIGKIKFEGKDSKNPMAFHYYDAEKVIMGKPMKEWLRFAMAWWHTLCAEGGDQ


002

mastigales

Acid


FGGGTKKFPWNEGTDAVTIAKQKADAGFEIMQKLGFPYFCFHDIDLVSEGNSIEEYEANL








QAITDYLKVKMEETGIKLLWSTANVFGNGRYMNGASTNPDFDVVARAIVQIKNAIDAGIK








LGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPS








KHQYDVDTETVIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANR








GDYQNGWDTDQFPIDQYELVQAWMEIIRGGGLGTGGTNFDAKTRRNSTDLEDIFIAHISG








MDAMARALESAAKLLEESPYCAMKKARYASFDSGIGKDFEDGKLTLEQATEYGKKVGEPK







QTSGKQELYEAIVAMYA





5586MI194_

Neocalli-

DNA
 59
ATGGCAAAAGAGTATTTCCCTACGATCGGTAAGATCGTTTATGAAGGACCGGAGTCCAAG


003

mastigales



AACCCTATGGCATTTCATTACTATGACGCAGAGCGCGTAGTAGCTGGTAAAAAAATGAAA






GATTGGATGCGTTTCGCTATGGCATGGTGGCACACCCTCTGTGCAGAAGGTGCAGACCAG






TTCGGTGGAGGCACCAAACACTTCCCGTGGAGTGAAGGTCCCGATGCCGTAACCATCGCC






AAGCAGAAAGCAGACGCAGGTTTTGAGATCATGCAGAAACTCGGCTTCCCGTATTTCTGT






TTCCATGACGTGGATCTGGTCAGCGAAGGCAGCAGCGTAGAAGAGTACGAGGCGAACCTC






GCAGCCATCACCGATTATCTCAAGCAGAAAATGGACGAGTCGGGTATCAAACTCCTTTGG






TCCACTGCTAACGTATTCGGTCACGCCCGTTACATGAACGGTGCCAGCACCAATCCTGAC






TTTGATGTCGTTGCCCGTGCGATTGTGCAGATCAAGAATGCTATCGACGCAGGTATCAAA






CTCGGCGCAGAGAACTACGTCTTCTGGGGCGGTCGTGAAGGTTATATGAGCCTGCTCAAT






ACCGACCAGAAACGCGAGAAAGAGCATACGGCAATGATGCTGCGTATGGCGCGTGACTAT






GCCCGCAGCAAAGGTTTCAAAGGTACCTTCCTCATCGAACCCAAACCCATGGAGCCGTCC






AAGCACCAGTATGACGTAGATACCGAGACGGTGATAGGTTTCCTCAAAGCACACGGTTTG






GAGAAAGACTTTAAGGTAAACATCGAAGTGAACCACGCTACCCTCGCCGGTCACACTTTC






GAGCACGAACTGGCAGTAGCCGTAGATAACGGCATGCTCGGTTCGATCGATGCCAACCGC






GGTGACTATCAGAACGGATGGGATACCGACCAGTTCCCCATCGATAACTTCGAACTGACC






CAAGCATGGATGCAGATCGTACGTAACGGTGGTCTCGGCACAGGCGGAACGAACTTCGAC






TCCAAGACCCGTCGTAACTCCACCGATCTCGAGGATATCTTCATCGCTCACATCAGTGGT






ATGGACGCTTGTGCCCGTGCCCTATTGAATGCCGTAGAGATCATGGAGAAATCACCGATC






CCTGCTATGCTCAAAGAGCGTTACGCTTCCTTCGATAGCGGTCTGGGTAAAGATTTCGAG






GACGGCAAACTGACCCTTGAGCAAGTCTATGAGTACGGTAAGAAAGTAGGCGAACCCAAA






CAAACCAGCGGCAAACAAGAACTCTATGAGGCTATCGTTGCCCTCTACGCTAAATAA





5586MI194_

Neocalli-

Amino 
 60
MAKEYFPTIGKIVYEGPESKNPMAFHYYDAERVVAGKKMKDWMRFAMAWWHTLCAEGADQ


003

mastigales

Acid


FGGGYKHFPWSEGPDAVYIAKQKADAGFEIMQKLGFPYPCFHDVELVSEGSSVEEYEANL








AAITDYLKQKMDESGIKLLWSTANVFGHARYMNGASTNPDFDVVARAIVQIKNAIDAGIK








LGAENYVFWGGREGYMSLLNTDQKREKEHTAMMLRMARDTARSKGFKGTFLIEPKPMEPS








KHQYDVDTETVIGFLKAHGLEKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANR








GDYQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDSKTRRNSTDLEDIFIAHISG








MDACARALLNAVEIMEKSPIPAMLKERYASFDSGLGKEFEDGKLYLEQVYEYGKKVGEPK







QTSGKQELYEAIVALYAK





5586MI198_

Neocalli-

DNA
 61
ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC


003

mastigales



CCGATGGCATTTCACTACTATGACGCAGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAG






TGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCGGAAGGCGGCGACCAGTTC






GGAGGCGGAACGAAGAAGTTCCCCTGGAACGAAGGCGCTAACGCTTTGGAGATCGCCAAG






CACAAAGCCGATGCGGGATTTGAGATCATGCAGAAACTCGGCATCCCTTATTTCTGTTTC






CATGACGTGGATCTCATCGCCGAGGGCGGTTCGGTAGAAGAGTACGAAGCCAACCTCGCT






GCCATCACCGATTACCTCAAACAGAAAATGGACGAGACTGGCATCAAACTGCTGTGGTCC






ACGGCGAACGTCTTCAGCAACCCCCGTTATATGAACGGCGCCAGCACGAACCCCGATTTC






GATGTAGTAGCGCGTGCCATCGTCCAGATCAAGAACGCTATCGACGCCGGTATCAAACTC






GGAGCAGAGAACTATGTCTTCTGGGGTGGTCGCGAGGGCTATATGAGCCTCCTCAACACT






GACCAGCGCCGAGAGAAAGAGCATATGGCTACCATGCTCCGTATGGCGCGTGACTACGCG






CGTGCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCATGTGAGCCGTCCAAA






CATCAGTATGATGTCGATACCGAGACCGTCATCGGTTTCCTCAAAGCGCATGGACTCGAC






AAGGATTTCAAAGTCAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAA






CACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGT






GACGCCCAGAACGGATGGGACACCGACCAGTTCCCTATTGATAACTTCGAACTCACACAG






GCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGYACGAACTTCGACGCC






AAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATG






GACGCTTGCGCACGTGCGTTACTCAATGCTGTCGAAATCCTCGAGAAGAGCCCGATTCCG






GCGATGCTCAAAGAGCGTTATGCTTCCTTTGACGGCGGCATCGGAAAGGACTTCGAGGAG






GGAAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAAGTCGGCGAACCCAAACAG






ACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG





5586MI198_

Neocalli-

Amino 
 62
MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF


003

mastigales

Acid


GGGTKKFPWNEGANALETAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYEANLA








AITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSK








HQYDVDTETVIGFLKAHGLDKDPKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDYDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACARALLNAVEILEKSPIPAMLKERYASPDGGIGKDFEEGKLTFEQVYEYGKKVGEPKQ







TSGKQELYETIVALYAK





5586MI201_

Neocalli-

DNA
 63
ATGGCAAAAGAGTATTTCCCTACGATCGGTAAGATCGTTTATGAAGGACCGGAATCCAAG


003

mastigales



AACCCTATGGCATTTCATTACTATGACGCAGAGCGCGTAGTAGCTGGTAAAAAAATGAAA






GATTGGATGCGTTTCGCTATGGCATGGTGGCACACCCYCTGTGCAGAAGGTGCAGACCAG






TTCGGTGGAGGCACCAAACACTTCCCGTGGAATGAAGGTCCCGATGCCGTAACCATCGCC






AAGCAGAAAGCAGACGCAGGTTTTGAGATCATGCAGAAACTCGGCTTCCCGTATTTCTGT






TTCCATGACGTGGATCTGGTCGGCGAAGGCAGCAGCGTAGAAGAGTACGAGGCGAACCTC






GCAGCCATCACCGATTATCTCAAGCAGAAAATGGACGAGTCGGGTATCAAACTCCTTTGG






TCCACTGCTAACGTATTCGGTCACGCCCGTTACATGAACGGTGCCAGCACCAATCCTGAC






TTTGATGTCGTTGCCCGTGCGATTGTGCAGATCAAGAATGCTATCGACGCAGGTATCAAA






CTCGGCGCAGAGAACTACGTCTTCTGGGGCGGTCGTGAAGGTTATATGAGCCTGCTCAAC






ACCGACCAGAAACGCGAGAAAGAGCATACGGCAATGATGCTGCGTATGGCGCGTGACTAT






GCCCGCAGCAAAGGTTTCAAAGGTACCTTCCTCATCGAACCCAAACCCATGGAGCCGTCC






AAGCACCAGTATGACGTAGATACCGAGACGGTGATAGGTTTCCTCAAAGCACACGGTTTG






GAGAAAGACTTTAAGGTAAACATCGAAGTGAACCACGCTACCCTCGCCGGICACACTTTC






GAGCACGAACTGGCAGTAGCCGTAGATAACGGCATGCTCGGTTCGATCGATGCCAACCGC






GGTGACTATCAGAACGGATGGGATACCGACCAGTTCCCCATCGATAACTTCGAACTGACC






CAAGCATGGATGCAGATCGTACGTAACGGTGGTCTCGGCACAGGCGGAACGAACTTCGAC






TCCAAGACCCGTCGTAACTCCACCGATCTCGAGGATATCTTCATCGCTCACATCAGTGGT






ATGGACGCTTGTGCCCGTGCCCTATTGAATGCCGTAGAGATCATGGAGAAATCACCGATC






CCTGCTATGCTCAAAGAGCGTTACGCTTCCTTCGATAGCGGTCTGGGTAAAGATTTCGAG






GACGGCAAACTGACCCTTGAGCAAGTCTATGAGTACGGTAAGAAAGTAGGCGAACCCAAA






CAAACCAGCGGCAAACAAGAACTCTATGAGGCTATCGTTGCCCTCTACGCTAAATAA





5586MI201_

Neocalli-

Amino 
 64
MAKEYFPTIGKIVYEGPESKNPMAFHYYDAERVVAGKKMKDWMRFAMAWWHTLCAEGADQ


003

mastigales

Acid


FGGGTKHFPWNEGPDAVTIAKQKADAGFEIMQKLGFPYFCFHDVDINGEGSSVEEYEANL








AAITDYLKQKMDESGIKLLWSTANVFGHARYMNGASTNPDFDVVARAIVQIKNAIDAGIK








LGAENYVFWGGREGYMSLLNTDQKREKEHTAMMLRMARDYARSKGFKGTFLIEPKPMEPS








KHQYDVDTETVIGFLKAHGLEKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDANR








GDYQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDSKTRRNSTDLEDIFIAHISG








MDACARALLNAVEIMEKSPIPAMLKERYASFDSGLGKDFEDGKLTLEQVYEYGKKVGEPK







QTSGKQELYEAIVALYAK





5586MI204_

Neocalli-

DNA
 65
ATGAAAGAGTATTTCCCTGAGGTCGGTAAGATCCAATTTGAAGGCCCGGAGTCTAAGAAC


002

mastigales



CCGATGGCATTTCACTACTATGACGCAGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAG






TGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCAGAAGGCGGCGACCAGTTC






GGAGGCGGAACGAAGCATTTCCCGTGGAATGAAGGCGCTAACGCTTTGGAGATCGCCAAA






CACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTC






CATGACGTGGATCTCATCGCCGAGGGCGGTTCGGTAGAAGAGTACGAAACCAACCTCGCT






GCTATCACCGACTACCTCAAGCAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCC






ACGGCGAACGTGTTCAGCAACCCCCGTTATATGAACGGCGCGAGCACGAACCCCGATTTC






GATGTAGTAGCGCGTGCCATCGTGCAGATCAAGAATGCCATCGACGCCGGCATCAAACTG






GGCGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTACATGAGCCTGCTCAACACC






GACCAGCGCCGCGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCG






CGTGCCAAAGGATTCAAGGGCACCTTTCTCATCGAACCCAAACCGTGTGAGCCGTCCAAA






CATCAGTATGATGTCGATACCGAGACCGTCATCGGTTTCCTCAAAGCGCATGGACTCGAC






AAGGATTTCAAGGTTAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAA






CACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGT






GACGCCCAGAACGGATGGGACACCGACCAGTTCCCTATTGATAACTTCGAACTCACACAG






GCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCC






AAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATG






GACGCTTGCGCACGTGCGTTGCTCAACGCCATCGAAATCCTCGAGAAGAGCCCGATCCCG






GCTATGCTCAAAGACCGTTATGCCTCCTTTGATGGCGGCATCGGAAAGGACTTTGAGGAG






GGCAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAG






ACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG





5586MI204_

Neocalli-

Amino 
 66
MKEYFPEVGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF


002

mastigales

Acid


GGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYETNLA








AITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSK








HQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACARALLNAIEILEKSPIPAMLKDRYASFDGGIGKPFEEGKLTFEQVYEYGKKVGEPKQ







TSGKQELYETIVALYAK





5586MI207_

Neocalli-

DNA
 67
ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC


002

mastigales



CCGATGGCGTTTCACTACTATGACGCTGAGCGCGTCGTAGCCGGTAAAACAATGAAAGAG






TGGATGCGTTTCGCTATGGCTTGGTGGCACACCCTCTGTGCGGAAGGCGGCGACCAGTTC






GGAGGAGGAACGAAGAAATTCCCCTGGAACGAAGGGGCAAACGCTTTGGAGATCGCCAAG






CACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCTTATTTCTGTTTC






CATGACGTGGATCTCATCGCCGAGGGCGAATCGGTAGAAGAGTACGAAGCCAACCTCGCT






GCCATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCC






ACGGCGAACGTGTTCAGCAACCCCCGTTATATGAACGGCGCCAGCACGAACCCCGATTTC






GATGTAGTGGCACGCGCTATCGTACAAATCAAGAACGCTATCGACGCCGGTATCAAACTC






GGAGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTATATGTCGCTCCTCAACACC






GACCAGCGCCGAGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCG






CGTTCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCGTGTGAGCCGTCCAAA






CATCAGTACGATGTGGACACAGAGACCGTCATCGGTTTCCTTAAAGCGCATGGACTCGAC






AAGGATTTCAAAGTCAATATCGAGGTCAACCACGCCACCCTCGCAGGCCACACGTTCGAA






CACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGT






GACGCCCAGAACGGATGGGACACCGACCAATTCCCTATTGATAACTTCGAACTCACTCAG






GCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCC






AAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATG






GACGCTTGCGCTCGTGCGTTGCTCAATGCTGTCGAAATCCTCGAGAAGAGCCCGATCCCG






GCTATGCTCAAAGAGCGTTATGCTTCCTTTGACGGCGGCATCGGAAAGGACTTTGAGGAG






GGCAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAG






ACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATGA





5586MI207_

Neocalli-

Amino 
 68
MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF


002

mastigales

Acid


GGGTKKFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGESVEEYEANLA








AITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARSKGFKGTFLIEPKPCEPSK








HQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACARALLNAVEILEKSPIPAMLKERYASFDGGIGKDFEEGKLTFEQVYEYGKKVGEPKQ







TSGKQELYETIVALYAK





5586MI209_

Neocalli-

DNA
 69
ATGAAAGAGTATTTCCCTGAGATCGGTAAGATCCAATTTGAAGGCCCGGAGTCCAAGAAC


003
mastigales


CCGATGGCGTTTCACTACTATGACGCAGAGCGCGTAGTAGCCGGTAAAACAATGAAAGAA






TGGATGCGTTTCGCCATGGCATGGTGGCACACCCTCTGTGCAGAAGGCGGCGACCAGTTC






GGAGGAGGAACGAAGCATTTCCCGTGGAATGAAGGCGCTAACGCTTTGGAGATCGCCAAA






CACAAAGCCGATGCGGGATTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTC






CATGACGTGGATCTCATCGCCGAGGGCGATTCGGTGGAGGAGTACGAAGCTAACCCCGCT






GCCATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCC






ACGGCGAACGTCTTCAGCAACCCCCGTTACATGAACGGTGCGAGCACGAACCCGGATTTC






GATGTAGTGGCACGCGCTATCGTACAAATCAAGAACGCTATCGACGCCGGTATCAAACTC






GGAGCAGAGAACTATGTCTTCTGGGGCGGTCGCGAGGGCTATATGTCGCTCCTCAACACC






GACCAGCGTCGCGAGAAAGAGCATATGGCTACTATGCTCCGTATGGCGCGTGACTACGCG






CGTGCCAAAGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCATGTGAGCCGTCCAAA






CATCAGTACGATGTGGACACAGAGACTGTCATCGGTTTCCTCAAAGCGCATGGACTCGAC






AAGGATTTCAAAGTCAACATCGAGGTCAACCACGCCACCCTCGCAGGTCACACGTTCGAA






CACGAACTGGCTTGCGCTGTAGATGCCGGCATGCTCGGTTCGATTGACGCCAACCGCGGT






GACGCCCAGAACGGATGGGACACTGACCAGTTCCCTATTGATAACTTCGAACTCACACAG






GCTTTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCC






AAGACACGCCGTAACTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATG






GACGCTTGTGTCCGTGCGTTGCTCAACGCCATCGAAATCCTCGAGAAGAGCCCGATCCCG






GCTATGCTCAAAGAGCGTTACGCTTCCTTTGACGGCGGCATCGGAAAGGACTTTGAGGAT






GGTAAACTGACTTTCGAGCAGGTCTATGAGTACGGCAAGAAGGTCGGAGAACCCAAACAG






ACCAGCGGCAAACAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAGTAA





5586MI209_

Neocalli-

Amino 
 70
MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFAMAWWHTLCAEGGDQF


003

mastigales

Acid


GGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGDSVEEYEANPA








AITDYLKQKMDETGIKLLWSTANVFSNPRYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQRREKEHMATMLRMARDYARAKGFKGTFLIEPKPCEPSK








HQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACVRALLNAIEILEKSPIPAMLKERYASFDGGIGKDFEDGKLTFEQVTEYGKKVGEPKQ







TSGKQELYETIVALYAK





5586MI214_

Neocalli-

DNA
 71
ATGAAAGAGTATTTCCCTGAGATCGGAAAGATCCAATTCGAAGGCCCGGAGTCCAAGAAT


002

mastigales



CCTATGGCATTTCACTACTATGACGCAGAGCGTGTAGTAGCCGGTAAAACAATGAAAGAG






TGGATGCGTTTCGCTTTGGCATGGTGGCACACGCTCTGCGCAGAAGGCGGCGACCAGTTC






GGAGGCGGCACGAAGCATTTCCCTTGGAATGAAGGTGCAAACGCTTTGGAGATCGCCAAG






CACAAAGCCGATGCAGGCTTCGAGATCATGCAGAAACTCGGCATCCCCTATTTCTGTTTC






CATGACGTGGATCTGATCGCCGAGGGCGGTTCGGTAGAAGAGTATGAAGCTAATTTAACG






GCTATCACCGATTACCTCAAACAGAAAATGGACGAGACCGGCATCAAACTGCTGTGGTCC






ACTGCGAACGTGTTCGGTAACGCACGTTATATGAACGGCGCGAGCACGAACCCCGATTTC






GATGTAGTGGCACGCGCTATCGTGCAGATCAAGAACGCTATCGACGCCGGCATCAAACTG






GGCGCAGAGAACTACGTCTTCTGGGGCGGTCGCGAGGGATATATGTCGCTCCTGAACACC






GACCAGAAGCGTGAGAAAGAGCATATGGCTACCATGCTCCGTATGGCGCGTGACTACGCG






CGTTCCAAAGGATTCAAAGGTACGTTCCTCATCGAGCCCAAACCGTGTGAGCCGTCCAAA






CATCAGTACGACGTGGACACTGAGACCGTCATCGGTTTCCTCAAAGCCCATGGTCTCGGC






AAGGATTTCAAAGTGAACATCGAGGTGAATCACGCCACCCTCGCAGGGCACACGTTCGAA






CACGAACTGGCTTGCGCCGTAGATGCCGGCATGCTCGGTTCGATCGACGCCAACCGCGGT






GACGCACAAAACGGATGGGACACCGACCAGTTCCCTATTGATAATTTCGAACTCACCCAG






GCATTCATGCAGATCGTCCGCAACGGCGGTTTCGGAACAGGCGGTACGAACTTCGACGCC






AAGACACGCCGTAATTCCACCGACTTGGAGGACATCTTCATCGCCCATATCAGCGGCATG






GACGCTTGTGCCCGTGCGTTGCTCAATGCTGTCGAAATCCTTGAAAAGAGCCCGATCCCG






GCGATGCTCAAAGAGCGTTACGCCTCCTTTGACAGCGGTATGGGTAAGGACTTTGAGGAG






GGCAAGCTGACCTTCGAGCAGGTCTATGAGTACGGCAAACAGGTCGGCGAACCCAAACAG






ACCAGCGGCAAGCAGGAGCTCTACGAAACCATCGTCGCCCTCTATGCCAAATAG





5586MI214_

Neocalli-

Amino 
 72
MKEYFPEIGKIQFEGPESKNPMAFHYYDAERVVAGKTMKEWMRFALAWWHTLCAEGGDQF


002

mastigales

Acid


GGGTKHFPWNEGANALEIAKHKADAGFEIMQKLGIPYFCFHDVDLIAEGGSVEEYEANLT








AITDYLKQKMDETGIKLLWSTANVFGNARYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQKREKEHMATMLRMARDYARSKGFKGTFLTEPKPCEPSK








HQYDVDTETVIGFLKAHGLGKDPKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDTDQFPIDNFELTQAFMQIVRNGGFGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACARALLNAVEILEKSPIPAMLKERYASFDSGMGKDFEEGKLTFEQVYEYGKQVGEPKQ







TSGKQELYETIVALYAK





5751MI3_

Neocalli-

DNA
 73
ATGAAAGAGTATTTTCCACAAATCGGCAAGATCCCATTTGAGGGACCAGAGTCAAAGAAC


001

mastigales



CCAATGGCATTCCACTACTATGACGCAGAGCGCGTAGTTGCCGGTAAGACAATGAAGGAA






TGGATGCGTTTCGCTATGGCCTGGTGGCACACTCTCTGTGCTGAGGGTAGCGATCAGTTC






GGCCCTGGTACAAAGAAGTTCCCTTGGAACGAGGGCGAGACAGCCCTTGAGCGCGCTAAG






CACAAGGCAGATGCTGGCTTCGAGGTTATGCAGAAGCTCGGCATCCCATATTTCTGCTTC






CACGATGTAGACCTTATCGACGAGGGTGCTAACGTGGCTGAGTATGAGGCAAACCTCGCT






GCTATCACTGACTACCTGAAGGAGAAGATGGAGGAGACTGGCGTAAAGCTCCTCTGGTCT






ACAGCCAACGTGTTCGGTAACGCTCGCTATATGAACGGTGCTTCTACAAATCCTGACTTC






GACGTTGTGGCTCGTGCCATCGTACAGATTAAGAACGCTATCGACGCTGGTATCAAGCTT






GGTGCTGAGAACTACGTGTTCTGGGGCGGCCGCGAGGGCTACATGAGCCTTCTGAACACT






GACCAGAAGCGCGAGAAGGAGCACATGGCAACTATGCTCGGCATGGCTCGCGACTATGCC






CGCGCTAAGGGATTCACCGGTACCTTCCTCATTGAGCCAAAGCCAATGGAGCCAACAAAG






CATCAGTATGATGTTGACACAGAGACCGTTATCGGTTTCCTCAAGGCTCACGGTCTGGAC






AAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACTCTCGCCGGTCACACCTTCGAG






CACGAGCTCGCTTGCGCTGTTGACGCTGGTATGCTCGGTTCTATCGACGCTAACCGCGGT






GACGCTCAGAACGGATGGGATACCGACCAGTTCCCAATCGACAACTTCGAGCTGACACAG






GCTTGGATGCAGATTGTTCGCAATGGCGGTCTTGGCACAGGTGGTACCAACTTCGACGCA






AAGACCCGTCGTAACTCTACCGACCTCGAGGACATCTTCATCGCTCACATCTCCGGTATG






GACGCTTGTGCACGCGCTCTCCTCAACGCAGTAGAGATACTCGAGAACTCTCCAATCCCA






ACAATGCTGAAGGACCGCTATGCAAGCTTCGACTCAGGTATGGGTAAGGACTTCGAGGAC






GGCAAGCTCACACTTGAGCAGGTTTATGAGTATGGTAAGAAGGTCGACGAGCCAAAGCAG






ACCTCTGGTAAGCAGGAACTCTATGAGACCATCGTTGCTCTCTATGCAAAATAA





5751MI3_

Neocalli-

Amino 
 74
MKEYFPQIGKIPFEGPESKNPMAFHYYDAERVVAGKTMKEWMPFAMAWWHTLCAEGSDQF


001

mastigales

Acid


GPGTKKFPWNEGETALERAKHKADAGFEVMQKLGIPYFCFHDVDLIDEGANVAEYEANLA








AITDYLKEKMEETGVKLLWSTANVFGNARYMNGASTNPDFDVVARAIVQIKNAIDAGIKL








GAENYVFWGGREGYMSLLNTDQKREKEHMATMLGMARDYARAKGFTGTFLIEPKPMEPTK








HQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANRG








DAQNGWDTDQFPIDNFELTQAWMQIVRNGGLGTGGTNFDAKTRRNSTDLEDIFIAHISGM








DACARALLNAVEILENSPIPTMLKDRYASFDSGMGKDFEDGKLTLEQVYEYGKKVDEPKQ







TSGKQELYETIVALYAK





5753MI3_

Prevotella

DNA
 75
ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGGCGACAGCAAA


002



AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAG






GAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCC






TTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCC






AAAGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGTATCGAATACTTCTGC






TTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAG






TTCGATGTGGIGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATTACACCCTCCTCAAC






ACCCAGATGCAGCGGGAAAAAGACCACCTGGCCAAGTTGCTGACGGCCGCCCGCGACTAT






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAACCCACC






AAGCACCAGTACGACGTGGATACGGAGACGGTCATCGGCTTCCTCCGTGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTIC






GAGCATGAGCTCACCGTGGCCCGCGAGAACGGTTTCCTGGGCTCCATCGGTGCCAACCGC






GGCGACGCCCAGAACGGCTGGGACACGGACCAGTTCCCTGTGGACCCGTACGATCTTACC






CAGGCCATGATGCAGGTGCTGCTGAACGGCGGCTTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTCCGCCGCTCCTCCACCGACCCTGAGGACATCTTCATCGCCCATATTTCCGCC






ATGGATGCCATGGCCCACGCTTTGCTTAACGCAGCTGCCGTGCTGGAAGAGAGCCCCCTG






TGCCAGATGGTCAAGGAGCGTTATGCCAGCTTCGACGGCGGCCTCGGCAAACAGTTCGAG






GAAGGCAAGGCTACCCTGGAAGACCTGTACGAATACGCCAAGGTCCAGGGTGAACCCGTT






GTCGCCTCCGGCAAGCAGGAGCTTTACGAGACTCTCCTGAACCTGTATGCCGTCAAGTAA





5753MI3_

Prevotella

Amino 
 76
MAKEYFPSIGKIPFEGGDSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP


002

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIGANR








GDAQNGWDTDQFPVDPYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLCQMVKERYASFDGGLGKQFEEGKATLEDLYEYAKVQGEPV







VASGKQELYETLLNLYAVK





1754MI1_

Prevotella

DNA
 77
ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAAGGAAAGGACAGTAAG


001



AATGTAATGGCTTTCCACTACTACGAGCCTGAGAAGGTCGTGATGGGAAAGAAGATGAAG






GACTGGCTGAAGTTCGCTATGGCTTGGTGGCATACACTGGGTGGCGCTTCTGCTGACCAG






TTTGGTGGTCAGACTCGTTCATACGAGTGGGACAAGGCTGGTGACGCTGTTCAGCGCGCT






AAGGATAAGATGGACGCTGGCTTCGAGATCATGGACAAGCTGGGCATCGAGTACTTCTGC






TTCCACGATGTTGACCTCGTTGAAGAGGGTGACACCATCGAGGAGTATGAGGCTCGCATG






AAGGCCATCACCGACTACGCTCAGGAGAAGATGAAGCAGTTCCCCAACATCAAGCTGCTC






TGGGGTACCGCAAACGTATTCGGTAACAAGCGCTATGCTAACGGTGCTTCTACCAACCCC






GACTTCGACGTAGTGGCTCGCGCCATCGTTCAGATCAAGAACGCTATTGATGCTACCATC






AAGCTGGGTGGTACCAACTATGTGTTCTGGGGTGGTCGTGAGGGCTATATGAGTCTGCTG






AACACCGACCAGAAGCGTGAGAAGGAGCACATGGCTACTATGCTGACCATGGCTCGCGAC






TATGCTCGCGCCAAGGGATTCAAGGGTACATTCCTCATTGAGCCGAAGCCCATGGAGCCC






AGCAAGCACCAGTATGATGTGGATACAGAGACCGTTATCGGCTTCCTGAAGGCACACAAC






CTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACACTCGCTGGTCATACC






TTCGAGCACGAGCTGGCTTGCGCTGTTGACGCTGGTATGCTTGGTTCTATCGACGCTAAC






CGTGGTGATGCTCAGAACGGTTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTG






ACACAGGCTATGCTCGAGATCATCCGCAATGGIGGTCTGGGCAATGGTGGTACCAACTTC






GATGCTAAGATCCGTCGTAACAGCACCGACCTCGAGGATCTCTTCATCGCTCACATCAGT






GGTATGGATGCTATGGCACGCGCTCTGATGAACGCTGCTGACATCCTTGAGAACTCTGAG






CTGCCCGCAATGAAGAAGGCTCGCTACGCAAGCTTCGACCAGGGTGTTGGTAAGGACTTC






GAAGATGGCAAGCTGACCCIIGAGCAGGTITACGAGTATGGTAAGAAGGTGGGTGAGCCC






AAGCAGACTTCTGGTAAGCAGGAGAAGTACGAGACCATCGTTGCTCTCTATGCAAAATAA





1754MI1_

Prevotella

Amino 
 78
MAKEYFPFTGKIPFEGKDSKNVMAFHYTEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ


001

Acid


FGGQTRSYEWDKAGDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGDTIEEYEARM








KAITDYAQEKMKQFPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLKAHNLDKPFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALMNAADILENSELPAMKKARYASFDQGVGKDFEDGKLTLEQVYEYGKKVGEP







KQTSGKQEKYETIVALYAK





1754MI3_

Prevotella

DNA
 79
ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAAGGAAAAGAGAGCAAG


007



AACGTAATGGCTTTCCATTACTATGAGCCTGAAAAGGTGGTCATGGGCAAGAAAATGAAG






GATTGGCTGAAATTCGCCATGGCTTGGTGGCACACCCTCGGTGGAGCCAGCGCCGACCAG






TTCGGTGGACAGACCCGCAGCTATGAGTGGGACAAGGCCGAGGATGCCGTACAGCGTGCT






AAGGACAAGATGGACGCCGGCTTCGAGATCATGGACAAACTGGGCATCGAGTATTTCTGC






TTCCACGATGTCGACCTCGTCGACGAGGGTGCTACCGTTGAGGAGTATGAGGCTCGCATG






AAAGCCATCACCGACTATGCCCAGGTCAAGATGAAGGAATATCCCAACATCAAACTGCTC






TGGGGCACCGCCAACGTGTTCGGCAACAAGCGTTATGCCAACGGCGCTTCCACCAACCCC






GACTTCGACGTGGTGGCACGCGCTATCGTTCAGATCAAGAATGCCATCGACGCTACCATC






AAGCTCGGCGGTCAGAACTACGTGTTCTGGGGCGGACGCGAGGGCTACATGAGCCTGCTC






AATACCGATCAGAAACGTGAGAAGGAACACATGGCCACCATGCTCACCATGGCGCGCGAC






TATGCTCGCAGCAAGGGATTCAAGGGCACCTTCCTCATCGAACCCAAACCCATGGAGCCT






TCCAAGCACCAGTATGATGTCGACACCGAGACGGTCATCGGCTTCCTCCGCGCCCACAAC






CTCGACAAGGACTTCAAGGTGAACATCGAGGTCAACCACGCCACGCTCGCCGGCCACACC






TTCGAGCACGAACTGGCTTGCGCCGTCGACGCCGGCATGCTCGGCAGCATCGACGCCAAC






CGCGGCGACGCACAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAACTG






ACACAGGCCATGCTGGAGATCATCCGCAATGGCGGCCTCGGCAATGGTGGTACCAACTTC






GACGCCAAGATCCGTCGTAACAGCACCGACCTCGAAGATCTCTTCATCGCTCACATCAGC






GGTATGGATGCCATGGCTCGCGCGCTGCTCAACGCCGCCGCCATCCTCGAGGAGAGCGAA






CTGCCCGCCATGAAGAAGGCCCGCTACGCTTCCTTCGACGAAGGTATCGGCAAGGACTTC






GAAGACGGCAAACTCACCCTCGAGCAGGTTTACGAGTACGGCAAGAAGGTAGGCGAGCCC






AAGCAGACCTCCGGCAAGCAAGAGAAGTACGAGACCATCGTGGCTCTCTACAGCAAATAA





1754MI3_

Prevotella

Amino 
 80
MAKEYFPFTGKIPFEGKESKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ


007

Acid


FGGQTRSYEWDKAEDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVDEGATVEEYEARM








KAITDYAQVKMKEYPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLRAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAAAILEESELPAMKKARYASFDEGIGKDFEDGKLTLEQVYEYGKKVGEP







KQTSGKQEKYETIVALYSK





1754MI5_

Prevotella

DNA
 81
ATGAAAGAGTATTTCCCGCAAATTGGAAAGATTCCCTTCGAGGGACCAGAGAGCAAGAGT


009



CCATTGGCGTTCCATTATTATGAGCCGGATCGCATGGTGCTCGGAAAGAGGATGGAGGAT






TGGCTGAAATTCGCCATGGCATGGTGGCACACCCTTGGCCAGGCCAGCGGCGACCAGTTC






GGCGGACAGACACGTGAGTACGAGTGGGATAAGGCTGGAGATCCGATACAAAGGGCAAAG






GATAAGATGGACGCCGGATTCGAGATCATGGAGAAATTGGGTATCAAGTACTTCTGCTTC






CATGATGTGGATCTCGTCGAGGAAGCTCCCACCATCGCCGAATATGAGGAGCGTATGAGG






ATCATCACCGACTATGCGCTCGAGAAGATGAAAGCCACTGGCATCAAACTCCTTTGGGGT






ACAGCCAATGTTTTCGGACATAAGAGATATATGAATGGGGCCGCCACCAACCCGGAGTTC






GGTGTTGTCGCCAGGGCTGCTGTCCAGATCAAGAACGCGATCGACGCCACCATCAAGCTG






GGAGGAACAAACTATGTGTTCTGGGGTGGCCGCGAGGGCTACATGAGCCTGCTCAACACC






CAGATGCAGAGGGAGAAGGACCATCTCGCCAATATGCTCAAGGCTGCTCGTGACTATGCT






CGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCGAAGCCGATGGAACCTACTAAG






CATCAGTACGATGTCGACACTGAGACCGTGATCGGCTTCCTCCGCGCAAACGGTCTTGAC






AAGGATTTCAAGGTCAACATCGAGGTCAATCACGCCACTCTTGCGGGTCACACTTTCGAG






CATGAGCTCGCCGTGGCTGTCGACAATGGTCTCCTTGGCTCAATCGATGCGAACAGGGGA






GATTATCAGAACGGTTGGGACACCGACCAGTTCCCTGTTGATCTCTTTGATTTGACCCAG






GCCATGCTCCAGATCATCCGTAACGGAGGCCTCGGTAATGGIGGATCCAACTTCGACGCC






AAGCTTCGCCGTAACTCCACTGATCCTGAGGATATATTCATTGCCCATATTTGCGGTATG






GACGCTATGGCCAGGGCTCTCCTTGCCGCCGCCGCGATCGTGGAGGAGTCTCCTATCCCG






GCTATGGTCAAAGAGCGTTACGCATCCTTCGACGAAGGTGAGGGCAAGAGATTCGAGGAT






GGTAAGATGAGTCTGGAGGAACTTGTTGATTACGCGAAGACTCACGGAGAGCCCGCCCAG






AAGAGTGGCAAACAGGAGCTCTACGAAACCCTTGTCAACATGTACATCAAATAA





1754MI5_

Prevotella

Amino 
 82
MKEYFPQIGKIPFEGPESKSPLAFHYYEPDRMVDGKRMEDWLKFAMAWWHTLGQASGDQF


009

Acid


GGQTREYEWDKAGDPIQRAKDKMDAGFEIMEKLGIKYFCFHDVDLVEEAPTIAEYEERMR








IITDYALEKMKATGIKLLWGTANVFGHKRYMNGAATNPEFGVVARAAVQIKNAIDATIKL








GGTNYVFWGGREGYMSLLNTQMQREKDHLANMLKAARDYARAKGFKGTFLIEPKPMEPTK








HQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGLLGSIDANRG








DYQNGWDTDQFPVDDFDLTQAMLQIIRNGGLGNGGSNFDAKLRRNSTDPEDIFIAHICGM








DAMARADDAAAAIVEESPIPAMVKERYASFDEGEGKRFEDGKMSDEELVDTAKTHGEPAQ







KSGKQELYETLVNMYIK





5586MI1_

Prevotella

DNA
 83
ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAGATTCCTTTCGAGGGAAAGGACAGTAAG


003



AATGTAATGGCGTTCCACTACTACGAGCCCGAGCGCGTGGTAATGGGCAAGAAGATGAAG






GAGTGGCTGAAGTTTGCCATGGCCTGGTGGCACACGCTGGGTGGAGCCAGTGCCGACCAG






TTTGGCGGACAGACCCGCAGCTACGAGTGGGACAAGGCTGAAGACGCCGTGCAGCGTGCC






AAGGACAAGATGGATGCCGGCTTCGAGATCATGGACAAGCTGGGCATCGAGTATTTCTGC






TTCCATGATGTCGATCTCGTTGACGAGGGTGCCACTGTCGAGGAGTATGAGGCTCGCATG






CAGGCCATCACCGACTATGCGCAGGAGAAGATGAAGCAGTATCCTGCCATCAAGCTGCTG






TGGGGTACGGCCAATGTCTTTGGCAACAAGCGTTATGCCAACGGTGCCTCTACCAATCCC






GACTTCGATGTGGTGGCCCGCGCCATCGTGCAGATTAAGAATGCCATTGATGCCACCATC






AAGCTGGGCGGCAGCAACTATGTGTTCTGGGGCGGTCGCGAGGGCTACATGTCGCTGCTC






AACACCGACCAGAAGCGTGAGAAGGAACACATGGCCCGGATGCTGACCATGGCCCGCGAC






TATGCCCGCTCGAAGGGCTTCAAGGGCAACTTCCTGATTGAGCCCAAGCCCATGGAGCCG






TCGAAGCATCAGTACGACGTGGACACCGAGACGGTTATCGGATTCCTCCGCGCACATGGC






CTTGACAAGGACTTCAAGGTGAACATCGAGGTGAACCATGCCACGCTGGCCGGTCATACC






TTCGAGCACGAACTGGCTTGCGCCGTAGATGCCGGCATGCTGGGCAGCATTGATGCCAAC






CGCGGCGACGCACAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTATGAGTTG






ACACAGGCCATGATGGAGATTATCCGCAATGGCGGTCTGGGTCTTGGCGGTACCAATTTC






GATGCCAAGATTCGCCGTAACTCCACCGACCTGGAAGACCTCTTCATCGCCCACATCAGT






GGCATGGACGCCATGGCTCGTGCGCTCCTTAATGCTGCCGACATTCTGGAGAACAGCGAA






CTGCCCGCCATGAAGAAAGCGCGCTACGCCTCGTTCGACAGTGGCATGGGCAAGGACTTC






GAGGACGGCAAACTGACCCTTGAGCAGGTTTACGAATACGGCAAAAAAGTCGGCGAACCT






AAGCAGACCTCCGGCAAGCAGGAGAAGTACGAGACCATCGTGGCTCTCTATGCCAAGTAA





5586MI1_

Prevotella

Amino 
 84
MAKEYFPFTGKIPFEGKDSKNVMAFHYTEPERVVMGKKMKEWLKFAMAWWHTLGGASADQ


003

Acid


FGGQTRSYEWDKAEDAVQRAKDKMDAGFEIMDKDGIEYFCFHDVDLVDEGATVEEYEARM








QAITDYAQEKMKQYPAIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGSNYVFWGGREGYMSLLNTDQKREKEHMARMLTMARDYARSKGFKGNFLIEPKPMEP








SKHQYDVDTETVIGFDRAHGLDKDEKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGLGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAADILENSELPAMKKARYASFDSGMGKDFEDGKLTLEQVYEYGKKVGEP







KQTSGKQEKTETIVADYAK





5586MI2_

Prevotella

DNA
 85
ATGGCAAAAGAGTATTTTCCGTTTACAGGTAAAATTCCTTTCGAAGGAAAGGACAGTAAG


006



AACGTAATGGCTTTCCACTACTACGAGCCCGAAAAGGTCGTGATGGGAAAGAAAATGAAA






GACTGGCTGAAGTTCGCCATGGCCTGGTGGCACACACTGGGTGGCGCCAGCGCCGACCAG






TTTGGCGGCCAGACACGCAGCTATGAGTGGGACAAGGCTGCCGATGCCGTGCAGCGCGCA






AAGGACAAGATGGACGCCGGCTTCGAAATCATGGACAAGCTGGGCATCGAGTATTTCTGC






TTCCACGACGTGGACCTCGTTGAGGAGGGAGCCACCATCGAGGAGTATGAGGCCCGCATG






AAGGCTATCACCGACTATGCCCAGGAGAAGATGAAACAGTATCCCAGCATCAAGCTGCTC






TGGGGCACCGCCAATGTGTTTGGCAACAAGCGCTACGCCAACGGCGCCAGCACCAACCCC






GACTTCGACGTCGTGGCCCGTGCCATCGTGCAGATCAAGAACGCCATCGATGCCACCATC






AAGCTGGGCGGCACCAACTACGTGTTCTGGGGCGGACGCGAGGGCTACATGAGCCTGCTC






AACACCGACCAGAAGCGCGAGAAGGAGCACATGGCCACCATGCTCACCATGGCCCGCGAC






TACGCCCGCGCAAAGGGATTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCG






TCGAAGCACCAGTACGACGTGGACACCGAGACCGTCATCGGTTTCCTGAAGGCCCACGGT






CTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTGGCCGGCCACACC






TTCGAGCATGAGCTGGCCTGCGCCGTCGACGCCGGTATGCTGGGCAGCATCGATGCCAAC






CGCGGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTC






ACCCAGGCCATGATGGAAATTATCCGCAACGGCGGCCTCGGCAACGGCGGCACCAACTTC






GACGCTAAGATCCGCCGCAACTCCACCGACCTCGAGGACCTCTTCATCGCCCACATCAGC






GGCATGGACGCCATGGCCCGCGCACTGATGAACGCTGCCGACATTATGGAGAACAGCGAG






CTGCCCGCCATGAAGAAGGCACGCTACGCCAGCTTCGACGCCGGCATCGGCAAGGACTTT






GAGGATGGCAAGCTCTCGCTGGAGCAGGTCTACGAGTATGGCAAGAAGGTGGAAGAGCCC






AAGCAGACCAGCGGCAAGCAGGAGAAGTACGAGACCATCGTCGCCCTCTATGCCAAGTAA





5586MI2_

Prevotella

Amino 
 86
MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ


006

Acid


FGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGATIEEYEARM








KAITDYAQEKMKQYPSIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNFELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALMNAADIMENSELPAMKKARYASFDAGIGKDFEDGKLSLEQVYEYGKKVEEP







KQTSGKQEKYETIVALYAK





5586MI8_

Prevotella

DNA
 87
ATGGCAAAAGAGTATTTCGCCTTTACAGGCAAGATTCCTTTCGAGGGAAAAGACAGTAAG


003



AACGTGATGGCTTTCCACTACTACGAGCCGGAGCGTGTGGTGATGGGCAAGAAGATGAAG






GAGTGGCTGAAGTTCGCCATGGCCTGGTGGCACACACTGGGTGGCGCATCGGCCGACCAG






TTCGGAGGCCAGACACGCAGCTACGAGTGGGACAAGGCCGCCGACGCCGTGCAGCGCGCC






AAGGACAAGATGGACGCCGGCTTCGAGATTATGGACAAGCTGGGCATCGAGTACTTCTGC






TTCCACGATGTAGACCTCGTTGAGGAGGGTGAGACCATAGCCGAGTACGAGCGCCGCATG






AAGGAAATCACCGACTACGCACAGGAGAAGATGAAGCAGTTCCCCAACATCAAGCTGCTC






TGGGGCACAGCCAACGTGTTCGGCAACAAGCGCTACGCCAACGGCGCATCGACCAACCCC






GACTTCGACGTTGTGGCACGCGCCATCGTGCAGATCAAGAACGCCATCGACGCCACCATC






AAGCTCGGCGGCTCCAACTATGTGTTCTGGGGCGGACGCGAGGGCTATATGAGCCTGCTC






AACACCGACCAGAAGCGCGAGAAGGAGCACATGGCCACCATGCTCACCATGGCCCGCGAC






TATGCACGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCGAAGCCCATGGAGCCC






TCGAAGCACCAGTACGACGTAGACACAGAGACCGTCATCGGCTTCCTCCGTGCACACGGG






CTGGACAAGGACTTCAAGGTGAACATCGAGGTAAACCACGCCACACTGGCCGGCCACACC






TTCGAGCACGAGCTGGCTTGCGCCGTCGACGCTGGCATGCTGGGCAGCATCGACGCCAAC






CGTGGCGACGCACAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTC






ACACAGGCCATGATGGAAATCATCCGCAATGGCGGACTGGGCAATGGCGGCACCAACTTC






GACGCCAAGATCCGTCGTAACAGCACCGACCTCGAAGACCTCTTCATCGCCCACATCAGC






GGCATGGACGCCATGGCACGCGCACTGCTCAACGCTGCCGACATCCTGGAGCACAGCGAG






CTGCCCAAGATGAAGAAGGAGCGCTACGCCAGCTTCGACGCAGGCATCGGCAAGGACTTC






GAAGACGGCAAGCTCACACTCGAGCAGGTCTACGAGTACGGCAAGAAGGTCGAAGAGCCC






CGTCAGACCAGCGGCAAGCAGGAGAAGTACGAGACCATCGTCGCCCTCTATGCCAAGTAA





5586MI8_

Prevotella

Amino 
 88
MAKEYFAFTGKIPFEGKDSKNVMAFHYTEPERVVMGKKMKEWLKFAMAWWHTLGGASADQ


003

Acid


FGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGETIAEYERRM








KEITDYAQEKMKQFPNIKLLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGSNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNFELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAADILEHSELPKMKKERYASFDAGIGKDFEDGKLTLEQVYEYGKKVEEP







RQTSGKQEKYETIVALYAK





5586MI14_

Prevotella

DNA
 89
ATGGCAAAAGAGTATTTTCCGTTTACTGGTAAGATTCCTTTCGAGGGAAAGGATAGTAAG


003



AATGTAATGGCTTTCCACTATTACGAGCCCGAGAAAGTCGTGATGGGAAAGAAGATGAAG






GACTGGCTGAAGTTCGCAATGGCTTGGTGGCATACACTGGGTGGTGCATCTGCAGACCAG






TTCGGTGGAGAGACCCGCAGCTACGAGTGGAGCAAGGCTGCTGATCCCGTTCAGCGCGCC






AAGGACAAGATGGACGCCGGCTTTGAGATTATGGATAAGCTGGGCATCGAGTACTTCTGT






TTCCACGATATAGACCTCGTTCAGGAGGCAGATACCATTGCAGAATATGAGGAGCGCATG






AAGGCAATTACCGACTATGCTCTGGAGAAGATGAAGCAGTTCCCCAACATCAAGTTGCTC






TGGGGTACCGCTAACGTATTTAGCAACAAGCGCTATATGAACGGTGCTTCTACCAATCCC






GACTTCGACGTGGTGGCCCGTGCCATCGTTCAGATCAAGAACGCTATTGATGCAACCATC






AAACTCGGTGGTACCAACTATGTATTCTGGGGTGGTCGTGAGGGTTACATGAGCCTATTG






AATACCGACCAGAAGCGTGAAAAGGAGCACATGGCAATGATGCTCGGTATGGCTCGCGAC






TATGCCCGCAGCAAGGGATTCAAGGGTACGTTCCTCATCGAGCCGAAGCCGATGGAGCCC






TCTAAGCATCAGTATGATGTCGATACGGAGACTGTGATTGGTTTCCTGAAGGCACACGGT






CTGGACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCTACACTGGCTGGTCATACC






TTCGAGCATGAGCTGGCTTGCGCTGTTGACGCAGGTATGCTGGGCTCTATCGACGCTAAC






CGCGGTGATGCCCAGAACGGCTGGGATACCGACCAGTTCCCCATCGACAACTACGAGCTG






ACACAGGCTATGATGGAAATCATCCGCAACGGTGGTCTGGGCAATGGTGGTACCAACTIC






GACGCTAAGATCCGCCGTAACTCTACCGACCTCGAGGATCTGTTCATCGCTCATATCAGT






GGTATGGATGCTATGGCCCGTGCTTTGTTGAATGCTGCCGACATTCTGGAGAACTCTGAA






CTGCCCGCTATGAAGAAGGCCCGCTACGCCAGCTTCGACAACGGTATCGGTAAGGACTTC






GAGGATGGCAAGCTGACCTTCGAGCAGGTTTACGAATATGGTAAGAAAGTTGAAGAGCCG






AAGCAGACCTCTGGCAAGCAGGAGAAATACGAGACCATCGTTGCTCTGTATGCTAAATAA





5586MI14_

Prevotella

Amino 
 90
MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ


003

Acid


FGGETRSYEWSKAADPVQRAKDKMDAGFEIMDKLGIEYFCFHDIDLVQEADTIAEYEERM








KAITDYALEKMKQFPNIKLLWGTANVFSNKRYMNGASTNPDFDVVARAIVQIKNAIDATI








KLGGTNYVFWGGREGYMSLLNTDQKREKEHMAMMLGMARDYARSKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAADILENSELPAMKKARYASFDNGIGKDFEDGKLTFEQVYEYGKKVEEP







KQTSGKQEKYETIVALYAK





5586MI26_

Prevotella

DNA
 91
ATGGCAAAAGAGTATTTTCCGTTTACCGGTAAAATTCCTTTCGAGGGAAAGGACAGTAAG


003



AATGTAATGGCTTTCCACTACTACGAGCCTGAGCGCGTAGTGATGGGAAAGAAGATGAAG






GATTGGTTGCGATTTGCAATGGCTTGGTGGCACACACTGGGTGGCGCTTCTGCCGACCAG






TTTGGTGGTCAGACCCGCAGTTACGAATGGGACAAGGCTGCTGATGCTGTICAGCGTGCT






AAGGACAAGATGGATGCCGGCTTCGAGATTATGGATAAGCTGGGAATCGAGTICTTCTGC






TGGCACGATATCGACCTCGTTGAAGAGGGTGAGACCATTGAAGAGTATGAGCGCCGCATG






AAGGCTATCACCGACTATGCTCTTGAGAAGATGCAGCAGTATCCCAACATCAAGAACCTC






TGGGGAACAGCCAATGTGTTTGGCAACAAGCGTTATGCCAACGGTGCCAGCACAAACCCA






GACTTTGACGTCGTTGCTCGTGCTATCGTACAGATTAAGAATGCTATCGACGCTACTATC






AAGTTGGGTGGTCAGAATTATGTGTTCTGGGGTGGCCGTGAGGGCTACATGAGCCTGCTC






AATACTGACCAGAAGCGTGAGAAGGAGCACATGGCTACAATGCTGACCATGGCACGCGAC






TATGCCCGCAGCAAGGGATTCAAGGGTAACTTCCTCATTGAGCCCAAGCCCATGGAGCCG






TCAAAGCACCAGTATGATGTTGACACCGAGACCGTATGCGGTTTCCTGCGTGCCCACAAC






CTTGACAAGGATTTCAAGGTAAATATCGAGGTTAACCATGCTACTCTGGCTGGTCATACT






TTCGAGCACGAACTGGCATGCGCTGTTGACGCTGGTATGCTTGGTTCTATCGATGCTAAC






CGTGGTGATGCCCAGAATGGCTGGGATACCGACCAGTTCCCCATCAACAACTATGAACTC






ACTCAGGCTATGCTTGAGATCATCCGTAATGGTGGTCTGGGTCTTGGCGGCACAAACTTC






GATGCCAAGATTCGTCGTAACTCAACAGATCTTGAGGATCTCTTCATCGCTCACATCAGT






GGTATGGATGCCATGGCCCGTGCTCTGCTGAATGCTGCTGCTATTCTGGAGGAGAGCGAG






CTGCCTAAGATGAAGAAGGAGCGTTATGCTTCTTTCGATGCCGGTATCGGTAAGGACTTC






GAGGATGGCAAGCTTACCCTTGAGCAGGCTTACGAGTATGGTAAGAAGGTTGAGGAGCCC






AAGCAGACTTCAGGCAAGCAGGAGAAGTACGAGACCATCGTTGCTCTGTATGCAAAATAA





5586MI26_

Prevotella

Amino 
 92
MAKEYFPFTGKIPFEGKDSKNVMAFHYYEPERVVMGKKMKDWLRFAMAWWHTLGGASADQ


003

Acid


FGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEFFCWHDIDLVEEGETIEEYERRM








KAITDYALEKMQQYPNIKNLWGTANVFGNKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGNFLIEPKPMEP








SKHQYDVDTETVCGFLRAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPINNYELTQAMLEIIRNGGLGLGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAAAILEESELPKMKKERYASFDAGIGKDFEDGKLTLEQAYEYGKKVEEP







KQTSGKQEKYETIVALYAK





5586MI86_

Prevotella

DNA
 93
ATGAAACAGTATTTTCCCCAGATTGGAAAGATACCCTTCGAGGGTGTAGAGAGCAAGAAT


001



GTGATGGCTTTCCACTATTATGAGCCAGAAAGAGTAGTCATGGGCAAGCCTATGAAAGAA






TGGCTGCGCTTCGCTATGGCGTGGTGGCACACGCTGGGGCAGGCGAGCGGCGACCCCTTC






GGCGGACAGACCCGCAGCTACGAGTGGGACCGTGCGGCCGACGCGCTACAGCGCGCCAAG






GACAAGATGGATGCGGGCTTCGAGCTGATGGAGAAGCTTGGCATTGAGTACTTCTGCTTC






CACGACGTGGACCTCGTAGAAGAGGGCGCCACGGTGGAGGAATACGAGCGGCGGATGGCT






GCCATCACCGACTACGCGGTAGAGAAGATGCGCGAGCATCCCGAGATACACTGCCTGTGG






GGCACGGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGAGCCGCCACCAACCCCGAC






TTCGACGTGGTGGCGCGTGCGGTGGTGCAGATAAAGAACAGCATCGACGCCACGATCAAG






CTGGGCGGCGAGAACTATGTGTTCTGGGGCGGACGCGAGGGATATATGAGCCTGCTCAAC






ACCGACCAGCGCCGCGAGAAGGAGCACCTGGCCATGATGCTTGCGAAGGCCCGCGACTAT






GGCCGCGCCCACGGCTTCAAGGGCACCTTCCTGATAGAGCCCAAGCCGATGGAGCCCATG






AAGCACCAGTACGACGTGGACACCGAGACGGTGATAGGTTTCCTGCGTGCCCACGGACTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGTTGGCGGGCCACACGTTC






GAGCACGAGCTGGCCTGTGCCGTCGATGCCGGCATGCTGGGCAGCATCGACGCCAACCGT






GGCGACGCGCAGAACGGATGGGATACGGACCAGTTCCCCATAGACTGCTACGAGCTCACG






CAGGCGTGGATGGAGATCATTCGTGGCGGCGGCTTCACCACCGGCGGCACCAACTTCGAC






GCTAAGCTGCGCCGCAACTCGACCGACCCCGAGGATATCTTCATAGCTCACATCAGCGGC






ATGGATGCTATGGCCCGCGCCCTGCTCTGCGCCGCCGACATCTTGGAGCACAGCGAGCTG






CCGGAGATGAAGCGGAAGCGCTATGCCTCGTTCGACAGCGGCATGGGCAAGGAGTTCGAA






GAGGGCAATCTCAGCTTCGAGCAAATCTATGCCTACGGCAAGCAGGCGGGCGAACCGGCC






ACGACCAGCGGCAAGCAGGAGAAATACGAAGCCATTGTTTCACTTTATACCCGATGA





5586MI86_

Prevotella

Amino 
 94
MKQYFPQIGKIPFEGVESKNVMAFHYYEPERVVMGKPMKEWLRFAMAWWHTLGQASGDPF


001

Acid


GGQTRSYEWDRAADALQRAKDKMDAGFELMEKLGIEYFCFHDVDINEEGATVEEYERRMA








AITDYAVEKMREHPEIHCLWGTANVFGHKRYMNGAATNPDFDVVARAVVQIKNSIDATIK








LGGENYVFWGGREGYMSLLNTDQRREKEHLAMMLAKARDYGRAHGFKGTFLIEPKPMEPM








KHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANR








GDAQNGWDTDQFPIDCYELTQAWMEIIRGGGFTTGGTNFDAKLRRNSTDPEDIFIAHISG








MDAMARALLCAADILEESELPEMKRKRYASFDSGMGKEFEEGNLSFEQIYAYGKQAGEPA







TTSGKQEKYEAIVSLYTR





5586MI108_

Prevotella

DNA
 95
ATGGCAAAAGAGTATTTTCCGTTTATCGGTAAGGTTCCTTTCGAAGGAACAGAGAGCAAG


002



AACGTGATGGCATTCCACTACTATGAGCCCGAAAAGGTGGTCATGGGTAAGAAAATGAAG






GACTGGCTGAAGTTCGCTATGGCTTGGTGGCACACACTGGGTGGTGCCAGCGCCGACCAG






TTTGGTGGTCAGACTCGCAGCTACGAGTGGGACAAGGCTGCTGATGCCGTTCAGCGCGCC






AAGGACAAGATGGATGCTGGCTTCGAGATCATGGATAAGCTCGGCATTGAGTACTTCTGC






TTCCATGACGTAGACCTCGTTGAGGAGGGTGAAACCGTCGCTGAGTATGAGGCTCGCATG






AAGGTCATCACCGACTATGCCCTGGAGAAGATGCAGCAGTTCCCCAACATCAAACTGCTC






TGGGGTACTGCTAACGTGTTCGGCCACAAGCGCTATGCCAACGGTGCCAGCACCAATCCC






GACTTCGACGTCGTGGCCCGTGCTATCGTTCAGATCAAGAATGCCATCGATGCTACCATT






AAGCTCGGCGGTACGAACTATGTGTTCTGGGGTGGTCGTGAGGGCTACATGAGCCTTCTC






AACACCGACCAGAAGCGCGAGAAGGAGCACATGGCAACGATGCTGACCATGGCTCGCGAC






TATGCCCGCGCCAAGGGATTCAAGGGCACGTTCCTCATCGAGCCGAAGCCCATGGAGCCC






TCGAAGCATCAGTACGACGTCGACACCGAGACCGTCATCGGCTTCCTCCGTGCCCACGGT






CTGGATAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTGGCCGGTCATACC






TTCGAGCACGAACTGGCTTGCGCCGTTGATGCCGGCATGCTCGGCTCTATCGATGCCAAC






CGCGGCGACGCTCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTACGAGCTC






ACTCAGGCCATGATGGAAATCATCCGTAATGGCGGTCTGGGCAACGGCGGCACGAACTTC






GATGCCAAGATCCGTCGTAACAGCACCGACCTCGAGGACCTCTTCATCGCTCACATCAGC






GGCATGGATGCCATGGCACGCGCTCTGATGAACGCTGCTGCCATCCTCGAAGAGAGCGAG






CTGCCCGCCATGAAGAAGGCCCGCTATGCTTCGTTCGACGAGGGTATCGGCAAGGACTTC






GAGGACGGCAAGTTGTCACTTGAGCAGGTCTACGAATATGGTAAGAAGGTTGAGGAGCCC






AAGCAGACCTCGGGCAAGCAGGAGAAGTACGAGACCATCGTGGCCCTCTATGCCAAGTAA





5586MI108_

Prevotella

Amino 
 96
MAKEYFPFIGKVPFEGTESKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGGASADQ


002

Acid


FGGQTRSYEWDKAADAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGETVAEYEARM








KVITDYALEKMQQFPNIKLLWGTANVFGHKRYANGASTNPDFDVVARAIVQIKNAIDATI








KLGGTNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARAKGFKGTFLIEPKPMEP








SKHQYDVDTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDAN








RGDAQNGWDTDQFPIDNYELTQAMMEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALMNAAAILEESELPAMKKARYASFDEGIGKDFEDGKLSLEQVYEYGKKVEEP







KQTSGKQEKYETIVALYAK





5586MI182_

Prevotella

DNA
 97
ATGGCAAAAGAGTATTTTCCGTTTGTTGGTAAGATTCCTTTCGAGGGAAAGGATAGTAAG


004



AATGTAATGGCTTTCCACTATTACGAACCAGAGAAGGTCGTGATGGGAAAGAAGATGAAG






GACTGGCTGAAGTTCGCCATGGCATGGTGGCACACACTGGGACAGGCCAGTGCCGACCCG






TTTGGAGGTCAGACCCGCAGCTACGAGTGGGACAAGGCTGACGATGCTGTGCAGCGCGCA






AAGGACAAGATGGATGCCGGATTTGAGATCATGGACAAGCTGGGCATCGAGTACTTCTGC






TTCCACGATGTAGACCTCGTTGAGGAGGGAGCAACTGTTGAGGAGTACGAGGCTCGCATG






AAGGCCATCACCGACTATGCATTGGAGAAGATGAAAGAGTATCCCAACATCAAGAACCTC






TGGGGTACAGCCAATGTATTCAGCAACAAGCGCTATATGAACGGTGCCAGCACCAACCCC






GACTTCGACGTTGTTGCACGTGCCATCGTACAGATAAAGAACGCCATTGACGCTACCATC






AAGCTCGGCGGTCAGAACTACGTGTTCTGGGGCGGACGTGAGGGATACATGAGCCTGCTC






AACACCGACCAGAAGCGCGAGAAGGAGCACATGGCAACCATGCTGACCATGGCTCGCGAC






TACGCTCGCAAGAACGGTTTCAAGGGCACATTCCTCATCGAGCCTAAGCCCATGGAACCC






TCAAAGCACCAGTACGACGTAGACACAGAGACCGTATGCGGTTTCCTCCGCGCCCATGGT






CTTGACAAGGATTTCAAGGTGAACATTGAGGTGAACCACGCTACCCTCGCCGGCCACACC






TTTGAGCATGAACTGGCTTGCGCCGTCGACAACGGCATGCTCGGCAGCATCGATGCCAAC






CGCGGCGACGTTCAGAACGGCTGGGACACCGACCAGTTCCCCATCGACAACTACGAGCTG






ACTCAGGCCATGCTCGAAATCATCCGCAACGGTGGTCTGGGCAACGGCGGTACCAACTTC






GACGCCAAGATCCGTCGTAACTCTACCGACCTCGAGGATCTGTTCATCGCCCACATCAGC






GGTATGGACGCCATGGCACGTGCACTGCTCAATGCAGCAGCCATACTGGAGGAGAGCGAG






CTGCCTGCCATGAAGAAGGAGCGTTACGCCAGCTTCGACAGCGGCATCGGCAAGGACTTC






GAGGACGGCAAGCTCACACTTGAGCAGGCCTATGAGTATGGTAAGAAGGTTGAGGAGCCA






AAGCAGACCTCTGGCAAGCAGGAGAAGTATGAGACTATAGTAGCCCTCTACGCTAAGTAG





5586MI182_

Prevotella

Amino 
 98
MAKEYFPFVGKIPFEGKDSKNVMAFHYYEPEKVVMGKKMKDWLKFAMAWWHTLGQASADP


004

Acid


FGGQTRSYEWDKADDAVQRAKDKMDAGFEIMDKLGIEYFCFHDVDLVEEGATVEEYEARM








KAITDYALEKMKEYPNIKNLWGTANVFSNKRYMNGASTNPDFDVVARAIVQIKNAIDATI








KLGGQNYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARKNGFKGTFLIEPKPMEP








SKHQYDVDTETVCGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELACAVDNGMLGSIDAN








RGDVQNGWDTDQFPIDNYELTQAMLEIIRNGGLGNGGTNFDAKIRRNSTDLEDLFIAHIS








GMDAMARALLNAAAILEESELPAMKKERYASFDSGIGKDFEDGKLTLEQAYEYGKKVEEP







KQTSGKQEKYETIVALYAK





5586MI193_

Prevotella

DNA
 99
ATGACTAAAGAGTATTTCCCTACCATTGGCAAGATTCCCTTTGAGGGACCTGAAAGCAAG


004



AACCCGCTTGCATTCCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAA






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAG






TTCGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCC






AAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGC






TTCCACGACATCGACCTGGTCGAGGATGCCGATGAAATCGCCGAGTACGAGGCCCGGATG






AAGGACATCACCGACTATCTGCTCGTCAAGATGAAAGAGACCGGCATCAAGAACCTTTGG






GGAACGGCCAACGTATTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCCGAT






TTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






TTGGGCGGTCAGAACTATGTGTTCTGGGGCGGCCGTGAAGGCTACCAGACCCTGCTCAAT






ACCCAGATGCAGCGCGAGAAGGAACACATGGGCCGTATGTTGGCACTGGCCCGCGACTAT






GGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACC






AAGCACCAGTACGATCAGGATACGGAAACCGTCATCGGCTTCCTGCGCCGCCATGGCCTC






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTC






GAGCACGAGCTGGCTTGCGCCGTCGACCACGGCATGCTGGGCAGCATCGACGCCAACCGG






GGTGATGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACG






CTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGAT






GCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCC






ATGGATGCCATGGCCCGCGCCCTGGTCAATGCTGTCGCCATTCTCGAGGAATCGCCCATC






CCGGCCATGGTCAGGGAACGTTACGCCTCDTTCGACAGCGGAAAGGGCAGGGAATATGAG






GAAGGCAGGCTGTCTCTCGAAGACATCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG





5586MI193_
Prevotella
Amino 
100
MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ


004

Acid


FGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLVEDADEIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIK








LGGQNYVFWGGREGYQTLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALVNAVAILEESPIPAMVRERYASFDSGKGREYEEGRLSLEDIVAYAKAHGEPK







QISGKQELYETIVALYCK





5586MI195_

Prevotella

DNA
101
ATGGCAAAAGAGTATTTCCCGCAGATCGGAAAGATCGGCTTTGAGGGTCCTGCAAGCAAG


003



AACCCGCTGGCATTCCATTATTATGACGCCGAGCGCGTGGTGATGGGTAAACCCATGAAA






GACTGGTTTAAATTCGCCCTCGCGTGGTGGCACAGCCTCGGCCAGGCCTCCGGCGACCCG






TTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAATGCCCCTACTGCCGCGCC






CGCGCCAAGGCGGACGCCGGCTTCGAGATCATGCAAAAGCTCGGCATCGGCTATTTCTGC






TTCCACGACGTCGACCTCATCGAAGACACGGACGACATCGCCGAATATGAGGCCCGCCTC






AAGGACATCACGGACTACCTGCTCGAAAGGATGCAGGAAACCGGCATCAAGAACCTCTGG






GGCACGGCCAATGTCTTCGGTCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAG






TTCGACATCGTCGCCCGCGCTGCCGTCCAGATCAAGAACGCCCTCGACGCCACCATCAAG






CTCGGTGGCTCGAACTACGTCTTCTGGGGCGGCCGCGAAGGTTATTACACGCTGCTCAAC






ACCCAGATGCAGCGCGAGAAAGACCACCTCGCCAAGCTCCTCACCGCCGCCCGCGACTAT






GCCCGCGCCAAGGGCTTCCAGGGCACCTTCCTGATCGAGCCCAAGCCGATGGAGCCGACC






AAGCACCAGTACGATGTCGACACGGAGACTGTAATCGGATTCCTCCGCGCCAACGGACTG






GACAAGGACTTCAAGGTCAACATCGAGGTCAACCACGCCACCCTCGCCGGCCATACCTTC






GAGCATGAGCTGACCGTCGCCCGCGAGAACGGATTCCTCGGCAGCATCGACGCCAACCGC






GGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTGGACGCCTACGACCTCACC






CAGGCCATGATGCAGGTGCTCCTGAACGGCGGTTTCGGCAACGGCGGCACCAATTTCGAC






GCCAAGCTCCGTCGCAGCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCGGCCATTCTCGAGGAGAGCCCGCTG






CCCGCGATGGTCAAGGAGCGTTACGCCTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAG






GAGGGAAAGGCCACGCTGGAGGACCTCTACGACTACGCCAAGGCCCATGGCGAGCCCGTC






GCCGCCTCCGGCAAGCAGGAACTGTGTGAAACTTACCTGAATCTGTATGCAAAGTAA





5586MI195_

Prevotella

Amino 
102
MAKEYFPQIGKIGFEGPASKNPLAFHYYDAERVVMGKPMKDWFKFALAWWHSLGQASGDP


003

Acid


FGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIGYFCFHDVDLIEDTDDIAEYEARL








KDITDYLLERMQETGIKNLWGTANVFGHKRYMNGAGTNPQFDIVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFQGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPLPAMVKERYASFDSGLGKQFEEGKATLEDLYDYAKAHGEPV







AASGKQELCETYLNLYAK





5586MI196_

Prevotella

DNA
103
ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA


003



AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAA






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAG






TTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCC






AAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGC






TTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATG






AAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGG






GGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGAT






TTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAAT






ACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTAT






GGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACC






AAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTC






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTC






GAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGC






GGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACG






CTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGAT






GCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCC






ATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATT






CCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAA






GAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG





5586MI196_

Prevotella

Amino 
104
MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ


003

Acid


FGGQTRHYAWDDPDCDTARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIK








LGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLTEPKPMEPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDTFIAHISA








MDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPK







QISGKQELYETIVALYCK





5586MI197_

Prevotella

DNA
105
ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA


003



AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAA






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAG






TTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCC






AAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGC






TTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATG






AAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGG






GGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGAT






TTCGACGTGCTGGCCCGTGCCGCCGCCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAAT






ACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTAT






GGCCGTGCACACGGTTTCAAGGGCACGCTCCTCATCGAGCCCAAACCGATGGAGCCGACC






AAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTC






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTC






GAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGC






GGTGACGCCCAGGACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACG






CTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGAT






GCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATATCTTCATCGCGCACATCAGCGCC






ATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATT






CCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAA






GAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG





5586MI197_

Prevotella

Amino 
106
MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ


003

Acid


FGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAAQIKNAIDATIK








LGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTLLIEPKPMEPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANR








GDAQDGWDTDQFPIDNYELTLAMLQIIRNGGLADGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPK







QISGKQELYETIVALYCK





5586MI199_

Prevotella

DNA
107
ATGACAAAAGAGTATTTCCCTACCATCGGCAAGATCCCCTTTGAGGGACCCGAGAGCAAA


003



AACCCCCTCGCTTTTCATTACTATGAGCCCGACCGCCTGGTCATGGGCAAGAAGATGAAA






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCCGGCGACCAG






TTTGGCGGCCAGACCCGCCACTATGCCTGGGATGATCCGGATTGCCCGTATGCACGTGCC






AAAGCCAAGGCCGACGCCGGTTTCGAAATCATGCAGAAACTGGGCATTGAATTCTTCTGC






TTCCACGACATCGACCTGATCGAGGATACCGATGACATCGTCGAGTATGAGGCCCGGATG






AAGGACATCACCGACTATCTGCTGGTCAAGATGAAAGAGACCGGCATCAAGAATCTCTGG






GGAACGGCCAACGTATTCGGGCACAAGCGCTATATGAACGGCGCTGCCACCAACCCCGAT






TTCGACGTGCTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGCGGCCAGAATTATGTGTTCTGGGGCGGGCGTGAAGGCTACCAGAGCCTGCTCAAT






ACCCAGATGCAGCGCGAAAAGGAACACATGGGCCGTATGTTGGCACTAGCCCGCGACTAT






GGCCGTGCACACGGTTTCAAGGGCACGTTCCTCATCGAGCCCAAACCGATGGAGCCGACC






AAGCACCAGTACGATCAGGATACGGAGACCGTCATCGGTTTTCTGCGCCGCCATGGCCTC






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCATGCTACCCTGGCGGGCCACACCTTC






GAGCACGAGCTGGCCTGCGCCGTCGACCACGGCATGCTGGGCAGTATTGACGCCAACCGC






GGTGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGATAACTATGAGCTGACG






CTGGCCATGCTCCAGATCATCCGCAACGGCGGCCTGGCACCCGGCGGCTCGAACTTCGAT






GCGAAGCTGCGTCGCAACTCCACCGATCCGGAAGATGTCTTCATCGCGCACATCAGCGCC






ATGGATGCCATGGCCCGCGCCCTGGTCAACGCTGTCGCCATTCTTGAGGAATCGCCCATT






CCGGACATGGTCAAGGAGCGCTACGCTTCGTTCGACAGCGGAAAAGGCAGGGAGTACGAA






GAGGGGAAACTTTCCTTCGAGGACCTCGTGGCCTATGCCAAAGCCCACGGCGAACCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACCATCGTGGCTCTCTATTGCAAGTAG





5586MI199_

Prevotella

Amino 
108
MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRLVMGKKMKDWLRFAMAWWHTLGQASGDQ


003

Acid


FGGQTRHYAWDDPDCPYARAKAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIVEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVLARAAVQIKNAIDATIK








LGGQNYVFWGGREGYQSLLNTQMQREKEHMGRMLALARDYGRAHGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDVFIAHISA








MDAMARALVNAVAILEESPIPDMVKERYASFDSGKGREYEEGKLSFEDLVAYAKAHGEPK







QISGKQELYETIVALYCK





5586MI200_

Prevotella

DNA
109
ATGGCAAAAGAGTATTTCCCGACAATCGGAAAGATCCCCTTCGAGGGCGTTGAGAGCAAG


003



AATCCCCTTGCTTTCCATTATTATGACGCCGAGCGCGTGGTCATGGGCAAGCCCATGAAG






GACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCGGACCCG






TTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCC






CGCGCCAAGGCTGACGCCGGCTTCGAGATCATGCAGAAGCTCGGAATCGGCTACTATTGC






TTCCACGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAATACGAGGCCCGCATG






AAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACCGCGAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACATCGTCGCCCGCGCGGCCCTGCAGATCAAGAACGCGATCGATGCCACCATCAAG






CTCGGCGGCACCGGCTACGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTGCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCAACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACC






AAGCACCAATACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAATGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTTC






GAGCACGAGCTCACCGTGGCCGTTGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGATCTCACC






CAGGCGATGATCCAGATCATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGAC






GCCAGGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTC






TGCGAGATGGTCGCCAAGCGTTACGCTTCCTTCGACAGCGGCCTCGGCAAAAAGTTCGAG






GAAGGCAAGGCCACCCTCGAGGAACTCTACGAGTATGCCAAGGCGAACGGTGAGGTCAAG






GCCGAATCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAATAG





5586MI200_

Prevotella

Amino 
110
MAKEYFPTIGKIPFEGVESKNPLAFHYYDAERVVMGKPMKDWFKFAMAWWHTLGQASADP


003

Acid


FGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARM








KDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDIVARAALQIKNAIDATIK








LGGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARANGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDARLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVIEESPLCEMVAKRYASFDSGLGKKFEEGKATLEELYEYAKANGEVK







AESGKQELYETLLNLYAK





5586MI203_

Prevotella

DNA
111
ATGGCACAAGCGTATTTTCCTACCATCGGGAAAATCCCCTTCGAGGGACCCGAAAGCAAG


003



AATCCCCTGGCATTCCATTATTATGAGCCCGACCGCCTGGTCCTGGGCAAGAAGATGAAG






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCTTCCGGCGACCAG






TTCGGCGGCCAGACCCGCCACTACGCCTGGGACGAGCCCGCCACGCCCCTGGAACGGGCC






AAGGCCAAGGCGGATGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGAATTCTTCTGC






TTCCACGATGTGGACCTCATCGAAGAGGGCGCCACGATCGAGGAATACGAGCAGCGGATG






CAGCAGATCACGGATTATCTGCTGGTCAAGATGAAAGAGACCGGCATCCGCAACCTCTGG






GGTACGGCCAACGTGTTCGGACACGAGCGCTACATGAACGGCGCGGCCACGAACCCCGAT






TTCGATGTCGTGGCCCGCGCGGCCGTGCAGATCAAGACGGCCATCGACGCCACCATCAAG






TTGGGCGGCGAGAACTATGTGTTCTGGGGCGGCCGGGAAGGCTATATGAGCCTGCTCAAT






ACGCAGATGCACCGCGAGAAGCTGCATCTGGGCAAGATGCTCGCCGCGGCCCGCGACTAC






GGACGCGCCCACGGCTTCAAGGGGACCTTCCTCATCGAACCCAAGCCGATGGAACCCACC






AAGCATCAGTATGACCAGGATACGGAGACGGTCATCGGTTTCCTGCGCCGCTACGGCCTG






GACGAAGACTTCAAGGTGAACATCGAGGTCAACCACGCTACGCTGGCCGGCCATACCTTC






GAACACGAACTGGCCACGGCGGTCGATGCCGGCCTGCTGGGCAGCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTACGAACTGACC






CTGGCGATGCTGCAGGTCATCCGCAACGGCGGTCTGGCCCCGGGCGGCTCGAATTTCGAT






GCCAAGCTCCGCCGGAACTCCACCGATCCGGAAGACATCTTCATTGCCCACATCAGCGCG






ATGGATGCGATGGCGCGGGCCCTGCTCAATGCGGCCGCCCTCTGCGAGACGTCCCCGATT






CCGGCGATGGTCAAGGCGCGTTACGCTTCGTTCGACAGCGGCGCCGGCAAGGATTTCGAA






GAGGGAAGGATGACGCTGGAAGACCTCGTGGCCTATGCCAGGACCCACGGCGAGCCGAAG






CGGACCTCGGGCAAGCAGGAACTCTATGAGACCCTCGTGGCGCTTTATTGCAAATAG





5586MI203_

Prevotella

Amino 
112
MAQAYFPTIGKIPFEGPESKNPLAFHYYEPDRLVLGKKMKDWLRFAMAWWHTLGQASGDQ


003

Acid


FGGQTRHYAWDEPATPLERAKAKADAGFEIMQKLGIEFFCFHDVDLIEEGATIEEYEQRM








QQITDYLLVKMKETGIRNLWGTANVFGHERYMNGAATNPDFDVVARAAVQIKTAIDATIK








LGGENYVFWGGREGYMSLLNTQMHREKLHLGKMLAAARDYGRAHGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRRYGLDEDFKVNIEVNHATLAGHTFEHELATAVDAGLLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQVIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALLNAAALCETSPIPAMVKARYASFDSGAGKDFEEGRMTLEDLVAYARTHGEPK







RTSGKQELYETLVALYCK





5586MI205_

Prevotella

DNA
113
ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG


004



AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAG






GAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCC






CGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGC






TTCCACGATGTGAATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATG






AAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGG






GGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAA






TTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAAC






ACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTAT






GCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCTG






GACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTC






GAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACC






CAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATG






CCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAG






GAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTG






GCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA





5586MI205_

Prevotella

Amino 
114
MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP


004

Acid


EGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVNIIEDCEDIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPV







AASGKQELYETLLNLYAK





5586MI206_

Prevotella

DNA
115
ATGGCAAAAGAGTATTTCCCGACTATCGGCAAGATTCCCTTCGAGGGCGTCGAATCCAAG


004



AACCCGATGGCATTCCACTATTATGACGCGAAACGCGTCGTGATGGGCAAGCCCATGAAG






GACTGGCTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGACAGGCTTCCGGCGACCCG






TTCGGCGGCCAGACCCGTTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCC






AAGGCCAAGGCCGACGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGAGTACTACTGC






TTCCATGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATG






AAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGTATCAAGAACCTCTGG






GGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACGTCGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAA






CTCGGCGGCACCTCTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCCACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTCAATATCGAAGTGAACCACGCCACCCTCGCCGGCCACACCTIC






GAGCATGAGCTCACCGTGGCGGTCGATAACGGCTTCCTCGGCTCCATCGACGCCAACCGT






GGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGACCTCACC






CAGGCCATGATGCAGATCATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGAC






GCCAAACTCCGCCGCTCCTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCGCTCCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTC






TGCAAGATGGTCGAGGAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAG






GAAGGCAAGGCCACCCTTGAGGACCTCTACGAGTATGCCAAGAAGAACGGCGAGCCCGTC






GTCGCTTCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAGTAG





5586MI206_

Prevotella

Amino 
116
MAKEYFPTIGKIPFEGVESKNPMAFHYYDAKRVVMGKPMKDWLKFAMAWWHTLGQASGDP


004

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYYCFHDIDLVEDTEDIAEYEARM








KDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIK








LGGTSYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAHGFKGTFLTEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVIEESPLCKMVEERYASFDSGLGKQFEEGKATLEDLYEYAKKNGEPV







VASGKQELYETLLNLYAK





5586MI208_

Prevotella

DNA
117
ATGTCAACTGAGTATTTCCCTACAATCGGCAAGATTCCCTTCGAGGGACCCGAGAGCAAG


003



AACCCCATGGCCTTCCACTACTATGAACCCGAAAAGTTGGTGATGGGCAAGAAGATGAAG






GACTGGCTGCGITTCGCAATGGCCTGGTGGCACACCCTIGGAGCCGCATCCGGCGACCAG






TTCGGCGGACAGACCCGCAGTTACGCCTGGGACAAGGGCGACTGCCCTTACAGCCGCGCC






CGCGCCAAGGTCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATAGAGTTCTTCTGC






TTCCATGACATCGACCTGGTCGAGGATACCGACGACATCGCCGAGTATGAAGCCCGGATG






AAAGACATCACGGACTATCTGCTGGAAAAGATGGAGGCTACCGGCATCAAGAACCTCTGG






GGCACGGCCAATGTCTTCGGTCACAAGCGTTATATGAACGGTGCAGCCACAAACCCCGAT






TTCGCAGTGGTCGCAAGGGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGTGGTGAGAACTATGTGTTCTGGGGTGGACGCGAGGGTTATATGAGCCTGCTCAAC






ACCCAGATGCAGAGGGAGAAGGAACACCTTGCCAAGATGCTCACCGCCGCACGTGACTAT






GCACGCGCCAAAGGTTTCAAGGGCACGTTCCTCATCGAACCCAAGCCGATGGAACCCACC






AAGCACCAGTATGACCAGGATACCGAGACCGTTATCGGATTCCTCCGCAGCCACGGCCTG






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCCACCCTGGCGGGCCATACCTTC






GAGCACGAACTGGCCACCGCCGTCGACAACGGCATGCTCGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGACAACTTCGAGCTCACG






CTTGCCATGATGCAGATAATCCGCAACGGCGGCCTGGCACCGGGCGGTTCGAACTTCGAC






GCAAAGCTGCGCCGCAATTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCGCGCCCTCGTCAACGCCGCCGCCATCCTCGGCGAGTCGCCCGTT






CCGGCTATGGTCAAGGACCGCTATGCTTCGTTCGACTGCGGCAAGGGCAAGGACTTCGAA






GACGGCAAACTGACTCTCGAAGACATCGTCGCCTACGCCAGGGAGAATGGCGAGCCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACTATCGTCGCTCTTTACTGCAAGTAA





5586MI208_

Prevotella

Amino 
118
MSTEYFPTIGKIPFEGPESKNPMAFHYYEPEKLVMGKKMKDWLRFAMAWWHTLGAASGDQ


003

Acid


FGGQTRSYAWDKGDCPYSRARAKVDAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARM








KDITDYLLEKMEATGIKNLWGTANVFGHKRYMNGAATNPDFAVVARAAVQIKNAIDATIK








LGGENYVFWGGREGYMSLLNTQMQREKEHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRSHGLDKDFKVNIEVNHATLAGHTFEHELATAVDNGMLGSIDANR








GDAQNGWDTDQFPIDNFELTLAMMQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALVNAAAILGESPVPAMVKDRYASFDCGKGKDFEDGKLTLEDIVAYARENGEPK







QISGKQELYETIVALYCK





5586MI210_

Prevotella

DNA
119
ATGTCATATTTTCCTACTATCGGTAACATCCCCTTTGAGGGTGTAGAGAGCAAGAATCCC


002



CTTGCCTTCCATTATTATGACGCTTCCCGCGTAGTTATGGGCAAGCCCATGAAGGAGTGG






CTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGTCAGGCATCGGCCGACCCTTTCGGC






GGACAAACCCGCAGCTATGCCTGGGACAAAGGCGAGTGCCCCTACTGCCGTGCCCGTGCC






AAGGCCGACGCCGGCTTCGAGCTCATGCAGAAACTGGGCATCGAGTATTTCTGCTCCCAC






GACATTGACCTCATCGAGGACTGCGACGACATTGCAGAGTACGAGGCCCGTCTGAAGGAC






ATTACGGACTACCTCCTGGAGAAGATGAAGAAGACCGGTATCAAGAACCTGTGGGGTACG






GCCAATGTGTTCGGTAACAAGCGTTACATGAACGGTGCTGCTACCAACCCTCAGTTTGAC






GTTGTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATTGACGCTACCATCAAGCTGGGC






GGTTCCAACTATGTGTTCTGGGGTGGCCGTGAGGGTTACTACACGCTTCTGAACACCCAG






ATGCAGCGTGAGAAGAATCACCTGGCTGCCATGCTCAAGGCTGCCCGCGACTATGCCCGC






GCCAACGGTTTCAAGGGCACCTTCCTCATTGAGCCCAAGCCCATGGAGCCCACCAAGCAC






CAGTACGACGTAGACACGGAGACCGTGATTGGATTCCTCCGCGCCAACGGTCTGGAGAAG






GACTTCAAGGTGAACATTGAGGTGAACCACGCTACTCTTGCCGGTCACACCTTCGAGCAC






GAGCTCACCGTGGCCCGTGAGAACGGCTTCCTGGGTTCCATTGACGCCAACCGCGGAGAT






GCCCAGAACGGCTGGGACACCGACCAGTTCCCGGTAGATGCCTTTGACCTCACCCAGGCC






ATGATGCAGATTCTCCTCAACGGAGGCTCCGGCAATGGCGGTACCAACTTTGACGCCAAG






CTGCGCCGTTCCTCCACCGACCCCGAGGACATCTTCATCGCGCACATCAGCGCCATGGAT






GCCATGGCTCACGCCCTGCTCAATGCAGCTGCCGTGCTGGAGGAGAGCCCGCTTTGCAAG






ATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTTGGCAAGCAGTTCGAGGAAGGA






AAGGCTACGCTGGAAGATCTGTATGCCTATGCCGTCAAGAACGGTGAGCCCGTGGTGGCT






TCCGGCAAGCAGGAACTGTACGAAACCTTCCTGAACCTCTATGCAAAATGGTAA





5586MI210_

Prevotella

Amino 
120
MSYFPTIGNIPFEGVESKNPLAFHYYDASRVVMGKPMKEWLKFAMAWWHTLGQASADPFG


002

Acid


GQTRSYAWDKGECPYCRARAKADAGFELMQKLGIEYFCSHDIDLIEDCDDIAEYEARLKD








ITDYLLEKMKKTGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIKLG








GSNYVFWGGREGYYTLLNTQMQREKNHLAAMLKAARDYARANGFKGTFLIEPKPMEPTKH








QYDVDTETVIGFLRANGLEKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANRGD








AQNGWDTDQFPVDAFDLTQAMMQILLNGGSGNGGTNFDAKLRRSSTDPEDIFIAHISAMD








AMAHALLNAAAVLEESPLCKMVKERYASFDSGLGKQFEEGKATLEDLYAYAVKNGEPVVA







SGKQELYETFLNLYAKW





5586MI212_

Prevotella

DNA
121
ATGTCAACTGAGTATTTCCCTACAATCGGCAAGATTCCCTTCGAGGGACCCGAGAGCAAG


002



AACCCCATGGCCTTCCACTACTATGAACCCGAAAAGTTGGTGATGGGCAAGAAGATGAAG






GACTGGCTGCGTTTCGCAATGGCCTGGTGGCACACCCTTGGAGCCGCATCCGGCGACCAG






TTCGGCGGACAGACCCGCAGTTACGCCTGGGACAAGGGCGACTGCCCTTACAGCCGCGCC






CGCGCCAAGGTCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGCATAGAGTTCTTCTGC






TTCCATGACATCGACCTGGTCGAGGATACCGACGACATCGCCGAGTATGAAGCCCGGATG






AAAGACATCACGGACTATCTGCTGGAAAAGATGGAGGTTACCGGCATCAAGAACCTCTGG






GGCACGGCCAATGTCTTCGGTCACAAGCGTTATATGAACGATGCAGCCACAAACCCCGAT






TTCGCAGTGGTCGCAAGGGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGTGGTGAGAACTATGTGTTCTGGGGTGGACGCGAGGGTTATATGAGCCTGCTCAAC






ACCCAGATGCAGAGGGAGAAGGAACACCTTGCCAAGATGCTCACCGCCGCACGTGACTAT






GCACGCGCCAAAGGTTTCAAGGGCACGTTCCTCATCGAACCCGAGCCGATGGAACCCACC






AAGCACCAGTATGACCAGGATACCGAGACCGTTATCGGATTCCTCCGCAGCCACGGCCTG






GACAAGGACTTCAAGGTCAACATCGAGGTGAACCACGCCACCCTGGCGGGCCATACCTTC






GAGCACGAACTGGCCACCGCCGTCGACAACGGCATGCTCGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGATCGACAACTTCGAGCTCACG






CTTGCCATGATGCAGATAATCCGCAACGGCGGCCTGGCACCGGGCGGTTCGAACTTCGAC






GCAAAGCTGCGCCGCAATTCCACCGATCCCGAGGACATCATCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCGCGCCCTCGTCAACGCCGCCGCCATCCTCGGCGAGTCGCCCGTT






CCGGCTATGGTCAAGGACCGCTATGCTTCGTTCGACTGCGGCAAGGGCAAGGACTTCGAA






GACGGCAAACTGACTCTCGAAGACATCGTCGCCTACGCCAGGGAGAATGGCGAGCCGAAA






CAGATTTCCGGCAAGCAGGAACTCTACGAAACTATCGTCGCTCTTTACTGCAAGTAA





5586MI212_

Prevotella

Amino 
122
MSTEYFPTIGKIPFEGPESKNPMAFHYYEPEKLVMGKKMKDWLRFAMAWWHTLGAASGDQ


002

Acid


FGGQTRSYAWDKGDCPYSRARAKVDAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARM








KDITDYLLEKMEVTGIKNLWGTANVFGHKRYMNDAATNPDFAVVARAAVQIKNAIDATIK








LGGENYVFWGGREGYMSLLNTQMQREKEHLAKMLTAARDYARAKGFKGTFLIEPEPMEPT








KHQYDQDTETVIGFLRSHGLDKDFKVNIEVNHATLAGHTFEHELATAVDNGMLGSIDANR








GDAQNGWDTDQFPIDNFELTLAMMQIIRNGGLAPGGSNFDAKLRRNSTDPEDIIIAHISA








MDAMARALVNAAAILGESPVPAMVKDRYASFDCGKGKDFEDGKLTLEDIVAYARENGEPK







QISGKQELYETIVALYCK





5586MI213_

Prevotella

DNA
123
ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG


003



AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAG






GAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCC






CGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGC






TTCCACGATGTGGATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATG






AAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGG






GGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAA






TTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAAC






ACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTAT






GCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCTG






GACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTC






GAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACC






CAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATG






CCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAG






GAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTG






GCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA





5586MI213_

Prevotella

Amino 
124
MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP


003

Acid


FGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGEGNGGTNEDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPV







AASGKQELYETLLNLYAK





5586MI215_

Prevotella

DNA
125
ATGGCAAAAGAGTATTTCCCGCAGATCGGAAAGATCGGCTTTGAGGGTCTTGAGAGCAAG


003



AACCCGATGGCATTCCATTATTATGACGCCGAGCGTGTCGTGCTCGGAAAGAAGATGAAG






GACTGGCTGAAGTTCGCGATGGCCTGGTGGCATACGCTCGGACAGGCTTCCGGCGACCCA






TTCGGCGGCCAGACTCGCAGCTATGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGTGCC






CGCGCCAAGGCCGACGCCGGCTTCGAGCTCATGCAGAAGCTCGGCATCGAGTACTTCTGC






TTCCACGACATCGACCTCATCGAGGACTGCGACGACATCGACGAGTACGAGGCCCGGATG






AAGGACATCACCGACTACCTGCTGGAGAAGATGAAGGAGACCGGAATCAAGAATCTCTGG






GGAACGGCCAACGTCTTCGGTCACAAGCGCTACATGAACGGCGCCGCTACCAATCCGCAG






TTTGAAATCGTCGCCCGCGCTGCCGTCCAGATCAAGAACGCGCTCGACGCCACCATCAAG






CTCGGCGGCTCCAACTACGTCTTCTGGGGCGGCCGCGAGGGCTATTACACGCTGCTGAAT






ACCCAGATGCAGCGCGAGAAGGACCATCTCGCCAGGCTCCTTACCGCCGCCCGCGACTAT






GCGCGCGCCAAGGGGTTCAAGGGGACCTTCCCCATCGAGCCGAAGCCGATGGAGCCGACC






AAGCACCAGTATGACGTCGACACGGAGACCGTCATCGGTTTCCTCCGCCAGAATGGCCTC






GACAAGGACTTCAAGGTCAATATCGAGGTGAACCACGCCACCCTCGCCGGCCATACCTTC






GAGCACGAGCTGACCGCGGCCCGGGAGAACGGCTTCCTCGGCAGCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCGGTGGACGCCTTCGATCTCACG






CGGGCCATGATGCAGATCCTGCTCAATGGCGGTTTCGGCAACGGCGGCACCAACTTCGAC






GCCAAGCTGCGCCGCAGCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAATGCGGCCGCCATCCTCGAGGAAAGCCCGCTG






CCGGCCCTGGTCAAGCAGCGCTATGCGTCCTTCGACAGCGGTCTCGGCAAGCAGTTCGAG






GAGGGTAAGGCCACGCTCGAGGACCTGTACGCATACGCGAAGGAGCACGGCGAGCCCGTC






GCGGCCTCCGGCAAGCAGGAGCTCTGCGAGACCTATCTCAACCTCTACGCGAAATAA





5586MI215_

Prevotella

Amino 
126
MAKEYFPQIGKIGFEGLESKNPMAFHYYDAERVVLGKKMKDWLKFAMAWWHTLGQASGDP


003

Acid


FGGQTRSYEWDKGECPYCRARAKADAGFELMQKLGIEYFCFHDIDLIEDCDDIDEYEARM








KDITDYLLEKMKETGIKNLWGTANVEGHKRYMNGAATNPQFEIVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLARLLTAARDYARAKGFKGTFPIEPKPMEPT








KHQYDVDTETVIGFLRQNGLDKDFKVNIEVNHATLAGHTFEHELTAARENGFLGSIDANR








GDAQNGWDTDQFPVDAFDLTRAMMQILLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPLPALVKQRYASFDSGLGKQFEEGKATLEDLYAYAKEHGEPV







AASGKQELCETYLNLYAK





5607MI1_

Prevotella

DNA
127
ATGAGTAAAGAGTATTTTCCTGGGATTGGCAAAATCCCGTATGAGGGAGCCGAGAGCAAG


003



AATGTGATGGCATTCCACTATTATGATCCCGAACGCGTGGTCATGGGCAAGAAAATGAAA






GACTGGTTCAAGTTCGCTATTGCCTGGTGGCATACCCTGGGGCAGGCCAGTGCTGACCAG






TTTGGCGGACAGACCCGTTTCTATGAATGGGACAAAGCCGAGGACCCCTTGCAGCGTGCC






AAGGACAAGATGGATGCCGGTTTTGAAATCATGCAGAAGCTGGGCATCGAGTATTTCTGT






TTCCATGATGTGGACCTCATCGAGGAGGCCGATACCATCGAGGAATATGAAGCCCGCATG






CAGGCGATTACCGACTACGCGCTGGAGAAGATGAAGGCAACGGGTATCAAGTTGCTGTGG






GGCACTGCCAACGTGTTCGGCCACAAGCGTTACATGAACGGCGCCGCCACCAATCCCGAC






TTCAATGTCGTGGCACGTGCAGCCGTGCAGATCAAGAACGCCCTCGATGCTACCATCAAG






TTGGGCGGAACGAGCTACGTCTTCTGGGGCGGTCGTGAAGGCTATCAGAGCCTGCTCAAC






ACCCAGATGCAGCGCGAGAAGAACCACCTGGCCAAGATGCTCACGGCAGCCCGTGACTAT






GCCCGTGCTAAGGGCTTCAAGGGCACCTTCCTGATTGAGCCCAAGCCGATGGAACCCACC






AAGCACCAGTATGACCAGGACACCGAGACCGTTATCGGCTTCTTGCGTGCCAATGGCCTT






GACAAGGACTTTAAGGTCAACATTGAGGTCAACCATGCCACGCTGGCTGGCCACACCTTT






GCACATGAGTTGGCAGTGGCTGTGGATAACGGTATGCTGGGCAGCATCGATGCTAACCGT






GGTGACCACCAGAACGGCTGGGATACAGACCAGTTCCCCATCAACAGTTATGAACTCACC






AATGCTATGCTGCAGATCATGCACGGCGGCGGTTTCAAGGACGGCGGTACCAACTTTGAC






GCCAAGCTGCGCCGCAACAGTACCGACCCCGAGGACATCTTTACCGCTCACATCAGTGGT






ATGGACGCTCTGGCCCGTGCCCTGTTGAGTGCTGCCGATATCCTTGAGAAGAGCGAGTTG






CCTGAAATGCTCAAGGAACGCTATGCCAGCTTTGACGCGGGTGAAGGCAAGCGCTTTGAG






GATGGCCAGATGACTCTTGAGGAACTGGTTGCCTATGCCAAGTCCCATGGCGAGCCTGCT






ACCATCAGTGGCAAGCAGGAAAAATATGAAGCCATCGTGGCTTTGCACGTCAAGTAA





5607MI1_

Prevotella

Amino 
128
MSKEYFPGIGKIPYEGAESKNVMAFHYYDPERVVMGKKMKDWFKFAIAWWHTLGQASADQ


003

Acid


FGGQTREYEWDKAEDPLQRAKDKMDAGFEIMQKLGIEYFCFHDVDLIEEADTIEEYEARM








QAITDYALEKMKATGIKLLWGTANVFGHKRYMNGAATNPDFNVVARAAVQIKNALDATIK








LGGTSYVFWGGREGYQSLLNTQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFAHELAVAVDNGMLGSIDANR








GDHQNGWDTDQFPINSYELTNAMLQIMHGGGFKDGGTNFDAKLRRNSTDPEDIFTAHISG








MDALARALLSAADILEKSELPEMLKERYASFDAGEGKRFEDGQMTLEELVAYAKSHGEPA







TISGKQEKYEAIVALHVK





5607MI2_

Prevotella

DNA
129
ATGAGTAAAGAGTATTATCCTGAGATTGGCAAAATCCCGTTTGAGGGTCCCGAGAGCAAG


003



AATGTGATGGCGTTCCATTACTATGAACCCGAACGCGTCGTCATGGGTAAGAAGATGAAA






GACTGGCTCAAGTTTGCCATGTGCTGGTGGCACAGCCTGGGTCAGGCCAGTGCCGACCAG






TTCGGCGGACAGACACGTTTCTACGAGTGGGACAAGGCCGATACCCCCCTGCAGCGTGCC






AAGGACAAAATGGATGCCGGATTTGAAATCATGCAGAAGTTGGGCATCGAGTACTTCTGC






TTCCACGATGTGGACCTCATCGAGGAGGCCGATACCATCGAGGAATACGAGGCCCGCATG






AAGGCCATTACCGACTATGCGCTGGAGAAGATGCAGGCCACCGGCATCAAGTTGCTGTGG






GGCACTGCCAATGTGTTCGGCCACAAGCGCTACATGAACGGCGCCGCCACCAATCCCGAT






TTCAATGTCGTGGCACGTGCCGCCGTCCAAATCAAGAATGCCATCGATGCCACCATCAAG






CTGGGCGGCACGAGTTACGTCTTCTGGGGTGGTCGTGAGGGCTATCAGAGTCTGCTCAAC






ACGCAGATGCAGCGCGAGAAGGACCATCTGGCCCGCATGCTGGCGGCAGCCCGCGACTAT






GGCCGTGCCCATGGCTTCAAGGGCACTTTCCTGATCGAGCCCAAACCCATGGAGCCCACC






AAGCACCAGTATGATGTGGACACCGAGACCGTGCTCGGCTTCCTGCGTGCCCACGGCCTG






GACAAGGACTTCAAGGTTAACATCGAGGICAATCATGCTACGCTGGCGGGACACACTTIC






AGCCACGAACTGGCTGTGGCCGTGGACAACGGTATGCTGGGCAGCATCGACGCCAACCGC






GGCGATTATCAGAATGGCTGGGACACCGACCAGTTCCCCATCGACAGCTTCGAGCTCACC






CAGGCCATGCTGCAGATCATGCGCGGCGGCGGCTTCAAGGACGGAGGTACCAACTTCGAT






GCCAAGCTGCGTCGCAACAGTACCGACCCTGAGGACATCTTCATCGCCCACATCAGCGGT






ATGGATGCCATGGCACGCGGCCTGTTGAGCGCTGCCGCTATCCTCGAGGATGGCGAGTTG






CCCGCGATGCTCAAGGCACGTTATGCCAGCTTTGACCAGGGCGAGGGTAAGCGCTTTGAG






GACGGCGAGATGACGCTCGAGCAGCTGGTGGATTATGCAAAGGATTATGCCAAATCGCAC






GGCGAGCCTGATGTCATCAGCGGCAAGCAGGAGAAGTTTGAAACCATCGTGGCCCTTTAC






GCCAAGTAA





5607MI2_

Prevotella

Amino 
130
MSKEYYPEIGKIPFEGPESKNVMAFHYYEPERVVMGKKMKDWLKFAMCWWHSLGQASADQ


003

Acid


FGGQTRFYEWDKADTPLQRAKDKMDAGFEIMQKLGIEYFCFHDVDLIEEADTIEEYEARM








KAITDYALEKMQATGIKLLWGTANVFGHKRYMNGAATNPDFNVVARAAVQIKNAIDATIK








LGGTSYVFWGGREGYQSLLNTQMQREKDHLARMLAAARDYGRAHGFKGTFLIEPKPMEPT








KHQYDVDTETVLGFLRAHGLDKDFKVNIEVNHATLAGHTFSHELAVAVDNGMLGSIDANR








GDYQNGWDTDQFPIDSFELTQAMLQIMRGGGFKDGGTNFDAKLRRNSTDPEDIFIAHISG








MDAMARGLLSAAAILEDGELPAMLKARYASFDQGEGKRFEDGEMTLEQLVDYAKDYAKSH







GEPDVISGKQEKFETIVALYAK





5607MI3_

Prevotella

DNA
131
ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG


003



AATCCCCTGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAG






GAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCC






CGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGAATCGGCTATTTCTGC






TTCCACGATGTGGATATCATCGAGGACTGCGAGGACATTGCCGAGTATGAGGCCCGTATG






AAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAATCTGTGG






GGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCGCAA






TTCGACGTGGTAGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCAGCAATTATGTGTTCTGGGGCGGCCGGGAAGGCTACTACACCCTTTTGAAC






ACGCAGATGCAGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCGGCCCGCGACTAT






GCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGTTTCCTGCGCGCCAACGGGCCG






GACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCATACCTTC






GAGCACGAGCTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGATACAGACCAGTTCCCCGTGGACGCCTTTGACCTCACC






CAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATG






CCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAG






GAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGIG






GCCGCTTCCGGAAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA





5607MI3_

Prevotella

Amino 
132
MTNEYFPGIGVIPFEGQESKNPLAFHYYDANRVVMGKPMKEWFKFAMAWWHTLGQASADP


003

Acid


FGGQTRSYAWDKGECPYCRARQKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGPDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPV







AASGKQELYETLLNLYAK





5607MI4_

Prevotella

DNA
133
ATGACTAAAGAGTATTTCCCTTCCGTCGGCAAGATTGCCTTTGAAGGACCCGAAAGCAAG


005



AACCCTATGGCCTTCCATTATTATGACGCCAATCGCGTGGTAATGGGAAAGCCGATGAAA






GAATGGCTTAAATTTGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCC






TTCGGCGGTCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCC






AAGGCCAAGGCCGATGCCGGCTTTGAACTGATGCAGAAACTGGGCATCGAGTATTTCTGC






TTCCACGATATAGACCTGGTGGAAGACTGCGATGATATCGCCGAATACGAGGCCCGCATG






AAGGACATCACGGACTATCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGG






GGAACCGCCAACGTGTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAACCCTCAG






TTCGACATCGTGGCCCGTGCCGCTGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAG






CTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGTGAGGGCTACTATACCCTCCTGAAC






ACCCAGATGCAGAGAGAGAAGGACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTAT






GCCCGTGCCAAGGGCTTCAAGGGCACCTTCCTCATCGAACCCAAGCCGATGGAGCCCACC






AAGCACCAGTACGACGTAGATACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAATATTGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTC






GAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTGGATGCCTTCGACCTCACC






CAGGCTATGATGCAGATCCTTCTGAACGGAGGCTTCGGCAACGGCGGTACCAACTTCGAC






GCCAAACTGCGCCGCTCCTCCACGGACCCCGAGGACATCTTCATCGCCCACATCAGCGCT






ATGGATGCCATGGCCCACGCCCTGCTGAATGCAGCCGCCATCCTGGAGGAAAGCCCGCTT






CCGAAGATGCTGAAAGAGCGTTATGCCAGCTTTGACGGCGGTCTGGGCAAGAAGTTCGAA






GAAGGCAAGGCCTCTCTGGAAGAACTCTACGAGTATGCCAAGAGCAACGGAGAGCCCGTG






GCCGCTTCCGGCAAGCAGGAGCTCTGCGAAACGTACCTGAACCTCTACGCTAAGTAA





5607MI4_

Prevotella

Amino 
134
MTKEYFPSVGKIAFEGPESKNPMAFHYYDANRVVMGKPMKEWLKFAMAWWHTLGQASADP


005

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFELMQKLGIEYFCFHDIDLVEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAFDLTQAMMQILLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAILEESPLPKMLKERYASFDGGLGKKFEEGKASLEELYEYAKSNGEPV







AASGKQELCETYLNLYAK





5607MI5_

Prevotella

DNA
135
ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGCCGACAGCAAA


002



AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAG






GAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCC






TTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCC






AAGGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGCATCGAATACTTCTGC






TTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAG






TTCGATGTGGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCTCCAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATTACACCCTCCTCAAC






ACACAGATGCAGCGGGAAAAAGACCACCTGGCCAAGTTGCTGACGGCCGCCCGCGACTAT






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAACCCACC






AAGCACCAGTACGACGTGGATACGGAGACGGTCATCGGCTTCCTCCGTGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTC






GAGCATGAGCTCACCGTGGCCCGCGAGAACGGTTTCCTGGGCTCCATCGATGCCAACCGC






GGCGACGCCCAGAACGGCTGGGACACGGACCAGTTCCCTGTGGACCCGTACGATCTTACC






CAGGCCATGATGCAGGTGCTGCTGAACGGCGGCTTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTCCGCCGCTCCTCCACCGACCCTGAGGACATCTTCATCGCCCATATTTCCGCC






ATGGATGCCATGGCCCACGCTTTGCTTAACGCAGCTGCCGTGCTGGAAGAGAGCCCCCTG






TGCCAGATGGTCAAGGAGCGTTATGCCAGCTTCGACGATGGCCTCGGCAAACAGTTCGAG






GAAGGCAAGGCTACCCTGGAAGACCTGTACGAATACGCCAAGGCCCAGGGTGAACCCGTT






GTCGCCTCCGGCAAGCAGGAGCTTTACGAGACTCTCCTGAACCTGTATGCCGTCAAGTAA





5607MI5_

Prevotella

Amino 
136
MAKEYFPSIGKIPFEGADSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP


002

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDRYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLCQMVKERYASFDDGLGKQFEEGKATLEDLYEYAKAQGEPV







VASGKQELYETLLNLYAVK





5607MI6_

Prevotella

DNA
137
ATGACCAAAGAATATTTCCCTACCGTCGGGAAGATCCCCTTCGAGGGCCCCGAAAGCAAG


002



AACCCTATGGCGTTCCATTACTATGACCCCAACCGTCTGGTGATGGGCAAGAAGATGAAA






GACTGGCTGCGTTTCGCCATGGCCTGGTGGCACACCCTCGGCCAGGCGTCGGGCGACCAG






TTCGGCGGCCAGACCCGCAGTTATGCGTGGGACGAGGGAGAATGCCCGTACGAGCGCGCC






CGTGCCAAGGCTGACGCCGGCTTCGAGATCATGCAGAAACTCGGTATCGAGTTCTTCTGC






TTCCACGACATCGACCTGATCGAGGATACCGACGACATCGCCGAGTATGAGGCCCGCCTG






AAAGACATCACGGACTATCTGCTCGAGAAGATGAAAGCCACTGGCATCAAAAATCTCTGG






GGAACGGCCAACGTGTTCGGCCACAAGCGTTGCATGAACGGCGCCGCCACCAACCCGGAC






TTCGCCGTGCTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGCGGCGAGAACTATGTGTTCTGGGGTGGCCGCGAAGGCTACACGAGCCTGCTCAAC






ACCCAGATGCAGCGTGAGAAAGAGCACCTGGGCCGCCTGCTGTCCCTGGCCCGCGACTAT






GGCCGCGCCCACGGCTTCAAGGGTACCTTCCTGATCGAGCCCAAGCCGATGGGACCGACG






AAACACCAGTACGACCAGGATACGGAAACTGTCATCGGTTTCCTGCGCCGCCACGGTCTA






GACAAGGACTTCAAGGTCAATATCGAGGTGAACCATGCCACGCTGGCGGGCCACACCTTC






GAACACGAACTGGCCTGCGCCGTGGATCACGGTATGCTGGGCAGCATCGACGCCAACCGC






GGTGACGCACAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTTCGAGCTGACC






CTTTCCATGCTCCAGATCATCCGCAACGGTGGCCTGGCACCCGGCGGCTCGAATTTCGAT






GCCAAGCTGCGCCGCAACTCCACCGATCCCGAAGACATTTTCATCGCGCACATCAGCGCC






ATGGACGCCATGGCCCGCGCATTGGTCAATGCGGCCGCCATCCTGGAGGAGAGCGCTATT






CCGAAGATGGTCAAGGAGCGTTACGCTTCGTTCGACAGCGGCAAAGGCAAGGAATACGAG






GAAGGCAAGCTGACGCTCGAAGACATCGTGGCCTATGCCAAGGCGAACGGAGAACCGAAG






CAGATTTCCGGCAAACAGGAACTCTACGAGACGCTTGTCGCACTCTATAGCAAATAA





5607MI6_

Prevotella

Amino 
138
MTKEYFPTVGKIPFEGPESKNPMAFHYYDPNRLVMGKKMKDWLRFAMAWWHTLGQASGDQ


002

Acid


FGGQTRSYAWDEGECRYERARAKADAGFEIMQKLGIEFFCFHDIDLIEDTDDIAEYEARL








KDITDYLLEKMKATGIKNLWGTANVFGHKRCMNGAATNPDFAVLARAAVQIKNAIDATIK








LGGENYVFWGGREGYTSLLNTQMQREKEHLGRLLSLARDYGRAHGFKGTFLIEPKPMGPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELACAVDHGMLGSIDANR








GDAQNGWDTDQFPIDNFELTLSMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALVNAAAILEESAIPKMVKERYASFDSGKGKEYEEGKLTLEDIVAYAKANGEPK







QISGKQELYETLVALYSK





5607MI7_

Prevotella

DNA
139
ATGACCAAAGGGTATTTCCCTACCATCGGCAGGATTCCCTTCGAGGGAACTGAAAGCAAG


002



AATCCCCTCGCATTCCATTACTATGAGCCCGACCGGCTCGTACTGGGCAAGAAAATGAAA






GACTGGCTGCGTTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCGTCCGGCGACCAG






TTCGGCGGCCAGACCCGCAGCTATGCCTGGGACAAGGCCGAGTGCCCCTATGAGCGCGCC






AAGGCCAAAGCCGACGCCGGCTTCGAGATCATGCAGAAACTCGGCATCGAGTTCTTCTGT






TTCCACGACATTGACCTCGTTGAGGATACCGACGACATCGCCGAGTATGAGGCCCGGATG






AAGGACATTACCGACTATCTCCTGGTCAAGATGAAGGAGACCGGAATCAAGAACCTCTGG






GGTACGGCCAATGTCTTCGGCCACAAGCGCTATATGAACGGCGCCGCCACCAATCCCGAC






TTCGACGTGGIGGCCCGCGCCGCCGTCCAGATCAAGAACGCCCTCGATGCCACCATCAAG






CTGGGCGGTGAAAACTATGTGTTCTGGGGCGGCCGCGAAGGCTATATGAGCCTGCTCAAC






ACGCAGATGCAGCGTGAGAAGGAGCACCTGGGCCGGATGCTGGTCGCCGCCCGCGACTAC






GCCCGCGCCCACGGCTTCAAGGGTACCTTCCTCATCGAGCCCAAACCGATGGAACCGACC






AAGCACCAGTACGACCAGGATACGGAAACCGTGATCGGCTTCCTTCGCCGCCACGGCCTG






GACAAGGATTTCAAGGTGAACATCGAAGTGAACCACGCCACGCTGGCCGGCCACACCTTC






GAGCACGAACTGGCCACCGCCGTCGACTGCGGCCTGCTGGGCAGCATCGACGCCAATCGC






GGCGACGCTCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTTCGAACTCACG






CTGGCCATGCTGCAGATTATCCGCAACGGCGGTCTGGCACCCGGCGGCTCGAACTTCGAC






GCCAAACTGCGCCGTAACTCCACCGATCCGGAAGATATCTTCATCGCCCACATCAGTGCG






ATGGACGCGATGGCCCGTGCGCTGGTCAACGCCGCCGCAATCTGGGAAGAGTCTCCCATC






CCGCAGATGAAGAAAGAACGCTACGCGTCGTTCGACAGCGGCAAGGGCAAGGAATTCGAA






GAGGGCAAGCTCTGCCTCGAAGACCTCGTGGCCTATGCCAAGGCGAACGGAGAACCGAAA






CAGATCTCCGGCAGGCAGGAACTATATGAGACCATCGTCGCCCTTTATTGCAAATAG





5607MI7_

Prevotella

Amino 
140
MTKGYFPTIGRIPFEGTESKNPLAFHYYEPDRLVLGKKMKDWLRFAMAWWHTLGQASGDQ


002

Acid


FGGQTRSYAWDKAECPYERAKAKADAGFEIMQKLGIEFFCFHDIDLVEDTDDIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNALDATIK








LGGENYVFWGGREGYMSLLNTQMQREKEHLGRMLVAARDYARAHGFKGTFLTEPKPMEPT








KHQYDQDTETVIGFLRRHGLDKDFKVNIEVNHATLAGHTFEHELATAVDCGLLGSIDANR








GDAQNGWDTDQFPIDNFELTLAMLQIIRNGGLAPGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALVNAAAIWEESPTPQMKKERYASFDSGKGKEFEEGKLCLEDLVAYAKANGEPK







QISGRQELYETIVALYCK





5608MI1_

Prevotella

DNA
141
ATGACCAACGAGTATTTTCCCGGAATCGGTGTGATTCCGTTTGAAGGACAGGAAAGCAAG


004



AATCCCATGGCTTTCCATTATTATGACGCCAACCGCGTAGTGATGGGCAAACCCATGAAG






GAATGGTTCAAATTTGCCATGGCCTGGTGGCATACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGACAGACCCGCTCCTACGCATGGGACAAGGGCGAGTGCCCTTACTGCCGTGCC






CGCCAGAAGGCCGACGCCGGCTTTGAACTGATGCAGAAGCTGGGTATCGGCTATTTCTGC






TTCCACGATGTGGATATCATCGAGGACTGCGAAGACATTGCCGAGTATGAGGCCCGTATG






AAGGACATCACGGACTATCTGCTGGTGAAGATGAAGGAAACGGGCATCAAGAACCTGTGG






GGCACGGCCAACGTCTTCGGCCACAAGCGCTATATGAACGGCGCTGCCACCAACCCGCAG






TTCGACGTGGTGGCCCGCGCTGCGGTCCAGATCAAGAACGCCCTGGACGCCACCATCAAG






CTGGGCGGCAGCAATTACGTGTTCTGGGGCGGCCGCGAAGGCTATTATACCCTTTGGAAC






ACGCAGATGCGGCGGGAGAAGGACCACCTGGCCCAGATGCTCAAGGCAGCCCGTGACTAT






GCCCGCGGCAAGGGATTCAAGGGCACGTTCCTCATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTAGATACGGAGACCGTGATTGGCTTCCTGCGCGCAAACGGACTG






GACAAGGACTTCAAGGTGAATATCGAAGTGAACCACGCCACCCTGGCCGGCCACACCTTC






GAGCACGAACTCACCGTGGCCCGCGAAAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGTTGGGATACAGACCAGTTCCCCATAGATGCCTTTGACCTCACC






CAGGCCATGATGCAGGTCCTGCTCAACGGCGGATTCGGCAACGGCGGCACCAACTTCGAC






GCCAAACTGCGCCGTTCCTCCACGGATCCCGAGGACATCTTCATCGCCCACATCGGCGCC






ATGGACGCCATGGCCCACGCCCTCCTGAACGCCGCCGCCATCCTGGAAGAGAGCCCCATG






CCGGGCATGGTGAAGGAGCGCTACGCTTCCTTCGACAATGGCCTTGGCAAGAAGTTCGAG






GAAGGAAAGGCCACGCTGGAAGAGCTGTACGACTATGCCAAGAAGAACGGCGAGCCTGTG






GCCGCTTCCGGCAAGCAGGAACTGTACGAAACGCTGCTGAACCTGTACGCCAAGTAA





5608MI1_

Prevotella

Amino 
142
MTNEYFPGIGVIPFEGQESKNPMAFHYYDANRVVMGKRMKEWFKFAMAWWHTLGQASADP


004

Acid


FGGQTRSYAWDKGECPYCRARCKADAGFELMQKLGIGYFCFHDVDIIEDCEDIAEYEARM








KDITDYLLVKMKETGIKNLWGTANVFGHKRYMNGAATNPQEDVVARAAVQIKNALDATIK








LGGSNYVFWGGREGYYTLWNTQMRREKDHLAQMLKAARDYARGKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPIDAFDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHIGA








MDAMAHALLNAAAILEESPMPGMVKERYASFDNGLGKKFEEGKATLEELYDYAKKNGEPV







AASGKQELYETLLNLYAK





5608MI2_

Prevotella

DNA
143
ATGAAAGAATACTTCCCTACCATCGGAAAAATCCCTTTCGAGGGCCCTCAGAGCAAGAAT


002



CCGCTCGCATTCCATTACTATGACGCCAACCGCGTTGTCGCCGGCAAACCCATGAAGGAC






TGGCTCAAGTTCGCCATGGCTTGGTGGCACACCCTGGGCGCAGCATCGGCAGACCCCTTC






GGCGGCCAGACCCGCAGCTACGAGTGGGACAAAGCCGAGTGCCCTTACTGCCGTGCCCGT






GAAAAGGCCGACGCCGGCTTCGAGATCATGCAGAAACTTGGAATCGAGTACTTCTGCTTC






CATGACATCGACCTTGTGGAAGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATGAAG






GACATCACGGACTACCTCCTGGAGAAGATGAAGGCCACCGGCATCAAGAACCTGTGGGGC






ACCGCCAACGTCTTTGGCAACAAGCGCTACATGAACGGCGCAGCCACCAACCCTCAGTTC






GACATCGTTGCCCGTGCAGCTGTCCAGATCAAGAACGCCATCGACGCAACAATCAAGCTG






GGCGGTACCGGTTACGTATTCTGGGGCGGCCGCGAGGGCTACTACACCCTCCTGAACACC






CAGATGCAGCGCGAGAAGGACCACCTTGCCAAGATGCTCACCGCAGCCCGCGACTACGCC






CGCGCCAAGGGATTCAAGGGCACATTCCTCATCGAGCCCAAGCCCATGGAGCCCACCAAG






CACCAGTACGATGTTGACACGGAAACCGTCATCGGCTTCCTCCGCGCCAACGGCCTGGAC






AAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTCGAG






CACGAGCTCACCGTGGCCGTGGACAACGGCTTCCTGGGCAGCATCGACGCAAACCGCGGC






GACGCCCAGAACGGCTGGGACACTGACCAGTTCCCTGTGGATCCTTACGACCTCACCCAG






GCAATGATGCAGATTATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGACGCC






AAACTCCGCCGCAGCTCCACGGACCCCGAGGACATCTTCATCGCCCACATCAGCGCAATG






GATGCAATGGCACACGCCCICATCAACGCTGCTGCAGTGCTTGAGGAAAGCCCTCTGTGC






GAGATGGTTGCAAAGCGCTACGCCAGCTTTGACAGCGGTCTTGGCAAGAAGTTCGAGGAA






GGCAAAGCCACTCTCGAGGAGATCTACGAGTATGCCAAGAAGGCCCCGGCACCCGTCGCC






GCCTCCGGCAAGCAGGAGCTCTACGAGACACTGCTCAATCTGTACGCTAAATAA





5608MI2_

Prevotella

Amino 
144
MKEYFPTIGKIPFEGPQSKNPLAFHYYDANRVVAGKPMKDWLKFAMAWWHTLGAASADPF


002

Acid


GGQTRSYEWDKAECPYCRAREKADAGFEIMQKLGIEYFCFHDIDLVEDCEDIAEYEARMK








DITDYLLEKMKATGIKNLWGTANVFGNKRYMNGAATNRQFDIVARAAVQIKNAIDATIKL








GGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPTK








HQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANRG








DAQNGWDTDQFPVDRYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISAM








DAMAHALINAAAVLEESPLCEMVAKRYASFDSGLGKKFEEGKATLEEIYEYAKKAPAPVA







ASGKQELYEILLNLYAK





5608MI3_

Prevotella

DNA
145
ATGACCAAAGAGTATTTCCCTACAATCGGAAAGATTCCCTTCGAAGGCCCGGAGAGCAAG


004



AATCCGCTGGCATTCCATTACTATGAACCCGACAGAATCATCCTCGGCAGGAAGATGAAG






GACTGGCTGCGCTTCGCCGTGGCCTGGTGGCACACCCTCGGCCAGGCGTCCGGCGACCAG






TTCGGAGGCCAGACCCGCAACTATGCGTGGGACGAGCCCGAATGCCCGGTAGAGCGCGCG






AAAGCCAAGGCCGACGCCGGCTTCGAGCTGATGCAGAAGCTGGGCATCGAGTATTTCTGC






TTCCACGACGTAGACCTCATAGAGGAGGCCGCAACCATCGAAGAATATGAGGAGCGCATG






GGCATCATAACCGACTACCTGCTCGGGAAGATGAAGGAGACAGGTATCAAGAACCTCTGG






GGCACCGCCAACGTGTTCGGCCACAAGCGTTACATGAACGGAGCCGCCACCAACCCCGAC






TTCGACGTGGIGGCCCGTGCGGCCGTGCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTGGGCGGCGAGAATTACGTATTCTGGGGCGGACGCGAGGGCTATGCAAGCCTGCTCAAC






ACTCAGATGCAGCGCGAGAAAGACCACCTGGGACGCATGCTGGCTGCAGCCCGCGACTAT






GGCCGCGCCCACGGATTCAAGGGCACTTTCCTCATCGAGCCCAAACCCATGGAGCCTACC






AAGCACCAGTACGACCAGGATACCGAGACCGTTATCGCCTTCCTGCGCAGGAACGGCCTC






GACAAGGATTTCAAGGTAAACATCGAGGTGAACCACGCCACCCTGGCGGGCCACACCTTC






GAGCACGAACTGGCGGTGGCAGTGGACAACGGCCTGCTTGGCAGCATCGACGCCAACCGC






GGCGACGCGCAGAACGGATGGGACACCGACCAGTTCCCCATCGACAACTTCGAGCTCACC






CAGGCCATGCTGCAGATAATCCGCAACGGCGGACTGGGAACCGGCGGATCGAACTTCGAC






GCCAAGCTGCGCCGCAATTCCACCGACCCTGAGGATATCTTCATCGCCCACATCAGTGCG






ATGGACGCCATGGCACGCGCGCTGGCAAACGCCGCCGCAATCATCGAAGAGAGCCCCATC






CCCGCAATGCTGAAGGAGCGCTACGCATCGTTCGACAGCGGCAAGGGCAAGGAGTTCGAG






GACGGCAAACTGAGCCTCGAAGAACTGGTAGCCTACGCCAAGGCGAACGGCGAGCCGAAG






CAGATTTCCGGCAAGCAGGAACTCTACGAAACCATAGTGGCCCTCTATTGCAAGTAA





5608MI3_

Prevotella

Amino 
146
MTKEYFPTIGKIPFEGPESKNPLAFHYYEPDRIILGRKMKDWLRFAVAWWHTLGQASGDQ


004

Acid


FGGQTRNYAWDEPECPVERAKAKADAGFELMQKLGIEYFCFHDVDLIEEAATIEEYEERM








GIITDYLLGKMKETGIKNLWGTANVFGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIK








LGGENYVFWGGREGYASLLNTQMQREKDHLGRMLAAARDYGRAHGFKGTFLTEPKPMEPT








KHQYDQDTETVIAFLRRNGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGLLGSIDANR








GDAQNGWDTDQFPIDNFELTQAMLQIIRNGGLGTGGSNFDAKLRRNSTDPEDIFIAHISA








MDAMARALANAAAIIEESPTPAMLKERYASFDSGKGKEFEDGKLSLEELVAYAKANGEPK







QISGKQELYETIVALYCK





5609MI1_

Prevotella

DNA
147
ATGGCACAAGAATACTTCCCTACCATTGGGAAAATCCCCTTCGAGGGCACTGAGAGCAAG


005



AATCCCCTTGCTTTCCATTACTATGAGCCGGAGCGCATTGTCTGCGGCAAACCCATGAAA






GAATGGCTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGCCAAACCCGCAGCTATGCCTGGGATAAGGGCGAATGCCCCTACTGCCGTGCC






CGCGCCAAGGCGGACGCCGGCTTCGAGATTATGCAAAAGCTGGGCATCGAGTACTTCTGC






TTCCACGATATCGACCTGGTAGAAGACTGTGACGATATTGCGGAATACGAAGCCCGCATG






AAGGACATCACGGACTACCTCCTGGAGAAGATGAAGGAAACCGGTATCAAGAACCTCTGG






GGCACCGCCAATGTGTTTGGTCACAAGCGCTACATGAACGGCGCCGCCACCAACCCGCAG






TTTGACGTAGTGGCCCGTGCCGCTGTTCAGATTAAGAACGCCATTGACGCCACCATCAAG






TTGGGCGGTGCCAATTACGTGTTCTGGGGCGGCCGCGAGGGCTATTACAGCCTCCTGAAC






ACCCAGATGCAGCGGGAGAAGGACCACCTGGCCAAGCTGCTCACGGCAGCCCGCGACTAT






GCCCGCGCCAACGGCTTCAAGGGAACCTTCCTGATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTGGATACGGAGACGGTCATTGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAATATCGAGGTGAACCACGCCACGTTGGCCGGCCACACCTTT






GAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGCGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACGCTTATGAGCTCACC






CAGGCCATGATGCAGGTGCTCCTGAACGGAGGCTTCGGCAACGGCGGCACCAACTTCGAC






GCCAAGCTGCGCCGCTCCTCCACGGACCCGGAGGACATCTTCATCGCCCATATCAGTGCG






ATGGATGCCATGGCCCACGCCCTGCTCAACGCCGCCGCCGTGCTGGAGGAAAGCCCCCTG






TGCCAGATGGTGAAGGAGCGCTACGCCAGCTTTGACAGCGGTCCGGGCAAGCAGTTCGAG






GAAGGAAAGGCCACCCTGGAGGACCTGTACAACTACGCCAAAGCCACCGGTGAACCCGTG






GTTGCCTCCGGCAAGCAGGAACTTTACGAGACCCTCCTGAACCTCTATGCAAAGTAG





5609MI1_

Prevotella

Amino 
148
MAQEYFPTIGKIPFEGTESKNPLAFHYYEPERIVCGKPMKEWLKFAMAWWHTLGQASADP


005

Acid


FGGQTRSYAWDKGECPYCRARAKADAGFEIMQKLGIEYFCFHDIDLVEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNAIDATIK








LGGANYVFWGGREGYYSLLNTQMQREKDHLAKLLTAARDYARANGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAYELTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLCQMVKERYASFDSGPGKQFEEGKATLEDLYNYAKATGEPV







VASGKQELYETLLNLYAK





5610MI1_

Prevotella

DNA
149
ATGGCACAAGAATACTTCCCTACCATTGGGAAAATCCCCTTCGAGGGCACTGAGAGCAAG


003



AATCCCCTTGCTTTCCATTACTATGAGCCGGAGCGCATTGTCTGCGGCAAACCCATGAAA






GAATGGCTCAAGTTTGCCATGGCCTGGTGGCACACGCTGGGGCAGGCATCGGCCGATCCC






TTCGGCGGCCAAACCCGCAGCTATGCCTGGGATAAGGGCGAATGCCCCTACTGCCGTGCC






CGTGCCAAGGCGGACGCCGGTTTTGAGATTATGCAAAAGCTGGGCATCGAGTACTTCTGC






TTCCACGATATCGACCTGGTAGAAGACTGTGACGATATTGCGGAATACGAAGCCCGCATG






AAGGACATCACGGACTACCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGG






GGCACCGCCAATGTGTTTGGTCACAAGCGCTACATGAACGGCGCCGGCACCAATCCGCAG






TTTGACGTGGTGGCCCGTGCTGCCGTGCAAATCAAGAACGCCATTGACGCCACCATCAAG






TTGGGCGGTGCCAATTACGTGTTCTGGGGCGGCCGCGAGGGCTATTACAGCCTCCTGAAC






ACCCAGATGCAGCGGGAGAAGGACCACCTGGCCAAGCTGCTCACGGCAGCCCGCGACTAT






GCCCGCGCCAACGGCTTCAAGGGAACCTTCCTGATTGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTGGATACGGAGACGGTCATTGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAATATCGAGGTGAACCACGCCACGCTGGCCGGCCACACCTTT






GAGCACGAACTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCAGCATCGACGCCAACCGC






GGCGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACGCTTATGAGCTCACC






CAGGCCATGATGCAGGTGCTCCTGAACGGAGGCTTCGGCAACGGCGGCACCAACTTCGAC






GCCAAGCTGCGCCGCTCCTCCACGGACCTGGAGGACATCTTCATCGCCCATATCAGTGCG






ATGGATGCCATGGCCCACGCCCTGCTCAACGCCGCCGCCGTGCTGGAGGAAAGCCCCCTG






TGCCAGATGGTGAAGGAGCGCTACGCCAGCTTTGACAGCGGTCCGGGCAAGCAGTTCGAG






GAAGGAAAGGCCACCCTGGAGGACCTGTACAACTACGCCAAAGCCAACGGTGAACCCGTG






GTTGCCTCCGGCAAGCAGGAACTTTACGAGACCCTCCTGAACCTCTATGCAAAGTAG





5610MI1_

Prevotella

Amino 
150
MAQEYFPTIGKIPFEGTESKNPLAFHYYEPERIVCGKPMKEWLKFAMAWWHTLGQASADP


003

Acid


FGGQTRSYAWDKGECPYCRARAKADAGFEIMQKLGIEYFCFHDIDLVEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQEDVVARAAVQIKNAIDATIK








LGGANYVFWGGREGYYSLLNTQMQREKDHLAKLLTAARDYARANGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAYELTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDLEDIFTAHISA








MDAMAHALLNAAAVLEESPLCQMVKERYASFDSGPGKQFEEGKATLEDLYNYAKANGEPV







VASGKQELYETLLNLYAK





5610MI2_

Prevotella

DNA
151
ATGGCAAAAGAATATTTCCCTACCATCGGCAAGATTCCTTTTGAAGGAACCGACAGCAAG


004



AGTCCCCTCGCCTTCCATTACTATGACGCCCAGCGCGTTGTGATGGGCAAACCCATGAAG






GAATGGCTCAAGTTCGCCATGGCCTGGTGGCACACCCTGGGCCAGGCATCGGCCGACCCC






TTCGGCGGTCAGACCCGCCACTATGCCTGGGATGAAGGCGAATGCCCCTACTGCCGCGCC






AAAGCCAAGGCCGACGCCGGCTTCGAGATCATGCAGAAACTGGGCATCGAGTACTTCTGC






TTCCACGATGTGGACCTGGTGGAAGACTGCGACGACATCGCCGAGTACGAAGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGG






GGCACGGCCAATGTGTTCGGCCACAAGCGTTACATGAACGGCGCCGGGACCAACCCGCAG






TTTGACATTGTGGCCCGCGCTGCCGTCCAGATCAAAAACGCCCTGGACGCCACCATCAAG






CTGGGCGGTTCCAACTACGTGTTCTGGGGCAGCCGCGAAGGCTACTACACCCTCCTGAAC






ACCCAGATGCAGCGGGAGAAAGACCACCTGGCCAAGCTCCTGACCGCCGCCCGCGACTAC






GCCCGCGCCAAAGGCTTCAAGGGAACCTTCCTCATCGAGCCCAAACCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACCGAGACCGTAATCGGCTTCCTGCGTGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCTGGCCACACCTIC






GAGCACGAACTCACCGTCGCCCGTGAAAACGGCTTCCTCGGATCGATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTAGACGCCTATGACCTCACC






CAGGCCATGATGCAGGTGCTGCTGAACGGCGGTTTCGGCAATGGCGGTACCAACTTCGAC






GCCAAGCTCCGCCGCTCCTCCACGGATCCGGAAGACATCTTCATCGCCCACATCAGCGCC






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTGCTGGAAGAAAGCCCGCTT






CCCGCCATGGCGAAAGAGCGCTACGCCTCCTTTGACAGCGGACTTGGCAAGAAGTTCGAA






GAGGGAAAGGCCACCCTCGAAGAGCTGTACGACTATGCCAAGGCTAACGACGCCCCTGTC






GCCGCCTCCGGCAAGCAGGAACTTTACGAAACCTTCTTGAACCTCTATGCAAAATAG





5610MI2_

Prevotella

Amino 
152
MAKEYFPTIGKIPFEGTDSKSPLAFHYYDAQRVVMGKPMKEWLKFAMAWWHTLGQASADP


004

Acid


EGGQTRHYAWDEGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVELVEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAGTNPQFDIVARAAVQIKNALDATIK








LGGSNYVFWGSREGYYTLLNTQMQREKDHLAKLLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPVDAYDLTQAMMQVLLNGGFGNGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLPAMAKERYASFDSGLGKKFEEGKATLEELYDYAKANDAPV







AASGKQELYETFLNLYAK





5751MI1_

Prevotella

DNA
153
ATGGCAAAACAGTATTTTCCGCAAATCGGAAAGATTAAATTCGAAGGAACAGAGAGCAAG


003



AATCCGCTTGCGTTCCATTATTATGACGCAAACAGGGTAGTCCTCGGAAAGGCAATGGAG






GAGTGGCTCAAGTTCGCAATGGCTTGGTGGCATACTCTCGGACAGGCTTCCGGAGACCAG






TTCGGCGGCCAGACCCGCAGCTACGAGTGGGATCTTGCAGCCACCCCCGAGCAGCGCGCA






AAGGACAAGCTCGACGCCGGCTTCGAAATAATGGAGAAACTTGGAATCAAGTATTTCTGT






TTCCACGATGTTGACCTTATCGAAGACAGCGACGATATTGCGACATATGAGGCTCGTCTC






AAGGACCTTACAGACTACGCTGCAGAGCAGATGAAGCTCCACGACATCAAGCTCCTCTGG






GGTACAGCGAATGTATTCGGCAACAAGCGCTACATGAACGGTGCGGCTACAAACCCTGAT






TTCGATGTAGTTGCCCGCGCAGCCGTTCAGATTAAGAACGCTATCGACGCGACCATCAAG






CTCGGTGGTACCAGCTATGTATTCTGGGGCGGTCGTGAGGGATATCAGAGCCTGCTCAAC






ACTCAGATGCAGCGTGAGAAGGACCACCTCGCAACCATGCTTACAATCGCTCGCGACTAT






GCTCGCAGCAAGGGCTTTACCGGAACCTTCCTTATCGAGCCTAAGCCGATGGAGCCTACA






AAACACCAGTACGACGTAGATACAGAGACTGTTGTCGGCTTCCTCAAGGCACACGGCCTG






GACAAGGACTTCAAGGTAAATATCGAGGTTAACCACGCAACTCTCGCAGGCCACACCTTC






GAGCACGAACTCACCGTTGCTGTGGATAACGGAATGCTCGGTTCTATCGACGCTAACCGC






GGTGATGCACAGAACGGCTGGGATACAGACCAGTTCCCTGTAAGCGCTGAGGAGCTTACC






CTCGCTATGATGCAGATTATCCGTAATGGTGGCCTTGGCAACGGAGGATCCAACTTCGAC






GCAAAGCTTCGCCGCAACTCTACCGATCCTGAAGACATCTTCATCGCACACATCTGCGGT






ATGGATGCAATGGCACACGCTCTCCTCAATGCAGCTGCAATTATCGAGGAGTCTCCTATC






CCTACAATGGTTAAGGAGCGTTACGCTTCCTTCGACAGCGGTATGGGTAAGGACTTCGAG






GATGGAAAGCTTACCCTCGAGGATCTCTACAGCTACGGCGTGAAGAACGGAGAGCCAAAG






CAGACCAGCGCAAAGCAGGAGCTCTATGAGACTCTCATGAATATCTATTGCAAGTAA





5751MI1_

Prevotella

Amino 
154
MAKQYFPQIGKIKFEGTESKNPLAFHYYDANRVVLGKAMEEWLKFAMAWWHTLGQASGDQ


003

Acid


FGGQTRSYEWDLAATPEQRAKDKLDAGFEIMEKLGIKYFCPHDVDLIEDSDDIATYEARL








KDLTDYAAEQMKLEDIKLLWGTANVFGNKRYMNGAATNPDFDVVARAAVQIKNAIDATIK








LGGTSYVFWGGREGYQSLLNIQMQREKDHLATMLTIARDYARSKGFTGTFLIEPKPMEPT








KHQYDVDTETVVGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGMLGSIDANR








GDAQNGWDTDQFPVSAEELTLAMMQIIRNGGLGNGGSNFDAKLRRNSTDPEDIFTAMICG








MDAMAHALLNAAAIIEESPIPTMVKERYASFDSGMGKDFEDGKLTLEDLYSYGVKNGEPK







QTSAKQELYETLMNIYCK





5751MI2_

Prevotella

DNA
155
ATGGCAAAAGAATTTTTTCCACAAGTAGGCAAGATTCCATTTGAGGGTCCTGAAAGTACT


003



AACGTACTCGCATTCCACTACTATGATCCAGAACGCGAAGTTCTTGGTAAGAAAATGAAA






GATTGGCTGAAGTATGCTATGGCTTGGTGGCACACACTCGGTCAGGCAAGTGGCGACCAA






TTCGGTCTTCAAACTCGTTCGTATGAATGGGATGAAGCCGACGATGTTCTTCAACGCGCA






AAGGATAAAATGGATGCTGGTTTTGAATTGATGACCAAACTTGGCATTGAATACTACTGC






TTCCATGATGTCGACCTTATTGAAGAAGGTGCAACAATTGAAGAATATGAAGCTCGTATG






CAAGCTATCACCGACTACGCATTAGAAAAACAAAAAGAAACCGGCATTAAGCTCCTTTGG






GGTACTGCTAATGTGTTTGGTCATAAGCGTTATATGAATGGTGCGGCAACAAACCCTGAC






TTTGATGTAGTGGCTCGCGCTGCTGTACAAATCAAGAACGCTATCGATGCAACTATCAAG






CTTGGTGGTCAAAACTATGTATTCTGGGGTGGCCGCGAAGGTTATATGAGTTTGCTCAAC






ACTCAAATGCAACGCGAAAAAGACCACTTGGCAAAGATGCTTACCGCAGCTCGCGACTAT






GCTCGTGCTAAGGGCTTCAAGGGTACATTCCTCGTTGAACCTAAGCCTATGGAACCAACT






AAGCATCAATATGATACCGATACAGAAACTGTGATTGGTTTCCTCCGTGCAAATGGTCTT






GAAAAAGACTTCAAGGTGAACATTGAAGTGAACCATGCTACTCTCGCTCAGCACACTTTC






GAACACGAACTCGCTGTGGCTGTCGACAATGGCATGCTCGGTTCTATCGACGCTAACCGT






GGCGATGCTCAAAATGGCTGGGATACCGACCAATTCCCAATCGACAACTACGAACTCACC






CTCGCTATGCTCCAAATCATTCGCAATGGTGGTCTTGGCAATGGCGGTAGCAACCTCGAC






GCTAAGATTCGTCGTAATAGCACCGACCTTGAAGACCTCTTTATCGCTCACATCAGTGGT






ATGGATGCTATGGCTCGTGCACTTCTCAATGCTGCTGCAATCGTTGAAAAGAGCGAAATT






CCTGCTATGTTGAAGCAGCGTTATGCAAGCTCTGATGCAGGTATGGGTAAGGACTTCGAA






GAAGGAAAACTCACTCTCGAACAACTCGTAGACTATGCTAAGGCTAACGGCGAACCTGCT






ACAGTAAGCGGCAAGCAAGAAAAGTATGAAACTCTCGTTGCTCTCTACGCTAAGTAA





5751MI2_

Prevotella

Amino 
156
MAKEFFPQVGKIPFEGPESTNVLAFHYYDPEREVLGKKMKDWLKYAMAWWHTLGQASGDQ


003

Acid


FGGQTRSYEWDEADDVLQRAKDKMDAGFELMTKLGIEYYCFHDVDLIEEGATIEEYEARM








QAITDYALEKQKETGIKLLWGTANVEGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIK








LGGQNYVFWGGREGYMSLLNTQMQREKDHLAKMLTAARDYARAKGFKGTFLVEPKPMEPT








KHQYDTDTETVIGFLRANGLEKDFKVNIEVNHATLAQHTFEHELAVAVDNGMLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLGNGGSNLDAKIRRNSTDLEDLFIAHISG








MDAMARALLNAAAIVEKSEIPAMLKQRYASSDAGMGKDFEEGKLTLEQLVDYAKANGEPA







TVSGKQEKYETLVALYAK





5752MI1_

Prevotella

DNA
157
ATGACTAAAGAGTATTTCCCGGGAATCGGAAAGATTCCGTTTGAAGGAACCAAGAGCAAG


003



AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGCGATGGGCAAGCCCATGAAG






GACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCC






TTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCTTATTGCCGCGCC






AAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGC






TTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAG






TTTGACATTGIGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAG






CTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAAC






ACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTT






GAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGC






GGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACC






CAGGCTATGATGCAGGTCATCCTCAACGGIGGCTTCGGCAATGGCGGCACCAACTTTGAC






GCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTG






CCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAA






GAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTG






GCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG





5752MI1_

Prevotella

Amino 
158
MTKEYFPGIGKIPFEGTKSKNPLAFHYYNASQVAMGKPMKDWLKYAMAWWHTLGQASADP


003

Acid


FGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCPHDVDIIEDCEDIAEYEARM








KDITDYLLEKQKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIK








LGGTNYVFWGGREGYYTLLNIQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFIRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPIDALDLTQAMMQVILNGGEGNGGTNEDAKLRRSSTDPEDTFIAEISA








MDAMAHALLNAAATLEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYEYAKKNGEPV







AASGKQELCETYLNLYAK





5752MI2_

Prevotella

DNA
159
ATGACTAAAGAGTATTTCCCGGGAATCGGAAAGATTCCGTTTGAAGGAACCAAGAGCAAG


003



AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGTGATGGGCAAGCCCATGAAG






GACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCC






TTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCGTACTGCCGCGCC






AAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGC






TTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAG






TTTGACATTGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCACCATCAAA






CTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAAC






ACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTT






GAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGC






GGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACC






CAGGCTATGATGCAGGTCATCCTCAACGGTGGCTTCGGCAATGGCGGCACCAACTTTGAC






GCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTG






CCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAA






GAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTG






GCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG





5752MI2_

Prevotella

Amino  
160
MTKEYFPGIGKIPFEGTKSKNPLAFHTYNASQVVNGKPMKDWLKYANAWWHTLGQASADP


003

Acid


EGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCPHDVDIIEDCEDIAEYEARM








KDITDYLLEKQKETGIKNLWGTANVFGHKRYMNGAATNPQFDIVARAAVQIKNALDATIK








LGGTNYVFWGGREGYYTLLNIQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFIRANGLDKDFKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPIDALDLTQAMMQVILNGGFGNGGTNFDAKLRRSSTDPEDIFTAHISA








MDAMAHALLNAAAILEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYETAKKNGEPV







AASGKQELCETYLNLYAK





5752MI3_

Prevotella

DNA
161
ATGGCAAAAGAGTATTTCCCGACTATCGGCAAGATTCCCTTCGAGGGCGTCGAATCCAAG


002



AACCCGATGGCATTCCACTACTATGACGCGAACCGCGTCGTGATGGGCAAGCCCATGAAG






GACTGGCTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGACAGGCTTCCGGCGACCCG






TTCGGCGGCCAGACCCGTTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCC






AAGGCCAAGGCCGACGCCGGCTTCGAGATCATGCAGAAGCTCGGTATCGAGTACTACTGC






TTCCATGACATCGACCTCGTGGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATG






AAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAAACCGGCATCAAGAACCTCTGG






GGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACGTCGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTCGGCGGTACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCCACGGCTTCCAGGGCACCTTCCTCATCGAGCCCAAGCCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTGCGCGCCAACGGTCTG






GACAAGGACTTCAAGGTCAATATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTTC






GAGCACGAGCTCACCGTGGCTGTCGATAACGGCTTCCTCGGCTCCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGACACCGACCAGTTCCCCGTGGACCCGTACGACCTCACC






CAGGCCATGATGCAGATCATCCGCAACGGCGGTTTCAAGGACGGCGGCACCAACTTCGAC






GCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTC






TGCAAGATGGTCGAGGAGCGCTACGCTTCCTTCGACAGCGGCCTCGGCAAGCAGTTCGAG






GAAGGCAAGGCCACCCTCGAGGACCTCTACGAGTATGCCAAGAAGAATGGCGAGCCCGTC






GTCGCCTCCGGCAAGCAGGAGCTCTACGAGACGCTGCTGAACCTTTACGCGAAGTAG





5752MI3_

Prevotella

Amino 
162
MAKEYFPTIGKIPFEGVESKNPMAFHYYDANRVVMGKPMKDWLKFAMAWWHTLGQASGDP


002

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYYCFHDIDLVEDTEDIAEYEARM








KDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDVVARAAVQIKNAIDATIK








LGGTGYVFWGGREGYYTLLNIQMQREKEHLAKMLTAARDYARAHGFQGTFLIEPKPMEPT








KHQYEVETETVIGFLRANGLDKDEKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVIEESPLCKMVEERYASFDSGLGKQFEEGKATLEELYETAKKNGEPV







VASGKQELYETLLNLYAK





5752MI5_

Prevotella

DNA
163
ATGGCAAAAGAGTATTTCCCGACAATCGGTAAGATCCCCTTCGAGGGACCCGAGTCCAAG


003



AACCCGATGGCATTCCACTACTATGACGCGGAGCGCGTGGTGATGGGCAAGAAGATGAAG






GACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCCGACCCG






TTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAAGGCCCCTGCTCCCGCGCC






CGCGCCAAGGCTGACGCCGGTTTCGAGATCATGCAGAAACTGGGCATCGGCTACTACTGC






TTCCACGACATCGACCTGGTGGAGGACACCGAGGACATCGCCGAGTATGAAGCCCGCATG






AAGGACATCACCGACTACCTCGTGGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACGGCCAACGTATTCGGCAACAAGCCCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACATCGCCGCCCGCGCGGCCCTGCAGACCAAGAACGCCATCGATGCCACCATCAAG






CTGGGCGGCACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGCTACTACACCCTCCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTTGCCAAGATGCTCACCGCGGCTCGCGACTAT






GCCCGCGCCCACGGCTTCAAGGGCACCTTCTTCATCGAGCCGAAACCGATGGAGCCCACC






AAGCACCAGTACGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAAGTGAACCACGCCACCCTCGCCGGCCACACCTTC






GAGCACGGGCTCACCGTGGCCGTTGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGC






GGAGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGACCTCACC






CAGGCGATGATCCAGATCATCCGCAATGGCGGCTTCAAGGACGGCGGTACCAACTTCGAC






GCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTGCTCGAGGAGAGCCCGCTC






TGCGAGATGGTTGCAAAGCGTTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAG






GAAGGCAACGCCACCCTCGAGGAACTCTACGAGTACGCCAAGGCGAAGGGCGAGGTCGTT






GCCGAATCCGGCAAGCAGGAACTCTACGAGACCCTGCTGAACCTCTACGCGAAGTAG





5752MI5_

Prevotella

Amino 
164
MAKEYFPTIGKIPFEGPESKNPMAFHYYDAERVVMGKKMKDWFKFAMAWWHTLGQASADP


003

Acid


FGGQTRSYEWDKGEGPCSRARAKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARM








KDITDYLVEKQKETGIKNLWGTANVFGNKPYMNGAATNPQFDIAARAALQTKNAIDATIK








LGGTGYVFWGGREGYYTLLNIQMQREKDHLAKMLTAARDYARAHGFKGTFFIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHGLTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLCEMVAKRYASFDSGLGKKFEEGNATLEELYEYAKAKGEVV







AESGKQELYETLLNLYAK





5752MI6_

Prevotella

DNA
165
ATGGCAAAAGAGTATTTCCCGACAATCGGAAAGATCCCCTTCGAGGGCGCTGAGAGCAAG


004



AATCCCCTTGCTTTCCACTATTATGACGCCGAGCGTGTGGTCATGGGCAAGCCCATGAAG






GACTGGTTCAAGTTCGCGATGGCCTGGTGGCACACCCTGGGCCAGGCTTCCGCCGACCCG






TTCGGCGGCCAGACCCGCTCCTACGAGTGGGACAAGGGCGAGTGCCCCTACTGCCGCGCC






CGCCAGAAGGCTGACGCCGGTTTCGAGATCATGCAGAAGCTCGGCATCGGCTACTACTGC






TTCCACGACATCGACCTGGTCGAGGACACCGAGGACATCGCCGAGTACGAGGCCCGCATG






AAGGACATCACCGACTACCTCGTCGAGAAGCAGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACGGCCAACGTGTTCGGCAACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACATCGTCGCCCACGCGGCCCTGCAGATCAAGAACGCGATCGGCGCCACCATCAAG






CTCGGCGGCACCGGTTACGTGTTCTGGGGCGGCCGTGAAGGTTACTACACCCTCCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTCGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCAACGGCTTCAAGGGCACCTTCCTCATCGAGCCGAAGCCGATGGAGCCCACC






AAGCACCAGTATGACGTGGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTCGCCGGCCACACCTIC






GAGCACGAGCTCACCGTGGCGGTCGACAACGGCTTCCTCGGCAGCATCGACGCCAACCGC






GGTGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGGTGGATCCGTACGATCTCACC






CAGGCGATGATCCAGATCATCCGCAACGGCGGCTTCAAGGATGGCGGCACCAACTTCGAC






GCCAAGCTCCGCCGCTCTTCCACCGACCCGGAGGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCCATGGCCCACGCCCTGCTGAACGCCGCCGCCGTCATCGAGGAGAGCCCGCTC






TGCGAGATGGTCGCCAAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAG






GAAGGCAACGCCACCCTCGAGGAACTCTACGAGTACGCCAAGGCGAACGGTGAGGTCAAG






GCCGAATCCGGCAAGCAGGAGCTCTACGAGACCCTTCTGAACCTCTACGCGAAATAG





5752MI6_

Prevotella

Amino 
166
MAKEYFPTIGKIPFEGAESKNPLAFHYYDAERVVMGKPMKDWFKFAMAWWHTLGQASADP


004

Acid


FGGQTRSYEWDKGECPYCRARQKADAGFEIMQKLGIGYYCFHDIDLVEDTEDIAEYEARM








KDITDYLVEKQKETGIKNLWGTANVFGNKRYMNGAATNPQFDIVAHAALQIKNAIGATIK








LGGTGYVFWGGREGYYTLLNTQMQREKDHLAKMLTAARDYARANGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMIQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVIEESPLCEMVAKRYASFDSGLGKKFEEGNATLEELYEYAKANGEVK







AESGKQELYETLLNLYAK





5753MI1_

Prevotella

DNA
167
ATGGCAAAAGAGTATTTCCCCACTATCGGGAAGATTCCTTTCGAAGGAGTCGAGAGCAAG


002



AACCCCCTTGCATTCCATTATTATGACGCAAACCGCATGGTCATGGGCAAGCCCATGAAG






GACTGGTTCAAGTTCGCCATGGCATGGTGGCACACCCTGGGACAGGCCTCCGCAGACCCG






TTCGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCC






AGGGCAAAGGCCGATGCCGGCTTCGAGATCATGCAGAAACTGGGTATCGAGTATTTCTGC






TTCCATGACATCGACCTGGTAGAGGACTGCGACGACATCGCCGAGTACGAGGCCCGCATG






AAGGACATCACGGACTATCTCCTGGAGAAGATGAAGGAAACCGGCATCAAGAACCTCTGG






GGCACCGCCAACGTGTTCGGCAACAAGCGTTACATGAACGGCGCCGGCACCAATCCGCAG






TTCGACGTAGTGGCCCGCGCTGCCGTCCAGATCAAGAACGCCATCGACGCCACCATCAAG






CTCGGCGGTTCCAACTATGTGTTCTGGGGCGGCCGTGAAGGATACTACACCCTGCTGAAC






ACCCAGATGCAGCGCGAGAAGGACCACCTCGGCAAACTGCTCACCGCCGCCCGCGACTAT






GCCCGCAAGAACGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACC






AAGCACCAGTACGACGTAGACACGGAGACCGTGATCGGCTTCCTCCGCGCCAACGGCCTG






GAGAAAGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCATACCTTC






GAGCATGAACTCACCGTGGCCTTGGACAACGGCTTCCTGGGATCCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACCCGTACGACCTCACC






CAGGCCATGATGCAGATCATCCGCAACGGCGGCCTCGGCAACGGCGGTACCAACTTCGAC






GCCAAACTGCGCCGTTCCTCCACCGATCCTGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGACGCCATGGCCCACGCCCTGCTCAACGCAGCCGCCGTGCTGGAAGAAAGTCCGCTC






TGTGAGATGGTCAAGGAGCGCTACGCTTCCTTCGACAGCGGTCTCGGCAAGAAGTTCGAA






GAGGGCAAGGCTACCCTGGAAGAAATCTACGAGTATGCCAAGAAGAGCGGCGAACCCGTG






GTCGCTTCCGGCAAGCAGGAGCTCTACGAAACCCTGCTGAACCTCTACGCCAAGTAG





5753MI1_

Prevotella

Amino 
168
MAKEYFPTIGKIPFEGVESKNPLAFHYYDANRMVMGKPMKDWFKFAMAWWHTLGQASADP


002

Acid


FGGQTRSYEWDKGECPYCRARAKADAGFEIMQKLGIEYFCPHDIDLVEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGNKRYMNGAGTNPQFDVVARAAVQIKNATDATIK








LGGSNYVFWGGREGYYTLLNTQMQREKDHLGKLLTAARDYARKNGFKGTFLIEPKPMEPT








KHQYEVETETVIGFLRANGLEKDEKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGLGNGGTNFDAKLRRSSTDPEDIFTAHISA








MDAMAHALLNAAAVLEESPLCEMVKERYASFDSGLGKKFEEGKATLEEIYEYAKKSGEPV







VASGKQELYETLLNLYAK





5753MI2_

Prevotella

DNA
169
ATGGCTAAAGAATACTTCCCCTCCATCGGCAAAATCCCTTTTGAAGGAGGCGACAGCAAA


002



AATCCCCTCGCTTTCCATTATTATGACGCCGGACGCGTGGTTATGGGCAAGCCCATGAAG






GAATGGCTTAAATTCGCCATGGCCTGGTGGCACACGCTGGGCCAGGCCTCCGGAGACCCC






TTCGGCGGCCAGACCCGCAGCTACGAATGGGACAAGGGCGAATGCCCCTACTGCCGCGCC






AAAGCCAAGGCCGACGCCGGTTTTGAAATCATGCAAAAGCTGGGTATCGAATACTTCTGC






TTCCACGATGTGGACCTTATCGAGGATTGCGATGACATTGCCGAATACGAAGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAAAAGATGAAGGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAATGTCTTCGGCCACAAGCGCTACATGAACGGCGCCGCCACGAACCCGCAG






TTCGACGTGGTCGCCCGCGCCGCCGTCCAGATCAAGAACGCGATTGACGCCACCATCAAG






CTCGGCGGTACCAGTTATGTATTCTGGGGCGGCCGCGAGGGCTACTACACCCTCCTGAAC






ACCCAGATGCAGCGTGAGAAAGACCACCTGGCCAAGATGCTCACCGCAGCCCGCGACTAC






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATCGAGCCCAAGCCGATGGAGCCCACC






AAGCACCAGTACGACGTTGACACGGAGACCGTGATCGGCTCCCTGCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTGAACATCGAGGTGAACCACGCCACCCTGGCCGGCCACACCTTC






GAGCACGAACTCACCGTGGCTGTTGACAACGGCTTCCTGGGCTCCATCGACGCCAACCGC






GGCGACGCCCAGAACGGCTGGGATACGGACCAGTTCCCGGTAGACCCGTACGACCTCACC






CAGGCCATGATGCAGATTATCCGCAACGGCGGCTTCAAGGACGGCGGCACCAACTTCGAT






GCCAAACTGCGCCGCTCTTCCACCGATCCGGAAGACATCTTCATCGCCCACATCAGCGCT






ATGGATGCCATGGCACACGCCCTGCTCAACGCCGCCGCCGTGCTGGAAGAGAGCCCGCTG






TGCAACATGGTCAAGGAGCGTTACGCCGGCTTCGACAGCGGCCTTGGCAAGAAGTTCGAG






GAAGGGAAGGCAACGCTGGAGGAAATCTATGACTATGCCAAGAAGAGCGGCGAACCCGTC






GTGGCTTCCGGCAAGCAGGAACTCTACGAAACCATCCTGAACCTCTATGCCAAGTAG





5753MI2_

Prevotella

Amino 
170
MAKEYFPSIGKIPFEGGDSKNPLAFHYYDAGRVVMGKPMKEWLKFAMAWWHTLGQASGDP


002

Acid


FGGQTRSYEWDKGECPYCRAKAKADAGFEIMQKLGIEYFCFHDVDLIEDCDDIAEYEARM








KDITDYLLEKMKETGIKNLWGTANVFGHKRYMNGAATNPQFDVVARAAVQIKNAIDATIK








LGGTSYVFWGGREGYYTLLNIQMQREKDHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGSLRANGLDKDFKVNIEVNHATLAGHTFEHELTVAVDNGFLGSIDANR








GDAQNGWDTDQFPVDPYDLTQAMMQIIRNGGFKDGGTNFDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAAVLEESPLCNMVKERYAGFDSGLGKKFEEGKATLEEIYDYAKKSGEPV







VASGKQELYETILNLYAK





5753MI4_

Prevotella

DNA
171
ATGTCAAAAGAGTATTTCCCTACAATCGGCAGGGTCCCCTTCGAGGGACCTGAGAGCAAG


002



AATCCGCTGGCGTTCCACTATTACGAGCCGGACCGGCTCGTCCTGGGCAGGAAAATGAAG






GACTGGCTGCGCTTCGCAATGGCCTGGTGGCATACGCTCGGGCAGGCTTCCGGCGACCAG






TTCGGCGGACAGACCTGCACATACGCCTGGGATGAAGGCGAGTGTCCCGTCTGCCGGGCA






AAGGCCAAGGCTGACGCCGGCTTTGAACTGATGCAGAAACTGGGCATCGGGTATTTCTGC






TTCCACGACGTGGACCTGGTCGAGGAGGCCGACACCATTGAAGAATACGAGGAGCGGATG






CGGATCATCACCGACTACCTGCTCGAGAAGATGGAAGAGACCGGCATCCGCAATCTCTGG






GGAACCGCCAATGTCTTCGGACACAAGCGCTATATGAACGGCGCCGCCACCAATCCCGAC






TTCGACGTCGTGGCCCGTGCCGCGGTCCAGATCAAGAATGCCATCGATGCCACCATCAAA






CTGGGTGGTGAGAACTATGTGTTCTGGGGTGGCCGCGAGGGCTATACGAGCCTGCTCAAC






ACGCAGATGCACCGGGAAAAACACCACCTCGGAAATATGCTCAGGGCAGCCCGCGACTAT






GGCCGTGCCCACGGTTTCAAGGGAACGTTCCTGATCGAGCCCAAGCCGATGGAGCCGACC






AAGCATCAGTACGACCAGGATACGGAGACGGTCATCGGTTTCCTGCGCTGTCACGGCCTG






GACAAGGATTTCAAGGTGAACATCGAGGTGAACCACGCCACGCTCGCCGGACACACCTTC






GAGCACGAACTGGCCACTGCGGTCGATGCCGGCCTGCTGGGCAGCATCGATGCCAACCGC






GGCGACGCCCAGAACGGCTGGGATACCGACCAGTTCCCGATCGACAACTACGAACTCACG






CTGGCGATGCTGCAGATCATCCGCAATGGCGGACTCGCACCCGGCGGATCGAACTTCGAT






GCCAAGTTGCGCCGCAATTCCACCGATCCGGAAGACATCTTCATCGCCCACATCAGCGCG






ATGGACGCGATGGCCCGTGCCCTGCTCAATGCGGCGGCCATCTGGACCGAATCGCCGATT






CAGGATATGGTCAGGGACCGCTATGCTTCCTTCGACAGCGGAAAGGGCAGGGAGTTCGAG






GAAGGCAGACTCAGTCTGGAAGACCTCGTGGCCTATGCGAAGGAGCACGGTGAGCCGCGC






CAGATCTCCGGCAGGCAGGAACTTTATGAAACCATCGTAGCGCTTTACTGCAGGTAA





5753MI4_

Prevotella

Amino 
172
MSKEYFPTIGRVPFEGPESKNPLAFHYYEPDRLVLGRKMKEWLRFAMAWWHTLGQASGDQ


002

Acid


FGGQTCTYAWDEGECPVCRAKAKADAGFELMQKLGIGYFCFHDVDLVEEADTIEEYEERM








RIITDYLLEKMEETGIRNLWGTANVEGHKRYMNGAATNPDFDVVARAAVQIKNAIDATIK








LGGENYVFWGGREGYTSLLNTQMHREKHHLGNMLRAARDYGRAHGFKGTFLIEPKPMEPT








KHQYDQDTETVIGFLRCHGLDKDFKVNIEVNHATLAGHTFEHELATAVDAGLLGSIDANR








GDAQNGWDTDQFPIDNYELTLAMLQIIRNGGLAPGGSNEDAKLRRNSTDPEDTFIANISA








MDAMARALLNAAAIWTESPIQDMVRDRYASFDSGKGREFEEGRLSLEDLVAYAKEHGEPR







QISGRQELYETIVALYCR





5752MI4_

Prevotella

DNA
173
ATGACTAAAGAGTATTTCCCGGGAATCGGAACGATTCCGTTTGAAGGAACCAAGAGCAAG


004



AACCCCCTGGCCTTCCATTATTATAACGCCTCCCAGGTAGTGATGGGCAAGCCCATGAAG






GACTGGCTCAAGTATGCCATGGCCTGGTGGCACACCCTGGGCCAGGCCTCTGCAGACCCC






TTTGGCGGCCAGACCCGCTCCTACGAATGGGACAAGGGCGAGTGCCCGTACTGCCGCGCC






AAGCAGAAGGCCGATGCCGGCTTTGAGCTCATGCAGAAGCTGGGCATCGAGTACTACTGC






TTCCACGACGTGGACATCATCGAGGACTGCGAGGACATTGCCGAGTACGAGGCCCGCATG






AAGGACATCACGGACTACCTGCTGGAGAAGCAGAAAGAGACCGGCATCAAGAACCTCTGG






GGCACCGCCAACGTGTTTGGCCACAAGCGCTACATGAACGGCGCCGCCACCAACCCTCAG






TTTGACATTGTGGCCCGTGCCGCCGTCCAGATCAAGAACGCCCTGGATGCCGCCATCAAA






CTGGGTGGTACCAACTACGTGTTCTGGGGTGGCCGCGAAGGCTACTACACGCTGCTCAAC






ACCCAGATGCAGCGGGAGAAGAACCACCTGGCCAAGATGCTCACCGCCGCCCGCGACTAC






GCCCGCGCCAAGGGCTTCAAGGGCACCTTCCTCATTGAGCCCAAACCCATGGAGCCCACC






AAGCACCAGTACGACGTGGACACCGAGACCGTGATTGGTTTCATCCGCGCCAACGGCCTG






GACAAGGACTTCAAGGTAAACATTGAGGTAAACCACGCCACCCTGGCCGGCCACACCTTT






GAGCACGAGCTCACCGTGGCCCGCGAGAACGGCTTCCTGGGCTCCATCGACGCCAACCGC






GGAGATGCCCAGAACGGCTGGGATACGGACCAGTTCCCCATCGACGCCCTGGATCTCACC






CAGGCTATGATGCAGGTCATCCTCAACGGTGGCTTCGGCAATGGCGGCACCAACTTTGAC






GCCAAGCTCCGCCGCTCCTCCACCGATCCCGAGGACATCTTCATCGCCCACATCAGCGCC






ATGGATGCCATGGCACACGCCCTCCTGAACGCAGCCGCCATCCTGGAAGAGAGCCCCCTG






CCCGCCATGGTCAAGGAGCGTTACGCTTCCTTCGACAGCGGTCTGGGCAAGAAGTTCGAA






GAAGGCAAGGCCTCCCTGGAAGAACTTTACGAATATGCCAAGAAGAATGGAGAGCCCGTG






GCCGCTTCCGGCAAACAGGAGCTCTGCGAAACTTACTTGAACCTCTATGCAAAGTAG





5752MI4_

Prevotella

Amino 
174
MTKEYFPGIGTIPFEGTKSKNPLAFHYYNASQVVMGKPMKDWLKYAMAWWHTLGQASADP


004

Acid


FGGQTRSYEWDKGECPYCRAKQKADAGFELMQKLGIEYYCFHDVDIIEDCEDIAEYEARM








KDITDYLLEKQKETGIKNLWGTANVEGHKRYMNGAATNPQFDIVARAAVQIKNALDAAIK








LGGTNYVFWGGREGYYTLLNTQMQREKNHLAKMLTAARDYARAKGFKGTFLIEPKPMEPT








KHQYDVDTETVIGFIRANGLEKDEKVNIEVNHATLAGHTFEHELTVARENGFLGSIDANR








GDAQNGWDTDQFPIDALDLTQAMMQVILNGGEGNGGTNEDAKLRRSSTDPEDIFIAHISA








MDAMAHALLNAAATLEESPLPAMVKERYASFDSGLGKKFEEGKASLEELYEYAKKNGEPV







AASGKQELCETYLNLYAK





727MI4_

Rhizobiales

DNA
175
GTGACTGATTTCTTCAAGGGCATCGCGCCCGTCAAGTTTGAGGGGCCGCAGAGCTCCAAT


006



CCGCTGGCCTATCGCCACTATAACAAGGACGAAATCGTCCTCGGCAAGCGGATGGAAGAC






CATATCCGTCCCGGCGTTGCCTATTGGCACACCTTCGCCTATGAGGGCGGCGATCCGTTT






GGCGGCCGCACCTTCGATCGCCCCTGGTTCGACAAGGGTATGGACGGCGCCCGCCTCAAG






GCCGACGTGGCCTTCGAACTGTTCGACCTGCTCGACGTTCCTTTCTTCTGTTTCCACGAT






GCTGATATCGCTCCCGAAGGCGCAACGCTGGCCGAGAGCAACCGCAATGTGCGCGAGATT






GGCGAGATCTTCGCTCGCAAGATGGAAACCAGCCGCACCAAGCTGCTCTGGGGTACGGCA






AACCTGTTCTCCAATCGCCGCTACATGGCCGGCGCCGCCACCAACCCGGACCCGGAAATC






TTCGCCTATGCCGCTGGGCAGGTGAAGAACGTGCTGGAACTGACCCACGAACTGGGCGGC






GCCAACTATGTGCTGTGGGGCGGTCGCGAGGGTTATGAAACCCTGCTCAACACCAAGATC






GGCCAGGAAATGGACCAGATGGGCCGTTTTCTGTCGATGGTCGTCGAGCATGCCGAAAAG






ATCGGCTTCAAGGGCCAGATCCTGATCGAGCCCAAGCCGCAGGAGCCGAGCAAGCACCAG






TATGACTTCGACGTTGCAACCGTTTACGGCTTCCTCAAGAAGTATGGTCTCGAAACCAAG






GTGAAGTGCAATATCGAGGTCGGCCATGCCTTCCTCGCCAATCACTCCTTCGAGCATGAA






CTGGCTTTGGCCGCATCGCTGGGCATTCTCGGCTCGGTCGACGCCAATCGCAACGATCTA






CAGTCCGGCTGGGATACCGACCAGTTCCCCAATAATGTCCCCGAAACCGCACTCGCCTTC






TATCAGATTCTCAAGGCGGGCGGACTGGGCAATGGCGGCTGGAACTTCGACGCCCGCGTG






CGCCGCCAGTCACTTGATCCGGCCGACCTGCTGCACGGCCATATCGGCGGCCTCGACGTG






CTGGCGCGCGGCCTCAAGGCCGCCGCGGCGCTGATCGAGGACGGCACCTATGACAAGGTC






GTCGACGCCCGCTATGCCGGCTGGAACCAGGGCCTGGGCAAGGATATCCTTGGTGGCAAG






CTGAACCTTGCCGACCTGGCTGCCAAGGTCGACGCCGAAAACCTCAACCCGCAGCCTAGG






TCCGGCCAGCAGGAATATCTCGAAAACCTGATCAACCGGTTCGTTTAG





727MI4_

Rhizobiales

Amino 
176
MTDFFKGIAPVKFEGPQSSNPLAYRHYNKDEIVLGKRMEDHIRPGVAYWHTFAYEGGDPF


006

Acid


GGRTFDRPWFDKGMDGARLKADVAFELFDLLDVPFFCFHDADIAPEGATLAESNRNVREI








GEIFARKMETSRTKLLWGTANLFSNRRYMAGAATNPDPEIFAYAAGQVKNVLELTHELGG








ANYVLWGGREGYETLLNTKIGQEMPQMGRFLSMVVEHAEKIGFKGQILIEPKPQEPSKHQ








YDFDVATVYGFLKKYGLETKVKCNIEVGHAFLANHSFEHELALAASLGILGSVDANRNDL








QSGWDTDQFPNNVPETALAFYQILKAGGLGNGGWNFDARVRRQSLDPADLLHGHIGGLDV








LARGLKAAAALIEDGTYDKVVDARYAGWNQGLGKDILGGKLNLADLAAKVDAENLNPQPR







SGQQEYLENLINRFV









5.5 Example 4: Quantification of XI Enzyme Activity

The clones identified in the ABD and SBD screens (see Table 2) were subcloned into vector p426PGK1 (FIG. 3), a modified version of p426GPD (ATCC accession number 87361) in which the GPD promoter was replaced with the PGK1 promoter from Saccharomyces cerevisiae (ATCC accession number 204501) gDNA. The clones were then transformed into yeast strain MYA11008.


Cells were grown as described in the materials and methods. Cell pellets were resuspended in about 300 μl of lysis buffer: approximate concentrations (50 mM NaH2PO4 (pH 8.0), 300 mM NaCl, 10 mM imidazole (Sigma, #I5513), to which was added about 2 μl/ml beta-mercaptoethanol (BME)), and protease inhibitor cocktail tablet (Roche, 11836170001) (1 tablet for about 10 ml cell extract). The cell suspension was added to a 2 ml screw-cap microcentrifuge tube that had been pre-aliquotted with about 0.5 ml of acid washed glass beads (425-600 μm). Cells were lysed using a FastPrep-24 (MP Biomedicals, Solon, Ohio) at amplitude setting of about 6 for about 3 repetitions of about 1 minute. Cells were chilled on ice for about 5 minutes between repetitions. Samples were centrifuged at about 10,000×g for about 10 minutes at 4° C. Recovered supernatants were used in the XI enzyme activity assay. XI enzyme activity was performed as described in the materials and methods. Results are shown in Table 3.









TABLE 3







XI activity at pH 7.5









SEQ ID NO:
Volumetric Activity
FIOPC












2
−60.73
2.58


4
−21.84
0.93


6
0.86
−0.05


8
−2.14
0.12


10
−2.38
0.13


12
−12.82
0.54


14
−26.97
1.45


16
−76.50
4.12


18
−15.32
0.83


20
−5.33
0.29


22
0.48
−0.03


24
0.36
−0.02


26
0.81
−0.04


28
−6.65
0.36


30
−9.10
0.49


32
−38.10
2.05


34
−21.76
1.17


36
−13.82
0.59


38
−17.58
0.75


40
−12.34
0.52


42
−74.88
3.18


44
−37.10
1.57


46
−35.57
1.51


48
−24.69
1.05


50
−32.23
1.37


52
−26.72
1.13


54
−90.79
3.85


56
−39.89
1.69


58
−74.26
3.15


60
−11.91
0.64


62
−15.43
0.83


64
−12.98
0.70


66
−27.45
1.48


68
−29.43
1.59


70
−4.54
0.24


72
−8.93
0.48


74
−0.20
0.01


76
−0.33
0.02


78
−50.55
2.15


80
−57.13
2.42


82
−58.09
2.47


84
−46.42
1.97


86
−35.95
1.53


88
−2.16
0.09


90
−32.77
1.39


92
−30.82
1.31


94
−8.16
0.35


96
−46.18
1.96


98
−30.05
1.28


100
−8.40
0.45


102
−8.34
0.45


104
−3.80
0.20


106
−4.81
0.26


108
−12.06
0.65


110
−6.10
0.33


112
−7.71
0.42


114
−4.17
0.22


116
−7.07
0.38


118
−13.50
0.73


120
−1.15
0.06


122
0.03
0.00


124
−4.41
0.24


126
−0.85
0.05


128
−14.60
0.79


130
−17.26
0.93


132
−0.75
0.04


134
−11.55
0.62


136
−7.20
0.39


138
0.16
−0.01


140
−3.63
0.20


142
−3.63
0.20


144
−1.20
0.06


146
−16.77
0.90


148
−2.00
0.11


150
−1.40
0.08


152
−3.63
0.20


154
−7.09
0.38


156
−0.96
0.05


158
−2.79
0.15


160
−3.23
0.17


162
−10.17
0.55


164
−0.51
0.03


166
−3.43
0.19


168
−5.65
0.30


170
−2.35
0.13


172
−1.20
0.06


174
−2.29
0.12


176
−1.92
0.08


Op-XI (ABD)
−23.56
NA


Op-XI (SBD)
−18.55
NA


Vo—ctrl
−1.74
NA









5.6 Example 5: Growth of Yeast Containing XI Clones on Xylose

A subset of the XI genes from Example 3 were expressed in Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108) and assayed for ability to confer the ability to grow on xylose. This assay was carried out as follows: colonies were isolated on SC-ura+2% glucose agar plates and inoculated into about 3 ml “pre-cultures” of both SC-ura 2% glycerol and SC-ura 2% xylose media, incubated at about 30° C., about 220 rpm, overnight. Cells were harvested by centrifugation (about 100×g, 5 minutes), supernatant discarded and washed twice and resuspended in about 1 ml of SC-ura 2% xylose. Cells were inoculated into Biolector plates, containing SC-ura, 2% xylose, and inoculums were normalized to two different starting optical densities of about OD600 0.2 and 0.4. Plates were covered using gas permeable seals and incubated in a BioLector microfermentation device (m2p-labs, Model G-BL100) at about 30° C. for about 4 days at 800 rpm and 90% humidity. Growth readings from the Biolector were acquired for 60-100 hours according to manufacturer's recommendations. Results are shown in FIG. 4.


5.7 Example 6: Ethanol Production Under Anaerobic Conditions

A subset of the XI expressing yeast clones in strain Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108) were assayed for ability to ferment xylose to ethanol (EtOH). In brief, single colonies were inoculated into about 25 ml of SC-ura medium supplemented with about 0.1% glucose and about 3% xylose. Cultures were incubated under microaerobic conditions at about 30° C. and about 200 rpm. Samples were harvested at about 0, 24, 48, 72 h, and ethanol concentration determined via HPLC standard assays. Ethanol productivity was calculated, and listed in units of grams of EtOH per liter per hour, and FIOPC was generated comparing productivity of the control Op-XI. Results are shown in Table 4.









TABLE 4







Anaerobic EtOH Production


Time (h)













SEQ ID NO:
0
24
48
72
EtOH (g/L/h)
FIOPC
















6
0.28
0
0
0
−0.004
−0.5


8
0
0
0
0
0.000
0.0


10
0
0
0
0
0.000
0.0


14
0.37
0.28
0.71
1.24
0.013
1.7


16
0.33
0.275
0.72
1.06
0.011
1.4


18
0.29
0.135
0.31
0.595
0.005
0.6


20
0.33
0
0
0
−0.004
−0.5


22
0.32
0
0
0
−0.004
−0.5


24
0.28
0
0
0
−0.004
−0.5


26
0.26
0
0
0
−0.003
−0.4


28
0.23
0.385
1.015
1.54
0.019
2.5


30
0.27
0
0
0.07
−0.003
−0.3


32
0
0.165
0.48
0.815
0.012
1.5


34
0
0.125
0.33
0.615
0.009
1.1


36
0
0
0
0
0.000
0.0


46
0
0.285
0.905
1.625
0.023
3.0


60
0.45
0.35
0.87
1.39
0.014
1.8


62
0
0
0
0.065
0.001
0.1


64
0.38
0.275
0.735
1.18
0.012
1.6


66
0
0
0.12
0.22
0.003
0.4


68
0
0.05
0.275
0.5
0.007
0.9


70
0
0
0
0
0.000
0.0


72
0.119
0
0.054
0.1685
0.001
0.1


74
0.21
0.11
0.275
0.57
0.005
0.7


76
0.28
0
0
0
−0.004
−0.5


90
0
0.24
0.69
1.09
0.016
2.0


100
0.104
0.642
0.141
0.366
0.001
0.2


102
0.185
0
0
0.054
−0.002
−0.2


104
0.235
0.536
0
0
−0.005
−0.7


106
0.188
0.4835
0
0
−0.004
−0.6


108
0.19
0.5855
0.1455
0.313
0.000
0.0


110
0.3
0
0
0.05
−0.003
−0.4


112
0.19
0.5535
0.106
0.1135
−0.003
−0.4


114
0.174
0
0
0
−0.002
−0.3


116
0.15
0
0.0515
0.211
0.001
0.1


118
0.177
0.7075
0.5065
0.941
0.009
1.1


120
0.153
0
0
0
−0.002
−0.2


122
0.169
0.553
0
0.074
−0.003
−0.5


124
0.125
0
0
0
−0.002
−0.2


126
0.32
0
0
0
−0.004
−0.5


128
0
0
0
0
0.000
0.0


130
0
0
0
0
0.000
0.0


132
0.121
0
0
0
−0.002
−0.2


134
0.118
0
0
0.1105
0.000
0.0


136
0.108
0
0
0
−0.001
−0.2


138
0.172
0.513
0
0
−0.004
−0.6


140
0.17
0.542
0
0.3135
0.000
−0.1


142
0.102
0
0
0
−0.001
−0.2


144
0.28
0
0
0
−0.004
−0.5


146
0.103
0.635
0.263
0.563
0.004
0.5


150
0.27
0
0
0
−0.003
−0.4


149
0.27
0
0
0
−0.003
−0.4


152
0.17
0
0
0
−0.002
−0.3


154
0.23
0
0
0
−0.003
−0.4


156
0.23
0
0
0
−0.003
−0.4


158
0.4
0
0.105
0.23
−0.002
−0.2


160
0.38
0
0
0
−0.005
−0.6


162
0.36
0.055
0.23
0.41
0.001
0.2


164
0.32
0
0
0
−0.004
−0.5


166
0.31
0
0
0
−0.004
−0.5


168
0.32
0
0.295
0.6
0.005
0.6


170
0.164
0.4995
0
0
−0.004
−0.5


172
0.27
0
0
0
−0.003
−0.4


174
0.3
0
0.17
0.345
0.001
0.2


OP-XI (pos)
0.2385
0.5875
0.6965
0.81508
0.008
NA


Host-(neg)
0.23625
0.088125
0
0
−0.003
NA









5.8 Example 7: Impact of pH on XI Activity

Extracts from strain Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108, expressing XI gene candidates in vector p426PGK1, were prepared as described in the Materials and Methods and assayed for XI activity at pH 7.5 and pH 6.0. Percent activity listed was calculated by dividing the VA at pH 6 by the VA at pH 7.5 and multiplying by 100. Results are listed in Table 5.









TABLE 5







XI activity at pH 6 and pH 7.5















Percent



Organism
VA, pH 6
VA, pH 7.5
activity


SEQ ID NO:
Classification
(U/ml)
(U/ml)
(pH 6)














2

Bacteroidales

1.92
2.59
74%


14

Bacteroides

0.32
0.98
32%


16

Bacteroides

1.16
2.40
48%


32

Bacteroides

1.17
2.21
53%


38

Firmicutes

2.46
2.77
89%


42

Firmicutes

1.71
2.18
79%


44

Firmicutes

0.19
0.25
76%


46

Firmicutes

1.49
1.95
76%


50

Firmicutes

0.81
0.95
86%


52

Firmicutes

0.02
0.08
26%


54

Neocallimastigales

1.46
2.90
51%


58

Neocallimastigales

1.89
3.05
62%


68

Neocallimastigales

1.50
1.97
76%


72

Neocallimastigales

0.57
1.04
55%


78

Prevotella

2.40
3.61
67%


80

Prevotella

1.52
2.29
66%


82

Prevotella

1.48
1.65
89%


84

Prevotella

1.79
2.96
61%


96

Prevotella

2.13
3.56
60%


116

Prevotella

0.06
0.13
47%


Host-neg

0.04
0.02
NA


Op-XI

0.61
1.25
49%









5.9 Example 8: Km for Selected XI Clones

The Km and Vmax at pH 6 were determined for a subset of the XI clones, expressed on p426PGK1 vector in Saccharomyces cerevisiae CEN.PK2-1Ca (ATCC: MYA1108), using the XI activity assay described in the Materials and Methods and varying the concentrations of xylose from about 40-600 mM. Results shown are calculated using the Hanes Plot, which rearranges the Michaelis-Menten equation (v=Vmax[S]/(Km+[S])) as: ([S]/v=Km/Vmax+[S]/Vmax), where plotting [S]/v against [S], resulting in a straight line and where the y intercept=Km/Vmax, the slope=1/Vmax, and the x intercept=−Km. Results are listed in Table 6.









TABLE 6







Km determination for 3 XIs









SEQ ID NO:
Km
Vmax





78
35.2
27.6


96
33.7
28.0


38
28.8
28.6









5.10 Example 9: Quantification of XI Activity Expressed from Single Genomic Integration Locus

A vector named pYDAB006 (FIG. 5A) for integration into locus YER131.5 (between YER131W and YER132C) in the S. cerevisiae genome was constructed using conventional cloning methods. The vector backbone with a Pad site at each end was derived from pBluescript II SK (+) (Agilent Technologies, Inc. Santa Clara, Calif.) by standard PCR techniques, which contained only the pUC origin of replication and bla gene encoding ampicillin resistance protein as a selectable marker. Two 300-base pair segments named YER131.5-A and YER 131.5-B were amplified from yeast genomic DNA by standard PCR techniques and connected with a multiple cloning site (MCS 1: 5′-GGCGCGCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATATCGGTACCGCGGCCGC-3′ (SEQ ID NO:181)) using the overlapping PCR technique. The PCR primers used in the overlapping PCR are shown in Table 7 below:









TABLE 7







Primers Used in pYDAB006 Construction










SEQ




ID



Primer
NO:
Sequence (PacI site is underlined)





131.5AF
182
caccattaattaaAGCTTTGTAAATATGATGAGAGAATAATATA




AATCAAACG





131.5AR
183
GGCGCGCCTCTAGAAAGCTTAATCGACAAGAACACTTCT




ATTTATATAGGTATGAAA





131.5BF
184
GCAGGGATATCGGTACCCACCAGCGGCCGCTGAAGAAG




GTTTATTTCGTTTCGCTGT





131.5BR
185
caccattaattaaCCCAGGTGAGACTGGATGCTCCATA





ABMCSF
186
GCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATA




TCGGTACCCACCAGCGGCCGC





ABMCSR
187
CGCTGGTGGGTACCGATATCCCTGCAGGGAGCTCACGCG




TAAGCTTTCTAGAGGCGCGCC









The overlapping PCR product was then ligated with the vector backbone resulting in plasmid pYDAB006.


A vector named pYDURA01 (FIG. 5B) for generating yeast selectable and recyclable marker was constructed using similar method as pYDAB006. The URA3 expression cassette was amplified from yeast genomic DNA by standard PCR techniques. The 200 base pair fingerprint sequence (named R88: TGCGTGTGCCGCGAGTCCACGTCTACTCGCGAACCGAGTGCAGGCGGGTCTTCGGCCAGGAC GGCCGTGCGTGACCCCGGCCGCCAGACGAAACGGACCGCGCTCGCCAGACGCTACCCAGCC CGTTCATGCCGGCCGCGAGCCGACCTGTCTCGGTCGCTTCGACGCACGCGCGGTCCTTTCGG GTACTCGCCTAAGAC (SEQ ID NO:188)) at both sides of URA3 cassette was amplified by standard PCR techniques from the genomic DNA of yBPA317, which was a diploid strain having genotypes MATa/MATalpha; URA3/ura3; YDL074.5::P(TDH3)-CBT1-T(CYC1)-R88 YLR388.5::P(TDH3)-StBGL-T(CYC1)-R88/YLR388.5::P(TDH3)-StBGL-T(CYC1)-R88. The primers used in the amplification are described in Table 8 below:









TABLE 8







Primers Used in pYDURA01 Construction










SEQ




ID
Sequence


Primer
NO:
(KpnI and NotI sites are underlined)





NotI-KpnI-R88-F
189
caatagcggccgcggtaccTGCGTGTGCCGCGAGTCCAC





R88-BamHI-R
190
TGTTAGGATCCGTCTTAGGCGAGTACCCGAAAGG





BamHI-ura-F
191
caataggatccAGGCATATTTATGGTGAAGAATAAGT





ura-Xho-R
192
TGTTACTCGAGAAATCATTACGACCGAGATTCCCG





XhoI-R88-F
193
caatactcgagTGCGTGTGCCGCGAGTCCAC





R88-NotI-R
194
TGTTAGCGGCCGCGTCTTAGGCGAGTACCCGAAAGG









An expression cassette was generated for the XI genes by cloning into a vector named pYDPt005 (FIG. 5C). pYDPt005 was generated using similar method as pYDAB006. It contained a TDH3 promoter and a PGK1 terminator flanking a multiple cloning site











(SEQ ID NO: 195)



(MCS 2: 5′-








ACTAGTGGATCcustom character GTCGACcustom character -3′,








where single underline is SpeI site, double underline is XhoI site, and jagged underline is PmeI site). The promoter and the terminator were amplified from S. cerevisiae genomic DNA; an AscI site was added to the 5′ end of the TDH3 promoter while a KpnI site was added to the 3′ end of the PGK1 terminator during amplification. Primers used in the amplification are described in Table 9.









TABLE 9







Primers Used in pYDPt005 Construction










SEQ




ID
Sequence


Primer
NO:
(AscI and KpnI sites are underlined)





TDH-F
196
CACCAGGCGCGCCTCTAGAAAGCTTACGCGTAGTTTATC




ATTATCAATACTGCCATTTCAAAGA





overlap-TDH-R
197
AACGTCGACCTCGAGGGATCCACTAGTTCGAAACTAAGT




TCTTGGTGTTTTAAAACT





overlap-PGK-F
198
GTGGATCCCTCGAGGTCGACGTTTAAACATTGAATTGAA




TTGAAATCGATAGATCAAT





PGK-R
199
CACCAGCGGCCGCGGTACCGATATCCCTGCAGGGAGCTC




GAAATATCGAATGGGAAAAAAAAACTGGAT









An Orpinomyces sp. XI gene (NCBI:169733248) was cloned in this vector between the SpeI and XhoI sites. The Orpinomyces sp. XI expression cassette and R88-Ura-R88 fragment were then cloned into vector pYDAB006 using AscI, KpnI and NotI sites; the resulting plasmid was named pYDABF006 (FIG. 5D). Subsequently, the Orpinomyces sp. XI gene in pYDABF0006 was replaced with a subset of the XI genes of Table 2 by digestion of pYDABF0006 with SpeI and PmeI and ligation to a DNA fragment encoding the appropriate XI sequence which had been amplified from p426PGK1-XI constructs. A SpeI site followed by a Kozak-like sequence (6 consecutive adenines) was added immediately in front of the start codon of the XI genes while a PmeI site was added to the 3′ end of the XI genes during amplification.


XI gene integration cassettes were extracted by PacI digestion and used to transform yeast strain yBPA130 using standard techniques. Transformants were selected for growth on SC-Ura (Synthetic Complete, Ura dropout) agar plates. Integration position and existence of XI cassette in transformants was confirmed by PCR using the primers shown in Table 10.









TABLE 10







Primers Used in Integration Verification










SEQ




ID



Primer
NO:
Sequence





5′ of integration
200
ACAGGGATAACAAAGTTTCTCCAGC





3′ of integration
201
CATACCAAGTCATGCGTTACCAGAG





5′ of R88-ura-R88
202
TTTCCCATTCGATATTTCGAGCTCC





3′ of integration
203
CATACCAAGTCATGCGTTACCAGAG









Confirmed clones were then grown about 18 hours in liquid YPD to allow looping out of the URA3 marker and were selected for growth on SC+5-FOA agar plate. The absence of the URA3 marker was confirmed by PCR.


Strains containing the confirmed XI expression cassettes were inoculated into about 3 ml of modified YP Media (YP+0.1% Glucose+3.0% Xylose) and incubated overnight at about 30° C. and about 220 rpm. These overnight cultures were subcultured into about 25 ml of the same media to about OD600=0.2. Samples were incubated overnight at about 30° C. and about 220 rpm. Cultures were harvested when OD600 was between about 3 and 4. Pellets were collected by centrifugation for about 5 minutes at about 4000 rpm. The supernatant was discarded and pellets washed with about 25 ml of distilled-deionized water and centrifuged again using the same conditions. Supernatant was discarded and the pellet frozen at about −20° C. until lysis and characterization.


Cell pellets were thawed and about 200 mg of each pellet sample was weighed out into 2 ml microcentrifuge tubes. About 50 μl of Complete®, EDTA-free Protease Inhibitor cocktail (Roche Part#11873 580 001) at 5 times the concentration stated in the manufacturer's protocol was added to each sample. To this was added about 0.5 ml of Y-PER Plus® Dialyzable Yeast Protein Extraction Reagent (Thermo Scientific Part#78999) (YP+) to each sample. Samples were incubated at about 25° C. for about 4 hours on rotating mixer. Sample supernatants were collected after centrifugation at about 10,000×g for about 10 minutes for characterization.


Total protein concentrations of the XI sample extracts prepared above were carried out using Bio-Rad Protein Assay Dye Reagent Concentrate (Bio-Rad, cat#500-0006, Hercules Calif.) which is a modified version of the Bradford method (Bradford).


Yeast physiological pH ranges are known to range from about pH 6 to about pH 7.5 (Pena, Ramirez et al., 1995, J. Bacteriology 4:1017-1022). Ranking of XI activity at yeast physiological pH was accomplished using the assay conditions at pH 7.5 and modified for pH 6.0 as described in the materials and methods. The specific activities of 20 XIs when expressed from a single copy integrated into the yeast YER131.5 locus were evaluated. The results are listed in Table 11.









TABLE 11







SA of XI Expressed in an Industrial S.cerevisiae













Organism
SA, pH6
SA, pH 7.5



SEQ ID NO:
Classification
(U/mg)
(U/mg)
















2

Bacteroidales

0.86
1.08



14

Bacteroides

0.33
1.07



16

Bacteroides

0.57
1.05



32

Bacteroides

0.53
1.00



38

Firmicutes

1.00
0.94



42

Firmicutes

0.79
0.82



44

Firmicutes

0.08
0.10



46

Firmicutes

0.62
0.69



50

Firmicutes

0.35
0.41



52

Firmicutes

0.01
0.03



54

Neocallimastigales

0.64
1.17



58

Neocallimastigales

0.79
1.10



68

Neocallimastigales

0.01
0.02



72

Neocallimastigales

0.22
0.40



78

Prevotella

1.10
1.45



80

Prevotella

0.74
1.11



82

Prevotella

0.54
0.60



84

Prevotella

0.76
1.06



96

Prevotella

1.10
1.62



116

Prevotella

0.03
0.06



Host neg ctrl

0.00
0.02










5.11 Example 10: Identification of Sequence Motifs in Acid Tolerant XIs

The proposed mechanism of xylose isomerases can be summarized as follows: (i) binding of xylose to xylose isomerase, so that O3 and O4 are coordinated by metal ion I; (ii) enzyme-catalyzed ring opening (the identity of the ring-opening group remains a subject for further investigation; ring opening may be the rate limiting step in the overall isomerization process); (iii) chain extension (sugar binds in a linear extended form) in which O2 and O4 now coordinate metal ion I; (iv) O2 becomes deprotonated causing a shift of metal ion II from position 1 to an adjacent position 2 in which it coordinates O1 and O2 of the sugar together with metal ion I; (v) isomerization via an anionic transition state arises by a hydride shift promoted by electrophilic catalysis provided by both metal ions; (vi) collapse of transition state by return of metal ion II to position 1; (vii) chain contraction to a pseudo-cyclic position with ligands to metal ion I changing from O2/O4 back to O3/O4; (viii) enzyme-catalyzed ring closure; (ix) dissociation of xylulose from xylose isomerase (Lavie et al., 1994, Biochemistry 33(18), 5469-5480).


Many XIs identified contained one or both of two signature sequences characteristic of XIs, [LI]EPKP.{2}P (SEQ ID NO:204) and [FL]HD[^K]D[LIV].[PD].[GDE] (SEQ ID NO:205). Additional sequence motifs present in the top performing Firmicutes and Prevotella XIs were identified. The motifs are located near the active site including residues in direct contact with the D-xylose and/or the metal ions. The motifs are shown in Table 12 below:









TABLE 12







XI Sequence Motifs













SEQ





ID


XI Source
Motif
Sequence
NO:






Firmicutes

1A
P[FY][AST][MLVI][AS][WYFL]W[HT]N[LFMG]GA
206






Firmicutes

1B
P[FY][AS].{2}[WYFL]W[HT][{circumflex over ( )}TV].GA
207






Firmicutes

2
[GSN][IVA]R[YFHG][FYLIV]C[FW]HD.D
208






Firmicutes

3
T[ASTC][NK][{circumflex over ( )}L]F.[NDH][PRKAG][RVA][FY]C
209






Firmicutes

4
[WFY]D[TQVI]D.[FY][PF][{circumflex over ( )}T].{2, 4}[YFH]S[ATL]T
210






Firmicutes

5
GF[NH]FD[SA]KTR
211






Prevotella

1A
FG.QT[RK].{2}E[WYF][DNG].{2, 3}[DNEGT][AT]
212






Prevotella

1B
FG.QT[RK].{2}E[WYF][DNG].{3}[{circumflex over ( )}C][{circumflex over ( )}P]
213






Prevotella

2
[FW]HD.D[LVI].[DE]EG[{circumflex over ( )}P][TSD][IV][EA]E
214









5.12 Example 11: In Vivo Evaluation of Xylose Isomerase

Haploid S. cerevisiae strain yBPA130 (MATa::ura3) and yBPA136 (MATalpha::ura3) were genetically modified to enhance C5 xylose utilization during fermentation. The modification includes the following: the native glucose repressible alcohol dehydrogenase II gene ADH2 was disrupted by inserting an expression cassette of the endogenous transaldolase gene TAL1 (SEQ ID NO:215) and xylulokinase gene XKS1 (SEQ ID NO:216). PHO13 encoding the native alkaline phosphatase specific for p-nitrophenyl phosphate gene was disrupted by inserting the native transketolase-1 gene TKL1 (SEQ ID NO:217). Native aldose reductase gene GRE3 was disrupted by inserting native D-ribulose-5-phosphate 3-epimerase gene RPE1 (SEQ ID NO:218) and Ribose-5-phosphate ketol-isomerase gene RKI1 (SEQ ID NO:219). Also one expression cassette of native galactose permease gene GAL2 (SEQ ID NO:220) was integrated into the S. cerevisiae strain, resulting in haploid strains pBPB007 (MATa::ura3) and pBPB008 (MATalpha::ura3). The genotype of pBPB007 and pBPB008 is adh2::TA1-XKS1, pho13::TKL1-XKS1, gre3::RPE1-RKI1 and YLR388.5::GAL2. The sequences are shown in Table 13, below:












TABLE 13







SEQ



Sequence
Type of
ID



Name
sequence
NO:
Sequence







TAL1 (S.
DNA
215
ATGTCTGAACCAGCTCAAAAGAAACAAAAGGTTGCTAACAACTCT



cerevisiae)



CTAGAACAATTGAAAGCCTCCGGCACTGTCGTTGTTGCCGACACT





GGTGATTTCGGCTCTATTGCCAAGTTTCAACCTCAAGACTCCACA





ACTAACCCATCATTGATCTTGGCTGCTGCCAAGCAACCAACTTAC





GCCAAGTTGATCGATGTTGCCGTGGAATACGGTAAGAAGCATGGT





AAGACCACCGAAGAACAAGTCGAAAATGCTGTGGACAGATTGTTA





GTCGAATTCGGTAAGGAGATCTTAAAGATTGTTCCAGGCAGAGTC





TCCACCGAAGTTGATGCTAGATTGTCTTTTGACACTCAAGCTACC





ATTGAAAAGGCTAGACATATCATTAAATTGTTTGAACAAGAAGGT





GTCTCCAAGGAAAGAGTCCTTATTAAAATTGCTTCCACTTGGGAA





GGTATTCAAGCTGCCAAAGAATTGGAAGAAAAGGACGGTATCCAC





TGTAATTTGACTCTATTATTCTCCTTCGTTCAAGCAGTTGCCTGT





GCCGAGGCCCAAGTTACTTTGATTTCCCCATTTGTTGGTAGAATT





CTAGACTGGTACAAATCCAGCACTGGTAAAGATTACAAGGGTGAA





GCCGACCCAGGTGTTATTTCCGTCAAGAAAATCTACAACTACTAC





AAGAAGTACGGTTACAAGACTATTGTTATGGGTGCTTCTTTCAGA





AGCACTGACGAAATCAAAAACTTGGCTGGTGTTGACTATCTAACA





ATTTCTCCAGCTTTATTGGACAAGTTGATGAACAGTACTGAACCT





TTCCCAAGAGTTTTGGACCCTGTCTCCGCTAAGAAGGAAGCCGGC





GACAAGATTTCTTACATCAGCGACGAATCTAAATTCAGATTCGAC





TTGAATGAAGACGCTATGGCCACTGAAAAATTGTCCGAAGGTATC





AGAAAATTCTCTGCCGATATTGTTACTCTATTCGACTTGATTGAA





AAGAAAGTTACCGCTTAA





XKS1 (S.
DNA
216
ATGTTGTGTTCAGTAATTCAGAGACAGACAAGAGAGGTTTCCAAC



cerevisiae)



ACAATGTCTTTAGACTCATACTATCTTGGGTTTGATCTTTCGACC





CAACAACTGAAATGTCTCGCCATTAACCAGGACCTAAAAATTGTC





CATTCAGAAACAGTGGAATTTGAAAAGGATCTTCCGCATTATCAC





ACAAAGAAGGGTGTCTATATACACGGCGACACTATCGAATGTCCC





GTAGCCATGTGGTTAGAGGCTCTAGATCTGGTTCTCTCGAAATAT





CGCGAGGCTAAATTTCCATTGAACAAAGTTATGGCCGTCTCAGGG





TCCTGCCAGCAGCACGGGTCTGTCTACTGGTCCTCCCAAGCCGAA





TCTCTGTTAGAGCAATTGAATAAGAAACCGGAAAAAGATTTATTG





CACTACGTGAGCTCTGTAGCATTTGCAAGGCAAACCGCCCCCAAT





TGGCAAGACCACAGTACTGCAAAGCAATGTCAAGAGTTTGAAGAG





TGCATAGGTGGGCCTGAAAAAATGGCTCAATTAACAGGGTCCAGA





GCCCATTTTAGATTTACTGGTCCTCAAATTCTGAAAATTGCACAA





TTAGAACCAGAAGCTTACGAAAAAACAAAGACCATTTCTTTAGTG





TCTAATTTTTTGACTTCTATCTTAGTGGGCCATCTTGTTGAATTA





GAGGAGGCAGATGCCTGTGGTATGAACCTTTATGATATACGTGAA





AGAAAATTCAGTGATGAGCTACTACATCTAATTGATAGTTCTTCT





AAGGATAAAACTATCAGACAAAAATTAATGAGAGCACCCATGAAA





AATTTGATAGCGGGTACCATCTGTAAATATTTTATTGAGAAGTAC





GGTTTCAATACAAACTGCAAGGTCTCTCCCATGACTGGGGATAAT





TTAGCCACTATATGTTCTTTACCCCTGCGGAAGAATGACGTTCTC





GTTTCCCTAGGAACAAGTACTACAGTTCTTCTGGTCACCGATAAG





TATCACCCCTCTCCGAACTATCATCTTTTC





ATTCATCCAACTCTGCCAAACCATTATATGGGTATGATTTGTTAT





TGTAATGGTTCTTTGGCAAGGGAGAGGATAAGAGACGAGTTAAAC





AAAGAACGGGAAAATAATTATGAGAAGACTAACGATTGGACTCTT





TTTAATCAAGCTGTGCTAGATGACTCAGAAAGTAGTGAAAATGAA





TTAGGTGTATATTTTCCTCTGGGGGAGATCGTTCCTAGCGTAAAA





GCCATAAACAAAAGGGTTATCTTCAATCCAAAAACGGGTATGATT





GAAAGAGAGGTGGCCAAGTTCAAAGACAAGAGGCACGATGCCAAA





AATATTGTAGAATCACAGGCTTTAAGTTGCAGGGTAAGAATATCT





CCCCTGCTTTCGGATTCAAACGCAAGCTCACAACAGAGACTGAAC





GAAGATACAATCGTGAAGTTTGATTACGATGAATCTCCGCTGCGG





GACTACCTAAATAAAAGGCCAGAAAGGACTTTTTTTGTAGGTGGG





GCTTCTAAAAACGATGCTATTGTGAAGAAGTTTGCTCAAGTCATT





GGTGCTACAAAGGGTAATTTTAGGCTAGAAACACCAAACTCATGT





GCCCTTGGTGGTTGTTATAAGGCCATGTGGTCATTGTTATATGAC





TCTAATAAAATTGCAGTTCCTTTTGATAAATTTCTGAATGACAAT





TTTCCATGGCATGTAATGGAAAGCATATCCGATGTGGATAATGAA





AATTGGGATCGCTATAATTCCAAGATTGTCCCCTTAAGCGAACTG





GAAAAGACTCTCATCTAA





TKL1 (S.
DNA
217
ATGACTCAATTCACTGACATTGATAAGCTAGCCGTCTCCACCATA



cerevisiae)



AGAATTTTGGCTGTGGACACCGTATCCAAGGCCAACTCAGGTCAC





CCAGGTGCTCCATTGGGTATGGCACCAGCTGCACACGTTCTATGG





AGTCAAATGCGCATGAACCCAACCAACCCAGACTGGATCAACAGA





GATAGATTTGTCTTGTCTAACGGTCACGCGGTCGCTTTGTTGTAT





TCTATGCTACATTTGACTGGTTACGATCTGTCTATTGAAGACTTG





AAACAGTTCAGACAGTTGGGTTCCAGAACACCAGGTCATCCTGAA





TTTGAGTTGCCAGGTGTTGAAGTTACTACCGGTCCATTAGGTCAA





GGTATCTCCAACGCTGTTGGTATGGCCATGGCTCAAGCTAACCTG





GCTGCCACTTACAACAAGCCGGGCTTTACCTTGTCTGACAACTAC





ACCTATGTTTTCTTGGGTGACGGTTGTTTGCAAGAAGGTATTTCT





TCAGAAGCTTCCTCCTTGGCTGGTCATTTGAAATTGGGTAACTTG





ATTGCCATCTACGATGACAACAAGATCACTATCGATGGTGCTACC





AGTATCTCATTCGATGAAGATGTTGCTAAGAGATACGAAGCCTAC





GGTTGGGAAGTTTTGTACGTAGAAAATGGTAACGAAGATCTAGCC





GGTATTGCCAAGGCTATTGCTCAAGCTAAGTTATCCAAGGACAAA





CCAACTTTGATCAAAATGACCACAACCATTGGTTACGGTTCCTTG





CATGCCGGCTCTCACTCTGTGCACGGTGCCCCATTGAAAGCAGAT





GATGTTAAACAACTAAAGAGCAAATTCGGTTTCAACCCAGACAAG





TCCTTTGTTGTTCCACAAGAAGTTTACGACCACTACCAAAAGACA





ATTTTAAAGCCAGGTGTCGAAGCCAACAACAAGTGGAACAAGTTG





TTCAGCGAATACCAAAAGAAATTCCCAGAATTAGGTGCTGAATTG





GCTAGAAGATTGAGCGGCCAACTACCCGCA





AATTGGGAATCTAAGTTGCCAACTTACACCGCCAAGGACTCTGCC





GTGGCCACTAGAAAATTATCAGAAACTGTTCTTGAGGATGTTTAC





AATCAATTGCCAGAGTTGATTGGTGGTTCTGCCGATTTAACACCT





TCTAACTTGACCAGATGGAAGGAAGCCCTTGACTTCCAACCTCCT





TCTTCCGGTTCAGGTAACTACTCTGGTAGATACATTAGGTACGGT





ATTAGAGAACACGCTATGGGTGCCATAATGAACGGTATTTCAGCT





TTCGGTGCCAACTACAAACCATACGGTGGTACTTTCTTGAACTtC





GTTTCTTATGCTGCTGGTGCCGTTAGATTGTCCGCTTTGTCTGGC





CACCCAGTTATTTGGGTTGCTACACATGACTCTATCGGTGTCGGT





GAAGATGGTCCAACACATCAACCTATTGAAACTTTAGCACACTTC





AGATCCCTACCAAACATTCAAGTTTGGAGACCAGCTGATGGTAAC





GAAGTTTCTGCCGCCTACAAGAACTCTTTAGAATCCAAGCATACT





CCAAGTATCATTGCTTTGTCCAGACAAAACTTGCCACAATTGGAA





GGTAGCTCTATTGAAAGCGCTTCTAAGGGTGGTTACGTACTACAA





GATGTTGCTAACCCAGATATTATTTTAGTGGCTACTGGTTCCGAA





GTGTCTTTGAGTGTTGAAGCTGCTAAGACTTTGGCCGCAAAGAAC





ATCAAGGCTCGTGTTGTTTCTCTACCAGATTTCTTCACTTTTGAC





AAACAACCCCTAGAATACAGACTATCAGTCTTACCAGACAACGTT





CCAATCATGTCTGTTGAAGTTTTGGCTACCACATGTTGGGGCAAA





TACGCTCATCAATCCTTCGGTATTGACAGATTTGGTGCCTCCGGT





AAGGCACCAGAAGTCTTCAAGTTCTTCGGTTTCACCCCAGAAGGT





GTTGCTGAAAGAGCTCAAAAGACCATTGCATTCTATAAGGGTGAC





AAGCTAATTTCTCCTTTGAAAAAAGCTTTCTAA





RPE1 (S.
DNA
218
ATGGTCAAACCAATTATAGCTCCCAGTATCCTTGCTTCTGACTTC



cerevisiae)



GCCAACTTGGGTTGCGAATGTCATAAGGTCATCAACGCCGGCGCA





GATTGGTTACATATCGATGTCATGGACGGCCATTTTGTTCCAAAC





ATTACTCTGGGCCAACCAATTGTTACCTCCCTACGTCGTTCTGTG





CCACGCCCTGGCGATGCTAGCAACACAGAAAAGAAGCCCACTGCG





TTCTTCGATTGTCACATGATGGTTGAAAATCCTGAAAAATGGGTC





GACGATTTTGCTAAATGTGGTGCTGACCAATTTACGTTCCACTAC





GAGGCCACACAAGACCCTTTGCATTTAGTTAAGTTGATTAAGTCT





AAGGGCATCAAAGCTGCATGCGCCATCAAACCTGGTACTTCTGTT





GACGTTTTATTTGAACTAGCTCCTCATTTGGATATGGCTCTTGTT





ATGACTGTGGAACCTGGGTTTGGAGGCCAAAAATTCATGGAAGAC





ATGATGCCAAAAGTGGAAACTTTGAGAGCCAAGTTCCCCCATTTG





AATATCCAAGTCGATGGTGGTTTGGGCAAGGAGACCATCCCGAAA





GCCGCCAAAGCCGGTGCCAACGTTATTGTCGCTGGTACCAGTGTT





TTCACTGCAGCTGACCCGCACGATGTTATCTCCTTCATGAAAGAA





GAAGTCTCGAAGGAATTGCGTTCTAGAGATTTGCTAGATTAG





RKI1 (S.
DNA
219
ATGGCTGCCGGTGTCCCAAAAATTGATGCGTTAGAATCTTTGGGC



cerevisiae)



AATCCTTTGGAGGATGCCAAGAGAGCTGCAGCATACAGAGCAGTT





GATGAAAATTTAAAATTTGATGATCACAAAATTATTGGAATTGGT





AGTGGTAGCACAGTGGTTTATGTTGCCGAAAGAATTGGACAATAT





TTGCATGACCCTAAATTTTATGAAGTAGCGTCTAAATTCATTTGC





ATTCCAACAGGATTCCAATCAAGAAACTTGATTTTGGATAACAAG





TTGCAATTAGGCTCCATTGAACAGTATCCTCGCATTGATATAGCG





TTTGACGGTGCTGATGAAGTGGATGAGAATTTACAATTAATTAAA





GGTGGTGGTGCTTGTCTATTTCAAGAAAAATTGGTTAGTACTAGT





GCTAAAACCTTCATTGTCGTTGCTGATTCAAGAAAAAAGTCACCA





AAACATTTAGGTAAGAACTGGAGGCAAGGTGTTCCCATTGAAATT





GTACCTTCCTCATACGTGAGGGTCAAGAATGATCTATTAGAACAA





TTGCATGCTGAAAAAGTTGACATCAGACAAGGAGGTTCTGCTAAA





GCAGGTCCTGTTGTAACTGACAATAATAACTTCATTATCGATGCG





GATTTCGGTGAAATTTCCGATCCAAGAAAATTGCATAGAGAAATC





AAACTGTTAGTGGGCGTGGTGGAAACAGGTTTATTCATCGACAAC





GCTTCAAAAGCCTACTTCGGTAATTCTGACGGTAGTGTTGAAGTT





ACCGAAAAGTGA





GAL2 (S.
DNA
220
ATGGCAGTTGAGGAGAACAATATGCCTGTTGTTTCACAGCAACCC



cerevisiae)



CAAGCTGGTGAAGACGTGATCTCTTCACTCAGTAAAGATTCCCAT





TTAAGCGCACAATCTCAAAAGTATTCTAATGATGAATTGAAAGCC





GGTGAGTCAGGGTCTGAAGGCTCCCAAAGTGTTCCTATAGAGATA





CCCAAGAAGCCCATGTCTGAATATGTTACCGTTTCCTTGCTTTGT





TTGTGTGTTGCCTTCGGCGGCTTCATGTTTGGCTGGGATACCGGT





ACTATTTCTGGGTTTGTTGTCCAAACAGACTTTTTGAGAAGGTTT





GGTATGAAACATAAGGATGGTACCCACTATTTGTCAAACGTCAGA





ACAGGTTTAATCGTCGCCATTTTCAATATTGGCTGTGCCTTTGGT





GGTATTATACTTTCCAAAGGTGGAGATATGTATGGCCGTAAAAAG





GGTCTTTCGATTGTCGTCTCGGTTTATATAGTTGGTATTATCATT





CAAATTGCCTCTATCAACAAGTGGTACCAATATTTCATTGGTAGA





ATCATATCTGGTTTGGGTGTCGGCGGCATCGCCGTCTTATGTCCT





ATGTTGATCTCTGAAATTGCTCCAAAGCACTTGAGAGGCACACTA





GTTTCTTGTTATCAGCTGATGATTACTGCAGGTATCTTTTTGGGC





TACTGTACTAATTACGGTACAAAGAGCTATTCGAACTCAGTTCAA





TGGAGAGTTCCATTAGGGCTATGTTTCGCTTGGTCATTATTTATG





ATTGGCGCTTTGACGTTAGTTCCTGAATCCCCACGTTATTTATGT





GAGGTGAATAAGGTAGAAGACGCCAAGCGTTCCATTGCTAAGTCT





AACAAGGTGTCACCAGAGGATCCTGCCGTCCAGGCAGAGTTAGAT





CTGATCATGGCCGGTATAGAAGCTGAAAAACTGGCTGGCAATGCG





TCCTGGGGGGAATTATTTTCCACCAAGACCAAAGTATTTCAACGT





TTGTTGATGGGTGTGTTTGTTCAAATGTTC





CAACAATTAACCGGTAACAATTATTTTTTCTACTACGGTACCGTT





ATTTTCAAGTCAGTTGGCCTGGATGATTCCTTTGAAACATCCATT





GTCATTGGTGTAGTCAACTTTGCCTCCACTTTCTTTAGTTTGTGG





ACTGTCGAAAACTTGGGACATCGTAAATGTTTACTTTTGGGCGCT





GCCACTATGATGGCTTGTATGGTCATCTACGCCTCTGTTGGTGTT





ACTAGATTATATCCTCACGGTAAAAGCCAGCCATCTTCTAAAGGT





GCCGGTAACTGTATGATTGTCTTTACCTGTTTTTATATTTTCTGT





TATGCCACAACCTGGGCGCCAGTTGCCTGGGTCATCACAGCAGAA





TCATTCCCACTGAGAGTCAAGTCGAAATGTATGGCGTTGGCCTCT





GCTTCCAATTGGGTATGGGGGTTCTTGATTGCATTTTTCACCCCA





TTCATCACATCTGCCATTAACTTCTACTACGGTTATGTCTTCATG





GGCTGTTTGGTTGCCATGTTTTTTTATGTCTTTTTCTTTGTTCCA





GAAACTAAAGGCCTATCGTTAGAAGAAATTCAAGAATTATGGGAA





GAAGGTGTTTTACCTTGGAAATCTGAAGGCTGGATTCCTTCATCC





AGAAGAGGTAATAATTACGATTTAGAGGATTTACAACATGACGAC





AAACCGTGGTACAAGGCCATGCTAGAATAA









A vector named pYDAB008 rDNA (FIG. 6) for integration xylose isomerase into ribosomal DNA loci in S. cerevisiae genome was constructed using conventional cloning methods. This vector can confer high copy number integration of genes and resulting in high-level expression of proteins. The vector was derived from pBluescript II SK (+) (Agilent Technologies, Inc., Santa Clara, Calif.). The pUC origin of replication and bla gene encoding ampicillin resistance was amplified with specific primer sequences as a selectable marker for cloning. A 741 base-pair segment R1 region, 253 base-pair R3 region and a 874 base-pair R2 region were amplified from yeast genomic DNA by PCR amplifications. A multiple cloning site of SEQ ID NO:181 (: 5′-GGCGCGCCTCTAGAAAGCTTACGCGTGAGCTCCCTGCAGGGATATCGGTACCGCGGCCGC-3′) was inserted between the R1 and R3/R2 regions by assembly using overlapping PCR. All primers used in above reactions are shown in Table 14. Overlapping PCR products were then ligated in one reaction and result in rDNA integration plasmid named pYDAB008 rDNA (FIG. 6).









TABLE 14







Primers Used in pYDAB008 rDNA vector construction










SEQ




ID
Sequence


Primer
NO:
(PacI restriction site is underlined)





PacI-rDNA(R1)-R
221
CACCATTAATTAACCCGGGGCACCTGTCACTTTGGAA





rDNA (R1)-over-R
222
CGCGTAAGCTTTCTAGAGGCGCGCCAAGCTTTTACACTCTTG




ACCAGCGCA





AB vector-MCS-R
223
CCGCTGGTGGGTACCGATATCCCTGCAGGGAGCTCACGCGTA




AGCTTTCTAGAGGCG





rDNA(R3)-over-R
224
CTGCAGGGATATCGGTACCCACCAGCGGCCGCAGGCCTTGG




GTGCTTGCTGGCGAA





rDNA(R3)-over-R
225
ACCTCTGCATGCGAATTCTTAAGACAAATAAAATTTATAGAG




ACTTGT





rDNA(R2)-over-R
226
GTCTTAAGAATTCGCATGCAGAGGTAGTTTCAAGGT





PacI-rDNA(R2)-R
227
CACCATTAATTAATACGTATTTCTCGCCGAGAAAAACTT









pYDABF 0015 (a plasmid comprising a Boles codon optimized nucleic acid of SEQ ID NO:244, encoding a xylose isomerase of SEQ ID NO:78) and pYDABF-0026 (a plasmid comprising a Boles codon optimized nucleic acid of SEQ ID NO:245, encoding a xylose isomerase of SEQ ID NO: SEQ ID NO:96) (both described in Example 10) were digested with Asc I and Kpn I restriction enzymes (New England Biolabs Inc., MA, USA) and the XI-coding insert ligated to pYDAB008 rDNA integration vector described above (FIG. 6). The resulting plasmids were named pYDABF-0033 (SEQ ID NO:78) and pYDABF-0036 (SEQ ID NO:96). Additionally, Boles codon optimized nucleic acids encoding xylose isomerase of SEQ ID NO:54 and SEQ ID NO:58 (SEQ ID NO:238 and SEQ ID NO:239, respectively) were ordered from Genewiz (Genewiz Inc., NJ, USA) were digested with Asc I and Kpn I restriction enzymes (New England Biolabs Inc., MA, USA) and ligated to pYDAB008 rDNA integration vector (FIG. 6). The codon-optimized sequences are set forth in Table 15, below:











TABLE 15






SEQ



Sequence
ID



Description
NO:
Sequence







Codon optimized DNA
238
ATGGCTAAGGAATACTTCCCAGAAATTGGTAAGATTAAGTTCGAA


encoding XI of SEQ

GGTAAGGACTCTAAGAACCCAATGGCTTTCCACTACTACGACCCA


ID NO: 54

GAAAAGGTTATTATGGGTAAGCCAATGAAGGACTGGTTGAGATTC




GCTATGGCTTGGTGGCACACCTTGTGTGCTGAAGGTGGTGACCAA




TTCGGTGGTGGTACTAAGAAGTTCCCATGGAACAACGGTGCTGAC




GCTGTTGAAATTGCTAAGCAAAAGGCTGACGCTGGTTTCGAAATT




ATGCAAAAGTTGGGTATTCCATACTTCTGTTTCCACGACGTTGAC




TTGGTTTCTGAAGGTGCTTCTGTTGAAGAATACGAAGCTAACTTG




AAGGCTATTACCGACTACTTGGCTGTTAAGATGAAGGAAACCGGA




ATTAAGTTGTTGTGGTCTACCGCTAACGTTTTCGGTAACGGTAGA




TACATGAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCT




AGAGCTATTGTTCAAATTAAGAACGCTATTGACGCTGGTATTAAG




TTGGGTGCTGAAAACTACGTTTTCTGGGGTGGTAGAGAAGGTTAC




ATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATG




GCTACCATGTTGACCATGGCTAGAGACTACGCTAGAGCTAAGGGT




TTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCT




AAGCACCAATACGACGTTGACACCGAAACCGTTATTGGTTTCTTG




AAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTT




AACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCT




GTTGCTGTTGACAACAACATGTTGGGTTCTATTGACGCTAACAGA




GGTGACTACCAAAACGGTTGGGACACCGACCAATTCCCAATTGAC




CAATACGAATTGGTTCAAGCTTGGATGGAAATTATTAGAGGTGGT




GGTTTGGGTACAGGTGGTACTAACTTCGACGCTAAGACCAGAAGA




AACTCTACCGACTTGGAAGACATTTTCATTGCTCACATTGCTGGT




ATGGACGCTATGGCTAGAGCTTTGGAATCTGCTGCTAAGTTGTTG




GAAGAATCTCCATACAAGGCTATGAAGGCTGCTAGATACGCTTCT




TTCGACAACGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGACC




TTGGAACAAGCTTACGAATACGGTAAGAAGGTTGGTGAACCAAAG




CAAACCTCTGGTAAGCAAGAATTGTACGAAGCTATTGTTGCTATG




TACGCTTAA





Codon optimized DNA
239
ATGGCTAAGGAATACTTCCCAGAAATTGGTAAGATTAAGTTCGAA


encoding XI of SEQ

GGTAAGGACTCTAAGAACCCAATGGCTTTCCACTACTACGACGCT


ID NO: 58

GAAAAGGTTATTATGGGTAAGCCAATGAAGGAATGGTTGAGATTC




GCTATGGCTTGGTGGCACACCTTGTGTGCTGAAGGTGGTGACCAA




TTCGGTGGTGGTACTAAGAAGTTCCCATGGAACGAAGGTACTGAC




GCTGTTACCATTGCTAAGCAAAAGGCTGACGCTGGTTTCGAAATT




ATGCAAAAGTTGGGTTTCCCATACTTCTGTTTCCACGACATTGAC




TTGGTTTCTGAAGGTAACTCTATTGAAGAATACGAAGCTAACTTG




CAAGCTATTACCGACTACTTGAAGGTTAAGATGGAAGAAACCGGA




ATTAAGTTGTTGTGGTCTACCGCTAACGTTTTCGGTAACGGTAGA




TACATGAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTTGCT




AGAGCTATTGTTCAAATTAAGAACGCTATTGACGCTGGTATTAAG




TTGGGTGCTGAAAACTACGTTTTCTGGGGTGGTAGAGAAGGTTAC




ATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACACATG




GCTACCATGTTGACCATGGCTAGAGACTACGCTAGATCTAAGGGT




TTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCATCT




AAGCACCAATACGACGTTGACACCGAAACCGTTATTGGTTTCTTG




AAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAAGTT




AACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTGGCT




GTTGCTGTTGACAACGGTATGTTGGGTTCTATTGACGCTAACAGA




GGTGACTACCAAAACGGTTGGGACACCGACCAATTCCCAATTGAC




CAATACGAATTGGTTCAAGCTTGGATGGAAATTATTAGAGGTGGT




GGTTTGGGTACTGGTGGTACAAACTTCGACGCTAAGACCAGAAGA




AACTCTACCGACTTGGAAGACATTTTCATTGCTCACATTTCTGGT




ATGGACGCTATGGCTAGAGCTTTGGAATCTGCTGCTAAGTTGTTG




GAAGAATCTCCATACTGTGCTATGAAGAAGGCTAGATACGCTTCT




TTCGACTCTGGTATTGGTAAGGACTTCGAAGACGGTAAGTTGACC




TTGGAACAAGCTTACGAATACGGTAAGAAGGTTGGTGAACCAAAG




CAAACCTCTGGTAAGCAAGAATTGTACGAAGCTATTGTTGCTATG




TACGCTTAA





Codon optimized DNA
244
ATGGCTAAGGAATATTTCCCATTCACCGGTAAGATTCCATTCGAA


encoding XI of SEQ

GGTAAGGACTCTAAGAACGTTATGGCTTTCCACTATTATGAACCA


ID NO: 78

GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTC




GCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAA




TTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGGTGAC




GCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATT




ATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGAC




TTGGTTGAAGAAGGTGACACCATTGAAGAATATGAAGCTAGAATG




AAGGCTATTACCGACTATGCTCAAGAAAAGATGAAGCAATTCCCA




AACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTAACAAG




AGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTT




GCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATT




AAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGT




TATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACAC




ATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAG




GGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCA




TCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTC




TTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAA




GTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTG




GCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAAC




AGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATT




GACAACTATGAATTGACCCAAGCTATGTTGGAAATTATTAGAAAC




GGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGA




AGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCT




GGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGACATT




TTGGAAAACTCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCT




TCTTTCGACCAAGGTGTTGGTAAGGACTTCGAAGACGGTAAGTTG




ACCTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGGTGAACCA




AAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT




TTGTATGCTAAGTAA





Codon optimized DNA
245
ATGGCTAAGGAATATTTCCCATTCATTGGTAAGGTTCCATTCGAA


encoding XI of SEQ

GGTACTGAATCTAAGAACGTTATGGCTTTCCACTATTATGAACCA


ID NO: 96

GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTC




GCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAA




TTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGCTGAC




GCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATT




ATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGAC




TTGGTTGAAGAAGGTGAAACCGTTGCTGAATATGAAGCTAGAATG




AAGGTTATTACCGACTATGCTTTGGAAAAGATGCAACAATTCCCA




AACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTCACAAG




AGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTT




GCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATT




AAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGT




TATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACAC




ATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAG




GGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCA




TCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTC




TTGAGGGCTCACGGTTTGGACAAGGACTTCAAGGTTAACATTGAA




GTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTG




GCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAAC




AGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATT




GACAACTATGAATTGACCCAAGCTATGATGGAAATTATTAGAAAC




GGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGA




AGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCT




GGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGCTATT




TTGGAAGAATCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCT




TCTTTCGACGAAGGTATTGGTAAGGACTTCGAAGACGGTAAGTTG




TCTTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGAAGAACCA




AAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT




TTGTATGCTAAGTAA









The resulting plasmids were named pYDABF-0033 (SEQ ID NO:78) and pYDABF-0036 (SEQ ID NO:96), pYDABF-0231 (SEQ ID NO:54) and pYDABF-0232 (SEQ ID NO:58).


The rDNA integration cassette was linearized by Pac I restriction enzyme digestion (New England Biolabs Inc., MA, USA) and purified with DNA column purification kit (Zymo Research, Irvine, Calif., USA). The integration cassette was transformed into modified haploid S. cerevisiae strain pBPB007 (MATa::ura3) and pBPB008 (MAT alpha::ura3) using the standard protocol described in previous examples. Transformants were plated on SC-xylose (SC complete+2% xylose) agar plates, about 2-3 days at about 30° C. Colonies that grew on SC-xylose agar plates were then checked by colony PCR analysis with primer sets shown in Table 16 (SEQ ID NOs:228, 229, 230, 231) to confirm the presence of xylose isomerase in the genome.









TABLE 16







Primers Used in Integration Verification












SEQ





ID




Primer
NO:
Sequence







N16PCR_F
228
CCCCATCGACAACTACGAGCTCACT







N16PCR_R
229
CAACTTGCCGTCCTCGAAGTCCTTG







N05PCR_F
230
CGAGCCTGAGAAGGTCGTGATGGGA







N05PCR_R
231
TACGTCGAAGTCGGGGTTGGTAGAA







N08PCR_F
240
TACTTGGCTGTTAAGATGAAG







N08PCR_R
241
ATCTAGCAGCCTTCATAGCCTT







N17PCR_F
242
CGAAGGTACTGACGCTGTTACC







N17PCR_R
243
CGAAAGAAGCGTATCTAGCCTT










Confirmed haploid strains were BD31328 (MATa), BD31336 (MATalpha), BD31526 (MATa) and BD31527 (MATalpha), BD34364 (MATa) and BD34365 (MATalpha), BD34366 (MATa) and BD34367 (MATalpha). Diploid strains BD31378 (expressing a xylose isomerase of SEQ ID NO:96), BD31365 (expressing a xylose isomerase of SEQ ID NO:78), BD34369 (expressing a xylose isomerase of SEQ ID NO:54) and BD34377 (expressing a xylose isomerase of SEQ ID NO:58) were generated by conventional plate mating on YPXylose (YP+2% xylose) agar plates, about 2 days at about 30° C. Colony PCR with specific primers checking mating types were performed (shown in Table 17) and single colonies having MATa and MATalpha were picked as diploid strains BD 31378 (SEQ ID NO:96), BD31365 (SEQ ID NO:78), BD34369 (SEQ ID NO:54) and BD34377 (SEQ ID NO:58).


A linear fragment encoding the URA3 sequence (SEQ ID NO:237; TTAATTAAGTTAATTACCTTTTTTGCGAGGCATATTTATGGTGAAGAATAAGTTTTGACCATC AAAGAAGGTTAATGTGGCTGTGGTTTCAGGGTCCATAAAGCTTTTCAATTCATCATTTTTTTT TTATTCTTTTTTTTGATTCCGGTTTCCTTGAAATTTTTTTGATTCGGTAATCTCCGAACAGAAG GAAGAACGAAGGAAGGAGCACAGACTTAGATTGGTATATATACGCATATGTAGTGTTGAAG AAACATGAAATTGCCCAGTATTCTTAACCCAACTGCACAGAACAAAAACCTGCAGGAAACG AAGATAAATCATGTCGAAAGCTACATATAAGGAACGTGCTGCTACTCATCCTAGTCCTGTTG CTGCCAAGCTATTTAATATCATGCACGAAAAGCAAACAAACTTGTGTGCTTCATTGGATGTT CGTACCACCAAGGAATTACTGGAGTTAGTTGAAGCATTAGGTCCCAAAATTTGTTTACTAAA AACACATGTGGATATCTTGACTGATTTTTCCATGGAGGGCACAGTTAAGCCGCTAAAGGCAT TATCCGCCAAGTACAATTTTTTACTCTTCGAAGACAGAAAATTTGCTGACATTGGTAATACA GTCAAATTGCAGTACTCTGCGGGTGTATACAGAATAGCAGAATGGGCAGACATTACGAATG CACACGGTGTGGTGGGCCCAGGTATTGTTAGCGGTTTGAAGCAGGCGGCAGAAGAAGTAAC AAAGGAACCTAGAGGCCTTTTGATGTTAGCAGAATTGTCATGCAAGGGCTCCCTAGCTACTG GAGAATATACTAAGGGTACTGTTGACATTGCGAAGAGCGACAAAGATTTTGTTATCGGCTTT ATTGCTCAAAGAGACATGGGTGGAAGAGATGAAGGTTACGATTGGTTGATTATGACACCCG GTGTGGGTTTAGATGACAAGGGAGACGCATTGGGTCAACAGTATAGAACCGTGGATGATGT GGTCTCTACAGGATCTGACATTATTATTGTTGGAAGAGGACTATTTGCAAAGGGAAGGGATG CTAAGGTAGAGGGTGAACGTTACAGAAAAGCAGGCTGGGAAGCATATTTGAGAAGATGCGG CCAGCAAAACTAAAAAACTGTATTATAAGTAAATGCATGTATACTAAACTCACAAATTAGA GCTTCAATTTAATTATATCAGTTATTACCCGGGAATCTCGGTCGTAATGATTTTTATAATGAC GAAAAAAAAAAAATTGGAAAGAAAAAGCTTCATGGCCTTTATAAAAAGGAACCATCCAATA CCTCGCCAGAACCAAGTAACAGTATTTTACGGTTAATTAA) was transformed into BD 31378 (SEQ ID NO:96), BD31365 (SEQ ID NO:78), BD34369 (SEQ ID NO:54) and BD34377 (SEQ ID NO:58) by a conventional transformation protocol, and transformants were plated on SCXylose-URA (Synthetic Complete, Uracil dropout) for selection. Colonies were checked by PCR with primers shown in Table 17, SEQ ID NO:235, SEQ ID NO:236). Confirmed strains are BD31446 (SEQ ID NO:78), BD31448 (SEQ ID NO:96), BD34373 (SEQ ID NO:54) and BD34378 (SEQ ID NO:58).









TABLE 17







Primers Used in Mating Type Verification










SEQ




ID



Primer
NO:
Sequence





1-mating type-R
232
AGTCACATCAAGATCGTTTAT





2-mating type 
233
GCACGGAATATGGGACTACTT


alpha-F







3-mating type 
234
ACTCCACTTCAAGTAAGAGTT


a-F







Ura fix-F
235
GAACAAAAACCTGCAGGAAACGAAGAT





Ura fix-R
236
GCTCTAATTTGTGAGTTTAGTATACATGCAT









Table 18 below shows the genotypes of the resulting yeast strains:









TABLE 18







Strain Construction









Name
Parent Strain
Description





pBPB007
yBPA130
MATa, ura3, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2


pBPB008
yBPA136
MATalpha, ura3, adh2 :: TAL1-XKS1,




pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2


BD31328
pBPB007
MATa, ura3, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 96)


BD31336
pBPB008
MATalpha, ura3, adh2 :: TAL1-XKS1,




pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 96)


BD31526
pBPB007
MATa, ura3, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 78)


BD31527
pBPB008
MATalpha, ura3, adh2 :: TAL1-XKS1,




pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 78)


BD34364
pBPB007
MATa, ura3, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 238)


BD34365
pBPB008
MATalpha, ura3, adh2 :: TAL1-XKS1,




pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 238)


BD34366
pBPB007
MATa, ura3, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 239)


BD34367
pBPB008
MATalpha, ura3, adh2 :: TAL1-XKS1,




pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 239)


BD31378
BD31328
MATa/alpha, ura3, adh2 :: TAL1-XKS1,



BD31336
pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 96)


BD31365
BD31526
MATa/alpha, ura3, adh2 :: TAL1-XKS1,



BD31527
pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 78)


BD34369
BD34364
MATa/alpha, ura3, adh2 :: TAL1-XKS1,



BD34365
pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 238)


BD34377
BD34366
MATa/alpha, ura3, adh2 :: TAL1-XKS1,



BD34367
pho13:: TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 239)


BD31448
BD31378
MATa/alpha, adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 96)


BD31446
BD31365
MATa/alpha. adh2 :: TAL1-XKS1, pho13::




TKL1-XKS1, gre3:: RPE1-RKI1 and




YLR388.5:: GAL2, rDNA::XI (SEQ ID




NO: 78)









5.13 Example 12: Fermentation Performance of Yeast Strain Expressing Different Xylose Isomerases

Fermentation performances of two different XI-expressing yeast strains were evaluated using the DasGip fermentation systems (Eppendorf, Inc.). DasGip fermenters allowed close control over agitation, pH, and temperature ensuring consistency of the environment during fermentation. DasGip fermenters were used to test performance of the yeast strains expressing the XI genes on hydrolysate (Hz) (neutralized with magnesium bases) as a primary carbon source. Prior to the start of fermentation strains were subjected to propagation testing consisting of two steps as described below.


Seed 1:


About 1 ml of strain glycerol stock was inoculated into about 100 ml of YP (Yeast extract, Peptone) medium containing about 2% glucose and about 1% xylose in the 250 ml bellco baffled flask (Bellco, Inc.). Strains were cultivated at about 30° C. with about 200 rpm agitation for at least 18 hours until at full saturation. Optical density was assessed by measuring light absorbance at wavelength of 600 nm.


Seed 2:


About 20 ml of saturated SEED 1 (see preceding paragraph) was inoculated into 3 L Bioflo unit (New Brunswick, Inc.) containing about 2.1 L of basal medium at pH 6.0 (1% v/v inoculation). Cultivation was conducted at about 30° C. in a fed batch mode with constant air flow of about 2 L/min. Agitation ramp (rpm) was about 200-626 rpm over about 15 hours starting at about 5 hours of elapsed fermentation time (EFT). Feeding profile was about 0-4.8 ml/min over 20 hours. The basal medium contained (per 1 L): about 20% of neutralized hydrolysate (Hz); about 20 g/L sucrose (from cane juice); about 35 ml of nutrients mixture (Table 19), about 1 ml of vitamin mixture (Table 20); about 0.4 ml of antifoam 1410 (Dow Corning, Inc.) and water. Feed medium contained (per 1 L): about 20% neutralized hydrolysate (Hz), about 110 g/L sucrose (from cane juice), about 35 ml of nutrient mixture; about 1 ml of vitamin mixture, about 0.4 ml of antifoam 1410 (Dow Corning, Inc.) and water.









TABLE 19







Nutrients mixture











Component
FW g/mol
Conc.















KH2PO4 H2O
154.1
99.1 g/L



Urea
60.06
65.6 g/L



MgSO4—7H2O
192.4
14.6 g/L



DI Water
NA
To 1.0 L

















TABLE 20







Vitamin mixture (1000×)










Components
mM














ZnSO4
100



H3BO3
24



KI
1.8



MnSO4
20



CuSO4
10



Na2MoO4
1.5



CoCl2
1.5



FeCl3
1.23










DasGip Fermentation:


Strains were tested in small scale fermentation using the DasGip system in the industrially relevant medium containing detoxified hydrolysate and sucrose. Strains were propagated as described above; DasGip inoculation was performed using the following protocol:


Cell dry weight of SEED 2 was assessed based on the final optical density. Cell dry weight and optical density (600 nm) correlation was used to estimate the volume of the SEED 2 culture needed for fermentation. Targeted inoculation level was about 7% v/v; about 1.5 g/L cell dry weight. Appropriate volume of SEED 2 culture was harvested by centrifugation (about 5000 rpm for 10 min) to pellet the cells and resuspended in about 17.5 ml of PBS. Resuspended cell solution was used to inoculate a 500 ml DasGip unit containing about 250 ml of detoxified hydrolysate and nutrient solution (about 3.5 ml/100 ml of medium). Fermentation was performed at about 32° C. at pH 6.3 with about 200 rpm. The duration of fermentation was about 92 hours with regular sampling. Sampling was conducted by a 25 ml steriological pipette through the port in the head plate of the DasGip unit. About 3 ml of culture were taken out, harvested by centrifugation (about 5000 rpm for 10 min) to pellet the cells and the supernatant was submitted for analysis. Standard analytical techniques such as high-pressure liquid chromatography (HPLC) were used to determine concentration of sugars and ethanol in the medium. Fermentation performances for yeast strains BD31378 (expressing a xylose isomerase of SEQ ID NO:96) and BD31365 (expressing a xylose isomerase of SEQ ID NO:78) are presented in FIG. 7A and FIG. 7B, respectively.


Serum Bottle Fermentation:


Fermentation performances of BD34373 (SEQ ID NO:54) and BD34378 (SEQ ID NO:58) were evaluated using the serum bottle fermentation system. New Wheaton Thin-Flng Lyp Stopper (VWR Inc., PA, USA) wrap individual 125 mL Anaerobic Media bottles (VWR Inc., PA, USA) allowed close control over agitation, pH, and temperature ensuring consistency of the environment during fermentation. Serum bottle fermentations were used to test performance of the yeast strains expressing XI genes on clean sugar media (see Table 21 (below), supplemented with nutrients (Table 19, above) and vitamins (Table 20, above)).









TABLE 21







Clean sugar










Component
Conc.







Glucose
 80 g/L



Xylose
 80 g/L



Arabinose
5.0 g/L



Acetic acid
8.0 g/L










Yeast cells were inoculated into about 200 ml of YP (Yeast extract, Peptone) medium containing about 0.5% glucose and about 3% xylose in the 500 ml bellco baffled flask (Bellco, Inc.). Strains were cultivated at about 30° C. with about 200 rpm agitation for at least 18 hours until at full saturation. Optical density was assessed by measuring light absorbance at wavelength of 600 nm. Targeted inoculation level was about 1.5 g/L cell dry weight. Appropriate volume of SEED culture was harvested by centrifugation (about 3500 rpm for 10 min) to pellet the cells and resuspended in about 20 ml of media. Resuspended cell solution was used to inoculate a 150 ml serum bottle containing about 90 ml of fermentation media. The autoclaved stopper was placed into serum bottles and then the serum bottles were clamped with aluminium seals (Bellco Inc., USA) by using a seal crimper (Bellco Inc., USA). The aluminium seal cap was peeled off and then the needle (Fisher, USA) inserted. Fermentation was performed at about 35° C., pH 5.5 with about 200 rpm. The duration of fermentation was about 44 hours. Fermentation performances for yeast strains BD34373 (SEQ ID NO:54) is presented in FIG. 7C and BD34378 (SEQ ID NO:58) are presented in FIG. 7D.


5.14 Example 13: Comparative Activity of XI's Encoded by Codon Optimized vs. Non-Optimized Open Reading Frames

The coding sequences for the XIs of SEQ ID NOs:38, 78 and 96 were subject to codon optimization using two approaches: the Boles codon optimization method and the DNA 2.0 Gene Designer software. The codon optimized sequences are set forth in Table 22 below:











TABLE 22






SEQ



Sequence
ID



Description
NO:
Sequence







Boles codon optimized
244
ATGGCTAAGGAATATTTCCCATTCACCGGTAAGATTCCATTCGAA


DNA encoding XI of

GGTAAGGACTCTAAGAACGTTATGGCTTTCCACTATTATGAACCA


SEQ ID NO: 78

GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTC




GCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAA




TTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGGTGAC




GCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATT




ATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGAC




TTGGTTGAAGAAGGTGACACCATTGAAGAATATGAAGCTAGAATG




AAGGCTATTACCGACTATGCTCAAGAAAAGATGAAGCAATTCCCA




AACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTAACAAG




AGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTT




GCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATT




AAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGT




TATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACAC




ATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAG




GGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCA




TCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTC




TTGAAGGCTCACAACTTGGACAAGGACTTCAAGGTTAACATTGAA




GTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTG




GCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAAC




AGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATT




GACAACTATGAATTGACCCAAGCTATGTTGGAAATTATTAGAAAC




GGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGA




AGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCT




GGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGACATT




TTGGAAAACTCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCT




TCTTTCGACCAAGGTGTTGGTAAGGACTTCGAAGACGGTAAGTTG




ACCTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGGTGAACCA




AAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT




TTGTATGCTAAGTAA





Boles codon optimized
245
ATGGCTAAGGAATATTTCCCATTCATTGGTAAGGTTCCATTCGAA


DNA encoding XI of

GGTACTGAATCTAAGAACGTTATGGCTTTCCACTATTATGAACCA


SEQ ID NO: 96

GAAAAGGTTGTTATGGGTAAGAAGATGAAGGACTGGTTGAAGTTC




GCTATGGCTTGGTGGCACACCTTGGGTGGTGCTTCTGCTGACCAA




TTCGGTGGTCAAACCAGATCTTATGAATGGGACAAGGCTGCTGAC




GCTGTTCAAAGAGCTAAGGACAAGATGGACGCTGGTTTCGAAATT




ATGGACAAGTTGGGTATTGAATATTTCTGTTTCCACGACGTTGAC




TTGGTTGAAGAAGGTGAAACCGTTGCTGAATATGAAGCTAGAATG




AAGGTTATTACCGACTATGCTTTGGAAAAGATGCAACAATTCCCA




AACATTAAGTTGTTGTGGGGTACTGCTAACGTTTTCGGTCACAAG




AGATATGCTAACGGTGCTTCTACCAACCCAGACTTCGACGTTGTT




GCTAGAGCTATTGTTCAAATTAAGAACGCTATTGATGCTACCATT




AAGTTGGGTGGTACTAACTATGTTTTCTGGGGTGGTAGAGAAGGT




TATATGTCTTTGTTGAACACCGACCAAAAGAGAGAAAAGGAACAC




ATGGCTACCATGTTGACCATGGCTAGAGACTATGCTAGAGCTAAG




GGTTTCAAGGGTACTTTCTTGATTGAACCAAAGCCAATGGAACCA




TCTAAGCACCAATATGACGTTGACACCGAAACCGTTATTGGTTTC




TTGAGGGCTCACGGTTTGGACAAGGACTTCAAGGTTAACATTGAA




GTTAACCACGCTACCTTGGCTGGTCACACCTTCGAACACGAATTG




GCTTGTGCTGTTGACGCTGGTATGTTGGGTTCTATTGACGCTAAC




AGAGGTGACGCTCAAAACGGTTGGGACACCGACCAATTCCCAATT




GACAACTATGAATTGACCCAAGCTATGATGGAAATTATTAGAAAC




GGTGGTTTGGGTAACGGTGGAACCAACTTCGACGCTAAGATTAGA




AGAAACTCTACCGACTTGGAAGACTTGTTCATTGCTCACATTTCT




GGTATGGACGCTATGGCTAGAGCTTTGATGAACGCTGCTGCTATT




TTGGAAGAATCTGAATTGCCAGCTATGAAGAAGGCTAGATATGCT




TCTTTCGACGAAGGTATTGGTAAGGACTTCGAAGACGGTAAGTTG




TCTTTGGAACAAGTTTATGAATATGGTAAGAAGGTTGAAGAACCA




AAGCAAACCTCTGGTAAGCAAGAAAAGTATGAAACCATTGTTGCT




TTGTATGCTAAGTAA





Boles codon optimized
246
ATGAAGGAAATTTTCCCAAACATTCCAGAAATTAAGTTCGAAGGT


DNA encoding XI of

AAGGACTCTAAGAACCCATTCGCTTTCCACTATTATAACCCAGAC


SEQ ID NO: 38

CAAATTATTTTGGGTAAGCCAATGAAGGAACACTTGCCATTCGCT




ATGGCTTGGTGGCACAACTTGGGTGCTACCGGTGTTGACATGTTC




GGTGCTGGTCCAGCTGACAAGTCTTTCGGTGCTAAGGTTGGTACT




ATGGAACACGCTAAGGCTAAGGTTGACGCTGGTTTCGAATTCATG




AAGAAGTTGGGTATTAGATATTTCTGTTTCCACGACGTTGACTTG




GTTCCAGAATGTGCTGACATTAAGGACACCAACAAGGAATTGGAC




GAAATTTCTGACTATATTTTGGAAAAGATGAAGGGTACTGACATT




AAGTGTTTGTGGGGTACTGCTAACATGTTCTCTAACCCAAGATTC




TGTAACGGTGCTGGTTCTACCAACTCTGCTGACGTTTTCGCTTTC




GCTGCTGCTCAAGTTAAGAAGGCTTTGGACATTACCGTTAAGTTG




GGTGGTAGAGGTTATGTTTTCTGGGGTGGTAGAGAAGGTTATGAA




ACCTTGTTGAACACCGACGTTAAGTTCGAACAAGAAAACATTGCT




AGATTGATGAAGATGGCTGTTGAATATGGTAGATCTATTGGTTTC




AAGGGTGACTTCTATATTGAACCAAAGCCAAAGGAACCAATGAAG




CACCAATATGACTTCGACGCTGCTACCGCTATTGGTTTCTTGAGG




GCTCACGGTTTGGACAAGGACTTCAAGTTGAACATTGAAGCTAAC




CACGCTACCTTGGCTGGTCACACCTTCCAACACGACTTGAGAATT




TCTGCTATTAACGGTATGTTGGGTTCTATTGACGCTAACCAAGGT




GACATGTTGTTGGGTTGGGACACCGACGAATTCCCATTCGACGTT




TATTCTGCTACCCAATGTATGTATGAAGTTTTGAAGAACGGTGGT




TTGACCGGTGGTTTCAACTTCGACTCTAAGACCAGAAGACCATCT




TATACCATGGAAGACATGTTCTTGGCTTATATTTTGGGTATGGAC




ACCTTCGCTTTGGGTTTGATTAAGGCTGCTCAAATTATTGAAGAC




GGTAGAATTGACCAATTCATTGAAAAGAAGTATTCTTCTTTCAGA




GAAACCGAAATTGGTCAAAAGATTTTGAACAACAAGACCTCTTTG




AAGGAATTGTCTGACTATGCTTGTAAGATGGGTGCTCCAGAATTG




CCAGGTTCTGGTAGACAAGAAATGTTGGAAGCTATTGTTAACGAC




GTTTTGTTCGGTAAGTAA





DNA 2.0 codon
247
ATGGCTAAGGAATACTTTCCATTCACCGGAAAGATACCATTTGAA


optimized DNA

GGTAAAGATTCTAAAAACGTAATGGCTTTTCATTATTACGAACCA


encoding XI of SEQ

GAAAAAGTTGTTATGGGCAAAAAGATGAAAGATTGGTTGAAATTT


ID NO: 78

GCGATGGCTTGGTGGCATACACTCGGGGGAGCTTCCGCTGATCAA




TTTGGCGGACAAACCAGATCATACGAATGGGATAAAGCAGGCGAT




GCCGTGCAGAGAGCAAAGGATAAAATGGATGCTGGTTTCGAAATT




ATGGATAAGCTAGGTATCGAATACTTCTGCTTCCATGACGTCGAT




TTGGTTGAAGAGGGCGATACTATCGAGGAATACGAGGCGAGAATG




AAGGCTATAACAGACTACGCCCAGGAGAAAATGAAACAATTTCCT




AACATCAAATTACTCTGGGGTACTGCCAATGTGTTTGGTAACAAA




AGATACGCAAACGGGGCTTCAACTAATCCTGACTTCGATGTTGTT




GCAAGAGCCATTGTTCAAATCAAAAACGCGATAGACGCTACTATT




AAACTAGGTGGCACGAATTACGTCTTTTGGGGTGGAAGGGAAGGT




TACATGTCTCTGCTTAATACAGATCAGAAGAGAGAGAAGGAACAC




ATGGCAACAATGCTCACTATGGCCCGTGACTACGCAAGAGCAAAA




GGTTTTAAGGGCACTTTCCTTATCGAACCAAAGCCTATGGAACCA




TCAAAACACCAATATGATGTTGACACAGAAACTGTGATCGGCTTT




TTGAAAGCTCATAACTTGGACAAGGATTTCAAAGTAAACATTGAA




GTTAATCATGCTACACTAGCAGGACACACATTTGAACACGAACTG




GCCTGTGCGGTAGATGCAGGGATGCTGGGTTCTATCGACGCTAAT




AGAGGGGATGCTCAAAATGGTTGGGATACCGATCAATTTCCAATC




GACAATTACGAATTAACACAAGCTATGTTGGAGATTATTAGAAAT




GGAGGTTTGGGTAATGGGGGTACAAACTTCGATGCTAAGATTCGT




CGAAATTCCACAGACTTAGAAGATTTGTTCATTGCGCATATATCT




GGTATGGATGCTATGGCCAGAGCATTAATGAATGCCGCTGACATC




TTAGAAAACAGTGAACTTCCAGCAATGAAAAAGGCCAGATATGCC




TCTTTCGATCAAGGTGTAGGAAAAGATTTTGAGGACGGCAAGTTG




ACTTTAGAACAAGTCTATGAATACGGTAAAAAGGTCGGCGAACCT




AAGCAAACCAGCGGAAAGCAAGAGAAATACGAGACTATCGTGGCT




CTTTATGCAAAATAA





DNA 2.0 codon
248
ATGGCCAAGGAGTACTTCCCTTTTATCGGCAAGGTCCCATTTGAA


optimized DNA

GGGACAGAATCCAAAAACGTCATGGCTTTTCACTACTATGAACCT


encoding XI of SEQ

GAGAAGGTAGTTATGGGTAAAAAGATGAAAGATTGGTTGAAGTTT


ID NO: 96

GCAATGGCATGGTGGCATACCTTGGGTGGGGCCTCTGCTGATCAA




TTTGGAGGACAAACTAGATCATACGAATGGGATAAAGCAGCTGAT




GCCGTTCAAAGAGCCAAAGATAAAATGGATGCCGGGTTCGAAATC




ATGGACAAATTGGGTATCGAATATTTCTGCTTCCATGATGTAGAC




CTTGTTGAGGAGGGTGAAACCGTCGCTGAATATGAGGCGAGAATG




AAGGTTATTACGGATTACGCACTAGAAAAGATGCAGCAGTTTCCA




AACATAAAACTATTGTGGGGTACTGCTAATGTTTTCGGACATAAA




CGTTACGCTAACGGAGCTTCCACTAATCCAGACTTTGATGTTGTC




GCGAGAGCTATCGTTCAAATCAAAAATGCAATCGATGCTACAATT




AAGTTAGGAGGGACAAATTACGTGTTCTGGGGTGGTAGAGAAGGT




TACATGAGCCTGCTTAATACAGATCAAAAGAGAGAAAAGGAGCAC




ATGGCAACAATGCTAACAATGGCTAGAGATTATGCCCGAGCTAAG




GGCTTCAAAGGCACTTTTCTGATAGAACCTAAACCAATGGAACCA




TCTAAACACCAATACGATGTAGACACCGAAACTGTAATAGGCTTC




CTTCGTGCACATGGTTTGGATAAAGATTTTAAGGTGAACATTGAA




GTGAATCATGCTACTTTAGCCGGTCACACTTTTGAACATGAATTA




GCATGTGCTGTTGATGCGGGAATGTTGGGTTCTATCGATGCCAAC




AGAGGCGACGCCCAAAATGGTTGGGACACAGACCAGTTTCCTATT




GACAATTACGAACTCACCCAAGCTATGATGGAAATTATCAGGAAT




GGGGGACTGGGAAATGGTGGTACGAACTTTGATGCGAAGATAAGG




AGAAACTCTACTGACTTAGAAGATTTGTTTATAGCACATATTTCA




GGTATGGACGCTATGGCAAGAGCTTTAATGAATGCCGCAGCAATC




TTGGAGGAAAGTGAACTCCCAGCTATGAAAAAGGCAAGATACGCA




AGTTTTGATGAGGGTATTGGCAAAGACTTCGAAGATGGTAAACTA




TCTTTAGAACAAGTGTACGAGTATGGCAAAAAGGTAGAGGAACCA




AAACAAACATCAGGCAAACAAGAGAAATATGAAACAATTGTCGCT




CTTTACGCGAAGTAA





DNA 2.0 codon
249
ATGAAGGAAATCTTCCCTAACATCCCAGAGATCAAATTCGAAGGC


optimized DNA

AAAGACTCTAAAAATCCATTTGCCTTCCACTATTACAACCCAGAC


encoding XI of SEQ

CAGATCATTTTAGGTAAACCAATGAAGGAGCACTTGCCATTTGCT


ID NO: 38

ATGGCTTGGTGGCATAATCTAGGCGCCACTGGTGTTGATATGTTT




GGTGCAGGCCCTGCGGACAAATCTTTCGGAGCTAAAGTAGGAACT




ATGGAACATGCAAAAGCGAAAGTTGATGCTGGGTTTGAGTTCATG




AAGAAATTAGGAATCAGATATTTCTGCTTTCATGATGTTGACTTG




GTTCCTGAGTGTGCTGACATTAAGGATACAAACAAGGAACTTGAT




GAAATCTCTGACTACATTTTGGAAAAGATGAAAGGTACTGACATA




AAGTGTTTGTGGGGCACGGCTAATATGTTTTCCAATCCAAGATTT




TGTAACGGCGCTGGCTCAACTAATTCAGCAGATGTCTTTGCATTC




GCTGCTGCACAAGTCAAGAAAGCACTTGACATTACAGTCAAACTG




GGTGGGAGAGGATACGTTTTCTGGGGTGGTAGAGAAGGCTACGAA




ACATTGTTGAATACAGACGTTAAGTTTGAACAAGAGAATATTGCA




AGGTTAATGAAAATGGCAGTGGAATATGGGCGTTCTATAGGTTTT




AAAGGTGATTTCTACATTGAGCCAAAACCAAAGGAACCTATGAAA




CATCAATACGATTTCGATGCCGCAACAGCAATAGGTTTCCTTAGA




GCCCACGGGTTGGATAAAGACTTTAAGCTCAATATCGAAGCCAAC




CACGCAACACTTGCAGGCCATACATTTCAACATGATCTTAGAATA




TCTGCTATTAACGGAATGCTCGGCTCAATTGATGCCAATCAGGGT




GATATGCTACTAGGTTGGGATACTGATGAGTTTCCATTTGATGTA




TACTCCGCTACACAATGCATGTATGAGGTGTTGAAAAATGGTGGT




CTGACCGGTGGCTTCAACTTCGATAGTAAGACCAGACGTCCTTCA




TACACTATGGAAGATATGTTTCTGGCGTATATCTTAGGTATGGAC




ACATTTGCTTTAGGTCTAATCAAAGCCGCTCAAATCATTGAAGAT




GGCAGAATTGACCAGTTTATAGAAAAGAAATACTCCAGTTTTCGA




GAAACCGAAATCGGACAAAAGATTCTCAATAACAAAACTTCATTG




AAGGAATTATCTGATTACGCCTGTAAGATGGGTGCGCCAGAATTA




CCTGGAAGCGGTAGACAAGAGATGCTTGAAGCTATCGTGAATGAT




GTATTGTTTGGAAAATAA









The codon optimized DNA sequences were synthesized and incorporated into expression cassettes substantially as described in Example 12. Yeast strains were generated that included single copies of individual XI open reading frames integrated into the yeast YER131.5 locus. Strains confirmed to contain the XI expression cassettes were inoculated into about 3 ml of modified YP Media (YP+0.1% Glucose+3.0% Xylose) and incubated overnight at about 30° C. and about 220 rpm. These overnight cultures were subcultured into about 25 ml of the same media to about OD600=0.2. Samples were incubated overnight at about 30° C. and about 220 rpm. Cultures were harvested when OD600 was between about 3 and 4. Pellets were collected by centrifugation for about 5 minutes at about 4000 rpm. The supernatants were discarded and pellets washed with about 25 ml of distilled-deionized water and centrifuged again using the same conditions. Supernatants were discarded and the pellet frozen at about −20° C. until lysis and characterization.


Cell pellets were thawed and about 200 mg of each pellet sample was weighed out into 2 ml microcentrifuge tubes. About 50 μl of Complete®, EDTA-free Protease Inhibitor cocktail (Roche Part#11873 580 001) at 5 times the concentration stated in the manufacturer's protocol was added to each sample. To this was added about 0.5 ml of Y-PER Plus® Dialyzable Yeast Protein Extraction Reagent (Thermo Scientific Part#78999) (YP+) to each sample. Samples were incubated at about 25° C. for about 4 hours on rotating mixer. Sample supernatants were collected after centrifugation at about 10,000×g for about 10 minutes for characterization.


Total protein concentrations of the XI sample extracts prepared above were carried out using Bio-Rad Protein Assay Dye Reagent Concentrate (Bio-Rad, cat#500-0006, Hercules Calif.) which is a modified version of the Bradford method (Bradford). In this assay, optical density readings were taken on a spectrophotometer set to 595 nm, and the standard curve was plotted as a linear regression line.


XI activity was determined using assay conditions at pH 7.5 as described in the Section 5.1.2. The specific activities of the codon optimized Xis are shown in Table 23.












TABLE 23





Protein
Nucleic Acid

SA, pH 7.5


SEQ ID NO.
SEQ ID NO:
Codon optimization method
(U/mg)


















78
77
Native
0.43


78
247
DNA2.0—Codon Optimization
0.14


78
244
Boles Codon Optimization
0.53


96
95
Native
0.42


96
248
DNA2.0—Codon Optimization
0.14


96
245
Boles Codon Optimization
0.40


38
37
Native
0.25


38
249
DNA2.0—Codon Optimization
0.27


38
246
Boles Codon Optimization
0.40









Because these specific activity data are determined on the basis of the total cellular protein mass, any variations in specific activity for any given XI are due to expression levels. These data demonstrate that the Boles codon optimization approach improves the expressibility of bacterial XIs in S. cerevisiae.


5.15 Example 14: Comparative Activity of XI's of the Disclosure vs. Orpinomyces sp. XI

The specific activities of exemplary XIs of the disclosure were compared to the specific activity of a known XI, Orpinomyces sp. XI assigned Genbank Accession No. 169733248. The XIs were incorporated into expression cassettes substantially as described in Example 12. Yeast strains were generated that included single copies of individual XI open reading frames integrated into the yeast YER131.5 locus. Activity of the individual clones was measured at pH 7.5 using a similar approach to that used in Example 12, except that the total protein concentrations were based on optical density readings at 450 nm and 595 nm, with the standard curve was plotted as a parametric fit. Results are shown in Table 24, below.













TABLE 24






Nucleic


% activity as



Acid

SA, pH
compared to


Protein
SEQ ID

7.5

Orpinomyces



SEQ ID NO.
NO:
Codon optimization method
(U/mg)
XI







NA
NA
Host negative control
0.005
  7%




(no recombinant XI)












Genbank Accession No.

Orpinomyces
sp.

0.071
100%


169733248
XI (Native)













96
95
Native
0.137
193%


54
238
Boles codon optimization
0.193
272%


58
239
Boles codon optimization
0.205
288%









While various specific embodiments have been illustrated and described, it will be appreciated that various changes can be made without departing from the spirit and scope of the invention(s).

Claims
  • 1. An isolated nucleic acid sequence comprising a nucleotide sequence having at least 90%, at least 93%, at least 95%, at least 96%, at least 98%, or at least 99% sequence identity, or having 100% sequence identity, to the nucleotide sequence of SEQ ID NO: 245 or to a portion thereof encoding a xylose isomerase catalytic or dimerization domain, wherein the nucleic acid encodes a polypeptide with at least one substitution relative to the polypeptide having the sequence of SEQ ID NO: 96 or at least one heterologous amino acid flanking the N-terminal or the C-terminal.
  • 2. The nucleic acid sequence of claim 1, which is codon optimized for expression in a eukaryotic cell.
  • 3. The nucleic acid sequence of claim 1, which is codon optimized for expression in yeast.
  • 4. A vector comprising the nucleic acid sequence of claim 1.
  • 5. The vector of claim 4, which further comprises an origin of replication.
  • 6. The vector of claim 4, which further comprises a promoter sequence operably linked to said nucleic acid sequence.
  • 7. The vector of claim 6, wherein the promoter sequence is operable in yeast.
  • 8. The vector of claim 6, wherein the promoter sequence is operable in filamentous fungi.
  • 9. A recombinant cell engineered to express a polypeptide encoded by the nucleic acid sequence of claim 1.
  • 10. The recombinant cell of claim 9, wherein the cell is a eukaryotic cell.
  • 11. The recombinant cell of claim 9, wherein the cell is a yeast cell.
  • 12. The recombinant cell of claim 11, wherein the yeast cell is a yeast cell of the genus Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Klockera, Schwanniomyces, Issatchenkia or Yarrowia.
  • 13. The recombinant cell of claim 11, wherein the yeast cell is of the species S. cerevisiae, S. bulderi, S. barnetti, S. exiguus, S. uvarum, S. diastaticus, K lactis, K. marxiames or K. fragili, or Issatchenkia orientalis.
  • 14. The recombinant cell of claim 13, wherein the cell is a S. cerevisiae cell.
  • 15. The recombinant cell of claim 14, comprising one or more genetic modifications resulting in at least one, any two, any three, any four or all of the following phenotypes: (a) an increase in transport of xylose into the cell;(b) an increase in xylulose kinase activity;(c) an increase in aerobic growth rate on xylose;(d) an increase in flux through the pentose phosphate pathway into glycolysis;(e) a decrease in aldose reductase activity;(f) a decrease in sensitivity to catabolite repression;(g) an increase in tolerance to ethanol, intermediates, osmolality or organic acids; and(h) a reduced production of byproducts.
  • 16. The recombinant cell of claim 15, wherein one or more genetic modifications result in increased expression levels of one or more of a hexose or pentose transporter, a xylulose kinase, an enzyme from the pentose phosphate pathway, a glycolytic enzyme and an ethanologenic enzyme.
  • 17. A host cell transformed with the vector of claim 4.
  • 18. The host cell of claim 17 which is a prokaryotic cell.
  • 19. The host cell of claim 18 which is a bacterial cell.
  • 20. The host cell of claim 17 which is a eukaryotic cell.
PCT Information
Filing Document Filing Date Country Kind
PCT/US2013/051718 7/23/2013 WO 00
Publishing Document Publishing Date Country Kind
WO2014/018552 1/30/2014 WO A
US Referenced Citations (2)
Number Name Date Kind
20120142067 Desfougeres et al. Jun 2012 A1
20120225452 Bao et al. Sep 2012 A1
Foreign Referenced Citations (8)
Number Date Country
102 174 549 Sep 2011 CN
10 2008 031350 Jan 2010 DE
WO 2003062430 Jul 2003 WO
WO2008000632 Jan 2008 WO
WO 2009109634 Sep 2009 WO
WO2010039692 Apr 2010 WO
WO 2010074577 Jul 2010 WO
WO 2011090731 Jul 2011 WO
Non-Patent Literature Citations (22)
Entry
Chica et al. Curr Opin Biotechnol. Aug. 2005;16(4):378-84.
Sen et al. Appl Biochem Biotechnol. Dec. 2007;143(3):212-23.
Accession AJ249909. Apr. 15, 2005.
Accession D3I1K1. Mar. 23, 2010.
Database EMBL HD069893, Database accession No. HD069893 sequence. Sequence 18 from Patent WO2010074577, XP002714607. Aug. 3, 2010.
Database EMBL HD069894, Database accession No. HD069894 sequence. Sequence 19 from Patent WO2010074577, XP002714608. Aug. 3, 2010.
Database EMBL HD069895, Database accession No. HD069895 sequence. Sequence 20 from Patent WO2010074577, XP002714609. Aug. 3, 2010.
Database UniProt XP002714610, retrieved from EBI accession No. UNIPROT: D3I1K1. “RecName: Full=Xylose isomerase; EC=5.3.1.5” Mar. 23, 2010.
Database UniProt XP002714611, retrieved from EBI accession No. UNIPROT: F9D530. “RecName: Full=Xylose isomerase; EC=5.3.1.5” Oct. 19, 2011.
Ethanol production related xylose isomerase (XI), Seq ID 2, XP002714606, retrieved from EBI accession No. GSP:AZM80432. Database accession No. AZM80432 sequence.
Harhangi, H. R. et al: “Xylose metabolism in the anaerobic fungus Piromyces sp. Strain E2 follows the bacterial pathway,” Archives of Microbiology, Springer, DE, vol. 180, No. 2, Jun. 13, 2003.
Hector, Ronald E. et al: “Growth and fermentation of D-xylose by Saccharomyces cerevisiae expressing a novel D-xylose isomerase originating from the bacterium Prevotella ruminicola TC2-24,” Biotechnology for Biofuels, Biomed Central Ltd., GB, vol. 6, No. 1, May 30, 2013.
Kuyper, M. et al: “High-level functional expression of a fungal xylose isomerase: The key to efficient ethanolic fermentation of xylose by Saccharomyces cerevisiae?” Fems Yeast Research, Wiley-Blackwell Publishing Ltd. GB, NL, vol. 4, No. 1, Oct. 1, 2003.
Accession AEL74969, Mar. 3, 2011.
Chica et al.: “Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and rational design”; Current Opinion in Biotechnology 2005, 16:378-384.
Sen et al.: “Developments in Directed Evolution for. Improving Enzyme Functions”; Appl Biochem Biotechnol (2007) 143:212-223.
Brat, Dawid et al.: “Functional Expression of a Bacterial Xylose Isomerase in Saccharomyces cerevisiae”; Applied and Environmental Microbiology, American Society for Microbiology, US, vol. 75, No. 8, Apr. 1, 2009, pp. 2304-2311.
European Search Report dated Aug. 24, 2016 regarding EP 13744924.
Jeffries et al.: “Engineering yeasts for xylose metabolism”; Current Opinion in Iotechnology, vol. 17, No. 3, Jun. 1, 2006, pp. 320-326.
Madhavan, Anjali et al: “Xylose isomerase from polycentric fungus Orpinomyces: gene sequencing, cloning, and expression in Saccharomyces cerevisiae for bioconversion of xylose to ethanol”; Applied Microbiology and Biotechnology, vol. 82, No. 6, Dec. 3, 2008, pp. 1067-1078.
Van Maris, A. J. A. et al: “Development of Efficient Xylose Fermentation in Saccharomyces cerevisiae: Xylose Isomerase as a Key Component”; Advances in Biochemical Engineering, Biotechnology, vol. 108, Jan. 1, 2007, pp. 179-204.
XP002714606: “Ethanol production related xylose isomerase (XI), SEQ ID 2.”; Jan. 19, 2012, Database accession No. AZM80432; retrieved from EBI accession No. GSP:AZM80432.
Related Publications (1)
Number Date Country
20150203835 A1 Jul 2015 US
Provisional Applications (1)
Number Date Country
61675241 Jul 2012 US