VARIANT CBH I POLYPEPTIDES WITH REDUCED PRODUCT INHIBITION

Information

  • Patent Application
  • 20150329880
  • Publication Number
    20150329880
  • Date Filed
    August 03, 2015
    9 years ago
  • Date Published
    November 19, 2015
    8 years ago
Abstract
The present disclosure relates to variant CBH I polypeptides that have reduced product inhibition, and compositions, e.g., cellulase compositions, comprising variant CBH I polypeptides. The variant CBH I polypeptides and related compositions can be used in variety of agricultural and industrial applications. The present disclosure further relates to nucleic acids encoding variant CBH I polypeptides and host cells that recombinantly express the variant CBH I polypeptides.
Description
BACKGROUND OF THE INVENTION

Cellulose is an unbranched polymer of glucose linked by β(1→4)-glycosidic bonds. Cellulose chains can interact with each other via hydrogen bonding to form a crystalline solid of high mechanical strength and chemical stability. The cellulose chains are depolymerized into glucose and short oligosaccharides before organisms, such as the fermenting microbes used in ethanol production, can use them as metabolic fuel. Cellulase enzymes catalyze the hydrolysis of the cellulose (hydrolysis of β-1,4-D-glucan linkages) in the biomass into products such as glucose, cellobiose, and other cellooligosaccharides. Cellulase is a generic term denoting a multienzyme mixture comprising exo-acting cellobiohydrolases (CBHs), endoglucanases (EGs) and β-glucosidases (BGs) that can be produced by a number of plants and microorganisms. Enzymes in the cellulase of Trichoderma reesei include CBH I (more generally, Ce17A), CBH2 (Cel6A), EG1 (Cel7B), EG2 (Cel5), EG3 (Cel2), EG4 (Cel61A), EG5 (Cel45A), EG6 (Cel74A), Cip1, Cip2, β-glucosidases (including, e.g., Cel3A), acetyl xylan esterase, β-mannanase, and swollenin.


Cellulase enzymes work synergistically to hydrolyze cellulose to glucose. CBH I and CBH II act on opposing ends of cellulose chains (Barr et al., 1996, Biochemistry 35:586-92), while the endoglucanases act at internal locations in the cellulose. The primary product of these enzymes is cellobiose, which is further hydrolyzed to glucose by one or more β-glucosidases.


The cellobiohydrolases are subject to inhibition by their direct product, cellobiose, which results in a slowing down of saccharification reactions as product accumulates. There is a need for new and improved cellobiohyrolases with improved productivity that maintain their reaction rates during the course of a saccharification reaction, for use in the conversion of cellulose into fermentable sugars and for related fields of cellulosic material processing such as pulp and paper, textiles and animal feeds.


SUMMARY OF THE INVENTION

The present disclosure relates to variant CBH I polypeptides. Most naturally occurring CBH I polypeptides have arginines at positions corresponding to R268 and R411 of T. reesei CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present disclosure include a substitution at either or both positions resulting in a reduction or decrease in product (e.g., cellobiose) inhibition. Such variants are sometimes referred to herein as “product tolerant.”


The variant CBH I polypeptides of the disclosure minimally contain at least a CBH I catalytic domain, comprising (a) a substitution at the amino acid position corresponding to R268 of T. reesei CBH I (“R268 substitution”); (b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or (c) both an R268 substitution and an R411 substitution. The amino acid positions of exemplary CBH I polypeptides into which R268 and/or R411 substitutions can be introduced are shown in Table 1, and the amino acid positions corresponding to R268 and/or R411 in these exemplary CBH I polypeptides are shown in Table 2.


R268 and/or R411 substituents can include lysines and/or alanines Accordingly, the present disclosure provides a variant CBH I polypeptide comprising a CBH I catalytic domain with one of the following amino acid substitutions or pairs of R268 and/or R411 substitutions: (a) R268K and R411K; (b) R268K and R411A; (c) R268A and R411K; (d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A; and (h) R411K. In some embodiments, however, the amino acid sequence of the variant CBH I polypeptide does not comprise or consist of SEQ ID NO:299, SEQ ID NO:300, SEQ ID NO:301, or SEQ ID NO:302.


The variant CBHI polypeptides of the disclosure typically include a CD comprising an amino acid sequence having at least 50% sequence identity to a CD of a reference CBH I exemplified in Table 1. The CD portions of the CBH I polypeptides exemplified in Table 1 are delineated in Table 3. The variant CBH I polypeptides can have a cellulose binding domain (“CBD”) sequence in addition to the catalytic domain (“CD”) sequence. The CBD can be N- or C-terminal to the CD, and the CBD and CD are optionally connected via a linker sequence.


The variant CBH I polypeptides can be mature polypeptides or they may further comprise a signal sequence.


Additional embodiments of the variant CBH I polypeptides are provided in Section 0.


The variant CBH I polypeptides of the disclosure typically exhibit reduced product inhibition by cellobiose. In certain embodiments, the IC50 of cellobiose towards a variant CBH I polypeptide of the disclosure is at least 1.2-fold, at least 1.5-fold, or at least 2-fold the IC50 of cellobiose towards a reference CBH I lacking the R268 substitution and/or R411 substitution present in the variant. Additional embodiments of the product inhibition characteristics of the variant CBH I polypeptides are provided in Section 0.


The variant CBH I polypeptides of the disclosure typically retain some cellobiohydrolase activity. In certain embodiments, a variant CBH I polypeptide retains at least 50% the CBH I activity of a reference CBH I lacking the R268 substitution and/or R411 substitution present in the variant. Additional embodiments of cellobiohydrolase activity of the variant CBH I polypeptides are provided in Section 0.


The present disclosure further provides compositions (including cellulase compositions, e.g., whole cellulase compositions, and fermentation broths) comprising variant CBH I polypeptides. Additional embodiments of compositions comprising variant CBH I polypeptides are provided in Section 0. The variant CBH I polypeptides and compositions comprising them can be used, inter alia, in processes for saccharifying biomass. Additional details of saccharification reactions, and additional applications of the variant CBH I polypeptides, are provided in Section 0.


The present disclosure further provides nucleic acids (e.g., vectors) comprising nucleotide sequences encoding variant CBH I polypeptides as described herein, and recombinant cells engineered to express the variant CBH I polypeptides. The recombinant cell can be a prokaryotic (e.g., bacterial) or eukaryotic (e.g., yeast or filamentous fungal) cell. Further provided are methods of producing and optionally recovering the variant CBH I polypeptides. Additional embodiments of the recombinant expression system suitable for expression and production of the variant CBH I polypeptides are provided in Section 0.





BRIEF DESCRIPTION OF THE DRAWINGS AND TABLES


FIGS. 1A-1B: Cellobiose dose-response curves using a 4-MUL assay for a wild-type CBH I (BD29555; FIG. 1A) and a R268K/R411K variant CBH I (BD29555 with the substitutions R273K/R422K; FIG. 1B).



FIGS. 2A-2B: The effect of cellobiose accumulation on the activity of wild-type CBH I and a R268K/R411K variant CBH I, based on percent conversion of glucan after 72 hours in the bagasse assay. FIG. 2A shows relative activity in the presence (+) and absence (−) of β-glucosidase (BG), where relative activity is normalized to wild type activity with BG (WT+=1). FIG. 2B shows tolerance to cellobiose as a function of the ratio of activity in the absence vs. presence of β-glucosidase (activity ratio=Activity −BG/Activity +BG).



FIG. 3: Cellobiose dose-response curves using PASC assay for a R268K/R411K variant CBH I polypeptide as compared to two wild type CBH I polypeptides.



FIG. 4: The effect of cellobiose accumulation on the activity of a wild-type CBH I and a R268K/R411K variant CBH I based on percent conversion of glucan after 72 hours in the bagasse assay in the presence (+) and absence (−) of β-glucosidase (BG). Activity is normalized to wild type activity with BG (WT+=1).



FIG. 5: Characterization of cellobiose product tolerance of variant CBH I polypeptides, based on percent conversion of glucan after 72 hours in the absence and presence of β-glucosidase (BG) in the bagasse assay; tolerance is evaluated as a function of the ratio of activity in the absence vs. presence of β-glucosidase.





TABLE 1: Amino acid sequences of exemplary “reference” CBH I polypeptides that can be modified at positions corresponding to R268 and/or R411 in T. reesei CBH I (SEQ ID NO:2). The database accession numbers are indicated in the second column. Unless indicated otherwise, the accession numbers refer to the Genbank database. “#” indicates that the CBH I has no signal peptide; “&” indicate that the sequence is from the PDB database and represents the catalytic domain only without signal sequence; * indicates a nonpublic database. These amino acid sequences are mostly wild type, with the exception of some sequences from the PDB database which contain mutations to facilitate protein crystallization.


TABLE 2: Amino acid positions in the exemplary reference CBH I polypeptides that correspond to R268 and R411 in T. reesei CBH I. Database descriptors are as for Table 1.


TABLE 3: Approximate amino acid positions of CBH I polypeptide domains. Abbreviations used: SS is signal sequence; CD is catalytic domain; and CBD is cellulose binding domain. Database descriptors are as for Table 1.


TABLE 4: Table 4 shows a segment within the catalytic domain of each exemplary reference CBH I polypeptide containing the active site loop (shown in bold, underlined text) and the catalytic residues (glutamates in most CBH I polypeptides) (shown in bold, double underlined text). Database descriptors are as for Table 1.


TABLE 5: MUL and bagasse assay results for variants of BD29555. ND means not determined. ±% Activity (+/−cellobiose)=[(Activity with cellobiose)/(Activity without cellobiose)]*100. ¥ % Activity (−/+BG)=[(Activity without BG)/(Activity with BG)]*100]


TABLE 6: MUL and bagasse assay results for variants of T. reesei CBH I. ND means not determined. ±% Activity (+/−cellobiose)=[(Activity with cellobiose)/(Activity without cellobiose)]*100. ¥ % Activity (−/+BG)=[(Activity without BG)/(Activity with BG)]*100.


TABLE 7: Informal sequence listing. SEQ ID NO:1-149 correspond to the exemplary reference CBH I polypeptides. SEQ ID NO:299 corresponds to mature T. reesei CBH I (amino acids 26-529 of SEQ ID NO:2) with an R268A substitution. SEQ ID NO:300 corresponds to mature T. reesei CBH I (amino acids 26-529 of SEQ ID NO:2) with an R411A substitution. SEQ ID NO:301 corresponds to full length BD29555 with both an R268K substitution and an R411K substitution. SEQ ID NO:302 corresponds to mature BD29555 with both an R268K substitution and an R411K substitution.


DETAILED DESCRIPTION OF THE INVENTION

The present disclosure relates to variant CBH I polypeptides. Most naturally occurring CBH I polypeptides have arginines at positions corresponding to R268 and R411 of T. reesei CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present disclosure include a substitution at either or both positions resulting in a reduction of product (e.g., cellobiose) inhibition. The following subsections describe in greater detail the variant CBH I polypeptides and exemplary methods of their production, exemplary cellulase compositions comprising them, and some industrial applications of the polypeptides and cellulase compositions.


Variant CBH I Polypeptides

The present disclosure provides variant CBH I polypeptides comprising at least one amino acid substitution that results in reduced product inhibition. “Variant” means a polypeptide which is differs in sequence from a reference polypeptide by substitution of one or more amino acids at one or a number of different sites in the amino acid sequence. Exemplary reference CBH I polypeptides are shown in Table 1.


The variant CBH I polypeptides of the disclosure have an amino acid substitution at the amino acid position corresponding to R268 of T. reesei CBH I (SEQ ID NO:2) (an “R268 substitution”), (b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or (c) both an R268 substitution and an R411 substitution, as compared to a reference CBH I polypeptide. It is noted that the R268 and R411 numbering is made by reference to the full length T. reesei CBH I, which includes a signal sequence that is generally absent from the mature enzyme. The corresponding numbering in the mature T. reesei CBH I (see, e.g., SEQ ID NO:4) is 8251 and R394, respectively.


Accordingly, the present disclosure provides variant CBH I polypeptides in which at least one of the amino acid positions corresponding to R268 and R411 of T. reesei CBH I, and optionally both the amino acid positions corresponding to R268 and R411 of T. reesei CBH I, is not an arginine.


The amino acid positions in the reference polypeptides of Table 1 that correspond to R268 and R411 in T. reesei CBH I are shown in Table 2. Amino acid positions in other CBH I polypeptides that correspond to R268 and R411 can be identified through alignment of their sequences with T. reesei CBH I using a sequence comparison algorithm. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, 1981, Adv. Appl. Math. 2:482-89; by the homology alignment algorithm of Needleman & Wunsch, 1970, J. Mol. Biol. 48:443-53; by the search for similarity method of Pearson & Lipman, 1988, Proc. Nat'l Acad. Sci. USA 85:2444-48, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by visual inspection.


The R268 and/or R411 substitutions are preferably selected from (a) R268K and R411K; (b) R268K and R411A; (c) R268A and R411K; (d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A; and (h) R411K.


CBH I polypeptides belong to the glycosyl hydrolase family 7 (“GH7”). The glycosyl hydrolases of this family include endoglucanases and cellobiohydrolases (exoglucanases). The cellobiohydrolases act processively from the reducing ends of cellulose chains to generate cellobiose. Cellulases of bacterial and fungal origin characteristically have a small cellulose-binding domain (“CBD”) connected to either the N or the C terminus of the catalytic domain (“CD”) via a linker peptide (see Suumakki et al., 2000, Cellulose 7: 189-209). The CD contains the active site whereas the CBD interacts with cellulose by binding the enzyme to it (van Tilbeurgh et al., 1986, FEBS Lett. 204(2): 223-227; Tomme et al., 1988, Eur. J. Biochem. 170:575-581). The three-dimensional structure of the catalytic domain of T. reesei CBH I has been solved (Divne et al., 1994, Science 265:524-528). The CD consists of two β-sheets that pack face-to-face to form a β-sandwich. Most of the remaining amino acids in the CD are loops connecting the β-sheets. Some loops are elongated and bend around the active site, forming cellulose-binding tunnel of (˜50 Å). In contrast, endoglucanases have an open substrate binding cleft/groove rather than a tunnel. Typically, the catalytic residues are glutamic acids corresponding to E229 and E234 of T. reesei CBH I.


The loops characteristic of the active sites (“the active site loops”) of reference CBH I polypeptides, which are absent from GH7 family endoglucanases, as well as catalytic glutamate residues of the reference CBH I polypeptides, are shown in Table 4. The variant CBH I polypeptides of the disclosure preferably retain the catalytic glutamate residues or may include a glutamine instead at the position corresponding to E234, as for SEQ ID NO:4. In some embodiments, the variant CBH I polypeptides contain no substitutions or only conservative substitutions in the active site loops relative to the reference CBH I polypeptides from which the variants are derived.


Many CBH I polypeptides do not have a CBD, and most studies concerning the activity of cellulase domains on different substrates have been carried out with only the catalytic domains of CBH I polypeptides. Because CDs with cellobiohydrolase activity can be generated by limited proteolysis of mature CBH I by papain (see, e.g., Chen et al., 1993, Biochem. Mol. Biol. Int. 30(5):901-10), they are often referred to as “core” domains. Accordingly, a variant CBH I can include only the CD “core” of CBH I. Exemplary reference CDs comprise amino acid sequences corresponding to positions 26 to 455 of SEQ ID NO:1, positions 18 to 444 of SEQ ID NO:2, positions 26 to 455 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions 24 to 457 of SEQ ID NO:5, positions 18 to 448 of SEQ ID NO:6, positions 27 to 460 of SEQ ID NO:7, positions 27 to 460 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18 to 434 of SEQ ID NO:12, positions 18 to 445 of SEQ ID NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 443 of SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of SEQ ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions 23 to 446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20, positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to 447 of SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to 447 of SEQ ID NO:26, positions 19 to 442 of SEQ ID NO:27, positions 18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29, positions 18 to 444 of SEQ ID NO:30, positions 18 to 451 of SEQ ID NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to 459 of SEQ ID NO:35, positions 19 to 450 of SEQ ID NO:36, positions 19 to 453 of SEQ ID NO:37, positions 18 to 448 of SEQ ID NO:38, positions 19 to 443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID NO:40, positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of SEQ ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to 453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45, positions 19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47, positions 18 to 448 of SEQ ID NO:48, positions 19 to 451 of SEQ ID NO:49, positions 18 to 444 of SEQ ID NO:50, positions 2 to 419 of SEQ ID NO:51, positions 27 to 461 of SEQ ID NO:52, positions 21 to 445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions 19 to 448 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions 20 to 443 of SEQ ID NO:57, positions 18 to 448 of SEQ ID NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 455 of SEQ ID NO:60, positions 19 to 449 of SEQ ID NO:61, positions 19 to 449 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to 448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65, positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ ID NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to 443 of SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to 448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions 18 to 444 of SEQ ID NO:73, positions 23 to 458 of SEQ ID NO:74, positions 20 to 452 of SEQ ID NO:75, positions 18 to 435 of SEQ ID NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 457 of SEQ ID NO:78, positions 18 to 448 of SEQ ID NO:79, positions 1 to 431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to 440 of SEQ ID NO:82, positions 19 to 442 of SEQ ID NO:83, positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85, positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to 451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions 18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ ID NO:92, positions 20 to 436 of SEQ ID NO:93, positions 18 to 450 of SEQ ID NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 472 of SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to 447 of SEQ ID NO:98, positions 19 to 450 of SEQ ID NO:99, positions 19 to 451 of SEQ ID NO:100, positions 18 to 448 of SEQ ID NO:101, positions 19 to 442 of SEQ ID NO:102, positions 20 to 457 of SEQ ID NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 440 of SEQ ID NO:105, positions 18 to 439 of SEQ ID NO:106, positions 27 to 460 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108, positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID NO:110, positions 19 to 447 of SEQ ID NO:111, positions 18 to 449 of SEQ ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions 18 to 445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115, positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ ID NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435 of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions 21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122, positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437 of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions 23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129, positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435 of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions 23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136, positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID NO:138, positions 20 to 445 of SEQ ID NO:139, positions 20 to 437 of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions 20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143, positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445 of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and positions 20 to 437 of SEQ ID NO:149.


The CBDs are particularly involved in the hydrolysis of crystalline cellulose. It has been shown that the ability of cellobiohydrolases to degrade crystalline cellulose decreases when the CBD is absent (Linder and Teeri, 1997, Journal of Biotechnol. 57:15-28). The variant CBH I polypeptides of the disclosure can further include a CBD. Exemplary CBDs comprise amino acid sequences corresponding to positions 494 to 529 of SEQ ID NO:1, positions 480 to 514 of SEQ ID NO:2, positions 494 to 529 of SEQ ID NO:3, positions 491 to 526 of SEQ ID NO:5, positions 477 to 512 of SEQ ID NO:6, positions 497 to 532 of SEQ ID NO:7, positions 504 to 539 of SEQ ID NO:8, positions 486 to 521 of SEQ ID NO:13, positions 556 to 596 of SEQ ID NO:15, positions 490 to 525 of SEQ ID NO:18, positions 495 to 530 of SEQ ID NO:20, positions 471 to 506 of SEQ ID NO:23, positions 481 to 516 of SEQ ID NO:27, positions 480 to 514 of SEQ ID NO:30, positions 495 to 529 of SEQ ID NO:35, positions 493 to 528 of SEQ ID NO:36, positions 477 to 512 of SEQ ID NO:38, positions 547 to 586 of SEQ ID NO:39, positions 475 to 510 of SEQ ID NO:40, positions 479 to 513 of SEQ ID NO:41, positions 506 to 541 of SEQ ID NO:42, positions 481 to 516 of SEQ ID NO:43, positions 503 to 537 of SEQ ID NO:45, positions 488 to 523 of SEQ ID NO:46, positions 476 to 511 of SEQ ID NO:48, positions 488 to 523 of SEQ ID NO:49, positions 479 to 513 of SEQ ID NO:50, positions 500 to 535 of SEQ ID NO:52, positions 493 to 528 of SEQ ID NO:55, positions 479 to 514 of SEQ ID NO:58, positions 494 to 529 of SEQ ID NO:60, positions 490 to 525 of SEQ ID NO:61, positions 497 to 532 of SEQ ID NO:62, positions 475 to 510 of SEQ ID NO:64, positions 477 to 512 of SEQ ID NO:65, positions 486 to 521 of SEQ ID NO:66, positions 470 to 505 of SEQ ID NO:67, positions 491 to 526 of SEQ ID NO:68, positions 476 to 511 of SEQ ID NO:69, positions 480 to 514 of SEQ ID NO:73, positions 506 to 540 of SEQ ID NO:74, positions 471 to 504 of SEQ ID NO:76, positions 501 to 536 of SEQ ID NO:78, positions 473 to 508 of SEQ ID NO:79, positions 481 to 516 of SEQ ID NO:83, positions 488 to 523 of SEQ ID NO:86, positions 475 to 510 of SEQ ID NO:92, positions 468 to 504 of SEQ ID NO:93, positions 501 to 536 of SEQ ID NO:96, positions 482 to 517 of SEQ ID NO:98, positions 481 to 516 of SEQ ID NO:99, positions 488 to 523 of SEQ ID NO:100, positions 472 to 507 of SEQ ID NO:101, positions 481 to 516 of SEQ ID NO:102, positions 471 to 505 of SEQ ID NO:105, positions 481 to 516 of SEQ ID NO:106, positions 495 to 530 of SEQ ID NO:107, positions 488 to 523 of SEQ ID NO:111, positions 478 to 513 of SEQ ID NO:112, positions 501 to 536 of SEQ ID NO:113, positions 491 to 526 of SEQ ID NO:115, and positions 503 to 538 of SEQ ID NO:116.


The CD and CBD are often connected via a linker. Exemplary linker sequences correspond to positions 456 to 493 of SEQ ID NO:1, positions 445 to 479 of SEQ ID NO:2, positions 456 to 493 of SEQ ID NO:3, positions 458 to 490 of SEQ ID NO:5, positions 449 to 476 of SEQ ID NO:6, positions 461 to 496 of SEQ ID NO:7, positions 461 to 503 of SEQ ID NO:8, positions 446 to 485 of SEQ ID NO:13, positions 444 to 555 of SEQ ID NO:15, positions 450 to 489 of SEQ ID NO:18, positions 450 to 494 of SEQ ID NO:20, positions 448 to 470 of SEQ ID NO:23, positions 443 to 480 of SEQ ID NO:27, positions 445 to 479 of SEQ ID NO:30, positions 460 to 494 of SEQ ID NO:35, positions 451 to 492 of SEQ ID NO:36, positions 449 to 476 of SEQ ID NO:38, positions 444 to 546 of SEQ ID NO:39, positions 443 to 474 of SEQ ID NO:40, positions 445 to 478 of SEQ ID NO:41, positions 458 to 505 of SEQ ID NO:42, positions 450 to 480 of SEQ ID NO:43, positions 457 to 502 of SEQ ID NO:45, positions 452 to 487 of SEQ ID NO:46, positions 449 to 475 of SEQ ID NO:48, positions 452 to 487 of SEQ ID NO:49, positions 445 to 478 of SEQ ID NO:50, positions 462 to 499 of SEQ ID NO:52, positions 449 to 492 of SEQ ID NO:55, positions 449 to 478 of SEQ ID NO:58, positions 456 to 493 of SEQ ID NO:60, positions 450 to 489 of SEQ ID NO:61, positions 450 to 496 of SEQ ID NO:62, positions 449 to 474 of SEQ ID NO:64, positions 452 to 476 of SEQ ID NO:65, positions 448 to 485 of SEQ ID NO:66, positions 425 to 469 of SEQ ID NO:67, positions 449 to 490 of SEQ ID NO:68, positions 444 to 475 of SEQ ID NO:69, positions 445 to 479 of SEQ ID NO:73, positions 459 to 505 of SEQ ID NO:74, positions 436 to 470 of SEQ ID NO:76, positions 458 to 500 of SEQ ID NO:78, positions 449 to 472 of SEQ ID NO:79, positions 443 to 480 of SEQ ID NO:83, positions 448 to 487 of SEQ ID NO:86, positions 443 to 474 of SEQ ID NO:92, positions 437 to 467 of SEQ ID NO:93, positions 473 to 500 of SEQ ID NO:96, positions 448 to 481 of SEQ ID NO:98, positions 451 to 480 of SEQ ID NO:99, positions 452 to 487 of SEQ ID NO:100, positions 449 to 471 of SEQ ID NO:101, positions 443 to 480 of SEQ ID NO:102, positions 441 to 470 of SEQ ID NO:105, positions 440 to 480 of SEQ ID NO:106, positions 461 to 494 of SEQ ID NO:107, positions 448 to 487 of SEQ ID NO:111, positions 450 to 478 of SEQ ID NO:112, positions 458 to 500 of SEQ ID NO:113, positions 449 to 490 of SEQ ID NO:115, and positions 449 to 502 of SEQ ID NO:116.


Because CBH I polypeptides are modular, the CBDs, CDs and linkers of different CBH I polypeptides, such as the exemplary CBH I polypeptides of Table 1, can be used interchangeably. However, in a preferred embodiment, the CBDs, CDs and linkers of a variant CBH I of the disclosure originate from the same polypeptide.


The variant CBH I polypeptides of the disclosure preferably have at least a two-fold reduction of product inhibition, such that cellobiose has an IC50 towards the variant CBH I that is at least 2-fold the IC50 of the corresponding reference CBH I, e.g., CBH I lacking the R268 substitution and/or R411 substitution. More preferably the IC50 of cellobiose towards the variant CBH I is at least 3-fold, at least 5-fold, at least 8-fold, at least 10-fold, at least 12-fold or at least 15-fold the IC50 of the corresponding reference CBH I. In specific embodiments the IC50 of cellobiose towards the variant CBH I is ranges from 2-fold to 15-fold, from 2-fold to 10-fold, from 3-fold to 10-fold, from 5-fold to 12-fold, from 4-fold to 12-fold, from 5-fold to 10-fold, from 5-fold to 12-fold, from 2-fold to 8-fold, or from 8-fold to 20-fold the IC50 of the corresponding reference CBH I. The IC50 can be determined in a phosphoric acid swollen cellulose (“PASC”) assay (Du et al., 2010, Applied Biochemistry and Biotechnology 161:313-317) or a methylumbelliferyl lactoside (“MUL”) assay (van Tilbeurgh and Claeyssens, 1985, FEBS Letts. 187(2):283-288), as exemplified in the Examples below.


The variant CBH I polypeptides of the disclosure preferably have a cellobiohydrolase activity that is at least 30% the cellobiohydrolase activity of the corresponding reference CBH I, e.g., CBH I lacking the R268 substitution and/or R411 substitution. More preferably, the cellobiohydrolase activity of the variant CBH I is at least 40%, at least 50%, at least 60% or at least 70% the cellobiohydrolase activity of the corresponding reference CBH I. In specific embodiments the IC50 cellobiohydrolase activity of the variant CBH I is ranges from 30% to 80%, from 40% to 70%, 30% to 60%, from 50% to 80% or from 60% to 80% of the cellobiohydrolase activity of the corresponding reference CBH I. Assays for cellobiohydrolase activity are described, for example, in Becker et al., 2011, Biochem J. 356:19-30 and Mitsuishi et al., 1990, FEBS Letts. 275:135-138, each of which is expressly incorporated by reference herein. The ability of CBH I to hydrolyze isolated soluble and insoluble substrates can also be measured using assays described in Srisodsuk et al., 1997, J. Biotech. 57:4957 and Nidetzky and Claeyssens, 1994, Biotech. Bioeng. 44:961-966. Substrates useful for assaying cellobiohydrolase activity include crystalline cellulose, filter paper, phosphoric acid swollen cellulose, cellooligosaccharides, methylumbelliferyl lactoside, methylumbelliferyl cellobioside, orthonitrophenyl lactoside, paranitrophenyl lactoside, orthonitrophenyl cellobioside, paranitrophenyl cellobioside. Cellobiohydrolase activity can be measured in an assay utilizing PASC as the substrate and a calcofluor white detection method (Du et al., 2010, Applied Biochemistry and Biotechnology 161:313-317). PASC can be prepared as described by Walseth, 1952, TAPPI 35:228-235 and Wood, 1971, Biochem. J. 121:353-362.


Other than said R268 and/or R411 substitution, the variant CBH I polypeptides of the disclosure preferably:

    • comprise an amino acid sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to a CD of a reference CBH I exemplified in Table 1 (i.e., a CD comprising an amino acid sequence corresponding to positions 26 to 455 of SEQ ID NO:1, positions 18 to 444 of SEQ ID NO:2, positions 26 to 455 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions 24 to 457 of SEQ ID NO:5, positions 18 to 448 of SEQ ID NO:6, positions 27 to 460 of SEQ ID NO:7, positions 27 to 460 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18 to 434 of SEQ ID NO:12, positions 18 to 445 of SEQ ID NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 443 of SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of SEQ ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions 23 to 446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20, positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to 447 of SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to 447 of SEQ ID NO:26, positions 19 to 442 of SEQ ID NO:27, positions 18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29, positions 18 to 444 of SEQ ID NO:30, positions 18 to 451 of SEQ ID NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to 459 of SEQ ID NO:35, positions 19 to 450 of SEQ ID NO:36, positions 19 to 453 of SEQ ID NO:37, positions 18 to 448 of SEQ ID NO:38, positions 19 to 443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID NO:40, positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of SEQ ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to 453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45, positions 19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47, positions 18 to 448 of SEQ ID NO:48, positions 19 to 451 of SEQ ID NO:49, positions 18 to 444 of SEQ ID NO:50, positions 2 to 419 of SEQ ID NO:51, positions 27 to 461 of SEQ ID NO:52, positions 21 to 445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions 19 to 448 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions 20 to 443 of SEQ ID NO:57, positions 18 to 448 of SEQ ID NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 455 of SEQ ID NO:60, positions 19 to 449 of SEQ ID NO:61, positions 19 to 449 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to 448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65, positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ ID NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to 443 of SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to 448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions 18 to 444 of SEQ ID NO:73, positions 23 to 458 of SEQ ID NO:74, positions 20 to 452 of SEQ ID NO:75, positions 18 to 435 of SEQ ID NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 457 of SEQ ID NO:78, positions 18 to 448 of SEQ ID NO:79, positions 1 to 431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to 440 of SEQ ID NO:82, positions 19 to 442 of SEQ ID NO:83, positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85, positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to 451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions 18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ ID NO:92, positions 20 to 436 of SEQ ID NO:93, positions 18 to 450 of SEQ ID NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 472 of SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to 447 of SEQ ID NO:98, positions 19 to 450 of SEQ ID NO:99, positions 19 to 451 of SEQ ID NO:100, positions 18 to 448 of SEQ ID NO:101, positions 19 to 442 of SEQ ID NO:102, positions 20 to 457 of SEQ ID NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 440 of SEQ ID NO:105, positions 18 to 439 of SEQ ID NO:106, positions 27 to 460 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108, positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID NO:110, positions 19 to 447 of SEQ ID NO:111, positions 18 to 449 of SEQ ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions 18 to 445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115, positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ ID NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435 of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions 21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122, positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437 of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions 23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129, positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435 of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions 23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136, positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID NO:138, positions 20 to 445 of SEQ ID NO:139, positions 20 to 437 of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions 20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143, positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445 of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and positions 20 to 437 of SEQ ID NO:149 (preferably the CD corresponding to positions 26-455 of SEQ ID NO:1 or 18-444 of SEQ ID NO:2); and/or
    • comprise an amino acid sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to a mature polypeptide of a reference CBH I exemplified in Table 1 (i.e., a mature protein comprising an amino acid sequence corresponding to positions 26 to 529 of SEQ ID NO:1, positions 18 to 514 of SEQ ID NO:2, positions 26 to 529 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions 24 to 526 of SEQ ID NO:5, positions 18 to 512 of SEQ ID NO:6, positions 27 to 532 of SEQ ID NO:7, positions 27 to 539 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18 to 434 of SEQ ID NO:12, positions 18 to 521 of SEQ ID NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 596 of SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of SEQ ID NO:17, positions 19 to 525 of SEQ ID NO:18, positions 23 to 446 of SEQ ID NO:19, positions 19 to 530 of SEQ ID NO:20, positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID NO:22, positions 19 to 506 of SEQ ID NO:23, positions 19 to 447 of SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to 447 of SEQ ID NO:26, positions 19 to 516 of SEQ ID NO:27, positions 18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29, positions 18 to 514 of SEQ ID NO:30, positions 18 to 451 of SEQ ID NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to 529 of SEQ ID NO:35, positions 19 to 528 of SEQ ID NO:36, positions 19 to 453 of SEQ ID NO:37, positions 18 to 512 of SEQ ID NO:38, positions 19 to 586 of SEQ ID NO:39, positions 19 to 510 of SEQ ID NO:40, positions 18 to 513 of SEQ ID NO:41, positions 24 to 541 of SEQ ID NO:42, positions 18 to 516 of SEQ ID NO:43, positions 19 to 453 of SEQ ID NO:44, positions 26 to 537 of SEQ ID NO:45, positions 19 to 523 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47, positions 18 to 511 of SEQ ID NO:48, positions 19 to 523 of SEQ ID NO:49, positions 18 to 513 of SEQ ID NO:50, positions 2 to 419 of SEQ ID NO:51, positions 27 to 535 of SEQ ID NO:52, positions 21 to 445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions 19 to 528 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions 20 to 443 of SEQ ID NO:57, positions 18 to 514 of SEQ ID NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 529 of SEQ ID NO:60, positions 19 to 525 of SEQ ID NO:61, positions 19 to 532 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to 510 of SEQ ID NO:64, positions 19 to 512 of SEQ ID NO:65, positions 19 to 521 of SEQ ID NO:66, positions 1 to 505 of SEQ ID NO:67, positions 19 to 526 of SEQ ID NO:68, positions 19 to 511 of SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to 448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions 18 to 514 of SEQ ID NO:73, positions 23 to 540 of SEQ ID NO:74, positions 20 to 452 of SEQ ID NO:75, positions 18 to 504 of SEQ ID NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 536 of SEQ ID NO:78, positions 18 to 508 of SEQ ID NO:79, positions 1 to 431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to 440 of SEQ ID NO:82, positions 19 to 516 of SEQ ID NO:83, positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85, positions 18 to 523 of SEQ ID NO:86, positions 18 to 443 of SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to 451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions 18 to 444 of SEQ ID NO:91, positions 19 to 510 of SEQ ID NO:92, positions 20 to 504 of SEQ ID NO:93, positions 18 to 450 of SEQ ID NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 536 of SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to 517 of SEQ ID NO:98, positions 19 to 516 of SEQ ID NO:99, positions 19 to 523 of SEQ ID NO:100, positions 18 to 507 of SEQ ID NO:101, positions 19 to 516 of SEQ ID NO:102, positions 20 to 457 of SEQ ID NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 505 of SEQ ID NO:105, positions 18 to 516 of SEQ ID NO:106, positions 27 to 530 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108, positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID NO:110, positions 19 to 523 of SEQ ID NO:111, positions 18 to 513 of SEQ ID NO:112, positions 22 to 536 of SEQ ID NO:113, positions 18 to 445 of SEQ ID NO:114, positions 18 to 526 of SEQ ID NO:115, positions 18 to 538 of SEQ ID NO:116, positions 23 to 435 of SEQ ID NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435 of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions 21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122, positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437 of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions 23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129, positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435 of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions 23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136, positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID NO:138, positions 20 to 445, of SEQ ID NO:139, positions 20 to 437 of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions 20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143, positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445 of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and positions 20 to 437 of SEQ ID NO:149, preferably the mature polypeptide corresponding to positions 26-529 of SEQ ID NO:1 or 18-514 of SEQ ID NO:2).


An example of an algorithm that is suitable for determining sequence similarity is the BLAST algorithm, which is described in Altschul et al., 1990, J. Mol. Biol. 215:403-410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. These initial neighborhood word hits act as starting points to find longer HSPs containing them. The word hits are expanded in both directions along each of the two sequences being compared for as far as the cumulative alignment score can be increased. Extension of the word hits is stopped when: the cumulative alignment score falls off by the quantity X from a maximum achieved value; the cumulative score goes to zero or below; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1992, Proc. Nat'l. Acad. Sci. USA 89:10915-10919) alignments (B) of 50, expectation (E) of 10, M'S, N′-4, and a comparison of both strands.


Most CBH I polypeptides are secreted and are therefore expressed with a signal sequence that is cleaved upon secretion of the polypeptide from the cell. Accordingly, in certain aspects, the variant CBH I polypeptides of the disclosure further include a signal sequence. Exemplary signal sequences comprise amino acid sequences corresponding to positions 1 to 25 of SEQ ID NO:1, positions 1 to 17 of SEQ ID NO:2, positions 1 to 25 of SEQ ID NO:3, positions 1 to 23 of SEQ ID NO:5, positions 1 to 17 of SEQ ID NO:6, positions 1 to 26 of SEQ ID NO:7, positions 1 to 27 of SEQ ID NO:8, positions 1 to 19 of SEQ ID NO:9, positions 1 to 17 of SEQ ID NO:11, positions 1 to 17 of SEQ ID NO:12, positions 1 to 17 of SEQ ID NO:13, positions 1 to 18 of SEQ ID NO:14, positions 1 to 18 of SEQ ID NO:15, positions 1 to 22 of SEQ ID NO:17, positions 1 to 18 of SEQ ID NO:18, positions 1 to 22 of SEQ ID NO:19, positions 1 to 18 of SEQ ID NO:20, positions 1 to 18 of SEQ ID NO:22, positions 1 to 18 of SEQ ID NO:23, positions 1 to 18 of SEQ ID NO:24, positions 1 to 19 of SEQ ID NO:25, positions 1 to 17 of SEQ ID NO:26, positions 1 to 18 of SEQ ID NO:27, positions 1 to 17 of SEQ ID NO:28, positions 1 to 22 of SEQ ID NO:29, positions 1 to 18 of SEQ ID NO:30, positions 1 to 17 of SEQ ID NO:31, positions 1 to 17 of SEQ ID NO:32, positions 1 to 18 of SEQ ID NO:33, positions 1 to 17 of SEQ ID NO:34, positions 1 to 25 of SEQ ID NO:35, positions 1 to 18 of SEQ ID NO:36, positions 1 to 18 of SEQ ID NO:37, positions 1 to 17 of SEQ ID NO:38, positions 1 to 18 of SEQ ID NO:39, positions 1 to 18 of SEQ ID NO:40, positions 1 to 17 of SEQ ID NO:41, positions 1 to 23 of SEQ ID NO:42, positions 1 to 17 of SEQ ID NO:43, positions 1 to 18 of SEQ ID NO:44, positions 1 to 25 of SEQ ID NO:45, positions 1 to 18 of SEQ ID NO:46, positions 1 to 17 of SEQ ID NO:47, positions 1 to 17 of SEQ ID NO:48, positions 1 to 18 of SEQ ID NO:49, positions 1 to 17 of SEQ ID NO:50, positions 1 to 26 of SEQ ID NO:52, positions 1 to 20 of SEQ ID NO:53, positions 1 to 18 of SEQ ID NO:54, positions 1 to 18 of SEQ ID NO:55, positions 1 to 17 of SEQ ID NO:56, positions 1 to 19 of SEQ ID NO:57, positions 1 to 17 of SEQ ID NO:58, positions 1 to 17 of SEQ ID NO:59, positions 1 to 25 of SEQ ID NO:60, positions 1 to 18 of SEQ ID NO:61, positions 1 to 18 of SEQ ID NO:62, positions 1 to 25 of SEQ ID NO:63, positions 1 to 17 of SEQ ID NO:64, positions 1 to 18 of SEQ ID NO:65, positions 1 to 18 of SEQ ID NO:66, positions 1 to 18 of SEQ ID NO:68, positions 1 to 18 of SEQ ID NO:69, positions 1 to 23 of SEQ ID NO:70, positions 1 to 17 of SEQ ID NO:71, positions 1 to 18 of SEQ ID NO:72, positions 1 to 17 of SEQ ID NO:73, positions 1 to 22 of SEQ ID NO:74, positions 1 to 19 of SEQ ID NO:75, positions 1 to 17 of SEQ ID NO:76, positions 1 to 17 of SEQ ID NO:77, positions 1 to 21 of SEQ ID NO:78, positions 1 to 18 of SEQ ID NO:79, positions 1 to 18 of SEQ ID NO:81, positions 1 to 20 of SEQ ID NO:82, positions 1 to 18 of SEQ ID NO:83, positions 1 to 17 of SEQ ID NO:84, positions 1 to 16 of SEQ ID NO:85, positions 1 to 17 of SEQ ID NO:86, positions 1 to 17 of SEQ ID NO:87, positions 1 to 22 of SEQ ID NO:88, positions 1 to 17 of SEQ ID NO:89, positions 1 to 20 of SEQ ID NO:90, positions 1 to 17 of SEQ ID NO:91, positions 1 to 18 of SEQ ID NO:92, positions 1 to 19 of SEQ ID NO:93, positions 1 to 17 of SEQ ID NO:94, positions 1 to 21 of SEQ ID NO:95, positions 1 to 15 of SEQ ID NO:96, positions 1 to 20 of SEQ ID NO:97, positions 1 to 18 of SEQ ID NO:98, positions 1 to 18 of SEQ ID NO:99, positions 1 to 18 of SEQ ID NO:100, positions 1 to 17 of SEQ ID NO:101, positions 1 to 18 of SEQ ID NO:102, positions 1 to 19 of SEQ ID NO:103, positions 1 to 18 of SEQ ID NO:104, positions 1 to 17 of SEQ ID NO:105, positions 1 to 17 of SEQ ID NO:106, positions 1 to 26 of SEQ ID NO:107, positions 1 to 22 of SEQ ID NO:108, positions 1 to 16 of SEQ ID NO:109, positions 1 to 20 of SEQ ID NO:110, positions 1 to 18 of SEQ ID NO:111, positions 1 to 17 of SEQ ID NO:112, positions 1 to 21 of SEQ ID NO:113, positions 1 to 17 of SEQ ID NO:114, positions 1 to 17 of SEQ ID NO:115, positions 1 to 18 of SEQ ID NO:116, positions 1 to 22 of SEQ ID NO:117, positions 1 to 20 of SEQ ID NO:118, positions 1 to 22 of SEQ ID NO:119, positions 1 to 19 of SEQ ID NO:120, positions 1 to 20 of SEQ ID NO:121, positions 1 to 19 of SEQ ID NO:122, positions 1 to 22 of SEQ ID NO:123, positions 1 to 19 of SEQ ID NO:124, positions 1 to 20 of SEQ ID NO:125, positions 1 to 19 of SEQ ID NO:126, positions 1 to 21 of SEQ ID NO:127, positions 1 to 22 of SEQ ID NO:128, positions 1 to 19 of SEQ ID NO:129, positions 1 to 20 of SEQ ID NO:130, positions 1 to 19 of SEQ ID NO:131, positions 1 to 20 of SEQ ID NO:132, positions 1 to 20 of SEQ ID NO:133, positions 1 to 21 of SEQ ID NO:134, positions 1 to 22 of SEQ ID NO:135, positions 1 to 22 of SEQ ID NO:136, positions 1 to 22 of SEQ ID NO:137, positions 1 to 22 of SEQ ID NO:138, positions 1 to 19 of SEQ ID NO:139, positions 1 to 19 of SEQ ID NO:140, positions 1 to 20 of SEQ ID NO:141, positions 1 to 19 of SEQ ID NO:142, positions 1 to 20 of SEQ ID NO:143, positions 1 to 25 of SEQ ID NO:144, positions 1 to 22 of SEQ ID NO:145, positions 1 to 23 of SEQ ID NO:146, positions 1 to 19 of SEQ ID NO:147, positions 1 to 20 of SEQ ID NO:148, and positions 1 to 19 of SEQ ID NO:149.


Recombinant Expression of Variant CBH I Polypeptides
Cell Culture Systems

The disclosure also provides recombinant cells engineered to express variant CBH I polypeptides. Suitably, the variant CBH I polypeptide is encoded by a nucleic acid operably linked to a promoter.


Where recombinant expression in a filamentous fungal host is desired, the promoter can be a filamentous fungal promoter. The nucleic acids can be, for example, under the control of heterologous promoters. The variant CBH I polypeptides can also be expressed under the control of constitutive or inducible promoters. Examples of promoters that can be used include, but are not limited to, a cellulase promoter, a xylanase promoter, the 1818 promoter (previously identified as a highly expressed protein by EST mapping Trichoderma). For example, the promoter can suitably be a cellobiohydrolase, endoglucanase, or β-glucosidase promoter. A particularly suitable promoter can be, for example, a T. reesei cellobiohydrolase, endoglucanase, or β-glucosidase promoter. Non-limiting examples of promoters include a cbh1, cbh2, egl1, eg12, eg13, eg14, eg15, pki1, gpdl, xyn1, or xyn2 promoter.


Suitable host cells include cells of any microorganism (e.g., cells of a bacterium, a protist, an alga, a fungus (e.g., a yeast or filamentous fungus), or other microbe), and are preferably cells of a bacterium, a yeast, or a filamentous fungus.


Suitable host cells of the bacterial genera include, but are not limited to, cells of Escherichia, Bacillus, Lactobacillus, Pseudomonas, and Streptomyces. Suitable cells of bacterial species include, but are not limited to, cells of Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Lactobacillus brevis, Pseudomonas aeruginosa, and Streptomyces lividans.


Suitable host cells of the genera of yeast include, but are not limited to, cells of Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phaffia. Suitable cells of yeast species include, but are not limited to, cells of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula polymorphs, Pichia pastoris, P. canadensis, Kluyveromyces marxianus, and Phaffia rhodozyma.


Suitable host cells of filamentous fungi include all filamentous forms of the subdivision Eumycotina. Suitable cells of filamentous fungal genera include, but are not limited to, cells of Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysoporium, Coprinus, Coriolus, Corynascus, Chaetomium, Cryptococcus, Filobasidium, Fusarium, Gibberella, Humicola, Hypocrea, Magnaporthe, Mucor, Myceliophthora, Mucor, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Scytaldium, Schizophyllum, Sporotrichum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, and Trichoderma. More preferably, the recombinant cell is a Trichoderma sp. (e.g., Trichoderma reesei), Penicillium sp., Humicola sp. (e.g., Humicola insolens); Aspergillus sp. (e.g., Aspergillus niger), Chrysosporium sp., Fusarium sp., or Hypocrea sp. Suitable cells can also include cells of various anamorph and teleomorph forms of these filamentous fungal genera.


Suitable cells of filamentous fungal species include, but are not limited to, cells of Aspergillus awamori, Aspergillus fumigatus, Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Chrysosporium lucknowense, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Coprinus cinereus, Coriolus hirsutus, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophile, Neurospora crassa, Neurospora intermedia, Penicillium purpurogenum, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, Phanerochaete chrysosporium, Phlebia radiate, Pleurotus eryngii, Talaromyces flavus, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, and Trichoderma viride.


The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants, or amplifying the nucleic acid sequence encoding the variant CBH I polypeptide. Culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to those skilled in the art. As noted, many references are available for the culture and production of many cells, including cells of bacterial and fungal origin. Cell culture media in general are set forth in Atlas and Parks (eds.), 1993, The Handbook of Microbiological Media, CRC Press, Boca Raton, Fla., which is incorporated herein by reference. For recombinant expression in filamentous fungal cells, the cells are cultured in a standard medium containing physiological salts and nutrients, such as described in Pourquie et al., 1988, Biochemistry and Genetics of Cellulose Degradation, eds. Aubert, et al., Academic Press, pp. 71-86; and Ilmen et al., 1997, Appl. Environ. Microbiol. 63:1298-1306. Culture conditions are also standard, e.g., cultures are incubated at 28° C. in shaker cultures or fermenters until desired levels of variant CBH I expression are achieved. Preferred culture conditions for a given filamentous fungus may be found in the scientific literature and/or from the source of the fungi such as the American Type Culture Collection (ATCC). After fungal growth has been established, the cells are exposed to conditions effective to cause or permit the expression of a variant CBH I.


In cases where a variant CBH I coding sequence is under the control of an inducible promoter, the inducing agent, e.g., a sugar, metal salt or antibiotics, is added to the medium at a concentration effective to induce variant CBH I expression.


In one embodiment, the recombinant cell is an Aspergillus niger, which is a useful strain for obtaining overexpressed polypeptide. For example A. niger var. awamori dgr246 is known to product elevated amounts of secreted cellulases (Goedegebuur et al., 2002, Curr. Genet. 41:89-98). Other strains of Aspergillus niger var awamori such as GCDAP3, GCDAP4 and GAP3-4 are known (Ward et al., 1993, Appl. Microbiol. Biotechnol. 39:738-743).


In another embodiment, the recombinant cell is a Trichoderma reesei, which is a useful strain for obtaining overexpressed polypeptide. For example, RL-P37, described by Sheir-Neiss et al., 1984, Appl. Microbiol. Biotechnol. 20:46-53, is known to secrete elevated amounts of cellulase enzymes. Functional equivalents of RL-P37 include Trichoderma reesei strain RUT-C30 (ATCC No. 56765) and strain QM9414 (ATCC No. 26921). It is contemplated that these strains would also be useful in overexpressing variant CBH I polypeptides.


Cells expressing the variant CBH I polypeptides of the disclosure can be grown under batch, fed-batch or continuous fermentations conditions. Classical batch fermentation is a closed system, wherein the compositions of the medium is set at the beginning of the fermentation and is not subject to artificial alternations during the fermentation. A variation of the batch system is a fed-batch fermentation in which the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression is likely to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the medium. Batch and fed-batch fermentations are common and well known in the art. Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth. Continuous fermentation systems strive to maintain steady state growth conditions. Methods for modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology.


Recombinant Expression in Plants

The disclosure provides transgenic plants and seeds that recombinantly express a variant CBH I polypeptide. The disclosure also provides plant products, e.g., oils, seeds, leaves, extracts and the like, comprising a variant CBH I polypeptide.


The transgenic plant can be dicotyledonous (a dicot) or monocotyledonous (a monocot). The disclosure also provides methods of making and using these transgenic plants and seeds. The transgenic plant or plant cell expressing a variant CBH I can be constructed in accordance with any method known in the art. See, for example, U.S. Pat. No. 6,309,872. T. reesei CBH I has been successfully expressed in transgenic tobacco (Nicotiana tabaccum) and potato (Solanum tuberosum). See Hooker et al., 2000, in Glycosyl Hydrolases for Biomass Conversion, ACS Symposium Series, Vol. 769, Chapter 4, pp. 55-90.


In a particular aspect, the present disclosure provides for the expression of CBH I variants in transgenic plants or plant organs and methods for the production thereof. DNA expression constructs are provided for the transformation of plants with a nucleic acid encoding the variant CBH I polypeptide, preferably under the control of regulatory sequences which are capable of directing expression of the variant CBH I polypeptide. These regulatory sequences include sequences capable of directing transcription in plants, either constitutively, or in stage and/or tissue specific manners.


The expression of variant CBH I polypeptides in plants can be achieved by a variety of means. Specifically, for example, technologies are available for transforming a large number of plant species, including dicotyledonous species (e.g., tobacco, potato, tomato, Petunia, Brassica) and monocot species. Additionally, for example, strategies for the expression of foreign genes in plants are available. Additionally still, regulatory sequences from plant genes have been identified that are serviceable for the construction of chimeric genes that can be functionally expressed in plants and in plant cells (e.g., Klee, 1987, Ann. Rev. of Plant Phys. 38:467-486; Clark et al., 1990, Virology 179(2):640-7; Smith et al., 1990, Mol. Gen. Genet. 224(3):477-81.


The introduction of nucleic acids into plants can be achieved using several technologies including transformation with Agrobacterium tumefaciens or Agrobacterium rhizogenes. Non-limiting examples of plant tissues that can be transformed include protoplasts, microspores or pollen, and explants such as leaves, stems, roots, hypocotyls, and cotyls. Furthermore, DNA encoding a variant CBH I can be introduced directly into protoplasts and plant cells or tissues by microinjection, electroporation, particle bombardment, and direct DNA uptake.


Variant CBH I polypeptides can be produced in plants by a variety of expression systems. For instance, the use of a constitutive promoter such as the 35S promoter of Cauliflower Mosaic Virus (Guilley et al., 1982, Cell 30:763-73) is serviceable for the accumulation of the expressed protein in virtually all organs of the transgenic plant. Alternatively, promoters that are tissue-specific and/or stage-specific can be used (Higgins, 1984, Annu Rev. Plant Physiol. 35:191-221; Shotwell and Larkins, 1989, In: The Biochemistry of Plants Vol. 15 (Academic Press, San Diego: Stumpf and Conn, eds.), p. 297), permit expression of variant CBH I polypeptides in a target tissue and/or during a desired stage of development.


Compositions of Variant CBH I Polypeptides

In general, a variant CBH I polypeptide produced in cell culture is secreted into the medium and may be purified or isolated, e.g., by removing unwanted components from the cell culture medium. However, in some cases, a variant CBH I polypeptide may be produced in a cellular form necessitating recovery from a cell lysate. In such cases the variant CBH I polypeptide is purified from the cells in which it was produced using techniques routinely employed by those of skill in the art. Examples include, but are not limited to, affinity chromatography (Van Tilbeurgh et al., 1984, FEBS Lett. 169(2):215-218), ion-exchange chromatographic methods (Goyal et al., 1991, Bioresource Technology, 36:37-50; Fliess et al., 1983, Eur. J. Appl. Microbiol. Biotechnol. 17:314-318; Bhikhabhai et al., 1984, J. Appl. Biochem. 6:336-345; Ellouz et al., 1987, Journal of Chromatography, 396:307-317), including ion-exchange using materials with high resolution power (Medve et al., 1998, J. Chromatography A, 808:153-165), hydrophobic interaction chromatography (Tomaz and Queiroz, 1999, J. Chromatography A, 865:123-128), and two-phase partitioning (Brumbauer et al., 1999, Bioseparation 7:287-295).


The variant CBH I polypeptides of the disclosure are suitably used in cellulase compositions. Cellulases are known in the art as enzymes that hydrolyze cellulose (beta-1,4-glucan or beta D-glucosidic linkages) resulting in the formation of glucose, cellobiose, cellooligosaccharides, and the like. Cellulase enzymes have been traditionally divided into three major classes: endoglucanases (“EG”), exoglucanases or cellobiohydrolases (EC 3.2.1.91) (“CBH”) and beta-glucosidases (EC 3.2.1.21) (“BG”) (Knowles et al., 1987, TIBTECH 5:255-261; Schulein, 1988, Methods in Enzymology 160(25):234-243).


Certain fungi produce complete cellulase systems which include exo-cellobiohydrolases or CBH-type cellulases, endoglucanases or EG-type cellulases and β-glucosidases or BG-type cellulases (Schulein, 1988, Methods in Enzymology 160(25):234-243). Such cellulase compositions are referred to herein as “whole” cellulases. However, sometimes these systems lack CBH-type cellulases and bacterial cellulases also typically include little or no CBH-type cellulases. In addition, it has been shown that the EG components and CBH components synergistically interact to more efficiently degrade cellulose. See, e.g., Wood, 1985, Biochemical Society Transactions 13(2):407-410.


The cellulase compositions of the disclosure typically include, in addition to a variant CBH I polypeptide, one or more cellobiohydrolases, endoglucanases and/or β-glucosidases. In their crudest form, cellulase compositions contain the microorganism culture that produced the enzyme components. “Cellulase compositions” also refers to a crude fermentation product of the microorganisms. A crude fermentation is preferably a fermentation broth that has been separated from the microorganism cells and/or cellular debris (e.g., by centrifugation and/or filtration). In some cases, the enzymes in the broth can be optionally diluted, concentrated, partially purified or purified and/or dried. The variant CBH I polypeptide can be co-expressed with one or more of the other components of the cellulase composition or it can be expressed separately, optionally purified and combined with a composition comprising one or more of the other cellulase components.


When employed in cellulase compositions, the variant CBH I is generally present in an amount sufficient to allow release of soluble sugars from the biomass. The amount of variant CBH I enzymes added depends upon the type of biomass to be saccharified which can be readily determined by the skilled artisan. In certain embodiments, the weight percent of variant CBH I polypeptide is suitably at least 1, at least 5, at least 10, or at least 20 weight percent of the total polypeptides in a cellulase composition. Exemplary cellulase compositions include a variant CBH I of the disclosure in an amount ranging from about 1 to about 20 weight percent, from about 1 to about 25 weight percent, from about 5 to about 20 weight percent, from about 5 to about 25 weight percent, from about 5 to about 30 weight percent, from about 5 to about 35 weight percent, from about 5 to about 40 weight percent, from about 5 to about 45 weight percent, from about 5 to about 50 weight percent, from about 10 to about 20 weight percent, from about 10 to about 25 weight percent, from about 10 to about 30 weight percent, from about 10 to about 35 weight percent, from about 10 to about 40 weight percent, from about 10 to about 45 weight percent, from about 10 to about 50 weight percent, from about 15 to about 20 weight percent, from about 15 to about 25 weight percent, from about 15 to about 30 weight percent, from about 15 to about 35 weight percent, from about 15 to about 30 weight percent, from about 15 to about 45 weight percent, or from about 15 to about 50 weight percent of the total polypeptides in the composition.


Utility of Variant CBH I Polypeptides

It can be appreciated that the variant CBH I polypeptides of the disclosure and compositions comprising the variant CBH I polypeptides find utility in a wide variety applications, for example detergent compositions that exhibit enhanced cleaning ability, function as a softening agent and/or improve the feel of cotton fabrics (e.g., “stone washing” or “biopolishing”), or in cellulase compositions for degrading wood pulp into sugars (e.g., for bio-ethanol production). Other applications include the treatment of mechanical pulp (Pere et al., 1996, Tappi Pulping Conference, pp. 693-696 (Nashville, Tenn., Oct. 27-31, 1996)), for use as a feed additive (see, e.g., WO 91/04673) and in grain wet milling.


Saccharification Reactions

Ethanol can be produced via saccharification and fermentation processes from cellulosic biomass such as trees, herbaceous plants, municipal solid waste and agricultural and forestry residues. However, the ratio of individual cellulase enzymes within a naturally occurring cellulase mixture produced by a microbe may not be the most efficient for rapid conversion of cellulose in biomass to glucose. It is known that endoglucanases act to produce new cellulose chain ends which themselves are substrates for the action of cellobiohydrolases and thereby improve the efficiency of hydrolysis of the entire cellulase system. The use of optimized cellobiohydrolase activity may greatly enhance the production of ethanol.


Cellulase compositions comprising one or more of the variant CBH I polypeptides of the disclosure can be used in saccharification reaction to produce simple sugars for fermentation. Accordingly, the present disclosure provides methods for saccharification comprising contacting biomass with a cellulase composition comprising a variant CBH I polypeptide of the disclosure and, optionally, subjecting the resulting sugars to fermentation by a microorganism.


The term “biomass,” as used herein, refers to any composition comprising cellulose (optionally also hemicellulose and/or lignin). As used herein, biomass includes, without limitation, seeds, grains, tubers, plant waste or byproducts of food processing or industrial processing (e.g., stalks), corn (including, e.g., cobs, stover, and the like), grasses (including, e.g., Indian grass, such as Sorghastrum nutans; or, switchgrass, e.g., Panicum species, such as Panicum virgatum), wood (including, e.g., wood chips, processing waste), paper, pulp, and recycled paper (including, e.g., newspaper, printer paper, and the like). Other biomass materials include, without limitation, potatoes, soybean (e.g., rapeseed), barley, rye, oats, wheat, beets, and sugar cane bagasse.


The saccharified biomass (e.g., lignocellulosic material processed by enzymes of the disclosure) can be made into a number of bio-based products, via processes such as, e.g., microbial fermentation and/or chemical synthesis. As used herein, “microbial fermentation” refers to a process of growing and harvesting fermenting microorganisms under suitable conditions. The fermenting microorganism can be any microorganism suitable for use in a desired fermentation process for the production of bio-based products. Suitable fermenting microorganisms include, without limitation, filamentous fungi, yeast, and bacteria. The saccharified biomass can, for example, be made it into a fuel (e.g., a biofuel such as a bioethanol, biobutanol, biomethanol, a biopropanol, a biodiesel, a jet fuel, or the like) via fermentation and/or chemical synthesis. The saccharified biomass can, for example, also be made into a commodity chemical (e.g., ascorbic acid, isoprene, 1,3-propanediol), lipids, amino acids, polypeptides, and enzymes, via fermentation and/or chemical synthesis.


Thus, in certain aspects, the variant CBH I polypeptides of the disclosure find utility in the generation of ethanol from biomass in either separate or simultaneous saccharification and fermentation processes. Separate saccharification and fermentation is a process whereby cellulose present in biomass is saccharified into simple sugars (e.g., glucose) and the simple sugars subsequently fermented by microorganisms (e.g., yeast) into ethanol. Simultaneous saccharification and fermentation is a process whereby cellulose present in biomass is saccharified into simple sugars (e.g., glucose) and, at the same time and in the same reactor, microorganisms (e.g., yeast) ferment the simple sugars into ethanol.


Prior to saccharification, biomass is preferably subject to one or more pretreatment step(s) in order to render cellulose material more accessible or susceptible to enzymes and thus more amenable to hydrolysis by the variant CBH I polypeptides of the disclosure.


In an exemplary embodiment, the pretreatment entails subjecting biomass material to a catalyst comprising a dilute solution of a strong acid and a metal salt in a reactor. The biomass material can, e.g., be a raw material or a dried material. This pretreatment can lower the activation energy, or the temperature, of cellulose hydrolysis, ultimately allowing higher yields of fermentable sugars. See, e.g., U.S. Pat. Nos. 6,660,506; 6,423,145.


Another exemplary pretreatment method entails hydrolyzing biomass by subjecting the biomass material to a first hydrolysis step in an aqueous medium at a temperature and a pressure chosen to effectuate primarily depolymerization of hemicellulose without achieving significant depolymerization of cellulose into glucose. This step yields a slurry in which the liquid aqueous phase contains dissolved monosaccharides resulting from depolymerization of hemicellulose, and a solid phase containing cellulose and lignin. The slurry is then subject to a second hydrolysis step under conditions that allow a major portion of the cellulose to be depolymerized, yielding a liquid aqueous phase containing dissolved/soluble depolymerization products of cellulose. See, e.g., U.S. Pat. No. 5,536,325.


A further exemplary method involves processing a biomass material by one or more stages of dilute acid hydrolysis using about 0.4% to about 2% of a strong acid; followed by treating the unreacted solid lignocellulosic component of the acid hydrolyzed material with alkaline delignification. See, e.g., U.S. Pat. No. 6,409,841. Another exemplary pretreatment method comprises prehydrolyzing biomass (e.g., lignocellulosic materials) in a prehydrolysis reactor; adding an acidic liquid to the solid lignocellulosic material to make a mixture; heating the mixture to reaction temperature; maintaining reaction temperature for a period of time sufficient to fractionate the lignocellulosic material into a solubilized portion containing at least about 20% of the lignin from the lignocellulosic material, and a solid fraction containing cellulose; separating the solubilized portion from the solid fraction, and removing the solubilized portion while at or near reaction temperature; and recovering the solubilized portion. The cellulose in the solid fraction is rendered more amenable to enzymatic digestion. See, e.g., U.S. Pat. No. 5,705,369. Further pretreatment methods can involve the use of hydrogen peroxide H2O2. See Gould, 1984, Biotech, and Bioengr. 26:46-52.


Pretreatment can also comprise contacting a biomass material with stoichiometric amounts of sodium hydroxide and ammonium hydroxide at a very low concentration. See Teixeira et al., 1999, Appl. Biochem. and Biotech. 77-79:19-34. Pretreatment can also comprise contacting a lignocellulose with a chemical (e.g., a base, such as sodium carbonate or potassium hydroxide) at a pH of about 9 to about 14 at moderate temperature, pressure, and pH. See PCT Publication WO2004/081185.


Ammonia pretreatment can also be used. Such a pretreatment method comprises subjecting a biomass material to low ammonia concentration under conditions of high solids. See, e.g., U.S. Patent Publication No. 20070031918 and PCT publication WO 06/110901.


Detergent Compositions Comprising Variant CBH I Proteins

The present disclosure also provides detergent compositions comprising a variant CBH I polypeptide of the disclosure. The detergent compositions may employ besides the variant CBH I polypeptide one or more of a surfactant, including anionic, non-ionic and ampholytic surfactants; a hydrolase; a bleaching agents; a bluing agent; a caking inhibitors; a solubilizer; and a cationic surfactant. All of these components are known in the detergent art.


The variant CBH I polypeptide is preferably provided as part of cellulase composition. The cellulase composition can be employed from about 0.00005 weight percent to about 5 weight percent or from about 0.0002 weight percent to about 2 weight percent of the total detergent composition. The cellulase composition can be in the form of a liquid diluent, granule, emulsion, gel, paste, and the like. Such forms are known to the skilled artisan. When a solid detergent composition is employed, the cellulase composition is preferably formulated as granules.


Examples
Materials and Methods
Preparation of CBH I Polypeptides for Biochemical Characterization

Protein expression was carried out in an Aspergillus niger host strain that had been transformed using PEG-mediated transformation with expression constructs for CBHI that included the hygromycin resistance gene as a selectable marker, in which the full length CBH I sequences (signal sequence, catalytic domain, linker and cellulose binding domain) were under the control of the glyceraldeyhde-3-phosphate dehydrogenase (gpd) promoter. Transformants were selected on the regeneration medium based on resistance to hygromycin. The selected transformants were cultured in Aspergillus salts medium, pH 6.2 supplemented with the antibiotics penicillin, streptomycin, and hygromycin, and 80 g/L glycerol, 20 g/L soytone, 10 mM uridine, 20 g/L MES) in baffled shake flasks at 30° C., 170 rpm. After five days of incubation, the total secreted protein supernatant was recovered, and then subjected to hollow fiber filtration to concentrate and exchange the sample into acetate buffer (50 mM NaAc, pH 5). CBH I protein represented over 90% of the total protein in these samples. Protein purity was analyzed by SDS-PAGE. Protein concentration was determined by gel densitometry and/or HPLC analysis. All CBH I protein concentrations were normalized before assay and concentrated to 1-2.5 mg/ml.


CBH I Activity Assays

4-Methylumbelliferyl Lactoside (4-MUL) Assay:


This assay measures the activity of CBH I on the fluorogenic substrate 4-MUL (also known as MUL). Assays were run in a costar 96-well black bottom plate, where reactions were initiated by the addition of 4-MUL to enzyme in buffer (2 mM 4-MUL in 200 mM MES pH 6). Enzymatic rates were monitored by fluorescent readouts over five minutes on a SPECTRAMAX™ plate reader (ex/em 365/450 nm). Data in the linear range was used to calculate initial rates (Vo).


Phosphoric Acid Swollen Cellulose (PASC) Assay:


This assay measures the activity of CBH I using PASC as the substrate. During the assay, the concentration of PASC is monitored by a fluorescent signal derived from calcofluor binding to PASC (ex/em 365/440 nm). The assay is initiated by mixing enzyme (15 μl) and reaction buffer (85 μl of 0.2% PASC, 200 mM MES, pH 6), and then incubating at 35° C. while shaking at 225 RPM. After 2 hours, one reaction volume of calcofluor stop solution (100 μg/ml in 500 mM glycine pH 10) is added and fluorescence read-outs obtained (ex/em 365/440 nm).


Bagasse Assay:


This assay measures the activity of CBH I on bagasse, a lignocellulosic substrate. Reactions were run in 10 ml vials with 5% dilute acid pretreated bagasse (250 mg solids per 5 ml reaction). Each reaction contained 4 mg CBH I enzyme/g solids, 200 mM MES pH 6, kanamycin, and chloramphenicol. Reactions were incubated at 35° C. in hybridization incubators (Robbins Scientific), rotating at 20 RPM. Time points were taken by transferring a sample of homogenous slurry (150 μl) into a 96-well deep well plate and quenching the reaction with stop buffer (450 μl of 500 mM sodium carbonate, pH 10). Time point measurements were taken every 24 hours for 72 hours.


Cellobiose Tolerance Assays (or Cellobiose Inhibition Assays):


Tolerance to cellobiose (or inhibition caused by cellobiose) was tested in two ways in the CBH I assays. A direct-dose tolerance method can be applied to all of the CBH I assays (i.e., 4-MUL, PASC, and/or bagasse assays), and entails the exogenous addition of a known amount of cellobiose into assay mixtures. A different indirect method entails the addition of an excess amount of β-glucosidase (BG) to PASC and bagasse assays (typically, 1 mg β-glucosidase/g solids loaded). BG will enzymatically hydrolyze the cellobiose generated during these assays; therefore, CBH I activity in the presence of BG can be taken as a measure of activity in the absence of cellobiose. Furthermore, when activity in the presence and absence of BG are similar, this indicates tolerance to cellobiose. Notably, in cases where BG activity is undesired, but may be present in crude CBH I enzyme preparations, the BG inhibitor gluconolactone can be added into CBH I assays to prevent cellobiose breakdown.


Library Screening Assays

The wild type CBH I polypeptide BD29555 was mutagenized to identify variants with improved product tolerance. A small (60-member) library of BD29555 variants was designed to identify variant CBH I polypeptides with reduced product inhibition. This product-release-site library was designed based on residues directly interacting with the cellobiose product in an attempt to identify variants with weakened interactions with cellobiose from which the product would be released more readily than the wild type enzyme. The 60-member evolution library contained wild-type residues and mutations at positions B273, W405, and R422 of BD29555 (SEQ ID NO:1), and included the following substitutions: B273 (WT), R273Q, R273K, R273A, W405 (WT), W405Q, W405H, R422 (WT), R422Q, R422K, R422L, and R422E (4 variants at position 273×3 variants at position 405×5 variants at position 422 equals 60 variants in total). All members of the library were screened using the 4-MUL assay in the presence and absence of 250 g/L cellobiose and using gluconolactone to inhibit any BG activity. The R273A, R273Q, and R273K/R422K variants showed enhanced product tolerance. The R273K/R422K variant showed greatest activity among the variants and cellobiose tolerance at 250 mg/L. Due to low expression, the R273K variant was not tested for product inhibition.


Characterization of Product Tolerant VARIANTS of BD29555

The R273K/R422K substitutions were characterized in both a wild type BD29555 background and also in combination with the substitutions Y274Q, D281K, Y410H, P411G, which were identified in a screen of an expanded product release site evolution library.


The wild type, the R273K/R422K variant and the R273K/Y274Q/D281K/Y410H/P411G/R422K variants were tested for activity on 4-MUL in the presence and absence of 250 mg/L cellobiose, and the R273K/R422K variant was also tested in the bagasse assay in the presence and absence of BG. The results are summarized in Table 5.


The results from these activity assays were converted into the percentage of activity remaining with and without cellobiose present, where values close to 100% indicated cellobiose tolerance. The percent of activity remaining in the MUL assay in the presence cellobiose versus in the absence of cellobiose shows that the R273K/R422K variant was the most tolerant, followed by the R273K/Y274Q/D281K/Y410H/P411G/R422K variant, and then wild-type, at 95%, 78%, and 25% activity, respectively.


Cellobiose dose response curves of the wild-type and R273K/R422K variant of BD29555 were obtained during the 4-MUL assay. Enzyme rates (Vo) were measured in the presence of different concentrations of cellobiose (200 mM MES pH 6, 25° C.). Rates were measured in quadruplicate. The results are shown in FIG. 1A-1B. FIG. 1A shows that wild type BD2955 is inhibited by cellobiose, with a half maximal inhibitory concentration (IC50 value) of 60 mg/L. FIG. 1B shows that the R273K/R422K variant is tolerant to cellobiose up to 250 mg/L.


The bagasse assay results shown in Table 5, which lists the percentage of activity remaining in the absence vs. presence of BG, also demonstrate that the percentage activity of the wild type BD29555 is lower than the percentage activity of the R273K/R422K variant, indicating that the R273K/R422K variant is less sensitive to the presence of cellobiose than the wild type. FIG. 2A-2B shows bar graph data for the bagasse assay of BD29555 vs. the R273K/R422K variant. In FIG. 2A, bars represent relative activity, which has been normalized to wild type activity in the absence of cellobiose (WT+BG=uninhibited activity=1). In FIG. 2B, bars indicate tolerance to cellobiose, as represented by the ratio of activity in the presence of cellobiose (−BG) to that of activity in the absence of cellobiose (+BG); ratios close to 1 indicate greater tolerance to cellobiose. These data again demonstrate that the R273K/R422K variant of BD29555 is more tolerant to cellobiose than the wild tvae BD29555.


The wild type and R273K/R422K variant were also characterized in the PASC assay. Results are shown in FIG. 3. The activities of both wild type BD29555 (SEQ ID NO:1) and wild type T. reesei CBH I (SEQ ID NO:2) were inhibited by cellobiose concentrations starting around 1 g/L (with IC50 values of 2.2 and 3 g/L, respectively), whereas the R273K/R422K variant showed little inhibition in the presence of 10 g/L cellobiose.


Characterization of Product Tolerant VARIANTS of T. reesei CBH I


Cellobiose product tolerant substitutions were introduced into T. reesei CBH I (SEQ ID NO:2). A panel of variants with single and double alanine and lysine substitutions at R268 and R411 were expressed and analyzed. The variants were tested for activity on 4-MUL in the presence and absence of 250 mg/L cellobiose and also in the bagasse assay in the absence and prseence of BG. The results from these assays were converted into the percentage activity remaining in the presence and absence of cellobiose and BG, respectively. Values are summarized in Table 6.


The 4-MUL assay results shown in Table 6 demonstrate that the activity of the wild type T. reesei CBH I was reduced to 23% in the presence of cellobiose, whereas the double mutants at R268 and R411 retained more than 90% of their activity under the same conditions.


The bagasse assay results shown in Table 6 demonstrate that the activity of the wild type T. reesei CBH I is more significantly impacted by the presence of BG than is the activity of the single or double substitution variants, indicating that the variants are less sensitive to the accumulation of cellobiose than the wild type. FIGS. 4 and 5 show bar graph data for the bagasse assay of wild type T. reesei CBH I vs. the variants. In FIG. 4, bars represent relative activity, normalized to wild type activity in the absence of cellobiose (WT+BG=1). In FIG. 5, bars represent tolerance to cellobiose, as represented by the ratio of activity in the presence of accumulating cellobiose (−BG) to that of activity in the absence of cellobiose (+BG); ratios close to 1 indicate greater tolerance to cellobiose.


Specific Embodiments and Incorporation by Reference

All publications, patents, patent applications and other documents cited in this application are hereby incorporated by reference in their entireties for all purposes to the same extent as if each individual publication, patent, patent application or other document were individually indicated to be incorporated by reference for all purposes.


While various specific embodiments have been illustrated and described, it will be appreciated that various changes can be made without departing from the spirit and scope of the invention(s).












TABLE 1





Sequence Identifier
Database




(SEQ ID NO:)
Accession Number
Species of Origin
Amino Acid Sequence








BD29555*
Unknown
MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN





TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ





IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN





VEGWTPSSNN ANTGLGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP





DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS





GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA





ARGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA SSTSTSSTST





GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL






340514556

Trichoderma reesei

MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG





NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL





GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE





PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW





NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA





EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV





PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG





YSGPTVCASG TTCQVLNPYY SQCL






51243029

Penicillium occitanis

MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN





TSTNCYTGNT WNSAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ





IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN





VEGWTPSANN ANTGIGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP





DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT STGSLSEIRR YYVQNGVVIP QPSSKISGIS





GNVINSDYCA AEISTFGGTA SFNKHGGLTN MAAGMEAGMV LVMSLWDDYA VNMLWLDSTY PTNATGTPGA





ARGTCATTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSTTT TASRTTTTSA SSTSTSSTST





GTGVAGHWGQ CGGQGWTGPT TCVSGTTCTV VNPYYSQCL






7cel (PDB) &

Trichoderma reesei

ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN





CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL





NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC





SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF





TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT





QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN





IKFGPIGSTG NPSG






67516425

Aspergillus nidulans

MASSFQLYKA LLFFSSLLSA VQAQKVGTQQ AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY




FGSC A4
TNCYTGNEWD TSICTSNEVC AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR VYLMDDEDTY





TMFYLLNKEF TFDVDVSELP CGLNGAVYFV SMDADGGKSR YATNEAGAKY GTGYCDSQCP RDLKFINGVA





NVEGWESSDT NPNGGVGNHG SCCAEMDIWE ANSISTAFTP HPCDTPGQTL CTGDSCGGTY SNDRYGGTCD





PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT VVTQFLTDDN TDTGTLSEIK RFYVQNGVVI PNSESTYPAN





PGNSITTEFC ESQKELFGDV DVFSAHGGMA GMGAALEQGM VLVLSLWDDN YSNMLWLDSN YPTDADPTQP





GIARGTCPTD SGVPSEVEAQ YPNAYVVYSN IKFGPIGSTF GNGGGSGPTT TVTTSTATST TSSATSTATG





QAQHWEQCGG NGWTGPTVCA SPWACTVVNS WYSQCL






46107376

Gibberella zeae PH-1

MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN





KWDTSVCTSG KVCAEKCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL





GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ





PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF





NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA





DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST





SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS





GPTACKSGFT CKKINDFYSQ CQ






70992391

Aspergillus fumigatus

MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV




Af293
GDYTNCYTGN TWDTTICPDD ATCASNCALE GANYESTYGV TASGNSLRLN FVTTSQQKNI GSRLYMMKDD





STYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN





GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG





TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTSSGTLK EIKRFYVQNG KVIPNSESTW





TGVSGNSITT EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW DDHSANMLWL DSNYPTTASS





TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG STFNSGGSNP GGGTTTTTTT QPTTTTTTAG





NPGGTGVAQH YGQCGGIGWT GPTTCASPYT CQKLNDYYSQ CL






121699984

Aspergillus clavatus

MLPSTISYRI YKNALFFAAL FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI DANWRWVHDV




NRRL 1
KGYTNCYTGN TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK FVTQSQQKNI GSRLYMMKDD





NTYETFKLLN QEFTFDVDVS NLPCGLNGAL YFVSMDADGG LSRYTGNEAG AKYGTGYCDS QCPRDLKFIN





GLANVEGWTP SSSDANAGNG GHGSCCAEMD IWEANSISTA YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG





TCDPDGCDFN SYRQGNKSFY GPGMTVDTKK KMTVVTQFLT NDGTATGTLS EIKRFYVQDG KVIANSESTW





PNLGGNSLTN DFCKAQKTVF GDMDTFSKHG GMEGMGAALA EGMVLVMSLW DDHNSNMLWL DSNSPTTGTS





TTPGVARGSC DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA TSTTTTKATT





TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL






1906845

Claviceps purpurea

MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR SGCRSVQGAV TVDANWLWTT VDGSQNCYTG





NRWDTSICSS EKTCSESCCI DGADYAGTYG VTTTGDALSL KFVQQGPYSK NVGSRLYLMK DESRYEMFTL





LGNEFTFDVD VSKLGCGLNG ALYFVSMDED GGMKRFPMNK AGAKFGTGYC DSQCPRDVKF INGMANSKDW





IPSKSDANAG IGSLGACCRE MDIWEANNIA SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD





FNSYRLGNTT FYGPGPKFTI DTTRKISVVT QFLKGRDGSL REIKRFYVQN GKVIPNSVSR VRGVPGNSIT





QGFCNAQKKM FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW LDSTYPTNSR QRGSKRGSCP





ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK






1gpi (PDB) &

Phanerochaete

ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN





chrysosporium

CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL





NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC





SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF





TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT





QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN





IKFGPIGSTG NPSG






119468034

Neosartorya fischeri

MHQRALLFSA LAVAANAQQV GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG




NRRL 181
NTWNTELCPD NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNNE





FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWKPSS





NDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSATRYAGTC DPDGCDFNPF





RMGNESFYGP GKIVDTKSEM TVVTQFITAD GTDTGALSEI KRLYVQNGKV IANSVSNVAD VSGNSISSDF





CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG





AGDPEKVESQ HPDASVTFSN IKFGPIGSTY KA






7804883

Leptosphaeria

MYRSLIFATS LLSLAKGQLV GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS





maculans

CATNCAIDGA DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST YQLFKFTGSQ EFTFDVDLSN





LPCGLNGALY FVSMDADGGL KKYPTNKAGA KYGTGYCDAQ CPRDLKFING EGNVEGWQPS KNDQNAGVGG





HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ SRCDGDGCGG TYSADRYAGV CDPDGCDFNS YRMGVKDFYG





KGKTVDTSKK FTVVTQFIGS GDAMEIKRFY VQNGKTIPQP DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF





KDKGGMANMP STCNGMVLVM SLWDDHYSNM LWLDSTYPTD KNPDTDAGSG RGECAITSGV PADVESQHPD





ASVIYSNIKF GPINTTFG






85108032

Neurospora crassa

MLAKFAALAA LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT SGSTNCYSGN




N150 (OR74A)
EWDTSLCSTN TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN IGSRTYLMNG ADAYQGFELL





GNEFTFDVDV SGTGCGLNGA LYFVSMDLDG GKAKYTNNKA GAKYGTGYCD AQCPRDLKYI NGIANVEGWT





PSTNDANAGI GDHGTCCSEM DIWEANKVST AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF





NSYRMGNTTF YGEGKTVDTS SKFTVVTQFI KDSAGDLAEI KRFYVQNGKV IENSQSNVDG VSGNSITQSF





CNAQKTAFGD IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS TYPVEGGPGA YRGECPTTSG





VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS NPSGTGAAHW





AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V






169859458

Coprinopsis cinerea

MFKKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY




okayama
TGNSWNSTVC SDPTTCAQRC ALEGANYQQT YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN





LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSAD





WTPSETDPNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD





YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD





SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA





RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY






154292161

Botryotinia fuckeliana

MYSAAVLATF SFLLGAGAQQ VGTSTAETHP ALTVQKCAAG GTCTDESDSI VLDANWRWLH STSGSTNCYT




B05-10
GNTWDTTLCP DAATCTTNCA LDGADYEGTY GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN





EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG TANVEGWVPD





SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHVCTVDSQ TACTGDDCAS NTGVCDGDGC DFNPYRMGNT





TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPS SDISGVSGNS ITDDFCAAQK





TAFGDTDYFT QNGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATDSGVP





ATVEAASGSA YVTFSSIKYG PIGSTFNAPA DSSSSVSASS SPAPIASSSS SASIAPVSSV VAAIVSSSAQ





AISSAAPVVS SSAQAISSAA PVVSSVVSSA APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS





AASSTGSVPL YGNCTGGKTC SEGTCVVQND YYSQCVASS






169615761 #

Phaeosphaeria

MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF DGNEWNKTAC PSNAACTKNC AIEGSDYRGT





nodorum SN15

YGITTSGNSL TLKFITKGQY STNVGSRTYL MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP





EKGQGTPGAK YGTGKLSQCS VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV





WEANSMSTAL TPHSCQPEGY AVCEESNCGG TYSLDRYAGT CDANGCDFNP YRVGNKDFYG KGKTVDTSKK





MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG NSITQKWCDT QKEVFKEEVY PFNQWGGMAS





MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG AARGECAITS GAPAEVEANN PDASVMFSNI





KFGPIGSTFQ QPA






4883502

Humicola grisea

MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC





YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF





TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE





GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN





GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC





DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS





GVPAEVEAQY PNAQVVWSNI RFGPIGSTVN V






950686

Humicola grisea

MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT





GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE





LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG





WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC





DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ





DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP





TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK





AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW YSQCL






124491660

Chaetomium

MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN





thermophilum

CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM





FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI





EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN





GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC





TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YANMLWLDSV YPPEKAGQPG AERGPCAPTS





GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V






58045187

Chaetomium

MMYKKFAALA ALVAGAAAQQ ACSLTTETHP RLTWKRCTSG GNCSTVNGAV TIDANWRWTH TVSGSTNCYT





thermophilum

GNEWDTSICS DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE





LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANIEN





WTPSTNDANA GFGRYGSCCS EMDIWDANNM ATAFTPHPCT IIGQSRCEGN SCGGTYSSER YAGVCDPDGC





DFNAYRQGDK TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ





EWCDAQKVAF GDIDDFNRKG GMAQMSKALE GPMVLVMSVW DDHYANMLWL DSTYPIDKAG TPGAERGACP





TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSTP SNPTATVAPP TSTTTSVRSS TTQISTPTSQ





PGGCTTQKWG QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL






169601100 #

Phaeosphaeria

MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA DYKGTYGITA





nodorum SN15

SGNSLQLKFI TKGSYSTNIG SRTYLMASDT AYQMFKFDGN KEFTFDVDLS GLPCGFNGAL YFVSMDEDGG





LKKYSGNKAG AKYGTGYCDA QCPRDLKFIN GEGNVEGWKP SDNDANAGVG GHGSCCAEMD IWEANSISTA





VTPHACSTIE QTRCDGDGCG GTYSADRYAG VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG





TGDAMEIKRF YVQGGKTIEQ PASTIPGVEG NSITTKFCDQ QKQVFGDRYT YKEKGGTANM AKALAQGMVL





VMSLWDDHYS NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS PDATVIYSNI KFGPLNSTY






169870197

Coprinopsis cinerea

MLGKIAIASL SFLAIAKGQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY




okayama
TGNSWNSSVC SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG TNVGGRVYLM ENENRYQMFN





LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSSQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSVG





WEPSETDSNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD





YNPFRMGNKD FYGPGKTIDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD





SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA





RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY






3913806

Agaricus bisporus

MFPRSILLAL SLTAVALGQQ VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR INDFTNCYTG





NEWDTSICPD GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN IGSRLYLMAD DSNYEIFNLL





NKEFTFDVDV SKLPCGLNGA LYFSEMAADG GMSSTNTAGA KYGTGYCDSQ CPRDIKFIDG EANSEGWEGS





PNDVNAGTGN FGACCGEMDI WEANSISSAY TPHPCREPGL QRCEGNTCSV NDRYATECDP DGCDFNSFRM





GDKSFYGPGM TVDTNQPITV VTQFITDNGS DNGNLQEIRR IYVQNGQVIQ NSNVNIPGID SGNSISAEFC





DQAKEAFGDE RSFQDRGGLS GMGSALDRGM VLVLSIWDDH AVNMLWLDSD YPLDASPSQP GISRGTCSRD





SGKPEDVEAN AGGVQVVYSN IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG QGWTGPTACQ





SPSTCHVIND FYSQCF






169611094

Phaeosphaeria

MYRNLALASL SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG KITLDANWRW THVTTGYTNC





nodorum SN15

YDGNSWNTTA CPDGATCTKN CAVDGADYSG TYGITTSSNS LSIKFVTKGS NSANIGSRTY LMESDTKYQM





FNLIGQEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE





GWNPSDADPN AGSGKIGACC PEMDIWEANS ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC





DFNSYRMGVK DFYGPGATLD TTKKMTVVTQ FLGSGSTLSE IKRFYVQNGK VFKNSDSAIE GVTGNSITES





FCAAQKTAFG DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD STYPTNSTKL GAQRGTCAID





SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS






3131

Phanerochaete

MVDIQIATFL LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL





chrysosporium

TANGWDPTLC PDGITCANYC ALDGVSYSST YGITTSGSAL RLQFVTGTNI GSRVFLMADD THYRTFQLLN





QELAFDVDVS KLPCGLNGAL YFVAMDADGG KSKYPGNRAG AKYGTGYCDS QCPRDVQFIN GQANVQGWNA





TSATTGTGSY GSCCTELDIW EANSNAAALT PHTCTNNAQT RCSGSNCTSN TGFCDADGCD FNSFRLGNTT





FLGAGMSVDT TKTFTVVTQF ITSDNTSTGN LTEIRRFYVQ NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK





KAFIETNYFA QHGGLAQLGQ ALRTGMVLAF SISDDPANHM LWLDSNFPPS ANPAVPGVAR GMCSITSGNP





ADVGILNPSP YVSFLNIKFG SIGTTFRPA






70991503

Aspergillus fumigatus

MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG




Af293
NTWNTELCPD NESCAQNCAL DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE





FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS





SDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF





RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI KRLYVQNGKV IANSVSNVAG VSGNSITSDF





CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG





AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG






294196

Phanerochaete

MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT





chrysosporium

GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT





SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF





LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT





AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA





QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG





IGYTGSTTCA SPYTCHVLNP YYSQCY






18997123

Thermoascus

MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG





aurantiacus

NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL





GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ





PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF





NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT





TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT





CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN






4204214

Humicola grisea var

MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC





thermoidea

YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF





TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE





GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN





GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC





DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS





GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V






34582632

Trichoderma viride

MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG




(also known as
NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL





Hypochrea rufa)

GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE





PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW





DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA





EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV





PAQVESQSPN AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG





YSGPTVCASG TTCQVLNPYY SQCL






156712284

Thermoascus

MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG





aurantiacus

NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL





GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ





PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF





NPYRQGNHSF YGPGQIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT





TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT





CPTTSGVPAD VESQYPNSYV IYSNIKVGPI NSTFTAN






39977899

Magnaporthe grisea

MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT IDANWRWTHT TSGYTNCYTG




(oryzae) 70-15
NKWDTSICST NADCASKCCV DGANYQQTYG ASTSGNALSL QYVTQSSGKN VGSRLYLLES ENKYQMFNLL





GNEFTFDVDA SKLGCGLNGA VYFVSMDADG GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ





PSSGDANSGV GNMGSCCAEM DIWEANSIST AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA GDCDPDGCDF





NSYRQGNRTF YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD IKRFYVQNGK VIPNSQSTIT GVTGNSVTQD





YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD DHHSQMLWLD STYPTTSTAP GAARGSCSTS





SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF KSS






20986705

Talaromyces emersonii

MLRRALLLSS SAILAVKAQQ AGTATAENHP PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT





GNTWDPTYCP DDETCAQNCA LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS TYQIFKLLNR





EFSFDVDVSN LPCGLNGALY FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ CPRDLKFIDG EANVEGWQPS





SNNANTGIGD HGSCCAEMDV WEANSISNAV TPHPCDTPGQ TMCSGDDCGG TYSNDRYAGT CDPDGCDFNP





YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD DGTDTGTLSE IKRFYIQNSN VIPQPNSDIS GVTGNSITTE





FCTAQKQAFG DTDDFSQHGG LAKMGAAMQQ GMVLVMSLWD DYAAQMLWLD SDYPTDADPT TPGIARGTCP





TDSGVPSDVE SQSPNSYVTY SNIKFGPINS TFTAS






22138843

Aspergillus oryzae

MHQRALLFSA FWTAVQAQQA GTLTAETHPS LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG





NTWDATLCPD NESCASNCAL DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN





EFTFDVDASK LPCGLNGAVY FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING QVRKGWEPSD





SDKNAGVGGH GSCCPQMDIW EANSISTAYT PHPCDDTAQT MCEGDTCGGT YSSERYAGTC DPDGCDFNAY





RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI KRFYVQGGKV IANAASNVDG VTGNSITADF





CTAQKKAFGD DDIFAQHGGL QGMGNALSSM VLTLSIWDDH HSSMMWLDSS YPEDADATAP GVARGTCEPH





AGDPEKVESQ SGSATVTYSN IKYGPIGSTF DAPA






55775695

Penicillium

MASTLSFKIY KNALLLAAFL GAAQAQQVGT STAEVHPSLT WQKCTAGGSC TSQSGKVVID SNWRWVHNTG





chrysogenum

GYTNCYTGND WDRTLCPDDV TCATNCALDG ADYKGTYGVT ASGSSLRLNF VTQASQKNIG SRLYLMADDS





KYEMFQLLNQ EFTFDVDVSN LPCGLNGALY FVAMDEDGGM ARYPTNKAGA KYGTGYCDAQ CPRDLKFING





QANVEGWEPS SSDVNGGTGN YGSCCAEMDI WEANSISTAF TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT





CDPDGCDFNP YRMGNQSFYG PSKIVDTESP FTVVTQFITN DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS





AVTGNSITDS FCSAQKTAFK DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD DHAANMLWLD STYPTSASST





TPGAARGSCD ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG TTTTKVTTTT ATKTTTTTGP





STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL






171676762

Podospora anserina

MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV TLDSNWRWTH TLQGSTNCYS





GNEWDTSICT TGTKCAQNCC VEGAEYAATY GITTSGNQLN LKFVTEGKYS TNVGSRTYLM ENATKYQGFN





LLGNEFTFDV DVSNIGCGLN GALYFVSMDL DGGLAKYSGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG





WNPSTNDVNA GAGRYGTCCS EMDIWEANNM ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC





DFNSYRMGNK EFYGKGKTVD TTKKMTVVTQ FLKNAAGELS EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT





QDWCDAQKIA FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW LDSTYPVDAA GRPGAERGAC





PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT PTSTRVSSTS TAAQTPTSAP





GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL






146350520

Pleurotus sp Florida

MFPYIALVSF SFLSVVLAQQ VGTLTAETHP QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT





GNTWDTSLCP DAATCSRNCA LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL ADDSHYQMFN





LKNKEFTFDV DVSQLPCGLN GALYFSQMDA DGGTGRFPNN KAGAKYGTGY CDSQCPHDIK FINGEANVQG





WQPSPNDSNA GKGQYGSCCA EMDIWEANSM ASAYTPHPCT VTTPTRCQGN DCGDGDNRYG GVCDKDGCDF





NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL TSDNTTSGTL SEIRRLYVQN GRVIQNSKVN IPGMASTLDS





ITESFCSTQK TVFGDTNSFA SKGGLRAMGN AFDKGMVLVL SIWDDHEAKM LWLDSNYPLD KSASAPGVAR





GTCATTSGEP KDVESQSPNA QVIFSNIKYG DIGSTYSN






37732123

Gibberella zeae

myraiatasa LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN





KWDTSVCTSG KVCAERCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL





GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ





PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF





NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA





DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST





SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS





GPTACKSGFT CKKINDFYSQ CQ






156055188

Sclerotinia

MYSAAVLATF SFLLGAGAQQ VGTLKTESHP PLTIQKCAAG GTCTDEADSV VLDANWRWLH STSGSTNCYT





sclerotiorum 1980

GNTWDTTLCP DAATCTANCA FDGADYEGTY GITSSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN





EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVSG GANNEGWVPD





SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHTCTVDGQ TACTGDDCAG NTGVCDADGC DFNPYRMGNT





TFYGSGKTID TTKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPN SDISGVSGNS ITDDFCTAQK





TAFGDTDYFS QKGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATTSGVP





ATVEAASGSA YVTFSSIKYG PIGSTFKAPA DSSSPVVASS SPAAVAAVVS TSSAQAVPSH PAVSSSQAAV





STPEAVSSAP EVPASSSAAQ SVAPTSTKPK CSKVSQSSTL ATSVAAPATT ATSAAVAATS AASSSGSVPL





YGNCTGGKTC SEGTCVVQNP WYSQCVASS






453224

Phanerochaete

MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT





chrysosporium

GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET





GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF





LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT





AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS





DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS





TTCASPYTCH VLNPYYSQCY






50402144

Trichoderma reesei

MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG





NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL





GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE





PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW





NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA





EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV





PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNRG TTTTRRPATT TGSSPGPTQS HYGQCGGIGY





SGPTVCASGT TCQVLNPYYS QCL






115397177

Aspergillus terreus

MPSTYDIYKK LLLLASFLSA SQAQQVGTSK AEVHPSLTWQ TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY




NIH2624
NNCYTGNTWD TTLCPDDETC ASNCALEGAD YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY





KMFKLLNQEF TFDVDVSNLP CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP RDLKFINGMA





NVEGWEPSAN DANAGTGNHG SCCAEMDIWE ANSISTAYTP HPCDTPGQVM CTGDSCGGTY SSDRYGGTCD





PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG TASGTLSEIK RFYVQNGKVI PNSESTWSGV





SGNSITTAYC NAQKTLFGDT DVFTKHGGME GMGAALAEGM VLVLSLWDDH NSNMLWLDSN YPTDKPSTTP





GVARGSCDIS SGDPKDVEAN DANAYVVYSN IKVGPIGSTF SGSTGGGSSS STTATSKTTT TSATKTTTTT





TKTTTTTSAS STSTGGAQHW AQCGGIGWTG PTTCVAPYTC QKQNDYYSQC L






154312003

Botryotinia fuckeliana

MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG CTTSAQSIVV DANWRWLHST TGSTNCYTGN




B05-10
TWDKTLCPDG ATCAANCALD GADYSGVYGI TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL





LNQEFTFDVD VSNLPCGLNG ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW





TPSSNDVNAG AGQYGSCCSE MDIWEANKIS AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS





YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE IRRFYVQNGV VIPNSQSTIA GVPGNSITDS





FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD APYPATKSPS APGVTRGSCS





ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS VPGTTTVSSA PASTATSGAG GVAKYAQCGG





SGYSGATACV SGSTCVALNP YYSQCQ






49333365

Volvariella volvacea

MFPAATLFAF SLFAAVYGQQ VGTQLAETHP RLTWQKCTRS GGCQTQSNGA IVLDANWRWV HNVGGYTNCY





TGNTWNTSLC PDGATCAKNC ALDGANYQST YGITTSGNAL TLKFVTQSEQ KNIGSRVYLL ESDTKYQLFN





PLNQEFTFDV DVSQLPCGLN GAVYFSAMDA DGGMSKFPNN AAGAKYGTGY CDSQCPRDIK FINGEANVQG





WQPSPNDTNA GTGNYGACCN EMDVWEANSI STAYTPHPCT QQGLVRCSGT ACGGGSNRYG SICDPDGCDF





NSFRMGDKSF YGPGLTVNTQ QKFTVVTQFL TNNNSSSGTL REIRRLYVQN GRVIQNSKVN IPGMPSTMDS





VTTEFCNAQK TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL SIWDDHAANM LWLDSNYPTD RPASQPGVAR





GTCPTSSGKP SDVENSTANS QVIYSNIKFG DIGSTYSA






729650

Penicillium

MKGSISYQIY KGALLLSALL NSVSAQQVGT LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG





janthinellum

STNCYTGNTW DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY





QKFNLLNQEF TFDVDVSNLP CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP RDLKFINGQA





NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE ANSISNAVTP HPCDTPSQTM CTGQRCGGTY STDRYGGTCD





PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV





SGNSITDSWC NAQKSAFGDT NEFSKHGGMA GMGAGLADGM VLVMSLWDDH ASDMLWLDST YPTNATSTTP





GAKRGTCDIS RRPNTVESTY PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS SSSKTTTTVT





TTTTSSGSSG TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL






146424871

Pleurotus sp Florida

MFRTAALTAF TLAAVVLGQQ VGTLTAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS LPVHTNCYTG





NAWDASLCPD PTTCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGPYSK NIGSRVYLLD DADHYKMFDL





KNQEFTFDVD MSGLPCGLNG ALYFSEMPAD GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW INGEANILDW





SASATDANAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN





SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTSSGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS





DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA





CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP VTSTTSSGPT TPTGPTGTVP





KWGQCGGNGY SGPTTCVAGS TCTYSNDWYS QCL






67538012

Aspergillus nidulans

MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG




FGSC A4
NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE





FTFDVDVSNL PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD





SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY





RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ





KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP





DVVESEHADA SVTFSNIKFG PIGSTF






62006162

Fusarium poae

MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG CSNVQGSVTI DANWRWTHQV SGSTNCHTGN





KWDTSVCTSG KVCAEKCCVD GADYASTYGI TSSGNQLSLS FVTKGSYGTN IGSRTYLMED ENTYQMFQLL





GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWE





PSKSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF





NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS





DFCTTQKKVF GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTA LGSQRGSCST





SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP TNPTNPNPTN PTNPGTVDQW GQCGGTNYSG





PTACKSPFTC KKINDFYSQC Q






146424873

Pleurotus sp Florida

MFRTAALTAF TLAAVVLGQQ VGTLAAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS TAGATNCYTG





NAWDSSLCPN PTTCATNCAI DGADYSGTYG ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL





KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW





SGSATDPNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN





SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS





DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA





CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSVPP VTSTTSSGPT TPTGPTGTVP





KWGQCGGIGY SGPTSCVAGS TCTYSNEWYS QCL






295937

Trichoderma viride

MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG





NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL





GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE





PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDSC GGTYSGDRYG GTCDPDGCDW





NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA





EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT DETSSTPGAV RGSSSTSSGV





PAQLESNSPN AKVVYSNIKF GPIGSTGNPS GGNPPGGNPP GTTTPRPATS TGSSPGPTQT HYGQCGGIGY





IGPTVCASGS TCQVLNPYYS QCL






6179889 #

Alternaria alternata

MTWQSCTAKG SCTNKNGKIV IDANWRWLHK KEGYDNCYTG NEWDATACPD NKACAANCAV DGADYSGTYG





ITAGSNSLKL KFITKGSYST NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV DVSNLPCGFN GALYFVSMDA





DGGLKKYSTN KAGAKYGTGY CDAQCPRDLK FINGEGNVEG WKPSSNDANA GVGGHGSCCA EMDIWEANSV





STAVTPHSCS TIEQSRCDGD GCGGTYSADR YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD TSKKFTVVTQ





FIGTGDAMEI KRFYVQNGKT IAQPASAVPG VEGNSITTKF CDQQKAVFGD TYTFKDKGGM ANMAKALANG





MVLVMSLWDD HYSNMLWLDS TYPTDKNPDT DLGTGRGECE TSSGVPADVE SQHADATVVY SNIKFGPLNS





TFG






119483864

Neosartorya fischeri

MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS CATNQGSVVM DANWRWVHQV




NRRL 181
GSTTNCYTGN TWDTSICDTD ETCATECAVD GADYESTYGV TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD





NTHYQMFKLL NQEFTFDVDV SNLPCGLNGA LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI





QGQANVEGWT PSSNNENTGL GNYGSCCAEL DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA





GTCDPDGCDF NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD





ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL WDDYYADMLW LDSTYPTNAS





SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS GSSGSGSSGS ASTSTTSTKT





TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY YSQCL






85083281

Neurospora crassa

MKAYFEYLVA ALPLLGLATA QQVGKQTTET HPKLSWKKCT GKANCNTVNA EVVIDSNWRW LHDSSGKNCY




OR74A
DGNKWTSACS SATDCASKCQ LDGANYGTTY GASTSGDALT LKFVTKHEYG TNIGSRFYLM NGASKYQMFT





LMNNEFAFDV DLSTVECGLN AALYFVAMEE DGGMASYSSN KAGAKYGTGY CDAQCARDLK FVGGKANIEG





WTPSTNDANA GVGPYGGCCA EIDVWESNAH SFAFTPHACK TNKYHVCERD NCGGTYSEDR FAGLCDANGC





DYNPYRMGNT DFYGKGKTVD TSKKFTVVSR FEENKLTQFF VQNGQKIEIP GPKWDGIPSD NANITPEFCS





AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV LVMSIWDDHY ANMLWLDSVY PPEKEGQPGA ARGDCPQSSG





VPAEVESQYA NSKVVYSNIR FGPVGSTVNV






3913803

Cryphonectria

MFSKFALTGS LLAGAVNAQG VGTQQTETHP QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT





parasitica

GNTWNTTLCP DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL





IGNEFAFDVD LSKLPCGLNG ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF INGQGNVEGW





TPSTNDANAG VGGLGSCCSE MDVWEANSMD MAYTPHPCET AAQHSCNADE CGGTYSSSRY AGDCDPDGCD





WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI SQYYIQGGTK IQQPNSTWPT LTGYNSITDD





FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD GMVLVMSLWD DHYANMLWLD STYPVDADAS SPGKQRGTCA





TTSGVPADVE SSDASATVIY SNIKFGPIGA TY






60729633

Corticium rolfsii

MFPAAALLSF TLLAVASAQQ IGTNTAEVHP SLTVSQCTTS GGCTSSTQSI VLDANWRWLH STSGYTNCYT





GNQWNSDLCP DPDTCATNCA LDGASYESTY GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL





LNKEFSFDVD ASNIGCGING AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF INGEANLLDW





NATSANSGTG SYGSCCPEMD IWEANKYAAA YTPHPCSVSG QTRCTGTSCG AGSERYDGYC DKDGCDFNSW





RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI RRLYVQGGTV IQNSVANQPN IPKVNSITDS





FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD MVLVMSIWDD YDAEMLWLDS NYPTSGSAST PGISRGPCSA





TSGLPATVES QQASASVTYS NIKWGDIGST YSGSGSSGSS SSSSSSAASA STSTHTSAAA TATSSAAAAT





GSPVPAYGQC GGQSYTGSTT CASPYVCKVS NAYYSQCLPA






39971383

Magnaporthe grisea

MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV IDGNWRWIHN IGGYENCYSG




70-15
NKWTSVCSTN ADCATKCAME GAKYQETYGV STSGDALTLK FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK





NNEFAFDVDL SSVECGMNSA LYFVPMKEDG GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ





PSSTDSSAGI GAQGACCAEI DIWESNKNAF AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY





NPYRMGNPDF YGPGKTIDTN RKFTVISRFE NNRNYQILMQ DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ





FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP EKAGQPGSAR GPCPADGGDP





NGVVNQYPNA KVIWSNVRFG PIGSTYQVD






39973029

Magnaporthe grisea

MQLTKAGVFL GALMGGAAAQ QVGTQTAENH PKMTWKKCTG KASCTTVNGE VVIDANWRWL HDASSKNCYD




70-15
GNRWTDSCRT ASDCAAKCSL EGADYAKTYG ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL





LGNEFAFDVD LSTIECGLNS ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF VGGKANIEGW





KPSSNDANAG VGPYGACCAE IDVWESNAHA FAFTPHPCTD NKYHVCQDSN CGGTYSDDRF AGKCDANGCD





INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV QNNKRIDMPS PALEGLPATG AITAEYCTNV





FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV MSIWDDHYSN MLWLDSVYPP DKEGSPGAAR GDCPQDSGVP





SEVESQIPGA TVVWSNIRFG PVGSTVNV






1170141

Fusarium oxysporum

MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG CSDVKGSVVI DANWRWTHQT SGSTNCYTGN





KWDTSICTDG KTCAEKCCLD GADYSGTYGI TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL





GNEFTFDVDV SGIGCGLNGA PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI NGVANSEGWK





PSDSDVNAGV GNLGTCCPEM DIWEANSIST AFTPHPCTKL TQHSCTGDSC GGTYSSDRYG GTCDADGCDF





NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS





DFCSKQKSVF GDIDDFSKKG GWNGMSDALS APMVLVMSLW HDHHSNMLWL DSTYPTDSTK VGSQRGSCAT





TSGKPSDLER DVPNSKVSFS NIKFGPIGST YKSDGTTPNP PASSSTTGSS TPTNPPAGSV DQWGQCGGQN





YSGPTTCKSP FTCKKINDFY SQCQ






121710012

Aspergillus clavatus

MYQRALLFSA LATAVSAQQV GTQKAEVHPA LTWQKCTAAG SCTDQKGSVV IDANWRWLHS TEDTTNCYTG




NRRL 1
NEWNAELCPD NEACAKNCAL DGADYSGTYG VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE





FTFDVDVSNL PCGLNGALYF VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE ANVEGWKPSD





NDKNAGVGGY GSCCPEMDIW EANSISTAYT PHPCDGMEQT RCDGNDCGGT YSSTRYAGTC DPDGCDFNSF





RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI RRVYVQGGKV IGNSASNVAG VEGDSITSDF





CTAQKKAFGD EDIFSKHGGL EGMGKALNKM ALIVSIWDDH ASSMMWLDST YPVDADASTP GVARGTCEHG





LGDPETVESQ HPDASVTFSN IKFGPIGSTY KSV






17902580

Penicillium

MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN





funiculosum

TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ





IFDLLNQEFT FTVDVSNLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN





VEGWTPSTNN SNTGIGNHGS CCAELDIWEA NSISEALTPH PCDTPGLTVC TADDCGGTYS SNRYAGTCDP





DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTDDGT SSGSLSEIRR YYVQNGVVIP QPSSKISGIS





GNVINSDFCA AELSAFGETA SFTNHGGLKN MGSALEAGMV LVMSLWDDYS VNMLWLDSTY PANETGTPGA





ARGSCPTTSG NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA STTSTSSTST





GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL






1346226

Humicola grisea var

MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT





thermoidea

GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQHS TNVGSRTYLM DGEDKYQTFE





LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG





WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC





DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ





DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP





TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK





AGRWQQCGGI GFTGPTQCEE PYICTKLNDW YSQCL






156712282

Chaetomium

MMYKKFAALA ALVAGASAQQ ACSLTAENHP SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT





thermophilum

GNQWDTSLCT DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE





LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANVGN





WTPSTNDANA GFGRYGSCCS EMDVWEANNM ATAFTPHPCT TVGQSRCEAD TCGGTYSSDR YAGVCDPDGC





DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ





EYCDAQKVAF SNTDDFNRKG GMAQMSKALA GPMVLVMSVW DDHYANMLWL DSTYPIDQAG APGAERGACP





TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS STSSPVSTPT





GQPGGCTTQK WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL






169768818

Aspergillus oryzae

MASLSLSKIC RNALILSSVL STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD ANWRWVHQTG




RIB40
SSSNCYTGNK WDTSYCSTND ACAQKCALDG ADYSNTYGIT TSGSEVRLNF VTSNSNGKNV GSRVYMMADD





THYEVYKLLN QEFTFDVDVS KLPCGLNGAL YFVVMDADGG VSKYPNNKAG AKYGTGYCDS QCPRDLKFIQ





GQANVEGWVS STNNANTGTG NHGSCCAELD IWESNSISQA LTPHPCDTPT NTLCTGDACG GTYSSDRYSG





TCDPDGCDFN PYRVGNTTFY GPGKTIDTNK PITVVTQFIT DDGTSSGTLS EIKRFYVQDG VTYPQPSADV





SGLSGNTINS EYCTAENTLF EGSGSFAKHG GLAGMGEAMS TGMVLVMSLW DDYYANMLWL DSNYPTNEST





SKPGVARGTC STSSGVPSEV EASNPSAYVA YSNIKVGPIG STFKS






46241270

Gibberella pulicaris

MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN





KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL





GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ





PSKSDVNAGI GNMGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF





NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP





EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST





SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA QCGGTNYSGP





TACKSPFTCK KINDFYSQCQ






49333363

Volvariella volvacea

MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP SLNWARCTSS GCTNVAGSVT LDANWRWLHT TSGYTNCYTG





NSWNTTLCPD GATCAQNCAL DGANYQSTCG ITTSGNALTL KFVTQGEQKN IGSRVYLMAS ESRYEMFGLL





NKEFTFDVDV SNLPCGLNGA LYFSSMDADG GMAKNPGNKA GAKYGTGYCD SQCPRDIKFI NGEANVAGWN





GSPNDTNAGT GNWGACCNEM DIWEANSISA AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC DPDGCDFNSY





RMGDKTYYGP GGTGVDTRSK FTVVTQFLTN NNSSSGTLSE IRRLYVQNGR VVQNSKVNIP GMSNTLDSIT





TGFCDSQKTA FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV WDDHAANMLW LDSNYPVDAD PSKPGIARGT





CSTTSGKPTD VEQSAANSSV TFSNIKFGDI GTTYTGGSVT TTPGNPGTTT STAPGAVQTK WGQCGGQGWT





GPTRCESGST CTVVNQWYSQ CI






46395332

Irpex lacteus

MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG





QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE





FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVAGWTGSS





SDPNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQD ANRYKGVCDP DGCDFNSFRM





GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR FYVQDGKVIP NSKVNIAGCD AVNSITDKFC





TQQKTAFGDT NRFADQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASKP GVARGTCPNT





SGVPKDVESQ SGSATVTYSN IKWGDLNSTF SGTASNPTGP SSSPSGPSSS SSSTAGSQPT QPSSGSVAQW





GQCGGIGYSG ATGCVSPYTC HVVNPYYSQC Y






50844407 #

Chaetomium

TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS TNCYTGNEWD TSICSDGKSC AQTCCVDGAD





thermophilum var

YSSTYGITTS GDSLNLKFVT KHQHGTNVGS RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF





thermophilum

VSMDADGGMS KYSGNKAGAK YGTGYCDAQC PRDLKFINGE ANIENWTPST NDANAGFGRY GSCCSEMDIW





EANNMATAFT PHPCTIIGQS RCEGNSCGGT YSSERYAGVC DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM





TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN AESKIPGNPG NSITQEWCDA QKVAFGDIDD FNRKGGMAQM





SKALEGPMVL VMSVWDDHYA NMLWLDSTYP IDKAGTPGAE RGACPTTSGV PAEIEAQVPN SNVIFSNIRF





GPIGSTVPGL DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS TPTSQPGGCT TQKWGQCGGI GYTGCTNCVA





GTTCTELNPW YSQCL






4586347

Irpex lacteus

MFHKAVLVAF SLVTIVHGQQ AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT





GNTWDASICS DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET NYQMFKLMNQ





EFTFDVDVSN LPCGLNGAVY FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ CPQDIKFING EANIVDWTAS





AGDANSGTGS FGTCCQEMDI WEANSISAAY TPHPCTVTEQ TRCSGSDCGQ GSDRFNGICD PDGCDFNSFR





MGNTEFYGKG LTVDTSQKFT IVTQFISDDG TADGNLAEIR RFYVQNGKVI PNSVVQITGI DPVNSITEDF





CTQQKTVFGD TNNFAAKGGL KQMGEAVKNG MVLALSLWDD YAAQMLWLDS DYPTTADPSQ PGVARGTCPT





TSGVPSQVEG QEGSSSVIYS NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS QPAQPTQPAG





TAAQWAQCGG MGFTGPTVCA SPFTCHVLNP YYSQCY






3980202

Phanerochaete

MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT





chrysosporium

GNEWNTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET





GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDHGDGCD FNSFRMGDKT





FLGKGMTVDT SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK





TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL SIWDDHAANM LWLDSDYPTD KDPSAPGVAR GTCATTSGVP





SDVESQVPNS QVVFSNIKFG DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP PPTGPTVPQW GQCGGIGYSG





STTCASPYTC HVLNPYYSQC Y






27125837

Melanocarpus

MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV NAEVVIDANW RWLHDDNMQN





albomyces

CYDGNQWTNA CSTATDCAEK CMIEGAGDYL GTYGASTSGD ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ





MFNLMGNELA FDVDLSTVEC GINSALYFVA MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN





IEGWKSSTSD PNAGVGPYGS CCAEIDVWES NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA





NGCDYNPYRM GNPDFYGKGK TLDTSRKFTV VSRFEENKLS QYFIQDGRKI EIPPPTWEGM PNSSEITPEL





CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS IYPPEKEGQP GAARGDCPTD





SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF






171696102

Podospora anserina

MYRSATFLTF ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL DANWRWTHIT NGYTNCYTGN





EWNATACPDG ATCAKNCAVD GADYSGTYGI TTPSSGALRL QFVKKNDNGQ NVGSRVYLMA SSDKYKLFNL





LNKEFTFDVD VSKLPCGLNG AVYFSEMLED GGLKSFSGNK AGAKYGTGYC DSQCPQDIKF INGEANVEGW





GGADGNSGTG KYGICCAEMD IWEANSDATA YTPHVCSVNE QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY





RLGNREFYGP GKTVDTTRPF TIVTQFVTDD GTDSGNLKSI HRYYVQDGNV IPNSVTEVAG VDQTNFISEG





FCEQQKSAFG DNNYFGQLGG MRAMGESLKK MVLVLSIWDD HAVNMNWLDS IFPNDADPEQ PGVARGRCDP





ADGVPATIEA AHPDAYVIYS NIKFGAINST FTAN






3913802

Cochliobolus

MYRTLAFASL SLYGAARAQQ VGTSTAENHP KLTWQTCTGT GGTNCSNKSG SVVLDSNWRW AHNVGGYTNC





carbonum

YTGNSWSTQY CPDGDSCTKN CAIDGADYSG TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM





FNLINKEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE





GWNPSDADPN GGAGKIGACC PEMDIWEANS ISTAYTPHPC RGVGLQECSD AASCGDGSNR YDGQCDKDGC





DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE IKRFYVQNGK VYKNSQSAVA GVTGNSITES





FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG DHAVNMLWLD STYPTDADPS KPGAARGTCP





TTSGKPEDVE KNSPDATVVF SNIKFGPIGS TFAQPA






50403723

Trichoderma viride

MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG





NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL





GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE





PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICDGDSC GGTYSGDRYG GTCDPDGCDW





NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA





EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV





PAQLESNSPN AKVVYSNIKF GPIGSTGNSS GGNPPGGNPP GTTTTRRPAT STGSSPGPTQ THYGQCGGIG





YSGPTVCASG STCQVLNPYY SQCL






3913798

Aspergillus aculeatus

MVDSFSIYKT ALLLSMLATS NAQQVGTYTA ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT





NCYSGNTWDS SICSTDTTCA SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL YLLADDSTYE





TFKLFNREFT FDVDVSNLPC GLNGALYFVS MDADGGVSRF PTNKAGAKYG TGYCDSQCPR DLKFIDGQAN





IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA NSISSAFTAH PCDSVQQTMC TGDTCGGTYS DTTDRYSGTC





DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF TVVTQFITHD GTDTGTLTEI RRLYVQNGVV IGNGPSTYTA





ASGNSITESF CKAEKTLFGD TNVFETHGGL SAMGDALGDG MVLVLSLWDD HAADMLWLDS DYPTTSCASS





PGVARGTCPT TTGNATYVEA NYPNSYVTYS NIKFGTLNST YSGTSSGGSS SSSTTLTTKA STSTTSSKTT





TTTSKTSTTS SSSTNVAQLY GQCGGQGWTG PTTCASGTCTKQNDYYSQCL






66828465

Dictyostelium

MYRILKSFIL LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT





discoideum

GNTWNPTICP DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK NIGSRLFLMS NESNYQLFHV





LGQEFTFDVD VSNLDCGLNG ALYLVSMDSD GGSARFPTNE AGAKYGTGYC DAQCPRDLKF ISGSANVDGW





IPSTNNPNTG YGNLGSCCAE MDLWEANNMA TAVTPHPCDT SSQSVCKSDS CGGAASSNRY GGICDPDGCD





YNPYRMGNTS FFGPNKMIDT NSVITVVTQF ITDDGSSDGK LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV





NEEFCTNQKK VFGDEDSFTK HGGLAKMGEA LKDGMVLVLS LWDDYQANML WLDSSYPTTS SPTDPGVARG





SCPTTSGVPS KVEQNYPNAY VVYSNIKVGP IDSTYKK






156060391

Sclerotinia

MISRVLAISS LLAAARAQQI GTNTAEVHPA LTSIVIDANW RWLHTTSGYT NCYTGNSWDA TLCPDAVTCA





sclerotiorum 1980

ANCALDGADY SGTYGITTSG NSLKLNFVTK GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL





PCGLNGALYF AEMDADGGVS RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY





GSCCSEMDIW EANKISAAYT PHPCSVDGQT RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG DTGFYGAGLT





VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN SQSKVTGVSG NSITDSFCAA QKTAFGDTNE





FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP ASKSPSAAGV SRGSCSASSG VPADVEANSP





GASVTYSNIK WGPINSTYSA GTGSNTGSGS GSTTTLVSSV PSSTPTSTTG VPKYGQCGGS GYTGPTNCIG





STCVSMGQYY SQCQ






116181754

Chaetomium globosum

MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG SCTKEDTTVV LDANWRWTHV TDGYTNCYTG




CBS 148-51
NAWNETACPD GKTCAANCAI DGAEYEKTYG ITTPEEGALR LNFVTESNVG SRVYLMAGED KYRLFNLLNK





EFTMDVDVSN LPCGLNGAVY FSEMDEDGGM SRFEGNKAGA KYGTGYCDSQ CPRDIKFING EANSEGWGGE





DGNSGTGKYG TCCAEMDIWE ANLDATAYTP HPCKVTEQTR CEDDTECGAG DARYEGLCDR DGCDFNSFRL





GNKEFYGPEK TVDTSKPFTL VTQFVTADGT DTGALQSIRR FYVQDGTVIP NSETVVEGVD PTNEITDDFC





AQQKTAFGDN NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA VYMNWLDSNY PTDADPTKPG VARGRCDPEA





GVPETVEAAH PDAYVIYSNI KIGALNSTFA AA






145230535

Aspergillus niger

MSSFQVYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN





CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL





FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC





DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD





PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV





NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP





GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS





TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL






46241266

Nectria haematococca

MYRAIATASA LLATARAQQV CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST SSSTNCYTGN




mpVI
TWDKTLCPDG KTCADKCCLD GADYSGTYGV TSSGNQLNLK FVTVGPYSTN VGSRLYLMED ENNYQMFDLL





GNEFTFDVDV NNIGCGLNGA LYFVSMDKDG GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI NGVANSDEWK





PSDSDKNAGV GKYGTCCPEM DIWEANKIST AYTPHPCKSL TQQSCEGDAC GGTYSATRYA GTCDPDGCDF





NPYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD





SFCKAQKVAF NDPDDFNKKG GWSGMSDALA KPMVLVMSLW HDHYANMLWL DSTYPKGSKT PGSARGSCPE





DSGDPDTLEK EVPNSGVSFS NIKFGPIGST YTGTGGSNPD PEEPEEPEEP VGTVPQYGQC GGINYSGPTA





CVSPYKCNKI NDFYSQCQ






1q9h (PDB) #

Talaromyces emersonii

EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC YTGNTWDPTY CPDDETCAQN





CALDGADYEG TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD DSTYQIFKLL NREFSFDVDV SNLPCGLNGA





LYFVAMDADG GVSKYPNNKA GAKYGTGYCD SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM





DVWEANSISN AVTPHPCDTP GQTMCSGDDC GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT





KPFTVVTQFL TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH





GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT CPTDSGVPSD VESQSPNSYV





TYSNIKFGPI NSTFTAS






157362170

Polyporus arcularius

MFPTLALVSL SFLAIAYGQQ VGTLTAETHP KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH DVGGSTNCYT





GNTWDDSLCP DPTTCAANCA LDGADYSGTY GITTSGNALS LKFVTQGPYS TNIGSRVYLL SEDDSTYEMF





NLKNQEFTFD VDMSALPCGL NGALYFVEMD KDGGSGRFPT NKAGSKYGTG YCDTQCPHDI KFINGEANVL





DWAGSSNDPN AGTGHYGTCC NEMDIWEANS MGAAVTPHVC TVQGQTRCEG TDCGDGDERY DGICDKDGCD





FNSWRMGDQT FLGPGKTVDT SSKFTVVTQF ITADNTTSGD LSEIRRLYVQ NGKVIANSKT QIAGMDAYDS





ITDDFCNAQK TTFGDTNTFE QMGGLATMGD AFETGMVLVM SIWDDHEAKM LWLDSDYPTD ADASAPGVSR





GPCPTTSGDP TDVESQSPGA TVIFSNIKTG PIGSTFTS






7804885

Leptosphaeria

MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV NGEVVIDANW RWLAHRSGYT





maculans

NCYTGSEWNQ SACPNNEACT KNCAIEGSDY AGTYGITTSG NQMNIKFITK RPYSTNIGAR TYLMKDEQNY





EMFQLIGNEF TFDVDLSQRC GMNGALYFVS MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK





SASDPNSGVG KKGACCAQMD VWEANSAATA LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN





PFRVGVKDFY GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ





EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP AKPGVARRDC PTSGGKPSEV





EAANPNAQVM FSNIKFGPIG STFAHAA






121852

Phanerochaete

MFRTATLLAF TMAAMVFGQQ VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT





chrysosporium

GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT





SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF





LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT





AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA





QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG





IGYTGSTTCA SPYTCHVLNP YYSQCY






126013214

Penicillium decumbens

MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG CTSKDGSVVI DANWRWVHSV DGYKNCYTGN





EWDSTLCPDD ATCATNCAVD GADYAGTYGA TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF





TFDVDVSTLP CGLNGALYFV SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN





DKNAGVGPHG SCCAEMDIWE ANSISTALTP HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR





MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK RVYVQNGKVI ANSASDVSGI TGNSITSDFC





TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE KYPTDAAASK AGVSRGTCST





DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA






156048578

Sclerotinia

MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS CTTQSGSIVL DGNWRWTHST TSSTNCYTGN





sclerotiorum 1980

TWDATLCPDD ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV GSRVYLLADN THYKTFNLLN





QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNKAGAQ YGTGYCDSQC PRDGKFINGK ANVDGWVPSS





NNPNTGVGNY GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDNCGGT YSTTRYAGTC DPDGCDFNPY





RQGNESFYGP GKTVDTNSVF TIVTQFLTTD GTSSGTLNEI KRFYVQNGKV IPNSESTISG VTGNSITTPF





CTAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPTTKTGAG GPRGTCSTSS





GVPASVEASS PNAYVVYSNI KVGAINSTFG






156712278

Acremonium

MYTKFAALAA LVATVRGQAA CSLTAETHPS LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG





thermophilum

NEWDTSICSS DTDCATKCCL DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL





LGQEFTFDVD VSNLGCGLNG ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF INGQANIDGW





QPSSNDANAG LGNHGSCCSE MDIWEANKVS AAYTPHPCTT IGQTMCTGDD CGGTYSSDRY AGICDPDGCD





FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE IKRFYVQNGK VIPNSESKIA GVSGNSITTD





FCTAQKTAFG DTNVFEERGG LAQMGKALAE PMVLVLSVWD DHAVNMLWLD STYPTDSTKP GAARGDCPIT





SGVPADVESQ APNSNVIYSN IRFGPINSTY TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT TTNPSGPQQT





HWGQCGGQGW TGPTVCQSPY TCKYSNDWYS QCL






21449327

Aspergillus nidulans

MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG




(also known as
NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE





Emericella nidulans)

FTFDVDVSNF PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD





SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY





RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ





KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP





DVVESEHADA SVTFSNIKFG PIGSTF






171683762

Podospora anserine (S

MMMKQYLQYL AAGSLMTGLV AGQGVGTQQT ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN




mat+)
CYDGNAWNTA ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR FYLMNGASKY





QMFTLMNNEF TFDVDLSTVE CGLNSALYFV AMEEDGGMRS YPTNKAGAKY GTGYCDAQCA RDLKFVGGKA





NIEGWRESSN DENAGVGPYG GCCAEIDVWE SNAHAYAFTP HACENNNYHV CERDTCGGTY SEDRFAGGCD





ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT VVTRFQDDNL EQFFVQNGQK ILAPAPTFDG IPASPNLTPE





FCSTQFDVFT DRNRFREVGD FPQLNAALRI PMVLVMSIWA DHYANMLWLD SVYPPEKEGE PGAARGPCAQ





DSGVPSEVKA NYPNAKVVWS NIRFGPIGST VNV






56718412

Thermoascus

MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG





aurantiacus var

NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL





levisporus

GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ





PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF





NPYRQGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT





TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT





CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN






15824273

Pseudotrichonympha

MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS





grassii

LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT





FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND





ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM





GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC





NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS





SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY






115390801

Aspergillus terreus

MHQRALLFSA LVGAVRAQQA GTLTEEVHPP LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG




NIH2624
NTWDESLCPD NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN





EFTFDVDVSN LPCGLNGALY FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING MANVEGWTPS





DNDKNAGVGG HGSCCPELDI WEANSISSAF TPHPCDDLGQ TMCSGDDCGG TYSETRYAGT CDPDGCDFNA





YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL YVQNGKVIAN AQSNVDGVTG NSITSDFCTA





QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI LSIWDDHNSS MMWLDSTYPE DADASEPGVA RGTCEHGVGD





PETVESQHPG ATVTFSKIKF GPIGSTYSSN STA






453223

Phanerochaete

MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT





chrysosporium

GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET





GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF





LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT





AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS





DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS





TTCASPYTCH VLNPCESILS LQRSSNADQY LQTTRSATKR RLDTALQPRK






3132

Phanerochaete

MRTALALILA LAAFSAVSAQ QAGTITAETH PTLTIQQCTQ SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY





chrysosporium

SGNTWDAILC PDPVTCAANC ALDGADYTGT FGILPSGTSV TLRPVDGLGL RLFLLADDSH YQMFQLLNKE





FTFDVEMPNM RCGSSGAIHL TAMDADGGLA KYPGNQAGAK YGTGFCSAQC PKGVKFINGQ ANVEGWLGTT





ATTGTGFFGS CCTDIALWEA NDNSASFAPH PCTTNSQTRC SGSDCTADSG LCDADGCNFN SFRMGNTTFF





GAGMSVDTTK LFTVVTQFIT SDNTSMGALV EIHRLYIQNG QVIQNSVVNI PGINPATSIT DDLCAQENAA





FGGTSSFAQH GGLAQVGEAL RSGMVLALSI VNSAADTLWL DSNYPADADP SAPGVARGTC PQDSASIPEA





PTPSVVFSNI KLGDIGTTFG AGSALFSGRS PPGPVPGSAP ASSATATAPP FGSQCGGLGY AGPTGVCPSP





YTCQALNIYY SQCI






16304152

Thermoascus

MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG





aurantiacus

NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL





GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ





PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF





NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT





TEYCTAQKAA FDNTGFFTHG GLQKISQALA QGMVLVMSLW DDHAANMLWL DSTYPTDADP DTPGVARGTC





PTTSGVPADV ESQNPNSYVI YSNIKVGPIN STFTAN






156712280

Acremonium

MHKRAATLSA LVVAAAGFAR GQGVGTQQTE THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN





thermophilum

CYTGNEWNTT ICADAASCAS NCVVDGADYQ GTYGASTSGN ALTLKFVTKG SYATNIGSRM YLMASPTKYA





MFTLLGHEFA FDVDLSKLPC GLNGAVYFVS MDEDGGTSKY PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN





SASWQPSSND QNAGVGGMGS CCAEMDIWEA NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS ATRFAGDCDP





DGCDWNAYRM GVHDFYGNGK TVDTGKKFSI VTQFKGSGST LTEIKQFYVQ DGRKIENPNA TWPGLEPFNS





ITPDFCKAQK QVFGDPDRFN DMGGFTNMAK ALANPMVLVL SLWDDHYSNM LWLDSTYPTD ADPSAPGKGR





GTCDTSSGVP SDVESKNGDA TVIYSNIKFG PLDSTYTAS






5231154

Volvariella volvacea

MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA NWRWTHSTSG STNCYTGNTW





QATLCPDGKT CAANCALDGA DYTGTYGVTT SGNSLTLQFV TQSNVGARLG YLMADDTTYQ MFNLLNQEFW





FDVDMSNLPC GLNGALYFSA MARTAAWMPM VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR





DIKFINGEAN VQGWQPSPND TNAGTGNYGA CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN





RYGSICDHDG LGFQNLFGMG RTRVRARVGR VKQFNRSSRV VEPISWTKQT TLHLGNLPWK SADCNVQNGR





VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL RRGMVLVLSI WDDHAANMLW





LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF TNIKFGPFNS TGTTYTTGSV PTTSTSTGTT





GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA SPTTCHVLNP YYSQCY






116200349

Chaetomium globosum

MKQYLQYLAA ALPLMSLVSA QGVGTSTSET HPKITWKKCS SGGSCSTVNA EVVIDANWRW LHNADSKNCY




CBS 148-51
DGNEWTDACT SSDDCTSKCV LEGAEYGKTY GASTSGDSLS LKFLTKHEYG TNIGSRFYLM NGASKYQMFT





LMNNEFAFDV DLSTVECGLN SALYFVAMEE DGGMASYSTN KAGAKYGTGY CDAQCARDLK FVGGKANYDG





WTPSSNDANA GVGALGGCCA EIDVWESNAH AFAFTPHACE NNNYHVCEDT TCGGTYSEDR FAGDCDANGC





DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ FQENKLTQFF VQNGKKIEIP GPKHEGLPTE SSDITPELCS





AMPEVFGDRD RFAEVGGFDA LNKALAVPMV LVMSIWDDHY ANMLWLDSSY PPEKAGTPGG DRGPCAQDSG





VPSEVESQYP DATVVWSNIR FGPIGSTVQV






4586343

Irpex lacteus

MFPKASLIAL SFIAAVYGQQ VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG





NSWDATLCPD ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA YQMFQLLNQE





FTFDVDMSNL PCGLNGALYL SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC PHDIKFINGM ANVAGWAGSA





SDPNAGSGTL GTCCSEMDIW EANNDAAAFT PHPCSVDGQT QCSGTQCGDD DERYSGLCDK DGCDFNSFRM





GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT TNGDLHEIRR LYVQDGKVIQ NSVVSIPGID AVDSITDNFC





AQQKSVFGDT NYFATLGGLK KMGAALKSGM VLAMSVWDDH AASMQWLDSN YPADGDATKP GVARGTCSAD





SGLPTNVESQ SASASVTFSN IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ GTVAQWGQCG





GTGFTGPTVC ASPFTCHVVNPYYSQCY






15321718

Lentinula edodes

MFRTAALLSF AYLAVVYGQQ AGTSTAETHP PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT





GNEWNTTVCP DGTTCAANCA LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF





NPLNQEFTFD VDVSALPCGL NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI KFIEGKANVE





GWTPSSTSPN AGTGGTGICC NEMDIWEANS ISEALTPHPC TAQGGTACTG DSCSSPNSTA GICDQAGCDF





NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL TAIRRIYVQN GQVIQNSMSN IAGVTPTNEI





TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA FSRGMVLVLS IWDDDAAEML WLDSTYPVGK TGPGAARGTC





ATTSGQPDQV ETQSPNAQVV FSNIKFGAIG STFSSTGTGT GTGTGTGTGT GTTTSSAPAA TQTKYGQCGG





QGWTGATVCA SGSTCTSSGP YYSQCL






146424875

Pleurotus sp Florida

MFRTAALTAF TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV LDSNWRWTHS TAGATNCYTG





NAWDPALCPD PATCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGQYSQ NIGSRVYLLD DADHYKLFDL





KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHAGNN AGAKYGTGYC DAQCPHDIKW INGEANVLDW





SASATDDNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN





SYRMGDKNFL GRGKTIDTTK KVTVVTQFIT DNNTPTGNLV EIRRVYVQNG VVYQNSFSTF PSLSQYNSIS





DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA





CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP VSSGGSSVPP VTSTTSSGTT TPTGPTGTVP





KWGQCGGIGY SGPTACVAGS TCTYSNDWYS QCL






62006158

Fusarium venenatum

MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN





KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGTYGTN IGSRTYLMED ENTYQMFQLL





GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ





PSKSDVNGGI GNLGTCCPEM DIWEANSIST AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG GTCDADGCDF





NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP





EFCTAQKKVF GDIDDFEKKG AWGGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST





SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG GTNYSGPTAC





KSPFTCKKIN DFYSQCQ






296027

Phanerochaete

MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT





chrysosporium

GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ





EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT





SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF





LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDLVNSI TDNFCSQQKT





AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA





QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG





IGYTGSTTCA SPYTCHVLNP YYSQCY






154449709

Fusicoccum sp

MYQTSLLASL SFLLATSQAQ QVGTQTAETH PKLTTQKCTT AGGCTDQSTS IVLDANWRWL HTVDGYTNCY




BCC4124
TGQEWDTSIC TDGKTCAEKC ALDGADYEST YGISTSGNAL TMNFVTKSSQ TNIGGRVYLL AADSDDTYEL





FKLKNQEFTF DVDVSNLPCG LNGALYFSEM DSDGGLSKYT TNKAGAKYGT GYCDTQCPHD IKFINGEANV





QNWTASSTDK NAGTGHYGSC CNEMDIWEAN SQATAFTPHV CEAKVEGQYR CEGTECGDGD NRYGGVCDKD





GCDFNSYRMG NETFYGSNGS TIDTTKKFTV VTQFITADNT ATGALTEIRR KYVQNDVVIE NSYADYETLS





KFNSITDDFC AAQKTLSGDT NDFKTKGGIA RMGESFERGM VLVMSVWDDH AANALWLDSS YPTDADASKP





GVKRGPCSTS SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA






169859460

Coprinopsis cinerea

MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY




okayama
TGNAWNSSVC SDGATCAQRC ALEGANYQQT YGITTSGDAL TIKFLTRSEQ TNIGARVYLM ENEDRYQMFN





LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGLSSQPNN RAGAKYGTGY CDSQCPRDIK FINGEANSVG





WEPSETDPNA GKGQYGICCA EMDIWEANSI SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD





YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD





SITQEFCDDA KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH MLWLDSNYPT DADPNKPGIA





RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN






50400675

Trichoderma

MYRKLAVISA FLAAARAQQV CTQQAETHPP LTWQKCTASG CTPQQGSVVL DANWRWTHDT KSTTNCYDGN





harzianum (anamorph

TWSSTLCPDD ATCAKNCCLD GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF




of Hypocrea lixii)
SFDVDVSQLP CGLNGALYFV SMDADGGQSK YPGNAAGAKY GTGYCDSQCP RDLKFINGQA NVEGWEPSSN





NANTGVGGHG SCCSEMDIWE ANSISEALTP HPCETVGQTM CSGDSCGGTY SNDRYGGTCD PDGCDWNPYR





LGNTSFYGPG SSFALDTTKK LTVVTQFATD GSISRYYVQN GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA





FGGTSFTDKG GLAQINKAFQ GGMVLVMSLW DDYAVNMLWL DSTYPTNATA STPGAKRGSC STSSGVPAQV





EAQSPNSKVI YSNIRFGPIG STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT GWTGPTRCAS





GYTCQVLNPF YSQCL






729649

Neurospora crassa

MRASLLAFSL AAAVAGGQQA GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL DANWRWTHAT SGSTKCYTGN




(OR74A)
KWQATLCPDG KSCAANCALD GADYTGTYGI TGSGWSLTLQ FVTDNVGARA YLMADDTQYQ MLELLNQELW





FDVDMSNIPC GLNGALYLSA MDADGGMRKY PTNKAGAKYA TGYCDAQCPR DLKYINGIAN VEGWTPSTND





ANGIGDHGSC CSEMDIWEAN KVSTAFTPHP CTTIEQHMCE GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG





NTTFYGEGKT VDTSSKFTVV TQFIKDSAGD LAEIKAFYVQ NGKVIENSQS NVDGVSGNSI TQSFCKSQKT





AFGDIDDFNK KGGLKQMGKA LAQAMVLVMS IWDDHAANML WLDSTYPVPK VPGAYRGSGP TTSGVPAEVD





ANAPNSKVAF SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS TSTASNPSGT GAAHWAQCGG





IGFSGPTTCP EPYTCAKDHD IYSQCV






119472134

Neosartorya fischeri

MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV




NRRL 181
GDYTNCYTGN TWDKTLCPDD ATCASNCALE GANYQSTYGA TTSGDSLRLN FVTTSQQKNI GSRLYMMKDD





TTYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN





GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG





TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTASGTLK EIKRFYVQNG KVIPNSESTW





SGVGGNSITN DYCTAQKSLF KDQNVFAKHG GMEGMGAALA QGMVLVMSLW DDHAANMLWL DSNYPTTASS





STPGVARGTC DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP GGGTTTTAKP TTTTTTAGSP





GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL






117935080

Chaetomium

MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW





thermophilum

RWLHDSNYQN CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE





YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS





SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN





AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT





VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR





TINEALRIPM VLVMSIWDGH YASMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF





PNAQVIWSNI RFGPIGSTYQ V






154300584

Botryotinia fuckeliana

MTSRIALVSL FAAVYGQQVG TYQTETHPSL TWQSCTAKGS CTTNTGSIVL DGNWRWTHGV




B05-10
GTSTNCYTGN TWDATLCPDD ATCAQNCALE GADYSGTYGI TTSGNSLRLN FVTQSANKNI





GSRVYLMADT THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNTAGAE





YGTGYCDSQC PRDMKFIKGQ ANVDGWVPSS NNANTGVGNH GSCCAEMDIW EANSISTAVT





PHSCDTVTQT VCTGDDCGGT YSSSRYAGTC DPDGCDFNSY RMGDETFYGP GKTVDTNSVF





TVVTQFLTTD GTASGTLNEI KRFYVQDGKV IPNSYSTISG VSGNSITTPF CDAQKTAFGD PTSFSDHGGL





ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPVGKTSAG GPRGTCDTSS GVPASVEASS





PNAYVVYSNI KVGAINSTYG






15824271

Pseudotrichonympha

MFVFVLLWLT QSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN





grassii

CYDGNEWSSS LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV





YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG





TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH





ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV





VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT





NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS





SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY






4586345

Irpex lacteus

MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQKCTAS GCTTSSTSVV LDANWRWVHT





TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS





RVYLLQDASN YQMFQLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK





YGTGYCDSQC PRDIKFINGE ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW EANSVAAAYT





PHPCSVNQQT RCTGADCGQG DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI





VTQFISDDGT TSGNLAEIRR FYVQDGNVIP NSKVSIAGID AVNSITDDFC TQQKTAFGDT





NRFAAQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASNP GVARGTCPTT





SGFPRDVESQ SGSATVTYSN IKWGDLNSTF TGTLTTPSGS SSPSSPASTS GSSTSASSSA SVPTQSGTVA





QWAQCGGIGY SGATTCVSPY TCHVVNAYYS QCY






46241268

Gibberella avenacea

MYRAIATASA LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT





SSSTNCYTGN KWDTSVCTSG ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN





IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKARYPANKA





GAKYGTGYCD AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST





AFTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD





TTKKVTVVTQ FKKGSNGRLS EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF





GDIDDFSKKG GWSGMSDALE SPPMVLVMSL WHDHHSNMLW LDSTYPTDST KLGAQRGSCA





TTSGVPSDLE RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG QYGQCGGQTY





TGPKDCKSPY TCKKINDFYS QCQ






6164684

Aspergillus niger

MSSFQIYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR





WVHSTSSATN CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG





SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS





GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN





SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG





LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE





NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD





SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS





AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL






6164682

Aspergillus niger

MHQRALLFSA LLTAVRAQQA GTLTEEVHPS LTWQKCTSEG SCTEQSGSVV IDSNWRWTHS





VNDSTNCYTG NTWDATLCPD DETCAANCAL DGADYESTYG VTTDGDSLTL KFVTGSNVGS





RLYLMDTSDE GYQTFNLLDA EFTFDVDVSN LPCGLNGALY FTAMDADGGV SKYPANKAGA





KYGTGYCDSQ CPRDLKFIDG QANVDGWEPS SNNDNTGIGN HGSCCPEMDI WEANKISTAL





TPHPCDSSEQ TMCEGNDCGG TYSDDRYGGT CDPDGCDFNP YRMGNDSFYG PGKTIDTGSK





MTVVTQFITD GSGSLSEIKR YYVQNGNVIA NADSNISGVT GNSITTDFCT AQKKAFGDED





IFAEHNGLAG ISDAMSSMVL ILSLWDDYYA SMEWLDSDYP ENATATDPGV ARGTCDSESG





VPATVEGAHP DSSVTFSNIK FGPINSTFSA SA






33733371

Chrysosporium

MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG SCTSVQGSIT IDANWRWTHR TDSATNCYEG





lucknowense

NKWDTSYCSD GPSCASKCCI DGADYSSTYG ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL




U.S. Pat. No. 6,573,086-10
LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DSQCPRDLKF INGEANVENW





QSSTNDANAG TGKYGSCCSE MDVWEANNMA AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD





FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYVQNGK VIPNSESTIP GVEGNSITQD





WCDRQKAAFG DVTDXQDKGG MVQMGKALAG PMVLVMSIWD DHAVNMLWLD STWPIDGAGK PGAERGACPT





TSGVPAEVEA EAPNSNVIFS NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS GSSGPTGGTG





VAKHYEQCGG IGFTGPTQCE SPYTCTKLND WYSQCL






29160311

Thielavia australiensis

MYAKFATLAA LVAGASAQAV CSLTAETHPS LTWQKCTAPG SCTNVAGSIT IDANWRWTHQ TSSATNCYSG





SKWDSSICTT GTDCASKCCI DGAEYSSTYG ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL





LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW





ESSTNDANAG SGKYGSCCTE MDVWEANNMA TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY AGVCDPDGCD





FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYAQDGK VIPNSESTIA GIPGNSITKA





YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD DHAVNMLWLD STYPTDQVGV AGAERGACPT





TSGVPSDVEA NAPNSNVIFS NIRFGPIGST VQGLPSSGGT SSSSSAAPQS TSTKASTTTS AVRTTSTATT





KTTSSAPAQG TNTAKHWQQC GGNGWTGPTV CESPYKCTKQ NDWYSQCL






146197087
uncultured symbiotic
MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP




protist of
SSDTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV





Reticulitermes

DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT





NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSSDSTA QRGPCPTSSG VPKDVESQHG





DATVVFSDIK FGAINSTFKY N






146197237
uncultured symbiotic
MLAAALFTFA CSVGVGTKTP ENHPKLNWQN CASKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS




protist of Neotermes
LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY VFKLNNKEFT





koshunensis

FSVDVSKLPC GLNGALYFVE MDADGGKAKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD





KNAGNGKYGS CCSEMDVWES NSQATALTPH VCKTTGQQRC SGKSECGGQD GQDRFAGLCD EDGCDFNNWR





MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENSKSN IPGIDATAAI SDHFCEQQKK





AFGDTNDFKN KGGFAKLGQV FDRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSQPGVDRG PCPTSSGKPD





DVESASADAT VVYGNIKFGA LDSTY






146197067
uncultured symbiotic
MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP




protist of
SSNTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV





Reticulitermes

DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT





NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQYG





DATVIYSDIK FGAINSTFKW N






146197407
uncultured symbiotic
MILALLSLAK SLGIATNQAE THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD





KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGITAG NSVTDTFCNE





QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG





APSDVESQSP DATVTFSDIK FGPIDSTY






146197157
uncultured symbiotic
MLVIALILRG LSVGTGTQQS ETHPSLSWQQ TSKGGSGQSV SGSVVLDSNW RWTHTTDGTT NCYDGNEWSS




protist of
DLCPDASTCS SNCVLEGADY SGTYGITGSG SSLKLGFVTK GSYSTNIGSR VYLLGDESHY KLFKLENNEF





Hodotermopsis

TFTVDDSNLE CGLNGALYFV AMDEDGGASK YSGAKPGAKY GMGYCDAQCP HDMKFINGDA NVEGWKPSDN





sjoestedti

DENAGTGKWG ACCTEMDIWE ANKYATAYTP HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR





MGNQSFWGPG LIIDTGKPVT VVTQFLADGG SLSEIRRKYV QGGKVIENTV TKISGMDEFD SITDEFCNQQ





KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT DSGSKAGADR GPCATSSGVP





KDVESNYASA SVTFSDIKFG PIDSTY






146197403
uncultured symbiotic
MLLALFAFGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPTTCSNNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGIRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD





KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGMAAG NSITDTFCNE





QKKAFGDNND FEKKGGLGAL SKQLDSGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG





APSDVESQSP DATVTFSDIK FGPIDSTY






146197081
uncultured symbiotic
MLASVVYLVS LVVSLEIGTQ QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL HDSGTTNCYD GNLWSDDLCP




protist of
NADTCSSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV





Reticulitermes

DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GDGKLGTCCS EMDIWEGNAK SQAYTVHACS KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTVD TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKDATGDT





NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV QAVDRVLCRR VFQRMLKASM





VMLQSRTRTL SLELSTRPLV GISPAGRLFF F






146197413
uncultured symbiotic
MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD





KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE





QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG





KPSDVESQSP DATVTFSDIK FGPIDSTY






146197309
uncultured symbiotic
MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD




protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD





darwiniensis

VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG





NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT





FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT





DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA





DANVIYSDIR FGAIDSTYK






146197227
uncultured symbiotic
MLGALVALAS CIGVGTNTPE KHPDLKWTNG GSSVSGSIVV DSNWRWTHIK GETKNCYDGN LWSDKYCPDA




protist of Neotermes
ATCGKNCVLE GADYSGTYGV TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV





koshunensis

SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN





GKYGSCCSEM DIWEANSMAT AYTPHVCDKL EQTRCSGSAC GQNGGGDRFS SSCDPDGCDF NSWRMGNKTF





WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI DNSMTNIAAM SKQYNSVSDE FCQAQKKAFG





DNDSFTKHGG FRQLGATLSK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPSDVESQ





NADSTVKYSD IRFGAIDSTY SK






146197253
uncultured symbiotic
MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ CACKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS




protist of Neotermes
LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL YLLKDKSTYY VFQLNNKEFT





koshunensis

FSVDVSKLPC GLNGALYFVE MDADGGKSKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD





KNAGNGKYGS CCSEMDVWES NSMATALTPH VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR





MGDKTFFGPG LTVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENAKSN IPGIDATNAI SDTFCEQQKK





AFGDTNDFKN KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSVPGVDRG PCPTSSGKPD





DVESASGDAT VVYGNIKFGA LDSTY






146197099
uncultured symbiotic
MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP




protist of
DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYPMFK LKNKEFTFTV





Reticulitermes

DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA





speratus

GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF





FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN





DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PTDSTAIGAS RGPCATSSGD PKDVESASAN





ASVKFSDIKF GALDSTY






146197409
uncultured symbiotic
MLASLLPLSN SLGTASNQAE THPKLTWTQY TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC




protist of Cryptocercus
PDPTTCSNNC NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGAVYFVAMD EDGGTAKHSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRWGARC TEMDIWEANS RATAYTPHIC TKTGLYRCEG TECGDSDTNR YGGVCDKDGC DFNSYRMGDK





SFFGQGKTVD SSKPVTVVTQ FITDNNQDSG KLTEIRRKYV QGGKVIDNSK VNIAGITAGN PITDTFCDEA





KKAFGDNNDF EKKGGLSALG TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT NASPGALGVE RGDCAITSGV





PADVESQSAD ASVTFSDIKF GPIDSTY






146197315
uncultured symbiotic
MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD




protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD





darwiniensis

VSNLPCGLSG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG





NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGAA CGENGGGSRF GSSCDPDGCD FNSWGMGNKT





FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT





DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA





DANVIYSDIR FGAIDSTYK






146197411
uncultured symbiotic
MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD





KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GILSETRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE





QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG





KPSDVESQSP DATVTFSDIK FGPIDSTY






146197161
uncultured symbiotic
MIGIVLIQTV FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW RWTHIPDGTT NCYDGNEWSS




protist of
DLCPDPTTCS NNCVLEGADY SGTYGISTSG SSAKLGFVTK GSYSTNIGSR VYLLGDESHY KIFDLKNKEF





Hodotermopsis

TFTVDDSNLE CGLNGALYFV AMDEDGGASR FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN





sjoestedti

DDNAGTGHYG ACCTEMDIWE ANKYATAYTP HICTENGEYR CEGKSCGDSS DDRYGGVCDK DGCDFNSWRL





GNQSFWGPGL IIDTGKPVTV VTQFVTKDGT DSGALSEIRR KYVQGGKTIE NTVVKISGID EVDSITDEFC





NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH DVNMLWLDSV YPTNPAGKAG ADRGPCATSS





GDPKEVEDKY ASASVTFSDI KFGPIDSTY






146197323
uncultured symbiotic
MLVFGIVSFV YSIGVGTNTA ETHPKLTWKN GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD




protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD





darwiniensis

VSQLPCGLNG ALYFVCMDQD GGMSRYPDNQ AGAKYGTGYC DAQCPTDLKF INGLPNSDGW KPQSNDKNSG





NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSICDPDGCD FNSWRMGNKT





FWGPGLIIDT KKPVTVVTQF IGSPVTEIKR EYVQGGKVIE NSYTNIEGMD KFNSISDKFC TAQKKAFGDN





DSFTKHGGFS KLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKLGS DRGPCPTSSG VPADVESKNA





DSSVKYSDIR FGSIDSTYK






146197077
uncultured symbiotic
MLSFVFLLGF GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT NCYDGNEWSS




protist of
DLCPDPETCS KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR VYLKKDTNTY QIFKLKNHEF





Reticulitermes

TFTVDVSNLP CGLNGALYFV EMEADGGKGK YPLAKPGAQY GMGYCDAQCP HDMKFINGNA NVLDWKPQET





speratus

DENSGNGRYG TCCTEMDIWE ANSQATAYTP HICTKDGQYQ CEGTECGDSD ANQRYNGVCD KDGCDFNSYR





LGNKTFFGPG LIVDSKKPVT VVTQFITSNG QDSGDLTEIR RIYVQGGKTI QNSFTNIAGL TSVDSITEAF





CDESKDLFGD TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD HSVNMLWLDS TYPTDAAAGA LGTQRGPCAT





SSGAPSDVES QSPDASVTFS DIKFGPLDST Y






146197089
uncultured symbiotic
MLTLVVYLLS LVVSLEIGTQ QSESHPALTW QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP




protist of
SSDTCTQKCY IEGADYSGTY GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV





Reticulitermes

DDSKLDCGLN GALYFVAMDA DGGKQKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVED WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY ASYRWGDHSF





YGEGKTVDTK QPITVVTQFI GDPLTEIRRL YIQGGKVINN SKTQNLASVY DSITDAFCDA TKAASGDTND





FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP TDSRDATAER GPCATSSGVP KDVESNQADA





SVVFSDIKFG AINSTYSYN






146197091
uncultured symbiotic
MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP




protist of
DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYQMFK LKNKEFTFTV





Reticulitermes

DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA





speratus

GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF





FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN





DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PSNSTAIGAT RGPCATSSGD PKNVESASAN





ASVKFSDIKF GAFDSTY






146197097
uncultured symbiotic
MLALVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP




protist of
SSDTCTSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV





Reticulitermes

DDSQLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT





NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVTSG VPKDVESQYG





SAQVVYSDIK FGAINSTY






146197095
uncultured symbiotic
MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCG




protist of
SSDTCSSKCY IEGADYSGTY GISASGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKGKEFTFTV





Reticulitermes

DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTID TKQPVTVVTQ FIGDPLTEIR RVYVQGGKVI NNSKTSNLAN VYDSITDKFC DDTKDATGDT





NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVLSG VPKNVESQHG





DATVIYSDIK FGAINSTFSY N






146197401
uncultured symbiotic
MFLALFVLGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPQTCSSNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAME EDGGVAKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC IEMDIWEANS MATAYTPHVC TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD





KSFFGVGKTV DSSKPVTVVT QFVTSNGQDG GTLSEIKRKY VQGGKVIENS KVNIAGITAV NSITDTFCNE





QKKAFGDNND FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP TDAAAGALGT ERGACATSSG





KPSDVESQSP DASVTFSDIK FGPIDSTY






146197225
uncultured symbiotic
MLLCLLSIAN SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA




protist of Neotermes
ATCGKNCVIE GADYQGTYGV SSSGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN GKEFTFTVDV





koshunensis

SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN





GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSSC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF





WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM SKQYNSVSDD FCQAQKKAFG





DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ





AASSSVKYSD IRFGAIDSTY K






146197317
uncultured symbiotic
MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD




protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD





darwiniensis

VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG





NGKYGSCCSE MDIWEANSIC SAVTPHVCDT LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT





FYGPGLIVDT KSKFTVVTQF VGSPVTEIKR KYVQNGKVIE NSFSNIEGMD KFNSISDKFC TAQKKAFGDT





DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA





NANVIYSDIR FGAIDSTYK






146197251
uncultured symbiotic
MLLCLLGIAS SLDAGTNTAE NHPQLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA




protist of Neotermes
ATCGQNCVIE GADYQGTYGV SASGNALTLT FVTHGQYSTN VGSRLYLLKD EKTYQIFNLI GKEFTFTVDV





koshunensis

SNLPCGLNGA LYFVQMDADG GTAKYSDNKA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN





GRYGSCCSEM DVWEANSLAT AYTPHVCDKL EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF NSWRLGNKTF





WGPGLIVDTK QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI DNSFTKLDSL TKQYNSVSDE FCVAQKKAFG





DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPADVESQ





AASSSVKYSD IRFGAIDSTY K






146197319
uncultured symbiotic
MLGIGFVCIV YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG NLWSKDLCPD




protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQIFNL NGKEFTFTVD





darwiniensis

VSNLPCGLNG ALYFVNMDAD GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG





NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSSCDPDGCD FNSWRLGNKT





FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR KYVQGGKVIE NSYTNIEGLD KFNSISDKFC TAQKKAFGDN





DSFIKHGGFR QLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKPGA DRGPCPTSSG VPADVESKNA





GSSVKYSDIR FGSIDSTYK






146197071
uncultured symbiotic
MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ TGSIVIDSNW RWLHDSGTTN CYDGNEWSSD




protist of
LCPDPEKCSQ NCYLEGADYS GTYGISSSGN SLQLGFVTKG SYSTNIGSRV YLLKDENTYA TFKLKNKEFT





Reticulitermes

FTADVSNLPC GLNGALYFVA MPADGGKSKY PLAKPGAKYG MGYCDAQCPH DMKFINGEAN ILDWKPSSND





speratus

ENAGAGRYGT CCTEMDIWEA NSQATAYTVH ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT





FFGPNLIVDS SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ NSFTNISGVA SVDSITDAFC NENKVATGDT





NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA SRGPCAITSG EPKDVESASA





NASVKFSDIK FGAIDSTY






146197075
uncultured symbiotic
MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP




protist of
SSDTCTSKCY IEGADYSGTY GITSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV





Reticulitermes

DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS





speratus

GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ





SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT





NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQHG





DATVIYSDIK FGAINSTFKW N






146197159
uncultured symbiotic
MLSLVSIFLV GLGFSLGVGT QQSESHPSLS WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW




protist of
STDLCPDAST CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES HYQLFKLKNH





Hodotermopsis

EFTFTVDDSQ LPCGLNGALY FVEMAEDGGA KPGAQYGMGY CDAQCPHDMK FITGEANVKD WKPQETDENA





sjoestedti

GNGHYGACCT EMDIWEANSQ ATAYTPHICS KTGIYRCEGT ECGDNDANQR YNGVCDKDGC DFNSYRLGNK





TFWGPGLTVD SNKAMIVVTQ FTTSNNQDSG ELSEIRRIYV QGGKTIQNSD TNVQGITTTN KITQAFCDET





KVTFGDTNDF KAKGGFSGLS KSLESGAVLV LSLWDDHSVN MLWLDSTYPT DSAGKPGADR GPCAITSGDP





KDVESQSPNA SVTFSDIKFG PIDSTY






146197405
uncultured symbiotic
MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC




protist of Cryptocercus
PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT





punctulatus

VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN





SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD





KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE





QKKAFGDNND FEKKGGFGAL SKQLVAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG





KPSDVESQSP DATVTFSDIK FGPIDSTY






146197327
uncultured symbiotic
MLCVGLFGLV YSIGVGTNTQ ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS




protist of Mastotermes
TLCPDGTTCS KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR LYLLKDDNNY QIFNLAGKEF





darwiniensis

TFTVDVSNLP CGLNGALYFV EMDQDGGKGK HKENEAGAKY GTGYCDAQCP TDLKFIDGIA NSDGWKPQDN





DENSGNGKYG SCCSEMDIWE ANSLATAYTP HVCDTKGQKR CQGTACGENG GGDRFGSECD PDGCDFNSWR





QGNKSFWGPG LIIDTKKSVQ VVTQFIGSGS SVTEIRRKYV QNGKVIENSY STISGTEKYN SISDDYCNAQ





KKAFGDTNSF ENHGGFKRFS QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN SNKPGADRGP CETSSGVPAD





VESKSASASV KYSDIRFGPI DSTYK






146197261
uncultured symbiotic
MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA




protist of Neotermes
ATCGKNCVIE GADYQGTYGV SASGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV





koshunensis

SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN





GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSAC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF





WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM TKQYNSVSDD FCQAQKKAFG





DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ





AASSSVKYSD IRFGAIDSTY K




















TABLE 2






Database

Position
Position


Sequence Identifier
Accession

Corresponding to
Corresponding to


(SEQ ID NO:)
Number
Species of Origin
Position 268
Position 411








BD29555*
Unknown
273
422



340514556

Trichoderma reesei

268
411



51243029

Penicillium occitanis

273
422



7cel (PDB) &

Trichoderma reesei

251
394



67516425

Aspergillus nidulans FGSC A4

274
424



46107376

Gibberella zeae PH-1

268
415



70992391

Aspergillus fumigatus Af293

277
427



121699984

Aspergillus clavatus NRRL 1

277
427



1906845

Claviceps purpurea

269
416



1gpi (PDB) &

Phanerochaete chrysosporium

240
391



119468034

Neosartorya fischeri NRRL 181

265
414



7804883

Leptosphaeria maculans

256
401



85108032

Neurospora crassa N150

268
412



169859458

Coprinopsis cinerea okayama

270
421



154292161

Botryotinia fuckeliana B05-10


410



169615761 #

Phaeosphaeria nodorum SN15

246
393



4883502

Humicola grisea

272
413



950686

Humicola grisea

270
416



124491660

Chaetomium thermophilum

272
413



58045187

Chaetomium thermophilum

270
416



169601100 #

Phaeosphaeria nodorum SN15

237
383



169870197

Coprinopsis cinerea okayama

269
421



3913806

Agaricus bisporus

263
414



169611094

Phaeosphaeria nodorum SN15

270
414



3131

Phanerochaete chrysosporium


410



70991503

Aspergillus fumigatus Af293

265
414



294196

Phanerochaete chrysosporium

258
409



18997123

Thermoascus aurantiacus

268
418



4204214

Humicola grisea var thermoidea

272
413



34582632

Trichoderma viride (also known as

268
411





Hypochrea rufa)




156712284

Thermoascus aurantiacus

268
418



39977899

Magnaporthe grisea (oryzae) 70-15

268
414



20986705

Talaromyces emersonii

266
416



22138843

Aspergillus oryzae

265
414



55775695

Penicillium chrysogenum

276
426



171676762

Podospora anserina

270
417



146350520

Pleurotus sp Florida

268
420



37732123

Gibberella zeae

268
415



156055188

Sclerotinia sclerotiorum 1980


410



453224

Phanerochaete chrysosporium

258
409



50402144

Trichoderma reesei

268
411



115397177

Aspergillus terreus NIH2624

274
424



154312003

Botryotinia fuckeliana B05-10

266
416



49333365

Volvariella volvacea

268
420



729650

Penicillium janthinellum

274
424



146424871

Pleurotus sp Florida

267
418



67538012

Aspergillus nidulans FGSC A4

265
410



62006162

Fusarium poae

268
415



146424873

Pleurotus sp Florida

267
418



295937

Trichoderma viride

268
411



6179889 #

Alternaria alternata

240
386



119483864

Neosartorya fischeri NRRL 181

278
428



85083281

Neurospora crassa OR74A

270
412



3913803

Cryphonectria parasitica

269
416



60729633

Corticium rolfsii

265
415



39971383

Magnaporthe grisea 70-15

268
410



39973029

Magnaporthe grisea 70-15

269
410



1170141

Fusarium oxysporum

268
415



121710012

Aspergillus clavatus NRRL 1

265
414



17902580

Penicillium funiculosum

273
422



1346226

Humicola grisea var thermoidea

270
416



156712282

Chaetomium thermophilum

270
416



169768818

Aspergillus oryzae RIB40

277
427



46241270

Gibberella pulicaris

268
415



49333363

Volvariella volvacea

265
418



46395332

Irpex lacteus

263
414



50844407 #

Chaetomium thermophilum var

245
391





thermophilum




4586347

Irpex lacteus

264
415



3980202

Phanerochaete chrysosporium

258
410



27125837

Melanocarpus albomyces

273
414



171696102

Podospora anserina

265
415



3913802

Cochliobolus carbonum

270
416



50403723

Trichoderma viride

268
411



3913798

Aspergillus aculeatus

275
425



66828465

Dictyostelium discoideum

269
419



156060391

Sclerotinia sclerotiorum 1980

252
402



116181754

Chaetomium globosum CBS 148-51

263
413



145230535

Aspergillus niger

274
424



46241266

Nectria haematococca mpVI

268
415



1q9h (PDB) #

Talaromyces emersonii

248
398



157362170

Polyporus arcularius

269
420



7804885

Leptosphaeria maculans

267
407



121852

Phanerochaete chrysosporium

258
409



126013214

Penicillium decumbens

264
415



156048578

Sclerotinia sclerotiorum 1980

265
413



156712278

Acremonium thermophilum

269
414



21449327

Aspergillus nidulans

265
410



171683762

Podospora anserina

274
415



56718412

Thermoascus aurantiacus var

268
418





levisporus




15824273

Pseudotrichonympha grassii

263
414



115390801

Aspergillus terreus NIH2624

266
411



453223

Phanerochaete chrysosporium

258
409



3132

Phanerochaete chrysosporium


407



16304152

Thermoascus aurantiacus

268
417



156712280

Acremonium thermophilum

273
420



5231154

Volvariella volvacea

281
438



116200349

Chaetomium globosum CBS 148-51

270
412



4586343

Irpex lacteus

263
414



15321718

Lentinula edodes


417



146424875

Pleurotus sp Florida

267
418



62006158

Fusarium venenatum

268
415



296027

Phanerochaete chrysosporium

258
409



154449709

Fusicoccum sp BCC4124

272
424



169859460

Coprinopsis cinerea okayama

269
421



50400675

Trichoderma harzianum

264
407



729649

Neurospora crassa

262
406



119472134

Neosartorya fischeri NRRL 181

277
427



117935080

Chaetomium thermophilum

272
413



154300584

Botryotinia fuckeliana B05-10

265
413



15824271

Pseudotrichonympha grassii

263
414



4586345

Irpex lacteus

263
414



46241268

Gibberella avenacea

268
416



6164684

Aspergillus niger

274
424



6164682

Aspergillus niger

266
412



33733371

Chrysosporium lucknowense

269
415




US6573086-10



29160311

Thielavia australiensis

269
415



146197087
uncultured symbiotic protist of
260
402





Reticulitermes speratus




146197237
uncultured symbiotic protist of
264
409





Neotermes koshunensis




146197067
uncultured symbiotic protist of
260
402





Reticulitermes speratus




146197407
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197157
uncultured symbiotic protist of
264
410





Hodotermopsis sjoestedti




146197403
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197081
uncultured symbiotic protist of
260
410





Reticulitermes speratus




146197413
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197309
uncultured symbiotic protist of
259
402





Mastotermes darwiniensis




146197227
uncultured symbiotic protist of
258
404





Neotermes koshunensis




146197253
uncultured symbiotic protist of
264
409





Neotermes koshunensis




146197099
uncultured symbiotic protist of
258
401





Reticulitermes speratus




146197409
uncultured symbiotic protist of
260
411





Cryptocercus punctulatus




146197315
uncultured symbiotic protist of
259
402





Mastotermes darwiniensis




146197411
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197161
uncultured symbiotic protist of
263
413





Hodotermopsis sjoestedti




146197323
uncultured symbiotic protist of
259
402





Mastotermes darwiniensis




146197077
uncultured symbiotic protist of
264
415





Reticulitermes speratus




146197089
uncultured symbiotic protist of
258
400





Reticulitermes speratus




146197091
uncultured symbiotic protist of
258
401





Reticulitermes speratus




146197097
uncultured symbiotic protist of
260
402





Reticulitermes speratus




146197095
uncultured symbiotic protist of
260
402





Reticulitermes speratus




146197401
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197225
uncultured symbiotic protist of
258
404





Neotermes koshunensis




146197317
uncultured symbiotic protist of
259
402





Mastotermes darwiniensis




146197251
uncultured symbiotic protist of
258
404





Neotermes koshunensis




146197319
uncultured symbiotic protist of
259
402





Mastotermes darwiniensis




146197071
uncultured symbiotic protist of
259
402





Reticulitermes speratus




146197075
uncultured symbiotic protist of
260
402





Reticulitermes speratus




146197159
uncultured symbiotic protist of
260
410





Hodotermopsis sjoestedti




146197405
uncultured symbiotic protist of
261
412





Cryptocercus punctulatus




146197327
uncultured symbiotic protist of
264
408





Mastotermes darwiniensis




146197261
uncultured symbiotic protist of
258
404





Neotermes koshunensis























TABLE 3








Signal
Catalytic

Cellulose



Database

Sequence (SS)
Domain (CD)
Linker Start
Binding



Accession

Start and End
Start and End
and End
Domain (CBD)


SEQ ID NO:
Number
Species of Origin
Position
Position
Position
Start and End








BD29555*
Unknown
1-25
26-455
456-493
494-529



340514556

Trichoderma reesei

1-17
18-444
445-479
480-514



51243029

Penicillium occitanis

1-25
26-455
456-493
494-529



7cel (PDB) &

Trichoderma reesei

N/A
 1-427
N/A
N/A



67516425

Aspergillus nidulans

1-23
24-457
458-490
491-526




FGSC A4







46107376

Gibberella zeae PH-1

1-17
18-448
449-476
477-512



70992391

Aspergillus fumigatus

1-26
27-460
461-496
497-532




Af293







121699984

Aspergillus clavatus

1-27
27-460
461-503
504-539




NRRL 1







1906845

Claviceps purpurea

1-19
20-449
N/A
N/A



1gpi (PDB) &

Phanerochaete

N/A
 1-424
N/A
N/A





chrysosporium








119468034

Neosartorya fischeri

1-17
18-447
N/A
N/A




NRRL 181







7804883

Leptosphaeria

1-17
18-434
N/A
N/A





maculans








85108032

Neurospora crassa

1-17
18-445
446-485
486-521




N150







169859458

Coprinopsis cinerea

1-18
19-454
N/A
N/A





okayama








154292161

Botryotinia fuckeliana

1-18
19-443
444-555
556-596




B05-10







169615761 #

Phaeosphaeria

1
 2-426
N/A
N/A





nodorum SN15








4883502

Humicola grisea

1-22
23-446
N/A
N/A



950686

Humicola grisea

1-18
19-449
450-489
490-525



124491660

Chaetomium

1-22
23-446
N/A
N/A





thermophilum








58045187

Chaetomium

1-18
19-449
450-494
495-530





thermophilum








169601100 #

Phaeosphaeria

1
2-416
N/A
N/A





nodorum SN15








169870197

Coprinopsis cinerea

1-18
19-454
N/A
N/A





okayama








3913806

Agaricus bisporus

1-18
19-447
448-470
471-506



169611094

Phaeosphaeria

1-18
19-447
N/A
N/A





nodorum SN15








3131

Phanerochaete

1-19
20-443
N/A
N/A





chrysosporium








70991503

Aspergillus fumigatus

1-17
18-447
N/A
N/A




Af293







294196

Phanerochaete

1-18
19-442
443-480
481-516





chrysosporium








18997123

Thermoascus

1-17
18-451
N/A
N/A





aurantiacus








4204214

Humicola grisea var

1-22
23-446
N/A
N/A





thermoidea








34582632

Trichoderma viride

1-18
18-444
445-479
480-514




(also known as









Hypochrea rufa)








156712284

Thermoascus

1-17
18-451
N/A
N/A





aurantiacus








39977899

Magnaporthe grisea

1-17
18-447
N/A
N/A




(oryzae) 70-15







20986705

Talaromyces emersonii

1-18
19-449
N/A
N/A



22138843

Aspergillus oryzae

1-17
18-447
N/A
N/A



55775695

Penicillium

1-25
26-459
460-494
495-529





chrysogenum








171676762

Podospora anserina

1-18
19-450
451-492
493-528



146350520

Pleurotus sp Florida

1-18
19-453
N/A
N/A



37732123

Gibberella zeae

1-17
18-448
449-476
477-512



156055188

Sclerotinia

1-18
19-443
444-546
547-586





sclerotiorum 1980








453224

Phanerochaete

1-18
19-442
443-474
475-510





chrysosporium








50402144

Trichoderma reesei

1-17
18-444
445-478
479-513



115397177

Aspergillus terreus

1-23
24-457
458-505
506-541




NIH2624







154312003

Botryotinia fuckeliana

1-17
18-449
450-480
481-516




B05-10







49333365

Volvariella volvacea

1-18
19-453
N/A
N/A



729650

Penicillium

1-25
26-456
457-502
503-537





janthinellum








146424871

Pleurotus sp Florida

1-18
19-451
452-487
488-523



67538012

Aspergillus nidulans

1-17
18-443
N/A
N/A




FGSC A4







62006162

Fusarium poae

1-17
18-448
449-475
476-511



146424873

Pleurotus sp Florida

1-18
19-451
452-487
488-523



295937

Trichoderma viride

1-17
18-444
445-478
479-513



 6179889 #

Alternaria alternata

1
2-419
N/A
N/A



119483864

Neosartorya fischeri

1-26
27-461
462-499
500-535




NRRL 181







85083281

Neurospora crassa

1-20
21-445
N/A
N/A




OR74A







3913803

Cryphonectria

1-18
19-449
N/A
N/A





Parasitica








60729633

Corticium rolfsii

1-18
19-448
449-492
493-528



39971383

Magnaporthe grisea

1-17
18-443
N/A
N/A




70-15







39973029

Magnaporthe grisea

1-19
20-443
N/A
N/A




70-15







1170141

Fusarium oxysporum

1-17
18-448
449-478
479-514



121710012

Aspergillus clavatus

1-17
18-447
N/A
N/A




NRRL 1







17902580

Penicillium

1-25
26-455
456-493
494-529





funiculosum








1346226

Humicola grisea var

1-18
19-449
450-489
490-525





thermoidea








156712282

Chaetomium

1-18
19-449
450-496
497-532





thermophilum








169768818

Aspergillus oryzae

1-25
26-460
N/A
N/A




RIB40







46241270

Gibberella pulicaris

1-17
18-448
449-474
475-510



49333363

Volvariella volvacea

1-18
19-451
452-476
477-512



46395332

Irpex lacteus

1-18
19-447
448-485
486-521



50844407 #

Chaetomium

N/A
 1-424
425-469
470-505





thermophilum var










thermophilum








4586347

Irpex lacteus

1-18
19-448
449-490
491-526



3980202

Phanerochaete

1-18
19-443
444-475
476-511





chrysosporium








27125837

Melanocarpus

1-23
23-447
N/A
N/A





albomyces








171696102

Podospora anserina

1-17
17-448
N/A
N/A



3913802

Cochliobolus

1-18
19-449
N/A
N/A





carbonum








50403723

Trichoderma viride

1-17
18-444
445-479
480-514



3913798

Aspergillus aculeatus

1-22
23-458
459-505
506-540



66828465

Dictyostelium

1-19
20-452
N/A
N/A





discoideum








156060391

Sclerotinia

1-17
18-435
436-470
471-504





sclerotiorum 1980








116181754

Chaetomium globosum

1-17
18-446
N/A
N/A




CBS 148-51







145230535

Aspergillus niger

1-21
22-457
458-500
501-536



46241266

Nectria haematococca

1-18
18-448
449-472
473-508




mpVI







1q9h (PDB) #

Talaromyces emersonii

N/A
 1-431
N/A
N/A



157362170

Polyporus arcularius

1-18
19-453
N/A
N/A



7804885

Leptosphaeria

1-20
21-440
N/A
N/A





maculans








121852

Phanerochaete

1-18
19-442
443-480
481-516





chrysosporium








126013214

Penicillium decumbens

1-17
18-448
N/A
N/A



156048578

Sclerotinia

1-16
17-446
N/A
N/A





sclerotiorum 1980








156712278

Acremonium

1-17
18-447
448-487
488-523





thermophilum








21449327

Aspergillus nidulans

1-17
18-443
N/A
N/A



171683762

Podospora anserina

1-22
23-448
N/A
N/A



56718412

Thermoascus

1-17
18-451
N/A
N/A





aurantiacus var










levisporus








15824273

Pseudotrichonympha

1-20
21-447
N/A
N/A





grassii








115390801

Aspergillus terreus

1-17
18-444
N/A
N/A




NIH2624







453223

Phanerochaete

1-18
19-442
443-474
475-510





chrysosporium








3132

Phanerochaete

1-19
20-436
437-467
468-504





chrysosporium








16304152

Thermoascus

1-17
18-450
N/A
N/A





aurantiacus








156712280

Acremonium

1-21
22-453
N/A
N/A





thermophilum








5231154

Volvariella volvacea

1-15
16-472
473-500
501-536



116200349

Chaetomium globosum

1-20
21-445
N/A
N/A




CBS 148-51







4586343

Irpex lacteus

1-18
19-447
448-481
482-517



15321718

Lentinula edodes

1-18
19-450
451-480
481-516



146424875

Pleurotus sp Florida

1-18
19-451
452-487
488-523



62006158

Fusarium venenatum

1-17
18-448
449-471
472-507



296027

Phanerochaete

1-18
19-442
443-480
481-516





chrysosporium








154449709

Pusicoccum sp

1-19
20-457
N/A
N/A




BCC4124







169859460

Coprinopsis cinerea

1-18
19-454
N/A
N/A





okayama








50400675

Trichoderma

1-17
18-440
441-470
471-505





harzianum








729649

Neurospora crassa

1-17
18-439
440-480
481-516



119472134

Neosartorya fischeri

1-26
27-460
461-494
495-530




NRRL 181







117935080

Chaetomium

1-22
23-446
N/A
N/A





thermophilum








154300584

Botryotinia fuckeliana

1-16
17-446
N/A
N/A




B05-10







15824271

Pseudotrichonympha

1-20
21-447
N/A
N/A





grassii








4586345

Irpex lacteus

1-18
19-447
448-487
488-523



46241268

Gibberella avenacea

1-17
18-449
450-478
478-513



6164684

Aspergillus niger

1-21
22-457
458-500
501-536



6164682

Aspergillus niger

1-17
18-445
N/A
N/A



33733371

Chrysosporium

1-17
18-448
449-490
491-526





lucknowense









US6573086-10







29160311

Thielavia australiensis

1-18
18-448
449-502
503-538



146197087
uncultured symbiotic
1-22
23-435
N/A
N/A




protist of









Reticulitermes speratus








146197237
uncultured symbiotic
1-20
21-442
N/A
N/A




protist of Neotermes









koshunensis








146197067
uncultured symbiotic
1-22
23-435
N/A
N/A




protist of









Reticulitermes speratus








146197407
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









punctulatus








146197157
uncultured symbiotic
1-20
21-443
N/A
N/A




protist of









Hodotermopsis










sjoestedti








146197403
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









punctulatus








146197081
uncultured symbiotic
1-22
23-443
N/A
N/A




protist of









Reticuhtermes speratus








146197413
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









punctulatus








146197309
uncultured symbiotic
1-20
21-435
N/A
N/A




protist of Mastotermes









darwiniensis








146197227
uncultured symbiotic
1-19
20-437
N/A
N/A




protist of Neotermes









koshunensis








146197253
uncultured symbiotic
1-21
21-442
N/A
N/A




protist of Neotermes









koshunensis








146197099
uncultured symbiotic
1-22
23-434
N/A
N/A




protist of









Rehculitermes speratus








146197409
uncultured symbiotic
1-19
20-444
N/A
N/A




protist of Cryptocercus









punctulatus








146197315
uncultured symbiotic
1-20
21-435
N/A
N/A




protist of Mastotermes









darwiniensis








146197411
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









Punctulatus








146197161
uncultured symbiotic
1-20
21-446
N/A
N/A




protist of









Hodotermopsis










sjoestedti








146197323
uncultured symbiotic
1-20
21-435
N/A
N/A




protist of Mastotermes









darwiniensis








146197077
uncultured symbiotic
1-21
22-448
N/A
N/A




protist of









Reticuhtermes speratus








146197089
uncultured symbiotic
1-22
23-433
N/A
N/A




protist of









Reticuhtermes speratus








146197091
uncultured symbiotic
1-22
23-434
N/A
N/A




protist of









Reticuhtermes speratus








146197097
uncultured symbiotic
1-22
23-435
N/A
N/A




protist of









Reticuhtermes speratus








146197095
uncultured symbiotic
1-22
23-435
N/A
N/A




protist of









Reticuhtermes speratus








146197401
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









Punctulatus








146197225
uncultured symbiotic
1-19
20-437
N/A
N/A




protist of Neotermes









koshunensis








146197317
uncultured symbiotic
1-20
21-435
N/A
N/A




protist of Mastotermes









darwiniensis








146197251
uncultured symbiotic
1-19
20-437
N/A
N/A




protist of Neotermes









koshunensis








146197319
uncultured symbiotic
1-20
21-435
N/A
N/A




protist of Mastotermes









darwiniensis








146197071
uncultured symbiotic
1-25
26-435
N/A
N/A




protist of









Reticulitermes speratus








146197075
uncultured symbiotic
1-22
23-435
N/A
N/A




protist of









Reticulitermes speratus








146197159
uncultured symbiotic
1-23
24-443
N/A
N/A




protist of









Hodotermopsis










sjoestedti








146197405
uncultured symbiotic
1-19
20-445
N/A
N/A




protist of Cryptocercus









punctulatus








146197327
uncultured symbiotic
1-20
21-441
N/A
N/A




protist of Mastotermes









darwiniensis








146197261
uncultured symbiotic
1-19
20-437
N/A
N/A




protist of Neotermes









koshunensis























TABLE 4









Amino Acid
Amino Acid
Position of






Positions of
Positions of Active
Catalytic


Sequence
Database


Fragment in
Site Loop in
Residues in


Identifier
Accession

Amino Acid Sequence of Fragment of Catalytic Domain
Sequence
Sequence
Sequence


(SEQ ID NO:)
Number
Species of Origin
Including Loop and Catalytic Residue
Identifier
Identifier
Identifier








BD29555*
Unknown
NVEGWTPSSNNANTGLGNHGACCAELDIWEANS
210-242
214-226
234, 239






340514556

Trichoderma reesei

NVEGWTPSANNANTGIGNHGACCAELDIWEANS
205-237
209-221
229, 234






51243029

Penicillium occitanis

NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS
210-242
214-226
234, 239






7cel (PDB) &

Trichoderma reesei

NVEGWEPSSNNANTGIGGHGSCCSEMDIWQANS
188-220
192-204
212, 217






67516425

Aspergillus nidulans

NVEGWESSDTNPNGGVGNHGSCCAEMDIWEANS
211-243
215-227
235, 240




FGSC A4






46107376

Gibberella zeae PH-1

NSDGWQPSDSDVNGGIGNLGTCCPEMDIWEANS
205-237
209-221
229, 234






70992391

Aspergillus fumigatus

NVEGWQPSSNDANAGTGNHGSCCAEMDIWEANS
214-246
218-230
238, 243




Af293






121699984

Aspergillus clavatus

NVEGWTPSSSDANAGNGGHGSCCAEMDIWEANS
214-246
218-230
238, 243




NRRL 1






1906845

Claviceps purpurea

NSKDWIPSKSDANAGIGSLGACCREMDIWEANN
206-238
210-222
230, 235






1gpi (PDB) &

Phanerochaete

NVGNWTETG-SNTGTGSYGTCCSEMDIWEANN
185-215
189-199
207, 212





chrysosporium







119468034

Neosartorya fischeri

NVEGWKPSSNDKNAGVGGHGSCCPEMDIWEANS
202-234
206-218
226, 231




NRRL 181






7804883

Leptosphaeria

NVEGWQPSKNDQNAGVGGHGSCCAEMDIWEANS
193-225
197-209
217, 222





maculans







85108032

Neurospora crassa

NVEGWTPSTNDANAGIGDHGTCCSEMDIWEANK
205-237
209-221
229, 234




N150 (OR74A)






169859458

Coprinopsis cinerea

NSADWTPSETDPNAGRGRYGICCAEMDIWEANS
207-239
211-223
231, 236





okayama







154292161

Botryotinia

NVEGWVPDSNSANSGTGNIGSCCSEFDVWEANS
203-235
207-219
227, 232





fuckeliana B05-10







169615761 #

Phaeosphaeria

NADGWQASTSDPNAGVGKKGACCAEMDVWEANS
183-215
187-199
207, 212





nodorum SN15







4883502

Humicola grisea

NIEGWRPSTNDPNAGVGPMGACCAEIDVWESNA
208-240
212-224
232, 237






950686

Humicola grisea

NIEGWTGSTNDPNAGAGRYGTCCSEMDIWEANN
207-239
211-223
231, 236






124491660

Chaetomium

NIEGWRPSTNDANAGVGPYGACCAEIDVWESNA
209-241
213-225
233, 238





thermophilum







58045187

Chaetomium

NIENWTPSTNDANAGFGRYGSCCSEMDIWEANN
207-239
211-223
231, 236





thermophilum







169601100 #

Phaeosphaeria

NVEGWKPSDNDANAGVGGHGSCCAEMDIWEANS
174-206
178-190
198, 203





nodorum SN15







169870197

Coprinopsis cinerea

NSVGWEPSETDSNAGRGRYGICCAEMDIWEANS
207-239
211-223
231, 236




okayama






3913806

Agaricus bisporus

NSEGWEGSPNDVNAGTGNFGACCGEMDIWEANS
203-235
207-219
227, 232






169611094

Phaeosphaeria

NVEGWNPSDADPNAGSGKIGACCPEMDIWEANS
208-240
212-224
232, 237





nodorum SN15







3131

Phanerochaete

NVQGWNATS--ATTGTGSYGSCCTELDIWEANS
204-234
208-218
226, 231





chrysosporium







70991503

Aspergillus fumigatus

NVEGWEPSSSDKNAGVGGHGSCCPEMDIWEANS
202-234
206-218
226, 231




Af293






294196

Phanerochaete

NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN
203-233
207-217
225, 230





chrysosporium







18997123

Thermoascus

NVEGWQPSANDPNAGVGNHGSSCAEMDVWEANS
205-237
209-221
229, 234





aurantiacus







4204214

Humicola grisea var

NIEGWRPSTNDPNAGVGPMGACCAEIDVWESNA
208-240
212-224
232, 237





thermoidea







34582632

Trichoderma viride

NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS
205-237
209-221
229, 234




(also known as





Hypochrea rufa)







156712284

Thermoascus

NVEGWQPSANDPNAGVGNHGSCCAEMDVWEANS
205-237
209-221
229, 234





aurantiacus







39977899

Magnaporthe grisea

NVEGWQPSSGDANSGVGNMGSCCAEMDIWEANS
205-237
209-221
229, 234




(oryzae) 70-15






20986705

Talaromyces

NVEGWQPSSNNANTGIGDHGSCCAEMDVWEANS
203-235
207-219
227, 232





emersonii







22138843

Aspergillus oryzae

R-KGWEPSDSDKNAGVGGHGSCCPQMDIWEANS
203-234
206-218
226, 231






55775695

Penicillium

NVEGWEPSSSDVNGGTGNYGSCCAEMDIWEANS
213-245
217-229
237, 242





chrysogenum







171676762

Podospora anserina

NIEGWNPSTNDVNAGAGRYGTCCSEMDIWEANN
207-239
211-223
231, 236






146350520

Pleurotus sp Florida

NVQGWQPSPNDSNAGKGQYGSCCAEMDIWEANS
207-239
211-223
231, 236






37732123

Gibberella zeae

NSDGWQPSDSDVNGGIGNLGTCCPEMDIWEANS
205-237
209-221
229, 234






156055188

Sclerotinia

NNEGWVPDSNSANSGTGNIGSCCSEFDVWEANS
203-235
207-219
227, 232





sclerotiorum 1980







453224

Phanerochaete

NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN
203-233
207-217
225, 230





chrysosporium







50402144

Trichoderma reesei

NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS
205-237
209-221
229, 234






115397177

Aspergillus terreus

NVEGWEPSANDANAGTGNHGSCCAEMDIWEANS
211-243
215-227
235, 240




NIH2624






154312003

Botryotinia

NSVGWTPSSNDVNAGAGQYGSCCSEMDIWEANK
206-238
210-222
230, 235





fuckeliana B05-10







49333365

Volvariella volvacea

NVQGWQPSPNDTNAGTGNYGACCNEMDVWEANS
207-239
211-223
231, 236






729650

Penicillium

NVDGWTPSKNDVNSGIGNHGSCCAEMDIWEANS
211-243
215-227
235, 240





janthinellum







146424871

Pleurotus sp Florida

NILDWSASATDANAGNGRYGACCAEMDIWEANS
206-238
210-222
230, 235






67538012

Aspergillus nidulans

NVEGWEPSDSDANAGVGGMGTCCPEMDIWEANS
202-234
206-218
226, 231




FGSC A4






62006162

Fusarium poae

NSDGWEPSKSDVNGGIGNLGTCCPEMDIWEANS
205-237
209-221
229, 234






146424873

Pleurotus sp Florida

NILDWSGSATDPNAGNGRYGACCAEMDIWEANS
206-238
210-222
230, 235






295937

Trichoderma viride

NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS
205-237
209-221
229, 234






6179889 #

Alternaria alternata

NVEGWKPSSNDANAGVGGHGSCCAEMDIWEANS
177-209
181-193
201, 206






119483864

Neosartorya fischeri

NVEGWTPSSNNENTGLGNYGSCCAELDIWESNS
215-247
219-231
239, 244




NRRL 181






85083281

Neurospora crassa

NIEGWTPSTNDANAGVGPYGGCCAEIDVWESNA
207-239
211-223
231, 236




OR74A






3913803

Cryphonectria

NVEGWTPSTNDANAGVGGLGSCCSEMDVWEANS
206-238
210-222
230, 235





parasitica







60729633

Corticium rolfsii

NLLDWNATS--ANSGTGSYGSCCPEMDIWEANK
206-236
210-220
228, 233






39971383

Magnaporthe grisea

NIEGWQPSSTDSSAGIGAQGACCAEIDIWESNK
205-237
209-221
229, 234




70-15






39973029

Magnaporthe grisea

NIEGWKPSSNDANAGVGPYGACCAEIDVWESNA
206-238
210-222
230, 235




70-15






1170141

Fusarium oxysporum

NSEGWKPSDSDVNAGVGNLGTCCPEMDIWEANS
205-237
209-221
229, 234






121710012

Aspergillus clavatus

NVEGWKPSDNDKNAGVGGYGSCCPEMDIWEANS
202-234
206-218
226, 231




NRRL 1






17902580

Penicillium

NVEGWTPSTNNSNTGIGNHGSCCAELDIWEANS
210-242
214-226
234, 239





funiculosum







1346226

Humicola grisea var

NIEGWTGSTNDPNAGAGRYGTCCSEMDIWEANN
207-239
211-223
231, 236





thermoidea







156712282

Chaetomium

NVGNWTPSTNDANAGFGRYGSCCSEMDVWEANN
207-239
211-223
231, 236





thermophilum







169768818

Aspergillus oryzae

NVEGWVSSTNNANTGTGNHGSCCAELDIWESNS
214-246
218-230
238, 243




RIB40






46241270

Gibberella pulicaris

NSDGWQPSKSDVNAGIGNMGTCCPEMDIWEANS
205-237
209-221
229, 234






49333363

Volvariella volvacea

NVAGWNGSPNDTNAGTGNWGACCNEMDIWEANS
205-237
209-221
229, 234






46395332

Irpex lacteus

NVAGWTGSSSDPNSGTGNYGTCCSEMDIWEANS
202-234
206-218
226, 231






50844407 #

Chaetomium

NIENWTPSTNDANAGFGRYGSCCSEMDIWEANN
182-214
186-198
206, 211





thermophilum var






thermophilum







4586347

Irpex lacteus

NIVDWTASAGDANSGTGSFGTCCQEMDIWEANS
203-235
207-219
227, 232






3980202

Phanerochaete

NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN
203-233
207-217
225, 230





chrysosporium







27125837

Melanocarpus

NIEGWKSSTSDPNAGVGPYGSCCAEIDVWESNA
210-242
214-226
234, 239





albomyces







171696102

Podospora anserina

NVEGWGGAD--GNSGTGKYGICCAEMDIWEANS
206-236
210-220
228, 233






3913802

Cochliobolus

NVEGWNPSDADPNGGAGKIGACCPEMDIWEANS
208-240
212-224
232, 237





carbonum







50403723

Trichoderma viride

NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS
205-237
209-221
229, 234






3913798

Aspergillus aculeatus

NIEGWEPSSTDVNAGTGNHGSCCPEMDIWEANS
210-242
214-226
234, 239






66828465

Dictyostelium

NVDGWIPSTNNPNTGYGNLGSCCAEMDLWEANN
206-238
210-222
230, 235





discoideum







156060391

Sclerotinia

NSVGWTPSSNDVNTGTGQYGSCCSEMDIWEANK
192-224
196-208
216, 221





sclerotiorum 1980







116181754

Chaetomium

NSEGWGGED--GNSGTGKYGTCCAEMDIWEANL
203-233
207-217
225, 230





globosum CBS 148-





51






145230535

Aspergillus niger

NCDGWEPSSNNVNTGVGDHGSCCAEMDVWEANS
209-241
213-225
233, 238






46241266

Nectria

NSDEWKPSDSDKNAGVGKYGTCCPEMDIWEANK
205-237
209-221
229, 234





haematococca mpVI







1q9h (PDB) #

Talaromyces

NVEGWQPSSNNANTGIGDHGSCCAEMDVWEANS
185-217
189-201
209, 214





emersonii







157362170

Polyporus arcularius

NVLDWAGSSNDPNAGTGHYGTCCNEMDIWEANS
208-240
212-224
232, 237






7804885

Leptosphaeria

NAEGWTKSASDPNSGVGKKGACCAQMDVWEANS
204-236
208-220
228, 233





maculans







121852

Phanerochaete

NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN
203-233
207-217
225, 230





chrysosporium







126013214

Penicillium

NVEGWKPSANDKNAGVGPHGSCCAEMDIWEANS
201-233
205-217
225, 230





decumbens







156048578

Sclerotinia

NVDGWVPSSNNPNTGVGNYGSCCAEMDIWEANS
202-234
206-218
226, 231





sclerotiorum 1980







156712278

Acremonium

NIDGWQPSSNDANAGLGNHGSCCSEMDIWEANK
206-238
210-222
230, 235





thermophilum







21449327

Aspergillus nidulans

NVEGWEPSDSDANAGVGGMGTCCPEMDIWEANS
202-234
206-218
226, 231




(also known as





Emericella nidulans)







171683762

Podospora anserine

NIEGWRESSNDENAGVGPYGGCCAEIDVWESNA
211-243
215-227
235, 240




(S mat+)






56718412

Thermoascus

NVEGWQPSANDPNAGVGNHGSCCAEMDVWEANS
205-237
209-221
229, 234





aurantiacus var






levisporus







15824273

Pseudotrichonympha

NVENWKPQTNDENAGNGRYGACCTEMDIWEANK
200-232
204-216
224, 229





grassii







115390801

Aspergillus terreus

NVEGWTPSDNDKNAGVGGHGSCCPELDIWEANS
203-235
207-219
227, 232




NIH2624






453223

Phanerochaete

NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN
203-233
207-217
225, 230





chrysosporium







3132

Phanerochaete

NVEGWLGTT--ATTGTGFFGSCCTDIALWEAND
202-232
206-216
224, 229





chrysosporium







16304152

Thermoascus

NVEGWQPSANDPNAGVGNHGSSCAEMDVWEANS
205-237
209-221
229, 234





aurantiacus







156712280

Acremonium

NSASWQPSSNDQNAGVGGMGSCCAEMDIWEANS
210-242
214-226
234, 239





thermophilum







5231154

Volvariella volvacea

NVQGWQPSPNDTNAGTGNYGACCNKMDVWEANS
220-252
224-236
244, 249






116200349

Chaetomium

NYDGWTPSSNDANAGVGALGGCCAEIDVWESNA
207-239
211-223
231, 236





globosum CBS 148-





51






4586343

Irpex lacteus

NVAGWAGSASDPNAGSGTLGTCCSEMDIWEANN
202-234
206-218
226, 231






15321718

Lentinula edodes

NVEGWTPSSTSPNAGTGGTGICCNEMDIWEANS
208-240
212-224
232, 237






146424875

Pleurotus sp Florida

NVLDWSASATDDNAGNGRYGACCAEMDIWEANS
206-238
210-222
230, 235






62006158

Fusarium venenatum

NSDGWQPSKSDVNGGIGNLGTCCPEMDIWEANS
205-237
209-221
229, 234






296027

Phanerochaete

NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN
203-233
207-217
225, 230





chrysosporium







154449709

Fusicoccum sp

NVQNWTASSTDKNAGTGHYGSCCNEMDIWEANS
209-241
213-225
233, 238




BCC4124






169859460

Coprinopsis cinerea

NSVGWEPSETDPNAGKGQYGICCAEMDIWEANS
207-239
211-223
231, 236




okayama






50400675

Trichoderma

NVEGWEPSSNNANTGVGGHGSCCSEMDIWEANS
201-233
205-217
225, 230





harzianum





(anamorph of





Hypocrea lixii)







729649

Neurospora crassa

NVEGWTPSTNDAN-GIGDHGSCCSEMDIWEANK
200-231
204-215
223, 228




(OR74A)






119472134

Neosartorya fischeri

NVEGWQPSSNDANAGTGNHGSCCAEMDIWEANS
214-246
218-230
238, 243




NRRL 181






117935080

Chaetomium

NIEGWRPSTNDANAGVGPYGACCAEIDVWESNA
209-241
213-225
233, 238





thermophilum







154300584

Botryotinia

NVDGWVPSSNNANTGVGNHGSCCAEMDIWEANS
202-234
206-218
226, 231





fuckeliana B05-10







15824271

Pseudotrichonympha

NVENWKPQTNDENAGNGRYGACCTEMDIWEANK
200-232
204-216
224, 229





grassii







4586345

Irpex lacteus

NVEGWTGSSTDSNSGTGNYGTCCSEMDIWEANS
202-234
206-218
226, 231






46241268

Gibberella avenacea

NSDGWKPSDSDINAGIGNMGTCCPEMDIWEANS
205-237
209-221
229, 234






6164684

Aspergillus niger

NCDGWEPSSNNVNTGVGDHGSCCAEMDVWEANS
209-241
213-225
233, 238






6164682

Aspergillus niger

NVDGWEPSSNNDNTGIGNHGSCCPEMDIWEANK
203-235
207-219
227, 232






33733371

Chrysosporium

NVENWQSSTNDANAGTGKYGSCCSEMDVWEANN
206-238
210-222
230, 235





lucknowense





U.S. Pat. No. 6,573,086-10






29160311

Thielavia

NVEGWESSTNDANAGSGKYGSCCTEMDVWEANN
206-238
210-222
230, 235





australiensis







146197087
uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNM
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197237
uncultured symbiotic
NSEGWKPQSGDKNAGNGKYGSCCSEMDVWESNS
200-232
204-216
224, 229




protist of Neotermes





koshunensis







146197067
uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNM
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197407
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197157
uncultured symbiotic
NVEGWKPSDNDENAGTGKWGACCTEMDIWEANK
201-233
205-217
225, 230




protist of





Hodotermopsis






sjoestedti







146197403
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197081
uncultured symbiotic
NVDDWKPQDNDENSGDGKLGTCCSEMDIWEGNA
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197413
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197309
uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS
196-228
200-212
220, 225




protist of





Mastotermes






darwiniensis







146197227
uncultured symbiotic
NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS
195-227
199-211
219, 224




protist of Neotermes





koshunensis







146197253
uncultured symbiotic
NSEGWKPQSGDKNAGNGKYGSCCSEMDVWESNS
200-232
204-216
224, 229




protist of Neotermes





koshunensis







146197099
uncultured symbiotic
NVLDWKPQSNDENAGTGRYGTCCTEMDIWEANS
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197409
uncultured symbiotic
NVLDWKPQSNDENSGNGRWGARCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197315
uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS
196-228
200-212
220, 225




protist of





Mastotermes






darwiniensis







146197411
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197161
uncultured symbiotic
NVQDWKPSDNDDNAGTGHYGACCTEMDIWEANK
201-233
205-217
225, 230




protist of





Hodotermopsis






sjoestedti







146197323
uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS
196-228
200-212
220, 225




protist of





Mastotermes






darwiniensis







146197077
uncultured symbiotic
NVLDWKPQETDENSGNGRYGTCCTEMDIWEANS
201-233
205-217
225, 230




protist of





Reticulitermes






speratus







146197089
uncultured symbiotic
NVEDWKPQDNDENSGNGKLGTCCSEMDIWEGNA
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197091
uncultured symbiotic
NVLDWKPQSNDENAGTGRYGTCCTEMDIWEANS
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197097
uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197095
uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197401
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCIEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197225
uncultured symbiotic
NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS
195-227
199-211
219, 224




protist of Neotermes





koshunensis







146197317
uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS
196-228
200-212
220, 225




protist of





Mastotermes






darwiniensis







146197251
uncultured symbiotic
NSDGWKPQKNDKNSGNGRYGSCCSEMDVWEANS
195-227
199-211
219, 224




protist of Neotermes





koshunensis







146197319
uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS
196-228
200-212
220, 225




protist of





Mastotermes






darwiniensis







146197071
uncultured symbiotic
NILDWKPSSNDENAGAGRYGTCCTEMDIWEANS
200-232
204-216
224, 229




protist of





Reticulitermes






speratus







146197075
uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA
197-229
201-213
221, 226




protist of





Reticulitermes






speratus







146197159
uncultured symbiotic
NVKDWKPQETDENAGNGHYGACCTEMDIWEANS
197-229
201-213
221, 226




protist of





Hodotermopsis






sjoestedti







146197405
uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS
198-230
202-214
222, 227




protist of





Cryptocercus






punctulatus







146197327
uncultured symbiotic
NSDGWKPQDNDENSGNGKYGSCCSEMDIWEANS
201-233
205-217
225, 230




protist of





Mastotermes






darwiniensis







146197261
uncultured symbiotic
NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS
195-227
199-211
219, 224




protist of Neotermes





koshunensis



















TABLE 5






Tolerance to
Tolerance to



250 mg/L Cellobiose
Cellobiose Accumulation



% Activity in
% Activity in



4-MUL Assay
Bagasse Assay


Substitution(s)
(+/−Cellobiose)±
(−/+BG)¥







None
25%
60%


R273K/R422K
95%
84%


R273K/Y274Q/
78%
ND


D281K/Y410H/


P411G/R422K


















TABLE 6






Tolerance to




250 mg/L Cellobiose
Tolerance to



% Activity in
Cellobiose Accumulation



4-MUL Assay
% Activity in Bagasse Assay


Substitution(s)
(+/−Cellobiose)±
(−/+BG)¥

















None
23%
74%


R268K/R411K
92%
94%


R268A/R411A
92%
95%


R268A/R411K
97%
94%


R268K/R411A
97%
102%


R268K
ND
92%


R268A
ND
86%


R411K
ND
89%


R411A
ND
94%

















TABLE 7





SEQ ID NO.
Amino Acid Sequence








MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT



WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC



GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA



NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT



STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS



VNMLWLDSTY PTNATGTPGA ARGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA



SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL






MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD



NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA



LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE



ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY



YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT



NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ



SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL






MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT



WNSAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC



GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSANN ANTGIGNHGA CCAELDIWEA



NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT



STGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDYCA AEISTFGGTA SFNKHGGLTN MAAGMEAGMV LVMSLWDDYA



VNMLWLDSTY PTNATGTPGA ARGTCATTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSTTT TASRTTTTSA



SSTSTSSTST GTGVAGHWGQ CGGQGWTGPT TCVSGTTCTV VNPYYSQCL






ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS



TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT



NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG



DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS



YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS



SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG






MASSFQLYKA LLFFSSLLSA VQAQKVGTQQ AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY TNCYTGNEWD



TSICTSNEVC AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR VYLMDDEDTY TMFYLLNKEF TFDVDVSELP



CGLNGAVYFV SMDADGGKSR YATNEAGAKY GTGYCDSQCP RDLKFINGVA NVEGWESSDT NPNGGVGNHG SCCAEMDIWE



ANSISTAFTP HPCDTPGQTL CTGDSCGGTY SNDRYGGTCD PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT VVTQFLTDDN



TDTGTLSEIK RFYVQNGVVI PNSESTYPAN PGNSITTEFC ESQKELFGDV DVFSAHGGMA GMGAALEQGM VLVLSLWDDN



YSNMLWLDSN YPTDADPTQP GIARGTCPTD SGVPSEVEAQ YPNAYVVYSN IKFGPIGSTF GNGGGSGPTT TVTTSTATST



TSSATSTATG QAQHWEQCGG NGWTGPTVCA SPWACTVVNS WYSQCL






MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG



KVCAEKCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST



AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL



DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ



WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ






MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV GDYTNCYTGN



TWDTTICPDD ATCASNCALE GANYESTYGV TASGNSLRLN FVTTSQQKNI GSRLYMMKDD STYEMFKLLN QEFTFDVDVS



NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD



IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT



DDGTSSGTLK EIKRFYVQNG KVIPNSESTW TGVSGNSITT EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW



DDHSANMLWL DSNYPTTASS TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG STFNSGGSNP GGGTTTTTTT



QPTTTTTTAG NPGGTGVAQH YGQCGGIGWT GPTTCASPYT CQKLNDYYSQ CL






MLPSTISYRI YKNALFFAAL FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI DANWRWVHDV KGYTNCYTGN



TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK FVTQSQQKNI GSRLYMMKDD NTYETFKLLN QEFTFDVDVS



NLPCGLNGAL YFVSMDADGG LSRYTGNEAG AKYGTGYCDS QCPRDLKFIN GLANVEGWTP SSSDANAGNG GHGSCCAEMD



IWEANSISTA YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG TCDPDGCDFN SYRQGNKSFY GPGMTVDTKK KMTVVTQFLT



NDGTATGTLS EIKRFYVQDG KVIANSESTW PNLGGNSLTN DFCKAQKTVF GDMDTFSKHG GMEGMGAALA EGMVLVMSLW



DDHNSNMLWL DSNSPTTGTS TTPGVARGSC DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA



TSTTTTKATT TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL






MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR SGCRSVQGAV TVDANWLWTT VDGSQNCYTG NRWDTSICSS



EKTCSESCCI DGADYAGTYG VTTTGDALSL KFVQQGPYSK NVGSRLYLMK DESRYEMFTL LGNEFTFDVD VSKLGCGLNG



ALYFVSMDED GGMKRFPMNK AGAKFGTGYC DSQCPRDVKF INGMANSKDW IPSKSDANAG IGSLGACCRE MDIWEANNIA



SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD FNSYRLGNTT FYGPGPKFTI DTTRKISVVT QFLKGRDGSL



REIKRFYVQN GKVIPNSVSR VRGVPGNSIT QGFCNAQKKM FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW



LDSTYPTNSR QRGSKRGSCP ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK






ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS



TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT



NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG



DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS



YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS



SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG






MHQRALLFSA LAVAANAQQV GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NTWNTELCPD



NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNNE FTFDVDVSNL PCGLNGALYF



VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWKPSS NDKNAGVGGH GSCCPEMDIW EANSISTAVT



PHPCDDVSQT MCSGDACGGT YSATRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSEM TVVTQFITAD GTDTGALSEI



KRLYVQNGKV IANSVSNVAD VSGNSISSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST



YPTDADPSKP GVARGTCEHG AGDPEKVESQ HPDASVTFSN IKFGPIGSTY KA






MYRSLIFATS LLSLAKGQLV GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS CATNCAIDGA



DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST YQLFKFTGSQ EFTFDVDLSN LPCGLNGALY FVSMDADGGL



KKYPTNKAGA KYGTGYCDAQ CPRDLKFING EGNVEGWQPS KNDQNAGVGG HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ



SRCDGDGCGG TYSADRYAGV CDPDGCDFNS YRMGVKDFYG KGKTVDTSKK FTVVTQFIGS GDAMEIKRFY VQNGKTIPQP



DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF KDKGGMANMP STCNGMVLVM SLWDDHYSNM LWLDSTYPTD KNPDTDAGSG



RGECAITSGV PADVESQHPD ASVIYSNIKF GPINTTFG






MLAKFAALAA LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT SGSTNCYSGN EWDTSLCSTN



TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN IGSRTYLMNG ADAYQGFELL GNEFTFDVDV SGTGCGLNGA



LYFVSMDLDG GKAKYTNNKA GAKYGTGYCD AQCPRDLKYI NGIANVEGWT PSTNDANAGI GDHGTCCSEM DIWEANKVST



AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF NSYRMGNTTF YGEGKTVDTS SKFTVVTQFI KDSAGDLAEI



KRFYVQNGKV IENSQSNVDG VSGNSITQSF CNAQKTAFGD IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS



TYPVEGGPGA YRGECPTTSG VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS



NPSGTGAAHW AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V






MFKKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNSWNSTVC



SDPTTCAQRC ALEGANYQQT YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN



GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSAD WTPSETDPNA GRGRYGICCA EMDIWEANSI



SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT



LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH



MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY






MYSAAVLATF SFLLGAGAQQ VGTSTAETHP ALTVQKCAAG GTCTDESDSI VLDANWRWLH STSGSTNCYT GNTWDTTLCP



DAATCTTNCA LDGADYEGTY GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY



FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG TANVEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL



TPHVCTVDSQ TACTGDDCAS NTGVCDGDGC DFNPYRMGNT TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV



QDDVVYEQPS SDISGVSGNS ITDDFCAAQK TAFGDTDYFT QNGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT



KDASTPGVSR GSCATDSGVP ATVEAASGSA YVTFSSIKYG PIGSTFNAPA DSSSSVSASS SPAPIASSSS SASIAPVSSV



VAAIVSSSAQ AISSAAPVVS SSAQAISSAA PVVSSVVSSA APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS



AASSTGSVPL YGNCTGGKTC SEGTCVVQND YYSQCVASS






MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF DGNEWNKTAC PSNAACTKNC AIEGSDYRGT YGITTSGNSL



TLKFITKGQY STNVGSRTYL MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP EKGQGTPGAK YGTGKLSQCS



VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV WEANSMSTAL TPHSCQPEGY AVCEESNCGG



TYSLDRYAGT CDANGCDFNP YRVGNKDFYG KGKTVDTSKK MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG



NSITQKWCDT QKEVFKEEVY PFNQWGGMAS MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG AARGECAITS



GAPAEVEANN PDASVMFSNI KFGPIGSTFQ QPA






MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC



SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI



NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA



YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ



FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS



YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PNAQVVWSNI RFGPIGSTVN V






MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT



DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN



GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM



ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG



EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL



DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA



TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW YSQCL






MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN CYDGNRWTSA



CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG



LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN



AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ



FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YANMLWLDSV



YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V






MMYKKFAALA ALVAGAAAQQ ACSLTTETHP RLTWKRCTSG GNCSTVNGAV TIDANWRWTH TVSGSTNCYT GNEWDTSICS



DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN



GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANIEN WTPSTNDANA GFGRYGSCCS EMDIWDANNM



ATAFTPHPCT IIGQSRCEGN SCGGTYSSER YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS



EIKRFYVQDG KIIANAESKI PGNPGNSITQ EWCDAQKVAF GDIDDFNRKG GMAQMSKALE GPMVLVMSVW DDHYANMLWL



DSTYPIDKAG TPGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSTP SNPTATVAPP TSTTTSVRSS



TTQISTPTSQ PGGCTTQKWG QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL






MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA DYKGTYGITA SGNSLQLKFI



TKGSYSTNIG SRTYLMASDT AYQMFKFDGN KEFTFDVDLS GLPCGFNGAL YFVSMDEDGG LKKYSGNKAG AKYGTGYCDA



QCPRDLKFIN GEGNVEGWKP SDNDANAGVG GHGSCCAEMD IWEANSISTA VTPHACSTIE QTRCDGDGCG GTYSADRYAG



VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG TGDAMEIKRF YVQGGKTIEQ PASTIPGVEG NSITTKFCDQ



QKQVFGDRYT YKEKGGTANM AKALAQGMVL VMSLWDDHYS NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS



PDATVIYSNI KFGPLNSTY






MLGKIAIASL SFLAIAKGQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNSWNSSVC



SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG TNVGGRVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN



GALYFIQMDA DGGMSSQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSVG WEPSETDSNA GRGRYGICCA EMDIWEANSI



SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTIDT NRKMTVVTQF ITHDNTDTGT



LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH



MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY






MFPRSILLAL SLTAVALGQQ VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR INDFTNCYTG NEWDTSICPD



GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN IGSRLYLMAD DSNYEIFNLL NKEFTFDVDV SKLPCGLNGA



LYFSEMAADG GMSSTNTAGA KYGTGYCDSQ CPRDIKFIDG EANSEGWEGS PNDVNAGTGN FGACCGEMDI WEANSISSAY



TPHPCREPGL QRCEGNTCSV NDRYATECDP DGCDFNSFRM GDKSFYGPGM TVDTNQPITV VTQFITDNGS DNGNLQEIRR



IYVQNGQVIQ NSNVNIPGID SGNSISAEFC DQAKEAFGDE RSFQDRGGLS GMGSALDRGM VLVLSIWDDH AVNMLWLDSD



YPLDASPSQP GISRGTCSRD SGKPEDVEAN AGGVQVVYSN IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG



QGWTGPTACQ SPSTCHVIND FYSQCF






MYRNLALASL SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG KITLDANWRW THVTTGYTNC YDGNSWNTTA



CPDGATCTKN CAVDGADYSG TYGITTSSNS LSIKFVTKGS NSANIGSRTY LMESDTKYQM FNLIGQEFTF DVDVSKLPCG



LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN AGSGKIGACC PEMDIWEANS



ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVVTQ FLGSGSTLSE



IKRFYVQNGK VFKNSDSAIE GVTGNSITES FCAAQKTAFG DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD



STYPTNSTKL GAQRGTCAID SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS






MVDIQIATFL LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL TANGWDPTLC



PDGITCANYC ALDGVSYSST YGITTSGSAL RLQFVTGTNI GSRVFLMADD THYRTFQLLN QELAFDVDVS KLPCGLNGAL



YFVAMDADGG KSKYPGNRAG AKYGTGYCDS QCPRDVQFIN GQANVQGWNA TSATTGTGSY GSCCTELDIW EANSNAAALT



PHTCTNNAQT RCSGSNCTSN TGFCDADGCD FNSFRLGNTT FLGAGMSVDT TKTFTVVTQF ITSDNTSTGN LTEIRRFYVQ



NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK KAFIETNYFA QHGGLAQLGQ ALRTGMVLAF SISDDPANHM LWLDSNFPPS



ANPAVPGVAR GMCSITSGNP ADVGILNPSP YVSFLNIKFG SIGTTFRPA






MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NTWNTELCPD



NESCAQNCAL DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE FTFDVDVSNL PCGLNGALYF



VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS SDKNAGVGGH GSCCPEMDIW EANSISTAVT



PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI



KRLYVQNGKV IANSVSNVAG VSGNSITSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST



YPTDADPSKP GVARGTCEHG AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG






MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP



DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP



HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN



GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK



DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV



TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY






MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD



DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA



LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST



AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL



TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW



LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN






MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC



SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI



NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA



YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ



FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS



YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V






MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD



NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA



LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE



ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY



YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT



NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ



SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL






MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD



DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA



LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST



AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGQIVDTS SKFTVVTQFI TDDGTPSGTL



TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW



LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQYPNSYV IYSNIKVGPI NSTFTAN






MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT IDANWRWTHT TSGYTNCYTG NKWDTSICST



NADCASKCCV DGANYQQTYG ASTSGNALSL QYVTQSSGKN VGSRLYLLES ENKYQMFNLL GNEFTFDVDA SKLGCGLNGA



VYFVSMDADG GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ PSSGDANSGV GNMGSCCAEM DIWEANSIST



AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA GDCDPDGCDF NSYRQGNRTF YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD



IKRFYVQNGK VIPNSQSTIT GVTGNSVTQD YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD DHHSQMLWLD



STYPTTSTAP GAARGSCSTS SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF KSS






MLRRALLLSS SAILAVKAQQ AGTATAENHP PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT GNTWDPTYCP



DDETCAQNCA LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS TYQIFKLLNR EFSFDVDVSN LPCGLNGALY



FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ CPRDLKFIDG EANVEGWQPS SNNANTGIGD HGSCCAEMDV WEANSISNAV



TPHPCDTPGQ TMCSGDDCGG TYSNDRYAGT CDPDGCDFNP YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD DGTDTGTLSE



IKRFYIQNSN VIPQPNSDIS GVTGNSITTE FCTAQKQAFG DTDDFSQHGG LAKMGAAMQQ GMVLVMSLWD DYAAQMLWLD



SDYPTDADPT TPGIARGTCP TDSGVPSDVE SQSPNSYVTY SNIKFGPINS TFTAS






MHQRALLFSA FWTAVQAQQA GTLTAETHPS LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG NTWDATLCPD



NESCASNCAL DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN EFTFDVDASK LPCGLNGAVY



FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING QVRKGWEPSD SDKNAGVGGH GSCCPQMDIW EANSISTAYT



PHPCDDTAQT MCEGDTCGGT YSSERYAGTC DPDGCDFNAY RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI



KRFYVQGGKV IANAASNVDG VTGNSITADF CTAQKKAFGD DDIFAQHGGL QGMGNALSSM VLTLSIWDDH HSSMMWLDSS



YPEDADATAP GVARGTCEPH AGDPEKVESQ SGSATVTYSN IKYGPIGSTF DAPA






MASTLSFKIY KNALLLAAFL GAAQAQQVGT STAEVHPSLT WQKCTAGGSC TSQSGKVVID SNWRWVHNTG GYTNCYTGND



WDRTLCPDDV TCATNCALDG ADYKGTYGVT ASGSSLRLNF VTQASQKNIG SRLYLMADDS KYEMFQLLNQ EFTFDVDVSN



LPCGLNGALY FVAMDEDGGM ARYPTNKAGA KYGTGYCDAQ CPRDLKFING QANVEGWEPS SSDVNGGTGN YGSCCAEMDI



WEANSISTAF TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT CDPDGCDFNP YRMGNQSFYG PSKIVDTESP FTVVTQFITN



DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS AVTGNSITDS FCSAQKTAFK DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD



DHAANMLWLD STYPTSASST TPGAARGSCD ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG TTTTKVTTTT



ATKTTTTTGP STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL






MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV TLDSNWRWTH TLQGSTNCYS GNEWDTSICT



TGTKCAQNCC VEGAEYAATY GITTSGNQLN LKFVTEGKYS TNVGSRTYLM ENATKYQGFN LLGNEFTFDV DVSNIGCGLN



GALYFVSMDL DGGLAKYSGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WNPSTNDVNA GAGRYGTCCS EMDIWEANNM



ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC DFNSYRMGNK EFYGKGKTVD TTKKMTVVTQ FLKNAAGELS



EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT QDWCDAQKIA FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW



LDSTYPVDAA GRPGAERGAC PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT PTSTRVSSTS



TAAQTPTSAP GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL






MFPYIALVSF SFLSVVLAQQ VGTLTAETHP QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT GNTWDTSLCP



DAATCSRNCA LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL ADDSHYQMFN LKNKEFTFDV DVSQLPCGLN



GALYFSQMDA DGGTGRFPNN KAGAKYGTGY CDSQCPHDIK FINGEANVQG WQPSPNDSNA GKGQYGSCCA EMDIWEANSM



ASAYTPHPCT VTTPTRCQGN DCGDGDNRYG GVCDKDGCDF NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL TSDNTTSGTL



SEIRRLYVQN GRVIQNSKVN IPGMASTLDS ITESFCSTQK TVFGDTNSFA SKGGLRAMGN AFDKGMVLVL SIWDDHEAKM



LWLDSNYPLD KSASAPGVAR GTCATTSGEP KDVESQSPNA QVIFSNIKYG DIGSTYSN






MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG



KVCAERCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST



AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL



DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ



WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ






MYSAAVLATF SFLLGAGAQQ VGTLKTESHP PLTIQKCAAG GTCTDEADSV VLDANWRWLH STSGSTNCYT GNTWDTTLCP



DAATCTANCA FDGADYEGTY GITSSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY



FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVSG GANNEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL



TPHTCTVDGQ TACTGDDCAG NTGVCDADGC DFNPYRMGNT TFYGSGKTID TTKPFSVVTQ FITDDGTETG TLTEIKRFYV



QDDVVYEQPN SDISGVSGNS ITDDFCTAQK TAFGDTDYFS QKGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT



KDASTPGVSR GSCATTSGVP ATVEAASGSA YVTFSSIKYG PIGSTFKAPA DSSSPVVASS SPAAVAAVVS TSSAQAVPSH



PAVSSSQAAV STPEAVSSAP EVPASSSAAQ SVAPTSTKPK CSKVSQSSTL ATSVAAPATT ATSAAVAATS AASSSGSVPL



YGNCTGGKTC SEGTCVVQNP WYSQCVASS






MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWDTSLCP



DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP



HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN



GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK



DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG



QCGGIGYSGS TTCASPYTCH VLNPYYSQCY






MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD



NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA



LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE



ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY



YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT



NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNRG TTTTRRPATT TGSSPGPTQS



HYGQCGGIGY SGPTVCASGT TCQVLNPYYS QCL






MPSTYDIYKK LLLLASFLSA SQAQQVGTSK AEVHPSLTWQ TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY NNCYTGNTWD



TTLCPDDETC ASNCALEGAD YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY KMFKLLNQEF TFDVDVSNLP



CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP RDLKFINGMA NVEGWEPSAN DANAGTGNHG SCCAEMDIWE



ANSISTAYTP HPCDTPGQVM CTGDSCGGTY SSDRYGGTCD PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG



TASGTLSEIK RFYVQNGKVI PNSESTWSGV SGNSITTAYC NAQKTLFGDT DVFTKHGGME GMGAALAEGM VLVLSLWDDH



NSNMLWLDSN YPTDKPSTTP GVARGSCDIS SGDPKDVEAN DANAYVVYSN IKVGPIGSTF SGSTGGGSSS STTATSKTTT



TSATKTTTTT TKTTTTTSAS STSTGGAQHW AQCGGIGWTG PTTCVAPYTC QKQNDYYSQC L






MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG CTTSAQSIVV DANWRWLHST TGSTNCYTGN TWDKTLCPDG



ATCAANCALD GADYSGVYGI TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL LNQEFTFDVD VSNLPCGLNG



ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW TPSSNDVNAG AGQYGSCCSE MDIWEANKIS



AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE



IRRFYVQNGV VIPNSQSTIA GVPGNSITDS FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD



APYPATKSPS APGVTRGSCS ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS VPGTTTVSSA PASTATSGAG



GVAKYAQCGG SGYSGATACV SGSTCVALNP YYSQCQ






MFPAATLFAF SLFAAVYGQQ VGTQLAETHP RLTWQKCTRS GGCQTQSNGA IVLDANWRWV HNVGGYTNCY TGNTWNTSLC



PDGATCAKNC ALDGANYQST YGITTSGNAL TLKFVTQSEQ KNIGSRVYLL ESDTKYQLFN PLNQEFTFDV DVSQLPCGLN



GAVYFSAMDA DGGMSKFPNN AAGAKYGTGY CDSQCPRDIK FINGEANVQG WQPSPNDTNA GTGNYGACCN EMDVWEANSI



STAYTPHPCT QQGLVRCSGT ACGGGSNRYG SICDPDGCDF NSFRMGDKSF YGPGLTVNTQ QKFTVVTQFL TNNNSSSGTL



REIRRLYVQN GRVIQNSKVN IPGMPSTMDS VTTEFCNAQK TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL SIWDDHAANM



LWLDSNYPTD RPASQPGVAR GTCPTSSGKP SDVENSTANS QVIYSNIKFG DIGSTYSA






MKGSISYQIY KGALLLSALL NSVSAQQVGT LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG STNCYTGNTW



DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY QKFNLLNQEF TFDVDVSNLP



CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP RDLKFINGQA NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE



ANSISNAVTP HPCDTPSQTM CTGQRCGGTY STDRYGGTCD PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG



TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV SGNSITDSWC NAQKSAFGDT NEFSKHGGMA GMGAGLADGM VLVMSLWDDH



ASDMLWLDST YPTNATSTTP GAKRGTCDIS RRPNTVESTY PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS



SSSKTTTTVT TTTTSSGSSG TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL






MFRTAALTAF TLAAVVLGQQ VGTLTAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS LPVHTNCYTG NAWDASLCPD



PTTCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGPYSK NIGSRVYLLD DADHYKMFDL KNQEFTFDVD MSGLPCGLNG



ALYFSEMPAD GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW INGEANILDW SASATDANAG NGRYGACCAE MDIWEANSEA



TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTSSGNLV



EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM ANGMVLIMSL WSDHAAHMLW



LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP VTSTTSSGPT



TPTGPTGTVP KWGQCGGNGY SGPTTCVAGS TCTYSNDWYS QCL






MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD



NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNL PCGLNGALYF



TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT



PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY



VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD



ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF






MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG CSNVQGSVTI DANWRWTHQV SGSTNCHTGN KWDTSVCTSG



KVCAEKCCVD GADYASTYGI TSSGNQLSLS FVTKGSYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWE PSKSDVNGGI GNLGTCCPEM DIWEANSIST



AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCTTQKKVF GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL



DSTYPTDSTA LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP TNPTNPNPTN PTNPGTVDQW



GQCGGTNYSG PTACKSPFTC KKINDFYSQC Q






MFRTAALTAF TLAAVVLGQQ VGTLAAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDSSLCPN



PTTCATNCAI DGADYSGTYG ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL KNQEFTFDVD MSGLPCGLNG



ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW SGSATDPNAG NGRYGACCAE MDIWEANSEA



TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV



EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL WSDHAAHMLW



LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSVPP VTSTTSSGPT



TPTGPTGTVP KWGQCGGIGY SGPTSCVAGS TCTYSNEWYS QCL






MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD



NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA



LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE



ALTPHPCTTV GQEICEGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY



YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT



DETSSTPGAV RGSSSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNPS GGNPPGGNPP GTTTPRPATS TGSSPGPTQT



HYGQCGGIGY IGPTVCASGS TCQVLNPYYS QCL






MTWQSCTAKG SCTNKNGKIV IDANWRWLHK KEGYDNCYTG NEWDATACPD NKACAANCAV DGADYSGTYG ITAGSNSLKL



KFITKGSYST NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV DVSNLPCGFN GALYFVSMDA DGGLKKYSTN KAGAKYGTGY



CDAQCPRDLK FINGEGNVEG WKPSSNDANA GVGGHGSCCA EMDIWEANSV STAVTPHSCS TIEQSRCDGD GCGGTYSADR



YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD TSKKFTVVTQ FIGTGDAMEI KRFYVQNGKT IAQPASAVPG VEGNSITTKF



CDQQKAVFGD TYTFKDKGGM ANMAKALANG MVLVMSLWDD HYSNMLWLDS TYPTDKNPDT DLGTGRGECE TSSGVPADVE



SQHADATVVY SNIKFGPLNS TFG






MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS CATNQGSVVM DANWRWVHQV GSTTNCYTGN



TWDTSICDTD ETCATECAVD GADYESTYGV TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD NTHYQMFKLL NQEFTFDVDV



SNLPCGLNGA LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI QGQANVEGWT PSSNNENTGL GNYGSCCAEL



DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA GTCDPDGCDF NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI



TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL



WDDYYADMLW LDSTYPTNAS SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS GSSGSGSSGS



ASTSTTSTKT TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY YSQCL






MKAYFEYLVA ALPLLGLATA QQVGKQTTET HPKLSWKKCT GKANCNTVNA EVVIDSNWRW LHDSSGKNCY DGNKWTSACS



SATDCASKCQ LDGANYGTTY GASTSGDALT LKFVTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN



AALYFVAMEE DGGMASYSSN KAGAKYGTGY CDAQCARDLK FVGGKANIEG WTPSTNDANA GVGPYGGCCA EIDVWESNAH



SFAFTPHACK TNKYHVCERD NCGGTYSEDR FAGLCDANGC DYNPYRMGNT DFYGKGKTVD TSKKFTVVSR FEENKLTQFF



VQNGQKIEIP GPKWDGIPSD NANITPEFCS AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV LVMSIWDDHY ANMLWLDSVY



PPEKEGQPGA ARGDCPQSSG VPAEVESQYA NSKVVYSNIR FGPVGSTVNV






MFSKFALTGS LLAGAVNAQG VGTQQTETHP QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT GNTWNTTLCP



DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL IGNEFAFDVD LSKLPCGLNG



ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF INGQGNVEGW TPSTNDANAG VGGLGSCCSE MDVWEANSMD



MAYTPHPCET AAQHSCNADE CGGTYSSSRY AGDCDPDGCD WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI



SQYYIQGGTK IQQPNSTWPT LTGYNSITDD FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD GMVLVMSLWD DHYANMLWLD



STYPVDADAS SPGKQRGTCA TTSGVPADVE SSDASATVIY SNIKFGPIGA TY






MFPAAALLSF TLLAVASAQQ IGTNTAEVHP SLTVSQCTTS GGCTSSTQSI VLDANWRWLH STSGYTNCYT GNQWNSDLCP



DPDTCATNCA LDGASYESTY GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL LNKEFSFDVD ASNIGCGING



AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF INGEANLLDW NATSANSGTG SYGSCCPEMD IWEANKYAAA



YTPHPCSVSG QTRCTGTSCG AGSERYDGYC DKDGCDFNSW RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI



RRLYVQGGTV IQNSVANQPN IPKVNSITDS FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD MVLVMSIWDD YDAEMLWLDS



NYPTSGSAST PGISRGPCSA TSGLPATVES QQASASVTYS NIKWGDIGST YSGSGSSGSS SSSSSSAASA STSTHTSAAA



TATSSAAAAT GSPVPAYGQC GGQSYTGSTT CASPYVCKVS NAYYSQCLPA






MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV IDGNWRWIHN IGGYENCYSG NKWTSVCSTN



ADCATKCAME GAKYQETYGV STSGDALTLK FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK NNEFAFDVDL SSVECGMNSA



LYFVPMKEDG GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ PSSTDSSAGI GAQGACCAEI DIWESNKNAF



AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY NPYRMGNPDF YGPGKTIDTN RKFTVISRFE NNRNYQILMQ



DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP



EKAGQPGSAR GPCPADGGDP NGVVNQYPNA KVIWSNVRFG PIGSTYQVD






MQLTKAGVFL GALMGGAAAQ QVGTQTAENH PKMTWKKCTG KASCTTVNGE VVIDANWRWL HDASSKNCYD GNRWTDSCRT



ASDCAAKCSL EGADYAKTYG ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL LGNEFAFDVD LSTIECGLNS



ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF VGGKANIEGW KPSSNDANAG VGPYGACCAE IDVWESNAHA



FAFTPHPCTD NKYHVCQDSN CGGTYSDDRF AGKCDANGCD INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV



QNNKRIDMPS PALEGLPATG AITAEYCTNV FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV MSIWDDHYSN MLWLDSVYPP



DKEGSPGAAR GDCPQDSGVP SEVESQIPGA TVVWSNIRFG PVGSTVNV






MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG CSDVKGSVVI DANWRWTHQT SGSTNCYTGN KWDTSICTDG



KTCAEKCCLD GADYSGTYGI TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SGIGCGLNGA



PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI NGVANSEGWK PSDSDVNAGV GNLGTCCPEM DIWEANSIST



AFTPHPCTKL TQHSCTGDSC GGTYSSDRYG GTCDADGCDF NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCSKQKSVF GDIDDFSKKG GWNGMSDALS APMVLVMSLW HDHHSNMLWL



DSTYPTDSTK VGSQRGSCAT TSGKPSDLER DVPNSKVSFS NIKFGPIGST YKSDGTTPNP PASSSTTGSS TPTNPPAGSV



DQWGQCGGQN YSGPTTCKSP FTCKKINDFY SQCQ






MYQRALLFSA LATAVSAQQV GTQKAEVHPA LTWQKCTAAG SCTDQKGSVV IDANWRWLHS TEDTTNCYTG NEWNAELCPD



NEACAKNCAL DGADYSGTYG VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE FTFDVDVSNL PCGLNGALYF



VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE ANVEGWKPSD NDKNAGVGGY GSCCPEMDIW EANSISTAYT



PHPCDGMEQT RCDGNDCGGT YSSTRYAGTC DPDGCDFNSF RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI



RRVYVQGGKV IGNSASNVAG VEGDSITSDF CTAQKKAFGD EDIFSKHGGL EGMGKALNKM ALIVSIWDDH ASSMMWLDST



YPVDADASTP GVARGTCEHG LGDPETVESQ HPDASVTFSN IKFGPIGSTY KSV






MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT



WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSNLPC



GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSTNN SNTGIGNHGS CCAELDIWEA



NSISEALTPH PCDTPGLTVC TADDCGGTYS SNRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTDDGT



SSGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDFCA AELSAFGETA SFTNHGGLKN MGSALEAGMV LVMSLWDDYS



VNMLWLDSTY PANETGTPGA ARGSCPTTSG NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA



STTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL






MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT



DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQHS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN



GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM



ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG



EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL



DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA



TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYICTKLNDW YSQCL






MMYKKFAALA ALVAGASAQQ ACSLTAENHP SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT GNQWDTSLCT



DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN



GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANVGN WTPSTNDANA GFGRYGSCCS EMDVWEANNM



ATAFTPHPCT TVGQSRCEAD TCGGTYSSDR YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS



EIKRFYVQDG KIIANAESKI PGNPGNSITQ EYCDAQKVAF SNTDDFNRKG GMAQMSKALA GPMVLVMSVW DDHYANMLWL



DSTYPIDQAG APGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS



STSSPVSTPT GQPGGCTTQK WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL






MASLSLSKIC RNALILSSVL STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD ANWRWVHQTG SSSNCYTGNK



WDTSYCSTND ACAQKCALDG ADYSNTYGIT TSGSEVRLNF VTSNSNGKNV GSRVYMMADD THYEVYKLLN QEFTFDVDVS



KLPCGLNGAL YFVVMDADGG VSKYPNNKAG AKYGTGYCDS QCPRDLKFIQ GQANVEGWVS STNNANTGTG NHGSCCAELD



IWESNSISQA LTPHPCDTPT NTLCTGDACG GTYSSDRYSG TCDPDGCDFN PYRVGNTTFY GPGKTIDTNK PITVVTQFIT



DDGTSSGTLS EIKRFYVQDG VTYPQPSADV SGLSGNTINS EYCTAENTLF EGSGSFAKHG GLAGMGEAMS TGMVLVMSLW



DDYYANMLWL DSNYPTNEST SKPGVARGTC STSSGVPSEV EASNPSAYVA YSNIKVGPIG STFKS






MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG



KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNAGI GNMGTCCPEM DIWEANSIST



AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL



DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA



QCGGTNYSGP TACKSPFTCK KINDFYSQCQ






MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP SLNWARCTSS GCTNVAGSVT LDANWRWLHT TSGYTNCYTG NSWNTTLCPD



GATCAQNCAL DGANYQSTCG ITTSGNALTL KFVTQGEQKN IGSRVYLMAS ESRYEMFGLL NKEFTFDVDV SNLPCGLNGA



LYFSSMDADG GMAKNPGNKA GAKYGTGYCD SQCPRDIKFI NGEANVAGWN GSPNDTNAGT GNWGACCNEM DIWEANSISA



AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC DPDGCDFNSY RMGDKTYYGP GGTGVDTRSK FTVVTQFLTN NNSSSGTLSE



IRRLYVQNGR VVQNSKVNIP GMSNTLDSIT TGFCDSQKTA FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV WDDHAANMLW



LDSNYPVDAD PSKPGIARGT CSTTSGKPTD VEQSAANSSV TFSNIKFGDI GTTYTGGSVT TTPGNPGTTT STAPGAVQTK



WGQCGGQGWT GPTRCESGST CTVVNQWYSQ CI






MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD



GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE FTFDVDMSNL PCGLNGAVYL



SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVAGWTGSS SDPNSGTGNY GTCCSEMDIW EANSVAAAYT



PHPCSVNQQT RCTGADCGQD ANRYKGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR



FYVQDGKVIP NSKVNIAGCD AVNSITDKFC TQQKTAFGDT NRFADQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD



YPTTADASKP GVARGTCPNT SGVPKDVESQ SGSATVTYSN IKWGDLNSTF SGTASNPTGP SSSPSGPSSS SSSTAGSQPT



QPSSGSVAQW GQCGGIGYSG ATGCVSPYTC HVVNPYYSQC Y






TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS TNCYTGNEWD TSICSDGKSC AQTCCVDGAD YSSTYGITTS



GDSLNLKFVT KHQHGTNVGS RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF VSMDADGGMS KYSGNKAGAK



YGTGYCDAQC PRDLKFINGE ANIENWTPST NDANAGFGRY GSCCSEMDIW EANNMATAFT PHPCTIIGQS RCEGNSCGGT



YSSERYAGVC DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN AESKIPGNPG



NSITQEWCDA QKVAFGDIDD FNRKGGMAQM SKALEGPMVL VMSVWDDHYA NMLWLDSTYP IDKAGTPGAE RGACPTTSGV



PAEIEAQVPN SNVIFSNIRF GPIGSTVPGL DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS TPTSQPGGCT TQKWGQCGGI



GYTGCTNCVA GTTCTELNPW YSQCL






MFHKAVLVAF SLVTIVHGQQ AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT GNTWDASICS



DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET NYQMFKLMNQ EFTFDVDVSN LPCGLNGAVY



FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ CPQDIKFING EANIVDWTAS AGDANSGTGS FGTCCQEMDI WEANSISAAY



TPHPCTVTEQ TRCSGSDCGQ GSDRFNGICD PDGCDFNSFR MGNTEFYGKG LTVDTSQKFT IVTQFISDDG TADGNLAEIR



RFYVQNGKVI PNSVVQITGI DPVNSITEDF CTQQKTVFGD TNNFAAKGGL KQMGEAVKNG MVLALSLWDD YAAQMLWLDS



DYPTTADPSQ PGVARGTCPT TSGVPSQVEG QEGSSSVIYS NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS



QPAQPTQPAG TAAQWAQCGG MGFTGPTVCA SPFTCHVLNP YYSQCY






MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWNTSLCP



DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP



HPCTTTGQTR CSGDDCARNT GLCDHGDGCD FNSFRMGDKT FLGKGMTVDT SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ



NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL SIWDDHAANM LWLDSDYPTD



KDPSAPGVAR GTCATTSGVP SDVESQVPNS QVVFSNIKFG DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP PPTGPTVPQW



GQCGGIGYSG STTCASPYTC HVLNPYYSQC Y






MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV NAEVVIDANW RWLHDDNMQN CYDGNQWTNA



CSTATDCAEK CMIEGAGDYL GTYGASTSGD ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ MFNLMGNELA FDVDLSTVEC



GINSALYFVA MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN IEGWKSSTSD PNAGVGPYGS CCAEIDVWES



NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA NGCDYNPYRM GNPDFYGKGK TLDTSRKFTV VSRFEENKLS



QYFIQDGRKI EIPPPTWEGM PNSSEITPEL CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS



IYPPEKEGQP GAARGDCPTD SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF






MYRSATFLTF ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL DANWRWTHIT NGYTNCYTGN EWNATACPDG



ATCAKNCAVD GADYSGTYGI TTPSSGALRL QFVKKNDNGQ NVGSRVYLMA SSDKYKLFNL LNKEFTFDVD VSKLPCGLNG



AVYFSEMLED GGLKSFSGNK AGAKYGTGYC DSQCPQDIKF INGEANVEGW GGADGNSGTG KYGICCAEMD IWEANSDATA



YTPHVCSVNE QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY RLGNREFYGP GKTVDTTRPF TIVTQFVTDD GTDSGNLKSI



HRYYVQDGNV IPNSVTEVAG VDQTNFISEG FCEQQKSAFG DNNYFGQLGG MRAMGESLKK MVLVLSIWDD HAVNMNWLDS



IFPNDADPEQ PGVARGRCDP ADGVPATIEA AHPDAYVIYS NIKFGAINST FTAN






MYRTLAFASL SLYGAARAQQ VGTSTAENHP KLTWQTCTGT GGTNCSNKSG SVVLDSNWRW AHNVGGYTNC YTGNSWSTQY



CPDGDSCTKN CAIDGADYSG TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM FNLINKEFTF DVDVSKLPCG



LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN GGAGKIGACC PEMDIWEANS



ISTAYTPHPC RGVGLQECSD AASCGDGSNR YDGQCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE



IKRFYVQNGK VYKNSQSAVA GVTGNSITES FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG DHAVNMLWLD



STYPTDADPS KPGAARGTCP TTSGKPEDVE KNSPDATVVF SNIKFGPIGS TFAQPA






MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD



NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA



LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE



ALTPHPCTTV GQEICDGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY



YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT



NETSSTPGAV RGSCSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNSS GGNPPGGNPP GTTTTRRPAT STGSSPGPTQ



THYGQCGGIG YSGPTVCASG STCQVLNPYY SQCL






MVDSFSIYKT ALLLSMLATS NAQQVGTYTA ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT NCYSGNTWDS



SICSTDTTCA SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL YLLADDSTYE TFKLFNREFT FDVDVSNLPC



GLNGALYFVS MDADGGVSRF PTNKAGAKYG TGYCDSQCPR DLKFIDGQAN IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA



NSISSAFTAH PCDSVQQTMC TGDTCGGTYS DTTDRYSGTC DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF TVVTQFITHD



GTDTGTLTEI RRLYVQNGVV IGNGPSTYTA ASGNSITESF CKAEKTLFGD TNVFETHGGL SAMGDALGDG MVLVLSLWDD



HAADMLWLDS DYPTTSCASS PGVARGTCPT TTGNATYVEA NYPNSYVTYS NIKFGTLNST YSGTSSGGSS SSSTTLTTKA



STSTTSSKTT TTTSKTSTTS SSSTNVAQLY GQCGGQGWTG PTTCASGTCTKQNDYYSQCL






MYRILKSFIL LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT GNTWNPTICP



DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK NIGSRLFLMS NESNYQLFHV LGQEFTFDVD VSNLDCGLNG



ALYLVSMDSD GGSARFPTNE AGAKYGTGYC DAQCPRDLKF ISGSANVDGW IPSTNNPNTG YGNLGSCCAE MDLWEANNMA



TAVTPHPCDT SSQSVCKSDS CGGAASSNRY GGICDPDGCD YNPYRMGNTS FFGPNKMIDT NSVITVVTQF ITDDGSSDGK



LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV NEEFCTNQKK VFGDEDSFTK HGGLAKMGEA LKDGMVLVLS LWDDYQANML



WLDSSYPTTS SPTDPGVARG SCPTTSGVPS KVEQNYPNAY VVYSNIKVGP IDSTYKK






MISRVLAISS LLAAARAQQI GTNTAEVHPA LTSIVIDANW RWLHTTSGYT NCYTGNSWDA TLCPDAVTCA ANCALDGADY



SGTYGITTSG NSLKLNFVTK GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL PCGLNGALYF AEMDADGGVS



RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY GSCCSEMDIW EANKISAAYT PHPCSVDGQT



RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG DTGFYGAGLT VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN



SQSKVTGVSG NSITDSFCAA QKTAFGDTNE FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP ASKSPSAAGV



SRGSCSASSG VPADVEANSP GASVTYSNIK WGPINSTYSA GTGSNTGSGS GSTTTLVSSV PSSTPTSTTG VPKYGQCGGS



GYTGPTNCIG STCVSMGQYY SQCQ






MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG SCTKEDTTVV LDANWRWTHV TDGYTNCYTG NAWNETACPD



GKTCAANCAI DGAEYEKTYG ITTPEEGALR LNFVTESNVG SRVYLMAGED KYRLFNLLNK EFTMDVDVSN LPCGLNGAVY



FSEMDEDGGM SRFEGNKAGA KYGTGYCDSQ CPRDIKFING EANSEGWGGE DGNSGTGKYG TCCAEMDIWE ANLDATAYTP



HPCKVTEQTR CEDDTECGAG DARYEGLCDR DGCDFNSFRL GNKEFYGPEK TVDTSKPFTL VTQFVTADGT DTGALQSIRR



FYVQDGTVIP NSETVVEGVD PTNEITDDFC AQQKTAFGDN NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA VYMNWLDSNY



PTDADPTKPG VARGRCDPEA GVPETVEAAH PDAYVIYSNI KIGALNSTFA AA






MSSFQVYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS



ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG



LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN



SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG



TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY



AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK



ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL






MYRAIATASA LLATARAQQV CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST SSSTNCYTGN TWDKTLCPDG



KTCADKCCLD GADYSGTYGV TSSGNQLNLK FVTVGPYSTN VGSRLYLMED ENNYQMFDLL GNEFTFDVDV NNIGCGLNGA



LYFVSMDKDG GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI NGVANSDEWK PSDSDKNAGV GKYGTCCPEM DIWEANKIST



AYTPHPCKSL TQQSCEGDAC GGTYSATRYA GTCDPDGCDF NPYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS



EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD SFCKAQKVAF NDPDDFNKKG GWSGMSDALA KPMVLVMSLW HDHYANMLWL



DSTYPKGSKT PGSARGSCPE DSGDPDTLEK EVPNSGVSFS NIKFGPIGST YTGTGGSNPD PEEPEEPEEP VGTVPQYGQC



GGINYSGPTA CVSPYKCNKI NDFYSQCQ






EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC YTGNTWDPTY CPDDETCAQN CALDGADYEG



TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD DSTYQIFKLL NREFSFDVDV SNLPCGLNGA LYFVAMDADG GVSKYPNNKA



GAKYGTGYCD SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM DVWEANSISN AVTPHPCDTP GQTMCSGDDC



GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT KPFTVVTQFL TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD



ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT



CPTDSGVPSD VESQSPNSYV TYSNIKFGPI NSTFTAS






MFPTLALVSL SFLAIAYGQQ VGTLTAETHP KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH DVGGSTNCYT GNTWDDSLCP



DPTTCAANCA LDGADYSGTY GITTSGNALS LKFVTQGPYS TNIGSRVYLL SEDDSTYEMF NLKNQEFTFD VDMSALPCGL



NGALYFVEMD KDGGSGRFPT NKAGSKYGTG YCDTQCPHDI KFINGEANVL DWAGSSNDPN AGTGHYGTCC NEMDIWEANS



MGAAVTPHVC TVQGQTRCEG TDCGDGDERY DGICDKDGCD FNSWRMGDQT FLGPGKTVDT SSKFTVVTQF ITADNTTSGD



LSEIRRLYVQ NGKVIANSKT QIAGMDAYDS ITDDFCNAQK TTFGDTNTFE QMGGLATMGD AFETGMVLVM SIWDDHEAKM



LWLDSDYPTD ADASAPGVSR GPCPTTSGDP TDVESQSPGA TVIFSNIKTG PIGSTFTS






MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV NGEVVIDANW RWLAHRSGYT NCYTGSEWNQ



SACPNNEACT KNCAIEGSDY AGTYGITTSG NQMNIKFITK RPYSTNIGAR TYLMKDEQNY EMFQLIGNEF TFDVDLSQRC



GMNGALYFVS MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK SASDPNSGVG KKGACCAQMD VWEANSAATA



LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN PFRVGVKDFY GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR



FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP



AKPGVARRDC PTSGGKPSEV EAANPNAQVM FSNIKFGPIG STFAHAA






MFRTATLLAF TMAAMVFGQQ VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP



DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP



HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN



GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK



DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV



TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY






MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG CTSKDGSVVI DANWRWVHSV DGYKNCYTGN EWDSTLCPDD



ATCATNCAVD GADYAGTYGA TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF TFDVDVSTLP CGLNGALYFV



SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN DKNAGVGPHG SCCAEMDIWE ANSISTALTP



HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK



RVYVQNGKVI ANSASDVSGI TGNSITSDFC TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE



KYPTDAAASK AGVSRGTCST DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA






MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS CTTQSGSIVL DGNWRWTHST TSSTNCYTGN TWDATLCPDD



ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV GSRVYLLADN THYKTFNLLN QEFTFDVDVS NLPCGLNGAV



YFANLPADGG ISSTNKAGAQ YGTGYCDSQC PRDGKFINGK ANVDGWVPSS NNPNTGVGNY GSCCAEMDIW EANSISTAVT



PHSCDTVTQT VCTGDNCGGT YSTTRYAGTC DPDGCDFNPY RQGNESFYGP GKTVDTNSVF TIVTQFLTTD GTSSGTLNEI



KRFYVQNGKV IPNSESTISG VTGNSITTPF CTAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS



TYPTTKTGAG GPRGTCSTSS GVPASVEASS PNAYVVYSNI KVGAINSTFG






MYTKFAALAA LVATVRGQAA CSLTAETHPS LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG NEWDTSICSS



DTDCATKCCL DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL LGQEFTFDVD VSNLGCGLNG



ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF INGQANIDGW QPSSNDANAG LGNHGSCCSE MDIWEANKVS



AAYTPHPCTT IGQTMCTGDD CGGTYSSDRY AGICDPDGCD FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE



IKRFYVQNGK VIPNSESKIA GVSGNSITTD FCTAQKTAFG DTNVFEERGG LAQMGKALAE PMVLVLSVWD DHAVNMLWLD



STYPTDSTKP GAARGDCPIT SGVPADVESQ APNSNVIYSN IRFGPINSTY TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT



TTNPSGPQQT HWGQCGGQGW TGPTVCQSPY TCKYSNDWYS QCL






MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD



NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNF PCGLNGALYF



TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT



PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY



VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD



ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF






MMMKQYLQYL AAGSLMTGLV AGQGVGTQQT ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN CYDGNAWNTA



ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR FYLMNGASKY QMFTLMNNEF TFDVDLSTVE



CGLNSALYFV AMEEDGGMRS YPTNKAGAKY GTGYCDAQCA RDLKFVGGKA NIEGWRESSN DENAGVGPYG GCCAEIDVWE



SNAHAYAFTP HACENNNYHV CERDTCGGTY SEDRFAGGCD ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT VVTRFQDDNL



EQFFVQNGQK ILAPAPTFDG IPASPNLTPE FCSTQFDVFT DRNRFREVGD FPQLNAALRI PMVLVMSIWA DHYANMLWLD



SVYPPEKEGE PGAARGPCAQ DSGVPSEVKA NYPNAKVVWS NIRFGPIGST VNV






MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD



DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA



LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST



AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL



TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW



LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN






MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD



NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS



MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH



ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR



KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI



YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY






MHQRALLFSA LVGAVRAQQA GTLTEEVHPP LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG NTWDESLCPD



NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN EFTFDVDVSN LPCGLNGALY



FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING MANVEGWTPS DNDKNAGVGG HGSCCPELDI WEANSISSAF



TPHPCDDLGQ TMCSGDDCGG TYSETRYAGT CDPDGCDFNA YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL



YVQNGKVIAN AQSNVDGVTG NSITSDFCTA QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI LSIWDDHNSS MMWLDSTYPE



DADASEPGVA RGTCEHGVGD PETVESQHPG ATVTFSKIKF GPIGSTYSSN STA






MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWDTSLCP



DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP



HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN



GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK



DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG



QCGGIGYSGS TTCASPYTCH VLNPCESILS LQRSSNADQY LQTTRSATKR RLDTALQPRK






MRTALALILA LAAFSAVSAQ QAGTITAETH PTLTIQQCTQ SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY SGNTWDAILC



PDPVTCAANC ALDGADYTGT FGILPSGTSV TLRPVDGLGL RLFLLADDSH YQMFQLLNKE FTFDVEMPNM RCGSSGAIHL



TAMDADGGLA KYPGNQAGAK YGTGFCSAQC PKGVKFINGQ ANVEGWLGTT ATTGTGFFGS CCTDIALWEA NDNSASFAPH



PCTTNSQTRC SGSDCTADSG LCDADGCNFN SFRMGNTTFF GAGMSVDTTK LFTVVTQFIT SDNTSMGALV EIHRLYIQNG



QVIQNSVVNI PGINPATSIT DDLCAQENAA FGGTSSFAQH GGLAQVGEAL RSGMVLALSI VNSAADTLWL DSNYPADADP



SAPGVARGTC PQDSASIPEA PTPSVVFSNI KLGDIGTTFG AGSALFSGRS PPGPVPGSAP ASSATATAPP FGSQCGGLGY



AGPTGVCPSP YTCQALNIYY SQCI






MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD



DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA



LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST



AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL



TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FDNTGFFTHG GLQKISQALA QGMVLVMSLW DDHAANMLWL



DSTYPTDADP DTPGVARGTC PTTSGVPADV ESQNPNSYVI YSNIKVGPIN STFTAN






MHKRAATLSA LVVAAAGFAR GQGVGTQQTE THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN CYTGNEWNTT



ICADAASCAS NCVVDGADYQ GTYGASTSGN ALTLKFVTKG SYATNIGSRM YLMASPTKYA MFTLLGHEFA FDVDLSKLPC



GLNGAVYFVS MDEDGGTSKY PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN SASWQPSSND QNAGVGGMGS CCAEMDIWEA



NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS ATRFAGDCDP DGCDWNAYRM GVHDFYGNGK TVDTGKKFSI VTQFKGSGST



LTEIKQFYVQ DGRKIENPNA TWPGLEPFNS ITPDFCKAQK QVFGDPDRFN DMGGFTNMAK ALANPMVLVL SLWDDHYSNM



LWLDSTYPTD ADPSAPGKGR GTCDTSSGVP SDVESKNGDA TVIYSNIKFG PLDSTYTAS






MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA NWRWTHSTSG STNCYTGNTW QATLCPDGKT



CAANCALDGA DYTGTYGVTT SGNSLTLQFV TQSNVGARLG YLMADDTTYQ MFNLLNQEFW FDVDMSNLPC GLNGALYFSA



MARTAAWMPM VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR DIKFINGEAN VQGWQPSPND TNAGTGNYGA



CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN RYGSICDHDG LGFQNLFGMG RTRVRARVGR VKQFNRSSRV



VEPISWTKQT TLHLGNLPWK SADCNVQNGR VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL



RRGMVLVLSI WDDHAANMLW LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF TNIKFGPFNS TGTTYTTGSV



PTTSTSTGTT GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA SPTTCHVLNP YYSQCY






MKQYLQYLAA ALPLMSLVSA QGVGTSTSET HPKITWKKCS SGGSCSTVNA EVVIDANWRW LHNADSKNCY DGNEWTDACT



SSDDCTSKCV LEGAEYGKTY GASTSGDSLS LKFLTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN



SALYFVAMEE DGGMASYSTN KAGAKYGTGY CDAQCARDLK FVGGKANYDG WTPSSNDANA GVGALGGCCA EIDVWESNAH



AFAFTPHACE NNNYHVCEDT TCGGTYSEDR FAGDCDANGC DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ FQENKLTQFF



VQNGKKIEIP GPKHEGLPTE SSDITPELCS AMPEVFGDRD RFAEVGGFDA LNKALAVPMV LVMSIWDDHY ANMLWLDSSY



PPEKAGTPGG DRGPCAQDSG VPSEVESQYP DATVVWSNIR FGPIGSTVQV






MFPKASLIAL SFIAAVYGQQ VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG NSWDATLCPD



ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA YQMFQLLNQE FTFDVDMSNL PCGLNGALYL



SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC PHDIKFINGM ANVAGWAGSA SDPNAGSGTL GTCCSEMDIW EANNDAAAFT



PHPCSVDGQT QCSGTQCGDD DERYSGLCDK DGCDFNSFRM GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT TNGDLHEIRR



LYVQDGKVIQ NSVVSIPGID AVDSITDNFC AQQKSVFGDT NYFATLGGLK KMGAALKSGM VLAMSVWDDH AASMQWLDSN



YPADGDATKP GVARGTCSAD SGLPTNVESQ SASASVTFSN IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ



GTVAQWGQCG GTGFTGPTVC ASPFTCHVVN PYYSQCY






MFRTAALLSF AYLAVVYGQQ AGTSTAETHP PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT GNEWNTTVCP



DGTTCAANCA LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF NPLNQEFTFD VDVSALPCGL



NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI KFIEGKANVE GWTPSSTSPN AGTGGTGICC NEMDIWEANS



ISEALTPHPC TAQGGTACTG DSCSSPNSTA GICDQAGCDF NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL



TAIRRIYVQN GQVIQNSMSN IAGVTPTNEI TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA FSRGMVLVLS IWDDDAAEML



WLDSTYPVGK TGPGAARGTC ATTSGQPDQV ETQSPNAQVV FSNIKFGAIG STFSSTGTGT GTGTGTGTGT GTTTSSAPAA



TQTKYGQCGG QGWTGATVCA SGSTCTSSGP YYSQCL






MFRTAALTAF TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDPALCPD



PATCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGQYSQ NIGSRVYLLD DADHYKLFDL KNQEFTFDVD MSGLPCGLNG



ALYFSEMAAD GGKAAHAGNN AGAKYGTGYC DAQCPHDIKW INGEANVLDW SASATDDNAG NGRYGACCAE MDIWEANSEA



TAYTPHVCRD EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KVTVVTQFIT DNNTPTGNLV



EIRRVYVQNG VVYQNSFSTF PSLSQYNSIS DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW



LDSDYPLDKS PSEPGVSRGA CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP VSSGGSSVPP VTSTTSSGTT



TPTGPTGTVP KWGQCGGIGY SGPTACVAGS TCTYSNDWYS QCL






MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG



KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGTYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNGGI GNLGTCCPEM DIWEANSIST



AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS



EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDIDDFEKKG AWGGMSDALE APMVLVMSLW HDHHSNMLWL



DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG



GTNYSGPTAC KSPFTCKKIN DFYSQCQ






MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP



DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY



LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP



HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN



GKVIQNSSVK IPGIDLVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK



DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV



TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY






MYQTSLLASL SFLLATSQAQ QVGTQTAETH PKLTTQKCTT AGGCTDQSTS IVLDANWRWL HTVDGYTNCY TGQEWDTSIC



TDGKTCAEKC ALDGADYEST YGISTSGNAL TMNFVTKSSQ TNIGGRVYLL AADSDDTYEL FKLKNQEFTF DVDVSNLPCG



LNGALYFSEM DSDGGLSKYT TNKAGAKYGT GYCDTQCPHD IKFINGEANV QNWTASSTDK NAGTGHYGSC CNEMDIWEAN



SQATAFTPHV CEAKVEGQYR CEGTECGDGD NRYGGVCDKD GCDFNSYRMG NETFYGSNGS TIDTTKKFTV VTQFITADNT



ATGALTEIRR KYVQNDVVIE NSYADYETLS KFNSITDDFC AAQKTLSGDT NDFKTKGGIA RMGESFERGM VLVMSVWDDH



AANALWLDSS YPTDADASKP GVKRGPCSTS SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA






MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNAWNSSVC



SDGATCAQRC ALEGANYQQT YGITTSGDAL TIKFLTRSEQ TNIGARVYLM ENEDRYQMFN LLNKEFTFDV DVSKVPCGIN



GALYFIQMDA DGGLSSQPNN RAGAKYGTGY CDSQCPRDIK FINGEANSVG WEPSETDPNA GKGQYGICCA EMDIWEANSI



SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT



LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITQEFCDDA KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH



MLWLDSNYPT DADPNKPGIA RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN






MYRKLAVISA FLAAARAQQV CTQQAETHPP LTWQKCTASG CTPQQGSVVL DANWRWTHDT KSTTNCYDGN TWSSTLCPDD



ATCAKNCCLD GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF SFDVDVSQLP CGLNGALYFV



SMDADGGQSK YPGNAAGAKY GTGYCDSQCP RDLKFINGQA NVEGWEPSSN NANTGVGGHG SCCSEMDIWE ANSISEALTP



HPCETVGQTM CSGDSCGGTY SNDRYGGTCD PDGCDWNPYR LGNTSFYGPG SSFALDTTKK LTVVTQFATD GSISRYYVQN



GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA FGGTSFTDKG GLAQINKAFQ GGMVLVMSLW DDYAVNMLWL DSTYPTNATA



STPGAKRGSC STSSGVPAQV EAQSPNSKVI YSNIRFGPIG STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT



GWTGPTRCAS GYTCQVLNPF YSQCL






MRASLLAFSL AAAVAGGQQA GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL DANWRWTHAT SGSTKCYTGN KWQATLCPDG



KSCAANCALD GADYTGTYGI TGSGWSLTLQ FVTDNVGARA YLMADDTQYQ MLELLNQELW FDVDMSNIPC GLNGALYLSA



MDADGGMRKY PTNKAGAKYA TGYCDAQCPR DLKYINGIAN VEGWTPSTND ANGIGDHGSC CSEMDIWEAN KVSTAFTPHP



CTTIEQHMCE GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG NTTFYGEGKT VDTSSKFTVV TQFIKDSAGD LAEIKAFYVQ



NGKVIENSQS NVDGVSGNSI TQSFCKSQKT AFGDIDDFNK KGGLKQMGKA LAQAMVLVMS IWDDHAANML WLDSTYPVPK



VPGAYRGSGP TTSGVPAEVD ANAPNSKVAF SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS TSTASNPSGT



GAAHWAQCGG IGFSGPTTCP EPYTCAKDHD IYSQCV






MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV GDYTNCYTGN



TWDKTLCPDD ATCASNCALE GANYQSTYGA TTSGDSLRLN FVTTSQQKNI GSRLYMMKDD TTYEMFKLLN QEFTFDVDVS



NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD



IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT



DDGTASGTLK EIKRFYVQNG KVIPNSESTW SGVGGNSITN DYCTAQKSLF KDQNVFAKHG GMEGMGAALA QGMVLVMSLW



DDHAANMLWL DSNYPTTASS STPGVARGTC DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP GGGTTTTAKP



TTTTTTAGSP GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL






MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN CYDGNRWTSA



CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG



LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN



AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ



FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YASMLWLDSV



YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V






MTSRIALVSL FAAVYGQQVG TYQTETHPSL TWQSCTAKGS CTTNTGSIVL DGNWRWTHGV GTSTNCYTGN TWDATLCPDD



ATCAQNCALE GADYSGTYGI TTSGNSLRLN FVTQSANKNI GSRVYLMADT THYKTFNLLN QEFTFDVDVS NLPCGLNGAV



YFANLPADGG ISSTNTAGAE YGTGYCDSQC PRDMKFIKGQ ANVDGWVPSS NNANTGVGNH GSCCAEMDIW EANSISTAVT



PHSCDTVTQT VCTGDDCGGT YSSSRYAGTC DPDGCDFNSY RMGDETFYGP GKTVDTNSVF TVVTQFLTTD GTASGTLNEI



KRFYVQDGKV IPNSYSTISG VSGNSITTPF CDAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS



TYPVGKTSAG GPRGTCDTSS GVPASVEASS PNAYVVYSNI KVGAINSTYG






MFVFVLLWLT QSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD



NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS



MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH



ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR



KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI



YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY






MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQKCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD



GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQMFQLINQE FTFDVDMSNL PCGLNGAVYL



SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW EANSVAAAYT



PHPCSVNQQT RCTGADCGQG DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT TSGNLAEIRR



FYVQDGNVIP NSKVSIAGID AVNSITDDFC TQQKTAFGDT NRFAAQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD



YPTTADASNP GVARGTCPTT SGFPRDVESQ SGSATVTYSN IKWGDLNSTF TGTLTTPSGS SSPSSPASTS GSSTSASSSA



SVPTQSGTVA QWAQCGGIGY SGATTCVSPY TCHVVNAYYS QCY






MYRAIATASA LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT SSSTNCYTGN KWDTSVCTSG



ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA



LYFVSMDADG GKARYPANKA GAKYGTGYCD AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST



AFTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD TTKKVTVVTQ FKKGSNGRLS



EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF GDIDDFSKKG GWSGMSDALE SPPMVLVMSL WHDHHSNMLW



LDSTYPTDST KLGAQRGSCA TTSGVPSDLE RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG



QYGQCGGQTY TGPKDCKSPY TCKKINDFYS QCQ






MSSFQIYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS



ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG



LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN



SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG



TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY



AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK



ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL






MHQRALLFSA LLTAVRAQQA GTLTEEVHPS LTWQKCTSEG SCTEQSGSVV IDSNWRWTHS VNDSTNCYTG NTWDATLCPD



DETCAANCAL DGADYESTYG VTTDGDSLTL KFVTGSNVGS RLYLMDTSDE GYQTFNLLDA EFTFDVDVSN LPCGLNGALY



FTAMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFIDG QANVDGWEPS SNNDNTGIGN HGSCCPEMDI WEANKISTAL



TPHPCDSSEQ TMCEGNDCGG TYSDDRYGGT CDPDGCDFNP YRMGNDSFYG PGKTIDTGSK MTVVTQFITD GSGSLSEIKR



YYVQNGNVIA NADSNISGVT GNSITTDFCT AQKKAFGDED IFAEHNGLAG ISDAMSSMVL ILSLWDDYYA SMEWLDSDYP



ENATATDPGV ARGTCDSESG VPATVEGAHP DSSVTFSNIK FGPINSTFSA SA






MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG SCTSVQGSIT IDANWRWTHR TDSATNCYEG NKWDTSYCSD



GPSCASKCCI DGADYSSTYG ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL LGNEFTFDVD VSNLGCGLNG



ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DSQCPRDLKF INGEANVENW QSSTNDANAG TGKYGSCCSE MDVWEANNMA



AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE



IKRFYVQNGK VIPNSESTIP GVEGNSITQD WCDRQKAAFG DVTDXQDKGG MVQMGKALAG PMVLVMSIWD DHAVNMLWLD



STWPIDGAGK PGAERGACPT TSGVPAEVEA EAPNSNVIFS NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS



GSSGPTGGTG VAKHYEQCGG IGFTGPTQCE SPYTCTKLND WYSQCL






MYAKFATLAA LVAGASAQAV CSLTAETHPS LTWQKCTAPG SCTNVAGSIT IDANWRWTHQ TSSATNCYSG SKWDSSICTT



GTDCASKCCI DGAEYSSTYG ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL LGNEFTFDVD VSNLGCGLNG



ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW ESSTNDANAG SGKYGSCCTE MDVWEANNMA



TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY AGVCDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE



IKRFYAQDGK VIPNSESTIA GIPGNSITKA YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD DHAVNMLWLD



STYPTDQVGV AGAERGACPT TSGVPSDVEA NAPNSNVIFS NIRFGPIGST VQGLPSSGGT SSSSSAAPQS TSTKASTTTS



AVRTTSTATT KTTSSAPAQG TNTAKHWQQC GGNGWTGPTV CESPYKCTKQ NDWYSQCL






MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSDTCSQKCY



IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNMK SQAYTVHACT



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI



NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSSDSTA



QRGPCPTSSG VPKDVESQHG DATVVFSDIK FGAINSTFKY N






MLAAALFTFA CSVGVGTKTP ENHPKLNWQN CASKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS LCPDDKTCSD



KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY VFKLNNKEFT FSVDVSKLPC GLNGALYFVE



MDADGGKAKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSQATALTPH



VCKTTGQQRC SGKSECGGQD GQDRFAGLCD EDGCDFNNWR MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV TEIRRKYVQN



GKVIENSKSN IPGIDATAAI SDHFCEQQKK AFGDTNDFKN KGGFAKLGQV FDRGMVLVLS LWDDHQVAML WLDSTYPTNK



DKSQPGVDRG PCPTSSGKPD DVESASADAT VVYGNIKFGA LDSTY






MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSNTCSQKCY



IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNMK SQAYTVHACT



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI



NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA



SRGPCAVSSG VPKDVESQYG DATVIYSDIK FGAINSTFKW N






MILALLSLAK SLGIATNQAE THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC



DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD



EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC



TVTGLRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY



VQGGKVIENS KVNIAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP



TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY






MLVIALILRG LSVGTGTQQS ETHPSLSWQQ TSKGGSGQSV SGSVVLDSNW RWTHTTDGTT NCYDGNEWSS DLCPDASTCS



SNCVLEGADY SGTYGITGSG SSLKLGFVTK GSYSTNIGSR VYLLGDESHY KLFKLENNEF TFTVDDSNLE CGLNGALYFV



AMDEDGGASK YSGAKPGAKY GMGYCDAQCP HDMKFINGDA NVEGWKPSDN DENAGTGKWG ACCTEMDIWE ANKYATAYTP



HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR MGNQSFWGPG LIIDTGKPVT VVTQFLADGG SLSEIRRKYV



QGGKVIENTV TKISGMDEFD SITDEFCNQQ KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT



DSGSKAGADR GPCATSSGVP KDVESNYASA SVTFSDIKFG PIDSTY






MLLALFAFGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC



DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD



EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC



TVTGIRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY



VQGGKVIENS KVNIAGMAAG NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDSGMVL VLSLWDDHSV NMLWLDSTYP



TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY






MLASVVYLVS LVVSLEIGTQ QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL HDSGTTNCYD GNLWSDDLCP NADTCSSKCY



IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GDGKLGTCCS EMDIWEGNAK SQAYTVHACS



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI



NNSKTSNLAD TYDSITDKFC DATKDATGDT NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV



QAVDRVLCRR VFQRMLKASM VMLQSRTRTL SLELSTRPLV GISPAGRLFF F






MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC



DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD



EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC



TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY



VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP



TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY






MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL



EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED



GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN



LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE



NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS



DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK






MLGALVALAS CIGVGTNTPE KHPDLKWTNG GSSVSGSIVV DSNWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVLE



GADYSGTYGV TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG



GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSMAT AYTPHVCDKL



EQTRCSGSAC GQNGGGDRFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI



DNSMTNIAAM SKQYNSVSDE FCQAQKKAFG DNDSFTKHGG FRQLGATLSK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP



GADRGPCKTS SGVPSDVESQ NADSTVKYSD IRFGAIDSTY SK






MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ CACKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS LCPDDKTCSD



KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL YLLKDKSTYY VFQLNNKEFT FSVDVSKLPC GLNGALYFVE



MDADGGKSKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSMATALTPH



VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR MGDKTFFGPG LTVDTKSPFV VVTQFYGSPV TEIRRKYVQN



GKVIENAKSN IPGIDATNAI SDTFCEQQKK AFGDTNDFKN KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK



DKSVPGVDRG PCPTSSGKPD DVESASGDAT VVYGNIKFGA LDSTY






MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP DPEKCSQNCY



LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYPMFK LKNKEFTFTV DVSNLPCGLN GALYFVAMPS



DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA GTGRYGTCCT EMDIWEANSQ ATAYTVHACS



KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN



SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PTDSTAIGAS



RGPCATSSGD PKDVESASAN ASVKFSDIKF GALDSTY






MLASLLPLSN SLGTASNQAE THPKLTWTQY TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC PDPTTCSNNC



NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF KLKNKEFTFT VDDSKLPCGL NGAVYFVAMD



EDGGTAKHSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRWGARC TEMDIWEANS RATAYTPHIC



TKTGLYRCEG TECGDSDTNR YGGVCDKDGC DFNSYRMGDK SFFGQGKTVD SSKPVTVVTQ FITDNNQDSG KLTEIRRKYV



QGGKVIDNSK VNIAGITAGN PITDTFCDEA KKAFGDNNDF EKKGGLSALG TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT



NASPGALGVE RGDCAITSGV PADVESQSAD ASVTFSDIKF GPIDSTY






MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL



EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLSG ALYHVNMDED



GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN



LQQTRCQGAA CGENGGGSRF GSSCDPDGCD FNSWGMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE



NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS



DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK






MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC



DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD



EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC



TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GILSETRRKY



VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP



TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY






MIGIVLIQTV FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW RWTHIPDGTT NCYDGNEWSS DLCPDPTTCS



NNCVLEGADY SGTYGISTSG SSAKLGFVTK GSYSTNIGSR VYLLGDESHY KIFDLKNKEF TFTVDDSNLE CGLNGALYFV



AMDEDGGASR FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN DDNAGTGHYG ACCTEMDIWE ANKYATAYTP



HICTENGEYR CEGKSCGDSS DDRYGGVCDK DGCDFNSWRL GNQSFWGPGL IIDTGKPVTV VTQFVTKDGT DSGALSEIRR



KYVQGGKTIE NTVVKISGID EVDSITDEFC NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH DVNMLWLDSV



YPTNPAGKAG ADRGPCATSS GDPKEVEDKY ASASVTFSDI KFGPIDSTY






MLVFGIVSFV YSIGVGTNTA ETHPKLTWKN GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL



EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSQLPCGLNG ALYFVCMDQD



GGMSRYPDNQ AGAKYGTGYC DAQCPTDLKF INGLPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ



VGQTRCEGRA CGENGGGDRF GSICDPDGCD FNSWRMGNKT FWGPGLIIDT KKPVTVVTQF IGSPVTEIKR EYVQGGKVIE



NSYTNIEGMD KFNSISDKFC TAQKKAFGDN DSFTKHGGFS KLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKLGS



DRGPCPTSSG VPADVESKNA DSSVKYSDIR FGSIDSTYK






MLSFVFLLGF GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT NCYDGNEWSS DLCPDPETCS



KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR VYLKKDTNTY QIFKLKNHEF TFTVDVSNLP CGLNGALYFV



EMEADGGKGK YPLAKPGAQY GMGYCDAQCP HDMKFINGNA NVLDWKPQET DENSGNGRYG TCCTEMDIWE ANSQATAYTP



HICTKDGQYQ CEGTECGDSD ANQRYNGVCD KDGCDFNSYR LGNKTFFGPG LIVDSKKPVT VVTQFITSNG QDSGDLTEIR



RIYVQGGKTI QNSFTNIAGL TSVDSITEAF CDESKDLFGD TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD HSVNMLWLDS



TYPTDAAAGA LGTQRGPCAT SSGAPSDVES QSPDASVTFS DIKFGPLDST Y






MLTLVVYLLS LVVSLEIGTQ QSESHPALTW QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP SSDTCTQKCY



IEGADYSGTY GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA



DGGKQKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVED WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT



KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY ASYRWGDHSF YGEGKTVDTK QPITVVTQFI GDPLTEIRRL YIQGGKVINN



SKTQNLASVY DSITDAFCDA TKAASGDTND FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP TDSRDATAER



GPCATSSGVP KDVESNQADA SVVFSDIKFG AINSTYSYN






MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP DPEKCSQNCY



LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYQMFK LKNKEFTFTV DVSNLPCGLN GALYFVAMPS



DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA GTGRYGTCCT EMDIWEANSQ ATAYTVHACS



KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN



SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PSNSTAIGAT



RGPCATSSGD PKNVESASAN ASVKFSDIKF GAFDSTY






MLALVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY



IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV DDSQLNCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI



NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH TANMLWLDST YPTDSTKTGA



SRGPCAVTSG VPKDVESQYG SAQVVYSDIK FGAINSTY






MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCG SSDTCSSKCY



IEGADYSGTY GISASGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKGKEFTFTV DDSKLDCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTID TKQPVTVVTQ FIGDPLTEIR RVYVQGGKVI



NNSKTSNLAN VYDSITDKFC DDTKDATGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA



SRGPCAVLSG VPKNVESQHG DATVIYSDIK FGAINSTFSY N






MFLALFVLGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPQTCSSNC



DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAME



EDGGVAKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC IEMDIWEANS MATAYTPHVC



TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDG GTLSEIKRKY



VQGGKVIENS KVNIAGITAV NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP



TDAAAGALGT ERGACATSSG KPSDVESQSP DASVTFSDIK FGPIDSTY






MLLCLLSIAN SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVIE



GADYQGTYGV SSSGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG



GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL



EQTRCSGSSC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI



DNSMSNIAGM SKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP



GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K






MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL



EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED



GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDT



LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGSPVTEIKR KYVQNGKVIE



NSFSNIEGMD KFNSISDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS



DRGPCPTTSG VPADVESKSA NANVIYSDIR FGAIDSTYK






MLLCLLGIAS SLDAGTNTAE NHPQLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGQNCVIE



GADYQGTYGV SASGNALTLT FVTHGQYSTN VGSRLYLLKD EKTYQIFNLI GKEFTFTVDV SNLPCGLNGA LYFVQMDADG



GTAKYSDNKA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GRYGSCCSEM DVWEANSLAT AYTPHVCDKL



EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF NSWRLGNKTF WGPGLIVDTK QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI



DNSFTKLDSL TKQYNSVSDE FCVAQKKAFG DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP



GADRGPCKTS SGVPADVESQ AASSSVKYSD IRFGAIDSTY K






MLGIGFVCIV YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG NLWSKDLCPD AATCGKNCVL



EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQIFNL NGKEFTFTVD VSNLPCGLNG ALYFVNMDAD



GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ



VGQTRCEGRA CGENGGGDRF GSSCDPDGCD FNSWRLGNKT FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR KYVQGGKVIE



NSYTNIEGLD KFNSISDKFC TAQKKAFGDN DSFIKHGGFR QLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKPG



DRGPCPTSSG VPADVESKNA GSSVKYSDIR FGSIDSTYK






MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ TGSIVIDSNW RWLHDSGTTN CYDGNEWSSD LCPDPEKCSQ



NCYLEGADYS GTYGISSSGN SLQLGFVTKG SYSTNIGSRV YLLKDENTYA TFKLKNKEFT FTADVSNLPC GLNGALYFVA



MPADGGKSKY PLAKPGAKYG MGYCDAQCPH DMKFINGEAN ILDWKPSSND ENAGAGRYGT CCTEMDIWEA NSQATAYTVH



ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT FFGPNLIVDS SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ



NSFTNISGVA SVDSITDAFC NENKVATGDT NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA



SRGPCAITSG EPKDVESASA NASVKFSDIK FGAIDSTY






MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY



IEGADYSGTY GITSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA



DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT



KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI



NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA



SRGPCAVSSG VPKDVESQHG DATVIYSDIK FGAINSTFKW N






MLSLVSIFLV GLGFSLGVGT QQSESHPSLS WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW STDLCPDAST



CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES HYQLFKLKNH EFTFTVDDSQ LPCGLNGALY



FVEMAEDGGA KPGAQYGMGY CDAQCPHDMK FITGEANVKD WKPQETDENA GNGHYGACCT EMDIWEANSQ ATAYTPHICS



KTGIYRCEGT ECGDNDANQR YNGVCDKDGC DFNSYRLGNK TFWGPGLTVD SNKAMIVVTQ FTTSNNQDSG ELSEIRRIYV



QGGKTIQNSD TNVQGITTTN KITQAFCDET KVTFGDTNDF KAKGGFSGLS KSLESGAVLV LSLWDDHSVN MLWLDSTYPT



DSAGKPGADR GPCAITSGDP KDVESQSPNA SVTFSDIKFG PIDSTY






MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC



DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD



EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC



TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY



VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGFGAL SKQLVAGMVL VLSLWDDHSV NMLWLDSTYP



TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY






MLCVGLFGLV YSIGVGTNTQ ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS TLCPDGTTCS



KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR LYLLKDDNNY QIFNLAGKEF TFTVDVSNLP CGLNGALYFV



EMDQDGGKGK HKENEAGAKY GTGYCDAQCP TDLKFIDGIA NSDGWKPQDN DENSGNGKYG SCCSEMDIWE ANSLATAYTP



HVCDTKGQKR CQGTACGENG GGDRFGSECD PDGCDFNSWR QGNKSFWGPG LIIDTKKSVQ VVTQFIGSGS SVTEIRRKYV



QNGKVIENSY STISGTEKYN SISDDYCNAQ KKAFGDTNSF ENHGGFKRFS QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN



SNKPGADRGP CETSSGVPAD VESKSASASV KYSDIRFGPI DSTYK






MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVIE



GADYQGTYGV SASGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG



GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL



EQTRCSGSAC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI



DNSMSNIAGM TKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP



GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K





SEQ ID NO: 299
QSACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS



TYGVTTSGNS LSIGFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT



NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG



DGCGGTYSDN AYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS



YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS



SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR PATTTGSSPG PTQSHYGQCG GIGYSGPTVC



ASGTTCQVLN PYYSQCL





SEQ ID NO: 300
QSACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS



TYGVTTSGNS LSIGFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT



NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG



DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS



YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVAGSCSTS



SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR PATTTGSSPG PTQSHYGQCG GIGYSGPTVC



ASGTTCQVLN PYYSQCL





SEQ ID NO: 301
MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT



WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC



GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA



NSISEALTPH PCDTPGLSVC TTDACGGTYS SDKYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT



STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS



VNMLWLDSTY PTNATGTPGA AKGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA



SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL





SEQ ID NO: 302
QQIGTYTAET HPSLSWSTCK SGGSCTTNSG AITLDANWRW VHGVNTSTNC YTGNTWNTAI CDTDASCAQD CALDGADYSG



TYGITTSGNS LRLNFVTGSN VGSRTYLMAD NTHYQIFDLL NQEFTFTVDV SHLPCGLNGA LYFVTMDADG GVSKYPNNKA



GAQYGVGYCD SQCPRDLKFI AGQANVEGWT PSSNNANTGL GNHGACCAEL DIWEANSISE ALTPHPCDTP GLSVCTTDAC



GGTYSSDKYA GTCDPDGCDF NPYRLGVTDF YGSGKTVDTT KPITVVTQFV TDDGTSTGTL SEIRRYYVQN GVVIPQPSSK



ISGVSGNVIN SDFCDAEIST FGETASFSKH GGLAKMGAGM EAGMVLVMSL WDDYSVNMLW LDSTYPTNAT GTPGAAKGSC



PTTSGDPKTV ESQSGSSYVT FSDIRVGPFN STFSGGSSTG GSSTTTASGT TTTKASSTST SSTSTGTGVA AHWGQCGGQG



WTGPTTCASG TTCTVVNPYY SQCL








Claims
  • 1. A polypeptide comprising a variant cellobiohydrolase I (“CBH I”) catalytic domain as compared to a reference CBH I catalytic domain, comprising: (a) a substitution at the amino acid position corresponding to R268 of T. reesei CBH I (“R268 substitution”);(b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or(c) both an R268 substitution and an R411 substitution,
  • 2. A method for producing ethanol, comprising: (a) treating biomass with a composition according to any one of claims 37 to 43 or with a fermentation broth according to claim 1, thereby producing monosaccharides; and(b) culturing a fermenting microorganism in the presence of the monosaccharides produced in step (a) under fermentation conditions, thereby producing ethanol.
  • 3. The method of claim 2, further comprising, prior to step (a), pretreating the biomass.
  • 4. The method of claim 2, wherein said fermenting microorganism is a bacterium or a yeast.
  • 5. The method of claim 4, wherein said fermenting microorganism is a bacterium selected from Zymomonas mobilis, Escherichia coli and Klebsiella oxytoca.
  • 6. The method of claim 4, wherein said fermenting microorganism is a yeast selected from Saccharomyces cerevisiae, Saccharomyces uvarum, Kluyveromyces fragilis, Kluyveromyces lactis, Candida pseudotropicalis, and Pachysolen tannophilus.
  • 7. The method of claim 2, wherein said biomass is corn stover, bagasses, sorghum, giant reed, elephant grass, miscanthus, Japanese cedar, wheat straw, switchgrass, hardwood pulp, softwood pulp, crushed sugar cane, energy cane, or Napier grass.
  • 8. A method for generating a nucleic acid that encodes a product tolerant variant CBH I polypeptide, comprising modifying the nucleotide sequence of a CBH I-encoding nucleic acid so that the nucleic acid encodes a variant CBH I polypeptide, wherein said variant CBH I polypeptide comprises: (i) an R268 substitution;(ii) an R411 substitution; or(iii) both an R268 substitution and an R411 substitution,thereby generating a nucleic acid that encodes a product tolerant variant CBH I polypeptide.
  • 9. The method of claim 8, wherein the modification is by site directed mutagenesis.
  • 10. The method of claim 8, wherein variant CBH I polypeptide comprises an R268 substitution.
  • 11. The method of claim 10, wherein the R268 substituent is a lysine.
  • 12. The method of claim 10, wherein the R268 substituent is an alanine.
  • 13. The method of claim 8, which comprises an R411 substitution.
  • 14. The method of claim 13, wherein the R411 substituent is a lysine.
  • 15. The method of claim 13, wherein the R411 substituent is an alanine.
  • 16. A method for producing ethanol, comprising: (a) treating biomass with a fermentation broth according to claim 1, thereby producing monosaccharides; and(b) culturing a fermenting microorganism in the presence of the monosaccharides produced in step (a) under fermentation conditions, thereby producing ethanol.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional application of U.S. application Ser. No. 13/824,317 filed Dec. 18, 2013, now issued as U.S. Pat. No. 9,096,871; which is a 35 USC §371 National Stage application of International Application No. PCT/US2011/055181 filed Oct. 6, 2011, now expired; which claims the benefit under 35 USC §119(e) to U.S. Application Ser. No. 61/390,392 filed Oct. 6, 2010, now expired. The disclosure of each of the prior applications is considered part of and is incorporated by reference in the disclosure of this application.

Provisional Applications (1)
Number Date Country
61390392 Oct 2010 US
Divisions (1)
Number Date Country
Parent 13824317 Dec 2013 US
Child 14816992 US