PRODUCTION OF OLIGOSACCHARIDES

REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB

The instant application contains a Sequence Listing which has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Sep. 23, 2020, is named G091970034WO00-SEQ-FL and is 276 kilobytes in size.

FIELD OF INVENTION The disclosure relates to enzymes, nucleic acids, and cells useful for the conversion of sucrose to fructans.
BACKGROUND

Polyfructans are oligosaccharides that comprise fructose monomers. These oligosaccharides generally further comprise glucose. Polyfructans have numerous uses including as prebiotics, fat replacers, sugar replacers, texture modifiers, and in industrial processes. Polyfructans may comprise β(2,6) linkages and/or β(2,1) linkages, with the type of polyfructan depending on the linkage position of the fructose residues. For example, graminans are complex mixtures of branched polyfructan oligosaccharides with β(2,1)-linked-D-fructosyl backbone and β(2,6)-linked-D-fructosyl side chains with different degrees of polymerization. Three distinct classes of enzymes can be used to produce polyfructans: sucrose:sucrose 1-fructosyltransferase (1-SST) enzymes, which generate branched polyfructans by introduction of β(2,1) linkages in saccharides; fructan:fructan 1-fructosyltransferase (1-FFT) enzymes, which promote polymerization of fructose monomers on saccharides though the formation of β(2,1) linkages; and sucrose:fructan-6-fructosyltransferase (6-SFT) enzymes, which catalyze the addition of fructose monomers through β(2,6) linkages to produce polyfructans.

SUMMARY

This disclosure relates, at least in part, to generation of engineered cells containing enzymes for producing polyfructan oligosaccharides, for example, by converting sucrose to polyfructans. These engineered cells are useful for producing complex and branched polyfructans.

Aspects of the disclosure relate to host cells that comprise one or more heterologous polynucleotides encoding: a sucrose:sucrose 1-fructosyltransferase (1-SST) enzyme; a fructan:fructan 1-fructosyltransferase (1-FFT); and a sucrose:fructan-6-fructosyltransferase (6-SFT) enzyme.

In some embodiments, the 1-SST enzyme comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 or SEQ ID NO: 24.

In some embodiments, the 1-FFT enzyme comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 7 or SEQ ID NO: 31.

In some embodiments, the 6-SFT enzyme comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 13 or SEQ ID NO: 38.

In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding two or more of a 1-SST enzyme; a 1-FFT enzyme; and a 6-SFT enzyme.

In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding a 1-SST enzyme; a 1-FFT enzyme; and a 6-SFT enzyme.

In some embodiments, at least two of the 1-SST enzyme, the 1-FFT enzyme, and the 6-SFT enzyme are expressed on the same heterologous polynucleotide.

In some embodiments, the host cell is a plant cell, an algal cell, a yeast cell, a bacterial cell, or an animal cell.

In some embodiments, the yeast cell is a Saccharomyces cell, a Yarrowia cell or a Pichia cell. In some embodiments, the host cell is a Pichia pastoris cell.

In some embodiments, the 1-SST enzyme comprises the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 24.

In some embodiments, the 1-FFT enzyme comprises the amino acid sequence of SEQ ID NO: 7 or SEQ ID NO: 31.

In some embodiments, the 6-SFT enzyme comprises the amino acid sequence of SEQ ID NO: 13 or SEQ ID NO: 38.

In some embodiments, one or more of the 1-SST enzyme, the 1-FFT enzyme, and the 6-SFT enzyme is secreted from the host cell.

Further aspects of the disclosure provide methods comprising culturing any of the host cells disclosed herein in this application.

In some embodiments, the methods further comprise purifying one or more of the 1-SST enzyme, 1-FFT enzyme, and 6-SFT enzyme from the host cell.

Further aspects of the disclosure provide methods of producing a fructan. In some embodiments, the method comprises contacting sucrose with one or more of (a) a 1-SST enzyme comprising an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 or SEQ ID NO: 24; (b) a 1-FFT enzyme comprising an amino acid sequence that is at least 90% identical to SEQ ID NO: 7 or SEQ ID NO: 31; and (c) a 6-SFT enzyme comprising an amino acid sequence that is at least 90% identical to SEQ ID NO: 13 or SEQ ID NO: 38.

In some embodiments, the sucrose is contacted with two or more of a 1-SST enzyme, a 1-FFT enzyme, and a 6-SFT enzyme.

In some embodiments, the sucrose is contacted with a 1-SST enzyme, a 1-FFT enzyme, and a 6-SFT enzyme.

In some embodiments, the fructan comprises a β(2,1) linkage, a β(2,6) linkage, or a combination thereof.

In some embodiments, the fructan is a kestose, an inulin and/or a graminan.

In some embodiments, the fructan has a degree of polymerization of at least 3.

In some embodiments, the method further comprises purifying the fructan.

In some embodiments, the 1-SST enzyme, 1-FFT enzyme, and/or 6-SFT enzyme are secreted from one or more host cells.

In some embodiments, the one or more host cells are cultured in media containing sucrose, wherein the sucrose is contacted with the 1-SST enzyme, 1-FFT enzyme, and/or 6-SFT enzyme in the media.

In some embodiments, the fructan is purified from the media.

In some embodiments, the 1-SST enzyme, 1-FFT enzyme, and/or 6-SFT enzyme is a purified enzyme.

In some embodiments, the kestose is 6-kestose.

In some embodiments, the kestose is 1-kestose.

In some embodiments, the fructan comprises a levan.

Aspects of the disclosure provide methods of producing a fructan, comprising (a) contacting sucrose with a 1-SST enzyme to produce kestose; and (b) contacting the kestose with a 1-FFT enzyme and/or a 6-SFT enzyme to produce the fructan.

In some embodiments, the kestose produced in a) is purified and the purified kestose is contacted with the 1-FFT enzyme and/or 6-SFT enzyme in b).

In some embodiments, the method further comprises purifying the fructan produced in b).

In some embodiments, the 1-SST enzyme, 1-FFT enzyme, and/or 6-SFT enzyme is secreted from one or more host cells. In some embodiments, the one or more host cells is cultured in media containing sucrose, wherein the sucrose is contacted with the 1-SST enzyme in the media. In some embodiments, the 1-SST enzyme, 1-FFT enzyme, and/or 6-SFT enzyme is a purified enzyme. In some embodiments, the fructan produced in b) is an inulin. In some embodiments, the fructan produced in b) is a branched inulin. In some embodiments, the fructan produced in b) is a graminan.

Aspects of the disclosure provide host cells that comprise one or more heterologous polynucleotides encoding one or more of (a) a 1-SST enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOs: 1-4 and 24-28; (b) a 1-FFT enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOs: 7-10 and 31-35; and (c) a 6-SFT enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NO: 13-21 and 38-52.

Aspects of the disclosure provide methods of producing a fructan, comprising contacting sucrose with one or more of: (a) a 1-SST enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOs: 1-4 and 24-28; (b) a 1-FFT enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOs: 7-10 and 31-35; and (c) a 6-SFT enzyme comprising an amino acid sequence that is at least 90% identical to a sequence selected from SEQ ID NOs: 13-21 and 38-52.

Each of the limitations of the invention can encompass various embodiments of the invention. It is, therefore, anticipated that each of the limitations of the invention involving any one element or combinations of elements can be included in each aspect of the invention. This invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways. Also, the phraseology and terminology used in this disclosure is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:

FIG. 1 depicts schematics showing chemical structures of selected fructans (inulins, levans, and graminans).

FIG. 2 depicts a schematic showing an example of biosynthetic conversion and relevant enzymes involved in the production of fructans in Agave tequiliana.

FIGS. 3A-3B depict graphs showing data from screening of a library of enzymes. FIG. 3A shows a graph displaying individual enzymes and the resultant products (β(2,6) fructans (labeled ‘2→6’ on y-axis) or β(2,1) fructans (labeled ‘kestose’ on x-axis)) formed by incubation with sucrose. Based on product formation, individual enzymes were classified as: inactive; having invertase activity; having kestose transferase (1-SST) activity; or having β(2,6) branching (6-SFT) activity. FIG. 3B shows a graph displaying individual enzymes and the resultant products (β(2,1) inulins (labeled ‘Nystose’ on y-axis) or β(2,1) fructans (labeled ‘kestose’ on x-axis)) formed by incubation with kestose. Based on product formation, individual enzymes were classified as: inactive; having kestase activity; or having 1-FFT activity. All reaction products in FIGS. 3A-3B were analyzed by HPLC and quantified using peak integration.

FIG. 4 depicts schematics showing representative HPLC-RID traces of fructans. An example of an enzymatic bioconversion reaction (individual enzyme incubated with sucrose) is shown in the top panel. An example of a preparation of commercially-available standards of nystose (A), 1-kestose (B), sucrose (C), glucose (D), and fructose (E) is shown in the bottom panel.

FIG. 5 depicts a schematic showing synthesis of branched inulins. Starting from sucrose (dimer of glucose and fructose), kestose (comprising β(2,1) linkage) is enzymatically formed using 1-SST activity. 1-FFT activity catalyzes formation of a linear inulin, which can be reacted with an enzyme having 6-SFT activity to provide β2,6 branched inulins. (G=glucose; F=fructose.)

FIGS. 6A-6D show confirmation of branched inulin formation by bioconversion. FIG. 6A shows an HPLC-RID trace of a bioconversion reaction showing that branched inulins have been produced and can be distinguished from starting material (sucrose) and by-products (glucose). FIG. 6B shows a schematic depicting fragmentation products that are generated when branched inulins are subjected to analysis by GC/MS. These fragmentation products provide a unique mass spectroscopy signature that indicates presence of β2,6 branching. FIG. 6C shows an example of GC/MS spectral analysis of: a bioconversion sample; linear sugars (Chicory; Nicie); and a known branched sugar (‘Test Ground’). FIG. 6D is a magnification of the GC/MS analysis in FIG. 6C between 28.0-29.6 min.

FIG. 7 is a non-limiting example of sequence identity analysis of SEQ ID NOs: 2-4, 6, 8-10, 12, 14-21, and 63. The percent sequence identity between indicated SEQ ID NOs is shown. SEQ ID NO: 6 is Festuca arundinacea 1-SST. SEQ ID NO: 12 is Echinops ritro 1-FFT. SEQ ID NO: 63 corresponds to residues 60 through 623 of Phleum pratense 6-SFT (SEQ ID NO: 23). Multiple Sequence Comparison by Log-Expectation (MUSCLE) was used for the sequence identity analysis.

DETAILED DESCRIPTION OF THE INVENTION

The disclosure provides, in some aspects, cells and enzymes that are engineered for production of polyfructans from sucrose. These enzymes include 1-SST enzymes, 1-FFT enzymes, and 6-SFT enzymes. Enzymes disclosed in this application and host cells comprising such enzymes, may be used to promote production of fructans, including branched fructans, such as branched inulins. In some embodiments, a fructan comprises a β(2,1) linkage, a β-(2,6)-linkage, or a combination thereof.

Fructans

As used in this application, a “fructan,” which may also be referred to as a “polyfructan” or a “fructooligosaccharide,” refers to an oligosaccharide that comprises fructose monomers. Fructans generally further comprise glucose. In some embodiments, a fructan comprises at least one β(2,1) linkage, at least one β(2,6) linkage, or a combination thereof. In some embodiments, a fructan is a kestose (e.g., 1-kestose or 6-kestose), an inulin and/or a graminan. In some embodiments, a fructan has a degree of polymerization (DP) of at least 3 (e.g., at least 3, at least 4, at least 5, at least 6), wherein the degree of polymerization refers to the total number of monosaccharide units (e.g., fructose units) in a fructan or the average number of monosaccharide units in a mixture of fructans. In some embodiments, a fructan comprises a levan (e.g., a linear levan or a branched levan, e.g., comprising at least one β(2,1) linkage and/or at least one β(2,6) linkage). In some embodiments, a fructan is an inulin. In some embodiments, an inulin is a linear inulin or a branched inulin (e.g., comprising at least one β(2,1) linkage and/or at least one β(2,6) linkage). In some embodiments, a fructan is a graminan.

Formula 1 is an example of a fructan comprising a β(2,1) linkage:

embedded image

Formula 2 is an example of a fructan comprising a β(2,6) linkage:

embedded image

Formula 3 shows 1-kestose:

embedded image

Formula 4 shows 6-kestose:

embedded image

Formula 5 shows nystose:

embedded image

Formula 6 shows an inulin, in which n is any integer.

embedded image

Formula 7 shows an example of a graminan, in which n1 is any integer.

embedded image

Formula 8 shows an example of a graminan, in which n1 and n2 independently may be any integer.

embedded image

As one of ordinary skill in the art would appreciate, any of the fructans produced using the methods described in this application may have numerous applications, including industrial uses. As a non-limiting example, long chain fructans (e.g., levans) may be used in fermentation processes and in the production of vinegar. See also, e.g., Niness, J Nutr. 1999 Jul; 129(7 Suppl):1402S-6S; Kolida et al., Br J Nutr. 2002; Koga et al., Pediatr Res. 2016 Dec; 80(6):844-851; Roberfroid, J Nutr. 2007 Nov; 137(11 Suppl):24935-25025; Suzuki et al., Bioscience Microflora Vol. 25(3), 109-116, 2006; Lopez and Urias-Silvas, Recent Advances in Fructooligosaccharides Research (pp. 297-310), 2007; and Vijn and Smeekens, Plant Physiology, June 1999, Vol. 120, pp. 351-359.

Sucrose:sucrose 1-fructosyltransferase (1-SST)

As used in this application, “sucrose:sucrose 1-fructosyltransferase (1-SST)” refers to an enzyme that generates branched polyfructans by introduction of β(2,1) linkages in saccharides (e.g., formation of 1-kestose from sucrose). A 1-SST enzyme may use sucrose as a substrate. In some embodiments, 1-SST exhibits specificity for sucrose compared to other saccharides. In some embodiments, 1-SST produces 1-kestose from sucrose. In some embodiments, a 1-SST can use levan as a substrate to produce a branched levan with beta(2-6) linkages and beta(2-1) linkages.

A host cell described in this application can comprise a 1-SST enzyme and/or a heterologous polynucleotide encoding such an enzyme. In some embodiments, a host cell comprises a heterologous polynucleotide encoding a 1-SST enzyme comprising an amino acid sequence that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NOs: 1-4, 6, and 24-28; a 1-SST enzyme in Table 2; or a 1-SST enzyme otherwise described in this application. In some embodiments, a host cell comprises a heterologous polynucleotide that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NOs: 5, 29-30, and 62; a polynucleotide encoding a 1-SST enzyme in Table 2; or a polynucleotide encoding a 1-SST enzyme otherwise described in this application.

In some embodiments, a host cell does not comprise a 1-SST derived from Festuca arundinacea. In some embodiments, a host cell does not comprise a 1-SST corresponding to SEQ ID NO: 6.

In some embodiments, a host cell that expresses a heterologous polynucleotide encoding a 1-SST enzyme may increase conversion of sucrose to 1-kestose, and/or increase introduction of β(2,1) linkages in oligosaccharides, by 0.5-fold, 1-fold, 1.5-fold, 2-fold, 2.5-fold, 3-fold, 3.5-fold, 4-fold, 4.5-fold, 5-fold, 5.5-fold, or 6-fold more (e.g., 2-fold to 6-fold more) relative to a control. In some embodiments, the control is a host cell that expresses a heterologous polynucleotide encoding SEQ ID NO: 6. In some embodiments, the control is a Pichia pastoris strain that expresses a heterologous polynucleotide encoding SEQ ID NO: 6, such as is described in and incorporated by reference from Lüscher, M. et. al., “Cloning and Functional Analysis of Sucrose:Sucrose 1-Fructosyltransferase from Tall Fescue,” Plant Physiology, 124:1217-1227 (2000).

In some embodiments, a host cell that expresses a heterologous polynucleotide encoding a 1-SST enzyme may exhibit at least 0.5-fold, 1-fold, 1.5-fold, 2-fold, 2.5-fold, 3-fold, 3.5-fold, 4-fold, 4.5-fold, 5-fold, 5.5-fold, or 6-fold more (e.g., 2-fold to 6-fold more) more activity in the presence of sucrose relative to other saccharides. In some embodiments, activity corresponds to conversion of sucrose to 1-kestose, and/or increase introduction of β(2,1) linkages in oligosaccharides.

In some embodiments, a 1-SST comprises a sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to any one of SEQ ID NOs: 1-4, 6, and 24-28.

Fructan:fructan 1-fructosyltransferase (1-FFT)

As used in this application, “fructan:fructan 1-fructosyltransferase (1-FFT)” refers to an enzyme that catalyzes the conversion of oligosaccharides comprising β(2,1) linkages (e.g., 1-kestose) into longer polymer chains of oligosaccharides (e.g., conversion of 1-kestose to inulins). A 1-FFT enzyme may use 1-kestose, sucrose, and/or fructose as a substrate. In some embodiments, a 1-FFT enzyme can use bifurcose or neokestose as a substrate. In some embodiments, 1-FFT produces inulins (e.g., branched inulins) from 1-kestose.

A host cell described in this application can comprise a 1-FFT enzyme and/or a heterologous polynucleotide encoding such an enzyme. In some embodiments, a host cell comprises a heterologous polynucleotide encoding a 1-FFT enzyme comprising an amino acid sequence that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NO: 7-10, 12, and 31-35; a 1-FFT enzyme in Table 2; or a 1-FFT enzyme otherwise described in this application. In some embodiments, a host cell comprises a heterologous polynucleotide that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NO: 11, 36, and 37; a polynucleotide encoding a 1-FFT enzyme in Table 2; or a polynucleotide encoding a 1-FFT enzyme otherwise described in this application.

In some embodiments, a host cell does not comprise a 1-FFT enzyme derived from Echinops ritro. In some embodiments, a host cell does not comprise a 1-FFT enzyme corresponding to SEQ ID NO: 12.

In some embodiments, a host cell that expresses a heterologous polynucleotide encoding a 1-FFT enzyme may increase conversion of 1-kestose to inulins, and/or increase conversion of oligosaccharides comprising β(2,1) linkages into longer polymer chains of oligosaccharides, by 0.5-fold, 1-fold, 1.5-fold, 2-fold, 2.5-fold, 3-fold, 3.5-fold, 4-fold, 4.5-fold, 5-fold, 5.5-fold, or 6-fold more (e.g., 2-fold to 6-fold more) relative to a control. In some embodiments, a control is a host cell that expresses a heterologous polynucleotide encoding SEQ ID NO: 12. In some embodiments, the control is a Pichia pastoris strain that expresses a heterologous polynucleotide encoding SEQ ID NO: 12, such as is described in and incorporated by reference from Van den Ende, W. et al., “Cloning and Functional Analysis of a High DP Fructan:Fructan 1-Fructosyl transferase from Echinops ritro (Asteraceae): Comparison of the native and recombinant enzymes,” Journal of Experimental Botany, 57(4):775-789 (2006).

In some embodiments, a 1-FFT enzyme comprises a sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to any one of SEQ ID NO: 7-10, 12, and 31-35.

Sucrose:fructan-6-fructosyltransferase (6-SFT)

As used in this application “sucrose:fructan-6-fructosyltransferase (6-SFT)” refers to an enzyme that generates fructans by introducing β(2,6) linkages in saccharides (e.g., production of 6-kestose from sucrose) or generates more complex fructans by introducing β(2,6) linkages in precursor fructans (e.g., production of bifurcose from 1-kestose). A 6-SFT may use sucrose, 6-kestose, 1-kestose, bifurcose, and/or neokestose as a substrate. In some embodiments, 6-SFT produces 6-kestose from sucrose. In some embodiments, 6-SFT produces bifurcose from 1-kestose. In some embodiments, 6-SFT produces graminans from bifurcose.

A host cell described in this application can comprise a 6-SFT enzyme and/or a heterologous polynucleotide encoding such an enzyme. In some embodiments, a host cell comprises a heterologous polynucleotide encoding a 6-SFT enzyme comprising an amino acid sequence that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NO: 13-21, 23, and 38-52; a 6-SFT enzyme in Table 2; or a 6-SFT enzyme otherwise described in this application. In some embodiments, a host cell comprises a heterologous polynucleotide that is at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical, or is 100% identical, including all values in between, to any of: SEQ ID NOs: 22 and 53-59; a polynucleotide encoding a 6-SFT enzyme in Table 2; or a polynucleotide encoding a 6-SFT enzyme otherwise described in this application.

In some embodiments, the host cell does not comprise a 6-SFT enzyme derived from Phleum pratense. In some embodiments, the host cell does not comprise a 6-SFT enzyme corresponding to SEQ ID NO: 23. In some embodiments, the host cell does not comprise a 6-SFT enzyme corresponding to SEQ ID NO: 63.

In some embodiments, a host cell that expresses a heterologous polynucleotide encoding an 6-SFT enzyme may increase conversion of sucrose to 1-kestose, increase conversion of 1-kestose to bifurcose, increase conversion of bifurcose to graminans, and/or increase introduction of β(2,6) linkages into fructans by 0.5-fold, 1-fold, 1.5-fold, 2-fold, 2.5-fold, 3-fold, 3.5-fold, 4-fold, 4.5-fold, 5-fold, 5.5-fold, or 6-fold more (e.g., 2-fold to 6-fold more) relative to a control. In some embodiments, a control is a host cell that expresses a heterologous polynucleotide encoding SEQ ID NO: 23. In some embodiments, the control is a Pichia pastoris strain that expresses a heterologous polynucleotide encoding SEQ ID NO: 23, such as is described in and incorporated by reference from Tamura, K. I., et al. “Cloning and Functional Analysis of a Fructosyltransferase cDNA for Synthesis of Highly Polymerized Levans in Timothy (Phleum pratense L.)” Journal of Experimental Botany, 60(3), 893-905 (2009). In some embodiments, a control is a host cell that expresses a heterologous polynucleotide encoding SEQ ID NO: 63. In some embodiments, the control is a Pichia pastoris strain that expresses a heterologous polynucleotide encoding SEQ ID NO: 63.

In some embodiments, an 6-SFT comprises a sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to any one of SEQ ID NOs:13-21, 23, and 38-52.

Variants

Variants of enzymes and proteins described in this application (e.g., 1-SST, 1-FFT, or 6-SFT), including variants to nucleic acid and amino acid sequences, are also encompassed by the present disclosure. A variant may share at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity with a reference sequence, including all values in between.

Unless otherwise noted, the term “sequence identity,” as known in the art, refers to a relationship between the sequences of two polypeptides or polynucleotides, as determined by sequence comparison (alignment). In some embodiments, sequence identity is determined across the entire length of a sequence, such as a reference sequence, while in other embodiments, sequence identity is determined over a region of a sequence. In some embodiments, sequence identity is determined over a region (e.g., a stretch of amino acids or nucleic acids, e.g., the sequence spanning an active site) of a sequence (e.g., 1-SST, 1- FFT, or 6-SFT sequence). For example, in some embodiments, sequence identity is determined over a region corresponding to at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or over 100% of the length of the reference sequence.

Identity measures the percent of identical matches between the smaller of two or more sequences with gap alignments (if any) addressed by a particular mathematical model, algorithm, or computer program.

Identity of related polypeptides or nucleic acid sequences can be readily calculated by any of the methods known to one of ordinary skill in the art. The percent identity of two sequences (e.g., nucleic acid or amino acid sequences) may, for example, be determined using the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68, 1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA 90:5873-77, 1993. Such an algorithm is incorporated into the NBLAST® and XBLAST® programs (version 2.0) of Altschul et al., J. Mol. Biol. 215:403-10, 1990. BLAST® protein searches can be performed, for example, with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to the proteins described in this application. Where gaps exist between two sequences, Gapped BLAST® can be utilized, for example, as described in Altschul et al., Nucleic Acids Res. 25(17):3389-3402, 1997. When utilizing BLAST® and Gapped BLAST® programs, the default parameters of the respective programs (e.g., XBLAST® and NBLAST®) can be used, or the parameters can be adjusted appropriately as would be understood by one of ordinary skill in the art.

Another local alignment technique which may be used, for example, is based on the Smith-Waterman algorithm (Smith, T. F. & Waterman, M. S. (1981) “Identification of common molecular subsequences.” J. Mol. Biol. 147:195-197). A general global alignment technique which may be used, for example, is the Needleman-Wunsch algorithm (Needleman, S. B. & Wunsch, C. D. (1970) “A general method applicable to the search for similarities in the amino acid sequences of two proteins.” J. Mol. Biol. 48:443-453), which is based on dynamic programming.

More recently, a Fast Optimal Global Sequence Alignment Algorithm (FOGSAA) was developed that purportedly produces global alignment of nucleic acid and amino acid sequences faster than other optimal global alignment methods, including the Needleman-Wunsch algorithm. In some embodiments, the identity of two polypeptides is determined by aligning the two amino acid sequences, calculating the number of identical amino acids, and dividing by the length of one of the amino acid sequences. In some embodiments, the identity of two nucleic acids is determined by aligning the two nucleotide sequences and calculating the number of identical nucleotide and dividing by the length of one of the nucleic acids.

For multiple sequence alignments, computer programs including Clustal Omega (Sievers et al., Mol Syst Biol. 2011 Oct 11;7:539) may be used.

In preferred embodiments, a sequence, including a nucleic acid or amino acid sequence, is found to have a specified percent identity to a reference sequence, such as a sequence disclosed in this application and/or recited in the claims when sequence identity is determined using the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68, 1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA 90:5873-77, 1993 (e.g., BLAST®, NBLAST®, XBLAST® or Gapped BLAST® programs, using default parameters of the respective programs).

In some embodiments, a sequence, including a nucleic acid or amino acid sequence, is found to have a specified percent identity to a reference sequence, such as a sequence disclosed in this application and/or recited in the claims when sequence identity is determined using the Smith-Waterman algorithm (Smith, T. F. & Waterman, M. S. (1981) “Identification of common molecular subsequences.” J. Mol. Biol. 147:195-197) or the Needleman-Wunsch algorithm (Needleman, S. B. & Wunsch, C. D. (1970) “A general method applicable to the search for similarities in the amino acid sequences of two proteins.” J. Mol. Biol. 48:443-453) using default parameters.

As used in this application, a residue (such as a nucleic acid residue or an amino acid residue) in sequence “X” is referred to as corresponding to a position or residue (such as a nucleic acid residue or an amino acid residue) “n” in a different sequence “Y” when the residue in sequence “X” is at the counterpart position of “n” in sequence “Y” when sequences X and Y are aligned using amino acid sequence alignment tools known in the art.

Variant sequences may be homologous sequences. As used in this application, homologous sequences are sequences, including nucleic acid or amino acid sequences, that share a certain percent identity (e.g., at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% percent identity, including all values in between). Homologous sequences include but are not limited to paralogous sequences, orthologous sequences, or sequences arising from convergent evolution. Paralogous sequences arise from duplication of a gene within a genome of a species, while orthologous sequences diverge after a speciation event. Two different species may have evolved independently but may each comprise a sequence that shares a certain percent identity with a sequence from the other species as a result of convergent evolution.

In some embodiments, a polypeptide variant, such as a 1-SST, 1-FFT, or 6-SFT enzyme variant, comprises a domain that shares a secondary structure (e.g., alpha helix, beta sheet) with a reference polypeptide (e.g., a reference 1-SST, 1-FFT, or 6-SFT enzyme). In some embodiments, a polypeptide variant, such as a 1-SST, 1-FFT, or 6-SFT enzyme variant, shares a tertiary structure with a reference polypeptide (e.g., a reference 1-SST, 1-FFT, or 6-SFT enzyme). As a non-limiting example, a variant polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme variant) may have low primary sequence identity (e.g., less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, less than 40%, less than 35%, less than 30%, less than 25%, less than 20%, less than 15%, less than 10%, or less than 5% sequence identity) compared to a reference polypeptide, but share one or more secondary structures (e.g., including but not limited to loops, alpha helices, or beta sheets), or have the same or similar tertiary structure as a reference polypeptide. For example, a loop may be located between a beta sheet and an alpha helix, between two alpha helices, or between two beta sheets. Homology modeling may be used to compare two or more tertiary structures.

Mutations can be made in a nucleotide sequence by a variety of methods known to one of ordinary skill in the art. For example, mutations can be made by PCR-directed mutation, site-directed mutagenesis according to the method of Kunkel (Kunkel, Proc. Nat. Acad. Sci. U.S.A. 82: 488-492, 1985), by chemical synthesis of a gene encoding a polypeptide, by gene editing, or by insertions, such as insertion of a tag (e.g., a HIS tag or a GFP tag). Mutations can include, for example, substitutions, deletions, and translocations, generated by any method known in the art. Methods for producing mutations may be found in references such as Molecular Cloning: A Laboratory Manual, J. Sambrook, et al., eds., Fourth Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2012, or Current Protocols in Molecular Biology, F. M. Ausubel, et al., eds., John Wiley & Sons, Inc., New York, 2010.

In some embodiments, methods for producing variants include circular permutation (Yu and Lutz, Trends Biotechnol. 2011 Jan;29(1):18-25). In circular permutation, the linear primary sequence of a polypeptide can be circularized (e.g., by joining the N-terminal and C-terminal ends of the sequence) and the polypeptide can be severed (“broken”) at a different location. Thus, the linear primary sequence of the new polypeptide may have low sequence identity (e.g., less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, less than 40%, less than 35%, less than 30%, less than 25%, less than 20%, less than 15%, less than 10%, less or less than 5%, including all values in between) as determined by linear sequence alignment methods (e.g., Clustal Omega or BLAST). Topological analysis of the two proteins, however, may reveal that the tertiary structure of the two polypeptides is similar or dissimilar. Without being bound by a particular theory, a variant polypeptide created through circular permutation of a reference polypeptide and with a similar tertiary structure as the reference polypeptide can share similar functional characteristics (e.g., enzymatic activity, enzyme kinetics, substrate specificity or product specificity). In some instances, circular permutation may alter the secondary structure, tertiary structure or quaternary structure and produce an enzyme with different functional characteristics (e.g., increased or decreased enzymatic activity, different substrate specificity, or different product specificity). See, e.g., Yu and Lutz, Trends Biotechnol. 2011 Jan;29(1):18-25.

It should be appreciated that in a protein that has undergone circular permutation, the linear amino acid sequence of the protein would differ from a reference protein that has not undergone circular permutation. However, one of ordinary skill in the art would be able to readily determine which residues in the protein that has undergone circular permutation correspond to residues in the reference protein that has not undergone circular permutation by, for example, aligning the sequences and detecting conserved motifs, and/or by comparing the structures or predicted structures of the proteins, e.g., by homology modeling.

In some embodiments, an algorithm that determines the percent identity between a sequence of interest and a reference sequence described in this application accounts for the presence of circular permutation between the sequences. The presence of circular permutation may be detected using any method known in the art, including, for example, RASPODOM (Weiner et al., Bioinformatics. 2005 Apr 1;21(7):932-7). In some embodiments, the presence of circulation permutation is corrected for (e.g., the domains in at least one sequence are rearranged) prior to calculation of the percent identity between a sequence of interest and a sequence described in this application. The claims of this application should be understood to encompass sequences for which percent identity to a reference sequence is calculated after taking into account potential circular permutation of the sequence.

Functional variants of the recombinant 1-SST, 1-FFT, or 6-SFT enzyme disclosed in this application are also encompassed by the present disclosure. For example, functional variants may bind one or more of the same substrates or produce one or more of the same products. Functional variants may be identified using any method known in the art. For example, the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68, 1990 described above may be used to identify homologous proteins with known functions.

Putative functional variants may also be identified by searching for polypeptides with functionally annotated domains. Databases including Pfam (Sonnhammer et al., Proteins. 1997 Jul;28(3):405-20) may be used to identify polypeptides with a particular domain.

Homology modeling may also be used to identify amino acid residues that are amenable to mutation without affecting function. A non-limiting example of such a method may include use of position-specific scoring matrix (PSSM) and an energy minimization protocol.

Position-specific scoring matrix (PSSM) uses a position weight matrix to identify consensus sequences (e.g., motifs). PSSM can be conducted on nucleic acid or amino acid sequences. The method uses aligned sequences and takes into account the observed frequency of a particular residue (e.g., an amino acid or a nucleotide) at a particular position and the number of sequences analyzed. See, e.g., Stormo et al., Nucleic Acids Res. 1982 May 11;10(9):2997-3011. The likelihood of observing a particular residue at a given position can be calculated. Without being bound by a particular theory, positions in sequences with high variability may be amenable to mutation (e.g., PSSM score ≥0) to produce functional homologs.

PSSM may be paired with calculation of a Rosetta energy function, which determines the difference between the wild-type and the single-point mutant. The Rosetta energy function calculates this difference as (ΔΔG_calc). With the Rosetta function, the bonding interactions between a mutated residue and the surrounding atoms are used to determine whether a mutation increases or decreases protein stability. For example, a mutation that is designated as favorable by the PSSM score (e.g. PSSM score ≥0), can then be analyzed using the Rosetta energy function to determine the potential impact of the mutation on protein stability. Without being bound by a particular theory, potentially stabilizing mutations are desirable for protein engineering (e.g., production of functional homologs). In some embodiments, a potentially stabilizing mutation has a ΔΔG_calcvalue of less than −0.1 (e.g., less than −0.2, less than −0.3, less than −0.35, less than −0.4, less than −0.45, less than −0.5, less than −0.55, less than −0.6, less than −0.65, less than −0.7, less than −0.75, less than −0.8, less than −0.85, less than −0.9, less than −0.95, or less than −1.0) Rosetta energy units (R.e.u.). See, e.g., Goldenzweig et al., Mol Cell. 2016 Jul 21;63(2):337-346. Doi: 10.1016/j.molcel2016.06.012.

In some embodiments, a 1-SST, 1-FFT, or 6-SFT enzyme coding sequence comprises a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 or more than 100 positions corresponding to a reference (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) coding sequence. In some embodiments, the 1-SST, 1-FFT, or 6-SFT enzyme coding sequence comprises a mutation in 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,100 or more codons of the coding sequence relative to a reference (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) coding sequence. As will be understood by one of ordinary skill in the art, a mutation within a codon may or may not change the amino acid that is encoded by the codon due to degeneracy of the genetic code. In some embodiments, the one or more mutations in the coding sequence do not alter the amino acid sequence of the coding sequence (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) relative to the amino acid sequence of a reference polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme).

In some embodiments, the one or more mutations in a recombinant 1-SST, 1-FFT, or 6-SFT enzyme sequence alters the amino acid sequence of the polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) relative to the amino acid sequence of a reference polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme). In some embodiments, the one or more mutations alters the amino acid sequence of the recombinant polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) relative to the amino acid sequence of a reference polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) and alters (enhances or reduces) an activity of the polypeptide relative to the reference polypeptide.

The activity, including specific activity, of any of the recombinant polypeptides described in this application (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) may be measured using routine methods. As a non-limiting example, a recombinant polypeptide's activity may be determined by measuring its substrate specificity, product(s) produced, the concentration of product(s) produced, or any combination thereof. As used in this application, “specific activity” of a recombinant polypeptide refers to the amount (e.g., concentration) of a particular product produced for a given amount (e.g., concentration) of the recombinant polypeptide per unit time.

The skilled artisan will also realize that mutations in a recombinant polypeptide (e.g., 1-SST, 1-FFT, or 6-SFT enzyme) coding sequence may result in conservative amino acid substitutions that provide functionally equivalent variants of the foregoing polypeptides, e.g., variants that retain the activities of the polypeptides. As used in this application, a “conservative amino acid substitution” refers to an amino acid substitution that does not alter the relative charge or size characteristics or functional activity of the protein in which the amino acid substitution is made.

In some instances, an amino acid is characterized by its R group (see, e.g., Table 1). For example, an amino acid may comprise a nonpolar aliphatic R group, a positively charged R group, a negatively charged R group, a nonpolar aromatic R group, or a polar uncharged R group. Non-limiting examples of an amino acid comprising a nonpolar aliphatic R group include alanine, glycine, valine, leucine, methionine, and isoleucine. Non-limiting examples of an amino acid comprising a positively charged R group includes lysine, arginine, and histidine. Non-limiting examples of an amino acid comprising a negatively charged R group include aspartate and glutamate. Non-limiting examples of an amino acid comprising a nonpolar, aromatic R group include phenylalanine, tyrosine, and tryptophan. Non-limiting examples of an amino acid comprising a polar uncharged R group include serine, threonine, cysteine, proline, asparagine, and glutamine.

Non-limiting examples of functionally equivalent variants of polypeptides may include conservative amino acid substitutions in the amino acid sequences of proteins disclosed in this application. As used in this application “conservative substitution” is used interchangeably with “conservative amino acid substitution” and refers to any one of the amino acid substitutions provided in Table 1.

In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more than 20 residues can be changed when preparing variant polypeptides. In some embodiments, amino acids are replaced by conservative amino acid substitutions.

TABLE 1

Conservative Amino Acid Substitutions.

Original

Conservative Amino

Residue
R Group Type
Acid Substitutions

Ala
nonpolar aliphatic R group
Cys, Gly, Ser

Arg
positively charged R group
His, Lys

Asn
polar uncharged R group
Asp, Gln, Glu

Asp
negatively charged R group
Asn, Gln, Glu

Cys
polar uncharged R group
Ala, Ser

Gln
polar uncharged R group
Asn, Asp, Glu

Glu
negatively charged R group
Asn, Asp, Gln

Gly
nonpolar aliphatic R group
Ala, Ser

His
positively charged R group
Arg, Tyr, Trp

Ile
nonpolar aliphatic R group
Leu, Met, Val

Leu
nonpolar aliphatic R group
Ile, Met, Val

Lys
positively charged R group
Arg, His

Met
nonpolar aliphatic R group
Ile, Leu, Phe, Val

Pro
polar uncharged R group

Phe
nonpolar aromatic R group
Met, Trp, Tyr

Ser
polar uncharged R group
Ala, Gly, Thr

Thr
polar uncharged R group
Ala, Asn, Ser

Trp
nonpolar aromatic R group
His, Phe, Tyr, Met

Tyr
nonpolar aromatic R group
His, Phe, Trp

Val
nonpolar aliphatic R group
Ile, Leu, Met, Thr

Amino acid substitutions in the amino acid sequence of a polypeptide to produce a recombinant polypeptide variant having a desired property and/or activity can be made by alteration of the coding sequence of the polypeptide. Similarly, conservative amino acid substitutions in the amino acid sequence of a polypeptide to produce functionally equivalent variants of the polypeptide typically are made by alteration of the coding sequence of the recombinant polypeptide.

A sequence encoding an enzyme of the present disclosure may further encode a secretion signal. As a non-limiting example, a secretion signal may be selected based on the host cell of interest. In some embodiments, a secretion signal may be a yeast, plant, or bacteria secretion signal.

In some embodiments, a secretion signal comprises a sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical to:

(SEQ ID NO: 60)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDV

AVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKREAEA.

In some embodiments, nucleic acid sequence encoding a secretion signal comprises a sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical to:

(SEQ ID NO: 61)

ATGAGATTTCCTTCAATTTTTACTGCTGTTTTATTCGCAGCATCCTCCGC

ATTAGCTGCTCCAGTCAACACTACAACAGAAGATGAAACGGCACAAATTC

CGGCTGAAGCTGTCATCGGTTACTCAGATTTAGAAGGGGATTTCGATGTT

GCTGTTTTGCCATTTTCCAACAGCACAAATAACGGGTTATTGTTTATAAA

TACTACTATTGCCAGCATTGCTGCTAAAGAAGAAGGGGTATCTCTCGAGA

AAAGAGAGGCTGAAGCT.

It should be appreciated that other secretion signals known to one of ordinary skill in the art would also be compatible with aspects of the disclosure.

Nucleic Acids Encoding Enzymes of the Disclosure

Aspects of the present disclosure relate to recombinant enzymes, functional modifications and variants thereof, as well as uses relating thereto. For example, the enzymes and cells described in this application may be used to promote production of fructans, e.g., branched fructans, e.g., branched inulins. The methods may comprise using a host cell comprising one or more enzymes disclosed in this application, a cell lysate, isolated enzymes, or any combination thereof. Methods comprising recombinant expression of polynucleotides encoding an enzyme disclosed in this application in a host cell are encompassed by the present disclosure. In vitro methods comprising reacting one or more enzymes for the production of polyfructans in a reaction mixture with a BCAA pathway enzyme disclosed in this application are also encompassed by the disclosure. In some embodiments, the BCAA pathway enzyme is an 1-SST, 1-FFT, or 6-SFT enzyme, or a combination thereof.

A nucleic acid encoding any one or more of the recombinant polypeptides 1-SST, 1-FFT, and/or 6-SFT is encompassed by the disclosure and may be comprised within a host cell. In some embodiments, the nucleic acid is in the form of an operon. In some embodiments, at least one ribosome binding site is present between one or more the coding sequences present in the nucleic acid.

In some embodiments, a nucleic acid provided in this application is a nucleic acid that hybridizes under high or medium stringency conditions to a nucleic acid encoding a 1-SST, 1-FFT, and/or 6-SFT, and that is biologically active. For example, high stringency conditions can include 0.2 to 1×SSC at 65° C. followed by a wash at 0.2×SSC at 65° C. In some embodiments, a nucleic acid provided in this application is a nucleic acid that hybridizes under low stringency conditions to a nucleic acid encoding a 1-SST, 1-FFT, and/or 6-SFT, and that is biologically active. For example, low stringency conditions can include 6×SSC at room temperature followed by a wash at 2×SSC at room temperature. Other hybridization conditions include 3×SSC at 40° C. or 50° C., followed by a wash in 1 or 2×SSC at 20° C., 30° C., 40° C., 50° C., 60° C., or 65° C.

Hybridizations can be conducted in the presence of formaldehyde, e.g., 10%, 20%, 30% 40% or 50%, which further increases the stringency of hybridization. Theory and practice of nucleic acid hybridization is described, e.g., in S. Agrawal (ed.) Methods in Molecular Biology, volume 20; and Tijssen (1993) Laboratory Techniques in biochemistry and molecular biology-hybridization with nucleic acid probes, e.g., part I chapter 2 “Overview of principles of hybridization and the strategy of nucleic acid probe assays,” Elsevier, New York. Exemplary proteins may have at least about 50%, 70%, 80%, 90%, 95%, 98% or 99% homology or identity with a 1-SST, 1-FFT, or 6-SFT protein or a domain thereof, e.g., a catalytic domain. Other exemplary proteins may be encoded by a nucleic acid that has at least about 50%, 70%, 80%, 90%, 95%, 98% or 99% homology or identity with a nucleic acid encoding a 1-SST, 1-FFT, or 6-SFT protein or a domain thereof, e.g., a catalytic domain.

A nucleic acid encoding any one or more of the recombinant polypeptides described in this application may be incorporated into any appropriate vector through any method known in the art. For example, the vector may be an expression vector, including but not limited to a viral vector (e.g., a lentiviral, retroviral, adenoviral, or adeno-associated viral vector), any vector suitable for transient expression, any vector suitable for constitutive expression, or any vector suitable for inducible expression (e.g., a galactose-inducible or doxycycline-inducible vector).

In some embodiments, a vector replicates autonomously in the cell. In some embodiments, a vector integrates into a chromosome within a cell. A vector can contain one or more endonuclease restriction sites that are cut by a restriction endonuclease to insert and ligate a nucleic acid containing a gene described in this application to produce a recombinant vector that is able to replicate in a cell. Vectors are typically composed of DNA, although RNA vectors are also available. Cloning vectors include, but are not limited to: plasmids, fosmids, phagemids, virus genomes and artificial chromosomes. As used in this application, the terms “expression vector” or “expression construct” refer to a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a host cell (e.g., microbe), such as a yeast cell. In some embodiments, the nucleic acid sequence of a gene described in this application is inserted into a cloning vector such that it is operably joined to regulatory sequences and, in some embodiments, expressed as an RNA transcript. In some embodiments, the vector contains one or more markers, such as a selectable marker, to identify cells transformed or transfected with the recombinant vector. In some embodiments, the nucleic acid sequence of a gene described in this application is recoded. Recoding may increase production of the gene product by at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 100%, including all values in between) relative to a reference sequence that is not recoded.

A coding sequence and a regulatory sequence are said to be “operably joined” or “operably linked” when the coding sequence and the regulatory sequence are covalently linked and the expression or transcription of the coding sequence is under the influence or control of the regulatory sequence. If the coding sequence is to be translated into a functional protein, the coding sequence and the regulatory sequence are said to be operably joined if induction of a promoter in the 5′ regulatory sequence permits the coding sequence to be transcribed and if the nature of the linkage between the coding sequence and the regulatory sequence does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the promoter region to direct the transcription of the coding sequence, or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein.

In some embodiments, the nucleic acid encoding any one or more of the proteins described in this application is under the control of regulatory sequences (e.g., enhancer sequences). In some embodiments, a nucleic acid is expressed under the control of a promoter. The promoter can be a native promoter, e.g., the promoter of the gene in its endogenous context, which provides normal regulation of expression of the gene. Alternatively, a promoter can be a promoter that is different from the native promoter of the gene, e.g., the promoter is different from the promoter of the gene in its endogenous context.

Enzymes disclosed herein can be encoded by the same heterologous polynucleotide or by different heterologous polynucleotides. For example, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 enzymes can be encoded by the same heterologous polynucleotide or can be encoded by one or more different heterologous polynucleotides.

In some embodiments, a heterologous polynucleotide encoding a 1-SST enzyme also encodes a 1-FFT and/or a 6-SFT enzyme; a heterologous polynucleotide encoding a 1-FFT enzyme also encodes a 1-SST enzyme and/or a 6-SFT enzyme; or a heterologous polynucleotide encoding a 6-SFT enzyme also encodes a 1-SST enzyme and/or a 1-FFT enzyme.

In some embodiments, a heterologous polynucleotide comprises a single promoter operably linked to a polynucleotide encoding at least one enzyme. For example, a single nucleic acid encoding at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 enzymes may be operably linked to a single promoter. Expression of enzymes within a single heterologous polynucleotide may be controlled by any method known in the art, including, for example, by internal ribosome entry sites (IRES) or polypeptide cleavage signals such as 2A sequences.

In some instances, a heterologous polynucleotide comprises more than one promoter. In some instances, separate promoters are operably linked to at least two polynucleotide sequences that each encode an enzyme used to produce a polyfructan. In some instances, separate promoters are operably linked to each polynucleotide sequence encoding an enzyme used to produce a polyfructan.

In some embodiments, the promoter is a eukaryotic promoter. Non-limiting examples of eukaryotic promoters include TDH3, PGK1, PKC1, PDC1, TEF1, TEF2, RPL18B, SSA1, TDH2, PYK1,TPI1 GAL1, GAL10, GAL7, GAL3, GAL2, MET3, MET25, HXT3, HXT7, ACT1, 6-SFT1, 6-SFT2, CUP1-1, ENO2, pAOX1, pGAP1, and SOD1, as would be known to one of ordinary skill in the art (see, e.g., Addgene website: blog.addgene.org/plasmids-101-the-promoter-region). In some embodiments, the promoter is a prokaryotic promoter (e.g., bacteriophage or bacterial promoter). Non-limiting examples of bacteriophage promoters include Pls icon, T3, T7, SP6, and PL. Non-limiting examples of bacterial promoters include Pbad, PmgrB, Ptrc2, Plac/ara, Ptac, and Pm.

In some embodiments, the promoter is an inducible promoter. As used in this application, an “inducible promoter” is a promoter controlled by the presence or absence of a molecule. This may be used, for example, to controllably induce the expression of an enzyme. In some instances, an inducible promoter is used to controllably repress expression of an enzyme. Non-limiting examples of inducible promoters include chemically regulated promoters and physically regulated promoters. For chemically regulated promoters, the transcriptional activity can be regulated by one or more compounds, such as alcohol, tetracycline, galactose, a steroid, a metal, or other compounds. For physically regulated promoters, transcriptional activity can be regulated by a phenomenon such as light or temperature. Non-limiting examples of tetracycline-regulated promoters include anhydrotetracycline (aTc)-responsive promoters and other tetracycline-responsive promoter systems (e.g., a tetracycline repressor protein (tetR), a tetracycline operator sequence (tetO) and a tetracycline transactivator fusion protein (tTA)). Non-limiting examples of steroid-regulated promoters include promoters based on the rat glucocorticoid receptor, human estrogen receptor, moth ecdysone receptors, and promoters from the steroid/retinoid/thyroid receptor superfamily. Non-limiting examples of metal-regulated promoters include promoters derived from metallothionein (proteins that bind and sequester metal ions) genes. Non-limiting examples of pathogenesis-regulated promoters include promoters induced by salicylic acid, ethylene or benzothiadiazole (BTH). Non-limiting examples of temperature/heat-inducible promoters include heat shock promoters. Non-limiting examples of light-regulated promoters include light responsive promoters from plant cells. In certain embodiments, the inducible promoter is a galactose-inducible promoter. In some embodiments, the inducible promoter is induced by one or more physiological conditions (e.g., pH, temperature, radiation, osmotic pressure, saline gradients, cell surface binding, or concentration of one or more extrinsic or intrinsic inducing agents). Non-limiting examples of an extrinsic inducer or inducing agent include amino acids and amino acid analogs, saccharides and polysaccharides, nucleic acids, protein transcriptional activators and repressors, cytokines, toxins, petroleum-based compounds, metal containing compounds, salts, ions, enzyme substrate analogs, hormones or any combination thereof. In some embodiments, an inducible promoter is the pAOX1 promoter. In some embodiments, an inducible promoter is used to drive expression in a eukaryotic cell. In some embodiments, a eukaryotic cell is a yeast cell. In some embodiments, a yeast cell is a Pichia cell. In some embodiments, a yeast cell is a Saccharomyces cell.

In some embodiments, the promoter is a constitutive promoter. As used in this application, a “constitutive promoter” refers to an unregulated promoter that allows continuous transcription of a gene. Non-limiting examples of a constitutive promoter include TDH3, PGK1, PKC1, PDC1, TEF1, TEF2, RPL18B, SSA1, TDH2, PYK1,TPI1, HXT3, HXT7, ACT1, 6-SFT1, 6-SFT2, ENO2, pGAP1, and SOD1. In some embodiments, a constitutive promoter is used to drive expression in a eukaryotic cell. In some embodiments, a eukaryotic cell is a yeast cell. In some embodiments, a yeast cell is a Pichia cell. In some embodiments, a yeast cell is a Saccharomyces cell.

Other inducible promoters or constitutive promoters known to one of ordinary skill in the art are also compatible with aspects of the disclosure.

The precise nature of the regulatory sequences needed for gene expression may vary between species or cell types, but generally can include, as necessary, 5′ non-transcribed and 5′ non-translated sequences involved with the initiation of transcription and translation respectively, such as a TATA box, capping sequence, CAAT sequence, and the like. In particular, such 5′ non-transcribed regulatory sequences can include a promoter region which includes a promoter sequence for transcriptional control of the operably joined gene. Regulatory sequences may also include enhancer sequences or upstream activator sequences. The vectors disclosed in this application may include 5′ leader or signal sequences. The regulatory sequence may also include a terminator sequence. In some embodiments, a terminator sequence marks the end of a gene in DNA during transcription. The choice and design of one or more appropriate vectors suitable for inducing expression of one or more genes described in this application in a heterologous organism is within the ability and discretion of one of ordinary skill in the art.

Expression vectors containing necessary elements for expression are commercially available and known to one of ordinary skill in the art (see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, Fourth Edition, Cold Spring Harbor Laboratory Press, 2012).

Host Cells

Any of the proteins or enzymes of the disclosure may be expressed in a host cell. The term “host cell” refers to a cell that can be used to express a polynucleotide, such as a polynucleotide that encodes an enzyme used in production of oligosaccharides.

he disclosed methods, compositions, and host cells are exemplified with Pichia pastoris cells, but are also applicable to other host cells. In this application, the term “Pichia pastoris” is used interchangeably with the term “Komagataella phaffii.”

Suitable host cells include, but are not limited to: yeast cells, bacterial cells, algal cells, plant cells, fungal cells, insect cells, and animal cells, including mammalian cells. In one illustrative embodiment, suitable host cells include Pichia pastoris.

Suitable yeast host cells include, but are not limited to: Candida, Escherichia, Hansenula, Saccharomyces, Schizosaccharomyces, Pichia, Kluyveromyces, and Yarrowia. In some embodiments, the yeast cell is Escherichia coli, Hansenula polymorpha, Saccharomyces cerevisiae, Saccaromyces carlsbergensis, Saccharomyces diastaticus, Saccharomyces norbensis, Saccharomyces kluyveri, Schizosaccharomyces pombe, Pichia finlandica, Pichia trehalophila, Pichia kodamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia quercuum, Pichia pijperi, Pichia stipitis, Pichia methanolica, Pichia angusta, Kluyveromyces lactis, Candida albicans, or Yarrowia lipolytica.

In some embodiments, the yeast strain is an industrial polyploid yeast strain. Other non-limiting examples of fungal cells include cells obtained from Aspergillus spp., Penicillium spp., Fusarium spp., Rhizopus spp., Acremonium spp., Neurospora spp., Sordaria spp., Magnaporthe spp., Allomyces spp., Ustilago spp., Botrytis spp., and Trichoderma spp.

In certain embodiments, the host cell is an algal cell such as Chlamydomonas (e.g., C. Reinhardtii) and Phormidium (P. sp. ATCC29409).

In other embodiments, the host cell is a prokaryotic cell. Suitable prokaryotic cells include gram positive, gram negative, and gram-variable bacterial cells. The host cell may be a species of, but not limited to: Agrobacterium, Alicyclobacillus, Anabaena, Anacystis, Acinetobacter, Acidothermus, Arthrobacter, Azobacter, Bacillus, Bifidobacterium, Brevibacterium, Butyrivibrio, Buchnera, Campestris, Camplyobacter, Clostridium, Corynebacterium, Chromatium, Coprococcus, Escherichia, Enterococcus, Enterobacter, Erwinia, Fusobacterium, Faecalibacterium, Francisella, Flavobacterium, Geobacillus, Haemophilus, Helicobacter, Klebsiella, Lactobacillus, Lactococcus, Ilyobacter, Micrococcus, Microbacterium, Mesorhizobium, Methylobacterium, Methylobacterium, Mycobacterium, Neisseria, Pantoea, Pseudomonas, Prochlorococcus, Rhodobacter, Rhodopseudomonas, Rhodopseudomonas, Roseburia, Rhodospirillum, Rhodococcus, Scenedesmus, Streptomyces, Streptococcus, Synecoccus, Saccharomonospora, Saccharopolyspora, Staphylococcus, Serratia, Salmonella, Shigella, Thermoanaerobacterium, Tropheryma, Tularensis, Temecula, Thermosynechococcus, Thermococcus, Ureaplasma, Xanthomonas, Xylella, Yersinia, and Zymomonas.

In some embodiments, the bacterial host strain is an industrial strain. Numerous bacterial industrial strains are known and suitable for the methods and compositions described in this application.

In some embodiments, the bacterial host cell is of the Agrobacterium species (e.g., A. radiobacter, A. rhizogenes, A. rubi), the Arthrobacterspecies (e.g., A. aurescens, A. citreus, A. globformis, A. hydrocarboglutamicus, A. mysorens, A. nicotianae, A. paraffineus, A. protophonniae, A. roseoparaffinus, A. sulfureus, A. ureafaciens), or the Bacillus species (e.g., B. thuringiensis, B. anthracis, B. megaterium, B. subtilis, B. lentus, B. circulars, B. pumilus, B. lautus, B. coagulans, B. brevis, B. firmus, B. alkaophius, B. licheniformis, B. clausii, B. stearothermophilus, B. halodurans and B. amyloliquefaciens. In particular embodiments, the host cell is an industrial Bacillus strain including but not limited to B. subtilis, B. pumilus, B. licheniformis, B. megaterium, B. clausii, B. stearothermophilus and B. amyloliquefaciens. In some embodiments, the host cell is an industrial Clostridium species (e.g., C. acetobutylicum, C. tetani E88, C. lituseburense, C. saccharobutylicum, C. perfringens, C. beijerinckii). In some embodiments, the host cell is an industrial Corynebacterium species (e.g., C. glutamicum, C. acetoacidophilum). In some embodiments, the host cell is an industrial Escherichia species (e.g., E. coli). In some embodiments, the host cell is an industrial Erwinia species (e.g., E. uredovora, E. carotovora, E. ananas, E. herbicola, E. punctata, E. terreus). In some embodiments, the host cell is an industrial Pantoea species (e.g., P. citrea, P. agglomerans). In some embodiments, the host cell is an industrial Pseudomonas species, (e.g., P. putida, P. aeruginosa, P. mevalonii). In some embodiments, the host cell is an industrial Streptococcus species (e.g., S. equisimiles, S. pyogenes, S. uberis). In some embodiments, the host cell is an industrial Streptomyces species (e.g., S. ambofaciens, S. achromogenes, S. avermitilis, S. coelicolor, S. aureofaciens, S. aureus, S. fungicidicus, S. griseus, S. lividans). In some embodiments, the host cell is an industrial Zymomonas species (e.g., Z. mobilis, Z. lipolytica).

The present disclosure is also suitable for use with a variety of animal cell types, including mammalian cells, for example, human (including 293, HeLa, WI38, PER.C6 and Bowes melanoma cells), mouse (including 3T3, NSO, NS1, Sp2/0), hamster (CHO, BHK), monkey (COS, FRhL, Vero), and hybridoma cell lines.

In various embodiments, strains that may be used in the practice of the disclosure including both prokaryotic and eukaryotic strains, are readily accessible to the public from a number of culture collections such as American Type Culture Collection (ATCC), Deutsche Sammlung von Mikroorganismen and Zellkulturen GmbH (DSM), Centraalbureau Voor Schimmelcultures (CBS), and Agricultural Research Service Patent Culture Collection, Northern Regional Research Center (NRRL). The present disclosure is also suitable for use with a variety of plant cell types.

The term “cell,” as used in this application, may refer to a single cell or a population of cells, such as a population of cells belonging to the same cell line or strain. Use of the singular term “cell” should not be construed to refer explicitly to a single cell rather than a population of cells. The host cell may comprise genetic modifications relative to a wild-type counterpart.

A vector encoding any one or more of the recombinant polypeptides (e.g., 1-SST, 1-FFT, and/or 6-SFT) described in this application may be introduced into a suitable host cell using any method known in the art. Host cells may be cultured under any conditions suitable as would be understood by one of ordinary skill in the art. For example, any media, temperature, and incubation conditions known in the art may be used. For host cells carrying an inducible vector, cells may be cultured with an appropriate inducible agent to promote expression.

Any of the cells disclosed in this application can be cultured in media of any type (rich or minimal) and any composition prior to, during, and/or after contact and/or integration of a nucleic acid. The conditions of the culture or culturing process can be optimized through routine experimentation as would be understood by one of ordinary skill in the art. In some embodiments, the selected media is supplemented with various components. In some embodiments, the concentration and amount of a supplemental component is optimized. In some embodiments, other aspects of the media and growth conditions (e.g., pH, temperature, etc.) are optimized through routine experimentation. In some embodiments, the frequency that the media is supplemented with one or more supplemental components, and the amount of time that the cell is cultured, is optimized.

Culturing of the cells described in this application can be performed in culture vessels known and used in the art. In some embodiments, an aerated reaction vessel (e.g., a stirred tank reactor) is used to culture the cells. In some embodiments, a bioreactor or fermentor is used to culture the cell. Thus, in some embodiments, the cells are used in fermentation. The terms “bioreactor” and “fermentor” are interchangeably used in this application and refer to an enclosure, or partial enclosure, in which a biological, biochemical and/or chemical reaction takes place, involving a living organism or part of a living organism, including one or more secreted enzymes. A “large-scale bioreactor” or “industrial-scale bioreactor” is a bioreactor that is used to generate a product on a commercial or quasi-commercial scale. Large scale bioreactors typically have volumes in the range of liters, hundreds of liters, thousands of liters, or more.

In some embodiments, methods of culturing cell(s) of the present disclosure comprise overexpression of an enzyme described in this application. In some embodiments, methods of culturing cell(s) further comprise isolating or purifying enzymes expressed from the cell(s) (e.g., isolating enzymes following secretion of the enzymes by the cells).

Non-limiting examples of bioreactors include: stirred tank fermentors, bioreactors agitated by rotating mixing devices, chemostats, bioreactors agitated by shaking devices, airlift fermentors, packed-bed reactors, fixed-bed reactors, fluidized bed bioreactors, bioreactors employing wave induced agitation, centrifugal bioreactors, roller bottles, and hollow fiber bioreactors, roller apparatuses (for example benchtop, cart-mounted, and/or automated varieties), vertically-stacked plates, spinner flasks, stirring or rocking flasks, shaken multi-well plates, MD bottles, T-flasks, Roux bottles, multiple-surface tissue culture propagators, modified fermentors, and coated beads (e.g., beads coated with serum proteins, nitrocellulose, or carboxymethyl cellulose to prevent cell attachment).

In some embodiments, the bioreactor includes a cell culture system where the cell (e.g., bacterial cell) is in contact with moving liquids and/or gas bubbles. In some embodiments, the cell or cell culture is grown in suspension. In other embodiments, the cell or cell culture is attached to a solid phase carrier. Non-limiting examples of a carrier system includes microcarriers (e.g., polymer spheres, microbeads, and microdisks that can be porous or non-porous), cross-linked beads (e.g., dextran) charged with specific chemical groups (e.g., tertiary amine groups), 2D microcarriers including cells trapped in nonporous polymer fibers, 3D carriers (e.g., carrier fibers, hollow fibers, multicartridge reactors, and semi-permeable membranes that can comprising porous fibers), microcarriers having reduced ion exchange capacity, encapsulation cells, capillaries, and aggregates. In some embodiments, carriers are fabricated from materials such as dextran, gelatin, glass, or cellulose.

In some embodiments, industrial-scale processes are operated in continuous, semi-continuous or non-continuous modes. Non-limiting examples of operation modes are batch, fed batch, extended batch, repetitive batch, draw/fill, rotating-wall, spinning flask, and/or perfusion mode of operation. In some embodiments, a bioreactor allows continuous or semi-continuous replenishment of the substrate stock, for example a carbohydrate source and/or continuous or semi-continuous separation of the product, from the bioreactor.

In some embodiments, the bioreactor or fermentor includes a sensor and/or a control system to measure and/or adjust reaction parameters. Non-limiting examples of reaction parameters include biological parameters (e.g., growth rate, cell size, cell number, cell density, cell type, or cell state, etc.), chemical parameters (e.g., pH, redox-potential, concentration of reaction substrate and/or product, concentration of dissolved gases, such as oxygen concentration and CO₂concentration, nutrient concentrations, metabolite concentrations, concentration of an oligopeptide, concentration of an amino acid, concentration of a vitamin, concentration of a hormone, concentration of an additive, serum concentration, ionic strength, concentration of an ion, relative humidity, molarity, osmolarity, concentration of other chemicals, for example buffering agents, adjuvants, or reaction by-products), physical/mechanical parameters (e.g., density, conductivity, degree of agitation, pressure, and flow rate, shear stress, shear rate, viscosity, color, turbidity, light absorption, mixing rate, conversion rate, as well as thermodynamic parameters, such as temperature, light intensity/quality, etc.). Sensors to measure the parameters described in this application are well known to one of ordinary skill in the relevant mechanical and electronic arts. Control systems to adjust the parameters in a bioreactor based on the inputs from a sensor described in this application are well known to one of ordinary skill in the art of bioreactor engineering.

In some embodiments, methods involve batch fermentation (e.g., shake flask fermentation). General considerations for batch fermentation (e.g., shake flask fermentation) include the level of oxygen and glucose. For example, batch fermentation (e.g., shake flask fermentation) may be oxygen and glucose limited, so in some embodiments, the capability of a strain to perform in a well-designed fed-batch fermentation is underestimated. Also, the final product may display some differences from the substrate in terms of solubility, toxicity, cellular accumulation and secretion and in some embodiments can have different fermentation kinetics.

In some embodiments, the cells of the present disclosure are adapted to consume sucrose and produce fructans in vivo. In some embodiments, the cells are adapted to produce one or more enzymes for sucrose consumption via conversion to 1-kestose, 6-kestose, and/or inulin (e.g., 1-SST, 1-FFT, and/or 6-SFT). In such embodiments, the enzyme can catalyze reactions for the consumption of sucrose by bioconversion in an in vitro process.

In some embodiments, the cell(s) (e.g., host cell(s)) of the present disclosure comprise one or more heterologous polynucleotides encoding a 1-SST enzyme; a 1-FFT enzyme; and/or a 6-SFT enzyme. In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding a 1-SST enzyme and a 1-FFT enzyme. In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding a 1-SST enzyme and a 6-SFT enzyme. In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding a 1-FFT enzyme and a 6-SFT enzyme. In some embodiments, a host cell comprises one or more heterologous polynucleotides encoding a 1-SST enzyme, a 1-FFT enzyme, and a 6-SFT enzyme.

The term “heterologous” with respect to a polynucleotide, such as a polynucleotide comprising a gene, is used interchangeably with the term “exogenous” and the term “recombinant” and refers to: a polynucleotide that has been artificially supplied to a biological system; a polynucleotide that has been modified within a biological system, or a polynucleotide whose expression or regulation has been manipulated within a biological system. A heterologous polynucleotide that is introduced into or expressed in a host cell may be a polynucleotide that comes from a different organism or species from the host cell, or may be a synthetic polynucleotide, or may be a polynucleotide that is also endogenously expressed in the same organism or species as the host cell. For example, a polynucleotide that is endogenously expressed in a host cell may be considered heterologous when it is situated non-naturally in the host cell; expressed recombinantly in the host cell, either stably or transiently; modified within the host cell; selectively edited within the host cell; expressed in a copy number that differs from the naturally occurring copy number within the host cell; or expressed in a non-natural way within the host cell, such as by manipulating regulatory regions that control expression of the polynucleotide. In some embodiments, a heterologous polynucleotide is a polynucleotide that is endogenously expressed in a host cell but whose expression is driven by a promoter that does not naturally regulate expression of the polynucleotide. In other embodiments, a heterologous polynucleotide is a polynucleotide that is endogenously expressed in a host cell and whose expression is driven by a promoter that does naturally regulate expression of the polynucleotide, but the promoter or another regulatory region is modified. In some embodiments, the promoter is recombinantly activated or repressed. For example, gene-editing based techniques may be used to regulate expression of a polynucleotide, including an endogenous polynucleotide, from a promoter, including an endogenous promoter. See, e.g., Chavez et al., Nat Methods. 2016 Jul; 13(7): 563-567. A heterologous polynucleotide may comprise a wild-type sequence or a mutant sequence as compared with a reference polynucleotide sequence.

Methods

In some aspects, the disclosure provides methods comprising culturing a host cell described in this application (e.g., a host cell comprising a heterologous polynucleotide encoding at least one enzyme selected from the group consisting of 1-SST, 1-FFT, and 6-SFT). In some embodiments, the disclosure provides a method of producing fructans, e.g., inulins, from sucrose comprising culturing a host cell described in this application (e.g., a host cell comprising a heterologous polynucleotide encoding 1-SST, 1-FFT, and/or 6-SFT). In some embodiments, the production and culturing occurs in vivo. In some embodiments, production of one or more products occurs in vitro. In some embodiments, methods of producing fructans using host cells comprise secretion of expressed enzymes (e.g., 1-SST, 1-FFT, and/or 6-SFT) from the cells. Methods involving secreted enzymes may comprise contacting the secreted enzymes with sucrose in the media or in solution surrounding the host cells.

In some aspects, the disclosure provides methods of using isolated or purified enzymes. Non-limiting methods for protein purification may be found, e.g., in Janson, Protein purification: principles, high resolution methods, and applications, Third Edition (2011). In some embodiments, the disclosure provides a method comprising contacting (or incubating) saccharides with one or more enzymes described in this application to produce fructans. In some embodiments, methods of producing fructans comprise contacting saccharides (e.g., sucrose) with one or more of: a 1-SST enzyme; a 1-FFT enzyme; and a 6-SFT enzyme. In some embodiments, methods of producing fructans comprise contacting or incubating saccharides (e.g., sucrose) with a 1-SST enzyme and a 1-FFT enzyme. In some embodiments, methods of producing fructans comprise contacting or incubating saccharides (e.g., sucrose) with a 1-SST enzyme and a 6-SFT enzyme. In some embodiments, methods of producing fructans comprise contacting or incubating saccharides (e.g., sucrose) with a 1-FFT enzyme and a 6-SFT enzyme. In some embodiments, methods of producing fructans comprise contacting or incubating saccharides (e.g., sucrose) with a 1-SST enzyme, a 1-FFT enzyme, and a 6-SFT enzyme.

Production of a fructan may be carried out in a method whereby all the reactions take place in one reactor, such as a bioreactor, which can be referred to as a “one-pot bioconversion.” In some embodiments, at least two enzymes are used in a single reactor. In some embodiments, at least three enzymes are used in a single reactor.

As a non-limiting example of a one-pot bioconversion, in some embodiments, a single strain can be used to secrete multiple enzymes into media containing sucrose to produce a polyfructan. In other embodiments, multiple strains, each encoding one or more enzymes, can be combined into a single fermentation wherein they will each secrete enzymes into media. The secreted enzymes can convert sucrose into branched inulins. Without being bound by a particular theory, glucose and sucrose released from this process can be used to develop increased biomass of the strains and provide additional substrate for the formation of branched inulin. In some instances, a one-pot bioconversion comprises incubation of one or more purified enzymes with a substrate in a single reactor to produce a polyfructan.

In some instances, multiple reactors are used to produce polyfructans. Use of more than one reactor may be referred to as multiple pot bioconversion. In some instances, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 reactors are used. As a non-limiting example, a multiple pot bioconversion can comprise incubating isolated 1-SST with sucrose to form kestose. The kestose produced can then be isolated and incubated with 1-FFT and 6-SFT to convert the kestose into branched inulins. The resulting sucrose and glucose can also be isolated and used for host-cell biomass accumulation, for bioconversion, or for alternative processes. In some embodiments, multiple pot bioconversion comprises purification of a product of interest from one reactor and subsequent introduction of the purified product of interest as a substrate in a second reactor.

In some instances, one or more enzymes selected from 1-SST, 1-FFT, and 6-SFT do not comprise a secretion signal. In some instances, the one or more enzymes (e.g., two or more or three or more enzymes) catalyze production of a fructan within a cell by fermentation. For example, a fructan may be produced within a cell and subsequently secreted from the cell, isolated from the cell, or purified from the cell. In some instances, the secreted fructan is the substrate for another reaction. In some instances, the secreted fructan is imported by a cell as a substrate for another reaction. In some instances, a fructan is produced within a cell and subsequently isolated or purified from a cell. The isolated or purified fructan may be used as the substrate for another reaction.

In some aspects, the disclosure provides methods of producing a fructan, comprising first contacting sucrose with a 1-SST enzyme to produce kestose (e.g., 1-kestose); and subsequently contacting kestose (e.g., 1-kestose) with a 1-FFT enzyme and/or a 6-SFT enzyme to produce the fructan. In some embodiments, such a two-step method comprises the use of host cells (e.g., comprising 1-SST, 1-FFT, and/or 6-SFT) and/or the use of isolated enzymes (e.g., 1-SST, 1-FFT, and/or 6-SFT). In some embodiments, kestose produced by contacting sucrose with a 1-SST enzyme is purified prior to being contacted with a 1-FFT enzyme and/or 6-SFT enzyme.

Methods of producing fructans may comprise isolating or purifying said fructans away from host cells and/or enzymes, in accordance with any isolation or purification technique known in the art.

The present invention is further illustrated by the following Examples, which should not be construed as limiting. The entire contents of all of the references (including literature references, issued patents, published patent applications, and co-pending patent applications) cited throughout this application are hereby expressly incorporated by reference. If a reference incorporated in this application contains a term whose definition is incongruous or incompatible with the definition of same term as defined in the present disclosure, the meaning ascribed to the term in this disclosure shall govern. Mention of any reference, article, publication, patent, patent publication, and patent application cited in this application is not, and should not be taken as, an acknowledgment or suggestion that they constitute valid prior art or form part of the common general knowledge of a skilled artisan.

EXAMPLES

In order that the invention described in this application may be more fully understood, the following examples are set forth. The examples described in this application are offered to illustrate the systems and methods provided in this application and are not to be construed as limiting their scope.

Example 1: Enzyme Library Design and Screening

Enzyme discovery

Machine-learning—based bioinformatics tools were used to identify enzyme candidates for each of the three desired enzymatic activities (1-SST, 1-FFT, and 6-SFT) in public sequence databases (SwissProt and TrEMBL, together known as UniProt). A single library of 152 enzymes was tested for each of the activities.

Library synthesis

DNA sequences for all 1-SST, 1-FFT, and 6-SFT enzymes were coded for expression in Pichia pastoris. Coding sequences were synthesized in an inducible Pichia pastoris expression vector under the control of the T7 promoter.

Cell growth and enzyme preparation

Strains harboring library plasmids were transformed into Pichia pastoris expression host cells. Enzymes were secreted into media, removed from the cells, and concentrated.

Enzyme screening

Bioconversion reactions involved incubating individual enzymes with either sucrose or 1-kestose for 96 hours. The reactions were subsequently stopped by boiling. Samples were subjected to high-performance liquid chromatography and analyzed by a refractive index detector (HPLC-RID).

As shown in FIG. 3A, reactions involving incubation of individual enzymes with sucrose provided resultant product mixtures that could be quantified for their concentrations of fructans comprising β(2,6) linkages and fructans comprising β(2,1) linkages (corresponding to 1-kestose). Incubation with sucrose identified enzymes with either 6-SFT or 1-SST activities. 1-SST enzymes produced high levels of 3-sugar oligosaccharides that co-migrated with kestose on HPLC. Incubations with 1-SST did not produce longer sugar polymers. 6-SFT enzymes produced high levels of higher molecular-weight oligosaccharides comprising β(2,6) linkages. Some enzymes that showed minimal activity in polymerizing sucrose demonstrated invertase activity and produced high levels of glucose and fructose.

As shown in FIG. 3B, reactions involving incubation of individual enzymes with 1-kestose provided resultant product mixtures that could be quantified for their concentrations of inulins comprising β(2,1) linkages (labeled ‘Nystose’) and higher-order kestose molecules. Incubation with kestose identified enzymes with 1-FFT activity. Reactions were assayed for high levels of 4+sugar-containing oligosaccharides, resulting in production of sucrose as a by-product. Many enzymes generated these high molecular-weight species. Another class of enzymes-kestases-formed sucrose, but did not show any activity in polymerizing high molecular-weight oligosaccharides.

Polyfructans produced were quantified by calculating the area under the curve of the HPLC chromatogram. An example of an HPLC chromatogram of a bioconversion reaction (an individual enzyme incubated with sucrose) is shown in FIG. 4 (top panel). An HPLC chromatogram of a preparation of commercially-available standards is also shown in FIG. 4 (bottom panel).

Example 2: Characterization of high-performing enzymes

Top-performing enzymes were selected for further development. Individual enzymes that showed 6-SFT, 1-SST, or 1-FFT activity in Example 1 were re-expressed, isolated, and assayed for ability to produce fructans. Enzyme preparations were incubated with either sucrose or 1-kestose before bioconversion reactions were analyzed by HPLC-RID and compared to saccharide standards. Peaks were identified by HPLC retention time, and the conversion of sucrose to other sugars was quantified by the relative peak areas from HPLC integrations. Enzymes provided in Table 2 represent the most active of each of the three classes of enzymes (6-SFT, 1-SST, and 1-FFT). “High activity” refers to the highest activity of the proteins that were tested. All proteins were tested for functionality and rank-ordered according to their activity in polymerizing sugars. SEQ ID NOs: 3-4 were modified to include a secretion signal for Pichia pastoris, and the modified constructs (SEQ ID NOs: 25 and 27, respectively) were also identified as having 1-SST activity. SEQ ID NOs: 9-10 were also modified to include a secretion signal for Pichia pastoris, and the modified constructs (SEQ ID NOs: 32 and 34, respectively) were identified as having 1-FFT activity. SEQ ID NOs: 15-21 were also modified to include a secretion signal for Pichia pastoris, and the modified constructs (SEQ ID NOs: 39, 41, 43, 45, 47, 49, and 51, respectively) were identified as having 6-SFT activity.

TABLE 2

Top-Performing Enzymes

SEQ ID NO
SEQ ID NO

Enzyme
(Amino Acid)
(DNA)

1-SST
1
5

1-FFT
7
11

6-SFT
13
22

Example 3: Bioconversion of sucrose to branched inulin — “One Pot” Bioconversion

Using the enzymes described in Table 2, a bioconversion of sucrose to branched inulin was performed. As shown in FIG. 5, sucrose (dimer of glucose and fructose) can be converted to 1-kestose (comprising β(2,1) linkage) using a 1-SST enzyme. A 1-FFT enzyme then catalyzes formation of a linear inulin, which itself can be reacted with a 6-SFT enzyme to provide β(2,6) branched inulins.

The three enzymes (1-SST, 1-FFT, and 6-SFT) were combined in a single reaction and incubated with sucrose for 96 hours. After 96 hours, the reaction was stopped by boiling.

Bioconversion to branched inulin was assayed by HPLC-RID and gas chromatography/mass spectroscopy (GC/MS). Saccharides were identified based on HPLC elution time. As shown in FIG. 6A, higher molecular-weight saccharides (n=3 to n=6) were identified as HPLC peaks that eluted before sucrose. This one-pot conversion reaction showed an increase in glucose formation as well as the formation of early-eluting high-molecular weight material, consistent with the hypothesis that this peak represents branched inulin. Comparison of this material with standards indicated that this was comprised of material with a degree of polymerization greater than 3 (DP3). Glucose did not co-elute with inulin (branched or otherwise). An HPLC assay of reactions showed a high release of glucose as a later-eluting peak in samples where branched inulin was being produced (as an early-eluting peak) (see, e.g., FIG. 6A).

GC/MS was then used to identify the presence of both β(2,1) and β(2,6) linkages in this bioconversion product mixture. Derivatization before GC/MS analysis was performed using a 4-step method that consisted of: 1) methylating free alcoholic -OH groups; 2) hydrolyzing the saccharide linkages; 3) reducing ketone and aldehyde groups; and 4) acylating the alcoholic -OH groups formed during step 3. Following this protocol, the samples were analyzed by GC/MS, which showed a series of products with a well-established elution order and characteristic fragmentation patterns (FIG. 6C-6D). GC/MS of the bioconversion sample resulted in a signature indicative of β(2,6) branched inulin. The bioconversion sample comprised a peak at 28.71 minutes, a peak that is characteristic of a known branched sugar (‘Best Ground’). Notably, this characteristic peak is not found in GC/MS analysis of linear saccharides (Chicory; Nicie).

Example 4: Bioconversion of sucrose to branched inulin—“Two Pot” Bioconversion

An isolated 1-SST enzyme is incubated with sucrose to form kestose. The kestose is isolated and then incubated with 1-FFT and 6-SFT enzymes, which convert the kestose into branched inulins.

The resulting sucrose and glucose can be isolated and used for host-cell biomass accumulation, material of bioconversion, or alternative processes.

Sequences

Non-limiting examples of 1-SST sequences

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEANLMRLRENDYPWTNDMLRWQRTGFHFQPGKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDY

TISWGHAVSKDLLHWNYLPMALRPDHWYDRKGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLL

EWKKSHVNPILVPPPGIEDHDFRDPFPVWYNESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQ

PVGMLECVDLFPVATTDSRANQALDMTTMRPGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGIG

LRYDWGKFYASKTFYDQEKQRRVLWGYVGEVDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRT

INKNFNSIPLYPGSTYQLDVGEATQLDIVAEFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQE

LSEQTATYFYVSRGIDGNLRTHFCQDELRSSKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVA

TSRVYPTEAIYSSARVFLFNNATDAIVTAKTVNVWHINSTYNHVFPGLVAP(SEQ ID NO: 1; secretion

signal is underlined)

MASSTKDVEAPPTLDAPLLGPAAPRSRLRVAPVSLSVMAFLLVAIAAAVLYYNPGGVASNLMRLRENDYPWTNDM

LRWQRTGFHFQPGKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDYTISWGHAVSKDLLHWNYLPMALRPDHWYDR

KGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLLEWKKSHVNPILVPPPGIEDHDFRDPFPVWY

NESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQPVGMLECVDLFPVATTDSRANQALDMTTMR

PGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGIGLRYDWGKFYASKTFYDQEKQRRVLWGYVGE

VDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRTINKNFNSIPLYPGSTYQLDVGEATQLDIVA

EFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQELSEQTATYFYVSRGIDGNLRTHFCQDELRS

SKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVATSRVYPTEAIYSSARVFLFNNATDAIVTAK

TVNVWHINSTYNHVFPGLVAP (SEQ ID NO: 2; secretion signal is underlined)

NLMRLRENDYPWTNDMLRWQRTGFHFQPGKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDYTISWGHAVSKDLLH

WNYLPMALRPDHWYDRKGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLLEWKKSHVNPILVPP

PGIEDHDFRDPFPVWYNESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQPVGMLECVDLFPVA

TTDSRANQALDMTTMRPGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGIGLRYDWGKFYASKTF

YDQEKQRRVLWGYVGEVDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRTINKNFNSIPLYPGS

TYQLDVGEATQLDIVAEFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQELSEQTATYFYVSRG

IDGNLRTHFCQDELRSSKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVATSRVYPTEAIYSSA

RVFLFNNATDAIVTAKTVNVWHINSTYNHVFPGLVAP (SEQ ID NO: 24)

MAKLNRSNIGLSLLLSMFLANFITDLEASSHQDLNQPYRTGYHFQPLKNWMNGPMIYKGIYHLFYQYNPYGAVWD

VRIVWGHSTSVDLVNWISQPPAFNPSQPSDINGCWSGSVTILPNGKPVILYTGIDQNKGQVQNVAVPVNISDPYL

REWSKPPQNPLMTTNAVNGINPDRFRDPTTAWLGRDGEWRVIVGSSTDDRRGLAILYKSRDFFNWTQSMKPLHYE

DLTGMWECPDFFPVSITGSDGVETSSVGENGIKHVLKVSLIETLHDYYTIGSYDREKDVYVPDLGFVQNESAPRL

DYGKYYASKTFYDDVKKRRILWGWVNESSPAKDDIEKGWSGLQSFPRKIWLDESGKELLQWPIEEIETLRGQQVN

WQKKVLKAGSTLQVHGVTAAQADVEVSFKVKELEKADVIEPSWTDPQKICSQGDLSVMSGLGPFGLMVLASNDME

EYTSVYFRIFKSNDDTNKKTKYVVLMCSDQSRSSLNDENDKSTFGAFVAIDPSHQTISLRTLIDHSIVESYGGGG

RTCITSRVYPKLAIGENANLFVFNKGTQSVDILTLSAWSLKSAQINGDLMSPFIEREESRSPNHQF (SEQ ID

NO: 3; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEADLNQPYRTGYHFQPLKNWMNGPMIYKGIYHLFYQYNPYGAVWDVRIVWGHSTSVDLVNWIS

QPPAFNPSQPSDINGCWSGSVTILPNGKPVILYTGIDQNKGQVQNVAVPVNISDPYLREWSKPPQNPLMTTNAVN

GINPDRFRDPTTAWLGRDGEWRVIVGSSTDDRRGLAILYKSRDFFNWTQSMKPLHYEDLTGMWECPDFFPVSITG

SDGVETSSVGENGIKHVLKVSLIETLHDYYTIGSYDREKDVYVPDLGFVQNESAPRLDYGKYYASKTFYDDVKKR

RILWGWVNESSPAKDDIEKGWSGLQSFPRKIWLDESGKELLQWPIEEIETLRGQQVNWQKKVLKAGSTLQVHGVT

AAQADVEVSFKVKELEKADVIEPSWTDPQKICSQGDLSVMSGLGPFGLMVLASNDMEEYTSVYFRIFKSNDDTNK

KTKYVVLMCSDQSRSSLNDENDKSTFGAFVAIDPSHQTISLRTLIDHSIVESYGGGGRTCITSRVYPKLAIGENA

NLFVFNKGTQSVDILTLSAWSLKSAQINGDLMSPFIEREESRSPNHQF (SEQ ID NO: 25; secretion

signal is underlined)

DLNQPYRTGYHFQPLKNWMNGPMIYKGIYHLFYQYNPYGAVWDVRIVWGHSTSVDLVNWISQPPAFNPSQPSDIN

GCWSGSVTILPNGKPVILYTGIDQNKGQVQNVAVPVNISDPYLREWSKPPQNPLMTTNAVNGINPDRFRDPTTAW

LGRDGEWRVIVGSSTDDRRGLAILYKSRDFFNWTQSMKPLHYEDLTGMWECPDFFPVSITGSDGVETSSVGENGI

KHVLKVSLIETLHDYYTIGSYDREKDVYVPDLGFVQNESAPRLDYGKYYASKTFYDDVKKRRILWGWVNESSPAK

DDIEKGWSGLQSFPRKIWLDESGKELLQWPIEEIETLRGQQVNWQKKVLKAGSTLQVHGVTAAQADVEVSFKVKE

LEKADVIEPSWTDPQKICSQGDLSVMSGLGPFGLMVLASNDMEEYTSVYFRIFKSNDDTNKKTKYVVLMCSDQSR

SSLNDENDKSTFGAFVAIDPSHQTISLRTLIDHSIVESYGGGGRTCITSRVYPKLAIGENANLFVFNKGTQSVDI

LTLSAWSLKSAQINGDLMSPFIEREESRSPNHQF (SEQ ID NO: 26)

MASPSDLESPPTLSAQLLESRPPRSKLRLVALTLTAAAFLVALALFLADGSASRFVSGLARKLRSDPIKEHDYPW

TNEMLTWQRSGFHFQPAKNFQSDPNAAMYYKGWYHFFYQYNPTGTAWDYTISWGHAVSRDLIHWLHLPMAMVPDH

WYDAKGVWSGYSTLLPDGRVIVLYTGGTPELVQVQNLAVPADASDPLLLKWKKSSVNPILVPPPGIGTSDFRDPF

PIWYNETDSNWHVLIGSKDSNHHGIVLLYKTKDFFNFTLLPSLLHTSTQSVGMFECVDLYPVATGGPLSNRGLEM

SVDLSNGGIKHVLKASMDEERHDYYAIGTFDLDSFKWTPDDPSIDVGVGLRYDWGKFYASKTFFDTEKQRRILWG

YVGEVDSKDDDKMKGWATLQNIPRTILLDTKTQSNLIIWPVEEVEDLRTDGNIFNDIKIGAGSSVQLDIGAASQL

DIEAEFELDNSALDGAIEADVTYNCSTSGGAANRGLLGPFGLLVLANQDLTEQTATYFYVSRGTDGDLRTHFCQD

ELRSSKAGDIVKRVVGSVVPVLHGETWSLRILVDHSIIESFAQRGRAVATSRVYPTEAIYNKARLFLFNNATDAK

VTAKSVKIWHMNSTHNHPFPGLESLFES (SEQ ID NO: 4; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEARSDPIKEHDYPWTNEMLTWQRSGFHFQPAKNFQSDPNAAMYYKGWYHFFYQYNPTGTAWDY

TISWGHAVSRDLIHWLHLPMAMVPDHWYDAKGVWSGYSTLLPDGRVIVLYTGGTPELVQVQNLAVPADASDPLLL

KWKKSSVNPILVPPPGIGTSDFRDPFPIWYNETDSNWHVLIGSKDSNHHGIVLLYKTKDFFNFTLLPSLLHTSTQ

SVGMFECVDLYPVATGGPLSNRGLEMSVDLSNGGIKHVLKASMDEERHDYYAIGTFDLDSFKWTPDDPSIDVGVG

LRYDWGKFYASKTFFDTEKQRRILWGYVGEVDSKDDDKMKGWATLQNIPRTILLDTKTQSNLIIWPVEEVEDLRT

DGNIFNDIKIGAGSSVQLDIGAASQLDIEAEFELDNSALDGAIEADVTYNCSTSGGAANRGLLGPFGLLVLANQD

LTEQTATYFYVSRGTDGDLRTHFCQDELRSSKAGDIVKRVVGSVVPVLHGETWSLRILVDHSIIESFAQRGRAVA

TSRVYPTEAIYNKARLFLFNNATDAKVTAKSVKIWHMNSTHNHPFPGLESLFES (SEQ ID NO: 27;

secretion signal is underlined)

RSDPIKEHDYPWTNEMLTWQRSGFHFQPAKNFQSDPNAAMYYKGWYHFFYQYNPTGTAWDYTISWGHAVSRDLIH

WLHLPMAMVPDHWYDAKGVWSGYSTLLPDGRVIVLYTGGTPELVQVQNLAVPADASDPLLLKWKKSSVNPILVPP

PGIGTSDFRDPFPIWYNETDSNWHVLIGSKDSNHHGIVLLYKTKDFFNFTLLPSLLHTSTQSVGMFECVDLYPVA

TGGPLSNRGLEMSVDLSNGGIKHVLKASMDEERHDYYAIGTFDLDSFKWTPDDPSIDVGVGLRYDWGKFYASKTF

FDTEKQRRILWGYVGEVDSKDDDKMKGWATLQNIPRTILLDTKTQSNLIIWPVEEVEDLRTDGNIFNDIKIGAGS

SVQLDIGAASQLDIEAEFELDNSALDGAIEADVTYNCSTSGGAANRGLLGPFGLLVLANQDLTEQTATYFYVSRG

TDGDLRTHFCQDELRSSKAGDIVKRVVGSVVPVLHGETWSLRILVDHSIIESFAQRGRAVATSRVYPTEAIYNKA

RLFLFNNATDAKVTAKSVKIWHMNSTHNHPFPGLESLFES (SEQ ID NO: 28)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctaacttgatgcgtttaagagagaatgattatccc

tggactaacgacatgctaagatggcaacgcacgggatttcacttccagcctggtaaaaacttccaagccgaccca

aatgcagctatgttttacaagggctggtaccatttcttttatcaatacaacccgaccggtgtggcttgggattac

acaatctcctggggtcacgctgtcagtaaggatttgctgcattggaattatcttccaatggccttgaggcctgac

cactggtacgatagaaaaggtgtttggagcggttactctactttattgccagacggtagaattgttgtcttgtac

accggtggaactaaggaattagttcaagtccaaaacttggctgtcccagtaaacctttctgacccattgctattg

gaatggaagaagtcacacgttaacccaatactcgttccacctccggggatcgaagatcatgatttccgagatcca

ttcccagtgtggtataatgaatctgactcgcggtggcacgttgtaattggttccaaagatccagagcactatggt

attgtcttgatctacactaccaaggacttcgttaactttacgttattaccaaacatattgcattccaccaagcag

ccggttggtatgctggaatgtgtagacttgttcccagttgctacaactgattctcgtgcaaatcaagctttggat

atgactaccatgaggcccggtcctgggctcaaatatgtgttaaaggcgagtatggatgacgaaagacacgattac

tacgccctaggtagctttgacttggactcgttcacttttacaccagatgatgaaaccattgacgtcggtattggt

cttagatacgactggggcaagttctacgcgtccaagactttttacgaccaagaaaaacaaagaagagttttgtgg

ggatacgtcggtgaagttgactcgaagcgtgatgatgctctgaaaggttgggcttctttgcaaaatatcccacgt

acaatcttgttcgacaccaaaaccaagtccaacctaattttgtggccagttgaagaagtcgagtctttaagaact

attaacaagaatttcaattcaatccctttgtatcctggttctacttaccagcttgatgtgggtgaagctacccaa

ttggatattgtggccgagttcgaagtcgatgaaaaggctattgaagctactgccgaagctgatgttacatataac

tgctccacctccggtggtgcagctaatagaggggttttgggtccattcggtttgttagttttagctaaccaagag

ttgtctgaacaaactgctacttacttctatgtctctcgcggcatagatggtaacttaagaacacatttttgtcaa

gacgaactgcgatcttccaaggctggtgccatcactaagcgggtagttggttctaccgtcccagttctacatggc

gaaacctgggccttgagaattttggtcgatcactcaatcgtagagtcttttgcacagagaggtagagctgttgcc

acgagtagagtctatcctacagaagcaatttatagctcagctagagtctttctattcaacaatgccactgacgct

attgttaccgctaagacagtaaacgtttggcacatcaactccacctacaatcatgtttttccgggtctggtcgct

cca (SEQ ID NO: 5)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctaacttgatgcgtttaagagagaatgattatccc

tggactaacgacatgctaagatggcaacgcacgggatttcacttccagcctggtaaaaacttccaagccgaccca

aatgcagctatgttttacaagggctggtaccatttcttttatcaatacaacccgaccggtgtggcttgggattac

acaatctcctggggtcacgctgtcagtaaggatttgctgcattggaattatcttccaatggccttgaggcctgac

cactggtacgatagaaaaggtgtttggagcggttactctactttattgccagacggtagaattgttgtcttgtac

accggtggaactaaggaattagttcaagtccaaaacttggctgtcccagtaaacctttctgacccattgctattg

gaatggaagaagtcacacgttaacccaatactcgttccacctccggggatcgaagatcatgatttccgagatcca

ttcccagtgtggtataatgaatctgactcgcggtggcacgttgtaattggttccaaagatccagagcactatggt

attgtcttgatctacactaccaaggacttcgttaactttacgttattaccaaacatattgcattccaccaagcag

ccggttggtatgctggaatgtgtagacttgttcccagttgctacaactgattctcgtgcaaatcaagctttggat

atgactaccatgaggcccggtcctgggctcaaatatgtgttaaaggcgagtatggatgacgaaagacacgattac

tacgccctaggtagctttgacttggactcgttcacttttacaccagatgatgaaaccattgacgtcggtattggt

cttagatacgactggggcaagttctacgcgtccaagactttttacgaccaagaaaaacaaagaagagttttgtgg

ggatacgtcggtgaagttgactcgaagcgtgatgatgctctgaaaggttgggcttctttgcaaaatatcccacgt

acaatcttgttcgacaccaaaaccaagtccaacctaattttgtggccagttgaagaagtcgagtctttaagaact

attaacaagaatttcaattcaatccctttgtatcctggttctacttaccagcttgatgtgggtgaagctacccaa

ttggatattgtggccgagttcgaagtcgatgaaaaggctattgaagctactgccgaagctgatgttacatataac

tgctccacctccggtggtgcagctaatagaggggttttgggtccattcggtttgttagttttagctaaccaagag

ttgtctgaacaaactgctacttacttctatgtctctcgcggcatagatggtaacttaagaacacatttttgtcaa

gacgaactgcgatcttccaaggctggtgccatcactaagcgggtagttggttctaccgtcccagttctacatggc

gaaacctgggccttgagaattttggtcgatcactcaatcgtagagtcttttgcacagagaggtagagctgttgcc

acgagtagagtctatcctacagaagcaatttatagctcagctagagtctttctattcaacaatgccactgacgct

attgttaccgctaagacagtaaacgtttggcacatcaactccacctacaatcatgtttttccgggtctggtcgct

ccataa (SEQ ID NO: 62)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgacttgaatcaaccttatagaaccggttaccac

ttccagccattaaaaaactggatgaacggcccaatgatttacaagggaatctatcatctgttttaccaatacaac

ccatacggtgccgtgtgggatgtaaggattgtctggggtcacagtacttccgtcgatttggttaattggataagc

caacccccggcattcaacccatcacaaccatctgacatcaacggttgttggtcgggttctgttacgattctacct

aatgggaagccagttatcctttatacaggtattgatcaaaacaagggtcaagttcagaatgtcgcggttccagtc

aatatctctgacccatatttgcgtgaatggtccaaaccacctcaaaacccattgatgactaccaacgctgttaac

ggtatcaaccctgatagatttagagatccaactacagcttggctaggaagagatggtgagtggagagtcattgtg

ggttcatctaccgacgaccgccggggtttggccatattatacaagtcccgcgatttctttaattggactcaatct

atgaaaccgttgcattacgaagatttgaccggaatgtgggaatgcccagacttcttcccagtttcaattacgggg

agtgatggtgtggaaacttcttccgtaggtgaaaacggtataaagcacgttctcaaggtcagcttaatcgaaact

ttgcatgactactataccattggttcgtatgacagagagaaggatgtctacgttcctgacttaggtttcgtccaa

aatgaatccgctccacgtttggattacgggaaatactacgcctctaagacattttatgacgacgtcaaaaagcgg

agaattttatggggttgggttaacgaatcttcgccagctaaggacgatattgaaaagggctggtctggtttgcag

tcatttccaagaaagatttggttggacgagagcggtaaagaattgctgcaatggccaatcgaagaaatagaaact

ctacgtggccaacaagttaactggcaaaagaaggttttgaaggctggttctaccttacaagtccacggtgttact

gctgctcaagcggatgtagaggtttccttcaaagtcaaggaattggaaaaagcagacgtcatcgaaccctcctgg

accgatccccaaaaaatatgttcgcagggtgacttgtctgttatgtctggtttaggtccgttcggtcttatggtt

cttgcttctaatgatatggaagaatacacttccgtttacttcagaatcttcaagagtaacgatgatactaataaa

aagaccaagtatgttgtgctcatgtgttccgatcaatcaagaagttctttgaacgatgagaacgataagtcaacc

tttggggcctttgttgctattgatccatctcatcagaccatctctctccgaacattgattgaccactccatagtc

gaatcatacggtggtggtggcagaacttgtatcacgagtagagtatatccaaagttggccatcggtgaaaatgca

aatttattcgtctttaacaagggtactcaatctgttgacattctgactttaagcgcttggtcccttaagagtgct

caaattaacggagacttgatgtctcctttcatcgagagagaagaaagtagatcacccaaccatcaattctaa

(SEQ ID NO: 29)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctagatcagatcctattaaagagcatgactatcca

tggactaatgaaatgttgacatggcaacgtagtggatttcacttccagcccgctaagaacttccaatccgaccca

aacgcagccatgtactacaagggctggtatcacttcttttaccaatacaatccgaccggtactgcttgggattac

acgatctcttggggtcatgctgtctcgcgggacttaatacactggcttcatctgccaatggctatggtaccagat

cactggtatgatgcgaagggtgtgtggtccggttactctaccctattgccagatggtagagttattgtcttatat

actggtggtaccccagaattggttcaagttcaaaacttggccgttcctgctgacgcctctgatccactgttgttg

aaatggaagaagtcctcagtcaaccccatccttgttccgccaccagggattggaactagcgacttcagggatcca

tttcctatctggtacaatgaaacagactccaactggcacgtcttgataggttctaaagactccaaccaccatggt

attgtattattgtataagactaaggacttctttaacttcacattgcttccatctttattgcacaccagtacccag

agcgttggtatgttcgaatgcgtggatctctacccagtcgctactggtgggccactatctaatagaggtttggaa

atgagcgttgatctctcaaatggtggtatcaaacatgttttgaaggcttctatggatgaggaaagacatgactac

tatgcgattggcacctttgacttagattctttcaaatggacgcccgacgatccaagtatcgacgttggtgtcggt

ctaagatacgattggggtaagttctacgcttctaagaccttttttgatactgaaaagcaacgccgaattttatgg

ggctatgtcggtgaagttgactccaaggatgatgacaagatgaaaggttgggcaaccttacaaaatatacctaga

actatcttgcttgacacgaaaactcaatctaacttgattatctggccagtcgaggaagttgaagatttgagaact

gacggcaacattttcaacgatataaaaattggtgctggttcttcagtacaattggatattggtgccgcttcgcag

ttggacatcgaagccgaatttgaactagataacagtgctttggacggcgctattgaagctgatgtcacttacaat

tgttcaacttcgggtggtgccgcaaatagaggtttgctggggcctttcggtttacttgttttagctaaccaagac

ttgacagaacaaaccgctacatacttctacgtgtccagaggtaccgatggtgatttgagaacccacttctgtcaa

gacgaattacgttcctccaaggcaggagacattgtcaagcgcgttgttggttctgtggtgccagttctacatggt

gaaacttggtccttgagaattttggttgaccactctatcatcgaaagctttgcacaaagaggacgggctgttgct

acctctagggtctacccaactgaggcaatctacaacaaagccagactgtttttgttcaacaatgctacagacgct

aaggttactgccaagagtgttaaaatatggcatatgaactctacacacaaccatccattccctggtttagaatcg

ctattcgaatcataa (SEQ ID NO: 30)

1-SST from Festuca arundinacea:

MESSAVVPGTTAPLLPYAYAPLPSSADDARENQSSGGVRWRVCAAVLAASALAVLIVVGLLAGGRVDRGPAGGDV

ASAAVPAVPMEIPRSRGKDFGVSEKASGAYSADGGFPWSNAMLQWQRTGFHFQPEKHYMNDPNGPVYYGGWYHLF

YQYNPKGDSWGNIAWAHAVSKDMVNWRHLPLAMVPDQWYDSNGVLTGSITVLPDGQVILLYTGNTDTLAQVQCLA

TPADPSDPLLREWIKHPANPILYPPPGIGLKDFRDPLTAWFDHSDNTWRTVIGSKDDDGHAGIILSYKTKDFVNY

ELMPGNMHRGPDGTGMYECIDLYPVGGNSSEMLGGDDSPDVLFVLKESSDDERHDYYALGRFDAAANIWTPIDQE

LDLGIGLRYDWGKYYASKSFYDQKKNRRIVWAYIGETDSEQADITKGWANLMTIPRTVELDKKTRTNLIQWPVEE

LDTLRRNSTDLSGITVDAGSVIRLPLHQGAQIDIEASFQLNSSDVDALTEADVSYNCSTSGAAVRGALGPFGLLV

LANGRTEQTAVYFYVSKGVDGALQTHFCHDESRSTQAKDVVNRMIGSIVPVLDGETFSVRVLVDHSIVQSFAMGG

RITATSRAYPTEAIYAAAGVYLFNNATGATVTAERLVVYEMASADNHIFTNDDL (SEQ ID NO: 6)

Non-limiting examples of 1-FFT sequences

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEASSVQPSAAERLTWERTAFHFQPAKNFIYDPNGPLFHMGWHHLFYQYNPYAPVWGNMSWGHA

VSKDMINWFELPVALVPTEWYDIEGVLSGSTTALPNGQIFALYTGNANDFSQLQCKAVPVDVSDPLLVKWVKYDG

NPILYTPPGIGLKDYRDPSTVWTGPDGKHRMIMGTKRGTTGLVLVYHTTDFTNYVMLDEPLHSVPNTDMWECVDL

FPVSTTNDSALDIAAYGSGIKHVLKESWEGHAMDFYSIGTYDAINDKWTPDNPELDVGIGLRCDYGRFFASKSLY

DPLKKRRVTWGYVAESDSADQDVSRGWATIYNVARTIVLDRKTGTHLLQWPVEELESLRSNVREFKEMTLEPGSI

VPLDIGSATQLDIIATFEVDQEALKATSDANDEYACTTSSGAAERGSFGPFGIAVLADGTLSELTPVYFYIAKNT

KGGVDTHFCTDKLRSSLDYDSEKVVYGSTIPVLDGEQITMRVLVDHSVVEGFAQGGRTVITSRVYPTKAIYEGAK

LFVFNNATTTNVKATLNVWQMSHALIQPYPF (SEQ ID NO: 7; secretion signal is

underlined)

MKTTEPLTDLEHAPNHTPLLDHPQPPPATVSKRLLIRVLSSITFVSLFFVSAFLLILLNQHESSYTDDNLAPLDR

SSVQPSAAERLTWERTAFHFQPAKNFIYDPNGPLFHMGWHHLFYQYNPYAPVWGNMSWGHAVSKDMINWFELPVA

LVPTEWYDIEGVLSGSTTALPNGQIFALYTGNANDFSQLQCKAVPVDVSDPLLVKWVKYDGNPILYTPPGIGLKD

YRDPSTVWTGPDGKHRMIMGTKRGTTGLVLVYHTTDFTNYVMLDEPLHSVPNTDMWECVDLFPVSTTNDSALDIA

AYGSGIKHVLKESWEGHAMDFYSIGTYDAINDKWTPDNPELDVGIGLRCDYGRFFASKSLYDPLKKRRVTWGYVA

ESDSADQDVSRGWATIYNVARTIVLDRKTGTHLLQWPVEELESLRSNVREFKEMTLEPGSIVPLDIGSATQLDI1

ATFEVDQEALKATSDANDEYACTTSSGAAERGSFGPFGIAVLADGTLSELTPVYFYIAKNTKGGVDTHFCTDKLR

SSLDYDSEKVVYGSTIPVLDGEQITMRVLVDHSVVEGFAQGGRTVITSRVYPTKAIYEGAKLFVFNNATTTNVKA

TLNVWQMSHALIQPYPF (SEQ ID NO: 8; secretion signal is underlined)

SSVQPSAAERLTWERTAFHFQPAKNFIYDPNGPLFHMGWHHLFYQYNPYAPVWGNMSWGHAVSKDMINWFELPVA

LVPTEWYDIEGVLSGSTTALPNGQIFALYTGNANDFSQLQCKAVPVDVSDPLLVKWVKYDGNPILYTPPGIGLKD

YRDPSTVWTGPDGKHRMIMGTKRGTTGLVLVYHTTDFTNYVMLDEPLHSVPNTDMWECVDLFPVSTTNDSALDIA

AYGSGIKHVLKESWEGHAMDFYSIGTYDAINDKWTPDNPELDVGIGLRCDYGRFFASKSLYDPLKKRRVTWGYVA

ESDSADQDVSRGWATIYNVARTIVLDRKTGTHLLQWPVEELESLRSNVREFKEMTLEPGSIVPLDIGSATQLDI1

ATFEVDQEALKATSDANDEYACTTSSGAAERGSFGPFGIAVLADGTLSELTPVYFYIAKNTKGGVDTHFCTDKLR

SSLDYDSEKVVYGSTIPVLDGEQITMRVLVDHSVVEGFAQGGRTVITSRVYPTKAIYEGAKLFVFNNATTTNVKA

TLNVWQMSHALIQPYPF (SEQ ID NO: 31)

MKTIEPFSDVENAPNSTPLLNHPEPPRAAVRKQSFVRVLSSITLVSLFFVLAFVLIVLNQQDSTTTVANSAPPGA

TVPEKSSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHAVSKDMIHWF

ELPVAIVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVEWVKYEDNPILYIPPG

IGPKDYRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDFYPVSTINDS

ALDIAAYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPEFDVGIGLRVDYGRFFASKSLYDPLKKRRVT

WGYVAESDSSDQDLNRGWATIYNVGRTVVLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSIIPLDIGMAT

QLDIVATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADVALSELTPVYFYIAKNIDGGLVTHFC

TDKLRSSLDYDGERVVYGSTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVMTSRVYPTNAIYEEAKIFLFNNATG

ASVKASLKIWQMGSASIQAYPF (SEQ ID NO: 9; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEASSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHA

VSKDMIHWFELPVAIVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVEWVKYED

NPILYIPPGIGPKDYRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDF

YPVSTINDSALDIAAYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPEFDVGIGLRVDYGRFFASKSLY

DPLKKRRVTWGYVAESDSSDQDLNRGWATIYNVGRTVVLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSI

IPLDIGMATQLDIVATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADVALSELTPVYFYIAKNI

DGGLVTHFCTDKLRSSLDYDGERVVYGSTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVMTSRVYPTNAIYEEAK

IFLFNNATGASVKASLKIWQMGSASIQAYPF (SEQ ID NO: 32; secretion signal is

underlined)

SSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHAVSKDMIHWFELPVA

IVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVEWVKYEDNPILYIPPGIGPKD

YRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDFYPVSTINDSALDIA

AYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPEFDVGIGLRVDYGRFFASKSLYDPLKKRRVTWGYVA

ESDSSDQDLNRGWATIYNVGRTVVLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSIIPLDIGMATQLDIV

ATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADVALSELTPVYFYIAKNIDGGLVTHFCTDKLR

SSLDYDGERVVYGSTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVMTSRVYPTNAIYEEAKIFLFNNATGASVKA

SLKIWQMGSASIQAYPF (SEQ ID NO: 33)

MKTIEPFSDVENAPNSTPLLNHPEPSRAAVRKQSFVRVLSSITLVSLFFVLAFVLIVLNQQDSTNTVANSAPPGA

TVPEKSSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHAVSKDMIHWF

ELPVAMVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVDWVKYEDNPILYIPPG

IGPKDYRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDFYPVSTINDS

ALDIAAYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPELDVGIGLRVDYGRLFASKSLYDPLKKRRVT

WGYVGESDSPDQDINRGWATIYNVGRTWLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSIIPLDIGMAT

QLDIVATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADLALSELTPLYFYIAKNTDGGLVTHFC

TDKLRSSLDYDGERVVYGGTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVITSRVYPTNAIYEEAKIFLFNNATG

ASVKASLKIWQMGSASIQAYPF (SEQ ID NO: 10; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEASSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHA

VSKDMIHWFELPVAMVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVDWVKYED

NPILYIPPGIGPKDYRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDF

YPVSTINDSALDIAAYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPELDVGIGLRVDYGRLFASKSLY

DPLKKRRVTWGYVGESDSPDQDINRGWATIYNVGRTVVLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSI

IPLDIGMATQLDIVATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADLALSELTPLYFYIAKNT

DGGLVTHFCTDKLRSSLDYDGERVVYGGTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVITSRVYPTNAIYEEAK

IFLFNNATGASVKASLKIWQMGSASIQAYPF (SEQ ID NO: 34; secretion signal is

underlined)

SSVKHSQSDRLRWERTAYHFQPAKNFIYDPNGPLFHMGWYHLFYQYNPYAPIWGNMSWGHAVSKDMIHWFELPVA

MVPTEWYDIEGVLSGSTTALPNGQIFALYTGNAKDFSQLQCKAVPLNASDPLLVDWVKYEDNPILYIPPGIGPKD

YRDPSTVWTGPDGKHRMIMGTKQNGTGMVHVYHTTDFINYVLLDEPLHSVPNTDMWECVDFYPVSTINDSALDIA

AYGSDIKHVIKESWEGHGMDLYSIGTYDAYKDKWTPDNPELDVGIGLRVDYGRLFASKSLYDPLKKRRVTWGYVG

ESDSPDQDINRGWATIYNVGRTVVLDRKTGTHLLHWPVEEIESLRSNVREFNEIELVPGSIIPLDIGMATQLDIV

ATFKVDPEALMAKSDINSEYGCTTSSGATQRGSLGPFGIVVLADLALSELTPLYFYIAKNTDGGLVTHFCTDKLR

SSLDYDGERVVYGGTVPVLDGEELTMRLLVDHSVVEGFAQGGRTVITSRVYPTNAIYEEAKIFLFNNATGASVKA

SLKIWQMGSASIQAYPF (SEQ ID NO: 35)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctagttccgttcaaccttctgccgctgaacgttta

acctgggagagaactgcattccattttcagccagctaaaaatttcatttatgatccaaacggaccgctgtttcac

atgggctggcaccatcttttctaccaatacaacccctacgctccagtctggggtaatatgagctggggtcacgcg

gtgtcaaaggacatgataaactggttcgaattgccagtagccttagttccaacggaatggtatgatattgaaggt

gttctatctggttctactacagctttgcctaatgggcaaatctttgctttgtacaccggtaacgccaacgacttc

tcccaattgcaatgtaaggctgtcccagttgacgtgtcggatccattattggtcaaatgggttaagtatgacggt

aatccgatcttgtacactccacctggaatcggtctgaaggattatagagatccatctaccgtctggactggtcca

gacggtaagcataggatgattatgggtacaaagagaggtaccactggcttggttttagtttaccacacaacggat

ttcactaactacgtcatgttggacgaaccactccactcagtaccaaacactgacatgtgggaatgcgttgatctt

tttccggtcagcaccaccaatgatagtgctttggacatcgcggcttatggttccggtattaaacatgttttgaaa

gagtcttgggaaggtcacgcaatggatttctactccattgggacttacgatgctataaacgacaagtggactcct

gacaacccagaactagacgtcggtattggtttgagatgtgattacggtagatttttcgcatctaagtccctatac

gatcctttaaagaaacggagagttacctggggatatgtcgccgaatctgattcagccgaccaagacgtgtctcgc

ggttgggctacaatctataatgttgcaaggactattgttttagaccgtaagaccggcactcatctgcttcagtgg

ccagtcgaagaattggagtcccttagatcgaacgtgagagaatttaaggaaatgaccttggaaccaggttccatc

gttccattggatataggttctgctactcaattggatattatcgctacgttcgaagttgaccaagaagctttgaaa

gctacctctgacgctaacgacgaatacgcctgtacaacatcttcaggtgctgcggagcgtggttcgttcggtccc

ttcggtatcgctgtcctcgccgatggtaccttgtccgaactgactccagtatacttctacattgctaaaaatact

aagggcggggtcgatacgcacttttgtactgataagttgagaagctctttagactatgacagtgaaaaggttgtc

tacgggagtaccattccagttttagatggtgaacaaatcactatgagagttctcgtcgatcattccgttgtggaa

ggttttgcccagggtggtagaactgtaattaccagtagagtttaccctaccaaggctatatacgaaggtgccaag

ttgtttgtattcaataacgctacaactacaaatgttaaggcaacgttgaatgtatggcaaatgtcacacgccctc

atccaaccatacccattctaa (SEQ ID NO: 11)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctagttccgttaaacattctcagtcagatcgattg

aggtgggaacgtactgcctaccactttcaaccagcaaagaacttcatatatgaccctaatggtccacttttccac

atgggatggtaccatctattttaccaatataacccgtatgctccaatttggggcaatatgtcttggggtcacgct

gtgtccaaggacatgatccattggttcgagctgcccgtcgctatcgttccaacggaatggtacgatattgaaggt

gtattaagcggttcgacaactgcgttgccaaacggtcaaattttcgccttgtacaccggtaatgctaaggatttt

tctcaattacaatgcaaagctgtccctttgaacgcttccgacccattgttggttgaatgggttaagtacgaagat

aaccctatcctatatattccaccaggcatcggtcctaaggactacagagatccatctaccgtgtggacaggtcca

gatggtaaacacagaatgattatgggaaccaagcaaaacggtactgggatggttcatgtctaccacaccactgac

tttataaattatgtcttattagacgagccgttgcactccgtcccaaacaccgatatgtgggaatgtgtggacttc

tacccagtatctactatcaatgacagcgcgttggatattgcagcctacggttcagacatcaagcatgttataaaa

gaatcttgggaaggtcatggtatggatttatactctattggtacttatgacgcttacaaggataagtggacgcca

gataaccccgagttcgatgttgggattggtctgagagttgattacggcagattctttgcttccaagagcttgtac

gacccgttgaagaagagaagagtcacatggggttatgttgctgaaagtgattcttccgaccaagacctcaataga

ggttgggccacaatctataacgttggtagaactgtcgtcttggaccggaaaaccggtacacacctattacattgg

ccagtggaggaaattgaatctctgcgttcgaacgtcagagaatttaatgaaattgaattggttccaggatcgatc

ataccattggatattggtatggctactcaattggacatcgttgccaccttcaaagtagacccagaagctcttatg

gctaagtccgatattaactctgaatacggttgtaccacttcctcaggtgctactcagcgtgggtctttaggccct

tttggtatcgttgttttggctgacgtagctctatcggagttaaccccagtttacttctatatcgcaaagaatatc

gatggtggtctggtcactcacttctgtaccgataaattgcgctctagtttggactacgatggagaaagagttgtt

tacggttcaactgttccagtcttggacggtgaagaattaaccatgagattgctggtggatcatagtgtagtcgaa

ggtttcgctcaaggtggtagaactgttatgacctccagagtctaccccactaacgccatctatgaagaggcgaag

atttttcttttcaataacgcgactggcgctagtgttaaagcatctttgaagatttggcaaatgggttctgcctct

attcaggcttatcccttctaa (SEQ ID NO: 36)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctagttccgttaaacattctcagtcagatcgattg

aggtgggaacgtactgcctaccactttcaaccagcaaagaacttcatatatgaccctaatggtccacttttccac

atgggatggtaccatctattttaccaatataacccgtatgctccaatttggggcaatatgtcttggggtcacgct

gtgtccaaggacatgatccattggttcgagctgcccgtcgctatggttccaacggaatggtacgatattgaaggt

gtcttgtctgggagcaccacagctttgcctaacggtcaaatcttcgccttatacactggtaatgcgaaagatttt

tcccaattacaatgcaaggctgttccattgaacgcctcggacccattgctcgtagattgggtcaagtacgaagat

aacccaattttgtatatccccccaggtattggaccaaaggactacagagatccgagtaccgtgtggactggtcct

gacggtaaacacagaatgatcatgggtaccaagcaaaacggcactggtatggttcacgtataccatacaaccgac

tttattaattatgttttattggacgaaccattgcactctgttccaaatactgatatgtgggagtgtgtcgatttc

tacccagtctctacgataaacgacagcgcactcgatatagctgcttatggtagtgatattaagcacgttattaaa

gaatcttgggaaggtcatggtatggacttgtactccatcggtacttacgatgcttacaaggataagtggacccca

gacaaccctgaattagacgttggtatcgggctaagagtggactatggtagattgttcgcatcgaaaagcctttac

gatccactgaagaaaagaagagtcacttggggttacgttggcgagtctgattctccagatcaggacattaacaga

ggttgggcgaccatctataatgttggacgtaccgtcgttttggatagaaagactggtactcatctactgcactgg

cctgtcgaagaaatcgaatcattaagaagtaatgttagagaatttaacgaaattgagttggtaccaggttctata

attcctttggacattggtatggccacacaattggacatcgttgctacattcaaggttgatccagaagctttaatg

gctaagtctgacataaactccgaatacggttgtaccacttcctccggtgcgactcaaagaggttcgttgggtcca

ttcggtatcgtcgttctagccgatttggctctctctgaattgactccattatacttttatatcgctaagaacacc

gatgggggcttggtaacacacttctgtactgataaattaagatcaagtttggattacgacggtgaacgcgtcgta

tacggtggtacggttcccgtgttagacggggaagaactcaccatgaggctattggtcgatcattctgttgttgag

ggttttgctcaaggtggaagaaccgttattactagccgtgtctatcccacaaatgctatttatgaagaagccaag

attttcctttttaacaacgctaccggtgcatccgttaaggcttctttgaagatatggcaaatgggtagcgcttct

atccaagcctacccattctaa (SEQ ID NO: 37)

1-FFT from Echinops ritro:

EPFSDLEHAPNHTPLLDRPKTPPAAVSHRLLIRVLSTITVVSLFFVAAFLLVLNQQDSGNNPLPQDPPPQPSAAD

RLRWERTAYHYQPAKNFMYDPNGPIFHMGWYHLFYQYNPYSVFWGNMTWGHAVSKDMINWFELPVALAPVEWYDI

EGVLSGSTTVLPTGEIFALYTGNANDFSQLQCKAVPVNTSDPLLIDWVRYEGNPILYTPPGVGLTDYRDPSTVWT

GPDNIHRMIIGTRRNNTGLVLVYHTKDFINYELLDEPLHSVPDSGMWECVDLYPVSTMNDTALDVAAYGSGIKHV

LKESWEGHAKDFYSIGTYDAINDKWWPDNPELDLGMGWRCDYGRFFASKTLYDPLKKRRVTWGYVAESDSGDQDR

SRGWSNIYNVARTVMLDRKTGTNLLQWPVEEIESLRSKVHEFNEIELQPGSIIPLEVGSTTQLDIVATFEVNKDA

FEETNVNYNEYGCTSSKGASQRGRLGPFGIIVLADGNLLELTPVYFYIAKNNDGSLTTHFCTDKLRSSFDYDDEK

VVYGSTVPVLEGEKLTIRLMVDHSIIEGFAQGGRTVITSRVYPTKAIYDTAKLFLFNNATDITVKASLKVWHMAS

ANIQMYPF (SEQ ID NO: 12)

Non-limiting examples of 6-SFT sequences

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEAVPGKLESNADVEWQRSAYHFQPDKNFISDPDGPMYHMGWYHLFYQYNPESAIWGNITWGHS

VSRDMINWFHLPFAMVPDHWYDIEGVMTGSATVLPNGQIIMLYTGNAYDLSQLQCLAYAVNSSDPLLLEWKKYEG

NPILFPPPGVGYKDFRDPSTLWMGPDGEWRMVMGSKHNETIGCALVYRTTNFTHFELNEEVLHAVPHTGMWECVD

LYPVSTTHTNGLEMKDNGPNVKYILKQSGDEDRHDWYAIGTFDPEKDKWYPDDPENDVGIGLRYDYGKFYASKTF

YDQHKKRRVLWGYVGETDPPKSDLLKGWANILNIPRSVVLDTQTETNLIQWPIEEVEKLRSKKYDEFKDVELRPG

SLIPLEIGTATQLDISATFEIDEKKLESTLEADVLFNCTTSEGSVGRGVLGPFGIVVLADANRSEQLPVYFYIAK

DTDGTSRTYFCADESRSSKDKDVGKWVYGSSVPVLEGENYNMRLLVDHSIVEGFAQGGRTVVTSRVYPTMAIYGA

AKIFLFNNATGISVKASLKIWKMAEAQLDPFPLSGWSS (SEQ ID NO: 13; secretion signal is

underlined)

MASSTTATTPLILRDETQIRPQLAGSSVGRRLSMAKILSGILVFVLVICALVAVIHDQSQQTMATNNHQGGDKPT

SAATFTAPLPQVGLKRVPGKLESNADVEWQRSAYHFQPDKNFISDPDGPMYHMGWYHLFYQYNPESAIWGNITWG

HSVSRDMINWFHLPFAMVPDHWYDIEGVMTGSATVLPNGQIIMLYTGNAYDLSQLQCLAYAVNSSDPLLLEWKKY

EGNPILFPPPGVGYKDFRDPSTLWMGPDGEWRMVMGSKHNETIGCALVYRTTNFTHFELNEEVLHAVPHTGMWEC

VDLYPVSTTHTNGLEMKDNGPNVKYILKQSGDEDRHDWYAIGTFDPEKDKWYPDDPENDVGIGLRYDYGKFYASK

TFYDQHKKRRVLWGYVGETDPPKSDLLKGWANILNIPRSVVLDTQTETNLIQWPIEEVEKLRSKKYDEFKDVELR

PGSLIPLEIGTATQLDISATFEIDEKKLESTLEADVLFNCTTSEGSVGRGVLGPFGIVVLADANRSEQLPVYFYI

AKDTDGTSRTYFCADESRSSKDKDVGKWVYGSSVPVLEGENYNMRLLVDHSIVEGFAQGGRTVVTSRVYPTMAIY

GAAKIFLFNNATGISVKASLKIWKMAEAQLDPFPLSGWSS (SEQ ID NO: 14; secretion signal

is underlined)

VPGKLESNADVEWQRSAYHFQPDKNFISDPDGPMYHMGWYHLFYQYNPESAIWGNITWGHSVSRDMINWFHLPFA

MVPDHWYDIEGVMTGSATVLPNGQIIMLYTGNAYDLSQLQCLAYAVNSSDPLLLEWKKYEGNPILFPPPGVGYKD

FRDPSTLWMGPDGEWRMVMGSKHNETIGCALVYRTTNFTHFELNEEVLHAVPHTGMWECVDLYPVSTTHTNGLEM

KDNGPNVKYILKQSGDEDRHDWYAIGTFDPEKDKWYPDDPENDVGIGLRYDYGKFYASKTFYDQHKKRRVLWGYV

GETDPPKSDLLKGWANILNIPRSVVLDTQTETNLIQWPIEEVEKLRSKKYDEFKDVELRPGSLIPLEIGTATQLD

ISATFEIDEKKLESTLEADVLFNCTTSEGSVGRGVLGPFGIVVLADANRSEQLPVYFYIAKDTDGTSRTYFCADE

SRSSKDKDVGKWVYGSSVPVLEGENYNMRLLVDHSIVEGFAQGGRTVVTSRVYPTMAIYGAAKIFLFNNATGISV

KASLKIWKMAEAQLDPFPLSGWSS (SEQ ID NO: 38)

MGSHGKPPLPYAYKPLPSDADGERTGCTRWRVCATALTASAMVVVVVGATLLAGFRVDQAVDEEAAGGFPWSNEM

LQWQRSGYHFQTAKNYMSDPNGLMYYRGWYHMFFQYNPVGTDWDDGMEWGHAVSRNLVQWRTLPIAMVADQWYDI

LGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPNDPLLRRWTKHPANPVIWSPPGVGTKDFRDSMTAW

YDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRVERTGEWECIDFYPVGHRTSDNSSEMLHV

LKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYASTSFYDPAKKRRVLMGYVGEVDSKRADV

VKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATQLSDVTLNTGSVIHIPLRQGTQLDIEATFHLDASA

VAALNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYFYVSRGLDGGLHTSFCQDELRSSRAKDVT

KRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEAYQEAKVYLFNNATGASVTAERLVVHDMD

SAHNQLSNMDDYSYVQ (SEQ ID NO: 15; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEADEEAAGGFPWSNEMLQWQRSGYHFQTAKNYMSDPNGLMYYRGWYHMFFQYNPVGTDWDDGM

EWGHAVSRNLVQWRTLPIAMVADQWYDILGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPNDPLLRR

WTKHPANPVIWSPPGVGTKDFRDSMTAWYDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRV

ERTGEWECIDFYPVGHRTSDNSSEMLHVLKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYA

STSFYDPAKKRRVLMGYVGEVDSKRADVVKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATQLSDVTL

NTGSVIHIPLRQGTQLDIEATFHLDASAVAALNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYF

YVSRGLDGGLHTSFCQDELRSSRAKDVTKRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEA

YQEAKVYLFNNATGASVTAERLVVHDMDSAHNQLSNMDDYSYVQ (SEQ ID NO: 39; secretion

signal is underlined)

DEEAAGGFPWSNEMLQWQRSGYHFQTAKNYMSDPNGLMYYRGWYHMFFQYNPVGTDWDDGMEWGHAVSRNLVQWR

TLPIAMVADQWYDILGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPNDPLLRRWTKHPANPVIWSPP

GVGTKDFRDSMTAWYDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRVERTGEWECIDFYPV

GHRTSDNSSEMLHVLKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYASTSFYDPAKKRRVL

MGYVGEVDSKRADVVKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATQLSDVTLNTGSVIHIPLRQGT

QLDIEATFHLDASAVAALNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYFYVSRGLDGGLHTSF

CQDELRSSRAKDVTKRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEAYQEAKVYLFNNATG

ASVTAERLVVHDMDSAHNQLSNMDDYSYVQ (SEQ ID NO: 40)

MGSHGKPPLPYAYKPLPSDADGERTGCTRWRVCAVALTASAMVVVVVGATLLAGFRVDQAVDEEAAGGFPWSNEM

LQWQRSGYHFQTAKNYMSDPNGLMYYRGWNHMFFQYNPVGTDWDDGMEWGHAVSRNLVQWRTLPIAMVADQWYDI

LGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPTDPLLRRWTKHPANPVIWSPPGVGTKDFRDPMTAW

YDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRVERTGEWECIDFYPVGRRTSDNSSEMLHV

LKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYASTSFYDPAKKRRVLMGYVGEVDSKRADV

VKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATELSDVTLNTGSVIHIPLRQGTQLDIEATFHLDASA

VAAFNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYFYVSRGLDGGLHTSFCQDELRSSRAKDVT

KRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEAYQEAKVYLFNNATGASVTAERLVVHEMD

SAHNQLSNMDDHSYVQ (SEQ ID NO: 16; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEADEEAAGGFPWSNEMLQWQRSGYHFQTAKNYMSDPNGLMYYRGWNHMFFQYNPVGTDWDDGM

EWGHAVSRNLVQWRTLPIAMVADQWYDILGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPTDPLLRR

WTKHPANPVIWSPPGVGTKDFRDPMTAWYDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRV

ERTGEWECIDFYPVGRRTSDNSSEMLHVLKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYA

STSFYDPAKKRRVLMGYVGEVDSKRADVVKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATELSDVTL

NTGSVIHIPLRQGTQLDIEATFHLDASAVAAFNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYF

YVSRGLDGGLHTSFCQDELRSSRAKDVTKRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEA

YQEAKVYLFNNATGASVTAERLVVHEMDSAHNQLSNMDDHSYVQ (SEQ ID NO: 41; secretion

signal is underlined)

DEEAAGGFPWSNEMLQWQRSGYHFQTAKNYMSDPNGLMYYRGWNHMFFQYNPVGTDWDDGMEWGHAVSRNLVQWR

TLPIAMVADQWYDILGVLSGSMTVLPNGTVIMIYTGATNASAVEVQCIATPADPTDPLLRRWTKHPANPVIWSPP

GVGTKDFRDPMTAWYDESDDTWRTLLGSKDDNNGHHDGIAMMYKTKDFLNYELIPGILHRVERTGEWECIDFYPV

GRRTSDNSSEMLHVLKASMDDERHDYYSLGTYDSAANRWTPIDPELDLGIGLRYDWGKFYASTSFYDPAKKRRVL

MGYVGEVDSKRADVVKGWASIQSVPRTIALDEKTRTNLLLWPVEEIETLRLNATELSDVTLNTGSVIHIPLRQGT

QLDIEATFHLDASAVAAFNEADVGYNCSSSGGAVNRGALGPFGLLVLAAGDRRGEQTAVYFYVSRGLDGGLHTSF

CQDELRSSRAKDVTKRVIGSTVPVLDGEAFSMRVLVDHSIVQGFAMGGRTTMTSRVYPMEAYQEAKVYLFNNATG

ASVTAERLVVHEMDSAHNQLSNMDDHSYVQ (SEQ ID NO: 42)

MESSRGILIPGTPPLPYAYEPLPSSLTDANGQEDRRITGGVRWRAWAAVLAVGALVVAAAVFGASRVDRDAVASS

VPATAEHGVLEKASGPYSASGGFPWSNAMLQWQRTGYHFQPEKNYQNDPNGPVYYKGWYHFFYQHNPGGTGWGNI

SWGHAVSRDMVHWRHLPLAMVPEHWYDIEGVLTGSITVLPDGRVILLYTGNTETFAQVTCLAEAADPSDPLLREW

AKHPANPVVYPPPGIGMKDYRDPTTAWFDNSDNTWRIIIGSKNDTDHSGIVFTYKTKDFVSYELIPGYLYRGPAG

TGMYECIDLFAVGGGRAASDMYNSTAEDVLYVLKESSDDDRRDYYALGRFDAAANTWTPIDTERELGVALRYDYG

RYDTSKSFYDPVKQRRIVWGYVVETDSWSADAAKGWANLQSIPRTVELDEKTRTNLVQWPVGELNTLRINTTDLS

DITVGAGSVDSLPLHQTSQLDIEASFRINASTIEALNEVDVGYNCTMTSGAATRGALGPFGILVLANVALTEQTA

VYFYVSKGLDGGLRTHFCHDELRSTHATDVAKEVVGSTVPVLDGEDFSVRVLVDHSIVQSFVMGGRMTATSRAYP

TEAIYAAAGVYLFNNATGASITAEKLVVHDMDSSYNRIFTDEDLLVLD (SEQ ID NO: 17; secretion

signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEASGPYSASGGFPWSNAMLQWQRTGYHFQPEKNYQNDPNGPVYYKGWYHFFYQHNPGGTGWGN

ISWGHAVSRDMVHWRHLPLAMVPEHWYDIEGVLTGSITVLPDGRVILLYTGNTETFAQVTCLAEAADPSDPLLRE

WAKHPANPVVYPPPGIGMKDYRDPTTAWFDNSDNTWRIIIGSKNDTDHSGIVFTYKTKDFVSYELIPGYLYRGPA

GTGMYECIDLFAVGGGRAASDMYNSTAEDVLYVLKESSDDDRRDYYALGRFDAAANTWTPIDTERELGVALRYDY

GRYDTSKSFYDPVKQRRIVWGYVVETDSWSADAAKGWANLQSIPRTVELDEKTRTNLVQWPVGELNTLRINTTDL

SDITVGAGSVDSLPLHQTSQLDIEASFRINASTIEALNEVDVGYNCTMTSGAATRGALGPFGILVLANVALTEQT

AVYFYVSKGLDGGLRTHFCHDELRSTHATDVAKEWGSTVPVLDGEDFSVRVLVDHSIVQSFVMGGRMTATSRAY

PTEAIYAAAGVYLFNNATGASITAEKLVVHDMDSSYNRIFTDEDLLVLD (SEQ ID NO: 43; secretion

signal is underlined)

SGPYSASGGFPWSNAMLQWQRTGYHFQPEKNYQNDPNGPVYYKGWYHFFYQHNPGGTGWGNISWGHAVSRDMVHW

RHLPLAMVPEHWYDIEGVLTGSITVLPDGRVILLYTGNTETFAQVTCLAEAADPSDPLLREWAKHPANPVVYPPP

GIGMKDYRDPTTAWFDNSDNTWRI1IGSKNDTDHSGIVFTYKTKDFVSYELIPGYLYRGPAGTGMYECIDLFAVG

GGRAASDMYNSTAEDVLYVLKESSDDDRRDYYALGRFDAAANTWTPIDTERELGVALRYDYGRYDTSKSFYDPVK

QRRIVWGYVVETDSWSADAAKGWANLQSIPRTVELDEKTRTNLVQWPVGELNTLRINTTDLSDITVGAGSVDSLP

LHQTSQLDIEASFRINASTIEALNEVDVGYNCTMTSGAATRGALGPFGILVLANVALTEQTAVYFYVSKGLDGGL

RTHFCHDELRSTHATDVAKEWGSTVPVLDGEDFSVRVLVDHSIVQSFVMGGRMTATSRAYPTEAIYAAAGVYLF

NNATGASITAEKLVVHDMDSSYNRIFTDEDLLVLD (SEQ ID NO: 44)

MANAFPWSNAMLQWQRTGFHFQPDKYYQNDPNGPVYYGGWYHFFYQYNPSGSVWEPQIVWGHAVSKDLIHWRHLP

PALVPDQWYDIKGVLTGSITVLPDGKVILLYTGNTETFAQVTCLAEPADPSDPLLREWVKHPANPVVFPPPGIGM

KDFRDPTTAWYDESDGTWRTIIGSKNDSDHSGIVFSYKTKDFISYELMPGYMYRGPKGTGEYECIDLYAVGGGRK

ASDMYNSTAEDVLYVLKESSDDDRHDWYSLGRFDAAANKWTPIDTELELGVGLRYDWGKYYASKSFYDPVKKRRV

VWAYVGETDSERADITKGWANLQSIPRTVELDEKTRTNLIQWPVEELNTLRINTTDLSGITVGAGSVAFLPLHQT

AQLDIEATFRIDASAIEALNEADVSYNCTTSRGAATRGALGPFGLLVLANHALTEQTGVYFYVSKGLDGGLRTHF

CHDELRSSHASDVVKRVVGSTVPVLDGEDFSVRVLVDHSIVQSFAMGGRLTATSRAYPTEAIYAAAGVYMFNNAT

GTSVTAEKLVVHDMDSSYNHIYTDGDLVVVD (SEQ ID NO: 18)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEAANAFPWSNAMLQWQRTGFHFQPDKYYQNDPNGPVYYGGWYHFFYQYNPSGSVWEPQIVWGH

AVSKDLIHWRHLPPALVPDQWYDIKGVLTGSITVLPDGKVILLYTGNTETFAQVTCLAEPADPSDPLLREWVKHP

ANPVVFPPPGIGMKDFRDPTTAWYDESDGTWRTIIGSKNDSDHSGIVFSYKTKDFISYELMPGYMYRGPKGTGEY

ECIDLYAVGGGRKASDMYNSTAEDVLYVLKESSDDDRHDWYSLGRFDAAANKWTPIDTELELGVGLRYDWGKYYA

SKSFYDPVKKRRVVWAYVGETDSERADITKGWANLQSIPRTVELDEKTRTNLIQWPVEELNTLRINTTDLSGITV

GAGSVAFLPLHQTAQLDIEATFRIDASAIEALNEADVSYNCTTSRGAATRGALGPFGLLVLANHALTEQTGVYFY

VSKGLDGGLRTHFCHDELRSSHASDVVKRVVGSTVPVLDGEDFSVRVLVDHSIVQSFAMGGRLTATSRAYPTEAI

YAAAGVYMFNNATGTSVTAEKLVVHDMDSSYNHIYTDGDLVVVD (SEQ ID NO: 45; secretion

signal is underlined)

ANAFPWSNAMLQWQRTGFHFQPDKYYQNDPNGPVYYGGWYHFFYQYNPSGSVWEPQIVWGHAVSKDLIHWRHLPP

ALVPDQWYDIKGVLTGSITVLPDGKVILLYTGNTETFAQVTCLAEPADPSDPLLREWVKHPANPVVFPPPGIGMK

DFRDPTTAWYDESDGTWRTIIGSKNDSDHSGIVFSYKTKDFISYELMPGYMYRGPKGTGEYECIDLYAVGGGRKA

SDMYNSTAEDVLYVLKESSDDDRHDWYSLGRFDAAANKWTPIDTELELGVGLRYDWGKYYASKSFYDPVKKRRVV

WAYVGETDSERADITKGWANLQSIPRTVELDEKTRTNLIQWPVEELNTLRINTTDLSGITVGAGSVAFLPLHQTA

QLDIEATFRIDASAIEALNEADVSYNCTTSRGAATRGALGPFGLLVLANHALTEQTGVYFYVSKGLDGGLRTHFC

HDELRSSHASDWKRWGSTVPVLDGEDFSVRVLVDHSIVQSFAMGGRLTATSRAYPTEAIYAAAGVYMFNNATG

TSVTAEKLVVHDMDSSYNHIYTDGDLVVVD (SEQ ID NO: 46)

MESRDIESSPALNAPLLQASPPIKSSKLKVALLATSTSVLLLIAAFFAVKYSVFDSGSGLLKDDPPSDSEDYPWT

NEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDYSISWGHAVSKDMIHWLHLPVAMVPDHW

YDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLIEWKKSNGNPILMPPPGVGPHDFRDPFP

VWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKDSVGMLECVDLYPVATTGNQIGNGLEMK

GGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGLRYDYGKFYASKTFYDQEKKRRILWGYV

GEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTSGKEFNGVVVEPGSTYHLDVGTATQLDI

EAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKMTEKTATYFYVSRNVDGGLQTHFCQDEL

RSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVATSRVYPTEAIYDSTRVFLFNNATSATVT

AKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 19; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEADDPPSDSEDYPWTNEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDY

SISWGHAVSKDMIHWLHLPVAMVPDHWYDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLI

EWKKSNGNPILMPPPGVGPHDFRDPFPVWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKD

SVGMLECVDLYPVATTGNQIGNGLEMKGGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGL

RYDYGKFYASKTFYDQEKKRRILWGYVGEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTS

GKEFNGVVVEPGSTYHLDVGTATQLDIEAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKM

TEKTATYFYVSRNVDGGLQTHFCQDELRSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVAT

SRVYPTEAIYDSTRVFLFNNATSATVTAKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 47;

secretion signal is underlined)

DDPPSDSEDYPWTNEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDYSISWGHAVSKDMIH

WLHLPVAMVPDHWYDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLIEWKKSNGNPILMPP

PGVGPHDFRDPFPVWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKDSVGMLECVDLYPVA

TTGNQIGNGLEMKGGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGLRYDYGKFYASKTFY

DQEKKRRILWGYVGEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTSGKEFNGVVVEPGST

YHLDVGTATQLDIEAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKMTEKTATYFYVSRNV

DGGLQTHFCQDELRSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVATSRVYPTEAIYDSTR

VFLFNNATSATVTAKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 48)

MASSTKDVEAPPTLDAPLLGSAAPRSRLRVAAVSLSVMAFLLVAIAAAVLYYNPGGVASNLMRLRENDYPWTNDM

LRWQRTGFHFQPEKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDYTISWGHAVSKDLLHWNYLPMALRPDHWYDR

KGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLLEWKKSHVNPILVPPPGIEDHDFRDPFPVWY

NESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQPVGMLECVDLFPVATTDSRANQALDMTTMR

PGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGVGLRYDWGKFYASKTFYDQEKHRRVLWGYVGE

VDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRTINKNFNSIPLYPGSTYQLDVGEATQLDIVA

EFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQELSEQTATYFYVSRGIDGNLRTHFCQDELRS

SKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVATSRVYPTEAIYSSARVFLFNNATDAIVTAK

TVNVWHMNSTYNHVFPGLVAP (SEQ ID NO: 20; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEANLMRLRENDYPWTNDMLRWQRTGFHFQPEKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDY

TISWGHAVSKDLLHWNYLPMALRPDHWYDRKGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLL

EWKKSHVNPILVPPPGIEDHDFRDPFPVWYNESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQ

PVGMLECVDLFPVATTDSRANQALDMTTMRPGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGVG

LRYDWGKFYASKTFYDQEKHRRVLWGYVGEVDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRT

INKNFNSIPLYPGSTYQLDVGEATQLDIVAEFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQE

LSEQTATYFYVSRGIDGNLRTHFCQDELRSSKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVA

TSRVYPTEAIYSSARVFLFNNATDAIVTAKTVNVWHMNSTYNHVFPGLVAP (SEQ ID NO: 49)

NLMRLRENDYPWTNDMLRWQRTGFHFQPEKNFQADPNAAMFYKGWYHFFYQYNPTGVAWDYTISWGHAVSKDLLH

WNYLPMALRPDHWYDRKGVWSGYSTLLPDGRIVVLYTGGTKELVQVQNLAVPVNLSDPLLLEWKKSHVNPILVPP

PGIEDHDFRDPFPVWYNESDSRWHVVIGSKDPEHYGIVLIYTTKDFVNFTLLPNILHSTKQPVGMLECVDLFPVA

TTDSRANQALDMTTMRPGPGLKYVLKASMDDERHDYYALGSFDLDSFTFTPDDETIDVGVGLRYDWGKFYASKTF

YDQEKHRRVLWGYVGEVDSKRDDALKGWASLQNIPRTILFDTKTKSNLILWPVEEVESLRTINKNFNSIPLYPGS

TYQLDVGEATQLDIVAEFEVDEKAIEATAEADVTYNCSTSGGAANRGVLGPFGLLVLANQELSEQTATYFYVSRG

IDGNLRTHFCQDELRSSKAGAITKRVVGSTVPVLHGETWALRILVDHSIVESFAQRGRAVATSRVYPTEAIYSSA

RVFLFNNATDAIVTAKTVNVWHMNSTYNHVFPGLVAP (SEQ ID NO: 50)

MESRDIESSPALNAPLLQTSPPIKSSKLKVALLATSTSVLLLIAAFFAVKYSVFDSGSGLLKDDPPSDSEDYPWT

NEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDYSISWGHAVSKDMIHWLHLPVAMVPDHW

YDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLIEWKKSNGNPILMPPPGVGPHDFRDPFP

VWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKDSVGMLECVDLYPVATTGNQIGNGLEMK

GGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGLRYDYGKFYASKTFYDQEKKRRILWGYV

GEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTSGKEFNGVVVEPGSTYHLDVGTATQLDI

EAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKMTEKTATYFYVSRNADGGLQTHFCQDEL

RSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVATSRVYPTEAIYDSTRVFLFNNATSATVT

AKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 21; secretion signal is underlined)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAA

KEEGVSLEKREAEADDPPSDSEDYPWTNEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDY

SISWGHAVSKDMIHWLHLPVAMVPDHWYDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLI

EWKKSNGNPILMPPPGVGPHDFRDPFPVWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKD

SVGMLECVDLYPVATTGNQIGNGLEMKGGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGL

RYDYGKFYASKTFYDQEKKRRILWGYVGEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTS

GKEFNGVVVEPGSTYHLDVGTATQLDIEAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKM

TEKTATYFYVSRNADGGLQTHFCQDELRSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVAT

SRVYPTEAIYDSTRVFLFNNATSATVTAKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 51;

secretion signal is underlined)

DDPPSDSEDYPWTNEMLKWQRTGYHFQPPNHFMADPNAAMYYKGWYHFFYQYNPNGSAWDYSISWGHAVSKDMIH

WLHLPVAMVPDHWYDSKGVWSGYATTLPDGRIIVLYTGGTDQLVQVQNLAEPADPSDPLLIEWKKSNGNPILMPP

PGVGPHDFRDPFPVWYNESDSTWHMLIGSKDDNHYGTVLIYTTKDFETYTLLPDILHKTKDSVGMLECVDLYPVA

TTGNQIGNGLEMKGGSGKGIKHVLKASMDDERHDYYAIGTFDLESFSWVPDDDTIDVGVGLRYDYGKFYASKTFY

DQEKKRRILWGYVGEVDSKADDILKGWASVQNIARTILFDAKTRSNLLVWPVEELDALRTSGKEFNGVVVEPGST

YHLDVGTATQLDIEAEFEINKEAVDAVVEADVTYNCSTSDGAAHRGLLGPFGLLVLANEKMTEKTATYFYVSRNA

DGGLQTHFCQDELRSSKANDITKRVVGHTVPVLHGETFSLRILVDHSIVESFAQKGRAVATSRVYPTEAIYDSTR

VFLFNNATSATVTAKSVKIWHMNSTHNHPFPGFPAP (SEQ ID NO: 52)

GARVGLGGIYDDADAFAWNNSMLQWQRAGFHFQTEKNFMSDPNGPVYYRGYYHLFYQYNMKGVVWDDGIVWGHVV

SRDLVHWRHLPIAMVPDHWYDSMGVLSGSITVLQNGSLVMIYTGVFSKTTDRSGMMEVQCLAVPADPNDPLLRSW

TKHPANPVLVHPPGIKDMDFRDPTTAWFDESDSTYRTVIGTKDDHHGSHAGFAMVYKTKDFLSFQRIPGILHSVE

HTGMWECMDFYPVGGGDNSSSEVLYVIKASMDDERHDYYALGMYDAAANTWTPLDQELDLGIGLRYDWGKLYAST

TFYDPAKRRRVMLGYVGETDSRRSDEAKGWASIQSIPRTVALDEKTRTNLLLWPVEEIETLRLNATEFNDINIDT

GSVFHLPIRQGNQLDIEASFRLDASAVAAINEADVGYNCSSSGGAATRGALGPFGLLVLAAEGIGEQTAVYFYVS

RGLDGGLRTSFCNDELRSSWARDVTKRVVGSTVPVLNGETLSMRVLVDHSIVQSFAMGGRVTATSRVYPTEAIYA

AAGVYLFNNATNASVTAERIIVHEMDSIDNNQIFLIDDL (SEQ ID NO: 63)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgtacccggtaaattagaatcgaatgccgatgtc

gagtggcaacgttctgcataccattttcagccagacaagaacttcatatccgatcctgacggcccaatgtatcac

atgggatggtaccacctattctaccaatataacccggaatcagctatttgggggaatatcacttggggtcatagt

gtgtctagggacatgattaactggtttcacttgccattcgctatggttccagatcattggtacgacatcgaaggt

gttatgaccggtagcgctacggttcttcctaacggtcaaatcattatgttgtatactggtaatgcgtacgatttg

tctcaattgcaatgcttagcttatgccgtcaactcctcagatccactactcttggaatggaagaagtacgaaggt

aatccaatattgttcccaccacccggtgtcggttacaaagactttagagatccttccaccttatggatgggccca

gacggcgaatggagaatggttatgggtagtaagcacaacgagacaatcggatgtgctttggtctatcgaactacc

aatttcactcactttgaacttaacgaagaagttttacatgctgtaccacacacaggaatgtgggaatgtgtggat

ctctacccggtcagcacgacccatactaacgggttggaaatgaaggacaatggtccaaacgttaaatatatttta

aagcaatctggtgatgaggatagacacgactggtacgccattggtacattcgatccagaaaaggacaaatggtac

cctgatgacccagagaatgacgttggtatcggtttgagatacgactatgggaagttctatgccagtaagactttt

tacgatcaacataaaaagcggagagtattgtggggttacgttggtgaaactgatccaccaaagtcggatctattg

aaaggttgggctaacattctcaacatccctagatcagtcgttttggatacccagacagagactaatttgattcaa

tggccaatcgaagaagttgaaaaacttagatccaagaagtacgacgaatttaaggacgtcgaactgcgtcctggt

tctttgattccattggaaatcggtaccgctacccaattggatatatctgcaactttcgaaattgatgaaaagaaa

ctggagtctactttagaagctgacgttttattcaactgtacaacttcagaaggttccgtcggtagaggtgttcta

ggccctttcggtatcgttgtcttggctgatgctaacagatccgaacaattgccagtttacttctacattgcaaag

gacaccgatggtacttctcgcacctatttctgtgctgacgaatctcgttcttcgaaggataaggatgtgggtaag

tgggtttacggatcttccgtaccagtcctggagggtgaaaactataatatgagattgctcgtcgatcattcgatt

gtagaaggttttgcccaagggggtagaaccgttgtcacctctcgcgtttatccaacgatggcaatctacggtgcc

gctaagatatttttgttcaacaatgctaccggtatttcagtgaaggctagtttaaaaatctggaagatggctgag

gcccaattggaccccttcccactttccggttggagcagttaa (SEQ ID NO: 22)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgacgaagaggctgccggtggatttccctggtca

aacgaaatgttacaatggcagagatccggttaccacttccaaacagcaaaaaattatatgtctgatcctaacggc

ctaatgtactataggggttggtaccatatgttcttccaatacaacccagtcgggactgattgggacgacggtatg

gaatggggtcacgctgtgtcgcgtaatttggtacaatggagaacgttgccaatagctatggttgccgatcaatgg

tatgatattctgggtgttctttctggttctatgaccgtcttgccaaacggtactgttatcatgatctacaccggt

gctactaatgcgagcgctgtcgaagttcaatgtattgcaaccccagccgatccgaacgaccctttgttaagaaga

tggactaagcatccagctaaccctgtgatctggagtccaccaggtgtagggacaaaggattttcgagactccatg

accgcttggtacgacgagtcagatgacacttggagaaccttgttgggctccaaggacgataacaatggtcaccat

gatggtattgctatgatgtataaaactaaggatttcctaaattacgaacttatcccaggcatactgcaccgtgtc

gaaaggacaggtgaatgggaatgcatcgacttttacccggttggtcatagaacgtctgataactctagcgaaatg

ttgcacgttttgaaagcctctatggatgacgaacggcacgattattactccttaggtacttacgatagtgctgcc

aacagatggaccccaattgaccccgaactagacttgggtattggattgagatatgattggggtaagttttacgct

agcacttcattctacgatccagcaaagaaacgtcgagtcttaatgggatatgttggtgaggttgactccaagaga

gctgacgtcgtgaagggttgggcttctatccaatctgttccaagaacaattgcattggacgaaaagactagaacc

aacctgctgttatggcccgttgaggaaatcgaaacattgagactaaatgctacccaactctcggatgtcaccttg

aatactggttctgtcattcatattcctttgagacaaggtacccagttggatatagaagctacattccaccttgat

gcctccgctgttgccgctttaaacgaagcggacgtcggttacaactgttcctcttctggtggtgctgtgaataga

ggagctttgggtccattcggtttgttagttctcgcggctggagacagacgtggtgagcaaactgctgtttacttt

tatgttagtagaggtttggacggcggtttgcatacctccttctgtcaagatgaactcagaagttcccgcgcgaag

gatgttactaaaagagtcatcggttcgactgtcccggttcttgacggcgaagcattctctatgagggttttagtt

gatcattcgattgtccaaggttttgcaatgggtggtagaactacgatgacatctcgggtctatccaatggaagct

taccaggaggccaaggtttacctctttaacaacgctaccggagcatccgttaccgctgaaagacttgtagttcac

gatatggactcagcccataatcaattgtctaacatggacgactactcatatgtacagtaa (SEQ ID NO:

53)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgacgaagaggctgccggtggatttccctggtca

aacgaaatgttacaatggcagagatccggttaccacttccaaacagcaaaaaattatatgtctgatcctaacggc

ctaatgtactataggggttggaaccatatgttcttccaatacaatccagtcgggactgattgggacgacggtatg

gaatggggtcacgctgtgtcgcgtaacttggtacaatggagaacgttgccaatagctatggttgccgatcaatgg

tacgatattctgggtgttctttctggttctatgaccgtcttgccaaatggtactgttatcatgatctataccggt

gctactaacgcgagcgctgtcgaagttcaatgtattgcaaccccagccgatccgacggaccctttgttaagaaga

tggactaagcatccagctaaccctgtgatctggagtccaccaggtgtagggacaaaggattttcgagatccaatg

accgcttggtacgacgaatcagacgatacttggagaacgctattgggctctaaggatgacaataatggtcaccac

gacggtattgctatgatgtacaaaactaaggatttcttgaactacgagctgattcctggtatcctccatagagtt

gaaagaacaggagaatgggaatgcatagacttttatccggtcggtcgtagaacctctgataactcgtccgaaatg

ttgcatgttttaaaggcttccatggatgacgagagacacgactactactctctaggtacttatgatagtgccgcc

aataggtggactccaattgacccagaattggatttgggtattggtttgagatatgactgggggaaattctacgct

tccaccagcttctatgatcccgcaaagaagagaagagttttgatgggttacgtcggtgaagtggactctaaacgc

gctgacgttgttaagggttgggcctctatccaaagtgtcccacgcaccattgctctggacgaaaaaactcgtaca

aaccttttattgtggccagtagaagaaatcgaaaccttaagattgaacgctactgagttgtccgacgttacttta

aacactggttccgtcatccacattccattgagacagggaacccaattggatattgaagcaacctttcatctcgat

gcgagtgctgttgcagctttcaatgaagctgatgtcggttacaattgttcatcttcgggtggtgctgttaataga

ggtgctctagggcctttcggcctcttagtcttggctgccggtgatagaagaggtgaacaaaccgctgtttacttt

tacgtatctcgtggtttggacggcggtctacacacctctttttgtcaggatgagttaagatcctcaagggctaag

gacgttactaagagagtcataggatcaactgtgcccgttttggatggtgaagccttttctatgcgtgtacttgtt

gatcattccatagtccaaggtttcgcaatgggtggtagaacaactatgacgagcagagtttatccaatggaagcg

taccaagaagctaaggtttatcttttcaacaacgcaacaggtgcctctgttacagccgagagattggtcgtacac

gaaatggactccgcccacaaccaattgtcgaacatggacgaccactcgtatgttcaataa (SEQ ID NO:

54)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctagtggcccttattctgcttcgggtggttttcca

tggtctaatgccatgttgcagtggcaacgtacaggataccacttccaacccgaaaaaaactaccaaaacgaccca

aacggtccagtctactataagggttggtatcatttcttttaccaacataatccaggtggtaccgggtggggtaac

atctcatggggtcacgcagtttccagagatatggtacactggaggcatttaccactagctatggttcctgagcat

tggtacgatatagaaggtgttttgactggaagcattactgtccttccagacggtagagtcattttgttatatacc

ggcaatactgaaacgttcgctcaagtgacctgtttggcggaggctgccgacccttccgatccactgttgagagaa

tgggctaagcacccggccaacccagtagtttacccgccaccaggtatcggtatgaaagactacagagatccaact

acagcttggttcgataactcagacaatacctggagaataatcattggttctaagaatgatactgatcactctggt

atcgtttttacttacaagaccaaggacttcgtcagctacgaactgattcctggatacctatatagaggtccagcc

gggacgggtatgtacgaatgcattgatttgttcgctgttggtggtgggcgtgctgcatcagatatgtataactct

accgctgaagatgtcttatacgttttgaaagaatcctccgacgacgacagacgggattactatgccttagggcga

tttgacgctgccgctaatacttggacacccatagatacagaaagagagttgggtgtcgcactcagatatgattac

ggtagatacgatacttctaagtctttctacgacccagttaagcaaaggagaattgtctggggttacgttgtcgaa

accgacagttggtccgctgacgctgcaaaaggttgggctaacctgcaatctatccctagaactgttgaattggat

gaaaagactcgaacaaaccttgtacagtggccagtgggtgagttgaacaccctacgtatcaataccactgatttg

agtgacattaccgttggtgctggctcggtcgattctttacccttgcaccaaacttcccaactagacatcgaagcg

tcatttagaattaatgcctctactatagaagccttgaacgaagttgatgtaggttataactgtactatgacgtct

ggtgctgctactagaggtgctttgggtccattcggaattttagtcttggctaacgtggccttgacagaacagacc

gctgtttatttttatgtttccaagggtttagacggtggtttacgaacccacttctgtcatgacgaattgaggtct

acacacgctaccgacgtcgccaaggaggttgttgggtctactgttccagttctcgatggtgaagattttagcgtc

agagttttggtcgatcactcaatcgtacaatctttcgtcatgggtggcagaatgacagcaacttccagagcttac

ccgactgaagcaatctatgctgccgctggcgtttacctcttcaacaatgctacaggtgcttccattaccgcagaa

aaattggtggtacatgacatggattcctcctacaacagaatctttactgacgaggatttattggtgcttgactaa

(SEQ ID NO: 55)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgcaaatgcttttccttggtcgaacgctatgttg

cagtggcaacgtactggcttccatttccaaccagacaaatactatcaaaacgatccaaacggtcccgtctactac

ggaggttggtatcactttttctaccaatataatccgtctggtagtgtttgggagccacaaattgtatggggtcac

gccgtttccaaggacctgatccattggcggcacttaccaccagctttggtcccagatcaatggtacgacataaag

ggtgttctaaccgggtcaattacggtccttcctgatggtaaggtgatcttgttatatactggtaatacagaaacc

ttcgctcaagttacttgcttggccgaacccgcagatccaagcgatccattgctcagagaatgggtaaagcatcct

gctaacccagttgtctttccaccacccggtattggtatgaaagacttcagagatccaaccactgcttggtacgac

gaatctgacggcacatggagaaccatcattggatctaaaaacgactccgaccactctggtatcgttttttcctac

aagactaaggatttcattagttatgagttgatgccgggttacatgtacagaggcccaaaggggaccggtgaatac

gaatgtatagatttatacgcggtgggtggtggtaggaaggcttctgatatgtataactccactgcggaagatgtc

ctatatgttttaaaagaatcatctgacgatgatagacatgactggtactcattgggtagatttgacgccgctgct

aataagtggacacctatagatactgagcttgaacttggcgttggtttgcgatatgactggggtaagtactacgcc

agcaagtctttctacgacccagttaaaaaaagacgtgtcgtgtgggcttatgtcggtgaaaccgattccgaaaga

gccgacatcaccaagggttgggcaaatttgcagtctatcccacgcactgttgaattggacgaaaaaactagaacg

aacttaattcaatggccggttgaggaactaaatacactgcgtattaacactacagatttgtcgggaatcaccgta

ggtgctggtagtgtcgctttcttgccattgcaccaaactgcccagctcgacattgaagctacttttagaattgat

gcttctgcgatagaagctctaaacgaagctgatgtttcctacaattgtaccacatcgcgaggagctgctaccaga

ggtgccttaggtccattcggtttgttggtattagccaaccatgccttgaccgaacaaactggtgtttacttttac

gtgtctaagggtttggacggtggtttaagaactcacttctgtcacgatgaactaagatcctctcatgcttcagat

gtcgttaagagagtcgtgggtagtacggttcctgttttggatggggaggactttagcgttcgtgtcttggttgac

cactctattgtccaaagtttcgccatgggtggtaggttgacagctacctccagagcttatccaactgaagcaatc

tacgctgcggcaggcgtatacatgttcaacaacgctacaggtacttccgttacggctgaaaagcttgttgtccac

gatatggattcttcctacaaccacatctataccgacggtgacctggtggtagttgattaa (SEQ ID NO:

56)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgacgatcctccatctgatagtgaagattaccca

tggaccaatgagatgcttaaatggcaaaggacgggttatcacttccagcccccaaaccattttatggcagaccca

aacgccgctatgtactacaaggggtggtatcacttcttttaccaatataaccctaatggttcagcttgggactac

tccatctcgtggggtcatgctgtatctaaggacatgattcactggctgcatttaccagtcgccatggttccagat

cattggtacgatagcaaaggagtttggtccggctacgctactactttgccagatggtagaataattgtcttgtat

accggtggtacagaccaattggttcaagtgcaaaatttagccgaaccagcggacccttctgatccactattgatc

gaatggaagaagtcaaacggaaacccaattttgatgcctccgccgggtgtaggtccacacgatttcagagatcca

ttcccagtttggtacaacgaatctgactccacatggcacatgttgatcggttctaaagatgacaatcactacggt

accgttctaatttatactactaaggattttgagacatacactttattgccagacatcctacataagaccaaggac

tcggttggtatgttggaatgtgtcgatctttatccagtggctactaccgggaatcaaattggtaacggtttagaa

atgaaaggtggttccggcaagggtatcaagcacgtcctgaaggcttctatggacgatgaacgtcacgattattac

gccataggtacgttcgacttggaatcctttagttgggttccggacgacgataccatagatgtcggcgtcggcttg

cgctatgactacggtaagttctacgcttcaaaaactttctatgatcaggaaaagaagagaagaattttgtgggga

tacgttggtgaagtagactctaaggctgacgacatcttaaaaggttgggcgagcgttcaaaatattgcaagaact

atcctatttgatgcaaaaactagaagtaacttgctcgtctggcccgtcgaggaattggacgctttgcgaacctct

ggtaaggaatttaacggtgtggttgttgaacctggttctacttaccatttagacgtaggtaccgccacccaattg

gatattgaagctgaatttgagatcaataaggaagctgttgacgctgttgtcgaagccgatgttacatacaactgc

tccacatctgatggtgctgctcacagaggtttgttgggaccattcggtcttttggttttagctaatgaaaagatg

acagaaaaaaccgccacttatttctacgtcagtcgtaacgttgatgggggtctacaaactcatttctgtcaagac

gagcttagaagctctaaagctaacgatattaccaaacgtgtcgttggccacactgttccagttctgcatggtgaa

accttctccttgagaattttagtagaccactcgatcgttgaatcgtttgcgcagaagggtagagcagtcgctacg

tctagggtgtatccaactgaagctatctacgattctacaagagttttcctcttcaacaacgccacttcagctacg

gtcactgccaagtccgtaaagatatggcatatgaacagtacccataaccacccttttccaggtttccccgcacca

taa (SEQ ID NO: 57)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctaacttgatgcgtttaagagagaatgattatccc

tggactaacgacatgctaagatggcaacgcacgggatttcacttccagcctgaaaaaaacttccaagccgaccca

aatgcagctatgttttacaagggctggtaccatttcttttatcaatacaacccgaccggtgtggcttgggattac

acaatctcctggggtcacgctgtcagtaaggatttgctgcattggaattatcttccaatggccttgaggcctgac

cactggtacgatagaaaaggtgtttggagcggttactctactttattgccagacggtagaattgttgtcttgtac

accggtggaactaaggaattagttcaagtccaaaacttggctgtcccagtaaacctttctgacccattgctattg

gaatggaagaagtcacacgttaacccaatactcgttccacctccggggatcgaggatcatgatttccgagatcca

ttcccagtgtggtataatgaatctgactcgcggtggcacgttgtaattggttccaaagatccagaacactatggt

attgtcttgatctacactaccaaggacttcgttaactttacgttattaccaaacatattgcattccaccaagcag

ccggttggtatgctggaatgtgtagacttgttcccagttgctacaactgattctcgtgcaaatcaagctttggat

atgactaccatgaggcccggtcctggcctcaaatatgtgttaaaggcgagtatggatgacgaaagacacgattac

tacgccctaggtagctttgacttggactcgttcacttttacaccagatgatgaaaccattgacgtcggtgtcggt

ttgagatacgactggggtaagttctatgcttcaaaaactttctatgaccaagaaaagcatagaagagttttatgg

ggttacgtgggggaagttgattctaagagagatgacgcgttaaaaggctgggcttccttgcaaaacatcccaaga

acaattttgttcgataccaaaactaagtctaatctaatcttgtggccagttgaagaggtcgaatcattgagaact

attaacaagaattttaactctataccactttacccaggttccacttaccaattggatgttggggaagccacccaa

ctggatattgtcgctgaatttgaagtcgatgagaaggctattgaagcaactgctgaagctgacgttacatataac

tgctctaccagcggtggtgccgctaacagaggtgttttgggtcctttcggtctattggttctagccaatcaagaa

ctttccgaacagactgccacttacttctatgtatcgcgtggtatcgacggcaacctgagaacccacttttgtcaa

gacgaattgagatcctccaaagccggtgctatcaccaagagggtcgtaggttctacagttcctgttttgcatggt

gaaacgtgggctttacgtatcctagttgaccactctattgtcgagtcttttgcacaacggggacgcgccgtcgct

accagtagagtatacccaactgaggctatatactcttcggctagagtctttctcttcaataacgcaaccgatgcc

attgttacagctaaaacggtcaacgtttggcatatgaatagcacttacaaccacgtctttcctggtttggttgct

ccataa (SEQ ID NO: 58)

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactaca

acagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgtt

gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct

aaagaagaaggggtatctctcgagaaaagagaggctgaagctgacgatcctccatctgatagtgaagattaccca

tggaccaatgagatgcttaaatggcaaaggacgggttatcacttccagcccccaaaccattttatggcagaccca

aacgccgctatgtactacaaggggtggtatcacttcttttaccaatataaccctaatggttcagcttgggactac

tccatctcgtggggtcatgctgtatctaaggacatgattcactggctgcatttaccagtcgccatggttccagat

cattggtacgatagcaaaggagtttggtccggctacgctactactttgccagatggtagaataattgtcttgtat

accggtggtacagaccaattggttcaagtgcaaaatttagccgaaccagcggacccttctgatccactattgatc

gaatggaagaagtcaaacggaaacccaattttgatgcctccgccgggtgtaggtccacacgatttcagagatcca

ttcccagtttggtacaacgaatctgactccacatggcacatgttgatcggttctaaagatgacaatcactacggt

accgttctaatttatactactaaggattttgagacatacactttattgccagacatcctacataagaccaaggac

tcggttggtatgttggaatgtgtcgatctttatccagtggctactaccgggaatcaaattggtaacggtttagaa

atgaaaggtggttccggcaagggtatcaagcacgtcctgaaggcttctatggacgatgaacgtcacgattattac

gccataggtacgttcgacttggaatcctttagttgggttccggacgacgataccatagatgtcggcgtcggcttg

cgctatgactacggtaagttctacgcttcaaaaactttctatgatcaggaaaagaagagaagaattttgtgggga

tacgttggtgaagtagactctaaggctgacgacatcttaaaaggttgggcgagcgttcaaaatattgcaagaact

atcctatttgatgcaaaaactagaagtaacttgctcgtctggcccgtcgaggaattggacgctttgcgaacctct

ggtaaggaatttaacggtgtggttgttgaacctggttctacttaccatttagacgtaggtaccgccacccaattg

gatattgaagctgaatttgagatcaataaggaagctgttgacgctgttgtcgaagccgatgttacatacaactgc

tccacatctgatggtgctgctcacagaggtttgttgggaccattcggtcttttggttttagctaatgaaaagatg

acagaaaaaaccgccacttatttctacgtcagtcgtaacgctgatgggggtctacaaactcatttctgtcaagac

gagcttagaagctctaaagctaacgatattaccaaacgtgtcgttggccacactgttccagttctgcatggtgaa

accttctccttgagaattttagtcgatcactcaattgtcgagtccttcgcgcaaaagggtagggctgttgcaacc

tctcgggtgtatccaactgaagccatctacgattctacgagagtttttctcttcaacaacgctacttcggcaacg

gtaactgctaagtccgtaaagatatggcatatgaacagtacccataaccacccttttccaggtttccccgcgcca

taa (SEQ ID NO: 59)

6-SFT from Phleum pratense:

MAPPQAIANGAPAPLPYAYARLPSSGDEKQDQSKSGGARYCRACVAGVAALLIVAGALAGARVGLGGIYDDADAF

AWNNSMLQWQRAGFHFQTEKNFMSDPNGPVYYRGYYHLFYQYNMKGVVWDDGIVWGHVVSRDLVHWRHLPIAMVP

DHWYDSMGVLSGSITVLQNGSLVMIYTGVFSKTTDRSGMMEVQCLAVPADPNDPLLRSWTKHPANPVLVHPPGIK

DMDFRDPTTAWFDESDSTYRTVIGTKDDHHGSHAGFAMVYKTKDFLSFQRIPGILHSVEHTGMWECMDFYPVGGG

DNSSSEVLYVIKASMDDERHDYYALGMYDAAANTWTPLDQELDLGIGLRYDWGKLYASTTFYDPAKRRRVMLGYV

GETDSRRSDEAKGWASIQSIPRTVALDEKTRTNLLLWPVEEIETLRLNATEFNDINIDTGSVFHLPIRQGNQLDI

EASFRLDASAVAAINEADVGYNCSSSGGAATRGALGPFGLLVLAAEGIGEQTAVYFYVSRGLDGGLRTSFCNDEL

RSSWARDVTKRWGSTVPVLNGETLSMRVLVDHSIVQSFAMGGRVTATSRVYPTEAIYAAAGVYLFNNATNASVT

AERIIVHEMDSIDNNQIFLIDDL (SEQ ID NO: 23)

EQUIVALENTS

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described in this application. Such equivalents are intended to be encompassed by the following claims.

All references, including patent documents, disclosed in this application are incorporated by reference in their entirety, particularly for the disclosure referenced in this application.

It should be appreciated that sequences disclosed in this application may or may not contain secretion signals. The sequences disclosed in this application encompass versions with or without secretion signals. It should also be understood that protein sequences disclosed in this application may be depicted with or without a start codon (M). The sequences disclosed in this application encompass versions with or without start codons. Accordingly, in some instances amino acid numbering may correspond to protein sequences containing a start codon, while in other instances, amino acid numbering may correspond to protein sequences that do not contain a start codon. It should also be understood that sequences disclosed in this application may be depicted with or without a stop codon. The sequences disclosed in this application encompass versions with or without stop codons. Aspects of the disclosure encompass host cells comprising any of the sequences described in this application and fragments thereof.

PRODUCTION OF OLIGOSACCHARIDES

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

PCT Information

Provisional Applications (1)