Alpha (1,2) Fucosyltransferase Syngenes For Use in the Production of Fucosylated Oligosaccharides

Abstract
The invention provides compositions and methods for engineering E. coli or other host production bacterial strains to produce fucosylated oligosaccharides, and the use thereof in the prevention or treatment of infection.
Description
FIELD OF THE INVENTION

The invention provides compositions and methods for producing purified oligosaccharides, in particular certain fucosylated oligosaccharides that are typically found in human milk.


BACKGROUND OF THE INVENTION

Human milk contains a diverse and abundant set of neutral and acidic oligosaccharides. More than 130 different complex oligosaccharides have been identified in human milk, and their structural diversity and abundance is unique to humans. Although these molecules may not be utilized directly by infants for nutrition, they nevertheless serve critical roles in the establishment of a healthy gut microbiome, in the prevention of disease, and in immune function. Prior to the invention described herein, the ability to produce human milk oligosaccharides (HMOS) inexpensively was problematic. For example, their production through chemical synthesis was limited by stereo-specificity issues, precursor availability, product impurities, and high overall cost. As such, there is a pressing need for new strategies to inexpensively manufacture large quantities of HMOS.


SUMMARY OF THE INVENTION

The invention features an efficient and economical method for producing fucosylated oligosaccharides. Such production of a fucosylated oligosaccharide is accomplished using an isolated nucleic acid comprising a sequence encoding a lactose-utilizing α (1,2) fucosyltransferase gene product (e.g., polypeptide or protein), which is operably linked to one or more heterologous control sequences that direct the production of the recombinant fucosyltransferase gene product in a host production bacterium such as Escherichia coli (E. coli).


The present disclosure provides novel α (1,2) fucosyltransferases (also referred to herein as α(1,2) FTs) that utilize lactose and catalyzes the transfer of an L-fucose sugar from a GDP-fucose donor substrate to an acceptor substrate in an alpha-1,2-linkage. In a preferred embodiment, the acceptor substrate is an oligosaccharide. The α(1,2) fucosyltransferases identified and described herein are useful for expressing in host bacterium for the production of human milk oligosaccharides (HMOS), such as fucosylated oligosaccharides. Exemplary fucosylated oligosaccharides produced by the methods described herein include 2′-fucosyllactose (2′FL), lactodifucotetraose (LDFT), lacto-N-fucopentaose I (LNF I), or lacto-N-difucohexaose I (LDFH I). The “α(1,2) fucosyltransferases” disclosed herein encompasses the amino acid sequences of the α(1,2) fucosyltransferases and the nucleic acid sequences that encode the α(1,2) fucosyltransferases, as well as variants and fragments thereof that exhibit α(1,2) fucosyltransferase activity. Also within the invention is a nucleic acid construct comprising an isolated nucleic acid encoding a lactose-accepting α (1,2) fucosyltransferase enzyme, said nucleic acid being optionally operably linked to one or more heterologous control sequences that direct the production of the enzyme in a host bacteria production strain.


The amino acid sequence of the lactose-accepting α(1,2) fucosyltransferases described herein is at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identity to Helicobacter pylori 26695 alpha-(1,2) fucosyltransferase (futC or SEQ ID NO: 1). Preferably, the lactose-accepting α(1,2) fucosyltransferases described herein is at least 22% identical to H. pylori FutC, or SEQ ID NO: 1.


In another aspect, the amino acid sequence of the lactose-accepting α(1,2) fucosyltransferases described herein is at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identity to Bacteroides vulgatus alpha-(1,2) fucosyltransferase (FutN or SEQ ID NO: 3). Preferably, the lactose-accepting α(1,2) fucosyltransferases described herein is at least 25% identical to B. vlugatos FutN, or SEQ ID NO: 3.


Alternatively, the exogenous α (1,2) fueosyltransferase preferably comprises at least at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identity to any one of the novel α (1,2) fucosyltransferases disclosed herein, for example, to the amino acid sequences in Table 1.


Exemplary α(1,2) fucosyltransferases include, but are not limited to, Prevotella melaninogenica FutO, Clostridium bolteae FutP, Clostridium bolteae+13 FutP, Lachnospiraceae sp. FutQ, Methanosphaerula palustris FutR, Tannerella sp. FutS, Bacteroides caccae FutU, Butyrivibrio FutV, Prevotella sp, FutW, Parabacteroides johnsonii FutX, Akkermansia muciniphilia FutY, Salmonella enterica FutZ, Bacteroides sp. FutZA. For example, the α(1,2) fucosyltransferases comprise the amino acid sequences comprising any one of the following: Prevotella melaninogenica FutO (SEQ ID NO: 10), Clostridium bolteae FutP (SEQ ID NO: 11), Clostridium bolteae+13 FutP (SEQ ID NO: 292), Lachnospiraceae sp. FutQ (SEQ ID NO: 12), Methanosphaerula palustris FutR (SEQ ID NO: 13), Tannerella sp. FutS (SEQ ID NO: 14), Bacteroides caccae FutU (SEQ ID NO: 15), Butyrivibrio FutV (SEQ ID NO: 16), Prevotella sp. FutW (SEQ ID NO: 17), Parabacteroides johnsonii FutX (SEQ ID NO: 18), Akkermansia muciniphilia FutY (SEQ ID NO: 19), Salmonella enterica FutZ (SEQ ID NO: 20), and Bacteroides sp. FutZA (SEQ ID NO: 21), or a functional variant or fragment thereof. Other exemplary α(1,2) fucosyltransferases include any of the enzymes listed in Table 1, or functional variants or fragments thereof.


The present invention features a method for producing a fucosylated oligosaccharide in a bacterium by providing bacterium that express at least one exogenous lactose-utilizing α(1,2) fucosyltransferase. The amino acid sequence of the exogenous lactose-utilizing α(1,2) fucosyltransferase is preferably at least 22% identical to H. pylori FutC or at least 25% identical to B. vulgatus FutN. In one aspect, the bacterium also expresses one or more exogenous lactose-utilizing α(1,3) fucosyltransferase enzymes and/or one or more exogenous lactose-utilizing α(1,4) fucosyltransferase enzymes. The combination of fucosyltransferases expressed in the production bacterium is dependent upon the desired fucosylated oligosaccharide product. The method disclosed herein further includes retrieving the fucosylated oligosaccharide from said bacterium or from a culture supernatant of said bacterium.


Examples of suitable α(1,3) fucosyltransferase enzymes include, but are not limited to Helicobacter pylori 26695 futA gene (GenBank Accession Number HV532291 (GI:365791177), incorporated herein by reference), H. hepaticus Hh0072, H. pylori 11639 FucT, and H. pylori UA948 FucTa (e.g., GenBank Accession Number AF194963 (GI:28436396), incorporated herein by reference)(Rasko, D. A., Wang, G., Palcic, M. M. & Taylor, D. E. J Biol Chem 275, 4988-4994 (2000)). Examples of suitable a(1,4) fucosyltransferase enzymes include, but are not limited to H. pylori UA948 FucTa (which has has relaxed acceptor specificity and is able to generate both α(1,3)- and α(1,4)-fucosyl linkages). An example of an enzyme possessing only α(1,4) fucosyltransferase activity is given by the FucT III enzyme from Helicobacter pylori strain DMS6709 (e.g., GenBank Accession Number AY450598.1 (GI:40646733), incorporated herein by reference) (S. Rabbani, V. Miksa, B. Wipf, B. Ernst, Glycobiology 15, 1076-83 (2005).)


The invention also features a nucleic acid construct or a vector comprising a nucleic acid enconding at least one α (1,2) fucosyltransferase or variant, or fragment thereof, as described herein. The vector can further include one or more regulatory elements, e.g., a heterologous promoter. By “heterologous” is meant that the control sequence and protein-encoding sequence originate from different bacterial strains. The regulatory elements can be operably linked to a gene encoding a protein, a gene construct encoding a fusion protein gene, or a series of genes linked in an operon in order to express the fusion protein. In yet another aspect, the invention comprises an isolated recombinant cell, e.g., a bacterial cell containing an aforementioned nucleic acid molecule or vector. The nucleic acid is optionally integrated into the genome of the host bacterium. In some embodiments, the nucleic acid construct also further comprises one or more α(1,3) fucosyltransferases and/or α(1,4) fucosyltransferases. Alternatively, the α (1,2) fucosyltransferase also exhibits α(1,3) fucosyltransferase and/or α(1,4) fucosyltransferase activity.


The bacterium utilized in the production methods described herein is genetically engineered to increase the efficiency and yield of fucosylated oligosaccharide products. For example, the host production bacterium is characterized as having a reduced level of β-galactosidase activity, a defective colanic acid synthesis pathway, an inactivated ATP-dependent intracellular protease, an inactivated lacA, or a combination thereof. In one embodiment, the bacterium is characterized as having a reduced level of β-galactosidase activity, a defective colanic acid synthesis pathway, an inactivated ATP-dependent intracellular protease, and an inactivated lacA.


As used herein, an “inactivated” or “inactivation of a” gene, encoded gene product (i.e., polypeptide), or pathway refers to reducing or eliminating the expression (i.e., transcription or translation), protein level (i.e., translation, rate of degradation), or enzymatic activity of the gene, gene product, or pathway. In the instance where a pathway is inactivated, preferably one enzyme or polypeptide in the pathway exhibits reduced or negligible activity. For example, the enzyme in the pathway is altered, deleted or mutated such that the product of the pathway is produced at low levels compared to a wild-type bacterium or an intact pathway. Alternatively, the product of the pathway is not produced. Inactivation of a gene is achieved by deletion or mutation of the gene or regulatory elements of the gene such that the gene is no longer transcribed or translated. Inactivation of a polypeptide can be achieved by deletion or mutation of the gene that encodes the gene product or mutation of the polypeptide to disrupt its activity. Inactivating mutations include additions, deletions or substitutions of one or more nucleotides or amino acids of a nucleic acid or amino acid sequence that results in the reduction or elimination of the expression or activity of the gene or polypeptide. In other embodiments, inactivation of a polypeptide is achieved through the addition of exogenous sequences (i.e., tags) to the N or C-terminus of the polypeptide such that the activity of the polypeptide is reduced or eliminated (i.e., by steric hindrance).


A host bacterium suitable for the production systems described herein exhibits an enhanced or increased cytoplasmic or intracellular pool of lactose and/or GDP-fucose. For example, the bacterium is E. coli and endogenous E. coli metabolic pathways and genes are manipulated in ways that result in the generation of increased cytoplasmic concentrations of lactose and/or GDP-fucose, as compared to levels found in wild type E. coli. Preferably, the bacterium accumulates an increased intracellular lactose pool and an increased intracellular GDP-fucose pool. For example, the bacteria contain at least 10%, 20%, 50%, or 2×, 5×, 10× or more of the levels of intracellular lactose and/or intracellular GDP-fucose compared to a corresponding wild type bacteria that lacks the genetic modifications described herein.


Increased intracellular concentration of lactose in the host bacterium compared to wild-type bacterium is achieved by manipulation of genes and pathways involved in lactose import, export and catabolism. In particular, described herein are methods of increasing intracellular lactose levels in E. coli genetically engineered to produce a human milk oligosaccharide by simultaneous deletion of the endogenous β-galactosidase gene (lacZ) and the lactose operon repressor gene (lad). During construction of this deletion, the lacIq promoter is placed immediately upstream of (contiguous with) the lactose permease gene, lacY, i.e., the sequence of the lacIq promoter is directly upstream and adjacent to the start of the sequence encoding the lacY gene, such that the lacY gene is under transcriptional regulation by the lacIq promoter. The modified strain maintains its ability to transport lactose from the culture medium (via LacY), but is deleted for the wild-type chromosomal copy of the lacZ (encoding β-galactosidase) gene responsible for lactose catabolism. Thus, an intracellular lactose pool is created when the modified strain is cultured in the presence of exogenous lactose.


Another method for increasing the intracellular concentration of lactose in E. coli involves inactivation of the lacA gene. A inactivating mutation, null mutation, or deletion of lacA prevents the formation of intracellular acetyl-lactose, which not only removes this molecule as a contaminant from subsequent purifications, but also eliminates E. coli's ability to export excess lactose from its cytoplasm (Danchin A. Cells need safety valves. Bioessays 2009, July; 31(7):769-73.), thus greatly facilitating purposeful manipulations of the E. coli intracellular lactose pool.


The invention also provides methods for increasing intracellular levels of GDP-fucose in a bacterium by manipulating the organism's endogenous colanic acid biosynthesis pathway. This increase is achieved through a number of genetic modifications of endogenous E. coli genes involved either directly in colanic acid precursor biosynthesis, or in overall control of the colanic acid synthetic regulon. Particularly preferred is inactivation of the genes or encoded polypeptides that act in the colanic acid synthesis pathway after the production of GDP-fucose (the donor substrate) and before the generation of colanic acid. Exemplary colanic acid synthesis genes include, but are not limited to: a wcaJ gene, (e.g., GenBank Accession Number (amino acid) BAA15900 (GI:1736749), incorporated herein by reference), a wcaA gene (e.g., GenBank Accession Number (amino acid) BAA15912.1 (GI:1736762), incorporated herein by reference), a wcaC gene (e.g., GenBank Accession Number (amino acid) BAE76574.1 (GI:85675203), incorporated herein by reference), a wcaE gene (e.g., GenBank Accession Number (amino acid) BAE76572.1 (GI:85675201), incorporated herein by reference), a wcaI gene (e.g., GenBank Accession Number (amino acid) BAA15906.1 (GI:1736756), incorporated herein by reference), a wcaL gene (e.g., GenBank Accession Number (amino acid) BAA15898.1 (GI:1736747), incorporated herein by reference), a wcaB gene (e.g., GenBank Accession Number (amino acid) BAA15911.1 (GI:1736761), incorporated herein by reference), a wcaF gene (e.g., GenBank Accession Number (amino acid) BAA15910.1 (GI:1736760), incorporated herein by reference), a wzxE gene (e.g., GenBank Accession Number (amino acid) BAE77506.1 (GI:85676256), incorporated herein by reference), a wzxC gene, (e.g., GenBank Accession Number (amino acid) BAA15899 (GI:1736748), incorporated herein by reference), a wcaD gene, (e.g., GenBank Accession Number (amino acid) BAE76573 (GI:85675202), incorporated herein by reference), a wza gene (e.g., GenBank Accession Number (amino acid) BAE76576 (GI:85675205), incorporated herein by reference), a wzb gene (e.g., GenBank Accession Number (amino acid) BAE76575 (GI:85675204), incorporated herein by reference), and a wzc gene (e.g., GenBank Accession Number (amino acid) BAA15913 (GI:1736763), incorporated herein by reference).


Preferably, a host bacterium, such as E. coli, is genetically engineered to produce a human milk oligosaccharide by the inactivation of the wcaJ gene, which encoding the UDP-glucose lipid carrier transferase. The inactivation of the wcaJ gene can be by deletion of the gene, a null mutation, or inactivating mutation of the wcaJ gene, such that the activity of the encoded wcaJ is reduced or eliminated compared to wild-type E coll. In a wcaJ null background, GDP-fucose accumulates in the E. coli cytoplasm.


Over-expression of a positive regulator protein, RcsA (e.g., GenBank Accession Number M58003 (GI:1103316), incorporated herein by reference), in the colanic acid synthesis pathway results in an increase in intracellular GDP-fucose levels. Over-expression of an additional positive regulator of colanic acid biosynthesis, namely RcsB (e.g., GenBank Accession Number E04821 (GI:2173017), incorporated herein by reference), is also utilized, either instead of or in addition to over-expression of RcsA, to increase intracellular GDP-fucose levels.


Alternatively, colanic acid biosynthesis is increased following the introduction of a mutation into the E. coli lon gene (e.g., GenBank Accession Number L20572 (GI:304907), incorporated herein by reference). Lon is an adenosine-5′-triphosphate (ATP)-dependant intracellular protease that is responsible for degrading RcsA, mentioned above as a positive transcriptional regulator of colanic acid biosynthesis in E. coli. In a ion null background, RcsA is stabilized, RcsA levels increase, the genes responsible for GDP-fucose synthesis in E. coli are up-regulated, and intracellular GDP-fucose concentrations are enhanced. Mutations in ion suitable for use with the methods presented herein include null mutations or insertions that disrupt the expression or function of ion.


A functional lactose permease gene is also present in the bacterium. The lactose permease gene is an endogenous lactose permease gene or an exogenous lactose permease gene. For example, the lactose permease gene comprises an E. coli lacY gene (e.g., GenBank Accession Number V00295 (GI:41897), incorporated herein by reference). Many bacteria possess the inherent ability to transport lactose from the growth medium into the cell, by utilizing a transport protein that is either a homolog of the E. coli lactose permease (e.g., as found in Bacillus licheniformis), or a transporter that is a member of the ubiquitous PTS sugar transport family (e.g., as found in Lactobacillus casei and Lactobacillus rhamnosus). For bacteria lacking an inherent ability to transport extracellular lactose into the cell cytoplasm, this ability is conferred by an exogenous lactose transporter gene (e.g., E. coli lacY) provided on recombinant DNA constructs, and supplied either on a plasmid expression vector or as exogenous genes integrated into the host chromosome.


As described herein, in some embodiments, the host bacterium preferably has a reduced level of β-galactosidase activity. In the embodiment in which the bacterium is characterized by the deletion of the endogenous β-galactosidase gene, an exogenous β-galactosidase gene is introduced to the bacterium. For example, a plasmid expressing an exogenous β-galactosidase gene is introduced to the bacterium, or recombined or integrated into the host genome. For example, the exogenous β-galactosidase gene is inserted into a gene that is inactivated in the host bacterium, such as the lon gene.


The exogenous b-galactosidase gene is a functional b-galactosidase gene characterized by a reduced or low leve of b-galactosidase activity compared to β-galactosidase activity in wild-type bacteria lacking any genetic manipulation. Exemplary β-galactosidase genes include E. coli lacZ and β-galactosidase genes from any of a number of other organisms (e.g., the lac4 gene of Kluyveromyces lactis (e.g., GenBank Accession Number M84410 (GI:173304), incorporated herein by reference) that catalyzes the hydrolysis of b-galactosides into monosaccharides. The level of β-galactosidase activity in wild-type E. coli bacteria is, for example, 6,000 units. Thus, the reduced 3-galactosidase activity level encompassed by engineered host bacterium of the present invention includes less than 6,000 units, less than 5,000 units, less than 4,000 units, less than 3,000 units, less than 2,000 units, less than 1,000 units, less than 900 units, less than 800 units, less than 700 units, less than 600 units, less than 500 units, less than 400 units, less than 300 units, less than 200 units, less than 100 units, or less than 50 units. Low, functional levels of β-galactosidase include β-galactosidase activity levels of between 0.05 and 1,000 units, e.g., between 0.05 and 750 units, between 0.05 and 500 units, between 0.05 and 400 units, between 0.05 and 300 units, between 0.05 and 200 units, between 0.05 and 100 units, between 0.05 and 50 units, between 0.05 and 10 units, between 0.05 and 5 units, between 0.05 and 4 units, between 0.05 and 3 units, or between 0.05 and 2 units of β-galactosidase activity. For unit definition and assays for determining β-galactosidase activity, see Miller J H, Laboratory CSH. Experiments in molecular genetics. Cold Spring Harbor Laboratory Cold Spring Harbor, N.Y.; 1972; (incorporated herein by reference). This low level of cytoplasmic β-galactosidase activity is not high enough to significantly diminish the intracellular lactose pool. The low level of β-galactosidase activity is very useful for the facile removal of undesired residual lactose at the end of fermentations.


Optionally, the bacterium has an inactivated thyA gene. Preferably, a mutation in a thyA gene in the host bacterium allows for the maintenance of plasmids that carry thyA as a selectable marker gene. Exemplary alternative selectable markers include antibiotic resistance genes such as BLA (beta-lactamase), or proBA genes (to complement a proAB host strain proline auxotropy) or purA (to complement a purA host strain adenine auxotrophy).


In one aspect, the E. coli bacterium comprises the genotype ΔampC::PtrpBcI, Δ(lacI-lacZ)::FRT, PlacIqlacY+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ+), ΔlacA, and also comprises any one of the exogenous α(1,2) fucosyltransferases described herein.


The bacterium comprising these characteristics is cultured in the presence of lactose. In some cases, the method further comprises culturing the bacterium in the presence of tryptophan and in the absence of thymidine. The fucosylated oligosaccharide is retrieved from the bacterium (i.e., a cell lysate) or from a culture supernatant of the bacterium.


The invention provides a purified fucosylated oligosaccharide produced by the methods described herein. The fucosylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacterium is used directly in such products. The fucosylated oligosaccharide produced by the engineered bacterium is 2′-fucosyllactose (2′-FL) or lactodifucotetraose (LDFT). The new alpha 1,2-fucosyltransferases are also useful to synthesize HMOS of larger molecular weight bearing alpha 1,2 fucose moieties, e.g., lacto-N-fucopentaose (LNF I) and lacto-N-difucohexaose (LDFH I). For example, to produce LDFT, the host bacterium is engineered to express an exogenous α (1,2) fucosyltransferase that also possesses α (1,3) fucosyltransferase activity, or an exogenous α (1,2) fucosyltransferase and an exogenous α (1,3) fucosyltransferase. For the production of LNF I and LDFH I, the host bacterium is engineered to express an exogenous α (1,2) fucosyltransferase that also possesses α (1,3) fucosyltransferase activity and/or α (1,4) fucosyltransferase activity, or an exogenous α (1,2) fucosyltransferase, an exogenous α (1,3) fucosyltransferas, and an exogenous α (1,4) fucosyltransferase.


A purified fucosylated oligosaccharide produced by the methods described above is also within the invention. The purified oligosaccharide (2′-FL) obtained at the end of the process is a white/slightly off-white, crystalline, sweet powder. For example, an engineered bacterium, bacterial culture supernatant, or bacterial cell lysate according to the invention comprises 2′-FL, LDFT, LNF I or LDFH I produced by the methods described herein, and does not substantially comprise a other fucosylated oligosaccharides prior to purification of the fucosylated oligosaccharide products from the cell, culture supernatant, or lysate. As a general matter, the fucosylated oligosaccharide produced by the methods contains a negligible amount of 3-FL in a 2′-FL-containing cell, cell lysate or culture, or supernatant, e.g., less than 1% of the level of 2′-FL or 0.5% of the level of 2′-FL. Moreover, the fucosylated oligosaccharide produced by the methods described herein also have a minimal amount of contaminating lactose, which can often be co-purified with the fucosylated oligosaccharide product, such as 2′FL. This reduction in contaminating lactose results from the reduced level of β-galactosidase activity present in the engineered host bacterium.


A purified oligosaccharide, e.g., 2′-FL, LDFT, LNF I, or LDFH I, is one that is at least 90%, 95%, 98%, 99%, or 100% (w/w) of the desired oligosaccharide by weight. Purity is assessed by any known method, e.g., thin layer chromatography or other chromatographic techniques known in the art. The invention includes a method of purifying a fucosylated oligosaccharide produced by the genetically engineered bacterium described above, which method comprises separating the desired fucosylated oligosaccharide (e.g., 2′-FL) from contaminants in a bacterial cell lysate or bacterial cell culture supernatant of the bacterium.


The oligosaccharides are purified and used in a number of products for consumption by humans as well as animals, such as companion animals (dogs, cats) as well as livestock (bovine, equine, ovine, caprine, or porcine animals, as well as poultry). For example, a pharmaceutical composition comprises purified 2′-FL and a pharmaceutically-acceptable excipient that is suitable for oral administration. Large quantities of 2′-FL are produced in bacterial hosts, e.g., an E. coli bacterium comprising an exogenous α (1,2) fucosyltransferase gene.


A method of producing a pharmaceutical composition comprising a purified human milk oligosaccharide (HMOS) is carried out by culturing the bacterium described above, purifying the HMOS produced by the bacterium, and combining the HMOS with an excipient or carrier to yield a dietary supplement for oral administration. These compositions are useful in methods of preventing or treating enteric and/or respiratory diseases in infants and adults. Accordingly, the compositions are administered to a subject suffering from or at risk of developing such a disease.


The invention also provides methods of identifying an α (1,2) fucosyltransferase gene capable of synthesizing fucosylated oligosaccharides in a host bacterium, i.e., 2′-fucosyllactose (2′-FL) in E. coli. The method of identifying novel lactose-utilizing, α(1,2)fucosyltransferase enzyme comprises the following steps:


1) performing a computational search of sequence databases to define a broad group of simple sequence homologs of any known, lactose-utilizing α(1,2)fucosyltransferase;


2) using the list from step (1), deriving a search profile containing common sequence and/or structural motifs shared by the members of the list;


3) searching sequence databases, using a derived search profile based on the common sequence or structural motif from step (2) as query, and identifying a candidate sequences, wherein a sequence homology to a reference lactose-utilizing α(1,2)fucosyltransferase is a predetermined percentage threshold;


4) compiling a list of candidate organisms, said organisms being characterized as expressing α(1,2)fucosyl-glycans in a naturally-occurring state;


5) selecting candidate sequences that are derived from candidate organisms to generate a list of candidate lactose-utilizing enzymes;


6) expressing the candidate lactose-utilizing enzyme in a host organism; and


7) testing for lactose-utilizing α(1,2)fucosyltransferase activity, wherein detection of the desired fucosylated oligosaccharide product in said organism indicates that the candidate sequence comprises a novel lactose-utilizing α(1,2)fucosyltransferase. In another embodiment, the search profile is generated from a multiple sequence alignment of the amino acid sequences of more than one enzyme with known α(1,2)fucosyltransferase activity. The database search can then be designed to refine and iteratively search for novel α(1,2)fucosyltransferases with significant sequence similarlity to the multiple sequence alignment query.


The invention provides a method of treating, preventing, or reducing the risk of infection in a subject comprising administering to said subject a composition comprising a purified recombinant human milk oligosaccharide, wherein the HMOS binds to a pathogen and wherein the subject is infected with or at risk of infection with the pathogen. In one aspect, the infection is caused by a Norwalk-like virus or Campylobacter jejuni. The subject is preferably a mammal in need of such treatment. The mammal is, e.g., any mammal, e.g., a human, a primate, a mouse, a rat, a dog, a cat, a cow, a horse, or a pig. In a preferred embodiment, the mammal is a human. For example, the compositions are formulated into animal feed (e.g., pellets, kibble, mash) or animal food supplements for companion animals, e.g., dogs or cats, as well as livestock or animals grown for food consumption, e.g., cattle, sheep, pigs, chickens, and goats. Preferably, the purified HMOS is formulated into a powder (e.g., infant formula powder or adult nutritional supplement powder, each of which is mixed with a liquid such as water or juice prior to consumption) or in the form of tablets, capsules or pastes or is incorporated as a component in dairy products such as milk, cream, cheese, yogurt or kefir, or as a component in any beverage, or combined in a preparation containing live microbial cultures intended to serve as probiotics, or in prebiotic preparations to enhance the growth of beneficial microorganisms either in vitro or in vivo.


Polynucleotides, polypeptides, and oligosaccharides of the invention are purified and/or isolated. Purified defines a degree of sterility that is safe for administration to a human subject, e.g., lacking infectious or toxic agents. Specifically, as used herein, an “isolated” or “purified” nucleic acid molecule, polynucleotide, polypeptide, protein or oligosaccharide, is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. For example, purified HMOS compositions are at least 60% by weight (dry weight) the compound of interest. Preferably, the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight the compound of interest. Purity is measured by any appropriate standard method, for example, by column chromatography, thin layer chromatography, or high-performance liquid chromatography (HPLC) analysis. For example, a “purified protein” refers to a protein that has been separated from other proteins, lipids, and nucleic acids with which it is naturally associated. Preferably, the protein constitutes at least 10, 20, 50, 70, 80, 90, 95, 99-100% by dry weight of the purified preparation.


Similarly, by “substantially pure” is meant an oligosaccharide that has been separated from the components that naturally accompany it. Typically, the oligosaccharide is substantially pure when it is at least 60%, 70%, 80%, 90%, 95%, or even 99%, by weight, free from the proteins and naturally-occurring organic molecules with which it is naturally associated.


By “isolated nucleic acid” is meant a nucleic acid that is free of the genes which, in the naturally-occurring genome of the organism from which the DNA of the invention is derived, flank the gene. The term covers, for example: (a) a DNA which is part of a naturally occurring genomic DNA molecule, but is not flanked by both of the nucleic acid sequences that flank that part of the molecule in the genome of the organism in which it naturally occurs; (b) a nucleic acid incorporated into a vector or into the genomic DNA of a prokaryote or eukaryote in a manner, such that the resulting molecule is not identical to any naturally occurring vector or genomic DNA; (c) a separate molecule such as a cDNA, a genomic fragment, a fragment produced by polymerase chain reaction (PCR), or a restriction fragment; and (d) a recombinant nucleotide sequence that is part of a hybrid gene, i.e., a gene encoding a fusion protein. Isolated nucleic acid molecules according to the present invention further include molecules produced synthetically, as well as any nucleic acids that have been altered chemically and/or that have modified backbones.


A “heterologous promoter” is a promoter which is different from the promoter to which a gene or nucleic acid sequence is operably linked in nature.


The term “overexpress” or “overexpression” refers to a situation in which more factor is expressed by a genetically-altered cell than would be, under the same conditions, by a wild type cell. Similarly, if an unaltered cell does not express a factor that it is genetically altered to produce, the term “express” (as distinguished from “overexpress”) is used indicating the wild type cell did not express the factor at all prior to genetic manipulation.


The terms “treating” and “treatment” as used herein refer to the administration of an agent or formulation to a clinically symptomatic individual afflicted with an adverse condition, disorder, or disease, so as to effect a reduction in severity and/or frequency of symptoms, eliminate the symptoms and/or their underlying cause, and/or facilitate improvement or remediation of damage. The terms “preventing” and “prevention” refer to the administration of an agent or composition to a clinically asymptomatic individual who is susceptible to a particular adverse condition, disorder, or disease, and thus relates to the prevention of the occurrence of symptoms and/or their underlying cause.


By the terms “effective amount” and “therapeutically effective amount” of a formulation or formulation component is meant a nontoxic but sufficient amount of the formulation or component to provide the desired effect.


The transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. By contrast, the transitional phrase “consisting of” excludes any element, step, or ingredient not specified in the claim. The transitional phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.


The host organism used to express the lactose-accepting fucosyltransferase gene is typically the enterobacterium Escherichia coli K12 (E. coli). E. coli K-12 is not considered a human or animal pathogen nor is it toxicogenic. E. coli K-12 is a standard production strain of bacteria and is noted for its safety due to its poor ability to colonize the colon and establish infections (see, e.g., epa.gov/oppt/biotech/pubs/fra/fra004.htm). However, a variety of bacterial species may be used in the oligosaccharide biosynthesis methods, e.g., Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, or Xanthomonas campestris. Bacteria of the genus Bacillus may also be used, including Bacillus subtilis, Bacillus licheniformis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulars. Similarly, bacteria of the genera Lactobacillus and Lactococcus may be modified using the methods of this invention, including but not limited to Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, and Lactococcus lactis. Streptococcus thermophiles and Proprionibacterium freudenreichii are also suitable bacterial species for the invention described herein. Also included as part of this invention are strains, modified as described here, from the genera Enterococcus (e.g., Enterococcus faecium and Enterococcus thermophiles), Bifidobacterium (e.g., Bifidobacterium longum, Bifidobacterium infantis, and Bifidobacterium bifidum), Sporolactobacillus spp., Micromomospora spp., Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonas fluorescens and Pseudomonas aeruginosa). Bacteria comprising the characteristics described herein are cultured in the presence of lactose, and a fucosylated oligosaccharide is retrieved, either from the bacterium itself or from a culture supernatant of the bacterium. The fucosylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacteria are used directly in such products. A suitable production host bacterial strain is one that is not the same bacterial strain as the source bacterial strain from which the fucosyltransferase-encoding nucleic acid sequence was identified.


Other features and advantages of the invention will be apparent from the following description of the preferred embodiments thereof, and from the claims. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All published foreign patents and patent applications cited herein are incorporated herein by reference. Genbank and NCBI submissions indicated by accession number cited herein are incorporated herein by reference. All other published references, documents, manuscripts and scientific literature cited herein are incorporated herein by reference. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a schematic illustration showing the synthetic pathway of the major neutral fucosyl-oligosaccharides found in human milk.



FIG. 2 is a schematic demonstrating metabolic pathways and the changes introduced into them to engineer 2′-fucosyllactose (2′-FL) synthesis in Escherichia coli (E. coli). Specifically, the lactose synthesis pathway and the GDP-fucose synthesis pathway are illustrated. In the GDP-fucose synthesis pathway: manA=phosphomannose isomerase (PMI), manB=phosphomannomutase (PMM), manC=mannose-1-phosphate guanylyltransferase (GMP), gmd=GDP-mannose-4,6-dehydratase, fcl=GDP-fucose synthase (GFS), and ΔwcaJ=mutated UDP-glucose lipid carrier transferase.



FIG. 3A and FIG. 3B show the sequence identity and a multiple sequence alignment of 4 previously known lactose-utilizing α(1,2)-fucosyltransferase protein sequences. FIG. 3A is a table showing the sequence identity between the 4 known lactose-utilizing α(1,2)-fucosyltransferases: H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). FIG. 3B shows multiple sequence alignment of the 4 known α(1,2)-fucosyltransferases. The ovals highlight regions of particularly high sequence conservation between the four enzymes in the alignment.



FIG. 4 shows the sequence alignment of the 12 identified α(1,2)-fucosyltransferase syngenes identified, along with the 4 previously known lactose-utilizing α(1,2)-fucosyltransferase protein sequences. The 4 known lactose-utilizing α(1,2)-fucosyltransferases are boxed and include H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). The 12 identified α(1,2)-fucosyltransferase are as follows: Prevotella melaninogenica FutO (SEQ ID NO: 10), Clostridium bolteae+13 FutP (SEQ ID NO: 292), Lachnospiraceae sp. FutQ (SEQ ID NO: 12), Methanosphaerula palustris FutR (SEQ ID NO: 13), Tannerella sp. FutS (SEQ ID NO: 14), Bacteroides caccae FutU (SEQ ID NO: 15), Butyrivibrio FutV (SEQ ID NO: 16), Prevotella sp. FutW (SEQ ID NO: 17), Parabacteroides johnsonii FutX (SEQ ID NO: 18), Akkermansia muciniphilia FutY (SEQ ID NO: 19), Salmonella enterica FutZ (SEQ ID NO: 20), Bacteroides sp. FutZA (SEQ ID NO: 21). The sequence for Clostridium bolteae FutP (without the 13 additional amino acids in the N-terminus) (SEQ ID NO: 11) is also shown in the alignment.



FIG. 5A and FIG. 5B are two pictures of gels showing the construction of the syngenes for each of the 12 novel α(1,2)-fucosyltransferases. FIG. 5A shows post-Gibson assembly PCR. FIG. 5B shows gel-purified RI/XhoI syngene fragments.



FIG. 6A and FIG. 6B are two photographs showing thin layer chromatograms of fucosylated oligosaccharide products produced in E. coli cultures using the 12 novel α(1,2)-fucosyltransferase syngenes. FIG. 6A shows fucosylated oligosaccharide products from 2 μl of culture supernatant. FIG. 6B shows fucosylated oligosaccharide products from 0.2 OD600 cell equivalents of whole cell heat extracts.



FIG. 7 is a graph showing the growth curve of the host bacterium expressing plasmids containing the α(1,2) fucosyltransferase genes WbgL, FutN, FutO, FutQ, and FutX after tryptophan induction in the presence of lactose in the culture medium (i.e. lac+trp).



FIG. 8 is a photograph of a SDS-PAGE gel showing the proteins produced from host bacterium expressing α(1,2) fucosyltransferase genes WbgL, FutN, FutO, FutQ, and FutX after induction.



FIG. 9A and FIG. 9B are two photographs of thin layer chromatograms showing the production of fucosylated oligosaccharide products from in E. coli cultures expressing select α(1,2)-fucosyltransferase syngenes WbgL, FutN, FutO, FutQ, and FutX at 7 hours or 24 hours after induction, FIG. 9A shows fucosylated oligosaccharide products from 2 μl of culture supernatant. FIG. 9B shows fucosylated oligosaccharide products from 0.2 OD600 cell equivalents of whole cell heat extracts.



FIG. 10A and FIG. 10B are two photographs of thin layer chromatograms showing the fucosylated oligosaccharide products after two different 1.5 L fermentation runs from E. coli expressing FutN: FIG. 10A) 36B and FIG. 10B) 37A. The culture yield for run 36B was 33 g/L while the yield for run 37A was 36.3 g/L.



FIG. 11 is a plasmid map of pG217 carrying the B. vulgatus FutN gene.



FIG. 12 is a schematic diagram showing the insertion of the LacIq promoter, the functional lacY gene, and the deletion of lacA.



FIG. 13 is a schematic diagram showing the deletion of the endogenous wcaJ gene using FRT recombination.



FIG. 14 is a schematic diagram of the E. coli W3110 chromosome, showing the insertion of a DNA fragment carrying kanamycin resistance gene (derived from transposon Tn5) and wild-type lacZ into the lon gene.





DETAILED DESCRIPTION OF THE INVENTION

While some studies suggest that human milk glycans could be used as antimicrobial anti-adhesion agents, the difficulty and expense of producing adequate quantities of these agents of a quality suitable for human consumption has limited their full-scale testing and perceived utility. What has been needed is a suitable method for producing the appropriate glycans in sufficient quantities at reasonable cost. Prior to the invention described herein, there were attempts to use several distinct synthetic approaches for glycan synthesis. Some chemical approaches can synthesize oligosaccharides (Flowers, H. M. Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb) 1115-1121 (2003)), but reactants for these methods are expensive and potentially toxic (Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494 (2000)). Enzymes expressed from engineered organisms (Albermann, C., Piepersberg, W. & Wehmeier, U. F. Carbohydr Res 334, 97-103 (2001); Bettler, E., Samain, E., Chazalet, V., Bosso, C., et al. Glycoconj J 16, 205-212 (1999); Johnson, K. F. Glycoconj J 16, 141-146 (1999); Palcic, M. M. Curr Opin Biotechnol 10, 616-624 (1999); Wymer, N. & Toone, E. J. Curr Opin Chem Biol 4, 110-119 (2000)) provide a precise and efficient synthesis (Palcic, M. M. Curr Opin Biotechnol 10, 616-624 (1999)); Crout, D. H. & Vic, G. Curr Opin Chem Biol 2, 98-111 (1998)), but the high cost of the reactants, especially the sugar nucleotides, limits their utility for low-cost, large-scale production. Microbes have been genetically engineered to express the glycosyltransferases needed to synthesize oligosaccharides from the bacteria's innate pool of nucleotide sugars (Endo, T., Koizumi, S., Tabata, K., Kakita, S. & Ozaki, A. Carbohydr Res 330, 439-443 (2001); Endo, T., Koizumi, S., Tabata, K. & Ozaki, A. Appl Microbiol Biotechnol 53, 257-261 (2000); Endo, T. & Koizumi, S. Curr Opin Struct Biol 10, 536-541 (2000); Endo, T., Koizumi, S., Tabata, K., Kakita, S. & Ozaki, A. Carbohydr Res 316, 179-183 (1999); Koizumi, S., Endo, T., Tabata, K. & Ozaki, A. Nat Biotechnol 16, 847-850 (1998)). However, prior to the invention described herein, there was a growing need to identify and characterize additional glycosyltransferases that are useful for the synthesis of HMOS in metabolically engineered bacterial hosts.


Human Milk Glycans

Human milk contains a diverse and abundant set of neutral and acidic oligosaccharides (Kunz, C., Rudloff, S., Baier, W., Klein, N., and Strobel, S. (2000). Annu Rev Nutr 20, 699-722; Bode, L. (2006). J Nutr 136, 2127-130). More than 130 different complex oligosaccharides have been identified in human milk, and their structural diversity and abundance is unique to humans. Although these molecules may not be utilized directly by infants for nutrition, they nevertheless serve critical roles in the establishment of a healthy gut microbiome (Marcobal, A., Barboza, M., Froehlich, J. W., Block, D. E., et al, J Agric Food Chem 58, 5334-5340 (2010)), in the prevention of disease (Newburg, D. S., Ruiz-Palacios, G. M. & Morrow, A. L. Annu Rev Nutr 25, 37-58 (2005)), and in immune function (Newburg, D. S. & Walker, W. A. Pediatr Res 61, 2-8 (2007)). Despite millions of years of exposure to human milk oligosaccharides (HMOS), pathogens have yet to develop ways to circumvent the ability of HMOS to prevent adhesion to target cells and to inhibit infection. The ability to utilize HMOS as pathogen adherence inhibitors promises to address the current crisis of burgeoning antibiotic resistance. Human milk oligosaccharides produced by biosynthesis represent the lead compounds of a novel class of therapeutics against some of the most intractable scourges of society.


One alternative strategy for efficient, industrial-scale synthesis of HMOS is the metabolic engineering of bacteria. This approach involves the construction of microbial strains overexpressing heterologous glycosyltransferases, membrane transporters for the import of precursor sugars into the bacterial cytosol, and possessing enhanced pools of regenerating nucleotide sugars for use as biosynthetic precursors (Dumon, C., Samain, E., and Priem, B. (2004). Biotechnol Prog 20, 412-19; Ruffing, A., and Chen, R. R. (2006). Microb Cell Fact 5, 25). A key aspect of this approach is the heterologous glycosyltransferase selected for overexpression in the microbial host. The choice of glycosyltransferase can significantly affect the final yield of the desired synthesized oligosaccharide, given that enzymes can vary greatly in terms of kinetics, substrate specificity, affinity for donor and acceptor molecules, stability and solubility. A few glycosyltransferases derived from different bacterial species have been identified and characterized in terms of their ability to catalyze the biosynthesis of HMOS in E. coli host strains (Dumon, C., Bosso, C., Utille, Heyraud, A., and Samain, E. (2006). Chembiochem 7, 359-365; Dumon, C., Samain, E., and Priem, B. (2004). Biotechnol Prog 20, 412-19; Li, M., Liu, X. W., Shao, J., Shen, J., Jia, Q., Yi, W., Song, J. K., Woodward, R., Chow, C. S., and Wang, P. G. (2008), Biochemistry 47, 378-387). The identification of additional glycosyltransferases with faster kinetics, greater affinity for nucleotide sugar donors and/or acceptor molecules, or greater stability within the bacterial host significantly improves the yields of therapeutically useful HMOS. Prior to the invention described herein, chemical syntheses of HMOS were possible, but were limited by stereo-specificity issues, precursor availability, product impurities, and high overall cost (Flowers, H. M. Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb) 1115-1121 (2003); Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494 (2000)). The invention overcomes the shortcomings of these previous attempts by providing new strategies to inexpensively manufacture large quantities of human milk oligosaccharides (HMOS) for use as dietary supplements. Advantages include efficient expression of the enzyme, improved stability and/or solubility of the fucosylated oligosaccharide product (2′-FL, LDFT, LNF I, and LDFH I) and reduced toxicity to the host organism. The present invention features novel α(1,2) FTs suitable for expression in production strains for increased efficacy and yield of fucosylated HMOS compared to α(1,2) FTs currently utilized in the field.


As described in detail below, E. coli (or other bacteria) is engineered to produce selected fucosylated oligosaccharides (i.e., 2′-FL, LDFT, LDHF I, or LNF I) in commercially viable levels. For example, yields are >5 grams/liter in a bacterial fermentation process. In other embodiments, the yields are greater than 10 grams/liter, greater than 15 grams/liter, greater than 20 grams/liter, greater than 25 grams/liter, greater than 30 grams/liter, greater than 35 grams/liter, greater than 40 grams/liter, greater than 45 grams/liter, greater than 50 grams/liter, greater than 55 grams/liter, greater than 60 grams/liter, greater than 65 grams/liter, greater than 70 grams/liter, or greater than 75 grams/liter of fucosylated oligosaccharide products, such as 2′-FL, LDFT, LDHF I, and LNF I.


Role of Human Milk Glycans in Infectious Disease

Human milk glycans, which comprise both unbound oligosaccharides and their glycoconjugates, play a significant role in the protection and development of the infant gastrointestinal (GI) tract. Neutral fucosylated oligosaccharides, including 2′-fucosyllactose (2′-FL), protect infants against several important pathogens. Milk oligosaccharides found in various mammals differ greatly, and the composition in humans is unique (Hamosh M., 2001 Pediatr Clin North Am, 48:69-86; Newburg D. S., 2001 Adv Exp Med Biol, 501:3-10), Moreover, glycan levels in human milk change throughout lactation and also vary widely among individuals (Morrow A. L. et al., 2004 J Pediatr, 145:297-303; Chaturvedi P et al., 2001 Glycobiology, 11:365-372). Approximately 200 distinct human milk oligosaccharides have been identified and combinations of simple epitopes are responsible for this diversity (Newburg D. S., 1999 Curr_Med Chem, 6:117-127; Ninonuevo M. et al., 2006 J Agric Food Chem, 54:7471-74801).


Human milk oligosaccharides are composed of 5 monosaccharides: D-glucose (Glc), D-galactose (Gal), N-acetylglucosamine (GlcNAc), L-fucose (Fuc), and sialic acid (N-acetyl neuraminic acid, Neu5Ac, NANA). Human milk oligosaccharides are usually divided into two groups according to their chemical structures: neutral compounds containing Glc, Gal, GlcNAc, and Fuc, linked to a lactose (Galβ1-4Glc) core, and acidic compounds including the same sugars, and often the same core structures, plus NANA (Charlwood J. et al., 1999 Anal Biochem, 273:261-277; Martin-Sosa et al., 2003 J Dairy Sci, 86:52-59; Parkkinen J, and Finne J., 1987 Methods Enzymol, 138:289-300; Shen Z. et al., 2001 J Chromatogr A, 921:315-321).


Approximately 70-80% of oligosaccharides in human milk are fucosylated, and their synthetic pathways are believed to proceed as shown in FIG. 1. A smaller proportion of the oligosaccharides are sialylated or both fucosylated and sialylated, but their synthetic pathways are not fully defined. Understanding of the acidic (sialylated) oligosaccharides is limited in part by the ability to measure these compounds. Sensitive and reproducible methods for the analysis of both neutral and acidic oligosaccharides have been designed. Human milk oligosaccharides as a class survive transit through the intestine of infants very efficiently, being essentially indigestible (Chaturvedi, P., Warren, C. D., Buescher, C. R., Pickering, L. K. & Newburg, D. S. Adv Exp Med Biol 501, 315-323 (2001)).


Human Milk Glycans Inhibit Binding of Enteropathogens to their Receptors


Human milk glycans have structural homology to cell receptors for enteropathogens and function as receptor decoys. For example, pathogenic strains of Campylobacter bind specifically to glycans containing H-2, i.e., 2′-fucosyl-N-acetyllactosamine or 2′-fucosyllactose (2′FL); Campylobacter binding and infectivity are inhibited by 2′-FL and other glycans containing this H-2 epitope. Similarly, some diarrheagenic E. coli pathogens are strongly inhibited in vivo by human milk oligosaccharides containing 2-linked fucose moieties, Several major strains of human caliciviruses, especially the noroviruses, also bind to 2-linked fucosylated glycans, and this binding is inhibited by human milk 2-linked fucosylated glycans. Consumption of human milk that has high levels of these 2-linked fucosyloligosaccharides was associated with lower risk of norovirus, Campylobacter, ST of E. coli-associated diarrhea, and moderate-to-severe diarrhea of all causes in a Mexican cohort of breastfeeding children (Newburg D. S. et al., 2004 Glycobiology, 14:253-263; Newburg D. S. et al., 1998 Lancet, 351:1160-1164). Several pathogens utilize sialylated glycans as their host receptors, such as influenza (Couceiro, J. N., Paulson, J. C. & Baum, L, G. Virus Res 29, 155-165 (1993)), parainfluenza (Amonsen, M., Smith, D. F., Cummings, R. D. & Air, G. M. J Virol 81, 8341-8345 (2007), and rotoviruses (Kuhlenschmidt, T. B., Hanafin, W. P., Gelberg, H. B. & Kuhlenschmidt, M. S. Adv Exp Med Biol 473, 309-317 (1999)). The sialyl-Lewis X epitope is used by Helicobacter pylori (Mandavi, J., Sondén, B., Hurtig, M., Olfat, F. O., et al. Science 297, 573-578 (2002)), Pseudomonas aeruginosa (Scharfman, A., Delmotte, P., Beau, J., Lamblin, G., et al. Glycoconj J 17, 735-740 (2000)), and some strains of noroviruses (Rydell, G. E., Nilsson, J., Rodriguez-Diaz, J., Ruvoën-Clouet, N., et al, Glycobiology 19, 309-320 (2009)).


Identification of Novel α(1,2) Fucosyltransferases

The present invention provides novel α(1,2) fucosyltransferase enzymes. The present invention also provides nucleic acid constructs (i.e., a plasmid or vector) carrying the nucleic acid sequence of a novel α(1,2) fucosyltransferases for the expression of the novel α(1,2) fucosyltransferases in host bacterium. The present invention also provides methods for producing fucosylated oligosaccharides by expressing the novel α(1,2) fucosyltransferases in suitable host production bacterium, as further described herein.


Not all α(1,2)fucosyltransferases can utilize lactose as an acceptor substrate. An acceptor substrate includes, for example, a carbohydrate, an oligosaccharide, a protein or glycoprotein, a lipid or glycolipid, e.g., N-acetylglucosamine, N-acetyllactosamine, galactose, fucose, sialic acid, glucose, lactose, or any combination thereof. A preferred alpha (1,2) fucosyltransferase of the present invention utilizes GDP-fucose as a donor, and lactose is the acceptor for that donor.


A method of identifying novel α(1,2)fucosyltransferase enzymes capable of utilizing lactose as an acceptor was previously carried out (as described in PCT/US2013/051777, hereby incorporated by reference in its entirety) using the following steps: 1) performing a computational search of sequence databases to define a broad group of simple sequence homologs of any known, lactose-utilizing α(1,2)fucosyltransferase; 2) using the list of homologs from step 1 to derive a search profile containing common sequence and/or structural motifs shared by the members of the broad group, e.g. by using computer programs such as MEME (Multiple Em for Motif Elicitation available at http://meme.sdsc.edu/meme/cgi-bin/meme.cgi) or PSI-BLAST (Position-Specific Iterated BLAST available at ncbi.nlm.nih.gov/blast with additional information at cnx.org/content/m11040/latest/); 3) searching sequence databases (e.g., using computer programs such as PSI-BLAST, or MAST (Motif Alignment Search Tool available at http://meme.sdsc.edu/meme/cgi-bin/mast.cgi), using this derived search profile as query, and identifying “candidate sequences” whose simple sequence homology to the original lactose-accepting α(1,2)fucosyltransferase is 40% or less; 4) scanning the scientific literature and developing a list of “candidate organisms” known to express α(1,2)fucosyl-glycans; 5) selecting only those “candidate sequences” that are derived from “candidate organisms” to generate a list of “candidate lactose-utilizing enzymes”; and 6) expressing each “candidate lactose-utilizing enzyme” and testing for lactose-utilizing α(1,2)fucosyltransferase activity.


The MEME suite of sequence analysis tools (meme.sdsc.edu/meme/egi-bin/meme.cgi) can also be used as an alternative to PSI-BLAST. Sequence motifs are discovered using the program “MEME”. These motifs can then be used to search sequence databases using the program “MAST”. The BLAST and PSI-BLAST search algorithms are other well known alternatives.


To identify additional novel α(1,2)fucosyltransferases, a multiple sequence alignment query was generated using four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences: H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). These sequence alignment and percentage of sequence identity is shown in FIG. 3. An iterative PSI-BLAST was performed, using the FASTA-formatted multiple sequence alignment as the query, and the NCBI PSI-BLAST program run on a local copy of NCBI BLAST+ version 2.2.29. An initial position-specific scoring matrix file (.pssm) was generated by PSI-BLAST, which the program then used to adjust the score of iterative homology search runs. The process is iterated to generate an even larger group of candidates, and the results of each run were used to further refine the matrix.


This PSI-BLAST search resulted in an initial 2515 hits. There were 787 hits with greater than 22% sequence identity to FutC. 396 hits were of greater than 275 amino acids in length. Additional analysis of the hits was performed, including sorting by percentage identity to FutC, comparing the sequences by BLAST to existing α(1,2) fucosyltransferase inventory (of known α(1,2) fucosyltransferases), and manual annotation of hit sequences to identify those originating from bacteria that naturally exist in the gastrointestinal tract. An annotated list of the novel α(1,2) fucosyltransferases identified by this screen are listed in Table 1. Table 1 provides the bacterial species from which the candidate enzyme is found, the GenBank Accession Number, GI Identification Number, amino acid sequence, and % sequence identity to FutC.


Of the identified hits, 12 novel α(1,2) fucosyltransferases were further analyzed for their functional capacity: Prevotella melaninogenica FutO, Clostridium bolteae FutP, Clostridium bolteae+13 FutP, Lachnospiraceae sp. FutQ, Methanosphaerula palustries FutR, Tannerella sp. FutS, Bacteroides caccae FutU, Butyrivibrio FutV, Prevotellaa sp. FutW, Parabacteroides johnsonii FutX, Akkermansia muciniphilia FutY, Salmonella enterica FutZ, and Bacteroides sp. FutZA. For Clostridium bolteae FutP, the annotation named the wrong initiation methionine codon. Thus, the present invention includes FutP with an additional 13 amino acids at the N-terminus of the annotated FutP (derived in-frame from the natural upstream DNA sequence), which is designated herein as Clostridium bolteae+13 FutP. The sequence identity between the 12 novel α(1,2) fucosyltransferases identified and the 4 previously identified α(1,2) fucosyltransferases is shown in Table 2 below.









TABLE 2







Sequence Identity
























1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16





























H. pylori futC

1

70.10
21.99
20.82
27.68
27.36
23.56
23.26
23.62
25.75
23.72
24.05
12.29
24.19
22.92
22.29



H. mustelae futL

2
70.10

23.87
19.88
26.38
28.21
24.30
23.38
24.62
25.31
25.31
24.47
23.56
25.15
23.55
23.26



Bacteroides vulgatus futN

3
21.99
23.87

25.16
32.05
28.71
28.94
25.79
37.46
32.27
26.11
61.27
71.63
27.67
25.15
84.75



E. coli O126 wbgL

4
20.82
19.88
25.16

24.25
22.73
22.32
26.04
25.45
24.77
21.49
73.29
26.71
24.63
21.45
25.16



Prevotella melaninogenica

5
27.68
26.36
32.05
24.25

36.96
31.63
35.74
35.16
55.74
30.28
30.03
32.80
30.09
26.28
31.83


FutO YP_003814512.1



Clostridium bolteae +

6
27.36
28.21
28.71
22.73
36.96

37.87
35.10
33.77
36.91
35.74
29.53
31.39
27.67
26.33
29.13


13 FutP WP_002570768.1



Lachnospiraceae sp.

7
23.56
24.30
28.94
22.32
31.63
37.87

29.87
29.17
32.90
51.02
28.53
30.00
27.69
24.00
27.74


FutQ WP_009251343.1



Methanospharula palustris

8
23.28
23.38
25.79
26.04
35.74
35.10
29.87

18.71
38.24
31.41
25.39
28.08
30.65
23.93
25.55


FutR YP_002467213.1



Tannerella sp.

9
23.62
24.62
37.46
25.45
35.16
33.77
29.17
28.71

34.41
30.03
35.71
36.27
26.48
21.75
36.60


FutS WP_021929367.1



Bacteroides caccae

10
25.75
25.31
32.27
24.77
55.74
36.91
32.90
38.24
34.41

31.21
29.94
33.33
29.28
24.40
33.01


FutU WP_005675707.1



Butyrivibrio

11
23.72
25.31
25.11
21.49
30.28
35.74
51.02
31.41
30.03
31.21

27.62
26.20
26.46
22.15
26.52


FutV WP_022772718.1



Prevotella sp.

12
24.05
24.47
61.27
23.29
30.03
29.58
28.53
25.39
15.71
29.94
27.62

57.60
25.79
22.15
59.01


FutV WP_022481266.1



Parabacteroides johnsonni

13
22.29
23.56
71.63
26.71
32.80
31.39
30.00
28.08
36.27
33.33
26.20
57.60

28.71
24.00
74.02


FutX WP_008155883.1



Akkermansia muciniphilia

14
24.19
25.15
27.67
24.63
30.09
27.67
27.69
30.65
26.41
29.28
26.46
25.79
28.71

21.45
28.08


FutY YP_00187755.5



Salmonella enterica

15
21.92
23.55
25.15
21.45
26.24
26.33
14.00
23.93
21.75
24.46
22.13
22.15
24.00
21.45

74.62


FutZ WP_023214330



Bacteroides sp.

16
22.29
23.26
84.75
25.16
31.83
29.13
27.74
25.55
35.50
33.01
26.52
59.01
74.02
28.08
24.62


FutZA WP_022161880.11









Based on the amino acid sequences of the identified α(1,2) fucosyltransferases (i.e., in Table 1), syngenes can be readily designed and constructed by the skilled artisan using standard methods known in the art. For example, the syngenes include a ribosomal binding site, are codon-optimized for expression in a host bacterial production strain (i.e., E. coli), and have common 6-cutter restriction sites or sites recognized by endogenous restriction enzymes present in the host strain (i.e., EcoK restriction sites) removed to ease cloning and expression in the E. coli host strain. In a preferred embodiment, the syngenes are constructed with the following configuration: EcoRI site—T7g10 RBS—α(1,2) FT syngene—XhoI site. The nucleic acid sequences of sample syngenes for the 12 identified α(1,2) fucosyltransferases are shown in Table 3. (the initiating methionine ATG codon is bolded)









TABLE 3







Nucleic acid sequences of 12 novel a(1,2) fucosyltransferase syngenes









Bacteria/

SEQ


Gene

ID


name
Sequence
NO:





FutO
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGAAAATCGTCAAAATCCTGGGCGGT
276



CTGGGCAATCAGATGTTCCAGTATGCTCTGTACCTGAGCCTGCAAGAAAGTTTTCCAAAA




GAACGTGTGGCCCTGGACCTGTCCTCCTTCCACGGCTATCACCTGCATAATGGCTTTGAG




CTGGAGAACATTTTCTCCGTTACCGCTCAGAAAGCATCCGCCGCAGATATCATGCGTATT




GCTTATTACTACCCGAACTATCTGCTGTGGCGCATTGGCAAACGTTTTCTGCCGCGTCGT




AAAGGTATGTGCCTGGAATCTAGCTCCCTGCGTTTCGATGAAAGCGTTCTGCGTCAGGAA




GGTAACCGTTATTTTGACGGTTACTGGCAAGACGAACGCTACTTCGCAGCCTATCGTGAA




AAAGTGCTGAAGGCTTTCACCTTTCCTGCATTCAAACGCGCAGAAAACCTGAGCCTGCTG




GAAAAACTGGACGAAAACAGCATTGCTCTGCATGTTCGTCGCGGTGATTACGTAGGTAAT




AACCTGTACCAAGGCATCTGTGACCTGGACTACTACCGTACCGCTATCGAGAAAATGTGT




GCACACGTTACTCCGTCTCTGTTTTGTATCTTTTCCAACGACATCACGTGGTGCCAGCAG




CACCTGCAACCGTACCTGAAGGCCCCTGTGGTGTACGTTACTTGGAACACCGGTGTTGAA




TCCTACCGCGATATGCAGCTGATGTCCTGCTGCGCACATAACATCATCGCGAATAGCTCC




TTCTCTTGGTGGGGTGCTTGGCTGAATCAGAACCGTGAAAAAGTTGTTATCGCCCCGAAA




AAATGGCTGAACATGGAAGAATGTCACTTCACGCTGCCGGCAAGCTGGATCAAAATTTAG




CTCGAGTGACTGACTG






FutP
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTGATTATCAAAATGATGGGTGGT
277



CTGGGCAACCAGATGTTCCAGTACGCACTGTACAAAGCATTCGAGCAGAAGCACATCGAT




GTGTATGCAGACCTGGCATGGTACAAAAACAAATCCGTGAAATTTGAACTGTACAACTTC




GGCATTAAAATCAACGTAGCATCCGAGAAAGACATCAACCGTCTGAGCGATTGCCAGGCG




GACTTTGTTTCCCGCATCCGCCGTAAAATCTTTGGTAAAAAAAAGAGCTTCGTATCTGAA




AAAAATGACTCCTGCTATGAAAACGACATCCTGCGTATGGACAACGTTTATCTGAGCGGT




TATTGGCAGACCGAAAAATACTTCTCTAACACGCGTGAGAAGCTGCTGGAGGATTATTCC




TTCGCTCTGGTAAACTCTCAGGTGTCCGAATGGGAAGACTCCATTCGCAACAAAAACAGC




GTTAGCATCCATATCCGTCGTGGTGATTATCTACAGGGCGAACTGTATGGTGGTATTTGC




ACCTCTCTGTACTACGCCGAAGCAATCGAGTACATTAAAATGCGTGTTCCGAACGCAAAA




TTCTTCGTTTTCTCTGATGACGTTGAATGGGTTAAACAGCAAGAAGACTTCAAAGGCTTC




GTAATCGTTGATCGCAACGAGTATTCTAGCGCTCTGTCCGATATGTACCTGATGTCCCTG




TGCAAGCATAACATTATTGCTAACTCCTCTTTCAGCTGGTGGGCAGCTTGGCTGAACCGT




AACGAAGAAAAAATTGTAATCGCGCCGCGCCGTTGGCTGAACGGCAAGTGCACCCCAGAT




ATCTGGTGTAAAAAATGGATTCGTATCTAGCTCGAGTGACTGACTG






FutQ
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTGATCGTACAGCTGAGCGGCGGT
278



CTGGGCAACCAGATGTTCGAATACGCGCTGTACCTGAGCCTGAAAGCAAAAGGCAAAGAA




GTGAAAATTGACGATGTTACGTGTTACGAGGGCCCTGGCACCCGTCCGCGTCAACTGGAT




GTTTTTGGTATCACGTACGATCGCGCGTCTCGTGAGGAGCTGACTGAGATGACGGACGCG




AGCATGGATGCGCTGTCTCGTGTTCGTCGCAAACTGACCGGTCGCCGCACTAAAGCGTAC




CGCGAACGCGACATCAACTTCGATCCACTGGTTATGGAAAAAGACCCGGCACTGCTGGAA




GGCTGTTTCCAGTCTGACAAATACTTTCGTGATTGCGAAGGCCGCGTGCGCGAACCGTAT




CGTTTCCGCGGCATTGAATCCGGCGCGTTCCCGCTGCCGGAAGACTATCTGCGCCTGGAA




AAGCAGATCGAAGATTGTCAGTCCGTATCCGTACACATCCGTCGTGGCGACTACCTGGAC




GAATCTCATGGTGGTCTGTACACCGGCATTTGTACTGAGGCGTACTATAAAGAGGCTTTT




GCTCGCATGGAACGTCTGGTTCCGGGCGCACGTTTCTTCCTGTTCTCTAACGATCCAGAA




TGGACTCGTGAGCACTTTGAGAGCAAGAACTGCGTTCTGGTTGAAGGTAGCACCGAAGAC




ACGGGTTACATGGACCTGTACCTGATGAGCCGCTGCCGCCACAATATTATTGCCAACTCT




TCTTTCAGCTGGTGGGGCGCTTGGCTGAATGAGAACCCTGAGAAAAAAGTCATCGCACCG




GCTAAATGGCTGAACGGTCGTGAGTGCCGTGATATCTATACCGAACGCATGATTCGTCTG




TAGCTCGAGTGACTGACTG






FutR
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGATCATTGTTCGTCTGAAAGGCGGT
279



CTGGGCAACCAACTGTCTCAGTATGCACTGGGCCGTAAGATCGCGCATCTGCACAATACC




GAACTGAAACTGGACACCACTTGGTTCACCACTATCTCCTCCGACACTCCACGTACCTAC




CGTCTGAACAATTATAACATCATCGGCACTATTGCATCCGCAAAGGAAATCCAGCTGATC




GAACGTGGTCGCGCGCAAGGCCGTGGCTACCTGCTGTCTAAAATTTCTGATCTGCTGACT




CCGATGTACCGTCGTACCTACGTCCGTGAACGTATGCATACCTTCGATAAAGCTATCCTG




ACCGTTCCGGACAACGTGTACCTGGATGGTTACTGGCAGACCGAAAAGTACTTCAAAGAC




ATCGAAGAAATCCTGCGCCGTGAGGTTACGCTGAAAGATGAACCGGATAGCATCAACCTG




GAAATGGCTGAACGTATTCAGGCTTGCCACAGCGTTTCCCTGCACGTGCGTCGTGGCGAC




TACGTTTCCAACCCGACCACTCAACAATTCCACGGCTGTTGCTCCATTGACTACTACAAC




CGCGCTATCTCTCTGATTGAAGAAAAAGTGGATGACCCGTCTTTCTTTATTTTTTCTGAC




GATCTGCCGTGGGCTAAAGAAAACCTGGACATCCCTGGCGAAAAAACCTTCGTTGCGCAT




AACGGCCCGGAAAAAGAGTATTGCGATCTGTGGCTGATGTCTCTGTGCCAGCACCATATC




ATCGCAAACTCTTCTTTCAGCTGGTGGGGTGCCTGGCTGGGTCAAGACGCCGAAAAGATG




GTGATCGCGCCGCGTCGCTGGGCCCTGTCCGAGAGCTTTGACACTTCTGACATCATTCCG




GACTCTTGGATTACTATCTAGCTCGAGTGACTGACTG






FutS
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTACGCATTGTGGAAATCATCGGC
280



GGTCTGGGTAACCAGATGTTCCAGTACGCATTCTCCCTGTACCTGAAAAACAAATCTCAC




ATCTGGGACCGTCTGTATGTGGACATCGAGGCGATGAAAACCTACGATCGTCACTATGGT




CTGGAACTGGAGAAAGTTTTCAATCTGAGCCTGTGTCCAATCTCTAACCGTCTGCACCGC




AACCTGCAAAAACGCTCCTTCGCAAAACACTTTGTAAAGAGCCTGTACGAGCACTCTGAA




TGCGAGTTCGACGAACCGGTGTACCGTGGCCTGCGTCCTTATCGCTATTATCGCGGCTAC




TGGCAAAACGAAGGTTACTTCGTTGATATTGAACCGATGATCCGTGAGGCTTTTCAGTTC




AACGTTAACATCCTGAGCAAAAAGACTAAAGCGATCGCATCCAAAATGCGCCGTGAACTG




TCCGTATCTATCCATGTTCGCCGTGGTGATTACGAAAACCTGCCGGAAGCGAAAGCGATG




CATGGCGGTATTTGTTCTCTGGACTATTACCACAAAGCGATCGACTTCATCCGCCAGCGT




CTGGATAATAACATCTGTTTCTATCTGTTCTCCGACGATATCAATTGGGTAGAAGAAAAC




CTGCAACTGGAAAACCGTTGCATCATCGACTGGAACCAGGGCGAAGATAGCTGGCAGGAC




ATGTACCTGATGAGCTGCTGCCGCCACCACATTATCGCAAACAGCTCTTTCTCCTGGTGG




GCGGCATGGCTGAATCCAAACAAGAACAAAATCGTACTGACCCCGAACAAATGGTTCAAC




CATACTGACGCAGTGGGTATCGTCCCAAAGTCCTGGATTAAAATTCCTGTGTTTTAGCTC




GAGTGACTGACTG






FutU
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGAAAATCGTTAAAATCCTGGGCGGC
281



CTGGGTAACCAGATGTTTCAGTACGCCCTGTTCCTGTCTCTGAAAGAACGCTTCCCGCAT




GAACAGGTGATGATTGACACCAGCTGCTTCCGCAATTACCCACTGCACAACGGTTTCGAA




GTGGATCGTATCTTCGCCCAGAAAGCACCGGTTGCCTCTTGGCGTAACATCCTGAAGGTT




GCCTACCCGTACCCGAACTACCGTTTCTGGAAAATCGGTAAATACATCCTGCCTAAACGT




AAAACCATGTGTGTAGAGCGTAAAAACTTCAGCTTTGACGCCGCAGTCCTGACCCGTAAA




GGCGATTGCTACTATGATGGCTACTGGCAGCATGAGGAATATTTCTGTGATATGAAAGAA




ACGATTTGGGAGGCTTTCTCCTTCCCTGAGCCGGTTGATGGTCGTAACAAGGAGATCGGT




GCCCTGCTACAGGCATCTGATAGCGCTTCCCTGCACGTTCGTCGCGGTGACTACGTGAAC




CACCCACTGTTTCGTGGTATTTGTGACCTGGACTATTATAAACGTGCCATCCACTACATG




GAAGAACGCGTCAACCCACAGCTGTACTGCGTTTTCAGCAACGATATGGCCTGGTGCGAG




TCCCACCTGCGTGCACTGCTGCCAGGCAAAGAAGTAGTTTATGTTGACTGGAACAAGGGT




GCGGAATCTTACGTTGATATGCGTCTGATGAGCCTGTGCCGTCACAACATCATCGCTAAC




TCTTCTTTCAGCTGGTGGGGCGCATGGCTGAACCGTAACCCGCAGAAAGTGGTGGTAGCG




CCGGAACGTTGGATGAACAGCCCGATTGAAGACCCAGTGAGCGACAAATGGATTAAACTG




TAGCTCGAGTGACTGACTG






FutV
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGATCATCATCCAGCTGAAAGGTGGC
282



CTGGGCAACCAAATGTTCCAGTACGCGCTGTACAAATCCCTGAAAAAACGTGGTAAAGAA




GTTAAAATTGATGACAAAACTGGCTTCGTGAACGACAAACTGCGTATCCCGGTACTGTCC




CGTTGGGGTGTTGAGTACGATCGTGCAACCGACGAAGAGATTATTAACCTGACCGACTCC




AAAATGGACCTGTTCTCTCGCATCCGCCGTAAACTGACTGGCCGCAAAACGTTCCGTATC




GACGAAGAATCCGGTAAATTCAACCCGGAAATCCTGGAAAAAGAGAACGCTTATCTGGTG




GGTTATTGGCAGTGCGACAAGTACTTCGACGACAAAGATGTGGTTCGCGAAATTCGTGAA




GCGTTCGAGAAAAAACCGCAGGAGCTGATGACCGACGCCAGCTCTTGGTCTACTCTACAG




CAGATTGAATGCTGCGAGTCCGTATCCCTGCACGTACGTCGTACTGATTACGTGGACGAG




GAACATATTCATATCCATAACATCTGTACGGAAAAATACTATAAAAACGCCATTGATCGT




GTGCGTAAACAGTACCCGAGCGCAGTGTTCTTCATCTTCACCGATGATAAAGAATGGTGC




CGCGACCACTTTAAAGGTCCGAACTTCATCGTAGTCGAACTGGAAGAAGGCGACGGTACC




GACATCGCTGAAATGACTCTGATGTCCCGCTGTAAACATCACATCATCGCTAATTCTAGC




TTTAGCTGGTGGGCGGCGTGGCTGAACGACTCCCCGGAPAAAATCGTGATCGCTCCTCAG




AAATGGATTAACAACCGCGACATGGACGATATTTACACCGAGCGTATGACTAAAATCGCA




CTGTAGCTCGAGTGACTGACTG






FutW
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGCCTGGTTAAAATGATCGGCGGT
283



CTGGGTAATCAGATGTTCATCTACGCGTTTTACCTACAGATGCGTAAGCGTTTCTCCAAC




GTTCGTATCGACCTGACCGATATGATGCACTACAACGTACACTATGGCTACGAACTGCAC




AAAGTTTTCGGTCTGCCGCGCACCGAGTTCTGTATGAACCAGCCTCTGAAAAAGGTTCTG




GAGTTCCTGTTCTTCCGTACCATTGTTGAACGTAAACAGCACGGTCGTATGGAGCCGTAT




ACTTGCCAGTATGTTTGGCCGCTGGTTTACTTTAAGGGCTTCTATCAGTCCGAACGTTAC




TTCTCCGAAGTTAAGGACGAAGTTCGTGAGTGTTTCACCTTCAATCCGGCACTGGCGAAT




CGTTCTTCCCAACAGATGATGGAACAGATCCAGAATGATCCTCAGGCTGTCTCTATCCAC




ATCCGTCGTGGCGACTATCTGAATCCGAAGCACTACGACACTATCGGTTGTATCTGTCAG




CTGCCGTATTACAAGCACGCCGTTTCCGAAATTAAAAAGTACGTTTCTAACCCTCACTTT




TACGTTTTCTCCGAAGACCTGGATTGGGTCAAAGCAAACCTGCCGCTGGAAAACGCACAG




TACATCGATTGGAACAAAGGCGCAGATAGCTGGCAGGATATGATGCTGATGAGCTGTTGC




AAACACCACATTATCTGTAACTCCACCTTTAGCTGGTGGGCGGCGTGGCTGAACCCATCT




GTCGAAAAAACCGTGATCATGCCGGAACAGTGGACGTCTCGTCAAGATTCCGTGGACTTT




GTGGCTAGCTGTGGCCGTTGGGTCCGTGTTAAAACGGAGTAGCTCGAGTGACTGACTG






FutX
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGTCTGATCAAGATGATCGGCGGC
284



CTGGGTAACCAGATGTTTATCTACGCGTTCTACCTGAAAATGAAACACCATTACCCGGAT




ACGAACATCGATCTGTCTGACATGGTTCATTATAAAGTTCACAACGGTTATGAGATGAAC




CGTATCTTTGACCTGAGCCAGACTGAATTTTGCATCAACCGTACCCTGAAAAAAATCCTG




GAGTTCCTGTTCTTCAAAAAAATCTACGAACGTCGCCAGGACCCGTCTACTCTGTATCCA




TACGAAAAACGTTATTTTTGGCCGCTGCTGTACTTTAAAGGTTTCTACCAGTCTGAACGC




TTCTTCTTCGATATCAAAGACGACGTTCGTAAAGCCTTCTCTTTTAACCTGAACATCGCT




AACCCGGAAAGCCTGGAACTGCTGAAACAGATCGAAGTTGACGACCAAGCTGTTTCTATC




CACATCCGCCGTGGTGACTACCTGCTGCCGCGTCACTGGGCAAACACGGGTTCCGTGTGC




CAGCTGCCGTATTACAAGAACGCGATCGCGGAAATGGAGAACCGTATTACTGGCCCGAGC




TACTACGTGTTCTCTGATGATATCTCTTGGGTTAAAGAAAACATCCCGCTGAAGAAAGCG




GTCTACGTGACGTGGAACAAGGGCGAAGACAGCTGGCAGGATATGATGCTGATGAGCCAC




TGTCGTCACCACATTATCTGTAATTCTACGTTCTCCTGGTGGGGTGCTTGGCTGAACCCA




CGTAAAGAGAAAATCGTCATCGCGCCGTGTCGCTGGTTCCAGCATAAAGAAACCCCGGAC




ATGTACCCGAAAGAATGGATCAAAGTACCGATTAACTAGCTCGAGTGACTGACTG






FutZ
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGTATTCTTGCCTGTCTGGTGGCCTG
285



GGTAACCAAATGTTTCAATACGCAGCAGCGTATATCCTGAAGCAGTATTTTCAGTCTACC




ACTCTGGTCCTGGATGATAGCTATTACTATTCCCAGCCGAAACGTGATACCGTTCGTAGC




CTGGAACTGAATCAGTTCAACATCTCTTATGATCGTTTTAGCTTCGCGGATGAAAAAGAG




AAGATCAAACTGCTGCGCAAATTCAAACGTAACCCGTTCCCTAAACAGATTTCCGAGATC




CTGTCTATTGCGCTGTTCGGCAAATACGCGCTGTCCGACCGTGCATTTTACACCTTCGAA




ACTATCAAAAACATCGACAAAGCGTGCCTGTTCTCCTTTTACCAGGACGCCGATCTGCTG




AATAAATATAAGCAGCTGATCCTGCCGCTGTTCGAACTGCGCGATGACCTGCTGGATATC




TGCAAGAACCTGGAACTGTATTCCCTGATCCAACGCAGCAACAATACCACTGCACTGCAT




ATCCGCCGTGGCGACTACGTGACCAACCAGCACGCCGCGAAATACCACGGCGTGCTGGAC




ATCAGCTACTATAACCACGCAATGGAATACGTGGAACGTGAACGCGGCAAACAGAACTTC




ATTATCTTCAGCGATGATGTACGTTGGGCACAGAAAGCGTTTCTGGAGAACGATAATTGC




TACGTGATTAACAACTCCGACTACGATTTCTCTGCGATCGATATGTATCTGATGTCTCTG




TGCAAAAACAACATCATCGCAAATTCCACCTACTCCTGGTGGGGTGCGTGGCTGAACAAA




TACGAGGACAAACTGGTTATCTCTCCGAAACAATGGTTTCTGGGTAACAACGAAACCTCT




CTGCGTAACGCGTCTTGGATCACCCTGTAGCTCGAGTGACTGACTG






FutZA
CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGTCTGATCAAGATGACCGGTGGC
286



CTGGGTAACCAGATGTTCATCTACGCGTTTTATCTGCGTATGAAAAAACGTTATCCGAAA




GTTCGTATTGATCTGTCTGATATGGTTCATTATCACGTTCACCACGGCTATGAAATGCAC




CGTGTTTTCAATCTGCCGCACACCGAATTTTGCATCAACCAGCCGCTGAAAAAAGTGATC




GAGTTCCTGTTTTTCAAAAAGATTTACGAACGTAAACAGGACCCTAATTCTCTGCGTGCA




TTCGAGAAGAAGTATCTGTGGCCGCTGCTGTACTTCAAAGGTTTCTATCAATCTGAGCGC




TTCTTTGCTGACATCAAAGACGAGGTTCGTAAAGCATTCACCTTTGACTCTTCTAAAGTG




AACGCTCGCTCTGCCGAACTGCTGCGTCGCCTGGATGCCGATGCTAACGCGGTTAGCCTG




CACATTCGTCGCGGTGACTATCTACAGCCGCAGCATTGGGCTACCACTGGTTCTGTCTGC




CAGCTGCCGTACTACCAGAACGCGATCGCTGAAATGAACCGTCGCGTTGCTGCCCCGAGC




TACTACGTTTTCAGCGATGACATCGCGTGGGTGAAGGAAAACATCCCTCTACAGAACGCA




GTGTACATCGACTGGAATAAAGGCGAAGAAAGCTGGCAGGATATGATGCTGATGAGCCAC




TGCCGCCACCACATTATCTGTAACAGCACCTTCTCTTGGTGGGGCGCGTGGCTGGACCCG




CACGAGGACAAAATTGTAATCGTTCCGAATCGTTGGTTCCAGCATTGCGAAACTCCTAAC




ATCTATCCGGCAGGCTGGGTGAAAGTTGCGATTAATTAGCTCGAGTGACTGACTG









In any of the methods described herein, the α(1,2) fucosyltransferase genes or gene products may be variants or functional fragments thereof. A variant of any of genes or gene products disclosed herein may have 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the nucleic acid or amino acid sequences described herein.


Variants as disclosed herein also include homolog, orthologs, or paralogs of the genes or gene products described herein that retain the same biological function as the genes or gene products specified herein. These variants can be used interchangeably with the genes recited in these methods. Such variants may demonstrate a percentage of homology or identity, for example, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity conserved domains important for biological function, preferably in a functional domain, e.g. catalytic domain.


The term “% identity,” in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. For example, % identity is relative to the entire length of the coding regions of the sequences being compared, or the length of a particular fragment or functional domain thereof.


For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.


Percent identity is determined using search algorithms such as BLAST and PSI-BLAST (Altschul et al., 1990, J Mol Biol 215:3, 403-410; Altschul et al., 1997, Nucleic Acids Res 25:17, 3389-402). For the PSI-BLAST search, the following exemplary parameters are employed: (1) Expect threshold was 10; (2) Gap cost was Existence:11 and Extension:1; (3) The Matrix employed was BLOSUM62; (4) The filter for low complexity regions was “on”.


Changes can be introduced by mutation into the nucleic acid sequence or amino acid sequence of any of the genes or gene products described herein, leading to changes in the amino acid sequence of the encoded protein or enzyme, without altering the functional ability of the protein or enzyme. For example, nucleotide substitutions leading to amino acid substitutions at “non-essential” amino acid residues can be made in the sequence of any of sequences expressly disclosed herein. A “non-essential” amino acid residue is a residue at a position in the sequence that can be altered from the wild-type sequence of the polypeptide without altering the biological activity, whereas an “essential” amino acid residue is a residue at a position that is required for biological activity. For example, amino acid residues that are conserved among members of a family of proteins are not likely to be amenable to mutation. Other amino acid residues, however, (e.g., those that are poorly conserved among members of the protein family) may not be as essential for activity and thus are more likely to be amenable to alteration, Thus, another aspect of the invention pertains to nucleic acid molecules encoding the proteins or enzymes disclosed herein that contain changes in amino acid residues relative to the amino acid sequences disclosed herein that are not essential for activity (i.e., fucosyltransferase activity).


An isolated nucleic acid molecule encoding a protein essentially retaining the functional capability compared to any of the genes described herein can be created by introducing one or more nucleotide substitutions, additions or deletions into the corresponding nucleotide sequence, such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein.


Mutations can be introduced into a nucleic acid sequence by standard techniques such that the encoded amino acid sequence is altered, such as site-directed mutagenesis and PCR-mediated mutagenesis. Preferably, conservative amino acid substitutions are made at one or more predicted non-essential amino acid residues. A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. Certain amino acids have side chains with more than one classifiable characteristic. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, tryptophan, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tyrosine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a given polypeptide is replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a given coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for given polypeptide biological activity to identify mutants that retain activity. Conversely, the invention also provides for variants with mutations that enhance or increase the endogenous biological activity. Following mutagenesis of the nucleic acid sequence, the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined. An increase, decrease, or elimination of a given biological activity of the variants disclosed herein can be readily measured by the ordinary person skilled in the art, i.e., by measuring the capability for mediating oligosaccharide modification, synthesis, or degradation (via detection of the products).


The present invention also provides for functional fragments of the genes or gene products described herein. A fragment, in the case of these sequences and all others provided herein, is defined as a part of the whole that is less than the whole. Moreover, a fragment ranges in size from a single nucleotide or amino acid within a polynucleotide or polypeptide sequence to one fewer nucleotide or amino acid than the entire polynucleotide or polypeptide sequence. Finally, a fragment is defined as any portion of a complete polynucleotide or polypeptide sequence that is intermediate between the extremes defined above.


For example, fragments of any of the proteins or enzymes disclosed herein or encoded by any of the genes disclosed herein can be 10 to 20 amino acids, 10 to 30 amino acids, 10 to 40 amino acids, 10 to 50 amino acids, 10 to 60 amino acids, 10 to 70 amino acids, 10 to 80 amino acids, 10 to 90 amino acids, 10 to 100 amino acids, 50 to 100 amino acids, 75 to 125 amino acids, 100 to 150 amino acids, 150 to 200 amino acids, 200 to 250 amino acids, 250 to 300 amino acids, 300 to 350 amino acids, 350 to 400 amino acids, 400 to 450 amino acids, or 450 to 500 amino acids. The fragments encompassed in the present invention comprise fragments that retain functional fragments. As such, the fragments preferably retain the catalytic domains that are required or are important for functional activity. Fragments can be determined or generated by using the sequence information herein, and the fragments can be tested for functional activity using standard methods known in the art. For example, the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined. The biological function of said fragment can be measured by measuring ability to synthesize or modify a substrate oligosaccharide, or conversely, to catabolize an oligosaccharide substrate.


Within the context of the invention, “functionally equivalent”, as used herein, refers to a gene or the resulting encoded protein variant or fragment thereof capable of exhibiting a substantially similar activity as the wild-type fucosyltransferase. Specifically, the fucosyltransferase activity refers to the ability to transfer a fucose sugar to an acceptor substrate via an alpha-(1,2)-linkage. As used herein, “substantially similar activity” refers to an activity level within 5%, 10%, 20%, 30%, 40%, or 50% of the wild-type fucosyltransferase.


To test for lactose-utilizing fucosylatransferase activity, the production of fucosylated oligossacharides (i.e., 2′-FL) is evaluated in a host organism that expresses the candidate enzyme (or syngene) and which contains both cytoplasmic GDP-fucose and lactose pools. The production of fucosylated oligosaccharides indicates that the candidate enzyme-encoding sequence functions as a lactose-utilizing α(1,2)fucosyltransferase.


Engineering of E. coli to Produce Human Milk Oligosaccharide 2′-FL


Described herein is a gene screening approach, which was used to validate the novel α (1,2) fucosyltransferases (α (1,2) FTs) for the synthesis of fucosyl-linked oligosaccharides in metabolically engineered E. coli. Of particular interest are α (1,2) FTs that are capable of the synthesis of the HMOS 2′-fucosyllactose (2′-FL). 2′-FL is the most abundant fucosylated oligosaccharide present in human milk, and this oligosaccharide provides protection to newborn infants against infectious diarrhea caused by bacterial pathogens such as Campylobacter jejuni (Ruiz-Palacios, G. M., et al. (2003). J Biol Chem 278, 14112-120; Morrow, A. L. et al. (2004). J Pediatr 145, 297-303; Newburg, D. S. et al. (2004). Glycobiology 14, 253-263). Other α (1,2) FTs of interest are those capable of synthesis of HMOS lactodifucotetraose (LDFT), laco-N-fucopentaose I (LNFI), or lacto-N-difucohexaose I (LDFH I).


The synthetic pathway of fucosyl oligosaccharides of human milk is illustrated in FIG. 1. Structurally, 2′-FL consists of a fucose molecule a 1,2 linked to the galactose portion of lactose (Fucα1-2Galβ1-4Glc). An α (1,2) FT from H. pylori strain 26695 termed FutC has been utilized to catalyze the synthesis of 2′-FL in metabolically engineered E. coli (Drouillard, S. et al. (2006). Angew Chem Int Ed Engl 45, 1778-780).


Candidate α(1,2) FTs (i.e., syngenes) were cloned by standard molecular biological techniques into an expression plasmid. This plasmid utilizes the strong leftwards promoter of bacteriophage λ (termed PL) to direct expression of the candidate genes (Sanger, F. et al. (1982), J Mol Biol 162, 729-773). The promoter is controllable, e.g., a trp-cI construct is stably integrated the into the E. coli host's genome (at the ampC locus), and control is implemented by adding tryptophan to the growth media. Gradual induction of protein expression is accomplished using a temperature sensitive cI repressor. Another similar control strategy (temperature independent expression system) has been described (Mieschendahl et al., 1986, Bio/Technology 4:802-808). The plasmid also carries the E. coli rcsA gene to up-regulate GDP-fucose synthesis, a critical precursor for the synthesis of fucosyl-linked oligosaccharides. In addition, the plasmid carries a β-lactamase (bla) gene for maintaining the plasmid in host strains by ampicillin selection (for convenience in the laboratory) and a native thyA (thymidylate synthase) gene as an alternative means of selection in thyA hosts. Alternative selectable markers include the proBA genes to complement proline auxotrophy (Stein et al., (1984), J Bacteriol 158:2, 696-700 (1984) or purA to complement adenine auxotrophy (S. A. Wolfe, J. M. Smith, J Biol Chem 263, 19147-53 (1988)). To act as plasmid selectable markers each of these genes are first inactivated in the host cell chromosome, then wild type copies of the genes are provided on the plasmid. Alternatively a drug resistance gene may be used on the plasmid, e.g. beta-lactamase (this gene is already on the expression plasmid described above, thereby permitting selection with ampicillin). Ampicillin selection is well known in the art and described in standard manuals such as Maniatis et al., (1982) Molecular cloning, a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring, N.Y.


The nucleic acid sequence of such an expression plasmid, pEC2-(T7)FutX-rcsA-thyA (pG401) is provided below. The underlined sequence represents the FutX syngene, which can be readily replaced with any of the novel α(1,2) FTs described herein using standard recombinant DNA techniques.









(SEQ ID NO: 287)


TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCG





GAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCG





TCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATG





CGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAA





TACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAAC





CTGTATATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCT





GATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCCGCCAGCCCG





ACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATT





CTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCATA





TCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATT





TAGAACTGATGCAAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGT





ACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAGATGCGTTTTAACCT





GCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCA





TCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTAT





CTACACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGG





CGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAG





ATGGTCGTCATATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAAC





GACCCGGATTCGCGCCGCATTATTGTTICAGCGTGGAACGTAGGCGAACT





GGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGG





CAGACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTC





CTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTATTGGTGCATATGAT





GGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACCGGTGGCG





ACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGC





CGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATC





CATCTTCGACTACCGTTTCGAAGACTTTGAGATTGAAGGCTACGATCCGC





ATCCGGGCATTAAAGCGCCGGTGGCTATCTAATTACGAAACATCCTGCCA





GAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAATTCTTCG





AGACGCCTTCCCGAAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGG





GAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGG





GGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGT





CACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTTCTTTAATGAAGCAG





GGCATCAGGACGGTATCTTTGTGGAGAAAGCAGAGTAATCTTATTCAGCC





TGACTGGTGGGAAACCACCAGTCAGAATGTGTTAGCGCATGTTGACAAAA





ATACCATTAGTCACATTATCCGTCAGTCGGACGACATGGTAGATAACCTG





TTTATTATGCGTTTTGATCTTACGTTTAATATTACCTTTATGCGATGAAA





CGGTCTTGGCTTTGATATTCATTTGGTCAGAGATTTGAATGGTTCCCTGA





CCTGCCATCCACATTCGCAACATACTCGATTCGGTTCGGCTCAATGATAA





CGTCGGCATATTTAAAAACGAGGTTATCGTTGTCTCTTTTTTCAGAATAT





CGCCAAGGATATCGTCGAGAGATTCCGGTTTAATCGATTTAGAACTGATC





AATAAATTTTITCTGACCAATAGATATTCATCAAAATGAACATTGGCAAT





TGCCATAAAAACGATAAATAACGTATTGGGATGTTGATTAATGATGAGCT





TGATACGCTGACTGTTAGAAGCATCGTGGATGAAACAGTCCTCATTAATA





AACACCACTGAAGGGCGCTGTGAATCACAAGCTATGGCAAGGTCATCAAC





GGTTTCAATGTCGTTGATTTCTCTTTTTTTAACCCCTCTACTCAACAGAT





ACCCGGITAAACCTAGTCGGGTGTAACTACATAAATCCATAATAATCGTT





GACATGGCATACCCTCACTCAATGCGTAACGATAATTCCCCTTACCTGAA





TATTTCATCATGACTAAACGGAACAACATGGGTCACCTAATGCGCCACTC





TCGCGATTTTTCAGGCGGACTTACTATCCCGTAAAGTGTTGTATAATTTG





CCTGGAATTGTCTTAAAGTAAAGTAAATGTTGCGATATGTGAGTGAGCTT





AAAACAAATATTTCGCTGCAGGAGTATCCTGGAAGATGTTCGTAGAAGCT





TACTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTG





TACTACCCTGTACGATTACTGCAGCTCGAGCTAGTTAATCGGTACTTTGA





TCCATTCTTTCGGGTACATGTCCGGGGTTTCTTTATCCTGGAACCAGCGA





CACGGCGCGATGACGATTTTCTCTTTACGTGGGTTCAGCCAAGCACCCCA





CCAGGAGAACGTAGAATTACAGATAATGTGGTGACGACAGTGGCTCATCA





GCATCATATCCTGCCAGCTGTCTTCGCCCTTGTTCCACGTCACGTAGACC





GCTTTCTTCAGCGGGATGTTTTCTTTAACCCAAGAGATATCATCAGAGAA





CACGTAGTAGCTCGGGCCAGTAATACGGTTCTCCATTTCCGCGATCGCGT





TCTTGTAATACGGCAGCTGGCACACGGAACCCGTGTTTGCCCAGTGACGC





GGCACCAGGTAGTCACCACGGCGGATGTGGATAGAAACAGCTTCGTCGTC





AACTTCGATCTGTTTCAGCAGTTCCAGGCTTTCCGGGTTAGCGATGTTCA





GGTTAAAAGAGAAGGCTTTACGAACGTCGTCTTTGATATCGAAGAAGAAG





CGTTCAGACTGGTAGAAACCTTTAAAGTACAGCAGCGGCCAAAAATAACG





TTTTTCGTATGGATACAGAGTAGACGCGTCCTGGCCACGTTCGTAGATTT





TTTTGAAGAACAGGAACTCCAGGATTTTTTTCAGGGTACGGTTGATGCAA





AATTCAGTCTGGCTCAGGTCAAAGATACGGTTCATCTCATAACCGTTGTG





AACTTTATAATGAACCATGTCAGACAGATCGATGTTCGTATCCGGGTAAT





GGTGTTTCATTTTCAGGTAGAACGCGTAGATAAACATCTGGTTACCCAGG





CCGCCGATCATCTTGATCAGACGCATATCTATATCTCCTTCTTGAATTCT





AAAAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTT





TAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTT





ATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAA





ACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTG





CGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTC





TCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCT





GTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCC





CCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAAT





GCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCC





TTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCG





TCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCG





CCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGA





ATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTG





CATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCG





CTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGC





GGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAA





TCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGC





CAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCC





CCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAAC





CCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGT





GCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTC





TCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTC





AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTCCACGAACCCCC





CGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCA





ACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGG





ATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTG





GCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGC





TGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAA





CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTAC





GCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGT





CTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGA





TTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTT





TAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAAT





GCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCC





ATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTT





ACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGG





CTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGA





AGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCG





GGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTG





CCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCA





TTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTT





GTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTA





AGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCT





CTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTC





AACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCC





CGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTG





CTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACC





GCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTT





CAGCATCTTTTACTTTCACCAGCGTTTCTGGGIGAGCAAAAACAGGAAGG





CAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACT





CATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTC





TCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGG





GTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCAT





TATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTC





GTC






The expression constructs were transformed into a host strain useful for the production of 2′-FL. Biosynthesis of 2′-FL requires the generation of an enhanced cellular pool of both lactose and GDP-fucose (FIG. 2). The wild-type Eschericia coli K12 prototrophic strain W3110 was selected as the parent background to test the ability of the candidates to catalyze 2′-FL production (Bachmann, B. J. (1972). Bacteriol Rev 36, 525-557). The particular W3110 derivative employed was one that previously had been modified by the introduction (at the ampC locus) of a tryptophan-inducible PtrpB cI+ repressor cassette, generating an E. coli strain known as 01724 (LaVallie, E. R. et al. (2000). Methods Enzymol 326, 322-340). Other features of GI724 include lacIq and lacPL8 promoter mutations. E. coli strain GI724 affords economical production of recombinant proteins from the phage λ PL promoter following induction with low levels of exogenous tryptophan (LaVallie, E. R. et al. (1993). Biotechnology (N Y) 11, 187-193; Mieschendahl, et al. (1986). Bio/Technology 4, 802-08). Additional genetic alterations were made to this strain to promote the biosynthesis of 2′-FL. This was achieved in strain GI724 through several manipulations of the chromosome using λ Red recombineering (Court, D. L. et al. (2002). Annu Rev Genet 36, 361-388) and generalized P1 phage transduction.


First, the ability of the E. coli host strain to accumulate intracellular lactose was engineered by simultaneous deletion of the endogenous β-galactosidase gene (lacZ) and the lactose operon repressor gene (lacI). During construction of this deletion, the lacIq promoter was placed immediately upstream of the lactose permease gene, lacY. The modified strain maintains its ability to transport lactose from the culture medium (via LacY), but is deleted for the wild-type copy of the lacZ (β-galactosidase) gene responsible for lactose catabolism. Therefore, an intracellular lactose pool is created when the modified strain is cultured in the presence of exogenous lactose. A schematic of the PlacIq lacY+ chromosomal construct is shown in FIG. 12.


Genomic DNA sequence of the PlacIq lacY+ chromosomal construct is set forth below (SEQ ID NO: 288):









CACCATCGAATGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCG





GAAGAGAGTCAAGTGTAGGCTGGAGCTGCTTCGAAGTTCCTATACTTTC





TAGAGAATAGGAACTTCGGAATAGGAACTTCGGAATAGGAACTAAGGAG





GATATTCATATGTACTATTTAAAAAACACAAACTTTTGGATGTTCGGTT





TATTCTTTTTCTTTTACTTTTTTATCATGGGAGCCTACTTCCCGTTTTT





CCCGATTTGGCTACATGACATCAACCATATCAGCAAAAGTGATACGGGT





ATTATTTTTGCCGCTATTTCTCTGTTCTCGCTATTATTCCAACCGCTGT





TTGGTCTGCTTTCTGACAAACTCGGGCTGCGCAAATACCTGCTGTGGAT





TATTACCGGCATGTTAGTGATGTTTGCGCCGTTCTTTATTTTTATCTTC





GGGCCACTGTTACAATACAACATTTTAGTAGGATCGATTGTTGGTGGTA





TTTATCTAGGCTTTTGTTTTAACGCCGGTGCGCCAGCAGTAGAGGCATT





TATTGAGAAAGTCAGCCGTCGCAGTAATTTCGAATTTGGTCGCGCGCGG





ATGTTTGGCTGTGTTGGCTGGGCGCTGTGTGCCTCGATTGTCGGCATCA





TGTTCACCATCAATAATCAGTTTGTTTTCTGGCTGGGCTCTGGCTGTGC





ACTCATCCTCGCCGTTTTACTCTTTTTCGCCAAAACGGATGCGCCCTCT





TCTGCCACGGTTGCCAATGCGGTAGGTGCCAACCATTCGGCATTTAGCC





TTAAGCTGGCACTGGAACTGTTCAGACAGCCAAAACTGTGGTTTTTGTC





ACTGTATGTTATTGGCGTTTCCTGCACCTACGATGTTTTTGACCAACAG





TTTGCTAATTTCTTTACTTCGTTCTTTGCTACCGGTGAACAGGGTACGC





GGGTATTTGGCTACGTAACGACAATGGGCGAATTACTTAACGCCTCGAT





TATGTTCTTTGCGCCACTGATCATTAATCGCATCGGTGGGAAAAACGCC





CTGCTGCTGGCTGGCACTATTATGTCTGTACGTATTATTGGCTCATCGT





TCGCCACCTCAGCGCTGGAAGTGGTTATTCTGAAAACGCTGCATATGTT





TGAAGTACCGTTCCTGCTGGTGGGCTGCTTTAAATATATTACCAGCCAG





TTTGAAGTGCGTTTTTCAGCGACGATTTATCTGGTCTGTTTCTGCTTCT





TTAAGCAACTGGCGATGATTTTTATGTCTGTACTGGCGGGCAATATGTA





TGAAAGCATCGGTTTCCAGGGCGCTTATCTGGTGCTGGGTCTGGTGGCG





CTGGGCTTCACCTTAATTTCCGTGTTCACGCTTAGCGGCCCCGGCCCGC





TTTCCCTGCTGCGTCGTCAGGTGAATGAAGTCGCTTAAGCAATCAATGT





CGGATGCGGCGCGAGCGCCTTATCCGACCAACATATCATAACGGAGTGA





TCGCATTGTAAATTATAAAAATTGCCTGATACGCTGCGCTTATCAGGCC





TACAAGTTCAGCGATCTACATTAGCCGCATCCGGCATGAACAAAGCGCA





GGAACAAGCGTCGCA






Second, the ability of the host E. coli strain to synthesize colanic acid, an extracellular capsular polysaccharide, was eliminated by the deletion of the wcaJ gene, encoding the UDP-glucose lipid carrier transferase (Stevenson, G. et al. (1996). J Bacteriol 178, 4885-893). In a wcaJ null background GDP-fucose accumulates in the E. coli cytoplasm (Dumon, C. et al. (2001). Glycoconj J 18, 465-474). A schematic of the chromosomal deletion of wcaJ is shown in FIG. 13.


The sequence of the chromosomal region of E. coli bearing the ΔwcaJ::FRT mutation is set forth below (SEQ ID NO: 289):









GTTCGGTTATATCAATGTCAAAAACCTCACGCCGCTCAAGCTGGTGATCA





ACTCCGGGAACGGCGCAGCGGGTCCGGTGGTGGACGCCATTGAAGCCCGC





TTTAAAGCCCTCGGCGCGCCCGTGGAATTAATCAAAGTGCACAACACGCC





GGACGGCAATTTCCCCAACGGTATTCCTAACCCACTACTGCCGGAATGCC





GCGACGACACCCGCAATGCGGTCATCAAACACGGCGCGGATATGGGCATT





GCTTTTGATGGCGATTTTGACCGCTGTTTCCTGTTTGACGAAAAAGGGCA





GTTTATTGAGGGCTACTACATTGTCGGCCTGTTGGCAGAAGCATTCCTCG





AAAAAAATCCCGGCGCGAAGATCATCCACGATCCACGTCTCTCCTGGAAC





ACCGTTGATGTGGTGACTGCCGCAGGTGGCACGCCGGTAATGTCGAAAAC





CGGACACGCCTTTATTAAAGAACGTATGCGCAAGGAAGACGCCATCTATG





GTGGCGAAATGAGCGCCCACCATTACTTCCGTGATTTCGCTTACTGCGAC





AGCGGCATGATCCCGTGGCTGCTGGTCGCCGAACTGGTGTGCCTGAAAGA





TAAAACGCTGGGCGAACTGGTACGCGACCGGATGGCGGCGTTTCCGGCAA





GCGGTGAGATCAACAGCAAACTGGCGCAACCCGTTGAGGCGATTAACCGC





GTGGAACAGCATTTTAGCCGTGAGGCGCTGGCGGTGGATCGCACCGATGG





CATCAGCATGACCTTTGCCGACTGGCGCTTTAACCTGCGCACCTCCAATA





CCGAACCGGTGGTGCGCCTGAATGTGGAATCGCGCGGTGATGTGCCGCTG





ATGGAAGCGCGAACGCGAACTCTGCTGACGTTGCTGAACGAGTAATGTCG





GATCTTCCCTTACCCCACTGCGGGTAAGGGGCTAATAACAGGAACAACGA





TGATTCCGGGGATCCGTCGACCTGCAGTTCGAAGTTCCTATTCTCTAGAA





AGTATAGGAACTTCGAAGCAGCTCCAGCCTACAGTTAACAAAGCGGCATA





TTGATATGAGCTTACGTGAAAAAACCATCAGCGGCGCGAAGTGGTCGGCG





ATTGCCACGGTGATCATCATCGGCCTCGGGCTGGTGCAGATGACCGTGCT





GGCGCGGATTATCGACAACCACCAGTTCGGCCTGCTTACCGTGTCGCTGG





TGATTATCGCGCTGGCAGATACGCTTTCTGACTTCGGTATCGCTAACTCG





ATTATTCAGCGAAAAGAAATCAGTCACCTTGAACTCACCACGTTGTACTG





GCTGAACGTCGGGCTGGGGATCGTGGTGTGCGTGGCGGTGTTTTTGTTGA





GTGATCTCATCGGCGACGTGCTGAATAACCCGGACCTGGCACCGTTGATT





AAAACATTATCGCTGGCGTTTGTGGTAATCCCCCACGGGCAACAGTTCCG





CGCGTTGATGCAAAAAGAGCTGGAGTTCAACAAAATCGGCATGATCGAAA





CCAGCGCGGTGCTGGCGGGCTTCACTTGTACGGTGGTTAGCGCCCATTTC





TGGCCGCTGGCGATGACCGCGATCCTCGGTTATCTGGTCAATAGTGCGGT





GAGAACGCTGCTGTTTGGCTACTTTGGCCGCAAAATTTATCGCCCCGGTC





TGCATTTCTCGCTGGCGTCGGTGGCACCGAACTTACGCTTTGGTGCCTGG





CTGACGGCGGACAGCATCATCAACTATCTCAATACCAACCTTTCAACGCT





CGTGCTGGCGCGTATTCTCGGCGCGGGCGTGGCAGGGGGATACAACCTGG





CGTACAACGTGGCCGTTGTGCCACCGATGAAGCTGAACCCAATCATCACC





CGCGTGTTGTTTCCGGCATTCGCCAAAATTCAGGACGATACCGAAAAGCT





GCGTGTTAACTTCTACAAGCTGCTGTCGGTAGTGGGGATTATCAACTTTC





CGGCGCTGCTCGGGCTAATGGTGGTGTCGAATAACTTTGTACCGCTGGTC





TTTGGTGAGAAGTGGAACAGCATTATTCCGGTGCTGCAATTGCTGTGTGT





GGTGGGTCTGCTGCGCTCCG






Third, the magnitude of the cytoplasmic GDP-fucose pool was enhanced by the introduction of a null mutation into the lon gene. Lon is an ATP-dependant intracellular protease that is responsible for degrading RcsA, which is a positive transcriptional regulator of colanic acid biosynthesis in E. coli (Gottesman, S. & Stout, V. Mol Microbiol 5, 1599-1606 (1991)). In a lon null background, RcsA is stabilized, RcsA levels increase, the genes responsible for GDP-fucose synthesis in E. coli are up-regulated, and intracellular GDP-fucose concentrations are enhanced. The lon gene was almost entirely deleted and replaced by an inserted functional, wild-type, but promoter-less E. coli lacZ+ gene (Δlon::(kan, lacZ+). λ Red recombineering was used to perform the construction. A schematic of the kan, lacZ+ insertion into the ion locus is shown in FIG. 14.


Genomic DNA sequence surrounding the lacZ+ insertion into the lon region in the E. coli strain is set forth below (SEQ. ID NO: 290):









GTGGATGGAAGAGGTGGAAAAAGTGGTTATGGAGGAGTGGGTAATTGATG





GTGAAAGGAAAGGGTTGGTGATTTATGGGAAGGGGGAAGGGGAAGAGGGA





TGTGGTGAATAATTAAGGATTGGGATAGAATTAGTTAAGGAAAAAGGGGG





GATTTTATGTGGGGTTTAATTTTTGGTGTATTGTGGGGGTTGAATGTGGG





GGAAAGATGGGGATATAGTGAGGTAGATGTTAATAGATGGGGTGAAGGAG





AGTGGTGTGATGTGATTAGGTGGGGGAAATTAAAGTAAGAGAGAGGTGTA





TGATTGGGGGGATGGGTGGAGGTGGAGTTGGAAGTTGGTATTGTGTAGAA





AGTATAGGAAGTTGAGAGGGGTTTTGAAGGTGAGGGTGGGGGAAGGAGTG





AGGGGGGAAGGGGTGGTAAAGGAAGGGGAAGAGGTAGAAAGGGAGTGGGG





AGAAAGGGTGGTGAGGGGGGATGAATGTGAGGTAGTGGGGTATGTGGAGA





AGGGAAAAGGGAAGGGGAAAGAGAAAGGAGGTAGGTTGGAGTGGGGTTAG





ATGGGGATAGGTAGAGTGGGGGGTTTTATGGAGAGGAAGGGAAGGGGAAT





TGGGAGGTGGGGGGGGGTGTGGTAAGGTTGGGAAGGGGTGGAAAGTAAAG





TGGATGGGTTTGTTGGGGGGAAGGATGTGATGGGGGAGGGGATGAAGATG





TGATGAAGAGAGAGGATGAGGATGGTTTGGGATGATTGAAGAAGATGGAT





TGGAGGGAGGTTGTGGGGGGGGTTGGGTGGAGAGGGTATTGGGGTATGAG





TGGGGAGAAGAGAGAATGGGGTGGTGTGATGGGGGGGTGTTGGGGGTGTG





AGGGGAGGGGGGGGGGGTTGTTTTTGTGAAGAGGGAGGTGTGGGGTGGGG





TGAATGAAGTGGAGGAGGAGGGAGGGGGGGTATGGTGGGTGGGGAGGAGG





GGGGTTGGTTGGGGAGGTGTGGTGGAGGTTGTGAGTGAAGGGGGAAGGGA





GTGGGTGGTATTGGGGGAAGTGGGGGGGGAGGATGTGGTGTGATGTGAGG





TTGGTGGTGGGGAGAAAGTATGGATGATGGGTGATGGAATGGGGGGGGTG





GATAGGGTTGATGGGGGTAGGTGGGGATTGGAGGAGGAAGGGAAAGATGG





GATGGAGGGAGGAGGTAGTGGGATGGAAGGGGGTGTTGTGGATGAGGATG





ATGTGGAGGAAGAGGATGAGGGGGTGGGGGGAGGGGAAGTGTTGGGGAGG





GTGAAGGGGGGATGGGGGAGGGGGAGGATGTGGTGGTGAGGGATGGGGAT





GGGTGGTTGGGGAATATGATGGTGGAAAATGGGGGGTTTTGTGGATTGAT





GGAGTGTGGGGGGGTGGGTGTGGGGGAGGGGTATGAGGAGATAGGGTTGG





GTAGGGGTGATATTGGTGAAGAGGTTGGGGGGGAATGGGGTGAGGGGTTG





GTGGTGGTTTAGGGTATGGGGGGTGGGGATTGGGAGGGGATGGGGTTGTA





TGGGGTTGTTGAGGAGTTGTTGTAATAAGGGGATGTTGAAGTTGGTATTG





GGAAGTTGGTATTGTGTAGAAAGTATAGGAAGTTGGAAGGAGGTGGAGGG





TAGATAAAGGGGGGGGTTATTTTTGAGAGGAGAGGAAGTGGTAATGGTAG





GGAGGGGGGGTGAGGTGGAATTGGGGGGATAGTGAGGGGGTGGAGGAGTG





GTGGGGAGGAATGGGGATATGGAAAGGGTGGATATTGAGGGATGTGGGTT





GTTGGGGGTGGAGGAGATGGGGATGGGTGGTTTGGATGAGTTGGTGTTGA





GTGTAGGGGGTGATGTTGAAGTGGAAGTGGGGGGGGGAGTGGTGTGGGGG





ATAATTGAATTGGGGGGTGGGGGAGGGGAGAGGGTTTTGGGTGGGGAAGA





GGTAGGGGGTATAGATGTTGAGAATGGGAGATGGGAGGGGTGAAAAGAGG





GGGGAGTAAGGGGGTGGGGATAGTTTTGTTGGGGGGGTAATGGGAGGGAG





TTTAGGGGGTGTGGTAGGTGGGGGAGGTGGGAGTTGAGGGGAATGGGGGG





GGGATGGGGTGTATGGGTGGGGAGTTGAAGATGAAGGGTAATGGGGATTT





GAGGAGTAGGATGAATGGGGTAGGTTTTGGGGGTGATAAATAAGGTTTTG





GGGTGATGGTGGGAGGGGTGAGGGGTGGTAATGAGGAGGGGATGAGGAAG





TGTATGTGGGGTGGAGTGGAAGAAGGGTGGTTGGGGGTGGTAATGGGGGG





GGGGGTTGGAGGGTTGGAGGGAGGGGTTAGGGTGAATGGGGGTGGGTTGA





GTTAGGGGAATGTGGTTATGGAGGGGTGGAGGGGTGAAGTGATGGGGGAG





GGGGGTGAGGAGTTGTTTTTTATGGGGAATGGAGATGTGTGAAAGAAAGG





GTGAGTGGGGGTTAAATTGGGAAGGGTTATTAGGGAGGTGGATGGAAAAA





TGGATTTGGGTGGTGGTGAGATGGGGGATGGGGTGGGAGGGGGGGGGGAG





GGTGAGAGTGAGGTTTTGGGGGAGAGGGGAGTGGTGGGAGGGGGTGATGT





GGGGGGGTTGTGAGGATGGGGTGGGGTTGGGTTGGAGTAGGGGTAGTGTG





AGGGAGAGTTGGGGGGGGGTGTGGGGGTGGGGTAGTTGAGGGAGTTGAAT





GAAGTGTTTAGGTTGTGGAGGGAGATGGAGAGGGAGTTGAGGGGTTGGGA





GGGGGTTAGGATGGAGGGGGAGGATGGAGTGGAGGAGGTGGTTATGGGTA





TGAGGGAAGAGGTATTGGGTGGTGAGTTGGATGGTTTGGGGGGATAAAGG





GAAGTGGAAAAAGTGGTGGTGGTGTTTTGGTTGGGTGAGGGGTGGATGGG





GGGTGGGGTGGGGAAAGAGGAGAGGGTTGATAGAGAAGTGGGGATGGTTG





GGGGTATGGGGAAAATGAGGGGGGTAAGGGGAGGAGGGGTTGGGGTTTTG





ATGATATTTAATGAGGGAGTGATGGAGGGAGTGGGAGAGGAAGGGGGGGT





GTAAAGGGGGATAGTGAGGAAAGGGGTGGGAGTATTTAGGGAAAGGGGGA





AGAGTGTTAGGGATGGGGTGGGGGTATTGGGAAAGGATGAGGGGGGGGGT





GTGTGGAGGTAGGGAAAGGGATTTTTTGATGGAGGATTTGGGGAGAGGGG





GGAAGGGGTGGTGTTGATGGAGGGGGGGGTAGATGGGGGAAATAATATGG





GTGGGGGTGGTGTGGGGTGGGGGGGGTTGATAGTGGAGGGGGGGGGAAGG





ATGGAGAGATTTGATGGAGGGATAGAGGGGGTGGTGATTAGGGGGGTGGG





GTGATTGATTGGGGAGGGAGGAGATGATGAGAGTGGGGTGATTAGGATGG





GGGTGGAGGATTGGGGTTAGGGGTTGGGTGATGGGGGGTAGGGAGGGGGG





ATGATGGGTGAGAGGATTGATTGGGAGGATGGGGTGGGTTTGAATATTGG





GTTGATGGAGGAGATAGAGGGGGTAGGGGTGGGAGAGGGTGTAGGAGAGG





GGATGGTTGGGATAATGGGAAGAGGGGAGGGGGTTAAAGTTGTTGTGGTT





GATGAGGAGGATATGGTGGAGGATGGTGTGGTGATGGATGAGGTGAGGAT





GGAGAGGATGATGGTGGTGAGGGTTAAGGGGTGGAATGAGGAAGGGGTTG





GGGTTGAGGAGGAGGAGAGGATTTTGAATGGGGAGGTGGGGGAAAGGGAG





ATGGGAGGGTTGTGGTTGAATGAGGGTGGGGTGGGGGGTGTGGAGTTGAA





GGAGGGGAGGATAGAGATTGGGGATTTGGGGGGTGGAGAGTTTGGGGTTT





TGGAGGTTGAGAGGTAGTGTGAGGGGATGGGGATAAGGAGGAGGGTGATG





GATAATTTGAGGGGGGAAAGGGGGGGTGGGGGTGGGGAGGTGGGTTTGAG





GGTGGGATAAAGAAAGTGTTAGGGGTAGGTAGTGAGGGAAGTGGGGGGAG





ATGTGAAGTTGAGGGTGGAGTAGAGGGGGGGTGAAATGATGATTAAAGGG





AGTGGGAAGATGGAAATGGGTGATTTGTGTAGTGGGTTTATGGAGGAAGG





AGAGGTGAGGGAAAATGGGGGTGATGGGGGAGATATGGTGATGTTGGAGA





TAAGTGGGGTGAGTGGAGGGGAGGAGGATGAGGGGGAGGGGGTTTTGTGG





GGGGGGTAAAAATGGGGTGAGGTGAAATTGAGAGGGGAAAGGAGTGTGGT





GGGGGTAAGGGAGGGAGGGGGGGTTGGAGGAGAGATGAAAGGGGGAGTTA





AGGGGATGAAAAATAATTGGGGTGTGGGGTTGGTGTAGGGAGGTTTGATG





AAGATTAAATGTGAGGGAGTAAGAAGGGGTGGGATTGTGGGTGGGAAGAA





AGGGGGGATTGAGGGTAATGGGATAGGTGAGGTTGGTGTAGATGGGGGGA





TGGTAAGGGTGGATGTGGGAGTTTGAGGGGAGGAGGAGAGTATGGGGGTG





AGGAAGATGGGAGGGAGGGAGGTTTGGGGGAGGGGTTGTGGTGGGGGAAA





GGAGGGAAAGGGGGATTGGGGATTGAGGGTGGGGAAGTGTTGGGAAGGGG





GATGGGTGGGGGGGTGTTGGGTATTAGGGGAGGTGGGGAAAGGGGGATGT





GGTGGAAGGGGATTAAGTTGGGTAAGGGGAGGGTTTTGGGAGTGAGGAGG





TTGTAAAAGGAGGGGGAGTGAATGGGTAATGATGGTGATAGTAGGTTTGG





TGAGGTTGTGAGTGGAAAATAGTGAGGTGGGGGAAAATGGAGTAATAAAA





AGAGGGGTGGGAGGGTAATTGGGGGTTGGGAGGGTTTTTTTGTGTGGGTA





AGTTAGATGGGGGATGGGGGTTGGGGTTATTAAGGGGTGTTGTAAGGGGA





TGGGTGGGGTGATATAAGTGGTGGGGGTTGGTAGGTTGAAGGATTGAAGT





GGGATATAAATTATAAAGAGGAAGAGAAGAGTGAATAAATGTGAATTGAT





GGAGAAGATTGGTGGAGGGGGTGATATGTGTAAAGGTGGGGGTGGGGGTG





GGTTAGATGGTATTATTGGTTGGGTAAGTGAATGTGTGAAAGAAGG






Fourth, a thyA (thymidylate synthase) mutation was introduced into the strain by P1 transduction. In the absence of exogenous thymidine, thyA strains are unable to make DNA and die. The defect can be complemented in trans by supplying a wild-type thyA gene on a multicopy plasmid (Belfort, M., Maley, G. F., and Maley, F. (1983). Proc Natl Acad Sci USA 80, 1858-861). This complementation was used here as a means of plasmid maintenance.


An additional modification that is useful for increasing the cytoplasmic pool of free lactose (and hence the final yield of 2′-FL) is the incorporation of a lacA mutation. LacA is a lactose acetyltransferase that is only active when high levels of lactose accumulate in the E. coli cytoplasm. High intracellular osmolarity (e.g., caused by a high intracellular lactose pool) can inhibit bacterial growth, and E. coli has evolved a mechanism for protecting itself from high intra cellular osmlarity caused by lactose by “tagging” excess intracellular lactose with an acetyl group using LacA, and then actively expelling the acetyl-lactose from the cell (Danchin, A. Bioessays 31, 769-773 (2009)). Production of acetyl-lactose in E. coli engineered to produce 2′-FL or other human milk oligosaccharides is therefore undesirable: it reduces overall yield. Moreover, acetyl-lactose is a side product that complicates oligosaccharide purification schemes. The incorporation of a lacA mutation resolves these problems. Sub-optimal production of fucosylated oligosaccharides occurs in strains lacking either or both of the mutations in the colanic acid pathway and the lon protease. Diversion of lactose into a side product (acetyl-lactose) occurs in strains that do not contain the lacA mutation. A schematic of the lacA deletion and corresponding genomic sequence is provided above (SEQ ID NO: 288).


The strain used to test the different α(1,2) FT candidates incorporates all the above genetic modifications and has the following genotype:


ΔampC::PtrpBcI, Δ(lacI-lacZ)::FRT,PlacIqlacY+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ+), ΔlacA


The E. coli strains harboring the different α(1,2) FT candidate expression plasmids were analyzed. Strains were grown in selective media (lacking thymidine) to early exponential phase. Lactose was then added to a final concentration of 0.5%, and tryptophan (200 μM) was added to induce expression of each candidate α(1,2) FT from the PL promoter. At the end of the induction period (˜24 h) equivalent OD 600 units of each strain and the culture supernatant was harvested. Lysates were prepared and analyzed for the presence of 2′-FL by thin layer chromatography (TLC).


A map of plasmid pG217 is shown in FIG. 11, which carries the B. vulgatus FutN. The sequence of plasmid pG217 is set forth below (SEQ ID NO: 291):









TCTAGAATTCTAAAAATTGATTGAATGTATGCAAATAAATGCATACA





CCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTG





TAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGG





GCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAA





AAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGAT





TTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGG





CGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGG





TGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACA





TTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGT





ATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGG





GCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGC





TGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCA





GAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGA





ATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATT





CTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTAT





TGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCG





TTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGG





TTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAA





AGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGT





TTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGC





TCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGC





GTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGC





CGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCG





CTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGT





TCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACC





GCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGA





CACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAG





AGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTA





ACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTG





AAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAA





ACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGA





TTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCT





ACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTT





GGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATT





AAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGG





TGTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGAT





CTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGA





TAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATG





ATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAA





CCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTAT





CCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGT





AGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGG





CATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCG





GTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAA





AAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTT





GGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTC





TTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTAC





TCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTC





TTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTT





TAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCA





AGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGC





ACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGT





GAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCG





ACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTG





AAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAAT





GTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGA





AAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAAC





CTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTCGCGCGTTTCG





GTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTC





ACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGG





CGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGG





CATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAA





TACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTC





AACCTGTATATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTT





GTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCC





GCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGC





ATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCAT





GATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAG





GAACCATGAAACAGTATTTAGAACTGATGCAAAAAGTGCTCGACGAA





GGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTT





TGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGA





CAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAACTGCTGTGG





TTTCTGCAGGGCGACACTAACATTGCTTATCTACACGAAAACAATGT





CACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAG





TGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCAT





ATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAAcGACCCGGA





TTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCGAACTGGATA





AAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCA





GACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTT





CCTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTATTGGTGCATA





TGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACC





GGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCT





GCAATTAAGCCGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAAC





GTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGACTTTGAGATT





GAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCTA





ATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTT





TTACCCTCCGTTAAATTCTTCGAGACGCCTTCCCGAAGGCGCCATTC





GCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCT





CTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGA





TTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAAC





GACGGCCAGTGCCAAGCTTTCTTTAATGAAGCAGGGCATCAGGACGG





TATCTTTGTGGAGAAAGCAGAGTAATCTTATTCAGCCTGACTGGTGG





GAAACCACCAGTCAGAATGTGTTAGCGCATGTTGACAAAAATACCAT





TAGTCACATTATCCGTCAGTCGGACGACATGGTAGATAACCTGTTTA





TTATGCGTTTTGATCTTACGTTTAATATTACCTTTATGCGATGAAAC





GGTCTTGGCTTTGATATTCATTTGGTCAGAGATTTGAATGGTTCCCT





GACCTGCCATCCACATTCGCAACATACTCGATTCGGTTCGGCTCAAT





GATAACGTCGGCATATTTAAAAACGAGGTTATCGTTGTCTCTTTTTT





CAGAATATCGCCAAGGATATCGTcGAGAGATTCCGGTTTAATCGATT





TAGAACTGATCAATAAATTTTTTCTGACCAATAGATATTCATCAAAA





TGAACATTGGCAATTGCCATAAAAACGATAAATAACGTATTGGGATG





TTGATTAATGATGAGCTTGATACGCTGACTGTTAGAAGCATCGTGGA





TGAAACAGTCCTCATTAATAAACACCACTGAAGGGCGCTGTGAATCA





CAAGCTATGGCAAGGTCATCAACGGTTTCAATGTCGTTGATTTCTCT





TTTTTTAACCCCTCTACTCAACAGATACCCGGTTAAACCTAGTCGGG





TGTAACTACATAAATCCATAATAATCGTTGACATGGCATACCCTCAC





TCAATGCGTAACGATAATTCCCCTTACCTGAATATTTCATCATGACT





AAACGGAACAACATGGGTCACCTAATGCGCCACTCTCGCGATTTTTC





AGGCGGACTTACTATCCCGTAAAGTGTTGTATAATTTGCCTGGAATT





GTCTTAAAGTAAAGTAAATGTTGCGATATGTGAGTGAGCTTAAAACA





AATATTTCGCTGCAGGAGTATCCTGGAAGATGTTCGTAGAAGCTTAC





TGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTG





TACTACCCTGTACGATTACTGCAGCTCGAGTTAGGATACCGGCACTT





TGATCCAACCAGTCGGGTAGATATCCGGTGCTTCGGAGTGCTGGAAC





CAACGGCTCGGCACAATAACAGTCTTATCCATATTAGGGTTCAGCCA





GGCACCCCACCAAGAAAACGTGCTGTTAcAAATGATGTGATGTTTGC





AATGAGACATCAGCATCATATCCTGCCAGGAGTCTTCATCAGTGTTC





CAGTCAATATAAACCGCATTCTGCAGTGGCAGATTTTCTTTAACCCA





CGCGATATCGTCGGAGAAGATATAGTAAGATGGGCTAGCAACACGAC





GGGACATTTCCGCGATAGCATTCTGGTAATACGGCAGCTGGCACACG





GAACCGGTAGTAGCCCAGTGTTTCGGCTGCAGATAGTCACCACGACG





AATGTGCAGGGAAACCGCGTTTTCATCTTTGTCCAGGATTTCCAGCA





TGTTCAGGCTGCGGGAATTTGCTTTGTTCTTATCAAAGGTGAAGGAT





TCACGCACTTCGTCTTTGATATCAGCGAAGAAACGCTCGCTCTGATA





GAAACCTTTAAAGTACAGCAGCGGCCAGAAATACTTCTTCTCGAACG





CACGCAGAGAGTTCGGCGCCTGCTTGCGTTCGTAGATTTTTTTAAAA





AACAGGAATTCGATAACTTTTTTCAGCGGTTGGTTGATGCAGAATTC





GGTGTGCGGCAGGTTGAACACGCGGTGCATTTCGTAACCGTAATGGA





CTTTGTAATGCATCATGTCGCTCAGGTCGATACGGACCTTCGGGTAA





TACTTTTTCATACGCAGATAGAAAGCATAGATAAACATCTGGTTGCC





CAGACCGCCAGTCACTTTGATCAGACGCATTATATCTCCTTCTTG






Fucosylated oligosaccharides produced by metabolically engineered E. coli cells are purified from culture broth post-fermentation. An exemplary procedure comprises five steps. (1) Clarification: Fermentation broth is harvested and cells removed by sedimentation in a preparative centrifuge at 6000×g for 30 min. Each bioreactor run yields about 5-7 L of partially clarified supernatant. (2) Product capture on coarse carbon: A column packed with coarse carbon (Calgon 12×40 TR) of ˜1000 ml volume (dimension 5 cm diameter×60 cm length) is equilibrated with 1 column volume (CV) of water and loaded with clarified culture supernatant at a flow rate of 40 ml/min. This column has a total capacity of about 120 g of sugar. Following loading and sugar capture, the column is washed with 1.5 CV of water, then eluted with 2.5 CV of 50% ethanol or 25% isopropanol (lower concentrations of ethanol at this step (25-30%) may be sufficient for product elution.) This solvent elution step releases about 95% of the total bound sugars on the column and a small portion of the color bodies. In this first step capture of the maximal amount of sugar is the primary objective. Resolution of contaminants is not an objective. (3) Evaporation: A volume of 2.5 L of ethanol or isopropanol eluate from the capture column is rotary-evaporated at 56 C.° and a sugar syrup in water is generated. Alternative methods that could be used for this step include lyophilization or spray-drying. (4) Flash chromatography on fine carbon and ion exchange media: A column (GE Healthcare HiScale50/40, 5×40 cm, max pressure 20 bar) connected to a Biotage Isolera One FLASH Chromatography System is packed with 750 ml of a Darco Activated Carbon G60 (100-mesh): Celite 535 (coarse) 1:1 mixture (both column packings were obtained from Sigma). The column is equilibrated with 5 CV of water and loaded with sugar from step 3 (10-50 g, depending on the ratio of 2′-FL to contaminating lactose), using either a celite loading cartridge or direct injection. The column is connected to an evaporative light scattering (ELSD) detector to detect peaks of eluting sugars during the chromatography. A four-step gradient of isopropanol, ethanol or methanol is run in order to separate 2′-FL from monosaccharides (if present), lactose and color bodies. Fractions corresponding to sugar peaks are collected automatically in 120-ml bottles, pooled and directed to step 5. In certain purification runs from longer-than-normal fermentations, passage of the 2′-FL-containing fraction through anion-exchange and cation exchange columns can remove excess protein/DNA/caramel body contaminants. Resins tested successfully for this purpose are Dowex 22.


The gene screening approach described herein was successfully utilized to identify new α(1,2) FTs for the efficient biosynthesis of 2′-FL in metabolically engineered E. coli host strains. The results of the screen are summarized in Table 1.


Production Host Strains


E. coli K-12 is a well-studied bacterium which has been the subject of extensive research in microbial physiology and genetics and commercially exploited for a variety of industrial uses. The natural habitat of the parent species, E. coli, is the large bowel of mammals. E. coli K-12 has a history of safe use, and its derivatives are used in a large number of industrial applications, including the production of chemicals and drugs for human administration and consumption. E. coli K-12 was originally isolated from a convalescent diphtheria patient in 1922. Because it lacks virulence characteristics, grows readily on common laboratory media, and has been used extensively for microbial physiology and genetics research, it has become the standard bacteriological strain used in microbiological research, teaching, and production of products for industry and medicine. E. coli K-12 is now considered an enfeebled organism as a result of being maintained in the laboratory environment for over 70 years. As a result, K-12 strains are unable to colonize the intestines of humans and other animals under normal conditions. Additional information on this well known strain is available at http://epa.gov/oppt/biotech/pubs/fra/fra004.htm. In addition to E. coli K12, other bacterial strains are used as production host strains, e.g., a variety of bacterial species may be used in the oligosaccharide biosynthesis methods, e.g., Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, or Xanthomonas campestris. Bacteria of the genus Bacillus may also be used, including Bacillus subtilis, Bacillus licheniformis, Bacillus coagulans, Bacillus thermophiles, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans. Similarly, bacteria of the genera Lactobacillus and Lactococcus may be modified using the methods of this invention, including but not limited to Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, and Lactococcus lactis. Streptococcus thermophiles and Proprionibacterium freudenreichii are also suitable bacterial species for the invention described herein. Also included as part of this invention are strains, modified as described here, from the genera Enterococcus (e.g., Enterococcus faecium and Enterococcus thermophiles), Bifidobacterium (e.g., Bifidobacterium longum, Bifidobacterium infantis, and Bifidobacterium bifidum), Sporolactobacillus spp., Micromomospora spp., Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonas fluorescens and Pseudomonas aeruginosa).


Suitable host strains are amenable to genetic manipulation, e.g., they maintain expression constructs, accumulate precursors of the desired end product, e.g., they maintain pools of lactose and GDP-fucose, and accumulate endproduct, e.g., 2′-FL. Such strains grow well on defined minimal media that contains simple salts and generally a single carbon source. The strains engineered as described above to produce the desired fucosylated oligosaccharide(s) are grown in a minimal media. An exemplary minimal medium used in a bioreactor, minimal “FERM” medium, is detailed below.


Ferm (10 liters): Minimal medium comprising:


40 g (NH4)2HPO4


100 g KH2PO4


10 g MgSO4.7H2O


40 g NaOH


1× Trace elements:


1.3 g NTA (nitrilotriacetic acid)


0.5 g FeSO4.7H2O


0.09 g MnCl2.4H2O


0.09 g ZnSO4.7H2O


0.01 g CoCl2.6H2O


0.01 g CuCl2.2H2O


0.02 g H3BO3


0.01 g Na2MoO4.2H2O (pH 6.8)


Water to 10 liters


DF204 antifoam (0.1 ml/L)


150 g glycerol (initial batch growth), followed by fed batch mode with a 90% glycerol-1% MgSO4-1× trace elements feed, at various rates for various times.


A suitable production host strain is one that is not the same bacterial strain as the source bacterial strain from which the fucosyltransferase-encoding nucleic acid sequence was identified.


Bacteria comprising the characteristics described herein are cultured in the presence of lactose, and a fucosylated oligosaccharide is retrieved, either from the bacterium itself or from a culture supernatant of the bacterium. The fucosylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacteria are used directly in such products.


EXAMPLES
Example 1
Identification of Novel α(1,2) Fucosyltransferases

To identify additional novel α(1,2)fucosyltransferases, a multiple sequence alignment query was generated using the alignment algorithm of the CLCbio Main Workbench package, version 6.9 (CLCbio, 10 Rogers Street #101, Cambridge, Mass. 02142, USA) using four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences: H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). This sequence alignment and percentages of sequence identity between the four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences is shown in FIG. 3. An iterative PSI-BLAST was performed, using the FASTA-formatted multiple sequence alignment as the query, and the NCBI PSI-BLAST program run on a local copy of NCBI BLAST+ version 2.2.29. An initial position-specific scoring matrix file (.pssm) was generated by PSI-BLAST, which was then used to adjust the score of iterative homology search runs. The process is iterated to generate an even larger group of candidates, and the results of each run were used to further refine the matrix.


A portion of the initial position-specific scoring matrix file used is shown below:












Last position-specific scoring matrix computed




























A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
































1
M
−1
−1
−2
−3
−2
0
−2
−3
−2
1
2
−1
6
0
−3
−2
−1
−2
−1
1


2
A
2
−2
0
4
−2
−1
1
−1
−1
−2
−3
−1
−2
−3
−1
1
−1
−3
−3
−1


3
F
−2
−3
−3
−4
−3
−3
−3
−3
−1
0
0
−3
0
7
−4
−3
−2
1
3
−1


4
K
0
3
0
−1
−2
1
0
−1
−1
−3
−3
3
−2
−3
−1
2
0
−3
−2
−2


5
V
−1
−3
−3
−4
−1
−3
−3
−4
−3
4
2
−3
1
0
−3
−2
−1
−3
−1
3


6
V
−1
−3
−3
−3
−1
−3
−3
−4
−3
4
1
−3
1
−1
−3
−2
0
−3
−1
3


7
Q
−1
4
0
−1
−3
4
1
−2
0
−3
−2
3
−1
−3
−2
0
−1
−3
−2
−3


8
I
−1
−3
−3
−4
−1
−2
−3
−4
−3
3
2
−3
1
0
−3
−2
−1
−3
−1
3


9
C
−1
−1
0
−1
5
3
0
−2
4
−2
−2
0
−1
−2
−2
0
2
−2
−1
−1


10
G
0
−3
−1
−1
−3
−2
−2
6
−2
−4
−4
−2
−3
−3
−2
0
−2
−3
−3
−3


11
G
0
−3
−1
−1
−3
−2
−2
6
−2
−4
−4
−2
−3
−3
−2
0
−2
−3
−3
−3


12
L
−2
−2
−4
−4
−1
−2
−3
−4
−3
2
4
−3
2
0
−3
−3
−1
−2
−1
1


13
G
0
−3
−1
−1
−3
−2
−2
6
−2
−4
−4
−2
−3
−3
−2
0
−2
−3
−3
−3


14
N
−2
−1
6
1
−3
0
0
−1
1
−4
−4
0
−2
−3
−2
1
0
−4
−2
−3


15
Q
−1
1
0
0
−3
6
2
−2
0
−3
−2
1
−1
−3
−1
0
−1
−2
−2
−2


16
M
−1
−2
−3
−4
−2
−1
−2
−3
−2
1
3
−2
5
0
−3
−2
−1
−2
−1
1


17
F
−2
−3
−3
−4
−3
−3
−4
−3
−1
0
0
−3
0
7
−4
−3
−2
1
3
−1


18
Q
−1
0
−1
−1
−3
5
1
−2
0
1
−1
1
0
−2
−2
−1
−1
−2
−2
0


19
Y
−2
−2
−3
−3
−3
−2
−3
−3
1
−1
−1
−2
−1
5
−3
−2
−2
2
6
−1


20
A
4
−1
−1
−1
−1
−1
−1
0
−2
−2
−2
−1
−1
−2
−1
2
0
−3
−2
−1


21
F
−2
−3
−3
−4
−3
−3
−4
−3
−1
0
0
−3
0
7
−4
−3
−2
1
3
−1


22
A
3
−2
−1
−2
−1
−1
−1
4
−2
−2
−2
−1
−2
−3
−1
1
−1
−3
−2
−1


23
K
−1
0
−1
−2
−3
0
−1
−3
1
−2
−2
3
−1
2
−2
−1
−1
1
6
−2


24
S
2
−1
−1
−2
−1
−1
−1
−1
−2
−1
1
−1
0
−1
−1
3
0
−3
−2
0


25
L
−2
3
−2
−3
−2
−1
−2
−3
−2
1
3
0
1
0
−3
−2
−1
−2
−1
0


26
Q
0
0
0
−1
−2
4
1
−2
−1
−1
0
0
3
−2
−2
2
0
−2
−2
−1


27
K
−1
2
0
−1
−3
1
0
−2
−1
−2
−2
4
−1
−3
−1
0
2
−3
−2
−2


28
H
−1
0
0
−2
−3
0
0
−2
6
1
−1
2
−1
−1
−2
−1
−1
−3
0
0


29
S
−1
−1
3
−1
−2
−1
−1
−2
0
−1
1
−1
0
1
−2
1
0
0
4
−1


30
H
−1
−1
4
0
−3
−1
−1
3
0
−3
−3
−1
−2
0
−3
0
−1
−1
4
−3


31
T
−1
−2
−1
−2
−2
−1
−2
−2
−2
1
−1
−1
−1
−2
5
0
3
−3
−2
0


32
P
−1
0
−2
−1
−3
0
−1
−2
−2
−3
−3
2
−2
−4
7
−1
−1
−4
−3
−3


33
V
−1
−3
−3
−4
−1
−2
−3
−4
−3
2
2
−3
1
−1
−3
−2
0
−3
−1
4


34
L
−2
3
−2
−3
−2
−1
−2
−3
0
0
2
0
1
1
−3
−2
−1
0
4
−1


35
L
−2
−3
−4
−4
−2
−3
−3
−4
−3
3
3
−3
1
3
−3
−3
−1
−1
1
1


36
D
−2
−2
1
6
−4
0
1
−2
−1
−4
−4
−1
−3
−4
−2
0
−1
−5
−3
−4









The command line of PSI-BLAST that was used is as follows: psiblast-db<LOCAL NR database name>-max_target_seqs 2500-in_msa<MSA file in FAST format>-out <results output file>-outfmt “7sskingdoms sscinames scomnames sseqid stitle evalue length pident”-out_pssm<PSSM file output>-out_ascii_pssm<PSSM (ascii) output>-num_iterations 6-num_threads 8


This PSI-BLAST search resulted in an initial 2515 hits. There were 787 hits with greater than 22% sequence identity to FutC. 396 hits were of greater than 275 amino acids in length. Additional analysis of the hits was performed, including sorting by percentage identity to FutC, comparing the sequences by BLAST to an existing α(1,2) fucosyltransferase inventory (of known α(1,2) fucosyltransferases, to eliminate known lactose-utilizing enzymes and duplicate hits), and manual annotation of hits to identify those originating from bacteria that naturally exist in the gastrointestinal tract. An annotated list of the novel α(1,2) fucosyltransferases identified by this screen are listed in Table 1. Table 1 provides the bacterial species from which the enzyme is found, the GenBank Accession Number, GI Identification Number, amino acid sequence, and % sequence identity to FutC.


Multiple sequence alignment of the 4 known α(1,2) FTs used for the PSI-BLAST query and 12 newly identified α(1,2) FTs is shown in FIG. 4.


Example 2
Validation of Novel α(1,2) FTs

To test for lactose-utilizing fucosylatransferase activity, the production of fucosylated oligossacharides (i.e., 2′-FL) is evaluated in a host organism that expresses the candidate enzyme (i.e., syngene) and which contains both cytoplasmic GDP-fucose and lactose pools. The production of fucosylated oligosaccharides indicates that the candidate enzyme-encoding sequence functions as a lactose-utilizing α(1,2)fucosyltransferase. Of the identified hits, 12 novel α(1,2) fucosyltransferases were further analyzed for their functional capacity to produce 2′-fucosyllactose: Prevotella melaninogenica FutO, Clostridium bolteae FutP, Clostridium bolteae+13 FutP, Lachnospiraceae sp. FutQ, Methanosphaerula palustries FutR, Tannerella sp. FutS, Bacteroides caccae FutU, Butyrivibrio FutV, Prevotellaa sp, FutW, Parabacteroides johnsonii FutX, Akkermansia muciniphilia FutY, Salmonella enterica FutZ, and Bacteroides sp. FutZA.


Syngenes were constructed comprising the 12 novel α(1,2) FTs in the configuration as follows: EcoRI—T7g10 RBS—syngene—XhoI. FIG. 5A and FIG. 5B show the syngene fragments after PCR assembly and gel-purification.


The candidate α(1,2) FTs (i.e., syngenes) were cloned by standard molecular biological techniques into an exemplary expression plasmid pEC2-(T7)-Fut syngene-rcsA-thyA. This plasmid utilizes the strong leftwards promoter of bacteriophage λ (termed PL) to direct expression of the candidate genes (Sanger, F. et al. (1982). J Mol Biol 162, 729-773). The promoter is controllable, e.g., a trp-cI construct is stably integrated the into the E. coli host's genome (at the ampC locus), and control is implemented by adding tryptophan to the growth media. Gradual induction of protein expression is accomplished using a temperature sensitive cI repressor. Another similar control strategy (temperature independent expression system) has been described (Mieschendahl et al., 1986, Bio/Technology 4:802-808). The plasmid also carries the E. coli rcsA gene to up-regulate GDP-fucose synthesis, a critical precursor for the synthesis of fucosyl-linked oligosaccharides. In addition, the plasmid carries a β-lactamase (bla) gene for maintaining the plasmid in host strains by ampicillin selection (for convenience in the laboratory) and a native thyA (thymidylate synthase) gene as an alternative means of selection in thyA hosts.


The expression constructs were transformed into a host strain useful for the production of 2′-FL. The host strain used to test the different α(1,2) FT candidates incorporates all the above genetic modifications described above and has the following genotype:


ΔampC::PtrpBcI, Δ(lacI-lacZ)::FRT, PlacIqlacY+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3,lacZ+), ΔlacA


The E. coli strains harboring the different α(1,2) FT candidate expression plasmids were analyzed. Strains were grown in selective media (lacking thymidine) to early exponential phase. Lactose was then added to a final concentration of 0.5%, and tryptophan (200 μM) was added to induce expression of each candidate α(1,2) FT from the PL promoter. At the end of the induction period (˜24 h) the culture supernatants and cells were harvested. Heat extracts were prepared from whole cells and the equivalent of 0.2OD600 units of each strain analyzed for the presence of 2′-FL by thin layer chromatography (TLC), along with 2 μl of the corresponding clarified culture supernatant for each strain.



FIG. 6 shows the oligosaccharides produced by the α(1,2) FT-expressing bacteria, as determined by TLC analysis of the culture supernatant and extracts from the bacterial cells. 2′FL was produced by exogenous expression of WbgL (used as control), FutO, FutP, FutQ, FutR, FutS, FutU, FutW, FutX, FutZ, and FutZA.


Table 4 summarizes the fucosyltransferase activity for each candidate syngene as determined by the 2′FL synthesis screen described above. 11 of the 12 candidate α(1,2) FTs were found to have lactose-utilizing fucosyltransferase activity.









TABLE 4







2′FL synthesis screen results














24 h OD
2′-FL culture
2′-FL cell




syngene
(induced)
medium
extract


















Escherichia coli

WbgL
9.58
5
5
pG204 pEC2-WbgL-rcsA-thyA
E640



Prevotella melaninogenica

FutO
12.2
3
2
pG393 pEC2-(T7)FutO-rcsA-thyA
E985



Clostridium bolteae

FutP
10.4
1
2
pG394 pEC2-(T7)FutP-rcsA-thyA
E986


Lachnospiraceae sp.
FutQ
10.6
3
4
pG395 pEC2-(T7)FutQ-rcsA-thyA
E987



Methanosphaerula palustris

FutR
11.9
0
1
pG396 pEC2-(T7)FutR-rcsA-thyA
E988



Tannerella sp.

FutS
11.3
2
3
pG397 pEC2-(T7)FutS-rcsA-thyA
E989



Bacteroides caccae

FutU
12.1
0
2
pG398 pEC2-(T7)FutU-rcsA-thyA
E990



Butyrlvibrio

FutV
11.3
0
1
pG399 pEC2-(T7)FutV-rcsA-thyA
E991



Prevotella sp.

FutW
10.5
3
3
pG400 pEC2-(T7)FutW-rcsA-thyA
E992



Parabacteroides johnsonii

FutX
10.7
3
5
pG401 pEC2-(T7)FutX-rcsA-thyA
E993



Akkermansia muciniphilia

FutY
9.1
0
0
pG402 pEC2-(T7)FutY-rcsA-thyA
E994



Salmonella enterica

FutZ
11.0
0
3
pG403 pEC2-(T7)FutZ-rcsA-thyA
E995



Bacteroides sp.

FutZA
9.9
3
3
pG404 pEC2-(T7)FutZA-rcsA-thyA
E996









Example 3
Characterization of Cultures Expressing Novel α(1,2) FTs

Further characterization of the bacterium expressing novel α(1,2) FTs FutO, FutQ, and FutX was performed. Specifically, proliferation rate and exogenous α(1,2) FT expression was examined.


Expression plasmids containing fucosyltransferases WbgL (plasmid pG204), FutN (plasmid pG217), and novel α(1,2) FTs FutO (plasmid pG393), FutQ (plasmid pG395), and FutX (pG401) were introduced into host bacterial strains. For example, the host strains utilized has the following genotype: ΔampC::PtrpBcI, Δ(lacI-lacZ)::FRT, PlacIqlacY+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ+), ΔlacA


Bacterial cultures expressing each exogenous fucosyltransferase were induced by addition of tryptophan (to induce expression of the exogenous fucosyltransferases) in the presence of lactose. Growth of the cultures was monitored by spectrophotometric readings at A600 at the following timepoints: 4 hours and 1 hour before induction, at the time of induction (time 0), and 3 hours, 7 hours, and 24 hours after induction. The results are shown in FIG. 7, and indicate that expression of the exogenous fucosyltransferase did not prevent cell proliferation. Furthermore, the growth curve for the bacterial cultures expressing the novel α(1,2) fucosyltransferases FutO, FutQ, and FutX is similar to those expressing the known α(1,2)FT enzymes WbgL and FutN.


Protein expression was also assessed for the bacterial cultures expressing each fucosyltransferase after induction. Cultures were induced as described previously, and protein lysates were prepared from the bacterial cultures at the time of induction (0 hours), 3 hours, 7 hours, and 24 hours after induction. The protein lysates were run on an SDS-PAGE gel and stained to examine the distribution of proteins at each time point. As shown in FIG. 8, induction at 7 hours and 24 hours showed increases in a protein band at around 20-28 kDa for bacterial cultures expressing exogenous FutN, FutO, and FutX. These results indicate that induction results in significant expression of the exogenous fucosyltransferases.


Finally, additional TLC analysis to assess the efficiency and yield of 2′FL production in bacterial cultures expressing novel α(1,2) FTs FutO, FutQ, and FutX compared to known fucosyltransferases WbgL and FutN. Cultures were induced at 7 hours and 24 hours, and run out on TLC. FIG. 9A shows the level of 2′FL in the cell supernatant. The level of 2′FL found in the bacterial cells were also examined. As shown in FIG. 9B, 2′FL was produced in cell lysates from bacteria expressing the novel α(1,2) FTs FutO, FutQ, and FutX at 7 hours and 24 hours after induction.


Example 4
FutN Exhibits Increased Efficiency for Production of 2′FL

Fucosylated oligosaccharides produced by metabolically engineered E. coli cells to express B. vulgatus FutN was purified from culture broth post-fermentation.


Fermentation broth was harvested and cells were removed by sedimentation in a preparative centrifuge at 6000×g for 30 min. Each bioreactor run yields about 5-7 L of partially clarified supernatant. A column packed with coarse carbon (Calgon 12×40 TR) of ˜1000 ml volume (dimension 5 cm diameter×60 cm length) was equilibrated with 1 column volume (CV) of water and loaded with clarified culture supernatant at a flow rate of 40 ml/min. This column had a total capacity of about 120 g of sugar. Following loading and sugar capture, the column is washed with 1.5 CV of water, then was eluted with 2.5 CV of 50% ethanol or 25% isopropanol (lower concentrations of ethanol at this step (25-30%) may be sufficient for product elution.) This solvent elution step released about 95% of the total bound sugars on the column and a small portion of color bodies (caramelized sugars), A volume of 2.5 L of ethanol or isopropanol eluate from the capture column was rotary-evaporated at 56 C.° and a sugar syrup in water was generated. A column (GE Healthcare HiScale50/40, 5×40 cm, max pressure 20 bar) connected to a Biotage Isolera One FLASH Chromatography System was packed with 750 ml of a Darco Activated Carbon G60 (100-mesh): Celite 535 (coarse) 1:1 mixture (both column packings were obtained from Sigma). The column was equilibrated with 5 CV of water and loaded with sugar from step 3 (10-50 g, depending on the ratio of 2′-FL to contaminating lactose), using either a celite loading cartridge or direct injection. The column was connected to an evaporative light scattering (ELSD) detector to detect peaks of eluting sugars during the chromatography. A four-step gradient of isopropanol, ethanol or methanol was run in order to separate 2′-FL from monosaccharides (if present), lactose and color bodies. Fractions corresponding to sugar peaks were collected automatically in 120-ml bottles, pooled.


The results from two fermentation runs are shown in FIG. 10A and FIG. 10B. The cultures were grown for 136 (run 36B) or 112 hours (run 37A), and the levels of 2′-FL produced was analyzed by TLC analysis. As shown in both FIG. 10A and FIG. 10B, the 2′-fucosyllactose was produced at 40 hours of culture, and production continued to increase until the end point of the fermentation process. The yield of 2′-FL produced from run 36B was 33 grams per liter. The yield of 2′-FL produced from run 37A was 36.3 grams per liter. These results indicate that expression of exogenous FutN is suitable for high yield of 2′-fucosyllactose product.
















TABLE 1









% i-


SEQ


Bacterium
Accession

Gene name
dentity


ID


names
No.
GI No.
[bacterium]
FutC
Alias
SEQUENCE
NO























Helicobacter 

AAD29869.1
4808599
alpha-1,2-
98
FutC
MAFKVVQICGGLGNQMFQYAFAKSLQKHSNTPVLLDITSEDWSDRKMQLELFPINLPYASAKEIAIAKMQH
1



pylori



fucosyl-


LPKLVRDALKCMGFDRVSQEIVFEYEPELLKPSRLTYFYGYFQDPRYFDAISPLIKQTFTLPPPPENNKNNNKKE






transferase 


EEYHRKLSLILAAKNSVFVHIRRGDYVGIGCQLGIDYQKKALEYMAKRVPNMELFVFCEDLEFTQNLDLGYPF






[Helicobacter


MDMITRNKEEEAYWDMLLMQSCQHGBANSTYSWWAAYLIENPEKIIIGPKHWLFGHENILCKEWVKIESH







pylori]



FEVKSQKYNA







Helicobacter

YP_
291277413
alpha-1,2-
70.85
FutL
MDFKIVQVHGGLGNQMFQYAFAKSLQTHLNIPVLLDTTWFDYGNRELGLHLFPIDLCICASAQQIAAAHMQ
2



mustelae; 

003517185.1

fucosyl-


NLPRLVRGALRRMGLGRVSKEIVFEYMPELFEPSRIAYFHGYEQDPRYFEDISPLIKQTFTLPHPTEHAEQYSR




Helicobacter



transferase 


KLSQILAAKNSVFVHIRRGDYMRLGWQLDISYQLRAIAYMAKRVQNLELFLECEDLEFVQNLDLGYPFVDMT




mustelae 12198



[Helicobacter


TRDGAAHWDMMLMQSCKHGIITNSTYSWWAAYLIKNPEKIIIGPSHWIYGNENILCKDWVKIESQFETKS







mustelae










12198]










Bacteroides; 

YP_
150005717
glycosyl
24.83
FutN
MRLIKVTGGLGNQMFIYAFYLRMKKYYPKVRIDLSDMMHYKVHYGYEMHRVFNLPHTEFCINQPLKKVIEFL
3



Bacteroides

001300461.1

transferase


FFKKIYERKQAPNSLRAFEKKYFWPLLYEKGFYQSERFFADIKDEVRESFTFDKNKANSRSLNMLEILDKDENA




vulgatus ATCC 



family protein


VSLHIRRGDYLQPKHWATTGSVCQLPYYCINAIAEMSRRVASPSYYIFSDDIAWVKENLPLQNAVYIDWNTD



8482; 


[Bacteroides


EDSWQDMMLMSHCKHHIICNSTFSWWGAWLNPNIVIDKTVIVPSRWFIDHSEAPDIYPTGWIKVPVS




Bacteroides




vulgatus ATCC







sp. 4_3_47FAA; 


8482]







Bacteroides










sp. 3_1_40A; 










Bacteroides











vulgatus










PC510; 










Bacteroides











vulgatus










CL09T03C04; 










Bacteroides











vulgatus










dnLKV7; 










Bacteroides











vulgatus CAG:6














Escherichia

WP_
545259828
protein
23.13
WbgL
MSIIRLQGGLGNQLFQFSFGYALSKINGTPLYFDISHYAENDDHGGYRLNNLQIPEEYLQYYTPKINNIYKLLVR
4



coli; 

021554465.1

[Escherichia


GSRLYPDIFLELGFCNEFHAYGYDFEYIAQKWKSKKYIGYVVQSEHFFHKHILDLKEFFIPKNVSEQANLLAAKIL




Escherichia




coli]



ESQSSLSIHIRRGDYIKNKTATLTHGVCSLEYYKKALNKIRDLAMIRDVFIFSDDIFWCKENIETLLSKKYNIYYSE




coli






DLSQEEDLWLMSLANHHIIANSSFSWWGAYLGSSASQIVIYPTPWYDITPKNTYIPIVNHWINVDKHSSC



UMEA 3065-1













Helicobacter

WP_
491361813
predicted
36.79
FutD
MGDYKIVELTCGLGNQMFQYAFAKALQKHLQVPVLLDKIWYDTQDNSTQFSLDIENVDLEYATNTQIEKAK
5 



bilis;

005219731.1

protein


ARVSKLPGLLIIKMFGLKKHNIAYSQSFDFHDEYLLPNDFTYFSGFFQNAKYLKGLEQELKSIFYYDSNNESNEG




Helicobacter 



[Helicobacter


KQRLELILQAKNSIFINIRRGDYCKIGWELGMDYYKRAIQVIMDRVEEPKFFIFGATDMSFTEQFQKNLGLNE




bills




bills]



NNSANLSEKTITQDNQHEDMFLIVICYCKHAILANSSYSEWSAYLNNDANNIVIAPTPWLLDNDNIICDDWIKI



ATCC 43879





SSK







Escherichia 

AAO37698.1
37788088
fucosyl-
25.94
Wbs.1
MEVKIIGGLGNQMFQYATAFAIAKRTHQNLTVDISDAVKYKTHPLRLVELSCSSEEVKKAWPFEKYLFSEKIPH
6



coli



transferase 


FMKKGMERKHYVEKSLEYDPDIDTKSINKKIVGYFQTEKYFKEFRHELIKEFQPKTKFNSYQNELLNLIKENDTC






[Escherichia


SLHIRRGDYVSSKIANETHGTCSEKYFERAIDYLMNKGVINKKTLLFIFSDDIKWCRENIFFNNQICFVQGDAY






coil]


HVELDMLITVISKCKNNIISNSSFSWWAAWLNENKNKTVIAPSKWEKKDIKHDIIPESWVKL







Vibrio 

BAA33632.1
3721682
probable beta-
25.94
WbIA
MIVMKISGGLGNQLFQYAVGRAIAIQYGVPLKLDVSAYKNYKLHNGYRLDQFNINADIANEDEIFHLKGSSN
7



cholerae



D-galactoside 


RLSRILRRLGWLKKNTYYAEKQRTIYDVSVEMQAPRYLDGYWQNEQYFSQIRAVLLQELWPNQPLSINAQA






2-alpha-L-


HQIKIQQTHAVSIHVRRGDYLNHPEIGVLDIDYYKRAVDYIKEKIEATVEFVF5NDVAWCKDNENFIDSPVFIE






fucosyl-


DTQTEIDDLMLMCQCQHNIVANSSFSWWAAWLNSNVDKIVIAPKTWMAENPKGYKWVPDSWREI






transferase









[Vibrio










cholerae]











Bacteroides

YP_099118.1
53713126
alpha-1,2-
24.58
Bft2
MIVSSLRGGLGNQMFIYAMVKAMALRNNVPFAFNETTDEANDEVYKRKLLLSYFALDLPENKKLTFDFSYGN
8



fragilis; 



fucosyl-


YYRRLSRNLGCHILHPSYRYICEERPPHEESRLISSKITNAFLEGYWQSEKYFLDYKQEIKEDPVIQKKLEYTSYLE




Bacteroides



transferase 


LEEIKLLDKNAIMIGVRRYQESDVAPGGVLEDDYYKCAMDIMASKVTSPVFFCFSQDLEWVEKHLAGKYPVR




fragilis NCTC



[Bacteroides


LISKKEDDSGTIDDMELMMHERNYIISNSSFYWWGAWLSKYDDKLVIAPGNFINKDSVPESWFKLNVR



9343; 



fragilis








Bacteroides



YCH46]







fragilis










YCH46; 










Bacteroides











fragilis HMW 










615













Escherichia

WP_
486356116
protein
24.25
WbgN
MSIVVARLAGGLGNQMFQYAKGYAESVERNSYLKLDLRGYKNYTLHGGFRLDKLNIDNTFVMSKKEMCIFP
9



coli; 

001592236.1

[Escherichia


NFIVRAINKFPKLSLCSKRFESEQYSKKINGSMKGSVEFIGEWQNERYFLEHKEKLREIFTPININLDAKELSDVI




Escherichia 




coli]



RCTNSVSVHIRRGDYVSNVEALKIHGLCTERYYIDSIRYLKERFNNLVFFVESDDIEWCKKYKNEIESRSDDVKFI




coli KTE84






EGNTQEVDMWLMSNAKYHIIANSSFSWWGAWLKNYDLGITIAPTPWEEREELNSFDPCPEKWVRIEK







Prevotella

YP_
302346214
glycosyl-
31.1
FutO
MKIVKILGGLGNQMFQYALYLSLQESFPKERVALDLSSFHGYHLHNGFELENIFSVTAQKASAADIMRIAYYY
10



malenino-

003814512.1

transferase 


PNYLLWRIGKRFLPRRKGMCLESSSLREDESVLRQEGNRYFDGYVVQDERYFAAYREKVLKAFTEPAFKRAEN




genica; 



family 11


LSLLEKLDENSIALHVRRGDYVGNNLYQGICDLDYYRTAIEKMCAHVTPSLFCIFSNDITWCQQHLQPYLKAP




Prevotella



[Prevotella


VVYVTWNTGVESYRDMQLMSCCAHNIIANSSFSWWGAWLNQNREKVVIAPKKWLNMEECHFTLPASWI




melaninogenica




melaninogenica



KI



ATCC 25845


ATCC 25845]










Clostridium

WP_
488634090
protein
29.86
FutP
MFQYALYKAFEQKHIDVYADLAWYKNKSVKFELYNFGIKINVASEKDINRLSDCQADEVSRIRRKIEGKKKSEV
11



bolteae; 

002570768.1

[Clostridium


SEKNDSCYENDILRMDNVYLSGYWQTEKYFSNTREKLLEDYSFALVNSQVSEWEDSIRNKNSVSIHIRRGDYL




CloStridium




bolteae]



QGELYGGICTSLYYAEAIEYIKNARVPNAKFFVFSDDVEWVKQQEDFKGFVIVDRNEYSSALSDMYLMSLCKH




bolteae 90A9; 






NIIANSSFSWWAAWLNRNEEKIVIAPRRWLNGKCTPDIWCKKWIRI




Clostridium











bolteae










90133; 










Clostridium











bolteae 90B8














Lachno-

WP_
496545268
protein
29.25
FutQ
MVIVQLSGGLGNQMFEYALYLSLKAKGKEVKIDDVTCYEGPGTRPRQLDVEGITYDRASREELTEMTDASM
12



spiraceae

009251343.1

[Lachnos-


DALSRVRRKLTGRRTKAYRERDINFDPLVMEKDPALLEGCFQSDKYFRDCEGRVREAYRERGIESGAFPLPED




bacterium




piraceae



YLRLEKQIEDCQSVSVHIRRGDYLDESHGGLYTGICTEAYYKEAFARMERLVPGARFFLFSNDPEVVTREHFES



3_1_57FAA_CT1



bacterium



KNCVLVEGSTEDTGYMDLYLMSRCRHNIIANSSFSWWGAWLNENPEKKVIAPAKWLNGRECRDIYTERMI






3_1_


RL






57 FAA_CT1]










Methano-

YP_
219852781
glycosyl
28.52
FutR
MIIVRLKGGLGNQLSQYALGRKIAHLHNTELKLDTTWFTTISSDTPRTYRLNNYNIIGTIASAKEIGLIERGRAQ
13



sphaerula



transferase


GRGYLLSKISDLLTPMYRRTYVRERMHTFDKAILTVPDNVYLDGYWQTEKYFKDIEEILRREVTLKDEPDSINLE




palustris; 

002467213.1

family protein


MAERIQACHSVSLHVRRGDYVSNPTTQQFHGCCSIDYYNRAISLIEEKVDDPSFFIFSDDLPWAKENLDIPGE




Methano-



[Methano-


KTFVAHNGPEKEYCDLWLMSLCQHHIIANSSFSWWGAWLGQDAEKMVIAPRRWALSESFDTSDIIPDSWI




sphaerula




sphaerula



TI




palustris




palustris







E1-9c


E1-9c]










Tannerella  

WP_
547187521
glycosyl
28.38
FutS
MVRIVEIIGGLGNQMFWQYAFSLYLKNKSHIWDRLYVDIEAMKTYDRHGLELEKVFNLSLCPISNRLHRNLQK
14


sp. CAG:118
021929367.1

transferase


RSFAKHFVKSLYEHSECEFDEPVYRGLRPYRYYRGYWQNEGYFVDIEMIREAFQFNVNILSKKTKAIASKMR






family 11


RELSVSIHVRRGDYENLPEAKAMHGGICSLDYYHGKAIDFIRQRLDNNICFYLFSDDINWVEENLQLENRCIID






[Tannerella


WNQGESDSWQDMYLMSCCRHHIIANSSFSWWAAWLNPNKWFNHTDAVGIVPGSWIKIPVF






sp. CAG:118]










Bacteroides

WP_
491925845
protein
28.09
FutU
MKIVKILGGLGNQMFQYALFLSLKERFPHEQVMIDTSCFRNYPLHNGFEVDRIFAQKAPVASWRNILKVAYP
15



caccae 

005675707.1

[Bacteroides


YPNYRFWKIGKYILPKRKTMCVERKNFSFDAAVLTRKGDCYYDGYWQHEEYFCDMKETIWEARSFPEPVDG




Bacteroides




caccae]



RNKEIGALLQASDSASLHVRRGDYVNHPLFRGICDLDYYKRAIHYMEERVNPQLYCFSNDMAWCESHLRA




caccae ATCC 






LLPGKEVVYVDWNKGAESYVDMRLMSLCRHNIIANSSFSWWGAWLNRNPQKVVVAPERWMNSPIEDPV



43185





SDKWIKL







Butyrivibrio

WP_
551028636
protein
27.8
FutV
VIIQLKGGLGNQMFQYALKSLKKRGKEVKIDDKTGFVNDKLRIPVLSRWGVEYDRATDEEIINLTDSKMDL
16


sp. AE2015
022772718.1

{Butyrivibrio


FSRIRRKLTGRKTFRIDEESGKFNPELEKENAYLVGYWQCDKYFDDKDVVREIREAFEKKPQELMTDASSWS






sp. AE2015]


TLQQIECCESVSLHVRRTDYVDEEHIHIHNICTEDYYKNAIDRVKQYPSAVFFIFTDDKEWCRDHFKGPNFIV









VELEEGDGTDIAEMTLMSRCKHHIIANSSFSWWAAWLNDSPEKIVIAPQKWINNRDMDDIYTERMTKIAL







Prevotella

WP_
548264264
un-
27.4
FutW
MRLVKMIGGLGNQMFIYAFYLQMRKRFSNVRIDLDMMHYNVHYGYELHKVFGLPRTEFCMNQPLKKVL
17


sp. CAG:891
022481266.1

characterized


EFLFFRTIVERKQHGRMEPYTCQVYWPLVYFKGFYQSERYFSEVKDEVRECFTFNPALANRSSQQMMEQIQ






protein


NDPQAVSIHIRRGDYLNPKHYDTIGCICQLPYYKHAVSEIKKYVSNPHFYVFSEDLDWVKANLPLENAQYIDW






[Prevotella


NKGADSWQDMMLMSCCKHHIICNSTFSWWAAWLNPSVEKTVIMPEQWTSRQDSVDFVASCGRWVRV






sp. CAG: 891]


KTE







Para-

WP_
495431188
glycosol
26.69
FutX
MRLIKMIGGLGNQMFIYAFYLKMKHHYPDTNIDLSMNVHYKVHNGYEMNRIFDLSQTEFCINRTLKKILEFL
18



bacteroides

008155883.1

transferase


FFKKIYERRQDPSTLYPYEKRYFWPLLYFKGFYQSERFFFDIKDDVRKAFSFNLNIANPESELLKQIEVDDQAV




johnsonii; 



[Para-


SIHIRRGDYLLPRHWANTGSVCQLPYYKNAIAEMENRITGPSYYVFSDDISWVKENIPLKKAVYVTWNKGED




Para-




bacteroides



SWQDMMLMSHCRHHIICNSTFSWWGAWLNPREKIVIAPCRWFQHKEPPDMYPKEWIKVPIN




bacteroides 




johnsonii]








johnsonii










CL02T1C29













Akkemansia

YP_
187735443
glycosyl
25.67
FutY
MRLFGGLGNQLFQYAFLFALSRQGGKARLETSSYEHDDKRVCELHHFRVSLPIEGGPPPWAFRKSRIPACLRS
19



muciniphila;

001877555.1

transferase


LFAAPKYPHFREEKRHGFDPGLAAPPRRHTYFKGYFQTEQYFLHCREQLCREFRLKTPLTPENARILEDIRSCCS




Akkermansia



family protein


ISLHIRRTDYLSNPYLSPPPLEYYLRSMAEMEGRLAAGAPQESLRYFIFSDDIEWARQNLRPALPHVHVDIND




muciniphila



[Akkermansia


GGTGYFDLELMRNCRHHIIANSTFSWWAAWLNEHAEKIVIAPRIWFNREEGDRYHTDDALIPGSWLRI



ATCC BAA-835



muciniphila










ATCC BAA-835]










Salmonella

WP_
555221695
fucosyl-
25.99
FutZ
MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFADEKEKIKLL
20



enterica;

023214330.1

transferase


RKFKRNPFPKQISEILSIALFGKYALSDRAFYTFETIKNIDKACLFSFYQDADLLNKYKQLILPLFELRDDLLDICKN




Salmonella



[Salmonella


LELYSLIQRSNNTTALHIRRGDYVTNQHAAKYHGVLDISYYNHAMEYVERERFKQNFIIFSDDVRWAQKAFL




enterica




enterica]



ENDNCYVINNSDYDFSAIDMYLMSLCKNNIIANSTYSWWGAWLNKYEDKLVISPKQWFLGNNETSLRNAS



subsp.





WITL




enterica











serovar Poona










str. ATCC









BAA-1673













Bacteroides

WP_
547748823
glycosyl-
26.01
FutZA
MRLIKMTGGLGNQMFIYAFYLRMKKRYPKVRIDLSDMNHYHVHHGYEMHRVFNLPHYEFCINQPLKKVIEF
21


sp. CAG:633
022161880.1

transferase


LFFKKIYERKQDPNSLRAFEDDYLWPLLFKGFYQSERFFADIKDEVRKAFTFDSSVKVARSAELLRRLDADAN






family 11


AVSLHIRRDGYLQPQHWATTGSVCQLPYYQNAIAEMNRRVAASPYYVFSDDIAWVKENIPLQNAVYIDWN






[Bacteroides


KGEESWQDMMLMSHCRHHIICNSTFSWWGAWLDPHEDKIVIVPNRWFQHCETPNIYPAGWVKVAIN






sp. CAG:633]










Clostridium

WP_
547839506
alpha-1-2-
34.28

MEKIKIVKLQGGMGNQMFQYAFGKGLESKFGCKVLFDKINYDELQKTIINNTGKNAEGICVRKYELGIFNLNI
22


sp. CAG:306
022247142.1

fucosyl-


DFATAEQIQECIGEKLNKACYLPGFIRKIFNLSKNKTVSNRIFEKKYGEYDEEILKDYSLAYYDGYFQNPKYFEDI






transferase


SDKIKKEFTLPEIKNHDIYNKKLLEKITQFENSVFIHVRRDDYLNINCEIDLDYYQKAVKYILKHIENPKFFVCAE






[Clostridum


DPDYIKNHFDIGYDFELVGENNKTQDTYYENMRLMMACKHAIIANSSYSWWAAWLSDYDNKIVIAPTPWL






sp. CAG:306]


PGISNEIICKNWIQIKRGISNE







Prevotella

WP_
497004957
protein
32.11

MKIVKILGGLGNQMFQYALYLSLQESFPKERVALDLSCFNGYHLHNGFELERIFSLTAQKASAATIMRIAYYYP
23


sp. oral
009434595.1

[Prevotella


NYLLWRIGKRLLPRRKTMCLESSTFRYDESVLTREGNFRYFDGYWQDERYFVACREKVLKAFTFPAFKRTENLS



taxon 306;


sp. oral


LLRKLDKNSVAIHVRRGDYIGNQLYQGICDLDYYRAAIDKISTYVTPSVFCIFSNDIAWCQTHLQPYLKAPVVY




Prevotella



taxon 306]


VTWNTGTESYRDMQLMSCCAHNIIANSSFSWWGAWLNQNNEKVVIAKRWLNMDDCQFPLPASWVKI



sp. oral 









taxon 306 str.









F0472













Brachyspira

WP_
547139308
glycosyl
30.14

MQLVKLMGGLGNQMFQYAFAKALGDKNILFYDGYKKHSLRKVELNRFKCKAVYIPRELFKYLKFVFTKFDKIE
24


sp. CAG:484
021917109.1

transferase


YMRSDIYVPEYLNRDGNHIYIGFWQTEKYFKQIRPRLLKDFTPRKKLDRENAGIISKMQQINSVSVHIRRTDYV






family 11


DESHIYDGTNLDYYKRAIEYSSKIENPEFFFFSDDMAYVKEKFAGLKFPHSFIDINSGNNSYKDLILMKNCKHN






[Brachyspira


IIANSTFSWWGAWLNENEEKIVIAPAKWFVTGENKDKIVPDEWIKL






sp. CAG:484]










Thalassospira

WP_
496164823
glycosyl
30.

MVIVKLLGGLGNQMFQYATGRAVASRLDVELLLDVSAFAHYDLRRYELDDWNITARLATSEELARSGVTAA
25



profundimaris;

008889330.1

transferase


PPSFFDRIAFLRIDLPVNCFREASFTYDPRILEVSSPVYLDGYWQSERYFLDIEKKLRQEFQLKASIDANNHSFK




Thalassospira



family 11


KKIDGLGKQAVSLHVRRGDYVTNPQTASYHGVCSLDYYRAAVDYIAEHVSDPCFFVFSDDLEWVQTNLNIK




profundimaris



[Thalassospira


QPIVLVDANGPDNGAADMALMMACRHHIIANSSFSWWGSWLNPLNDKIIVAPKKWFGRANHDTTDLVP



WP0211



profundimaris]



DSWVRL







Acetobacter

WP_
547459369
alpha-1 2-
29.99

MAVSPQESKYSAHVSPDKPLRIVRLGGGLGNQMFQYAFGLAAGDVLWDNTSFLTNHYRSFDLGLYNISGDF
26


sp. CAG:267
022078656.1

fucosyl-


ASNEQIKKCKNEIRFKNILPRSIRKKFNLGKFIYLKTNRVCERQINRYEPELLSKDGDVYYDGVFQTEKYFKPLRE






transferase


RLLHDFTLTKPLDAANLDMLAKIRAADAVAVHIRRGDYLNPRSPFTYLDKDYFLNAMDYIGKRVDKPHFFIFS






[Acetobacter


SDTDWVRTNIQTAYPQTIVEINDEKHGYFDLELMRNCRHNIIANSTFSWWGAWLNTNPDKIVVAPKQWFR






sp. CAG:267]


PDAAEYSGDIVPNKDWIKL







Dysgonomonas

WP_
493896281
protein
29.9

MVTVLLSGGLGNQMFQYAAAKSLAIRLNTALSVDLYTFSKKTQATVRPYELGIGNIEDVVETSSLKAKAVIKAR
27



mossii;

006842165.1

[Dysgonomonas


PFIQRHRSFFQRFGVFTDTYAILYQPTFEALTFFVIMSGYFQNESYFKNISELLRKDFSFKYPLIGENKDVAGQI




Dysgonomonas




mossii]



SENQSVAVHIRRGDYLNKNSQSNFAILEKDYYEKAINYISAHVKNPEFYVFSEDFDWIKDNLNFKEFPVTFID




mossii DSM






WNKGKDSYIDMQLMSLCKHNIIANSSFSWWSAWLNNSEERKIVAPERWFVDEQKNELLDCFYPQGWIKI



22836













Clostridium

WP_
545396671
glycosyl-
29.83

>gi|545396671|ref|WP_021636924.1|glycoyltransferase, family 11 [Clostridium
28


sp. KLE
021636924.1

transferase,


sp. KLE 1755]MIIEISGGLGNQMFQYALGQKFISMGKEVKYDLSFYNDRVQTLRQFELDIFDLDCPVASNSELS



1755


family 11


GNSLKSRLKQKLGWDKEKIYEENLDLGYQPRIFELDDIYSGYWQDSELYFKDIREQILRLYTFPIQLDYMNGVFL






[Clostridium


RKIENSNSVSIHIRRGDYLNENNLKIYGNICTLNYYNKALQIIAKKITNPIIFVFTNDIEWVRKELEIPNMVIVDC






sp. KLE 1755]


NSGKLSYWDMYLMSKCKANIVANSSFSWWGAWLNKNENRIIISPKRWLNNHEQTSTLCDNWIRCGDD







Gillisia

WP_
494045950
alpha-1,2-
29.28

MFISKNTVIIKLVGGLGNQMFQFAIAKIIAEKEKSEVLVDITFYTELTENTKKFPRHFSLGIFNSSFAIASKKEIDYF
29



limnaea;

006988068.1

fucosyl-


TKLSNFNKFKKKLGLNYPTIFHESSFNFKAQVLELKAPIYLNGYFQSFRYFLGKEYVIRKIFKFPDEALDKDNDNI




Gillisia



transferase


KRKIIGKTSVSLHIRRGDYVNNKKTQQFHGNCTIDYYQSAIAYLSSKLTDFNLIFFSDDIHWVRQQFKNISNQKI




limnaea DSM



[Gillisia


YVSGNLNHNSWKDMYLMSLCDHNIIANSSFSWWGAWLNKNPEKIIIAPKRWFADTEQDKNSIDLIPSEWY



15749



limneas]



RI







Methylotenera

YP_
253996403
glycosyl
29.19

MLVSRIIGGLGNQMFEYAAARAASLRISVQLKLDLSGFETYDLHAYGLNNFNIVEDVAKKDDYFIGAPESLLKK
30



mobilis;

003048467.1

transferase


IKKYLRGLIQLESFRESDLSFDSKVLELNDNTYLDGYWQCERYFIDFDKQIRQDFSFKFAPDALNQRYLELIDSV




Methylotenera



family protein


NAVSVHIRRGDYVSNSTTNEIHGVCDLDYYQRAAEFMRARIGPENLHFFVFSDDTDWVKENISFGSDTTFIS




mobilis JLW8



[Methylotenera


HNDAAKNYEDMRLMSACKHHIIANSSFSWWAAWLNPSKQKVVIAPRQWFKSTLLNSDDIVPASWVRL







mobilis JLW8]











Runella

YP_
338214504
glycosyl
29.14

WMMVKLSGGLGNQLFQYAFGRHLATVNQKELKLDTSALTKTSDWTNRSYALDAFNIRAQEATPEEIKALAGKP
31



slithyformis;

004658567.1

transferase


NRLLQRVGRKVGITPIQYFQEPHFHFYSSALSIKSSHYLEGYWQSEKYFEAITPILREEFAFTISPSTHAQTIKEKI




Runella



family protein


SNGTSVSIHLRRGDYVKTSKANRYLRPLTMDYYQKAIDYINQRVKNPNFFLFSDDIKWAKSQVTFPPTTHFST




slithyformis



[Runella


GTSAHEDLWLMTHCRHHIIANSTFSWWGAWLNQQPDKIVIAPQKWFSTERFDTKDLLPEPWIQL



DSM 19594



slithyformis










DSM 19594]










Pseudo-

WP_
489048235
alpha-1,2-
29.1

MIKVKAIGGLGNQLFQYATARAIAEKRGDGNNNDMSDFSSYKTHPFCLNKFRCKATYESKPKLINKLLSNEIKR
32



alteromonas

002958454.1

fucosyl-


NLLQKLGFIKKYFETQLPFNEDVLLNNSINYLTGYFQSEKYFLSIRECLLDELTLIEDLNIAETAVSKAIKNAKNSI




haloplanktis;



transferase


SIHIRRGDYVSNEGANKTHGVCDSDYFKKALNYFSERKLLDEHTELFIFSDDIEWCRNNLSFDYKMNFVDGSS




Pseudo-



[Pseudo-


ERPEVDMVLMSQCKHQVISNSTFSWWGAWLNKDEKVVVAPKEWFKSTDLDSTDIVPNQWIKL




doalteromonas




alteromonas








haloplanktis




haloplanktis]







ANT/505












uncultured
EKE06679.1
406985989
glycosyl
28.67

MLTLKLKGGLGNQMFQYAASHNLAKNKKTKINFDLSFFSDIEVRDIKRDYLLDKFNISADISFDQKNISGFRK
33


bacterium


transferase


FLVKVISKFFGEVYFYYRLKFLSSKYLDGYFQSEKYFKNVEEDIRKDFTLDKDEMGVEAKKIEQQIVNSKNSVLHIR






family 11


RGDYVDDLKTNIYHGVCNLDYYKRSIKYLKENFGEINIFVFSDDIAWVKENLAFENLQFVSRPDIKDYEELML






[uncultured


MSKCEHNIIANSSFSWWGAWLNENKNKIIIAPKEWFQKFNINEKHIVPKSWIRL






bacterium]










Clostridium

WP_
545396696
glycolsyl-
28.57

MVIVQLSGGLGNQMFEYALYLSLKAKGKVVKIDDITCYEGPGTRPKQLDVFGVSYERATKQELTEMTDSSLD
34


sp. KLE 1755
021636949.1

transferase,


PVSRIRRKLTGRKTKAYREKDINFDPQNMERDPALLEGCFQSEKYFQDCREQVREAYRRGIESGAYPLPEAY






family 11


RRLEKEIADCKSVSVHIRRDGYLEESHGGLYTGICTEQYYQEAFARMEKEVPGAKFFLFSNDPDWTREHFKGE






[Clostridium


NRILVEGSTEDTGYLDLYLMSKCKHNIIANSSFSWWGAWLNDNPEKKVTAPWLNGRECRDIYTERMIRI






sp. KLE 1755]










Francisella

WP_
490414974
alpha-1,2-
28.57

MKIIKIQGGLGNQMFQYAFYKSLKNNCIDCYVDIKNYDTYKLHYGFELNRIFKNIDLSFARKYHKKEVLGKLFSI
35



philomiragia;

004287502.1

fucosyl-


IPSKFIVKFNKNYILQKNFAFDKAYFEIDNCYLDGYWQSEKTFKKITKDIYDAFTFEPLDSINFEFLKNIQDYNLV




Francisella



transferase


SIHVRRGDYVNHPLHGGICDLEYYNKAISFIRSKVANVHFLVFSNDILWCKDNLKLDRVTYIDHNRWMDSYK




philomiragia



[Francisella


DMHLMSLCKHNIIANSSFSWWGAWLNQNDDKIVIAPSKWFNDDKNIQKDICPSNWVRI



subsp.



philomiragia]








philomiragia










ATCC 25015













Pseudomonas

WP_
515906733
protein
28.52

MVIAHLIGGLGNQMFQYAAARALSSAKKEPLLLDTSSFESYTLHQGFELSKLFAGEMCIARDKDINHVLSWQ
36



fluorescens;

017337316.1

[Pseudomonas


AFPRIRNFLHRPKLAFLRKASLIIEPSFHYWNGIQKAPADCYLMGYWQSERYFQDAAEEIRKDFTFKLNMSPQ




Pseudomonas




fluorescens]



NIATADQILNTNAISLHVRRGDYVNNSVYAACTVEYYQAAIQLLSKRVDAPTFFVFSDDIDWVKNNLNIGFPH




fluorescens






CYVNHNKGSESYNDMRLMSMCQHNIIANSSFSWWGAWLNSNADKIVVAPKQWFINNTNVNDLFPPAW



NCIMB 11764





VTL







Herbas-

WP_
495392680
glycosyl
28.48

MIATRLIGGLGNQMFQYAAGRALALRVGSPLLLDVSGFANYELRRYELDGFRIDATAASAQQLARLGVNATP
37



pirillum sp.

008117381.1

transferase


GTSLLARVLRKVWPAPADRILREASFTYDARIEQASAPVYLDGYWQSERYFARIRQHLLDEFTLKGDWGSDN



Yr522


family 11


AAMAAQIATAGAGAVSLHVRRGDYSNAHTAQYHGVCSLDYYRDAVAHIGGRVEAPHFFVFSDDHEWVR






[Herbase-


ENLQIGHPATFVQINSADGIYDMMLMKSCRHHIIANSSFSWWGAWLNGPAEDKIVVAPQRWFKDATNDT







pirillum sp.



RDLIPAAWVRL






YR522]










Prevotella

WP_
496097659
protein
28.43

MKIVKILGGLGNQMFQYALYLSLKETFPQENVTVDLSCFHGYHLHNGFEIARIFSLHPDKATVMEILRIAYYYP
38



histicola;

008822166.1

[Prevotella


NYFFWQIGKRVLPQRKTMCTESTKLLFDKSVLQREGDRYDGYWQDERYFIDCRRTILNTFKFPPFTDDNNL




Prevotella




histicola]



ALLKKMDTNSVSIHVRRGDYMGNKLYQGICKLNYYREAIMKISSYISPSMFCVFSNDIEWCRDNLESFIKAPIY




histicola






YVDWNSGTESYRDMQLMSCCGHNIIANSSFSWWGAWLNQNSSKIVIAPKRWINLKNCGFMLPSRWVKI



F0411













Flavo-

WP_
516064371
protein
28.42

MIVVQLIGGLGNQLFQYAAAKALALQTKQKFSLDVSQFESYKLHNYALNHFNVISKNYKKPNRYLRKIKSFYQ
39



bacterium sp.

017494954.1

[Flavo-


KNVFYKEVDFGYNPDLIHLKGGIIFLEGYFQSEKYFIKYEKEIREDFELRTPLKKETKAAIAKIESVNSVSIHIRRGD



WG21



bacterium



YINNPLHNTSKEEYYNKALEIVENKINNPVYFVSDDMEWVKANFSTKQETIFIDFNDASTNFEDLKLMTSCK






sp. WG21]


HNIIANSSFSWWGGWLNKNPDKIVIAPKRWFNDDSINTNDIIPTNWKVI







Polaribacter

WP_
517774309
protein
28.42

MIIVRIVGGLGNQMFQYAYAKALQQKGYQVKIDITKFKKYNLHGGYQLDQFKIDLETSSPIANVLCRIGLRRS
40


franzmannii
018944517.1

[Polarbacter


VKEKSLLFDEKFLEIPQREYIKGYFQTEKYFSSITPILRKQFIVQKELCNTTLRYLKEITIQKNACSLHIRRGDYISDE







franzmannii]



KANSVHGTCDLPYYKKSIKRIQFYKDAHFFIFSDDISWAKKNLLITNATYIDHNVIPHEDMYLMTLCHNI









ANSSFSWWGAWLNQHENKTVIAPKNWFVNRENEVACANWIQL







Polaribacter

WP_
472321325
glycosyl
28.42

MVVVRILGGLGNQMFQYAYAKSLAEKGYEVQIDISKFKSYKLHGGYHLDKFRIDLETANSSSAFLSKIGLKKTIK
41


sp. MED152
07670847.1

transferase


EPNLLFHKDLLKVNNNAFIKGYFQAEQYFSDIREILINQFKIKKELAKSTLAIKNQIELLKTTCSHLVRRDGYISDK






family 11


KANKVHGTCDLDYYSSAIEDHISKQNSNVHFFVFSDDIAWVKDNLNITNARYDHNVIPHEDMYLMTLCNHNI






[Polarbacter


TANSSFSWWGAWLNQNPDKIVIPKNWFVDKENEVACKSWITL






sp. MED152]










Mehtano-

YP_
150402264
glycosyl
28.19

MKIIQLKGGLGNQMFQYALYKSLKKRGQEVLLDISWYLKNNAHNGYELEWVFGLSPEYASIRQCFKLGDIPI
42



coccus

001329558.1

transferase


NLIYNVKRKVFPKKTHFFEKSNFNYDNNVFEVTNGYFEGWQNENYFKNFRSEILNDFSFKNIDKRNAEFSEY




maripaludis;



family protein


LKSINSVSVHVRRGDYVTNQKALNVHGNICNLEYYNKAINLANNNLKNPKFVIFSDDITWCKSNLGIDDPVYV




Methano-



[Methano-


DWNTGPYSYQDMYLMSNCKNNIIANSSFSWWGAWLNQNTEKKVFSPKKWVNDRNNVNIVPNGWIKIK




coccus 




coccus








maripaludis




maripaludis







C7


C7]










Gallionella

WP_
517104561
protein
28.15

MIIAHIIGGLGNQMFQYAAGRALSLARGVPFKLDISGFEGYDLHQGFELQRVFNCAAGIASEAEVRDSLGW
43


sp. SCGC
018293379.1

[Gallionella


QFSSPIRRIVARPSLAVLRRSTFVVEPHFHYWAGIKQVPDNCYLAGYWQSEQFQSHAAVIRTDFAFKPPLSG



AAA018-N21


sp. SCGC


QNSKLAMQIAQGNAVSLHIRRGDYANNPKTTATHGLCSLDYYRAAIQHIAERVQSPHFFIFSDDIAWVKSNL






AAA018-N21]


AINFPHQVYGHNQGTESYNDMRLMSLCQHNIIANSSFSWWGAWLNTNAHKIVIAPKQWFANTTHVADLI







Azospira

YP_
372486759
Glycosyl
28.04

MQSPACIAGARAWWVGYGMAEAMQPVVVGLSGGLGNQMFQYAAGRALAHRLGHPLSLDLSWFQGRG
44



oryzae;

005026324.1

transferase


DRHFALAPFHIAASLERAWPRLPPAMQAQLSRLSRRWAPRIMGAPVFREPHFGYVPAGAALAAPVFLEGY




Dechlorosoma



family 11


WQSERYFRELREPLLQDFSLRQPPLPASCQPILAAIGNSDAICVHVRRGDYLSNPVAAKVHGVCPVDYYQQGV




suillum PS



[Dechlorosoma


AELSASLARPHCFVFSDDPEWVRGSLAFPCPMTVVDVNGPAEAHFDLALMAACQHFVIANSSLSWWGAW







suillum PS]



LGQAAGKRVIAPSRWFLTSDKDARDLLPPSWERR







Prevotella

WP_
517274199
protein
28

MKIVKIIGGLGNQMFQYALAMALNKNFTKEEVKLDIHCFNGYTKHQGFEIDRVFGNEFELASYRDVAKVAY
45



paludivivens

018463017.1

[Prevotella


PYFNFQLWRIGSRIFPDRRHMISEDTSFKIMPEVITSHNYKYYDGYWQHEEYFKNIHDEILDAFKPKFQDER







paludivivens]



NKALAERLSDSNSISIHIRRGDYLNDELFKGTCGIEYYKKAIEEINERTVPTLFCVFSNDIHWCKENIEPLLNGKE









TIYVDWNTGSDNYRDMQLMTKCKHNIIANSSFSWWGAWLNNTKDKIVIAPRIWYNTKEKVSPVASSWIKL







Gramella

YP_
120434923
alpha-1,2-
27.96

MSNKNPVIVEIMGGLGNQMFQFAVAKLLAEKNSSVLLVDTNFYKEISQNLKDFPRYFSLGIFDISYKMGTEN
46



forsetii;

860609.1

fucosyl-


GMVNFKNSLFKNRVSRKLGLNYPKIFKEKSYRFADLFNKKTPIYLKGYFQSYKFIGVESKIRQWFEFPYENL




Gramella



transferase


GVGNEEIKSKILEKTSVSVHIRRGDYVENKKTKEFHGNCSLEYYKNAITYFLDIVKEFNIVFFSDDISWVRDEFK




forsetii 



[Gramella


DLPNEKVFVTGNLHENSWKDMYLMSLCDHNIIANSSFSWWAAWLNNNSEKNVIAPKKWFADIDQEQKSL



KT0803



forsetii



DLLPPSWIRM






KT0803]










Mariprofundus

WP_
497634831
alpha-1,2-
27.92

MIIVQFTGGLGNQMFQYALGRRLSLLHDVELKFDLSFYQHDILRDFMLDRFQVNGQVATEKEIEAYTNTPIF
47



ferrooxyclans;

009849029.1

fucosyl-


ALDRPLLDRLVRWGLYRGIVSVSDEPPGKQALMNYNSRVLQAPRNTYVQGYWQSEKYFMPIRQKLLDDFSL




Mariprofundus



transferase


VDKADQANGAMLEKIRQCHSVSLHVRRGDYVSNPLTNHSHGTCGLEYYEKAIALIGSKVDDPHFFVFSDDPE




ferrooxydans



[Mariprofundus


WTRDHLKCRFPMTVTCNSADSCEWDMELMRHCRDHIIANSSFSWWGAWLNMNPDKVVVAPAAWFN



PV-1



ferrooxydans]



NFSADTSDLIPDSWVRI







Bacillus

WP_
488102896
protein
27.91

MKIIQVSSGLGNQMFQYALYKKISLNDNDVFLDSSTSYMMYKNQHNGYELERIFHIKPRHAGKEIIKNLSDLD
48



cereus;

002174293.1

[Bacillus


SELISRIRRKLFGAKKSMYVELKEFEYDPIIFEKKETYFKGYWQNYNYFKDIEQELRKDFVFTEKLDKRNEKLANE




Bacillus




cereus]



IRNKNSVSIHIRRGDYYLNKYEEKFGNIANLEYYLKAINLVKKKIEDPKFYFSDDIDWAQKNINLTNDVVYISH




cereus VD107






NQGNESYKDMQLMSLCKHNIIANSTFSWWGAFLNNNDDKIVVAPKKWINIKGLEKVELFPENWITY






Firmicutes
WP_
547951299
protein
27.81

MIIIRMTGGLGNQMFQYALYLKLRAMGKEVKMDDFTEYEGREARPLSLWAFGIEYDRASREELCRMTDGFL
49


bacterium
022352106.1

[Firmicutes


DPVSRIRRKLFGRKSLEYMEKDCNFDPEILNRDPAYLTGYFQSEKYFADIEEEVRQAFRFSERIWEGIPSQLLER



CAG:534


bacterium


IRSYEQQIKTTMAVSVHIRRGDYLQNEEAYGGICTERYYKTAIEYVKKRQQDASFFVFTNDPDYAGEWILKNF






CAG:534]


GQEKERFVLIEFTQEENGYLDLYLMSLCRHHILANSSFSWWGAYLNPSREKMVIVPHKWFGNQECRDIYME









NMIRIAKEQS







Sideroxydans

YP_
291615344
glycosyl
27.81

MVISNIIGGLGNQMFQYAAARALSLKLEVPLKLDISGFTNYALHQGFELDFRIGCKIEIASEADVHEILGWQSA
50



litho-

003525501.1

transferase


SGIRRVVSRPGMSIFRRKGFVVEPHFSYWNGIRKITGDCYLAGYWQSEKYFLDAAVEIRKDFSFKLPLDSHNA




trophicus;



family 11


ELAEKIDQENAVSLHIRRGDYANNPLTAATHGLCSLDYYRKSIKHIAGQVRNPYFFVFSDDIAWVKDNLEIEFP




Sideroxydans



[Sideroxydans


SQYVDYNHGSMSFNDMRLMSLCKHHIIANSSFSWWGAWLNPNPEKVVIAPERWFANRTDVQDLLPPGW




litho-




litho-



VKL




trophicus;




trophicus 







ES-1


ES-1]










zeta

WP_
517092760
protein
27.81

MIVSQIIGGLGNQMFQYATGRALSHRLHDTFFLDLDGFSGYQLHQGFELSNVFQCEVNVATRSQMQALLG
51



proteo-

018281578.1

[zeta


WRSFSSVRRLLMKRSLKWARGHRVMIEPHFHYWSRFAEINEGCYLSGYWQSERYFKPIENIIRQDFKFNHLL




bacterium




proteo-



KGVNLDLAQQMETVNSNSLHVRRGDYASDANTNHTHGLCPLDYYRDAILYIAQNTVAPSFFIFSDDIEWCRE



SCGC AB-137-



bacterium 



HLKLSFPATYIDHNKGSNSYCDMQLMSLCHHHIIANSSFSWWGAWLNTRLDKVIAPKQWFANGNRTDDLI



C09


SCGC AB-137-


PAEWLVM






C09]










Pedobacter

YP_
255530062
glycosyl
27.8

MKIIRFLGGLGNQMFQYAFYKSLQHRFPHVKADLQGYQEYTLHNGFELEHIFNIKVNSVSSFTSDLFYNKKW
52



heparinus;

003090434.1

transferase


LYRKLRRILNKLRNTYIEEKKLFSFDPSLLNNPKSAYYWGYWQNFQYFEHIADDLRKDFQFRAPLSAQNQEVLD




Pedobacter



family protein


QTKLSNSISLHIRRGDYIKDPLLGGLCGPEYYQTAINYITSKVNAARFFIFSDDIDWCIANLKLQDCSFISWNKG




heparinus;



[Pedobacter


TSSYIDMQLMSSCKHHIVANSSFSWWAWLNPNPDKIVIAPEKWTNDKDINVRMSFPQGWISL



DSM 2366



heparinus










DSM 2366]










Methylophilus

WP_
517814852
protein
27.78

MFQYAMGLSLAENNQTPLKLDLSQFTDYKLHNGFELSKVFNCSAETASVTQIETLLGICKYSFIRRILKNTYLKN
53



methylotrophus

018985060.1

[Methyl-


LRPAQYVVEPFFGYWDGVNFLGDNVYLEGYWQSQKYFIDYESTIRTHFTFKNILSGENLKLSDRIKGSNSVSL






ophilus


HIRRGDYVTNKNNAFIGTCSLIYYQNAIEYFSTKIADPIFFIFSDDITWAKSNLRLANEHYFVGHNQGEDSHFD






methylo-


MQLMSLCKHHIIANSSFSWWGAWLNPSKDKIIIAPKKWFASGLNDQDLVPKDWLRI






trophus]










Rhodo-

WP_
495309205
alpha-1,2-
27.7

MIYTRIRGGLGNQLFQYSAARSLADYLNVSLGLDTREFDENSPYKMSLNHGNIRADLNPPDLIKHKKDGKIAYI
54



bacterales

008033953.1

fucosyl-


IDHIKGNQKKVYKEPFLSFDKNLFSNVDGTYLKGYWQSEKYFLRNRKNILSDINLIKKTDKFNTINLKEIKKSTSI




bacterium



transferase 


SLHIRRGDYLSNESYNETHGICSLSYYTDAVEYIKNRLGENIKVFAFSDDPDWVLENLKLSVDIKIINNNTSANS



HTCC2255


[Rhodo-


FEDLRLMLNCDHNIIANSSFSWWGAWLNQNPEKIVISPKKWYNKKQLQNADIVPSSWLKV







bacterales











bacterium










HTCC2255]










Spirulina

WP_
515872075
protein
27.69

MAKIIARIGGIGNQIFIYAAARRLELINNAELVLDSVSGFVHDLQYRQHYQLDHFHIPCRKATPAERFEPFSR
55



subsalsa

017302658.1

[Spirulina


VRRYLKRQLNQRLPFEQRRYVIQESIDFDPRLIEFKPRGTVHLEGYWQSEDFYKDIEATIRQDLQIQPPTDPTN







subsalsa]



LAIVQHIHQHTSVAVHIRFFDQPNADTMNNAPSDYYHRAVEAMETFVPGAHYYLFSDQPEAAKSRIPLPDE









RVTLVNHNRGNKLAYADLWLMTQCQHFIIANSTFSWWGAWLAENQKKQVIAPGFEKREGVSWWGFKGL









LPKQWIKL







Vibrio

WP_
498119755
glycosyl
27.67

MVIVKITGGLGNQLFQYATGSALANKLSCELVLDLSFYPTQTLRKYELAKFNINARVATDREIFLAGGGNDFFS
56



cyclitrophicus

010433911.1

transferase


KALKKLGLTSIIFPEYIKEQESIKYVGKIDLCKSGAYLDGYWQNPLYFSQNKIELTREFLPRAQLSPASALAWKDHI






family 11


SQASNSVSLHVRRGDYVENAHTNNIHGTCSLEYYQHAIEKIRSEVHNPVFFVFSDDIEWCKLNLSSLAEVEFV






[Vibrio


DNTTSAIDDLMLMRQCKHSIIANSTFSWWGAWLKLDGLVIAPRNWFSSASRNLKGIYPKEWHIL







cyclitophicus]











Lachno-

WP_
551039510
protein
27.65

MRSVVDIKGGYGNQLFCYSFGYAVSKETGSELIIDTSMILDMNNVKDRNYQLGVLGITYDSHISYKYGKDFLSR
57



spiraceae

022783177.1

[Lachno-


KTGLNRLRKKSAIGFGTVVFKEKEQYVYDPSVFEIKRDTYFDGFWQSSRYFEKYSDDLRKMLKPKKISNAAEKL




bacterium




spiraceae



AEDARDCLSVSVHIRRGDYVSLGTWLKDDYYIKALDIIKERYGSEPVFFVFSDNKKYADDFFSAAGLKYRLMD



NK4A179



bacterium



YETDDAVRDDMFLMSRCSHNIMANSSYSWWGAFLNDNKDKTVICPETGVWGGDFYPEGWMKVTASSG






NK4A179]


K






uncultured
EKE02186.1
406980610
glycosyl
27.57

MIIVNLYGGLGNQMFQYALGRHLAEKNNTELKLDISAFESYKLRKYELGNLNIIEKFALPEEISRLSTLPTGKIER
58


bacterium


transferase


FIRKTLRKPVKKPESYKENITGGFNPKILDLQNNIYLEGYWQSEKYFIEIEDIIRKEPSFKPATGKKNEILENILNI






family protein


NSVSLHIRRGDYVTNPEVNQNHGVCSLDYYKSCVDFIEKKLESPYFYIFSDDIEWVKNNLQIQSQVYYVDHNT






[uncultured


VDNAIEDMRLMFSCHNKILANSSFSWWGAWLNSNPDKMVITPRKWFNTTYDSNDLIPERWIKL






bacterium]










Bacteroides

WP_
492366053
protein
27.46

MKIGIIYIVTGPYIKRWNEFYSSSQLYFCVEAEKNYEVFTDSSELASQRLPNVHMHLIEDKGWIVNVSSKSVFIC
59



fragilis;

05822375.1

[Bacteroides


EIRNQLTSYDYIFYLNGNFKFISPIYCDEILPQAEHNYLTALSFSHYLTIHPDHYPYDRNKNCNAFIFYGQGKYYF




Bacteroides




fragilis]



QGGFYGGRTQEVLSLSEWCRDAIEADFNKKVIARFHDESYINRYLLTQHPKVLNKKYAFQDIWPYEGEYKAIV




fragilis 






LNKEEVPEDNNLQEMKQNYIDPSLSFLLNDELKFIPISIVQLYGGLGNQMFGYAFYLYIRHISTQERKLLIDPAP



HMW 616





CKRYGNHNGYELPSIFSKICQHIHISDETKNNIRKLRKGTSLSIEEVRASMPQSFKEKKQPIIFYSGCWQCVTYV









ETVKDEIKKDFIFDESKLNEPSAQMLRIIRRSNSVNVHIRRNDYLIGNNEFLYGGICTKSYYEKAISQMYTLLKDE









PIFIYFTDDPEWVRSNFALDKSYLVDWNKNKDNWQDMYLMSACRHHIIANSSFSWWAAWLGGFPEKKVI









APSTWLNGMQTPDILPTEWIKIPITPDKKILDRICNHLILHSSYMKQLGLNSGKMGVVIFFFHYARYTQNPLYE









NYAGDLFDELYEEIHKGISFSFLDGLCGIAWAVEYLVHEQFIEGNTDDSLAEIDFKVMQIDPRRFTDYSFETGL









EGIACYVLSRLLSPRVCSSSLTLDSVYLKDTLEACRKVPVDKANYTRLFLNYIESKEVGYSFKDVLMQVLNHSEK









AFGSDGLTWQTGLTMIMR







Butyrivibrio

WP_
551034739
protein
27.46

MIIIQLKGGLGNQMFQYALYKELRSRGKEVKIDDVTGFVDDELRTPVLQRFGIEYDRATREEVVKLTDSKMDI
60


sp.
022778576.1

[Butyrivibrio


FSRIRRKLTGRKTCRIDEESGTFNPDILELDEAYLVGYWQSDKYFRNEDVIAQLRQEFQKRPQEIMTDSASWA



AE3009


sp.


TLQQIECCQSVSLHIRRTDYIDEEHNHIHNLCTEKYYKGAIDRIRSQYPSAVFFIFTDDKEWCRNHFRGPNFFV






AE3009]


VELAEKENTDIAEMLLMSSCKHHICANSSFSWWSAWLNDSPEKMVIVPNKWINNRDMDDIYTDRMTKM









AI







Bacteroides

WP_
490447027
protein
27.43

MKQTIILSGGLGNQMFQYAFFLSMKAKGKSCSLDTTLFQTNKMHNGFELKSVFDIPDSPNQASALHSLLIKM
61



avatus;

004317929.1

[Bacteroides


LRRYKPKSILTIDEPYTFCPDALESKKSFLMGDWLSPKYFESIKDVVVNAYRFHNIGNKNVDTANEMHGNNS




Bacteroides




ovatus]



VSIHIRRGDYLKLPYYCVCNENYYRQAEIEQIKDRVDNPIFYVFSNEPSWCDSFMKEFRVNFKIVNWNQGKDS




ovatus






YQDMYLMTQCKHNIIANSTFSWWGAWLNNNTDKIIVAPSKWFKNSEHNINCKEWLLIDTSK



CL02T12C04













Desulfospira

WP_
550911345
protein
27.42

MGKKYVETVVNGGLGNQIFQFSAGFALSKRLNLDLVLNISTFDSCQKRNFELYTFPKIKNSFACIKDDDPGVFS
62



joergensenii

022664368.1

[Desulfospira


RLRIPFLNFKEKIKQFHESHFFFDPAFFDIREPVRIEGYFQSYKYFEKYSDQLKDILLDIPLTSRLKTVLKVISSKES







joergensenii]



VSVHIRRGDYISDQGINEVHGTLNEAYYLNSIKLMEKMFPESFFFLFTDDPHYVEENFKFLEDTSCIISDNDCLP









YEDMYLMANCHHNIIANSSFSWWGAWLNQNPEKIVIAPRKWFSRKILMEKPVMDLLPDDWILL







Lachno-

EOS74299.1
507817890
protein
27.39

MNIIRMTGGLGNQMFQYALFLRLKAQGKEVKFDDRTEYKGEEARPILLWAFGIDYPAAGEEEVNELTDGV
63



spiraceae



C819_03052


MKFSHRLRRKLFGRKSKEYREKSCNFDQQILEKEPAYFTGYFQSERYFEEVKEQVRKAFQFSGKIWGSVSKEL




bacterium



[Lachno-


EERIREYQTKIENKSQMPVSVHIRRGDYLENDEAYGGICTDAYYRKAIEMMEEKFPNTVFYIFSNDTGWAKQ



10-1



spiraceae



WIDHFYKEKSRFIVIEGTTEDTGYLDLFLMSKCRAHIIANSSFSWWGAWLDPDQEKIVIAPSKWVNNQDMK







bacterium



DIYTREMIKISPKGEVR






10-1]










Bacteroides

WP_
495107639
protein
27.33

MVVVYVGAGLARNMFQYAFALSLREKGLDVFIDEDSFIPRFDFERTKLDSVFVNVNIQRCDKNSFPLVLRED
64



dorei;

007832461.1

[Bacteroides


RFYKLLKRISEYMSDNRYIERWNLDYLPIHKKASTNCIFIGFWISYKYFQSSEDAVRKAFTFKPLDSIRNVELAT




Baceroides




dorei]



KLVTENSAVHFRKNIDYLKNLPNTCPPSYYYEAINYIKKVVPNPKFYFFSDNWDWVRENIRGVEFTAVDWN




dorei






PSSGIHSHCDMQLMSLCKHNIIANSTYSWWSAYLNENNNKIVVCPKDWYGGMVKKLDTIIPESWIING



DSM 17855












Firmicutes
WP_
547127421
protein
27.32

MVKVKMSGGLGNQMFQYALYRKIQQTGKDVKLDLFSFQDKNAFRRFSLDIFPIEYQTANLEECRKLGECSYR
65


bacterium
021916201.1

[Firmicutes


PVDKIRRKMFGLKESYYQEDLDKGYQPEILEMNPVYLDGYWQCERYFQDIREKILEDYTFPKKISIESSRLQERI



CAG:24


bacterium


KNTESVSIHIRRGDYLDAANYKIYGNICTIEYYQSAISRMRKLCEKPNFYLFSNDPEWAKEIFGDTEDTIVEEDK






CAG:24]


ERPDYEDMFLMSRCKHNIIANSSFSWWAAWLNQNENKRVIAPVKWFNNHSVTDVICDDWIRIDGDHKGA







Clostridium

WP_
547299420
epsH
27.3

MIYVNIRGRLGNQLFIYAFARALQKSTNQQITLNYTSFRKHYNNTAMDLEQFNIPEDIMFENSKELPWFANT
66



hathewayi

022031822.1

[Clostridium


DGKVIRILRHYFPKLIRSILQKMNVLMWLGDEYVEVKVNKRRDIYIDGFWQSSRYFKSVYKELKNELIPKMEM



CAG:224



hathewayi



SKEIKTMGDLINQKESVCVSVRRGDYVTVKKNRDYYICDEKYLNTSIMRMVELPNVTWFIFSDDADWVK






CAG:224]


DNIVFPGEVFYQPPRVTPLETLYLMKACKHFIISNSSFSWWGQYLSNNDNKIVIGPAKWYVDGRKTDIIEEE









WIKIEV







Syntrophus

YP_
85860461
alpha-1,2-
27.3

MVIVRLTGGIGNQMFQYAAARRVSLVNNAPLFLDLGWFQETGSWTPRKYELDAFRIAGESASVGDIKDFKS
67



acidi-

462663.1

fucosyl-


RRQNAFFRRLPLFLKKRIFHTRQTHIIEKSYNFDPEILNLQGNVYLDGYWQSEKYFSDVDSEIRREFSFQTDPAE




trophicus



transferase


RNRKILERIASCESVSIHIRRDGYVTLPDANAFHGLCTPAYYRLAVEQISRKVVEPVFFVFSDDIAWARGNLKL



SB:


[Syntrophus


GFETCFMDQNGPDRGDEDLRLMIACRHHIIANSSFSWWGAWLCSNPEKIVYAPRKWFNNGLDTPDNIPAS




Syntrophus




acidi-



WIRI




acidi-




trophicus








trophus



SB]










Bacteriodes

WP_
491931393
protein
27.27

MKIVKIIGLGNQMFQYALYSLKKKYPKEKIKIDISMFETYGLHNGFELKRIFDIDAEYASREEIRELSFYIKIYKL
68



caccae;

005678148.1

[Bacteroides


QRIFRKIFPVRKTECVEKYDFKFMSEVWSNCDRYYEGYWQNWEYFIEAQTEVRSTFTFKKELVGRNAKVIREI




Bacteroides




caccae]



QYAKMPVSLHIRRGDYLHHKLFGGLCDLNYYKKAIDYVLNNYDTPQFVLFSNDIEWCKTYILPLVQGPFILVD




caccae






WNSGVESYIDMQLMSCCRINIIANSSFSWWAAWLNDSSEKIVIAPKLWAHSPYGKEIQLKSWLLF



ATCC 43185













Butyrivibrio

WP_
551011888
protein
27.24

MIIIEMSGGLGNQMFQYALYKSMLHKGLDVTIDKSIYRDVDHKEQVDLDRFPNVSYIEADRKLSSTLRGYGY
69



fibrisolvens

022756304.1

[Butyrivibrio


NDSIIDKIRNKLNKSKRNLYHEDLDKGYQPEIFEFDNYLNGYWQCERYFKDIKNEIKKDFIFPCTQSGDDKIK







fibrisolvens]



ALTIEMESCNSVSLHVRRGDYLKPGLIEIYGNICTEEYYKKSIEYIKERVDNPVFYIFSNDMAWVRDNFKSDDFR









YVNEDGAFDGMTDMYLMTRCRHHIVANSSFSWWGAWLNKHDDNIVICPNRWVNTHTVTDIICEDWIRI









DV







Para-

WP_
492476819
protein
27.24

MIVGGNDYCKVKVVNIIGGLGNQMFQYAFALSLKEHFPKEEIRIDISHFNYLFVNKVGAANLHNGYELDKIFF
70



bacteroides

005857874.1

[Para-


NIELKKANAWQLMKLTWFIPNYLISRIARKILPVRNSEYIQNSSDCFFYDPMVYNKQGSCYYEGYWQAIGYYE




distasonis;




bacteroides



SMRDKLCKIFQHPSPEGKNKQYIENMESSNSVGIHIRRGDYLLSDNFRGICEVDYYKRAIDKILQDGEKHVFYL




Para-




distasonis]



FSNDQKWCEEYILPLLGNYEIIFVTGNIGRDSCWDMFLMTHCKDLIIANSSFSWWGAFLNKRGGRVVTPKR




bacteroides






WMNRNIRYDLWMPEWIRI




distasonis










CL03T12C09













Geobacter

YP_
148263741
glycosyl
27.21

MIIARLQGGLGNQMFQYAVGLHLALTHNVELKIDITMFSDYKWHTYSLRPFNIRESIATEEEIKALTDVKMDR
71



uranii-

001230447.1

transferase


PYKKIDNFLCRLLRKSQKISATHVKEKHFHYDPDILKLPDNVYLDGYWQSEKYFKEIENIIRQTFIIKNPQLGRDK




reducens;



family protein


ELACKILSTESVCLHIRRGNYVTDKTTNSVLGPCDLSYYSNCIKSLAGNNKDPHFFVFSNDHEWVSKNLKLDYP




Geobacter



[Geobacter


TIYVDHNNEDKDYEDLRLMSQCKHHIIANSTFSWWSAWLCSNPDKVIYAPQKWFRVDEYNTKDLLPSNWLI




uranii-




uranii-



L




reducens Rf4




reducens Rf4]











Lachno-

WP_
511026085
protein
27.21

MIIVKIYEGLGNQLFQYAFARSIQVNGKKVFLDTSGYTDQLFPLCRTSTRRRYQLNCFNIRIKEVEKKNIEKYSFL
72



spiraceae

01280341.1

[Lachno-


IQEDMFGKLISKLAKLHLWMYKVTIQQNAQEYKESYLNTRGNVYYKGWFQNPKYFSSIRRLLLKEITPKYKIRI




bacterium




spiraceae



PAELRELLQEDNIVAVHCRRGDYQYIRNCLPVNYYKKAMAYMKEKKLGVPRYLFFSDDLSWVKRQFGNKDN



A4



bacterium A4]











Colwellia

YP_
71282201
alpha-1,2-
27.15

MKVVRVCGGFGNQLFQYAFYLAVKHKFNETTKLDIHDMASYELHNGYELERIGNLNENYCSAEEKLAVQSTK
73



psychrery-

270849.1

fucosyl-


NIFTKLLKEIKKYTPFIPRTYIKEKKHLHFSYQEVDLGTKDTSIYYRGSWQNPQYFNSIASEIREKLTFPEFTEPKSL




thraea;



transferase


ALHQEISEHETVAVHIRRGDYLKHKALGGICDLPYYQNAIKEIEGLVEKPLFVIFSDDITWCRANINVEKVRFVD




Colwellia



[Colwellia


WNSGEQSFQDMHLMSLCTHNIIANSSFSWWGAWLNANPNKIVISPNKWIHYTDSMGIVPSEWIKVETSI




psychrery-




psychrery-







thraea 34H



thraea 34H]











Roseobacter

WP_
497495952
alpha-1,2-
26.96

MITSRLHGRLGNQMFQYAAARALAHRLGCGVALDGRGAELRGEGVLTRVFDLPLSAAPKLPPLKQHAPLRY
74


sp. MED193
009810150.1

fucosyl-


GLWRGLGLAPRFRRERGLGYNTAFETWEDGCYLHGYWQSERYFEEISDLIRADFTFPDFSNRQNAEMAARI






transferase


MEDNAISLHVRRGDYVALSAHVLCDQAYYEAALTRLLEGLSQDAPTVYVFSDDPDWAKANLPLPCKKVVVD






[Roseobacter


FNGPETDFEDMRLMSLCKHNIIGNSSFSWWAAWLNANPQKRVAGPANWFGDPKLSNPDILPSQWLKVA






sp. MED193]


P







Cesiribacter

WP_
496488826
Glycosyl
26.89

MMIVRLCGGLGNQLFQYAVGKQLSVKNNIPLKIDDSWLRLPDARKYRLQFFQIEEPLASPQEVERFVGPYES
75



andamanensis;

009197396.1

transferase


QSLYARLYRKVQNMLPRHRRRYFQESGFWAYEPELMRIRSQVFLEGFWQHHAYFTRLHPQVLEALQLREEY




Cesiribacter



family 11


RQEPYAVLDQIREDAASVSLHIRRGDYVSDPYNLQFFGVMPLSYYQQVAYMQEQLHAPTFYIFSDDLDWA




andamanensis



[Cesiribacter


RAHLKLQAPMVFVDIEGGRKEYLELEAMRLCRHNILANSSFSWWGAYLNTNPHKRVIAPRQWVADPELKD



AMV16



andamanensis]



KVQIQMPDWILL







Rhodo-

WP_
495954476
glycosyl
26.89

MIATRLIGGLGNQMFQYAYGFSLARRRSERLVLDVSAFESYDLHALAIDQFDISAARMTQAEFARIPGRYRG
76



pirellula

008679055.1

transferase


KSRWAERVANFAGGLQSCDKRPLRLRREKPFGFAEKYLAEGSDLYLDGYWQSERYFPGLQAELKKEFQLKRG




sallentina;



family 11


LSDESSRVLDEIQSSMSVAMHVRRGDYVTNAETLRIYRRLDAEYYRKCLNDLRQRFSNLNVFVFSNDIQWCQ




Rhodo-



[Rhodo-


DHLDVGLKQRPVTHNDATTAIEDMFLMSQCDHSIIANSSFSWWAAFLGRSDAQRRVYYPDPWFNPGTLN




pirellula




pirellula



GDSLGCANWVSESSISVSRPSRAA




sallentina;




sallentina]







SM41













Butyrivibrio

WP_
551018054
protein
26.85

MIIIRMMGGLGNQMFQYALYLQLKALGKEVKIDDVYGFRDDPQRDPVLEKMYGITYTKASDAEVVDITDSH
77


sp. AD3002
022762282.1

[Butyrivbrio


LDIFSRIRRKLFGRKSHEYIEETGLFDPKVFEFETAYLNGYFQSDKYFPDKEVLAQLRREFVIKPDDVFTSADSE






sp. AD3002]


ELYRQIRETESVSIHVRRGDYLLPGTVETFGGICDNDYYKRAIDRMVSEHPDAIFFVFTSDKEWCEQNVSGKK









FRIVDTKEENDDAADLLLMSLCKHHILANSSYSWWSAWMNDSPEKTVIVPSKWLNTKPMDDIYTSRMTKI







Segetibacter

WP_
517440157
protein
26.78

MVVVKLIGGMGNQMFQYAIGRHLAIKNKCPLYFDHIELENKNTANTPRNYELDIFNVQYQKNPFLQSNRFV
78



koreensis

018611017.1

[Segetibacter


AKVYHKLFSVQRIKEPDFTFHPHILNVQGNIHLNGYWQNENFKEIEEIIRQDFTFKTPANEKIESILQQIAATN







koreensis]



SVSLHVRRGDYITLTEANQFHGVCSDTYYQKAIAKIKEAIPAPHLFVFSDDIHWVKQNMPFTEEHTFVDGNT









GKNSFEDLRLMAACRHNILANSSFSWWAGWLNKNPEKMVIAPEDWFRAVHTDIVPPSWIKM







Amphritea

WP_
518450815
protein
26.76

MVIVRLIGGLGNQLFQYAYALSLLEQGYDVKLDASAFESYTLHGGFGLGEYAERLEVATTEEVDMVSRVGRIS
79



japonica

019621022.1

[Amphritea


TLLRKLQGKKSRRVIKESNFSYDEKMLTPEDSHYLVGYFQSELYFNKIRGELLSALDLKHKLSPYTEASYLAIADA







japonica]



SVSVSMHIRRGDYVSDKAAHNTHGVCSLDYYYAAVTFFEERYPDVDFYIFSDDIEWVKENLNVQRAHYISSEE









KRFAGEDIYLMSQCDHNIVANSSFSWWGAWLNANEDKIVVAPRQWYADSNMQRLSKTLVPDTWIRL







Desulfovibrio

YP_
218887785
glycosyl
26.76

MRPVVVDIFGGLGNQMFQYAAAKSLAERLGVRLELDVSMFSGDPLRAFSLGEFAITDHVRGKSRSSLLVRFA
80



vulgaris;

002437106.1

transferase


RSLGFGSSSKCVEPFFHYWEGINEIEAPVHMHGYWQSEKYFKAYEDLIRRTFSFSACEGVASSGKYAGVSSP




Desulfovibrio



family protein


MSVSVHLRRGDYKEQKNVVVHGILGREYYDAAYSIIKQGCPSACFFVFTDAINEAVDFFSHWNDVLFVDGN




vulgaris;



[Desulfovibrio


NQYQDMYLM SQCRHHIIANSSYSWWGAWLGAFSDGMTVAPKMWFAYDVLKEKSIKDLFPEDWIVL



str.



vulgaris str.







′Miyazaki F′


′Miyazaki F′]










Spirosoma

WP_
522095677
protein
26.76

MIISRITSGLGNQLFQYAVARHLSLKNKTSLYVDLSYYLYQYHDDTSRNFKLGNFSVPYHTLQQSPVEYVSKAT
81



spitsbergense

020606886.1

[Spirosoma


KLLPNRSLRPFFLFQKERQFHFDEQILQSRAGCVILEGFWQSEAYFRDNADTIRRDLQLSGTPSPEFNQYRELI







spitsbergense]



RETPMSVSIHVRRSDYVNHPEFSQTFGFVGIDYYKRAIELARKELANPRFFVFSDDKEWSKTNLPLGEDSVFV









QNTGLNGDVADLVLMSHCQHHIIANSSFSWWGAWLNPNAGKLVITPKNWYKNKPAWNTKDLLPPTWLS









I







Lachno-

WP_
511037988
protein
26.73

MNIIRMSGGIGNQMFQYALYLKLVSLGKEVKFDDVTEYELDNARPIMLSVFGIDYPKASREELVELTDASMD
82



spiraceae

016292012.1

[Lachno-


FLSRVRRKIFGRKSGEYHEASADYDETVLEKEHAYLCGCFQSERYFKDIEYEVREAYRFRNVVVPEEIRGGIETY




bacterium 




spiraceae



ERQIGESLSVSIHIRRGDYLDAADVYGGICTDAYYNQAIRYMIKKYENPSFFVFTNDTFWAEKWCEVRERETG



28-4



bacterium



KRFTVIKGTDEETGYIDLMLMSRCKAHIIANSSFSWWGAWLDASPDKCVVAPVKWINTRECRDIYTEDMVR






28-4]


IGSNGKISFSNCSSL







Lachno-

WP_
511048325
protein
26.71

MVVVRIWEGLGNQLFQYAYARALSLRTKDRVYLDISEYEMSPKPVRKYELCHFKIKQPVINCGRIFPFVNKDS
83



spiraceae

016302211.1

[Lachno-


FYTKNNQYLRYFPAGLIKEEDCYFKRDFCELKGLYLKGWFQSEKYFKEFESHIREEIYPRNKIKITRGLRKILNSD




bacterium 




spiraceae



NTVSVHIRRGDFGKDHNILPIEYYENSKRVILERVDNPYFIIFDDILWVKENMNFGLNCFRYMDKEYSYKDYEE



COE1



bacterium



LMIMSRCKNIIANSTFSWWGAWLNPSKDKIVIAPKKWFLYNPKKDFDIVPNDWIRV






COE1]










Para-

WP_
492502331
alpha-1,2-
26.69
FutZB
MKIVNIIGGLGNQMFQYAFAVALKAKYPNEEVFIDTQHYKNAFIKVYHGNNFYHNGYEIDKVFPNATLEPAR
84



bacteroides;

005867692.1

fucosyl-


PKDLMKVSFYIPNQVLARAVRRIFPKRKTEFVTDQQPYVFIPEALSVIDDCFYDGYWMTPLYFDKYRDRILKEF




Para-



transferase


TFRPFDTKENLELEPLLKQDNSVTVHIRRGDYVGSSSFGGICTLDYYRNAIREAYNLITSPEFFIFSNDQKWCM




bacteroides



[Para-


ENMRNEFGDAKVHFIAHNRGADSYRDMQLLSIARCNILANSSFSWWGAYLNQRKNCFIICPHKWHNTLEY



sp. 20_3;



bacteroides]



SDLYLPTWIKI




Para-











bacteroides











distasonis










CL09T03C24













Bacteroides

WP_
488624717
protein
26.62

MFVIRLIGGVGNQLFQYTFGQFLRHKFGVEVCYDIVAFDTVDKGRNLELQLLDESLPLFETSNFFFSKYKSWK
85


sp. HPS0048
002561428.1

[Bacteroides


KRLFLYGFLLKKNNKYYTKYAPEEISLFTEKGLSYFDGWWQYPALLRDTINNMEDFFIPKQPIPVQIQKYYNEIL






sp. HPS0048]


LNNFAVALHVRRGDYFTSKYAKTYAVCNVEYYTSAVNLMCEKLRSCKFYVFSDDLDWVKSNLILPSNTVYVK









NYDINSYWYIYLMSLCHHIIISNSSFSWWGATLNRNFHKIVIAPKYWSTKKNNTLCDNSWIKI







Bacteroides

WP_
511013468
protein
26.58

MKIINILGGLGNQMFEYAMYLALKNAHSEEEILCSTRSFCGYGLHNGYELGRIFGIQVKEASLLQLTKLAYPFF
86



theta-

016267863.1

[Bacteroides


NYKSWQVMRHWLPVRKTMTRGAINIPFDYSQVMREDSVYYDGYQWNEKNFLHIREEILTAYTFPKFDDEK




iotaomicron;




theta-



NQELADIIVKSNAVSCHIRRGDYLKEINMCVCTSSYYAHAISYMNEEINPNLYCVFSDDIEWCRNNICELMGE




Bacteroides




iotaomicron]



DKKIIFIDWNKGEKSFRDMQLMSLCKHNIIANSSFSWWGAWLNRNDKKIVVAPTRWIASEVKNDPLCDSW




theta-






KRIE




iotaomicron










dnLKV9













Desulfovibrio

YP_
78357918
glycosyl
26.56

MKFVGVWILGGLGNQMFQFAAAYALAKRMGGELRLDLSGFKKYPLRSYSLDLFTVDTPLWHGLPMSQRRF
87



alaskensis;

389367.1

transferase


RIPMDAWTRGSRLPLVPSPPFVMAKEKNFAFSPIVYELQQSCYLYGYWQSYRYFQDVEDDIRTLFSLSRFATL




Desulfovibrio



[Desulfovibrio


ELAPVVQLNEVESVAVHLRRGDYITDAASNAVHGVCGIDYYQRSMSLVRRSTTKPIFYIFSDEPEVAKKLFAT




alaskensis




alaskensis



EDDVVVMPSRRQEEDLLLMSRCKHHIIANSSFSWWAAWLGKRASGLCIAPRYWFARPKLESTYFDLIPDE



G20


G20]


WLLL







Prevotella

ETD21592.1
564721540
protein
26.56

MDIVVIFNGLGNQMSQYAFYLAKRKSGSRCHCIFHNVSTGFHNGSELDKVFGIKYEKGIFSKLLSKIYDIFDGIP
88



oralis CC98A



HMPREF1199_


KLRKKLNSLGIHIIREPRNYDYTASLLPRVSRWGLNYFVGGWHSEKYYTEILQEIKNTFSFKIDDEIKDIDFYEFYS






00667


LIHNDINSVSLHIRRGDYVGANEYSYFQFGGVATLEYYHKAIDEIYQRIENPTFYVFSDDIGWCKTTFLKNNFIF






[Prevotella


VDCNCGEKSWRDMFLISQCKHHIIANSTFSWWGAWLSIFHNSITICPKEFIKGVVIRDVYPDTWIKLSS






oralis CC98A]










Comamonadaceae

YP_
550990115
glycosyl-
26.54

MASKISKIIPRIFGGLGNQLFIYAAARRLALVNGAELALDDVSGFVRDHEYNRHYQLDHFNIPCRKATAAERLE
89



bacterium CR

008680725.1

transferase


PFARVRRYLKRKWNQRLPFEQRKYLVQESVDFDERLLTFKPRGTVYLEGYWQSEDYFKDIEPQIRADLRIHPP






[Comamon-


TDTVNQQMAERIRATNAVAVHVRFFDAPAQSALGVGGNNAPGDYYQRAIKVMQEQAPDAQYYIFSDQP







adaceae



QAARARIPLRDDHVTLVNHNQCDAVAYADLWLISQCQHFIIANSTFSWWGAWLGKTPESIVIAPGFEKREG







bacterium CR]



AMFWGFRGLLPDRWVKL







Vibrio

WP_
550250577
WblA protein
26.51

MKDSRIVKLNGGLGNQMFQFALAFALKKKLNVAVKFDTELLDTNRTEFKSLERFGLIVDKLTITEKFKYKGLE
90



nigripulch-

022596860.1

[Vibrio


SCKYRKICNWISNFTTINIHKGYYKEKERGVYDRGIFDSNVKYIDGYWQNQEYFNDFRSELLNKFNLNGKVSN




ritudo;




nigripulch-



HAIQYLKEITSVQNSVSIHVRRGDYLLLDVYRNLTLDYYSEAIKLVRITNPDSKFFIFSNDINWCKSNFKSVDNAI



Vibrio



ritudo]



FVDSTVDEFDDMFLMSKCKTNIIANSTFSWWAAWLNNSGKIVYCPKKWRNDTTENHKGLPEGWNIIDK




nigripulch-











ritudo AM115;











Vibrio











nigripul-











chritudo










Pon4; Vibrio










nigripulch-











ritudo SO65














Sulfuro-

YP_
268680406
glycosyl-
26.48

MIIIKIMGGLTSQMHKYALGRVLSLKYNVPLKLDLTWFDNPKSDTPWEYQLDFNINATIATVSEIKKLKGNN
91



spirillum

003304837.1

transferase


LFNRIARKIEKFFSIRIYKKSYINKSFISISDFHKLKSDIYLDGEWNGFKYFEDYQDTIKNELTLKRGSSINIQNTIE




deleyianum;



family protein


LKSSDNSVFLHIRRGDYLSNKNAAAFHAKCSLDYYYKAIQIVKEKIDNPIFYISDDILWVKKNFVINESCRFME




Sulfuro-



[Sulfuro-


KNQNFEDLLLMSYCKHGITANSGFSLMAGWLNQNKDKMIIVPQTWVNDDRININILNSLEQDNFTIIR




spirillum




spirillum








deleyianum




deleyianum







DSM 6946


DSM 6946]










Escherichia

WP_
486318742
glycosyl
26.47

MTFIVRLTGGLGNQMFQYALARSLAKKYNARLKLDISYYHNQPHKDTPRTFELNQLCIVDNILNSSSFSEKFLY
92



coli;

001581194.1

transferase


IYDKLRVKLSKKISLPYFRNIVTPVNFNCIDFAEDKDYYFLGHFQELSNIYSIDESLRSEFKPNQEIMNLAHQSKIY




Escherichia



11 family


ELIKQSRGSVALHIRRGDYVTNKNAAEHHGVIGLSYYVNALSYLENVSEFFDVFVFSDDPEWARKNIKNSRNL




coli Jurua



protein


FFCDEGNCRYSKKYSTIDMYLMSQCDHFIIANSTYSWWAAWLGNYPSKHVVAPARWNANNSPYPILQNW



18/11:


[Eschericha


KAIHE




Escherichia




coli]








coli










180600;










Escherichia











coli










P0304777.1;










Eschericha











coli










P034777.2;










Eschericha











coli










P034777.3;










Eschericha











coli










P0304777.4;










Eschericha











coli










P0304777.7;










Eschericha











coli










P0304777.9;










Eschericha











coli










P0304777.10;










Eschericha











coli










P0304777.11;










Eschericha











coli










P0304777.12;










Eschericha











coli










P0304777.13;










Eschericha











coli










P0304777.14;










Eschericha











coli










P0304777.15












Firmicutes
wP_
547109632
protein
26.44

MVGVQLSGGLGNQMFEYALYLKLKSMGKDVRIDDVTCYGAQEKQRVNQLSVFGVSYEHMTKQEYEQITD
93


bacterium
021914998.1

[Firmicutes


SSMSPLHRARRLLCGRKDLSYREASCNYDPEILRREPALLLGYFQTERYFADIKDQVREAFTFRNLTLTKESAA



CAG:24


bacterium


MEQQMKECESVSVHIRRGDYLTPANQALFGGICDLDYYHRAVAEIRKRKPDVKFFLFSNDMEWTKEHFCGS






CAG:24]


EFVPVEGNSEQAGEQDLYLMSCCKNHILANSSFSWWGAWLDNGKDKLVIAPEKWMNGRGCCDIYTDEMI









RV







Amphritea

WP_
518452719
protein
26.42

MVKIKIIGGLGNQMFQYAAAKSLAVLNNTRVSANVSVFSNYKTHPLRLNKLNCDCEFDFTRDFRLVLSGFPLL
94



japonica

019622926.1

[Amphritea


GSAFSKKSMLLNHYVEKDLLFDSSFFLDLDNVLLSGYFQSEKYFSNIRELLIQEFSLDDRLTEAELAINNKIESCN







japonica]



SIAIHIRRGDYITDLSANNIHGICSEEYFEKALNYLDSINVLSDPTTTLFIFSDDILWCKDNLAFKYRTVFVEGSVD









RPEVDIHLMSKCKHQVISNSTFSWWGAWLNTNLDKCVIAPLKWFNSLHDSTDIVPKQWMRL







Bacteroides

WP_
492689153
protein
26.41

MKQTIIMSGGLGNQMFQYALYCSMREKGIRVKIDISLYEFNRMHNGYMLDYAFGLNISHNKINKYSVLWTR
95



salyersiae;

005923045.1

[Bacteroides


LIRSNRAPFLLFREDESRFCDDVFTTYKPYIDGCWIDERYFFNIKKKIISQFSFHNIDQKNLMVANMMKVCNS




Bacteroids




salyersiae]



VSLHIRRGDYLSQSMYNICNESYYKSAIEYIISRVEDSKFFIFSDDPEWCKYFMEKFNVDYEIIQHNFGKDSYKD




salyersiae






MYLMTQCKHNIIANSTFSWWGAWLNNNAGKNVVCPSVWINGRDFNPCLEEWYHI



WAL10018 =









DSM 18765 =









JCM 12988













Bacteroides

WP_
492241663
protein
26.38

MDIILLHNGLGNQMSQYAFYLSKKKNGIHTSYICLSNDHNGIELDKVFGVECQMGCKKIFLLFILRLLMSNRT
96



fragilis;

005786334.1

[Bacteroides


GFLIRKVNLLFSKIKIKLITENLDYSFHPSFLSASPYCLAFWVGGWHHPQYYSEISSQIKEAFTFKRSLLDERNICI




Bacteroides




fragilis]



EKRMREPNSVCLHIRRDGYLTGINYELFGKVCNEQYYQKAIDYIEGKLSDICYYVFSNDMEWAKKILLGKNAV




fragilis






FVDWNRGEESWKDMYLMSKCNLIIPSNSTFSWWAWLCEHPVNIVCPKLFVYGDEQSDIYLDNWHKIE



CL03T00C08;










Bacteroides











fragilis










CL03T12C07













Bacteroides

WP_
494751435
protein
26.37

MMGIEKTNMVIVRLWGGIGNQLFQYSFGEFLREDYQVDVIYDIASFGKSDKLRKLELSVVVPGIPVTTDISFSK
97



nordii;

007486843.1

[Bacteroides


YVGTKNRLLRFIYGLKNSFIEEKYFSDEQLFKYLSKRGDVYLQGYWQKTIYAETLRRKGSFFLSQEEPIVLHTIKA




Bacteroides




nordii]



KIQEAEGAIALHVRRGDYFSSKHINTFGVCDAHYYEKAVDIMRGRVSNAMIFVFSDDLDWVRRYVNLPTNVI




nordii 






YVPNYDIPQYWYIYLMSLCRHNIISNSSFSWWGAFLNMNTNKIVVSPSKWTLNSDKTIALDEWFKI



CL02T12C05













Butyrivibrio

YP_
302669783
glycosyl
26.37

MECSMIIIKFCGALGNQLFQYALYEKMRILGKDVKADISAFGDGNEKRFFYLDELGIEFNIASADEIAEYLNRKT
98



proteo-

003829743.1

transferase


IRFVPGFLQHRHYYFEKKPYVYNKKILSYDDCYLEGYWQNYRYFDDIKDELLKHMKFPCLPLEQKKLAEKMEN




clasticus;



11


ENSVAVHVRMGDYLNLQDLYGGICDADYYDRAFSYIEGNISNPVYYGFSDDVDKASALLAKHKINWIDYNSE




Butyrivibrio



[Butyrivibrio


KGAIYDLILMSKCKNNIIANSSFSWWGAYLEYNNGKVVVSPNRWMNCFENSNIAYWGWISL




proteo-




proteo-








clasticus




clasticus







B316


B316]










Prevotella

YP_
294674032
family 11
26.33

MRIVKVLGGLGNQMFQFALYKALQKQYFEERVLLDLHCFNGYHKHRGFEIDSVFGVTYEKATLKEVASLAYP
99



ruminicola;

003574648.1

gylcosyl


YPNYQCWRIGSRILPVRKTMLKEEPNFTLEPSALSLPDSTYYDGYWQHEEYFMHIREEILSTYAFPAFDDERN




Prevotella



transferase


KTTAQLAASTNSCSIHIRRGDYLTDPLRKGTTNGNYVIAAIKEMQQEVKPEKWLVFSDDIAWCQQHLASTLD




ruminicola



[Prevotella


ATNTIYIDWNTGANSIHDMHLMALCRHHIIANSSFWWGAWLSQQDGITIAPSNWMNLKDVCSPVPDN



23



rumincola 23]



WIKI







Prevotella

WP_
494223898
protein
26.33

MKIIKIIGGLGNQMFQYALAIALQQQYKDEEIRLDLNCFRGYNKHQGYLLDEIFGRRFRAASLQEVARLAWPY
100



salivae;

007135533.1

[Prevotella


PHYQLWRVGSRVLPRRQTMVCEPADGSFSPDVLTLEGNRYYDGYWQDERYFKAYRKEIIEAFKFSPFVGDG




Prevotella




salivae]



NRHVENMLRNERFASLHVRRGDYLNDALYQNTCGIDYYQRAISQMNAMNANPSCYFIFSDDIAWCKTHIEPL




salivae






CEGHRPYYIDWNKGKEAYRDMQLMALCKYHIIANSSFSWWGAWLNDAEDGITIAPQQWYSHGNKPSPAS



DMS 15606





ESWIKV







Lachno-

WP_
511045640
protein
26.3

MNIVRISDGLGNQMFQYAYARKISILSRQRTYLDIRFINNEDLVKKGNHVQFRKKLGHRKYGLSHFNVSLQIA
101



spiraceae

016299568.1

[Lachno-


DLKMLSHWEYLIQSNCMQQLIYSLSMQDKWIWRYRHEEVNYDGMLSKVELLFPTYYQGYFFALKYYDDIKH




bacterium




spiraceae



ILQHDFSLKDKMKLLPEDRDALYNRNTISLHVRRGDFLEINRDISGSEYYEKAVQMIGSKVESPIFLIFSDDIEW



COE1



bacterium



VHEHIRIPNDKIYVSGIGYEDYEELTIMKHCKHNIIANSTFSYWAAYLNSNKDKIVICPKHWRERIIPKDWICI






COE1]










Bacteroides

WP_
495118115
alpha-1,2-
26.28

MIVVNVNAGLANQMFHYAFGRGLEAKGWNIFYDQTNFKPRKEWSFENVQLQDAFPNLGLKMMPEGKFK
102



dorei;

007842931.1

fucosyl-


WICNVVTNKLSKGLHLAMINLHNLIGDEKYEFETTYGYDPDIEKEITKNCILKGFWQSEKYFAHCKDDIRKQFS




Bacteroides



transferase


FLPFDEEKNIVIMNKMVKENSVAIHLRKGADYLDSELMGKGLCGVEYYIKAIEYIKKNIDNPVFYVFTDNPVW




dorei 5_1_



[Bacteroides


VKNNLPKFDYILVDWNEVAGKKNFRDMQLMSCAKHNIIANSTYSWWGAWLNPNPNKIVIGPAKFFNPIN



36/D4



dorei]



NFFSSSDIMCEDWVKI







Roseobacter

WP_
495485361
alpha-1,2-
26.28

MLSKDPGMITTRLHGRLGNQMFQYAAGRALAARLGVPLADSRGAKLRGEGVLTRVFDLPLAQPLSLPPLK
103


sp. SK209-2-6
008210047.1

fucosyl-


QDAPLRYAAWRLTGRPFRFRREQGLGYNPAFETWGDDSYLHGYWQSEAYFDSIADQIRQDFTFPEFSNSQ






transferase


NREMAQRIAGSTAISLHVRRDGYVALAAHVLCDQAYYEAALTRILEGVEGSPTVYVFSDDPNWAKENLPLPC






[Roseobacter


EKVVVDFNGPDTDFEDMRLMSLCQHNIIGNSSFSWWAAWLNTHNEKRVAGPAHWFGNPKLQNPDILPE






sp. SK209-2-6]


SWLKISV







alpha

WP_
518900826
protein
26.26

MIYSRIRGGLGNQLFQYCVARSLADNLGTSLGLDVRDFNENSPYLMGLKHFNIRADFNPPGMIEHKKNGYF
104



proteo-

020056701.1

[alpha


RYLIDVVNGKQKFVYKEPHLNFDKNIFSLPSNNYLKGYWQTEKYFIKNKVNILNDLKIISHQSDKNKTISSKIAN




bacterium




proteo-



NTSVSLHIRRGDYISNSAYNSTHGTCSLAYYTNAVNFLVNKIGGNFKVFAFSDDPEWVSSNLKLPVDICFVKN



SCGC AAA076-



bacterium



NSSEYJYEDLRLMSECNHNIIANSSFSWWGAWLNTNHNKTVITPCKWYADNSTKNADITPSNWIKI



C03


SCGC AAA076-









C03]










Helicobacter

wP_
490188900
protein
26.26

MGGGGQDLRLFELMLYNISLPLCFDYKTLVKYFYSNDKSLKYNFPLQYIRATRSKYHKLYWLALKHYKYFYDE
105



bilis;

004087499.1

[Helicobacter


DPQGDNIVKMYLNNSLEKHAYPFGYFQNLIYFDEIDSIIREEFCLKIPLKPHNQALKEKIEKTENSVFLHVRLGD




Helicobacter




bilis]



YLKMEATDGGYVRLGKEYYQSALEILKTRLGQPHIFIFSNDIEWCEKNLCNLLDFTGCHIEFVKANGEGNAAE




bilis WiWa






EMELMRACKHAVIANSTFSWWASYLIDNPDKQIIMPTQVFNDTRRIPKSNMLAKKGYILIDPFWGMHSIV







Ralstonia sp.

WP_
498513378
glycosyl
26.26

MIVTRVIGGLGNQMFQYAAGRALARRLGVPLKIDSSGFADYPLHNYGLHHFALKAVQAGDREIPSGRAENR
106


GA3-3
010813809.1

transferase


WAKALRRFGLGTELRVFRERGFAVDPEVMKLPDGTYLDGYWQSESYFAEMTQELRRDFQIATPPTSENAE






family protein


WLARIGGDEGAVSIHVRRGDYVTNASANANHGICSLDYYMRAARYVAENIGVKPTFYVFSDDPDWVAGNL






[Ralstonia


HLGHETRYVRHNDSARNYEDLRLMSACRHHIIANSTFSWWGAWLNASEKKVVIAPAQWFRDEKYDTRDLL






sp. GA3-3]


PPTWTKL







Bacteroides

WP_
490431888
protein
26.25

MVVVYIAAGLANKMFQYAFSRGLMSHGLDVFLDQTSFQPEWSFEDIALEEVFPNIEIKNAPNNMFSLAYKK
107



ovatus;

004303999.1

[Bacteroides


DLLSRIYRRMSAFFPNNRYLMERPFIYDELIYKKATNNCIFCGLWQTELYFNFCERDVRRNFVFTPFQDDQNI




Bacteroides




ovatus]



KLAEKMKNENSVAIHIRKGADYLKRNIWDGTCSVEYYNQAINYLKEHVSNPVFYLFTDNPEWVEENLKNIDY




ovatus 3_8_






KLVDWNPVSGKQSYRDMQLMSCAKHNIIANSTYSWWGAWLNNNPQKIVVAPKIWFNPKIEKAPYIIPDR



47FAA





WIRL







Loktanella

WP_
518799952
protein
26.23

MIITKLIGGLGNQMFQYAAGRSLAMRHGVPLLLDITELRSYPKHQGYQFEDVFAGRFEIAGLIPLIRVLGRKAR
108



vestfoldensis

019955906.1

[Loktanella


KVPKTVAVVSPKWPPMGDHVWVRQRTHDYDAAFESIGADCYLSGFWQSEKYFATIAPQIRESFRFKEALTG







vestfoldensis]



ANAAIASRMKEAPSAAIHIRRGDYVTDKGAHAFHGLCAWDYYDAAIDHISRHEPDARFFVFSDDVVAAQER









FANRQRAEVVAVNSGRHSYRDMMLMAQCKHQIIANSTFSWWAAWLNQNPDKIVVAPGTWFSGNDGQI









KDIYCKDWIVI







Flavobacter-

WP_
515558304
protein
26.14

MDVVIIFNGLGNQMSQYAFYSQKKKINNSTYFVPFCKDHNGLELETVFSLNTKETLIQKSYILFRILLTDRLKIV
109



ium sp.

016991189.2

[Flavobacter


SDPLKWILNLFKCKIVKESFNYNYNPEYLKPSKGITFYYGGWAEKYFAKENQQIKSVFEFTGDLGKINKEHVK



ACAM 123



ium sp.



DIASTNAVSLHVRRGDFMNEANIGLFGGVSTKAYFEGAIKLIATKVDHPHFFVFSNDMDWVKENLSMDTVT






ACAM 123]


YVTCNSGKDSWKDMCLMSLCQHNIIPNSTFSWWGAWLNKNPHKIVVCPSRFLNNDTYTDIYPDSWVKISD









Y







Bacteroides

WP_
492219620
glycosyl
26.1

MMKLVRMTGGLGNQMFIYAFYIQMKTIFPELRIDMSEMKKYKLHNGYELEDVFSIRPQTISAHKWLKRVIV
110



fragilis;

005779407.1

transferase


YAFFSIIREKSEEELSIHKYTQHKRWPLVYYKGFFQSELFFKESSDTIRDIFSFNTENANFRTKEWAKIIKEQRSSV




Bacteroides



family 11


SIHIRRGDYTSAKNKIKYGNICTEEYYQKAISIILKKEPKAFFHIFSDDVEWTKAHLKIHHLPHQYISWNKGPDS




fragilis



[Bacteroides


WQDMMLMSLCRHNIIANSSFSWWGAWLNAYKDKTVIAPSRWSNVKKTPHILPESWISISI



3_1_12



fragilis]











Spirosoma

WP_
522086793
protein
26.09

MIISRVTSGLGNQLFQYAAARSLSLRNKTAFYVDLSYYLYEYPDDTSRSFKLGFFSVPYRILQESPVEYLSKSTKL
111



panaciterrae

020598002.1

[Spirosoma


FPNRSLRPFFLFLKEKQFHFDPTILQAHAGCVIMEGFWQSECYFRDHAEIIRRELQLSKSPSSEFEGYHQQIQA







panaciterrae]



TPVPVSVHVRRGDYVNHPEFSKTFGFIGLDYYKTAIRHLTKTIKNPHFYVFSDDKEWARANLPLPTDSVFVTN









TGPSGDVADLVLMSTCHHHIIANSSFSWWGAWLNPNPDKLVITPKLWYKNQPTWNTKDLLPPTWSVSL






uncultured
EKE06672.1
406985982
glycosyl
26.09

MIITKLTGGLGNQLFQYAIGRNLIYINGSDLKLDVSEYDVSNKGNFRHYALDKFNTIQNFASKKETNNFKFGVF
112


bacterium


transferase


KKWLYKSGIVKNKNYFLEKKFNFDKEILKIKDNAFLQGYWQSEKYFIGIRDILLQEFSLKENIELKFGEILKEINES






family 11


NSVSIHVRRGDYVKNPKNLSFHGVCSPKYYSESTSKIASLIEKPVFFVFSDDIEWVKENLNITFPVVYLSGIKNIK






[uncultured


SYEELVLMSKCKHNIIANSSFSWWGAWLNTNQKKIVIAPKRWFNDVKLDTTDLIPENWIRI






bacterium]










Thermo-

NP_
22298537
alpha-1,2-
26.07

MIIVHLCGGLGNQMFQYAAGLAAAHRIGSEVKFDTHWFDATCLHQGLELRRVFGLELPEPSSKDLRKVLGA
113



synechococcus

681784.1

fucosyl-


CVHPAVRRLLAGHFLHGLRPKSLVIQPHFHYWTGFEHLPDNVYLEGYWQSERYFSNIADIIRQQFRFVEPLDP




elongatus;



transferase


HNAALMDEMQSGVSVSLHIRRGDYFNNPQMRRVHGVDLSEYYPAAVATMIEKTNAERFYVFSDDPQWVL




Thermo-



[Thermo-


EHLKLPVSYTVVDHNRGAASYRDMQLMSACRHHIIANSTFSWWGAWLNPRPDKVVIAPRHWFNVDVFD




synechococcus




synechococcus



TRDLYCPGWIVL




elongatus




elongatus







BP-1


BP-1]










Colwellia

WP_
517858213
protein
26.03

MKIVKIAGGFGNQLFQYAFYLALDKKYAEQVCLDSLDMAKYRLHNGYELEGIFKLDARYCTEEQRIIVRKDNN
114



piezophila

019028421.1

[Colwellia


IFTKLLSSLKKKLGNNKNYILEPKQEHFTFHEKSFGQANTPTYYKGYWQDVKYLENIEEELKSSLVFPEFELGKNI






piezophila]


ELANFISSNSSVSLHVRRGDYVQHKAFGGICDLSYYQRAVEQINTLVKDPIFIVFSDDIQSCKDNLNLEKAKFV









DWNIGENSFRDMQLMTLCKHNIIANSSFSWWGAWLNANDDKNVICPDKWVHYTSATGVLPSEWIKIKAS









V







Prevotella

WP_
518810840
protein


MKIVKIIGGLGNQMFQYALAIALQERWKDEEIKLDLHGFNGYHKHQGYQLDMLFGHRFEAATLTDVAQLA
115



maculosa

019966794.1

[Prevotella


WPYPHYQLWRVGSRLLPKRRSMLCEPSKGLLPSDVLKQKGSLYYDGYWQDERYFRAIRPQIMAAFKFPDFT






maculosa]


DRRNLETEKRLKASEAVSIHVRRGDYLDDVLFQGTCNIAYYQRAIARLCQLKTPVFCIFSNDMAWCKVHIEPL









LHGKEILYVDWNRGKESYRDLQLMTLCRHHIIANSSFSWWGAWLSKAEDGITIAPRHWYAHDAKPSPAAE









RWIKV







Salmonella

YP_
525860034
fucosyl-
25.99

MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFADEKEKIKLL
116



enterica;

008261369.1

transferase


RKFKRNPFPKQISEILSIALFGKYALSDRAFYTFETIKNIDKACLFSFYQDADLLNKHKQLILPLFELRDDLLDICKN




Salmonella



[Salmonella


LELYSLIQRSNNTTALHIRRGDYVTNQHAAKYHGVLDISYYNHAMEYVEREGKQNFIIFSDDVRWAWKAFL




enterica




enterica



ENDNCYVINNSDYDFSAIDMYLMSLCKNNIIANSTYSWWGAWLNKYEDKLVISPKQWFLGNNETSLRNAS



subsp.


subsp.


WITL




enterica




enterica








serovar




serovar








Worthington




Cubana str.







str. ATCC 


CFSAN002050]






9607; 










Salmonella











enterica










subsp.










enterica











serovar











Cubana str.










CFSAN001083;










Salmonella











enterica










subsp.










enterica











serovar











Cubana str.










CFSAN002050;










Salmonella











enterica










subsp.










enterica











serovar











Cubana str.










CVM42234













Bacteroides

WP_
495935021
protein
25.94

MKKVIFSGGLGNQMFQYAFYLFLKKKGIKAVIDNSLYSEFKMHNGFELIKVFDIKESIYRTYFLKVHLIFIKLLMK
117


sp. 3_2_5
008659600.1

[Bacteroides


IPPVRKLSCKDDVIPIGDHEFDPPYARFYLGYWQSKKIVNYVIEELRAQFIFRNIPQMTIEKGDFLSSINSVSIHIR






sp. 3_2_5]


RGDYMGIPAYQGICNEIYYERAISFMKEHFLNPRFYVFSNDSIWAKLFLEKFDIDMEIIVTPPIYSYWDMYLMS









RCRNHIIANSTFSWWAAVLNINKDKIVISPTIFKKDECIDIIFDDWVKISNI







Clostridium

WP_
547662453
protein
25.86

MIMLQMTGGMGNQMFTYALYRSLRQKGKEVCIEDFTHYDTPEKNCLQTVFHLDYRKADREVYQRLTDSEP
118


CAG:510
022124550.1

[Clostridium


DFLHKVKRKLTGRKEKIYQEKDAIIFEPEVFQTDDVYMIGYFQSGRYFEKAVFDLRKDFTFAWNTFPEKAKKLR






sp. CAG:510]


EQMQAESSVSLHIRRGDYMNGKFASIYGNICTDAYYEAARRYMKEHFGDCRFYLFTDDAEWGRQQESEDT









VYVDASEGAGAYVDMALMSCCRHHIIANSSFSWWGAWLDENPDKTVIAPAKWLNISEGKDIYAGLCNCLI









DANGSVQGE







Rhodopirellula

WP_
495940880
glycosyl
25.86

MIVTRLIGGLGNQLFQYAFGHSLARSTYQTLLIDDSAFIDYRLHPLAIDHFTISASRLSDADRSRVPGKFLRTPV
119



europaea;

008665459.1

transferase


GRALDKVSRFVPGYQGVLPVRREKPFGFRESLLARESDLYLDGYWQSEKFFPGLRGSLREEFQLREQPSETTR




Rhodopirellula



family protein


RLSAQMKSENSVAIHVRRGDYVTSAKAKQIYRTLDADYYRRCLLDLAAHETDLKLYLFSNDVPWCESNLDVGI




europaea



[Rhodo-


PFTPVQHTDGATAHEDLHLIAQCRHVVIANSTFSWWGAYLGQLHPTRRVYYPEPWFHPGTLDGSAMGCD



SH398



pirellula



DWISEASLEEQSSLKSSRRAA







europaea]










uncultured
EKD23702.1
406873590
glycosyl
25.82

MIIVKLKGGMGNQMFQYAIGRNLATKLGTQLRLDLTFLLDRSPRKDFVFRDYDLDIFALDVAFAGPTDLKPFT
120


bacterium


transferase


QFRISHLTKIYNIFPRLLGRPYVISEPHFHFSEAILKSSDNVYLDGYWQSEKYFKEIENSIRDDFKFRQPLEGRAA






family protein


EMAAQIKNEDRAVCLNVRRADFVTSKKAQEFHGFIGLDYYQKAVDLLVSKVGPLHLFIFSDDVDWCAANLK






[uncultured


FNYPTTFVTKDYSGKKYEAYLQLMTLCRHYIIPNSTFAWWGAWLNSDPNKIVIAPKQWFKEASISTTDIIPST






bacterium]


WIRL







Bacillus

WP_
446510160
protein
25.74

MIIVKLKGGLGNQMFQYALGKSLAYYDKPLKIDADYIKNNEGYVPRDFSLSKFNIELDLYQEADKERVGFILK
121



cereus;

000587678.1

[Bacillus


NNFLAKKLRNYFLKKGKYKGKYIIENPDNLGLFKKELFENHNESMYIDGYWQSYLFNNIRECLIKEFNLKPEYT




Bacillus




cereus]



KEMTEIMQRINETNSVAVHIRRGDYVKLGWTLDTTYYKKAIAEIVKNVDNPKFYVFSDDTDWVRSNLQELD




cereus AH1271






NAVFIGECNLFDYQELWLMSTCKHHNIISNSTFSWWGAWLNQNDHQVVVSPSAWINGMSVETTSLIPDSW









KRV






Firmicutes
WP_
548309386
protein
25.74

MDIIRMEGGLGNQLFQYALYRQLQFMGRTVKMDVTTEYGREHDRQQMLWAFDVHYEEATQEEINRLTD
122


bacterium
022499937.1

[Firmicutes


GFMDLPSRIRRKLTGRRTKKYAEADSNFDPQVLLKTPVYLTGYFQSEKYFKDVEGILHTELGFSDRIYDGISEVF



CAG:95


bacterium


ADQIRNYQKQIRETESVSLHVRRGDYLEHPEIYGMSCTMEYYQAGVRYIRERHPDAEIFVFTNDPVFTEKWL






CAG:95]


QENFLGDFTLIQGTSEETGYLDLMLMSQCKHQIMANSSFSWWGAWLNPNKDKIVVAPEPWFGDRNFHDI









YTEEMIRSPRGEVKKHG







Prevotella

WP_
490508875
alpha-1,2-
25.74

MIAATLFGGLGNQMFIYATVKALSLHYQVPMAFNLNHGFANDYKYHRKLELCKFNCQLPTAKWITFDYRGE
123



oris;

004374901.1

fucosyl-


LNIKRISRRIGRNLLCPNYQFVIEEEPFHYEKRLFEFTNKNIFLEGYWQSPCYFENYSKEIRADFQLKVPLSKEML




Prevotella



transferase


EEIYALKATGKTLVMLGIRRYQEVEGRDICTYKLCDEKEYYIKAITYIQERIPNALFVVFTQDKEWATTHLPKGAE




oris F0302



[Prevotella


FYFVKDKQDEYATVADMFLMTQCTHAIISNSTFYWWGAWLQCTTKNHIVIAPDSFINSDCVCKEWIILKRNS







oris]



LC







Escherichia

AAO37719.1
37528734
fucosyl-
25.73

MYSCLSGGLGNQMFQYAAAYILQRKLKQRSLVLDDYFLDCSNRDTRRRFELNQFNICYDRLTTSKEKKEISII
124



coli



transferase


RHVNRYRLPLFVTNSIFGVLLKKNYLPEAKFYEFLNNCKLQVKNGYCLFSYFQDATLIDSHRDMILPLFQINEDL






[Eschericha


LHLCNDLHIYKKVICENANTTSLHIRRGDYITNPHASKFHGVLPMDYYEKAIRYIEDVQGEQVIIVFSDDVKWA







coli]



ENTFANQPNYYVVNNSECEYSAIDMFLMSKCKNNIIANSTYSWWGAWLNTFEDKIVVSPRKWFAGNNKSK









LTMDSWINL







Leeia

WP_
516890767
protein
25.71

MIIVKIIGGLGNQMMQYAFAHACAKRLGVPFKLDTAFESYKLWPYGIHNFHTAPIASIEEIEHAKSMGVITE
125



oryzae

018150480.1

[Leeia


TSFRRDDSLVSAVKDGMYIQGYWADYRYSESVWGELKPVFTLMDPLTPEQQALAMNISAPNAVAIMVKR







oryzae]



GDYVRNPNCFLLPQQYYRDAIKLVLDQQPDAVIYCFSQDDWVIAMLIIPAPKWVRGQGIDNGFVDMIL









MSKARHRIVANSTRSIWASRWDQDGLTIVPSQFRRKDDPWLLQVYGPVLQPCYPPQWRWDVTGDGKKE









AEMTSTALLQIAGGDVRGRKLRIGVWGFYEEFYQNNYIFLNKNAPIGHELLKPFNQLYQYGQAHNLEFVTLDL









VADLSTLDAVLFFDAPNMRSPLVSSVMQLDIKKYLCLLECELIKPDNWQQSLHELFTRIFTWHDGLVDNMYI









KYXQLMPWIESAOSLTAPFTETAKKGYLQKKLICWISGNKLVSHPFELYSKRIEVIRWRESHHPEHFDLYG









MGWSASQYPSYKGKIDDKIVLXGYRFSLCYFNAKFLPGYITEKIIDCFKAGWPVYBCGAPNIAQWIPUNCFID









SGKDTDALYTYLISMTEEVHADYLENIRQRRLGGKAYPFSADAFINMTRTIVQDCLFPHERTDVSWVPNY









NHGNFWSAITSALNQNVSVELLVLDNASTDDSWSQLQFFADYPQVRLIRNRWNIGVQHNWNHATWLAT









GRYVVMLSADDLLLPGHLEQAVKHLDHNPASSIYYTPCLWINEHDQPLGTLNHPGHLESDYVGGRDCISDLL









KRDSYTPSAAVIRRETLNRIGSMNLHLKGAIDWDLWIRIAEISPAFIFRKQPGVCYRQHSGNNSVDFYASTAP









LEDHIRIVESIIDRKVAVKYLLXAKEEIIAHLDNRMSYPENQIQHLLSRINNIKDYLRKGAGPVISVIIPTICNRPGI









IAIMAIFSITYTTFKDFTVVHNDGGCDIGGIVDRFSDQLQISYVRSSQSGGAAASRNRALKLAKGRIIAYLDDD









DVYLDSHLEKLVDAYKKKSFKHYINTYLIQERKEGRUELGRERRYAGISYSRAALLVSNRIPTPTWSHTKCLI









DTIGDFDESLEILEDWDFLLRASKVTEFYQVNAIIVIVKSDKSRDUHTIRANADKLLAYHQKIYAKHPVINISI









LANRQSLINSLSNRQDNPKNENSYQGWVNARQPNELAVQIIJIFRMMLOWSQYQFMIVMVVKQSQQ









NLIANTIDSFCQQLYSGWLIVISDFSAPDESFINNEVLGWLTLETVEDENLLTQAFNGVLAEVPSQWVILPV









GTRLTSTALLKVGDRIILNGGACVIYTDHDYVSDUGMIKDPVLKPAFNLDMLRSQDYIGSSIFFRTDSIAAVG









GFASFPGARTYEACFRMLGNYGPQTIEHLPEPVMTrPENQPENSLRVAAMQLALEEHLHRNNISASIEEGYV









TGTFLVQYHHSEQPRVSMIPNKDKHEFLAPCIETLMKVTQYPARCVIIVDNQSTDPDTLIYYEMESRFANNVK









VIQYDNPFNFSAQCNLGAESATQDRILRLNNDTEIVCIILERMMQMAQRNDVGVVGARLVFPETVTIQH









AGIVLGGKYPDEVFQFPYMNFPVDKDVSLNRTKVVQNYSAVRGACLLVRKSLYQQVGGMNEQNLAVLYGD









VDLCLRIRQLHKSWWTPPSTLVHHTGKRLNSNSQHHKHLMMVIQTRQEREYMLSHWLDIIANDPYYHRLL









DKSECNGTIDCTHTPLWDDIPSARPRLQGMALVGGSGPYRVNIVIPFHILERSAIAIMSNFCRRSKARLPSITEL









ARNAPQVFWQNALADEFIRMLCMYKKYLPSVFRIQMLDDLLTElPDASSFKRHRQKNWRDAKARLRKSLKF









CDRLIVSTFPLRTFAEDMIDDIIVVPNMLERSVWGDLVSKRRAGKKPRVGWVGAQQHAGDLALMTDWKA









TGHEVDWVFCKGMCPDIRPYVAFVNTRWLTYDKYPQGIAALNLDLAIAPLEINAFNEAKSNLRLLEYGALG









WPVICTDIYPYQTNNAPVCRVPNDASAWIEARSHIADUGATAUXILRQWVHDHYMIFDHAQFWLSA









LTRPAGK







Desulfovibrio

WP_
492830219
Glycosyl
25.68

MFQYAAARALSLRHSASLAADLTWFSQQFDVQTTPREYALPAFRLNLPEADKRIVATFRLNPTELRIVSFLRH
126



africanus;

005984173.1

transferase


RICFPSRFLPRHITELSFDYWDGFRDILPPAYLDGYWQSERYFSDYPDIIRADFSMLSISEQAAWMSAKIASVQ




Desulfovibrio



family 11


DSISLHIRRGDYVNSLATRKAHGIDTERYYAKALEWIADRIGAATIFAFSDDPRWVRANFDFGKHKGIVVDGS




africanus



[Desulfovibrio


WTAHEDMHLMSLCSHHIIANSSFSWWGAWLSTSQGITIAPKSWFSNPHIWTPDVCPATWERIPC



PCS



africanus]











Akkermansia

WP_
547786341
Glycosyl
25.66

MAKGKIIVMRLFGGLGNQLFQYAFLFALSRQGGKARLETSSYEHDDKRVCELHHFRVSLPIEGGPPPWAFRK
127



muciniphila

022196965.1

transferase


SRIPACLRSLFAAPKYPHFREEKRHGFDPGLAAPPRRHTYFKGYFQTEQYFLHCREQLCREFRLKTPLTPENARI



CAG:154


family 11


LEDIRSCCSISLHIRRTDYLSNPYLSPPPLEYYLRSMAEMEGRLRAADAPQESLRYFIFSDDIEWARQNLRPALP






[Akkermansia


HVHVDINDGGTGYFDLELMRNCRHHIIANSTFSWWAAWLNEHAEKIVIAPRIWFNREEGDRYHTDDALIP







muciniphila



GSWLRI






CAG:154]










Dysgonomonas

WP_
493897667
protein
25.66

MKIVKLQGGLGNQMFQYAIARTLETNKKKDIFLDLSFLRMNNVSTDCFTARDFELSIFPHLRAKKLNSLQEKF
128



mosii;

006843524.1

[Dysgonomonas


LLSDRVRYKFIRKIANINFHKINQLENEIVGIPFGIKNVYLDGFFQSESYFKHIRFDLIKDFEFPELDTRNEALKKTI




Dysgonomonas




mosii]



VNNNSVSIHIRRGDYVHLKNANTYHGVLSLEYYLNCIKRIGEETKEQLSFFIFSDDPEYASKSLSFLPNMQIVD




mosii DSM






WNLGKNSWKDMALMLACKHHIIANSSFSWWGAWLSERGITYAPVKWFNNESQYNINNIIPSDWVII



22836













Prevotella

WP_
490506359
glycosyl
25.66

MDIVLIFNGLGNQMSQYAFYMSKKKFVPQSKCMYYKGASNNHNGSELDKLFDIKYSETFFCKLILLLFKLYENI
129



oris;

004372410.1

transferase


PRLRKYFHILGINIVSEPQNYDYNESILKKKTRFGITLYKGGWHSEKYFLANKQDVLNTFSFKIAKEDKNFIDLAK




Prevotella



family 11


SIEEDTNSVSLHVRRGDYLNISPTDHYQFGGVATTNYYKNAVSYMLKRNKQAHFYIFSDDITWCKAEYKDLM




oris F0302



[Prevotella


PTFIECNKKNKSWRDMLLMSLCTNHINANSTFSWWGAWLSTKNGITICPTEFIHNVVTRDIYPETWVQL







oris]











Pseudo-

WP_
496239055
glycosyl
25.66

MIIVRLMGGMGNQLFQYATAFALSKRKSEPLVLDTRFFDHYTLHGGYKLDHFNISARILSKEEESLYPNQWA
130



gulbenkiania

496239055

transferase


NLLLRYPIIDRAFKKWHVERQFTYQDRIYRMKRGQALLGYWQSELYFQEYRKEISAEFTLKEQSSVTAQQISV




ferrooxidans



family 11


AMQGGNSVAVHIRRGDYLSNPSALRTHGICLSGYYNHAMSLLNERINDAQFYIFSDDIAWAKENIKIGKTSK



2002;


[Pseudo-


NLIFIEGESVETDFWLMTQSKHHIIANSTFSWWGAWLANNTDEQLVICPSPWFDDKNLSETDLIPKSWIRLN




Pseudo-




gulbenkiania



KDLPV




gulbenkiania




ferrooxidans]








ferrooxidans














Salmonella

WP_
446208786
protein
25.66

MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFTDEKEKIKLL
131



enterica

000286641.1

[Salmonella


RKFKRNPFPKKISEILSIALFGKYALSDSAFYAVETIKNIDKACLFSFYQDADLLNKHKQLILPLFELRDDLLDICKN







enterica]



LDVYPLILRNNNTTALHIRRGDYLTNQHAAKYHGVLDTSYYNNAMEYVEREGKQNFIIFSDDVKWAQKAFL









GNENCYIVNNGDYDYSAIDMYLMSLCKNNIIANSTYSWWGAWLNKSEDKLVISPKQWFLGNNETSLRNAS









WIIL







Carno-

YP_
554649642
glycosyl
25.59

MLIVKVYGGIGNQMFQYSFYKYLQKNNDDVFLDISDYKVHNHHNGFELIDVFNIEVKQADMSKFKGHVSSK
132



bacterium sp.

008718688.1

transferase


NSIFYRLTSKLFKRNILGYSEFMDSNGISIVRNEKILTDHYFIGFWQDVLYLQSVEEEIKEAFNFKNVAIGKQNLE



WN1359


family 11


LISLSESVESVSVHIRKGDYANNSDLSDICDLEYYEEAMKIIDSKVSEPLYFIFSDDIEWCKQKFGKRDNLIYVD






[Carno-


WNIAKKSYIDMLLMSKCKHNIIANSTFSWWGAWLNNNSKKIVICPKTWDRKKNENHLLLNDWIAI







bacterium sp.










WN1359]










Prevotella

WP_
547227670
protein
25.58

MMKIIVNMACGLANRMFQYSYYLFLMHKGYNVKVDFYNSAKLAHEKVAWNDIFPKARIEQASFSDILKSG
133


sp. CAG:1185
021964668.1

[Prevotella


GGSDVISKIRRKYLPFLSSVVNMPTAFDANLPVENKKLQYIIGVFQNANMVEAVEEDVKRCFKFQPFTDERNL






sp. CAG:1185]


KLQNEMQSCESVAIHVRKGKDYAQRIWYQNTCPIEYYQNAIRLISEKVNNPKLYVFTDNPEWVKEHFKDFPY









TLVEGNPASGWGSHFDMQLMSVCKHNIISNSTYSWWSAFLNVHNEKIVIGPKVWFNPDSCSEFTSERILCK









DWIAV







Selenomonas

WP_
497331130
glycosyl-
25.58

MFQYAMASSVARRAGEILKLDLSWIRQMEKKLSADDIYGLGIFSFDEKFSTSNEVQKFLPSGKFSAKIYRAVN
134


sp. CM52
009645343.1

transferase,


RRMPFSWRRVLEEGGMGWHPQIMEIRRSVYFYMGYWQSEKYFSDFIQEIFRKDFTFREEVRQSIEERRPIVE






family 11


KIRKSDAVSLHIRRGDYAQNPALGEIFLSFTMQYYIDAARYISERVKTPVFFIFSDDIPWAKENLPLPYEVCYIDD






[Selenomonas


NIQTNEREIGHKSKGYEDMYLMTQCQHNIIANSSFSWWGAWLNHNPNKIVVAPKKWCNGSFNYADIVPE






sp. CM52]


QWVKL







Bacteroides

WP_
494751213
protein
25.57

MEIVFIFNGLGNQMSQYALYSKRNLGCKVRYAYNIRSLSDHNGFELDRVFGITYPNNLFNKCINIIYRLLFAN
135



nordii;

007486621.1

[Bacteroides


KYLFLVQKMIYMLRQMNVYSIKEKDNYDYDYKILTRHKGIVLYYGGWHSEKYFLSNADIIKDKFRFNISKLNSES




Bacteroides




nordii]



LVLYHRLSSLNAVALHVRRGDYMAPEHYNVFGCVCGIEYYKAAIQYIQSQILNPVFIVFSNDIEWVKENITGIQ




nordii






MIFVDFNKKENSWMDMCLMSCCEHNIISNSTFSWWGAWLNNNKNKIVVCPKYFMSNIDTKDIYPESWIKI



CL09T00C40













Para-

WP_
491855386
protein
25.54

MKKKDIILRVWGGVGNQLFIYAFAKVLSLITDCKVTLDIRTGFANDGYKRVYRLGDFSISLLPALRFTLLSFAQ
136



bacteroides

005635503.1

[Para-


RKMPYIRHLLAYKFDFFEEDQKYPLETLDSFFKIYSDKNLYLQGYWQYFDFSSYRDVLLKDLRFEVEINNTYLYY




merdae;




bacteroides



SDLIEKSNAVAIHFRRIQYEPVISIDYYKKAIKYISENVENPTFFIFSDDINWCRENLSINGICFFVENFKDELYELK




Para-




merdae]



LMSQCNHFIIANSTFSWWGAWLSVNADKKVIMPDGYTDVSMNGSIVHI




bacteroides











merdae ATCC










43184;










Para-











bacteroides











merdae










CL09T00C40













Butyrivibrio

WP_
551024004
protein
25.51

MIIIQLKGGLGNQMFQYALYKELKHRGRDVDKIDDESGFIGDKLRVPVLDRFGVEYDRATKDEVIALTDSKMDI
137


sp. NC2007
022768139.1

[Butyribibrio


FSRIRRKLTGRKTFRIDEMEGIFDPKILETENAYLVGYQWSEKYFTSPEVIEQIQEAFGKRPQEIMHDSVSWST






sp. NC2007]


LQQIECCESVSIHVRRTDYMDAEHIKIHNLCSEKYYKNAISKIREEHPNAVFFIFTDDKEWCKEHFKGPKFITVE









LQEGEFTDVADMLLMSRCKHHIIANSSFSWWSAWLNDSPEKIVIAPSKWINNKKMDDIYTERMTKVAI







Bacteroides

WP_
490430100
protein
25.5

MIVVYSNAGLANRMFHYALYKALEVKGIDVYFDEKSYVPEWSFETTTLMDVFPNIQYRESLQFKRASKKTFL
138



ovatus;

004302233.1

[Bacteroides


DKIVIHCSNLFGGRYYVNYRFKYDDKLFTKLETNQDLCLIGLWQSEKYFMDVRQEIQKCFQYRSFVDDKNVKT




Bacteroides




ovatus]



AQQMLSENSVAIHVRKGADYQQNRIWKNTCTIDYYRLAIDIRMHVQNPVFYVFTDNKDWVIENFTDLDY




ovatus ATCC






TLCDWNPTSGKQNYLDMQLMSCAKHNVIANSTYSWWGAWLNENSDKIVIAPKRWFNKIVTPDILPEQWI



8483;





KI




Bacteroides











ovatus










CL02T12C04













Mesotoga

YP_
389844033
glycosyl
25.5

MRVVWFGGGLGNQMFQYGLYCFLKKNNQEVKADCTQYSTTPMNNGFELERLFNLDIAHANLDVISKLTG
139



prima

006346113.1

transferase


GNRLSPRKVIWKLFRKPKVYFEEKIPFSFDPDVLKGNNRYLKGYWQNMNYLEPCAKELRDVFTFPAFSSDNN



MesG1.Ag.4.2;


family protein


KRLADEIAKVEAVGHVFRRGDFLKSSNLGLFGGICSDQYYLRAIQTMENTVVEPVFYVFCDDPQWAKNSFSD




Mesotoga 



[Mesotoga


ARFTVIDWNIGSNSYRDMQLMSLCKHNIIANSTFSWWAAWLNRNPNRTVIAPERMVNRDLDFSGIFPND




prima




prima



WIRLQG






MesG1.Ag.4.2]










Clostridium

WP_
545399562
glycosyl-
25.49

MIVLKLQGGLGNQMFEYAFARTIQEQKKDKKLILDTSDFQYDKQREYSLGHFILNENIEIDSSGKFNLWYDQR
140


sp. KLE
021639228.1

transferase,


KNPLLKVGFKFWPKFQFQTLKFGIYVWDYAKYIPVDVSKKHKNILLHGLWQSDKYFSQISEIIRKEFAVKDEP



1755


family 11


SQGNKAWLERISSANAVCVHIRRGDFLAKGSVLLTCSNSYYLKAMEIISKKVNEPEFFIFSDDIEDVKKIFEFPG






[Clostridium


YQITLVNQSNPDYEELRLMSKCKHFIIANSTFSWWSSLLSENEDKVIVAPRLWYSDGRDTSALMRDEWIIIDN






sp. KLE 1755]


E







Bacteroides

WP_
547321746
glycosyl-
25.42

MDLVTLSGGLGNQMFQFAFYWALKKRGKKVFLYKNKLAAKEHNGYETQTLFGVEEKCVDGLWMTRLLGC
141



plebeius

022052991.1

transferase,


PLLGKILKHILFPHKIRERVLYNYSIYLPLFERNGLHWVGYWQSEKYFQDVADDIRRIFCFDHLSLNPATSAALK



CAG:211


family 11


CMSEQVAVSVHIRRGDYYLPCNVATYGGLCTVEYYENAIRYVKEYPQAVFYVFSDDLDWVRENIPSAGKM






[Bacteroides


VFVDWNRGKDSWQDMFLMSKCHHNILANSSFSWWGAWLNTHPEKLVIAPERWANCPAPDALPDGWV







plebeius



RIEGVSRR






CAG:211]










Treponema

WP_
545448980
glycosyl-
25.4

MAIKIVKISGGLGNQMFCYAFACALQKCGHKVYVDTSLYRKATVHSGIDFCHNGLETERLFGIKFDEADTAD
142



lecithinoly-

021686002.1

transferase,


VRRLSTSAEGLLNRIRRKYFTKKTHYIDTVFKYPELLSDKNDCYLEGYWQTEKYFLPIEKDIRRLFTFRPTLSEKS




ticum;



family 11


AAVQSALQAQQAAVLSASIHVRRGDFLNTKTLNVCTETYYNNAIKYAVKKHAVSRFIFSDDIPWCREHLCFC




Treponema



[Treponema


NAHAVFIDWNTGNDSWQDMALMSMCRCNIIANSSFSWWAAWLNNASDKTVLAPAIWNRRQLEYVDRY




lecithinoly-




lecithinoly-



YGYDYSDIVPESWIRIPID




ticum ATCC




ticum]







700332













Bacteroides

WP_
490419682
glycosyl-
25.34

MRLIKMTGGLGNQMFIYAFYLRMKKRHTNTRIDLSDMMHYNVHHGYEMHRVFNLPKTEFCINQPLKKVIE
143



eggerthii;

004291980.1

transferase


FLFFKKIYERKQDPSSLLPFDKKYLWPLLYFKGFYQSWERFFADMENDIRIAFTFNSDLFNEKTQAMLTQIKHNE




Bacteroides



[Bacteroides


HAVSLHIRRGDYLEPKHWKTTGSVCQLPYYLNAITEMNKRIEQPSYYVFSDDIAWVKENLPLPQAVFIDWNK




eggerthii




eggerthii]



GAESWQDMMLMSHCRHHIICNSTFSWWGAWLNPRENKTVIMPERWFQHCDTPNIYPDGWIKVPVN



DMS 20697













Bacteroides

WP_
491891563
glycosyl-
25.34

MRFIKMTGGLGNQMFIYAFYMRMKKHYSTRIDLSDMVHYKAHNGYEMHRVFNLPPIEFRINQPLKKVIEF
144



stercoris;

005656005.1

transferase


LFFKKIYERKQVPSSLVPYDKKYFWPLLFKGFYQSERFFADMADDIRKAFTFNPRLSNRKTKEMSEQIDHDE




Bacteroides



[Bacteroides


NAVSIHVRRGDYLEPKYWKTTGCVCQLPYYLNAIAEMNKRISQPSYYVFSDDIAWVKENPLPKAFFIDWNK




stercoris




stercoris]



GAESWQDMMLMSRCRHHIICNSTFSWWGAWLNPRENKTVIMPERWFRHCETPDICPDKWIKVPINQPD



ATCC 43183





SIG







Butyrivibrio

YP_
302671882
glycosyl
25.34

MIIIQLKGGMGNQMFQYALYRQLKKLGREKIDDETGFVDDELRIPVLQRFGISYDKATREEIVKLTDSKMDI
145



proteo-

003831842.1

transferase


FSRIRRKLTGRKTFRIDEESGIFDPRILEVEDAYLVGYWQSDKYFANEEVEKEIREAFEKRPQEVMQDSVSWTI




clasticus;



11 


LQQIECCESVSLHIRRTDYIDEEHIHIHNICTEKYYKSAIDEVRNQYPSAVFFIFTDDKDWCRQHRFGPNFFVVD




Butyrivibrio



[Butyrivibrio


LDEDTNTDIAEMTLMSRCKHHILANSSFSWWAAWLNDNPGKIVIAPSKWINNRKMDDIYTARMKKIAI




proteo-




proteo-








clasticus




clasticus







B316


B316]










Roseobacter

WP_
495504071
alpha-1,2-
25.34

MSPIVHFPSDRLLRYEHLSLWKTAMIYTRLLARLGNQMFQYAAGRGLAARLGVDFTDSRRAVHKGDGV
146


sp. GAI101
008228724.1

fucosyl-


LTRVFDLDWAAPENMPPAQHERPLAYYAWRGLRRDPKIYRENGLGYNAAFETLPDNTYLGHYWQCERYFA






transferase


HIADDIRAAFVPRHPMSAWNADMARRIASGPSVSLHVRRGDYLTVGAHGICDQTYYDAALAAVMQGLPSP






[Roseobacter


TVYVFSDDPQWAKDNLPLTFEVVVDFNGPDSDYEDMRLMSLCQHNVIANSSFSWWGAWLNANPQKRV






sp. GAI101]


AGPANWFSNPKLSNPDILPSRWIRI







Thalassobacter

WP_
544666256
alpha-1,2-
25.34

MGQDMIYSRIFGGLGNQLFQYATARAVSLRQGVELVLDTRLAPPGSHWAFGLDHFNISARIAEPSELPPSKD
147



arenae;

021099615.1

fucosyl-


NFFKYVMWRAFGHDPAFMRERGLGYQSRIAQAPDGTYLHGYFQSERYFADVLDHLENELRIVTPPDTRNA




Thalassobacter



transferase


EYADRIASAGHTVSLHVRRGDYVETSKSNSTHATCDEAYYLRALARLSEGKSDLKVFVFSDDPEWVRDNLKLP




arenae



[Thalasso-


YDTTPVGHNGPDKPHEDLRLMSCCSDHVIANSTFSWWGGWLDRRPEARVVGPAKWFNNPKLVNPDILPE



DSM 19593



bacter











arenae]











Prevotella

WP_
490511493
protein
25.33

MKIIKIIGGLGNQMFQYALAVALQKKWKDEEIKLDLHGFNGYHKHQGYQLDEIFGHRFKAASLKEVAQLAW
148



oris;

00437401.1

[Prevotella


PYPHYQLWRVGSRLLPKRKTMVCESADCRFQSDLLNLEGSLYYDFYWQDERYFKAFRTEIIEAFKFTPLVGDS




Prevotella




oris]



NRKVENMLKEGRFASLHVRRGDYLKEPLFQSTCDIAYYQRAISRLNQMADYPYCYLIFSNDIAWCKTHIEPLCD




oris C735






GRRTHYVDWNHGKESYRDMQLMTFCKHHIIANSSFSWWGAWLSTANDGITIAPHQWYANDRKPSPAAE









AWLKL







Prevotella

WP_
490514606
protein
25.33

MKIVRIIGGLGNQMFQYALALALKQQQENEEVKLDLSAFRGYKKHGGFQLVQCFGTTLPAATWQEVAQLA
149



culorum;

004380180.1

[Prevotella


WYYPHYQLWRLGHRVLPVCKTMLKEPDNGAFLPEVLQRKGDAYYEGCWQDERYFSHYRPAILQAFTFPTF




Prevotella




oulorum]



TNPRNLAMQQQINTTESVAIHVRRGDYLHDALFRNTCGLAYFQRAITCILQHVAHPVFYVFSDDMAWCRQ




oulorum






HIQPLLQTNEAVFVDWNHGKASICDLHLMTLCRHHIIANSSFSWWGAWLSPHQAGWIIAPKQWYAHEEK



F0390





MSPAAERWLKL







Spirosoma

WP_
522084965
protein
25.33

MNRRVAVQLKGGLGNQLFQYALGRRLSLQLEAELLFDCSVLENRIPVTNFTFRSFDLDMFRIAGRVATPSDL
150



panaciterrae

020596174.1

[Spirosoma


PLFPKSASIRSPWPHLVQLARLWKQGYSVYERGFAYNPKMLRQLSDRVYLNGYWQSYRYFEDIAATLRAD







panaciterrae]



CSFPDLPDSAVGLAGQINATNSICLHIRRTDFLQVPLHQVSNADYVGRAIAYMAERVNDPHFFVFSDDIAW









CQTNLRLSYPVVFVPNELAGPKNSLHFRLMRYCKHFITANSTFSWWAWWLSEPSDGKVIVTPQTWFSDSRSI









DDLIPANWIRL







Butyrivibrio

YP-
302669866
glycosyl-
25.26

MNYVEVKGGLGNQLFQYTFYKYLEKKSGHVLLHTDFFKNIDSFEEATKRKLGLDRFDCDFVAVSFFISCEKL
151



proteo-

003829826.1

transferase 11


VKESDYKDSMLSQDEVFYSGYWQNKRFFLEVMDDIRKDLLLKDENIQDEVKELAKELRAVDSVAIHRFFGDY




clastiucs;



[Butyrivibrio


LSEQNKKIFTSLSVDYYQKAIAQLAERNGADLKGYIFTDEPEYVSGIIDQLGSIDIKLMPVREDYEDLYLMSCAR




Butyrivibrio




proteo-



HHIIANSSFSWWGAALGDTESGITIAPAKWYVDGRTPDLYLRNSWISI




proteo-




clasticus








clastiucs



B316]






B316













Butyrivibrio

WP_
551021623
protein
25.26

MIIIQLKGGLGNQMFQYALYKELKHRGREVKIDDVSGFVNDKLRVPVLDRFGVEYERATREEVVELTDSRMD
152


sp. XPD2006
022765786.1

[Butyrivibrio


IFSRIRRKLTGRKTYRIDEMEGIFDPAILETENAYLVGYWQSEKYFTSPEVIEQIQEAFGKRPQEIMHDSVSWST






sp. XPD2006]


LQQIECCESVSIHVRRTDYVDAEHIKIHNLCSEKYYKNAIGKIREDHPNAVFFIFTDDKEWCKDHFKGPNFITVE









LQEGEFTDVADMLLMSRCKHHIIANSSFSWWSAWLNDSPEKMVIAPSKWINNKKMDDIYTERMTRVAI







Bacteroides

WP_
496041586
protein
25.24

MKIVNITGGLGNQMFQYAFAMALKYRNPQEEVFDIQHYNTIFFKKFKGINLHNGYEIDKVFPKAKLPVAGV
153


sp. 1_1_6
008766093.1

[Bacteroides


RQLMKFSYWIPNYISLRLGRKFLPIRKKEYIPPSYMNYSYDEKALNWKGDGYFEGYWQSYNHFGDIKEELQK






sp. 1_1_6]


VYAHPKPNQYNAALISNLESCNSVGIHVRRGDYLAEFEFRGICGLDYYEKGIKEILSDEKKYVFFIFNDMQWC









QENIAPLVGDNRIVFISGNKGKDSCWDMFLMTHCKDLIIANSSFSWWGAFLNKKVDRVICPKPWLNRDCNI









DIYNPSWILCPCYSEDW







Bacteroides

YP_
53713865
alpha-1,2-
25.17

MKIVTFQGGLGNQLFQYVFYWLDMRCDKDNIYGYYPKKGLRAHNGLEIEKVFEVKLPNSSLSTDLIVKSIKLI
154



fragilis;

099857.1

fucosyl-


NKIFKNRQYISTDGRLDVNGVLFEGFWQDKYFWEDVDILNFRWPLKLDVTNSFIMTKIQANNSISIHIRRG




Bacteroides



transferase


DYLLPKYRNIYGDICNEEYYQKAIEYILKCVDDPFFFVFSDDIDWAKSIINVSNVTFVNNNKGKDSYIDMFLMS




fragilis



[Bacteroides


LCHHNIIANSTFSWWAAQLNKHSDKIMIAPIRWFKSLFKDPNIFTESWIRI



YCH46



fragilis










YCH46]










Bacteroides

WP_
548260617
alpha-1,2-
25.17

MIKIVSFSGGLGNQLFQYLLVVYLRECGHQVYGYYNRKWLIGHNGLEVNNVFDIYLPKTNFIVNALVKVIRVL
155


sp. 9_1_42FAA
022477844.1

fucosyl-


RCLGFKKYVATDTYNNPIAIFYDGYWQDQKYFNIIDSKLSFKKFDLSAENSILSKIKSNISVALHIRCGDYLSSS






transferase


NVEIYGGVCTKEYYEKALELVCKIKNVMFFVFSDDIEYAKLLLNLPNAIYVNANVGNSSFIDMYLMANCKVNV






[Bacteroides


IANSTFSYWAARLNQDNILTIYPKKWYNSKYAVPDIFPSEWVGV






sp. 9_1_42FAA]









[Coralio-
WP_
548260617
glycosyl-
25.17

MIIVKVQGGLGNQMFQYAFGRALSEKHSQDLYLDCSEYLRPSCKREYGLDHFNIRAKKASCDVKSMVTPH
156



margarita sp.

022477844.1

transferase


FALRKKLKKIFAVPYSLSPTHILERNFNFQPSILEFNCGYFDGFWQTQKYFSGISDIVRKDTFKDAVKYSGGET



CAG:312


family 11


FAKITSLNSVSLHIRRGDYYKVKRTRKRFSVIRAGYFKRAVEYMRSKLDTPHFFIFTDDPKWVSENFPAGEDYT






[Coralio-


LVSSSGMYEDLFLMAQCRHNIIFNSSFSWWGAWLNGNPGKIVVAPDMWFTPHYKLDYSDVVPEEWIKLN







margarita sp.



TGYFESKEF






CAG:312]










Pseudor-

WP_
550957292
alpha-1,2-
25.17

MIVMQIKGGLGNQMFQYAAGRALSLQTGMPLHLDLRYYRREREHGYGLGAFNIEASPLDESLLPPLPRESPL
157



hodobacter

022705649.1

fucosyl-


AWLIWRLGRRGPNLVRENGMGFNPTLSNVTKPAWITGYFQSERYFAAHAATIRAELTPVAAPDLVNARWL




gerrugineus



transferase


AEIAAEPRAVSLHVRRGDYVRDAKAAAKHGSCTPAYYERALAHITARMGTAPVVYAFSDDPAWVRENLRLP






[Pseudor-


AEIRVPGHNDTAGNVEDLRLMSACRHHIVANSSFSWWGAWLNPRADKIVASPARWFADPAFTNPDIWPE







hodobacter



AWARIEG







ferrugineus]











Escherichia

YP_
215487252
fucosyl-
25.16

MMYCCLSGGLGNQMFQYAAAYILKQHFPDTILVLDDSYYFNQPQKDTIRHLELDQFKIIFDRFSSKDEKVKIN
158



coli;

002329683.1

transferase


RLRKHKKIPLLNSFLQFTAIKLCNKYLSNDSAYYNPESIKNIDVACLFSFYQDSKLLNEHRDLILPLFEIRDDLRVL




Escherichia



[Escherichia


CHNLQIYSLITDSKNITSIHVRRGDYVNNKHAAKFHGTLSMDYYISAMEYIESECCGSQTFIIFTDDVIWAKEKFS




coli O127:H6




coli O127:H6



KYSNCLVADADENKFSVIDMYLMSLCNNNIIANSTYSWWGAWLNRSEDKLVIAPKQWYISGNECKLKNEN



str. E2348/69


str. E2348/69


WIAM







Lachno-

WP_
511537894
protein
25.16

MIIIKVMGGLGNQMQQYALYEKFKSIGKNKVKLDISWFEDSSVQEKVFARRSLELRQFKDLQFDTCSAEEKEA
159



spiraceae

016359991.1

[Lachno-


LLGKSGILGKLERKLIPARNKHFYESDIYHSEVFNMSDAYLEGHWACEKYYHDIMPLLQEKIQFPESANSQNIT




bacterium




spiraceae



VKKRMKAENSVSIHIRRGDYLDPENEAMFGGICTNSYYKAAEEYIKSRVPDTHFYLFSDDTAYLRENYHGDEY



3_1_57FAA_



bacterium



TIVDWNKGEDSFYDMELMSCCRHNICANSTFSFWGARLNRTPDKIVIPAKHKNSQEIEPQLLHELWDNW



CT1


3_1_57FAA_


VIIDGDGRIV






CT1]










Butyrivibrio

WP_
551010878
glycosyl-
25.09

MKPLVSLIVPVLNVEKYLEQCLTSISSQTYDNFEVILVVGKCIDNSENICKKWCEKDHRFRIEPQLKSCLGYARN
160



fibrisolvens

022755397.1

transferase


VGIDAAKGEYIAFCDSDDCITSDFLSCFVDTALKNSSDIVETQFTLCDQNLSPIYDYDRNILGHILGHGFLEYTSA






[Butyrivibrio


PSVWKYFVKRDIFTSNNLHYPEIRFGEDISMYSLLFSYCNKIDYVEKPTYLYRQVPSSLMNNPQGKRKRYESLF







fibrisolvens]



DIIDFVTNEFKTRLLFQKSWLKLLFQLEMHSASIISDSATSDDEAISMRQEISGYLKKVFPVKNTIFEVTALGWG









GEIVSSIASKFNTLHGVSSSNMFNFYFFELLEDSTRKKLEEMIINFSPDIFLIDLISEADYLSSYKGNLGTFVKNW









KIGFSIGMKMIQTHSNNSSIFLLENYMQQAPDHVDNTNEILKMLYDDIKINHPDIICISPAPDILNRSSEPELPCI









YQLKLVSDKLHTMYSPVINCVETKGGLGNQIFQYVSKYIEKMTGYRPLLHIGFFDYVKAIPGGTKRIFSLDKLF









PDIETTSGKIPCSHVVEEKSFISNPGSDIFYRGYWQDIRYFSDVKDEVLESFNVDTSSMSKDVIDFADTIRNANS









IAMHIRRQDYLNENNVSLFEQLSIDYYKSAVDMIRKEYADDLVLFIFSDDPEYANSIADSFDIEGFVMPLHKDY









EDLYLITLAHHHIIANSTFSLWGALLSARKDGIRIAPRNWFKGTPATNLYPDKWLIL







Anaeromusa

WP_
517532751
protein
25.08

MFCVRIYGGLGNQMFQYALGRAMAKHYSETAAFDLSWYEQKIKPGFEASVCQYNIELSRKDRPKAWYEPIL
161



acida-

018702959.1

[Anaeromusa


KRISRHTDKLEMWFGLFFFEKKYHTDSTVFERGLCKKNITLDGYWQSYKYFSAIEDDLRRELTIPKEREELIAISRS




minophila




acida-



LPENSVSIHVRRGDYVSNPKANAMHGTCSWEYYQAAIEKMTGLVKEPQYWFSDDITWTKENLPLPNAMYI







minophila]



GRELGLFDYEELILMSRCKHNIMANSTFSWWGASLNSNPNKVVIAPRKWFRHKKIKVNDLFPSSWVVL







Bacteroides

WP_
496044479
glycosyl
25.08

MDIVVIFNGLGNQMSQYAFYLAKKKDNLNCHVIFDPKSTNVHNGAELKRVFGIELNRNYLDKIISYFYGYIFN
162


sp. 2_1_16
008768986.1

transferase


KRIVNKLFSLVGIRMIYEPKNYDYFEELLKPSSNFISFYWGGWHSEKYFKDIELEVKKVFKFPEVTNSPYFTEWF






family 11


NKIFLDNNSVSIHIRRGDYLDKPSDPYYQFNGVCTIDYYEKAILYLKERILEPNFYIFSNDINWCMKTFGTENMY






[Bacteroides


YVDCNKGKDSWRDMYLMSECRHHINANSTFSWWAAWLSPYSNGIVLHPKYFIKDIETKDYYPQKWIMIE






sp. 2_1_16]










Chlorobium

YP_
189500849
glycosyl
25.08

MDKVVVHLTGGLGNQMFQYALGRSISINRNCPLLLNTSFYDTYDKFSCGLSRYNVKAEFIKKNSYYNNKYYRY
163



phaeo-

001960319.1

transferase


VIRLLSRYGVACYFGSYYEKKIFSYDEKVYKRSCVSYYGTWQSYGYFDSIRDILLRDYEMVGCLEEEVEKYVSDIK




bacteroides;



family protein


RVDSVSLHIRRGDYFDNKRLQSIHGILTMEYYYKAMSLFPDSSVFYVFSDDIEWVRENLITNTNIVYVVLESDN




Chlorobium



[Chlorobium


PENEIYLMSLCKNNIISNSTFSWWGAWLNKNKYKKVIAPRMWYKDNQSSSDLMPSDWCU




phaeo-




phaeo-








bacteroides




bacteroides







BS1


BS1]










Treponema

WP_
551312724
protein
25.08

MIVISMGGGLGNQMFEYAFYTQLKHLYPKSEIKVDTKYAFPYSHNGIEVFKIFGLNPPEANWKEVHSLVKTYP
164



bryantii

022932606.1

[Treponema


IEGNKAHFIKFFLYRILRKANLVEREPTSFCKQKDFTEFYNSFFELPQNNKSFYLYGPFVNYNYFAAIHNEIMDLYT







bryantii]



FPEITDVTNIEYKRKIESSHSISIHIRRGDYITEGVPLVPDAYYREALVYINKKIEDPHFFVFTDDKDYCKSLFSDN









QNFTIVEGNTGANSFRDMQLMSLCKHNIIANSTFSFWGAFLNKNSEKIVIAPNIAFKDCSCPYICPDWIIL







Bacteroides

YP_
675358171
LPS
25

MVIAKLFGGLGNQMFIYAAAKGIAQISNQKLTFDIYTGFEDDSRFRRVYELKQFNLSVQESRRWMSFRYPLG
165



fragilis;

005110943.1

biosynthesis


RILRKISRKIGFCIPLVNFKFIVEKKPYHFQNEIMRIASFSSIYLEGYFQSYKYFSKIEAQIREDFKFTKEVIGSVEK




Bacteroides



alpha-1,2-


ASFITNSRYTPVAIGVRRYSEMKGEFGELAVVEHDYYDAAIKYIANKVPNLIFIVFSEDIDWVKKNLKLDYPVYF




fragilis



fucosyl-


VTSKKGELAAIQDMYLMSLCNHHIISNSSFYWWGAYLASTNNHIVIAPSVFLNKDCTPIDWVII



638R


transferase









[Bacteroides










fragilis










638R]









Firmicutes
WP_
547951298
protein
25

MSGGLGNQMFYALYMKLTAMGREVKFDDINEYRGEKAWPIMLAVFGIEYPRATWDEIVAFTDGSMDFS
166


bacterium
022352105.1

[Firmicutes


KRLKRLFRGRHPIEYVEQGFYDPKVLSFENMYLKGSFQSQRYFEDILEEVQETFRFPELKDMNLPAPLYETTEK



CAG:534


bacterium


YLLRIEGCNAVGLHMYRGDSRSNEELYDGICTEKYYEGAVRFIQDKCPDAKFFIFSNEPKWVKGWVISLMKS






CAG:534]


QIREDMSREEIRALEDHFVLIENNTEYTGYLDMFLMSRCRHNIISNSSFSWWAFINENPDKLVTAPSRWVN









GVPSEDVYVKGMTLIDEKGRVERTIKE






Firmicutes
WP_
547971670
glycosyl-
25

MVIVKIGDGLGNQMFNYVCGYSVAKHDNDTLLLDTSDVDNSTLRTYDLDKFNIDFTDRESFTNKGFFHKVYK
167


bacterium
022368748.1

transferase


RLRRSLKYNVIYESRTENCPCVLDVYRRKFIRDGYLHGYFQNLCYFKTCKEDIMRQFTPKEPFSAKADELIHRFA



CAG:882


family 11


TENTCSVHVRGGDIKPLSIKYYKDALDKIGEAKKDMRFIVFSNVRNLAEEYIKELGVDAEFIWDLGEFTDIEELF






[Firmicutes


LMKACRRHILSDSTFSRWAALLDEKSEEVFVPFSPDADKIYMPEWIMEEYDGNEEKR






bacterium









CAG:882]










Vibrio

WP_
491639353
glycosyl-
25

MVIVKVSGGLGNQLFQYAIGCAISNRLSCELLLDTSFYPKQSLRKYELDKFNIKAKVATQKEVFSCGGGDDLLS
168



parahae-

005496882.1

transferase


RFLRKLNLSSLFFPNYIKEKESLVYLAEISHCKSGSFLDGYWQNPQYFSDIKDELVKQIVPIMPLSSPALEWQNII




molyticus;



family 11


INTKNCVSLHVRRGDYVNNAHTNSVHGVCDLSYYREAITNIHETVEKPKFFVFSDDISWCKDNLGSLGHFTYV




Vibrio



[Vibrio


DNTLSAIDDLMLMSFCEHHIIANSTFSWWGAWLNDHGITIAPKRWFSSVERNNKDLFPEKWLIL




parahae-




parahae-








molyticus




molyticus]







10329;










Vibrio











parahae-











molyticus;










10296;










Vibrio











parahae-











molyticus;










12310;










Vibrio











parahae-











molyticus;










10290













Herba-

WP_
493509348
glycosyl-
24.92

MIVSRLIGGLGNQMFQYAAGRALALRRGVPFAIDSRAFADYKTHAFGMQCFCADQTEAPSRLLPNPPAEGR
169



spirillum

006463714.1

transferase


LQRLLRRFLPNPLRVYTEKTFTFDEAVLSLPDGIYLDGYWQSEKYFADFADDIRKDFAVKAAPSAPNQAWLEL




frisingense;



family protein


IGRTHSVSLHIRRGDYVSNAAAAANHGTCDLGYYERAVAHLHQVTGQAPELFVFSDDLDWVATNLQLPYT




Herba-



[Herba-


MHLVRDNDAATNFEDLRLMTACRHHIVANSSFSWWGAWLDGRSESITIAPARWFVADTPDARDLVPQR




spirillum




spirillum



WVRL




frisingense;




frisingense]







GSF30













Rhizobium

WP_
495034125
Glycosyl-
24.92

MIITRILGGLGNQMFQYAAGRALAIANEAELKLDLIEMGAYKLRPFALDQFNIKAAIAQPDEVPAKPKRGLLR
170


sp. CF080
007759661.1

transferase


KFTSAFKPDRSSCERIVENGLTFDSRVPALRGSLHLSGYWQSEQYFASSADAIRSDFSLKSPLGPARQDVLARI






family 11


GAATTPVSIHVRRGDYVTNPSANAVHGTCEPPWYHEAMRRMLDRAGDASFFVFSDEPQWARDNLQSSRP






[Rhizobium


MVFIEPQNNGRDGEDMHLMAACHAHIIANSSFSWWGAWLNPRPNKHVIAPRQWFRAPDKDDRDIVPA






sp. CF080]


TWERL







Verrucomicro-

WP_
497645196
glycosyl-
24.85

MVISHISEGLGNQMFQYAAGRRLSYHLGTTLKLDDYHYRLHPFRSFQLDRFLITSPIATDAEISHLCPLEGLAR
171



bium

009959380.1

transferase


AIRARLPGKLRGATLRLLGNLGLGSPYQPRLHSFKEETPKQPLLIGKVVSERHFHFDPDVLECPDNVCLVGYW




spinosum



family 11


QDERYFGEIRDILLRELTLKSPPAGATKAVLERIQRSSSVSLHVRRGDKTKSSSYHCTSLEYCLAAMSEMRARL






[Verrucomicro-


QAPTFFVFSDDWDWVREQIPCSSSVIHVDHNRAEDVSEDFRLMKSCDHHIIASSSLSWWAAWLGTNENSF







bium



VSPPADRWLNFSNHFTADVLPPHWIQLDGSSLLPAQ







spinosum]











Fibrella

YP_
436833833
glycosyl-
24.83

MTANRVLVNSPMVIAKITSGLGNQLFQYALGRHLALQGNTSLWFDLRFHQEYATDTPRKFKLDRFNVRYN
172



aestuarina;

007319049.1

transferase


LLDSSPWLYASKATRLLPGRSLRPLIDTRFEADFHFDPTVIRPAAPLTILWGFWQSEDKYFAQSTPQIRQELTFN




Fibrella



family 11


RPLSDTFVGYQQQIEQAEVPISVHVRRGDYVTHPEFSQSFGFVGLAYYQKALAHLQDLFPNATLFFFSDDPD




aestuarina



[Fibrella


WVRANIVTEQPHVFVQNSGPDADVDDLQLMSLCHHHVIANSSFSWWGAWLNPRPDKVVIGPQRWFAN



BUZ 2



aestuarina



KPWDTKDLLPSGWLRL






BUZ 2]









Rhodobacter
WP_
563380195
alpha-1,2-
24.83

MIHMRLVGGLGNQLFQYACGRAVALRHGETLVLDTRELSRGAAHAVFGLDHFAIRARMGASADLPPPRSR
173


sp. CACIA14H1
023665745.1

fucosyl-


VLAYGLWRAGFMAPRFLRERGLGVNPAVLAAGDGTYLHGYFQSEAYFRDVVPQIRPELEIVTPPSDDNLRW






transferase


ASRIAGDDRAVSLHVRRGDYVASAKGQQVHGTCDADYYARAVAAIRARAGIDPRLYVFSDDPHWARDNLA






[Rhodobacter


LDAETVVLDHNPPGAAVEDMRLMGVCRHHIIANSSFSWWGAWRNPSAGKVVVAPVRWFADPKLHNPDI






sp. CACIA14H1]


CPPEWLRV







Rhodo-

NP_
32475785
fucosyl-
24.83

MATSAHLHLSDEKQTLDSKASDRDCATTEASASDKTCTISISGGLGNQMLQYAAGRALSIHHDCLSQLDLKF
174



pirellula

868779.1

transferase


YSSKRHRSYELDAFPIQAHRSIKPSFFSQILSKIQSESKHVPTYQEQSKRFDPAFFNTEPPVKIRGYFFSEKYFSPY




baltica SH 1;



[Rhodo-


ADQIRTELTPPIPPDQPADRMAIRLKECVSTSLHVRRGDYVTNANARQRFWCCTSEYFEAAIERLPTDSTVFV




Rhodo-




pirellula



FSDDIEWAKQNIRSSRTTVYVNDELKKAGSPETGLRDLWLMTHAKSHIIANSSFSWWGAWLANSEANLTIA




pirellula




baltica SH 1]



PKKWFNDPEIDDSDIVPSSWHRI




baltica














Spirosoma

WP_
522092845
protein
24.76

MVVVELMGGLGNQMFQYAFGMQLAHQRQDTLTVSTFLLSNKLLANLRNYTYRPFELCIFGIDKPKASPFNL
175



spitsbergense

020604054.1

[Spirosoma


LRALLPFDLNTSLLRETDDPEAVIPAASARIVCVGYWQSEHYFEEVTVHVREKFIFRQPFNSFTSRLANNLNGI







spitsbergense]



PNSVFVHIRRGDYVTNKGANAHHGLCDRTYYERAVTFMREHLENPLFFIFSDDLEWVSQELGPILEPATYVG









GNQKNDSWQDMYLMSLCRHAIVANSSFSWWGAWLSPHASKIVVAPKEWFGKPLLPVKTNDLIPSNWIRI






uncultured
EKD71402.1
406938106
protein
24.76

MNAIIPRITGGIGNQLFIYAAARRMAIANSMNLVIDDTSGFKYDVLYKRFYQLEKFNITSRMATPTERLEPFSK
176


bacterium


ACD_


IRRYLKRKINKTYPFAQRAYITQEKSGFDPRLLVFRPKGNVYLDGYWQSENYFKDIEGIIRQDLIIKSPSDSLNIA






46C00193


TAERIKNTLAIAVHVRFFDMVDISDSSNCQSNYYHTAIAKMEEKIPNAHYFIFSDKPVLARLAMPLPDDRITIID






G0003


HNIGDMNAYADLWLMSLCKHFVIANSTFSWWGAWLSDNKEKIVIAPDITITSGVTQWGFDGLIPDEWIKL






[uncultured









bacterium]










Prevotella

WP_
494008437
protein
24.75

MDVIVIFNGLGNQMSQYAFYLEKRLRNRQTTYFVLNPRSTYELERLFGIPYRSNLMCRMIYKLLDKAYFSNHI
177



micans;

006950883.1

[Prevotella


RLKKILRTALNAVGIRLIVEPITRNYSLSNFTHHPGLTFYRGGWHSELNFTSVVTELRRKFIFPPSDDEEFKRISAL




Prevotella




micans]



IIRTQSISLHIRRGDYLDYSEYQGVCTEEYYERAIEYIRSHVENPVFFVSDDKEYAINKFSGDDSFRIVDFNTGE




micans F0438






NSWRDMQLMSLCRHHILANSTFSWWGAWLDSAPEKIVLHPIYHMRDVPTRDFYPHNWIGISGE







Thermo-

AHB87954.1
564737556
alpha-1,2-
24.75

MIIVRLYGGLGNQMFQYAAGLALSLRHAVPLRFDLDWFDGVRLHQGLELHRVFDLDLPRAAPSEMRQVLG
178



synechococcus



fucosyl-


SFSHPLVRRLLVRRRLRWLLPQGYALEPHFHYWPGFEALGPKAYLDGYWQSERYFSEYQDAVRAAFRFAQP



sp. NK55


transferase


LDERNRQIVEEMAACESVSLHVRRGDFVQDPVVRRVHGVDLSAYYPRAVALLMERMREPRFYVFSDDPD






[Thermo-


WVRANLKLPAPMIVIDHNRGEHSFRDMQLMSACRHHILANSSFSWWGAWLNSQPHKLVIAPKRWFNVD







synechococcus










sp. NK55]










Coleo-

WP_
493031416
Glycosyl-
24.73

MLSLNKNFLFVHIPKSCILKEVYIYMISFPNLGKGVRLGNQMFQYAFLRSTARRLGVKFYCPAWSGDSLFTLN
179



fasciculus

006100814.1

transferase


DQEERVSQPEGITKQYRQGLNPGFSENALSIQDGTEISGYFQSDKYYDNPDLVRQWFSLKEEKIASIRDRFSRL




chthono-



family 11


NFANSVGMHLRFGDVVGQLKRPPMRRSYYKKALSYIPNQELILVFSDEPERTKKMLDGLSGNFLFLSGHKNY




plastes;



[Coleo-


EDLYLMTKCQHFICSYSTFSWWGAWLGGERERTVIYPKEGQYRPGYGRKAEGVSCESWIEVQSLRGFLDDY




Coleo-




fasciculus



RLVSRLEKRLPKSLMNFFY




fasciculus




chthono-








chthono-




plastes]








plastes PCC










7420













Bacteroides

WP_
517496220
glycosyl-
24.66

MRLIKMTGGLGNQMFIYAFYLRMKKRHTNTRIDLSDMMHYNVHHGYEMHHVFNLPKTEFCINQPLKKVIE
180



gallinarum

018666797.1

transferase


FLFFKKIYERKQDSSNLLPFDKKYFWPLLYFKGFYQSERFFADMENDIRKAFTFNSGLFNEKTQTMLKQIEHNE






[Bacteroides


HAVSLHVRRGDYLEPKHWKTTGSVCQLPYYINAIAEMNRRIEQPFYYVFSDDIAWVKENLPLPQAVFIDWN







gallinarum]



KGVESWQDMMLMSHCRHHIICNSTFTWWGAWLNPKENKTVIMPERFQHCETPNIYPAGWIKVPIN






Firmicutes
WP_
547967507
glycosyl-
24.66

MNNVEIMGGLGKQLFQYAFSRYLQKLGVKNVVLRKDFFTIQFPENNGITKREVFLDKYNTRYVAAAGEKTYR
181


bacterium
022367483.1

transferase


DYCDENDYRDDYAIGSDEVLYEGYWQNIDFYNVVRKEMQEELKLKPEFIDNSMAAVEKDMSSCNSVALHIR



CAG:882


GT11 family


RSDYLTQVNAQIFEQLTQDYYASAVSIIEQYTHEKPVLYIFSDDPEYAAENMKDFMGCRTVIMPPCEPYQDM






[Firmicutes


YLMTRAKHNIIANSTFSWWGATLNANPDNITVAPSRWMKGRTVNLYHKDWITL






bacterium









CAG:882]










Bacteroides

WP_
494296741
protein
24.6

MIAVNVAGLANQMFHYAFGFGLMAKGLDVCFDQSNFKPRSQWAFELVRLQDAFPSIDIKVMPEGHFK
182



xylanisolvens;

008021494.1

[Bacteroides


WVFPSLPRNGLERRFQEFMKKWHNFIGDEVYIDEPMYGYVPDMEKCATRNCIYKGFWQSEKYFRHCEDDI




Bacteroides




xylanisolvens]



RKTFTFPLFDELKNIEVAAKMSQENSVAIHLRKGDDYMQSELMGKGLCTVDYYMKAIDYMRKHINNPHFY




xylanisolvens






VFTDNPCWVKDNLPEFEYILVDWNEVSGKRNFRDMQLMSCAKHNIIGNSTYSWWAWLNANQDKIVVG



CL03T12C04





PKRFFNPINSFFSTCDIMCEDWISL







Geobacter

YP_
322418503
glycosyl-
24.58

MIGMVIFRAYNGLGNQMFQYALGRHLALLNEAELKIDTTAFADDPLREYELHRLKVQGSIATPDEIAFFREM
183


sp. M18
004197726.1

transferase


ENTHPQAYLRLTQKSRLFDPAILSARGNIYLHGFWQTEKYFADIREILLDEFEPIVPAGEDSIKVLSHMKATNA






family protein


VALHVRRSDYVSNPMTLRHHGVLPLDYYREAVRRIAGMVPDPVFFIFSDDPQWAKDNIRLEYPAFCVDAHD






[Geobacter


ASNGHEDLRLMRNCKHFIIANSSFSWWGAWLSQNTGKKVVAPLKWFAKPEIDTRDIVPLQWIRI






sp. M18]










Ruegeria

YP_
56698215
alpha-1,2-
24.57

MITTRLHGRLGNQMFQYAAARGLAARLGTQVALDTRLAESRGEGVLTRVFDLDLAQPDQLPPLKGDGLLR
184



pomeroyi

168587.1

fucosyl-


HGAWRLLGLAPRFRREHGLGYNAAIETWDDGTYLHGYWQSERYFAHIAARIRADFAFPAFSNSQNAEMAA



DSS-3


transferase


RIGDTDAISLHVRRGDYVALAAHTLCDQRYYAAALTRLLEGVAGDPVVYLFSDDPAWARDNLALPVQKVVV






[Ruegeria


DFNGPETDFEDMRLMSLCRHNIIGNSSFSWWAAWLNAHPGKRIAAPASWFGDAKLHNPDLLPPDWLKIE







pomeroyi



V






DSS-3]










Lachno-

WP_
511037973
protein
24.52

MIIIQLAGGLGNQMQQYAMYQKLLSLGKKVKLDISWFEEKNRQKNVYARRELELNYFKKAEYEACTEEERKA
185



spiraceae

016291997.1

[Lachno-


LVGEGGFAGKIKGKLFPGTRKIFRETEMYHPEIFDFEDRYLYGYFACEKYYADIMEILQEQFVFPPSGNPENQK




bacterium




spiraceae



MAERIADGESVSLHIRRGDYLDAENMAMFGNICTEEYYAGAIREMKKIYPSAHFFVFSDDIPYAKETYSGEEF



28-4



bacterium



TVVDINRGKDSFFDIWLMSGCRHNICANSTFSFWGARLNRNKGKVVMRPFIHKNSQKFEPELMHELWKG






28-4]


WVFIDNRGNIC







Prevotella

WP_
547254188
glycosyl-
24.49

MRILVFTGGLGNQMFEYAFYKHLKSCFPKESFYGHYGVKLKEHYGLEINKWFDVTLPPAKWWTLPPVVGLFYL
186


sp. CAG:1092
021989703.1

transferase


YKKLVPNSKWLDLFQREWKHKDAKVFFFPFKFTKQYFPKENGWLKWKVDEASLCEKNKKLLQVIHDEETDCFV






family 11


HVRRGDYLASNFKSIFEGCCTLDYYKRALEYMNKNNPKVRFICFSDDLEWMRKNLPMDDSAIYVDWNTGT






[Prevotella


DSPLDMYMMSQCDNGIIANSSFSYWGAYLGGKKTTVIYPQKWWNMEGGNPNIFMDEWLGM






sp. CAG:1092]










Spirosoma

WP_
517447743
protein
24.41

MVISVLSGGLGNQLFQYAFGLKLAALQLQTELRLERHLLESKAIARLRQYTPRTYELDTFGVEAPAASLMDTVS
187



luteum

018618567.1

[Spirosoma


CLSRVASLDKTALLLRESTLTPNAINNLNNRVRDVVCLGYWQSEEYFRPATEQLRKHLVFRKNPAQSRSMAD







luteum]



TILSCQNAAFVHIRRGDYVTNTHANQHHGLCDVSYYRRACEYVKECIPDVQFFVFSDDPDWAKRELGIHLQP









ARFIDHNRGADSWQDMYLMSLCRHAIVANSSFSWWGAWLNPVAERLVVAPGQWFVNQPVLSQQIIPPH









WHCL







Marinomonas

YP_
333906886
glycosyl-
24.34

MIIVDLSGGLGNQMFQYACARSLSIELNLPLKVVYGSLASQTVHNGYELNRVFGLDLEFATENDMQKNLGFF
188



posidonica

004480472.1

transferase


LSKPILRKIFSKKPLNNLKFQNFFPENSFNYSSLFSYIKDSGFLQGYWQTEKYFLNHKSQILKDFCFVNMDDE



IVIA-Po-


family protein


TNISIANDIQSGHSISIHVRRGDYLTNLKAKAIHGHCSLDYYLKAIEFLQEKIGESRLFIFSDDPEWVSENIATRFS



181;


[Marinomonas


DVSVIQHNRGVKSFNDMRLMSMCDHHIIANSSFSWWGAWLNPSQNKKIIAPKNWFVTDKMNTIDLIPSS




Marinomonas




posidonica



WILK




posidonica



IVIA-Po-181]










Bacteroides;

WP_
492425792
glycosyl-
24.32

MKIVVFKGGLGNQLFQYAFYKYLSRKDETFYFYNDAWYNVSHNGFELDKYFKTDDLKKCSRFWIILFKTILSKI
189



Bacteroides

005839979.1

transferase


YHWKIYVVGSVEYQYPNHLFQAGYFLDKKYYDENTIDFKHLLLSEKNQSLLKDIQNSNSVGVHIRRGDYMTK



sp. 4_3_47FAA;


family 11


QNLVIFGNICTQKYYHDAIRIITEKVNDAVFYVFSDDISWVQTHLDIPNAVYVNWNTGESSIYDMYLMSSCKY




Bacteroides



[Bacteroides


NIIANSTFSYWAARLNKKTNMVIYPSKWYNTFTPDIFPESWCGI



sp. 3_1_40A;










Bacteroides











dorei










5_1_36/D4;










Bacteroides











vulgatus










PC510;










Bacteroides











dorei










CL03T12C01;










Bacteroides











vulgatus










dnLKV7













Candidatus

WP_
519013556
protein
24.32

MTIRIKLTGGLGNQMFQFATGFAIAKKKNVRSLSDLKYINKRKLFNGFELQKIFNIYSKVSFLNKTLSFKSINFTE
190



Pelagibacter

020169431.1

[Candidatus


ILNRIDTTFYNFKEPHFHYTSNILNLPKHSFLDGYWQSELYFNEFATEIKRIFNFSGKLDKSVLLVADDINRNNSI




ubique




Pelagibacter



SIHIRRGDFLLKQNNNHHTDLKEYYLKAINETSKIFKNPKYFIFSDDTSWTVDNFVIDHPYIIVDINFGARSFLD







ubique]



MYLMSLCKSNIIANSSFSWWSAWLNNNKDKIIYAPKNWFNDKSICTDDLIPESWNIIL







Bacteroides

WP_
519013556
protein
24.29

MSVIINMACGLANRMFQYAFYLYLQKEGYDAYVDYFTRADLVHENVDWLRIFPEATFRRATARDIRKMGG
191


sp. CAG:875
020169431.1

[Bacteroides


GHDCFSRLRRKLLPMTTKVLETSGAFEIILPPKNRDSYLLGAFQSAKMVESVDAEVRRIFTFPEFESGKNQYFQ






sp. CAG:875]


TRLAQENSVGLHIRKGKDYQERIWYKNTCGVEYYRKAVDLMKEKVDSPSFYVFTDNPAWVKENLSWLEYKL









VDGNPGSGWGSHCDMQLMSLCKHNIISNSSYSWWGAYLNNTLNKIVVCPRIWFNPESTKDFSSNPLLAEG









WISL







Butyrivibrio

WP_
551011911
protein
24.29

MIIIKLQGGLGNQLFLYGLYKNLKHLKRDVKMDIESGFEGDELRKPCLDCMNLEYAIATRDEVTDIRDSYMDI
192



fibrisolvens

022756327.1

[Butyrivibrio


FSRIRRKITGRKTFDYYEPEDGNYDPKVLEMTKAYLNGYFQSEKYFGDEESVKALKDELTKGKEDILTSTDLITKI







fibrisolvens]



YHDIKNSESVSLHIRRGDYLTPGIIETYGGICTDEYYDKAIAMIRETFPEARFFIFSNDIEWCKEKFAGDKNILFV









NTIGINLDSEDNIKIGKSDKDISEYRDLAELYLMSACKHHILANSSFSWWGAWLSDHEGMTIAPSKWLNNKN









MTDIYTKDMLLI







Roseburia

YP_
347532692
glycosyl-
24.22

MVTVKIGDGMGNQMYNYACGYAAAKRSGEKLRLDISECDNSTLRDYELDHFRVVYDEKESFPNRTFWQKL
193



hominis;

004839455.1

transferase


YKRLRRDIRYHVIRERDMYAVDARVFVPARRGRYLGHYWQCLGYFEEYLDDLREMFTPAYEQTDAVRELM




Roseburia



family protein


QQFTQTPTCALHVRGGDLGGPNRAYFQQAIARMQKEKPDVTFIVFTNDPKAKECLDDGEARMYRIAEFGE




hominis A2-



[Roseburia


ALSDIDEFFLMSACQNQIISNSTYSTWAAYLNTLPGRIVIGPKFHGVEQMALPDWIVLDGGACQKGEIDAV



183



hominis A2-










183]










Rhodo-

WP_
495934621
alpha-1,2-
24.16

MATSVHPHLSDGKQALDSKAAQQVCSTQAASASDRACTISIGGLGNQMLQYAAGRALSIHHDCPLQLDLK
194



pirellula;

008659200.1

fucosyl-


FYSSKRHRSYELDAFPIQAQRWIKPSFFSQVLDKIQGESKSAPTYEEQSKRFDRAFFDIELPARIRGYFFSEKYFL




Rhodo-



transferase


PYADQIRTELTPPVPLDQPARDMAQRLSEGMSTSLHVRRGDYVSNANARQRFWSCTSEYFEAAIEQMPAD




pirellula



[Rhodo-


STVFVFSDDIEWAKQNIRSSRPTVYVNDELKLAGSPETGLRDLWLMTHASKSHIIANSSFSWWGAWLSGSEA




europaea 6C




pirellula



NLTIAPKKWFNDPEIDDSDIVPTSWRRI







europaea]











Rudanelia

WP_
518832653

proein

24.16

MVIAKITSGLGNQLFQYALGRHLAIQNQTRLWFDLRYYHRTYETDTPRQFKLDRFSIDYDLLDYSPWLYVSKA
195



lutea

019988573.1

[Rudanelia


TRLLPGRSLRPLFDTRKEPHFHLDPAVPNAKGAFITLDGFWQSEGYFASNAATIRRELTFTRQPGPMYARYR







lutea]



QQIEQTQTPVSVHIRRGDYVSHPEFSQSFGALDDTYYQTALAQINGQFPDATLLVFSDDPEWVRQHMFRER









PHVLVENTGPDADVDDLQLMSLCHHHIIANSSFSWWGAWLNPRPDKRVIAPKQWFRNKPWNTADLIPAG









WVRL







Bacter-

WP_
495893515
glycosyl-
24.15

MRLIKMTGGLGNQMFIYAMYLKMKTIFPDVRIDLSDMVHYQVHYGYEMNKVFHLPRTEFCINRSLKKIIEFL
196



oidetes;

008618094.1

transferase


LFKTILERKQGGSLVPYTRKYHWPWIYFKGFYQSEKYFAGIEKEVREAFVFDIRRASRRSLRAMQEIKADPHA




Capno 



[Bacter-


VSIHVRRGDYLLEKHWKALGCICQSSYYLNALAELEKRVKHPHYYVFSEDLNWVRQNLPLIKAEFIDWNKGE



cytophaga sp.



oidetes]



DSWQDMMLMSHCRHHIICNSTFSWWGAWLNGPLDKIVIAPERWTQTTDSADVVPESWLKVSIG



oral taxon









329 str.









F0087;










Para-











prevotella











clara YIT










11840













Smarag-

WP_
516906936
protein
24.08

MADVVVTLAGGLGNQLFQTAYAKNLEARGHRVTLDGTVVRWTRGLHIDPQICGLKILNATPPAPVPGRLAA
197



dicoccus

018159152.1

[Smarag-


TVLRRALATRLRFGPDGRIVTQRTLEFDEQYLNLNSPGRYRVEGYWQCERYFSDVGQTVRKVFLDMLGRH




niigatensis




dicoccus



VSYNGLSRLPAMADPSSISLHVRRGDYVTANFIDPLALEYYERALEELAVPSPRIFVFSDDLDWATRELGRICD







niigatensis]



VIPVEPDWTSHPGGEIFLMSQCSHHIIANSSFSWWGAWLDGRTSSRVVAPRQWFLETYSARDIVPDRWT









KV







Bacteroides

WP_
547279005
family 11
24.05

MIHLILGGGLGNQMFQYAFARLSALQYNENISFNTILYKELKNEERSFSLGHLNINTMCIVETPDENKRIWELF
198



fragilis

022012576.1

glycosyl-


NKQIFHQKIARKILPASIRWWWMSNRNIYANVCGPYKYYHPRHRSQNTTIIHGGFQSWKYFKEHQSMIKAE



CAG:558


transferase


LKVITPISEPNKKILKEIQNSNSICVHIRRGDFLSAQFSPHLEVCNKDYYEKAIKMISSQIENPTFFIFSNTHEDLV






[Bacteroides


WIRKNYNIPQNSVYVDLNNPDYEELRLMYNCKHFILSNSSFSWWAQYLSESKNKIIIAPKWDKRKGIDFSDIY







fragilis 



MPEWIIIK






CAG:558]










Desulfo-

WP_
550904402
protein
24.05

MSFSIDVAAIQRMALVKVDGGLGSQMWQYALSLAVGKKSSSFTVKHDLSWFRHYAKDIRGIENRFFILNSVFT
199



vibrio

022657592.1

[Desulfo-


NINLRLASENERLFFHIALNRYPDSICNFDPDILALKQPTYLGGYYVNAQYVTSAEKEIREAYVFAPAVEESNQA







vibrio]



MLQTIHAAPMPVAVHVRRGDYIGSMHELVTPRYFERAFKILAAALQPKPTFFVFSNGMEWTKKAFAGLPYD









FVYVDANDNDNVAGDLFLMTQCKHFTISNSLSWWGAWLSQRAENKTVIMPSKWRGGKSPIPGECMRV









EHWHMCPVE







Hoeflea

WP_
494373839
alpha-1,2-
24.05

MHGGLGNQLFQYAVGRAVALRTGSELLLDTREFTSSNPFQYDLGHFSIQAKVANSSELPPGKNRPLAYAW
200



photo-

007199917.1

fucosyl-


WRKFGRSPRFVREQDLGYNARIETIEADCYLHGYFQSQKYFEDIASILWKDLSFRQAISGENASMAERIQSAP




trophica;



transferase


SVSMHIRRGDYLTSAKARSTHGAPDLGYYGRALGEIRARSGSDPVVYLFSDDPDWVRNNMRMDANLVTVA




Hoeflea



[Hoeflea


INDGKTAFEDLRLMSLCDHNIIVNSTFSWWGAWLNPSLDKIVVAPKRWFADPKLSNPDITPPGWLRLGD




photo-




photo-








trophica DFL-




trophica]







43













Vibrio

WP_
487957217
glycosyl
24.04

MKIISFSGGLGNQLFQYAFYLKDNSDFGNIFLDFSFYESQNKRDAVIRNFYGVDSLDIIKQSSYVRGKFLILKL
201



cholerae;

002030616.1

transferase,


INKFRFFNNLLEFVDKENGLDETLLSTNKVFFDGYWQSYRYVKDYKSNIKELFSFYDFKGNILEVRKKICQSNSV




Vibrio



family 11


CMHVRRGDYVAEKNTKLVHGVCSLQYYRDALNNIKNVDNSIDHIFIFSDDIDWVKNNISFDIPVTVVDFVGQ




cholerae O1



[Vibrio


SVPDYAEMLLFSCGKHKVIANSTFSWWGAFLSDRNGVIVSPKKWFAKEEKNYDEIFIEGSLRL



str. 87395



cholerae]










Lachno-
WP_
551041074
protein
24.03

MIIVRFRGGMGNQMFQYAFLRYLEMKGATLKADLSEFKCMKTHAGYELDKAFDLHPAEASYKEIRAVADYI
202


spiraceae
022784718.1

[Lachno-


PVMHRFPFSRKVFEILYKKEKRVEAEGPKKSHISEEKYFDMSEDERLHLASSSEDLYMDGFWIKPDMYDDE



bacterium


spiraceae


VLKCFTFSKTLDEKYKGTIEDIHSCSVHVRCGDYTGTGLDILGKEYEKAAEKILSEDADVKFYVFSDDREKAEK



NK4A179


bacterium


LLSPFMKKMVFCDTPASHAYDDMYLMSRCRHHIIANSTFSFWGARLSADKSGITICPKYEDKNNTANRLVHE






NK4A179


GWQML







Cecembia

WP_
496476931
Glycosyl-
24.01

MIIMKFMGGLGNQIYQYALGRKLSELHNSFLASHIHIYKNDPDREFVLDKRNIKNKHLPWKVIKLLNSDYALKF
203



lonarensis;

009185692.1

transferase


DKVFHTEFYHELVLEKALESKDIPRKNNLYLRGSWGNRKYYEDYIKDISDEITLKEKFKTKDFNTVNKKVKNSDS




Cecembia



family 11


VGIHIRRGDYEKVAHFKNFYGLLPPSYYSAAVDFIGNRIEKSNFFIFSDDTDWVKENLPFLKDSFFVSDIIGSVD




lonarensis



[Cecembia


YLEFELLKNCKHQIIANSTFSWWAARLNSNPAKIVIPKRWFADDRQQAVYEIEDSYYIKEAIKL



LW9



lonarensis]











Bacteroides

WP_
490423336
protein
24

MKIVNILGGLGNQMFVYAMYLALKEAHPEEEILLCRRSYKGYPLHNGYELERIFGVEAPEAALSQLARVAYPF
204



ovatus;

004295547.1

[Bacteroides


FNYKSWQLMRHFLPLRKSMASGTTQIPFDYSEVTRNDNVYYDGYWQNEKNFLSIRDKVIKAFTFPEFRDEK




Bacteroides




ovatus]



NKALSDKLKSVKTASCHIRRGDYLKDPIYGVCNSDYYTRAITELNQSVNPDMYCIFSDDIGWCKENFKFLIGDK




ovatus ATCC






EVVFVDWNKGQESFYDMQLMSLCHYNIIANSSFSWWGAWLNNNDDKVVVAPERWMNKTLENDPICDN



8483





WKRIKVE







Bacteroides

WP_
547668508
glycosyl-
23.99

MRLIKMTGGLGNQMFIYAFYLKMKKLFPHTKIDLSDMMHYHVHHGYEMNRVFALPHTEFCIBNRTLKKLME
205



coprocola

022125287.1

transferase


FLLCKVVYERKQKNGSMEAFEKKYAWPLIYFKGFYQSERFFADIEDDVRKTFCFNMELINSRSREMMKIIDAD



CAG:162


family 11


EHAVSIHIRRGDYLLPKFWANAGCVCQLPYYKNAITELEKHESTPSFYVFSDDIEWVKQNLSLPNAHYIDWN






[Bacteroides


QGNDSWQDMMLMSHCRNHIICNSTFSWWGAWLNPRKNKTVIVPSRWFMKEETPIYPVSWIKVPIN







coprocola










CAG:162]










Bacteroides

WP_
495110765
gylcosyl-
23.99

MRLIKVTGGLGNQMFIYAFYLRMKKYYPKVRIDLSDMMHYKVHYGYEMHRVFKLPHTEFCINQPLKKIIEFLF
206



dorei;

007835585.1

transferase


FKKIYERKQAPNSLRAFEKKYFWPLLYFKGFYQSERFFADIKDEVREAFTFDRSKANSRSLDMLDILDKDENAV




Bacteroides



[Bacteroides


SLHIRRGDYLQPKHWATTGSVCQLPYYQNAIAEMSKRVTSPSYYIFSDDIVWVRENLPLQNAVYIDWNTGE




dorei DSM




dorei]



DSWQDMMLMSHCKHHIICNSTFSWWGAWLNPSIDKTVIVPSRWFQYSETPDIYPTGWIKPVD



17855;










Bacteroides











dorei










CL02T12C01













Bacteroides;

WP_
494936920
protein
23.97

MIIVRLWGGLGNQLFQYSFGQYLEIETDKKVFYDVASFTSDQLRKLELCSFIPDIPLYNAYFTRYTGVKNRLF
207



Bacteroides

007662951.1

[Bacteroides]


KALFQWSNTYLSESMFDICLLEKARGKIFLQGYWQEEKYATYFPMQKVLSEWKNPNVLSEIEENIRSAKISVS




intestinalis






LHVRRGDYFSPKNINVYGVCTEKYYEQAIDRANSEIEEDGQFFVFSDDILWVKNHVSLPESTVFVPNHEISQFA



DSM 17393;





YIYLMSLCKVNIISNSTFSWWGAYLNQHKNQLVIAPSRWTFTSNKTLALDSWTKI




Bacteroides











intestinalis










CAG:564













Lachno-

WP_
511028838
protein
23.95

MIVIHVMGGLGNQLYQYALYEKLRALGREVKLDVYAYRQAEGAEREWRALELEWLEGIRYEVCTAAERQQL
208



spiraceae

016283022.1

[Lachno-


LDNSMRLADRVRRRLTGRRDKTVRECAAYMPEIFEMDDVYLYGFFWGCEKYYEDIIPLLQEKIVFPESSNPKN




bacterium A4




spiraceae



ADVLRAMAGENAVSVHIRRKDYLTVADGKRYMGICTDAYYKGARFYITERVERPVFYIFSDDPAFAKTQFCE







bacterium A4]



ENMHVVDWNTGRESLQDMALMSRCRHNICANSTFSIWGARLNRHPDKIMIRPLHHDNYEALDARTVHEY









WKGWVLIDADGKV







Phaeobacter

YP_
399994425
protein
23.91

MIITRLHGRLGNQMFQYAAGRALADRLGVSVALDSRGAELRGEGVLTRVFDLDLATPDILPPLRQRAPLGYA
209



gallaeciensis;

006574665.1

PGA1_


LWRGLGQHLGTGPKLRREVGLGYNPDFVDWSDNSYLHGYWQSERYFAWSAERIRRDFTFPEYSNQQNAE




Phaeobacter



c33070


MAARIGETNAISLHVRRDGYLTLAAHVKCDQAYYEAALAQVLDGLEGQPTVYVFSDDPQWAKENLPLCDK




gallaeciensis



[Phaeobacter


VVVDFNGADTDYEDMRLMSLCKHNIIGNSSFSWWAAWLNGTPDRRVAGPTKWFGDPKLNNPDILPPDW



DSM 17395 =



gallaeciensis



LRISV



CIP 105210


DSM 17395 =









CIP 105210]









Firmicutes
WP_
546362318
protein
23.88

MSGGLGNQMFQYALYLKLRSLGREVCFDDKSQYDEETFRNSSQKRRPKHLDIFGITYPSAGKEELEKLTDGA
210


bacterium
021849028.1

[Firmicutes


MDLPSRIRRKILGRKSLEKNDRDFMFDPSFLEETEGYFCGGFQSPRYFAGAEEEVRKAFTFPEELLCPKEGCSR



CAG:791


bacterium


EQEKMLEQSASYAERIRKANCEAADRGVPGGGSASIHLRFGDYVDKGDIYGGICTDAYYDTAIRCLKERDPG






CAG:791]


MIFFVFSNDEEKAGEWIRYQAERSENLGRGHFVLVKGCDEDHGYLDLYLMTLCRNHVIANSSFSWWASFM









CDAPDKMVFAPSIWNNQKDGSELARTDIYADFMQRISPRGTRLSDRPLISVIVTAYNVAPYIGRALDSVCGQ









TWKNLEIIAVDDGSSDETGAILDRYAAGDSRIQVVHTENRGVSAARNEGIAHARGEYIGFVDGDDRAHPAM









YEAMIRGILSSGADMAVVRYREVSAEETLTDAEEQVASFDPVLRASVLLQQRDAVQCFIRAGMAEEEGKIVL









RSAVWNKLFHRRLLRDNRFPEGTSAEDIPFTTRALCLSKKVLCVPEILYDYVVNRQESIMNTGRAERTLTQEIP









AWRTHLELLKESGLSDLAEESEYWFYRRMSLYEEEYRRCSETAKEAKELQERILKHRDRILELAEEHSFGRRGD









RERLKLYVNSPRQYFLLSDLYEKTVVNWKNRPDKT







Butyrivibrio

WP_
302669773
glycosyl-
23.84

MRKRIIALNGGLGNQMFQYAFARMLEDRKHCLIEFDTGFYSTVNDRKLAIQNYNIHKYDFCNHEYYNKIRLLF
211



proteo-

003829733.1

transferase


QKIPFVAWLAGTYKEYSEYQLDPRVFLFNYRFYYGYWQNKQYFENISNDIRNELSYIGNVSEKENALLNMLEA




clasticus;



11


HNAIAIHVRRGDYTQEGYNKIYISLSEKEYYKRAVSIACKELGDNNIPLYVFSDDIDWCKANLADIGNVTFVDNT




Butyrivibrio



[Butyrivibrio


ISSSADIDMLMMKKSRCLITANSTFSWWSAWLSDRDDKIVLVPDKWLQDEEKNTKLMKAFICDKWKIVPV




proteo-




proteo-








clasticus




clasticus







B316


B316]










Bacteroides

WP_
496043738
protein
23.81

>gi|496043738|ref|WP_008768245.1|protein[Bacteroides sp.
212


sp.
008768245.1

[Bacteroides


2_1_16]MQVVARIIGGLGNQMFIYATARALALRIDADLILDTQSGYKNDLFKRNFLLDSFCISYRKANCFQKY



2_1_16


sp.


DYYLGEKVKSLGKKTHFSVIPFMKYISENTSCDFVDGLLKKHILSVYLDGYWQNEAYFKDYASIIKKDFQFCQV






2_1_16]


NDLRTLSEAEIIKKSITPVAIGNRRYQELNSHQNTKVTDLDFYQKAINYIESKVDMPTFFIFSEDQEWVKNNLEQ









KSNFIMISPKEGNYSALNDMYLISLCKHHIVSNSSFYWWGAWLANNKNKIVVASDCFLNPQSIPDSWIKF







Desulfo-

YP_
256830317
glycosyl-
23.76

>gi|256830317|ref|YP_003159045.1|gylcosyl transferase family protein
213



microbium

003159045.1

transferase


[Desulfomicrobium baculatum DSM]




baculatum;



family protein


MAKIVTRIMGGIGNQLFCYAAARLALVNHAELVIDDVTGFSRDRVYRRRYMLDHFNISARKATNYE




Desulfo-



[Desulfo-


RMEPFERYRRGLAKYISKKLPFFEREYIEQERIEFDPRFLEYRTYNNIYIDGLWQSENYFKDVEDIIRDDLKIIPPT




microbium




microbium



DLENINIAKKIKNIQNTIAMHVRWFDLPGINLGNNVSTYYYHRAIAMMEQRINAPHYFLFSDNLEAVHSKLD




baculatum




baculatum



LPEGRVTFVSNNDGDDNAYADLWLMSQCKHFITANSTFSWWGAWLGESRDSVVLVPRFSPDGGVTSWC



DSM 4028


DSM 4028]


FTGLIPERWEQVSSIR







Prevotella

WP_
545304945
galactoside
23.76

MDIVLIFNGLGNQMSQYAFYLAKRQRNNHTVYCHFGPRTQYSLDKLFDIPYRHNAVLVLLYRALDHAHFSN
214



pleuritidis;

021584236.1

2-alpha-L-


HRWLRRLLRPTLQLLGVKMIVERPSRDFDMRHFTHQKGIVFYRGGWHSELNFTAVADAVKRRFRFPEIQDA




Prevotella



fucosyl-


AVLAVIDRIKSCQSVSLHLRRGDYLSLSEFQFVCTEAYYEHAIAYFESQIESPEYFVFSDDPTYAREQFGADPNF




pleuritidis



transferase


HIIDLNHGEDAWCDLLMMTQCRYNIIASNTFSWWGAWLNDNPSKIVVHPRYHLNGVETRDFYPRNWICIE



F0068


[Prevotella










pleuritidis]











Bacteroides

WP_
496038684
glycosyl-
23.75

MKVIWFNGNLGNQVFYCKYKEFLHNKYPNETIKYYSNSRSPKICVEQYFRLSPDRIDSFKVRFVFEFLGKFFR
215


sp. 1_1_14
008763191.1

transferase,


RIPLKVPKWYCTRKSLNYEASYFEHYLQDKSFFEKEDSSWLKAKKPDNFSEKYLIFENLICNTNSVANHIRRGD






family 11


YIKPGSDYEDLSATDYYEQAIKKATEVYLDSQFFFFSDDLEFVKNNFKGDNIYYVDCNRGADYLDILLMSQAK






[Bacteroides


INIIANSTFSYWGAYMNHEKKKVMYSDLWFRNESGRQPNIMLDSWICIETKRK






sp. 1_1_14]










Agromyces

WP_
551273588
protein
23.65

MVGRVGIARRQAADVSCTDGDGLVAWRIRTGEIVLGLQGGIGNQLFEWAFAMALRSIGRRVLFDAVRCRG
216



subbeticus

022893737.1

[Agromyces


DRPLMIGPLLPASDWLAAPVGLALAGATKAGLLSDRSWPRLVRQRRSGYDPSVLERLGGTSYLLGTFQSARY







subbeticus]



FDGVEHEVRAAVRALLEGMLTPSGRRFADELRADPHRVAVHVRRGDYVSDPNAAVRHGVLGAGYYDQAL









EHAAALGHVRRVWFSDDLDWVREHLARDDDLLCPADATRHDGGEIALIASCATRIIANSSFSWWGGWLG









APSSPAHPVIAPSTWFADGHSDAAELVPRDWVRL







Prevotella

WP_
494220705
alpha-1,2-
23.59

MIATTLFGGLGNQMFIYATAKALSLHYRTPMAFNLRQGFEQDYKYQRHLELNHFKCQLPTAKWITFNYKGE
217



salivae;

007133870.1

fucosyl-


LNIKRISRRIGRNLLCPHYQFIKEKEPFHYEKRLFEFTNKKIFLEGYWQSPRYFENYSDEIRRDFQLKSILPHTITD




Prevotella



transferase


ELQMLKGTGKPLVMLGIRRYQEVKDKKDSPYPLCNKDYYAKAISHVQEQLPAPLFVVFTQEQAWAMNNLP




salivae



[Prevotella


TNANLYFVKEKDNAWATIADMYLMTQCQHAIISNSTFYWWGAWLQHPIENHIVVAPNNFINRDCVCDN



DSM 15606



salivae]



WIILD







Carno-

YP_
554649641
glycosyl-
23.57

MIFVDLSEGLGNQMFQYAYSRYLQELYGGTLYNTSSFKRKNSTRSYSLNNFYLYENVKLPSKFRRVIYNFYSK
218



bacterium sp.

008718687.1

transferase


TIRMFIKKVIRMNPYSDKYYFSMIPYGFYVSSQVFKYLTVPTTKRHNIFMGTWQTNKYFQSINDKIKDELKV



WN1359


[Carno-


KTEPNELNKKLITEINSNQSVCVHIRLGDYTNPEFDYLVCTSDYYLKGMDYIVSKVKEPNFYIFSNSSSDIEWI







bacterium sp.



KNNFYFKYKVKYIDLNNPDFEDFRLMYNCHHFIISNSTFSWWAQFLSNNDKKIIVAPSKWQKSNENEAKDIY






WN1359]


LDHWKLIEIE







Butyrivibrio

WP_
551018062
glycosyl-
23.55

MLIIQIGGLGNQMQQYAYLEKFKALGKETRLDTSWFDNASMQENVLARRSLELRFFDNLTYEACTPQERA
219


sp. AD3002
022762290.1

transferase


RFTDSVARVVEKLVPGMGSRFTESCMYHPEIFELKDKYIEGYFACQKYYDDIMGELQELFVFPTHPDEEINI






[Butyrivibrio


KNMNLMNEMEMVPSVSVHIRRGDYLDPENAALFGNIATDAYYDSAMEYFKAIDPDTHFYIFTNDPEYAREK






sp. AD3002]


YADPGRYTIVDHNTGKYSLLDIQLMSHCRGNICANSTFSFWGARLNRRKDKIPVRTLVMRNNQPVTPELMH









EYWPGWVLVDKDGKVR







Clostidium

WP_
454396682
gycosyl-
23.55

MIVIRVMGGLGNQMQQYALYEKFKALGKETRLDTSWFDNASMQENVLARRSLELRFFDNLTYEACTPQER
220


sp. KLE 1755
021636935.1

transferase,


EALLGKEGFFNKLERKLFPSKNKHFYESEMFHPEIFKLDNVYLEGHWACEKYYHDIMPLLQSKIIFPKTDNIQN






family 11


NMLKNKMNSENSVSIHIRRGDYLDPENAAMFGGICTDSYYKSAEGYIRNRVTNPHFYLFSDDPAYLREHYKG






[Clostidium


EEYTVVDWNHGADSFYDMELEMSCCKHNVCANSTFSFWGARLRTEKKIVIRPAKHKNSQQAEPERMHEL






sp. KLE 1755]


WENWVIIDEEGRIV







Bacteroides;

YP_
150005950
glycosyl-
23.47

MKFFVFGGGLGNQLFQYSYYRYLKKKYPSERILGIYPDSLKAHNGIEIDKWFDIELPPTSYLYNKLGILLYRVNRF
221



Bacteroides

001300694.1

transferase


LYNGHYRLLFCNRVYPQSMKHFFQWGDWQDYSIIKQINIFEFRSELPIGKENMEFLKKMETCNSISVHIRRG




vulgatus ATCC 



family protein


DYLKTDLIHIYGGICTSKYYREAIKFMEQEVEEPFFFFFSDDCLYVETEFADIRNKIIISHNRDDRSFFDMYLMAH



8482;


[Bacteroides


AKNMILANSTFSCWAAYLNRTAKIIITPDRWVNTDFSKLEALPNEWIKIRV




Bacteroides




vulgatus








dorei DSM



ATCC 8482]






17855;










Bacteroides











massiliensis










dnLKV3













Para-

WP_
495902050
glycosyl-
23.47

MRLIKMTGGLGNQMFIYAMYLKMRAVFPDTRIDLSDMVHYRVHYGYEMNKVFNLPRTEFRINRSLKKIIEFL
222



prevotella

008626629.1

transferase


LFKTILERKQGGSLVPYIRKYHWPWIYFKGFYQSEEYFAGVEKEVREAFVFDVRRVNRKSLCAMQEIMADPD




xylaniphila;



[Para-


AVSIHVRRGDYLQGKHWKSLGCICQRSYYLNALSELEKRIVHPHYYVFSEDLDWVRQYLPLENAVFIDWNKG




Para-




prevotella



EDSWQDMMLMSHCRHHIICNSTFSWWGAWLNPSPDKIVIAPERWTQQTTNSADVVPESWLKVSIG




prevotella




xylaniphila]








xylaniphila










YIT 11841













Thauera

WP_
489020296
glycosyl-
23.47

MTDRALIAIVKGGLGNQLFIYAAARAMALRTGRQLYLDAVRGYLADDYGRSFRLNRFPIEAELMPEQWRVA
223


sp. 28
002930798.1

tansferase


STLRHPRAKLVRALNKYLPEAWRFYVAEFGTRPGALWNHGRNVKRVTLMGYWQDEAYFLDYAELLRREL






family


GPPMPDAPEVRARGEFFAGTESVFLHVRRCRYSPLLDAGYYQKAVDLACAELNKPVFMIFGDDIEWVVNNI






protein


DFRGAGYERQDYDESDELADLWLMTRCRHAIIANSSFSWWAAWLGGAAGSGRHVWAPGQSGLALKCAK






[Thauera


SWEAVDAQPE






sp. 28]










Subdo-

WP_
494107522
alpha-1,2-
23.44

MIYAELAGGLGNQMFIYAFARALGLRCGEAVTLLDRQDWRDGAPAHTACALEGLNLVPEVKILAEPGFAKR
224



ligranulum

007048308.1

fucosyl-


HLPRQNTAKALMIKYEQRQGLMARDWHDWERRCAPVLNLLGLHFATDGYTPVRRGPARDFLAWGYFQS




variabile;



transferase


EAYFADFAPTIRAELRAKQAPAGVWAEKIRAAACPVALHLRRGDYCRPENEILQVCSPAYYARAAAAAAAAY




Subdo-



[Subdo-


PEATLFVFSDDIDWAKEHLDTAGLPAVWMPRGDAVGDLNLMALCRGFILSNSTYSWWAQYLAGEGRTV




ligranulum




ligranulum



WAPDRWFAHTKQTALYQPGWHLIETR




variabile




variabile]







DSM 15176












Firicutes
WP_
547127527
protein
23.4

MIIVEVMGGLGNQMQQYALYRKLESLGKDARLDVSWFLDKERQTKVLASRKLELSWFENLPAKYCTQEEK
225


bacterium
021916223.1

[Firmicutes


QAILGKNNLIGKLKKKLLGGSNRHFTESDMYHPEIFDLEDAYLSGFWACEAYYADILPMLRSQIHFPDPEKGE



CAG:24


bacterium


GWDLEAAAKNKETMERMKQETSVSIHIRRGDYLDAKNAEMFGGICTDAYYEAAISYIKEQTPDAHFYVFSD






CAG:24]


DSAYVKNAYPGKEFTVVDWNTGKNSLFDMQLMSCCNHNICANSTFSWGARLNPSPDKVMIRPSKHKNS









QNIVPEEMKRLWDGWVLIDGKGRII







Prevotella

WP_
547906803
glycosyl-
23.39

MIITKLNGGLGNQLFEYACARNLQLKYNDVLYLDIEGFKRSPRHYSLEKFKLSSDVRMLPEKDSKSLILLQAISK
226


sp. CAG:474
022310139.1

transferase


LNRNLAFKLGPLFGTYIWKSSNYRPLKIKNTRGKKLYLYGYWQSYEYFKENEAIIKQELNVKETIPIECSELLKEIN






family 11


KPHSICVHVRRGDYVSCGFLHCDEAYYNRGINHIFDKHPDSNVVVFSDDIKWVKANMNFDHPVAYVEVDV






[Prevotella


PDYETLRLMYMCKHFVMSNSSFSWWASYLSDNKEKIVVAPSYWLPANKDNKSMYLDNWTIL






sp. CAG:474]










Roseburia

WP_
493910390
gylcosyl-
23.38

intestinalis]MRGNRGMIAVKIGDGMGNQLFNYACGYAQARRDGDSLVLDISECDNSTLRDFELDKFHLKY
227



intestinalis;

006855899.1

transferase


DKKESFPNRNLGQKIYKNLRRALKYHVIKEREVYHNRDHRYDVNDIDPRVYKKKGLRNKYLYGYWQHLAYFE




Roseburia



family 11


DYLDEITAMMTPAYEQSETVKKLQEEFKKTPTCAVHVRGGDIMGPAGAYFKHAMERMEQEKPGVRYIVFT




intestinalis;



[Roseburia


NDMERAEEALAPVLESQKKDAVGQAENRLEFVSEMGEFSDVDEFFLMAACQNQILSNSTFSTWAAYLNQN



L1-82



intestinalis]



PDKTVIMPDDLLSERMRQKNWIILK







Bacteroides

WP_
490424433
protein
23.29

MKIVLFTPGLGNQMFQYLFYLYRDNYPNQNIYGYYNRNILNKHNGLEVDKVFDIQLPPHTVISDASAFFIRA
228



ovatus;

004296622.1

[Bacteroides


LGGLGLKYFIGKDQLSPWKVYFDGYWQNKEYFQNNVDKMRFREGFLNKKNDDILSLIRNTNSVSNHVRRG




Bacteroides




ovatus]



DYCDSCRKDLFLQACTPQYYESAISVMKEKFQKPVFFVFSDDIPWVKVNLNIPNAYYIDWNKKENSYLDMYL




ovatus ATCC






MSLCTASIIANSTFSFWGAMLGNKKELVIKPKKWIGDEIPEIFPPSWLSL



8483













Butyrivibrio

WP_
551035785
gylcosyl-
23.25

MLIIQIAGGLGNQMQQYALYRKLLKYHPDGVRLDLSWFDSEVQKNMLAKREFELALFKGLPYIECKPEERAA
229


sp. AE3009
022779599.1

transferase


FLDRNAAQKLSGKVLKKLGLRDNANPNVFEESRMFHPEIFELDNKYIIGYFACQKYYDDIMGDLCNLFEFPEH






[Butyrivibrio


LDPELEKKNLELISKMEKENSVSVHIRRGDYLDPENFKILGNIATDEYYESAMKYFEDRYEKVHFYIFTSDHEYA






sp. AE3009]


REHFADESKYTIVDWNTGKDSLQDVRLMNHCKGNICANSTFSWGARLNQRQDKVMIRTYKMRNNQPV









DPDTMHDYWKGWILIDETGREV







Butyrivibrio

WP_
302669752
glycosyl-
23.23

MTKNEKKLIVKFQGGLGNQLYEYAFCEWLRQQYSDYEVLADLSYYKIRSAHGELGIWNIFPNINIEVASNWDI
230



proteo-

003829712.1

transferase 11


IKYSDQIPIMYGGKGADRLNSVRTNVDRFFSKRKHSYYTEISNTDVSEVINALNNGIRYFDGYWQNIDYFKG




clasticus;



[Butyrivibrio


NIEDLRNKLKFSEKCDKYITDEMLRDNAVSLHVRRGDYVGSEYEKEVGLSYYKKAVEYVLDRVDQAKFFIFSD




Butyrivibrio




proteo-



DKYYAETAFEWIDNKTVVAGYDNELAHVDMLLMSRMKNNIIANSTFSLWAAYLNDSMNPLIVYPDVESLD




proteo-




clasticus



KKTFSDWNGIIK




clasticus



B316]






B316













Prevotella

WP_
517173838
protein
23.23

MKSQFLKHIKLSGGFGNQLFQYFFGEYLKEKYNCSISFFSEPALDINQLQIHRFFPALRISHNTELRPYHYSFTQ
231



nanceiensis

018362656.1

[Prevotella


QLAYRCMRKLLLLFPFLNRVKIENGSNYQNQSFNDTYCFFDGYWQSYRYLSAFTPSLQFEDQLINDISADYIN







nanceiensis]



AIEQSEAVFLHIRRGDYLNKENQKVFACECPLNYFENAANRIKEDIKNVHFFVFSNDIQWVKSHLKLNDNEVTF









IQNEGNSCDLKDFYLMTRCKHAIISNSTFSWWAAYLINNSDKKVIAPKHWYNDISMNNAATKDLIPPTWIRL







Ruegeria sp.

WP_
495838392
aplha-1,2-
23.23

MIITRLHGRLGNQMFQYAAGRALADRAGVPLALDSRGAILRGEGVLTRVFDLELADPVHLPPLKQTNPLRYA
232


R11
008562971.1

fucosyl-


IWRGIGQKVGAKPYFRRERGLGYNPAFEDWGDSNYLHGYWQSQKYFQNSAERIRSDFTFPAFSNQQNAE






transferase


MAARIAESTAISLVHVRRGDYLTFAAHVLCDQAYYDAALAKVLDGLQGDPIVYVFSDDPQWAKDNLSLPCEK






[Ruegeria sp.


VVVDFNGPETDFEDMRLMSLCQHNIIGNSSFSWWAAWLNQTPGRRVAGPAKWFGDPKLSNPDIFPHDW






R11]


LRISV







Winograd-

WP_
527072096
aplha-1,2-
23.21

MGNQLYEYATAKAMAVALNKKLVIDPRPILKEAPQRHYDLGLFNIQDEDFGSPFVQWLVRWVASVRLGKFF
233



skyella

020895733.1

fucosyl-


KTIMPFAWSYQMIRDKEEGFDESLLQQKSRNIVIEGYWQSFKYFESIRPTLLKELSFKDKPNAINQKYLDEISV




psychro-



transferase


NAVANHIRRGDYVANPVANAVHGLCDMDYYKKAIAIIKDKVENPYFFIFTDDPDWAEKNFKISEHQKIIKHNI




tolerans RS-



[Winograd-


GKQDHEDFRLLTNCKYFIIANSSFSWWGAWLSDYKNKIVISPNKWFNVDAVPITERIPESWIRV



3; Winograd-



skyella








skyella




psychro-








psychro-




tolerans]








tolerans 














Lachno-

WP_
551041720
protein
23.2

MITVRIDGGFGNQMFQYAFFLHLKKTITDNKISVDLNCYNPHGSGDIFTRFKLAPEQAAPSEIKRFHRNSIYHL
234



spiraceae

022785341.1

[Lachno-


LRPLDSAGITTNPYYREEDIDDLNSVLNKKRVYLRGYWQDKRYPFSVKDQLIDCFDLGKMDMTGASAENNVI




bacterium




spiraceae



LEQIASEESRSVGBHLRGGDYIGDPVYSGICTPEYYEAAFKHVSEKIKDPVFHIFTNDISMIEKCGLSGKYDLKIT



NK4A179



bacterium



DINDEAHGWADLKLMSACRHHIISNSSFSWWAAFLGEATTEASADVINVIPEYMRQGVSAETLRCPCWTT






NK4A179]


VTSDGRVYPS







Prevotella

WP_
496522549
aplha-1,2-
23.13

MKIVCIKGGLGNQLFEYCRYRSLHRHDNRGVYLHYDRRRTKQHGGVWLDKAFHITLPNEPLRVKLLVMVLK
235


sp. oral taxon
009230832.1

fucosyl-


TLRRLHLFKRLYREEDPRAVLIDDYSQHKQYITNAAEILNFRPFEQLDYAEEIQTTPFAVSVHVRRGDYLLLANK



317 str.


transferase


SNFGVCSVHYYLSAAVAVRERHPESRFFVFSDDMEWAKENLNLPNCVFVEHAQAQPDHADLYLMSLCKGH



F0108;


[Prevotella


IIANSTFSFWGAYLSKGSSAIAIYPKQWFAEFTWNVPDIFPAHWMAL




Prevotella



sp. oral taxon






sp. oral taxon


317]






317













Butyrivibrio

WP_
551021633
glycosyl-
23.1

MLIIQIAGGLGNQMQQYAVYTKLRGMGKDVRLDLSWFDPSVQKNMLAPREFELSMFEGVDYTECTAEER
236


sp. XPD2006
022765796.1

transferase


DSFLKQGMIANVTGKMLKKLGLRDEANPKVFSEKEMYHPEIFELEDRYIKGYFACQKYYDDIMGELWEKYTF






[Butyrivibrio


PAHSDPDLHTRNMALVERMEKETSVSVHIRRGDYLDPSNVEILGNIATEEYYQGAMDYFSVKDPDTHFYIFT






sp. XPD2006]


SDHEYAREKFSDESKYTIVDWNSGRNSVQDLMLMSHCKGNICANSTFSFWGARLNRRPDKTVIRTYKMRN









NQPVNPDIMHDYWKGWILMDEGSII







Butyrivibrio

WP_
551008140
protein
23.08

MIIIKLQGGLGNQLFLYGLYKNLKHLKRDVKMDIESGFEEDKLRVPCLKSMGLDYEVATRDEIVAIRDSYMDIF
237



fibrisolvens

022752717.1

[Butyrivibrio


SRIRRKITGRKTFDYYEPEDGNFDPRVLEQTHAYLDGYFQSEKYFGDSDDRKKLKDELLKEKIRVLDSSDTLKDL







fibrisolvens]



YNMMSSGSSVSLHIRRGDYLTPGIMETYGGICTDEYYDIAMNRIKNEYPDSKFFIFSNDIDWCKEKYGSRDDV









IFVDSCDEHEGLTNVSGDQDDIQVQGDIKEHGNNSLRDAAELYLMSACKHHILANSSFSWWGAWLSDHEG









MTIAPSKWLNNKNMTDIYTKDMLLI







Cylindros-

WP_
493321658
Glycosyl-
23.05

MKKTVVLLKGGLGNQMFQYAFARSISLKNSSKLVIDNWSGFTFDYKYHRQYELGTFSIVGRPANLTEKFPFW
238



permopsis

006278973.1

transferase


FYELKSKFFPRLPKVFQQQFYGLLINEVGGEYIPEIEETKISQNCWLNGYWQSPLYFQKHSDSITRELMPPEPM




raciborskii;



family 11


EKHFLELGKLLRETESVALGIRLYEESKNPGSHSSSGELKSHFEINQAILKLRELCNGAKFFVFCTHRSPLLQELAL




Cylindros-



[Cylindros-


PENTIFVTHDDGYVGSMERMWLLTQCKHHIFTNSTFYWWGAWLSQKFYIQGSQIVFAADNFINSDAIPKH




permopsis




permopsis



WKPF




raciborskii




raciborskii]







C5-505













Prevotella

WP_
494609908
alpha-1,2-
23.05

MKIVNFQGGLGNQMFIYAFSRYLSRLYPQEKIYGSYWSRSLYVHSAFQLDRIFSLQLPPHNLFTDCISKLARFF
239



multiformis;

007368154.1

fucosyl-


ERLRLVPVEETPGSMFYNGYWLDKKYWEGIDLSEMFCFRNPDLSAEAGAVLSMIERSNAVNVHIRRGDYQS




Prevotella



transferase


EEHIEKFGRFCPPDYYRIATERIRQREDDPLFFVFSDDMMWVKSNMDVPNAVYVDCHHGDDSWKDMFL




multiformis



[Prevotella


MAKCRHNIIANSTFSFWAAMLNANPDKVVVYPQRWFCWPSPDIFPEMWLPVTEKEIKSSF



DSM 16608



multiformis]











Bacteroides

WP_
548151455
protein
23

MIIVNMACGLANRMFQYAFYLSLKERGYNVKVDFYKSATLPHENVPWNDIFPYAEIDQVSNFRVLILGGGA
240


sp. CAG:462
022384635.1

[Bacteroides


NLLSKLRRKYLPSLTNVITMSTAFDTDLQIDDDRKDKYIIGVFQSAAMVEGVCKKVKQCFSFLPFTDLRHLQLE






sp. CAG:462]


KEMQECESVAIHVRKGNDYQQRIWYQNTCTMDYYRKAIAEIKGKVKDPRFYVFTDNADWVRRNFTDFDYK









MVEGNPVYGWGSHFDMQLMSRCKYNIISNSTYSWWGAYLNANRNKIVICPNIWFNPESCNEYTSCKLLCK









GWIAL







Desulfo-

WP_
492830222
Glycosyl-
23

MRIGILYICTGKYTVFWNHFFTSCEQHFLREHEKHYYIFTDGEIAHLNCNRVHRIEQQHLGWPDSTLKRFHM
241



vibrio

005984176.1

transferase


FERIADTLRQNSDFIVFFNANMVFLRDVGKEFLPTREQALVFHRHPGLFRRPAWLPYERRPESTAYIPYGSG




africans;



family 11/


SIYVCGGVNGGYTQPYLDFVAMLRRNIDIDVERGIIARWHDESHINRFVIGRHYKIGHPGYVYPDRRNLPFPR




Desulfo-



Glycosyl-


IIRVIDKASVGGHTFLRGQTPEPAPEEQSKTVAKKLRSQLKRPCMPRAAQDEPIILARMMGGLGNQMFIYAA



vibrio


transferase


ARVLAERQGAQLHLDTGKLSGDSIRQYDLPAFSIDAPLWHIPCGCDRIVQAWFALRHVAAGCGMPKPTMQ



africans PCS


family 6


VLRSGFHLDQRFFSIRHSAYLIGYWQSPHYWRGHEDRVRSSFDLTRFERPHLREALAAVSQPNTISVHLRRG






[Desulfo-


DFRAPKNSDKHLLIDGSYYERARKLLLEMTPQSHFYIFSDEPEEAQRLFAHWENTSFQPRRSQEEDLLLMSRC







vibrio



SASIIANSSFSWWGAWLGRPKQHVIAPRMWFTRDVLMHTYTLDLFPEKWILL







africans]











Roseburia sp.

WP_
548374190
protein
22.98

MILIHVMGGLGNQLYQYALYEKMKSLGKKVKLDTYAYNDAAGEDKEWRSLELDRFPAIEYDKATSEDRTKLL
242


CAG:100
022518697.1

[Roseburia sp.


DNSGLLTAKIRRKLLGRKDKTIRESKEYMPEIFHMDDVYLYGFWNCERYYEDIIPLLQDKLQFPISNNPRNQQ






CAG:100]


CIEQMQKENAVSIHIRRTDYLTVADGARYMGICTEDYYKGAMAYIEERVSNPVYYIFSDDVEYAKQHYHQD









NMHVVDWNSKADSIYDMQLMSKCKHNICANSTFSMWAARLNQNEKIMIRPLHHDNYETTTATQVKQN









WKNWILLDQNGQVCE







Lachno-

WP_
550997676
protein
22.96

MTMNIIRMSGGLGSQMFQYALYLKLKSMGKEVFDDINEYRGEKARPIMLAVFGIEYPRATWDEITSFTDG
243



spiraceae

022742385.1

[Lachno-


SMDLLKRLRRKIFGRKAIEYEEQGFYDPNVLNFDSMYLRGNFQSEKYFQDIKEEVRKLYRFSTLEDMRLPERLY




bacterium




spiraceae



KATKACLDGIESSESVGLHMYRSDSRVDGELYDGICTGNYYKGAVRFIQDKVPDAKFYIFSNEPKWVRGWVV



10-1



bacterium



DLIQSQIQEGMSPSQVKEMEKRFVMVEANTEYTGYLDMMLMSKCKHNIISNSSFSWWSAWMNDHPEKV






10-1]


WWAPDRWSSDKEGNEIYTTGMTLVNEKGRVNYIHENSTVK







Prevtotella

WP_
490496500
protein
22.96

MILSYITGRLGNQLFEYAYARSLLLKRGKNEELILNFSLVRAAGKEIEGFDDNLRYFNVYSYTELDKDIVLSKGDL
244



nigrescens;

004362670.1

[Prevotella


LQLFIYILFKLDQKLFRIIKEKWFSFFRRFGIIFQDYLDNISNLIIPRTKNVFCYGKYENPKYFDDIRSILLKEFTP




Prevtotella




nigrescens]



PPLKNNDQLYSVIESTNSVCISIRRDFLCDKFKDRFLVCDKEYFLEAMEEAKKRISNSTFIFFSDDIEWVRENIH




nigrescens






SDVPCYYESGKDPVWEKLRLMYSCKHFIISNSTFSWWAQLSRNEEKVVIAPDRWSNVPGEKSFLLSNSFIKI



F0103





PIGILP







Bacteroides

WP_
547952493
fucosyl-
22.95

MIYVEINGRLGNNMFEIAAAKSLTDEVTLWCKGDWQLNCIKMYSDTLFKNYPIVKSLPNNIRIYEEPEFTFHPI
245


sp. CAG:875
022353235.1

transferase


PYKENQDLLIKGYFQSYKYLDREKVLKLYPCPMPVKLDIEKRFGDILSQYTVVSINVRRGDYLNLPHRHPFVGK






[Bacteroides


KFLERAMLWFGDKVHYIISSDDIEWCKAHFKQFDNVHYLTNSYPLLDLYIQTACHHNIISNSSFSWWGAYLN






sp. CAG:875]


NHPQKIVIAPHRWFGMSTNINTQDLLPPEWMIEQCVYEPKVFLKALPLHAKYLLKRVLK







Prevotella

YP_
532354444
protein
22.9

MDSQLLKHIKLSGGFGNQLFQYFFGEYLKEKYNCSISFFSEPALDINQLQIHRFFPTLRISHNTELRRFHYAFTQ
246


sp. oral
008444280.1

HMPREF0669_


QLAYRCMRKLLLLFPFLNRKVKIENGSNYQNQSFNDTYCFDGYWQSYRYLSAFTPSLQFEDQLINDISADYIN



taxon 299 str.


00176


AIEQSEAVFLHIRRGDYLNKENQKVFAECPLNYFENAVNKIKEGNKTYHFFVFSNDIEWVKCHLKLNNNEVTF



F0039;


[Prevotella


IQNEGSSCDLKDFYLMTRCKHAIISNSTFSWWAAYLINNNDKKVIAPKRWYNDLSMNNATKDLIPPTWIRL




Prevotella



sp. oral






sp. oral


taxon 299






taxon 299


str. F0039]










Para-

WP_
495904204
alpha-1,2-
22.87

MKIVCLKGGLGNQMFEYCRFRDLMDSGNGKVYLFYDRRRLKQHDGLRLSDCFELELPSCPWGIRLVVWGL
247



prevotella

008628783.1

fucosyl-


KICRAIGVLKRLYDDEKPDAVLIDDYSQHRRFIPNARRYFSFRQFLAELQSGFVQMIRAVDYPVSVHVRRGDY




xylaniphila;



transferase


LHPSNNSFVLCGVDYFRQAIAYVRKKRPDARFFFFSDDMEWVRENLWMEDAVYVEHTELMPDYMDLYLM




Para-



[Para-


TLCRGHIISNSTFSFWGAYLAVDGNGMKIYPRRWFRDPTWITPPIFSEEWVGL




prevotella




prevotella








xylaniphila




xylaniphila]







YIT 11841













Dethio-

WP_
491897177
glycosyl-
22.84

MFQYAFGRALALDLGLDLKLDISNFGSDSRPFSLGIYSLTKNIPFGCYLSTSTRLKVMTKKLRRWGVWGMD
248



sulfovibrio

005658864.1

transferase


KNMPGLVEPFPPVLVSLDEVLSEKLSHLFVDGYWQSEKYFSRYSDVIRSDFRVIEESSAFLAWKKRMLSEPG




peptido-



family 11


GSISVHVRRGDYVTDSSANRVHGVLPIEYYLRAKEILNTISDGLVFYVFTDDPVWARNNLCLGDKTIYVSGEDL




vorans;



[Dethio-


KDYEELALMSCCDHHVVANSSFSWWGAWLGQDTSTVTIAPGRWFRKMDSSFVIPDNWIKIWT




Dethio-




sulfovibrio








sulfovibrio




peptido-








peptido-




vorans]








vorans DSM










11002













Lachno-

WP_
510896192
protein
22.83

MIIIQVMGGLGNQLQQYALYRKFVRMGKEQRLDISWFLDKEKRGEVLAERELELDYFDRLIYETCTPEEKEQLI
249



spiraceae

016229292.1

[Lachno-


GSEGVAGKLKRKFLPGRIRWFHESKIYHPELLQMENMYLSGYFACEKYYADILYDLREKIQFPVNDHPKNIKM




bacterium




spiraceae



AQEMQERESVSVHLRRGDYLDEKNTAMFGNICTDAYYCKAIEYMKTLCSKPHFYIFSDDIPYVRQRFTGEEYT



10-1



bacterium



VVDINHGRDSFFDMWLMSRCRHNICANSTFSFWGARLNSNDNKIMIRPTIHKNSQVFVKEEMEQLWPG






10-1]


WKFISPDGGIK







Treponema

WP_
513872223
protein
22.82

MFCAAFVEALKHAGQKVFVDTSLYNKGTVRSGIDFCHNGLETEHLFGIKFDEADKADVHRLSTSAEGLLNRIR
250



maltophilum;

016525279.1

[Treponema


RKYFTKKTHYIDTVFRYTPEVLSDKSDRYLEGFWQTEKYFLPIESDIRTLFRFRQPLSEKSAAVQSALQAQEPAS




Treponema




maltophilum]



LSASIHVRRGDFLHTKTLNVCTETYYNNAIEYAAKKYAVSAFYVFSDDIQWCREHLNFFGARSVFIDWNIGAD




maltophilum






SWQDMVLMSMCRCNIIANSSFSWWAAWLNAASDKIVLAPAIWNRRQLEYADRYYGYDSDIPETWIRIP



ATCC 51939





I







Bacteroides

WP_
511022363
protein
22.79

MKLVSFTAGLGNQLFQYCFYRYLLNKFPNEKIYGYYNKKWLKKHGGIIIEHFFDVKLPRSTRWINLYGQYLRIIY
251



massiliensis;

016276676.1

[Bacteroides


KCFSCGVSKDDDFEMNRTMFVGYWQDQCFFSGINISYKKNLVISEKNTWLLGEILKCNSVAIHFRRGDYMLP




Bacteroides




massiliensis]



QFKKIFGEVCTVKYYLKSIRKVEEKISEPVFFVFSDDIDWVKQNFTFNKVYFVDWNKGQNSFWDMYLMSQC




massiliensis






SANIIANSTFSFWGAYLNKNNPFVIYPQKWVRTNLKQPNIFTKTWMAL



dnLKV3













Enterococcus

YP_
389869137
family 11
22.71

MTVLTLGGGLGNQMFQYGYARYIQKIHREKFIYINDSEVIKEADRFNSLGNLNTVNIKVLPRIISKPLNETERLV
252



faecium;

006376560.1

glycosyl-


RKIMVRLFGVAGFNESAIFQSLNKFGIYYHPSVYKFYESLKTGFPIKIIEGGFQSWKYLETCPEIKQELRVKYEP




Enterococcus



transferase


MGENLRLLNLISQSESVCVHIRRGDYLSPKYKHLNVCDYQYYFESMNYIISKLNNPTFFIFSNTSDDLDWIKEN




faecium DO;



[Enterococcus


YSLPGKIVYVKNDNPDYEELRLMYSCKHFILLSNSTFSWWAQYLSNNSGIVIAPEIWNRLNHDGIADLYMPNW




Enterococcus




faecium DO]



ITMKVNR




faecium










EnGen0035













Bacteroides;

WP_
490442319
glycosyl-
22.67

MDVVVIFNGLGNQMSQYAYYLAKKKVNPNTKVIFDIMSKHNHGYDLERAFGIEVNKTLLIKVLQIIYVLSRK
253



Bacteroides

004313284.1

transferase


FRLFKSVGVRTIYEPLNYDYTPLLMQKGPWGINYYVGGWHSEKNFMNVPDEVKKAFMFREQPNEDRFNE



sp. 2_1_22;


family 11


WLQVIRGDNSSVSVHIRRGDYMNIEPTGYYQLNGVATLDYYHEAIDYIRQYVTPHFYVFSNDLDWCKEQF




Bacteroides



[Bacteroides]


GVENFFYIECNQGVNSWRDMYLMSECHYHINANSTFSWWGAWLCKFEDSITVCPERFIRNVVTKDFYPER



sp. 2_2_4;





WHKIKSC




Bacteroides










sp. D1;










Bacteroides











xylanisolvens










SD CC 2a;










Bactertoides











xylanisolvens










SD CC 1b;










Bacteroides











ovatus CAG:22














Synecho-

YP_
326781960
glycosyl-
22.6

MIGFNALGRMGRLANQMFQYASLKGIARNTGVDFCVPYHEEAVNDGIGNMLRTEIFDSFDLQVNVGLLNK
254



coccus phage

004322362.1

transferase


GHAPVVQERFFHFDEELFRMCPDHVDIRGYFQTEKYFKHIEDEIREDFTFKDEILNPCKEMIAGVDNPLALHV



S-SM2


family 11


RRTDYVTNSANHPPCTLEYYEALLKHFDDDRNVIVFSDDPAWCKEQELFSDDRFMISENEDNRIDLCLMSLC






[Synecho-


DDFIIANSTYSWWGAWLSANKDKKVIAPVQWFGTGYTKDHKTSKLIPDGWTRIATA







coccusphage










S-SM2]










Geobacter

YP_
404496189
glycosyl-
22.58

MDIHVLSYGLGNQLSYAQFFINRRQLMQRAYAFYAFKQHNGYELDRIFGLKEGLPWYLQFVRVVFRLGISRR
255



metalli-

006720295.1

transfer


FYSKRTADFVLSLFRIKVIDEAYNYEFDPSLLKPWFGIRILYGGWHDSRYFHPSEAAVRTAFSFPPLDDVNDAIL




reducens;



[Geobacter


QQIDAVYGVSIHVRRGDYLKGINSNLFGGIATLEYYRNAIGWAITYCKHRSLEIKFYVFSDDIDWCKQNLGLR




Geobacter




metalli-



DAVYVSGNSKTDSEKDILLMSHCRANIIANSTFSWWAAWLNQQPNKVVICPTKFINTDSPNQTIYPAAWH




metalli-




reducens 



QIEG




reducens GS-



GS-15]






15; Geobacter










metalli-











reducens RCH3














Lachno-

WP_
551037245
protein
22.58

MIIVRFHGGLGNQMFEYAFYRYMTNKYGADNVIGDMTWFDRNYSEHQGYELKKVFDIDIPAIDYKTLAKIH
256



spiraceae

022780989.1

[Lachno-


EYYPRYHRFAGLRYLSRMYAKYKNKHLKPTGEYIMDFGPSQYIHNDAFDKLDTNKDYYIEGVFCSDAYIKYYE




bacterium




spiraceae



NQIKKDLTFKPNYSQHTKDMLPKIEETNSVAIHVRRGDYVGNVFDIVTPDYYRQAVNYIRERVENPVFFVSD



NK4A136



bacterium



DMDYIKANFDFLGDFVPVHNCGKDSFQDMYLISRCRHMIIANSSFSYFGALLGEKDSTIVIAPKKYKADEDLA






NK4A136]


LARENWVLL







Bacteroides

WP_
495419937
protein
22.56

MGFIVNMACGLANRMFQYSYYLFLKKQGYKVTVDFYRSAKLAHEKVAWNSIFPYAEIKQASRLKVFLWGGG
257



coprophilus;

008144634.1

[Bacteroides


SDLCSKVRRRYFPSSTNVRTTTGAFDASLPANTARNEYIIGVFLNASIVEAVDDEIKKCFTFLPFTDEMNLRLKK




Bacteroides




coprophilus]



EIEECESVAIHVRKGKDYQSRIWYQNTSCMEYYRKAILQMKEKLQHSKFYVFTDNVDWVKENFQEIDYTLVE




coprophilus






GNPADGYGSHFDMQLMSLCKHNIISNSTYSWWSAFLNRNPEKVVIAPEIWFNPDSCDEFRSDRALCKGWI



DSM 18228 =





VL



JCM 13818













Bacteroidetes;

WP_
495895157
alpha-1,2-
22.53

MKIVCLKGGLGNQMFEYCRFRDLMESGHDEVYLFYDHRRLKQHNGLRLSDCFELELPSCPWGIKLVVWGLK
258



Capnocytophaga

008619736.1

fucosyl-


ICRAVGVLKRLYDDEKPEAVLIDDYSQHRRFIPNARRFFFRQFLAELQSGFVQMIRAVDYPVSVHVRRGDYL



sp. oral


transferase


HPSNSSFGLCGVDYFQQAIAYVRKKRPDARFFFFSDDMEWVRENLWMEDAVYVEHTELLPDYVDLYLMTL



taxon 329 str.


[Bacter-


CRGHHIISNSTFSFWGAYLAVDGNGMKIYPRRWFRDPTWTSPPIFSEEWVGL



F0087



oidetes]








Para-











prevotella











clara YIT










11840













Butyrivibrio

WP_
551026242
glycosyl-
22.47

MLIIQIAGGLGNQMQQYAVYTKLREMGDKVKLDLSWFDPQVQKNMLAPREFELPIFGGTDYEECSDAYERD
259


sp. NC2007
022770361.1

transferase


ALLKQGAFAAIAGKVLKKLGLRDEANPKVFSEKEMYHPEVFELEDKYIKGYFACQKYYGDIMDKLQEKFIFPE






[Butyrivibrio


HSDPDLHARNMALVERMEREPSVSVHIRRGDYLDPSNVEILGNIATEQYYQGAMDYFTVKEPDTHFYIFTSD






sp. NC2007]


HEYAREKFSDESKYTIVDWNNGKNSVQDLMLMSHCKHNICANSTFSFWGARLNKRPDKTVIRTYKMRNN









QPVNPQIMHDYWKGWILMDEKGSII







Para-

WP_
495903957
glycosyl-
22.45

MKIILVFTGGLGNQMFAYAFYLYLKRLFPQERFYGLYGKKLSEHYGLEIDKWFKVSLPRQPWWVLPVTGLFYL
260



prevotella

008628536.1

transferase


YKQCVPNSKWLDLNQEICKNPRAVIFFPFKFTKKYIPDDNIWLEWKVDESGLSEKNRLLLSEIRSSDCCFVHVR




xylaniphila;



[Para-


RGDYLSPTFKSLFEGCCTLSYYQRALKSMKEISPFVKVFCFSDDIQWVKQNLELGNRAVFVWNSGTDSPLD




Para-




prevotella



MYLMSQCRYGIMANSTFSYWGARLGRKKKRIYYPQKWWNHGTGLPDIFPNTWVKI




prevotella




xylaniphila]








xylaniphila










YIT 11841













Blautia

WP_
492742598
protein
22.44

HEIHVYLTGRLGNQLFQYAFARHLQKEYGGKIICNIYELEHRSEKAAWVPGKFNYEMSNYKLNDSILIEDIKLP
261



hydro-

00594476.1

[Blautia]


WFADFSNPIIRIVKKVIPRIYFNLMASKGYLLWQKNSYINIPAIRNNEIIVNGWWQDVRFFHDVEAELSNEIVP




genotrophica






TTKPISENEYLYNIAERENSVCVSIRGGNYLVPKVKKKLFVCDKEYFYNAIELLIKSKVRNAIFVFSDDLEWVKSYI



DSM 10507;





KLEEKFPECKFYYESGKDTVEEKLRMMTKCKHFIISNSSFSWWAQYLAKNENFIVIAPDAWFTNGDKNGLYI




Blautia;






DDWILIPTQTKKM




Blautia 











hydro-











genotrophica










CAG:147













Geobacter

YP_
189425804
glycoside
22.44

MITVLLNGGLGNQLFQYAAGRALAEKHDVELLLDLSRLQHPKPGDTPRCFELAPFNIKASLLAEEGRQPLGGSY
262



lovleyi;

001952981.1

hydrolase


QACMHRLLLKASIPLWGSIILKEQGCGFDPLIFRAPSCILDGFWQDECYFKQITSLLQQELSLKAPSPALRKAS




Geobacter



family


SVLSDATVAVHVRRGDYVTNPAAASFHGICSQDYYQAAVANILTSYPDSQFLVFSDDPAWCQEHLDLGQPF




lovleyi SZ



protein


RLAADFGLNGSAEELVLISRCAHQIIANSSFSWWGAWLNPSPHKLVVAPCRWFTPAITTNDLLPETWVRLP






[Geobacter










lovleyi SZ]











Lachno-

WP_
551037435
protein
22.41

MVISHLSGGFGNQLYSYAFAYAVAKARKEELWIDTAIQDAPWFFRNPDILNLNIKYDKRVSYKIGEKKIDKIFN
263



spiraceae

022781176.1

[Lachno-


RINFRNAIGWNTKIINESDMPNIDDWFDTCVNQKGNIYIKGNWSYEKLFISVKQEIIDMFTFKNELSKEANDI




bacterium




spiraceae



AQDINSQETSVHIHYRLGDYVKIGINIVPDYFIASAMTSMVEKYGNPVFYSFSEDNDWVKKQFEGLPYNIKYVE



NK4A136



bacterium



YSSDDKGLEDFRLYSMCKHQIASNSSYSWWGAYLNNNPNKYIIAPTDYNGGWKSEIYPKHWDVRPFEFLK






NK4A136]










Bacteroides

WP_
492426440
glycosyl-
22.37

MFHYKFLLFGGGLGNQIFEYYFYLWLRKKYPNIVFLGCYRKASFKAHNGLEISDVFDVDLPNDGGLSGRFISYV
264



vulgatus;

005840359.1

transferase


LSVLSRIIPSLSMKANTEYSSKYLLINAYQPNLLFYLNEEKIKFRPFKLDEVNRRLLNSIKMESSVSIHVRRGDYLF




Bacteroides



family 11


GQYRDIYSNICTLAYYQKAVDKCKGILESPRFFVFSDDIEWARDVFVGREYEFVSNNIGKNSFIDMFLMSNCKI




vulgatus



[Bacteroides


QIIANSTFSYWAAYLSNSLVKIYPAKWINGIERPNIFPDNWIGL



PC510



vulgatus]











Planctomyces

YP_
325110698
glycosyl-
22.37

MIIARIENGLGNQLFKYAAGRALSKHRTSLYTIPGSVRKPHETFILSKYFNVQAKSVSPFLLQTGFRLRLLKGYE
265



brasiliensis;

004271766.1

transferase


NHSFGFDPRFETTRNNTVVSGNFQSARYFLPFFDQINRELTLKPEVVDGLESVYPHVLESLRTPNSVVNHIRLG




Planctomyces



family protein


DYVSSGYDICGPEYYAKAISRLQQLGHELRAFVFSDTPQAASRFLPADIDAQIMSEFPEVRDAARSLTVERSTI




brasiliensis



[Planctomyces


RDYFLMQQCRHFVIPSNNFSYWAALLSSSDGDVIYPNRWYIDIDTSPRDLGLAPAEWTPIPLT



DSM 5305



brasiliensis










DSM 5305]










Butyrivibrio

WP_
497499658
alpha-1,2-
22.36

MIILQIAGGLGNQMQQYALRKLLKCGKTVKLDLSWFGPEIQKNMLAPFEFELVLFDLPFEICTKEEFDALIK
266


sp. AE2015
009813856.1

fucosyl-


QNLFQKIAGKVSQKLGKSASSNAKVFVETMYHEEIFDLDDVYITGYFACQYYYDDVMAELQDLFVFPSHSIP






transferase


ELDQRNAVLASKMEKENSVSVHIRRGDYLSPENVGILGNIASDKYYESAMNYFLEKDENTHFYIFTNDHEYAR






[Butyrivibrio


EHYSDESRYTIIDWNTGKNSLQDLMLMSHCKGNICANSTFSWGARLNKRPDRELVRTLKMRNNQEAQPEI






sp. AE2015]


MHEYWKNWILIDENGVIV







Roseovarius

WP_
497499658
alpha-1,2-
22.34

MTDTPPPSQVITSRLFGGAGNQLFQYAAGRALADRLGCDLMIDARYVAGSRDGDCFTHFAKARLRRDVA
267



nubinhibens

009813856.1

fucosyl-


LPPAKSDGPLYRALWRKFGRSPRFHRERGLGVDPEFFNLPRGTYHGYWQSEQYFGPDTDALRRDLTLTTAL



ISM;


transferase


DAPNAAMAAQIDAAPCPVSFHVRRGDYIAAGAYAACTPDYYRAAADHLATTLGKPLTCFIFSNDPAWARD




Roseovarius



[Roseovarius


NLDLGQDQVIVDLNDEATGHFDMALMARCAHHVIANSTFSWWGAWLNPDPDKLVVAPRNWFATQALH




nubinhibens




nubinhibens]



NPDLIPEQWHRL







Eubacterium

WP_
548315094
protein
22.33

MIEVNIVGQLGNQMFEYACARQLQKKYGGEIVLNTYEMRKETPNKFLSILDYKSENDVKIISDKPLSSANANN
268


sp. CAG:581
022505071.1

[Eubacterium


YLVKIMRQYFPNWYFNFMAKRGTFVWKSARKYKELPELNEQLSKHIVLNGYWQCDKYFNDVVDTIREDFTP






sp. CAG:581]


KYPLKAENEQQLEKIKSTESVSVTIRRGDFMNEKNKDTFYICDDDYFNKALSKIKELCPDCTFGFSDDVWEIKK









NVNFPGEVYFESGNDPVWEKLRLMSACKHFVLSNSSFSWWAQYLSDNNNKIVVAPDIWYKTDGPKKTALY









QDGWNLIHIGD







Providencia

AFH02807.1
383289327
glycosyl-
22.26

MKINGKESSMKIKQKKIISHLIGGLGNQLFQYATSYALAKENNAKIVIDDRLFKKYKHGGYRLDKLNIIGEKISS
269



alcalifaciens



transferase


IDKLLFPLILCKLSQKENFIFKSTKKFILEKKTSSFKYLTFSDKEHTKMLIGYWQNAIYFQKYFSELKEMFVPLDIS






[Providencia


QEQLDLSIQHAQQSVALHVRRGDYISNKNALAMHGICSIDYYKNSIQHINAKLEKPFFYIFSNDKLWCEENLT







alcalifaciens]



PLFDGNFHIVENNSQEIDLWLISQCQHHIIANSTFSWWGAWLANSDSQIVITPDPWFNKEIPIPSPVLSHWL









KLKK







Salmonella

AFW04804.1
411146173
glycosyl-
22.26

MFSCLSGGLGNQMFQYSAAYILKKNICHAQLIIDDSYFYCQPQKDTPRNFEINQFNIVFDRVTTDEEKRAISKL
270



enterica



transferase


RKFKKIPLPLFKSNVITEFLFGKSLLTDEDFYKVLKKNQFTVKMNACLFSLYQDSSLINKYRDLILPLFTINDELLQ






[Salmonella


VCQQLDSYGFICEHTNTTSLHIRRGDYVTNPHAAKGHGTLSMNYYSQAMNYVDHKLGKQLFIIFSDDVQWA







enterica]



AEKFGGRSDCYIVNNVNCQFSAIDMYLMSLCNNNIIANSTYSWWGAWLNKSEEVLVIAPRKWFAEDKESLL









AVNDWISI







Sulfuro-

YP_
268680398
protein
22.18

MIIIKIMGGLASQLHKYSVGRALSLKYNTELKLDIFWFDNISGSGTIREYHLDKYNVVAKIATEQEIKQFKPNKY
271



spirillum

003304829.1

Sdel_1779


LLKINNLFQKFTNWKINYRNYCNESFISLENFNLLPDNIYVEGEWSDRYFSHIKEILQKELTLKSEYMDSTNHF




deleyianum;



[Sulfuro-


LAKQSSDFAHDDNASKLHCTCSLEYYKKALQYISKNLLKMKLLIFSDDLDWLKPNFNFLDNVEFEFVEGFQDY




Sulfuro-




spirillum



EEFHLMTLSKHNIIANSGFSLFFAWLNINHNKIIISLSEWVFEEKLNKYIIDNIKDKNILFLENLE




spirillum




deleyianum








deleyianum



DSM 6946]






DSM 6946













Pseudo-

YP_
374329930
alpha-1,2-
22.15

MSVASQVRISGAARRRKLKPTLIVRIRGGIGNQLFQYALGRKIALETGMKLRFDRSEYDQYFNRSYCLNLFKT
272



vibrio sp.

005080114.1

fucosyl-


QGLSATESEMSAVLWPAQSFGQTVKLCRKFYPFYQRRYIREDELLQDSETPVLKQSAYLDGYWQTWEIPFSI



FO-BEG1


transferase


MEQLRDEITLKKPMVLERLKLLQRIKSGPSAALHVRYGDYSQAHNLQNFGLCSAGYYKGAMDFLTERVPGLT






[Pseudo-


FYVFSDSPERAREVVPQQENVYFSDPMQDGKDHEDLMVMSSCDHIVTANSTFSWWAAFLNGNEDKHVIA







vibrio sp.



PLKWFKNPNLDDLIVPPHWQRL






FO-BEG1]










Prevotella

WP_
496529942
alpha-1,2-
22.11

MKIVCIKGGLGNQLFEYCRYHGLLRQHNNHGVYLHYDRRRTKQHGGVWLDKAFLITLPTEPWRVKLMVM
273


sp. oral
009236633.1

fucosyl-


ALKMLRKLHLFKRLYREDDPRAVLIDDYSQHKQFITNAAEILNFRPFAQLDYVDEITSEPFAVSVHVRRGDYLL



taxon 472 str.


transferase


PANKANFGVCSVHYYLSAAVAVRERHPDARFFVFSDDIEWAKMNLNLNLPNCVFVEHAQPQPDHADLYLMSL



F0295;


[Prevotella


CKGHIIANSTFSFWGAYLSMGSSAIAIYPKQWFAEFTWNAPDIFLGHWIAL




Prevotella



sp. oral






sp. oral


taxon 472]






taxon 472 













Butyrivibrio

WP_
551008155
glycosyl-
22.08

MLIIRVAGGLGNQMQQYAMYRKLKSLGKEVKLDLSWFDVENQEGQLAPRKCELKYFDGVDFEECTDAERA
274



fibrisolvens

022752732.1

transferase


YFTKRSILTKALNKVFPATCKIFEETEMFHPEIYSFKDKYLEFYFLCNKYYDDILPFIQNEIVPKHSDPKMQKRN






[Butyrivibrio


EELMERMDGWHTASIHLRRGDYITEPQNEALFGNIATDAYYDAAIRYVLDKDYQTHFYIFSNDPEYAREHYS







fibrisolvens]



DESRYTIVTGNDGDNSLLDMELMSHCRYNICANSTFSFWGARLNKRSDKEMIRTFKMRNNQEVTAREMTD









YWKDWILIDEKGNRIF







Lewinella

WP_
522059857
protein
22.04

MVISRLHSGLGNQMFQYAFARRIQLQLNVKLRIDLSILLDSRPPDGYIKREYDLDIFKLSPAYHCNPTSLRILYA
275



persica

020571066.1

[Lewinella


PGKYRWSQVVRDLARKGYPVYMEKSFSVDNTLLDSPPDNVIYQGYWQSERYFSEVANTIRKDFAFQHSIQP







persica]



QSESLAREIRKEDSVCLNIRRKDYLASPTHNVTDETYYENCIQQMRERFSGARFFLFSDDLVWCREFFAHFHD









VVIVGHDHAGPKFGNYLQLMAQCHHYIIPNSTFAWWAAWLGERTGSVIMAPERWFGTDEFGYRDVVPER









WLKVPN









Other Embodiments

While the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.


The patent and scientific literature referred to herein establishes the knowledge that is available to those with skill in the art. All United States patents and published or unpublished United States patent applications cited herein are incorporated by reference. All published foreign patents and patent applications cited herein are hereby incorporated by reference. Genbank and NCBI submissions indicated by accession number cited herein are hereby incorporated by reference. All other published references, documents, manuscripts and scientific literature cited herein are hereby incorporated by reference.


While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Claims
  • 1. A method for producing a fucosylated oligosaccharide in a bacterium comprising providing bacterium comprising an exogenous lactose-utilizing α(1,2) fucosyltransferase enzyme, wherein the amino acid sequence of said enzyme comprises at least 22% but less than 40% identity to FutC (SEQ ID NO: 1).
  • 2. A method for producing a fucosylated oligosaccharide in a bacterium comprising providing bacterium comprising an exogenous lactose-utilizing α(1,2) fucosyltransferase enzyme, wherein the amino acid sequence of said enzyme comprises at least 25% identity to FutN (SEQ ID NO: 3).
  • 3. The method of claim 1 or 2, wherein said α(1,2) fucosyltransferase enzyme comprises, Lachnospiraceae sp. FutQ, Tannerella sp. FutS, Clostridium bolteae+13 FutP, Methanosphaerula palustries FutR, Akkermansia muciniphilia FutY, Clostridium bolteae FutP, or a functional variant or fragment thereof.
  • 4. The method of claim 1, wherein said α(1,2) fucosyltransferase enzyme comprises any one of amino acid sequences SEQ ID NO: 10-21 and 292, or a functional variant or fragment thereof.
  • 5. The method of claim 1, further comprising retrieving the fucosylated oligosaccharide from said bacterium or from a culture supernatant of said bacterium.
  • 6. The method of claim 1, wherein said fucosylated oligosaccharide comprises 2′-fucosyllactose (2′-FL), lactodifucotetraose (LDFT), Lacto-N-fucopentaose I (LNF I), or lacto-N-difucohexaose I (LDFH I).
  • 7. The method of claim 1, wherein the bacterium further comprises an exogenous lactose-utilizing α(1,3) fucosyltransferase enzyme and/or an exogenous lactose-utilizing α(1,4) fucosyltransferase enzyme.
  • 8. The method of claim 7, wherein the exogenous lactose-utilizing α(1,3) fucosyltransferase enzyme comprises a Helicobacter pylori 26695 futA gene.
  • 9. The method of claim 7, wherein the exogenous lactose-utilizing α(1,4) fucosyltransferase enzyme comprises a Helicobacter pylori UA948 FucTa gene or a Helicobacter pylori strain DMS6709 FucT III gene.
  • 10. The method of claim 1, wherein said bacterium further comprises a reduced level of β-galactosidase activity, a defective colanic acid synthesis pathway, an inactivated adenosine-5′-triphosphate (ATP)-dependent intracellular protease, or an inactivated endogenous lacA gene, or any combination thereof.
  • 11. The method of claim 10, wherein said method further comprises culturing said bacterium in the presence of tryptophan and in the absence of thymidine.
  • 12. The method of claim 10, wherein said reduced level of β-galactosidase activity comprises a deleted or inactivated endogenous lacZ gene and/or a deleted or inactivated endogenous lacI gene of said bacterium.
  • 13. The method of claim 12, wherein said reduced level of β-galactosidase activity further comprises an exogenous lacZ gene or variant thereof, wherein said exogenous lacZ gene or variant thereof comprises an β-galactosidase activity level less than wild-type bacterium.
  • 14. The method of claim 10, wherein said reduced level of β-galactosidase activity comprises an activity level less than wild-type bacterium.
  • 15. The method of claim 14, wherein said reduced level of β-galactosidase activity comprises less than 6,000 units of β-galactosidase activity.
  • 16. The method of claim 14, wherein said reduced level of β-galactosidase activity comprises less than 1,000 units of β-galactosidase activity.
  • 17. The method of claim 10, wherein said bacterium comprises a lacIq gene promoter immediately upstream of a lacY gene.
  • 18. The method of claim 10, wherein said defective colanic acid synthesis pathway comprises the inactivation of the wcaJ gene of said bacterium is deleted.
  • 19. The method of claim 10, wherein said inactivated ATP-dependent intracellular protease is a null mutation, inactivating mutation, or deletion of an endogenous lon gene.
  • 20. The method of claim 19, wherein said inactivating mutation of an endogenous lon gene comprises the insertion of a functional E. coli lacZ+ gene.
  • 21. The method of claim 10, wherein said bacterium further comprises a functional lactose permease gene.
  • 22. The method of claim 21, wherein said bacterium comprises E. coli lacY.
  • 23. The method of claim 10, wherein said bacterium further comprises an exogenous E. coli rcsA or E. coli rcsB gene.
  • 24. The method of claim 10, wherein said bacterium further comprises a mutation in the thyA gene.
  • 25. The method of claim 10, wherein said bacterium accumulates intracellular lactose in the presence of exogenous lactose.
  • 26. The method of claim 10 wherein said bacterium accumulates intracellular GDP-fucose.
  • 27. The method of claim 1, wherein said bacterium is E. coli.
  • 28. The method of claim 1, wherein said production strain is a member of the Bacillus, Pantoea, Lactobacillus, Lactococcus, Streptococcus, Proprionibacterium, Enterococcus, Bifidobacterium, Sporolactobacillus, Micromomospora, Micrococcus, Rhodococcus, or Pseudomonas genus.
  • 29. The method of claim 1, wherein said production strain is selected from the group consisting of Bacillus licheniformis, Bacillus subtilis, Bacillus coagulans, Bacillus thermophiles, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans, Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, Xanthomonas campestris Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, Lactococcus lactis, Streptococcus thermophiles, Proprionibacterium freudenreichii, Enterococcus faecium, Enterococcus thermophiles), Bifidobacterium longum, Bifidobacterium infantis, Bifidobacterium bifidum, Pseudomonas fluorescens and Pseudomonas aeruginosa.
  • 30. The method of claim 1, wherein said bacterium comprises a nucleic acid construct comprising an isolated nucleic acid encoding said α(1,2) fucosyltransferase enzyme.
  • 31. The method of claim 30, wherein said nucleic acid is operably linked to one or more heterologous control sequences that direct the production of the enzyme in the bacterium.
  • 32. The method of claim 31, wherein said heterologous control sequence comprises a bacterial promoter and operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, or a plasmid selectable marker.
  • 33. A purified fucosylated oligosaccharide, wherein: (a) at least 99% of said purified fucosylated oligosaccharide comprises 2′-fucosyllactose (2′-FL), lactodifucotetraose (LDFT), Lacto-N-fucopentaose I (LNF I), and/or lacto-N-difucohexaose I (LDFH I) by weight; and(b) said purified fucosylated oligosaccharide lacks 3-fucosyllactose (3-FL) or contains an amount of 3-FL that is less than 0.5% the level of 2′-FL.
  • 34. A nucleic acid construct comprising an isolated nucleic acid encoding a lactose-utilizing α(1,2) fucosyltransferase enzyme for the production of said enzyme in a host bacteria production strain, wherein the amino acid sequence of said enzyme comprises at least 22% identity to FutC (SEQ ID NO: 1).
  • 35. (canceled)
  • 36. The construct of claim 34, wherein said α(1,2) fucosyltransferase enzyme comprises Lachnospiraceae sp. FutQ, Tannerella sp. FutS, Clostridium bolteae+13 FutP, Salmonella enterica FutZ, Methanosphaerula palustries FutR, Akkermansia muciniphilia FutY, Clostridium bolteae FutP, or a functional variant or fragment thereof.
  • 37. The construct of claim 34, wherein said α(1,2) fucosyltransferase enzyme comprises any one of amino acid sequences SEQ ID NO: 10-21 and 292, or a functional variant or fragment thereof.
  • 38. The construct of claim 34, wherein said nucleic acid is operably linked to one or more heterologous control sequences that direct the production of the enzyme in the bacterium.
  • 39. The construct of claim 38, wherein said heterologous control sequence comprises a bacterial promoter and operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, a plasmid selectable marker, and/or an origin of replication.
  • 40. An isolated bacterium comprising an isolated nucleic acid encoding a lactose-accepting α (1,2) fucosyltransferase enzyme, wherein the amino acid sequence of said enzyme encoded by said nucleic acid comprises at least 22% identity to FutC (SEQ ID NO: 1).
  • 41. An isolated bacterium comprising an isolated nucleic acid encoding a lactose-accepting α (1,2) fucosyltransferase enzyme, wherein the amino acid sequence of said enzyme encoded by said nucleic acid comprises at least 25% identity to FutN (SEQ ID NO: 3).
  • 42. The isolated bacterium of claim 40 or 41, wherein said α(1,2) fucosyltransferase enzyme comprises Lachnospiraceae sp. FutQ, Tannerella sp. FutS, Clostridium bolteae+13 FutP, Salmonella enterica FutZ, Methanosphaerula palustries FutR, Akkermansia muciniphilia FutY, Clostridium bolteae FutP, or a functional variant or fragment thereof.
  • 43. The isolated bacterium of claim 40, wherein said α(1,2) fucosyltransferase enzyme comprises any one of amino acid sequences SEQ ID NO: 10-21 and 292, or a functional variant or fragment thereof.
  • 44. The isolated bacterium of claim 40, further comprising a α(1,3) fucosyltransferase enzyme and/or an α(1,4) fucosyltransferase enzyme.
  • 45. The isolated bacterium of claim 40, wherein said bacterium is Escherichia coli.
  • 46. The isolated bacterium of claim 40, wherein said bacterium further comprises reduced level of β-galactosidase activity, a defective colanic acid synthesis pathway, an inactivated adenosine-5′-triphosphate (ATP)-dependent intracellular protease, an inactivated endogenous lacA gene, or any combination thereof.
  • 47. The isolated bacterium of claim 46, wherein said bacterium comprises the genotype ΔampC::PtrpBcI, Δ(lacI-lacZ)::FRT, PlacIqLacY+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ+).
  • 48. The purified fucosylated oligosaccharide of claim 33, wherein said purified fucosylated oligosaccharide does not comprise acetyl-lactose that is detectable by thin layer chromatography.
  • 49. The purified fucosylated oligosaccharide of claim 33, which is in the form of a crystalline powder.
  • 50. The method of claim 1, wherein the amino acid sequence of said enzyme comprises (i) the amino acid sequence of FutP (SEQ ID NO: 11); or (ii) at least 25% identity to FutP (SEQ ID NO: 11).
  • 51. The method of claim 50, wherein the amino acid sequence of said enzyme comprises (i) the amino acid sequence of FutQ (SEQ ID NO: 12); or (ii) at least 25% identity to FutQ (SEQ ID NO: 12).
  • 52. The method of claim 51, wherein the amino acid sequence of said enzyme comprises (i) the amino acid sequence of FutR (SEQ ID NO: 13); or (ii) at least 25% identity to FutR (SEQ ID NO: 13).
  • 53. The method of claim 52, wherein the amino acid sequence of said enzyme comprises (i) the amino acid sequence of FutS (SEQ ID NO: 14); or (ii) at least 25% identity to FutS (SEQ ID NO: 14).
  • 54. The method of claim 53, wherein the amino acid sequence of said enzyme comprises (i) the amino acid sequence of FutY (SEQ ID NO: 19); or (ii) at least 25% identity to FutY (SEQ ID NO: 19).
  • 55. The method of claim 54, wherein said enzyme is not a mutant enzyme.
  • 56. The method of claim 55, wherein said enzyme naturally occurs in a bacterium that naturally exists in the gastrointestinal tract.
PCT Information
Filing Document Filing Date Country Kind
PCT/US15/30823 5/14/2015 WO 00
Provisional Applications (1)
Number Date Country
61993742 May 2014 US