METHODS AND COMPOSITIONS FOR 3-HYDROXYPROPIONATE PRODUCTION

Abstract
Provided herein, inter alia, are methods, host cells, and vectors for producing 3-hydroxypropionate (3-HP). In some embodiments, the host cells include a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH). In some embodiments, the methods include culturing said host cell(s) in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP. Expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH.
Description
SUBMISSION OF SEQUENCE LISTING ON ASCII TEXT FILE

The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 220032001640SEQLIST.TXT, date recorded: May 11, 2018, size: 484 KB).


FIELD

The present disclosure relates, inter alia, to methods, host cells, and vectors for producing 3-hydroxypropionate (3-HP) using an oxaloacetate decarboxylase (OAADC) and a 3-hydroxypropionate dehydrogenase (3-HPDH).


BACKGROUND

Acrylate is an important industrial building block for polymers utilized in diapers, plastic additives, surface coatings, water treatment, adhesives, textiles, surfactants, and others. The market size for acrylate is estimated to expand to 8.2 MMT, $20Bi by 2020. 3-hydroxypropionate (3-HP) was identified as one of the top 12 value-added chemicals from biomass in 2004 (Werpy, T. et al. “Top Value Added Chemicals from Biomass.” US Department of Energy Report, Vol. 1. 2004), because 3-HP can be converted into acrylic acid, and several other commodity chemicals, in one step (FIG. 1).


There are more than 7 metabolic pathways proposed for 3-HP production (Kumar, V. et al (2013) Biotech. Adv. 31:945-961: FIG. 2A), however none of them is efficient enough for industrial scale production. 3-HP could in theory be produced by a simplified metabolic pathway from glucose using an oxaloacetate decarboxylase to convert oxaloacetate into 3-oxopropanoate (FIG. 2B) with extremely high efficiency (e.g., 100% wt. 3-HP/wt. glucose); however, an enzyme that efficiently catalyzes this reaction has not been found (see U.S. Pat. Nos. 8,048,624 and 8,809,027).


Therefore, a need exists for methods, host cells, and vectors that allow for the efficient production of 3-HP, e.g., on an industrial scale. The use of an oxaloacetate decarboxylase would result in reduced costs and optimized processes as compared to existing methods.


SUMMARY

To meet these and other demands, provided herein are methods, host cells, and vectors for producing 3-hydroxypropionate (3-HP), e.g., using an oxaloacetate decarboxylase (OAADC) and a 3-hydroxypropionate dehydrogenase (3-HPDH).


Accordingly, certain aspects of the present disclosure relate to a method for producing 3-hydroxypropionate (3-HP), the method comprising: providing a recombinant host cell, wherein the recombinant host cell comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), and wherein the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1; and culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP, wherein expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH. Other aspects of the present disclosure relate to a method for producing 3-hydroxypropionate (3-HP), the method comprising: providing a recombinant host cell, wherein the recombinant host cell comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), wherein the OAADC has a specific activity of at least 0.1 μmol/min/mg against oxaloacetate; and culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP, wherein expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH.


In some embodiments, the recombinant host cell is a recombinant prokaryotic cell. In some embodiments, the prokaryotic cell is an Escherichia coli cell. In some embodiments, the host cell is selected from the group consisting of Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus lichenmformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrficans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Vibrio alginolyticus, Xanthomonas, Zymomonas, and Zymomonus mobilis. In some embodiments, the recombinant host cell is a recombinant fungal cell.


Other aspects of the present disclosure relate to a method for producing 3-hydroxypropionate (3-HP), the method comprising: providing a recombinant host cell, wherein the recombinant host cell comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), and wherein the recombinant host cell is a recombinant fungal cell; and culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP, wherein expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH. In some embodiments, the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. In some embodiments, the OAADC has a specific activity of at least 0.1 μmol/min/mg against oxaloacetate.


In some embodiments of any of the above embodiments, the OAADC has a specific activity of at least 10 μmol/min/mg against oxaloacetate. In some embodiments, the OAADC has a specific activity of at least 100 μmol/min/mg against oxaloacetate. In some embodiments of any of the above embodiments, the OAADC has a catalytic efficiency (kcat/KM) for oxaloacetate that is greater than about 2000 M−1 s−1. In some embodiments, the recombinant host cell (e.g., a fungal host cell) is capable of producing 3-HP at a pH lower than 6. In some embodiments, the recombinant host cell is capable of producing 3-HP below the pKa of 3-HP. In some embodiments, the fungal cell is a yeast cell. In some embodiments, the fungal cell is of a genus or species selected from the group consisting of Aspergillus, Aspergillus nidulans, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus terreus, Aspergillus pseudoterreus, Aspergillus usamii, Candida rugosa, Issatchenkia orientalis, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kluyveromyces marxianas, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Rhodosporidium toruloides, Sccharomyces cerevisiae, Schizosaccharomyces pombe, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Yarrowia lipolytica, and Zygosaccharomyces rouxii.


In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence shown in Table 2 or Table 5A. In some embodiments of any of the above embodiments, the OAADC comprises the amino acid sequence of a polypeptide selected from the group consisting of 4COK (SEQ ID NO: 1), A0A0F6SDN1_9DELT (SEQ ID NO:3), 4K9Q (SEQ ID NO:5), 1JSC (SEQ ID NO:15), 3L84_3M34 (SEQ ID NO:19), A0A0F2PQV5_9FIRM (SEQ ID NO:25), A0A0R2PY37_9ACTN (SEQ ID NO:41), XIWK73_ACYPI (SEQ ID NO:43), F4RJP4_MELLP (SEQ ID NO:51), A0A081BQW3_9BACT (SEQ ID NO:53), CAK95977 (SEQ ID NO:55), YP_831380 (SEQ ID NO:57). ZP_06846103 (SEQ ID NO:61), ZP_08570611 (SEQ ID NO:65), WP_010764607.1 (SEQ ID NO:77), YP_005756646.1 (SEQ ID NO:81), WP_018535238.1 (SEQ ID NO:85), YP_006485164.1 (SEQ ID NO: 112), YP_005461458.1 (SEQ ID NO: 113), YP 006991301.1 (SEQ ID NO: 114), WP 003075272.1 (SEQ ID NO:115), WP_020634527.1 (SEQ ID NO:116), IOVM (SEQ ID NO: 117), 2Q5Q (SEQ ID NO:118), 2VBG (SEQ ID NO:119), 2VBI (SEQ ID NO:120), and 3FZN (SEQ ID NO:121). In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence at least 80% identical to SEQ ID NO: 1. In some embodiments, the OAADC comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence at least 80% identical to a sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166. In some embodiments, the OAADC comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166.


In some embodiments of any of the above embodiments, the recombinant polynucleotide is stably integrated into a chromosome of the recombinant host cell. In some embodiments of any of the above embodiments, the recombinant polynucleotide is maintained in the recombinant host cell on an extra-chromosomal plasmid. In some embodiments of any of the above embodiments, the polynucleotide encoding the 3-HPDH is an endogenous polynucleotide. In some embodiments of any of the above embodiments, the polynucleotide encoding the 3-HPDH is a recombinant polynucleotide. In some embodiments of any of the above embodiments, the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130. In some embodiments of any of the above embodiments, the 3-HPDH comprises the amino acid sequence of SEQ ID NO:154 or 159. In some embodiments of any of the above embodiments, the recombinant host cell is cultured under anaerobic conditions suitable for the recombinant host cell to convert the substrate to 3-HP. In some embodiments of any of the above embodiments, the substrate comprises glucose. In some embodiments, at least 95% of the glucose metabolized by the recombinant host cell is converted to 3-HP. In some embodiments, 100% of the glucose metabolized by the recombinant host cell is converted to 3-HP. In some embodiments of any of the above embodiments, the substrate is selected from the group consisting of sucrose, fructose, xylose, arabinose, cellobiose, cellulose, alginate, mannitol, laminarin, galactose, and galactan. In some embodiments of any of the above embodiments, the recombinant host cell further comprises a recombinant polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK). In some embodiments, the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163. In some embodiments of any of the above embodiments, the recombinant host cell further comprises a modification resulting in decreased production of pyruvate from phosphoenolpyruvate, as compared to a host cell lacking the modification. In some embodiments, the modification results in decreased pyruvate kinase (PK) activity, as compared to a host cell lacking the modification. In some embodiments, the modification results in decreased pyruvate kinase (PK) expression, as compared to a host cell lacking the modification. In some embodiments, the modification comprises an exogenous promoter in operable linkage with an endogenous pyruvate kinase (PK) coding sequence, wherein the exogenous promoter results in decreased endogenous PK coding sequence expression, as compared to expression of the endogenous PK coding sequence in operable linkage with an endogenous PK promoter. In some embodiments, the exogenous promoter is a MET3, CTR1, or CTR3 promoter. In some embodiments, the exogenous promoter comprises a polynucleotide sequence selected from the group consisting of SEQ ID NOs:131-133. In some embodiments, the recombinant host cell further comprises a second modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK), as compared to a host cell lacking the second modification. In some embodiments of any of the above embodiments, the method further comprises substantially purifying the 3-HP. In some embodiments of any of the above embodiments, the method further comprises converting the 3-HP to acrylic acid.


Other aspects of the present disclosure relate to a recombinant host cell comprising a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC), wherein the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. Other aspects of the present disclosure relate to a recombinant host cell comprising a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC), wherein the OAADC has a specific activity of at least 0.1 μmol/min/mg against oxaloacetate. In some embodiments, the recombinant host cell is a recombinant prokaryotic cell. In some embodiments, the prokaryotic cell is an Escherichia coli cell. In some embodiments, the host cell is selected from the group consisting of Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus lichenmformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrficans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Vibrio alginolyticus, Xanthomonas, Zymomonas, and Zymomonus mobilis. In some embodiments, the recombinant host cell is a recombinant fungal host cell.


Other aspects of the present disclosure relate to a recombinant fungal host cell comprising a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC). In some embodiments, the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. In some embodiments, the OAADC has a specific activity of at least 0.1 μmol/min/mg against oxaloacetate.


In some embodiments of any of the above embodiments, the OAADC has a specific activity of at least 10 μmol/min/mg against oxaloacetate. In some embodiments, the OAADC has a specific activity of at least 10 μmol/min/mg against oxaloacetate. In some embodiments of any of the above embodiments, the OAADC has a catalytic efficiency (kcat/KM) for oxaloacetate that is greater than about 2000 M−1 s−1. In some embodiments of any of the above embodiments, the host cell further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH). In some embodiments, the polynucleotide encoding the 3-HPDH is an endogenous polynucleotide. In some embodiments, the polynucleotide encoding the 3-HPDH is a recombinant polynucleotide. In some embodiments, the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130. In some embodiments, the 3-HPDH comprises the amino acid sequence of SEQ ID NO: 154 or 159.


In some embodiments of any of the above embodiments, the recombinant fungal host cell is capable of producing 3-HP at a pH lower than 6. In some embodiments, the recombinant host cell is capable of producing 3-HP below the pKa of 3-HP. In some embodiments, the fungal cell is a yeast ceil. In some embodiments, the fungal cell is of a genus or species selected from the group consisting of Aspergillus, Aspergillus nidulans, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus terreus, Aspergillus pseudoterreus, Aspergillus usamii, Candida rugosa, Issatchenkia orientalis, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kluyveromyces marxianas, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Rhodosporidium toruloides, Sccharomyces cerevisiae, Schizosaccharomyces pombe, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Yarrowia lipolytica, and Zygosaccharomyces rouxii.


In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence shown in Table 2 or Table 5A. In some embodiments of any of the above embodiments, the OAADC comprises the amino acid sequence of a polypeptide selected from the group consisting of 4COK (SEQ ID NO:1). A0A0F6SDN1_9DELT (SEQ ID NO:3), 4K9Q (SEQ ID NO:5), IJSC (SEQ ID NO:15), 3L84_3M34 (SEQ ID NO:19), AOAOF2PQV5_9FIRM (SEQ ID NO:25), AOAOR2PY37_9ACTN (SEQ ID NO:41), X1WK73_ACYPI (SEQ ID NO:43), F4RJP4_MELLP (SEQ ID NO:51), AOA081BQW3_9BACT (SEQ ID NO:53), CAK95977 (SEQ ID NO:55), YP_831380 (SEQ ID NO:57), ZP_06846103 (SEQ ID NO-61), ZP_08570611 (SEQ ID NO:65), WP_010764607.1 (SEQ ID NO:77), YP_005756646.1 (SEQ ID NO:81), WP_018535238.1 (SEQ ID NO:85), YP_006485164.1 (SEQ ID NO: 112), YP_005461458.1 (SEQ ID NO: 113), YP_006991301 i1 (SEQ ID NO:114), WP_003075272.1 (SEQ ID NO:115), WP_020634527.1 (SEQ ID NO: 116), IOVM (SEQ ID NO: 117), 2Q5Q (SEQ ID NO:118), 2VBG (SEQ ID NO:119), 2VBI (SEQ ID NO:120), and 3FZN (SEQ ID NO:121). In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence at least 80% identical to SEQ ID NO: 1. In some embodiments of any of the above embodiments, the OAADC comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments of any of the above embodiments, the OAADC comprises an amino acid sequence at least 80% identical to a sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166. In some embodiments, the OAADC comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166.


In some embodiments of any of the above embodiments, the recombinant polynucleotide is stably integrated into a chromosome of the recombinant host cell. In some embodiments of any of the above embodiments, the recombinant polynucleotide is maintained in the recombinant host cell on an extra-chromosomal plasmid. In some embodiments of any of the above embodiments, the recombinant host cell is capable of producing 3-HP under anaerobic conditions. In some embodiments of any of the above embodiments, the recombinant host cell further comprises a recombinant polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK). In some embodiments, the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163. In some embodiments of any of the above embodiments, the recombinant host cell further comprises a modification resulting in decreased production of pyruvate from phosphoenolpyruvate, as compared to a host cell lacking the modification. In some embodiments, the modification results in decreased pyruvate kinase (PK) activity, as compared to a host cell lacking the modification. In some embodiments, the modification results in decreased pyruvate kinase (PK) expression, as compared to a host cell lacking the modification. In some embodiments, the modification comprises an exogenous promoter in operable linkage with an endogenous pyruvate kinase (PK) coding sequence, wherein the exogenous promoter results in decreased endogenous PK coding sequence expression, as compared to expression of the endogenous PK coding sequence in operable linkage with an endogenous PK promoter. In some embodiments, the exogenous promoter is a MET3, CTR1, or CTR3 promoter. In some embodiments, the exogenous promoter comprises a polynucleotide sequence selected from the group consisting of SEQ ID NOs:131-133. In some embodiments, the recombinant host cell further comprises a second modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK), as compared to a host cell lacking the second modification.


Other aspects of the present disclosure relate to a vector comprising a polynucleotide that encodes an amino acid sequence at least 80% identical to a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166. In some embodiments, the polynucleotide encodes the amino acid sequence of SEQ ID NO: 1. In some embodiments, the polynucleotide comprises the polynucleotide sequence of SEQ ID NO:2. In some embodiments, the polynucleotide encodes an amino acid sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166. In some embodiments, the vector further comprises a promoter operably linked to the polynucleotide. In some embodiments, the promoter is exogenous with respect to the polynucleotide that encodes the amino acid sequence at least 80% identical to SEQ ID NO:1. In some embodiments, the promoter is a T7 promoter. In some embodiments, the promoter is a TDH or FBA promoter. In some embodiments, the promoter comprises the polynucleotide sequence of SEQ ID NO:135 or 136. In some embodiments, the vector further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH). In some embodiments, the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130. In some embodiments, the amino acid sequence of SEQ ID NO:154 or 159.


In some embodiments, the polynucleotide that encodes the sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166 and the polynucleotide encoding the 3-hydroxypropionate dehydrogenase (3-HPDH) are arranged in an operon operably linked to the same promoter. In some embodiments, the promoter is a T7 or phage promoter. In some embodiments, an operon of the present disclosure comprises (a) a polynucleotide that encodes an amino acid sequence at least 80% identical to SEQ ID NO:1 (e.g., SEQ ID NO:2), (b) a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH) (e.g., a polynucleotide encoding a 3-HPDH listed in Table 1 or Table 7A) or a polynucleotide encoding an alcohol dehydrogenase (e.g., comprising the sequence of NCBI GenBank Ref. No. ABX13006 or a polynucleotide encoding an alcohol dehydrogenase listed in Table 7A), and (c) a polynucleotide encoding a phosphoenolpyruvate carboxykinase (e.g., comprising a polynucleotide encoding a phosphoenolpyruvate carboxykinase listed in Table 9A). In some embodiments, the phosphoenolpyruvate carboxykinase is selected from the group consisting of E. coli Pck, NCBI Ref. Seq. No. WP_011201442, NCBI Ref. Seq. No. WP_011978877, NCBI Ref. Seq. No. WP_027939345, NCBI Ref. Seq. No. WP_074832324, and NCBI Ref. Seq. No. WP_074838421. In some embodiments, the 3-HPDH comprises the amino acid sequence of SEQ ID NO: 154 or 159. In some embodiments, the vector further comprises a polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK). In some embodiments, the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163. In some embodiments, the polynucleotide that encodes the sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166; the polynucleotide encoding the 3-hydroxypropionate dehydrogenase (3-HPDH); and the polynucleotide encoding the phosphoenolpyruvate carboxykinase (PEPCK) are arranged in an operon operably linked to the same promoter (e.g., a T7 or phage promoter).


It is to be understood that one, some, or all of the properties of the various embodiments described above and herein may be combined to form other embodiments of the present invention. These and other aspects of the present disclosure will become apparent to one of skill in the art. These and other embodiments of the present disclosure are further described by the detailed description that follows.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows the chemical structure of 3-Hydroxypropionic acid (3-HP) and commodity/specialty chemicals that can be derived from 3-HP. The dehydration reaction of 3-HP into acrylic acid is indicated by a box. Adapted from Werpy, T. et al. “Top Value Added Chemicals from Biomass.” US Department of Energy Report, Vol 1, 2004.



FIG. 2A shows the seven known, complex synthesis pathways involving combinations of 19 different metabolic enzymes for the production of 3-HP from glucose. Adapted from Kumar, V. et al. (2013) Biotech. Adv. 31:945-961.



FIG. 2B shows a simplified metabolic pathway for the production of 3-HP from glucose using a 3-oxopropanoate intermediate produced directly from oxaloacetate. The oval indicates a novel enzyme capable of efficiently catalyzing the decarboxylation of oxaloacetate to 3-oxopropanoate.



FIG. 3 depicts the scheme for genomic enzyme mining to identify active oxaloacetate decarboxylases.



FIG. 4 shows log specific activity towards oxaloacetate for 56 candidate enzymes identified by genomic enzyme mining.



FIG. 5 shows the kinetic characterization of the top candidate enzyme identified by genomic enzyme mining, 4COK, on substrates pyruvate (squares) and oxaloacetate (diamonds)



FIG. 6 shows the results of a second round of genomic mining centered around the sequence space of 4COK to identify other candidate OAADCs. A phylogenetic tree of candidate enzymes is shown, along with the corresponding OAADC activity measured for each enzyme (log scale). A clade containing enzymes with the highest measured OAADC activity is indicated.



FIG. 7 shows the activity of candidate 3-hydroxypropionate dehydrogenase (3-HPDH) enzymes towards 3-HP using either NAD+ or NADP+ as a co-factor.



FIG. 8A shows the activity of the candidate 3-HPDH enzyme 2CVZ towards 3-HP using cither NAD+ or NADP+ as a co-factor.



FIG. 8B shows the activity of the candidate 3-HPDH enzyme A4YI81 towards 3-HP using either NAD+ or NADP+ as a co-factor



FIG. 9 shows the activities of the candidate 3-HPDH enzymes 2CVZ and A4YI81 towards 3-HP using NAD+ as a co-factor.



FIG. 10 shows the activities of candidate phosphoenolpyruvate carboxykinase (PEPCK) enzymes from E. coli and A. succinogenes towards PEP.





DETAILED DESCRIPTION

The present disclosure relates generally to methods, host cells, and vectors for producing 3-hydroxypropionate (3-HP). In some embodiments, the methods, host cells, and vectors comprise a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH). Without wishing to be bound to theory, it is thought that a simplified metabolic pathway using an OAADC to convert oxaloacetate into 3-oxopropanoate and a 3-HPDH to convert 3-oxopropanoate into 3-HP (FIG. 2B) would allow for more efficient production of 3-HP than existing pathways (FIG. 2A). For example, it is thought that utilizing this simplified metabolic pathway can result in approximately 100% conversion of glucose into 3-HP. Moreover, this metabolic pathway is active under anaerobic conditions such that host cells can grow and produce 3-HP without aeration, enabling an increased yield and increased scale of production (e.g., larger fermenter size) with lower operating costs (e.g., by eliminating the need for aeration). Finally, this pathway can be carried out using fungal cells, which are typically more tolerant of low pH than bacterial cells. For example, it is thought that using E. coli for large-scale production of 3-HP would lead to acidification of the culture medium, thereby requiring more complicated purification and pH neutralization processes to maintain the pH of the culture within a viable range for E cot (which can also lead to undesirable waste products, such as gypsum, that raise environmental concerns).


In particular, the present disclosure is based, at least in part, on the demonstration described herein of a method for identifying enzymes with OAADC activity. As one example, 4COK from Gluconacetobacter diazotrophicus was found to have efficient OAADC activity with a particularly strong specific activity using oxaloacetate as a substrate (e.g., as compared to pyruvate and/or 2-ketoisovalerate). Additional enzymes having OAADC activity similar to that of 4COK were also identified, such as A0A0J7KM68_LASNI (SEQ ID NO:145), 5EUJ (SEQ ID NO:146), C7JF72_ACEP3 (SEQ ID NO: 148), and A0A0D6NFJ6_9PROT (SEQ ID NO:166). Moreover, enzymes particularly suitable for catalyzing the other steps of the 3-HP biosynthesis pathway (e.g., PEPCK and 3-HPDH) were also characterized, such as the 3-HPDHs A4YI81 (SEQ ID NO: 154) and 2CVZ (SEQ ID NO: 159) and the PEPCKs from E. coli (SEQ ID NO:162) and A. succinogenes (SEQ ID NO:163).


Methods and Host Cells for Producing 3-hydroxypropionate (3-HP)


Certain aspects of the present disclosure relate to methods of producing 3-HP. In some embodiments, the methods comprise providing a recombinant host cell that comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), wherein the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1; and culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP. In some embodiments, the methods comprise providing a recombinant host cell that comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), wherein the OAADC has a specific activity of at least 0.1p mol/min/mg against oxaloacetate; and culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP. In some embodiments, the methods comprise providing a recombinant fungal host cell that comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH); and culturing the recombinant fungal host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP. Expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH.


As used herein, “recombinant” or “exogenous” refer to a polynucleotide wherein the exact nucleotide sequence of the polynucleotide is not naturally found in a given host cell, e.g., as the host cell is found in nature. These terms may also refer to a polynucleotide sequence that may be naturally found in (e.g., “endogenous” with respect to) a given host, but in an unnatural (e.g., greater than or less than expected) amount, or additionally if the sequence of a polynucleotide comprises two or more subsequences that are not found in the same relationship to each other in nature. For example, regarding the latter, a recombinant polynucleotide can have two or more sequences from unrelated polynucleotides or from homologous nucleotides arranged to make a new polynucleotide, or a promoter sequence in operable linkage with a coding sequence in an unnatural combination. Specifically, the present disclosure describes the introduction of a recombinant vector into a host cell, wherein the vector contains a polynucleotide coding for a polypeptide that is not normally found in the host cell or contains a foreign polynucleotide coding for a substantially homologous polypeptide that is normally found in the host cell. With reference to the host cell's genome, the polynucleotide sequence that encodes the polypeptide is recombinant or exogenous. “Recombinant” may also be used to refer to a host cell that contains one or more exogenous or recombinant polynucleotides.


The terms “derived from” or “from” when used in reference to a polynucleotide or polypeptide indicate that its sequence is identical or substantially identical to that of an organism of interest. For instance, a 3-HPDH from Saccharomyces cerevisiae refers to a 3-HPDH enzyme having a sequence identical or substantially identical to a native 3-HPDH of Saccharomyces cerevisiae. The terms “derived from” and “from” when used in reference to a polynucleotide or polypeptide do not indicate that the polynucleotide or polypeptide in question was necessarily directly purified, isolated, or otherwise obtained from an organism of interest. By way of example, an isolated polynucleotide containing a 3-HPDH coding sequence of Saccharomyces cerevisiae need not be obtained directly from a Saccharomyces cerevisiae cell. Instead, the isolated polynucleotide may be prepared synthetically using methods known to one of skill in the art, including but not limited to polymerase chain reaction (PCR) and/or standard recombinant cloning techniques.


“Percent (%) amino acid sequence identity” with respect to a reference polypeptide sequence refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the reference polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. When comparing two sequences for identity, it is not necessary that the sequences be contiguous, but any gap would carry with it a penalty that would reduce the overall percent identity. For blastn, the default parameters are Gap opening penalty=5 and Gap extension penalty=2. For blastp, the default parameters are Gap opening penalty=1 and Gap extension penalty=1. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, by the local homology algorithm of Smith and Waterman, Adv Appl Math, 2:482, 1981; by the homology alignment algorithm of Needleman and Wunsch, J Mol Biol. 48:443, 1970; by the search for similarity method of Pearson and Lipman. Proc Natl Acad Sci USA, 85:2444, 1988; by computerized implementations of these algorithms FASTDB (Intelligenetics), by the BLAST or BLAST 2.0 algorithms (Altschul et al., Nuc Acids Res, 25:3389-3402, 1977: and Altschul et al., J Mol Biol, 215:403-410, 1990, respectively), GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package (Genetics Computer Group, Madison, Wis.), PILEUP (Feng and Doolittle, J Mol Evol, 35:351-360, 1987), the CLUSTALW program (Thompson et al., Nucl Acids. Res, 22:4673-4680, 1994), or by manual alignment and visual inspection. Suitable parameters for any of these exemplary algorithms, such as gap open and gap extension penalties, scoring matrices (see. e.g., the BLOSUM62 scoring matrix of Henikoff and Henikoff, Proc Natl Acad Sci USA, 89: 10915, 1989), and the like can be selected by one of ordinary skill in the art.


The terms “coding sequence” and “open reading frame (ORF)” refer to a sequence of codons extending from an initiator codon (ATG) to a terminator codon (TAG. TAA or TGA), which can be translated into a polypeptide.


The terms “decrease,” “reduce” and “reduction” as used in reference to biological function (e.g., enzymatic activity, production of compound, expression of a protein, etc.) refer to a measurable lessening in the function by at least 10%, at least 50%, at least 75%, or at least 90%. Depending upon the function, the reduction may be from 10% to 100%. The term “substantial reduction” and the like refer to a reduction of at least 50%, 75%, 90%, 95%, or 100%.


The terms “increase,” “elevate” and “enhance” as used in reference to biological function (e.g., enzymatic activity, production of compound, expression of a protein, etc.) refer to a measurable augmentation in the function by at least 10%, at least 50%, at least 75%, or at least 90%. Depending upon the function, the elevation may be from 10% to 100%; or at least 10-fold, 100-fold, or 1000-fold up to 100-fold, 1000-fold or 10,000-fold or more. The term “substantial elevation” and the like refer to an elevation of at least 50%, 75%, 90%, 95%, or 100%.


Oxaloacetate Decarboxylases


Certain aspects of the present disclosure relate to oxaloacetate decarboxylase (OAADC) enzymes and recombinant polynucleotides related thereto. As used herein, an oxaloacetate decarboxylase (OAADC) is capable of catalyzing the reaction converting oxaloacetate to 3-oxopropanoate (also known as malonate semialdehyde). The discovery of enzymes capable of catalyzing this reaction with sufficient efficiency for enabling large-scale processes (e.g., production of 3-HP) is described and demonstrated herein.


In some embodiments, the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. In some embodiments, the OAADC has at least about 20% activity using oxaloacetate as a substrate as compared to its activity using pyruvate as a substrate. Exemplary assays for determining enzymatic activity against pyruvate or oxaloacetate (e.g., using pyruvate or oxaloacetate as a substrate) are described in greater detail in Examples 1 and 2 below.


In some embodiments, an OAADC of the present disclosure has a ratio of activity against oxaloacetate to activity against 2-ketoisovalerate that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350. For example, as described herein, 4COK from Gluconoacetobacter diazotrophicus was demonstrated to possess approximately 390-fold greater activity towards oxaloacetate than 2-ketoisovalerate. Additional OAADCs with similar enzymatic activity to that of 4COK were also identified, such as A0A0J7KM68_LASNI (SEQ ID NO:145), 5EUJ (SEQ ID NO:146), C7JF72_ACEP3 (SEQ ID NO: 148), and A0A0D6NFJ6_9PROT (SEQ ID NO: 166), as described in greater detail in Example 2 below. In some embodiments, an OAADC of the present disclosure has a ratio of activity against oxaloacetate to activity against 2-ketoisovalerate that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350 and a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. Exemplary assays for determining enzymatic activity against pyruvate, 2-ketoisovalerate, or oxaloacetate (e.g., using pyruvate, 2-ketoisovalerate, or oxaloacetate as a substrate) are described in greater detail in Examples 1 and 2 below.


In some embodiments, an OAADC of the present disclosure has a ratio of activity against oxaloacetate to activity against 4-methyl-2-oxovaleric acid that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350. In some embodiments, an OAADC of the present disclosure has a ratio of activity against oxaloacetate to activity against 4-methyl-2-oxovaleric acid that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350 and a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. The exemplary assays for determining enzymatic activity against pyruvate, 2-ketoisovalerate, or oxaloacetate (e.g., using pyruvate, 2-ketoisovalerate, or oxaloacetate as a substrate) described in Example 1 below can readily be modified to measure activity against 4-methyl-2-oxovaleric acid by one of skill in the art.


In some embodiments, an OAADC of the present disclosure has a specific activity of at least 0.1 μmol/min/mg, at least 10 μmol/min/mg, or at least 100 μmol/min/mg against oxaloacetate. In some embodiments, an OAADC of the present disclosure has a specific activity against oxaloacetate of at least about 0.1, at least about 0.5, at least about 1, at least about 5, at least about 10, at least about 25, at least about 50, at least about 75, at least about 100, at least about 200, at least about 300, at least about 400, at least about 500, at least about 600, at least about 700, at least about 800, at least about 900, at least about 1000, at least about 2000, at least about 3000, at least about 4000, or at least about 5000 μmol/min/mg. For example, as described herein, 4COK from Gluconoacetobacter diazotrophicus was demonstrated to possess a specific activity against oxaloacetate of approximately 5500 μmol/min/mg. Additional OAADCs with similar enzymatic activity to that of 4COK were also identified, such as A0A0J7KM68_LASNI (SEQ ID NO:145), 5EUJ (SEQ ID NO:146), C7JF72_ACEP3 (SEQ ID NO: 148), and A0A0D6NFJ6_9PROT (SEQ ID NO:166), as described in greater detail in Example 2 below. In some embodiments, an OAADC of the present disclosure has a specific activity of at least 0.1 μmol/min/mg, at least 10 μmol/min/mg, or at least 100 μmol/min/mg against oxaloacetate and a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1. In some embodiments, an OAADC of the present disclosure has a specific activity of at least 0.1 μmol/min/mg, at least 10 μmol/min/mg, or at least 100 μmol/min/mg against oxaloacetate and a ratio of activity against oxaloacetate to activity against 2-ketoisovalerate that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350. In some embodiments, an OAADC of the present disclosure has a specific activity of at least 0.1 μmol/min/mg, at least 10 μmol/min/mg, or at least 100 μmol/min/mg against oxaloacetate, a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1, and a ratio of activity against oxaloacetate to activity against 2-ketoisovalerate that is greater than or equal to about 5, about 10, about 25, about 50, about 75, about 100, about 150, about 200, about 250, about 300, or about 350. Exemplary assays for determining specific activity against oxaloacetate (e.g., using oxaloacetate as a substrate) are described in greater detail in Example 1 below. In some embodiments, specific activity refers to enzymatic conversion of oxaloacetate into 3-oxopropanoate.


In some embodiments, an OAADC of the present disclosure is expressed in a host cell at up to 1% of total protein. In some embodiments, an OAADC and a 3-HPDH of the present disclosure have a combined expression in a host cell of up to 1% of total protein.


In some embodiments, an OAADC of the present disclosure has a catalytic efficiency (kcat/KM) for oxaloacetate that is greater than about 500, 1000, or 2000 (M−1 s−1). For example, as described herein, 4COK from Gluconoacetobacter diazotrophicus was demonstrated to possess a catalytic efficiency for oxaloacetate of approximately 2296.4. Exemplary assays for determining catalytic efficiency and other rate constants using oxaloacetate as a substrate are described in greater detail in Example 1 below. Additional OAADCs with similar enzymatic activity to that of 4COK were also identified, such as A0A0J7KM68_LASNI (SEQ ID NO:145), 5EUJ (SEQ ID NO:146), C7JF72_ACEP3 (SEQ ID NO: 148), and A0A0D6NFJ6_9PROT (SEQ ID NO: 166), as described in greater detail in Example 2 below.


In some embodiments, an OAADC of the present disclosure comprises an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to an amino acid sequence shown in Table 2. In some embodiments, an OAADC of the present disclosure is encoded by a polynucleotide sequence shown in Table 2.


In some embodiments, an OAADC of the present disclosure comprises an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQLLLNTDMQQIYCSNELNCG FSAEGYARANGAAAAIVTFSVGALSAFNALGGAYAENLPVILISGAPNANDHGTGH ILHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKIDHVIRTALREKKPAYLEIA CNVAGAPCVRPGGIDALLSPPAPDEASLKAAVDAALAFIEQRGSVTMLVGSRIRAA GAQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGHYWGEVSSPGAQQAVEG ADGVICLAPVFNDYATVGWSAWPKGDNVMLVERHAVTVGGVAYAGIDMRDFLT RLAAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMARQIGALLTPRTTLTAET GDSWFNAVRMKLPHGARVELEMQWGHIGWSVPAAFGNALAAPERQHVLMVGD GSFQLTAQEVAQMIRHDLPVIIFINNHGYTTEVMIHDGPYNNVKNWDY AGLMEVF NAGEGNGLGLRARTGGELAAATEQARANRNGPTLIECTLDRDDCTQELVTWGKRV AAANARPPRAG (SEQ ID NO:1). In some embodiments, an OAADC of the present disclosure comprises the amino acid sequence MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQLLLNTDMQQIYCSNELNCG FSAEGYARANGAAAAIVTFSVGALSAFNALGGAYAENLPVILISGAPNANDHGTGH ILHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKIDHVIRTA LREKKPAYLEIA CNVAGAPCVRPGGIDALLSPPAPDEASLKAAVDAALAFIEQRGSVTMLVGSRIRAA GAQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGHYWGEVSSPGAQQAVEG ADGVICLAPVFNDYATVGWSAWPKGDNVMLVERHAVTVGGVAYAGIDMRDFLT RLAAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMARQIGALLTPRTTLTAET GDSWFNAVRMKLPHGARVEIEMQWGHIGWSVPAA FGNALA APERQHVIMVGD GSFQLTAQEVAQMIRHDLPVIIFLINNHGYTIEVMIHDGPYNNVKNWDYAGLMEVF NAGEGNGLGLRARTGGELAAAIEQARANRNGPTLIECTLDRDDCTQELVIWGKRV AAANARPPRAG (SEQ ID NO:1). In some embodiments, an OAADC of the present disclosure comprises an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the amino acid sequence of GenBank/NCBI RefSeq Accession Nos. AIG13066, WP_012554212, and/or WP_012222411.


In some embodiments, an OAADC of the present disclosure is encoded by the polynucleotide sequence of SEQ ID NO:2.


In some embodiments, an OAADC of the present disclosure has a specific activity against oxaloacetate of at least about 10 μmol/min/mg. In some embodiments, an OAADC of the present disclosure comprises the amino acid sequence of a polypeptide selected from the group consisting of 4COK (SEQ ID NO:1), A0A0F6SDN1_9DELT (SEQ ID NO:3), 4K9Q (SEQ ID NO:5), 1JSC (SEQ ID NO: 15), 3L84_3M34 (SEQ ID NO:19). A0A0F2PQV5_9FIRM (SEQ ID NO:25), A0A0R2PY37_9ACTN (SEQ ID NO:41), X1WK73_ACYPI (SEQ ID NO:43), F4RJP4_MELLP (SEQ ID NO:51), A0A081BQW3_9BACT (SEQ ID NO:53), CAK95977 (SEQ ID NO:55), YP 831380 (SEQ ID NO:57), ZP_06846103 (SEQ ID NO:61), ZP_08570611 (SEQ ID NO:65), WP_010764607.1 (SEQ ID NO:77), YP_(005756646.1 (SEQ ID NO:81), WP_018535238.1 (SEQ ID NO:85), YP_006485164.1 (SEQ ID NO: 112), YP_005461458.1 (SEQ ID NO:113), YP_006991301.1 (SEQ ID NO:114), WP_003075272.1 (SEQ ID NO:115), WP_020634527.1 (SEQ ID NO: 116), IOVM (SEQ ID NO: 117), 2Q5Q (SEQ ID NO:118), 2VBG (SEQ ID NO:119), 2VBI (SEQ ID NO:120), and 3FZN (SEQ ID NO:121). Additional OAADCs with similar enzymatic activity to that of 4COK were also identified, such as A0A0J7KM68_LASNI (SEQ ID NO:145), 5EUJ (SEQ ID NO: 146), C7JF72_ACEP3 (SEQ ID NO: 148), and A0A0D6NFJ6_9PROT (SEQ ID NO: 166).


In some embodiments, an OAADC of the present disclosure comprises a sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence of A0A0J7KM68_LASNI, 5EUJ, or C7JF72_ACEP3 (see Table 5A). In some embodiments, an OAADC of the present disclosure comprises a sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166. In some embodiments, an OAADC of the present disclosure comprises a sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a sequence selected from the group consisting of SEQ ID NOs: 145, 146, 148, and 166. In some embodiments, an OAADC of the present disclosure comprises the sequence of A0A0J7KM68_LASNI, 5EUJ, C7JF72_ACEP3, or A0A0D6NFJ6_9PROT (see Table 5A). In some embodiments, an OAADC of the present disclosure comprises a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166. In some embodiments, an OAADC of the present disclosure comprises a sequence selected from the group consisting of SEQ ID NOs: 145, 146, 148, and 166.


In some embodiments, an OAADC of the present disclosure has a sequence that is at least 80%, at least 81%, at least 82%, at least 83%.u at least 840%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%0, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%0, at least 98%, at least 99%, or 1000% identical to a sequence shown in Table 5A.









TABLE 5A







Candidate OAADC sequences.








Enzyme name
Amino acid sequence





G6EYP0 9PROT
MEYTVGQYLATRLAQLGLNHVFAVAGDYNLTLLDEMAKAKDLEQVYCCNEL



NCGFAGEGYARARIMGASVVTFSVGAFSAFNAVGGAFAENLPLLLISGAPNNN



DYGSGHILHHTMGYSDYRYQMEMAKKITCEAVSVAHADEAPCLIDHAIRSAIR



NRKPAYIEISCNVANQPCTEPGPISSITNSLISDDESLKAAAKACVEALEKAKNPV



VIIGGKIRSAGCAVSKQVAELTKKLGCAVATMAQAKGLSPEEEAEYVGTFWGD



ISSPGVEDLVRDSDCRIYIGAVFNDYSTVGWTCKLVSDNDILISSHHTRVGKKEF



SGVYLKDFIPVLASSVKKNTTSLEQFKAKKLPAKETPVADGNAALTTVELCRQI



QGAINKDTTLFLETGDSWFHGMHFNLPNGARVESEMQWGHIGWSIPSMFGYAV



SEPNRRNIIMVGDGSFQLTAQEVCQMIRRNMPVIIILINNSGYTIEVKIHDGPYNRI



KNWDYAGLIDVFNAEDGKGLGLKAKNGAELEKAMKTALAHKDGPTLIEVDID



AQDCSPDLVVWGKKVAKANGRAPRKAGGSG (SEQ ID NO: 137)





W7DU13 9PROT
MKYTVGQYLATRLAQLGLNHVFAVAGDYNLTLLDEMAKVEDLEQVYCCNEL



NCGFAGEGYARSRVMGASVVTFSVGAFSAFNAVGGAFAENLPLLLISGAPNNN



DYGSGHILHHTMGYSDYRYQMDMAKQITCEAVSVAHADEAPCLIDHAIRSALR



NRKPAYIEISCNVANQPCTEPGPISSITNSLISDDESLKAAAKACLDALEKAKSPV



VIIGGKIRSAGCAVSKKVAELTKKLGCAVATMAQAKGLSPEEEAEYVGTFWGEI



SSPGVEELVRESDCRIYIGAVFNDYSTVGWTCKLVGENDILISSHHTRVGHKEFS



GVYLKDFIPVTTSCVKKNTTSLDQFKAKKIPVKQVPVADGKAPLTTVELCRQIQ



GAINKDTTIYLETGDSWFHGMHFKLPNGARVESEMQWGHIGWSIPSMFGYAVS



EPNRRNIIMVGDGSFQLTAQEVCQMIRRNIPIIIILINNSGYTIEVKIHDGPYNRIKN



WDYAGLINVFNAEDGKGLGLKAKNGAELEKAMQTALAHKDGPTLIEVDIDAQ



DCSPDLVVWGKKVAKANGRAPRKFQTFGGSG (SEQ ID NO: 138)





I4H6Y9 MICAE_1
MSNYNVGTYLAERLVQIGVKHHFVVPGDYNLVLLDQFLKNQNLLQVGCCNEL



NCGFAAEGYARANGLGVAVVTYSVGALSALNAIGGAYAENLPVILVSGAPNTN



DYSTGHLLHHTMGTQDLTYVLEIARKLTCAAVSITSAEDAPEQIDHVIRTALREQ



KPAYIEIACNIAAAPCASPGPVSAIINEVPSDAETLAAAVSAAAEFLDSKQKPVLL



IGSQLRAAKAEQEAIELAEALGCSVAVMAAAKSFFPEEHPQYVGTYWGEISSPG



TSAIVDWSDAVVCLGAVFNDYSTVGWTAMPSGPTVLNANKDSVKFDGYHFSGI



HLRDFLSCLARKVEKRDATMAEFARFRSTSVPVEPARSEAKLSRIEMLRQIGPLV



TAKTTVFAETGDSWFNGMKLQLPTGARFEIEMQWGHIGWSIPAAFGYALGAPE



RQIICMIGDGSFQLTAQEVAQMIRQKLPIIIFLVNNHGYTIEVEIHDGPYNNIKNW



DYAGLIKVFNAEDGAGQGLLATTAGELAQAIEVALENREGPTLIECVIDRDDAT



ADLISWGRAVAVANARPHRGGSG (SEQ ID NO: 139)





A0A094IGF4 9PEZI
MATFTVGDYLAERLAQIGIRHHFVVPGDYNLILLDKLQSHPDLSELGCANELNC



SLAAEGYARAQGVAACIVTYSVGAFSAFNGTGSAYAENLPLILVSGSPNTNDSA



KFHLLHHTLGTNDFTYQFEMAKKITCCAVAVGRAQDAPRLIDQAIRAALLAKK



PAYIEIPTNLSGAMCVRPGPISAVVEPVLSDKASLTAAVDRAVQYLCGKQKPAIL



VGPKLRRAGAEMALLQVAEAIGCAVAVQPAAKGFFPEDHKQFAGVFWGQVST



LAADSILNWADTILCVGTIFTDYSTVGWTALPNVPLMIAEMDHYMFPGATFGR



VRLNDFLSGLAKTVGRNESTMVEYGYIRPDPPLVHAAAPDELLNRKETARQVQ



MLLTPETTVFVDTGDSWFNGIRMKLPRGASFEIEMQWGHIGWSIPAAFGYAMG



KPERKVITMVGDGSFQMTAQEVSQMVRYKVPIIIFLINNKGYTIEVEIHDGLYNR



IKNWDYALLVRAFNSNDGQAIGFRASTGRELAEAIEKAKAHKDGPTLIECVIDQ



DDCSRELITWGHYVAAANARPPVQTGGSG (SEQ ID NO: 140)





A0A0D2CX28
MSWTVGSYLAERLAQIGIEHHFVVPGDYNLVLLDKLQAHPKLSEIGCANELNCS


9EURO
FAAEGYARAKGVAAAVVTFSVGAFSAFNGVGGAYAENLPVILISGAPNTSDSG



AFHLLHHTLGTHDFGYQLEMAKKITCAAVAIRRAQDAPRLIDHAIRSAMSAKKP



AYIEIPTNLSIANCPAPGPISAVIAPERSDEITLAMAVNAALDWLKSKQKPVLLAG



PKLRAAGAEAAFLQLADALGCAVAVLPGAKSFFPEDHKQFVGVYWGQVSTMG



ADAIVDWSDGIFGAGVVFTDYSTVGWTALPPDSITLTADLDHMSFTGAEFNRV



QLAELLSALAERATRNSSTMVEYAHLRPDVLFPHIEEPKLPLHRNEIARQIQQLL



QPKTTLFVETGDSWFNGVQMRLPRSCRFEIEMQWGHIGWSVPASFGYAVGSPE



RQIILMVGDGSFQMTVQEVSQMVRARLPIIIFLMNNRGYTIEVEIHDGLYNRIKN



WNYASLIEAFNAEDGHAKGIKASNPEQLAQAIKLATSNSDGPTLIECVIDQDDCT



RELITWGHYVASANARPPAHKGGSG (SEQ ID NO: 141)





H6C7K9 EXODN
MRCMSVPSMTFSRHTLRSCATSSDRMTGAPRKPFITSIKRQHQQPWHSICPNVTI



IMSWTVGSYLAERLSQIGIEHHFVVPGDYNLVLLDQLQAHPKLSEIGCANELNC



SFAAEGYARAKGVAAAVVTFSVGAFSAFNGLGGAYAENLPVILISGSPNTNDAG



AFHLLHHTLGTHDFEYQRQIAEKITCAAVAVRRAQDAPRLIDHATRSALLAKKP



SYIEIPTNLSNVTCPAPGPISAVIAPEPSDEPTLAAAVHAATNWLKAKQKPILLAG



PKLRAAGGEAGFLQLAEAIGCAVAVMPGAKSFFPEDHKQFVGVYWGQASTMG



ADAIVDWADGIFGAGLVFTDYSTVGWTAIPSESITLNADLDNMSFPGATFNRVR



LADLLSALAKEATPNPSTMVEYARLRPDILPPHHEQPKLPLHRVEIARQIQELLH



PKTTLFAETGDSWFNAMQMNLPRDCRFEIEMQWGHIGWSVPASFGYAVGAPE



RQVLLMIGDGSFQMTAQEVSQMVRSKVPIIIFLMNNGGYTIEVEIHDGLYNRIKN



WNYAAMMEVFNAGDGHAKGIKASNPEQLAQAIKLAKSNSEGPTLIECIIDQDD



CTKELITWGHYVATANGRPPAHTGGSG (SEQ ID NO: 142)





PDC2 SCHPO
MTKDAESTMTVGTYLAQRLVEIGIKNHFVVPGDYNLRLLDFLEYYPGLSEIGCC



NELNCAFAAEGYARSNGIACAVVTYSVGALTAFDGIGGAYAENLPVILVSGSPN



TNDLSSGHLLHHTLGTHDFEYQMEIAKKLTCAAVAIKRAEDAPVMIDHAIRQAI



LQHKPVYIETPTNMANQPCPVPGPISAVISPEISDKESLEKATDIAAELTSKKEKPIL



LAGPKLRAAGAESAFVKLAEALNCAAFIMPAAKGFYSEEHKNYAGVYWGEVS



SSETTKAVYESSDLVIGAGVLFNDYSTVGWRAAPNPNILLNSDYTSVSIPGYVFS



RVYMAEFLELLAKKVSKKPATLEAYNKARPQTVVPKAAEPKAALNRVEVMRQ



IQGLVDSNTTLYAETGDSWFNGLQMKLPAGAKFEVEMQWGHIGWSVPSAMGY



AVAAPERRTIVMVGDGSFQLTGQEISQMIRHKLPVLIFLLNNRGYTIEIQIHDGPY



NRIQNWDFAAFCESLNGETGKAKGLHAKTGEELTSAIKVALQNKEGPTLIECAI



DTDDCTQELVDWGKAVRSANARPPTADNGGSG (SEQ ID NO: 143)





IZPD
MSYTVGTYLAERLVQIGLKHHFAVAGDYNLVLLDNLLLNKNMEQVYCCNELN



CGFSAEGYARAKGAAAAVVTYSVGALSAFDAIGGAYAENLPVILISGAPNNND



HAAGHVLHHALGKTDYHYQLEMAKNITAAAEAIYTPEEAPAKIDHVIKTALRE



KKPVYLEIACNIASMPCAAPGPASALFNDEASDEASLNAAVDETLKFIANRDKV



AVLVGSKLRAAGAEEAAVKFTDALGGAVATMAAAKSFFPEENALYIGTSWGE



VSYPGVEKTMKEADAVIALAPVFNDYSTTGWTDIPDPKKLVLAEPRSVVVNGIR



FPSVHLKDYLTRLAQKVSKKTGSLDFFKSLNAGELKKAAPADPSAPLVNAEIAR



QVEALLTPNTTVIAETGDSWFNAQRMKLPNGARVEYEMQWGHIGWSVPAAFG



YAVGAPERRNILMVGDGSFQLTAQEVAQMVRLKLPVIIFLINNYGYTIEVMIHD



GPYNNIKNWDYAGLMEVFNGNGGYDSGAAKGLKAKTGGELAEAIKVALANT



DGPTLIECFIGREDCTEELVKWGKRVAAANSRKPVNKVV (SEQ ID NO: 144)





4COK
MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQLLLNTDMQQIYCSNELN



CGFSAEGYARANGAAAAIVTFSVGALSAFNALGGAYAENLPVILISGAPNANDH



GTGHILHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKIDHVIRTALREKK



PAYLEIACNVAGAPCVRPGGIDALLSPPAPDEASLKAAVDAALAFIEQRGSVTM



LVGSRIRAAGAQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGHYWGEVS



SPGAQQAVEGADGVICLAPVFNDYATVGWSAWPKGDNVMLVERHAVTVGGV



AYAGIDMRDFLTRLAAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMARQI



GALLTPRTTLTAETGDSWFNAVRMKLPHGARVELEMQWGHIGWSVPAAFGNA



LAAPERQHVLMVGDGSFQLTAQEVAQMIRHDLPVIIFLINNHGYTIEVMIHDGP



YNNVKNWDYAGLMEVFNAGEGNGLGLRARTGGELAAAIEQARANRNGPTLIE



CTLDRDDCTQELVTWGKRVAAANARPPRAG (SEQ ID NO: 1)





A0A0J7KM68
MSYTVGQYLADRLVQIGLKDHFAIAGDYNLVLLDQFLKNKNWNQIYDCNELN


LASNI
CGFAAEGYARANGAAACVVTYTVGAISAMNSALAGAYAENLPVLCISGAPNC



NDYGSGRILHHTIGKPEFTQQLDMVKHVTCAAESVVQASEAPAKIDHVIRTMLL



EQRPAYIDIACNISGLECPRPGPIEDLLPQYAADNKSLTSAIDAIAKKIEASQKVTL



YVGPKVRPGKAKEASVKLADALGCAVTVGPASMSFFPAKHPGFRGTYWGIVST



GDANKVVEEAETLIVLGPNWNDYATVGWKAWPKGPRVVTIDEKAAQVDGQV



FSGLSMKALVEGLAKKVSKKPATAEGTKAPHFEYPVAKPDAKLTNAEMARQIN



AILDDNTTLHAETGDSWFNVKNMNWPNGLRIESEMQYGHIGWSIPSGFGGAIGS



PERKHIIMCGDGSFQLTCQEVSQMIRYKLPVTIFLIDNHGYGIEIAIHDGPYNYIQ



NWNFTKLMEVFNGEGEECPYSHNKNGKSGLGLKATTPAELADAIKQAEANKE



GPTLIQVVIDQDDCTKDLLTWGKEVAKTNARSPVVTDKAGGSG (SEQ ID



NO: 145)





5EUJ
MYTVGMYLAERLAQIGLKHHFAVAGDYNLVLLDQLLLNKDMEQVYCCNELN



CGFSAEGYARARGAAAAIVTFSVGAISAMNAIGGAYAENLPVILISGSPNTNDY



GTGHILHHTIGTTDYNYQLEMVKHVTCAAESIVSAEEAPAKIDHVIRTALRERKP



AYLEIACNVAGAECVRPGPINSLLRELEVDQTSVTAAVDAAVEWLQDRQNVV



MLVGSKLRAAAAEKQAVALADRLGCAVTIMAAAKGFFPEDHPNFRGLYWGEV



SSEGAQELVENADAILCLAPVFNDYATVGWNSWPKGDNVMVMDTDRYTFAG



QSFEGLSLSTFAAALAEKAPSRPATTQGTQAPVLGIEAAEPNAPLTNDEMTRQIQ



SLITSDTTLTAETGDSWFNASRMPIPGGARVELEMQWGHIGWSVPSAFGNAVGS



PERRHIMMVGDGSFQLTAQEVAQMIRYEIPVIIFLINNRGYVIEIAIHDGPYNYIK



NWNYAGLIDVFNDEDGHGLGLKASTGAELEGAIKKALDNRRGPTLIECNIAQD



DCTETLIAWGKRVAATNSRKPQAGGSG (SEQ ID NO: 146)





2584327140
MAYTVGMYLAERLAQIGLKHHFAVAGDYNLVLLDQLLLNKDMEQIYCCNELN


EU61DRAFT
CGFSAEGYARAHGAAAAVVTFSVGAISAMNAIGGAYAENLPVILISGSPNSNDY



GSGHILHHTLGTTDYGYQLEMARHVTCAAESITDAASAPAKIDHVIRTALRERK



PAYLEIACNVSSAECPRPGPVSSLLAEPATDPVSLKAALEASLSALNKAERVVML



VGSKIRAADAQAQAVELADRLGCAVTIMSAAKGFFPEDHPGFRGLYWGEVSSP



GAQELVENADAVLCLAPVFNDYSTVGWNAWPKGDKVLLAEPNRVTVGGQSFE



GFALRDFLKGLTDRAPSKPATAQGTHAPKLEIKPAARDARLTNDEMARQINAM



LTPNTTLAAETGDSWFNAMRMNLPGGARVEVEMQWGHIGWSVPSTFGNAMG



SKDRQHIMMVGDGSFQLTAQEVAQMVRYELPVIIFLVNNKGYVIEIAIHDGPYN



YIKNWDYAGLMEVFNAGEGHGIGLHAKTAGELEDAIKKAQANKRGPTIIECSLE



RTDCTETLIKWGKRVAAANSRKPQAVGGSG (SEQ ID NO: 147)





C7JF72 ACEP3
MTYTVGMYLAERLSQIGLKHHFAVAGDFNLVLLDQLLVNKEMEQVYCCNELN



CGFSAEGYARAHGAAAAVVTFSVGAISAMNAIAGAYAENLPVILISGSPNSNDY



GTGHILHHTLGTNDYTYQLEMMRHVTCAAESITDAASAPAKIDHVIRTALRERK



PAYVEIACNVSDAECVRPGPVSSLLAELRADDVSLKAAVEASLALLEKSQRVTM



IVGSKVRAAHAQTQTEHLADKLGCAVTIMAAAKSFFPEDHKGFRGLYWGDVSS



PGAQELVEKSDALICVAPVFNDYSTVGWTAWPKGDNVLLAEPNRVTVGGKTY



EGFTLREFLEELAKKAPSRPLTAQESKKHTPVIEASKGDARLTNDEMTRQINAM



LTSDTTLVAETGDSWFNATRMDLPRGARVELEMQWGHIGWSVPSAFGNAMGS



QERQHILMVGDGSFQLTAQEMAQMVRYKLPVIIFLVNNRGYVIEIAIHDGPYNY



IKNWDYAGLMEVFNAEDGHGLGLKATTAGELEEAIKKAKTNREGPTIIECQIER



SDCTKTLVEWGKKVAAANSRKPQVSGGSG (SEQ ID NO: 148)





A0A0D6NFJ6
MTYTVGMYLADRLAQIGLKHHFAVAGDYNLVLLDQLLTNKDMQQIYCCNELN


9PROT
CGFSAEGYARAHGAAAAVVTFSVGAISAMNAIGGAYAENLPVILISGSPNSNDY



GSGHILHHTIGSTDYGYQMEMVKHVTCAAESITDAASAPAKIDHVIRTALRESK



PAYLEIACNVSAQECPRPGPVSSLLSEPAPDKTSLDAAVAAAVKLIEGAENTVTIL



VGSKLRAARAQAEAEKLADKLECAVTIMAAAKGFFPEDHAGFRGLYWGEVSS



PGTQELVEKADAIICLAPVFNDYSTVGWTAWPKGDKVLLAEPNRVTIKGQTFEG



FALRDFLTALAAKAPARPASAKASSHTPTAFPKADAKAPLTNDEMARQINAML



TSDTTLVAETGDSWFNAMRMTLPRGARVELEMQWGHIGWSVPSSFGNAMGSQ



DRQHVVMVGDGSFQLTAQEVAQMVRYELPVIIFLVNNRGYVIEIAIHDGPYNYI



KNWDYAGLMEVFNAGEGHGLGLHATTAEELEDAIKKAQANRRGPTIIECKIDR



QDCTDTLVQWGKKVASANSRKPQAVGGSG (SEQ ID NO: 166)









3-hydroxypropionate Dehydrogenases

Certain aspects of the present disclosure relate to 3-hydroxypropionate dehydrogenase (3-HPDH) enzymes and polynucleotides related thereto. In some embodiments, a 3-HPDH of the present disclosure refers to an enzyme that catalyzes the conversion of 3-oxopropanoate into 3-HP. Any enzyme capable of catalyzing the conversion of 3-oxopropanoate into 3-HP, e.g., known or predicted to have the enzymatic activity described by EC 1.1.1.59 and/or Gene Ontology (GO) ID 0047565, can be suitably used in the methods and host cells of the present disclosure.


In some embodiments, a 3-HPDH of the present disclosure refers to a polypeptide having the enzymatic activity of a polypeptide shown in Table 1 below. In some embodiments, a 3-HPDH of the present disclosure refers to a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 1 below. In some embodiments, a 3-HPDH of the present disclosure is derived from a source organism shown in Table 1 below. In some embodiments, a 3-HPDH of the present disclosure comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130.


In some embodiments, a 3-HPDH of the present disclosure refers to a polypeptide having the enzymatic activity of a polypeptide shown in Table 7A below. In some embodiments, a 3-HPDH of the present disclosure refers to a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 7A below. In some embodiments, a 3-HPDH of the present disclosure comprises a polypeptide sequence that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the amino acid sequence of SEQ ID NO:154 or 159. In some embodiments, a 3-HPDH of the present disclosure comprises the amino acid sequence of SEQ ID NO:154 or 159.


In some embodiments, a 3-HPDH of the present disclosure is an endogenous 3-HPDH. A variety of host cells contemplated for use herein include endogenous genes encoding 3-HPDH enzymes; see. e.g., Table 1 below. In some embodiments, a 3-HPDH of the present disclosure is a recombinant 3-HPDH For example, a polynucleotide encoding a 3-HPDH of the present disclosure can be introduced into a host cell that lacks endogenous 3-HPDH activity, or a polynucleotide encoding a 3-HPDH of the present disclosure can be introduced into a host cell with endogenous 3-HPDH activity in order to supplement, enhance, or supply said activity under different regulation than the endogenous activity.









TABLE 1







Exemplary 3-HPDH polypeptides.









Sequence Name
Amino Acid Sequence
Source Organism





A4YI81_METS5
MTEKVSVVGAGVIGVGWATLFASKGYSVSLYTEKKETL

Metallosphaera sedula




DKGIEKLRNYVQVMKNNSQITEDVNTVISRVSPTTNLDE




AVRGANFVIEAVIEDYDAKKKIFGYLDSVLDKEVILASST




SGLLITEVQKAMSKHPERAVIAHPWNPPHLLPLVEIVPGE




KTSMEVVERTKSLMEKLDRIVVVLKKEIPGFIGNRLAFAL




FREAVYLVDEGVATVEDIDKVMTAAIGLRWAFMGPFLT




YHLGGGEGGLEYFFNRGFGYGANEWMHTLAKYDKFPY




TGVTKAIQQMKEYSFIKGKTFQEISKWRDEKLLKVYKLV




WEK (SEQ ID NO: 122)






Q819E3_BACCR
MEHKTLSIGFIGIGVMGKSMVYHLMQDGHKVYVYNRTK

Bacillus cereus




AKTDSLVQDGANWCNTPKELVKQVDIVMTMVGYPHDV




EEVYFGIEGIIEHAKEGTIAIDFTTSTPTLAKRINEVAKRK




NIYTLDAPVSGGDVGAKEAKLAIMVGGEKEIYDRCLPLL




EKLGTNIQLQGPAGSGQHTKMCNQIAIASNMIGVCEAVA




YAKKAGLNPDKVLESISTGAAGSWSLSNLAPRMLKGDF




EPGFYVKHFMKDMKIALEEAERLQLPVPGLSLAKELYEE




LIKDGEENSGTQVLYKKYIRG (SEQ ID NO: 123)






5JE8
MKKIGFIGLGNMGLPMSKNLVKSGYTVYGVDLNKEAEA

Bacillus cereus




SFEKEGGIIGLSISKLAETCDVVFTSLPSPRAVEAVYTGAE




GLFENGHSNVVFIDTSTVSPQLNKQLEEAAKEKKVDFLA




APVSGGVIGAENRTLTFMVGGSKDVYEKTESIMGVLGA




NIFHVSEQIDSGTTVKLINNLLIGFYTAGVSEALTLAKKN




NMDLDKMFDILNVSYGQSRIYERNYKSFIAPENYEPGFT




VNLLKKDLGFAVDLAKESELHLPVSEMLLNVYDEASQA




GYGENDMAALYKKVSEQLISNQK (SEQ ID NO: 124)






SERDH_PSEAE
MKQIAFIGLGHMGAPMATNLLKAGYLLNVFDLVQSAVD

Pseudomonas




GLVAAGASAARSARDAVQGADVVISMLPASQHVEGLYL

aeruginosa




DDDGLLAHIAPGTLVLECSTIAPTSARKIHAAARERGLA




MLDAPVSGGTAGAAAGTLTFMVGGDAEALEKARPLFEA




MGRNIFHAGPDGAGQVAKVCNNQLLAVLMIGTAEAMA




LGVANGLEAKVLAEIMRRSSGGNWALEVYNPWPGVME




NAPASKDYSGGFMAQLMAKDLGLAQEAAQASASSTPM




GSLALSLYRLLLKQGYAERDFSVVQKLFDPTQGQ (SEQ




ID NO: 125)






E7KSY9_YEASL
MSQGRKAAERLAKKTVLITGASAGIGKATALEYLEASNG

Saccharomyces




DMKLILAARRLEKLEELKKTIDQEFPNAKVHVAQLDITQ

cerevisiae




AEKIKPFIENLPQEFKDIDILVNNAGKALGSDRVGQIATE




DIQDVFDTNVTALINITQAVLPIFQAKNSGDIVNLGSIAGR




DAYPTGSIYCASKFAVGAFTDSLRKELINTKIRVILIAPGL




VETEFSLVRYRGNEEQAKNVYKDTTPLMADDVADLIVY




ATSRKQNTVIADTLIFPTNQASPHHIFRG (SEQ ID NO: 126)






Q5FQ06_GLUOX
MSSPKIGFIGYGAMAQRMGANLRKAGYPVVAYAPSGGK

Gluconobacter oxydans




DETEMLPSPRAIAEAAEIIIFCVPNDAAENESLHGENGAL




AALTPGKLVLDTSTVSPDQADAFASLAVEHGFSLLDAPM




SGSTPEAETGDLVMLVGGDEAVVKRAQPVLDVIGKLTIH




AGPAGSAARLKLVVNGVMGATLNVIAEGVSYGLAAGL




DRDVVFDTLQQVAVVSPHHKRKLKMGQNREFPSQFPTR




LMSKDMGLLLDAGRKVGAFMPGMAVADQALALSNRLH




ANEDYSALIGAMEHSVANLPHK (SEQ ID NO: 127)






A9A4M8_NITMS
MHTVRIPKVINFGEDALGQTEYPKNALVVTTVPPELSDK

Nitrosopumilus




WLAKMGIQDYMLYDKVKPEPSIDDVNTLISEFKEKKPSV

maritimus




LIGLGGGSSMDVVKYAAQDFGVEKILIPTTFGTGAEMTT




YCVLKFDGKKKLLREDRFLADMAVVDSYFMDGTPEQVI




KNSVCDACAQATEGYDSKLGNDLTRTLCKQAFEILYDAI




MNDKPENYPYGSMLSGMGFGNCSTTLGHALSYVFSNEG




VPHGYSLSSCTTVAHKHNKSIFYDRFKEAMDKLGFDKLE




LKADVSEAADVVMTDKGHLDPNPIPISKDDVVKCLEDIK




AGNL (SEQ ID NO: 128)






YDFG_ECOLI
MIVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQEL

Escherichia coli




KDELGDNLYIAQLDVRNRAAIEEMLASLPAEWCNIDILV




NNAGLALGMEPAHKASVEDWETMIDTNNKGLVYMTRA




VLPGMVERNHGHIINIGSTAGSWPYAGGNVYGATKAFV




RQFSLNLRTDLHGTAVRVTDIEPGLVGGTEFSNVRFKGD




DGKAEKTYQNTVALTPEDVSEAVWWVSTLPAHVNINTL




EMMPYTQSYAGLNVHRQ (SEQ ID NO: 129)






Q5SLQ6_THET8
MEKVAFIGLGAMGYPMAGHLARRFPTLVWNRTFEKALR

Thermus thermophilus




HQEEFGSEAVPLERVAEARVIFTCLPTTREVYEVAEALYP




YLREGTYWVDATSGEPEASRRLAERLREKGVTYLDAPV




SGGTSGAEAGTLTVMLGGPEEAVERVRPFLAYAKKVVH




VGPVGAGHAVKAINNALLAVNLWAAGEGLLALVKQGV




SAEKALEVINASSGRSNATENLIPQRVLTRAFPKTFALGL




LVKDLGIAMGVLDGEKAPSPLLRLAREVYEMAKRELGP




DADHVEALRLLERWGGVEIR (SEQ ID NO: 130)
















TABLE 7A







Candidate 3-HPDH sequences.








Enzyme name
Amino acid sequence





ADH6_ YEAST
MSYPEKFEGIAIQSHEDWKNPKKTKYDPKPFYDHDIDIKIEACGVCGSDIHCAAG



HWGNMKMPLVVGHEIVGKVVKLGPKSNSGLKVGQRVGVGAQVFSCLECDRCK



NDNEPYCTKFVTTYSQPYEDGYVSQGGYANYVRVHEHFVVPIPENIPSHLAAPLL



CGGLTVYSPLVRNGCGPGKKVGIVGLGGIGSMGTLISKAMGAETYVISRSSRKRE



DAMKMGADHYIATLEEGDWGEKYFDTFDLIVVCASSLTDIDFNIMPKAMKVGG



RIVSISIPEQHEMLSLKPYGLKAVSISYSALGSIKELNQLLKLVSEKDIKIWVETLPV



GEAGVHEAFERMEKGDVRYRFTLVGYDKEFSD (SEQ ID NO: 149)





YQHD_ECOLI
MNNFNLHTPTRILFGKGAIAGLREQIPHDARVLITYGGGSVKKTGVLDQVLDALK



GMDVLEFGGIEPNPAYETLMNAVKLVREQKVTFLLAVGGGSVLDGTKFTAAAA



NYPENIDPWHILQTGGKEIKSAIPMGCVLTLPATGSESNAGAVISRKTTGDKQAF



HSAHVQPVFAVLDPVYTYTLPPRQVANGVVDAFVHTVEQYVTKPVDAKIQDRF



AEGILLTLIEDGPKALKEPENYDVRANVMWAATQALNGLIGAGVPQDWATHML



GHELTAMHGLDHAQTLAIVLPALWNEKRDTKRAKLLQYAERVWNITEGSDDER



IDAAIAATRNFFEQLGVPTHLSDYGLDGSSIPALLKKLEEHGMTQLGENHDITLD



VSRRIYEAAR (SEQ ID NO: 150)





ADH2_YEAST_Alcohol_dehydrogenase_2
MSIPETQKAIIFYESNGKLEHKDIPVPKPKPNELLINVKYSGVCHTDLHAWHGDW



PLPTKLPLVGGHEGAGVVVGMGENVKGWKIGDYAGIKWLNGSCMACEYCELG



NESNCPHADLSGYTHDGSFQEYATADAVQAAHIPQGTDLAEVAPILCAGITVYK



ALKSANLRAGHWAAISGAAGGLGSLAVQYAKAMGYRVLGIDGGPGKEELFTSL



GGEVFIDFTKEKDIVSAVVKATNGGAHGIINVSVSEAAIEASTRYCRANGTVVLV



GLPAGAKCSSDVFNHVVKSISIVGSYVGNRADTREALDFFARGLVKSPIKVVGLS



SLPEIYEKMEKGQIAGRYVVDTSK (SEQ ID NO: 151)





YdfG
MIVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQLDV



RNRAAIEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNK



GLVYMTRAVLPGMVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNL



RTDLHGTAVRVTDIEPGLVGGTEFSNVRFKGDDGKAEKTYQNTVALTPEDVSEA



VWWVSTLPAHVNINTLEMMPVTQSYAGLNVHRQ (SEQ ID NO: 152)





A9A4M8
MHTVRIPKVINFGEDALGQTEYPKNALVVTTVPPELSDKWLAKMGIQDYMLYD



KVKPEPSIDDVNTLISEFKEKKPSVLIGLGGGSSMDVVKYAAQDFGVEKILIPTTF



GTGAEMTTYCVLKFDGKKKLLREDRFLADMAVVDSYFMDGTPEQVIKNSVCDA



CAQATEGYDSKLGNDLTRTLCKQAFEILYDAIMNDKPENYPYGSMLSGMGFGN



CSTTLGHALSYVFSNEGVPHGYSLSSCTTVAHKHNKSIFYDRFKEAMDKLGFDK



LELKADVSEAADVVMTDKGHLDPNPIPISKDDVVKCLEDIKAGNL (SEQ ID



NO: 153)





A4YI81
MTEKVSWGAGVIGVGWATLFASKGYSVSLYTEKKETLDKGIEKLRNYVQVMK



NNSQITEDVNTVISRVSPTTNLDEAVRGANFVIEAVIEDYDAKKKIFGYLDSVLDK



EVILASSTSGLLITEVQKAMSKHPERAVIAHPWNPPHLLPLVEIVPGEKTSMEVVE



RTKSLMEKLDRIVVVLKKEIPGFIGNRLAFALFREAVYLVDEGVATVEDIDKVMT



AAIGLRWAFMGPFLTYHLGGGEGGLEYFFNRGFGYGANEWMHTLAKYDKFPYT



GVTKAIQQMKEYSFIKGKTFQEISKWRDEKLLKVYKLVWEK (SEQ ID NO: 154)





3OBB
MKQIAFIGLGHMGAPMATNLLKAGYLLNVFDLVQSAVDGLVAAGASAARSARD



AVQGADVVISMLPASQHVEGLYLDDDGLLAHIAPGTLVLECSTIAPTSARKIHAA



ARERGLAMLDAPVSGGTAGAAAGTLTFMVGGDAEALEKARPLFEAMGRNIFHA



GPDGAGQVAKVCNNQLLAVLMIGTAEAMALGVANGLEAKVLAEIMRRSSGGN



WALEVYNPWPGVMENAPASRDYSGGFMAQLMAKDLGLAQEAAQASASSTPM



GSLALSLYRLLLKQGYAERDFSWQKLFDPTQGQ (SEQ ID NO: 155)





5JE8
MKKIGFIGLGNMGLPMSKNLVKSGYTVYGVDLNKEAEASFEKEGGIIGLSISKLA



ETCDVVFTSLPSPRAVEAVYFGAEGLFENGHSNVVFIDTSTVSPQLNKQLEEAAK



EKKVDFLAAPVSGGVIGAENRTLTFMVGGSKDVYEKTESIMGVLGANIFHVSEQI



DSGTTVKLINNLLIGFYTAGVSEALTLAKKNNMDLDKMFDILNVSYGQSRIYERN



YKSFIAPENYEPGFTVNLLKKDLGFAVDLAKESELHLPVSEMLLNVYDEASQAG



YGENDMAALYKKVSEQLISNQK (SEQ ID NO: 156)





Q819E3
MEHKTLSIGFIGIGVMGKSMVYHLMQDGHKVYVYNRTKAKTDSLVQDGANWC



NTPKELVKQVDIVMTMVGYPHDVEEVYFCIEGIIEHAKEGTIAIDFTTSTPTLAKR



INEVAKRKNIYTLDAPVSGGDVGAKEAKLAIMVGGEKEIYDRCLPLLEKLGTNIQ



LQGPAGSGQHTKMCNQIAIASNMIGVCEAVAYAKKAGLNPDKVLESISTGAAGS



WSLSNLAPRMLKGDFEPGFYVKHFMKDMKIALEEAERLQLPVGLSLAKELYEE



LIKDGEENSGTQVLYKKYIRG (SEQ ID NO: 157)





Q5FQ06
MSSPKIGFIGYGAMAQRMGANLRKAGYPVVAYAPSGGKDETEMLPSPRAIAEAA



EIIIFCVPNDAAENESLHGENGALAALTPGKLVLDTSTVSPDQADAFASLAVEHGF



SLLDAPMSGSTPEAETGDLVMLVGGDEAVVKRAQPVLDVIGKLTIHAGPAGSAA



RLKLVVNGVMGATLNVIAEGVSYGLAAGLDRDVVFDTLQQVAVVSPHHKRKL



KMGQNREFPSQFPTRLMSKDMGLLLDAGRKVGAFMPGMAVADQALALSNRLH



ANEDYSALIGAMEHSVANLPHK (SEQ ID NO: 158)





2CVZ
MEKVAFIGLGAMGYPMAGHLARRFPTLVWNRTFEKALRHQEEFGSEAVPLERV



AEARVIFTCLPTTREVYEVAEALYPYLREGTYWVDATSGEPEASRRLAERLREKG



VTYLDAPVSGGTSGAEAGTLTVMLGGPEEAVERVRPFLAYAKKVVHVGPVGAG



HAVKAINNALLAVNLWAAGEGLLALVKQGVSAEKALEVINASSGRSNATENLIP



QRVLTRAFPKTFALGLLVKDLGIAMGVLDGEKAPSPLLRLAREVYEMAKRELGP



DADHVEALRLLERWGGVEIR (SEQ ID NO: 159)





Q05016
MSQGRKAAERLAKKTVLITGASAGIGKATALEYLEASNGDMKLILAARRLEKLE



ELKKTIDQEFPNAKVHVAQLDITQAEKIKPFIENLPQEFKDIDILVNNAGKALGSD



RVGQIATEDIQDVFDTNVTALINITQAVLPIFQAKNSGDIVNLGSIAGRDAYPTGSI



YCASKFAVGAFTDSLRKELINTKIRVILIAPGLVETEFSLVRYRGNEEQAKNVYKD



TTPLMADDVADLIVYATSRKQNTVIADTLIFPTNQASPHHIFRG (SEQ ID NO: 160)









3-hydroxypropionate Metabolic Pathways

In some embodiments, a host cell of the present disclosure comprises one or more additional polynucleotides (e.g., encoding one or more additional polypeptides) whose activity promotes the synthesis or uptake of oxaloacetate into the host cell. As is known in the art, host cells are able to convert glucose into phosphoenolpyruvate through a series of metabolic reactions known as glycolysis. See. e.g., Alberts, B., Johnson, A., and Lewis. J. et al. Molecular Biology of the Cell. 4th ed. New York: Garland Science; 2002. In some embodiments, a host cell of the present disclosure comprises polynucleotides encoding the following metabolic enzymes: hexokinase, phosphoglucose isomerase, phosphofructokinase, aldolase, triose phosphate isomerase, glyceraldehyde 3-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglycerate mutase, and enolase. Suitable enzymes from a variety of host cells are well known in the art. In some embodiments, a host cell of the present disclosure comprises polynucleotides encoding one or more polypeptides active in the oxidative pentose phosphate or Entner-Doudoroff pathway. These pathways are also known to break down sugars (e.g., into glyceraldehyde-3-phosphate); see, e.g., Chen, X. et al. (2016) Proc. Natl. Acad. Sci. 113:5441-5446. The metabolic enzymes catalyzing steps in these pathways are known in the art.


Metabolic pathways that produce oxaloacetate are known, such as the tricarboxylic acid (TCA) cycle. Phosphoenolpyruvate (e.g., originating from the breakdown of glucose as described above) can be converted into oxaloacetate through multiple chemical reactions. See Sauer, U. and Eikmanns, B. J. (2005) FEMS Microbiol. Rev. 29:765-794. In some embodiments, a host cell of the present disclosure comprises a polynucleotide encoding a phosphoenolpyruvate carboxylase. In some embodiments, a phosphoenolpyruvate carboxylase refers to an enzyme that catalyzes the conversion of phosphoenolpyruvate into oxaloacetate. Any enzyme capable of catalyzing the conversion of phosphoenolpyruvate into oxaloacetate, e.g., known or predicted to have the enzymatic activity described by EC 4.1.1.31 and/or Gene Ontology (GO) ID 0008964, can be suitably used in the methods and host cells of the present disclosure. In some embodiments, the phosphoenolpyruvate carboxylase is an endogenous phosphoenolpyruvate carboxylase. In some embodiments, the phosphoenolpyruvate carboxylase is a recombinant phosphoenolpyruvate carboxylase. Phosphoenolpyruvate carboxylases are known in the art and include, without limitation, NP_312912, NP_252377, NP_232274, WP_001393487, WP_001863724, and WP_002230956 (see www.genome.jp/dbget-bin/get_linkdb?-t+refpep+ec:4.1.1.31 for additional enzymes)


In some embodiments, a host cell of the present disclosure comprises polynucleotides encoding a pyruvate kinase and a pyruvate carboxylase. In some embodiments, a pyruvate kinase refers to an enzyme that catalyzes the conversion of phosphoenolpyruvate into pyruvate. Any enzyme capable of catalyzing the conversion of phosphoenolpyruvate into pyruvate, e.g., known or predicted to have the enzymatic activity described by EC 2.7.1.40 and/or Gene Ontology (GO) ID 0004743, can be suitably used in the methods and host cells of the present disclosure. In some embodiments, the pyruvate kinase is an endogenous pyruvate kinase. In some embodiments, the pyruvate kinase is a recombinant pyruvate kinase. Pyruvate kinases are known in the art and include, without limitation, S. cerevisiae Pyk1 and Pyk2, NP_014992, NP_250189, NP_310410, NP_358391, NP_390796, and NP_465095 (see www.genome.jp/dbget-bin/get_linkdb?-t+refpep+ec:2.7.1.40 for additional enzymes). In some embodiments, a pyruvate carboxylase refers to an enzyme that catalyzes the conversion of pyruvate into oxaloacetate. Any enzyme capable of catalyzing the conversion of pyruvate into oxaloacetate, e.g., known or predicted to have the enzymatic activity described by EC 6.4.1.1 and/or Gene Ontology (GO) ID 0071734, can be suitably used in the methods and host cells of the present disclosure. In some embodiments, the pyruvate carboxylase is an endogenous pyruvate carboxylase. In some embodiments, the pyruvate carboxylase is a recombinant pyruvate carboxylase. Pyruvate carboxylases are known in the art and include, without limitation, NP_009777, NP_011453, NP_266825, NP_349267, and NP_464597 (see www.genome.jp/dbgect-bin/get_linkdb?-t+rcfpcp+cc:6.4.1.1 for additional enzymes).


In some embodiments, a host cell of the present disclosure comprises one or more modifications resulting in decreased production of pyruvate from phosphoenolpynivate, e.g., as compared to a host cell (e.g., of the same species and grown under similar conditions) lacking the modification. Without wishing to be bound to theory, it is thought that decreasing production of pyruvate from phosphoenolpyruvate may favor the conversion of phosphoenolpyruvate into oxaloacetate, e.g., using a phosphoenolpyruvate carboxylase of the present disclosure.


In some embodiments, a host cell of the present disclosure comprises a polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK). In some embodiments, a host cell of the present disclosure comprises a polynucleotide encoding a recombinant phosphoenolpyruvate carboxykinase (PEPCK). In some embodiments, a PEPCK of the present disclosure refers to a polypeptide having the enzymatic activity of a polypeptide shown in Table 9A below. In some embodiments, a PEPCK of the present disclosure comprises a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 9A below. In some embodiments, a PEPCK of the present disclosure comprises a polypeptide sequence that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 162 or 163. In some embodiments, a PEPCK of the present disclosure comprises the amino acid sequence of SEQ ID NO:162 or 163









TABLE 9A







Candidate PEPCK sequences.








Enzyme name
Amino acid sequence





Q7XAU8
MASPNGLAKIDTQGKTEVYDGDTAAPVRAQTIDELHLLQRKRSA



PTTPIKDGATSAFAAAISEEDRSQQQLQSISASLTSLARETGPKLVK



GDPSDPAPHKHYQPAAPTIVATDSSLKFTHVLYNLSPAELYEQAF



GQKKSSFITSTGALATLSGAKTGRSPRDKRVVKDEATAQELWWG



KGSFNIEMDERQFVINRERALDYLNSLDKVYVNDQFLNWDPENRI



KVRIITSRAYHALFMHNMCIRPTDEELESFGTPDFTIYNAGEFPAN



RYANYMTSSTSINISLARREMVILGTQYAGEMKKGLFGVMHYLM



PKRGILSLHSGCNMGKDGDVALFFGLSGTGKTTLSTDHNRLLIGD



DEHCWSDNGVSNIEGGCYAKCIDLSQEKEPDIWNAIKFGTVLENV



VFNERTREVDYSDKSITENTRAAYPIEFIPNAKIPCVGPHPKNVILL



ACDAFGVLPPVSKLNLAQTMYHFISGYTALVAGTVDGITEPTATF



SACFGAAFIMYHPTKYAAMLAEKMQKYGATGWLVNTGWSGGR



YGVGKRIRLPHTRKIIDAIHSGELLTANYKKTEVFGLEIPTEINGVP



SEILDPINTWTDKAAYKENLLNLAGLFKKNFEVFASYKIGDDSSLT



DEILAAGPNF (SEQ ID NO: 161)





PCKA_Ecoli
MRVNNGLTPQELEAYGISDVHDIVYNPSYDLLYQEELDPSLTGYE



RGVLTNLGAVAVDTGIFTGRSPKDKYIVRDDTTRDTFWWADKGK



GKNDNKPLSPETWQHLKGLVTRQLSGKRLFVVDAFCGANPDTRL



SVRFITEVAWQAHFVKNMFIRPSDEELAGFKPDFIVMNGAKCTNP



QWKEQGLNSENFVAFNLTERMQLIGGTWYGGEMKKGMFSMMN



YLLPLKGIASMHCSANVGEKGDVAVFFGLSGTGKTTLSTDPKRRL



IGDDEHGWDDDGVFNFEGGCYAKTIKLSKEAEPEIYNAIRRDALL



ENVTVREDGTIDFDDGSKTENTRVSYPIYHIDNIVKPVSKAGHATK



VIFLTADAFGVLPPVSRLTADQTQYHFLSGFTARLAGTERGITEPT



PTFSACFGAAFLSLHPTQYAEVLVKRMQAAGAQAYLVNTGWNG



TGKRISIKDTRAIIDAILNGSLDNAETFTLPMFNLAIPTELPGVDTKI



LDPRNTYASPEQWQEKAETLAKLFIDNFDKYTDTPAGAALVAAG



PKL (SEQ ID NO: 162)





PCK from
MTDLNKLVKELNDLGLTDVKEIVYNPSYEQLFEEETKPGLEGFDK


Actinobaccilus_succinogenes
GTLTTLGAVAVDTGIFTGRSPKDKYIVCDETTKDTVWWNSEAAK



NDNKPMTQETWKSLRELVAKQLSGKRLFVVEGYCGASEKHRIGV



RMVTEVAWQAHFVKNMFIRPTDEELKNFKADFTVLNGAKCTNP



NWKEQGLNSENFVAFNITEGIQLIGGTWYGGEMKKGMFSMMNY



FLPLKGVASMHCSANVGKDGDVAIFFGLSGTGKTTLSTDPKRQLI



GDDEHGWDESGVFNFEGGCYAKTINLSQENEPDIYGAIRRDALLE



NVVVRADGSVDFDDGSKTENTRVSYPIYHIDNIVRPVSKAGHATK



VIFLTADAFGVLPPVSKLTPEQTEYYFLSGFTAKLAGTERGVTEPT



PTFSACFGAAFLSLHPIQYADVLVERMKASGAEAYLVNTGWNGT



GKRISIKDTRGIIDAILDGSIEKAEMGELPIFNLAIPKALPGVDPAIL



DPRDTYADKAQWQVKAEDLANRFVKNFVKYTANPEAAKLVGA



GPKA (SEQ ID NO: 163)





1J3B
MQRLEALGIHPKKRVFWNTVSPVLVEHTLLRGEGLLAHHGPLVV



DTTPYTGRSPKDKFVVREPEVEGEIWWGEVNQPFAPEAFEALYQR



VVQYLSERDLYVQDLYAGADRRYRLAVRVVTESPWHALFARNM



FILPRRFGNDDEVEAFVPGFTVVHAPYFQAVPERDGTRSEVFVGIS



FQRRLVLIVGTKYAGEIKKSIFTVMNYLMPKRGVFPMHASANVG



KEGDVAVFFGLSGTGKTTLSTDPERPLIGDDEHGWSEDGVFNFEG



GCYAKVIRLSPEHEPLIYKASNQFEAILENVVVNPESRRVQWDDD



SKTENTRSSYPIAHLENVVESGVAGHPRAIFFLSADAYGVLPPIAR



LSPEEAMYYFLSGYTARVAGTERGVTEPRATFSACFGAPFLPMHP



GVYARMLGEKIRKHAPRVYLVNTGWTGGPYGVGYRFPLPVTRA



LLKAALSGALENVPYRRDPVFGFEVPLEAPGVPQELLNPRETWAD



KEAYDQQARKEARLFQENFQKYASGVAKEVAEAGPRTE (SEQ ID



NO: 164)





1YTM
MSLSESLAKYGITGATNIVHNPSHEELFAAETQASLEGFEKGTVTE



MGAVNVMTGVYTGRSPKDKFIVKNEASKEIWWTSDEFKNDNKP



VTEEAWAQLKALAGKELSNKPLYYVVDLFCGANENTRLKIRFVME



VAWQAHFVTNMFIRPTEEELKGFEPDFVVLNASKAKVENFKELG



LNSETAVVFNLAEKMQIILNTWYGGEMKKGMFSMMNFYLPLQGI



AAMHCSANTDLEGKNTAIFFGLSGTGKTTLSTDPKRLLIGDDEHG



WDDDGVFNFEGGCYAKVINLSKENEPDIWGAIKRNALLENVTVD



ANGKVDFADKSVTENTRVSYPIFHIKNIVKPVSKAPAAKRVIFLSA



DAFGVLPPVSILSKEQTKYYFLSGFTAKLAGTERGITEPTPTFSSCF



GAAFLTLPPTKYAEVLVKRMEASGAKAYLVNTGWNGTGKRISIK



DTRGIIDAILDGSIDTANTATIPYFNFTVPTELKGVDTKILDPRNTY



ADASEWEVKAKDLAERFQKNFKKFESLGGDLVKAGPQL (SEQ ID



NO: 165)









In some embodiments, the modification results in decreased pyruvate kinase (PK) activity, e.g., as compared to a host cell (e.g., of the same species and grown under similar conditions) lacking the modification. For example, the host cell may comprise one or more mutations in an endogenous PK enzyme, resulting in decreased PK activity.


In some embodiments, the modification results in decreased pyruvate kinase (PK) expression, e.g., as compared to a host cell (e.g., of the same species and grown under similar conditions) lacking the modification. Various methods for decreasing gene expression may be used and include, without limitation, homologous recombination or other mutagenesis techniques (e.g., transposon-mediated mutagenesis) to remove and/or replace part or all of the coding sequence or regulatory sequence(s); CRISPR/Cas9-mediated gene editing; CRISPR interference (CRISPRi; see Qi, L. S. et al. (2013) Cell 152:1173-1183); heterochromatin formation; RNA interference (RNAi), morpholinos, or other antisense nucleic acids; and the like.


As one example, PK expression can be decreased by placing a PK coding sequence (e.g., an endogenous PK coding sequence) under the control of a promoter (e.g., an exogenous promoter) that results in decreased PK coding sequence expression. For example, an endogenous PK coding sequence can be operably linked to an exogenous promoter that results in decreased expression of the endogenous PK coding sequence, e.g., as compared to endogenous PK expression (e.g., of the same species and grown under similar conditions).


In some embodiments, a PK coding sequence (e.g., an endogenous PK coding sequence) of the present disclosure is operably linked to an inducible promoter, such as the MET3, CTR1, and CTR3 promoters. The MET3 promoter is an inducible promoter commonly used in the art to regulate gene transcription in response to methionine levels. e.g., in the cell culture medium. See, e.g., Mao, X. et al. (2002) Curr. Microbiol. 45:37-40 and Asadollahi, M. A. et al. (2008) Biotechnol. Bioeng. 99:666-677. The CTR1 and CTR3 promoters are copper-repressible promoters commonly used in the art to regulate gene transcription in response to copper levels, e.g., in the cell culture medium. See, e.g., Labbe, S. et al. (1997) J. Biol. Chem. 272:15951-15958.


In some embodiments, a PK coding sequence (e.g., an endogenous PK coding sequence) of the present disclosure is operably linked to a promoter (e.g., a MET promoter) comprising the polynucleotide sequence of TGTGAAGATGAATGTATTGAATATAAAATTATTTCTTGATATCCATATATCCCA TAAACAAGAAATTACTACTTCCGGAAAAACGTAAACACAGTGGAAAATTACG ATACCAATCACGTGATCAAATTACAAGGAAAGCACGTGACTTAAGGCTTCCTA AACTAGAAATTGTGGCTGTCAGGATCAATTGAAAATGGCGCCACACTTTCTTCT CTTATGGTTAGGAGTAGACCCCGAAGACAGAGGATTCCGGCAATCGGAGCACA GTACAACTTTATACTTTCGTTCACTGCATGGAGAGTGAAATTTCAAGCTGAT GCAATTGATATAAATATAACCCATTTACAGGATATGTCCCTCCAAAGGTTGATC CGTTTATTGCTATAATGAATATTGGTTCACTATITTATGCCTCTTGATTTGTAAT CCGGGCCTTTGCTITIGTACTTGACCTTAGACCTTAATCCACCCCAATAGTAAC TAATCAGAACACAAA (SEQ ID NO:131). In some embodiments, a PK coding sequence (e.g., an endogenous PK coding sequence) of the present disclosure is operably linked to a promoter (e.g., a CTR3 promoter) comprising the polynucleotide sequence of ATTCAACTAGAAAGTTGCAAGTAAAGCAACTAACTGCGGGACCAAACAAATIT AAACAAACCCGTGAATATTGTTCTACCTTATCCTATTGCTTCGAAAAAATGAGC AAATATTAACGACAGTTTACTACTGTCGTAGCTITTACTTCAAATAGAAGGAAA ACTGATGAATTGCATACATGAGCAATITFTATTAGAAATTATTACCTAAAAAGG CAAGAAAGCAGAGATAATTTCTCATGCCCCCAACTACTTACTTATATCTACAA TTAAAACTTAATAATATGCTCTTIUGCAGTATGAACCTITTCTTTAAATAACAG AGTACTGCCGCTTCAAACGATGTATCTACATTGACTAAACGAAAATACTACAA GCTGTCTTACTTTTTAAACAAAC (SEQ ID NO:132). In some embodiments, a PK coding sequence (e.g., an endogenous PK coding sequence) of the present disclosure is operably linked to a promoter (e.g., a CTR1 promoter) comprising the polynucleotide sequence of









(SEQ ID NO: 133)


TTGCGTAAGATAGATTCAAACCAAGTGATGGACCTGTCACTGCTTAGTGTT





GATGAACAAACATATCTTCGAGGCCATTCCGCAATGAAAAATCAATTTCTG





ACTAGCTTGCTTGGAGAGGAGCCATCGATACCAGAGTCAGATCCTGACAAC





GAATCGTGTCACATTTTTGTCCGTGCCCAAGCACCGTTTCCCTTCCGAGAT





GAAGATACCATGCAAGTAGGTGATGTTCGTGTTGCTAAATGGAAAGACGTG





GCGCATGGTGTAGCAGAGGGAGCTTTACACGTGATATAAACAGCATGCGCC





TCATTGAGCAAATTAACTACTAACGGTTTCCGAAATAGGTAATTGAGCAAA





TAAGAATTTCAGCACTTTATGAAGAAGGGTCAAGCGTATATAAAGGACACC





TCTTACTTTGAGGTTGTAAGTTTGTCTCTAGCCTTATCAATGGTCTTTATT





TTTTCTGCTACCTTGATTGGGAAATAATCCAATCTTCAATA.






In some embodiments, a host cell of the present disclosure comprises a modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK), e.g., as compared to a host cell (e.g., of the same species and grown under similar conditions) lacking the modification. As one example, an exogenous PEPCK coding sequence can be introduced into a host cell (e.g., operably linked to a constitutive or inducible promoter as described herein), or an endogenous PEPCK coding sequence can be operably linked to an exogenous promoter (e.g., a constitutive or inducible promoter as described herein). In some embodiments, a host cell of the present disclosure comprises a modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK) and a modification resulting in decreased pyruvate kinase (PK) expression and/or activity. In some embodiments, a PEPCK refers to an enzyme that catalyzes the conversion of phosphoenolpyruvate into oxaloacetate. Any enzyme capable of catalyzing the conversion of phosphoenolpyruvate into oxaloacetate, e.g., known or predicted to have the enzymatic activity described by EC 4.1.1.49 and/or Gene Ontology (GO) ID 0004611, can be suitably used in the methods and host cells of the present disclosure. Exemplary PEPCKs are also described supra and in Example 2 below.


Host Cells


Certain aspects of the present disclosure relate to recombinant host cells. In some embodiments, a recombinant host cell of the present disclosure comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) of the present disclosure. For example, in some embodiments, the OAADC has a ratio of activity against pyruvate to activity against oxaloacetate that is less than or equal to about 5:1 and/or a specific activity of at least 0.1 μmol/min/mg against oxaloacetate. In some embodiments, the recombinant host cell further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH) of the present disclosure. A host cell of the present disclosure can comprise one or more of the genetic modifications described supra in any number or combination.


Any microorganism may be utilized according to the present disclosure by one of ordinary skill in the art. In certain aspects, the microorganism is a prokaryotic microorganism, e.g., a recombinant prokaryotic host cell. In certain aspects, a microorganism is a bacterium, such as gram-positive bacteria or gram-negative bacteria. Given its rapid growth rate, well-understood genetics, variety of available genetic tools, and its capability in producing heterologous proteins, in some embodiments, a host cell of the present disclosure is an E. coli cell (e.g., a recombinant E. coli cell).


Other microorganisms may be used according to the present disclosure, e.g., based at least in part on the compatibility of enzymes and metabolites to host organisms. For example, other suitable organisms can include, without limitation: Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus lichenmformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrficans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Vibrio alginolyticus, Xanthomonas, Zymomonas, and Zymomonus mobilis Any of these cells may suitably be selected by one of ordinary skill in the art as a recombinant host cell based on the present disclosure, e.g., for use in any of the methods of the present disclosure.


In some embodiments, a host cell of the present disclosure is a fungal host cell. In some embodiments, a recombinant fungal host cell of the present disclosure comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC). In some embodiments, the recombinant fungal host cell further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH). In some embodiments, the recombinant fungal host cell further comprises a polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK). Without wishing to be bound to theory, it is thought that fungal host cells are particularly advantageous for production of 3-HP, which can lead to acidification of a cell culture medium, since they can be more acid-tolerant than certain bacterial host cells. In some embodiments, a host cell of the present disclosure is a non-human host cell. In some embodiments, a host cell of the present disclosure is a yeast host cell.


A variety of fungal host cells are known in the art and contemplated for use as a host cell of the present disclosure. Non-limiting examples of fungal cells are any host cells (e.g., recombinant host cells) of a genus or species selected from Aspergillus, Aspergillus nidulans, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus terreus, Aspergillus pseudoterreus, Aspergillus usamii, Candida rugosa, Issatchenkia orientalis, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kluyveromyces marxianas, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Rhodosporidium toruloides, Sccharomyces cerevisiae, Schizosaccharomyces pombe, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Yarrowia lipolytica, and Zygosaccharomyces rouxii.


Without wishing to be bound to theory, it is thought that the ability to tolerate and grow (e.g., be cultured in a culture medium/conditions characterized by) acidic pH is particularly advantageous for the methods described herein, since 3-HP production acidifies cell culture media. In some embodiments, a host cell of the present disclosure is capable of producing 3-HP at a pH (e.g., in a cell culture having a pH) lower than 4, lower than 4.5, lower than 5, lower than 5.5, lower than 6, or lower than 6.5. In some embodiments, a host cell of the present disclosure is capable of producing 3-HP at a pH (e.g., in a cell culture having a pH) lower than the pKa of 3-HP, i.e., 4.5 (e.g., at a temperature between about 20° C. and about 37° C. such as 20° C., 25° C., 30° C., or 37° C.).


Recombinant Techniques


Many recombinant techniques commonly known in the art may be used to introduce one or more genes of the present disclosure (e.g., an OAADC, 3-HPDH, and/or PEPCK of the present disclosure) into a host cell, including without limitation protoplast fusion, transfection, transformation, conjugation, and transduction.


Unless otherwise indicated, the practice of the present disclosure employs conventional molecular biology techniques (e.g., recombinant techniques), microbiology, cell biology, and biochemistry, which are within the skill of the art. Such techniques are well known in the art; see, e.g., Molecular Cloning: A Laboratory Manual, second edition (Sambrook et al., 1989); Oligonucleotide Synthesis (Gait, ed., 1984); Animal Cell Culture (Freshney, ed., 1987): Gene Transfer Vectors for Mammalian Cells (Miller & Calos, eds., 1987); Current Protocols in Molecular Biology (Ausubel et al., eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); and Current Protocols in Immunology (Coligan et al., eds, 1991).


In some embodiments, one or more recombinant polynucleotides are stably integrated into a host cell chromosome. In some embodiments, one or more recombinant polynucleotides are stably integrated into a host cell chromosome using homologous recombination, transposition-based chromosomal integration, recombinase-mediated cassette exchange (RMCE; e.g., using a Cre-lox system), or an integrating plasmid (e.g., a yeast integrating plasmid). A variety of integration techniques suitable for a range of host cells are known in the art (see, e.g., US PG Pub No. US20120329115; Daly, R. and Heam, M. T. (2005) J. Mol. Recognit. 18:119-138; and Griffiths, A. J. F., Miller, J. H., Suzuki, D. T. et al. An Introduction to Genetic Analysis. 7th cd. New York: W.H. Freeman; 2000). See also PCT/US2017/014788, which is incorporated by reference in its entirety.


In some embodiments, one or more recombinant polynucleotides are maintained in a recombinant host cell of the present disclosure on an extra-chromosomal plasmid (e.g., an expression plasmid or vector). A variety of extra-chromosomal plasmids suitable for a range of host cells are known in the art, including without limitation replicating plasmids (e.g., yeast replicating plasmids that include an autonomously replicating sequence, ARS), centromere plasmids (e.g., yeast centromere plasmids that include an autonomously replicating sequence, CEN), episomal plasmids (e.g., 2-μm plasmids), and/or artificial chromosomes (e.g., yeast artificial chromosomes, YACs, or bacterial artificial chromosomes, BACs). See. e.g., Actis, L. A. et. al. (1999) Front. Biosci. 4:D43-62; and Gunge, N. (1983) Annu. Rev. Microbiol. 37:253-276.


Vectors


Certain aspects of the present disclosure relate to vectors comprising polynucleotide(s) encoding an OAADC of the present disclosure, a 3-HPDH of the present disclosure, and/or a PEPCK of the present disclosure.


As used herein, the term “vector” refers to a polynucleotide construct designed to introduce nucleic acids into one or more host cell(s). Vectors include cloning vectors, expression vectors, shuttle vectors, plasmids, cassettes, and the like. As used herein, the term “plasmid” refers to a circular double-stranded DNA construct used as a cloning and/or expression vector. Some plasmids take the form of an extrachromosomal self-replicating genetic element (episomal plasmid) when introduced into a host cell. Other plasmids integrate into a host cell chromosome when introduced into the host cell. Certain vectors are capable of directing the expression of coding regions to which they are operatively linked, e.g., “expression vectors.” Thus expression vectors cause host cells to express polynucleotides and/or polypeptides other than those native to the host cells, or in a non-naturally occurring manner in the host cells. Some vectors may result in the integration of one or more polynucleotides (e.g., recombinant polynucleotides) into the genome of a host cell.


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure. For example, in some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQLLLNTDMQQIYCSNELNCG FSAEGYARANGAAAAIVTFSVGALSAFNALGGAYAENLPVlLISGAPNANDHGTG H ILHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKIDHVIRTALREKKPAYLEIA CNVAGAPCVRPGGIDALLSPPAPDEASLKAAVDAALAFIEQRGSVTMLVGSRIRAA GAQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGHYWGEVSSPGAQQAVEG ADGVICLAPVFNDYATVGWSAWPKGDNVMLVERHAVTVGGVAYAGIDMRDFLT RIAAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMARQIGATILTPRTLTAET GDSWFNAVRMKLPHGARVELEMQWGHIGWSVPAAFGNALAAPERQHVLMVGD GSFQLTAQEVAQMIRHDLPVIIFLINNHGYTIEVMIHDGPYNNVKNWDYAGLMEVF NAGEGNGLGLRARTGGEILAAATEQARANRNGPTLIECTLDRDDCTQELVTWGKRV AAANARPPRAG (SEQ ID NO: 1). In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises the polynucleotide sequence of SEQ ID NO:2. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an amino acid sequence that is at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a sequence selected from the group consisting of SEQ ID NOs: 145, 146, 148, and 166. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a sequence selected from the group consisting of SEQ ID NOs: 145, 146, 148, and 166.


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a 3-HPDH of the present disclosure. For example, in some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 1. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 7A. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes the amino acid sequence of SEQ ID NO: 154 or 159.


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure (e.g., as described supra) and a polynucleotide sequence that encodes a 3-HPDH of the present disclosure (e.g., as described supra).


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a PEPCK of the present disclosure. For example, in some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes a polypeptide that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a polypeptide shown in Table 9A. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes the amino acid sequence of SEQ ID NO:162 or 163.


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure (e.g., as described supra), a polynucleotide sequence that encodes a 3-HPDH of the present disclosure (e.g., as described supra), and a polynucleotide sequence that encodes a PEPCK of the present disclosure (e.g., as described supra).


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises one or more of the promoters described infra, e.g., in operable linkage with a coding sequence or polynucleotide described herein. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure operably linked to a promoter, where the promoter is not an endogenous OAADC promoter (e.g., the promoter is not operably linked to the polynucleotide as the polynucleotide is found in nature). In some embodiments, the vector is a bacterial or prokaryotic expression vector. In some embodiments, the vector is a yeast or fungal cell expression vector.


Promoters


In some embodiments, a coding sequence of interest is placed under control of one or more promoters. “Under the control” refers to a recombinant nucleic acid that is operably linked to a control sequence, enhancer, or promoter. The term “operably linked” as used herein refers to a configuration in which a control sequence, enhancer, or promoter is placed at an appropriate position relative to the coding sequence of the nucleic acid sequence such that the control sequence, enhancer, or promoter directs the expression of a polypeptide.


“Promoter” is used herein to refer to any nucleic acid sequence that regulates the initiation of transcription for a particular coding sequence under its control. A promoter does not typically include nucleic acids that are transcribed, but it rather serves to coordinate the assembly of components that initiate the transcription of other nucleic acid sequences under its control. A promoter may further serve to limit this assembly and subsequent transcription to specific prerequisite conditions. Prerequisite conditions may include expression in response to one or more environmental, temporal, or developmental cues; these cues may be from outside stimuli or internal functions of the cell. Bacterial and fungal cells possess a multitude of proteins that sense external or internal conditions and initiate signaling cascades ending in the binding of proteins to specific promoters and subsequent initiation of transcription of nucleic acid(s) under the control of the promoters. When transcription of a nucleic acid(s) is actively occurring downstream of a promoter, the promoter can be said to “drive” expression of the nucleic acid(s). A promoter minimally includes the genetic elements necessary for the initiation of transcription, and may further include one or more genetic elements that serve to specify the prerequisite conditions for transcriptional initiation. A promoter may be encoded by the endogenous genome of a host cell, or it may be introduced as part of a recombinant, engineered polynucleotide. A promoter sequence may be taken from one host species and used to drive expression of a gene in a host cell of a different species A promoter sequence may also be artificially designed for a particular mode of expression in a particular species, through random mutation or rational design. In recombinant engineering applications, specific promoters are used to express a recombinant gene under a desired set of physiological or temporal conditions or to modulate the amount of expression of a recombinant nucleic acid. In some embodiments, the promoters described herein are functional in a wide range of host cells.


In some embodiments, one or more genes of the present disclosure (e.g., polynucleotides encoding an OAADC, 3-HPDH, pyruvate kinase, phosphoenolpyruvate carboxylase, or pyruvate carboxylase) is operably linked to a promoter, e.g., a constitutive or inducible promoter. In some embodiments, the promoter is exogenous with respect to the polynucleotide that encodes the OAADC. For example, in some embodiments, the promoter is derived from a different source organism than the polynucleotide that encodes the OAADC and/or is not naturally found in operable linkage with the polynucleotide that encodes the OAADC (e.g., in the source organism of the OAADC).


Various promoters suitable for prokaryotic and/or yeast/fungal host cells are known. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure, and a polynucleotide sequence that encodes a 3-HPDH of the present disclosure and/or a polynucleotide sequence that encodes a PEPCK of the present disclosure in a single operon. In some embodiments, the operon is operably linked to a T7 or phage promoter. In some embodiments, the T7 promoter comprises the polynucleotide sequence TAATACGACTCACTATAGGGAGA (SEQ ID NO:134). In some embodiments, an operon of the present disclosure comprises (a) a polynucleotide that encodes an amino acid sequence at least 80% identical to SEQ ID NO:1 (e.g., SEQ ID NO:2), (b) a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH) (e.g., a polynucleotide encoding a 3-HPDH listed in Table 1 or Table 7A) or a polynucleotide encoding an alcohol dehydrogenase (e.g., comprising the sequence of NCBI GenBank Ref. No. ABX13006 or a polynucleotide encoding an alcohol dehydrogenase listed in Table 7A), and (c) a polynucleotide encoding a phosphoenolpyruvate carboxykinase (e.g., comprising a polynucleotide encoding a phosphoenolpyruvate carboxykinase listed in Table 9A). In some embodiments, the phosphoenolpyruvate carboxykinase is selected from the group consisting of E. coli Pck, NCBI Ref. Seq. No. WP_011201442, NCBI Ref. Seq. No. WP_011978877, NCBI Ref. Seq. No. WP_027939345, NCBI Ref. Seq. No. WP_074832324, and NCBI Ref. Seq. No. WP_074838421. In some embodiments, the 3-HPDH comprises the amino acid sequence of SEQ ID NO: 154 or 159 In some embodiments, the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163. In some embodiments, the OAADC comprises a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166.


In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure and a polynucleotide sequence that encodes a 3-HPDH of the present disclosure, both operably linked to the same promoter. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure, a polynucleotide sequence that encodes a 3-HPDH of the present disclosure, and a polynucleotide sequence that encodes a PEPCK of the present disclosure, all operably linked to the same promoter. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure and a polynucleotide sequence that encodes a 3-HPDH of the present disclosure operably linked to different promoters. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure, a polynucleotide sequence that encodes a 3-HPDH of the present disclosure, and a polynucleotide sequence that encodes a PEPCK of the present disclosure operably linked to different promoters. In some embodiments, a vector of the present disclosure (e.g., an expression vector) comprises a polynucleotide sequence that encodes an OAADC of the present disclosure, a polynucleotide sequence that encodes a 3-HPDH of the present disclosure, and/or a polynucleotide sequence that encodes a PEPCK of the present disclosure operably linked to a TDH promoter or an FRA promoter. In some embodiments, the TDH promoter comprises the polynucleotide sequence TTGATITAACCTGATCCAAAAGGGGTATGTCTATTTTTTAGAGAGTGTITTGTG TCAAATTATGGTAGAATGTGTAAAGTAGTATAAACTTCCTCTCAAATGACGAG GTTTAAAACACCCCCCGGGTGAGCCGAGCCGAGAATGGGGCAATTGTTCAATG TGAAATAGAAGTATCGAGIGAGAAACTTGGGTGTTGGCCAGCCAAGGGGGGGG GGGGGAAGGAAAATGGCGCGAATGCTCAGGTGAGATTGTITTGGAATTGGGTG AAGCGAGGAAATGAGCGACCCGGAGGTTGTGACTTTAGTGGCGGAGGAGGAC GGAGGAAAAGCCAAGAGGGAAGTGTATATAAGGGGAGCAATITGCCACCAGG ATAGAATGGATGAGTTATAATTCTACTGTATITATTGTATAATITATITCTCCT TITGTATCAAACACATTACAAAACACACAAAACACACAAACAAACACAATTAC AAAAA (SEQ ID NO: 135). In some embodiments, the FBA promoter comprises the polynucleotide sequence









(SEQ ID NO: 136)


TATCGTATTTATTAATCCCCTTCCCCCCAGCGCAGATCGTCCCGTCGATTT





CTATTGTTTGGGCATTATCAGCGACGCGACGGCGACGCGACGGCGATAATG





GGCGACGGTCACAAGATGGAACGAGAAAACAGTTTTTTTCGGATAGGACTC





ATTTTCCAGGTGAGAATGGGGTGACCCCGGGGAGAAACCTTCCGCGAGTGG





AGTGCGAGTGGAGTGGGAAATGTGGCCCCCCCCCCCCTTGTGGGCCATGAG





GTTGACAAATACCGTGTGGCCCGGTGATGGAGTGAGAAAGAGAGGGAAATG





ATAATGGGAAAACAAGGAGAGGCCCGTTTCCCGGGATTTATATAAAGAGGT





GTCTCTATCCCAGTTGAAGTAGAGATTTGTTGATGTAGTTGTTCCTTCCAA





TAAATTTGTTCAATCAGTACACAGCTAATACTATTATTACAGCTACTACTA





ATACTACTACTACTATTACTACCACCCCCAACACAAACACA.






In some embodiments, a constitutive promoter is defined herein as a promoter that drives the expression of nucleic acid(s) continuously and without interruption in response to internal or external cues. Constitutive promoters are commonly used in recombinant engineering to ensure continuous expression of desired recombinant nucleic acid(s). Constitutive promoters often result in a robust amount of nucleic acid expression, and, as such, are used in many recombinant engineering applications to achieve a high level of recombinant protein and enzymatic activity.


Many constitutive promoters are known and characterized in the art. Exemplary bacterial constitutive promoters include without limitation the E. coli promoters Pspc, Pbla, PRNAI, PRNAII, P1 and P2 from rrnB, and the lambda phage promoter PL (Liang, S. T. ct al. J Mol. Biol. 292(1):19-37 (1999)). In some embodiments, the constitutive promoter is functional in a wide range of host cells.


An inducible promoter is defined herein as a promoter that drives the expression of nucleic acid(s) selectively and reliably in response to a specific stimulus. An ideal inducible promoter will drive no nucleic acid expression in the absence of its specific stimulus but drive robust nucleic acid expression rapidly upon exposure to its specific stimulus. Additionally, some inducible promoters induce a graded level of expression that is tightly correlated with the amount of stimulus received. Stimuli for known inducible promoters include, for example, heat shock, exogenous compounds or a lack thereof (e.g., a sugar, metal, drug, or phosphate), salts or osmotic shock, oxygen, and biological stimuli (e.g., a growth factor or pheromone).


Inducible promoters are often used in recombinant engineering applications to limit the expression of recombinant nucleic acid(s) to desired circumstances. For example, since high levels of recombinant protein expression may sometimes slow the growth of a host cell, the host cell may be grown in the absence of recombinant nucleic acid expression, and then the promoter may be induced when the host cells have reached a desired density. Many inducible promoters are known and characterized in the art. Exemplary bacterial inducible promoters include without limitation the E. coli promoters Plac, Ptrp, Ptac, PT7, PBAD, and PlacUV5 (Nocadello, S. and Swennen, E. F. Microb Cell Fact, 11:3 (2012)). In some preferred embodiments, the inducible promoter is a promoter that functions in a wide range of host cells. Inducible promoters that functional in a wide variety of host bacterial and yeast cells are well known in the art.


Genetic Markers

Certain aspects of the present invention related to genetic markers that allow selection of host cells that have one or more desired polynucleotides. In some embodiments, the genetic marker is a positive selection marker that confers a selective advantage to the host organisms. Examples of positive markers are genes that complement a metabolic defect (autotrophic markers) and antibiotic resistance markers.


In some embodiments, the genetic marker is an antibiotic resistance marker such as Apramycin resistance, Ampicillin resistance, Kanamycin resistance. Spectinomycin resistance, Tetracyclin resistance, Neomycin resistance, Chloramphenicol resistance, Gentamycin resistance, Erythromycin resistance, Carbenicillin resistance, Actinomycin D resistance, Neomycin resistance. Polymyxin resistance. Zeocin resistance and Streptomycin resistance. In some embodiments, the genetic marker includes a coding sequence of an antibiotic resistance protein (e.g., a beta-lactamase for certain Ampicillin resistance markers) and a promoter or enhancer element that drives expression of the coding sequence in a host cell of the present disclosure. In some embodiments, a host cell of the present disclosure is grown under conditions in which an antibiotic resistance marker is expressed and confers resistance to the host cell, thereby selected for the host cell with a successful integration of the marker. Exemplary culture conditions and media are described herein.


In some embodiments, the genetic marker is an auxotrophic marker, such that marker complements a nutritional mutation in the host cell. In some embodiments, the auxotrophic marker is a gene involved in vitamin, amino acid, fatty acid synthesis, or carbohydrate metabolism; suitable auxotrophic markers for these nutrients are well known in the art. In some embodiments, the auxotrophic marker is a gene for synthesizing an amino acid. In some embodiments, the amino acid is any of the 20 essential amino acids. In some embodiments, the auxotrophic marker is a gene for synthesizing glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, tyrosine, tryptophan, serine, threonine, cysteine, methionine, asparagine, glutamine, lysine, arginine, histidine, aspartate or glutamate. In some embodiments, the auxotrophic marker is a gene for synthesizing adenosine, biotin, thiamine, leucine, glucose, lactose, or maltose. In some embodiments, a host cell of the present disclosure is grown under conditions in which an auxotrophic resistance marker is expressed in an environment or medium lacking the corresponding nutrient and confers growth to the host cell (lacking an endogenous ability to produce the nutrient), thereby selected for the host cell with a successful integration of the marker. Exemplary culture conditions and media are described herein.


Cell Culture Media and Methods

Certain aspects of the present disclosure relate to methods of culturing a cell. As used herein, “culturing” a cell refers to introducing an appropriate culture medium, under appropriate conditions, to promote the growth of a cell. Methods of culturing various types of cells are known in the art. Culturing may be performed using a liquid or solid growth medium. Culturing may be performed under aerobic or anaerobic conditions where aerobic, anoxic, or anaerobic conditions are preferred based on the requirements of the microorganism and desired metabolic state of the microorganism. In addition to oxygen levels, other important conditions may include, without limitation, temperature, pressure, light, pH, and cell density.


In some embodiments, a culture medium is provided A “culture medium” or “growth medium” as used herein refers to a mixture of components that supports the growth of cells. In some embodiments, the culture medium may exist in a liquid or solid phase. A culture medium of the present disclosure can contain any nutrients required for growth of microorganisms. In certain embodiments, the culture medium may further include any compound used to reduce the growth rate of, kill, or otherwise inhibit additional contaminating microorganisms, preferably without limiting the growth of a host cell of the present disclosure (e.g., an antibiotic, in the case of a host cell bearing an antibiotic resistance marker of the present disclosure). The growth medium may also contain any compound used to modulate the expression of a nucleic acid, such as one operably linked to an inducible promoter (for example, when using a yeast cell, galactose may be added into the growth medium to activate expression of a recombinant nucleic acid operably linked to a GAL1 or GAL10 promoter). In further embodiments, the culture medium may lack specific nutrients or components to limit the growth of contaminants, select for microorganisms with a particular auxotrophic marker, or induce or repress expression of a nucleic acid responsive to levels of a particular component.


In some embodiments, the methods of the present disclosure may include culturing a host cell under conditions sufficient for the production of a product, e.g., 3-HP. In certain embodiments, culturing a host cell under conditions sufficient for the production of a product entails culturing the cells in a suitable culture medium. Suitable culture media may differ among different microorganisms depending upon the biology of each microorganism. Selection of a culture medium, as well as selection of other parameters required for growth (e.g., temperature, oxygen levels, pressure, etc.), suitable for a given microorganism based on the biology of the microorganism are well known in the art. Examples of suitable culture media may include, without limitation, common commercially prepared media, such as Luria Bertani (LB) broth, Sabouraud Dextrose (SD) broth, or Yeast medium (YM, YPD, YPG, YPAD, etc.) broth. In other embodiments, alternative defined or synthetic culture media may also be used.


Certain aspects of the present disclosure relate to culturing a recombinant host cell of the present disclosure in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP. A variety of substrates are contemplated for use herein. In some embodiments, the substrate is a compound described herein that can be used as a metabolic precursor to generate oxaloacetate.


In some embodiments, the substrate comprises glucose. In some embodiments, the substrate is glucose. In some embodiments, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or 100% of the glucose metabolized by the recombinant host cell is converted to 3-HP.


Other substrates contemplated for use herein include, without limitation, sucrose, fructose, xylose, arabinose, cellobiose, cellulose, alginate, mannitol, laminarin, galactose, and galactan. In some embodiments, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or 100% of the substrate metabolized by the recombinant host cell is converted to 3-HP. A variety of techniques suitable for engineering a recombinant host cell able to metabolize these and other substrates have been described. See, e.g., Enquist-Newman. M. et a. (2014) Nature 505:239-43 (describing S. cerevisiae host cells capable of metabolizing 4-deoxy-L-erythro-5-hexoseulose urinate or mannitol); Wargacki. A. J. et al. (2012) Science 335:308-313 (describing E. coli host cells capable of metabolizing alginate, mannitol, and glucose); and Turner, T. L. et al. (2016) Biotechnol. Bioeng. 113:1075-1083 (describing S. cerevisiae host cells capable of cellobiose and xylose).


In some embodiments, a recombinant host cell of the present disclosure is cultured under semiaerobic or anaerobic conditions (e.g., semiaerobic/anaerobic conditions suitable for the host cell to produce 3-HP). As described herein, production of 3-HP using a recombinant host cell of the present disclosure is thought to be advantageous, e.g., for increasing scale of production, yield, and/or cost efficacy. In some embodiments, anaerobic conditions may refer to conditions in which average oxygen concentration is 20% or less than the average oxygen concentration of tap water or of an average aqueous environment.


Purification of Products from Host Cells

In some embodiments, the methods of the present disclosure further comprise substantially purifying 3-HP produced by a host cell of the present disclosure, e.g., from a cell culture or cell culture medium.


A variety of methods known in the art may be used to purify a product from a host cell or host cell culture. In some embodiments, one or more products may be purified continuously, e g., from a continuous culture. In other embodiments, one or more products may be purified separately from fermentation, e.g., from a batch or fed-batch culture. One of skill in the art will appreciate that the specific purification method(s) used may depend upon, inter aha, the host cell, culture conditions, and/or particular product(s).


In some embodiments, purifying 3-HP comprises: separating or filtering the host cells from a cell culture medium, separating the 3-HP from the culture medium (e.g., by solvent extraction), concentration of water (e.g., by evaporation), and crystallization of the 3-HP. Techniques for purifying 3-HP are known in the art; see, e.g., U.S. Pat. Nos. 7,279,598 and 6,852,517; U.S. PG Pub. Nos. US20100021978, US2009032548, and US20110244575; and International Pub. Nos. WO2010011874, WO2013192450, and WO2013192451. In some embodiments, the solvent is an organic solvent, including without limitation alcohols, aldehydes, ethers, and ketones. For descriptions of exemplary purification schemes, see, e.g., WO2013192450.


In some embodiments, the methods of the present disclosure further comprise converting 3-HP (e.g., substantially purified 3-HP) into acrylic acid. Techniques for converting 3-HP into acrylic acid are known: see, e.g., WO2013192451 and WO2013185009. In some embodiments, 3-HP is converted into acrylic acid via a catalyst and heat. In some embodiments, 3-HP is converted into acrylic acid by vaporizing 3-HP in aqueous solution and contacting the vapor with a catalyst or inert surface area. In some embodiments, the aqueous solution containing the 3-HP is obtained from a cell culture medium, e.g., by concentrating the medium (e.g., by removal of water).


EXAMPLES

The present disclosure will be more fully understood by reference to the following examples. They should not, however, be construed as limiting the scope of the present disclosure. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.


Example 1: Identification of Novel Oxaloacetate Decarboxylases

This study shows the identification of candidate enzymes capable of directly catalyzing the decarboxylation of oxaloacetate to 3-oxoproponanoate using a genomic mining method. Purified candidate enzymes were characterized in functional assays to assess catalytic activity and substrate preference for oxaloacetate compared to pyruvate.


Materials and Methods


Genomic Enzyme Mining



FIG. 3 depicts an overview of the genomic enzyme mining scheme employed to identify candidate oxaloacetate decarboxylase enzymes. Briefly, branched-chain ketoacid decarboxylase from Lactococcus lactis (crystal structure PDB code: 2VBG) was identified to have a relatively broad substrate spectrum (Smit, B. A. et a. (2005) Appl. Environ. Microbiol. 71:303-311). Therefore, its sequence was used as the input to perform genomic database searching via HMMER (Finn, R. D. et a. (2011) Nucleic Acids Res. 39:W29-W37). The target database was set to 15 representative proteomes, and the significance level for E-values was set at 1c-50.


The search resulted in 1,732 significant hits, and the resulting sequences were subsequently filtered using the CD-HIT online server with a 90% identity cutoff. A set of 1,303 homologous gene sequences was then generated. Sequences derived from bacteria were preferred due to the increased likelihood of producing soluble proteins in E. coli. Enzymes with a sequence length less than 200 amino acids or more than 700 amino acids were removed since the average sequence length of ketoacid decarboxylases is about 500 amino acids. To select enzymes for characterization studies, proteins sequences that were experimentally validated and annotated as TPP binding proteins were prioritized. For the purpose of diversifying enzyme candidates, the selected sequences broadly covered the entire enzyme family.


Table 2 shows the final sequence library containing 56 sequences with an average of 15% sequence identity, which were verified by phylogenetic analysis. These candidates were subsequently characterized for activity towards oxaloacetate.









TABLE 2







Protein and gene sequences of candidate oxaloacetate decarboxylase enzymes.










Enzyme





name or





UniProt/





Genebank ID
Species
Protein Sequence
Gene sequence





4COK

Gluconacetobacter

MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQL
ATGACGTATACCGTGGGCCGCTATCTGGCTGACCGTTTAG




diazotrophicus

LLNTDMQQIYCSNELNCGFSAEGYARANGAAAAIVTF
CCCAAATTGGTCTTAAACATCACTTTGCCGTGGCAGGCGA




SVGALSAFNALGGAYAENLPVILISGAPNANDHGTGHI
CTACAACTTGGTTCTGTTAGACCAGCTGCTGCTGAATACC




LHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKID
GACATGCAACAGATTTACTGCAGTAATGAACTTAACTGTG




HVIRTALREKKPAYLEIACNVAGAPCVRPGGIDALLSP
GGTTCAGTGCCGAAGGCTATGCGCGCGCCAACGGCGCGG




PAPDEASLKAAVDAALAFIEQRGSVTMLVGSRIRAAG
CTGCAGCCATTGTCACCTTTTCCGTCGGCGCTCTGAGCGC




AQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGH
CTTCAACGCCTTGGGCGGCGCATACGCGGAAAACTTGCC




YWGEVSSPGAQQAVEGADGVICLAPVFNDYATVGWS
GGTCATCCTGATCTCTGGCGCACCGAACGCGAATGACCAC




AWPKGDNVMLVERHAVTVGGVAYAGIDMRDFLTRL
GGGACCGGCCATATCTTGCACCATACGCTGGGCACCACA




AAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMA
GATTATGGCTACCAACTGGAAATGGCACGCCATATTACAT




RQIGALLTPRTTLTAETGDSWFNAVRMKLPHGARVEL
GTGCGGCGGAATCAATTGTCGCTGCAGAGGATGCGCCAG




EMQWGHIGWSVPAAFGNALAAPERQHVLMVGDGSFQ
CGAAAATTGATCACGTGATTCGCACCGCGCTGCGCGAAA




LTAQEVAQMIRHDLPVIIFLINNHGYTIEVMIHDGPYNN
AAAAACCAGCATACCTGGAAATTGCGTGTAATGTGGCTG




VKNWDYAGLMEVFNAGEGNGLGLRARTGGELAAAIE
GCGCTCCATGCGTTCGCCCGGGCGGTATTGATGCATTCT




QARANRNGPTLIECTLDRDDCTQELVTWGKRVAAAN
GTCGCCGCCCGCCCCGGATGAAGCCAGCCTGAAGGCGGC




ARPPRAG
CGTTGACGCCGCCCTGGCCTTCATTGAACAACGCGGCTCA




(SEQ ID NO: 1)
GTGACGATGCTCGTTGGTAGTCGTATCCGTGCAGCCGGAG





CCCAGGCTCAGGCGGTCGCCCTCGCGGATGCTCTGGGCTG





CGCGGTGACGACGATGGCGGCAGCGAAATCTTTTTTTCCA





GAAGATCATCCGGGTTATCGTGGTCACTACTGGGGTGAG





GTGTCATCCCCGGGTGCCCAACAGGCCGTGGAGGGCGCT





GACGGTGTGATTTGTTTGGCCCCGGTTTTCAATGACTATG





CCACTGTGGGCTGGAGCGCGTGGCCGAAAGGGGATAACG





TCATGCTTGTGGAACGTCACGCGGTTACCGTAGGTGGTGT





TGCGTATGCCGGCATCGATATGCGAGACTTTCTGACACGT





CTGGCGGCTCACACCGTACGCCGTGATGCCACCGCACGC





GGCGGGGCATATGTAACCCCGCAGACGCCGGCAGCGGCT





CCGACTGCCCCTCTGAACAACGCGGAGATGGCGCGCCAG





ATCGGCGCGCTACTGACGCCGCGGACAACTTTGACCGCG





GAAACCGGCGACAGCTGGTTCAATGCGGTCCGTATGAAA





CTGCCGCACGGCGCGCGGGTCGAACTGGAAATGCAATGG





GGGCACATCGGTTGGAGCGTGCCGGCGGCGTTTGGTAAC





GCGCTGGCGGCGCCGGAACGCCAGCACGTCCTGATGGTG





GGTGACGGCTCATTTCAGCTGACTGCACAGGAAGTGGCC





CAGATGATTCGTCATGACTTACCGGTGATAATCTTTCTGA





TCAACAACCACGGCTATACTATACAAGTGATGATCCATG





ACGGGCCGTATAACAACGTGAAGAACTGGGATTACGCGG





GCCTGATGGAAGTCTTCAATGCGGGGGAAGGTAACGGCC





TCGGTCTTCGTGCCCGCACTGGGGGCGAACTGGCGGCGG





CTATTGAACAGGCCCGCGCCAACCGTAACGGCCCGACCC





TGATCGAATGTACCCTGGACCGCGATGACTGCACGCAGG





AACTGGTGACCTGGGGCAAACGTGTTGCAGCTGCCAACG





CGCGCCCTCCTCGTGCAGGA





(SEQ ID NO: 2)





A0A0F6SDN1_9DELT

Sandaracinus

MADLLAIHRHAVRARLLDERLTQLARAGRIGFHPDAR
ATGGCCGATCTGCTGGCGATTCACCGACATGCCGTGCGTG




amylolyticus

GFEPAIAAAVLAMRAEDAIFPSARDHAAFLVRGLPISR
CCCGTCTGCTGGATGAGCGTTTAACGCAACTTGCCCGCGC




YVAHAFGSVEDPMRGHAAPGHLASRSELRIAAASGLVS
TGGCCGCATCGGGTTCCACCCTGATGCACGTGGTTTCGAG




NHMTHAAGYAWAAKLRGETCAVLTMFADTAADAGD
CCGGCTATTGCGGCTGCCGTACTGGCTATGCGCGCGGAAG




FHSAVNFAGATKAPVIFFCRTDRTRSAHPPTPIDRVAD
ATGCTATTTTCCCGTCCGCGCGAGATCACGCAGCGTTTCTT




KGIAYGVESLVCSADDAGAVASAMAQAHQRALAGEG
GGTTCGCGGATTGCCGATTAGCCGGTATGTGGCCCATGCG




PTLVEAIRESKSDPIEALEARLSSEGHWDAHRALELRRE
TTTGGCAGTGTTGAGGATCCTATGCGTGGCCACGCTGCCC




LMTEIESAVAHAQQVGAPPREAVFEDVYATLPRHLED
CCGGGCACTTAGCGTCACGCGAACTGCGCATTGCCGCGG




QRTTLLATANHEDR
CCAGCGGTCTGGTCAGCAACCATATGACTCACGCCGCCG




(SEQ ID NO: 3)
GTTACGCGTGGGCAGCTAAACTTCGCGGGGAAACGTGCG





CGGTTTTGACCATGTTTGCAGACACCGCTGCGGACGCTGG





TGACTTTCATTCAGCGGTAAACTTTGCGGGTGCCACCAAG





GCGCCGGTTATCTTTTTTTGCCGTACAGATCGGACCCGTA





GTGCACATCCGCCGACGCCGATTGACCGTGTGGCCGATA





AGGGCATTGCATACGGTGTGGAGAGCTTGGTTTGTTCGGC





CGATGATGCCGGTGCGGTGGCTAGCGCCATGGCACAGGC





ACACCAGCGCGCTCTGGCCGGCGAAGGTCCTACGCTGGT





GGAAGCGATTCGTGAATCCAAAAGCGATCCCATCGAGGC





CCTGGAGGCTCGCCTGTCTAGCGAAGGTCACTGGGATGC





GCACCGTGCGCTGGAACTGCGCCGCGAGCTGATGACTGA





GATCGAGTCTGCCGTGGCGCATGCCCAGCAGGTTGGTGCT





CCCCCACGCGAAGCCGTGTTCGAAGATGTCTATGCAACCT





TGCCGCGTCACCTGGAAGACCAGCGTACGACATTACTGG





CCACCGCCAACCACGAAGATCGG





(SEQ ID NO: 4)





4K9Q

Polynucleobacter

MRTVKEITFDLLRKLQVTTVVGNPGSTEETFLKDFPSD
ATGCGCACCGTTAAAGAGATCACATTCGATCTGTTGCGGA




necessarius subsp.

FNYVLALQEASVVAIADGLSQSLRKPVIVNIHTGAGLG
AACTGCAAGTTACCACCGTGGTGGGCAACCCAGGCTCCA




Asymbioticus

NAMGCLLTAYQNKTPLIITAGQQTREMLLNEPLLTNIE
CCGAGGAAACGTTTCTGAAAGATTTTCCGTCGGACTTTAA




AINMPKPWVKWSYEPARPEDVPGAFMRAYATAMQQP
CTATGTACTGGCCCTCCAGGAAGCGAGCGTCGTCGCGATC




QGPVFLSLPLDDWEKLIPEVDVARTVSTRQGPDPDKV
GCGGACGGCTTATCCCAGAGTCTTCGTAAGCCCGTGATCG




KEFAQRITASKNPLLIYGSDIARSQAWSDGIAFAERLNA
TTAACATTCACACGGGGGCAGGCTTGGGCAATGCTATGG




PVWAAPFAERTPFPEDHPLFQGALTSGIGSLEKQIQGH
GGTGCTTGTTGACAGCCTATCAGAATAAAACCCCCCTTAT




DLIVVIGAPVFRYYPWIAGQFIPEGSTLLQVSDDPNMTS
TATAACCGCGGGGCAACAAACCCGCGAAATGCTGCTCAA




KAVVGDSLVSDSKLFLIEALKLIDQREKNNTPQRSPMT
AAACCGTGGGTGAAGTGGAGCTATGAACCGGCACGGCC




KEDRTAMPLRPHAVLEVLKENSPKEIVLVEECPSIVPL
GAAACCGTGGGTGAAGTGGAGCTATGAACCGGCACGGCC




MQDVFRINQPDTFYTFASGGLGWDLPAAVGLALGEEV
GGAGGACGTCCCGGGCGCATTCATGCGCGCGTATGCGAC




SGRNRPVVTLMGDGSFQYSVQGIYTGVQQKTHVIYVV
GGCTATGCAACAGCCCCAGGGTCCGGTTTTTCTGAGCCTT




FQNEEYGILKQFAELEQTPNVPGLDLPGLDIVAQGKAY
CCGCTTGACGATTGGGAAAAACTTATCCCTGAAGTAGATG




GAKSLKVETLDELKTAYLEALSFKGTSVIVVPITKELKP
TCGCCCGCACAGTGTCTACCCGTCAAGGTCCGGATCCGGA




LFG
CAAGGTCAAAGAATTTGCGCAACGCATTACCGCATCAAA




(SEQ ID NO: 5)
AAATCCGCTGCTCATTTATGGCAGCGATATTGCGCGCTCG





CAAGCGTGGAGCGATGGTATCGCATTCGCAGAACGCCTA





AACGCACCGGTCTGGGCGGCTCCCTTCGCGGAACGGACC





CCATTTCCTGAAGATCATCCCCTTTTTCAGGGTGCCCTGA





CCTCGGGTATCGGAAGCCTGGAAAAGCAAATCCAGGGTC





ATGATTTAATCGTGGTCATCGGTGCCCCGGTGTTTCGCTA





CTACCCTTGGATCGCGGGGCAATTTATTCCGGAGGGCTCA





ACCCTCCTTCAGGTGTCGGATGATCCTAATATGACCAGCA





AAGCGGTAGTTGGTGATTCCTTGGTTAGCGATTCGAAATT





GTTCCTGATCGAAGCACTTAAACTGATCGATCAGCGCGAA





AAAAACAATACGCCACAGCGCAGCCCGATGACCAAAGAG





GACCGTACCGCCATGCCACTCCGTCCCCATGCTGTTCTCG





AAGTGCTGAAAGAAAATTCACCGAAAGAGATAGTACTGG





TCGAAGAGTGTCCATCCATCGTTCCTCTGATGCAGGACGT





TTTCCGCATTAACCAACCGGATACCTTCTACACCTTTGCA





AGTGGCGGCTTGGGTTGGGACCTGCCGGCCGCAGTAGGG





CTGGCCCTGGGCGAGGAAGTTAGCGGCCGCAACCGGCCT





GTGGTTACGCTTATGGGCGATGGATCCTTCCAATATAGCG





TTCAAGGTATTTACACGGGAGTGCAGCAAAAAACCCATG





TAATTTACGTGGTGTTCCAGAACGAAGAATATGGGATCTT





AAAGCAGTTTGCAGAACTTGAACAGACTCCGAACGTGCC





CGGACTGGATCTGCCGGGGCTGGACATTGTGGCTCAGGG





TAAAGCGTATGGCGCAAAAAGCCTTAAAGTGGAAACACT





TGATGAATTAAAAACCGCCTATCTGGAAGCGCTGAGCTTT





AAGGGTACGTCTGTCATTGTCGTGCCGATCACCAAGGAAT





TAAAACCACTTTTCGGA





(SEQ ID NO: 6)





D6ZJY9_MOBCV

Mobiluncus curtisii

MLKQIEGSQAIARAVAACQPNVVAAYPISPQTHIVEAL
ATGCTGAAACAGATTGAAGGCTCTCAGGCAATAGCACGT




SALVKSGQLEHCEYVNVESEFAAMSACIGSSAVGARS
GCCGTTGCTGCGTGCCAGCCAAACGTGGTCGCAGCCTATC




YTATASQGLLMVEAVYNAAGLGFPIVMTVANRAIG
CGATCTCACCGCAGACCCATATTGTGAAGCACTTTCTGC




APINIWNDHSDSMSQRDSGWLQLFAENNQEAADLHV
GCTGGTAAAAAGTGGCCAGCTGGAACACTGCGAGTACGT




QAFRIAEELSVPVMCMDGFILTHAVEQVDLPESEQVK
GAACGTAGAATCCGAATTCGCAGCCATGTCTGCCTGCATT




QFLPPYEPRQVLDPDDPLSIGAMVGPEAFTEVRYIAHH
GGCTCGTCCGCAGTTGGCGCGCGCTCATATACTGCGACGG




KMLQALDLIPQVQSEFKSIFGRDSGGLLHTYRCEDAETI
CATCACAGGGCTTGCTGTATATGGTTGAAGCGGTCTACAA




IVALGSVVGTLKDVVDQRRENGEKIGIMSLVSFRPFPF
CGCCGCTGGCCTGGGCTTCCCGATTGTCATGACGGTGGCG




AAIREVLQSAKRVVCLEKAFQLGIGGIVSSELRAAMRG
AACCGTGCAATTGGAGCTCCGATCAATATCTGGAATGACC




LPFTCYEVIAGLGGRNITKNSLHAMLDQAVADTIEPLT
ACAGTGATTCGATGTCGCAGCGCGACTCTGGCTGGCTGCA




FMDLDMELVQGELEREAATRRSGAFATNLQRERVLRA
GCTGTTCGCCGAGAACAACCAGGAAGCCGCAGACTTACA




NAKIAEAGPKPKADKVGNPRVASPSIKQDAVPVVPDQ
TGTGCAGGCATTTCGTATCGCTGAGGAGTTGAGCGTCCCG




AE
GTTATGGTGTGCATGGATGGTTTCATTCTAACGCATGCCG




(SEQ ID NO: 7)
TTGAACAGGTCGACCTCCCGGAATCTGAACAAGTGAAAC





AGTTTCTCCCTCCCTACGAACCACGTCAAGTTCTGGACCC





GGACGATCCGTTATCTATTGGCGCTATGGTTGGTCCGGAA





GCGTTTACCGAGGTGCGCTATATTGCTCATCATAAAATGC





TGCAGGCTCTGGATCTGATCCCACAAGTGCAGTCCGAATT





TAAATCAATATTTGGCCGGGACTCTGGGGGACTGCTGCAT





ACGTATCGGTGCGAAGATGCGGAAACTATTATTGTGGCCC





TGGGTTCCGTTGTAGGTACCCTGAAAGATGTCGTGGACCA





ACGTCGCGAGAATGGCGAGAAAATCGGCATCATGAGCTT





AGTGAGCTTCCGCCCCTTCCCATTTGCTGCCATCCGCGAG





GTCCTGCAGTCAGCGAAACGCGTGGTTTGCCTGGAGAAA





GCGTTTCAATTGGGTATTGGGGGGATTGTATCTTCTGAGC





TGCGGGCGGCCATGCGTGGTTTGCCGTTCACTTGTTACGA





AGTAATCGCCGGTTTGGGTGGCCGCAACATTACTAAAAA





CAGTCTACATGCTATGCTTGATCAGGCCGTCGCTGATACG





ATCGAGCCGCTAACCTTTATGGATCTGGATATGGAGCTGG





TGCAGGGCGAGCTCGAACGGGAAGCAGCGACGAGACGCT





CTGGCGCTTTCGCCACCAACCTGCAACGCGAACGTGTCCT





GCGTGCGAACGCTAAAATTGCAGAAGCAGGTCCGAAACC





AAAAGCAGATAAAGTAGGTAACCCGCGGGTTGCGTCTCC





GTCAATCAAGCAGGATGCGGTGCCTGTAGTCCCTGACCA





GGCTGAA





(SEQ ID NO: 8)





|QILMD8_CUPMC

Cupriavidusmetallidurans

MIEAVQFVEAARERGFEWYAGVPCSYLTPFINYVVQD
ATGATTGAGGCTGTTCAGTTTGTCGAGGCGGCACGGGAA




PSLHYVSAANEGDAVAFIAGVTQGARNGVRGITMMQ
CGTGGCTTTGAATGGTACGCGGGGGTTCCCTGCAGTTATT




NSGLGNAVSKTSLTWTERLPQLLIVTWRGQPGGASDE
TGACTCCGTTCATTAATTATGTAGTTCAGGATCCGTCGCT




PQHALMGPVTPAMLDTMEIPWELFPTEPDAVGPALDR
GCACTACGTCAGTGCCGCGAACGAGGGAGATGCTGTTGC




AIAHMDATGRPYALIMQKGSVAPYPLKTQFPPVARAK
ATTCATCGCGGGCGTCACCCAAGGTGCTCGCAACGGCGTC




ATPQVSRSGATPLPSRQEALQRVIAHTPADSTVVLAST
CGTGGTATCACCATGATGCAAAATTCCGGTCTGGGTAACG




GFCGRELYALDDRPNQLYMVGSMGCLTPFALGLAMA
CCGTGTCCCCGCTGACCAGCCTGACCTGGACCTTCCGCCT




RPDLKVVAVDGDGAALMRMGVFATLGAYGPANLTH
GCCGCAGCTGTTGATAGTAACGTGGCGTGGTCAGCCGGG




VLLDNNAHDSTGGQATVSHNVSFAGVAAACGYASAIE
CGGCGCCTCAGACGAACCACAACATGCGCTGATGGGCCC




GDDLDMLDRVLASAATATSGPNFVCLQTRAGTPDGLP
TGTGACCCCGGCGATGCTGGACACCATGGAGATCCCGTG




RPSVTPVEVKTRLGRQIGADQGHAGEKHAAA
GGAACTGTTTCCGACAGAACCGGATGCAGTGGGGCCAGC




(SEQ ID NO: 9)
CCTCGATCGCGCCATCGCACACATGGACGCCACGGGCCG





TCCTTACGCGCTGATCATGCAGAAGGGCTCGGTGGCTCCA





TACCCGCTGAAGACACAGACTCCGCCGGTTGCACGCGCG





AAGGCGACCCCACAGGTTAGTCGCTCAGGTGCCACGCCA





TTACCATCGCGTCAAGAAGCCCTTCAGCGGGTTATCGCCC





ATACCCCGGCTGATTCAACTGTGGTTCTGGCATCTACTGG





CTTTTGCGGTCGAGAACTGTATGCGTTGGATGACCGCCCG





AACCAATTATATATGGTGGGTTCCATGGGTTGTCTGACGC





CATTCGCACTGGGGTTGGCAATGGCGCGTCCGGATCTCAA





AGTGGTTGCAGTAGATGGCGATGGCGCGGCCCTAATGCG





CATGGGGGTGTTCGCGACTCTGGGGGCGTATGGGCCGGC





TAACCTCACCCACGTTTTATTAGACAACAACGCACACGAT





TCAACCGGCGGCCAGGCCACCGTAAGCCATAATGTTTCTT





TTGCGGGGGTCGCAGCGGCGTGCGGCTACGCCTCTGCAAT





CGAAGGTGACGACTTGGATATGCTGGACCGTGTGTTAGC





GTCCGCCGCAACAGCGACTTCCGGGCCGAACTTCGTGTGC





TTACAAACTCGTGCAGGTACGCCGGACGGCTTACCACGA





CCATCTGTGACCCCGTTGAAGTGAAAACGCGCCTTGGTC





GGCAAATTGGCGCCGACCAGGGCCACGCAGGCGAAAAAC





ACGCCGCGGCC





(SEQ ID NO: 10)





Q9F768

Bacteroides fragilis

MNTLTSQIEQLQSLAHELLYLGVDGAPIYTDHFRQLNK
ATGAATACCCTGACCTCTCAGATTGAACAACTGCAAAGCC




EVLEQSDALYPQRGATPEEEANICLALLMGYNATIYNQ
TGGCCCACGAACTGCTGTATCTGGGTGTGGACGGTGCCCC




GDKEEKKQVVLNRCWDVLDQLPATLLKCQLLTYCYG
TATCTATACCGACCATTTTC GTCAGCTGAACAAGGAAGTC




EVFEEELAKEAHTIIESWSNRELLKAEKEIAESLNNLEA
CTGGAACAAAGCGATGCGCTCTATCCACAGAGGGGCGCT




NPYPYSELHE
ACCCCGGAAGAAGAGGCCAACATTTGCCTGGCACTGCTT




(SEQ ID NO: 11)
ATGGGTTATAATGCAACGATTTACAATCAGGGCGATAAG





GAAGAGAAAAAACAAGTGGTCCTGAATCGCTGTTGGGAT





GTGCTGGATCAGCTCCCGGCAACCCTCCTGAAGTGTCAGC





TTCTCACGTACTGCTATGGCGAAGTTTTTGAAGAAGAGTT





AGCGAAAGAAGCCCACACAATCATAGAGTCATGGAGTAA





CCGCGAACTGCTGAAAGCAGAAAAAGAAATCGCGGAATC





GCTGAATAACCTCGAGGCGAATCCGTACCCGTATTCCGAA





CTGCACGAA





(SEQ ID NO: 12)





I3BXS7_9GAMM

Thiothrix nivea

MQIQVSELIVKFLQKLGVDTIFGMPGAHILPVYDELYD
ATGCAAATCCAGGTTAGCGAGCTGATTGTAAAGTTCTTGC



DSM 5205
SGIKTVLYKHEQGAAFMAGGYARVSGRIGACITTAGP
AGAAATTAGGTGTCGATACAATTTTTGGCATGCCAGGCGC




GASNLITGIANAYADKLPMIVITGEAPTHIFGRGGLQES
CCACATCCTGCCCGTGTATGATGAATTATACGACAGCGGC




SGEGGSIDQTALFSGVTRYHKLIERTDYITNVLSQAAR
ATAAAAACCGTTCTCGTTAAGCACGAACAGGGCGCCGCG




QLVADVPGPVVLSIPVNVQKELVDASILENLPTLKPLP
TTCATGGCGGGTGGCTACGCCCGGGTTTCTGGTCGAATTG




KLQIAPPVLEQCADMIRKARCPVILAGYGCLQSVRARL
GTGCGTGTATCACTACCGCTGGCCCGGGGGCCTCGAATCT




ELRKFSEHLNIPVATSLKGKGAIDERSALSLGSLGVTSS
AATCACCGGTATCGCTAACGCGTATGCGGATAAATTGCCG




GHAMHYFMQEADLIILLGAGFNERTSYVWKLADLTQER
ATGATTGTTATCACCGGCGAGGCCCCTACCCACATTTTCG




KIIQVDRNVAQLEKVVKADLAIQSDLGDFLHALNTCC
GCCGAGGCGGCTTACAGGAATCTTCCGGTGAAGGTGGCT




VPQGIEPKSCPDLAAFKQKVDQQAAQSGQVIFNQKLFD
CAATCGACCAAACCGCACTCTTCAGCGGGGTGACCCGAT




LVKSLFARLEPHFAEGIVLVDDNIIYAQNFYRVKDGL
ACCACAAACTGATTGAACGTACCGATTACATTACCAATGT




FVPNTGVSSLGHAIPAAIGARFVLDKPMFAILGDGGFQ
CCTCTCCCAGGCCGCCCGGCAGCTTGTAGCCGATGTACCA




MCCMEIMTAVNYNIPLNIVLFNNQTLGLIRKNQHQQY
GGACCCGTTGTCCTCTCGATTCCAGTTAACGTGCAAAAAG




EQRFLDCDFQNPDYALLAQSFGINHFHVGNNADLQRV
AGCTTGTCGACGCAAGTATTTTAGAAAACTTACCTACGCT




FDTADFHHAINLIELMVDREAYPNYSSRR
TAAACCGCTGCCGAAACTGCAGATCGCGCCGCCGGTGCT




(SEQ ID NO: 13)
GGAGCAGTGTGCGGATATGATCCGCAAGGCTCGTTGTCC





AGTCATCCTGGCGGGGTATGGCTGTCTGCAGTCGGTGCGC





GCTAGATTAGAGCTGCGTAAATTCAGCGAACACCTGAAT





ATTCCAGTGGCGACGAGTCTTAAAGGGAAGGGAGCGATT





GATGAACGTTCGGCACTCAGCCTGGGGTCGCTGGGCGTG





ACGAGTAGCGGACATGCTATGCACTATTTTATGCAAGAG





GCGGATCTCATCATTCTGCTAGGGGCGGGCTTTAATGAAC





GTACGTCTTATGTTTGGAAGGCAGACTTAACCCAAGAGCG





TAAAATCATTCAGGTCGATCGTAATGTTGCTCAGCTAGAA





AAAGTGGTTAAGGCCGATTTGGCAATTCAGTCTGATCTGG





GCGATTTTTTACACGCGCTGAACACCTGTTGTGTGCCCCA





GGGTATTGAACCGAAATCATGTCCGGATCTGGCAGCCTTT





AAACAGAAAGTGGATCAGCAGGCGGCCCAGAGTGGCCAG





GTGATCTTCAACCAGAAATTTGATTTAGTTAAGTCGTTGT





TTGCACGACTGGAACCTCATTTTGCCGAAGGTATCGTATT





GGTGGATGACAATATCATCTATGCGCAAAACTTCTACCGC





GTGAAAGACGGGGACCTGTTTGTACCGAACACTGGGGTG





AGCAGCCTGGGACATGCGATTCCCGCCGCCATTGGTGCGC





GCTTCGTCTTGGATAAACCGATGTTTGCGATTCTTGGCGA





TGGTGGCTTCCAAATGTGTTGTATGGAAATAATGACCGCT





GTGAATTATAATATTCCGCTCAACATCGTGCTCTTTAACA





ATCAGACCCTGGGACTGATACGTAAAAACCAACATCAAC





AGTATGAACAGCGTTTCCTGGATTGTGATTTCCAGAACCC





AGACTATGCCCTACTGGCGCAAAGCTTTGGCATTAACCAC





TTTCATGTGGGTAACAACGCCGATCTGCAGCGCGTTTTTG





ACACGGCGGATTTTCATCATGCTATCAACCTGATTGAGCT





CATGGTTGATCGCGAAGCTTATCCAAACTATTCAAGCCGT





CGC





(SEQ ID NO: 14)





1JSC

Saccharomyces

MIRQSTLKNFAIKRCFQHIAYRNTPAMRSVALAQRFYS
ATGATCCGTCAGTCTACCCTGAAAAACTTTGCTATCAAAC




cerevisiae

SSSRYYSASPLPASKRPEPAPSFNVDPLEQPAEPSKLAK
GCTGCTTTCAGCATATTGCCTATCGTAACACTCCGGCCAT




KLRAEPDMDTSFVGLTGGQIFNEMMSRQNVDTVFGYP
GCGTTCGGTAGCGCTAGCACAGCGCTTCTATTCCTCTTCT




GGAILPVYDAIHNSDKFNFVLPKHEQGAGHMAEGYAR
AGCAGATACTATTCGGCATCTCCGCTGCCGGCCAGTAAAC




ASGKPGVVLVTSGPGATNVVTPMADAFADGIPMVVFT
GCCCCGAACCAGCTCCGTCGTTCAACGTTGATCCACTGGA




GQVPTSAIGTDAFQEADVVGISRSCTKWNVMVKSVEE
ACAGCCAGCGGAACCTTCTAAGCTGGCGAAAAAACTTCG




LPLRINEAFEIATSGRPGPVLVDLPKDVTAAILRNPIPTK
CGCGGAACCGGATATGGATACTTCATTCGTAGGTCTGACA




TTLPSNALNQLTSRAQDEFVMQSINKAADLINLAKKPV
GGAGGCCAGATCTTTAATGAGATGATGAGTCGTCAAAAC




LYVGAGILNHADGPRLLKELSDRAQIPVTTTLQGLGSF
GTCGACACGGTATTCGGCTACCCGGGCGGAGCCATCCTGC




DQEDPKSLDMLGMHGCATANLAVQNADLIIAVGARF
CGGTATATGATGCGATTCATAACTCGGATAAATTCAACTT




DDRVTGNISKFAPEARRAAAEGRGGHHFEVSPKNINK
TGTGTTGCCGAAACATGAACAGGGCGCGGGCCACATGGC




VVQTQIAVEGDATTNLGKMMSKIFPVKERSEWFAQIN
AGAGGGATATGCGCGTGCAAGCGGCAAACCGGGTGTCGT




KWKKEYPYAYMEETPGSKIKPQTVIKKLSKVANDTGR
GCTGGTAACATCAGGCCCGGGTGCAACAAATGTTGTCAC




HVIVTTGVGQHQMWAAQHWTWRNPHTFITSGGLGTM
ACCTATGGCGGATGCTTTTGCCGACGGTATCCCGATGGTA




GYGLPAAIGAQVAKPESLVIDIDGDASFNMTLTELSSA
GTGTTCACCGGCCAAGTGCCAACCAGCGCGATTGGAACA




VQAGTPVKILILNNEEQGMVTQWQSLFYEHRYSHTHQ
GACGCTTTCCAGGAAGCTGATGTGGTCGGCATCTCCCGCA




LNPDFIKLAEAMGLKGLRVKKQEELDAKLKEFVSTKG
GTTGTACAAAGTGGAACGTGATGGTGAAGAGCGTAGAAG




PVLLEVEVDKKVPVLPMVAGGSGLDEFINFDPEVERQ
AGTTGCCTCTGCGTATCAACGAAGCGTTCGAGATTGCGAC




QTELRHKRTGGKH
CAGTGGGCGCCCGGGGCCCGTCTTAGTCGACTTACCTAAG




(SEQ ID NO: 15)
GACGTAACCGCCGCGATCCTGCGCAATCCTATTCCGACCA





AAACTACGTTACCCAGTAACGCGCTGAACCAGCTTACCA





GCCGCGCTCAGGACGAATTCGTCATGCAGTCCATCAATAA





AGCTGCGGACCTTATTAACCTGGCTAAAAAGCCTGTGCTC





TATGTTGGTGCCGGTATTCTCAATCACGCCGATGGACCGC





GTCTGCTGAAAGAGCTGAGCGACCGCGCTCAGATCCCCG





TGACCACTACGCTTCAAGGCCTTGGCTCCTTTGATCAGGA





AGATCCTAAAAGCTTAGATATGTTAGGAATGCACGGATG





CGCCACGGCGAACCTGGCGGTGCAGAATGCGGATCTGAT





TATTGCCGTCGGCGCCCGTTTTGACGACCGTGTGACCGGC





AACATTAGCAAATTTGCTCCTGAAGCTCGTCGTGCTGCTG





CGGAAGGACGTGGAGGAATTATTCATTTTGAAGTAAGTC





CAAAAAATATTAACAAAGTCGTACAGACCCAGATTGCGG





TCGAGGGTGATGCGACCACCAATCTGGGGAAGATGATGA





GCAAAATCTTCCCTGTAAAAGAACGTAGTGAGTGGTTCGC





CCAGATAAATAAGTGGAAAAAAGAATATCCATATGCCTA





TATGGAGGAAACGCCAGGTAGTAAAATTAAACCGCAAAC





TGTGATCAAAAAACTGTCAAAAGTCGCAAACGATACGGG





TCGTCATGTAATCGTAACTACGGGCGTGGGTCAGCATCAG





ATGTGGGCGGCGCAGCATTGGACCTGGCGTAACCCGCAT





ACCTTTATTACGAGCGGCGGATTGGGGACCATGGGCTATG





GGTTGCCGGCGGCGATTGGCGCCCAGGTGGCCAAGCCAG





AGTCACTGGTCATCGATATTGACGGTGACGCGAGCTTCAA





CATGACGCTGACGGAGTTGTCCTCAGCGGTTCAGGCCGGT





ACTCCGGTGAAAATCCTGATTCTGAACAATGAGGAACAG





GGTATGGTTACGCAGTGGCAAAGCTTATTCTACGAGCACC





GATATTCCCACACGCATCAGCTGAACCCTGACTTCATTAA





ACTTGCTGAACCAATGGGGCTGAAGGGCCTGCCCGTGAA





AAAGCAGGAAGAACTTGATGCTAAACTGAAAGAATTCGT





CTCGACGAAGGGACCACTACTTTTAGAAGTGGAGGTGGA





TAAAAAAGTTCCAGTCTTACCTATGGTCGCTGGCGGTAGC





GGCCTGGATGAATTTATTAATTTCGATCCGGAGGTCGAAC





GTCAGCAAACTGAATTGCGCCATAAACGGACAGGAGGTA





AACAC





(SEQ ID NO: 16)





O86938|PPD_STRVT

Streptomycesviridochromogenes

MIGAADLVAGLTGLGVTTVAGVPCSYLTPLINRVISDP
ATGATTGGGGCTGCCGATCTGGTCGCTGGTCTGACCGGTC




ATRYLTVTQEGEAAAVAAGAWLGGGLGCAITQNSGL
TGGGTGTGACCACAGTGGCCGGTGTACCGTGCAGTTATTT




GNMTNPLTSLLHPARIPAVVITTWRGRPGEKDEPQHHL
AACTCCGTTAATCAACCGAGTAATCAGTGACCCGGCAAC




MGRITGDLLDLCDMEWSLIPDTTDELHTAFAACRASL
GAGATATTTGACGGTGACGCAGGAAGGAGAAGCAGCGGC




AHRELPYGFLLPQGVVADEPLNETAPRSATGQVVRYA
AGTTGCAGCAGGGGCCTGGTTGGGTGGTGGTCTGGGCTG




RPGRSAARPTRIAALERLLAELPRDAAVVSTTGKSSRE
CGCGATTACCCAAAACAGCGGTCTTGGCAACATGACCAA




LYTLDDRDQHFYMVGAMGSAATVGLGVALHTPRPVV
CCCTCTCACCTCTTTACTTCACCCTGCCCGTATCCCGGCGG




VVDGDGSVLMRLGSLATVGAHAPGNLVHLVLDNGVH
TAGTTATCACCACCTGGCGCGGCCGCCCGGGTGAGAAAG




DSTGGQRTLSSAVDLPAVAAACGYRAVHACTSLDDLS
ATGAGCCCCAGCACCACCTAATGGGCCGCATTACTGGTG




DALATALATDGPTLVHLAIRPGSLDGLGRPKVTPAEVA
ATCTCCTGGACCTGTGTGATATGGAGTGGTCGCTGATTCC




RRFRAFVTTPPAGTATPVHAGGVTAR
GGATACGACCGACGAACTGCACACAGCGTTTGCTGCTTGC




(SEQ ID NO: 17)
CGTGCTTCCCTGGCGCACCGTGAGCTGCaTATGGTTTTCT





GCTTCCGCAGGGTGTGGTGGCCGATGAGCCACTGAACGA





AACGGCTCCGCGTTCGGCCACCGGGCAGGTCGTCCGCTAT





GCGCGTCCAGGCCGGTCTGCTGCCCGGCCTACGCGCATTG





CCGCCCTGGAACGCCTACTCGCCGACTTTACCGCGTGACGC





AGCAGTGGTATCTACCACCGGCAAAAGCTCCCGAGAGCT





GTACACTTTGGACGATCGTGATCAACATTTCTATATGGTC





GGTGCGATGGGCTCTGCCGCGACCGTTGGACTGGGAGTC





GCGTTGCATACCCCCCGTCCGGTCGTTGTTGTTGATGGTG





ACGGCTCCGTCTTGATGCGCCTCGGTTCGCTGGCAACCGT





GGGGGCCCATGCCCCCGGCAACCTGGTGCATCTTGTGCTG





GATAACGGTGTCCACGATAGCACGGGTGGCCAACGCACG





TTGAGCAGCGCGGTGGATCTCCCAGCTGTCGCCGCCGCGT





GCGGCTATCGCGCTGTGCACGCCTGCACCTCTCTGGATGA





TCTCAGTGATGCATTGGCGACCGCGTTAGCGACGGATGGT





CCGACCTTAGTGCACCTGGCGATTCGCCCGGGAAGCCTGG





ATGGTCTGGGCCGCCCGAAAGTCACGCCCGCTGAAGTGG





CCCGTCGTTTTCGTGCGTTCGTGACCACCCCCCCAGCCGG





TACAGCTACGCCTGTTCACGCTGGTGGTGTGACAGCCCGG





(SEQ ID NO: 18)





3L84_3M34

Campylobacter

MNIQILQEQANTLRFLSADMVQKANSGHPGAPLGLAD
ATGAACATTCAAATTTTGCAAGAACAAGCGAACACTCTG




jejuni

ILSVLSYHLKHNPKNPTWLNRDRLWSGQHASALLYSF
CGTTTCTTGAGTGCGGACATGGTCCAGAAAGCCAATAGC




LHLSGYDLSLEDLKNFRQLHSKTPGHPEISTLGVEIATG
GGCCACCCTGGCGCACCCCTGGGCCTGGCGGATATCCTCT




PLGQGVANAVGFAMAAKKAQNLLGSDLIDHKIYCLC
CTGTGCTCAGTTATCATCTTAAACACAACCCAAAAAACCC




GDGDLQEGISYEACSLAGLHKLDNFILIYDSNNISIEGD
GACCTGGCTTAACCGCCGACCGCTTAGTGTTTTCCGGCGGT




VGLAFNENVKMRFEAQGFEVLSINGHDYEEINKALEQ
CACGCCTCCGCACTGTTGTATTCTTTCCTTCATCTGAGCGG




AKKSTKPCLIIAKTTIAKGAGELEGSHKSHGAPLGEEVI
CTACGACTTAAGTCTGGAAGACCTCAAGAACTTCCGCCAG




KKAKEQAGFDPNISFHIPQASKIRFESAVELGDLEEAK
CTGCACTCGAAGACCCCGGGGCACCCCGAAATTTCCACCC




WKDKLEKSAKKELLERLLNPDFNKIAYPDFKGKDLAT
TGGGCGTAGAAATTGCCACGGGTCCTCTGGGCCAGGGGG




RDSNGEILNVLAKNLEGFLGGSADLGPSNKTELHSMG
TGGCGAATGCAGTGGGATTTGCGATGGCGGCAAAAAAAG




DFVEGKNIHFGIREHAMAAINNAFARYGIFLPFSATFFIF
CGCAAAATCTGCTGGGCAGTGACCTGATTGATCACAAAA




SEYLKPAARIAALMKIKHFFIFTHDSIGVGEDGPTHQPI
TCTACTGTCTGTGCGGTGACGGCGATCTGCAGGAGGGTAT




EQLSTFRAMPNFLTFRPADGVENVKAWQIALNADIPSA
TTCATATGAGGCGTGTTCTCTGGCGGGCCTGCACAAATTA




FVLSRQKLKALNEPVFGDVKNGAYLLKESKEAKFTLL
GATAATTTTATCCTGATATATGATAGTAACAACATTAGCA




ASGSEVWLCLESANELEKQGFACNVVSMPCFELFEKQ
TTGAGGGTGACGTCGGTCTGGCGTTCAATGAAAACGTTAA




DKAYQERLLKCEVIGVEAAHSNELYKFCHKVYGIESF
GATGCGTTTTGAAGCGCAGGGGTTCGAAGTGCTGAGCATT




GESGKDKDVFERFGFSVSKLVNFILSK
AATGGTCACGATTATGAAGAAATTAACAAAGCCCTGGAA




(SEQ ID NO: 19)
CAGGCCAAGAAATCTACCAAACCATGCTTGATTATCGCA





AAAACAACCATTGCGAAAGGCGCGGGTGAACTTGAAGGT





AGCCACAAAAGCCACGGCGCCCCACTGGGTGAAGAAGTG





ATCAAAAAAGCGAAAGAACAGGCTGGCTTTGATCCCAAC





ATCTCTTTTCATATTCCGCAGGCTTCGAAAATCCGCTTTGA





AAGCGCCGTTGAACTGGGGGACCTGGAAGAAGCGAAATG





GAAGGACAAACTTGAAAAATCCGGAAAAAAAGAACTGCT





CGAACGCCTGCTCAACCCAGATTTTAACAAGATTGCGTAT





CCCGATTTCAAAGGCAAAGACCTGGCCACGCGAGACAGT





AACGGGGAGATTTTAAATGTTCTGGCCAAAAATCTGGAG





GGTTTCCTGGGCGGCTCCGCTGACCTGGGTCCTTCGAACA





AGACGGACCTACACTCAATGGGTGACTTTGTTGAGGGCA





AGAACATTCACTTTGGTATTCGTGAACATGCCATGGCGGC





TATTAACAATGCCTTTGCGCGCTATGGAATCTTTCTGCCCT





TTTCAGCGACGTTCTTCATCTTCAGCGAATATCTTAAACC





GGCGGCGCGCATCGCCGCGCTGATGAAGATCAAACATTT





TTTCATTTTTACGCACGACAGCATCGGAGTAGGAGAAGAC





GGCCCGACGCACCAGCCTATAGAACAATTAAGTACCTTTC





GCGCCATGCCGAATTTCCTCACTTTTCGTCCGGCGGATGG





GGTAGAAAACGTAAAAGCTTGGCAGATTGCACTCAATGC





CGACATTCCATCTGCGTTCGTCCTCTCACGTCAGAAGCTG





AAGGCCTTGAACGAGCCTGTTTTTGGTGACGTGAAGAAC





GGAGCATACCTGCTGAAAGAATCTAAAGAAGCCAAGTTT





ACCCTGCTTGCTTCTGGCTCGGAGGTGTGGCTGTGCTTAG





AAAGCGCAAACGAACTTGAAAAACAAGGCTTTGCCTGCA





ACGTCGTGAGTATGCCGTGTTTTGAGCTGTTCGAAAAGCA





GGATAAAGCTTACCAGGAACGCCTGCTTAAAGGAGAAGT





AATTGGCGTGGAGGCGGCACACTCTAATGAACTGTACAA





ATTTTGCCATAAAGTGTATGGGATCGAAAGCTTTGGCGAG





AGTGGCAAAGACAAAGACGTTTTTGAACGTTTCGGCTTTT





CGGTGTCCAAACTrGTGAATTrTATTCTGTCCAAA





(SEQ ID NO: 20)





lupa_A

Streptomyces

MSRVSTAPSGKPTAAHALLSRLRDHGYGKVFGVVGRE
ATGAGCCGTGTCTCTACAGCGCCTTCGGGTAAACCTACGG




clavuligerus

AASILFDEVEGIDFVLTRHEFTAGVAAPVLARITGRPQ
CAGCTCACGCACTTTTAAGTCGCCTGCGTGACCATGGGGT




ACWATLGPGMTNLSTGIATSVLDRSPVIALAAQSESHD
AGGCAAGGTTTTCGGTGTGGTGGGCCGTGAAGCCGCCTC




IFPNDTHQCLDSVAIVAPMSKYAVELQRPHEITDLVDS
GATCCTGTTCGATGAAGTCGAAGGTATCGATTTCGTCCTG




AVNAAMTEPVGSFISLPVDLLGSSEGIDTTVPNPPANT
ACCCGCCATGAGTTTACCGCAGGCGTAGCCGCGGACGTG




PAKPVGVVADGWQKAADQAAALLAEAKHPVLVVGA
TTAGCACGTATCACCGGGCGTCCACAAGCCTGCTGGGCTA




AAIRSGAVPAIRALAERLNIPVITTYIAKGVLPVGHELN
CCCTGGGACCGGGAATGACCAATCTGAGCACCGGGATTG




VYGAVTGYMDGILNFPALQTMFAPVDLVLTVGYDYAE
CAACGTCAGTATTAGACCGTTCGCCGGTTATTGCGCTCGC




DLRPSMWQKGIEKKTVRISPTVNPIPRVYRPDVDVVTD
AGCTCAGAGTGAATCACACGATATTTTCCCAAACGACACC




VLAFVEHFETATASFGAKQRHDIEPLRARIAEFLADPET
CACCAATGTTTAGACTCAGTGGCGATTGTGGCACCGATGA




YEDGMRVHQYIDSMNTVMEEAAEPGEGTIVSDIGFFR
GCAAATATGCGGTTGAGCTGCAGCGCCCACACGAAATTA




HYGVLFARADQPFGFLTSAGCSSFGYGIPAAIGAQMAR
CGGATTTGGTCGATAGTGCCGTTAATGCCGCGATGACTGA




PDQPTFLIAGDGGFHSNSSDLETIARLNLPIVTVVVNND
ACCCGTGGGCCCCAGCTTTATTAGCCTACCAGTCGATCTG




TNGLIELYQNIGHHRSHDPAVKFGGVDFVALAEANGV
CTGGGGTCGAGCGAAGGGATTGACACAACAGTGCCGAAC




DATRATNREELLAALRKGAELGRPFLIEVPVNYDFQPG
CCGCCGGCGAATACCCCGGCTAAACCGGTGGGCGTGGTA




GFGALSI
GCTGATGGCTGGCAGAAAGCGGCAGATCAAGCTGCTGCG




(SEQ ID NO: 21)
CTTTTGGCAGAGGCCAAACATCCAGTATTAGTGGTGGGTG





CAGCGGCGATCCGTAGCGGAGCTGTTCCTGCAATTAGAG





CTTTGGCAGAACGTTTGAACATCCCCGTCATCACCACCTA





TATCGCTAAAGGTGTCCTGCCGGTTGGTCATGAACTGAAT





TACGGTGCTGTCACCGGCTATATGGATGGCATCCTGAACT





TCCCAGCGCTGCAAACCATGTTTGCTCCGGTGGATTTAGT





ACTGACCGTGGCTTATGATTATGCaGAAGATCTGCGACCT





TCGATGTGGCAAAAAGGTATCGAAAAAAAGACAGTTCGA





ATTTCGCCGACTGTGAACCCCATCCCTCGGGTCTATCGTC





CGGACGTGGACGTCGTGACCGACGTGCTGGCTTTTGTGGA





ACACTTTGAAACCGCGACCGCGTCCTTCGGTGCGAAACA





GCGACACGACATCGAACCCTTGCGTGCACGTATTGCAGA





ATTCTTGGCGGACCCGGAAACCTATGAGGATGGAATGCG





AGTCCATCAGGTAATCGATTCTATGAACACCGTCATGGAA





GAGGCGGCAGAGCCAGGCGAAGGCACCATTGTTAGTGAT





ATTGGGTTCTTCCGCCACTATGGTGTCTTGTTTGCTCGTGC





GGACCAACCCTTTGGGTTCCTGACCTCTGCGGGTTGTTCA





TCTTTTGGATACGGTATTCCAGCGGCTATCGGAGCACAGA





TGGCCCGTCCGGATCAACCTACATTTTTAATTGCAGGCGA





TGGCGGTTTTCACTCTAATTCGAGCGACCTGGAAACCATT





GCTCGCCTTAACCTGCCGATCGTGACGGTTGTCGTGAACA





ATGACACGAACGGCCTGATTGAACTGTACCAGAATATCG





GTCATCATCGCAGTCATGATCCAGCCGTAAAGTTCGGGGG





TGTCGATTTTGTGGCGCTGGCGGAAGCAAACGGCGTTGAT





GCGACCCGGGCAACCAATCGTGAGGAGCTGCTTGCGGCG





TTGCGTAAAGGCGCAGAACTGGGTCGTCCGTTCCTGATCG





AAGTACCGGTAAACTATGACTTTCAGCCGGGTGGCTTTGG





CGCTCTGTCTATT





(SEQ ID NO: 22)





A0A016CS86_BACFG

Fibrobacter

MLSPKFFVETLQTYSMDFFTGVPDSLLKNMCAYITDHI
ATGCTGAGCCCCAAATTCTTTGTCGAAACCCTGCAAACCT




succinogenes

ESQNNIIAVNEGTALGLAAGYYIATGCIPIVYMQNSGIG
ATTCCATGGACTTTTTTACGGGCGTGCCCGATTCGCTGTT




NTVNPLLSLTDKVVYNIPVLLLIGWRGEPGIKDEPQHIK
GAAAAACATGTGCGCCTATATAACTGATCATATTGAATCA




QGMITIPLLDTLGIKNQILNKDPNMAKSQINDAIEYMR
CAGAACAACATTATCGCAGTTAATGAAGGCACTGCGCTT




MTKEAFAFVIQKDTFEEYKLQNTEDSKFDLDREEAIKI
GGGCTGGCGGCGGGTTACTACATCGCAACCGGTTGCATCC




VCNSLDKGSVIVSTTGMISRELFEYRESIDANHETDFLT
CGATTGTATATATGCAGAACAGTGGGATTGGTAACACTGT




VGSMGHASQIALGIALRRKNKKVYCFDGDGAVLMHM
AAATCCTCTTTTGAGTTTGACGGACAAAGTTGTGTACAAC




GALTTIGTSRAVNYIHIVFNNGAHDSVGGQPTVGLKVN
ATCCCGGTGCTTCTCCTTATTGGCTGGCGCGGCGAGCCGG




LSKIASACGYNNVISVDSKATLKESLDRFKSINGPVLLE
GCATTAGGATGAACCGCAGCATATCAAACAGGGGATGA




VKVRKGARKDLGRPTLTPVNKELLMNFLEEADESDK
TCACCATCCCGTTGCTGGATACACTAGGCATTAAAAACCA




SDNVFK
AATTCTCAATAAGGACCCAAACATGGCCAAATCACAAAT




(SEQ ID NO: 23)
TAACGATGCCATCGAGTACATGCGGATGACGAAAGAGGC





ATTCGCCTTTGTAATTCAGAAAGACACTTTCGAGGAATAC





AAACTGCAAAACACCGAAGACAGCAAGTTCGACCTGGAC





CGCGAAGAGGCGATTAAAATCGTGTGTAATTCCTTAGAC





AAAGGCTCCGTGATTGTGAGTACGACCGGCATGATCTCGC





GTGAATTATTCGAGTACCGCGAAAGCATCGATGCTAACC





ATGAAACTGACTTCCTCACAGTCGGTTCCATGGGTCACGC





CAGTCAAATCGCTCTGGGCATCGCACTGCGCCGTAAAAA





CAAAAAAGTCTACTGTTTCGATGGCGATGGAGCCGTCTTA





ATGCATATGGGCGCCTTAACGACAATTGGCACGAGCCGC





GCTGTCAACTACATCCACATTGTGTTCAACAATGGGGCAC





ACGATAGCGTAGGGGGCCAGCCGACGGTTGGCCTCAAAG





TAAACCTGACTAAAATTGCAAGCGCGTGCGGTTACAACA





ATGTAATCTCCGTGGATTCTAAGGCAACATTGAAAGAAA





GCCTCGATCGTTTTAAATCAATAAAIGGTCCGGTATTGCT





CGAAGTTAAGGTACGCAAAGGCGCGCGTAAAGACCTGGG





TCGCCCGACCTTAACACCGGTTAAAAACAAGGAACTGCT





GATGAACTTTCTGGAAGAAGCTGATGAAAGCGATAAAAG





CGATAATGTTTTCAAA





(SEQ ID NO: 24)


A0A0F2PQV5_9FIRM

Peptococcaceae

MISTKRFGEELKKLGFDFYSGVPCSFLKNLINYTTNHC
ATGATTAGCACTAAACGCTTTGGTGAAGAACTAAAAAAA




bacterium

NYLAATNEGEAVAVAAGAFLAGKKPVVLMQNSGLTN
CTGGGCTTTGATTTCTATTCCGGCGTTCCTTGCAGCTTCCT



BRH_c4b
AVSPLVSLNYLFRLPVLGFVSLRGEPGIPDEPQHQLMG
GAAAAACCTAATCAATTACACCACGAATCACTGTAACTA




RITTQMLDLVEIQWEYLSTDFDEVKKQLLQAYSCIESN
CCTGGCCGCTACCAACGAGGGAGAGGCAGTCGCGGTTGC




QPFFFVVKKDTFEKEQLTDSQKRLSKNMFKSERTKAD
CGCGGGTGCGTTCCTGGCCGGCAAAAAACCGGTTGTGCT




QVPKRFETLRLINSLKDVKTVQLTTTGITGRELYEIEDV
GATGCAAAACTCCGGGTTGACGAATGCCGTCTCTCCCCTT




SNNLYMVGSMGCVSSLGLGLALTKKDKDYVVIEGDG
GTAAGCCTGAACTATCTCTTCCGCTTACCGGTGCTGGGTT




ALLMRMGNLATNGYYGPPNMLHILLDNNMHESTGGQ
TTGTCTCCCTTCGCGGTGAACCTGGTATCCCAGACGAGCC




STVSYNINFYDIAAACGYTKSIYVHNLVELESHIKDWK
GCAACACCAGCTCATGGGCCGTATTACCACCCAAATGCTT




REKNLTFLYLKIAKGSIEGLGRPKMKPHEVKERLKVFL
GATCTGGTTGAAATTCAGTGGGAGTATCTCTCCACAGATT




DG
TTGATGAGGTGAAAAAACAGCTGTTACAGGCATACAGCT




(SEQ ID NO: 25)
GTATTGAATCAAATCAACCGTTCTTTTTCGTGGTAAAAAA





AGATACCTTTGAAAAAGAACAGTTAACCGACTCTCAGAA





ACGTCTGAGCAAAAACATGTTTAAATCGGAACGCACCAA





AGCGGATCAGGTGCCCAAAAGATTTGAAACCCTGCGGCT





AATAAACTCCCTGAAAGATGTGAAGACCGTGCAGCTCAC





TACGACGGGCATTACCGGCCGTGAACTATACGAAATTGA





AGATGTCAGCAATAACCTATATATGGTAGGTAGTATGGG





CTGTGTCAGTTCGCTGGGCCTGGGACTGGCGCTGACTAAA





AAAGACAAAGATGTGGTTGTTATCGAAGGTGATGGCGCC





CTGCTGATGCGGATGGGTAACCTTGCGACGAACGGTTACT





ACGGTCCGCCGAATATGCTGCACATTTTGCTGGATAATAA





TATGCATGAATCCACTGGAGGTCAGAGTACCGTTAGCTAC





AACATCAATTTCGTTGACATTGCTGCCGCGTGCGGTTATA





CTAAATCCATCTATGTGCATAACCTGGTGGAACTCGAGTC





GCATATCAAAGATTGGAAACGGGAGAAAAATCTCACGTT





TCTCTATCTGAAAATCGCCAAGGGTAGCATTGAAGGACTG





GGCCGTCCAAAAATGAAACCTCACGAGGTGAAAGAACGT





TTAAAAGTATTCTTGGATGGT





(SEQ ID NO. 26)





D7DTG5M_ETV3

Methanococcus

MKTIVILLDGVADRPSKELNYKTPLQYANIPNLDEFAK
ATGAAAACCATCGTTATTTTGCTCGATGGGGTTGCGGATC




voltae

SSLTGLMCPQKIGVTLGTEVAHFLLWGYDISQFPGRGV
GTCCTTCCAAAGAACTGAATTATAAAACTCCGCTTCAATA




IEALGEGIDLKKDSIYLRATLGHVNYNQKENNFLVLDR
CGCGAACATCCCGAATCTCGACGAATTCGCTAAGTCTTCC




RTKDINNQEISELLNKISNINIDGYLFTIHHMQGIHSILEI
TTAACGGGCCTCATGTGTCCCCAGAAAATTGGGGTTCCAC




SKLENDGNLKTEPNLKKNNLKKNGFELTYEEFCNEKNI
TGGGCACGGAAGTCGCTCATTTCTTGCTGTGGGGCTACGA




LKYGNINNINNCISNKISDSDPFYKDRHVIMVKPVIKLI
TATTAGTCAGTTCCCCGGACGGGGGGTGATCGAAGCGCT




GTYEEYLNALNVSNALNKYLTTCNTLLENDSINISRKN
GGGTGAAGGCATTGACCTGAAAAAAGATTCGATTTACCT




ENKSLANFLLTKWAGSYKKLPSFKQKWGLNGVIIANS
GCGCGCTACCCTCGGTCATGTGAACTATAATCAGAAGGA




SLFRGLAKLLKMDYYEVKEFDKAIELGLKFKNDNTNN
GAACAACTTCCTTGTGTTGGATCGTCGGACCAAAGACATT




NNNSNNNNNNNQNNNINNKKIYDFIHIHTKEPDEAGH
AACAATCAAGAGATCTCAGAGCTGCTCAACAAAATTTCC




TKNPINKVRVLEKLDKNLKVVIDEIDKEKENGDENLYII
AACATTAACATTGATGGTTATCTGTTTACCATTCATCACA




TGDHATPSTGGLIHSGELVPIAICGKNVGKDSTKAFNE
TGCAGGGTATCCACAGTATTCTGGAAATTTCTAAGCTGGA




MDVLNGYYRINSTDIMNLVLNYTDKALLYGLRPNGDL
GAATGACGGTAATCTGAAAACCGAACCGAACTTGAAGAA




KKYIPEDNELEFLKKDN
AAACAATCTGAAAAAAAATGGCTTCGAACTGACCTATGA




(SEQ ID NO: 27)
AGAATTTTGCAACGAGAAAAATATTCTGAACTATGGCAA





TATTAACAACATCAATAATTGCATCTCTAACAAAATTTCG





GATTCAGACCCGTTTTACAAGGATCGCCACGTGATAATGG





TTAAACCAGTAATTAAACTGATTGGTACCTACGAAGAATA





TCTGAACGCCCTGAATGTAAGCAACGCGCTGAATAAATA





TCTGACAACGTGTAACACCCTGCTGGAAAATGACAGCAT





CAATATTTCACGTAAAAATGAGAATAAATCTCTGGCAAAT





TTTCTGCTGACTAAATGGGCGGGCAGCTATAAAAAGCTGC





CTAGCTTTAAACAGAAATGGGGCTTAAATGGTGTGATTAT





TGCTAACAGTTCTCTGTTCCGTGGTCTGGCCAAACTCCTC





AAAATGGACTATTATGAGGTGAAAGAGTTCGACAAGGCA





ATTGAACTGGGGCTGAAGTTCAAGAACGATAACACGAAC





AATAATAACAACTCCAACAATAACAACAACAACAATCAG





AACAACAATATCAACAATAAGAAGATCTACGACTTTATC





CATATCCATACGAAAGAACCTGATGAGGCCGGGCATACC





AAGAATCCGATCAACAAGGTACGCGTGCTGGAAAAACTC





GATAAAAATTTAAAAGTAGTTATTGATGAGATCGATAAA





GAGAAGGAAAACGGCGATGAAAACCTTTACATTATTACC





GGTGACCACGCGACACCATCGACGGGCGGTCTGATCCAT





TCGGGCGAACTGGTTCCAATTGCAATTTGTGGCAAGAACG





TTGGTAAAGACTCTACGAAGGCGTTTAACGAAATGGACG





TACTGAACGGCTATTACCGGATCAATTCAACCGATATCAT





GAACCTGGTGCTTAACTATACGGATAAAGCCCTCCTGTAT





GGACTCCGTCCAAACGGGGATCTTAAGAAATATATTCCTG





AAGACAATGAACTGGAATTCCTCAAAAAAGATAAC





(SEQ ID NO: 28)





3E9Y

Arabidopsis

MAAATTTTTTSSSISFSTKPSPSSSKSPLPISRFSLPFSLNP
ATGGCGGCTGCTACCACCACTACCACAACATCTTCGTCTA




thaliana

NKSSSSSRRRGIKSSSPSSISAVLNTTTNVTTTPSPTKPT
TATCCTTTTCTACTAAACCGAGCCCTTCTTCTTCCAAAAGT




KPETFISRFAPDQPRKGADILVEALERQGVETVFAYPG
CCACTGCCCATTTCACGCTTCTCCTTACCGTTTAGCCTGAA




GASMEIHQALTRSSSIRNVLPRHEQGGVFAAEGYARSS
CCCCAACAAGAGCTCGAGCAGCTCACGCCGCCGCGGTAT




GKPGICIATSGPGATNLVSGLADALLDSVPLVAITGQVP
TAAATCATCGAGCCCGTCTAGCATATCCGCGGTTCTCAAC




RRMIGTDAFQETPIVEVTRSITKHNYLVMDVEDIPRIIEE
ACCACTACCAACGTTACGACCACTCCTAGCCCGACCAAAC




AFFLATSGRPGPVLVDVPKDIQQQLAIPNWEQAMRLP
CCACTAAACCGGAAACCTTTATTTCGCGATTCGCTCCGGA




GYMSRMPKPPEDSHLEQIVRLISESKKPVLYVGGGCLN
CCAGCCTCGTAAAGGTGCGGATATTCTTGTGGAAGCGCTG




SSDELGRFVELTGIPVASTLMGLGSYPCDDELSLHMLG
GAACGCCAGGGCGTGGAAACCGTGTTTGCTTACCCGGGT




MHGTVYANYAVEHSDLLLAFGVRFDDRVTGKLEAFA
GGCGCTTCCATGGAGATACATCAGGCCTTGACACGGAGTT




SRAKIVHIDIDSAEIGKNKTPHVSVCGDVKLALQGMNK
CATCTATCCGAAATGTTCTGCCGCGTCATGAACAGGGCGG




VLENRAEELKLDFGVWRNELNVQKQKFPLSFKTFGEA
TGTATTTGCAGCGGAAGGGTACGCGCGCTCCTCTGGCAAA




IPPQYAIKVTDELTDGKAIISTGVGQHQMWAAQFYNY
CCAGGCATCTGCATTGCGACCTCAGGCCCCGGTGCTACCA




KKPRQWLSSGGLGAMGFGLPAAIGASVANPDAIVVDI
ATCTCGTTAGCGGCCTGGCAGATGCGTTACTGGATAGCGT




DGDGSFIMNVQELATIRVENLPVKVLLLNNQHLGMVM
GCCGTTAGTCGCGATTACCGGTCAGGTGCCACGTCGTATG




QWEDRFYKANRAHTFLGDPAQEDEIFPNMLLFAAACG
ATCGGCACTGATGCGTTCCAGGAAACACCTATAGTAGAG




IPAARVTKKADLREAIQTMLDTPGPYLLDVICPHQEHV
GTGACCCGTTCAATCACGAAACATAACTATTTGGTGATGG




LPMIPSGGTFNDVITEGDGRIKY
ATGTAGAGGACATCCCGCGCATTATTGAAGAAGCGTTTTT




(SEQ IS NO: 29)
TCTAGCCACTTCTGGTCGCCCAGGCCCGGTCCTGGTAGAT





GTGCCCAAAGATATCCAACAGCAGCTGGCGATCCCGAAT





TGGGAGCAGGCAATGCGCCTCCCCGGGTACATGTCGCGA





ATGCCGAAACCGCCGGAAGATTCTCATTTAGAACAGATT





GTGCGTTTAATTTCGGAATCGAAAAAACCGGTTCTGTATG





TTGGCGGTGGCTGCTTGAATTCATCAGATGAACTGGGTCG





TTTCGTAGAACTCACCGGCATTCCGGTAGCGTCAACCCTG





ATGGGCCTGGGTTCCTATCCGTGCGATGACGAGCTCTCGC





TGCATATGCTCGGAATGCACGGTACCGTGTACGCCAATTA





CGCTGTGGAACACAGTGACCTTCTGCTGGCGTTTGGTGTA





CGTTTTGATGATCGTGTCACCGGCAAGCTGGAGGCGTTCG





CGTCGCGCGCGAAAATTGTCCACATTGATATTGATTCTGC





GGAGATTGGGAAAAACAAAACCCCGCACGTCTCCGTGTG





CGGGGACGTTAAGCTCGCACTTCAGGGCATGAATAAAGT





TCTGGAAAACCGTGCAGAAGAACTGAAACTGGATTTCGG





CGTGTGGCGTAACGAACTTAATGTACAGAAGCAGAAATT





TCCGCTGTCTTTTAAAACGTTTGGTGAAGCAATCCCGCCC





CAGTACGCCATCAAAGTCCTTGACGAATTAACCGACGGT





AAGGCAATCATAAGCACCGGTGTGGGTCAACATCAGATG





TGGGCGGCTCAATTTTATAATTATAAAAAACCTAGACAGT





GGCTCTCGTCAGGCGGCCTGGGTGCCATGGGCTTTGGACT





GCCTGCCGCAATCGGCGCAAGTGTAGCGAACCCGGACGC





TATCGTGGTGGATATCGACGGCGATGGTAGTTTTATTATG





AACGTCCAGGAGCTGGCCACCATCCGCGTAGAGAACCTG





CCCGTAAAAGTTTTATTGTTAAACAACCAGCATTTAGGTA





TGGTGATGCAATGGGAAGATCGTTTCTACAAGGCCAATC





GCGCGCACACCTTTTTAGGCGATCCTGCGCAGGAAGATG





AGATTTTTCCTAACATGCTGCTTTTCGCCGCAGCTGCGG





CATCCCCGCCGCGCGAGTAACCAAAAAGGAGATCTCCG





TGAAGCCATCCAGACTATGCTCGATACCCCCGGTCCGTAT





CTGCTTGACGTGATTTGTCCGCATCAAGAACACGTTCTTC





CGATGATTCCGAGCGGCGGCACCTTTAATGATGTGATCAC





GGAAGGGGACGGTCGCATTAAATAT





(SEQ ID NO: 30)





2ZKT

Pyrococcus

MVLKRKGLLIILDGLGDRPIKELNGLTPLEYANTPNMD
ATGGTTCTGAAACGTAAAGGGCTGCTGATTATCTTGGATG




furiosus

KLAEIGILGQQDPIKPGQPAGSDTAHLSIFGYDPYETYR
GTCTGGGTGATCGTCCGATCAAAGAATTAAACGGCTTAAC




GRGFFEALGVGLDLSKDDLAFRVNFATLENGIITDRRA
TCCGTTGGAATATGCCAACACCCCAAATATGGATAAACTG




GRISTEEAHELARAIQEEVDIGVDFIFKGATGHRAVLVL
GCGGAAATCGGCATTCTAGGCCAGCAGGATCCGATCAAA




KGMSRGYKVGDNDPHEAGKPPLKFSYEDEDSKKVAEI
CCAGGCCAGCCGGCCGGCTCTGACACTGCGCACCTGTCA




LEEFVKKAQEVLEKHPINERRRKEGKPIANYLLIRGAG
ATCTTTGGCTATGATCCCTATGAAACTTACCGTGGGCGGG




TYPNIPMKFTEQWKVKAAGVIAVALVKGVARAVGFP
GCTTTTTTGAAGCATTAGGGGTGGGCCTTGATCTGAGTAA




VYTPEGATGEYNTNEMAKAKKAVELLKDYDFVFLHF
AGACGATCTGGCCTTTCGTGTGAATTTTGCCACGCTCGAA




KPTDAAGHDNKPKLKAELIERADRMIGYILDHVDLEE
AATGGGATTATTACGGATCGTCGCGCAGGCCGTATTAGCA




VVIAITGDHSTPCEVMNHSGDPVPLLIAGGGVRTDDTK
CAGAGGAAGCGCACGAACTGGCGCGGGCGATTCAGGAGG




RFGEREAMKGGLGRIRGHDIVPIMMDLMNRSEKFGA
AAGTGGACATTGGGGTTGACTTCATTTTCAAAGGCGCGAC




(SEQ ID NO: 31)
CGGCCATCGTGCAGTGCTCGTTTTAAAAGGTATGTCTCGT





GGTTATAAAGTGGGTGATAACGATCCGCATGAAGCTGGT





AAACCGCCGTTAAAGTTTTCATATGAAGACGAGGATTCA





AAGAAAGTAGCCGAAATTCTCGAAGAATTCGTGAAAAAA





GCGCAGGAAGTTCTTGAAAAACACCCAATTAATGAAAGA





CGCCGCAAGGAGGGCAAACCGATCGCGAACTATTTGCTG





ATTCGCGGGGCTGGGACGTATCCGAACATACCGATGAAA





TTCACCGAGCAGTGGAAAGTGAAGGCGGCCGGCGTAATT





GCAGTGGCGCTGGTTAAAGGCGTAGCACGTGCAGTCGGC





TTCGACGTATATACCCCTGAAGGGGCGACCGGAGAGTAC





AACACGAACGAAATGGCCAAAGCAAAAAAAGCAGTAGA





ACTGCTAAAAGATTATGATTTTGTGTTCTTACACTTCAAA





CCGACTGATGCCGCGGGGCACGACAACAAACCGAAGCTG





AAAGCGGAATTGATTGAACGCGCCGATCGCATGATTGGG





TATATCTTGGATCATGTTGACTTAGAAGAAGTTGTAATCG





CTATCACCGGCGATCATTCGACGCCATGCGAGGTAATGA





ATCATAGCGGGGACCCTGTCCCACTTTTGATTGCGGGTGG





CGGCGTGCGCACGGACGATACCAAACGTTTCGGCGAGCG





CGAGGCAATGAAAGGCGGCCTTGGCCGCATCCGTGGCCA





CGATATTGTTCCTATCATGATGGATCTAATGAATCGTTCG





GAAAAATTTGGTGCG





(SEQ ID NO: 32)





A0A124FLS8_9FIRM

Clostridia

MLLVVLDGLGGLPVPELNGRTELEAAATPNLDALAKR
ATGCTGCTGGTTGTTCTGGATGGTCTGGGCGGCCTTCCGG




bacterium 62_21

SSLGLAHPVLPGIAPGSSAGHLALFGYDPLRYVIGRGV
TGCCTGAACTGAATGGGCGTACGGAACTTGAGGCGGCCG




LEALGIGFDLHPGDVAVRANFATVQDTRNGPVVTDRR
CGACACCGAACTTAGATGCGCTGGCGAAGCGCTCTTCCCT




AGRPPTEHTRSICRRLQDAIPEIDGVRVFIEPVKEHRFVI
GGGCCTGGCACATCCGGTGCTGCCGGGCATAGCGCCTGG




VLRGEGLDDRVADTDPQREGMPPLQPQPLAEEARRTA
TTCTTCTGCTGGGCATCTGGCTCTTTTCGGTTACGATCCGT




MLAGTLVQRIAELVRDEPRTNFALLRGFSRRPRLDPFP
TGCGTTATGTCATTGGCCGCGGCGTCCTGGAGGCCCTGGG




ERYRARAGAVAVYPMYRGLASLVGMDLLPVAGDTLA
CATTGGTTTCGACCTCCATCCCGGTGATGTGGCCGTCCGT




DEIASLKENWPEYDYFFLHVKGTDSRGEDGDWAGKIK
GCTAATTTCGCAACCGTCCAAGACACGCGGAACGGTCCA




IIEEFDAQLPAILDLNPDALVITGDHSTPATYAAHSWHP
GTCGTGACGGATCGACGTGCGGGCCGTCCGCCGACGGAA




VPFLLYSRWVLPDRDAPGFGEHACARGVLGGFPLLYT
CATACTCGTAGTATCTGTCGTCGCCTGCAGGACGCAATTC




MNLLLANAGRLGKFSA
CGGAGATTGACGGTGTACGTGTCTTCATTGAGCCGGTTAA




(SEQ ID NO: 33)
AGAACATAGATTCGTGATTGTGCTGCGAGGCGAAGGTCT





GGATGATCGCGTCGCCGACACGGATCCCCAACGTGAAGG





GATGCCTCCGTTACAACCGCAACCGCTTGCTGAAGAAGCT





CGTCGCACAGCGATGCTGGCGGGAACCCTGGTGCAACGG





ATTGCTGAGTTAGTCCGCGATGAGCCTCGTACTAATTTTG





CTCTGCTGCGCGGGTTCTCTCGCCGTCCTCGCCTGGACCC





GTTCCCAGAACGTTATCGTGCCCGCGCAGGAGCAGTGGC





AGTCTATCCGATGTATCGCGGTCTGGCATCCCTGGTCGGT





ATGGATCTGCTGCCAGTCGCCGGGGATACGCTTGCCGACG





AAATTGCGAGCCTCAAGGAAAACTGGCCTGAGTATGATT





ACTTCTTTCTGCACGTTAAAGGCACGGACAGTCGCGGTGA





AGATGGTGATTGGGCAGGCAAAATCAAGATTATTGAGGA





ATTTGACGCCCAGCTGCCTGCAATTCTAGATTTAAATCCC





GATGCGTTGGTGATTACAGGCGATCACAGTACGCCTGCTA





CGTACGCGGCCCATAGCTGGCATCCTGTGCCTTTTCTGTT





GTACAGCCGCTGGGTCCTGCCGGATCGCGATGCGCCAGG





TTTCGGCGAACACGCATGCGCCCGTGGAGTGCTGGGTCA





GTTCCCGCTGTTGTATACGATGAATCTTTTGTTGGCCAAT





GCTGGGCGTCTCGGCAAATTCAGCGCC





(SEQ ID NO: 34)





4YVBX

Pyrococcus

MNKRFPFPVGEPDFIQGDEAIARAAILAGCRFYAGYPIT
ATGAATAAACGGTTTCCGTTCCCGGTGGGAGAACCTGATT




furiosus

PASEIFEAMALYMPLVDGVVIQMEDEIASIAAAIGASW
TTATTCAGGGTGATGAGGCTATCGCTCGTGCAGCCATTTT




AGAKAMTATSGPGFSLMQENIGYAVMTETPVVIVDVQ
AGCCGGATGTCGTTTTTATGCGGGATACCCGATCAGCCC




RSGPSTGQPTLPAQGDIMQAIWGTHGDHSLIVLSPSTV
GCGTCGGAAATCTTCGAAGCGATGGCACTATATATGCCGC




QEAFDFTIRAFNLSEKYRTPVILLTDAEVGHMRERVYIP
TGGTCGATGGCGTAGTTATCCAGATGGAAGATGAGATTGC




NPDEIEIINRKLPRNEEEAKLPFGDPHGDGVPPMPIFGK
CGTCGATCGCGGCCGCCATCGGGGCAAGTTGGGCTGGTG




GYRTYVTGLTHDEKGRPRTVDREVHERLIKRIVEKIEK
CTAAGGCGATGACCGCTACCTCTGGGCCCGGATTCAGCCT




NKKDIFTYETYELEDAEIGVVATGIVARSALRAVKMLR
GATGCAAGAAAACATTGGTTACGCGGTTATGACAGAAAC




EEGIKAGLLKIETIWPFDFELIERIAERVDKLYVPEMNL
GCCTGTGGTTATAGTCGACGTGCAGCGTAGCGGTCCAAGC




GQLYHLIKEGANGKAEVKLISKIGGEVHTPMEIFEFIRR
ACGGGACAACCGACCCTGCCTGCGCAAGGCGATATTATG




EFK
CAGGCGATTTGGGGCACGCATGGCGACCACAGCCTGATA




(SEQ ID NO: 35)
GTTCTGTCACCGTCGACGGTCCAGGAGGCGTTCGATTTTA





CGATTCGTGCGTTCAACCTGTCCGAAAAGTACCGTACCCC





GGTCATCCTGCTCACCGATGCCGAAGTGGGACATATGCG





GGAACGTGTTTATATCCCGAACCCAGATGAAATCGAAATT





ATTAATCGTAAGCTGCCGCGCAACGAAGAGGAAGCAAAA





TTACCGTTCGGTGATCCGCACGGCGATGGGGTTCCCCCCA





TGCCTATTTTCGGGAAAGGTTACAGGACGTATGTGACCGG





CCTGACCCATGATGAAAAAGGTCGCCCACGCACAGTCGA





TCGTGAAGTGCATGAACGCCTGATTAAACGTATAGTTGAA





AAAATAGAAAAGAACAAGAAAGATATCTTTACGTACGAA





ACGTATGAGCTGGAAGATGCCGAAATTGGAGTGGTTGCA





ACGGGTATTGTGGCCCGTTCGGCCTTACGTGCTGTCAAAA





TGCTGCGCGAAGAGGGCATCAAAGCGGGCCTGTTGAAAA





TTGAAACTATTTGGCCGTTTGACTTCGAATTAATCGAGCG





TATTGCGGAACGCGTGGATAAACTGTATGTACCGGAAAT





GAACTTAGGGCAGCTGTATCACCTGATTAAGGAAGGCGC





GAACGGCAAAGCGGAAGTTAAATTAATCAGCAAGATCGG





TGGAGAAGTGCATACCCCGATGGAGATCTTTGAATTTATT





CGTCGCGAATTCAAA





(SEQ ID NO: 36)





C4L9G3_TOLAT

Tolumonas auensis

MTEQWQSLDSLNALWSALLIEELARLGIRDICIAPGSRS
ATGACCGAACAGTGGCAGTCCCTCGATTCTCTGAATGCCT




TPLTLAAAANPAISTHLHFDERGLGFLALGLAQGSQRP
TGTGGTCTGCGCTGTTGATTGAAGAGCTCGCACGCCTGGG




VAVIVTSGSAVANLLPAVVEARQSGIPLWLLTADRPADE
GATTCGGGATATTTGTATTGCCCCAGGCAGCCGCTCAACC




LLGCGANQAITQANIFANYPVVYQQLFPAPDHDITPSWL
CCTCTTACTCTGGCCGCCGCTGCTAACCCGGCGATCTCAA




LASVDQAAFQQQQTPGPVHLNCPFREPLYPVAGQQIPG
CTCATTTGCATTTTGACGAACGCGGGTTAGGTTTTCTTGCC




NALRGLTHWLRSAQPWTQYHAVQPICQTHPLWAEVR
CTGGGGTTGGCGCAGGGGAGCCAGCGTCCGGTCGCGGTT




QSKGIIIAGRLSRQQDTGAILKLAQQTGWPLLADIQSQL
ATCGTGACGTCTGGAAGCGCGGTCGCAAACCTGCTGCCC




RFHPQAMTYADLALHHPAFREELAQAETLLLFGGRLT
GCTGTCGTCGAAGCACGCCAGAGTGGCATTCCGCTTTGGT




SKRLQQFADGHNWQHCWQIDACSERLDSGLAVQQRF
TACTGACGGCGGATCGCCCAGCAGAATTGCTCGGTTGCG




VTSPELWCQAHQCEPHRIPWHQLPRWDKLAGLITQQ
GCGCCAATCAGGCGATCACGCAGGCAAACATATTTGCGA




LPEWGEITLCHQLNSQLQGQLFIGNSMPIRLLDMLGTS
ACTATCCAGTGTATCAGCAACTGTTTCCTGCTCCGGATCA




GAQPSHIYTNRGASGIDGLIATAAGIARANTSQPTTLLL
TGATATTACTCCTAGCTGGCTGCTGGCGAGTGTGGACCAG




GDSSALYDLNSLALLRELTAPFVLIIINNDGGNIFHMLP
GCAGCTTTCCAGCAGCAACAGACGCCGGGACCCGTACAT




VPEQNQIRERFYQLPHGLDFRASAEQFRLAYAAPTGAI
CTGAACTGTCCGTTCCGAGAACCACTGTACCCGGTCGCGG




SFRQAYQQALSHPGATLLECKVATGEAADWLKNFAL
GCCAGCAGATTCCGGGTAATGCACTGCGCGGTCTGACCC




QVRSLPA
ACTGGTTACGCTCTGCGCAACCGTGGACACAGTATCATGC




(SEQ ID NO: 37)
GGTCCAACCTATCTGCCAAACCCACCCGCTTTGGGCAGAA





GTGCGCCAGAGCAAAGGCATTATTATTGCGGGCCGACTG





TCACGTCAGCAAGATACCGGTGCCATCCTGAAACTGGCTC





AACAGACCGGCTGGCCGCTGTTGGCTGATATTCAGTCGCA





GCTGCGTTTTCATCCGCAGGCCATGACGTACGCGGATCTG





GCACTCCATCATCCGGCGTTTCGTGAAGAACTAGCGCAGG





CAGAAACCCTCTTACTGTTTGGTGGTCGACTGACTTCGAA





ACGCCTGCAACAATTTGCAGATGGCCACAATTGGCAGCA





TTGCTGGCAGATTGACGCCGGGTCAGAGCGGCTGGACTC





GGGTCTTGCGGTCCAACAGCGTTTTGTGACTTCTCCAGAA





CTGTGGTGCCAGGCGCATCAGTGTGAGCCGCATCGTATCC





CGTGGCACCAACTGCCACGGTGGGACGGTAAACTGGCAG





GTCTGATTACCCAGCAGCTGCCGGAGTGGGGTGAGATTA





CACTATGCCATCAGCTGAACTCACAGTTACAAGGCCAGTT





ATTCATCGGGAATTCGATGCCAATCCGCCTGCTGGATATG





CTCGGCACCAGCGGCGCGCAGCCATCGCATATTTACACTA





ACCGGGGCGCAAGTGGCATTGACGGGCTAATCGCCACGG





CCGCGGGTATCGCCCGTGCGAATACAAGCCAGCCGACGA





CCCTGCTTCTGGGGGACAGCAGCGCCCTGTACGACTTGAA





CAGCCTGGCACTATTACGCGAACTGACCGCTCCGTTCGTA





CTGATCATAATCAATAATGACGGCGGCAATATCTTTCATA





TGCTGCCGGTTCCAGAGCAGAATCAGATTCGCGAACGGTT





CTATCAGCTGCCGCATGGCCTGGACTTTCGCGCTAGTGCC





GAACAATTCCGATTAGCGTATGCCGCGCCCACCGGAGCC





ATCTCCTTTCGTCAAGCGTACCAACAAGCCCTGAGCCATC





CGGGGGCGACACTGCTGGAGTGCAAAGTTGCCACGGGCG





AAGCCGCAGATTGGCTCAAAAATTTTGCGCTCCAAGTCCG





CAGTCTTCCGGCG





(SEQ ID NO: 38)





A0A0K1FGX4_9FIRM

Selenomonas noxia

MNANDLIAALGAEFFTGVPDSKLRPLVDCLMDTYGAN
ATGAATGCTAACGATCTCATTGCGGCACTGGGTGCCGAAT



ATCC 43541
SPSHIIAANEGNAAALAAGYHLAAGKVPLVYLQNSGL
TCTTCACTGGCGTTCCCGATTCTAAATTGCGCCCGTTGGTT




GNIVNPLLSLLHAEVYGIPCIFVIGWRGEPDLHDEPQHL
GATTGCCTGATGGATACCTATGGCGCTAATTCACCAAGCC




VQGRLTLPLLETIGVKTMVLTEASQPEDVSAWMEQIRP
ACATCATTGCGGCCAACGAGGGGAATGCCGCGGCTCTGG




HLAAGGQCALLVRKGALTHPKHKYANENPLRREDAIA
CCGCTGGCTACCACTTAGCTGCAGGTAAAGTTCCTCTGGT




RILDAAQGAVVVATTGKTGRELFELRAARGEDHAHDF
TTACCTGCAGAACAGTGGGTTGGGTAATATCGTCAATCCG




LTVGSMGHAGAIALGIALHRPSQRVFLLDGDGAALMH
TTGTTATCATTACTGCATGCGGAAGTATATGGCATTCCGT




MGAMATIGAAAPANIVHVLLNNEAHESVGGAPTAAH
GCATCTTCGTGATTGGTTGGCGCGGTGAACCTGACTTACA




TVDFPAVARAVGYRLVQTAADAAELAQILPAVGRSDA
TGACGAACCGCAACACCTGGTCCAGGGTCGTTTGACCCTT




LTFLEVRTAIGSRADLGRPTTTPTENKEALMRTLRE
CCGTTACTGGAAACCATTGGCGTGAAAACAATGGTACTG




(SEQ ID NO: 39)
ACCGAAGCGAGCCAGCCGGAAGATGTCTCCGCCTGGATG





GAACAAATTCGTCCGCATCTGGCAGCGGGGGGCCAGTGC





GCCTTGCTGGTGCGCAAGGGCGCGCTGACTCATCCGAAA





CACAAATATGCAAACGAAAACCCCCTGCGTCGCGAGGAT





GCAATCGCACGGATCCTCGATGCAGCGCAGGGCGCTGTT





GTTGTGGCCACCACCGGCAAAACCGGTCGTGAACTGTTTG





AACTGCGCGCCGCCCGCGGCGAAGACCATGCCCATGATT





TCCTGACCGTGGGTAGTATGGGTCACGCCGGTGCAATCGC





ACTGGGTATTGCCCTGCACCGGCCGTCCCAACGCGTATTT





TTACTGGATGGGGATGGCGCGGCCCTGATGCATATGGGT





GCGATGGCAACCATTGGTGCAGCGGCACCCGCCAACATC





GTGCACGTCCTGCTGAATAACGAAGCGCATGAATCTGTG





GGCGGCGCACCAACCGCAGCTCACACCGTCGATTTTCCGG





CGGTAGCCCGCGCCGTGGGCTACCGTTTAGTACAGACTGC





GGCGGATGCCGCAGAACTGGCGCAGATTCTGCCAGCAGT





GGGCCGCAGCGACGCCCTGACGTTCTTGGAAGTTCGTACT





GOTATTGGTTCACGCGCAGACCTGGGTCGTCCTACTACTA





CCCCAACCGAAAACAAAGAGGCACTTATGCGTACGCTGC





GCGAA





(SEQ ID NO: 40)





A0A0R2PY37_9ACTN

Acidimicrobium sp.

MASSEKMRVGEAIIDLLVREYELDTVFGIPGVHNIELFR
ATGGCGAGCTCTGAGAAAATGCGCGTAGGCGAAGCGATT



BACL17
GLHSSGVRVVAPRHEQGAGFMADGWSIATGKPGVCA
ATAGATCTGCTGGTGCGCGAATATGAACTAGATACCGTGT




LISGPGLTNAITPIAQAYHDSRAMLVLASTTPTHSLGKK
TCGGGATTCCCGGAGTGCACAACATTGAGCTGTTTAGAGG




FGPLHDLDDQSAVVRTVTAFSETVTDPTQFPQLIERAW
CTTACATAGCTCTGGTGTGCGCGTCGTTGCGCCTCGCCAT




NVFTSSRPRPVHIAIPTDVLEQFVDPFTRVTTDISKPVA
GAACAAGGTGCAGGCTTTATGGCGGACGGCTGGAGCATT




QDSDIQRAAQLLAAAKRPMIIAGGGALGTGALISNIAT
GCTACAGGCAAACCTGGTGTCTGCGCCTTGATAAGTGGGC




AIDSPIVLTGNAKGEVPSTHPLCVGSAMVIPRVQEEIEQ
CGGGCTTAACCAATGCAATAACCCCGATAGCGCAAGCGT




SDVVLVIGSEISDADLYNGGRAQGFSGSVIRIDIDTEQIS
ACCACGATAGTCGCGCGATGTTAGTCCTGGCGAGTACTAC




RRVAPHVSLVADAADSLSRISAELTKAGVALTNSGSAR
GCCGACGCACAGCCTGGGCAAAAAATITGGCCCATTACA




ATNLRMAARSGVRQDLLPWIDAIEQSVPDNTLVAVDS
CGATCTTGACGATCAGTCCGCCGTGGTGCGTACCGTGACT




TQLAYAAHTVMSCNSPRSWLAPFGPGTLGCALPMAIG
GCTTTTTCAGAGACTGTTACAGATCCTACGCAGTTCCCAC




AAIADTTRPVLAIAGDGGWLFTLAEMAAAIDEGIDMV
AGCTGATTGAACGGGCGTGGAATGTTTTCACATCATCTCG




LVLWDNRGYGQIRESFDDVRAPRMGVDVSSHDPSAIA
TCCGCGTCCAGTTCATATCGCAATCCCGACCGACGTGCTG




NGFGWNAIDVTTIEAFRIVLSEAFENRGAHFIRISVS
GAGCAGTTTGTGGATCCGTTTACGCGAGTGACCACCGATA




(SEQ ID NO: 41)
TTTCGAAACCAGTGGCCCAGGACTCCGATATTCAAAGAG





CGGCGCAGCTCCTAGCAGCGGCCAAACGTCCCATGATCA





TTGCGGGCGGAGGCGCTCTGGGCACAGGTGCATTGATCTC





GAACATTGCCACAGCTATTGATAGCCCGATCGTGTTGACC





GGTAATGCGAAGGGTGAGGTACCGAGTACCCACCCGTTA





TGTGTCGGCTCTGCTATGGTTATTCCACGCGTGCAGGAAG





AAATCGAACAAAGTGATGTCGTTTTGGTGATTGGCAGCG





AAATCTCTGATGCAGACCTGTACAACGGTGGTCGCGCCCA





GGGATTTTCTGGTAGCGTTATCCGCATCGACATTGATACC





GAGCAGATTAGTCGTCGAGTGGCCCCGCACGTCAGCCTG





GTGGCTGATGCGGCGGATTCCTTGTCACGTATTTCTGCCG





AACTGACAAAGGCCGGTGTGGCGCTGACGAATTCTGGCA





GCGCACGTGCGACGAATTTACGTATGGCAGCCCGTAGCG





GCGTGCGACAAGACCTGCTGGCGTGGATCGATGCCATTG





AACAATCCGTGCCGGACAACACGCTGGTGGCGGTAGATT





CAACCCAGCTGGCGTATGCGGCGCATACAGTCATGAGTT





GTAATTCTCCGCGTTCTTGGTTAGCGCCATTCGGCTTTGGT





ACGCTTGGTTGTGCCCTTCCAATGGCGATCGGCGCCGCAA





TCGCGGATACGACCCGTCCAGTCCTGGCCATTGCGGGCGA





TGGTGGTTGGCTGTTTACCTTAGCCGAAATGGCGGCAGCA





ATCGACGAAGGCATTGATATGGTTCTTGTACTGTGGGATA





ATCGCGGCTATGGACAAATCCGTGAAAGCTTCGACGATG





TGCGAGCACCCCGTATGGGTGTAGATGTTTCAAGCCATGA





CCCTTCCGCAATAGCCAACGGCTTCGGTTGGAACGCGATT





GACGTGACCACCATTGAGGCGTTCCGAATTGTTCTGTCGG





AAGCGTTTGAGAACCGTGGTGCTCACTTTATTCGTATTTC





CGTGAGC





(SEQ ID NO: 42)





X1WK73_ACYPI

Acyrthosiphonpisum

MQEADFEVNHARNADIPIVGDAKQTLSQMLELLAQSD
ATGCAGGAAGCGGATTTTGAAGTGAATCATGCGCGTAAC




AKQELDSLRDWQTIDGWRSRKCLEFDRTSDKIKPQA
GCGGACATTCCGATCGTCGGAGACGCGAAACAGACTCTG




VIETIWRLTKGDAYVTSDVGQHQMFAALYYQFDKPRR
TCGCAGATGCTGGAACTCCTGGCGCAATCAGACGCTAAA




WINSGGLGTMGFGLPAALGVKMALPDETVICVTGDGS
CAGGAGCTTGACTCCCTGCGCGACTGGTGGCAGACCATTG




IQMNIQELSTALQYDLPVLVLNLNNGFLGMVKQWQD
ATGGATGGCGGAGTCGCAAATGCCTGGAATTTGATCGTA




MIYSGRHSQSYMQSLPDFVRLAEAYGHVGISIAHPAEL
CGTCAGATAAGATCAAACCACAAGCGGTTATTGAGACGA




EEKLQLALDTLAKGRLVFVDVNIDGSEHVYPMQIRGG
TTTGGCGCCTGACCAAAGGCGATGCCTACGTGACTTCCGA




VIVKLDEIARLAGVSRTTASYVINGKARQYRVSDKTVE
TGTCGGCCAACACCAGATGTTCGCGGCACTGTACTACCAG




KVMAVVREHNYHPNAVAAGLRAGRTRSIGLVIPDLEN
TTTGATAAGCCGAGACGTTGGATTAACAGTGGTGGCCTTG




TSYTRIANYLERQARQRGYQLLIACSEQQPDNEMRCIE
GCACGATGGGTTTTGGGCTCCCGGCGGCGCTGGGTGTTAA




HLLQRQVDAIIVSTSLPPEHPFYQRWINDPLPIIALDRAL
AATGGCACTTCCCGATGAGACAGTAATCTGCGTTACGGGC




DREHFTSVVGADQDDAHALAAELRQLPVKNVLFLGA
GACGGTTCGATTCAGATGAATATCCAGGAACTGTCTACTG




LPELSVSFLREMGFRDAWKDDERMVDYLYCNSFDRT
CGTTACAGTACGATTTGCCGGTACTGGTGCTGAACTTGAA




AAATLFEKYLEDHPMPDALFTTSFGLLQGVMDITLKR
CAACGGTTTTCTTGGCATGGTTAAACAATGGCAGGATATG




DGRLPTDLAIATPGDHELLDFLECPVLAVGQRHRDVA
ATCTATAGCGGCCGCCATAGCCAGAGCTACATGCAATCCC




ERVLELVLASLDEPRKPKPGLTRIRRNLFRRGQLSRRT
TTCCGGATTTCGTACGCCTGGCAGAAGCGTACGGGCATGT




K
CGGGATAAGCATCGCGCACCCGGCTGAACTGGAAGAAAA




(SEQ ID NO: 43)
ATTACAGCTGGCCTTAGATACGCTGGCAAAGGGGCGCCTT





GTGTTTGTTGATGTCAATATTGACGGGAGTGAACATGTAT





ATCCCATGCAAATCCGTGGTGGTGTTATTGTGAAGCTCGA





TGAGATCGCACGCCTGGCAGGAGTATCTCGTACCACAGC





CTCGTACGTCATTAATGGAAAGGCACGTCAGTACCGAGTC





TCCGATAAAACGGTCGAAAAGGTGATGGCGGTGGTGCGC





GAACATAACTATCATCCTAATGCTGTGGCTGCTGGTTTGC





GGGCAGGACGTACTCGTAGCATTGGATTAGTAATCCCGG





ATCTGGAAAACACATCATACACGCGCATTGCGAACTATCT





GGAACGCCAGGCGCGCCAGCGCGGCTATCAGCTGTTAAT





CGCTTGCAGCGAGGACCAGCCAGATAATGAAATGCGCTG





CATCGAACACTTGCTGCAACGACAGGTGGACGCCATTATT





GTCTCTACTTCCCTGCCCCCGGAACATCCGTTCTACCAAC





GCTGGATCAACGATCCACTCCCGATCATCGCGCTGGATCG





TGCGCTGGACCGCGAGCATTTTACGAGCGTAGTAGGGGC





CGATCAGGACGATGCCCATGCCCTAGCCGCCGAACTTCGT





CAGCTTCCGGTCAAAAACGTGCTGTTTCTGGGCGCCCTGC





CGGAACTGAGCGTGTCGTTTTTGCGTGAAATGGGCTTCCG





TGACGCCTGGAAAGATGATGAACGAATGGTCGATTACCT





GTATTGTAACAGCTTCGATCGTACGGCCGCAGCTACCCTG





TTTGAGAAATATCTCGAAGATCACCTGATGCCGGATGCGT





TGTTCACTACCTCCTTCGGTTTGCTGCAGGGTGTGATGGA





TATTACACTAAAACGCGACGGCCGCTTGCCGACCGATCTG





GCGATCGCGACCTTTGGGGACCATGAATTATTGG CTTCT





TGGAATGTCCGGTCCTGGCTGTGGGCCAACGCCACCGGG





ATGTGGCGGAACGCGTCCTGGAACTGGTGCTGGCCAGCC





TGGATGAACCGCGCAAACCGAAACCAGGTCTGACGCGCA





TCCGTCGCAACCTGTTTCGGCGCGGCCAGCTTAGCCGTCG





GACCAAA





(SEQ ID NO: 44)





B1HLR4_BURPE

Burkholderia

MKTEDLIGILTDAGVDLAVGVPDSLLKSFCGRLNDPDC
ATGAAAACCGAAGACCTGATAGGCATCCTGACGGATGCT




pseudornallei

PLRHLVASSEGGAVGIAIGHHLATGGLAAVYMQNSGI
GGTGTAGATCTCGCAGTCGGAGTCCCGGACAGCTTACTGA




GNAINPLVSLADRAVYGIPLVLIVGWRAEISASGAQVH
AAAGTTTTGTGGTCGTCTGAATGACCCGGACTGCCCGCT




DEPQHVTQGRITLPLLDALSIRHLVLERAGGENDALAP
ACGGCACCTGGTAGCATCATCAGAGGGTGGTGCCGTAGG




SIARLIAGARQTSQPVALVVRKDAFDDASASRPGAAAP
GATTGCGATTGGTCACCATCTCGCCACCGGGGGCCTGGCC




HAGRMTREQAIALIVEHADAGTAIVSTTGVASRELYEL
GCGGTATATATGCAAAACTCAGGTATCGGTAACGCCATC




RDRLGHSHARDFLTVGGMGHASQIAVGIALARPAQKV
AACCCTCTTGTTTCGCTGGCAGACCGCGCTGTGTACGGCA




ICIDGDGALLMHMGGLAYCAGAPNLTHVVINNGVHDS
TTCCGCTGGTTCTTATCGTGGGATGGCGTGCGGAAATCTC




VGGQPTLAAHLRLSHIAASCGYAFSRSVATPIELESALH
TGCCAGTGGCGCACAGGTACACGACGAGCCACAACACGT




HASRLDGSAFIEVTCRPGYRSDLGRPRTSPAENKRHFM
GACGCAGGGACGCATTACCTTACCGCTGCTGGACGCGCT




AFLSRNGATHERDDHAQESGIQDAVQCARH
GTCGATTCGCCACTTGGTTCTGGAACGCGCGGGAGGCGA




(SEQ ID NO: 45)
AAATGACGCTCTGGCCCCCTCTATTGCGCGCTTGATTGCG





GGCGCGCGTCAAACTAGCCAGCCGGTTGCTCTGGTGGTGC





GTAAGGATGCGTTCGATGATGCTTCTGCAAGTCGTCCTGG





CGCCGCTGCTCCACACGCAGGTCGCATGACCCGTGAACA





AGCGATTGCCCTGATTGTTGAGCATGCGGACGCAGGTACC





GCCATTGTAAGTACCACTGGCGTGGCATCGCGCGAACTTT





ACGAATTACGCGACCGTTTAGGTCATTCCCATGCCCGCGA





TTTTCTGACCGTCGGCGGCATGGGTCATGCCTCTCAGATC





GCAGTGGGAATTGCGCTGGCACGCCCCGCGCAGAAAGTC





ATTTGCATTGATGGTGATGGCGCACTGTTGATGACATGG





GTGGTCTGGCATATTGTGCGGGCGCCCCAAACCTGACACA





CGTGGTGATTAATAACGGAGTTCATGATAGTGTCGGAGG





CCAGCCGACCCTGGCTGCCCATTTGCGCCTGTCACACATC





GCGGCAAGCTGCGGCTACGCATTTTCACGCAGCGTAGCA





ACGCCTATAGAACTTGAATCAGCGCTGCACCACGCTAGC





AGACTGGATGGCTCAGCGTTCATTGAAGTGACCTGTCGTC





CGGGCTATCGCAGCGATCTGGGCCGTCCTCGTACGTCCCC





GGCCGAAAATAAACGCCACTTTATGGCGTTCTTAAGCCGC





AACGGGGCCACCCATGAGCGTGATGACCACGCACAGGAA





TCGGGTATTCAAGACGCAGTGCAGTGCGCACGTCAT





(SEQ ID NO: 46)





X8CA07_MYCXE

Mycobacterium

MLAKHEFSAATMADGYSRCGQKLGVVAATSGGAALN
ATGCTGGCGAAACATGAGTTCTCCGCAGCGACCATGGCG




xenopi 3993

LVPGLGESLASRVPVLALVGQPATTMDGRGSFQDTSG
GATGGTTACAGCCGTTGCGGTCAAAAACTGGGCGTAGTT




RNGSLDAEALFSAVSVFCRRVLKPADIITALPAAVAAA
GCGGCGACGAGCGGCGGTGCGGCACTGAACTTGGTCCCA




QTGGPAVLLLPKDIQQTQVGINGYAEHGVAPSRSVGD
GGCTTAGGTGAAAGCTTAGCGTCACGAGTGCCGGTGTTG




PHSIVRALRQVTGPVTIIAGEQVARDDARAELEWLRAV
GCGCTGGTGGGCCAGCCGGCGACCACCATGGATGGGAGA




LRARVACVPDAKDVAGTPGFGSSSALGVTGVMGHPG
GGCTCCTTCCAGGACACGAGTGGCCGCAATGGCAGCTTG




VADALAKSALCLVVGTRLSVTARTGLDDALAAVRVV
GACGCTGAAGCATTGTTCTCTGCCGTGTCCGTGTTTTGCC




SIGSAPPYVCTHVHTDDLRASLRLLTAALSGRGRPTG
GTCGTGTACTTAAACCAGCTGACATTATTACTGCATTACC




VRVPDAVVRTELTPRRSTVPACAIATR
AGCAGCAGTTGCTGCGGCCCAGACCGGTGGTCCTGCAGT




(SEQ ID NO: 47)
CCTGCTGCTTCCGAAAGACATTCAACAGACTCAAGTGGGC





ATCAACGGTTACGCAGAACATGGCGTCGCCTCCGAGTCGC





TCAGTAGGCGATCCGCATTCAATTGTGCGTGCCCTTCCTTC





AGGTGACTGGGCCGGTGACTATAATTGCCGGGGAACAAG





TGGCCCGTGATGATGCGCGCGCGGAACTTGAATGGTTGC





GAGCTGTATTAAGAGCACGTGTTGCTTGTGTACCTGATGC





AAAAGATGTTGCGGGGACGCCAGGCTTCGGTTCCTCTTCC





GCGCTGGGCGTCACTGGTGTGATGGGTCATCCGGGCGTG





GCTGACGCGCTGGCTAAAAGCGCCCTGTGTTTAGTTGTCG





GTACGCGTTTGTCGGTCACAGCACGTACGGGCCTGGATGA





TGCGCTGGCCGCTGTCCGCGTTGTGAGCATCGGTTCCGCG





CCGCCGTACGTGCCATGTACGCATGTGCATACTGATGACC





TGCGTGCTTCCTTACGACTGCTCACCGCGGCGTTATCAGG





TCGCGGTCGTCCGACCGGGGTACGTGTTCCTGATGCGGTG





GTGCGCACGGAACTGACTCCTCGTCGTAGCACCGTTCCGG





CATGTGCCATTGCGACGCGT





(SEQ ID NO: 48)





D1Y3P7_9BACT

Pyramidobacter

MQISSFIAQLQRIASSHFLGVPDSQLKALCNYLYKNCGI
ATGCAGATTTCGTCCTTCATTGCGCAGTTACAGCGCATCG




piscolens W5455

SSDHIIAANEGNCTALAAGYYLATGKVVVYMQNSGL
CAAGCTCACATTTTTTAGGAGTGCCGGACAGCCAGCTCAA




GNVVNPVASLLNDKVYGIPCVFVIGWRGEPGLKDEPQ
AGCTTTGTGTAATTATCTGTACAAAAACTGTGGCATCTCA




HIFQGAVTLDLLKVMDIASFVVRKDTTEQELAAQMAE
AGTGACCACATCATTGCCGCGAACGAAGGCAACTGTACT




FQPLLAAGKSVAFVIAKEALTVDEKVSFKNTDFTMTREE
GCGCTGGCTGCGGGGTATTACCTGGCTACGGGCAAGGTG




VIRHITAFSGEDPIVSTTGKASRELFEIRVRNGQPHKYD
CCGGTTGTTTACATGCAGAACAGCGGGTTAGGGAATGTTG




FLTVGSMGHSSSIALGIALKPHTKIWCIDGDGAALMH
TGAATCCGGTTGCGTCCTTGCTGAATGACAAAGTGTACGG




MGALAVIGSQRPRNLVHIVINNGAHESVGGLPTVARSA
GATCCCGTGTGTGTTTGTCATTGGCTGGCGGGGCGAGCCC




SLAKVAEACGYVNVKTVGTFAELDAALKDARNADEL
GGCCTCAAGGACGAACCTCAACACATCTTCCAGGGCGCG




TFIEAKTAIGARADLGRPTTSAMENRDGFMAYLKELR
GTGACTCTGGATCTGCTTAAAGTAATGGATATCGCGAGCT




(SEQ ID NO: 49)
TCGTTGTCCGTAAAGATACCACGGAACAGGAATTAGCGG





CCCAGATGGCTGAGTTTCAACCGCTGCTGGCGGCCGGCA





AATCGGTTGCCTTCGTCATTGCAAAAGAAGCCCTGACGTA





CGATGAGAAAGTAAGTTTTAAAAACGACTTCACTATGACT





CGCGAAGAAGTGATTCGTCATATCACAGCGTTTTCCGGCG





AAGACCCTATCGTGAGCACCACCGGAAAAGCTAGCCGCG





AATTATTCGAAATTCGAGTCCGTAACGGTCAGCCCCACAA





ATACGATTTCCTGACTGTGGGCTCTATGGGCCATAGCAGT





TCTATTGCGCTGGGTATTGCACTATCGAAGCCCCACACGA





AAATATGGTGTATCGATGGCGACGGTGCCGCCCTGATGC





ATATGGGGGCCCTGGCGGTGATTGGTAGCCAACGTCCGC





GCAATTTAGTCCATATTGTTATTAATAATGGTGCCCATGA





GAGCGTTGGTGGTCTTCCGACCGTGGCACGGTCTGCGAGT





CTGGCGAAAGTCGCAGAAGCCTGTGGTTATGTTAACGTA





AAAACGGTGGGTACCTTTGCAGAGTTAGATGCAGCTTTAA





AAGACGCCCGTAACGCCGATGAACTGACTTTTATAGAAG





CCAAAACCGCGATCGGAGCCCGCGCGGATCTCGGTCGCC





CAACCACCTCCGCTATGGAAAACCGTGACGGATTTATGGC





CTATCTGAAGGAGCTGCGT





(SEQ ID NO: 50)





F4RJP4_MELLP

Melampsora larici-

MPAFSLVEIEAKMSFFSDFLNQVTCTPSVASKQIYVSKV
ATGCCGGCATTCTCCCTGGTAGAGATAGAAGCGAAAATG




populina

LIQITNFDQLDFDFQIKILNQVTLHPSQPKLTQEEKSKLL
TCCTTTTTTTCTGATTTTCTGAATCAAGTCAAGACGCCGAG




NNTSILRDSIVFFTDTGAARGVGGHAGGPFDTVREVVL
TGTCGCCTCAAAGCAAATTTATGTTAGCAAAGTGCTTATT




LLASFASGSDSKIFDHTVSDEAGHRAQSKLPGHPQLGL
CAGATTACTAACTTTGATCAGCTGGATTTTGACTTTCAAA




TPGVKFSSVVVDWATCGLFSRVSHSPTETVFCFCSDGS
TCAAGATCCTCAACCAGGTTACTCTGCATCCATCCCAGCC




QHEGSDAEAARLARAQKLNIKLLIDNNNVTISGHTSGY
AAAATTGACCCAGGAGGAAAAATCAAAACTCTTGAACAA




LKGYKVGKTLEAHALKIVRAEGEKYTGCNDVKSKVIR
CACGAGTATCCTGCGCGATAGTATCGTCTTCTTCACGGAT




INFDLKGSTGFEAIHQSRPGIFIPSVIVEHGNFCAAAGFG
ACGGGTGCAGCACGTGGTGTAGGTGGTCACGCGGGCGGA




FEKGKEKMRKLDAVISFGEIVHRALDAGDQLGIEGFDV
CCATTTGATACCGTACGCGAGGTTGTGCTCCTGTTGGCTA




GLVNKSTLNVIDEKPWMNMDIRNLF
GCTTTGCCAGTGGGAGCGACAGCAAAATCTTTGATCATAC




(SEQ ID NO: 51)
TGTGTCAGATGAAGCGGGCCATCGTGCCCAATCAAAGCT





GCCGGGTCATCCGCAACTGGGTCTTACGCCGGGCGTGAA





ATTCAGCAGCGTGGTCGTAGATTGGGCGACCTGCGGTCTG





TTCAGCCGTGTGTCACACAGCCCAACGGAAACCGTGTTTT





GCTTTTGCAGCGATGGTAGTCAGCACGAAGGCAGCGATG





CGGAAGCCGCAAGACTGGCCCGTGCGCAGAAGCTTAACA





TTAAATTATTGATCGATAACAACAATGTAACTATCTCTGG





GCACACCAGCGGTTACCTTAAAGGATACAAAGTCGGTAA





AACGCTGGAAGCACATGCCTTAAAAATAGTACGTGCAGA





AGGTGAAAAATATACCGGCTGCAACGATGTGAAATCTAA





GGTGATACGGATCAACTTTGACCTCAAAGGTTCTACCGGC





TTCGAGGCGATTCATCAGTCCCGCCCGGGTATTTTCATTC





CGTCGGTAATCGTGGAACATGGCAATTTTGCGCAGCAGC





GGGTTTCGGATTTGAAAAAGGCAAAGAAAAGATGCGTAA





GCTGGCGCTGTTATTTCTTTTGGCGAGATTGTTCATCGTG





CCTTGGACGCCGGCGATCAACTGGGCATAGAGGGGTTTG





ATGTCGGCCTCGTAAACAAAAGTACCCTGAATGTGATTGA





TGAAAAGCCGTGGATGAACATGGATATCCGCAACCTGTT





(SEQ ID NO: 52)





A0A081BQW3_9BACT

Candidatus

MTTLGNSRVAPRDALMELAERDPRYVLVCSDSGLVIK
ATGACCACGCTGGGAAACTCCCGCGTGGCGTTTCGCGATG




Moduliflexus

AQPFIEKFPQRFFDVGIAEQNAVGVAAGLASSGLVPFF
CCTTAATGGAGCTGGCAGAACGCGACCCGCGGTACGTAC




flocculans

ATYAGFITMRACEQVRTFVAYPGLNVKLVGANGGMA
TGGTGTGTTCGGATTCTGGCCTGGTGATTAAGGCCCAACC




SGEREGVTHQFFEDVGILRAIPGITVVVPADADQVVAA
TTTCATCGAGAAATTCCCCCAGCGCTTTTTGATGTTGGA




TKAVALKDGPAYIRIGSGRDPMVEGETPPFELGKVRIL
ATCGCGGAGCAGAACGCGGTTGGCGTGGCCGCGGGTCTG




KTYGHDVAIFAMGFIMNRALEAAAQLNSEGIRAVVVD
GCATCCAGCGGGTTGGTACCTTTTTTTGCGACCTACGCCG




VHTLKPLDVEAITAILQKTSAAVTVEDHNIIGGLGSAIA
GTTTTATCACGATGCGTGCTTGTGAACAGGTACGCACCTT




EVSAEEMPTPLRRIGLRDVYPESGHPEPLLDKYHLGVS
CGTCGCTTATCCGGGTCTGAACGTCAAACTGGTCGGCGCC




DIISAAKTVLKKKNHPPRRIAFSTRENAEEGFSNGNMG
AACGGCGGCATGGCGTCTGGGGAACGCGAAGGGGTCACG




EEIYE
CACCAGTTTTTCGAGGATGTCGGTATACTGCGTGCAATTC




(SEQ ID NO: 53)
CTGGCATTACAGTCGTCGTACCTGCCGATGCCGATCAGGT





AGTAGCGGCAACCAAAGCGGTAGCATTAAAAGATGGCCC





GGCCTATATACGTATCGGAAGCGGGCGTGACCCGATGGT





TGAGGGGGAAACCCCGCCTTTTGAACTTGGCAAAGTTCGT





ATTCTGAAAACCTACGGGCATGACGTAGCTATCTTCGCCA





TGGGTTTTATAATGAACCGCGCGCTTGAGGCAGCGGCGC





AACTGAACAGTGAAGGCATTCGGGCAGTTGTAGTAGACG





TGCACACCCTGAAACCCCTGGATGTGGAGGCAATTACCG





CGATCCTCCAGAAAACTTCTGCAGCGGTAACCGTGGAGG





ATCATAACATCATTGGCGGCCTCGGGAGCGCGATAGCCG





AGGTGTCGGCGGAGGAAATGCCGACCCCCCTGCGCCGTA





TTGGTCTGCGCGATGTTTATCCGGAAAGTGGTCACCCGGA





GCCTCTGCTGGATAAATACCACTTGGGCGTTAGCGACATC





ATCAGCGCCGCCAAGACGGTGCTGAAAAAAAAGAATCAC





CCGCCCCGCCGTATCGCCTTCAGCACCCGGGAAAATGCCG





AGGAGGGTTTCAGTAACGGCAATATGGGCGAGGAAATTT





ATGAAG





(SEQ ID NO: 54)





CAK95977

Pseudomonas

MKTVHGATYDILRQHGLTTIFGNPGSNELPFLKGFPED
ATGAAGACGGTCCACGGTGCAACCTACGACATCCTGCGC




fluorescens

FRYILGLHEGAVVGMADGYALASGQPTFVNLHAAAG
CAGCATGGTCTGACGACGATTTTTGGTAATCCGGGTGATA




TGNGMGALTNAWYSHSPLVITAGQQVRSMIGVEAML
ACGAACTGCCGTTTCTGAAAGGTTTCCCGGAAGACTTTCG




ANVDAAQLPKPLVKWSHEPATAQDVPRALSQAIHTAN
TTATATTCTGGGCCTGCATGAAGGTGCCGTGGTTGGCATG




LPPRGPVYVSIPYDDWACEAPSGVEHLARRQVSSAGLP
GCAGATGGTTACGCGCTGGCCAGTGGTCAGCCGACCTTTG




SPAQLQHLCERLAAARNPVLVLCPDVDGSAANGLAV
TGAACCTGCATGCGGCGGCGGGCACCGGTAACGGCATGG




QLAEKLRMPAWVAPSASRCPFPTRHACFRGVLPAAIA
GTGCACTGACGAATGCTTGGTATAGTCACTCCCCGCTGGT




GISHNLAGHDLILVVGAPVFRYHQFAPGNYLPAGCELL
TATTACGGCGGGTCAGCAAGTCCGCTCTATGATCGGCGTG




HLTCDPGEAARAPMGDALVGDIALTLEAVLDGVPQSV
GAAGCTATGCTGGCGAACGTGGACGGTGCACAGCTGCCG




RQMPTALPAAEPVADDGGLLRPETVFDLLNALAPKDA
AAACCGCTGGTTAAGTGGTCACATGAACCGGCAACCGCT




IYVKESTSTVGAFWRRVEMREPGSYFFPAAGGLGFGLP
CAGGATGTGCCGCGTGCGCTGTCGCAAGCCATTCACACG




AAVGVQLASPGRQVIGVIGDGSANYGITALWTAAQYN
GCAAATCTGCCGCCGCGCGGTCCGGTGTATGTTTCAATCC




IPVVFIILKNGTYGALRWFADVLDVNDAPGLDVPGLDF
CGTACGATGACTGGGCCTGCGAAGGACCGTCGGGTGTTG




CAIARGYGVQAVHAATGSAFAQALREALESDRPVLIE
AACATCTGGCGCGTCGCCAGGTCAGCTCTGCCGGCCTGCC




VPTQTIEP
GAGCCCGGCACAGCTGCAACACCTGTGTGAACGTCTGGC




(SEQ ID NO: 55)
CGCAGCTCGTAACCCGGTCCTGGTGCTGGGTCCGGATGTG





GATGGTTCTGCGGCCAATGGCCTGGCTGTTCAGCTGGCGG





AAAAGCTGCGTATGCCGGCTFGGGTGGCACCGTCAGCCTC





GCGCTGCCCGTTCCCGACCCGTCACGCCTGTTTTCGCGGT





GTTCTGCCGGCAGCTATTGCCGGTATCAGCCATAACCTGG





CAGGCCACGATCTGATTCTGGTCGTGGGTGCGCCGGTGTT





CCGTTATCATCAGTTTGCGCCGGGTAATTACCTGCCGGCG





GGTTGCGAACTGCTGCACCTGACCTGTGATCCGGGTGAAG





CAGCCCGCGCTCCGATGGGTGACGCGCTGGTTGGCGATAT





CGCCCTGACCCTGGAAGCAGTGCTGGATGGCGTTCCGCA





GAGCGTCCGTCAAATGCCGACGGCACTGCCGGCAGCTGA





ACCGGTGGCAGATGACGGTGGTCTGCTGCGTCCGGAAAC





CGTTTTCGACCTGCTGAACGCGCTGGCCCCGAAAGATGCC





ATTTATGTTAAGGAAAGCACCTCTACGGTCGGTGCATTCT





GGCGTCGCGTGGAAATGCGTGAACCGGGCTCCTACTTTTT





CCCGGCGGCCGGCGGTCTGGGTTTTGGTCTGCCGGCAGCT





GTTGGTGTCCAGCTGGCCAGTCCGGGTCGCCAAGTGATTG





GCGTTATCGGCGATGGTTCCGCTAACTATGGTATTACCGC





ACTGTGGACGGCGGCCCAGTACAACATCCCGGTTGTCTTC





ATTATCCTGAAAAATGGCACCTATGGTGCTCTGCGTTGGT





TTGCGGATGTCCTGGACGTGAATGATGCGCCGGGTCTGGA





CGTGCCGGGCCTGGATTTCTGCGCAATCGCTCGCGGCTAC





GGTGTTCAGGCAGTCCATGCAGCTACCGGCAGCGCATTTG





CCCAAGCACTGCGTGAAGCGCTGGAATCTGATCGCCCGG





TGCTGATTGAAGTTCCGACCCAGACGATCGAACCG





(SEQ ID NO: 56)





YP_831380

Arthrobacter sp.

MTTVHAAAYELLRSNRLTTIFGNPGDNELPFLDAMPA
ATGACGACGGTCCATGCCGCCGCCTATGAACTGCTGCGTA




DFRYILGLHEGVVVGMADGFAQASGQAAFVNLHAAS
GCAATCGCCTGACGACGATCTTTGGTAATCCGGGTGATAA




GTGNAMGALTNAWYTSHTPLVITAGQQVRPMIGLEAM
TGAACTGCCGTTTCTGGATGCAATGCCGGCTGACTTCCGG




LSNVDAASLPRPLVKWSAEPAQAPDVPRALSQAIHTAT
TATATTCTGGGCCTGCATGAGGGTGTGGTTGTCGGCATGG




SDPKGPVYLSIPYDDWNQDTGNLSEHLSSRSVSRAGNP
CGGATGGTTTTGCGCAGGCCAGCGGTCAAGCGGCCTTCGT




SAEQLDDILSALREAANPALVFGPDVDAARANHHAVR
TAACCTGCATGCAGCTTCTGGCACCGGTAACGCGATGGGC




LAEKLAAPVWIAPAAPRCPFPTRHPNFRGVLPASIAGIS
GCCCTGACGAATGCATGGTACAGTCACACCCCGCTGGTG




ALLNGHDLIVVIGAPVFRYHQYQPGSYLPENSRLIHITC
ATTACGGCGGGCCAGCAAGTTCGTCCGATGATCGGTCTGG




DAGEAARAPMGDALVADIGQTLRALADIIPQSRRPPLR
AAGCGATGCTGAGCAATGTTGATGCAGCCTCTCTGCCGCG




PRVIPPVPDSQDDLLAPDAVFEVMNEVAPEDVVYVNE
CCCGCTGGTCAAATGGTCTGCCGAACCGGCACAGGCTCC




SVSTVTALWERVELKHPGSYYFPASGGLGFGMPAAVG
GGATGTTCCGCGTGCGCTGAGCCAAGCCATTCATACCGCA




VQLANDRRRVIAVIGDGSANYGITALWTAAQEKIPVVF
ACGTCTGACCCGAAGGGTCCGGTGTATCTGAGTATCCCGT




IILNNGTYGALRAFAKLLNAENAAGLDVPGICFCAIAE
ACGATGACTGGAACCAGGATACCGGTAATCTGTCCGAAC




GYGVEAHRITSLENFKDKLSAALQSDTPTLLEVPTSTTS
ACCTGAGCAGCCGTAGCGTGAGCCGTGCGGGTAACCCGT




PF
CAGCTGAACAACTGGATGACATTCTGTCGGCACTGCGTGA




(SEQ ID NO: 57)
AGCAGCTAACCCGGCGCTGGTTTTTGGTCCGGATGTGGAT





GCGGCCCGCGCTAATCATCACGCGGTGCGTCTGGCCGAA





AAACTGGCAGCTCCGGTTTGGATCGCACCGGCGGCACCG





CGTTGCCCGTTTCCGACCCGCCATCCGAACTTCCGTGGCG





TTCTGCCGGCAAGTATTGCTGGCATCTCCGCCCTGCTGAA





TGGTCATGATCTGATTGTGGTTATCGGTGCACCGGTGTTC





CGTTATCACCAGTACCAACCGGGCAGTTATCTGCCGGAAA





ATTCCCGCCTGATTCACATCACCTGTGATGCAGGTGAAGC





AGCTCGTGCCCCGATGGGTGATGCGCTGGTTGCCGACATT





GGTCAGACGCTGCGCGCGCTGGCCGACATTATCCCGCAA





AGCAAACGTCCGCCGCTGCGCCCGCGTGTCATCCCGCCGG





TGCCGGATTCACAGGATGACCTGCTGGCACCGGACGCTGT





CTTTGAAGTGATGAACGAAGTCGCGCCGGAAGATGTCGT





GTATGTGAATGAATCAGTTTCGACCGTCACGGCCCTGTGG





GAACGTGTGGAACTGAAGCATCCGGGTTCATATTACTTTC





CGGCGTCGGGCGGTCTGGGTTTCGGTATGCCGGCGGCCGT





GGGTGTTCAGCTGGCCAACGATCGTCGCCGTGTGATTGCA





GTTATCGGCGACGGTAGCGCAAATTATGGCATTACCGCTC





TGTGGACGGCAGCTCAGGAAAAAATCCCGGTTGTCTTTAT





TATCCTGAACAATGGCACCTACCGTCCCCTGCGCGCATTC





GCTAAGCTGCTGAACGCCGAAAATGCGGCCGGCCTGGAT





GTGCCGGGCATTTGCTTTTGTGCGATCGCCGAAGGCTATG





GTGTGGAAGCGCACCGTATTACCAGCCTGGAAAACTTCA





AAGATAAGCTGTCAGCAGCTCTGCAATCGGACACCCCGA





CGCTGCTGGAAGTGCCGACCAGCACCACGTCTCCGTTT





(SEQ ID NO: 58)





ZP_06547677

Pseudomonas

MKTIHSAAYALLRRHGMTTIFGNPGSNELPFLKSFPED
ATGAAGACCATCCACTCTGCCGCCTATGCCCTGCTGCGTC




putida CSV86

FQYVLGLHEGAVVGMADGYALASGKPAFVNLHAAA
GCCACGGTATGACCACCATTTTCGGTAATCCGGGTAGCAA




GTGNGMGALTNSWYSHSPLVITAGQQVRPMIGVEAM
TGAACTGCCGTTTCTGAAAAGTTTCCCGGAAGACTTTCAG




LANVDATQLPKPLVKWSYEPANAQDVPRALSQAIHYA
TATGTTCTGGGCCTGCATGAAGGTGCCGTGGTTGGCATGG




NTTPKAPVYLSIPYDDWDOPSGPGVEHLIERDVQTAGT
CAGATGGTTACGCGCTGGCAAGCGGCAAGCCGGCATTCG




PDARQLQYLVQQVQDARNPYLYLGPDVDATLSNDHA
TGAACCTGCATGCGGCGGCGGGCACCGGTAACGGCATGG




VALADKLRMPVWIAPAASRCPFPTRHPSFRGVLPAAIA
GTGCCCTGACCAATTCTTGGTATAGCCACTCTCCGCTGGT




GISKTLQGHDLIIVVGAPVFRYLQFAPGDYLPVGAQLL
GATTACGGCAGGCCAGCAAGTTCGTCCGATGATCGGTGTC




HITSDPLEATRAPMGHALVGDIRETLRVLAEEVVQQSR
GAAGCGATGCTGGCCAATGTGGACGCGACCCAGCTGCCG




PYPEALAAPECVTDEPHHLHPETLFDVLDAVAPHDAIY
AAACCGCTGGTTAAGTGGAGCTATGAACCGGCTAACGCG




VKESTSTVTAFWQRMNLRHPGSYYFPAAGGLGFGLPA
CAGGATGTTCCGCGCGCACTGTCGCAAGCTATTCATTACG




AVGVQLAQPQRRVVALIGDGSANYGITALWTAAQYRI
CGAATACCACGCCGAAAGCCCCGGTGTATCTGAGCATCC




PVVFIILKNGTYGALRWFAGVLKAEDSPGLDVPGLDFC
CGTACGATGACTGGGATCAGCCGTCTGGTCCGGGCGTCG




AIAKGYGVKAVHTDTRDSFEAALRTALDANEPTVIEVP
AACACCTGATTGAACGTGACGTGCAAACGGCTGGCACCC




TLTIQPH
CGGATGCACGTCAGCTGCAAGTTCTGGTCCAGCAAGTTCA




(SEQ ID NO: 59)
GGATGCACGTAACCCGGTGCTGGTTCTGGGTCCGGATGTG





GATGCGACCCTGAGCAATGACCATGCCGTGCCACTGGCT





GATAAACTGCGTATGCCGGTTTGGATCGCACCGGCTGCGA





GTCGCTGCCCGTTCCCGACGCGTCATCCGTCCTTTCGTGG





TGTGCTGCCGGCCGCAATTGCAGGTATCAGCAAGACCCTG





CAAGGTCACGATCTGATTATCGTCGTGGGTGCGCCGGTTT





TCCGTTATCTGCAATTTGCGCCGGGTGACTACCTGCCGGT





GGGTGCACAACTGCTGCATATTACGTCAGATCCGCTGGAA





GCAACCCGTGCTCCGATGGGCCACGCCCTGGTTGGTGATA





TCCGTGAAACCCTGCGCGTCCTGGCAGAAGAAGTTGTCCA





GCAATCGCGCCCGTATCCGGAAGCGCTGGCTGCACCGGA





ATGTGTGACGGACGAACCGCATCACCTGCATCCGGAAAC





CCTGTTCGATGTCCTGGACGCAGTGGCACCGCACGATGCT





ATTTACGTGAAAGAAAGTACCTCCACGGTTACCGCCTTTT





GGCAGCGTATGAACCTGCGCCATCCGGGCAGCTATTACTT





CCCGGCCGCAGGCGGTCTGCGTTTTGGTCTGCCGGCTGCG





GTCGGTGTGCAGCTGGCACAGCCGCAACGTCGCGTGGTT





GCTCTGATTGGCGATGGTTCTGCGAACTATGGTATCACGG





CACTGTGGACCGCCGCACAGTACCGTATTCCGGTCGTGTT





CATTATCCTGAAAAATGGCACCTATGGTGCCCTGCGCTGG





TTTGCAGGTGTCCTGAAGGCTGAAGATAGTCCGGGCCTGG





ACGTGCCGGGTCTGGATTTCTGCGCAATCGCTAAAGGCTA





CGGTGTTAAGGCGGTCCATACGGATACCCGTGACTCCTTT





GAAGCTGCACTGCGTACGGCGCTGGATGCAAACGAACCG





ACCGTGATTGAAGTTCCGACGCTGACCATCCAGCCGCAC





(SEQ ID NO: 60)





ZP_06846103

Halotalea

MTSRSSFSPPSASEQRGADIFAEVLQCEGVRYIFGNPGT
ATGACCAGCCGTAGCTCGTTTAGCCCGCCGTCAGCGTCAG




alkalilenta

TELPLLDALTDITGIHYVLGLHEASVVAMADGYAQAS
AACAGCGTGGTGCGGATATTTTTGCCGAAGTCCTGCAATG




GKPGFVNLHTAGGLGNAMGAILNAKMANTPLVVTAG
TGAAGGTGTCCGCTATATTTTTGGCAATCCGGGCACCACG




QQDTRHGVTDPLLHGDLTGIARPNVKWAEEIHHPEHIP
GAACTGCCGCTGCTGGATGCACTGACCGACATTACGGGT




MLLRRALQDCRTGPAGPVFLSLPIDTMERCTSVGAGE
ATCCATTATGTGCTGGGCCTGCACGAAGCGTCAGTGGTTG




ASRIERASVANMLHALATALAEVTAGHIALVAGEEVF
CGATGGCCGATGGTTACGCACAGGCTTCGGGCAAACCGG




TANASVEAVALAEALGAPVFGASWPGHIPFPTAHPQW
GTTTCGTTAACCTGCATACCGCCGGCGGTCTGGGTAATGC




QGTLPPKASDIRETLGPFDAVLILGGHSLISYPYSEGPAI
GATGGGTGCCATTCTGAACGCAAAGATGGCTAATACCCC




PPHCRLFQLTGDGHQIGRVHETTLGLVGDLQLSLRALL
GCTGGTCGTGACGGCGGGTCAGCAAGATACCCGTCATGG




PLLARKLQPQNGAVARLRQVATLKRDARRTEAAERSA
CGTTACCGATCCGCTGCTGCACGGCGACCTGACCGGTATC




REFDASATTPFVAAFETIRAIGPDVPIVDEAPVTIPHVRA
GCACGTCCGAATGTCAAATGGGCCGAAGAAATTCATCAC




CLDSASARQYLFTRSAILGWGMPAAVGVSLGLDRSPV
CCGGAACATATCCCGATGCTGCTGCGTCGTGCGCTGCAAG




VCLVGDGSAMYSPQALWTAAHERLPVTFVVFNNGEY
ATTGCCGCACGGGTCCGGCTGGTCCGGTGTTTCTGAGTCT




NILKNYARAQTNYRSARANRFIGLDISDPAIDFPALASS
GCCGATTGACACGATGGAACGTTGTACGTCCGTGGGTGC




LGVPARRVERAGDIAIAVEDGIRSGRPNLIDVLISSSS
AGGTGAAGCCAGCCGTATCGAACGCGCGAGCGTGGCTAA




(SEQ ID NO: 61)
CATGCTGCATGCGCTGGCCACCGCACTGGCTGAAGTGAC





GGCCGGTCACATTGCGCTGGTCGCCGGTGAAGAAGTGTTC





ACCGCGAATGCCAGTGTTGAAGCAGTCGCTCTGGCGGAA





GCACTGGGCGCACCGGTTTTTGGTGCTTCCTGGCCGGGTC





ATATTCCGTTCCCGACCGCACACCCGCAGTGGCAGGGTAC





GCTGCCGCCGAAGGCGAGCGATATCCGTGAAACCCTGGG





CCCGTTTGACGCCGTGCTGATTCTGGGCGGTCATAGTCTG





ATCTCCTATCCGTACTCAGAAGGTCCGGCAATTCCGCCGC





ACTGCCGCCTGTTCCAGCTGACCGGCGATGGTCATCAAAT





CGGCCGTGTTCACGAAACCACGCTGGGCCTGGTGGGCGA





TCTGCAACTGAGTCTGCGCGCGCTGCTGCCGCTGCTGGCC





CGTAAACTGCAACCGCAAAACGGTGCAGTCGCTCGTCTG





CGCCAAGTGGCAACCCTGAAGCGTGATGCTCGTCGCACG





GAAGCGGCCGAACGTTCAGCCCGGGAATTTGACGCGTCG





GCCACCACGCCGTTTGTTGCAGCTTTCGAAACCATTCGCG





CAATCGGCCCGGATGTGCCGATTGTTGACGAAGCGCCGG





TTACGATCCCGCATGTCCGTGCCTGCCTGGATAGCGCATC





TGCTCGCCAGTACCTGTTTACCCGTTCTGCAATTCTGGGTT





GGGGTATGCCGGCGGCCGTCGGTGTGAGTCTGGGTCTGG





ATCGTTCCCCGGTTGTCTGTCTGGTGGGCGACGGTTCAGC





GATGTACTCGCCGCAGGCACTGTGGACCGCAGCTCACGA





ACGCCTGCCGGTTACGTTTGTGGTTTTCAACAATGGTGAA





TATAACGCCCTGAAAAATTTTGCGCGTGCCCAAACCACT





ACCGTAGCGCACGCGCTAATCGTTTTATTGGCCTGGATAT





CTCTGACCCGGCGATTGATTTCCCGGCGCTGGCCAGCTCT





CTGGGTGTGCCGGCACGTCGCGTTGAACGTGCTGGTGATA





TTGCAATCGCTGTCGAAGACGGCATCCGCAGCGGTCGTCC





GAACCTGATTGATGTGCTGATCAGTTCCTCATCG





(SEQ ID NO: 62)





ZP_07290467

Streptomyces sp.

MRTVRESALDVLRARGMTTVFGNPGSTELPMLKQFPD
ATGCGTACGGTGCGTGAATCGGCTCTGGACGTGCTGCGTG




DFRYVLGLQEAVVVGMADGFALASGTTGLVNLHTGP
CGCGTGGTATGACGACGGTTTTTGGTAATCCGGGCTCAAC




GTGNAMGAILNARANRTPMVVTAGQQVRAMLTMEA
GGAACTGCCGATGCTGAAACAGTTTCCGGATGACTTCCGC




LLTMPQSTLLPQPAVKWAYEPPRAADVAPALARAVQV
TATGTTCTGGGTCTGCAAGAAGCTGTGGTTGTCGGTATGG




AETPPQGPVFVSLPMDDFDVVLGEDEDRAAQRAAART
CAGATGGCTTTGCCCTGGCAAGTGGCACCACGGGTCTGGT




VTHAAAPSAEVVRRLAARLSGARSAVLVAGNDVDAS
GAATCTGCATACCGGTCCGGGCACGGGTAACGCGATGGG




GAWDAVVELAERTGLPVWSAPTEGRVAFPKSHPQYR
CGCAATTCTGAACGCTCGTGCGAATCGTACCCCGATGGTG




GMLPPAIAPLSRCLEGHDLVLVIGAPVFCYYPYVPGAH
GTTACGGCGGGCCAGCAAGTGCGTGCCATGCTGACGATG




LPENTELVHLTRDADEAARAPVGDAVVADLALTVRAL
GAAGCACTGCTGACCAATCCGCAGAGTACGCTGCTGCCG




LAELPAREAAAPAARTARAESTAEVDGVLTPLAAMTA
CAACCGGCTGTCAAGTGGGCGTACGAACCGCCGCGCGCG




IAQGAPANTLWVNESPSNLGQFHDATRIDTPGSFLFTA
GCCGATGTGGCACCGGCACTGGCTCGTGCGGTCCAGGTG




GGGLGFGLAAAVGAQLGAPDRPVVCVIGDGSTHYAV
GCAGAAACCCCGCCGCAAGGTCCGGTTTTTGTCTCCCTGC




QALWTAAAYKVPVTFVVLSNQRYAILQWFAQVEGAQ
CGATGGATGACTTCGATGTCGTGCTGGGCGAAGATGAAG




GAPGLDIPGLDIAAVATGYGVRAHRATGFGELSKLYR
ACCGTGCAGCTCAGCGTGCGGCGGCACGTACCGTTACGC




ESALQQDGPVLIDVPVTTELPTL
ACGCTGCGGCCCCGAGCGCGGAAGTTGTCCGTCGCCTGG




(SEQ ID NO: 63)
CAGCTCGTCTGAGTGGTGCTCGTTCCGCGGTGCTGGTTGC





GGGTAATGATGTGGACGCCTCTGGCGCATGGGATGCTGT





GGTTGAACTGGCCGAACGTACCGGTCTGCCGGTCTGGAGT





GCACCGACGGAAGGTCGTGTGGCATTTCCGAAATCCCATC





CGCAGTATCGTGGTATGCTGCCGCCGGCAATTGCACCGCT





GAGCCGTTGCCTOGAAGGTCACGATCTGGTCCTGGTGATC





GGTGCGCCGGTGTTCTGTTATTACCCGTACGTTCCGGGTG





CCCATCTGCCGGAAAACACCGAACTGGTTCACCTGACGC





GCGATGCAGACGAAGCAGCCCGTGCCCCGGTTGGTGATG





CAGTCGTGGCCGACCTGGCACTGACCGTGCGCGCTCTGCT





GGCGGAACTGCCGGCGCGTGAAGCAGCTGCGCCGGCCGC





ACGTACCGCTCGCGCGGAATCTACGGCCGAAGTCGATGG





TGTGCTGACCCCGCTGGCTGCAATGACGGCAATTGCACAG





GGCGCTCCGGCAAACACCCTGTGGGTTAATGAAAGCCCG





TCTAACCTGGGTCAATTTCATGATGCAACCCGTATCGACA





CGCCGGGCAGCTTTCTGTTCACCGCCGGCGGTGGCCTGGG





TTTCGGTCTGGCCGCAGCTGTGGGTGCCCAGCTGGGCGCA





CCGGATCGTCCGGTTGTCTGCGTTATTGGCGACGGTTCAA





CCCACTATGCAGTCCAGGCACTGTGGACCGCGGCGGCGT





ACAAAGTTCCGGTCACCTTTGTGGTTCTGTCGAATCAGCG





CTATGCAATCCTGCAATGGTTCGCGCAAGTGGAAGGCGCT





CAAGGTGCGCCGGGCCTGGATATTCCGGGTCTGGACATC





GCTGCGGTTGCAACGGGTTACGGTGTCCGTGCCCATCGTG





CAACCGGCTTTGGTGAACTGTCAAAGCTGGTGCGTGAATC





GGCGCTGCAACAAGATGGCCCGGTTCTGATCGACGTGCC





GGTTACCACGGAACTGCCGACCCTG





(SEQ ID NO: 64)





ZP_08570611

Rheinheimera sp.

MSSINSFTVADYLLTRLHQLGLRKVFQVPGDYVANFM
ATGTCATCAATCAACTCGTTCACCGTCGCCGACTACCTGC



A13L
DALEQFNGIEAVGDLTELGAGYAADGYARLTGIGAVS
TGACCCGTCTGCATCAACTGGGCCTGCGTAAGGTTTTTCA




VQFGVGTFSVLNAIAGSYVERNPVVVITASPSTGNRKTI
AGTGCCGGGCGATTATGTCGCTAACTTTATGGACGCGCTG




KETGVLFHHSTGDLLADSKVFANVTVAAEVLSDPSDA
GAACAGTTCAATGGCATTGAAGCCGTGGGTGATCTGACC




RQKIDKALTLAITFRRPIYLEAWQDVWGLACEKPEGEL
GAACTGGGTGCAGGTTATGCGGCCGACGGTTACGCACGT




KALPLISEEGALKAMLADSLKLLNSARQPLVLLGVEIN
CTGACCGGTATCGGTGCAGTGTCTGTTCAGTTTGGCGTGG




RFGLQDAVLDLLKASGLPYSTTSLAKTVISENEGIFVGT
GTACGTTTTCTGTTCTGAACGCAATTGCTGGCAGTTACGT




YADGASFPATVEYIEKADCVLALGVIFTDDYLTMLSK
TGAACGTAATCCGGTGGTTGTCATCACCGCGTCGCCGAGC




QFDQMIVVNNDETSRLGHAYYHQLYLADFILQLTDEIK
ACGGGTAACCGCAAAACCATTAAGGAAACGGGCGTGCTG




KSSLYPRQNSALPLLPPQPQITPALLQQQLSYQNFFDLF
TTTCATCACTCCACCGGTGATCTGCTGGCTGACTCAAAAG




YGYLLQHQLQDNISLILGESSSLYMSARLYGLPQDSFIA
TGTTCGCGAATGTCACGGTGGCAGCTGAAGTTCTGTCTGA




DAAWGSLGHETGCVTGIAYASDKRAMAIAGDGGFMM
TCCGAGTGACGCGCGCCAGAAAATTGATAAGGCCCTGAC




MCQCLSTISRHQLNSVVFVISNKVYAIEQSFVDICAFAK
CCTGGCAATTACGTTTCGTCGCCCGATCTATCTGGAAGCC




GGHFAPFDLLPTWDYLSLAKAFSVEGYRVQNGEELLQ
TGGCAGGATGTTTGGGGCCTGGCATGCGAAAAACCGGAA




ALEHIMTQKDKPALVEVVIQSQDLAPAMAGLVKSITG
GGTGAACTGAAGGCCCTGCCGCTGATCAGCGAAGAAGGC




HTVEQCAIPT
GCGCTGAAAGCCATGCTGGCAGATTCTCTGAAGCTGCTGA




(SEQ ID NO: 65)
ACAGTGCACGTCAGCCGCTGGTTCTGCTGGGTGTCGAAAT





TAATCGCTTCGGTCTGCAAGATGCTGTTCTGGACCTGCTG





AAAGCGTCTGGTCTGCCGTATTCCACCACGTCACTGGCCA





AGACCGTTATTAGTGAAAACGAAGGCATCTTTGTCGGCAC





CTATGCGGATGGTGCGTCCTTCCCGGCAACGGTGGAATAC





ATCGAAAAAGCCGATTGTGTCCTGGCACTGGGTGTGATTT





TTACCCATGACTACCTGACGATGCTGTCAAAACAGTTCGA





TCAAATGATCGTGGTTAACAATGACGAAACCTCGCGTCTG





GGCCATGCTTATTACCACCAGCTGTATCTGGCGGATTTTA





TTCTGCAACTGACGGACGAAATTAAAAAATCTAGCCTGTA





CCCGCGTCAGAACAGCGCACTGCCGCTGCTGCCGCCGCA





ACCGCAGATTACCCCGGCGCTGCTGCAACAACAGCTGAG





TTATCAGAACTTTTTCGACCTGTTTTATGGTTACCTGCTGC





AACATCAGCTGCAAGACAATATTTCCCTGATCCTGGGCGA





AAGTTCCTCACTGTATATGTCAGCTCGTCTGTACGGTCTG





CCGCAGGATTCTTTCATCGCAGACGCAGCATGGGGCAGTC





TGGGTCACGAAACCGGCTGCGTTACGGGTATCGCGTATGC





CAGCGATAAACGTGCAATGGCTATTGCGGGTGACGGCGG





TTTTATGATGATGTGCCAGTGTCTGAGCACCATTAGCCGC





CATCAACTGAACTCCGTCGTGTTCGTTATTTCAAATAAAG





TCTACGCCATCGAACAGTCCTTTGTGGATATTTGTGCCTTC





GCAAAGGGCGGTCACTTTGCGCCGTTCGATCTGCTGCCGA





CCTGGGACTATCTGTCGCTGGCTAAAGCGTTTAGCGTGGA





AGGCTACCGCGTTCAGAACGGTGAAGAACTGCTGCAAGC





GCTGGAACATATCATGACCCAGAAAGATAAGCCGGCCCT





GGTGGAAGTTGTCATTCAGTCGCAGGATCTGGCACCGGC





AATGGCTGGCXTGGTCAAAAGCATCACCGGTCACACGGT





GGAACAGTGCGCCATTCCGACC





(SEQ ID NO: 66)





YP_001240047

Bradyrhizobium sp.

MHPDACSIACAAMPTNWGPRTVTKLPLPDPQSRATTH




STM 3843
HRTAHYFLEALIDLGVEYIFANLGTDHVSLIEEIARWDS





EGRRHPEVILCPHEVVAYHMAMGYAMTTGRGQAVFV





HVDAGTANACMAIQNAFRYRLPVLLIAGRAPFAIHGEL





PGGRDTYVHFVQDSFDQGSIVRPYVKWEYTLPSGVVV





KEALTRAAAFMHSDPPGPVSMMLPREVLAEAWDDDA





MPAYPPARYGSVRAGGVDPERAQAIADALMTAENPIA





LTAYLGRSAEAVSVLDRLALVCGIRVVEFNPITMNICQ





DSPCFAGSDPAALVADADLGLLIDIDVPFIPQLLKSADR





LRWIQIDIDALKADIPMWGFATDLRIQGDSAVILRQVL





EIVIARGNDSYMRKVRDRIASWRPAREAAQAKRMAA





AANKGSPGAINPAYLFARLQALLSEQDIVVNEAVRNAP





VLQQQLRRTKPMTYVGLAGGGLGFSGGMALGLKLAN





PSHRVVQIVGDGAFHFAAPDSVYAVSQQYRLPIFSVIL





DNKGWQAVKASVQRVYPDGVAQQTDSFLSRLATGRQ





DEQRRLVDIARAFGAHGERVDDPDELDAAIRSCLAAL





DDGRAAVLHVNITPL





(SEQ ID NO: 67)






YP_001279645

Psychrobacter sp.

MQHDSITPLSKKTSMLDTTAESVVSQTVQQVVFELMR





TLNMTTVFGNPGSTELNFLTNFPEDFSYVLGLHEASVV





GMADGYAQATGNAAFVNLHSAAGVGNALGNIFTAYR





NHTPLVITAGQQARSLLPFAPYLGAEQAAQFPQPYIKW





SIEPARAEDVPLAIAQAYLIAMQHPQGPTFVSIPSDDWD





KPAVLPLLSQSCGHSIPSPDALAELVEVMSTSQNMALV





VGSDVDRQGGFELAVSVAEACQAPVWEAPNSSRASFP





ENHPLFAGFLPAIPEKLSEKLLGYDTIVVIGAPAFTLHV





AGTLSLKKSKIYQLTDDPQYAAQSVATKTLSGNIRDSL





QALLDKLPTSMTPRSGLDLPVRKPAAEVQGSNPISIEY





VMATLAKYCPEDVVIVEEAPSHRPAIQRYLPITQPKSFY





TMASGGLGYGLPAAVGVALGTQRRTLCLIGDGSSMYS





IQAIWTAVQHNLPVTVIVLNNTGYGAMRSFSKIMGSTQ





VPGLDLPNINFVQLAQSMGCQAQKVTDYSVLDKVFAD





TMQAAGSYLLEIMVDANTGAVY





(SEO ID NO: 69)






ZP_01901192

Roseobacter sp. AzwK-3b

MKMTTEEAFVKTLQRHGIEHAFCIIGSAMMPISDLFPQ





AGITFWDCAHEGSAGMMSDGYTRATGKMSMMIAQN





GPGITNFVTAVKTAYWNHTPLLLVTPQAANKTIGQGG





FQEVEQMKLFEDMVAYQEEVRDPSRMAEVLARVISK





AKNLSGPAQIMPRDYWTQVIDIELPDPIEFERSPGGENS





VAEAARLISEARNPVILNGAGVVLSEGGIAASQALAER





LDAPVCVGYQHNDAFPGSHPLFAGPLGYNGSKAAME





LIKDADWLCLGTRLNPFSTLPGYGMDYWPKDAKIIQ





VDINPDRIGLTKKVSVGIIGDAAKVARGILGQLSDSAG





DEGRDARRARIAETKSKWAQQLSSMDHEDDDPGTSW





NERAREAKPDWMSPRMAWRAIQSALPREAIISSDIGNN





CAIGNAYPSFEEGRKYLAPGLFGPCGYGLPAIVGAKIG





RPDVPWGFAGDGAFGIAVNELTAIGRSEWPGITQIVF





RNYQWGAEKRNSTLWFDDNFVGTELDDDVSYAGIAK





ACGLKGVVARTMDELTDALNQAIKDQMENGTTTLIEA





MINQELGEPFRRDAMKKPVAVAGISPDDMRPQKVA





(SEO ID NO: 71)






ZP_06549025

Serratia

MSNAITKVQNANARRGGDVLLEVLESEGVEYVFGNPG





marcescens FGI94

TTELPFMDALLRKPSIQYVEALQEASAVAMADGYAQA





AKKPGFLNLHTAGGLGHGMGNLLNAKCSQTPLVVTA





GQQDSRHTTTDPLLLGDLVGMGKTFAKWSQEVTHVD





QLPVLVRRAFHDSDAAPKGSVFLSLPMDVMEAMSAIG





IGAPSTIDRNAVAGSLPLLASKLAAFTPGNVALIAGDEI





YQSEAANEVVALAEMLAADVYGSTWPNRIPYPTAHPL





WRGNLSTKATEINRALSQYDAIFALGGKSLITILYTEGQ





AVPEQGCKVFQLSADAGDLGRTYSSELSWGDIKSSLKV





LLPELEKATANHRRDYQRRFEKAINEFKLSKESLLGQV





QEQQSATVITPLVAAFEAARAIGPDVAIVDEAIATSGSL





RKSLNSHRADQYAFLRGGGLGWGMPAAVGYSLGLGK





APVVCFVGDGAAMYSPQALWTAAHEKLPVTFIVMNN





TEYNVLKNFMRSQADYTSAQTDRFIAMDLVNPSYDYQ





ALGASMGLETRKVIRAGDIAPAVEAALASGKPNVIEIII





SKS (SEQ ID NO: 73)






ZP_07033476

Granulicella

MNIAYETRENKVASGRECLLEILRDEGVTHVFGNPGTT





mallensis ATCC

ELALIDALAGDDDFHF1LGLQEAAVVGMADGYAQATG




BAA-1857
RPSFVNLHTTAGLGNGMGNLTNAFATNVPMVVTAGQ





QDIRHLAYDPLLSGDLVGLARATVKWAHEVRSLQELP





IILRRAFRDANTEPRGPVFVSLPMNIIDEIGTVSIPPRSTI





VQAESGDISQLVRLLVESAGNLCLVVGDEVGRYGATE





AAVRVAELLGAPYYGSPFHSNVPFPTDHPLWRFTLPPN





TGEMRKVLGGYDRILLIGDRAFMSYTYSDELPLSPKTQ





LLQIAVDRHSLGRCHAVELGLYGDPLSLLAAVGDALS





QERALAPSRDSRLAIARDWRASWEQDLKDECERLAPS





RPLYPLVAADAVLRGYTPGTVIVDECLATNKVRQLY





PVRKPGEYYYFRGAGLGWGMPAAVGVSLGLERQQRV





VCLLGDGAAMYSPQALWSAAHESLPITFVVFNNSEYNI





LKNFMRSRPGYNAQSGRFVGMEINQPSIDFCALARSM





GYDAVRLTEPDDITAYMIAAGDREGPSLLEIPIAATAS





(SEq ID NO: 75)






WP_010764607.1

Enterococcus

MYTVADYLLDRLKELGIDEVFGVPGDYNLQFLDHITA





haemoperoxidus

RKDLEWIGNANELNAAYMADGYARTKGISALVTTFG




ATCC BAA-382
VGELSAINGLAGSYAESIPVIEIVGSPTTTVQQNKKLVH





HTLGDGDFLRFERIHEEVSAAIAHLSTENAPSEIDRVLT





VAMTEKRPVYINLPIDIAEMKASAPTTPLNHTTDQLTT





VETAILTKVEDALKQSKNVVIAGHEILSYHIENQLEQF





IQKFNLPITVLPFGKGAFNEEDAHYLGTYTGSTTDESM





KNRVDHADLVLLLGAKLTDSATSGFSFGFTEKQMISIG





STEVLFYGEKQETVQLDRFVSALSTLSFSRFTDEMPSV





KRLATPKVRDEKLTQKQFWQMVESFLLQGDTVVGEQ





GTSFFGLTNVPLKKDMHFIGQPLWGSIGYTFPSALGSQI





ANKESRHLLFIGDGSLQLTVQELGTAIREKLTPIVFVIN





NNGYTVEREIHGATEQYNDIPMWDYQKLPFVFGGTDQ





TVATYKVSTEIELDNAMTRARTDVDRLQWIEVVMDQ





NDAPVLLKKLAKIFAKQNS





(SEQ ID NO: 77)






WP_002115026.1

Acinetobacter

MELLSGGEMLVRALADEGVEHVFGYPGGAVLHIYDA





baumannii

LFQQDKINHYLVRHEQAAGHMADAYSRATGKTGVVL





VTSGPGATNTVTPIATAYMDSIPMVILSGQVASHLIGED





AFQETDMVGISRPIVKHSFQVRHASEIPAIIKKAFYIAAS





GRPGPVVVDIPKDATNPAEKFAYEYPEKVKMRSYQPP





SRGHSGQIRKAIDELLSAKRPVIYTGGGVVQGNASALL





TELAMLLGYPVTNTLMGLGGFPGDDPQFVGMLGMHG





TYEANMAMHNADVILAIGARFDDRVTNNPAKFCVNA





KVIHIDIDPASISKTIMAHIPIVGAVEPVLQEMLTQLKQL





NVSKPNPEAIAAWWDQINEWRKVHGLKFETPTDGTM





KPQQVVEALYKATNGDAIITSDVGQHQMFGALYYKY





KRPRQWINSGGLGTMGVGLPYAMAAKLAFPDQQVVC





ITGEASIQMCIQELSTCKQYGMNVKILCLNNRALGMV





KQWQDMNYEGRHSSSYVESLPDFGKLMEAYGHVGIQI





DHADELESKLAEAMAINDKCVFINVMVDRTEHVYPM





LIAGQSMKDMWLGKGERT (SEQ ID NO: 79)






YP_005756646.1

Staphylococcus

MKQRIGAYLIDAIHRAGVDKIFGVPGDFNLAFLDDIISN





areus

PNVDWVGNTNELNASYAADGYARLNGLAALVTTFGV





GELSAVNGIAGSYAERIPVIAITGAPTRAVEHAGKYVH





HSLGEGTFDDYRKMFAHITVAOGYITPENATTEIPRLIN





TAIAERRPVHLHLPIDVAISEIEIPTPFEVTAAKDTDAST





YIELLTSKLHQSKQPIIITGHEINSFHLHQELEDFVNQTQ





IPVAQLSLGKGAFNEENPYYMGIYDGKIAEDKIRDYVD





NSDLILNIGAKLTDSATAGFSYQFNIDDVVMLNHHNIKI





DDVTNDEISLPSLLKQLSNISHTNNATFPAYHRPTSPDY





TVGTEPLTQQTYFKMMQNFLKPNDVIIADQGTSFFGA





YDLALYKNNTFIGQPLWGSIGYTLPATLGSQLADKDR





RNLLLIGDGSLQLTVQAISTMIRQHIKPVLFVINNDGYT





VERLIHGMYEPYNEIHMWDYKALPAVFGGKNVEIHDV





ESSKDLQDTFNAINGHPDVMHFVEVKMSVEDAPKKLI





DIAKAFSQQNK





(SEQ ID NO: 81)






WP_008347133.1

Bacillus pumilus

MPQRTAGKEVTALLEEWGVKHIYGMPGDSINELIEELR




SFR-032
HESSKIQFIQTRHEEVAALSAAADAKLTGKLGVCLSIA





GPGAVHLLNGLYDAKADGAPVLAIAGQVASTEVGRD





AFQEIKLERMFDDVAVFNQQVQTAEALPDLLNQAIKA





AYTHKGVAVLTVSDDLFSQKIKRSPVYTSPLYVEGDV





RPKKDQLLKAAQLINNAKKPVILAGKGLRNAKEELLSF





AEKAAAPIVITLPAKGVVPDRHAYFLGYLGQIGTKPAY





EAMEECDLLIMLGTSFPYRDYLPEDTPAIQLDIKPDQIG





KRYPVEVGIVSDSKTGLHELTSYIEYKEORGFLEACTE





HMMKWREEMDKEKSIATSPLKPQQVIARLEEAVDDD





AILSVDVGNVTVWMARHFEMKQQDFIISSWLATMGC





GLPGAISAKLNEPNRQAIAVCGDGGFTMVMQDFVTAV





KYKLPIVVVILNNNNLGMIEYEQQVKGNINYGIELEDI





DFAKFAEACGGKGISVSSHEELAPAFDQALQADKPVII





DVAVTNEPPLPGKITYTQAAGFSKYLLKKFFEKGELDI





PPLKKSLKRFF





(SEQ ID NO: 83)






WP_018535238.1

Streptomyces

MVSRPARVAILEQLRADGVRYMFGNPGTVEQGFLDEL





glaucescens

RNFPDIEYILALQEAGVVGLADGYARATRTPAVLQLHT





GVGVGNAVGMLYQAKRGHAPLVAIAGEAGLRYDAM





EAQMAVDLVAMAEPVTKWATRVVDPESTLRVLRRA





MKVAATPPYGPVLVVLPADVMDRDTSEAAVPTSYVD





FAATPDPQVLDRAAELLAGAERPIVIAGDGVHFAGAQ





EELGRLAQTWGAEVWGADWAEVNLSVEHPAYAGQL





GHMFGDSSRRVTGAADAVLLYGTYALPEVYPALDGV





FADGAPVVHIDLDTDAIAKNFPVDLGLAADPRRALDG





LARALERRMSPESRARAGEWFTGRSAQRSYEIAAARE





QDEAALAPDALPVTAFLQELARQLPEDAVVFDEALTA





SPDVTRHLPPTRPGHWHQTRGGSLGVGIPGAIAAQLAH





PDRTVVGFTCDGGSLYTIQALWTAARYDIGATFVICNN





SSYKLLELNIEEYWKSVDVAAHEQPEMFDLARPAIDFV





ALSRSLGVPAVRVEKPDQAKAAVEQALGTPGPFLIDLV





TGRGRED





(SEQ ID NO: 85)






YP_0064855164.1
Pseudomonas
MKTVHSASYEILRRHGLTTVFGKPGSNELPFLKDFPED





aeruginosa

FRYILGLHEGAVVGMADGFALASGRPAFVNLHAAAGT





GNGMGALTNAWYSHSPLVITAGQQVRSMIGVEAMLA





NVDAGQLPKPLVKWSHEPACAQDVPRALSQAIQTASL





PPRAPVYLSIPYDDWAQPAPAGVEHLAARQVSGAALP





APALLAELGERLSRSRNPVLVLGPDVDGANANGLAVE





LAEKLRMPAWGAPSASRCPFPTRHACFRGVLPAAIAGI





SRLLDGHDLILVVGAPVFRYHQFAPGDYLPAGAELVQ





VTCDPGEAARAPMGDALVGDIALTLEALLEQVRPSAR





PLPEALPRPPALAEEGGPLRPETVFDVIDALAPRDAIFV





KESTSTVTAFWQRVEMREPGSYFFPAAGGLGFGLPAA





VGAQLAQPRRQVIGIIGDGSANYGITALWSAAQYRVP





AVFIILKNGTYGALRWFAGVLEVPDAPGLDVPGLDFC





AIARGYGVEALHAATREELEGALKHALAADRPVLIEV





PTQTIEP





(SEQ ID NO: 87)






YP_005461458.1
Actinoplanes
MIDLDGTVTVAEYLGLRLRHAGVEHLFGVPGDFNLNL





missouriensis

LDGLAFVEGLRWVGSPNELGAGYAADAYARRRGLSA





LFTTYGVGELSAINAVAGSAAEDSPVVHVVGSPRTTTV





AGGALVHHTIADGDFRHFARAYAEVTVAQAMVTATD





AGAQIDRVLLAALTHRKPVYLSIPQDLALHRIPAAPLR





EPLTPASDPAAVERFRTAVRDLLTPAVRPIMLVGQLVS





RYGLSTLVTDMTTRSGIPVAAQLSAKGVIDESVEGNLG





LYAGSMLDGPAASLIDSADVVLHLGTALTAELTGFFTH





RRPDARTVQLLSTAALVGTTRFDNVLFPDAMTTLAEV





LTTFPAPARLAAPTTRAEPTGLAASITPPAPSAVDLTAS





TATDLTAPTAGDISEMSRVLTQDAFWAGMQAWLPAG





HALVADTGTSYWGALALRLPGDTVFLGQPIWNSIGWA





LPAVLGQGLADPDRRPVLVIGDGAAQMTIQELSTIVAA





GLRPIILLLNNRGYTIERALQSPNAGYNDVADWNWRA





VVAAFAGPDTDYHHAATGTELAKALTAASESNRPVFI





EVELDAFDTPPLLRRLAERATAPS





(SEQ ID NO: 89)






YP_006991301.1

Carnobacterium

MYTVGNYLLDRLTELGIRDIFGVPGDYNLKFLDHVMT





maltaromaticum

HKELNWIGNANELNAAYAADGYARTKGIAALVTTFG




LMA28
VGELSAANGTAGSYAEKVPVVQIVGTPTTAVQNSHKL





VHHTLGDGRFDHFEKMQTEINGAIAHLTADNALAEID





RVLRIAVTERCPVYINLAIDYAEVVAEKPLKPLMEESK





KVEEETTLVLNKIEKALQDSKNPVVLIGNEIASFHLESA





LADFVKKFNLPVTVLPFGKGGFDEEDAHFIGVYTGAPT





AESIKERVEKADLILIIGAKLTDSATAGFSYDFEDRQVIS





WSDEVSFYGEIMKPVAFAQFVNGLNSLNYLGYTGEIK





QVERVADIEAKASNLTQNNFWKFVEKYLSNGDTLVAE





QGTSFFGASLVPLKSKMKFIGQPLWGSIGYTFPAMLGS





QIANPASRHLLFIGDGSLQLTIQELGMTFREKLTPIVFVI





NNDGYTVEREIHGPNELYNDIPMWDYQNLPYVFGGN





KGNVATYKVTTEEELVAAMSQARQDTTRLQWIEVVM





GKQPSPDLLVQLGKVFAKQNS (SEQ ID NO: 91)






NP_594083.1

Schizosaccharomyces

MSSEKVLVGEYLFTRLLQLGIKSILGVPGDFYLALLDLI





pombe

EKVGDETFRWVGNENELNGAYAADAYARVKGISAIV





TTFGVGELSALNGFAGAYSERIPVVHIVGYPNTKAQAT





RPLLHHTLGNGDFKVFQRMSSELSADVAFLDSGDSAG





RLIDNLLETCVRTSRPVYLAVPSDAGYFYTDASPLKTP





LVFPVPENNKEIEHEVVSEILELIEKSKNPSILVDACVSR





FHIQQETQDFIDATHFPTYVTPMGKTAINESSPYFDGVY





IGSLTEPSIKERAESTDLLLIIGGLRSDFNSGTFTYATPAS





QTIEFHSDYTKIRSGVYEGISMKHLLPKLTAAIDKKSVQ





AKARPVHFEPPKAVAAEGYAEGTITHKWFWPTFASFL





RESDVVTTETGTSNFGILDCIFPKGCQNLSQVLWGSIG





WSVGAMFGATLGIKDSDAPHRRSILIVGDGSLHLTVQE





ISATIRNGLTPIIFVINNKGYTIERLIHGLHAVYNDINTE





WDYQNLLKGYGAKNSRSYNIHSEKELLDLFKDEEFGK





ADVIQLVEVHMPVLDAPRVLIEQAKLTASLNKQ





(SEQ ID NO: 93)






WP_003075272.1

Comamonas

MPANTAPNAQAAEVFTVRHAVINMLRELGMTRIFGNP





testosteroni

GSTELPLFRDYPEDFSYILGLQETVVVGMADGYAQAT





RNASFVNLHSAAGVGHAMANIFTAFKNRTPMVITAGQ





QTRSLLQFDPFLHSNQAAELPKPYVKWSCEPARAEDV





PQALARAYYIAMQEPRGPVFVSIPADDWDVPCEPITLR





KVGFETRPDPRLLDSIGQALEGARAPAFVVGAAVDRS





QAFEAVQALAERHQARVYVAPMSGRCGFPEDHALFG





GFLPAMRERIVDRLSGHDVVFVIGAPAFTYHVEGHGPF





IAEGTQLFQLIEDPAIAAWAPVGDAAVGNIRMGVQELL





ARPLTHPRPALQPRPAIPAPAAPEPGRLMTDAFLMHTL





AQVRSRDSIIVEEAPGSRSIIQAHLPIYAAETFFTMCSGG





LGHSLPASVGIALARPDKKVIGVIGDGSAMYAIQALWS





AAHLKLPVTYIIVKNRRYAALQDFSRVFGYREGEKVE





GTDLPDIDFVALAKGQGCDGVRVTDAAQLSQVLRDAL





RSPRATLVEVEVA





(SEQ ID NO: 95)






WP_020634527.1

Amycolatopsis

MNVAELVGRTLAELGVGAAFGVVGSGNFVVTNGLRA





orientalis

GGVRFVAARHEGGAASMADAYARMSGRVSVLSLHQ




HCCB10007
GCGLTNALTGITEAAKSRTPMIVLTGDTAASAVLSNFR





IGQDALATAVGAVPERVHSAPTAVADTVRAYRTAVQ





QRRTVLLNLPLDVQAQEAPEAVEIPKVRGPAPIRPDAG





MVAKLADLLAEARRPVFIAGRGARASAVPLRELAEISG





ALLATSAVAHGLFHDDPFSLGISGGFSSPRTADLIVDAD





LVIGWGCALNMWTTRHGTLLGPAARLVQVDVEQAAL





GAHRPIDLGVVGDVAGTAVDVHAELDKRGHQRSREA





PTGTRWNDVPYNDLSGDGRIDPRTLSRRLDEILPAERM





VSIDSGNFMGYPSAYLSVPDENGFCFTQAFQSIGLGLG





TAIGAALARPDRLPVLGVGDGGFHMAVSELETAVRLR





IPLVIVVYNDAAYGAEIHHFGDADMTTVRFPDTDIAAI





GRGFGCDGVTVRSVGDLAAVKEWLGGPRDAPLVIDA





KIADDGGSWWLAEAFRH (SEQ ID NO: 97)






1OVM

Enterobacter sp.

MRTPYCVADYLLDRLTDCGADHLFGVPGDYNLQFLD
ATGCGTACCCCGTACTGCGTTGCTGACTACCTGCTGGACC




HVIDSPDICWVGCANELNASYAADGYARCKGFAALLT
GTCTGACCGATTGCGGCGCGGACCACCTGTTTGGCGTGCC




TFGVGELSAMNGIAGSYAEHVPVLHIVGAPGTAAQQR
GGGCGACTACAACCTGCAATTTCTGGACCATGTCATTGAT




GELLHHTLGDGEFRHFYHMSEPITVAQAVLTEQNACY
TCTCCGGACATCTGCTGGGTGGGCTGTGCCAACGAACTGA




EIDRVLTTMLRERRPGYLMLPADVAKKAATPPVNALT
ATGCAAGTTATGCGGCCGATGGCTACGCACGTTGCAAAG




HKQAHADSACLKAFRDAAENKLAMSKRTALLADFLV
GTTTTGCAGCTCTGCTGACCACGTTCGGCGTGGGTGAACT




LRHGLKHALQKWVKEVPMAHATMLMGKGIFDERQA
GTCCGCGATGAATGGCATTGCCGGCAGCTATGCGGAACA




GFYGTYSGSASTGAVKEAIEGADTVLCVGTRFTDTLTA
TGTGCCGGTTCTGCACATCGTTGGCGCGCCGGGCACCGCG




GFTHQLTPAQTIEVQPHAARVGDVWFTGIPMNQAIETL
GCGCAGCAACGTGGTGAACTGCTGCATCACACGCTGGGC




VELCKQHVHAGLMSSSSGAIPFPQPDGSLTQENFWRTL
GATGGTGAATTTCGCCATTTCTACCACATGTCCGAACCGA




QTFIRPGDIILADQGTSAFGAIDLRLPADVNFIVQPLWG
TTACCGTTGCCCAAGCAGTCCTGACGGAACAGAACGCCT




SIGYTLAAAFGAQTACPNRRVIVLTGDGAAQLTIQELG
GCTATGAAATCGACCGTGTGCTGACCACGATGCTGCGCG




SMLRDKQHPIILVLNNEGYTVERAIHGAEQRYNDIALW
AACGTCGTCCGGGCTATCTGATGCTGCCGGCTGATGTTGC




NWTHIPQALSLDPQSECWRVSEAEQLADVLEKVAHHE
GAAAAAGGCAGCTACCCCGCCGGTCAACGCACTGACGCA




RLSLIEVMLPKADIPPLLGALTKALEACNNA
TAAACAGGCTCACGCGGATTCCGCTTGTCTGAAGGCGTTT




(SEQ ID NO: 99)
CGTGACGCGGCCGAAAATAAACTGGCCATGTCAAAGCGT





ACCGCCCTGCTGGCAGACTTCCTGGTGCTGCGTCATGGCC





TGAAACACGCGCTGCAAAAATGGGTTAAGGAAGTCCCGA





TGGCCCATGCAACCATGCTGATGGGCAAGGGTATTTTTGA





TGAACGCCAGGCCGGCTTCTATGGCACCTACTCAGGCTCG





GCCAGCACGGGTGCAGTGAAAGAAGCTATCGAAGGCGCG





GATACCGTGCTGTGCGTTGGTACGCGTTTTACCGACACGC





TGACCGCCGGTTTCACGCATCAGCTGACCCCGGCACAAAC





GATTGAAGTTCAGCCGCACGCAGCTCGCGTCGGTGATGTG





TGGTTTACCGGTATTCCGATGAACCAAGCGATCGAAACGC





TGGTTGAACTGTGTAAACAGCATGTCCACGCTGGCCTGAT





GAGCAGCAGCAGCGGTGCCATTCCGTTCCCGCAACCGGA





TGGCTCTCTGACCCAGGAAAATTTTTGGCGTACGCTGCAA





ACCTTCATTCGTCCGGGCGATATTATCCTGGCGGACCAGG





GCACCTCTGCTTTTGGTGCGATCGATCTGCGTCTGCCGGC





CGACGTGAACTTCATTGTTCAACCGCTGTGGGGCAGTATC





GGTTATACCCTGGCGGCGGCGTTTGGCGCCCAGACGGCAT





GTCCGAATCGTCGCGTCATTGTGCTGACCGGCGATGGTGC





TGGGCAGCTGACGATCCAAGAACTGGGTAGCATGCTGCG





CGACAAACAACATCCGATTATCCTGGTGCTGAACAATGA





AGGGTATACCGTTGAACGTGCCATTCATGGTGCAGAACA





GCGCTACAACGATATTGCACTGTGGAATTGGACCCACATC





CCGCAAGCGCTGTCTCTGGACCCGCAGAGTGAATGCTGG





CGTGTGTCGGAAGCTGAACAGCTGGCGGATGTCCTGGAA





AAAGTGGCGCATCACGAACGCCTGAGCCTGATTGAAGTT





ATGCTGCCGAAAGCTGATATCCCGCCGCTGCTGGGTGCGC





TGACCAAGGCTCTGGAAGCGTGTAACAATGCC





(SEQ ID NO: 100)





2Q5Q

Azospirillum

MKLAEALLRALKDRGAQAMFGIPGDFALPFFKVAEET





brasilense Sp24

QILPLHTLSHEPAVGFAADAAARYSSTLGVAAVTYGA





GAPNMVNAVAGAYAEKSPVVVISGAPGTTEGNAGLLL





HHQGRTLDTQFQVFKEITVAQARLDDPAKAPAEIARV





LGAARAQSRPVYLEIPRNMVNAEVEPVGDDPAWPVD





RDALAACADEVLAAMRSATSPVLMVCVEVRRYGLEA





KVAELAQRLGVPVVTTFMGRGLLADAPTPPLGTYIGV





AGDAEITRLVEESDGLFLLGAILSDTNFAVSQRKIDLRK





TIHAFDRAVTLGYHTYADIPLAGLVDALLERLPPSDRT





TRGKEPHAYPTGLQADGEPIAPMDIARAVNDRVRAGQ





EPLLIAADMGDCLFTAMDMIDAGLMAPGYYAGMGFG





VPAGIGAQCVSGGKRILTVVGDGAFQMTGWELGNCR





RLGIDPIVILFNNASWEMLRTFQPESAFNDLDDWRFAD





MAAGMGGDGVRVRTRAELKAALDKAFATRGRFQLIE





AMIPRGVLSDTLARFYQGQKRLHAAPRE (SEQ ID NO:





101)






2VBG

Lactococcus lactis

MYTVGDYLLDRLHELGIEEIFGVPGDYNLQFLDQIISRE
ATGTACACCGTTGGCGACTACCTGCTGGACCGTCTGCATG




DMKWIGNANELNASYMADGYARTKKAAAFLTTFGV
AACTGGGCATCGAAGAAATCTTTGGCGTGCCGGGTGACT




GELSAINGLAGSYAENLPVVEIVGSPTSKVQNDGKFVH
ATAACCTGCAATTTCTGGATCAGATTATCAGCCGTGAAGA




HTLADGDFKHFMKMHEPVTAARTLLTAENATYEIDRV
CATGAAATGGATTGGTAACGCTAATGAACTGAACGCATC




LSQLLKERKPVYINLPVDVAAAKAEKPALSLEKESSTT
TTATATGGCTGATGGTTACGCACGTACCAAAAAGGCGGCC




NTTEQVILSKIEESLKNAQKPVVIAGHEVISFGLEKTVT
GGCGTTTCTGACCACGTTCGGCGTTGGTGAACTGAGCGCA




QFVSETKLPITTLNFGKSAVDESLPSFLGIYNGKLSEISL
ATTAACGGCCTGGCCGGTTCTTATGCAGAAAATCTGCCGG




KNFVESADFILMLGVKLTDSSTGAFTHHLDENKMISLN
TGGTTGAAATCGTTGGCTCACCGACGTCGAAAGTCCAGA




IDEGIIFNKVVEDFDFRAVVSSLSELKGIEYEGQYIDKQ
ATGATGGCAAGTTTGTGCATCACACCCTGGCCGATGGCGA




YEEFIPSSAPLSQDRLWQAVESLTQSNETIVAEQGTSFF
CTTTAAACATTTCATGAAGATGCACGAACCGGTGACGGCT




GASTIFLKSNSRFIGQPLWGSIGYTFPAALGSQIADKES
GCGCGTACCCTGCTGACGGCGGAAAACGCCACCTATGAA




RHLLFIGDGSLQLTVQELGLSIREKLKPICFIINNDGYTV
ATTGATCGTGTGCTGAGCCAGCTGCTGAAAGAACGCAAG




EREIHGPTQSYNDIPMWNYSKLPETFGATEDRVVSKIV
CCGGTTTACATCAATCTGCCGGTTGATGTCGCCGCAGCTA




RTENEFVSVMKEAQADVNRMYWIELVLEKEDAPKLL
AAGCTGAAAAGCCGGCGCTGTCTCTGGAAAAAGAAAGCT




KKMGKLFAEQNK
CTACCACGAACACCACGGAACAGGTTATTCTGAGCAAAA




(SEQ ID NO: 103)
TCGAAGAATCTCTGAAAAATGCCCAAAAGCCGGTCGTGA





TTGCAGGCCATGAAGTGATCTCATTTGGTCTGGAAAAAAC





CGTCACGCAGTTCGTGTCGGAAACCAAGCTGCCGATTACC





ACGCTGAACTTTGGTAAAAGTGCCGTGGATGAAAGCCTG





CCGTCTTTCCTGGGCATTTATAACGGTAAACTGAGTGAAA





TCTCCCTGAAGAATTTTGTCGAAAGCGCCGATTTCATTCT





GATGCTGGGCGTGAAACTGACCGACAGTTCCACGGGTGC





ATTTACCCATCACCTGGATGAAAACAAGATGATCAGTCTG





AACATCGACGAAGGCATCATCTTCAACAAGGTTGTCGAA





GATTTCGACTTCCGTGCGGTGGTTTCATCGCTGTCCGAAC





TGAAGGGCATTGAATATGAAGGCCAGTACATCGATAAGC





AATACGAAGAATTTATCCCGAGCAGCGCACCGCTGAGCC





AGGACCGTCTGTGGCAAGCAGTTGAATCACTGACGCAGT





CGAACGAAACCATTGTCGCTGAACAAGGCACCAGCTTTTT





CGGTGCGTCCACCATCTTTCTGAAAAGTAATTCCCGTTTC





ATTGGTCAGCCGCTGTGGGGCAGCATCGGTTATACCTTTC





CGGCGGCACTGGGCTCACAAATTGCGGATAAAGAATCGC





GCCATCTGCTGTTCATCGGCGACGGTAGCCTGCAACTGAC





CGTTCAAGAACTGGGTCTGTCTATTCGTGAAAAACTGAAC





CCGATCTGCTTTATTATCAACAATGATGGCTACACGGTGG





AACGCGAAATTCACGGTCCGACCCAGTCATATAACGACA





TCCCGATGTGGAATTACTCGAAACTGCCGGAAACGTTTGG





CGCCACCGAAGATCGTGTCGTGAGTAAGATTGTGCGCAC





CGAAAACGAATTTGTGTCCGTTATGAAAGAAGCACAGGC





TGATGTTAATCGCATGTATTGGATCGAACTGGTCCTGGAA





AAAGAAGACGCTCCGAAGCTGCTGAAAAAGATGGGCAAA





CTGTTTGCGGAACAGAACAAG





(SEQ ID NO: 104)





2VBI

Acetobacter syzvgii

MTYTVGMYLAERLVQIGLKHHFAVAGDYNLVLLDQL
ATGACCTATACGGTGGGCATGTACCTGGCTGAACGCCTGG



9H-2
LLNKDMKQIYCCNELNCGFSAEGYARSNGAAAAVVT
TGCAGATTGGCCTGAAACATCACTTTCCGGTGGCTGGCGA




FSVGAISAMNALGGAYAENLPVILISGAPNSNDQGTGH
TTACAACCTGGTGCTGCTGGATCAACTGCTGCTGAACAAA




ILHHTIGKTDYSYQLEMARQVTCAAESITDAHSAPAKI
GACATGAAACAGATTTATTGCTGTAACGAACTGAATTGCG




DHVIRTALRERKPAYLDIACNIASEPCVRPGPVSSLLSE
GCTTTAGCGCAGAAGGTTACGCTCGCTCTAATGGTGCGGC




PEIDHTSLKAAVDATVALLEKSASPVMLLGSKLRAAN
GGCGGCAGTGGTTACCTTCAGTGTGGGTGCCATTTCCGCA




ALAATETLADKLQCAVTIMAAAKGFFEDHAGFRGLY
ATGAACGCTCTGGGCGGTGCTTACGCGGAAAATCTGCCG




WGEVSNPGVQELVETSDALLCIAPVFNDYSTVGWSAW
GTTATTCTGATCTCAGGCGCGCCGAACTCGAATGATCAGG




PKGPNVILAEPDRVTVDGRAYDGFTLRAFLQALAEKA
GCACGGGTCATATCCTGGATCACACCATTGGTAAAACGG




PARPASAQKSSVPTCSLTATSDEAGLTNDEIVRHINALL
ATTATAGCTACCAACTGGAAATGGCACGTCAGGTCACCTG




TSNTTLVAETGDSWFNAMRMTLPRGARVELEMQWGH
TGCGGCCGAATCAATCACGGATGCGCATTCGGCCCCGGC




IGWSVPSAFGNAMGSQDRQHVVMVGDGSFQLTAQEV
AAAAATCGACCACGTTATTCGTACCGCACTGCGTGAACGT




AQMVRYELPVIIFLINNRGYVIEIAIHDGPYNYIKNWDY
AAACCGGCATATCTGGATATCGCGTGCAACATTGCAAGC




AGLMEVFNAGEGHGLGLKATTPKELTEAIARAKANTR
GAACCGTGTGTGCGTCCGGGTCCGGTTAGCTCTCTGCTGA




GPTLIECQIDRTDCTDMLVQWGRKVASTNARKTTLA
GTGAACCGGAAATTGATCATACCTCCCTGAAAGCAGCTGT




(SEQ ID NO: 105)
GGAGGCGACGGTTGGCCTGCTGGAAAAATCAGCCTCGCC





GGTGATGCTGCTGGGCTCAAAACTGCGTGCAGCAAACGC





ACTGGCAGCTACCGAAACGCTGGCAGATAAACTGCAGTG





CGCTGTGACCATGATGGCGGCGGCAAAAGGCTTTTTCCCG





GAAGATCACGCCGGCTTCCGTGGTCTGTATTGGGGCGAA





GTTTCAAATCCGGGTGTCCAGGAACTGGTGGAAACCTCG





GATGGACTGGTGTGTATGGCTCCGGTTTTTAACGACTACA





GCACGGTCGGCTGGTCTGCGTGGCCGAAAGGTCCGAATG





TGATTCTGGCCGAACCGGACCGTGTTACCGTCGATGGTCG





TGCGTATGATGGTTTTACGCTGCGTGCTTTCCTGCAAGCT





CTGGCAGAAAAAGCACCGGCACGTCCGGCTAGTGCACAG





AAAAGTTCCGTTCCGACCTGCAGTCTGACCGCGACGTCCG





ATGAAGCCGGCCTGACGAACGACGAAATCGTTCGGCACA





TTAACGCGCTGCTGACCAGCAATACCACGCTGGTGGCGG





AAACGGGCGATTCTTGGTTCAATGCCATGCGTATGACCCT





GCCGCGTGGTGGACGCGTCGAACTGGAAATGCAGTGGGG





CCATATTGGTTGGAGCGTGCCGTCTGCATTTGGCAATGCT





ATGGGTAGTCAGGATCGTCAACACGTCGTGATGGTGGGC





GACGGTTCCTTCCAGCTGACCGCGCAAGAAGTTGCCCAG





ATGGTCCGTTATCAACTGCCGGTGATTATCTTTCTGATCA





ACAATCGCGGCTACGTTATTGAAATCGCCATTCATGATGG





TCCGTACAACTACATCAAAAACTGGGACTATGCCGGTCTG





ATGGAAGTTTTTAACGCAGGCGAAGGTCACGGCCTGGGT





CTGAAAGCGACCACGCCGAAAGAACTGACCGAAGCCATT





GCACGTGCTAAAGCGAATACCCGCGGCCCGACGCTGATC





GAATGCCAAATTGATCGTACCGACTGTACGGATATGCTGG





TCCAGTGGGGTCGCAAAGTGGCGTCTACCAACGCACGCA





AAACGACGCTGGCG (SEQ ID NO: 106)





3FZN
Agrobacterium
MASVHGTTYELLRRQGIDTVFGNPGSNELPFLKDFPED
ATGGCGAGCGTGCATGGCACCACGTATGAACTGCTGCGT




radiobacter

FRYILALQEACVVGIADGYAQASRKPAFINLHSAAGTG
CGCCAGGGTATCGATACCGTGTTCGGCAACCCGGGTTCAA




NAMGALSNAWNSHSPLIVTAGQQTRAMIGVEALLTNV
ATGAACTGCCGTTTCTGAAAGATTTCCCGGAAGACTTTCG




DAANLPRPLVKWSYEPASAAEVPHAMSRAIHMASMA
TTATATCCTGGCACTGCAAGAAGCGTGCGTGGTTGGCATT




PQGPVYLSVPYDDWDKDADPQSHHLFDRHVSSSVRLN
GCAGACGGTTACGCGCAAGCCTCGCGCAAACCGGCGTTT




DQDLDILVKALNSASNPAIVLGPDVDAANANADCVML
ATTAACCTCCATAGCGCGGCCGGCACCGGTAATGCAATG




AERLKAPVWVAPSAPRCPFPTRHPCFRGLMPAGIAAIS
GGCGCTCTGAGCAACGCGTGGAACAGCCACAGCCCGCTG




QLLEGHDVVLVIGAPVFRYHOYDPGQYLKPGTRLISVT
ATCGTGACCGCGGGCCAGCAAACGCGTGCCATGATTGGT




CDPLEAARAPMGDAIVADIGAMASALANLVEESSRQL
GTGGAAGCACTGCTGACGAACGTTGATGCAGCTAATCTG




PTAAPEPAKVDQDAGRLHPETVFDTLNDMAPENAIYL
CCGCGCCCGCTGGTCAAATGGTCCTATGAACCGGCATCAG




NESTSTTAQMWQRLNMRNPGSYYFCAAGGLGFALPA
CGGCCGAAGTCCCCCATGCAATCTCTCGTGCCATCCACAT




AIGVQLAEPERQVIAVIGDGSAVYSISALWTAAQYNIPT
GGCAAGTATGGCCCCGCAGGGTCCGGTCTATCTGTCTGTG




IFVIMNNGTYGALRWFAGVLEAENVPGLDVPGIDFRA
CCGTACGATGACTGGGATAAAGACGCCGATCCGCAGAGT




LAKGYGVQALKADNLEQLKGSLQEALSAKGPVLIEVS
CATCACCTGTTTGATCGTCATGTTAGCTCTAGTGTCCGCCT




TVSPVK
GAACGACCAGGATCTGGATATCCTGGTTAAAGCACTGAA




(SEQ ID NO: 107)
CTCTGCTAGTAATCCGGCGATTGTGCTGGGTCCGGATGTT





GACGCAGCTAACGCAAATGCTGATTGCGTGATGCTGGCT





GAACGTCTGAAAGCGCCGGTTTGGGTCGCACCGTCGGCTC





CGCGTTGCCCGTTCCCGACCCGTCACCCGTGTTTTCGTGG





TCTGATGCCGGCCGGTATTGCAGCAATCAGCCAGCTGCTG





GAAGGCCATGATGTCGTGCTGGTCATCGGTGCACCGGTGT





TCCGCTATCACCAGTACGACCCGGGCCAATATCTGAAACC





GGGTACCCGTCTGATTTCTGTTACGTGTGATCCGCTGGAA





GCAGCTCGCGCGCCGATGGGCGATGCAATCGTGGCAGAC





ATTGGTGCGATGGCCAGTGCACTGGCTAACCTGGTTGAAG





AATCCTCACGTCAGCTGCCGACCGCGGCCCCGGAACCGG





CTAAAGTTGATCAAGACGCAGGTCGTCTGCACCCGGAAA





CCGTCTTTGATACGCTGAATGACATGGCCCCGGAAAACGC





AATTTACCTGAATGAATCCACGTCAACCACGGCCCAGATG





TGGCAACGTCTGAACATGCGCAATCCGGGTTCTTATTACT





TCTGTGCAGCTGGCGGTCTGGGTTTTGCACTGCCGGCGGC





AATCGGTGTGCAGCTGGCGGAACCGGAACGTCAAGTGAT





TGCCGTTATCGGCGATGGTAGCGCCAACTATTCGATTAGC





GCACTGTGGACCGCAGCTCAGTACAATATTCCGACGATCT





TCGTTATTATGAACAATGGCACCTATGGTGCCCTGCGTTG





GTTTGCAGGTGTGCTGGAAGCTGAAAACGTTCCGGGCCTG





GATGTCCCGGGTATCGACTTCCGTGCACTGGCAAAAGGCT





ACGGTGTTCAGGCACTGAAAGCTGATAATCTGGAACAGC





TGAAAGGCTCGCTGCAAGAAGCGCTGAGCGCCAAAGGTC





CGGTGCTGATTGAAGTCTCTACCGTGAGTCCGGTTAAA





(SEQ ID NO: 108)





IZPD

Zymomonas

MSYTVGTYLAERLVQIGLKHHFAVAGDYNLVLLDNLL
ATGAGCTATACCGTGGGCACGTACCTGGCTGAACGTCTGG




mobilis subsp.

LNKNMEQVYCCNELNCGFSAEGYARAKGAAAAVVT
TTCAAATTGGCCTGAAACATCACTTTGCCGTGGCCGGTGA




mobilis

YSVGALSAFDAIGGAYAENLPVILISGAPNNNDHAAGH
TTATAATCTGGTTCTGCTGGACAACCTGCTGCTGAATAAA




VLHRALGKTDYHYQLEMAKNITAAAEAIYTPEEAPAK
AACATGGAACAGGTGTACTGCTGTAATGAACTGAACTGC




IDHVIKTALREKKPVYLEIACNIASMPCAAPGPASALFN
GGCTTCAGTGCGGAAGGTTATGCTCGCGCGAAGGGTGCG




DEASDEASLNAAVDETLKFIANRDKVAVLVGSKLRAA
GCGGCGGCGGTGGTTACCTACAGTGTTGGTGCCCTGTCCG




GAEEAAVKFTDALGGAVATMAAAKSFFPEENALYIGT
CATTTGATGCTATCGGCGGTGCCTATGCAGAAAATCTGCC




SWGEVSYPGVEKTMKEADAVIALAPVFNDYSTTGWT
GGTTATTCTGATCTCCGGCGCCCCGAACAATAACGATCAT




DIPDPKKLVLAEPRSVVVNGIRFPSVIILKDYLTRLAQK
GCGGCCCGTCATGTCCTGCATCACGCACTGGGTAAAACC




VSKKTGSLDFFKSLNAGELKKAAPADPSAPLVNAEIAR
GACTATCATTACCAGCTGGAAATGGCAAAAAACATTACC




QVEALLTPNTTVIAETGDSWFNAQRMKLPNGARVEYE
GCAGCTGCGGAAGCGATCTATACGCCGGAAGAAGCTCCG




MQWGHIGWSVPAAFGYAVGAPERRNILMVGDGSFQL
GCGAAAATTGATCACGTTATCAAAACCGCGCTGCGTGAG




TAQEVAQMVRLKLPVIIFLINNYGYTIEVMIHDGPYNNI
AAAAAACCGGTCTACCTGGAAATTGCGTGCAATATCGCCT




KNWDYAGLMEVFNGNGGYDSGAAKGLKAKTGGELA
CAATGCCGTGTGCAGCACCGGGTCCGGCATCGGCACTGTT




EAIKVALANTDGPTLIECFIGREDCTEELVKWGKRVAA
TAATGATGAAGCAAGCGACGAAGCTTCTCTGAACGCTGC




ANSRKPVNKVV
GGTGGATGAAACCCTGAAATTCATTGCGAACCGTGACAA




(SEQ ID NO: 109)
AGTTGCAGTCCTGGTGGGCAGCAAACTGCGTGCCGCAGG





TGCAGAAGAAGCTGCGGTCAAATTTACCGATGCACTGGG





CGGTGCTGTGGCAACGATGGCCGCAGCTAAAAGCTTTTTC





CCGGAAGAAAATGCCCTGTATATCGGCACCTCATGGGGT





GAAGTGTCGTACCCGGGTGTTGAAAAAACGATGAAAGAA





GCCGATGCAGTCATTGCTCTGGCGCCGGTGTTCAATGACT





ATAGCACCACGGGCTGGACCGATATCCCGGACCCGAAAA





AACTGGTTCTGGCGGAACCGCGTAGCGTCGTGGTTAACG





GTATTCGCTTTCCGTCTGTGCATCTGAAAGATTACCTGAC





CCGTCTGGCCCAAAAAGTTAGCAAGAAAACCGGCTCTCT





GGACTTTTTCAAAAGTCTGAATGCGGGTGAACTGAAAAA





AGCAGCACCGGCCGATCCGTCCGCACCGCTGGTCAATGC





GGAAATTGCACGTCAGGTGGAAGCACTGCTGACCCCGAA





CACCACGGTGATCGCCGAAACGGGCGACTCTTGGTTCAAT





GCACAACGTATGAAACTGCCGAACGGTGCGCGCGTTGAA





TATGAAATGCAGTGGGGCCATATTGGTTGGAGCGTTCCGG





CAGCTTTTGGCTACGCAGTCGGTGCTCCGGAACGTCGCAA





CATCCTGATGGTGGGCGATGGTTCGTTCCAGCTGACCGCA





CAAGAAGTTGCTCAGATGGTCCGTCTGAAACTGCCGGTCA





TCATCTTTCTGATCAACAACTACGGCTACACGATTGAAGT





GATGATCCACGATGGTCCGTATAATAACATCAAAAATTG





GGACTACGCCGGCCTGATGGAAGTGTTTAATGGTAACGG





CGGTTATGATAGTGGCGCGGCCAAAGGTCTGAAAGCGAA





AACCGGCGGTGAACTGGCCGAAGCAATTAAAGTTGCTCT





GGCGAACACCGATGGCCCGACGCTGATTGAATGCTTCATC





GGTCGCGAAGACTGTACCGAAGAACTGGTTAAATGGGGC





AAACGTGTCGCAGCTGCGAATAGCCGCAAACCGGTGAAC





AAAGTCGTG (SEQ ID NO: 110)





1OZF

Klebsiella

MDKQYPVRQWAHGADLVVSQLEAQGVRQVFGIPGAK





pneumoniae subsp.

IDKVFDSLLDSSIRIIPVRHEANAAFMAAAVGRITGKAG





Pneumoniae

VALVTSGPGCSNLITGMATANSEGDPVVALGGAVKRA





DKAKQVHQSMDTVAMFSPVTKYAIEVTAPDALAEVV





SNAFRAAEQGRPGSAFVSLPQDVVDGPVSGKVLPASG





APQMGAAPDDAIDQVAKLIAQAKNPIFLLGLMASQPE





NSKALRRLLETSHIPVTSTYQAAGAVNQDNFSRFAGRV





GLFNNQAGDRLLQLADLVICIGYSPVEYEPAMWNSGN





ATLVHIDVLPAYEERNYTPDVELVGDIAGTLNKLAQNI





DHRLVLSPQAAEILRDRQHQRELGDRRGAQLNQFALH





PLRIVRAMQDIVNSDNVTLTVDMGSFHIWIARYLYTFRA





RQVMISNGQQTMGVALPWAIGAWLVNPERKVVSVSG





DGGFLQSSMELETAVRLKANVLHLIWVDNGYNMVAI





QEEKKYQRLSGVEFGPMDFKAYAESFGAKGFAVESAE





ALEPTLRAAMDVDGPAVVAIPVDYRDNPLLMGQLHLS





QIL





(SEQ ID NO: 111)






YP_006485164.1

Pseudomonas

MKTVHSASYEILRSHGLTTVFGNPGSNELPFLKDFPED





aeruginosa

FRYILGLHEGAVVGMADGFALASGRPAFVNLHAAAGT





GNGMGALTNAWYSHSPLVITAGQQVRSMIGVEAMLA





NVDAGQLPKPLVKWSHEPACAQDVPRALSQAIQTASL





PPRAPVYLSIPYDDWAQPAPAGVEHLAARQVSGAALP





APALLAELGERLSKSRNPVLVLGPDVDGANANGLAVE





LAEKLRMPAWGAPSASRCPFPTRHACFRGVLPAAIAGI





SRLLDGHDLILVVGAPVFRYHQFAPGDYLPAGAELVQ





VTCDPGEAARAPMGDALVGDIALTLEALLEQVRPSAR





PLPEALPRPPALAEEGGPLRPETVFDVIDALAPRDAIFV





KESTSTVTAFWQRVEMREPGSYFFPAAGGLGFGLPAA





VGAQLAQPRRQVIGIIGDGSANYGITALWSAAQYRVP





AVFILKNGTYGALRWFAGVLEVPDAPGLDVPGLDFC





AIARGYGVEALHAATREELEGALKHALAADRPVLIEV





PTQTIEP (SEQ ID NO: 112)






YP_005461458.1

Actinoplanes

MIDLDGTVTVAEYLGLRLRHAGVEHLFGVPGDFNLNL





missouriensis

LDGLAFVEGLRWVGSPNELGAGYAADAYARRRGLSA





LFTTYGVGELSAINAVAGSAAEDSPVVHVVGSPRTTTV





AGGALVHHTIADGDFRHFARAYAEVTVAQAMVTATD





AGAQIDRVLLAALTHRKPVYLSIPQDLALHRIPAAPLR





EPLTPASDPAAVERFRTAVRDLLTPAVRPIMLVGQLVS





RYGLSTLVTDMTTRSGIPVAAQLSAKGVIDESVEGNLG





LYAGSMLDGPAASLIDSADVVLHLGTALTAELTGFFTH





RRPDARTVQLLSTAALVGTTRFDNVLFPDAMTTLAEV





LTTFPAPARLAAPTTRAEPTGLAASITPPAPSAVDLTAS





TATDLTAPTAGDISEMSRVLTQDAFWAGMQAWLPAG





HALVADTGTSYWGALALRLPGDTVFLGQPIWNSIGWA





LPAVLGQGLADPDRRPVLVIGDGAAQMTIQELSTIVAA





GLRPIILLLNNRGYTIERALQSPNAGYNDVADWNWRA





VVAAFAGPDTDYHHAATGTELAKALTAASESNRPVFI





EVELDAFDTPPLLRRLAERATAPS (SEQ ID NO: 113)






YP_006991301.1

Carnobacterium

MYTVGNYLLDRLTELGIRDIFGVPGDYNLKFLDHVMT





maltaromaticum

HKELNWIGNANELNAAYAADGYARTKGIAALVTTFG




LMA28
VGELSAANGTAGSYAEKVPVVQIVGTPTTAVQNSHKL





VHHTLGDGRFDHFEKMQTEINGAIAHLTADNALAEID





RVLRIAVTERCPVYINLAIDVAEVVAEKPLKPLMEESK





KVEEETTLVLNKIEKALQDSKNPVVLIGNEIASFHLESA





LADFVKKFNLPVTVLPFGKGGFDEEDAHFIGVYTGAPT





AESIKERVEKADLILIIGAKLTDSATAGFSYDFEDRQVIS





VGSDEVSFYGEIMKPVAFAQFVNGLNSLNYLGYTGEIK





QVERVADIEAKASNLTQNNFWKFVEKYLSNGDTLVAE





QGTSFFGASLVPLKSKMKFIGQPLWGSIGYTFPAMLGS





QIANPASRHLLFIGDGSLQLTIQELGMTFREKLTPIVFVI





NNDGYTVEREIHGPNELYNDIPMWDYQNLPYVFGGN





KGNVATYKVTTEEELVAAMSQARQDTTRLQWIEVVM





GKQDSPDLLVQLGKVFAKQNS (SEQ ID NO: 114)






WP_003075272.1

Comamonas

MPANTAPNAQAAEVFTVRHAVINMLRELGMTRIFGNP





testosteroni

GSTELPLFRDYPEDFSYILGLQETVVVGMADGYAQAT





RNASFVNLHSAAGVGHAMANIFTAFKNRTPMVITAGQ





QTRSLLQFDPFLHSNQAAELPKPYVKWSCEPARAEDV





PQALARAYYIAMQEPRGPVFVSIPADDWDVPCEPITLR





KVGFETRPDPRLLDSIGQALEGARAPAFVVGAAVDRS





QAFEAVQALAERHQARVYVAPMSGRCGFPEDHALFG





GFLPAMRERIVDRLSGHDVVFVIGAPAFTYHVEGHGPF





IAEGTQLFQLIEDPALAAWAPVGDAAVGNIRMGVQELL





ARPLTHPRPALQPRPAIPAPAAPEPGRLMTDAFLMHTL





AQVSRDSIIVEEAPGSRSIIQAHLPIYAAETFFTMCSGG





LGHSLPASVGIALARPDKKVIGVIGDGSAMYAIQALWS





AAHLKLPVTYIIVKNRRYAALQDFSRVFGYREGEKVE





GTDLPDIDFVALAKGQGCDGVRVTDAAQLSQVLRDAL





RSPRATLVEVEVA (SEQ ID NO: 115)






WP_020634527.1

Amycolatopsis

MNVAELVGRTLAELGVGAAFGVVGSGNFVVTNGLRA





orientalis

GGVRFVAARHEGGAASMADAYARMSGRVSVLSLHQ





HCCB10007

GCGLTNALTGITEAAKSRTPMIVLTGDTAASAVLSNFR





IGQDALATAVGAVPERVHSAPTAVADTVRAYRTAVQ





QRRTVLLNLPLDVQAQEAPEAVEIPKVRGPAPIRPDAG





MVAKLADLLAEARRPVFIAGRGARASAVPLRELAEISG





ALLATSAVAHGLFHDDPFSLGISGGFSSPRTADLIVDAD





LVIGWGCALNMWTTRHGTLLGPAARLVQVDVEQAAL





GAHRPIDLGVVGDVAGTAVDVHAELDKRGHQRSREA





PTGTRWNDVPYNDLSGDGRIDPRTLSRRLDEILPAERM





VSIDSGNFMGYPSAYLSVPDENGFCFTQAFQSIGLGLG





TAIGAALARPDRLPVLGVGDGGFHMAVSELETAVRLR





IPLVIVVYNDAAYGAEIHHFGDADMTTVRFPDTDIAAI





GRGFGCDGVTVRSVGDLAAVKEWLGGPRDAPLVIDA





KIADDGGSWWLAEAFRH (SEQ ID NO: 116)






1OVM

Enterobacter sp.

MRTPYCVADYLLDRLTDCGADHLFGVPGDYNLQFLD





HVIDSPDICWVGCANELNASYAADGYARCKGFAALLT





TFGVGELSAMNGIAGSYAEHVPVLHIVGAPGTAAQQR





GELLHHTLGDGEFRHFYHMSEPITVAQAVLTEQNACY





EIDRVLTTMLRERRPGYLMLPADVAKKAATPPVNALT





HKQAHADSACLKAFRDAAENKLAMSKRTALLADFLV





LRHGLKHALQKWVKEVPMAHATMLMGKGIFDERQA





GFYGTYSGSASTGAVKEAIEGADTVLCVGTRFTDTLTA





GFTHQLTPAQTIEVQPHAARVGDVWFTGIPMNQAIETL





VELCKQHVHAGLMSSSSGAIPFPQPDGSLTQENFWRTL





QTFIRPGDIILADQGTSAFGAIDLRLPADVNFIVQPLWG





SIGYTLAAAFGAQTACPNRRVIVLTGDGAAQLTIQELG





SMLRDKQHPIILVLNNEGYTVERAIHGAEQRYNDIALW





NWTHIPQALSLDPQSECWRVSEAEQLADVLEKVAHHE





RLSLIEVMLPKADIPPLLGALTKALEACNNA (SEQ ID





NO: 117)






2Q5Q

Azospirillum

MKLAEALLRALKDRGAQAMFGIPGDFALPFFKVAEET





brasilense Sp24

QILPLHTLSHEPAVGFAADAAARYSSTLGVAAVTYGA





GAFNMVNAVAGAYAEKSPVVVISGAPGTTEGNAGLLL





HHQGRTLDTQFQVFKEITVAQARLDDPAKAPAEIARV





LGAARAQSRPVYLEIPRNMVNAEVEPVGDDPAWPVD





RDALAACADEVLAAMRSATSPVLMVCVEVRRYGLEA





KVAELAQRLGVPVVTTFMGRGLLADAPTPPLGTYIGV





AGDAEITRLVEESDGLFLLGAILSDTNFAVSQRKIDLRK





TIHAFDRAVTLGYHTYADIPLAGLVDALLERLPPSDRT





TRGKEPHAYPTGLQADGEPIAPMDIARAVNDRVRAGQ





EPLLIAADMGDCLFTAMDMIDAGLMAPGYYAGMGFG





VPAGIGAQCVSCGKRILTVVGDGAFQMTGWELGNCR





RLGIDPIVILFNNASWEMLRTFQPESAFNDLDDWRFAD





MAAGMGGDGVRVRTRAELKAALDKAFATRGRFQLIE





AMIPRGVLSDTLARFVQGQKRLHAAPRE (SEQ ID





NO: 118)






2VBG

Lactococcus lactis

MNVAELVGRTLAELGVGAAFGVVGSGNFVVTNGLRA





GGVRFVAARHEGGAASMADAYARMSGRVSVLSLHQ





GCGLTNALTGITEAAKSRTPMIVLTGDTAASAVLSNFR





IGQDALATAVGAVPERVHSAPTAVADTVRAYRTAVQ





QRRTVLLNLPLDVQAQEAPEAVEIPKVRGPAPIRPDAG





MVAKLADLLAEARRPVFIAGRGARASAVPLRELAEISG





ALLATSAVAHGLFHDDPFSLGISGGFSSPRTADLIVDAD





LVIGWGCALNMWTTRHGTLLGPAARLVQVDVEQAAL





GAHRPIDLGVVGDVAGTAVDVHAELDKRGHQRSREA





PTGTRWNDVPYNDLSGDGRIDPRTLSRRLDEILPAERM





VSIDSGNFMGYPSAYLSVPDENGFCFTQAFQSIGLGLG





TAIGAALARPDRLPVLGVGDGGFHMAVSELETAVRLR





IPLVIVVYNDAAYGAEIHHFGDADMTTVRFPDTDIAAI





GRGFGCDGVTVRSVGDLAAVKEWLGGPRDAPLVIDA





KIADDGGSWWLAEAFRH (SEQ ID NO: 119)






2VBI

Acetobacter syzygii

MTYTVGMYLAERLVQIGLKHHFAVAGDYNLVLLDQL




9H-2
LLNKDMKQIYCCNELNCGFSAEGYARSNGAAAAVVT





FSVGAISAMNALGGAYAENLPVILISGAPNSNDQGTGH





ILHHTIGKTDYSYQLEMARQVTCAAESITDAHSAPAKI





DHVIRTALRERKPAYLDIACNIASEPCVRPGPVSSLLSE





PEIDHTSLKAAVDATVALLEKSASPVMLLGSKLRAAN





ALAATETLADKLQCAVTIMAAAKGFFPEDHAGFRGLY





WGEVSNPGVQELVETSDALLCIAPVFNDYSTVGWSAW





PKGPNVILAEPDRVTVDGRAYDGFTLRAFLQALAEKA





PARPASAQKSSVPTCSLTATSDEAGLTNDEIVRHINALL





TSNTTLVAETGDSWFNAMRMTLPRGARVELEMQWGH





IGWSVPSAFGNAMGSQDRQHVVMVGDGSFQLTAQEV





AQMVRYELPVIIFLINNRGYVIEIAIHDGPYNYIKNWDY





AGLMEVFNAGEGHGLGLKATTPKELTEAIARAKANTR





GPTLIECQIDRTDCTDMLVQWGRKVASTNARKTTLAL





E (SEQ ID NO 120)






3FZN

Agrobacterium

MASVHGTTYELLRRQGIDTVFGNPGSNELPFLKDFPED





radiobacter

FRYILALQEACVVGIADGYAQASRKPAFINLHSAAGTG





NAMGALSNAWNSHSPLIVTAGQQTRAMIGVEALLTNV





DAANLPRPLVKWSYEPASAAEVPHAMSRAIHMASMA





PQGPVYLSVPYDDWDKDADPQSHHLFDRHVSSSVRLN





DQDLDILVKALNSASNPAIVLGPDVDAANANADCVML





AERLKAPVWVAPSAPRCPFPTRHPCFRGLMPAGIAAIS





QLLEGHDVVLVIGAPVFRYHQYDPGQYLKPGTRLISVT





CDPLEAARAPMGDAIVADIGAMASALANLVEESSRQL





PTAAPEPAKVDQDAGRLHPETVFDTLNDMAPENAIYL





NESTSTTAQMWQRLNMRNPGSYYFCAAGGLGFALPA





AIGVQLAEPERQVLAVIGDGSANYSISALWTAAQYNIPT





IFVIMNNGTYGALRWFAGVLEAENVPGLDVPGIDFRA





LAKGYGVQALKADNLEQLKGSLQEALSAKGPVLIEVS





TVSPVKHHHHHH (SEQ ID NO: 121)









Protein Production and Enzyme Purification


Overnight cultures of BLR cells suspended in a 2 mL volume were transformed with a pet29b+ plasmid (encoding polypeptides of interest with a C-terminal His-tag) and grown in Terrific Broth with 50 μg/ml kanamycin. Cultures were diluted 1:1.000 in 500 ml of Terrific Broth with 1 mM MgSO4, 1% glucose and 50 μg/ml antibiotic and then grown at 37° C. for 24 hours. Cultures were pelleted down at 4,700 RPM for 10 minutes and resuspended in auto-induction media (LB broth, 1 mM MgSO4, 0.1 mM TPP, 1×NPS and 1×5052) for induction at 18° C. for 20 hours. At the end of induction, cells were centrifuged, the supernatant was removed and cells were resuspended in 40 mL lysis buffer (100 mM HEPES, pH 7.5, 100 mM NaCl, 10% glycerol, 0.1 mM TPP, 1 mM MgSO4, 10 mM Imidazole, 1 mM TCEP) and 1 mM phenylmethylsulphonyl fluoride. The cell lysate suspension was sonicated for 2 min and followed by centrifugation at 4,700 RPM. The supernatant was loaded onto a gravity flow column with 500 uL Cobalt beads and was washed with 15 mL of wash buffer five times. Proteins were eluted with 1,000 mL of elution buffer (100 mM HEPES, pH 7.5, 100 mM NaCl, 10% glycerol, 0.1 mM TPP, 1 mM MgSO4, 200 mM Imidazole and 1 mM TCEP). Protein concentrations were determined using a Synergy H1 spectrophotometer (Biotek) by measuring absorbance at 280 nm using calculated extinction coefficients.


Enzyme Activity Assay and Kinetic Characterization


All substrates were dissolved in MilliQ H2O and the pH was adjusted to 7.2 as necessary. Activity for oxaloacetate, pyruvate, and 2-ketoisovalerate was measured at a 1 mM substrate concentration. The assay was performed in a 96-well half-area plate. Each reaction contained reaction buffer (100 mM HEPES, 100 mM NaCl, 10% glycerol, pH 7.2), ADH (Sigma-Aldrich, A7011, 100 U/mL for pyruvate, 600 U/mL for oxaloacetate, and 600 U/mL for 2-ketoisovalerate), and a final concentration of 0.5 mM NADPH, 0.1 mM TPP, and 1 mM MgSO4. A range of substrate concentrations (0.1 mM-5 mM) were uSEQ to perform steady-state kinetics measurement over a period of one hour. Absorbance readings were taken at one minute intervals at 340 nm at 21° C. for 60 minutes using the Synergy H1 spectrophotometer (Biotek). Kinetic parameters (ka and Ks1) were determined by fitting initial velocity versus substrate concentration data to the Michaelis-Menten equation.


Results



FIG. 4 and Table 3 show the activity of 56 candidate oxaloacetate decarboxylases towards the substrates oxaloacetate, pyruvate, and 2-ketoisovalerate.









TABLE 3







Activity of oxaloacetate decarboxylases









Activity (μmol · mg−1 · min−1)











Enzyme name or


2-keto



UniProt/Genbank ID
Species
Oxaloacetate
isovalerate
Pyruvate














4COK

Gluconacetobacter diazotrophicus

5533.300
14.118
19333.333


A0A0F6SDN1_9DELT

Sandaracinus amylolyticus

12.307
15.578
490.212


4K9Q

Polynucleobacter necessarius subsp.

10.981
55.816
0.000




Asymbioticus



D6ZJY9_MOBCV

Mobiluncus curtisii

0.000
15.337
32.277


|Q1LMD8_CUPMC

Cupriavidus metallidurans

4.712
6.326
0.000


Q9F768

Bacteroides fragilis

4.259
0.000
0.000


I3BXS7_9GAMM

Thiothrix nivea DSM 5205

8.059
21.794
0.000


1JSC

Saccharomyces cerevisiae

21.015
22.577
0.000


O86938|PPD_STRVT

Streptomyces viridochromogenes

0.000
3.627
0.000


3L84_3M34

Campylobacter jejuni

14.554
0.000
30.758


1upa_A

Streptomyces clavuligerus

1.733
17.287
1.499


A0A016CS86_BACFG

Fibrobacter succinogenes

0.000
14.840
0.000


A0A0F2PQV5_9FIRM
Peptococcaceae bacterium BRH_c4b
26.972
0.000
24.122


D7DTG5_METV3

Methanococcus voltae

3.983
9.969
27.183


3E9Y

Arabidopsis thaliana

2.499
0.000
0.000


2ZKT

Pyrococcus furiosus

2.385
5.429
18.603


A0A124FLS8_9FIRM
Clostridia bacterium 62_21
6.465
57.886
79.706


4WBX

Pyrococcus furiosus

0.000
2424.874
69.184


C4L9G3_TOLAT

Tolumonas auensis

4.623
15.720
72.346


A0A0K1FGX4_9FIRM

Selenomonas noxia ATCC 43541

4.326
8.736
154.754


A0A0R2PY37_9ACTN

Acidimicrobium sp. BACL17

34.977
23.241
617.232


X1WK73_ACYPI

Acyrthosiphon pisum

23.275
61.946
1162.672


B1HLR4_BURPE

Burkholderia pseudomallei

0.000
13.333
13.333


X8CA07_MYCXE

Mycobacterium xenopi 3993

0.000
33.333
26.600


D1Y3P7_9BACT

Pyramidobacter piscolens W5455

0.000
0.000
26.700


F4RJP4_MELLP

Melampsora laricipopulina

13.333
24.444
26.600


A0A081BQW3_9BACT

Candidatus Moduliflexus flocculans

13.333
42.222
66.667


CAK95977

Pseudomonas fluorescens

10.22193433
0
0


YP_831380

Arthrobacter sp.

15.81263828
0
0


ZP_06547677

Pseudomonas putida CSV86

2.636659175
708.837523*
1648.5245*


ZP_06846103

Halotalea alkalilenta

42.16910984
17.5671744*
1195.18032*


ZP_07290467

Streptomyces sp.

0
83.3824552*
267.885245*


ZP_08570611

Rheinheimera sp. A13L

39.1977264
0
0


YP_001240047

Bradyrhizobium sp. STM 3843

0
0
0


YP_001279645

Psychrobacter sp.

3.556735997
0
0


ZP_01901192

Roseobacter sp. AzwK-3b

0
0
0


ZP_06549025

Serratia marcescens FGI94

7.392211819
139902.1428
9.954203568


ZP_07033476

Granulicella mallensis

7.065903742
811.4324283
1174.57377



ATCC BAA-1857


WP_010764607.1

Enterococcus haemoperoxidus

48.42956916
63422.30474
1689.737705



ATCC BAA-382


WP_002115026.1

Acinetobacter baumannii

2.410507246
0
30.67169555


YP_005756646.1

Staphylococcus aureus

13.01208771
792778.8092
15900.58689


WP_008347133.1

Bacillus pumilus SAFR-032

1.544738956
0
0


WP_018535238.1

Streptomyces glaucescens

11.67518701
93.58311535
35.54345178


YP_006485164.1

Pseudomonas aeruginosa

44.89076789
242.8363761
113.7848268


YP_005461458.1

Actinoplanes missouriensis

47.6189372
70.38233411
370.9180328


YP_006991301.1

Carnobacterium maltaromaticum LMA28

52.96875
195862.9999
2055.147506


NP_594083.1

Schizosaccharomyces pombe

1.312105291
0
8424.567708


WP_003075272.1

Comamonas testosteroni

24.95980669
623.2146098
147.6722275


WP_020634527.1

Amycolatopsis orientalis

20.61304942
4.067348776
11.61476828



HCCB10007


1OVM

Enterobacter sp.

18.7477487
8954.54365*
158.667580*


2Q5Q

Azospirillum brasilense Sp24

10.86768802
0
23.95798121


2VBG

Lactococcus lactis

35.41517071
67191.9
1257


2VBI

Acetobacter syzygii 9H-2

16.99543089
36.2215268*
201944.262*


3FZN

Agrobacterium radiobacter

27
1987.26023*
370.918032*


1ZPD

Zymomonas mobilis

0
18.1191493*
453344.262*



subsp. mobilis


1OZF

Klebsiella pneumoniae

4.537374205
419.706428*
391.524590*



subsp. Pneumoniae





*Indicates values calculated based on published data (Mak, W. S. et al. (2015) Nat. Commun. 6: 10005).






Functional characterization indicated that 45 of the 56 diverse enzyme candidates identified from the genomic database described earlier showed activity towards oxaloacetate. Among these active homologues, pyruvate decarboxylase from Gluconoacetobacter diazotrophicus (PDB code: 4COK: see van Zyl, L. J. et al. (2014) BMC. Struct. Biol 14:21) was found to be most active. As shown in Table 3.4COK exhibited more than 100-fold higher activity towards oxaloacetate than any other decarboxylase tested.


As shown in Table 4 and FIG. 5, 4COK exhibited a catalytic efficiency (kcat/KM) of approximately 2296.4 M−1 s−1 for oxaloacetate and approximately 5532.1 M−1 s−1 for pyruvate.









TABLE 4







Kinetic constants of 4COK for pyruvate and oxaloacetate










Pyruvate
Oxaloacetate















kcat (s−1)
 8.254 ± 1.87
n.d.



KM (mM)
 1.49 ± 0.43
n.d.



kcat/KM (M−1s−1)
5532.1 ± 39.4
2296.4 ± 116










These findings indicated that pyruvate decarboxylase from Gluconoacetobacter diazotrophicus catalyzed the decarboxylation of oxaloacetate to 3-oxopropanoate, acting as an efficient oxaloacetate decarboxylase (OAADC). The direct conversion of oxaloacetate to 3-oxopropanoate using an OAADC enables a novel and advantageous metabolic pathway to produce 3-HP.


Example 2: Identification of Additional Oxaloacetate Decarboxylases, Alcohol Dehydrogenases, and Phosphoenolpyruvate Carboxykinases

Materials and Methods


Genome Mining


A second round of genome mining was conducted as described in Example 1, except using the 4COK sequence as the input. Genes encoding candidate OAADCs were synthesized and expressed in E. coli for further characterization. OAADC activity was assayed as described in Example 1.


Alcohol Dehydrogenase (ADH) Activity


Candidate ADHs were expressed in E. coli, and soluble expression levels were analyzed. 3-HP dehydrogenase (3-HPDH) activity of each was tested based on the reverse reaction, from 3-HP to 3-oxopropanoate. The assay was performed in a 96-well half-area plate. Each reaction contained a final concentration of 1 mM NADP+/NAD+ in reaction buffer (100 mM Hepes, 100 mM NaCl, 10% glycerol, pH 7.2) and ADHs. A range of substrates from 0.1 mM-5 mM was used to perform steady-state kinetics measurement over a period of an hour. Absorbance readings were taken every 1 min at OD 340 at 21° C. for 60 min. using the Synergy™ H1 Hybrid Multi-Mode Microplate Reader (Biotek). Kinetic parameters (kcat and KM) were determined by fitting initial velocity versus substrate concentration data to the Michaelis-Menten equation.


Phosphoenolpyruvate Carboxykinase (PEPCK) Activity


5 genes encoding candidate PEPCKs were synthesized and cloned into expression vectors. After obtaining solubly expressed proteins, they were used for activity characterization. Each enzyme was assayed in the phosphenolpyruvate carboxylation direction in a solution containing 100 mM PBS buffer (pH 6.5), 0.20 mM NADH, 1.25 mM ADP, 2.5 mM PEP, 50 mM KHCO3, 2 mM MnCl2, and 4 units malate dehydrogenase.


Results


A second round of genome mining was performed to explore the sequence space around the enzyme 4COK, which found to be highly active in the first round of mining described in Example 1. These analyses identified many proteins with measurable OAADC activity. In particular, a highly active enzyme cluster was identified, including the most active, newly identified OAADCs A0A0J7KM68, C7JF72_ACEP3, 5EUJ, and A0A0D6NFJ6_9PROT (FIG. 6). The sequences of the enzymes in the clade highlighted in FIG. 6 are provided in Table 5.









TABLE 5 







Candidate sequences in clade with highest OAADC specific activity.








Enzyme name
Amino acid sequence





G6EYP0 9PROT
MEYTVGQYLATRLAQLGLNHVFAVAGDYNLTLLDEMAKAKDLEQVYCCNEL



NCGFAGEGYARARIMGASVVTFSVGAFSAFNAVGGAFAENLPLLLISGAPNNN



DYGSGHILHHTMGYSDYRYQMEMAKKITCEAVSVAHADEAPCLIDHAIRSAIR



NRKPAYIEISCNVANQPCTEPGPISSITNSLISDDESLKAAAKACVEALEKAKNPV



VIIGGKIRSAGCAVSKQVAELTKKLGCAVATMAQAKGLSPEEEAEYVGTFWGD



ISSPGVEDLVRDSDCRIYIGAVFNDYSTVGWTCKLVSDNDILISSHHTRVGKKEF



SGVYLKDFIPVLASSVKKNTTSLEQFKAKKLPAKETPVADGNAALTTVELCRQI



QGAINKDTTLFLETGDSWFHGMHFNLPNGARVESEMQWGHIGWSIPSMFGYAV



SEPNRRNIIMVGDGSFQLTAQEVCQMIRRNMPVIIILINNSGYTIEVKIHDGPYNRI



KNWDYAGLIDVFNAEDGKGLGLKAKNGAELEKAMKTALAHKDGPTLIEVDID



AQDCSPDLVVWGKKVAKANGRAPRKAGGSG (SEQ ID NO: 137)





W7DU13 9PROT
MKYTVGQYLATRLAQLGLNHVFAVAGDYNLTLLDEMAKVEDLEQVYCCNEL



NCGFAGEGYARSRVMGASVVTFSVGAFSAFNAVGGAFAENLPLLLISGAPNNN



DYGSGHILHHTMGYSDYRYQMDMAKQITCEAVSVAHADEAPCLIDHAIRSALR



NRKPAYIEISCNVANQPCTEPGPISSITNSLISDDESLKAAAKACLDALEKAKSPV



VIIGGKIRSAGCAVSKKVAELTKKLGCAVATMAQAKGLSPEEEAEYVGTFWGEI



SSPGVEELNRESDCRIYIGAVFNDYSTVGWTVKLVGENDILISSHHTRVGHKEFS



GVYLKDFIPVLTSCVKKNTTSLDQFKAKKIPVKQVPVADGKAPLTTVELCRQIQ



GAINKDTTIYETGDSWFHGMHFKLPNGARVESEMQWGHIGWSIPSMFGYAVS



EPNRRNIIMVGDGSFQLTAQEVCQMIRRNIPIIIILINNSGYTIEVKIHDGPYNRIKN



WDYAGLINVFNAEDGKGLGLKAKNGAELEKAMQTALAHKDGPTLIEVDIDAQ



DCSPDLVVWGKKVAKANGRAPRKFQTFGGSG (SEQ ID NO: 138)





I4H6Y9 MICAE_1
MSNYNVGTYLAERLVQIGVKHHFVVPGDYNLVLLDQFLKNQNLLQVGCCNEL



NCGFAAEGYARANGLGVAVVTYSVGALSALNAIGGAYAENLPVILVSGAPNTN



DYSTGHLLHHTMGTQDLTYVLEIARKLTCAAVSITSAEDAPEQIDHVIRTALREQ



KPAYIEIACNIAAAPCASPGPVSAIINEVPSDAETLAAAVSAAAEFLDSKQKPVLL



IGSQLRAAKAEQEAIELAEALGCSVAVMAAAKSFFPEEHPQYVGTYWGEISSPG



TSAIVDWSDAVVCLGAVFNDYSTVGWTAMPSGPTVLNANKDSVKFDGYHFSGI



HLRDFLSCLARKVEKRDATMAEFARFRSTSVPVEPARSEAKLSRIEMLRQIGPLV



TAKTTVFAETGDSWFNGMKLQLPTGARFEIEMQWGHIGWSIPAAFGYALGAPE



RQIICMIGDGSFQLTAQEVAQMIRQKLPIIIFLVNNHGYTIEVEIHDGPYNNIKNW



DYAGLIKVFNAEDGAGQGLLATTAGELAQAIEVALENREGPTLIECVIDRDDAT



ADLISWGRAVAVANARPHRGGSG (SEQ ID No: 139)





A0A094IGF4 9PEZI
MATFTVGDYLAERLAQIGIRHHFVVPGDYNLILLDKLQSHPDLSELGCANELNC



SLAAEGYARAQGVAACIVTYSVGAFSAFNGTGSAYAENLPLILVSGSPNTNDSA



KFHLLHHTLGTNDFTYQFEMAKKITCCAVAVGRAQDAPRLIDQAIRAALLAKK



PAYIEIPTNLSGAMCVRPGPISAVVEPVLSDKASLTAAVDRAVQYLCGKQKPAIL



VGPKLRRAGAEMALLQVAEAIGCAVAVQPAAKGFFPEDHKQFAGVFWGQVST



LAADSILNWADTILCVGTIFTDYSTVGWTALPNVPLMIAEMDHVMFPGATFGR



VRLNDFLSGLAKTVGRNESTMVEYGYIRPDPPLVHAAAPDELLNRKETARQVQ



MLLTPETTVFVDTGDSWFNGIRMKLPRGASFEIEMQWGHIGWSIPAAFGYAMG



KPERKVITMVGDGSFQMTAQEVSQMVRYKVPIIIFLINNKGYTIEVEIHDGLYNR



IKNWDYALLVRAFNSNDGQAIGFRASTGRELAEAIEKAKAHKDGPTLIECVIDQ



DDCSRELITWGHYVAAANARPPVQTGGSG (SEQ ID NO: 140)





A0A0D2CX28
MSWTVGSYLAERLAQIGIEHHFVVPGDYNLVLLDKLQAHPKLSEIGCANELNCS


9EURO
FAAEGYARAKGVAAAVVTFSVGAFSAFNGVGGAYAENLPVILISGAPNTSDSG



AFHLLHHTLGTHDFGYQLEMAKKITCAAVAIRRAQDAPRLIDHAIRSAMSAKKP



AYIEIPTNLSIANCPAPGPISAVIAPERSDEITLAMAVNAALDWLKSKQKPVLLAG



PKLRAAGAEAAFLQLADALGCAVAVLPGAKSFFPEDHKQFVGVYWGQVSTMG



ADAIVDWSDGIFGAGVVFTDYSTVGWTALPPDSITLTADLDHMSFTGAEFNRV



QLAELLSALAERATRNSSTMVEYAHLRPDVLFPHIEEPKLPLHRNEIARQIQQLL



QPKTTLFVETGDSWFNGVQMRLPRSCRFEIEMQWGHIGWSVPASFGYAVGSPE



RQIILMVGDGSFQMTVQEVSQMVRARLPIIIFLMNNRGYTIEVEIHDGLYNRIKN



WNYASLIEAFNAEDGHAKGIKASNPEQLAQAIKLATSNSDGPTLIECVIDQDDCT



RELITWGHYVASANARPPAHKGGSG (SEQ ID NO: 141)





H6C7K9 EXODN
MRCMSVPSMTFSRHTLRSCATSSDRMTGAPRKPFITSIKRQHQQPWHSICPNVTI



IMSWTVGSYLAERLSQIGIEHHFVVPGDYNLVLLDQLQAHPKLSEIGCANELNC



SFAAEGYARAKGVAAAVVTFSVGAFSAFNGLGGAYAENLPVILISGSPNTNDAG



AFHLLHHTLGTHDFEYQRQIAEKITCAAVAVRRAQDAPRLIDHAIRSALLAKKP



SYIEIPTNLSNVTCPAPGPISAVIAPEPSDEPTLAAAVHAATNWLKAKQKPILLAG



PKLRAAGGEAGFLQLAEAIGCAVAVMPGAKSFEFPEDHKQFVGVYWGQASTMG



ADAIVDWADGIFGAGLVFTDYSTVGWTAIPSESITLNADLDNMSFPGATFNRVR



LADLLSALAKEATPNPSTMVEYARLRPDILPPHHEQPKLPLHRVEIARQIQELLH



PKTTLFAETGDSWFNAMQMNLPRDCRFEIEMQWGHTGWSVPASFGYAVGAPE



RQVLLMIGDGSFQMTAQEVSQMVRSKVPIIIFLMNNGGYTIEVEIHDGLYNRIKN



WNYAAMMEVFNAGDGHAKGIKASNPEQLAQAIKLAKSNSEGPTLIECIIDQDD



CTKELITWGHYVATANGRPPAHTGGSG (SEQ ID NO: 142)





PDC2 SCHPO
MTKDAESTMTVGTYLAQRLVEIGIKNHFVVPGDYNLRLLDFLEYYPGLSEIGCC



NELNCAFAAEGYARSNGIACAVVTYSVGALTAFDGIGGAYAENLPVILVSGSPN



TNDLSSGHLLHHTLGTHDFEYQMEIAKKLTCAAVAIKRAEDAPVMIDHAIRQAI



LQHKPVYIEIPTNMANQPCPVPGPISAVISPEISDKESLEKATDIAAELISKKEKPIL



LAGPKLRAAGAESAFVKLAEALNCAAFIMPAAKGFYSEEHKNYAGVYWGEVS



SSETTKAVYESSDLVIGAGVLFNDYSTVGWRAAPNPNILLNSDYTSVSIPGYVFS



RVYMAEFLELLAKKVSKKPATLEAYNKARPQTVVPKAAEPKAALNRVEVMRQ



IQGLVDSNTTLYAETGDSWFNGLQMKLPAGAKFEVEMQWGHIGWSVPSAMGY



AVAAPERRTIVMVGDGSFQLTGQEISQMIRHKLPVLIFLLNNRGYTIEIQIHDGPY



NRIQNWDFAAFCESLNGETGKAKGLHAKTGEELTSAIKVALQNKEGPTLIECAI



DTDDCTQELVDWGKAVRSANARPPTADNGGSG (SEQ ID NO: 143)





1ZPD
MSYTVGTYLAERLVQIGLKHHFAVAGDYNLVLLDNLLLNKNMEQNYCCNELN



CGFSAEGYARAKGAAAAVVTYSVGALSAFDAIGGAYAENLPVILISGAPNNND



HAAGHVLHHALGKTDYHYQLEMAKNITAAAEAIYTPEEAPAKIDHVIKTALRE



KKPVYLEIACNIASMPCAAPGPASALFNDEASDEASLNAAVDETLKFIANRDKV



AVLVGSKLRAAGAEEAAVKFTDALGGAVATMAAAKSFFPEENALYIGTSWGE



VSYPGVEKTMKEADAVIALAPVFNDYSTTGWTDIPDPKKLVLAEPRSVVVNGIR



FPSVHLKDYLTRLAQKVSKKTGSLDFFKSLNAGELKKAAPADPSAPLVNAEIAR



QVEALLTPNTTVIAETGDSWFNAQRMKLPNGARVEYEMQWGHIGWSVPAAFG



YAVGAPERRNIEVILMVGDGSFQLTAQEVAQMVRLKLPVIIFLINNYGYTIEVMIHD



GPYNNIKNWDYAGLMEVFNGNGGYDSGAAKGLKAKTGGELAEAIKVALANT



DGPTLIECFIGREDCTEELVKWGKRVAAANSRKPVNKVV (SEQ ID NO: 144)





4COK
MTYTVGRYLADRLAQIGLKHHFAVAGDYNLVLLDQLLLNTDMQQIYCSNELN



CGFSAEGYARANGAAAAIVTFSVGALSAFNALGGAYAENLPVILISGAPNANDH



GTGHILHHTLGTTDYGYQLEMARHITCAAESIVAAEDAPAKIDHVIRTALREKK



PAYLEIACNVAGAPCNTRPGGIDALLSPPAPDEASLKAAVDAALAFTEQRGSVTM



LVGSRIRAAGAQAQAVALADALGCAVTTMAAAKSFFPEDHPGYRGHYWGEVS



SPGAQQAVEGADGVICLAPVFNDYATVGNVSAWPKGDNVMLVERHAVTVGGV



AYAGIDMRDFLTRLAAHTVRRDATARGGAYVTPQTPAAAPTAPLNNAEMARQI



GALLTPRTTLTAETGDSWFNAVRMKLPHGARVELEMQWGHIGWSVPAAFGNA



LAAPERQHVLMVGDGSFQLTAQEVAQMIRHDLPVIIFLINNHGYTIEVMIHDGP



YNNVKNWDYAGLMEVFNAGEGNGLGLRARTGGELAAAIEQARANRNGPTLIE



CTLDRDDCTQELVTWGKRVAAANARPPRAG (SEQ ID NO: 1)





A0A0J7KM68
MSYTVGQYLADRLVQIGLKDHFAIAGDYNLVLLDQFLKNKNWNQIYDCNELN


LASNI
CGFAAEGYARANGAAACVVTYTVGAISAMNSALAGAYAENLPVLCISGAPNC



NDYGSGRILHHTIGKPEFTQQLDMVKHVTCAAESVVQASEAPAKIDHVIRTMLL



EQRPAYIDIACNISGLECPRPGPIEDLLPQYAADNKSLTSAIDAIAKKIEASQKVTL



YVGPKVRPGKAKEASVKLADALGCAVTVGPASMSFFPAKHPGFRGTYWGIVST



GDANKWEEAETLIVLGPNWNDYATVGWKAWPKGPRVVTIDEKAAQVDGQV



FSGLSMKALVEGLAKKVSKKPATAEGTKAPHFEYPVAKPDAKLTNAEMARQIN



AILDDNTTLHAETGDSWFNVKNMNWPNGLRIESEMQYGHIGWSIPSGFGGAIGS



PERKHIIMCGDGSFQLTCQEVSQMIRYKLPVTIFLIDNHGYGIEIAIHDGPYNYIQ



NWNFTKLMEVFNGEGEECPYSHNKNGKSGLGLKATTPAELADAIKQAEANKE



GPTLIQVVIDQDDCTKDLLTWGKEVAKTNARSPVVTDKAGGSG (SEQ ID



NO: 145)





5EUJ
MYTVGMYLAERLAQIGLKHHFAVAGDYNLVLLDQLLLNKDMEQVYCCNELN



CGFSAEGYARARGAAAAIVTFSVGAISAMNAIGGAYAENLPVILISGSPNTNDY



GTGHILHHTIGTTDVNYQLEMVKHVTCAAESIVSAEEAPAKIDHVIRTALRERKP



AYLEIACNVAGAECVRPGPINSLLRELEVDQTSVTAAVDAAVEWLQDRQNVV



MLVGSKLRAAAAEKQAVALADRLGCAVTIMAAAKGFFPEDHPNFRGLYWGEV



SSEGAQELVENADAAILCLAPVFNDYATVGWNSWPKGDNVMVMDTDRVTFAG



QSFEGLSLSTFAAALAEKAPSRPATTQGTQAPVLGIEAAEPNAPLTNDEMTRQIQ



SLITSDTTLTAETGDSWFNASRMPIPGGARVELEMQWGHIGWSVPSAFGNAVGS



PERRHIMMVGDGSFQLTAQEVAQMIRYIEIPVIIFLINNRGYVIEIAIHDGPYNYIK



NWNYAGLIDVFNDEDGHGLGLKASTGAELEGAIKKALDNRRGPTLIECNIAQD



DCTETLIAWGKRVAATNSRKPQAGGSG (SEQ ID NO: 146)





2584327140
MAYTVGMYLAERLAQIGLKHHFAVAGDYNLVLLDQLLLNKDMEQIYCCNELN


EU61DRAFT
CGFSAEGYARAHGAAAAVVTFSVGAISAMNAIGGAYAENLPVILISGSPNSNDY



GSGHILHHTLGTTDYGYQLEMARHVTCAAESITDAASAPAKIDHVIRTALRERK



PAYLEIACNVSSAECPRPGPVSSLLAEPATDPVSLKAALEASLSALNKAERVVML



VGSKIRAADAQAQAVELADRLGCAVTIMSAAKGFFPEDHPGFRGLYWGEVSSP



GAQELVENADAVLCLAPVFNDYSTVGWNAWPKGDKVLLAEPNRVTVGGQSFE



GFALRDFLKGLTDRAPSKPATAQGTHAPKLEIKPAARDARLTNDEMARQINAM



LTPNTTLAAETGDSWFNAMRMNLPGGARVEVEMQWGHIGWSVPSTFGNAMG



SKDRQHIMMVGDGSFQLTAQEVAQMVRYELPVIIFLVNNKGYVIEIAIHDGPYN



YIKNWDYAGLMEVFNAGEGHGIGLHAKTAGELEDAIKKAQANKRGPTIIECSLE



RTDCTETLIKWGKRVAAANSRKPQAVGGSG (SEQ ID NO: 147)





C7JF72 ACEP3
MTYTVGMYLAERLSQIGLKHHFAVAGDFNLVLLDQLLVNKEMEQVYCCNELN



CGFSAEGYARAHGAAAAVVTFSVGAISAMNAIAGAYAENLPVILISGSPNSNDY



GTGHILHHTLGTNDYTYQLEMMRHVTCAAESITDAASARAKTDHVIRTALRERK



PAYVEIACNVSDAECVRPGPVSSLLAELRADDVSLKAAVEASLALLEKSQRVTM



IVGSKVRAAHAQTQTEHLADKLGCAVTIMAAAKSFFPEDHKGFRGLYWGDVSS



PGAQELVEKSDALICVAPVFNDYSTVGWTAWPKGDNVLLAEPNRVTVGGKTY



EGFTLREFLEELAKKAPSRPLTAQESKKHTPVIEASKGDARLTNDEMTRQINAM



LTSDTTLVAETGDSWFNATRMDLPRGARVELEMQWGHIGWSVPSAFGNAMGS



QERQHILMVGDGSFQLTAQEMAQMVRYKLPVIIFLVNNRGYVIEIAIHDGPYNY



IKNWDYAGLMEVFNAEDGHGLGLKATTAGELEEAIKKAKTNREGPTIIECQIER



SDCTKTLVEWGKKVAAANSRKPQVSGGSG (SEQ ID NO: 148)





AGA0D6NFJ6
MTYTVGMYLADRLAQIGLKHHFAVAGDYNLVLLDQLLTNKDMQQIYCCNELN


9PROT
CGFSAEGYARAHGAAAAVVTSVGAISAMNAIGGAYAENLPVILISGSPNSNDY



GSGHILHHITIGSTDYGYQMEMVKHVTCAAESITDAASAPAKIDHVIRTALRESK



PAYLLEIACNVSAQECPRPGPVSSLLSEPAPDKTSLDAAVAAAVKLIEGAENTVIL



VGSKLRAARAQAEAEKLADKLECAVTIMAAAKGFFPEDHAGFRGLYWGVSS



PGTQELVEKADAIICLAPVFNDYSTVGWTAWPKGDKVLLAEPNRVTIKGQTFEG



FALRDFLTALAAKAPARPASAKASSHTPTAFPKADAKAPLTNDEMARQINAML



TSDTTLVAETGDSWFNAMRMTLPRGARVELEMQWGHIGWSVPSSFGNAMGSQ



DRQHVVMVGDGSFQLTAQEVAQMVRYELPVIIFLVNNRGYVIEIAIHDGPYNYI



KNWDYAGLMEVFNAGEGHGLGLHATTAEELEDAIKKAQANRRGPTIIECKIDR



QDCTDTLVQWGKKVASANSRKPQAVGGSG (SEQ ID NO: 166)









The kinetics of these enzymes were characterized and compared with that of 4COK. As shown in Table 6, four of these enzymes displayed high levels of OAADC activity, similar to or greater than that of 4COK.









TABLE 6







Kinetics of highly active OAADCs.













A0A0J7KM68
C7JF72_ACEP3
5EUJ
A0A0D6NFJ6_9PROT
4COK
















kcat(s−1)
6.248
55.45
28.79
>121
>55


Km(mM)
2.389
15.53
6.667
 >20
>20


kcat/Km(M−1s−1)
2615.3 ± 224.2
3570.5 ± 252.5
4318.3 ± 320.7
6045.2 ± 452.5
2296.4 ± 116.0









To engineer a novel pathway to produce 3-HP, 3-hydroxypropionate dehydrogenase (3-HPDH) and phosphoenolpyruvate carboxykinase (PEPCK) candidates suitable for the novel pathway were also investigated. As shown in FIG. 21B, the final step in the conversion of sugars into 3-HIP is the formation of 3-HP from 3-oxopropanoate, which can be catalyzed by a 3-HPDH. 12 candidate ADHs were expressed in E. coli and tested for solubility and 3-HPDH activity. The sequences of the enzymes tested are provided in Table 7.









TABLE 7 







Candidate 3-HPDH sequences.








Enzyme name
Amino acid sequence





ADH6_YEAST
MSYPEKFEGIAIQSHEDWKNPKKTKYDPKPFYDHDIDIKIEACGVCGSDIHCAAG



HWGNMKMPLVVGHEIVGKVVKLGPKSNSGLKVGQRVGVGAQVFSCLECDRCK



NDNEPYCTKFVTTYSQPYEDGYVSQGGYANYVRVHEHFVVPIPENIPSHLAAPLL



CGGLTVYSPLVRNGCGPGKKVGIVGLGGIGSMGTLISKAMGAETYVISRSSRKRE



DAMKMGADHYIATLEEGDWGEKYFDTFDLIVVCASSLTDIDFNIMPKAMKVGG



RIVSISIPEQHEMLSLKPYGLKAVSISYSALGSIKELNQLLKLVSEKDIKIWVETLPV



GEAGVHEAFERMEKGDVRYRFTLVGYDKEFSD (SEQ ID NO. 149)





YQHD_ECOLI
MNNFNLHTPTRILFGKGAIAGLREQIPHDARVLITYGGGSVKKTGVLDQVLDALK



GMDVLEFGGIEPNPAYETLMNAVKLVREQKVTFLLAVGGGSVLDGTKFIAAAA



NYPENIDPWHILQTGGKEIKSAIPMGCVLTLPATGSESNAGAVISRKTTGDKQAF



HSAHVQPVFAVLDPVYTYTLPPRQVANGVVDAFVHTVEQYVTKPVDAKIQDRF



AEGILLTLIEDGPKALKEPENYDVRANVMWAATQALNGLIGAGVPQDWATHML



GHELTAMHGLDHAQTLAIVLPALWNEKRDTKRAKLLQYAERVWNITEGSDDER



IDAAIAATRNFFEQLGVPTHLSDYGLDGSSIPALLKKLEEHGVMTQLGENHDITLD



VSRRIYEAAR (SEQ ID NO: 150)





ADH2_YEAST_A1
MSIPETQKAIIFYESNGKLEHKDIPVPKPKPNELLINVKYSGVCHTDLHAVTHGDW


cohel_
PLPTKLPLVGGHEGAGVVVGMGENVKGWKIGDYAGIKWLNGSCMACEYCELG


debydrogenase_2
NESNCPHADLSGYTHDGSFQEYATADAVQAAHIPQGTDLAEVAPILCAGITVYK



ALKSANLRAGHWAAISGAAGGLGSLAVQYAKAMGYRVLGIDGGPGKEELFTSL



GGEVFIDFTKEKDIVSAVVKATNGGAHGIINVSVSEAAIEASTRYCRANGTVVLV



GLPAGAKCSSDVFNHVVKSISIVGSYVGNRADTREALDFFARGLVKSPIKVVGLS



SLPEIYEKMEKGQIAGRYVVDTSK (SEQ ID NO: 151)





YdfG
MIVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQLDV



RNRAAIEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNK



GLVYMTRAVLPGMVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNL



RTDLHGTAVRVTDIEPGLVGGTEFSNVRFKGDDGKAEKTYQNTVALTPEDVSEA



VWWVSTLPAHVNINTLEMMPVTQSYAGLNVHRQ (SEQ ID NO: 152)





A9A4M8
MHTVRIPKVINFGEDALGQTEYPKNALVVTTVPPELSDKWLAKMGIQDYMLND



KVKPEPSIDDVNTLISEFKEKKPSVLIGLGGGSSMDVVKYAAQDFGVEKILIPTTF



GTGAEMTTYCVLKFDGKKKLLREDRFLADMAVVDSYFMDGTPEQVIKNSVCDA



CAQATEGYDSKLGNDLTRTLCKQAFEILYDAIMNDKPENYPYGSMLSGMGFGN



CSTTLGHALSYVFSNEGVPHGYSLSSCTTVAHKNKSIFYDRFKEAMDKLGFDK



LELKADVSEAADVVMTDKGHLDPNPIPISKDDVVKCLEDIKAGNL (SEQ ID



NO: 153)





A4YI81
MTEKVSVVGAGVIVGWATLFASKGYSVSLYTEKKETLDKGIEKLRNYVQVMK



NNSQITEDVNTVISRVSPTTNLDEAVRGANFVIEAVIEDYDAKKKIFGYLDSVLDK



EVILASSTSGLLITEVQKAMSKHPERAVIAHPWNPPHLLPLVEIVPGEKTSMEVVE



RTKSLMERLDRIVVVLKKEIPGFIGNRLAFALFREAVYLVDEGVATVEDIDKVMT



AAIGLRWAFMGFFLTYRLGGGEGGLEYFFNRGFGYGANEWMHTLAKYDKFPYT



GVTKAIQQMKEYSFIKGKTFQEISKWRDEKLLKVYKLVWEK (SEQ ID NO: 154)





3OBB
MKQIAFIGLGHMGAPMATNLLKAGYLLNVFDLVQSAVDGLVAAGASAARSARD



AVQGADVVISMLPASQHVEGLYLDDDGLLAHIAPGTLVLECSTIAPTSARKIHAA



ARERGLAMLDAPVSGTAGAAAGTLTFMVGGDAEALEKARPLFEAMGRNIFHA



GPDGAGQVAKVCNNQLLAVLMIGTAEAMALGVANGLEAKVLAEDARRSSGGN



WALEVYNPWPGVMENAPASRDYSGGFMAQLMAKDLGLAQEAAQASASSTPM



GSLALSLYRLLLKQGYAERDFSVVQKLFDPTQGQ (SEQ ID NO: 155)





5JE8
MKKIGFIGLGNMGLPMSKNLVKSGYTVYGVDLNKEAEASFEKEGGIGLSISKLA



ETCDVVFTSLPSPRAVEAVYFGAEGLFENGHSNVVFIDTSTVSPQLNKQLEEAAK



EKKVDFLAAPVSGGVIGAENRTLTFMVGGSKDVYEKTESIMGVLGANIFHVSEQI



DSGTTVKLINNLLIGFYTAGVSEALTLAKKNNMDLDKMFDILNVSYGQSRIYERN



YKSFIAPENYEPGFTVNLLKKDLGFAVDLAKESELHLPVSEMLLNVYDEASQAG



YGENDMAALYKKVSEQLISNQK (SEQ ID NO: 156)





Q819E3
MEHKTLSIGFIGIGVMGKSMVYHLMQDGHKVYVYNRTKAKTDSLVQDGANWC



NTPKELVKQVDIVMTMVGYPHDVEEVYFGIEGIIEHAKEGTIAIDFTTSTPTLAKR



INEVAKRKNIYTLDAPVSGGDVGAKEAKLAIMVGGEKEIYDRCLPLLEKLGTNIQ



LQGPAGSGQHTKMCNQIAIASNMIGVCEAVAYAKKAGLNPDKVLESISTGAAGS



WSLSNLAPRMLKGDFEPGFYVKHFMKDMKIALEEAERLQLPVPGLSLAKELYEE



LIKDGEENSGTQVLYKKYIRG (SEQ ID NO: 157)





Q5FQ06
MSSPKIGFIGYGAMAQRMGANLRKAGYPVVAYAPSGGKDETEMLPSPRAIAEAA



EIIIFCVPNDAAENESLHGENGALAALTPGKLVLDTSTVSPDQADAFASLAVEHGF



SLLDAPMSGSTPEAETGDLVMLVGGDEAVVKRAQPVLDVIGKLTIHAGPAGSAA



RLKLVVNGVMGATLNVIAEGVSYGLAAGLDRDVVFDTLQQVAVVSPHHKRKL



KMGQNREFPSQFPTRLMSKDMGLLLDAGRKVGAFMPGMAVADQALALSNRLH



ANEDYSALIGAMEHSVANLPRK (SEQ ID NO:158)





2CVZ
MEKVAFIGLGAMGYPMAGHLARRFPTLVWNRTFEKALRHQEEFGSEAVPLERV



AEARVIFTCLPTTREVYEVAEALYPYLREGTYWVDATSGEPEASRRLAERLREKG



VTYLDAPVSGGTSGAEAGTLTVMLGGPEEAVERVRPFLAYAKKVVHVGPVGAG



HAVKAINNALLAVNLWAAGEGLLALVKQGVSAEKALEVINASSGRSNATENLIP



QRVLTRAFPKTFALGLLVKDLGIAMGVLDGEKAPSPLLRLAREVYEMAKRELGP



DADHVEALRLLERWGGVEIR (SEQ ID NO: 159)





Q05016
MSQGRKAAERLAKKTVLITGASAGIGKATALEYLEASNGDMKLILAARRLEKLE



ELKKTIDQEFPNAKVHVAQLDITQAEKIKPFIENLPQEFKDIDILVNNAGKALGSD



RVGQIATEDIQDVFDTNVTALINITQAVLPIFQAKNSCDIVNLGSIAGRDAYPTGSI



YCASKFAVGAFTDSLRKELINTKIRVILIAPGLVETEFSLVRYRGNEEQAKNVYKD



TTPLMADDVADLIVYATSRKQNTVIADTLIFPTNQASPHHIFRG (SEQ ID NO: 160)









Table 8 shows that 9 out of the 12 candidate 3-HPDHs were expressed in soluble form in E. coli.









TABLE 8







Expression of candidate 3-HPDHs



















ADH
YdfG
YMR226C
2CVZ
QFQ06
Q819E3
5JE8
3OBB
A4YI81
A9A4M8
ADH2_Y
ADH6_Y
YqhD





Soluble
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
No


Expression









The nine 3-HPDHs from Table 6 that were expressed in soluble form were next characterized for their activity towards 3-HP. As shown in FIG. 7, these results demonstrated that of these enzymes, both 2CVZ and A4YI81 were found to prefer NAD as the cofactor and have the highest activity against 3-HP Activity data for these enzymes using NAD+ or NADP+ as a co-factor are shown in FIGS. 8A & 8B. The enzymatic activities of these enzymes using NAD+ are also shown in FIG. 9, demonstrating a Km for NAD+ of 0.42 mM for 2CVZ and 0.65 mM for A4YI81.


The synthetic pathway shown in FIG. 2B also uses a PEPCK to provide oxaloacetate substrate for the OAADC. In order to explore possible active PEPCKs responsible for the conversion of phosphoenolpyruvate to oxaloacetate, 5 PEPCK candidates were synthesized and cloned into an expression vector. The sequences of the enzymes tested are provided in Table 9.









TABLE 9 







Candidate PEPCK sequences








Enzyme name
Amino acid sevence





Q7XAU8
MASPNGLAKIDTQGKTEVYDGDTAAPVRAQTIDELHLLQRKRSA



PTTPIKDGATSAFAAAISEEDRSQQQLQSISASLTSLARETGPKLVK



GDPSDPAPHKHYQPAAPTIVATDSSLKFTHVLYNLSPAELYEQAF



GQKKSSFITSTGALATLSGAKTGRSPRDKRVVKDEATAQELWWG



KGSPNIEMDERQFVINRERALDYLNSLDKVYVNDQFLNWDPENRI



KVRIITSRAYHALFMHNMCIRPTDEELESFGTPDFITYNAGEFPAN



RYANMTSSTSINISLARREMVTLGTQYAGEMKKGLFGVMHYLM



PKRGILSLHSGCNMGKDGDVALFFGLSGTGKTTLSTDHNRLLIGD



DEHCWSDNGVSNIEGGCYAKCIDLSQEKEPDIWNAIKFGTVLENV



VFNERTREVDYSDKSITENTRAAYPIEFIPNAKIPCVGPHPKNVILL



ACDAFGVLPPVSKLNLAQTMYHFISGYTALVAGTVDGITEPTATF



SACFGAAFIMYHPTKYAAMLAEKMQKYGATGWLVNTGWSGGR



YGVGKRIRLPHTRKIIDAIHSGELLTANYKKTEVFGLEIPTEINGVP



SEILDPINTWTDKAAYKENLLNLAGLFKKNFEVFASYKIGDDSSLT



DEILAAGPNF (SEO ID NO: 161)





PCKA_Ecoli
MRVNNGLTPQELEAYGISDVHDIVYNPSYDLLYQEELDPSLTGYE



RGVLTNLGAVAVDTGIFTGRSPKDKYIVRDDTTRDTFWWADKGK



GKNDNKPLSPETWQHLKGLVTRQLSGKRLFVVDAFCGANPDTRL



SVRFITEVAWQAHFVKNMFIRPSDEELAGFKPDFIVMNGAKCTNP



QWKEQGLNSENFVAFNLTERMQLIGGTWYGGEMKKGMFSMMN



YLLPLKGIASMHCSANVGEKGDVAVFFGLSGTGKTTLSTDPKRRL



IGDDEHGWDDDGVFNFEGGCYAKTIKLSKEAEPEIYNAIRRDALL



ENVTVREDGTIDFDDGSKTENTRVSYPIYHIDNTVKPVSKAGHATK



VIFLTADAFGVLPPVSRLTADQTQYHFLSGFTAKLAGTERGITEPT



PTFSACFGAAFLSLHPTQYAEVLVKRMQAAGAQAYLVNTGWNG



TGKRISIKDTRAIIDAILNGSLDNAETFTLPMFNLAIPTELPGVDTKI



LDPRNTYASPEQWQEKAETLAKLFIDNFDKYTDTPAGAALVAAG



PKL (SEQ ID NO: 162)





PCK from
MTDLNKLVKELNDLGLTDVKEIVYNPSYEQLFEEETKPGLEGFDK


Actinobaccilus_
GTLTTLGAVAVDTGIFGRSPKDKYIVCDETTKDTVWWNSEAAK


succinogenes
NDNKPMTQETWKSLRELVAKQLSGKREFVVEGYCGASEKHRIGV



RMVTEVAWQAHFVKNMFIRPTDEELKNFKADFTVLNGAKCTNP



NWKEQGLNSENFVAFNITEGIQLIGGTWYGGEMKKGMFSMMNY



FLPLKGVASMHCSANVGKDGDVAIFFGLSGTGKTELSTDPKRQLI



GDDEHGWDESGVFNFEGGCYAKTINLSQENEPDIYGAIRRDALLE



NVVVRADGSVDFDDGSKTENTRVSYPIYHIDNIVRPVSKAGHATK



VIFLTADAFGVLPPVSKLTPEQTEYYFLSGFTAKLAGTERGVTEPT



PTFSACFGAAFLSLHPIQYADVLVERMKASGAEAYLVNTGWNGT



GKRISIKDTRGIIDAILDGSIEKAEMGELPIFNLAIPKALPGVDPAIL



DPRDTYADKAQWQVKAEDLANRFVKNFVKYTANPEAAKLVGA



GPKA (SEQ ID NO: 163)





IJ3B
MQRLEALGIHPKKRVFWNTVSPVLVHTLLRGEGLLAHHGPLVV



DTTPYTGRSPKDKFVVREPEVEGEIWWGEVNQPFAPEAFEALYQR



VVQYLSERDLYVQDLYAGADRRYRLAVRVVTESPWHALFARNM



FILPRRFGNDDEVEAFVPGFTVVHAPYFQAVPERDGTRSEVFVGIS



FQRRLVLIVGTKYAGEIKKSIFTVMNYLMPKRGVFPMHASANVG



KEGDVAVFFGLSGTGKTTLSTDPERPLIGDDEHGWSEDGVFNFEG



GCYAKVIRLSPEHEPLIYKASNQFEAILENVVVNPESRRVQWDDD



SKTENTRSSYPIAHLENVVESGVAGHPRAIFFLSADAYGVLPPIAR



LSPEEAMYYFLSGYTARVAGTERGVTEPRATFSACFGAPFLPMHP



GVYARMLGEKIRKHAPRVYLVNTGWTGGPYGVGYRFPLPVTRA



LLKAALSGALENVPYRRDPVFGFEVPLEAPGVPQELLNPRETWAD



KEAYDQQARKLARLFQENFQKYASGVAKEVAEAGPRTE (SEQ ID



NO: 164)





IYTM
MSLSESLAKYGITGATNIVHNPSHEELFAAETQASEEGFEKGTVTE



MGAVNVMTGVYTFGRSPKDKFIVKNEASKEIWWTSDEFKNDNKP



VTEEAWAQLKALAGKELSNKPLYVVDLFCGANENTRLKIRFVME



VAWQAHFVTNMFIRPTEEELKGEEPDFVVLNASKAKVENFKELG



LNSETAVVFNLAEKMQIILNTWYGGEMKKGMFSMMNFYLPLQGI



AAMHCSANTDLEGKNTAIFFGLSGTGKTTLSTDPKRLLIGDDEHG



WDDDGVFNFEGGCYAKVINLSKENPDIWGAIKRNALLENVTVD



ANGKVDFADKSVTENTRVSYPIFHIKNIVKPVSKAPAAKRVIFLSA



DAFGVLPPVSILSKEQTKYYFLSGFTAKLAGTERGITEPTPTFSSCF



GAAFLTLPPTKYAEVLVKRMEASGAKAYLVNTGWNGTGKRISIK



DTRGIIDAILDGSIDTANTATIPYFNFTVPTELKGVDTKILDPRNTY



ADASEWEVKAKDLAERFQKNFKKFESLGGDLVKAGPQL (SEQ ID



NO: 165)









Two highly active PEPCKs were identified from E. coli and A. succinogenes, respectively. The activities of these enzymes using phosphoenolpyruvate (PEP) as a substrate are shown in FIG. 10 and Table 10.









TABLE 10







Kinetics of PEPCK enzymes against PEP.











Actinobacillus succinogenes PCK


E. coli PCK














kcat(s−1)
2.875
3.423


Km(mM)
0.1692
0.1905


kcat/Km(M−1s−1)
16991.72577
17968.50394









In summary, these data demonstrate the identification of multiple PEPCK, OAADC, and 3-HPDH enzymes suitable for catalyzing each step of a novel and advantageous metabolic pathway to produce 3-HP.

Claims
  • 1. A method for producing 3-hydroxypropionate (3-HP), the method comprising: (a) providing a recombinant host cell, wherein the recombinant host cell comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), wherein the OAADC comprises the amino acid sequence of SEO ID NO:1; and(b) culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP, wherein expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH.
  • 2. (canceled)
  • 3. The method of claim 1, wherein the recombinant host cell is a recombinant prokaryotic cell.
  • 4. The method of claim 3, wherein the prokaryotic cell is an Escherichia coli cell.
  • 5. The method of claim 1, wherein the host cell is selected from the group consisting of Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus lichenmformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Vibrio alginolyticus, Xanthomonas, Zymomonas, and Zymomonus mobilis.
  • 6. (canceled)
  • 7. A method for producing 3-hydroxypropionate (3-HP), the method comprising: (a) providing a recombinant host cell, wherein the recombinant host cell comprises a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC) and a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH), wherein the OAADC comprises the amino acid sequence of SEO ID NO:1, and wherein the recombinant host cell is a recombinant fungal cell; and(b) culturing the recombinant host cell in a culture medium comprising a substrate under conditions suitable for the recombinant host cell to convert the substrate to 3-HP, wherein expression of the OAADC and the 3-HPDH results in increased production of 3-HP, as compared to production by a host cell lacking expression of the OAADC and the 3-HPDH.
  • 8-12. (canceled)
  • 13. The method of claim 7, wherein the recombinant host cell is capable of producing 3-HP at a pH lower than 6.
  • 14. The method of claim 13, wherein the recombinant host cell is capable of producing 3-HP below the pKa of 3-HP.
  • 15. The method of claim 7, wherein the fungal cell is a yeast cell.
  • 16. The method of claim 7, wherein the fungal cell is of a genus or species selected from the group consisting of Aspergillus, Aspergillus nidulans, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus terreus, Aspergillus pseudoterreus, Aspergillus usamii, Candida rugosa, Issatchenkia orientalis, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kluyveromyces marxianas, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Rhodosporidium toruloides, Sccharomyces cerevisiae, Schizosaccharomyces pombe, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Yarrowia lipolytica, and Zygosaccharomyces rouxii.
  • 17-20. (canceled)
  • 21. The method of claim 1, wherein the recombinant polynucleotide is stably integrated into a chromosome of the recombinant host cell.
  • 22. The method of claim 1, wherein the recombinant polynucleotide is maintained in the recombinant host cell on an extra-chromosomal plasmid.
  • 23. The method of claim 1, wherein the polynucleotide encoding the 3-HPDH is an endogenous polynucleotide.
  • 24. The method of claim 1, wherein the polynucleotide encoding the 3-HPDH is a recombinant polynucleotide.
  • 25. The method of claim 1, wherein the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130, 154, and 159.
  • 26. (canceled)
  • 27. The method of claim 1, wherein the recombinant host cell is cultured under anaerobic conditions suitable for the recombinant host cell to convert the substrate to 3-HP.
  • 28. The method of claim 1, wherein the substrate comprises glucose, sucrose, fructose, xylose, arabinose, cellobiose, cellulose, alginate, mannitol, laminarin, galactose, or galactan.
  • 29-31. (canceled)
  • 32. The method of claim 1, wherein the recombinant host cell further comprises a recombinant polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK).
  • 33. The method of claim 32, wherein the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163.
  • 34. The method of claim 1, wherein the recombinant host cell further comprises a modification resulting in decreased production of pyruvate from phosphoenolpyruvate, as compared to a host cell lacking the modification.
  • 35. The method of claim 34, wherein the modification results in decreased pyruvate kinase (PK) activity or expression, as compared to a host cell lacking the modification.
  • 36-39. (canceled)
  • 40. The method of claim 34, wherein the recombinant host cell further comprises a second modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK), as compared to a host cell lacking the second modification.
  • 41. The method of claim 1, further comprising: (c) substantially purifying the 3-HP.
  • 42. The method of claim 1, further comprising: (d) converting the 3-HP to acrylic acid.
  • 43. A recombinant host cell comprising a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC), wherein the OAADC comprises the amino acid sequence of SEO ID NO:1.
  • 44. (canceled)
  • 45. The host cell of claim 43, wherein the recombinant host cell is a recombinant prokaryotic cell.
  • 46. The host cell of claim 45, wherein the prokaryotic cell is an Escherichia coli cell.
  • 47. The host cell of claim 43, wherein the host cell is selected from the group consisting of Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus lichenmformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrficans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Vibrio alginolyticus, Xanthomonas, Zymomonas, and Zymomonus mobilis.
  • 48. (canceled)
  • 49. A recombinant fungal host cell comprising a recombinant polynucleotide encoding an oxaloacetate decarboxylase (OAADC), wherein the OAADC comprises the amino acid sequence of SEO ID NO:1.
  • 50-54. (canceled)
  • 55. The host cell of claim 43, wherein the host cell further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH).
  • 56. The host cell of claim 55, wherein the polynucleotide encoding the 3-HPDH is an endogenous polynucleotide.
  • 57. The host cell of claim 55, wherein the polynucleotide encoding the 3-HPDH is a recombinant polynucleotide.
  • 58. The host cell of claim 55, wherein the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130, 154, and 159.
  • 59. (canceled)
  • 60. The host cell of claim 49, wherein the recombinant fungal host cell is capable of producing 3-HP at a pH lower than 6.
  • 61. The host cell of claim 60, wherein the recombinant host cell is capable of producing 3-HP below the pKa of 3-HP.
  • 62. The host cell of claim 49, wherein the fungal cell is a yeast cell.
  • 63. The host cell of claim 49, wherein the fungal cell is of a genus or species selected from the group consisting of Aspergillus, Aspergillus nidulans, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus terreus, Aspergillus pseudoterreus, Aspergillus usamii, Candida rugosa, Issatchenkia orientalis, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kluyveromyces marxianas, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Rhodosporidium toruloides, Sccharomyces cerevisiae, Schizosaccharomyces pombe, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Yarrowia lipolytica, and Zygosaccharomyces rouxii.
  • 64-67. (canceled)
  • 68. The host cell of claim 43, wherein the recombinant polynucleotide is stably integrated into a chromosome of the recombinant host cell.
  • 69. The host cell of claim 43, wherein the recombinant polynucleotide is maintained in the recombinant host cell on an extra-chromosomal plasmid.
  • 70. The host cell of claim 43, wherein the recombinant host cell is capable of producing 3-HP under anaerobic conditions.
  • 71. The host cell of claim 43, wherein the recombinant host cell further comprises a recombinant polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK).
  • 72. The host cell of claim 71, wherein the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163.
  • 73. The host cell of claim 43, wherein the recombinant host cell further comprises a modification resulting in decreased production of pyruvate from phosphoenolpyruvate, as compared to a host cell lacking the modification.
  • 74. The host cell of claim 73, wherein the modification results in decreased pyruvate kinase (PK) activity or expression, as compared to a host cell lacking the modification.
  • 75. (canceled)
  • 76. The host cell of claim 74, wherein the modification comprises an exogenous promoter in operable linkage with an endogenous pyruvate kinase (PK) coding sequence, wherein the exogenous promoter results in decreased endogenous PK coding sequence expression, as compared to expression of the endogenous PK coding sequence in operable linkage with an endogenous PK promoter.
  • 77. The host cell of claim 76, wherein the exogenous promoter is a MET3, CTR1, or CTR3 promoter.
  • 78. The host cell of claim 77, wherein the exogenous promoter comprises a polynucleotide sequence selected from the group consisting of SEQ ID NOs:131-133.
  • 79. The host cell of claim 71, wherein the recombinant host cell further comprises a second modification resulting in increased expression or activity of phosphoenolpyruvate carboxykinase (PEPCK), as compared to a host cell lacking the second modification.
  • 80. A vector comprising: (a) a polynucleotide that encodes an amino acid sequence at least 80% identical to a sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166; and(b) a promoter operably linked to the polynucleotide: wherein the promoter is exogenous with respect to the polynucleotide.
  • 81. The vector of claim 80, wherein the polynucleotide encodes the amino acid sequence of SEQ ID NO:1.
  • 82. The vector of claim 80, wherein the polynucleotide comprises the polynucleotide sequence of SEQ ID NO:2.
  • 83. The vector of claim 80, wherein the polynucleotide encodes an amino acid sequence selected from the group consisting of SEQ ID NOs:145, 146, 148, and 166.
  • 84-85. (canceled)
  • 86. The vector of claim 80, wherein the promoter is a T7 promoter, a TDH promoter, or an FBA promoter.
  • 87. (canceled)
  • 88. The vector of claim 6, wherein the promoter comprises the polynucleotide sequence of SEQ ID NO:135 or 136.
  • 89. The vector of claim 80, wherein the vector further comprises a polynucleotide encoding a 3-hydroxypropionate dehydrogenase (3-HPDH).
  • 90. The vector of claim 89, wherein the 3-HPDH comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:122-130, 154, and 159.
  • 91. (canceled)
  • 92. The vector of claim 89, wherein the polynucleotide that encodes the sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166 and the polynucleotide encoding the 3-hydroxypropionate dehydrogenase (3-HPDH) are arranged in an operon operably linked to the same promoter.
  • 93. The vector of claim 92, wherein the promoter is a T7 or phage promoter.
  • 94. The vector of claim 80, wherein the vector further comprises a polynucleotide encoding a phosphoenolpyruvate carboxykinase (PEPCK).
  • 95. The vector of claim 94, wherein the PEPCK comprises the amino acid sequence of SEQ ID NO:162 or 163.
  • 96. The vector of claim 94, wherein the polynucleotide that encodes the sequence selected from the group consisting of SEQ ID NOs:1, 145, 146, 148, and 166; the polynucleotide encoding the 3-hydroxypropionate dehydrogenase (3-HPDH); and the polynucleotide encoding the phosphoenolpyruvate carboxykinase (PEPCK) are arranged in an operon operably linked to the same promoter.
  • 97. The vector of claim 96, wherein the promoter is a T7 or phage promoter.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the priority benefit of U.S. Provisional Application Ser. No. 62/507,019, filed May 16, 2017, which is incorporated herein by reference in its entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with Government support under Grant No. DE-AC02-05CH11231 awarded by the Department of Energy. The Government has certain rights in this invention.

Provisional Applications (1)
Number Date Country
62507019 May 2017 US
Continuations (1)
Number Date Country
Parent 16612304 Nov 2019 US
Child 17683101 US