The present invention provides enzymes that have been optimized by implementation of Protein Repair One Stop Shop (PROSS), an algorithm that generates protein designs for enhanced stability without changing either enzymatic properties or enzyme active site conformation of the respective enzyme. The protein designs generated by PROSS introduce mutations to the amino acid sequence of a wild-type protein, resulting in a mutated amino acid sequence that encodes a variant of the wild-type enzyme, i.e., “an enzyme variant,” which has an enhanced stability, core packing, surface polarity and backbone rigidity, a higher functional expression, or a combination thereof, compared to the stability, core packing, surface polarity and backbone rigidity, functional expression or a combination thereof of the wild-type enzyme.
Enzymes have been used in various manufacturing process, although indirectly, since the dawn of man. Isolated enzymes were first used in the 1914, and their large scale microbial production started in the 1960. The industrial enzyme business is steadily growing: in 2004, market cap for industrial enzymes was $2.5 billion with a predicted annual growth rate of 5%-10%, according to BioSpectrum, Industrial enzyme: High on radar, Mar. 15, 2006. Indeed, I1 years later it was valued at $5 billion, as reported by Global Industrial Enzymes Market-Segmented by Type, Application, and Geography-Trends and Forecasts (2017-2022), Mordor Intelligence, 2017.
In 2017, major applications of industrial enzymes are in medicine, textile industry, food and beverages industry and in recent times as analytical aids. Detergents (17%), food and feed (17%), leather and paper (17%), textiles (8%) and pharmaceutical (41%) are the major industries that use industrially produced enzymes, as described by Chandel A K, et al., Industrial enzymes in bioindustrial sector development: An Indian perspective. J Comm Biotechnol 2007; 13:283-91.
As enzymes are used for more applications, there is a desire to use them in extreme environmental conditions, such as temperature, pH extremes and in the presence of salts, alkalis and surfactants. Major applications of enzymes are at high temperature (e.g., washing 60-70° C., starch gelatinization 100° C., textile desizing 80-90° C., etc.), under high salt concentrations (food industry), under alkaline conditions and in the presence of surfactants (e.g., in detergents and in several biotransformation reaction systems among others). Thus, there is a need to engineer enzymes to be extremely stable, far beyond their wild-type environment.
Approaches to stabilize enzymes and protein in general are usually based on phylogenetic reconstruction of ancestral sequences' (see, e.g., Lehmann M., et al., The consensus concept for thermostability engineering of proteins. Biochim. Biophys. Acta. 2000; 1543:408-415), and structure based rational or computational design (see, e.g., Borgo B., et al., Automated selection of stabilizing mutations in designed and natural proteins, Proc. Natl. Acad. Sci. USA. 2012; 109:1494-1499, and Jacak R., et al., Computational protein design with explicit consideration of surface hydrophobic patches, Proteins. 2012; 80:825-838), and had some success in engineering proteins with improved stability and higher functional expression per Jacak.
Recently, Goldenzweig et al. developed PROSS—an automated method for protein stabilization—and demonstrated its robustness on 5 proteins from a wide spectrum of organisms, as described by Goldenzweig A, et al., Automated Structure- and Sequence-Based Design of Proteins for High Bacterial Expression and Stability. Mol Cell. 2016; 63(2):337-46, and by Campeotto I, et al., One-step design of a stable variant of the malaria invasion protein RH5 for use as a vaccine immunogen. PNAS 2017 January 17. One target was human acetylcholinesterase (hAChE), an enzyme mediating synaptic transmission, which was designed to have 2000-fold higher expression rate in E. coli compared to the wild-type and exhibited 20° C. higher thermostability with no change in enzymatic properties or in active site conformation as was determined by X-Ray crystallography.
In light of its extreme robustness, the inventors applied PROSS to a number of industry relevant enzymes. Each PROSS iteration resulted in an amino acid sequence of an enzyme with an improved stability.
Provided herein are pyruvate transaminase variants of a wild-type pyruvate transaminase from Vibrio fluvialis of SEQ ID NO: 39 comprising one or more mutations selected from A74E, Y76M, V103M, S108M, F130Y, Q196E, Q210L, A214P, G219A, L243Q, S304A, K305Q, T309E, A313K or A313R, E315G, E316T, A323Y, K335E, N342D, N405E; T406A, T408E, T408F or T408Q, and E436D.
In one embodiment, provided herein is a pyruvate transaminase variant of a wild-type pyruvate transaminase from Vibrio fluvialis of SEQ ID NO: 39 comprising mutations:
Also provided herein are triacylglycerol lipase variants of a wild-type triacylglycerol lipase from Burkholderia glumae of SEQ ID NO: 40 comprising one or more mutations selected from A63G, N64G, S74E, S78R, A106E, S132A, F161Y, T168Y, T176P, N183W, D196N, R204K, Q210G, T211A, Y214F, R216Q, A234P, S241N, Q242T, A297G, Q300P, F312Y, S319N, V334I, N338F, T348N or T348Q, L355N, and V358L.
In one embodiment, provided herein is a triacylglycerol lipase variant of a wild-type triacylglycerol lipase from Burkholderia glumae of SEQ ID NO: 40 comprising mutations:
Provided herein are halohydrin dehalogenase variants of a wild-type halohydrin dehalogenase from Rhizobium radiobacter of SEQ ID NO: 41 comprising one or more mutations selected from T3V, K10R, F12G, S17A, S22A, E33P, Q37D, K38P, L41R, K52H, E56A, I63V, A65R, T67I; S68K or S68E; S78A, D80A, A83P, Q87R; K91Q or K91E; A93D or A93S; V94P; G99A or G99K; Q105L, R107G, A110R, N113R or N113Q, S117P, F136L, W139L, K140P, E141G, S147A; C153I or C153N; T154A, N157K, S160A, V205Q, T206V, K215E and M245V.
In one embodiment, provided is a halohydrin dehalogenase variant of a wild-type halohydrin dehalogenase from Rhizobium radiobacter of SEQ ID NO: 41 comprising mutations:
Also provided herein are aminotransferase variants of a wild-type aminotransferase from Chromobacterium violaceum of SEQ ID NO: 42 comprising one or more mutations selected from M36L, G39A, D70E, F71L, R76Y, K90Q, V97I, S101E or S101K, A109P, D112N, R113H, S123A, V124N, T145V, L146I, G152A, S162A, V202L; V203H or V203K; A217P, D315E, A337R, V236I, A239D, H276Y; A286I or A286V; K304D, A312E, H333L; A337K or A337R; Q346E, K358Q, E368P; V379L; V379I or V379M; Q380A, N387D, K390T, D396N, F397E, T402W, R410N, S424A, T452E; and A455E or A455K.
In one embodiment, provided herein is an aminotransferase variant of a wild-type aminotransferase from Chromobacterium violaceum of SEQ ID NO: 42 comprising mutations:
Also provided herein are putative secreted lipase variants of a wild-type putative secreted lipase from Clostridium botulinum of SEQ ID NO: 43 comprising one or more mutations selected from D69N, E91Q, I101T, K137D, A140R, S141N, H143Y, V168K, K183G, K196P, N200D, K246H, S253V, S265A, T277M, K289N, N308D, E339Y, T346K, A375P, S395T, Q398K, G437T, M469I, N477Q, and S479P.
In one embodiment, also provided is a putative secreted lipase variant of a wild-type putative secreted lipase from Clostridium botulinum of SEQ ID NO: 43 comprising mutations:
Further provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 44 comprising one or more mutations selected from K10G; K11E or K11Q; Q14K or Q14V; V15I, E19P, K20V or K20I, K22T, S24G, V66I, I67V, E68V, E69A, Y89W, Q102W, R108H, Q110K, A112T, Y120F, A127D, K133H, F147L, T153V, V159E or V159M; V179M, L196I, G197D; E199D, K205R, Q206E or Q206K; K216L, H217K, D218E, A220P, Q222K, W223F, I224M, E226K, T234A, A242P, E245Q, S246Q, K249R or K249N; I251L, A256T, C257V or C257L; E265G, I267M or I267F, I269V or I269L; V279I, V28I, T306P or T306V; V308I, Q311R; N315K or N315D; D318E, L324R, N329D or N329K, K335D, V336L, D337E, K335D, V336L, and D337E.
Also provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 44 comprising mutations:
Provided herein are histidine acid phosphatase variants of a wild-type histidine acid phosphatase from Yersinia mollaretii of SEQ ID NO: 45 comprising one or more mutations selected from S42A, Q46P or Q46F, D52N, D56H or D56Q, K57P, A65P, T77R or T77K, G81Q or G81R, F82Y, Y83F, N89Q, T102E, Q106W, I109S or I109T, L115A, A136P, V141P or V141M, E149K, Q154T or Q154R, S157P, T158K or T158E, H161R or H161K, R162A, E165M, Q167R, A170G, S173D, E174A, K181D, P182D or P182S, Q185L, G187Q or G187E, E188Q, I189V, T193C or T193K, A194N, 5200A or 5200K, 5207T, N221S, Q222K, Q223E, S230E, S236A, G240A, N247Y, S248A, H257G, A262E, V266R, S267E, L269M, Q275K or Q275Y, K289Q, V298L, K306D, G324A, G334A, Q342T, Q345G, G355A, N363D, D365K, M380L, S386A, K392T, S393N, H394N, I409T, T411D, Q416P, and A425K.
In one embodiment, provided herein is a histidine acid phosphatase variant of a wild-type histidine acid phosphatase from Yersinia mollaretii of SEQ ID NO: 45 comprising mutations:
Provided herein are chanoclavine-I aldehyde reductase easA variants of a wild-type chanoclavine-I aldehyde reductase easA from Neosartorya fumigata of SEQ ID NO: 46 comprising one or more mutations selected from A8S, K12Q, H20N, I26V, T30M, G37D, Q38N, G50A, D66F, T68S, M72G, P84E, R86I, E91K, S94D, R95A, I126V, C151Y or C151I, A153E, N164E, T184I, K186D, Q192T, T210V, V217I, E240D, A250E, M252L, D256K, L268Q, E270N, E304R, K305W, Q309W, Y368W, L371R, and H372N or H372Q.
In one embodiment, provided herein is a chanoclavine-I aldehyde reductase easA variant of a wild-type chanoclavine-I aldehyde reductase easA from Neosartorya fumigata of SEQ ID NO: 46 comprising mutations:
Also provided herein are alanine dehydrogenase variants of a wild-type alanine dehydrogenase from Archaeoglobus fulgidus of SEQ ID NO: 47 comprising one or more mutations selected from E2K, Q8R, S13Q, N22E, L33E, G72Q, S89D, T104I, S106A, S125A, F128V, T135E, K235R, D277P, V300A, S310E, S315T, K316F and I317V.
In one embodiment, provided herein is an alanine dehydrogenase variant of a wild-type alanine dehydrogenase from Archaeoglobus fulgidus of SEQ ID NO: 47 comprising mutations:
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Shimwellia blattae of SEQ ID NO: 48 comprising one or more mutations selected from T28A, E41Q, N44D; N62A, N62L or N62Y; L72A; L81Q, A83K or A83Q; E84A, S90P, G91D, G92D, A94P, N95K; G99P or G99E; S103M, E107P, K108E, A112W, L113I, H114Y or H114W; I121F, 5170A; I171F, I171M, I171L or I171A; T175F or T175V; V178I, Q185E or Q185D, N188D, L197Y, A215G, T225A, Q234A, Q235D, and Q237A.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Shimwellia blattae of SEQ ID NO: 48 comprising mutations:
Further provided herein are major phosphate-irrepressible acid phosphatase variants of a wild-type major phosphate-irrepressible acid phosphatase from Morganella morganii of SEQ ID NO: 49 comprising one or more mutations selected from E40D, K47A, E54T, Q59D, L61K, N62R, M72A, A89S, A90K, G91D or G91E; T95K, G99E, E107K, M137Q, E148T, N151T, K153D, Q155E, S170A, T175V, V182I, A185D or A185E; N186R, Q187A or Q187R; L197Y, D213E, A215G, T225A, D229N, Q233R, Q235D, K246L, S247L or S247R; Q248R, and K249Q.
In one embodiment, provided herein is a major phosphate-irrepressible acid phosphatase variant of a wild-type major phosphate-irrepressible acid phosphatase from Morganella morganii of SEQ ID NO: 49 comprising mutations:
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Prevotella intermedia of SEQ ID NO: 50 comprising one or more mutations selected from N33H, D41E, G42S or G42E; Q44A or Q44S; T45V or T45I, S46D, T53P, L63K or L63E; Y64N, E66M or E66R; A67N, M74A, D82K, V85R, V90L, N97Q, E101C, I105M, K109P, T111K, V119L, L120R or L120Q; G127N, A145M, Y147F, M150S, N153T, E155K, Q157E, Q158E, S161R, T162K, T171A, I173L, T177V, V180I, S182A, I186P, Q189A, N190D, E194K, Y197R, Q198E, M199Y, S221A, D231N, Q237D and Q249R.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Prevotella intermedia of SEQ ID NO: 50 comprising mutations:
Provided herein are tryptophan dimethylallyltransferase variants of a wild-type tryptophan dimethylallyltransferase from Neosartorya fumigata of SEQ ID NO: 51 comprising one or more mutations selected from K2Q, A3T, A9N, R17K, A18Y, L28R, T43D or T43M, T47S, T48L, C50Q or C50E; T57F, C61H, C68P, A74R, N96H, S97N, K114N, S125A, H128R or H128Q, L130K, K134P, H146N, S152R, S155A, F157Y, H160K, G167D, A185T or A185V; T188V or T188A; T202S, V207I, G209E, V211I, R213K, V216Q, N226D, S237P, A254D, Q265R, M266N or M266Q, A285P, S292E, 5304P, V319Q, I320V, Q335P, N336G, V339Y, E341Q, S370T, T378E, T379A, A387Q, D390S, D405N, R406N, T407K or T407Q; and V453K.
In another embodiment, provided herein is a tryptophan dimethylallyltransferase variant of a wild-type tryptophan dimethylallyltransferase from Neosartorya fumigata of SEQ ID NO: 51 comprising mutations:
Further provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 52 comprising one or more mutations selected from N7R or N7T; K10G, A12P, E19P, R20I, K22T, L23I, E24G, E25P or E25R, V28I, I51V, E69A, A71G, K72P, K75T, I77V, Y89W, E94H or E94R; L99R, Q102W or Q102R; L110Q, G112T, Y120F, P127H, A132V, V141L, V143A, T153V, V159E, A162T, I179M, L181V, S197D, S201L, D206K, A212T, I213V, G215A, K222Q, H225Q, D226K, V228I, S236V, K241P, Q249R, K252R, L257I or L257V; V259L, N264P, A265G, L267M, V279I, S280T, V281I, K282R, K290Q, M292L, A311R, E312R or E312K; E318D, N329D, and K335D.
In a further embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 52 comprising mutations:
Also provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Rhodococcus jostii of SEQ ID NO: 53 comprising one or more mutations selected from V36A, K44E or K44D, A46L, D70P, S73T, I77V, V83A or V83C, T91R, S97T, Q99M, S100Q, L105A, N108G, V110A, T111A, E113C or E113A; L114Q or L114M; N116D, I128V, A132C, A133M, T136A, Q139E, F140Y, S141A or S141T; T154D, W159K, V160A, S161A, T164G or T164S; S173A, A177T, G178A, E179G, K181Q, A182P, A187V or A1871; V199I, R200Q, A201G, V203R, N206G, G208R, V210I, I213V, S218E or S218D, L220F, F222W, F222M or F222Q; Q225K, S226F, D229T, F230H, Y232F, T233A, A235M, H239A, D241L, Y247W, V249Q, T254V, V255I, V261G, F270M, and E271R.
In one embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Rhodococcus jostii of SEQ ID NO: 53 comprising mutations:
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Shigella flexneri of SEQ ID NO: 54 comprising one or more mutations selected from D40S, N41Q, A425, I55P, G56D, A59D, L61K, L72A or L72K; L81Q, N95K, K108E, S110A, T118R or T118K, N119R, N151T, D156K or D156Q; R160K or R160T; 5170A, T1751 or T175M; K192Q, Q235D, and N246L.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Shigella flexneri of SEQ ID NO: 54 comprising mutations: (a) N41Q, I55P, G56D, L72A, N95K, S110A, N119R, R160K, T175I, and Q235D;
Also provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Aromatoleum aromaticum of SEQ ID NO: 55 comprising one or more mutations selected from K2R, S8A, R14N, E16T, L45Y, G46L, S49T, A51G, V54T, A55E, N57G, I71V, D76T, A79K, A83R, H91D, L93I, E96P, A97P, I99M or I99A; A100E, V110I, C115A or C115R, L117Y, R120A or R120F, T123D, S128A, Q136E, T142C, I152V, Y1535, Y153A or Y153G; N154E, N154H or N154D; A159P, S162T, T185R, A188V, H198R, R200K, 5205D, C211Q, S212R or S212T or S212Q, Q213N or Q213D; Q215D, A217V or A217R, S218K, C227R, V266E, K274M, R275K, Q277A, Q289M, N291E, T300H, F303K or F303R, 5309V, T3115 or T311P, L312F, T314Q or T314E, L321H, Q322L, K323E or K323Q, A324S and S332C.
In another embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Aromatoleum aromaticum of SEQ ID NO: 55 comprising mutations:
Also provided herein are arylesterase variants of a wild-type arylesterase from Pseudomonas pseudoalcaligenes of SEQ ID NO: 56 comprising one or more mutations selected from A34G, F35Y, K49Q, S52K, E54H, E57D, S78P, E83R, E87D, Q102L, V115I, S118A, L128V, V138Q or V138R; T142Q or T142E, T149K or T149E; E153K, S158P, G168A, V170N, G172E, M173W or M173L; A182S, E183A, E187P or E187A, I188Q, L189I, and T196Q.
In another embodiment, provided herein is an arylesterase variant of a wild-type arylesterase from Pseudomonas pseudoalcaligenes of SEQ ID NO: 56 comprising mutations:
Also provided herein are putrescine aminotransferase variants of a wild-type putrescine aminotransferase from Escherichia coli of SEQ ID NO: 57 comprising one or more mutations selected from A12E, A15Q, H16R, R24D, H28E, A33Q, R36K, K43R, E44K, F50Y, A59K, F84Y, V102K, S105K, N109E, A112D, Q114L, Q119N, L121F, D123N, K131H, A135E, T137A, K140D, S144V, S153A, K162R, A175T, A1925, S194E, T195E, F196Y, N214D, M218L, T220K, N223E, L248I, T255R, F264Y or F264H; M268L, K282R, N290G, Q292V, A299G, L322F, L344I, T346A, N348H, Q353E, A357E, Q361E, M365Y or M365R, V381I, Q382R, M389L, N398E or N398D; G401A, S406A, R412G, T4185, A4225, T424V, I435D, C438I, E439D, L440K or L440Q; M452I, and S455Q.
In one embodiment, provided herein is a putrescine aminotransferase variant of a wild-type putrescine aminotransferase from Escherichia coli of SEQ ID NO: 57 comprising mutations:
Provided herein are green to red photoconvertible GFP-like protein EosFP variants of a wild-type green to red photoconvertible GFP-like protein EosFP from Lobophyllia hemprichii of SEQ ID NO: 58 comprising one or more mutations selected from N11K, D26E, D28E, K32N, F34Y, M40I, L93M, T94I, I100V or I102T; T113C, N116H, K117E, R119K or R119E, A127P, L137I, K145M, K145S or K145Y; T154V, I157C or I157V, A160F, N166G, A167G, Y169L, F173C, F173V or F1731; E181K, L186M, F191L, V192I, and C195R.
In one embodiment, provided herein is a green to red photoconvertible GFP-like protein EosFP variant of a wild-type green to red photoconvertible GFP-like protein EosFP from Lobophyllia hemprichii of SEQ ID NO: 58 comprising mutations:
Also provided herein are red fluorescent protein drFP583 variants of a wild-type red fluorescent protein drFP583 from Discosoma sp. of SEQ ID NO: 59 comprising one or more mutations selected from M1A, N6S, R17H, T21S, V22M, H41T, N42Q, V44A, A57S, K70R, V71A, Y72F, V73T, K83Y, L85Q, N98V, S111T, Q114E, D115G, C117T, F118L, K123W, F124L, I125R, V127T, S131P, D132N, A145P, R153E, E160D, H162K, K163M, K166R, H172R, L174R or L174T; V175A, E176D or E176Q; S179T, I180T, M182K, L189M, Y192A or Y192S; Y194F, S197R, D200E, N205D, I210V, T217S or T217A; G219A, H222S, L223T, and F224G.
In another embodiment, provided herein is a red fluorescent protein drFP583 variant of a wild-type red fluorescent protein drFP583 from Discosoma sp. of SEQ ID NO: 59 comprising mutations:
Further provided herein are green fluorescent protein variants of a wild-type green fluorescent protein from Aequorea victoria of SEQ ID NO: 60 comprising one or more mutations selected from S28K, 530R, F64L, S72A, N105C, N146I, H148G, M153T, K158N, V163A, N164E, I167M, N170P, S175G, N198Y, 5202H, Q204R or Q204H; A206V, and V224L.
In another embodiment, provided herein is a green fluorescent protein variant of a wild-type green fluorescent protein from Aequorea victoria of SEQ ID NO: 60 comprising mutations:
Also provided herein are luciferin 4-monooxygenase variants of a wild-type luciferin 4-monooxygenase from Photinus pyralis of SEQ ID NO: 61 comprising one or more mutations selected from A4G, K8I, A12P, F14R, D19P, Q25L, A29L, V36H, T39R, I47T, E48G, V49Q, Y56L, E58Q, M59W, 560A, V61N or V61C; L63M or L63V; M67L, T74K, N75G, R77V, Q87E, M90V, A101T or A101V; A103H or A103T; I108N or I108T, N110T, R112D, L114I, N116H, S117Q, M118L, T124K, V128T, K130P or K1305; G132A, Q134P, Q147K, T156Q, Q159N, V168M, T169K, H171Y, G175N, V182K, K190Q, T191Q, T214N, A215I, A222C, R223S, D234G, G246A, T251C or T251H; T252V, Y255H, I257R or I257Y; C258Q, L264V, Y266R or Y266K; S276A, L277V or L277I; K281R, Q283T, S284V, T290S or T290P, F292M, S293V, N3085, H310T, G316A, H332N or H332G; I349V, A361S, F368G, V384P, M396N, S399K, N409K, L411T, 5420T, W426F, E430G, S440E, S456A, I457V, Q460S, N463D or N463K; F465A, L472Y, E488K, H489P, V499I, T5075, T508P, A509H or A509Y; V522I, G525T, G525N or G5255; L526P, L530I, and A532R or A532K.
In one embodiment, provided herein is a luciferin 4-monooxygenase variant of a wild-type luciferin 4-monooxygenase from Photinus pyralis of SEQ ID NO: 61 comprising mutations:
Provided herein are mOrange variants of a wild-type mOrange from Discosoma sp. of SEQ ID NO: 62 comprising one or more mutations selected from N8D, N9P, R22H, S26C, F46T, A49M, K50R, F88L, D120G, E122C, S136P, D137N, A150P, S152T or S152C, A161T, K167T, T179R, S180C, A197N, I199F, G201D or G201N, K203R; D205Y, D205E or D205V; N210D, Q218E, and G224A.
In one embodiment, provided herein is a mOrange variant of a wild-type mOrange from Discosoma sp. of SEQ ID NO: 62 comprising mutations:
Also provided herein are sandercyanin fluorescent protein variants of a wild-type sandercyanin fluorescent Protein from Sander vitreus of SEQ ID NO: 63 comprising one or more mutations selected from M1L, K4R, A12P, E15P, D16N, A19P, A205, A42I, V25T, D28E; K35T, K35P or K35Q; A42I, I114T, N115D, A118D, S119F, A121V, A122E, V127L or V127M; G143E, T144I, M145L, G151P, T154E or T154K; L156I, A162N, and A163C.
In another embodiment, provided herein is a sandercyanin fluorescent protein variant of a wild-type sandercyanin fluorescent Protein from Sander vitreus of SEQ ID NO: 63 comprising mutations:
Also provided herein are GFP-like fluorescent chromoprotein FP538 variants of a wild-type GFP-like fluorescent chromoprotein FP538 from Zoanthus sp. of SEQ ID NO: 64 comprising one or more mutations selected from K9T, T30E, I34R or I34E; K39E, I49V, E50K, G94E, S96T, I106T, C107A, N108R, N108S or N108T; I125E or I125K; N127H, M129T, D134N, K139Q, M141R or M141K; T143V, T1431 or T143L; N144G, A147P, M153Y, K157N, K162R, S166T, L170K, V184T, S189K, S192K, E196G, Q201E, L205E, T220Y, T220Q, T220E or T220H, I224V, F226H, and A229P.
In a further embodiment, provided herein is a GFP-like fluorescent chromoprotein FP538 variant of a wild-type GFP-like fluorescent chromoprotein FP538 from Zoanthus sp. of SEQ ID NO: 64 comprising mutations:
Provided herein are mNeonGreen variants of a wild-type mNeonGreen from Branchiostoma lanceolatum of SEQ ID NO: 65 comprising one or more mutations selected from V52H, D53E, N65Y, E69M or E69Q, A110Q, S1311 or S131C, T139R, S143D, K152H, A158P, T164Q, S166K, A169G, W172P, S175E, K177M, T178L, T188D or T188E, K190T, T194K, G196K or G196E, N197D, 5203C, T204Q, T208H, N218D, and Y226F.
In a further embodiment, provided herein is a mNeonGreen variant of a wild-type mNeonGreen from Branchiostoma lanceolatum of SEQ ID NO: 65 comprising mutations:
Also provided herein are green fluorescent protein blFP-Y3 variants of a wild-type green fluorescent protein blFP-Y3 from Branchiostoma lanceolatum of SEQ ID NO: 66 comprising one or more mutations selected from M1A, A5T, V18H, D19E, E35Q, S49N, A76K, K79C, S97I, S100A, N101H, Y102F, T105R, S109N, K112I, I118H, S132K, W138P, T141E, S153G, T154E, T160K, G162K, T170E, N174T, N184D, and K213W.
In a further embodiment, provided herein is a green fluorescent protein blFP-Y3 variant of a wild-type green fluorescent protein blFP-Y3 from Branchiostoma lanceolatum of SEQ ID NO: 66 comprising mutations:
Provided herein are far-red fluorescent protein eqFP650 variants of a wild-type far-red fluorescent protein eqFP650 from Entacmaea quadricolor of SEQ ID NO: 67 comprising one or more mutations selected from G13N, T19E, 520G, K25N, A33M, K34R, A51M, I85V, T86M, Y88F, L94C, T100I, N104D, L107F, K112E, N114R, S120P, A134P, A142R or A142C; S144G, R147E, H149R, Q151N or Q151D; V157K, Y161H, H163I, S165H, F184V, F186Y, R189H or R189Y; K190R, E196H, K199D, T201N, and M208V.
In another embodiment, provided herein is a far-red fluorescent protein eqFP650 variant of a wild-type far-red fluorescent protein eqFP650 from Entacmaea quadricolor of SEQ ID NO: 67 comprising mutations:
Also provided herein are fluorescent protein cyOFP variants of a wild-type fluorescent protein cyOFP from Escherichia coli of SEQ ID NO: 68 comprising one or more mutations selected from L8V, E11P, H14K, S15T, L19M, H32G, K37N, N45M, R46K, V50T, A58S, A63M, M65L, K71E, L106C, T1121 or T112V; L114I, A132P, L151M; A154R, A154K or A154N; K164M, V169K, N177H, K179R, T189K, N211D, V220W, V220H or V220Q, and D227H.
In one embodiment, provided herein is a fluorescent protein cyOFP variant of a wild-type fluorescent protein cyOFP from Escherichia coli of SEQ ID NO: 68 comprising mutations:
Each of the enzyme variants provided herein retain at least 95% activity, at least 90% activity, at least 85% activity, at least 80% activity, or at least 75% activity, after 4 days at 20° C. and at a pH in the range of 3.5-8, such as a pH in the range of 5-7.
Other features and advantages of this invention will become apparent from the following detailed description, and examples. It should be understood, however, that the detailed description and the specific examples while indicating preferred embodiments of the invention are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that this invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure this invention.
Pyruvate transaminase is an enzyme in the class-III pyridoxal-phosphate-dependent aminotransferase (“ATIII”) family having transaminase activity and pyridoxal phosphate binding activity. Pyridoxal phosphate is the active form of vitamin B6, and is a coenzyme in transamination reactions, as well as in numerous other enzymatic reactions. Pyruvate transaminase has been purified from Vibrio fluvialis JS17 and has been characterized as having activity toward chiral amines and a lack of activity toward beta-alanine, an optimal pH of 9.2 and an optimal temperature of 37° C. The aminotransferases of the ATIII family also are important in plant growth and development, and plant responses under abiotic stresses, i.e., environmental conditions that reduce growth, including but not limited to water deficit or excess, extreme temperatures, soil salinity and/or acidy, and mineral deficiency or toxicity.
Provided herein are pyruvate transaminase variants of a wild-type pyruvate transaminase from Vibrio fluvialis of SEQ ID NO: 39 comprising one or more mutations selected from A74E, Y76M, V103M, S108M, F130Y, Q196E, Q210L, A214P, G219A, L243Q, S304A, K305Q, T309E, A313K or A313R, E315G, E316T, A323Y, K335E, N342D, N405E; T406A, T408E, T408F or T408Q, and E436D.
In one embodiment, provided herein is a pyruvate transaminase variant of a wild-type pyruvate transaminase from Vibrio fluvialis of SEQ ID NO: 39 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described pyruvate transaminase variant of SEQ ID NO: 39. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the pyruvate transaminase variant of SEQ ID NO: 39. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the pyruvate transaminase variant of SEQ ID NO: 39. In a further embodiment, provided herein are methods for producing a pyruvate transaminase variant, said method comprising expressing the pyruvate transaminase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the pyruvate transaminase variant of SEQ ID NO: 39.
Triacylglycerol lipase is an enzyme that hydrolyzes triacylglycerol to diacylglycerol and a carboxylate. Pancreatic lipase, also called pancreatic triacylglycerol lipase is an enzyme secreted into the duodenum by the pancreas that breaks down dietary fat (triglyceride) to monoglycerides and free fatty acids. Normal levels of pancreatic lipase are low, but during pancreatitis, an inflammation of the pancreas, or in pancreatic adenocarcinoma, the pancreas secretes three or more times as much pancreatic lipase. In certain diseases, such as celiac disease, cystic fibrosis, Crohn disease, not enough lipase is produced; patents may be prescribed a pancreatic enzyme therapy of protease, amylase and lipase, or enteric-coated lipase supplements.
Also provided herein are triacylglycerol lipase variants of a wild-type triacylglycerol lipase from Burkholderia glumae of SEQ ID NO: 40 comprising one or more mutations selected from A63G, N64G, S74E, S78R, A106E, S132A, F161Y, T168Y, T176P, N183W, D196N, R204K, Q210G, T211A, Y214F, R216Q, A234P, S241N, Q242T, A297G, Q300P, F312Y, S319N, V334I, N338F, T348N or T348Q, L355N, and V358L.
In one embodiment, provided herein is a triacylglycerol lipase variant of a wild-type triacylglycerol lipase from Burkholderia glumae of SEQ ID NO: 40 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described triacylglycerol lipase variant of SEQ ID NO: 40. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the triacylglycerol lipase variant of SEQ ID NO: 40. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the triacylglycerol lipase variant of SEQ ID NO: 40. In a further embodiment, provided herein are methods for producing a triacylglycerol lipase variant, said method comprising expressing the triacylglycerol lipase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the triacylglycerol lipase variant of SEQ ID NO: 40.
Halohydrin dehalogenase, also called haloalcohol dehalogenase or halohydrin hydrogen-halide lyase, is an enzyme in certain bacteria that cleaves the carbon-halogen bond in organic compounds to yield epoxides from a vicinal hydroxyl group, which is a reversible reaction. There are three different isoform classes of halohydrin dehalogenase, the A, B and C groups. Halogenated aliphatics are environmental pollutants, thus, the ability of halohydrin dehalogenase to dehalogenase these compounds render the bacteria encoding this enzyme, as well as the enzyme itself, as bioremediation tools to degrade halogenated compounds in polluted soil, and contaminated groundwater and wastewater. Industrial application of halohydrin dehalogenases for their reversible dehalogenation; in the reverse reaction negatively charged nucleophiles, such as azide, cyanide, or nitrite, are accepted, besides halides, to open the epoxide ring, thereby, novel C—N, C—C, or C—O bonds are formed by halohydrin dehalogenases, making halohydrin dehalogenase an attractive biocatalyst for the production of various β-substituted alcohols through epoxide ring opening. Isoform HheC from grain-negative bacterium Agrobacterium radiobacter strain AD1 has been studied the most because its encoded halohydrin dehalogenase is enantio-selective.
Provided herein are halohydrin dehalogenase variants of a wild-type halohydrin dehalogenase from Rhizobium radiobacter of SEQ ID NO: 41 comprising one or more mutations selected from T3V, K10R, F12G, S17A, S22A, E33P, Q37D, K38P, L41R, K52H, E56A, I63V, A65R, T67I; S68K or S68E; S78A, D80A, A83P, Q87R; K91Q or K91E; A93D or A93S; V94P; G99A or G99K; Q105L, R107G, A110R, N113R or N113Q, S117P, F136L, W139L, K140P, E141G, S147A; C153I or C153N; T154A, N157K, S160A, V205Q, T206V, K215E and M245V.
In one embodiment, provided is a halohydrin dehalogenase variant of a wild-type halohydrin dehalogenase from Rhizobium radiobacter of SEQ ID NO: 41 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described halohydrin dehalogenase variant of SEQ ID NO: 41. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the halohydrin dehalogenase variant of SEQ ID NO: 41. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the halohydrin dehalogenase variant of SEQ ID NO: 41. In a further embodiment, provided herein are methods for producing a halohydrin dehalogenase variant, said method comprising expressing the halohydrin dehalogenase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the halohydrin dehalogenase variant of SEQ ID NO: 41.
Aminotransferases, also called transaminases, are enzymes that catalyze the amino-transfer reaction between an amino acid and a 2-keto acid, i.e., an alpha-keto acid, in which the amino group of the amino acid and keto group of a keto acid are exchanged: the amino acid becomes a keto acid and the keto acid becomes an amino acid; this reaction is reversible. Transaminases are used to produce amino acids and chiral amines. Alanine transaminase, previously called serum glutamate-pyruvate transaminase, is found in plasma and body tissues, primarily in the liver. Alanine transaminase catalyzes the transfer of an amino group from L-alanine to α-ketoglutarate, reversibly producing pyruvate and L-glutamate.
Also provided herein are aminotransferase variants of a wild-type aminotransferase from Chromobacterium violaceum of SEQ ID NO: 42 comprising one or more mutations selected from M36L, G39A, D70E, F71L, R76Y, K90Q, V97I, S101E or S101K, A109P, D112N, R113H, S123A, V124N, T145V, L146I, G152A, S162A, V202L; V203H or V203K; A217P, D315E, A337R, V236I, A239D, H276Y; A286I or A286V; K304D, A312E, H333L; A337K or A337R; Q346E, K358Q, E368P, V379L, V379I or V379M; Q380A, N387D, K390T, D396N, F397E, T402W, R410N, S424A, T452E, and A455E or A455K.
In one embodiment, provided herein is an aminotransferase variant of a wild-type aminotransferase from Chromobacterium violaceum of SEQ ID NO: 42 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described aminotransferase variant of SEQ ID NO: 42. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the aminotransferase variant of SEQ ID NO: 42. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the aminotransferase variant of SEQ ID NO: 42. In a further embodiment, provided herein are methods for producing a aminotransferase variant, said method comprising expressing the aminotransferase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the aminotransferase variant of SEQ ID NO: 42.
Lipases are enzymes that catalyze the hydrolysis of fats, oils and triglycerides. In humans, the pancreas secretes pancreatic lipase, the main enzyme to break down dietary fats and triglycerides in digested oils to monoglycerides and two fatty acids. Humans also produce a lingual lipase, a bile salt-dependent lipase, hepatic lipase, lipoprotein lipase, as well as gastric lipase in the infant. Bacteria and fungi secrete lipase to absorb nutrients from external medium. Pathogenic organisms express and secrete lipases to promote host invasion. Lipases also catalyse transesterification, aminolysis and acidolysis reactions. Bacterial and fungal lipases are used as industrial catalysts the food, detergent and pharmaceutical industries.
Clostridium botulinum is the bacterium that produces botulinum toxin. C. botulinum has two genes encoding two putative secreted lipases, putative secreted lipase CB00863 and CB02061. Identification of lipase-positive C. botulinum is made its growth at a pH of between 4.8 and 7.0 and an inability to use lactose as a primary source of carbon. Lipase activity by this Gram-positive anaerobe produces a thin pearly layer around colonies grown on egg yolk-containing medium, which property is utilized as a diagnostic test for C. botulinum.
Also provided herein are putative secreted lipase variants of a wild-type putative secreted lipase from Clostridium botulinum of SEQ ID NO: 43 comprising one or more mutations selected from D69N, E91Q, I101T, K137D, A140R, S141N, H143Y, V168K, K183G, K196P, N200D, K246H, S253V, S265A, T277M, K289N, N308D, E339Y, T346K, A375P, S395T, Q398K, G437T, M469I, N477Q, and S479P.
In one embodiment, also provided is a putative secreted lipase variant of a wild-type putative secreted lipase from Clostridium botulinum of SEQ ID NO: 43 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described putative secreted lipase variant of SEQ ID NO: 43. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the putative secreted lipase variant of SEQ ID NO: 43. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the putative secreted lipase variant of SEQ ID NO: 43. In a further embodiment, provided herein are methods for producing a putative secreted lipase variant, said method comprising expressing the putative secreted lipase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the putative secreted lipase variant of SEQ ID NO: 43.
Alcohol dehydrogenases (ADH) are a group of enzymes that catalyze both the oxidation of primary and secondary alcohol to aldehydes and ketones, respectively, and the reverse reaction. In humans, there are five classes of alcohol dehydrogenases, of which the primary ADH is a class 1 form, in the liver catalyzes the/oxidation reaction of ethanol to acetaldehyde using the coenzyme nicotinamide adenine dinucleotide (NAD+), which is reduced to NADH. In yeast and bacteria, alcohol dehydrogenase ferments glucose to ethanol and carbon dioxide. Geobacillus stearothermophilus, an obligate anaerobe, encodes an alcohol dehydrogenase that is thermophilic and NAD+-dependent and that catalyzes the oxidation of an alcohol, e.g., ethanol and methanol, with NAD+ to an aldehyde or ketone and NADH. The alcohol dehydrogenase from G. stearothermophilus is utilized in the biocatalytic synthesis of w-oxo lauric acid methyl ester, which is a key intermediate for bio-based polyamide 12 production, from the corresponding long-chain fatty alcohol by oxidation.
Further provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 44 comprising one or more mutations selected from K10G; K11E or K11Q; Q14K or Q14V; V15I, E19P, K20V or K20I, K22T, S24G, V66I, I67V, E68V, E69A, Y89W, Q102W, R108H, Q110K, A112T, Y120F, A127D, K133H, F147L, T153V, V159E or V159M; V179M, L196I, G197D; E199D, K205R, Q206E or Q206K; K216L, H217K, D218E, A220P, Q222K, W223F, I224M, E226K, T234A, A242P, E245Q, S246Q, K249R or K249N; I251L, A256T, C257V or C257L; E265G, I267M or I267F, I269V or I269L; V279I, V28I, T306P or T306V; V308I, Q311R; N315K or N315D; D318E, L324R, N329D or N329K, K335D, V336L, D337E, K335D, V336L, and D337E.
Also provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 44 comprising mutations:
Further provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 52 comprising one or more mutations selected from N7R or N7T; K10G, A12P, E19P, R20I, K22T, L23I, E24G, E25P or E25R, V28I, I51V, E69A, A71G, K72P, K75T, I77V, Y89W, E94H or E94R; L99R, Q102W or Q102R; L110Q, G112T, Y120F, P127H, A132V, V141L, V143A, T153V, V159E, A162T, I179M, L181V, S197D, S201L, D206K, A212T, I213V, G215A, K222Q, H225Q, D226K, V228I, S236V, K241P, Q249R, K252R, L257I or L257V; V259L, N264P, A265G, L267M, V279I, S280T, V281I, K282R, K290Q, M292L, A311R, E312R or E312K; E318D, N329D, and K335D.
In another embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Geobacillus stearothermophilus of SEQ ID NO: 52 comprising mutations:
Also provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Rhodococcus jostii of SEQ ID NO: 53 comprising one or more mutations selected from V36A, K44E or K44D, A46L, D70P, S73T, I77V, V83A or V83C, T91R, S97T, Q99M, S100Q, L105A, N108G, V110A, T111A, E113C or E113A; L114Q or L114M; N116D, I128V, A132C, A133M, T136A, Q139E, F140Y, S141A or S141T; T154D, W159K, V160A, S161A, T164G or T164S; S173A, A177T, G178A, E179G, K181Q, A182P, A187V or A1871; V199I, R200Q, A201G, V203R, N206G, G208R, V210I, I213V, S218E or S218D, L220F, F222W, F222M or F222Q; Q225K, S226F, D229T, F230H, Y232F, T233A, A235M, H239A, D241L, Y247W, V249Q, T254V, V255I, V261G, F270M, and E271R.
In one embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Rhodococcus jostii of SEQ ID NO: 53 comprising mutations: (a) V36A, K44E, A46L, D70P, S73T, I77V, V83A, T91R, S97T, Q99M, L105A, N108G, V110A, T111A, E113C, L114Q, N116D, A132C, A133M, T136A, Q139E, F140Y, S141A, T154D, V160A, S161A, T164G, E179G, A182P, A187V, R200Q, A201G, N206G, G208R, V210I, A211V, F222W, S226F, F230H, T233A, H239A, D241L, Y247W, T254V, V255I, V261G, F270M, and E271R;
Also provided herein are alcohol dehydrogenase variants of a wild-type alcohol dehydrogenase from Aromatoleum aromaticum of SEQ ID NO: 55 comprising one or more mutations selected from K2R, S8A, R14N, E16T, L45Y, G46L, S49T, A51G, V54T, A55E, N57G, I71V, D76T, A79K, A83R, H91D, L93I, E96P, A97P, I99M or I99A; A100E, V110I, C115A or C115R, L117Y, R120A or R120F, T123D, S128A, Q136E, T142C, I152V, Y1535, Y153A or Y153G; N154E, N154H or N154D; A159P, S162T, T185R, A188V, H198R, R200K, 5205D, C211Q, S212R or S212T or S212Q, Q213N or Q213D; Q215D, A217V or A217R, S218K, C227R, V266E, K274M, R275K, Q277A, Q289M, N291E, T300H, F303K or F303R, 5309V, T3115 or T311P, L312F, T314Q or T314E, L321H, Q322L, K323E or K323Q, A324S and S332C.
In another embodiment, provided herein is an alcohol dehydrogenase variant of a wild-type alcohol dehydrogenase from Aromatoleum aromaticum of SEQ ID NO: 55 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described alcohol dehydrogenase variant of any one of SEQ ID NO: 44, SEQ ID NO: 52, SEQ ID NO: 53 or SEQ ID NO: 55. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the alcohol dehydrogenase variant of any one of SEQ ID NO: 44, SEQ ID NO: 52, SEQ ID NO: 53 or SEQ ID NO: 55. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the alcohol dehydrogenase variant of any one of SEQ ID NO: 44, SEQ ID NO: 52, SEQ ID NO: 53 or SEQ ID NO: 55. In a further embodiment, provided herein are methods for producing an alcohol dehydrogenase variant, said method comprising expressing the alcohol dehydrogenase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the alcohol dehydrogenase variant of any one of SEQ ID NO: 44, SEQ ID NO: 52, SEQ ID NO: 53 or SEQ ID NO: 55.
Phytase is a class of phosphatase enzyme that hydrolyzes phytic acid to release a useable form of organic phosphorus. Phytic acid a storage form of phosphorus found in plants, seeds, suhc as oil seeds, cereals and grains, which is not bioavailable to nonruminants, since these animals lack the phytase enzyme. Phytases are found in plants, fungi, bacteria and ruminant animal, such as cattle and sheep. Nonruminants, such as humans, dogs, and birds, do not produce phytase. Animal feed may be supplemented with phytase to release phytate-bound nutrients, such as calcium, phosphorus, other minerals, such as zinc, iron and magnesium, carbohydrates, and proteins. Phytase may be added to grains, legumes, seeds and corn to release the above-enumerated minerals, thereby producing functional food. A phytase encoded by Gram-negative bacteria species Yersinia mollaretii has been characterized as having a specific activity of 1,073 U/mg which is about 10 times higher than widely used fungal phytases.
The phytase (histidine acid phosphatase) variants provided herein may be used as described above, e.g., to supplement feed and hydrolyze indigestible phytate, i.e., myo-inositol 1,2,3,4,5,6-hexakis dihydrogen phosphate, to increase absorption in nonruminants of inorganic phosphates, minerals, as well as trace elements, from plant-based food, seeds, and grains. The provided phytase variants release from two to ten times more inorganic phosphorus, are thermostable at high temperature, e.g., 55° C. to 90° C., are active at body temperature of humans and animals (30° C.−40° C.) and at a pH range from 1.5 to 6.0, thereby withstanding the acidic environment of the stomach without a loss of specific activity.
Provided herein are histidine acid phosphatase variants of a wild-type histidine acid phosphatase from Yersinia mollaretii of SEQ ID NO: 45 comprising one or more mutations selected from S42A, Q46P or Q46F, D52N, D56H or D56Q, K57P, A65P, T77R or T77K, G81Q or G81R, F82Y, Y83F, N89Q, T102E, Q106W, I109S or I109T, L115A, A136P, V141P or V141M, E149K, Q154T or Q154R, S157P, T158K or T158E, H161R or H161K, R162A, E165M, Q167R, A170G, S173D, E174A, K181D, P182D or P182S, Q185L, G187Q or G187E, E188Q, I189V, T193C or T193K, A194N, 5200A or 5200K, 5207T, N221S, Q222K, Q223E, S230E, S236A, G240A, N247Y, S248A, H257G, A262E, V266R, S267E, L269M, Q275K or Q275Y, K289Q, V298L, K306D, G324A, G334A, Q342T, Q345G, G355A, N363D, D365K, M380L, S386A, K392T, S393N, H394N, I409T, T411D, Q416P, and A425K.
In one embodiment, provided herein is a histidine acid phosphatase variant of a wild-type histidine acid phosphatase from Yersinia mollaretii of SEQ ID NO: 45 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described histidine acid phosphatase variant of SEQ ID NO: 45. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the histidine acid phosphatase variant of SEQ ID NO: 45. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the histidine acid phosphatase variant of SEQ ID NO: 45. In a further embodiment, provided herein are methods for producing a histidine acid phosphatase variant, said method comprising expressing the histidine acid phosphatase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the histidine acid phosphatase variant of SEQ ID NO: 45.
Chanoclavine-I aldehyde reductase easA is an enzyme that is an aldehyde reductase, this enzyme is involved in the biosynthesis of fumiclavanine C, which is an ergot alkaloid (a mycotoxin) produced by some fungi, including Neosartorya Claviceps, Aspergillus and Penicillium genera. EasA catalyzes the reduction of chanoclavine-I aldehyde to dihydrochanoclavine-I aldehyde that spontaneously dehydrates to form 6,8-dimethyl-6,7-didehydroergoline, which is which is converted to festuclavine by festuclavine dehydrogenase (EasG). Festuclavine is hydrolyzed by easM, leading to the formation of fumigaclavine B, which, in turn, is acetylated by easN to fumigaclavine A. Next, easL catalyzes the conversion of fumigaclavine A into fumigaclavine C by attaching a dimethylallyl moiety to C-2 of the indole nucleus. Ergot alkaloids are used in therapeutic drugs, such as migraine reduction and blood pressure regulation. The herein provided chanoclavine-I aldehyde reductase easA variants maybe used to produce improved ergot alkaloids in vitro via biotechnological processes.
Provided herein are chanoclavine-I aldehyde reductase easA variants of a wild-type chanoclavine-I aldehyde reductase easA from Neosartorya fumigata of SEQ ID NO: 46 comprising one or more mutations selected from A8S, K12Q, H20N, I26V, T30M, G37D, Q38N, G50A, D66F, T68S, M72G, P84E, R86I, E91K, S94D, R95A, I126V, C151Y or C151I, A153E, N164E, T184I, K186D, Q192T, T210V, V217I, E240D, A250E, M252L, D256K, L268Q, E270N, E304R, K305W, Q309W, Y368W, L371R, and H372N or H372Q.
In one embodiment, provided herein is a chanoclavine-I aldehyde reductase easA variant of a wild-type chanoclavine-I aldehyde reductase easA from Neosartorya fumigata of SEQ ID NO: 46 comprising mutations:
Y368W; or
In one embodiment, provided herein are nucleic acids encoding an above-described chanoclavine-I aldehyde reductase easA variant of SEQ ID NO: 46. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the chanoclavine-I aldehyde reductase easA variant of SEQ ID NO: 46. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the chanoclavine-I aldehyde reductase easA variant of SEQ ID NO: 46. In a further embodiment, provided herein are methods for producing a chanoclavine-I aldehyde reductase easA variant, said method comprising expressing the chanoclavine-I aldehyde reductase easA variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the chanoclavine-I aldehyde reductase easA variant of SEQ ID NO: 46.
Alanine dehydrogenase is an enzyme that catalyzes the NAD+-dependent oxidative deamination of L-alanine to pyruvate and ammonium, and the reverse reaction, i.e., the reductive amination of pyruvate to L-alanine. Alanine dehydrogenase is required for growth when alanine is the sole carbon or nitrogen source. The amino acid L-alanine is used in various industries, including food, pharmaceutical, veterinary and production of engineered thermoplastics. The alanine dehydrogenase variants provided herein are hyperthermostable, maintaining structural stability and function at high temperatures of equal to or greater than 60° C., and high acidity add/or high radiation levels, enabling faster and improved production of L-alanine in bioreactors at high temperatures from 33° C. to 42° C.
Also provided herein are alanine dehydrogenase variants of a wild-type alanine dehydrogenase from Archaeoglobus fulgidus of SEQ ID NO: 47 comprising one or more mutations selected from E2K, Q8R, S13Q, N22E, L33E, G72Q, S89D, T104I, S106A, S125A, F128V, T135E, K235R, D277P, V300A, S310E, S315T, K316F and I317V.
In one embodiment, provided herein is an alanine dehydrogenase variant of a wild-type alanine dehydrogenase from Archaeoglobus fulgidus of SEQ ID NO: 47 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described alanine dehydrogenase variant of SEQ ID NO: 47. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the alanine dehydrogenase variant of SEQ ID NO: 47. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the alanine dehydrogenase variant of SEQ ID NO: 47. In a further embodiment, provided herein are methods for producing a alanine dehydrogenase variant, said method comprising expressing the alanine dehydrogenase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the alanine dehydrogenase variant of SEQ ID NO: 47.
Acid phosphatase is an enzyme that cleaves phosphoryl groups from an orthophosphoric monoester during digestion. This phosphatase has an optimum activity in the pH range of 4-6. Soil microorganism use acid phosphatase to obtain phosphate nutrients; measuring the rate of activity of acid phosphatases determines the need for phosphates in soil. The hereoin provided acid phosphatase variants have improved stability in an acidic pH.
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Shimwellia blattae of SEQ ID NO: 48 comprising one or more mutations selected from T28A, E41Q, N44D; N62A, N62L or N62Y; L72A; L81Q, A83K or A83Q; E84A, S90P, G91D, G92D, A94P, N95K; G99P or G99E; S103M, E107P, K108E, A112W, L113I, H114Y or H114W; I121F, 5170A; I171F, I171M, I171L or I171A; T175F or T175V; V178I, Q185E or Q185D, N188D, L197Y, A215G, T225A, Q234A, Q235D, and Q237A.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Shimwellia blattae of SEQ ID NO: 48 comprising mutations:
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Prevotella intermedia of SEQ ID NO: 50 comprising one or more mutations selected from N33H, D41E, G42S or G42E; Q44A or Q44S; T45V or T45I, S46D, T53P, L63K or L63E; Y64N, E66M or E66R; A67N, M74A, D82K, V85R, V90L, N97Q, E101C, I105M, K109P, T111K, V119L, L120R or L120Q; G127N, A145M, Y147F, M150S, N153T, E155K, Q157E, Q158E, S161R, T162K, T171A, I173L, T177V, V180I, S182A, I186P, Q189A, N190D, E194K, Y197R, Q198E, M199Y, S221A, D231N, Q237D and Q249R.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Prevotella intermedia of SEQ ID NO: 50 comprising mutations:
Provided herein are acid phosphatase variants of a wild-type acid phosphatase from Shigella flexneri of SEQ ID NO: 54 comprising one or more mutations selected from D40S, N41Q, A425, I55P, G56D, A59D, L61K, L72A or L72K; L81Q, N95K, K108E, S110A, T118R or T118K, N119R, N151T, D156K or D156Q; R160K or R160T; S170A, T175I or T175M; K192Q, Q235D, and N246L.
In one embodiment, provided herein is an acid phosphatase variant of a wild-type acid phosphatase from Shigella flexneri of SEQ ID NO: 54 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described acid phosphatase variant of any one of SEQ ID NO: 48, SEQ ID NO: 50, or SEQ ID NO: 54. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the acid phosphatase variant of any one of SEQ ID NO: 48, SEQ ID NO: 50, or SEQ ID NO: 54. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the acid phosphatase variant of any one of SEQ ID NO: 48, SEQ ID NO: 50, or SEQ ID NO: 54. In a further embodiment, provided herein are methods for producing an acid phosphatase variant, said method comprising expressing the acid phosphatase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the acid phosphatase variant of any one of SEQ ID NO: 48, SEQ ID NO: 50, or SEQ ID NO: 54.
Morganella morganii is one of the few enterobacterial species producing high-level phosphate-irrepressible acid phosphatase (“PhoC”) activity. PhoC activity has been shown to prevent induction of alkaline phosphatase when a PhoC-hydrolysable organic phosphate ester, such as glycerol 2-phosphate, was the sole phosphate source.
Further provided herein are major phosphate-irrepressible acid phosphatase variants of a wild-type major phosphate-irrepressible acid phosphatase from Morganella morganii of SEQ ID NO: 49 comprising one or more mutations selected from E40D, K47A, E54T, Q59D, L61K, N62R, M72A, A89S, A90K, G91D or G91E; T95K, G99E, E107K, M137Q, E148T, N151T, K153D, Q155E, S170A, T175V, V182I, A185D or A185E; N186R, Q187A or Q187R; L197Y, D213E, A215G, T225A, D229N, Q233R, Q235D, K246L, S247L or S247R; Q248R, and K249Q.
In one embodiment, provided herein is a major phosphate-irrepressible acid phosphatase variant of a wild-type major phosphate-irrepressible acid phosphatase from Morganella morganii of SEQ ID NO: 49 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described major phosphate-irrepressible acid phosphatase variant of SEQ ID NO: 49. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the major phosphate-irrepressible acid phosphatase variant of SEQ ID NO: 49. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the major phosphate-irrepressible acid phosphatase variant of SEQ ID NO: 49. In a further embodiment, provided herein are methods for producing a major phosphate-irrepressible acid phosphatase variant, said method comprising expressing the major phosphate-irrepressible acid phosphatase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the major phosphate-irrepressible acid phosphatase variant of SEQ ID NO: 49.
Tryptophan Dimethylallyltransferase Variants
Tryptophan dimethylallyltransferase catalyzes the reaction of dimethylallyl diphosphate and L-tryptophan, to produce diphosphate and 4-(3-methylbut-2-enyl)-L-tryptophan. This enzyme is a member of the family of transferases, specifically those that transfers aryl or alkyl groups other than methyl groups. This enzyme catalyzes the first reaction in a biosynthetic pathway of the ergot alkyloids in Claviceps sp.
Provided herein are tryptophan dimethylallyltransferase variants of a wild-type tryptophan dimethylallyltransferase from Neosartorya fumigata of SEQ ID NO: 51 comprising one or more mutations selected from K2Q, A3T, A9N, R17K, A18Y, L28R, T43D or T43M, T47S, T48L, C50Q or C50E; T57F, C61H, C68P, A74R, N96H, S97N, K114N, S125A, H128R or H128Q, L130K, K134P, H146N, S152R, S155A, F157Y, H160K, G167D, A185T or A185V; T188V or T188A; T202S, V207I, G209E, V211I, R213K, V216Q, N226D, S237P, A254D, Q265R, M266N or M266Q, A285P, S292E, 5304P, V319Q, I320V, Q335P, N336G, V339Y, E341Q, S370T, T378E, T379A, A387Q, D390S, D405N, R406N, T407K or T407Q; and V453K.
In another embodiment, provided herein is a tryptophan dimethylallyltransferase variant of a wild-type tryptophan dimethylallyltransferase from Neosartorya fumigata of SEQ ID NO: 51 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described tryptophan dimethylallyltransferase variant of SEQ ID NO: 51. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the tryptophan dimethylallyltransferase variant of SEQ ID NO: 51. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the tryptophan dimethylallyltransferase variant of SEQ ID NO: 51. In a further embodiment, provided herein are methods for producing a tryptophan dimethylallyltransferase variant, said method comprising expressing the tryptophan dimethylallyltransferase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the tryptophan dimethylallyltransferase variant of SEQ ID NO: 51.
Arylesterase is an enzyme of the hydrolase family that catalyzes the reaction of phenyl acetate and H2O to phenol and acetate. Arylesterase takes part in bisphenol A degradation. The serum enzymes paraoxonases/arylesterases, found in the liver and blood, hydrolyze a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters, e.g., phenyl acetate, and thus confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation, making it a useful therapeutic.
Also provided herein are arylesterase variants of a wild-type arylesterase from Pseudomonas pseudoalcaligenes of SEQ ID NO: 56 comprising one or more mutations selected from A34G, F35Y, K49Q, S52K, E54H, E57D, S78P, E83R, E87D, Q102L, V115I, S118A, L128V, V138Q or V138R; T142Q or T142E, T149K or T149E; E153K, S158P, G168A, V170N, G172E, M173W or M173L; A182S, E183A, E187P or E187A, I188Q, L189I, and T196Q.
In another embodiment, provided herein is an arylesterase variant of a wild-type arylesterase from Pseudomonas pseudoalcaligenes of SEQ ID NO: 56 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described arylesterase variant of SEQ ID NO: 56. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the arylesterase variant of SEQ ID NO: 56. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the arylesterase variant of SEQ ID NO: 56. In a further embodiment, provided herein are methods for producing an arylesterase variant, said method comprising expressing the arylesterase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the arylesterase variant of SEQ ID NO: 56.
Putrescine aminotransferase is an enzyme that reversibly catalyzes the aminotransferase reaction between putrescine and 2-oxoglutarate to produce 1-pyrroline, L-glutamate and water. This enzyme has an optimum pH of 9.0 and is active at an alkaline pH, and is highly active at 20° C.-80° C., with an optimum temperature of 60° C. Putrescine aminotransferase has been found to catalyze the transamination of 12-aminododecanoic acid (Nylon 12) rendering it a valuable industrial biocatalyst in the production of polyamide.
Also provided herein are putrescine aminotransferase variants of a wild-type putrescine aminotransferase from Escherichia coli of SEQ ID NO: 57 comprising one or more mutations selected from A12E, A15Q, H16R, R24D, H28E, A33Q, R36K, K43R, E44K, F50Y, A59K, F84Y, V102K, S105K, N109E, A112D, Q114L, Q119N, L121F, D123N, K131H, A135E, T137A, K140D, S144V, S153A, K162R, A175T, A1925, S194E, T195E, F196Y, N214D, M218L, T220K, N223E, L248I, T255R, F264Y or F264H; M268L, K282R, N290G, Q292V, A299G, L322F, L344I, T346A, N348H, Q353E, A357E, Q361E, M365Y or M365R, V381I, Q382R, M389L, N398E or N398D; G401A, S406A, R412G, T4185, A4225, T424V, I435D, C438I, E439D, L440K or L440Q; M452I, and S455Q.
In one embodiment, provided herein is a putrescine aminotransferase variant of a wild-type putrescine aminotransferase from Escherichia coli of SEQ ID NO: 57 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described putrescine aminotransferase variant of SEQ ID NO: 57. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the putrescine aminotransferase variant of SEQ ID NO: 57. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the putrescine aminotransferase variant of SEQ ID NO: 57. In a further embodiment, provided herein are methods for producing a putrescine aminotransferase variant, said method comprising expressing the putrescine aminotransferase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the putrescine aminotransferase variant of SEQ ID NO: 57.
Green to Red Photoconvertible GFP-like Protein EosFP (“EosFP”) was isolated from the stony coral Lobophyllia hemprichii. This protein matures in a green fluorescent state with an emission maximum at 516 nm. Upon irradiation with violet-blue light the chromophore undergoes an irreversible photoconversion to a red state emitting at 581 nm. The wavelengths required for photoconversion and detection of the green and red fluorescent states can be easily separated, rendering EosFP an excellent choice for regional optical marking. EosFP is used as a marker for tracking cells, including cell fate mapping and tracking of metastases, compartments and proteins in live cells, based on the principle that the marker is regionally photoconverted from green to red. Subsequently, the red fluorescent fraction can be tracked independently. EosFP is introduced into cells by transfection of expression vectors or microinjection of in vitro transcribed mRNA or purified recombinant protein. Subcellular components also can be tracked with EosFP. To track a protein of interest, EosFP is fused thereto; after photoconversion, the movement of the marked fraction of the protein is followed by red fluorescence. Likewise, EosFP is fused to a protein of interest as a nanoscopy marker to determine its subcellular localization with using photoactivated localization microscopy (PALM).
Provided herein are green to red photoconvertible GFP-like protein EosFP variants of a wild-type green to red photoconvertible GFP-like protein EosFP from Lobophyllia hemprichii of SEQ ID NO: 58 comprising one or more mutations selected from N11K, D26E, D28E, K32N, F34Y, M40I, L93M, T94I, I100V or I102T; T113C, N116H, K117E, R119K or R119E, A127P, L137I, K145M, K145S or K145Y; T154V, I157C or I157V, A160F, N166G, A167G, Y169L, F173C, F173V or F1731; E181K, L186M, F191L, V192I, and C195R.
In one embodiment, provided herein is a green to red photoconvertible GFP-like protein EosFP variant of a wild-type green to red photoconvertible GFP-like protein EosFP from Lobophyllia hemprichii of SEQ ID NO: 58 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described green to red photoconvertible GFP-like protein EosFP variant of SEQ ID NO: 58. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the green to red photoconvertible GFP-like protein EosFP variant of SEQ ID NO: 58. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the green to red photoconvertible GFP-like protein EosFP variant of SEQ ID NO: 58. In a further embodiment, provided herein are methods for producing a green to red photoconvertible GFP-like protein EosFP variant, said method comprising expressing the green to red photoconvertible GFP-like protein EosFP variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the green to red photoconvertible GFP-like protein EosFP variant of SEQ ID NO: 58.
Red Fluorescent Protein drFP583 is found in sea anemone of the Discosoma sp. This protein is theorized to have a role in photoprotection of the coral's resident symbiont microalgae's photosystems from photoinhibition caused by high light levels found near the surface of coral reefs. In deeper water, the fluorescence may be to convert blue light into longer wavelengths more suitable for use in photosynthesis by the microalgal symbionts. Fluorescent proteins such as red fluorescent protein drFP583 are a useful and ubiquitous tool for making chimeric proteins, where they function as a fluorescent protein tag. These proteins have been expressed in most known cell types and are used as a noninvasive fluorescent marker in living cells and organisms. They enable a wide range of applications where they have functioned as a cell lineage tracer, reporter of gene expression, or as a measure of protein-protein interactions.
Also provided herein are red fluorescent protein drFP583 variants of a wild-type red fluorescent protein drFP583 from Discosoma sp. of SEQ ID NO: 59 comprising one or more mutations selected from M1A, N6S, R17H, T21S, V22M, H41T, N42Q, V44A, A57S, K70R, V71A, Y72F, V73T, K83Y, L85Q, N98V, S111T, Q114E, D115G, C117T, F118L, K123W, F124L, I125R, V127T, S131P, D132N, A145P, R153E, E160D, H162K, K163M, K166R, H172R, L174R or L174T; V175A, E176D or E176Q; S179T, I180T, M182K, L189M, Y192A or Y192S; Y194F, S197R, D200E, N205D, I210V, T217S or T217A; G219A, H222S, L223T, and F224G.
In another embodiment, provided herein is a red fluorescent protein drFP583 variant of a wild-type red fluorescent protein drFP583 from Discosoma sp. of SEQ ID NO: 59 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described red fluorescent protein drFP583 variant of SEQ ID NO: 59. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the red fluorescent protein drFP583 variant of SEQ ID NO: 59. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the red fluorescent protein drFP583 variant of SEQ ID NO: 59. In a further embodiment, provided herein are methods for producing a red fluorescent protein drFP583 variant, said method comprising expressing the red fluorescent protein drFP583 variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the red fluorescent protein drFP583 variant of SEQ ID NO: 59.
The green fluorescent protein (GFP) is a protein composed of 238 amino acid residues (26.9 kDa) that exhibits bright green fluorescence when exposed to light in the blue to ultraviolet range. Traditionally, GFP refers to the protein first isolated from the jellyfish Aequorea victoria., which has a major excitation peak at a wavelength of 395 nm and a minor one at 475 nm; its emission peak is at 509 nm, which is in the lower green portion of the visible spectrum. The GFP gene is frequently used as a reporter of expression and also has been used in modified forms to make bio sensors. A modified GFP was made by a single point mutation 565T), which improved its spectral characteristics, resulting in increased fluorescence. Since then, many mutations, including color mutants, have been made, including blue fluorescent protein (EBFP, EBFP2, Azurite, mKalama1), cyan fluorescent protein (ECFP, Cerulean, CyPet, mTurquoise2), and yellow fluorescent protein derivatives (YFP, Citrine, Venus, YPet). GFP is used as a reported gene, in fluorescence microscopy for cell biology, such as tracking cancer cells and metastases, and in tracking biological processes, including spread of virus infections.
Further provided herein are green fluorescent protein variants of a wild-type green fluorescent protein from Aequorea victoria of SEQ ID NO: 60 comprising one or more mutations selected from S28K, 530R, F64L, S72A, N105C, N146I, H148G, M153T, K158N, V163A, N164E, I167M, N170P, S175G, N198Y, 5202H, Q204R or Q204H; A206V, and V224L.
In another embodiment, provided herein is a green fluorescent protein variant of a wild-type green fluorescent protein from Aequorea victoria of SEQ ID NO: 60 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described green fluorescent protein variant of SEQ ID NO: 60. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the green fluorescent protein variant of SEQ ID NO: 60. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the green fluorescent protein variant of SEQ ID NO: 60. In a further embodiment, provided herein are methods for producing a green fluorescent protein variant, said method comprising expressing the green fluorescent protein variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the green fluorescent protein variant of SEQ ID NO: 60.
Firefly luciferin 4-monooxygenase, also known as luciferase, is found in Photinus pyralis, the Common eastern firefly. Luciferase is a general term for the class of oxidative enzymes used in bioluminescence and is distinct from a photoprotein. Luciferase catalyzes a bioluminescent reaction which involves the substrate luciferin as well as Mg2+ and ATP, produces green light with a wavelength of 562 nm. Luciferase from firefly is broadly used as a reporter for studying gene regulation and function, and for pharmaceutical screening.
Also provided herein are luciferin 4-monooxygenase variants of a wild-type luciferin 4-monooxygenase from Photinus pyralis of SEQ ID NO: 61 comprising one or more mutations selected from A4G, K8I, A12P, F14R, D19P, Q25L, A29L, V36H, T39R, I47T, E48G, V49Q, Y56L, E58Q, M59W, 560A, V61N or V61C; L63M or L63V; M67L, T74K, N75G, R77V, Q87E, M90V, A101T or A101V; A103H or A103T; I108N or I108T, N110T, R112D, L1141, N116H, S117Q, M118L, T124K, V128T, K130P or K1305; G132A, Q134P, Q147K, T156Q, Q159N, V168M, T169K, H171Y, G175N, V182K, K190Q, T191Q, T214N, A215I, A222C, R223S, D234G, G246A, T251C or T251H; T252V, Y255H, I257R or I257Y; C258Q, L264V, Y266R or Y266K; S276A, L277V or L277I; K281R, Q283T, S284V, T290S or T290P, F292M, S293V, N3085, H310T, G316A, H332N or H332G; I349V, A361S, F368G, V384P, M396N, S399K, N409K, L411T, 5420T, W426F, E430G, S440E, S456A, I457V, Q460S, N463D or N463K; F465A, L472Y, E488K, H489P, V499I, T5075, T508P, A509H or A509Y; V522I, G525T, G525N or G5255; L526P, L530I, and A532R or A532K.
In one embodiment, provided herein is a luciferin 4-monooxygenase variant of a wild-type luciferin 4-monooxygenase from Photinus pyralis of SEQ ID NO: 61 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described luciferin 4-monooxygenase variant of SEQ ID NO: 61. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the luciferin 4-monooxygenase variant of SEQ ID NO: 61. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the luciferin 4-monooxygenase variant of SEQ ID NO: 61. In a further embodiment, provided herein are methods for producing a luciferin 4-monooxygenase variant, said method comprising expressing the luciferin 4-monooxygenase variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the luciferin 4-monooxygenase variant of SEQ ID NO: 61.
In a mOrange and mOrange2, called “living colors” are extremely bright orange fluorescent protein monomers which can be used as tags or reporters. Both mOrange fluorescent proteins are mutants derived from mRFP1, a monomeric mutant of DsRed, which was developed in Dr. Roger Tsien's lab by directed mutagenesis. mOrange excitation and emission maxima are 548 and 562 nm, respectively. mOrange2 excitation and emission maxima are 549 and 565 nm, respectively. The mOrange variant fluorescent proteins provided herein are variants of mOrange, which is designated as the “wild-type mOrange” from Discosoma sp.
Provided herein are mOrange variants of a wild-type mOrange from Discosoma sp. of SEQ ID NO: 62 comprising one or more mutations selected from N8D, N9P, R22H, S26C, F46T, A49M, K50R, F88L, D120G, E122C, S136P, D137N, A150P, S152T or S152C, A161T, K167T, T179R, S180C, A197N, I199F, G201D or G201N, K203R, D205Y, D205E or D205V; N210D, Q218E, and G224A.
In one embodiment, provided herein is a mOrange variant of a wild-type mOrange from Discosoma sp. of SEQ ID NO: 62 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described mOrange variant of SEQ ID NO: 62. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the mOrange variant of SEQ ID NO: 62. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the mOrange variant of SEQ ID NO: 62. In a further embodiment, provided herein are methods for producing a mOrange variant, said method comprising expressing the mOrange variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the mOrange variant of SEQ ID NO: 62.
Sandercyanin fluorescent protein is a blue protein found in the mucus coating blue forms of walleye fish, Sander vitreus. Solutions of Sandercyanin fluorescent protein are deep blue in color and show absorbance maxima at 383 and 633 nm, respectively. The naturally occurring Sandercyanin fluorescent protein is a homotetramer of four identical subunits that are associated but not covalently bound. Monomeric variants of Sandercyanin fluorescent protein are one-quarter the size of the tetrameric protein form, i.e., about 18.6 kDa, and have a large stokes shift (375 nm/675 nm), like the tetrametric form, and fluoresce in the far-red or near infrared region, which is useful for a variety of applications, including studying protein-protein interactions, spatial and temporal gene expression, assessing cell biology distribution and mobility, studying protein activity and protein interactions in vivo, as well as cancer research, immunology, and stem cell research and sub-cellular localization. The far-red fluorescence of monomeric variants of Sandercyanin fluorescent protein also are useful for in vivo deep-tissue imaging. The Sandercyanin fluorescent protein variants provided herein are improved in their brightness of fluorescence and stability compared to the wild-type Sandercyanin fluorescent protein and to previously described monomeric variants of Sandercyanin fluorescent protein, and thus are better fluorescent probes than the naturally occurring blue protein.
Also provided herein are sandercyanin fluorescent protein variants of a wild-type sandercyanin fluorescent Protein from Sander vitreus of SEQ ID NO: 63 comprising one or more mutations selected from M1L, K4R, A12P, E15P, D16N, A19P, A205, A42I, V25T, D28E; K35T, K35P or K35Q; A42I, I114T, N115D, A118D, S119F, A121V, A122E, V127L or V127M; G143E, T144I, M145L, G151P, T154E or T154K; L156I, A162N, and A163C.
In another embodiment, provided herein is a sandercyanin fluorescent protein variant of a wild-type sandercyanin fluorescent Protein from Sander vitreus of SEQ ID NO: 63 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described sandercyanin fluorescent protein variant of SEQ ID NO: 63. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the sandercyanin fluorescent protein variant of SEQ ID NO: 63. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the sandercyanin fluorescent protein variant of SEQ ID NO: 63. In a further embodiment, provided herein are methods for producing a sandercyanin fluorescent protein variant, said method comprising expressing the sandercyanin fluorescent protein variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the sandercyanin fluorescent protein variant of SEQ ID NO: 63.
GFP-like fluorescent chromoprotein FP538 is a yellow pigment protein found in Zoanthus sp., in present in soft coral, and neon green sea mats and polyps. GFP-like fluorescent chromoprotein FP538 mutants previously described are the Glu-66 mutant, having a fluorescence excitation of is at 493 nm and 550 nm with intense green emission at 405 nm and a weak red emission at 576 nm, and a Asp-66 mutant, having a fluorescence emission of the is at 524 nm and 552 nm and with a broad red emission shoulder extending from 650 nm. GFP-like fluorescent chromoprotein FP538 are used in a variety of application, including as a fluorescent protein tag in chimeric proteins, a fluorescent marker in living cells and organisms, tracer of cell lineage, a reporter of gene expression, and a measure of protein-protein interactions. The herein provided GFP-like fluorescent chromoprotein FP538 variants have improved fluorescence and stability compared to the above-described mutants.
Also provided herein are GFP-like fluorescent chromoprotein FP538 variants of a wild-type GFP-like fluorescent chromoprotein FP538 from Zoanthus sp. of SEQ ID NO: 64 comprising one or more mutations selected from K9T, T30E, I34R or 134E; K39E, I49V, E50K, G94E, S96T, I106T, C107A, N108R, N108S or N108T; I125E or I125K; N127H, M129T, D134N, K139Q, M141R or M141K; T143V, T1431 or T143L; N144G, A147P, M153Y, K157N, K162R, S166T, L170K, V184T, S189K, S192K, E196G, Q201E, L205E, T220Y, T220Q, T220E or T220H, I224V, F226H, and A229P.
In a further embodiment, provided herein is a GFP-like fluorescent chromoprotein FP538 variant of a wild-type GFP-like fluorescent chromoprotein FP538 from Zoanthus sp. of SEQ ID NO: 64 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described GFP-like fluorescent chromoprotein FP538 variant of SEQ ID NO: 64. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the GFP-like fluorescent chromoprotein FP538 variant of SEQ ID NO: 64. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the GFP-like fluorescent chromoprotein FP538 variant of SEQ ID NO: 64. In a further embodiment, provided herein are methods for producing a GFP-like fluorescent chromoprotein FP538 variant, said method comprising expressing the GFP-like fluorescent chromoprotein FP538 variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the GFP-like fluorescent chromoprotein FP538 variant of SEQ ID NO: 64.
mNeonGreen is a monomeric yellow-green fluorescent protein derived from a tetrameric fluorescent protein from the cephalochordate Branchiostoma lanceolatum and is described as the “brightest monomeric green or yellow fluorescent protein.” The previously developed mNeonGreen monomeric protein is called here the wild-type mNeonGreen. The herein provided mNeonGreen variants are improved compared to the wild-type protein in their photostability and fluorescence, and may be used as fusion tags for general and single-molecule super-resolution imaging, and as fluorescent probes.
Provided herein are mNeonGreen variants of a wild-type mNeonGreen from Branchiostoma lanceolatum of SEQ ID NO: 65 comprising one or more mutations selected from V52H, D53E, N65Y, E69M or E69Q, A110Q, S1311 or S131C, T139R, S143D, K152H, A158P, T164Q, S166K, A169G, W172P, S175E, K177M, T178L, T188D or T188E, K190T, T194K, G196K or G196E, N197D, 5203C, T204Q, T208H, N218D, and Y226F.
In a further embodiment, provided herein is a mNeonGreen variant of a wild-type mNeonGreen from Branchiostoma lanceolatum of SEQ ID NO: 65 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described mNeonGreen variant of SEQ ID NO: 65. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the mNeonGreen variant of SEQ ID NO: 65. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the mNeonGreen variant of SEQ ID NO: 65. In a further embodiment, provided herein are methods for producing a mNeonGreen variant, said method comprising expressing the mNeonGreen variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the mNeonGreen variant of SEQ ID NO: 65.
Green fluorescent protein blFP-Y3 is derived from the cephalochordate Branchiostoma lanceolatum (Common lancelet). This fluorescent protein also is called the brightest monomeric green or yellow fluorescent protein to date and is an excellent fusion tag for traditional imaging as well as stochastic single-molecule super resolution imaging. The presently provided Green Fluorescent Protein blFP-Y3 variants have an improved fluorescence brightness compared to wild-type Green fluorescent protein blFP-Y3.
Also provided herein are green fluorescent protein blFP-Y3 variants of a wild-type green fluorescent protein blFP-Y3 from Branchiostoma lanceolatum of SEQ ID NO: 66 comprising one or more mutations selected from M1A, AST, V18H, D19E, E35Q, S49N, A76K, K79C, S97I, S100A, N101H, Y102F, T105R, S109N, K112I, I118H, S132K, W138P, T141E, S153G, T154E, T160K, G162K, T170E, N174T, N184D, and K213W.
In a further embodiment, provided herein is a green fluorescent protein blFP-Y3 variant of a wild-type green fluorescent protein blFP-Y3 from Branchiostoma lanceolatum of SEQ ID NO: 66 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described green fluorescent protein blFP-Y3 variant of SEQ ID NO: 66. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the green fluorescent protein blFP-Y3 variant of SEQ ID NO: 66. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the green fluorescent protein blFP-Y3 variant of SEQ ID NO: 66. In a further embodiment, provided herein are methods for producing a green fluorescent protein blFP-Y3 variant, said method comprising expressing the green fluorescent protein blFP-Y3 variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the green fluorescent protein blFP-Y3 variant of SEQ ID NO: 66.
Far-red Fluorescent Protein eqFP650 is found in Entacmaea quadricolor, a bubble-tip anemone. The far-red fluorescent protein eqFP650 variants provided herein are enhanced in brightness over the wild-type far-red fluorescent protein eqFP650 and may be used for whole-body fluorescence imaging in animal models for among others, the study of tumorigenesis, embryogenesis, and inflammation.
Provided herein are far-red fluorescent protein eqFP650 variants of a wild-type far-red fluorescent protein eqFP650 from Entacmaea quadricolor of SEQ ID NO: 67 comprising one or more mutations selected from G13N, T19E, 520G, K25N, A33M, K34R, A51M, I85V, T86M, Y88F, L94C, T100I, N104D, L107F, K112E, N114R, S120P, A134P, A142R or A142C; S144G, R147E, H149R, Q151N or Q151D; V157K, Y161H, H163I, S165H, F184V, F186Y, R189H or R189Y; K190R, E196H, K199D, T201N, and M208V.
In another embodiment, provided herein is a far-red fluorescent protein eqFP650 variant of a wild-type far-red fluorescent protein eqFP650 from Entacmaea quadricolor of SEQ ID NO: 67 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described far-red fluorescent protein eqFP650 variant of SEQ ID NO: 67. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the far-red fluorescent protein eqFP650 variant of SEQ ID NO: 67. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the far-red fluorescent protein eqFP650 variant of SEQ ID NO: 67. In a further embodiment, provided herein are methods for producing a far-red fluorescent protein eqFP650 variant, said method comprising expressing the far-red fluorescent protein eqFP650 variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the far-red fluorescent protein eqFP650 variant of SEQ ID NO: 67.
Fluorescent protein cyOFP is a tetrameric cyan-excitable orange-red fluorescent protein has been genetically engineered and expressed as a 242 amino acid fluorescent protein in Escherichia coli; the previously described fluorescent protein cyOFP is herein called the wild-type Fluorescent Protein cyOFP. The herein provided fluorescent protein cyOFP variants have enhanced properties compared to the wild-type fluorescent protein cyOFP.
Also provided herein are fluorescent protein cyOFP variants of a wild-type fluorescent protein cyOFP from Escherichia coli of SEQ ID NO: 68 comprising one or more mutations selected from L8V, E11P, H14K, S15T, L19M, H32G, K37N, N45M, R46K, V50T, A58S, A63M, M65L, K71E, L106C, T112I or T112V; L114I, A132P, L151M; A154R, A154K or A154N; K164M, V169K, N177H, K179R, T189K, N211D, V220W, V220H or V220Q and D227H.
In one embodiment, provided herein is a fluorescent protein cyOFP variant of a wild-type fluorescent protein cyOFP from Escherichia coli of SEQ ID NO: 68 comprising mutations:
In one embodiment, provided herein are nucleic acids encoding an above-described fluorescent protein cyOFP variant of SEQ ID NO: 68. In another embodiment, provided herein are expression vectors comprising the nucleic acid encoding the fluorescent protein cyOFP variant of SEQ ID NO: 68. In another embodiment, provided herein are host cells transformed with the expression vector comprising the nucleic acid encoding the fluorescent protein cyOFP variant of SEQ ID NO: 68. In a further embodiment, provided herein are methods for producing a fluorescent protein cyOFP variant, said method comprising expressing the fluorescent protein cyOFP variant in the transformed host cell, wherein the host cell is transformed with the expression vector comprising the nucleic acid encoding the fluorescent protein cyOFP variant of SEQ ID NO: 68.
The list of enzymes used in the PROSS algorithm, the enzyme Uniprot Accession number, host organism, and wild-type enzyme sequence length, respectively, are given in Table 1.
Vibrio
fluvialis
Burkholderia glumae
Rhizobium
radiobacter
Chromobacterium
violaceum
Clostridium
botulinum
Geobacillus
stearothermophilus
Yersinia
mollaretii ATCC
Neosartorya
fumigata (strain
Archaeoglobus
fulgidus (strain
Shimwellia
blattae (Escherichia
blattae)
Morganella
morganii (Proteus
morganii)
Prevotella
intermedia
Neosartorya
fumigata (strain
Geobacillus stearothermophilus
Rhodococcus
jostii (strain
Shigella
flexneri 5a
Aromatoleum
aromaticum
Pseudomonas
pseudoalcaligenes (strain
Escherichia
coli (strain K12)
Lobophyllia
hemprichii (Lobed
Discosoma sp. (Sea anemone)
Aequorea
victoria (Jellyfish)
Photinus
pyralis (Common
pyralis)
Discosoma sp. (Sea anemone)
Sander
vitreus (Walleye)
Zoanthus sp. (Green polyp)
Branchiostoma
lanceolatum
Branchiostoma
lanceolatum
Entacmaea
quadricolor
Escherichia
coli
Table 2 enumerates the wildtype amino acid sequence for each of the enzymes listed in Table 1; each enzyme is identified by its respective Uniprot Accession number.
Table 3 lists the stabilizing mutations introduced by PROSS to the respective wild-type amino acid sequence enumerated in Table 2 for each enzyme that is listed in Table 1; each enzyme is identified by its respective Uniprot Accession number and the number of enzyme variants for each respective wild-type enzyme is indicated as “variant number”.
Any patent, patent application publication, or scientific publication, cited herein, is incorporated by reference herein in its entirety. The examples are presented to more fully illustrate embodiments of the invention. They should in no way be construed, however, as limiting the broad scope of the invention. While certain features of the invention have been described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.
This application claims priority of U.S. Provisional Application No. 62/557,160, filed Sep. 12, 2017, which is hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US18/50641 | 9/12/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62557160 | Sep 2017 | US |