NOVEL GENETICALLY ENGINEERED MICROORGANISM CAPABLE OF GROWING ON FORMATE, METHANOL, METHANE OR CO2

The present invention relates to a genetically engineered microorganism expressing (i) formate tetrahydrofolate (THF) ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase, (ii) the enzymes of the glycine cleavage system (GCS), (iii) serine deaminase and serine hydroxymethyltransferase (SHMT), (iv) an enzyme increasing the availability of NADPH, and (v) optionally formate dehydrogenase (FDH), and wherein the genetically engineered microorganism has been genetically engineered to express at least one of the enzymes of (i) to (v), wheren said enzyme is not expressed by the corresponding microorganism that has been used to prepare the genetically engineered microorganism, and wherein the enzymes of (i) to (v) are genomically expressed.

In this specification, a number of documents including patent applications and manufacturer's manuals are cited. The disclosure of these documents, while not considered relevant for the patentability of this invention, is herewith incorporated by reference in its entirety. More specifically, all referenced documents are incorporated by reference to the same extent as if each individual document was specifically and individually indicated to be incorporated by reference.

Carbon dioxide is the focal point of many of our societal challenges and opportunities. The anthropogenic release of CO₂threatens the balance of the planetary climate and could lead to a calamitous increase in global temperatures. On the other hand, CO₂has the potential to replace fossil carbons as the primary feedstock for production of carbon-based value-added chemicals, including fuels, plastics, solvents, feed, and food. Yet, valorization of carbon dioxide remains an open challenge. Biological fixation of CO₂by plants and algae takes place naturally on a massive scale. However, photosynthetic carbon fixation is challenging to harness due to multiple constraints, including competition for agricultural resources which erodes food security, land use which jeopardizes biodiversity, difficult processing of lignocellulosic biomass, and, most fundamentally, the low efficiency by which phototrophs use sunlight¹. Alternatively, CO₂can be upgraded by purely chemical means, e.g., generating syngas^2,3which can be used to produce complex hydrocarbons⁴. However, such processes rely on extreme conditions and suffer from limited operational flexibility, narrow product spectrum, and low product selectivity.

An emerging solution is to integrate abiotic and biotic processes, in order to harness their respective advantages while avoiding their specific drawbacks. Physicochemical methods excel in both capturing renewable energy and using it to activate CO₂into energized small molecules. Specifically, one carbon (C₁) compounds can be derived from CO₂and renewable energy with high efficiency⁵. Biochemical processes can then convert these C₁compounds into a wide array of chemicals with high specificity under ambient conditions⁶. Of the possible C₁molecules, formate and methanol are especially interesting, as, unlike gases such as carbon monoxide and methane, they are miscible in water, thus avoiding mass transfer limitations. Formate can be produced by the direct electrochemical reduction of CO₂with an energetic efficiency of >40%⁵. Methanol can be produced in a two-step process, where electrolysis first generates hydrogen which is then reacted with CO₂; the overall energetic efficiency of this process was demonstrated to be >50%⁷.

While anaerobic acetogens and methanogens can consume formate or methanol at very high efficiency, their product spectrum is very limited⁸. Aerobic cultivation, while associated with lower bioconversion efficiency, is generally much more flexible in terms of production capability. Despite considerable progress in developing better genetic tools for engineering natural aerobic formatotrophs and methylotrophs, their biotechnological application is still limited. This is in part due to unfavorable cultivation parameters (e.g., cell concentration and growth rate) and low efficiency of the relevant metabolic pathways⁹.

Adapting a microorganism for growth on formate or methanol has therefore been a key goal of the synthetic biology community in the last decade^10-21. However, so far, the success of these efforts has been limited. This could be partially explained by the complexity of the natural pathways— the Calvin Cycle, the Serine Cycle, and the Ribulose Monophosphate Cycle²²—the cyclic activity of which strongly overlaps with central metabolism and requires complex regulation of the fluxes that converge into and diverge from the pathway.

An example of a failure to generate a microorganism that can solely grow on formate is described in the article Yishai et al. (2018), ACS Synth. Biol, 7:2023-2028. Here, a genetically engineered E. coli was produced expressing (i) formate tetrahydrofolate (THF) ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase, and (ii) the enzymes of the glycine cleavage system (GCS) with the aim of generating E. coli which can solely grow on formate as the carbon source. However, it was found that this E. coli is still unable to grow on 30 nM formate and requires the addition of glucose as the main source of carbon. The essentially same results were obtained when generating genetically engineered yeast expressing the orthologous genes. Also this yeast cannot grow on formate as the sole carbon source without supplementing glycine (de la Cruz et al. (2019), ACS Synth. Biol, 8:911-9217). It has been suggested in Yishai et al. (2018), ACS Synth. Biol, 7:2023-2028 to further modify the E. coli to additionally express serine deaminase and formate dehydrogenase, in order to obtain an E. coli being capable of growing on formate as the sole carbon source. However, this has not been put into practice.

The unmet need of the provision of a genetically engineered microorganism that growths in particular under aeriobic conditions on a carbon (C₁) compound as the sole carbon, such as formate or methanol, is therefore addressed by the present invention.

Hence, the present invention relates in a first aspect to a genetically engineered microorganism expressing (i) formate tetrahydrofolate (THF) ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase, (ii) the enzymes of the glycine cleavage system (GCS), (iii) serine deaminase and serine hydroxymethyltransferase (SHMT), (iv) optionally formate dehydrogenase (FDH), and (v) an enzyme increasing the availability of NADPH.

Most key biotechnological microorganisms, including E. coli, cannot naturally grow on C₁feedstocks. As mentioned, already a number of prior art attempts to genetically engineer a microorganism to solely grow on a C₁compound, in particular formate or methanol, failed.

The claimed genetically engineered microorganism of the invention is capable to efficiently grow on a C₁feedstock, such as formate or methanol as the sole carbon source, in particular under aerobic conditions. This is achieved by genetically engineering a microorganism, so that it expresses the enzymes as characterized by the above items (i) to (v), noting that the expression of the enzyme of item (iv) is optional for the reasons that will be provided herein below.

The above enzymes (i) to (iv) are enzymes of the novel so-called reductive glycine pathway (rGlyP) which is illustrated in FIG. 1, right side. The rGlyP was compiled by the inventors with the aim to design and engineer a simple, linear synthetic pathway which could support a microorganism to grow on formate or methanol as sole carbon source. The rGlyP pathway was designed on the basis of the anaerobic reductive acetyl-CoA pathway (rAcCoAP)²³, which assimilates C₁compounds efficiently. The reductive glycine pathway (rGlyP), as shown in FIG. 1, right side, was designed to be the aerobic twin of the unaerobic rAcCoAP²⁴, as shown in FIG. 1, left side. Both pathways are linear routes with limited overlap with central metabolism, minimizing the need for regulatory optimization. Both pathways start with the ligation of formate and tetrahydrofolate (THF), proceed via reduction into a C₁-THF intermediate, which is then condensed, within an enzyme complex, with CO₂to generate a C₂compound (acetyl-CoA or glycine). The C₂compound is finally condensed with another C₁moiety and metabolized to generate pyruvate as biomass precursor. Importantly, both the rAcCoAP and the rGlyP are characterized by a ‘flat’ thermodynamic profile^24,25, that is, both are mostly reversible such that the direction of the metabolic flux they carry is determined mainly by the concentrations of their substrates and products. This thermodynamic profile, while constraining the driving force of the pathway reactions²⁶, indicates very high energetic efficiency, where no energetic input, e.g., in the form of ATP hydrolysis, is wasted. Indeed, both pathways are associated with a very low ATP cost: only 1-2 ATP molecules are invested in the metabolism of formate to pyruvate²⁴. Yet, unlike the rAcCoAP, the key enzymatic components of which are highly oxygen sensitive, the rGlyP can operate under full aerobic conditions. Due to this oxygen sensitivity of the enzymatic components of the rAcCoAP it is not feasible to use the enzymatic components of this pathway to produce genetically engineered microorganisms. It would be necessary to grow them under unaerobic conditions which is labor and cost intense for commercial large scale uses. The rGlyP overcomes this drawback of the rAcCoAP by implementing only enzymatic components that allow growth under aerobic conditions. Hence, to the best knowledge of the inventors the rGlyP represents the most efficient route—in terms of energy utilization, resources consumption, and biomass yield—to assimilate formate in the presence of oxygen²⁴.

A recent study suggests that the complete rGlyP might be naturally operating in a phosphite-oxidizing microbe²⁷. Moreover, the key enzymatic conversion of the rGlyP, catalyzed by the glycine cleavage system (GCS), was shown to be fully reversible in many organisms^28-30. Previous studies demonstrated that the GCS can support glycine and serine biosynthesis from formate in an engineered E. coli strain at elevated CO₂concentration^31-33. However, growth of the microorganism on formate (and CO₂) as a sole carbon source has not yet been demonstrated and remained an open challenge before the present invention was made.

As discussed above, the rGlyP comprises the four enzymatic modules corresponding to items (i) to (iv), noting that the enzymatic modules (i) to (iii) are designated C1 to C3 in FIG. 1.

According to item (i) THF ligase, methenyl-THF cyclohydrolase, and methylene-THF dehydrogenase are expressed. These enzymes act together to convert the sole C₁carbon source formate into methylene-THF. In more detail, methylene-THF is generated in three steps. As will be further detailed herein below, formate is catalyzed into formyl-THF by the THF ligase, formyl-THF is catalyzed into methenyl-THF by the methenyl-THF cyclohydrolase and methenyl-THF is catalyzed into methylene-THF by the methylene-THF dehydrogenase.

According to item (ii) the enzymes of the glycine cleavage system (GCS) are expressed. These enzymes condense methylene-THF with CO₂and ammonia to glycine. According to item (iii) serine hydroxymethyltransferase (SHMT) and serine deaminase are expressed. These enzymes together condense glycine with another methylene-THF to serine and finally pyruvate. The pyruvate metabolism supplies energy to microorganism when oxygen is present. Hence, once pyruvate is available, the microorganism can be maintained and grown. According to item (iv) formate dehydrogenase (FDH) is expressed. FDH generates reducing power and energy from this C₁feedstock. This reducing power aids in the discussed enzymatic conversions. As will be further explained herein below, under certain conditions the reducing power and energy may also be sufficient to generate enough pyruvate in the absence of the expression of FDH. For this reason the expression of FDH is optional.

In the following further details on all individual enzymes according to items (i) to (iv) are provided.

Enzymes of Item (i):

Formate tetrahydrofolate (THF) ligase (EC 6.3.4.3) is an enzyme that catalyzes the chemical reaction ATP+formate+tetrahydrofolate⇄ADP+phosphate+10-formyltetrahydrofolate. The chemical reaction being catalyzed by the THF ligase is reversible and in connection with the present invention the forward reaction occurs. Hence, the 3 substrates of this enzyme are ATP, formate, and tetrahydrofolate, whereas its 3 products are ADP, phosphate, and 10-formyltetrahydrofolate. This enzyme belongs to the family of ligases, specifically those forming generic carbon-nitrogen bonds. This enzyme participates in glyoxylate and dicarboxylate metabolism and one carbon pool by folate. In the examples herein below the formate tetrahydrofolate having the amino acid sequence of SEQ ID NO: 1 is used which is encoded by the nucleotide sequence of SEQ ID NO: 2. It is therefore preferred that the tetrahydrofolate used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 1 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 2.

Herein above and also herein below sequence identities of at least 80% are envisioned. For each occurrence individually the at least 80% identity is with increasing preference at least 90%, at least 95%, at least 98%, and at least 99% identity. Means and methods for determining sequence identity are known in the art. Preferably, the BLAST (Basic Local Alignment Search Tool) program is used for determining the sequence identities as referred to herein.

Methenyl-THF cyclohydrolase (EC 3.5.4.9) is an enzyme that catalyzes the chemical reaction 5,10-methenyltetrahydrofolate +H₂O⇄10-formyltetrahydrofolate. Thus, the two substrates of this enzyme are 5,10-methenyltetrahydrofolate and H₂O, whereas its product is 10-formyltetrahydrofolate. The chemical reaction being catalyzed by the methenyl-THF cyclohydrolase is likewise reversible. In connection with the present invention the reverse reaction occurs. This enzyme belongs to the family of hydrolases, in particular those acting on carbon-nitrogen bonds other than peptide bonds, specifically in cyclic amidines. This enzyme participates in glyoxylate and dicarboxylate metabolism and one carbon pool by folate. In the examples herein below the methenyl-THF cyclohydrolase having the amino acid sequence of SEQ ID NO: 3 is used which is encoded by the nucleotide sequence of SEQ ID NO: 4. It is therefore preferred that the methenyl-THF cyclohydrolase used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 3 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 4.

Methylene-THF dehydrogenase (EC 1.5.1.5) is an enzyme that catalyzes the chemical reaction 5,10-methylenetetrahydrofolate+NADP⇄5,10-methenyltetrahydrofolate+NADPH+H+. The two substrates of this enzyme are therefore 5,10-methylenetetrahydrofolate and NADP+, whereas its 3 products are 5,10-methenyltetrahydrofolate, NADPH, and H+. Also the chemical reaction which is catalyzed by the methylene-THF dehydrogenase is reversible and in connection with the present invention the reverse reaction occurs. This enzyme belongs to the family of oxidoreductases, specifically those acting on the CH—NH group of donors with NAD+ or NADP+ as acceptor. In the examples herein below the methylene-THF dehydrogenase having the amino acid sequence of SEQ ID NO: 5 is used which is encoded by the nucleotide sequence of SEQ ID NO: 6. It is therefore preferred that the methylene-THF dehydrogenase used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 5 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 6.

SEQ ID NOs: 1 to 6 are sequences from the microorganism Methylobacterium extorquens. Methylobacterium extorquens (strain ATCC 14718 / DSM 1338/AM1) is a pink-pigmented facultative methylotrophs Gram-negative bacterium isolated in 1960, as an airborne contaminant growing on methylamine. It was used as a workhorse to characterize the serine cycle for assimilation of the C₁-unit of methylene tetrahydrofolate, a central intermediate in methylotrophic metabolism, and more recently the ethylmalonyl-CoA pathway for glyoxylate regeneration. The common trait of all Methylobacterium species is the ability to grow on one or several reduced one carbon (C₁) compounds other than methane, most prominently methanol. The genetically engineered microorganism of the invention is illustrated in the appended examples by a genetically engineered E. coli. E. coli does not naturally express the three enzymes THF ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase. It is therefore required to genetically engineer E. coli, so that it expresses the three enzymes.

Enzymes of Item (ii):

The enzymes of the glycine cleavage system (GCS) react the final product of the action of the THF ligase, methenyl-THF cyclohydrolase, and methylene-THF dehydrogenase 5,10-methylene-THF with CO₂, ammonia, and NADH to generate the C2 amino acid glycine. Finally, with the help of the enzyme of item (iii) glycine is condensed with another 5,10-methylene-THF molecule to produce the C3 amino acid serine. All reactions from formate to serine are fully reversible, and the overall thermodynamics of the pathway favor the reductive direction with ΔrG′m˜-6 kJ/mol (change in Gibbs energy at pH 7.5, ionic strength of 0.25 M and reactant concentrations of 1 mM). Hence, from a thermodynamic perspective, net production of serine is largely a matter of keeping the pathway substrates, formate and CO₂, at sufficiently high concentrations (Yishai et al. (2018), ACS Synth. Biol., 7, 2023-2028).

The GCS is made up of four enzymes which are called protein T, protein H, protein P, and protein L. The GCS system can be found in a wide variety of bacteria, including aerobic and unaerobic bacteria (Kikuchi et al. (2008), Proc Jpn Acad Ser B Phys Biol Sci., 84(7)). GcvT (glycine cleavage system protein T, EC 2.1.2.10) is an aminomethyltransferase. GcvH (glycine cleavage system protein H, No EC number), interacts with all other components of the GCS in a cycle of reductive methylamination (catalysed by the P-protein), methylamine transfer (catalysed by the T-protein) and electron transfer (catalysed by the L-protein, lipoamide dehdyrogenase, EC 1.8.1.4). GcvP (glycine cleavage system protein P, EC 1.4.4.2) is a glycine dehydrogenase.

While GcvTHP were overexpressed in the examples herein below it is not required to also overexpress lipoamide dehdyrogenase. This is because the wild-type exression of lipoamide dehdyrogenase (Lpd) in E. coli on top of the GcvTHP overexpression resulted in a slightly better growth of the E. coli as compared to the GcvTHP and Lpd overexpression. Without wishing to be bound by this theory this indicates that i) the natural expression level of GcvL in E. coli is enough to support the reductive glycine pathway, and ii) the overexpression of GcvL might slightly impair other metabolic steps in E. coli.

Hence, in case the microorganism to be used to produce the genetically engineered microorganism endogenously expressed the enzymes of the GCS it is preferred to overexpress the three enzymes GcvTHP but not the enzyme Lpd. In this connection it is of note that in the GCS complex the ratio of the four enzymes is 1:1:1:1, i.e. for each GcvT/GcvH/GcvP protein there should be one Lpd present. As Lpd is used for other complexes in the cell as well, it is assumed that the endogenous expression of Lpd is higher than that of GcvT/GcvH/GcvP. Based on the expression yields of the four enzymes of the GCS as described in Li et al. (2014), Cell, 157(3):624-635 in E. coli the expression level of Lpd is about 10-fold higher as compared to the expression of GcvT, GcvH and GcvP. Hence, in case GcvT, GcvH and GcvP are overexpressed it is preferred that GcvT, GcvH and GcvP are overexpressed at least 5-fold, more preferably at least 7.5-fold and most preferably about 10-fold. The term “about” is preferably ±20% and more preferably ±10%.

In the examples herein below GcvT, GcvH and GcvP having the amino acid sequences of SEQ ID NOs 7, 9 and 11 are used which are encoded by the nucleotide sequences of SEQ ID NOs 8, 10 and 12, respectively. It is therefore preferred that the GcvT, GcvH and GcvP used herein are at least 80% identical to the amino acid sequence of NOs 7, 9 and 11, respectively or are encoded by a nucleotide sequence being at least 80% identical to SEQ NOs 8, 10 and 12, respectively.

SEQ ID NOs: 7 to 12 are sequences from the microorganism E. coli. Since the genetically engineered microorganism of the invention is illustrated in the appended examples by a genetically engineered E. coli it is noted that E. coli also endogenously expresses the enzymes of the GCS. Hence, while GcvT, GcvH and GcvP can be and are preferably overexpressed in E. coli by genetically engineering E. coil also naturally occurring E. coil expresses GcvT, GcvH and GcvP.

The Lpd of E. coli has the amino acid sequences of SEQ ID NO: 59 and is encoded by the nucleotide sequences of SEQ ID NO: 60. It is therefore preferred that the Lpd as used herein is at least 80% identical to the amino acid sequence of NO: 59 or is encoded by a nucleotide sequence being at least 80% identical to SEQ NO: 60.

Enzymes of Item (iii):

Serine deaminase (or L-serine dehydratase, EC 4.3.1.17) catalyzes the reversible reaction L-serine pyruvate⇄NH₃. The reaction involves the initial elimination of water to form an enamine intermediate followed by tautomerization to an imine form and hydrolysis of the C—N bond. In the examples herein below the serine deaminase having the amino acid sequence of SEQ ID NO: 13 is used which is encoded by the nucleotide sequence of SEQ ID NO: 14. It is therefore preferred that the serine deaminase used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 13 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 14.

Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) (Vitamin B6) dependent enzyme (EC 2.1.2.1) which plays an important role in cellular one-carbon pathways by catalyzing the reversible, simultaneous conversions of L-serine to glycine and tetrahydrofolate (THF) to 5,10-methylenetetrahydrofolate (5,10-CH2-THF) (5,10-methylenetetrahydrofolate+glycine+H₂O<=>tetrahydrofolate+L-serine). In the examples herein below the serine hydroxymethyltransferase having the amino acid sequence of SEQ ID NO: 15 is used which is encoded by the nucleotide sequence of SEQ ID NO: 16. It is therefore preferred that the serine hydroxymethyltransferase used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 15 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 16.

Also SEQ ID NOs: 13 to 16 are sequences from the microorganism E. coli. E. coli endogenously expresses serine deaminase and SHMT. Serine deaminase and SHMT can be overexpressed in E. coli by genetically engineering E. coil, but also naturally occurring E. coli expresses serine deaminase and SHMT.

Enzyme of Item (iv):

The formate dehydrogenase (FDH, EC 1.17.1.9) which is optionally expressed is preferably expressed. FDH is an NAD-dependent formate dehydrogenase. An NAD-dependent formate dehydrogenase catalyzes the reaction formate+NAD⁺⇄CO₂+NADH+H⁺. NAD-dependent formate dehydrogenases are important in methylotrophic yeast and bacteria and are vital in the catabolism of C₁compounds, such as formate, methanol, methane and CO₂. In the examples herein below the formate dehydrogenase having the amino acid sequence of SEQ ID NO: 17 is used which is encoded by the nucleotide sequence of SEQ ID NO: 18. It is therefore preferred that the formate dehydrogenase used herein is at least 80% identical to the amino acid sequence of SEQ ID NO: 17 or is encoded by a nucleotide sequence being at least 80% identical to SEQ ID NO: 18.

SEQ ID NOs: 17 and 18 are sequences from the microorganism Pseudomonas sp. Pseudomonas is a genus of Gram-negative, Gammaproteobacteria, belonging to the family of Pseudomonadaceae. The members of the genus demonstrate metabolic diversity and consequently are able to colonize a wide range of niches. E. coli does not naturally express FDH, so that E. coli has to be genetically engineered to express this enzyme.

The expression of all the above-discussed enzymes according to items (i) to (iv) with the exception of serine hydroxymethyltransferase (SHMT) in E. coil is suggested in Yishai et al. (2018), ACS Synth. Biol, 7:2023-2028. It is shown in the examples herein below that a genetically engineered microorganism expressing (i) formate tetrahydrofolate (THF) ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase, (ii) the enzymes of the glycine cleavage system (GCS), (iii) serine deaminase and serine hydroxymethyltransferase (SHMT), and (iv) formate dehydrogenase (FDH) grows on formate as the sole carbon source. An E. coli expressing all these enzymes has a doubling time when grown on formate of about 70 h and a growth yield of about 1.5 gCWD/mol-formate.

Since it was desirable to further improve this performance, in particular the quite long doubling time of about 70 h, for commercial uses, but it was totally unclear which further modification is required in order to improve the performance, the inventors applied a short-term evolution approach. Thereby an E. coli was obtained that grows on formate with a doubling time of only about 8 h (nearly 9 times faster!) and the growth yield of about 2.3 gCWD/mol-formate is also improved. This performance is suitable for commercial uses. The E. coli as obtained by the short-term evolution approach was analyzed and it was found that it overexpresses (13-fold) the gene pntAB.

pntAB is an enzyme increasing the availability of NADPH. For this reason the genetically engineered microorganism of the invention comprises the fifth enzymatic module according to item (v). According to item (v) an enzyme increasing the availability of NADPH is expressed.

An enzyme increasing the availability of NADPH can be any enzyme being involved in the regeneration of NADPH. Insufficient rate of NADPH regeneration may limit the activity of biosynthetic pathways. In such cases the expression of NADPH-regenerating enzymes can be used to address this problem and increase cofactor availability (Lindner et al. (2018), ACS Synth. Biol., 7, 2742-2749). Examples of suitable NADPH-regenerating enzymes will be discussed herein below.

Hence, the microorganism of the present invention is further characterized by having an increased availability of nicotinamide adenine dinucleotide phosphate (NADPH). It was surprisingly found that the relatively slow doubling time on formate of 70 h of a microorganism expressing the enzymes of items (i) to (iv) is due to a lack of NADPH. NADPH is an essential electron donor in all organisms. It provides the reducing power that drives numerous anabolic reactions, including those responsible for the biosynthesis of all major cell components. The efficient synthesis of many of these components, however, is limited by the rate of NADPH regeneration. It is of note that NADPH is also a key factor for rGly which is consumed by methylene-THF dehydrogenase. Without wishing to be bound by this theory it may be the methylene-THF dehydrogenase of the genetically engineered microorganism of the invention which consumes the additional NADPH.

It is accordingly preferred with increasing preference that the genetically engineered microorganism has a doubling time of less than 20 h, less than 15 h, less than 10 h and about 8 h when grown on formate as the sole carbon source. The term “about” is preferably ±10% and more preferably ±5%.

As discussed herein above, Yishai et al. (2018), ACS Synth. Biol, 7:2023-2028 and de la Cruz et al. (2019), ACS Synth. Biol, 8:911-9217 obtained an E. coil and yeast, respectively, that can only grow on formate as carbon source if supplemented with glycine. In E. coli and yeast the orthologous genes were expressed to obtain these microorganisms. The essentially same results in E. coli and yeast as obtained in the prior art show that also the present invention is applicable to microorganism in general and not only to E. coli being used in the appended illustrative examples.

Taken together, the present invention provides the first genetically engineered microorganism that can efficiently solely grow on C₁compounds, in particular under aerobic conditions. As is illustrated by the appended examples, this was achieved by the expression of the enzymatic modules (i) to (v) in a microorganism. Establishing synthetic formatotrophy and methylotrophy, as demonstrated herein, paves the way for sustainable bioproduction rooted in CO₂and renewable energy.

In accordance with a preferred embodiment of the first aspect of the invention the enzymes of (i) to (v) are genomically expressed, preferably from genomic safe spots.

In accordance with a more preferred embodiment the first aspect of the invention relates to a genetically engineered microorganism expressing (i) formate tetrahydrofolate (THF) ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase, (ii) the enzymes of the glycine cleavage system (GCS), (iii) serine deaminase and serine hydroxymethyltransferase (SHMT), (iv) an enzyme increasing the availability of NADPH, and (v) optionally formate dehydrogenase (FDH), and wherein the genetically engineered microorganism has been genetically engineered to express at least one of the enzymes of (i) to (v), wheren said enzyme is not expressed by the corresponding microorganism that has been used to prepare the genetically engineered microorganism, and wherein the enzymes of (i) to (v) are genomically expressed, preferably from genomic safe spots.

In the examples herein below the genes as defined in connection with the aspect of the invention were expressed extrachromosomally from plasmids as well as intrachrosomally by introducing them into the genome of E. coli. While expression from plasmids and genomic expression both work, genomic expression of the enzymes is preferred since it was found to support more robust growth as compared to expression from plasmids.

Genomic safe spots are genomic loci where genes or other genetic elements can be safely inserted and expressed. Eight genomic safe spots of E. coli are described in Bassalo, et al. (2016), ACS 350 synthetic biology 5, 561-568, doi:10.1021/acssynbio.5b00187. These safe spots are used in the appended examples for transgene insertion and this strategy may further support robust growth. The names and map positions of the eight genomic safe spots in the E. coli genome are SS2:787571; SS3: 1308935; SS5:2083959; SS6:2580897; SS7:3099988; SS8:3533732; SS9:3979535; and SS10:4411972.

In accordance with a further preferred embodiment of the first aspect of the invention the enzymes of (i) to (v) are expressed under the control of a strong constitutive promoter and/or a modified ribosome binding site.

Placing a gene under the control of strong constitutively active promoter leads to the robust overexpression of the gene. Examples of strong constitutive promoters are the PGI promoter (preferably the PGI-20 promoter as used in the appended examples) and the T7 promoter. The

PGI promoters are a family of synthetic promoters derived from native promoter of glucose-6-phosphate isomerase (pgi) of E. coli (Braatsch et al. (2008), Biotechniques, 45(3), pp.335-337).

In bacteria, ribosome binding sites (RBSs) are effective control elements for translation initiation. By modifying an RBS the translation initiation rate can be increased. This in turn can increase protein abundance by several orders of magnitude (Salis et al., Nat Biotechnol. 2009 October ;27(10):946-50 and Zelcbuch et al., Nucleic Acids Research, 41(9), 1 May 2013, Page e98). Hence, the modified ribosome binding site to be used in connection with the present invention increases protein abundance as compared to the corresponding wild-type RBS. The translation initiation rate of a modified RBS can be determined by the summary effect of multiple molecular interactions, including the hybridization of the 16S rRNA to the RBS sequence, the binding of tRNAfMET to the start codon, the distance between the 16S rRNA binding site and the start codon, and the presence of RNA secondary structures that occlude either the 16S rRNA binding site or the standby site. A modified RBS can be a mutated RBS or a synthetic RBS. Also, as a modified RBS, a set of RBS sequences can be used. The set comprises at least one mutated and/or synthetic RBS sequence and can also comprise native RBS sequences.

It is preferred to use one or more of the following synthetic RBS sequences: (RBS-A; SEQ ID NO: 43): AGGAGGTTTGGA, (RBS-B, SEQ ID NO: 44): AACAAAATGAGGAGGTACTGAG, (RBS-C, SEQ ID NO: 45): AAGTTAAGAGGCAAGA, (RBS-D, SEQ ID NO: 46): TTCGCAGGGGGAAG, (RBS-E, SEQ ID NO: 47): TAAGCAGGACCGGCGGCG, and (RBS-F, SEQ ID NO: 48): CACCATACACTG which are known from Zelcbuch et al., Nucleic Acids Research, 41(9), 1 May 2013, Page e98. Among this list the use of RBS-A and RBS-C is preferred (FIG. 6). For instance, a set of two or three RBS-C sequences may be used or an RBS-C sequence along with one or two native RBS sequences.

It is also possible and preferred to use a strong constitutive promoter and a modified ribosome binding site in combination as is illustrated in the examples herein below and as discussed in Wenk et al., Methods Enzymol., 608:329-367.

In accordance with another preferred embodiment of the first aspect of the invention the enzyme of (iv) is at least 2-fold, preferably at-least 3-fold, more preferably at least 4-fold and most preferably at least 5-fold higher expressed than the enzymes of (i) to (iii).

In the above discussed E. coil having a doubling time of only 8 h on formate a further modification was found, namely a mutation in the FDH gene which increases the expression (2.5-fold) which in turn increases the format oxidation activity 7.4-fold. Hence, the increased expression of the FDH gene is presumbaly the reason why the E. coli not only has a superior doubling time of only 8 h but also displays an improved growth yield of about 2.3 gCWD/mol-formate. It is assumed that not the absolute higher expression of the FDH gene is responsible but rather the higher expression as compared to the enzyme of items (i) to (iii).

In accordance with a yet further preferred embodiment of the first aspect of the invention the enzyme increasing the availability of NADPH is membrane transhydrogenase (PntAB), glucose 6-phosphate dehydrogenase (Zwf), 6-phosphogluconate dehydrogenase (Gnd), malic B enzyme (MaeB), or isocitrate dehydrogenase (lcd), and is preferably PntAB.

E. coli harbors five enzymes that regenerate NADPH (Lindner et al. (2018), ACS Synth. Biol., 7, 2742-2749). These five enzymes are glucose 6-phosphate dehydrogenase (Zwf, EC 1.1.1.49; SEQ ID NOs 29 and 30) and 6-phosphogluconate dehydrogenase (Gnd, EC 1.1.1.44; SEQ ID NOs 23 and 24) of the oxidative pentose phosphate pathway, a malic enzyme (MaeB, EC 1.1.1.38-40; SEQ ID NOs 27 and 28), isocitrate dehydrogenase (lcd, 1.1.1.41-42; SEQ ID NOs 25 and 26) of the TCA cycle, and a membrane-bound proton-translocating transhydrogenase (PntAB, EC 7.1.1.1; SEQ ID NOs 19 to 22). These enzymes can be found in other microorganisms, as well.

The five enzymes in E. coli have the amino acid sequences of SEQ ID NOs: 19, 21, 23, 25, 27, 29 and 31, respectively and are encoded by the nucleotide sequences of SEQ ID NOs: 20, 22, 24, 26, 28 and 30, respectively. It is therefore preferred that the five enzymes used herein are at least 80% identical to any of the amino acid sequences of NOs: 19, 21, 23, 25, 27, 29 and 31, respectively or are encoded by a nucleotide sequence being at least 80% identical to any one of NOs: 20, 22, 24, 26, 28 and 30. It is of note in this respect that PntAB is composed of the A unit and the B unit of SEQ ID NOs: 19 and 21 and is encoded by SEQ ID NOs: 20 and 21.

Among these five enzymes PntAB is preferred since this enzyme was found to be overexpressed in the discussed E. coli having a doubling time of only 8 h on formate. Since the overexpression of PntAB increases the availability of NADPH, the same effect as by PntAB can be achieved by any other enzyme increasing the availability of NADPH, as well.

In accordance with a still further preferred embodiment of the first aspect of the invention an overexpression of PntAB is achieved by introducing a mutation into the promoter region of pntAB, wherein the mutation of pntAB is preferably a single-base pair substitution in the promoter region of pntAB.

As discussed, the endogenous PntAB was found to be overexpressed in the E. coli having a doubling time of only 8 h on formate as shown in the appended examples. The overexpressed pntAB gene was analyzed and a single-base pair substitution in the promoter region of pntAB was found. Hence, the mutation results in a stronger promoter activity and therefore also in higher expression levels.

The nucleotide sequence of the promoter is shown in SEQ ID NO: 31. SEQ ID NO: 31 can be distinguished from the wild-type promoter by a T in nucleotide position 6 as compared to a C in the wild-type sequence. While it is not considered to be of particular relevance how the overexpression of PntAB is achieved, it is preferred that it is achieved by placing the pntAB gene under the control of a strong promoter, preferably the promoter of SEQ ID NO: 31.

PntAB was found to be overexpressed 13-fold as compared to the expression of PntAB under the control of its wild-type promoter. It is therefore preferred with increasing preference herein that PntAB is overexpressed in the genetically engineered microorganism at least 2-fold, least 3-fold, least 4-fold, least 5-fold, least 6-fold, least 7-fold, least 8-fold, least 9-fold, least 10-fold, least 11-fold, least 12-fold, and least 13-fold as compared to the expression of PntAB under the control of its wild-type promoter. This preference applies mutatis mutandis to all other enzymes increasing the availability of NADPH that can be used in place of PntAB and in particular to glucose 6-phosphate dehydrogenase (Zwf), 6-phosphogluconate dehydrogenase (Gnd), or malic B enzyme (MaeB), and isocitrate dehydrogenase (lcd).

In accordance with a preferred embodiment of the first aspect of the invention an overexpression of FDH is at least partly achieved by introducing a mutation into the 5′ untranslated region of FDH, wherein the mutation of FDH is preferably a single-base pair substitution in the 5′ untranslated region of FDH.

The overexpression of FDH is achieved in the examples herein below by placing the fdh gene under the control of a strong constitutive promoter and a modified ribosome binding site. As discussed above, a mutation in the FDH gene was found to further increase the expression (2.5-fold). The overexpressed fdh gene was analyzed and a single-base pair substitution was found in the 5′ untranslated region (UTR) of the FDH gene. The 5′UTR with the single-base pair substitution is shown in SEQ ID NO: 32.

It is preferred that the FDH gene encoding the overexpressed FDH is characterized by the 5′UTR of SEQ ID NO: 32. SEQ ID NO: 32 can be distinguished from the wild-type 5′UTR by an A in nucleotide position 10 as compared to a C in the wild-type sequence.

In accordance with a further preferred embodiment of the first aspect of the invention the microorganism is auxotrophic for serine, glycine and C₁moieties.

Auxotrophy is the inability of an organism to synthesize a particular organic compound required for its growth. Hence, this genetically engineered microorganism preferably cannot synthesize serine, glycine and C₁moieties.

As discussed herein above, in the prior art Yishai et al. (2018), ACS Synth. Biol, 7:2023 a genetically engineered E. coil was produced which cannot solely grow on formate but only if glucosen is supplemented. Hence, in order to ensure that no serine, glycine or any C₁moieties are present in the examples herein below, a microorganism being auxotrophic for serine, glycine and C₁moieties was used to generate the genetically engineered microorganism of the invention. The microorganism being auxotrophic for serine, glycine and C₁moieties preferably comprises the four gene deletions ΔserA Δkbl ΔltaE ΔaceA. The first deletion abolishes native serine biosynthesis, the second and third abolish threonine cleavage to glycine, and the fourth deletion prevents the formation of glyoxylate that could potentially be aminated to glycine³². By starting from such a microorganism it was possible to prove that the E. coli is capable to grow on formate as the sole carbon source. The combined activity of the enzymes of items (i) to (iii) enables the cell to metabolize formate into C1-THF, glycine, and serine, relieving these auxotrophies (FIG. 2A).

In accordance with another preferred embodiment of the first aspect of the invention the microorganism is a bacterium, preferably a proteobacterium, more preferably an enterobacterium and most preferably E. coli.

While the present invention is applicable to microorganisms in general, as is discussed herein above in an illustrative manner for E. coli and yeast, the appended examples illustrate the present invention on the basis of E. coli. E coli is a bacterium, more specifically a proteobacterium of the family of enterobacteria.

In accordance with a preferred embodiment of the first aspect of the invention the microorganism is capable of converting methanol to formate.

As is illustrated in the appended examples, the genetically engineered microorganism of the invention can be further genetically engineered, so that it can convert methanol to formate. Such a further genetically engineered microorganism is capable of growing on methanol as the sole carbon source.

In order to enable the genetically engineered microorganism of the invention to convert methanol to formate it is preferred to additionally express a NAD-dependent methanol dehydrogenase (EC 1.1.1.244), PQQ-dependent methanol dehydrogenase (EC 1.1.99.8), or oxygen-dependent methanol oxidase (EC 1.1.3.13). The use of a NAD-dependent methanol dehydrogenase is illustrated by the appended examples. The methanol dehydrogenase is an NAD+-dependent methanol dehydrogenase that catalyzes the chemical reaction methanol+NAD+⇄ formaldehyde+NADH+H+. Thus, the two substrates of this enzyme are methanol+NAD+, whereas its 3 products are formaldehyde, NADH, and H+. This enzyme belongs to the family of oxidoreductases, specifically those acting on the CH—OH group of donor with NAD+ or NADP+ as acceptor. The systematic name of this enzyme class is methanol:NAD+oxidoreductase.

In the examples herein below the methanol dehydrogenase having any one of the amino acid sequences of SEQ ID NOs: 33, 35, 37, 39, and 41 is used which are encoded by the nucleotide sequence of SEQ ID NOs: 34, 36, 38, 40, and 42, respectively. It is therefore preferred that the methanol dehydrogenase as used herein is at least 80% identical to any one of the amino acid sequences of SEQ ID NOs: 33, 35, 37, 39, and 41 or is encoded by a nucleotide sequence being at least 80% identical to any one of SEQ ID NOs: 34, 36, 38, 40, and 42.

SEQ ID NOs: 33/34, 35/36, 37/38, 39/40, and 41/42 are sequences from Bacillus stearothermophilus (strain: Unitprot: P42327), Cupriavidus necator (strain: Unitprot: F8GNE5), Corynebacterium glutamicum (strain: Unitprot: A4QHJ5), Bacillus methanolicus (strain: Unitprot: I3E949), and Bacillus methanolicus (strain: Unitprot: I3E949) with the mutations Q5L and A363L, respectively.

In a second conversion formaldehyde is reduced to formate. This is done in the exemplified E. co/i via the endogenous glutationine system, so that once formaldehyde is formed it is reduced to format without the necessity of additional genomic engeneering (Gutheil, et al. (1997), Biochemical and biophysical research communications, 238(3):693-696). Alternative pathways for reducing formaldehyde to formate are described in Lidstrom (2006), Volume 2: Ecophysiology and Biochemistry, pp.618-634.

It is of note that the reduction of methanol to formate is expected to result in sufficient energy (i.e. reducing power) to enable the genetically engineered microorganism to grow efficiently. Hence, in connection with the microorganism being capable of converting methanol to formate it might not be required to express FDH, thereby providing additional energy (i.e. reducing power). For this reason FDH is optionally present in the genetically engineered microorganism of the invention.

In accordance with a further preferred embodiment of the first aspect of the invention the microorganism is capable of converting methane to formate.

Similarly, the genetically engineered microorganism of the invention can be further genetically engineered, so that it can convert methane to formate. Such a further genetically engineered microorganism is capable of growing on methane as the sole carbon source.

In order to enable the genetically engineered microorganism of the invention to convert methane to formate it is preferred to additionally express methane monooxygenase (MMO). Methane monooxygenase (MMO) is an enzyme capable of oxidizing the C—H bond in methane as well as other alkanes. Methane monooxygenase belongs to the class of oxidoreductase enzymes (EC 1.14.13.25). Methane monooxygenase is the first enzyme in the metabolic pathway of methanotrophs, which are bacteria that use methane as source of carbon and energy. The MMO can be selected from the membrane-bound particulate MMO (pMMO), soluble MMO (sMMO), and the related enzyme ammonia monooxygenase (AMO).

Since, methane hydroxylation results in methanol (CH₄+½O₂→CH₃OH), the genetically engineered microorganism of the invention being capable of converting methane to formate is generally also engineered to convert methanol to formate, preferably by expressing a methanol dehydrogenase as discussed herein above.

In accordance with a still further preferred embodiment of the first aspect of the invention the microorganism is capable of converting CO₂to formate.

Hence, the genetically engineered microorganism of the invention can also be further genetically engineered, so that it can convert CO₂to formate. Such a further genetically engineered microorganism is capable of growing on CO₂as the sole carbon source.

One option for converting CO₂to formate is the electrochemical reduction of CO₂to formate using formate dehydrogenase, e.g. the metal-independent enzyme type of formate dehydrogenase (FDH) derived from Candida boidinii yeast; see Buddhinie et al. Acc. Chem. Res. 2019, 52, 3, 676-685. In more detail, Buddhinie et al. describe the reduction of CO₂to formate using electrochemically reduced methyl-viologen as an artificial electron donor. Another option for catalyzing the reduction of CO₂is the use of a carbon dioxide reductase, in particular the hydrogen-dependent carbon dioxide reductase from Acetobacterium woodii; see Schuchmann Müller (2013), Science, 342(6164):1382-1385. A yet further option is the use of a hydrogenase enzyme to obtain NADH and the use of a formate dehydrogenase using the produced NADH for CO₂reduction.

The present invention relates in a second aspect to a method for growing the microorganism as defined in accordance with the first aspect of the invention which expresses FDH, comprising culturing the microorganism under growth conditions comprising formate as the sole carbon source.

It is of note that in accordance with this method the microorganism also has to express FDH since the activity of this enzyme is required to produce enough energy (i.e. reducing power) to enable the microorganism to grow efficiently.

Means and methods for culturing microorganisms under growth conditions are established in the art; see, for example Lodish (2000), Molecular Cell Biology, 4th edition, section 6.1, W.H. Freeman and Company. Many prokaryotes (i.e., bacteria) and single-celled eukaryotes such as yeast, all of which grow in nature as single cells, can be easily grown in culture dishes.

The present invention relates in a third aspect to a method for growing the microorganism being capable of converting methanol to formate as defined in accordance with the first aspect of the invention, comprising contacting the microorganism under growth conditions comprising methanol as the sole carbon source.

For the reasons discussed above, the microorganism being capable of converting methanol to formate might not be required to express FDH.

The present invention relates in a fourth aspect to a method for growing the microorganism being capable of converting methane to formate as defined in accordance with the first aspect of the invention which expresses FDH, comprising culturing the microorganism under growth conditions comprising methane as the sole carbon source.

In this case the microorganism has to express FDH in order to have enough energy (i.e.

reducing power) for efficient growth.

The present invention relates in a fifth aspect to a method for growing the microorganism being capable of converting CO₂to formate as defined in accordance with the first aspect of the invention which expresses FDH, comprising culturing the microorganism under growth conditions with CO₂as the sole carbon source.

Also in this case the microorganism has to express FDH in order to have enough energy (i.e. reducing power) for efficient growth.

Moreover, in this method generally an electron donor is required in order to reduce CO₂to formate. Hence, culturing the microorganism under growth conditions with CO₂as the sole carbon source is generally under conditions, wherein electrons from an electron donor reduce CO₂to formate. The electron donor is preferably H₂or CO.

The CO₂may be provided in the form of syngas. Syngas is a fuel gas mixture consisting primarily of H₂, CO and CO₂. Hence, syngas comprises CO₂as the sole carbon source as well as H₂and CO as electron donors to reduce CO₂to formate.

A microorganism is an organism of microscopic or ultramicroscopic size that generally cannot be seen by the naked eye. Viruses are not classified as microorganisms in accordance with the present disclosure. Examples of microorganisms include bacteria, archaea, protozoa, fungi and algae. The microorganism is preferably a bacterium or fungus and is most preferably a bacterium. The fungus is preferably yeast and is most preferably Saccharomyces cerevisiae. Preferred examples of the bacterium will be discussed herein below. The microorganism of the invention is generally capable of converting a C₁feedstock (e.g. formate, methanol, methane or CO₂) into pyruvate, glycerate, or acetyl-CoA via the formation of glycine, in particular under aerobic conditions. This conversion provides the microorganism with the required energy to survive and grow.

The term “genetically engineered microorganism” as used herein designates a genetically engineered microorganism that has been genetically engineered to express at least one enzyme that is not expressed by a corresponding (wild-type) microorganism that has been used to prepare the genetically engineered microorganism. It follows that the genetically engineered microorganism has been prepared by technical means and does not occur in nature. In this respect it is to be understood that the term “is not expressed” preferably means that no expression of the at least one enzyme is found in a corresponding (wild-type) microorganism. However, the term “is not expressed” also comprises the situation where essentially no expression of the at least one enzyme is found in a corresponding (wild-type) microorganism.

In this respect, genetic engineering generally means the artificial manipulation, modification, or recombination of DNA or other nucleic acid molecules in order to modify an organism or population of organisms. Genetic engineering can be accomplished using multiple art-established techniques. Non-limiting examples are transformation, transfection and transduction. Transformation is the direct alteration of a genetic component of a cell by passing the genetic material through the cell membrane. The cell membrane can be made amenable for the uptake of the genetic components to be transformed into the cell, for example, by divalent cations (e.g. calcium cations) or electroporation. The process used to insert foreign DNA into a cell is usually called transfection. For instance, liposomes and polymers can be used as vectors to deliver DNA into cultured animal cells via transfection. Positively charged liposomes bind with DNA, while polymers can be designed that interact with DNA. They form lipoplexes and polyplexes, respectively, which are then taken up by the cells. Other transfection techniques, e.g. including using electroporation and biolistics, are also known in the art.

The term “expression” (or gene expression) designates the process by which information from a gene is used in the synthesis of a functional gene product, which product is in the present case a protein. Hence, expression comprises the steps of transcription of a gene into mRNA and the translation of the mRNA into protein. Expression is preferably overexpression. Overexpression is the excessive or high expression of a gene. Overexpression can be achieved, for example, by placing the gene under the control of a stronger promoter as compared to the promoter controlling the gene in nature, by using a modified ribosomal binding site (as described above) or by increasing the gene copy number in the genome. Overexpression is preferably determined in a quantitative PCR (qPRR) reaction in comparison to a constitutively expressed reference gene. Accurate interpretation of qPCR data requires normalization using constitutively expressed reference genes. Ribosomal RNA is often used as a reference gene for transcriptional studies in microorganisms, including E. coli. As an alternative a housekeeping gene can be used. Housekeeping genes are typically constitutively exprssed genes that are required for the maintenance of basic cellular function of a microorganism. Non-limiting but preferred examples of such housekeeping genes are the cysG, hcaT, idnT and rssA genes which all can be found in E. coli. In the appended examples the rssA gene is used (FIG. 7).

A gene is preferably determined to be overexpressed by qPCR if the normalized expression level of the gene as determined by qPCR in the genetically engineered microorganism of the invention is at least 2-fold, preferably at least 5-fold, more preferably at least 10-fold higher and most preferably at least 20-fold higher as compared to the normalized expression level as determined by qPCR in the corresponding wild-type microorganism. For example, the GcvT, GcvH and GcvP genes to be preferably used herein are from E. coli. In this case, the GcvT, GcvH or GcvP gene is overexpressed if the normalized expression level of the gene as determined by qPCR in the genetically engineered microorganism of the invention is at least 2-fold, preferably at least 5-fold, more preferably at least 10-fold higher and most preferably at least 20-fold higher as compared to the normalized expression level as determined by qPCR in wild-type E. coil (FIG. 7). Similarly, THF ligase, methenyl-THF cyclohydrolase and methylene-THF dehydrogenase genes to be preferably used herein are from Methylobacterium extorquens. In this case, the THF ligase, methenyl-THF cyclohydrolase or methylene-THF dehydrogenase gene is overexpressed if the normalized expression level of the gene as determined by qPCR in the genetically engineered microorganism of the invention is at least 2-fold, preferably at least 5-fold, more preferably at least 10-fold higher and most preferably at least 20-fold higher as compared to the normalized expression level as determined by qPCR in wild-type Methylobacterium extorquens.

As regards the embodiments characterized in this specification, in particular in the claims, it is intended that each embodiment mentioned in a dependent claim is combined with each embodiment of each claim (independent or dependent) said dependent claim depends from. For example, in case of an independent claim 1 reciting 3 alternatives A, B and C, a dependent claim 2 reciting 3 alternatives D, E and F and a claim 3 depending from claims 1 and 2 and reciting 3 alternatives G, H and I, it is to be understood that the specification unambiguously discloses embodiments corresponding to combinations A, D, G; A, D, H; A, D, I; A, E, G; A, E, H; A, E, I; A, F, G; A, F, H; A, F, I; B, D, G; B, D, H; B, D, I; B, E, G; B, E, H; B, E, I; B, F, G; B, F, H; B, F, I; C, D, G; C, D, H; C, D, I; C, E, G; C, E, H; C, E, I; C, F, G; C, F, H; C, F, I, unless specifically mentioned otherwise.

Similarly, and also in those cases where independent and/or dependent claims do not recite alternatives, it is understood that if dependent claims refer back to a plurality of preceding claims, any combination of subject-matter covered thereby is considered to be explicitly disclosed. For example, in case of an independent claim 1, a dependent claim 2 referring back to claim 1, and a dependent claim 3 referring back to both claims 2 and 1, it follows that the combination of the subject-matter of claims 3 and 1 is clearly and unambiguously disclosed as is the combination of the subject-matter of claims 3, 2 and 1. In case a further dependent claim 4 is present which refers to any one of claims 1 to 3, it follows that the combination of the subject-matter of claims 4 and 1, of claims 4, 2 and 1, of claims 4, 3 and 1, as well as of claims 4, 3, 2 and 1 is clearly and unambiguously disclosed.

The figures show.

FIG. 1—The synthetic reductive glycine pathway is similar in structure to the reductive acetyl-CoA pathway. Yet, while the latter pathway is restricted to anaerobic conditions, the former can operate under aerobic conditions. Both pathways are highly ATP-efficient, as only 1-2 ATP molecules are consumed in the conversion of formate to pyruvate (e.g., instead of 7 by the Calvin Cycle). Molecular structure in brown corresponds to a sub-structure of tetrahydrofolate. Enzymes of the reductive glycine pathway, as implemented in this study, are indicated in purple (Lpd, unlike the other enzymes of the glycine cleavage system, was not overexpressed). ‘Me’ corresponds to Methylobacterium extorquens and ‘Ec’ corresponds to Escherichia coli. Division of the pathway into modules, as explained in the text, is shown in light brown to the right of the figure.

FIG. 2—Modular establishment of the reductive glycine pathway. (A) Selection scheme of C₁M and C₂M for the biosynthesis of C₁-moieties, glycine, and serine. (B) Overexpression of C₁M and C₂M enabled growth with formate (and CO₂) as sole source of C₁-moieties, glycine, and serine. (C) Selection scheme of C₁M, C₂M, and C₃M to generate biomass building blocks, where acetate oxidation provides reducing power and energy. Deletion of aceA prevents acetate from being used as a carbon source. (D) Overexpression of C₁M, C₂M, and C₃M enabled growth with formate as source of biomass and acetate as an energy source. Genomic integration of C₃M was performed in strain in which the endogenous glyA and sdaA were deleted. (E) Selection scheme of C₁M, C₂M, C₃M, and EM to use formate as sole carbon and energy source. (F) Growth on formate is demonstrated only when all four modules are overexpressed. Genomic overexpression is indicated by ‘g’, while overexpression from a plasmid is indicated by ‘p’. Experiments were conducted at 10% CO₂within 96-well plates and were performed in triplicate, which displayed identical growth curves (±5%), and hence were averaged. Doubling times (DT) shown in the figure.

FIG. 3—Short term evolution improves growth on formate. (A) Test-tube cultivation on formate as sole carbon source. The vertical small red arrows correspond to the addition of formate, increasing the concentration in the medium by 30 mM. Upon reaching an OD₆₀₀of 0.4, cells were reinoculated into a new test-tube with an initial OD₆₀₀of 0.03-0.05. Error bars correspond to standard deviation of 2 experiments. 6 exemplifying cycles of cultivation are shown. (B) Doubling time decreased with cultivation cycle. Error bars correspond to standard deviation of 2 experiments. (C) Growth of the evolved strain (in test-tube) is directly coupled to a decrease in formate concentration. Error bars correspond to standard deviation of 2 experiments. (D) Cultivation of the evolved strain on formate as a sole carbon source within a 96-well plate. Experiments were conducted at 10% CO₂. Plate reader experiments were performed in triplicate, which displayed identical growth curves (±5%), and hence were averaged. Doubling times (DT) are shown in the figure. DT were considerably shorter in the plate reader than in test-tube as the measurements in were more accurate (taken every 10 minutes rather than once per day) and since the conditions are different (e.g., more stable cultivation environment in the plate reader).

FIG. 4—Labeling pattern of proteinogenic amino acids confirms the activity of the reductive glycine pathway. As elaborated in FIG. 13, the labeling pattern is consistent with the assimilation of formate and CO₂via the synthetic pathway, and indicates low cyclic flux via the TCA cycle. Numbers written in italics above the bars correspond to the overall fraction of labeled carbons.

FIG. 5—Engineered growth on methanol. (A) Methanol can be assimilated via the activity of methanol dehydrogenase (MDH), where formaldehyde is oxidized to formate via the native activity of the glutathione system. (B) The Methanol Module (MM) converts methanol to formate and provides the cell with reducing power and energy. (C) Overexpression of MDH from Bacillus stearothermophilus (BsMDH) within the gC₁M gC₂M gC₃M gEM strain, carrying a mutation in the promoter of the pntAB operon (FIG. 12), enabled growth on methanol within a 96-well plate. Experiments were conducted at 10% CO₂. Plate reader experiments were performed in triplicate, which displayed identical growth curves (±5%), and hence were averaged. (D) Comparison of growth on methanol (shown are final cell densities) with different expressed enzymes and at different genetic backgrounds. NAD-dependent MDH from several organisms was tested: Bacillus stearothermophilus (BsMDH), Corynebacterium glutamicum (CgMDH), and Cupriavidus necator N-1 (CnMDH), as well as two MDHs from B. methanolicus (BmMDH2 and BmMDH3) and an improved variant (BmMDH2*, carrying Q5L A363L modifications). Formaldehyde dehydrogenases from Pseudomonas putida (PpFADH; SEQ ID NOs: 49 and 50) and Pseudomonas aeruginosa (PaFADH; SEQ ID NOs: 51 and 52) was further tested. (E) Labeling pattern of proteinogenic amino acids upon feeding with ¹³C-methanol/¹²-CO₂is identical that with ¹³C-formate/¹²-CO₂(FIG. 4), confirming the activity of the reductive glycine pathway. Numbers written in italics above the bars correspond to the overall fraction of labeled carbons.

FIG. 6—Schematic overview of overexpression strategy. Gene overexpression from plasmid is shown in the left column while genomic overexpression is shown in the right column.

Promoter and ribosome binding sites are as described in a previous manuscript*. Genomic ‘safe spots’ were described previously**. (* S. Wenk, O. Yishai, S. N. Lindner, A. Bar-Even, An engineering approach for rewiring microbial metabolism. Methods Enzymol 608, 329-367 (2018); ** M. C. Bassalo et al., Rapid and efficient one-step metabolic pathway integration in E. coli, ACS synthetic biology 5, 561-568 (2016))

FIG. 7—Replacement of the native promoter of the GCV operon with a strong constitutive promoter increases gene expression 20-50 fold in a serine auxotroph (SerAux) strain (ΔserA Δkbl ΔltaE ΔaceA). Transcript levels were normalized to the expression of the rrsA gene and are shown relative to the expression of a WT (non-serine auxotroph) strain. As a comparison, the transcript levels induced by a weak constitutive promoter and moderate constitutive promoter are shown*. Experiments were performed in triplicate. (* S. Wenk, O. Yishai, S. N.

Lindner, A. Bar-Even, An engineering approach for rewiring microbial metabolism. Methods Enzymol 608, 329-367 (2018))

FIG. 8—Different expression approaches of the genes of C3M -glyA and sdaA—affect growth via the reductive glycine pathway, with acetate serving as an energy source. Expression on a plasmid resulted in an identical growth regardless of the promoter strength (green and blue lines). Overexpression of sdaA alone failed to achieve growth (pink and purple lines). Genomic expression (after deletion of endogenous glyA and sdaA) resulted in better growth when gene expression was controlled by a medium strength ribosome binding site (‘C’, pale blue line) than by a strong ribosome binding site (‘A’, brown line). ‘g’ corresponds to genomic expression and ‘p’ to expression on a plasmid. Origin and replication, promoters, and ribosome binding sites are described in a previous study*. (* S. Wenk, O. Yishai, S. N. Lindner, A. Bar-Even, An engineering approach for rewiring microbial metabolism. Methods Enzymol 608, 329-367 (2018))

FIG. 9—Different expression approaches of the genes of EM—formate dehydrogenase—affect growth via the reductive glycine pathway. Expression on a plasmid supported growth. Genomic overexpression supported growth only when the ribosome binding site was of the highest strength (‘A’). ‘g’ corresponds to genomic expression and ‘p’ to expression on a plasmid. Origin and replication, promoters, and ribosome binding sites are described in a previous study.* (* S. Wenk, O. Yishai, S. N. Lindner, A. Bar-Even, An engineering approach for rewiring microbial metabolism. Methods Enzymol 608, 329-367 (2018))

FIG. 10—Number of colony forming units increases monotonically with OD600 for cells growing on formate as sole carbon source.

FIG. 11—Cell growth on formate directly correlates with increased medium pH due to the accumulation of OH⁻.

FIG. 12—Two mutations emerged within the formatotrophic strain after a short period of evolution. (A) A point mutation in the 5′-UTR of the FDH gene. (B) A point mutation in the promoter of the pntAB gene. Strain K4 corresponds to a strain in which the four modules of the reductive glycine pathway were introduced into its genome, that is, gC1M gC2M gC3M gEM, while strain K4e the same strain after short term evolution.

FIG. 13—Change in transcript level in the evolved strain. (A) Levels of FDH transcript increased 2.7-fold in the evolved strain. (B) Levels of pntAB transcript increased by ˜14-fold in the evolved strain. In both cases transcript levels were normalized to the rrsA gene and are shown relative to the expression within a nonevolved strain. Experiments were performed in triplicate. Strain K4 corresponds to a strain in which the four modules of the reductive glycine pathway were introduced into its genome, that is, gC1M gC2M gC3M gEM, while strain K4e the same strain after short term evolution.

FIG. 14—Evolved strain displays 7.4-fold higher activity of FDH in cell extract. FDH activity was measured in 96-well plate by the addition of formate and NAD+ and was followed by increase in absorbance at 340 nm by the accumulation of NADH. The results were normalized to mg of total cell protein. Strain K4 corresponds to a strain in which the four modules of the reductive glycine pathway were introduced into its genome, that is, gC1M gC2M gC3M gEM, while strain K4e the same strain after short term evolution.

FIG. 15—Introduction of the two mutations found in genome sequencing of the evolved strain (5′UTR of fdh and promoter region of pntAB) improved growth on formate dramatically and resulted in a growth pattern very similar to that of the evolve strain (see FIG. 3C). Cultivation of the evolved strain on formate as a sole carbon source within a 96-well plate. Experiments were conducted at 10% CO2. Plate reader experiments were performed in triplicate, which displayed identical growth curves (±5%), and hence were averaged. Strain K4 corresponds to a strain in which the four modules of the reductive glycine pathway were introduced into its genome, that is, gC1M gC2M gC3M gEM.

FIG. 16—Addition of 100 mM sodium bicarbonate enables growth on higher concentrations for formate, as demonstrated with the evolved K4 strain and a K4 strain to the genome of which the two mutations found in the evolved strain were introduced. Cultivation of the evolved strain on formate as a sole carbon source within a 96-well plate. Experiments were conducted at 10% CO2. Plate reader experiments were performed in triplicate, which displayed identical growth curves (±5%), and hence were averaged. Strain K4 corresponds to a strain in which the four modules of the reductive glycine pathway were introduced into its genome, that is, gC1M gC2M gC3M gEM.

FIG. 17—Expected labeling of proteinogenic amino acids upon feeding with 13C-formate/12C-CO2or 12C-formate/13C-CO2 and according to different metabolic scenarios.

FIG. 18—Number of colony forming units increases monotonically with OD600 for cells growing on methanol as sole carbon source.

FIG. 19—Addition of 100 mM sodium bicarbonate increases final OD600 on methanol, reaching 0.9 instead of 0.2 (FIG. 5C). Consumption of methanol is depicted by the bars: the grey bars correspond to methanol concentration in a test tube without cells (concentration decrease due to evaporation), while the blue bars represent the concentration of methanol in a test tube in which cells are growing on methanol.

The examples illustrate the invention.

EXAMPLE 1—RESULTS
The Reductive Glycine Pathway

Escherichia coli, as most other key biotechnological microorganisms, cannot naturally grow on C₁feedstocks. In this study, it was aimed to design and engineer a simple, linear synthetic pathway which could support E. coli growth on formate or methanol as sole carbon source. The inspiration came from the anaerobic reductive acetyl-CoA pathway (rAcCoAP)²³which assimilates C₁compounds very efficiently. The reductive glycine pathway (rGlyP), as shown in FIG. 1, was designed to be the aerobic twin of the rAcCoAP²⁴. Both are linear routes with limited overlap with central metabolism, minimizing the need for regulatory optimization. Both pathways start with the ligation of formate and tetrahydrofolate (THF), proceed via reduction into a C₁-THF intermediate, which is then condensed, within an enzyme complex, with CO₂to generate a C₂compound (acetyl-CoA or glycine). The C₂compound is finally condensed with another C₁moiety and metabolized to generate pyruvate as biomass precursor. Importantly, both the rAcCoAP and the rGlyP are characterized by a ‘flat’ thermodynamic profile^24,25, that is, both are mostly reversible such that the direction of the metabolic flux they carry is determined mainly by the concentrations of their substrates and products. This thermodynamic profile, while constraining the driving force of the pathway reactions²⁶, indicates very high energetic efficiency, where no energetic input, e.g., in the form of ATP hydrolysis, is wasted. Indeed, both pathways are associated with a very low ATP cost: only 1-2 ATP molecules are invested in the metabolism of formate to pyruvate²⁴. Yet, unlike the rAcCoAP, the key enzymatic components of which are highly oxygen sensitive, the rGlyP can operate under full aerobic conditions. Hence, the rGlyP represents the most efficient theoretical route—in terms of energy utilization, resources consumption, and biomass yield—to assimilate formate in the presence of oxygen²⁴.

Modular-Engineering Approach Establishes Grow on Formate

To facilitate the establishment of formatotrophic growth, the rGlyP was divided into four metabolic modules (FIG. 6): (i) a C₁Module (C_iM), consisting of formate THF ligase, methenyl-THF cyclohydrolase, and methylene-THF dehydrogenase, all from Methylobacterium extorquens³⁴, together converting formate into methylene-THF; (ii) a C₂Module (C₂M), consisting of the endogenous enzymes of the GCS (GcvT, GcvH, and GcvP) which condenses methylene-THF with CO₂and ammonia to give glycine; (iii) a C₃Module (C₃M), consisting of serine hydroxymethyltransferase (SHMT) and serine deaminase, together condensing glycine with another methylene-THF to generate serine and finally pyruvate; and (iv) an Energy Module (EM), which consists of formate dehydrogenase (FDH) from Pseudomonas sp. (strain 101)³⁵, generating reducing power and energy from this C₁feedstock.

The strategy was to establish the activities of the different modules in consecutive steps, integrating subsequent modules and selecting for their combined activity. It was started with an E. coli strain that is auxotrophic for serine, glycine, and C₁moieties—ΔserA Δkbl ΔltaE ΔaceA—where the first deletion abolishes native serine biosynthesis, the second and third abolish threonine cleavage to glycine, and the final deletion prevents the formation of glyoxylate that could potentially be aminated to glycine³². The combined activity of the C₁M and the C₂M, together with the native activity of SHMT, should enable the cell to metabolize formate into C₁-THF, glycine, and serine, relieving these auxotrophies (FIG. 2A).

Into the serine auxotroph strain, the enzymes of the C₁M and the C₂M were introduced, either on plasmid or in the genome (FIG. 6). For genome integration of C₁M, all relevant enzymes were combined into one operon, under the regulation of a strong constitutive promoter³⁶, which was inserted into a genomic ‘safe spot’, SS9³⁷. In the case of the C₂M, the native promoter of the GCS was replaced with a strong constitutive one (FIG. 6), increasing transcript levels 20-50 fold (FIG. 7). As expected, growth with formate was observed upon overexpression of both modules (FIG. 2B) and was dependent upon high CO₂concentration (10% in the headspace) which thermodynamically and kinetically supports the reductive activity of the GCS. While genomic integration of the enzymes of the C₁M (gC,M) did not improve growth compared to plasmid expression (pC₁M), replacing plasmid borne expression of the enzymes of the C₂M (pC₂M) with genomic overexpression (gC₂M) supported a higher growth rate (FIG. 2B).

Next, it was aimed to establish formate as the primary carbon source, which requires high expression of the enzymes of the C₃M to convert glycine into the central metabolism intermediate pyruvate (FIG. 2C). To enable formate assimilation to biomass, an energy source is required, which at this stage was chosen to be acetate. The TCA cycle can fully oxidize acetate to generate reducing power and energy, while the deletion of isocitrate lyase (ΔaceA) abolishes the activity of the glyoxylate shunt, thus preventing the cell from using this molecule as a carbon source. Growth should thus be dependent on formate assimilation via the rGlyP for biomass generation and acetate oxidation for the production of reducing power and energy (FIG. 2C).

The enzymes of the C₃M were either overexpressed on a plasmid (pC₃M) or in the genome (gC₃M) (FIG. 6); in the latter case, the native glyA and sdaA were deleted and a synthetic operon harboring both genes under the regulation of a strong constitutive promoter was introduced into another genomic ‘safe spot’, SS7³⁷. Overexpression of the enzymes of the C₃M, within a strain that genomically expresses the enzymes of the C₁M and the C₂M, resulted on growth on formate and acetate (at 10% CO₂) (FIG. 2D). Genomic expression of C₃M supported more robust growth compared to the C₃M expressed from plasmid. To confirm that the expression level of C₃M does not constrain the growth rate, a strain was tested in which the expression of glyA and sdaA is controlled by a stronger ribosome binding site (RBS-A instead of RBS-C³⁶). It was found that this strain grows rather poorly (FIG. 8), indicating that higher expression of these genes is deleterious.

Finally, it was aimed to introduce the EM such that formate can serve as sole carbon and energy source (FIG. 2E). Overexpression of FDH on a plasmid (FIG. 6), in the strain carrying the genes of the C₁M, C₂M and C₃M in the genome, enables growth on formate (FIG. 9). However, when FDH was introduced into yet another genomic ‘safe spot’, SS10³⁷, it failed to establish growth (FIG. 9), suggesting that the expression level of FDH was too low. Therefore, a strain was tested in which the genomic expression of FDH was controlled by a stronger ribosome binding site (RBS-A instead of RBS-C³⁶, FIG. 6). This strain, carrying no plasmid, was able to grow on formate as a sole carbon and energy source (FIG. 2F and FIG. 9). Growth on formate was also observed in a test-tube and confirmed by recording monotonically increasing colony-forming units with increased OD (FIG. 10). This is the first case in which growth on formate was made possible in a microorganism that cannot assimilate C₁compounds natively.

Short-Term Evolution Improves Growth on Formate

To improve growth on formate it was decided to conduct a short term evolution experiment in fed batch mode. The engineered strain was cultivated in test tubes, where formate was added every 3-6 days, increasing the concentration in the medium by 30 mM (red arrows in FIG. 3A). Once cell turbidity reached an OD₆₀₀of 0.4, the cells were diluted to OD₆₀₀of 0.03-0.05 and started a new cycle of cultivation (FIG. 3A shows six typical cycles).

Within 13 cultivation cycles (≤40 generations), growth rate on formate was substantially improved (FIG. 3A), with the doubling time dropping from 65-80 h in the first two cycles to less than 10 h in the last cycle (FIG. 3B). Growth yield on formate also improved, from ≈1.5 gCDW/mol-formate in the first cycle to 2.3±0.2 gCDW/mol-formate in the last. This yield is similar to that of microorganisms growing autotrophically on formate via the Calvin cycle (3.2±1.1 gCDW/mol-formate³⁸). The growth of the evolved bacterium on formate was directly coupled to a decrease in the concentration of the feedstock in the medium (FIG. 3C). Furthermore, as formatotrophy consumes protons (net oxidation and net assimilation both consume formic acid rather than formate), a direct correlation was observed between cell density and the pH of the medium (FIG. 11).

To better characterize growth on formate, growth experiments were conducted in 96-well plates, automatically measuring OD₆₀₀every ˜10 minutes. It was found that maximal cell density increased monotonically with increasing formate concentration from 10 mM to 150 mM (FIG. 3D). Similarly, the doubling time decreased monotonically with increasing formate concentration: from 17 hours with 10 mM formate to less than 8 hours at formate concentrations higher than 100 mM (FIG. 3D). The cellular toxicity of formate, which is attributed to inhibition of cytochrome c oxidase³⁹and dissipation of the proton motive force⁴⁰, probably explains the increased lag time at formate concentrations of 109 mM and 153 mM, and the failure to grow at higher concentrations.

Adaptive laboratory evolution usually requires hundreds of generation to improve the fitness of E. coli in a substantial way^41-43.The strain required less than 40 generations, presumably as the growth of the parent strain was so poor that a small number of mutations were sufficient to drastically improve fitness. To check whether this is indeed the case, multiple colonies of the evolved strain were isolated and their genomes were sequenced. Two mutations were found which occurred in all sequenced colonies (FIG. 12). The first was a single base-pair substitution in the 5′-UTR of the newly introduced FDH gene, which increased the level of transcript 2.5-fold (FIG. 13) and resulted in a 7.4-fold increase in formate oxidation activity in cell extract assays (FIG. 14). The second mutation was a single base-pair substitution in the promoter region of pntAB, which encodes for the membrane-bound transhydrogenase. This mutation increased transcript level by more than 13-fold (FIG. 13). The beneficial effect of these two mutations is to be expected, as the first increases energy supply to the cell from formate and the second increases the availability of NAPDH, a key cofactor for the activity of the rGlyP (consumed by methylene-THF dehydrogenase), the supply of which could limit pathway activity.

To confirm that the two mutations suffice to support the improved growth on formate, Multiplex Automated Genomic Engineering (MAGE⁴⁴) was used to introduce these mutations into a non-evolved strain. It was found that while the parent strain could hardly grow in 96-well plates, the strain in which the two mutations were present displayed a growth profile almost identical to that of the evolved strain (FIG. 15). It was therefore concluded that overexpression of FDH and PntAB were sufficient to enable the observed improved growth on formate. By further optimizing cultivation conditions, it was found that addition of 100 mM sodium bicarbonate to the medium enabled the evolved strain, as well as the reconstructed strain, to grow at higher formate concentrations, tolerating even 300 mM (FIG. 16).

Carbon Labeling Confirms Pathway Activity and Shed Light on Cellular Fluxes

To confirm that growth on formate indeed proceeds via the rGlyP, carbon labeling experiments were performed. The cultures were fed with ¹³C-formate/¹²CO₂, ¹²C-formate/¹³CO₂, and ¹³C-formate/¹³CO₂, and measured the labeling pattern of proteinogenic amino-acids using liquid chromatography-mass spectrometry. The focus was on 7 amino-acids—glycine, serine, alanine, valine, proline, threonine, and histidine—which either directly relate to the activity of the rGlyP or originate from different parts of central metabolism, thus providing an indication of key metabolic fluxes.

As shown in FIG. 4, the amino acid labeling confirms the activity of the rGlyP. Specifically, feeding ¹³C-formate/¹²CO₂resulted in single labeled glycine and double labeled serine and pyruvate (as indicated by the labeling of alanine). As valine—derived from two pyruvate molecules, one of which loses its carboxylic acid carbon—is mostly quadruple labeled, it was deduce that pyruvate is labeled in its two non-carboxylic carbons, as predicted for growth via the rGlyP (FIG. 17). Conversely, feeding ¹²C-formate/¹³CO₂resulted, as expected, in single labeled glycine, serine and pyruvate. As valine is also single labeled, it was deduced that pyruvate is labeled in its carboxylic carbon, again confirming the activity of the rGlyP (FIG. 17). Upon feeding ¹³C-formate/¹³CO₂, all amino-acids were nearly-completely labeled, where the overall fraction of labeled carbon (marked above the bars in FIG. 4 in italics) is 97-98%, as expected by feeding with 99% ¹³C-labeled formate and 99% ¹³C-labeled CO₂.

The labeling of threonine (derived from oxaloacetate) and proline (derived from 2-ketoglutarate) sheds light on the flux via the anaplerotic reactions and the TCA cycle. Specifically, if cyclic flux via the TCA cycle would predominate over anaplerotic flux, threonine and proline would be expected to be almost fully labeled upon feeding with ¹³C-formate and almost fully unlabeled when feeding with ¹³CO₂(FIG. 17). Conversely, if anaplerotic flux and non-cyclic flux would predominate over the cyclic flux, then threonine would be expected to be mostly double labeled on either ¹³C-formate or ¹³CO₂and proline would be expected to be mostly quadruple labeled on ¹³C-formate and single labeled on ¹³CO₂(FIG. 17). The results shown in FIG. 4 are thus consistent with high anaplerotic flux and low cyclic flux. This indicates that the cell obtains sufficient reducing power and energy from formate oxidation via FDH, and hence does not need to wastefully oxidize the assimilated carbons within pyruvate and acetate (i.e., investing cellular resources for C₁assimilation, only to completely oxidize the assimilated product).

Engineered Growth of E.coli on Methanol

Next, it was aimed to use the rGlyP for methanol assimilation. A single enzyme, methanol dehydrogenase (MDH), can convert methanol to formaldehyde, which can be oxidized to formate by the endogenous glutathione system⁴⁵(FIG. 5A). The expression of MDH can thus be regarded as the introduction of another module—Methanol Module (MM)—that serves to metabolize methanol to formate, while providing the cells with reducing power (FIG. 5B). NAD-dependent MDH from several organisms was tested: Bacillus stearothermophilus (BsMDH) ¹⁹, Corynebacterium glutamicum (CgMDH) ⁴⁶, and Cupriavidus necator N-1 (CnMDH)⁴⁷, as well as two MDHs from Bacillus methanolicus (BmMDH2 and BmMDH3)^10,48and an improved variant (BmMDH2*, carrying Q5L A363L modifications)⁴⁸These MDH variants were expressed on plasmids in three genetic backgrounds: the parent strain (gC₁M gC₂M gC₃M gEM), the evolved strain, and the parent strain to which the mutation within the promoter of the pntAB (FIG. 12) was introduced via MAGE. Overexpression of BsMDH supported growth on 600 mM methanol, which was most efficient in the latter strain (FIG. 5C) and somewhat poorer in the other strains (FIG. 5D). Growth was confirmed by observing monotonically increasing colony-forming units with increased OD (FIG. 18). The other MDH variants failed to support growth (FIG. 5D, final OD₆₀₀not higher than inoculation, as indicated by the brown dashed line).

To confirm that growth on methanol indeed depends on formaldehyde oxidation via the glutathione system, the endogenous genes encoding for S-(hydroxymethyl)glutathione dehydrogenase (ΔfrmA) were deleted in the above strains. The deletion was found to completely abolish growth on methanol (FIG. 5D), confirming the essentiality of the glutathione system to the observed growth. Moreover, overexpression of NAD-dependent formaldehyde dehydrogenase from Pseudomonas putida (PpFADH; SEQ ID NOs: 49 and 50), as demonstrated in a previous study ¹², or from Pseudomonas aeruginosa (PaFADH⁴⁹; SEQ ID NOs: 51 and 52) did not improve growth on methanol (FIG. 5D), indicating that the endogenous glutathione system is sufficiently fast and that the rate limiting step lies in methanol oxidation.

To confirm that growth on methanol indeed proceed via the rGlyP, a carbon labeling experiment was performed. The cultures were fed with ¹³C-methanol/¹²CO₂and the labeling pattern of the proteinogenic amino-acids described above was measured. The measured labeling pattern (FIG. 5E) was essentially identical to that observed with ¹³C-formate/¹²CO₂(FIG. 4), confirming that growth on methanol takes place via the synthetic route.

Notably, the growth rate on methanol was considerably lower than that on formate—doubling time of 54±5.5 h. This can be attributed to the slow rate of methanol oxidation. The observed biomass yield was 4.2±0.17 gCDW / mole methanol, considerably lower than that of microorganisms naturally growing on methanol (7.2±1.2 gCDW/mol-methanol via the Calvin cycle, 12±1.6 gCDW/mol-methanol via the serine cycle, and 15.6±2.7 gCDW/mol-methanol via the RuMP cycle³⁸). It is speculated that the low yield is also related to the slow rate of methanol oxidation: a low growth rate increases the proportional consumption of energy for cell maintenance, thus lowering biomass yield. Addition of 100 mM sodium bicarbonate significantly increased the final OD₆₀₀, but the growth parameters did not improve: doubling time of 55±1 h and biomass yield of 4.2±0.1 gCDW/mol-methanol (FIG. 19, also showing methanol consumption during growth).

Conclusions

This study provides the first demonstration of synthetic formatotrophy and methylotrophy. It is shown that rational design alone can suffice to achieve such a goal, but that short term evolution can provide useful fine tuning to improve growth characteristics. Further improvement of growth on formate and methanol can be achieved via long term evolution or via the introduction of metabolic routes that bypass limiting reactions. For example, replacing NAD-dependent MDH with methanol oxidase might reduce biomass yield (as this enzyme dissipates reducing power) but could support a much higher growth rate, as it replaces a thermodynamically- and kinetically-limited reaction with a favorable and fast one. The C₁assimilating strains can be further engineered for the production of value-added chemicals. Especially interesting are chemicals that can be derived directly from the rGlyP intermediates or product, and can thus be produced with high yield and productivity. For example, lactate and isobutanol, both of which are derived from pyruvate, should be produced with high yield. Similarly, cysteine, which is derived from serine, a key pathway intermediate, might be an ideal product. Coupling the abiotic synthesis of formate and methanol with their microbial conversion to chemicals of interest will enable an integrated process for the valorization of CO₂into renewable commodities.

EXAMPLE 2—MATERIAL AND METHODS
Chemicals and Reagents

Primers were synthesized by Integrated DNA Technologies (IDT, Leuven, Belgium). PCR reactions were carried out either using Phusion High-Fidelity DNA Polymerase or Dream Taq. Restrictions and ligations were performed using FastDigest enzymes and T4 DNA ligase, respectively, all purchased from Thermo Fisher Scientific (Dreieich, Germany). Glycine, sodium formate, sodium formate-¹³C, methanol-¹³C were ordered from Sigma-Aldrich (Steinheim, Germany). ¹³CO₂was ordered from Cambridge Isotope Laboratories, Inc. (Andover, Mass., USA).

Bacterial Strains

Wild type Escherichia coli strain MG1655 (F⁻λ⁻ilvG⁻rfb-50 rph-1) was used as the host for all genetic modifications. E. coli strain DH5α (F⁻, λ⁻,ϕ80/lacZΔM15, Δ(lacZYA-argF)U169, deoR, recA1, endA1, hsdR17(rK⁻mK⁺), phoA, supE44, thi-1, gyrA96, relA1) and E. coli strain ST18 (pro thi hsdR⁺ Tp^rSm^r−; chromosome::RP4-2 Tc::Mu-Kan::Tn7λpir ΔhemA)⁵⁰were used for cloning and conjugation procedures, respectively.

Genome Engineering

Gene knockouts were introduced in MG1655 by P1 phage transduction⁵¹. Single gene knockout mutants from the National BioResource Project (NIG, Japan)⁵²were used as donors of specific mutations. For the recycling of selection marker (as the multiple gene deletions and integrations were required) all the antibiotic cassettes integrated into genome were flanked by FRT (Flippase Recongnition Target) sites. Cells were transformed with a flippase recombinase helper plasmid (FLPe, replicating at 30° C., Gene Bridges), which carries a gene encoding FLP which recombines at the FRT sites and removes the antibiotic cassette. Elevated temperature (37° C.) was subsequently used to cure the cell from the FLPe plasmid.

Exchange of E. coli native promoter with a synthetic one was performed by using PCR-mediated λ-Red recombination method. The synthetic promoter fused with FRT-flanked kanamycin resistance gene was cloned into the pZ vector and the DNA fragment was obtained by PCR amplification with primers containing 50 base pair homology for recombination. Recombinant E. coli MG1655 harboring λ-Red recombinase (pRed/ET, Gene Bridges) was cultivated at 30° C., and the expression of λ-Red recombinase was induced by the addition of 10 mM L-arabinose. Electro-competent cells were prepared by washing three times with ddH₂O. The PCR product was introduced into E. coil expressing the λ-Red recombinase via electroporation. Mutants with exchanged promoter occurred via homologous recombination, selected on the LB agar plate containing 50 μg ml⁻¹kanamycin, and subsequently screened by colony PCR.

To enable genomic overexpression from a synthetic operon, conjugation based genetic recombination methods was adapted as previously described³⁶. The synthetic operons were digested with Bcul and Notl, and ligated by T4 ligase into previously digested with the same enzyme pDM4 (with oriR6K) genome integration vector. This vector has two 600 bp homology region compatible with target spot, chloramphenicol resistance gene (camR), a levansucrase gene (sacB), and the conjugation gene traJl for the transfer of the plasmid. The resulting ligation products were used to transform chemically competent E. coli ST18 strains. Positive clones growing on chloramphenicol medium supplemented with 5-aminolevulinic acid (50 mg mi⁻¹) were identified by colony PCR, and the confirmed recombinant ST18 strain was used as donor strain for the conjugation. Chloramphenicol resisting recipient E. coil strains were screened as positive strains for the first round of recombination. Subsequently, sucrose counter selection and kanamycin resistance tests were carried out to isolate recombinant E. coli strains with the correct synthetic operon integration into chromosome. All constructs were verified via PCR and sequencing.

Introducing point mutations on genome—to establish the mutation shown in FIG. 12—was achieved by using multiplex automated genome engineering (MAGE)^44,53. A single colony of desired strain(s) transformed with pORTMAGE⁵³(Addgene catalog no. 72680) was incubated in LB medium supplemented with 100 mg I⁻¹of ampicillin at 30° C. in a shacking incubator. To start the MAGE cycle, overnight cultures were diluted by 100 times in the same medium and cultivated to an optical density of 0.4-0.5 at 600 nm. 1 ml of each culture was transferred to sterile microcentrifuge tubes, and then transferred to 42° C. thermomixer (Thermomixer C, Eppendorf) to express λ-Red genes by heat shock for 15 min at 1000 rpm. After induction, cells were quickly chilled on ice for at least 15 min, and then made electrocompetent by washing three times with ice-cold ddH₂O. 40 ul of electrocompetent cell was mixed with 2 ul of 50 uM of oligomer stock solution and the final volume of the suspension was adjusted to 50 ul. The oligomers used for MAGE were: 5″-T*T*T TTG GCG CTA GAT CAC AGG CAT AAT TTT CAG TAC GTT ATA GGG tGT TTG TTA CTA ATT TAT TTT AAC GGA GTA ACA TTT AGC TCG T*A*C -3″ (pntAB_MAGE; SEQ ID NO: 53), 5′-T*A*A AGT TAA ACA AAA TTA TTT CTA TTA ACT AGT GAA TTC GGT CAt TGC GTC CTG CGC ATA TTA TAT GTG AAT CAC AGT GAT ATG TCA A*G*T-3′ (fdh_MAGE; SEQ ID NO: 54) where the asterisk (*) indicates phosphorothiolated bond. Electroporation was done on Gene Pulser XCell (Bio-Rad) set to 1.8 kV, 25 μF capacitance, and 200 Ω resistance for 1 mm gap cuvette. Immediately after electroporation, 1 ml of LB was added to cuvette and the electroporation mixes in LB was transferred to sterile culture tubes and cultured with shaking at 30° C., 240 rpm for 1 hour to allow for recovery. After recovery, 2 ml of LB medium supplemented with ampicillin was added and then further incubated in the same condition. When the culture reached an OD₆₀₀of 0.4-0.5, cells were either subjected to additional MAGE cycle or analyzed for genotype via PCR and sequencing. 8 consecutive MAGE cycles were performed before analyzing the genotype to identify strains carrying the required mutations.

All strains used are shown in Table 1.

TABLE 1

Strains and plasmids used in this study

Strain/Plasmid
Description/Genotype
Source

Strains

MG1655
F⁻ λ⁻ ilvG⁻ rfb-50 rph-
1

DH5α
F⁻ λ⁻ Φ80lacZΔM15 Δ(lacZYA-argF)U169 deoR recA1
2

endA1 hsdR17(rK⁻ mK⁺) phoA supE44 thi-1 gyrA96

relA1

ST18
pro thi hsdR⁺ Tp^rSm^r; chromosome::RP4-2 Tc::Mu-
3

Kan::Tn7/λpir ΔhemA

SerAux
MG1655, ΔserA ΔltaE Δkbl ΔaceA
4

gC₁M
SerAux, ss9-P_STRONG-RBS_C-ftfL-RBS_C-fch-RBS_C-mtdA
This

study

gC₂M
SerAux, P_STRONG-RBS_C-gcvT-RBS_NATIVE-gcvH-RBS_NATIVE-
This

gcvP
study

gCM gC₂M
gC₁M, P_STRONG-RBS_C-gcvT-RBS_NATIVE-gcvH-RBS_NATIVE-
This

gcvP
study

gC₁M gC₂M gC₃M
gC₁M gC₂M, ss7-P_STRONG-RBS_C-glyA-RBS_C-sdaA ΔsdaA
This

ΔglyA
study

gC₁M gC₂M gC₃M’
gC₁M gC₂M, ss7-P_STRONG-RBS_A-glyA-RBS_A-sdaA ΔsdaA
This

ΔglyA
study

gC₁M gC₂M gC₃M
gC₁M gC₂M gC₃M, ss10-P_STRONG-RBS_A-fdh
This

gEM (K4)

study

gC₁M gC₂M gC₃M
gC₁M gC₂M gC₃M, ss10-P_STRONG-RBS_C-fdh
This

gEM'

study

K4 g-PntAB*
K4 strain with a point mutation in promoter region of pntAB
This

study

K4 g-FDH* g-
K4 strain with a point mutation in both promoter region of
This

PntAB*
pntAB and 5’UTR region of ss10-P_STRONG-RBS_A-fdh
study

K4e
Evolved K4 strain after short term evolution
This

study

Plasmids

pDM4
Conjugation plasmid with oriR6K origin, sacB, traJI and
5

chloramphenicol/kanamycin resistance

pZASS
Overexpression plasmid with p15A origin, streptomycin
5

resistance, constitutive strong strength promoter (P_STRONG)

pZASM
Overexpression plasmid with p15A origin, streptomycin
5

resistance, constitutive medium strength promoter

(P_MEDIUM)

pZATM
Overexpression plasmid with p15A origin, tetracycline
5

resistance, constitutive medium strength promoter

(P_MEDIUM)

pZSSM
Overexpression plasmid with pSC101 origin, streptomycin
5

resistance, constitutive medium strength promoter

(P_MEDIUM)

pDM4:SS9-C₁M
pDM4 backbone with 600 bp up/down homology to safe
This

spot 9 ⁶for the genome integration of P_STRONG-RBS_C-ftfL-
study

RBS_C-fch-RBS_C-mtdA

pDM4:SS7-C₃M
pDM4 backbone with 600 bp up/down homology to safe
This

spot 7 ⁶for the genome integration of P_STRONG-RBS_C-glyA-
study

RBS_C-sdaA

pDM4:SS10-EM
pDM4 backbone with 600 bp up/down homology to safe
This

spot 10 ⁶for the genome integration of P_STRONG-RBS_A-fdh
study

pC₁M
pZSSM backbone for overexpression of RBS_C-ftfL-RBS_C-
4

fch-RBSc-mtdA from Methylobacterium extorquens

pC₂M
pZATM backbone for overexpression of RBS_C-gcvT-RBS_C-
4

gcvH-RBS_C-gcvP from E. coli

pC₃M
pZASS backbone for overexpression of RBS_C-glyA-RBS_C-
This

sdaA from E. coli
study

ASS-glyA-sdaA
pZASS backbone for overexpression of RBS_C-glyA-RBS_C-
This

sdaA from E. coli
study

ASM-glyA-sdaA
pZASS backbone for overexpression of RBS_C-glyA-RBS_C-
This

sdaA from E. coli
study

ASS-sdaA
pZASS backbone for overexpression of RBS_C-sdaA from
This

E. coli

study

ASM-sdaA
pZASM backbone for overexpression of RBS_C-sdaA from
This

E. coli

study

ASS-fdh
pZASS backbone for overexpression of RBS_C-fdh from
This

Pseudomonas putida

study

ASS-bsMDH
pZASS backbone for overexpression of methanol
This

dehydrogenase from Bacillus stearothermophilus
study

(UnitProt, P42327)

ASS-cgMDH
pZASS backbone for overexpression of methanol
This

dehydrogenase from Corynebacterium glutamicum
study

(UnitProt, A4QHJ5)

ASS-cnMDH
pZASS backbone for overexpression of methanol
This

dehydrogenase from Cupriavidus necator (UnitProt,
study

F8GNE5)

ASS-bmMDH3
pZASS backbone for overexpression of methanol
This

dehydrogenase from Bacillus methanolicus (Unitprot,
study

I3E2P9)

ASS-bmMDH2
pZASS backbone for overexpression of methanol
This

dehydrogenase from Bacillus methanolicus (Unitprot,
study

I3E949)

ASS-bmMDH2*
pZASS backbone for overexpression of engineered
This

methanol dehydrogenase (Q5L A363L) from Bacillus
study

methanolicus (Unitprot, I3E949)

ASS-
pZASS backbone for overexpression of RBSc-bsMDH-
This

bsMDH/paFADH
RBSc-paFADH, a formaldehyde dehydrogenase from
study

Pseudomonas aeruginosa

ASS-
pZASS backbone for overexpression of RBSc-bsMDH-
This

bsMDH/ppFADH
RBSc-ppFADH, a formaldehyde dehydrogenase from
study

Pseudomonas putida

REFERENCES IN TABLE Table 1

1 Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453-1462 (1997).

2 Meselson, M. & Yuan, R. DNA restriction enzyme from E. coli. Nature 217, 1110-1114 (1968).

3 Thoma, S. & Schobert, M. An improved Escherichia coli donor strain for diparental mating. FEMS Microbiol Lett 294, 127-132, doi:10.1111/.1574-6968.2009.01556.x (2009).

4 Yishai, O., Bouzon, M., Doring, V. & Bar-Even, A. In Vivo Assimilation of One-Carbon via a Synthetic Reductive Glycine Pathway in Escherichia coli. ACS synthetic biology, doi:10.1021/acssynbio.8b00131 (2018).

5 Wenk, S., Yishai, O., Lindner, S. N. & Bar-Even, A. An Engineering Approach for Rewiring Microbial Metabolism. Methods Enzymol 608, 329-367, doi:10.1016/bs.mie.2018.04.026 (2018).

5 Bassalo, M. C. et al. Rapid and Efficient One-Step Metabolic Pathway Integration in E. coli. ACS synthetic biology 5, 561-568, doi:10.1021/acssynbio.5b00187 (2016).

Synthetic-Operon Construction

Protein sequences of formate-tetrahydrofolate ligase (ftfL, UniProt: Q83WS0), 5,10-methenyl-tetrahydrofolate cyclohydrolase (fchA, UniProt: Q49135), and 5,10-methylene-tetrahydrofolate dehydrogenase (mtdA, UniProt: P55818) were taken from Methylobacterium extorquens AM1. Formate dehydrogenase (fdh, UniProt: P33160) was taken from Pseudomonas sp. Formaldehyde dehydrogenase were obtained from Pseudomonas aeruginosa (fdhA, UnitProt: Q9HTE3) and Pseudomonas putida (fdhA, UnitProt: P46154). Methanol dehydrogenases were prepared from Bacillus stearothermophilus (adh, UniProt: P42327), Corynebacterium glutamicum (cgR_2695, UniProt: A4QHJ5), Cupriavidus necator (mdh2, UniProt: F8GNE5), and Bacillus methanolicus (UnitProt: I3E2P9 and I3E949, as well as en engineered MDH, as reported in⁴⁸). These genes were codon optimized for E. coli K-12 and synthesized (Baseclear, Netherlands). All gene sequences are listed in sequence protocol of the application.

Genes native to E. coli—that is, serine hydroxymethyltransferase (glyA) and serine deaminase (sdaA)—were prepared via PCR-amplification from E. coli MG1655 genome. Genes were integrated into a high copy number cloning vector pNiv to construct synthetic operons using the method described previously^36,54Plasmid-based gene overexpression was achieved by cloning the desired synthetic operon into the pZ vector (15A origin of replication, streptomycin marker³⁶) digested with EcoRI and Pstl utilizing T4 DNA ligase. All molecular biology techniques were performed with standard methods⁵⁵or following manufacturer protocol.

Promoters and ribosome binding sites were used as described previously^36,54,56.Briefly, either a medium strength constitutive promoter (‘PGl-10’⁵⁶) or a strong constitutive promoter (‘PGl-20’⁵⁶) was used, as indicated in the text and in FIG. 6. Either medium strength ribosome binding site (RBS_c⁵⁴) or a strong ribosome binding site (RBS_A⁵⁴) was further used, as indicated in the text and in FIG. 6.

All plasmid used are shown in the above Table 1.

Growth Medium and Conditions

Luria Bertani medium (1% NaCl, 0.5% yeast extract, and 1% tryptone) was used for strain propagation. Further cultivation was done in M9 minimal media (50 mM Na₂HPO₄, 20 mM KH₂PO₄, 1 mM NaCl, 20 mM NH₄Cl, 2 mM MgSO₄, and 100 μM CaCl₂), with trace elements (134 μM EDTA, 13 μM FeCl₃.6H₂O, 6.2 μM ZnCl₂, 0.76 μM CuCl₂.2H₂O, 0.42 μM CoCl₂.2H₂O, 1.62 μM H₃BO₃, 0.081 μM MnCl₂.4H₂O). For the cell growth test, overnight cultures in LB medium were used to inoculate a pre-culture at an optical density (600 nm, OD₆₀₀) of 0.02 in 4 ml fresh M9 medium containing 10 mM glucose, 1 mM glycine and 30 mM formate in 10 ml glass test tube. Cell were then cultivated at 37° C. and shaking of 240 rpm. Cell cultures were harvested by centrifugation (18,407×g, 3 min, 4° C.) and washed twice with fresh M9 medium and used to inoculate the main culture, conducted aerobically either in 10 ml glass tube or Nunc 96-well microplates (Thermo Fisher Scientific) with appropriate carbon sources according to strain and specific experiment: 10 mM glucose, 20 mM acetate, 30 mM formate, 600 mM methanol, and/or 10% CO₂(90% air). In the microplates cultivation, each well containing 150 pl culture covered with 50 μl mineral oil (Sigma-Aldrich) to avoid evaporation (note that small gaseous molecules such as O₂and CO₂can freely diffuse via this oil coverage). Growth experiments were conducted (either 100% air or 90% ai/10% CO₂) in a BioTek Epoch 2 plate reader (BioTek Instrument, USA) at 37° C. Growth (OD₆₀₀) was measured after a kinetic cycle of 12 shaking steps, which alternated between linear and orbital (1 mm amplitude), and were each 60 s long. OD values measured in the plate reader were calibrated to represent OD values in standard cuvettes according to ODcuvette=ODplate/0.23. Glass tube culture was carried out in 4 ml of working volume, at 37° C. and shaking of 240 rpm. Volume loss due to evaporation was compensated by adding the appropriate amount of sterile double distilled water (ddH₂O) to culture tube every two days. All growth experiments were performed in triplicate, and the growth curves shown represent the average of these triplicates.

¹³C Labeling of Proteinogenic Amino Acids

For stationary isotope tracing of proteinogenic amino acids, cells were cultured in 4 ml of M9 media supplemented with either labeled or unlabeled carbon sources, that is, ¹³C-formate, ¹³C-methanol and/or ¹³CO₂, under conditions as described above. A 6 L vacuum desiccator (Lab Companion, South Korea) was used for cultures grown in ¹³CO₂, where the original gas was expelled by using vacuum pump followed by refilling with 90% air and 10% ¹³CO₂. The cell was harvested by centrifugation for 3 min at 18,407×g when the stationary growth phase was reached. Biomass was hydrolyzed by incubation with 1 ml of 6 N hydrochloric acid for a duration of 24 h in 95° C. Samples were dried via heating at 95° C. and re-dissolved in 1 ml of ddH₂O. Hydrolyzed amino acids were separated using ultra performance liquid chromatography (Acquity, Waters, Milford, Mass., USA) using a C18-reversed-phase column (Waters) as previously described⁵⁷. Mass spectra were acquired using an Exactive mass spectrometer (Thermo Fisher). Data analysis was performed using Xcalibur (Thermo Fisher). Prior to analysis, amino-acid standards (Sigma-Aldrich) were analyzed under the same conditions in order to determine typical retention times.

Dry Weight Analysis

To determine dry cell weight of E. coil grown formate or methanol, pre-cultures prepared as described above were inoculated to at a final OD₆₀₀of 0.01 into fresh M9 medium containing either formate (30 mM) or methanol (600 mM) in 125 ml pyrex Erlenmeyer flask and grown at 37° C. with agitation at 240 rpm. Up to 50 ml of cell culture, growing in shake-flasks, were harvested by centrifugation (3,220×g, 20 min). To remove residual medium compounds cells were washed be three cycles of centrifugation (7,000×g, 5 min) and resuspension in 2 ml ddH₂O. Cell-solutions were transferred to pre-weighted and pre-dried aluminum dish, dried at 90° C. for 16 h, and weight of the dried cells in the dish was determined and subtracted by the weight of the empty dish.

CDW of E. coil strains was measured during exponential growth phase (OD₆₀₀of 0.3-0.4) in the presence of 10% CO₂on 30 mM formate (at OD₆₀₀of 0.2, 0.37, and 0.41) and on 600 mM methanol (at OD₆₀₀of 0.21, 0.22, and 0.24). As a control, CDW of E. coli strain growing either on formate or methanol was determined during exponential growth phase in the presence of 10% CO₂and 30 mM formate and either 10 mM glucose (at OD₆₀₀of 1.26), 20 mM pyruvate (at OD₆₀₀of 0.78), or 20 mM succinate (at OD600 of 0.37). To determine CDW of E. coli WT, cells were grown in the presence of 10% CO₂on 10 mM glucose and CDW was determined during exponential growth phase (at OD₆₀₀of 0.78).

Enzymes and Chemical Assays

Absorbance changes for all assays were monitored in a BioTek Epoch 2 plate reader. Working at the measurement linear range was confirmed in all assays. Results represent averages of at least three cell preparations. To determine the activity of formate dehydrogenase, 1.5 ml of OD₆₀₀1.0 cell culture grown in M9 minimal medium and supplemented with glucose and formate from glass test tubes were washed twice with 9 gl⁻¹sodium chloride. Cells were lysed by adding CelLytic Reagent (Sigma) and allowed to sit for 20 min at the room temperature. After cell disruption, cellular debris was removed by centrifugation (18,407×g, 4° C., 10 min) and the supernatant used for crude assays without further purification. Formate dehydrogenase assay performed in the presence of 10 mM 2-mercaptoenthanol, 100 mM sodium formate, 200 mM sodium phosphate buffer pH 7.0, and 2 mM NAD⁺ in a total volume of 200 μl at 37° C.⁵⁸. The increase in NADH concentration resulting from formate oxidation was monitored at 340 nm. Protein concentration was measured using the Bradford Reagent (Sigma) with bovine serum albumin as a standard. Formate and methanol in the culture were quantified by a colorimetric assay using formate assay kit (Sigma-Aldrich) and methanol assay kit (BioVision) respectively. All samples were diluted to ensure the reading are within the standard curve range according to the manufacturer's instructions.

Quantitative Polymerase Chain Reaction

Total RNA was extracted from 1 ml of overnight culture at an OD₆₀₀0.5 using the RNeasy Mini Kit (Qiagen, Hilden, Germany), and following the protocol of the supplier. All RNA samples were treated with DNase I (Sigma-Aldrich, St. Louis, Mo., US) to remove any residual DNA. First-strand cDNA was synthesized using a qScript cDNA Synthesis kit following the manufacturer instructions (Quanta Biosciences, Gaithersburg, Md., US), and 1 μg of total RNA was used as a template in 20 μl reaction volume. Quantitative reverse-transcription-polymerase chain reactions (qRT-PCR) were made using a Maxima™ SYBR Green qPCR Master Mix (ThermoFisher Scientific, Darmstadt, Germany) supplemented with 5 μM primers and 5 μl cDNA template, which was diluted up to 200 μl after synthesis. The primers used for QPCR were: GCC AAT CTG CAA CAG TGC TC-3′ (pntA_forward, SEQ ID NO: 55), 5′-TTT TTG GCT GGA TGG CM GC-3′ (pntA_reverse, SEQ ID NO: 56), 5″- CGT GAC GM TAC CTG ATC GTT -3′ (fdh forward, SEQ ID NO: 57), 5″- GGT AGC GTT ACC TTT AGA GTA AGA GTG -3′ (fdh reverse, SEQ ID NO: 58). PCR was performed in 96-well optical reaction plates (ThermoFisher Scientific, Darmstadt, Germany) as follows: 10 min at 50° C., 5 min at 95° C., and 40 cycles of 10 s at 95 and 30 s at 60° C., and finally 1 min at 95° C. The specificity of the reactions, and the amplicon identities were verified by melting curve analysis. Reaction mixtures without cDNA were used as a negative control. Data were evaluated using the CT method⁵⁹and with correction for the PCR efficiency, which was determined based on the slope of standard curves. Normalization of gene expression levels were carried on by using the rrsA gene⁶⁰, and eventually the fold-differences in the transcript levels and mean standard error were calculated as described elsewhere⁵⁹.

Quantification of E. coil Colony Forming Units

Viable cell counts were determined by sampling E. coil cell cultures periodically. 10 μl of cell culture was diluted in 990 μl sterile M9 medium, and the diluted cell suspension was further diluted either by 100 times or 1000 times to obtain isolated colonies on agar plates. 100 μl of repeatedly diluted cell suspension was plated on LB agar plate and incubated at 37° C. for 24 h. All cell counts experiments were conducted at least five times per each OD value to obtain reliable cell counting numbers.

REFERENCES

1 Blankenship, R. E. et al. Comparing photosynthetic and photovoltaic efficiencies and recognizing the potential for improvement. Science 332, 805-809, doi:10.1126/science.1200165 (2011).

2 Scheffe, J. R. & Steinfeld, A. Oxygen exchange materials for solar thermochemical splitting of H2O and CO2: a review. Materials Today 17, 341-348 (2014).

3 Snoeckx, R. & Bogaerts, A. Plasma technology—a novel solution for CO2 conversion? Chem Soc Rev 46, 5805-5863, doi:10.1039/c6cs00066e (2017).

4 Zhang, Q., Kang, J. & Wang, Y. Development of novel catalysts for Fischer—Tropsch synthesis: tuning the product selectivity. ChemCatChem 2, 1030-1058 (2010).

5 Jouny, M., Luc, W. & Jiao, F. General techno-economic analysis of CO2 electrolysis systems. Ind Eng Chem Res 57, 2165-2177 (2018).

6 Yishai, O., Lindner, S. N., Gonzalez de la Cruz, J., Tenenboim, H. & Bar-Even, A. The formate bio-economy. Curr Opin Chem Bio! 35, 1-9, doi:10.1016/j.cbpa.2016.07.005 (2016).

7 Szima, S. & Cormos, C. C. Improving methanol synthesis from carbon-free H2 and captured CO2: A techno-economic and environmental evaluation. J CO2 Util 24, 555-563 (2018).

8 Bertsch, J. & Muller, V. Bioenergetic constraints for conversion of syngas to biofuels in acetogenic bacteria. Biotechnology for biofuels 8, 210, doi:10.1186/s13068-015-0393-x (2015).

9 Bennett, R. K., Steinberg, L. M., Chen, W. & Papoutsakis, E. T. Engineering the bioconversion of methane and methanol to fuels and chemicals in native and synthetic methylotrophs. Curr Opin Biotechnol 50, 81-93, doi:10.1016/j.copbio.2017.11.010 (2017).

10 Muller, J. E. et al. Engineering Escherichia coli for methanol conversion. Metab Eng 28, 190-201, doi:10.1016/j.ymben.2014.12.008 (2015).

11 Dai, Z. et al. Metabolic construction strategies for direct methanol utilization in Saccharomyces cerevisiae. Bioresour Technol 245, 1407-1412, doi:10.1016/j.biortech.2017.05.100 (2017).

12 Yu, H. & Liao, J. C. A modified serine cycle in Escherichia coli coverts methanol and CO2 to two-carbon compounds. Nature communications 9, 3992, doi:10.1038/s41467-018-06496-4 (2018).

13 Meyer, F. et al. Methanol-essential growth of Escherichia coli. Nature communications 9, 1508, doi:10.1038/s41467-018-03937-y (2018).

14 Woolston, B. M., King, J. R., Reiter, M., Van Hove, B. & Stephanopoulos, G. Improving formaldehyde consumption drives methanol assimilation in engineered E. coli. Nature communications 9, 2387, doi:10.1038/s41467-018-04795-4 (2018).

15 Bennett, R. K., Gonzalez, J. E., Whitaker, W. B., Antoniewicz, M. R. & Papoutsakis, E. T. Expression of heterologous non-oxidative pentose phosphate pathway from Bacillus methanolicus and phosphoglucose isomerase deletion improves methanol assimilation and metabolite production by a synthetic Escherichia coli methylotroph. Metab Eng, doi:10.1016/j.ymben.2017.11.016 (2017).

16 Gonzalez, J., Bennett, R. K., Papoutsakis, E. T. & Antoniewicz, M. R. Methanol assimilation in Escherichia coli is improved by co-utilization of threonine and deletion of leucine-responsive regulatory protein. Metab Eng, doi:10.1016/j.ymben.2017.11.015 (2017).

17 Rohlhill, J., Sandoval, N. R. & Papoutsakis, E. T. Sort-Seq Approach to Engineering a Formaldehyde-Inducible Promoter for Dynamically Regulated Escherichia coli Growth on Methanol. ACS synthetic biology 6, 1584-1595, doi:10.102¹/_acssynbio.7b00114 (2017).

18 Woolston, B. M., Roth, T., Kohale, I., Liu, D. R. & Stephanopoulos, G. Development of a formaldehyde biosensor with application to synthetic methylotrophy. Biotechnol Bioeng 115, 206-215, doi:10.1002/bit.26455 (2018).

19 Whitaker, W. B. et al. Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli. Metab Eng 39, 49-59, doi:10.1016/j.ymben.2016.10.015 (2017).

20 Lu, X. et al. Constructing a synthetic pathway for acetyl-coenzyme A from one-carbon through enzyme design. Nature communications 10, 1378, doi: 10.1038/s41467-019-09095-z (2019).

21 Wang, X. et al. Biological conversion of methanol by evolved Escherichia coli carrying a linear methanol assimilation pathway. Bioresour Bioprocess 4, 41 (2017).

22 Anthony, C. The Biochemistry of Methylotrophs. (Academic Press, 1982).

23 Drake, H. L., Kirsten, K. & Matthies, C. in The Prokaryotes 354-420 (Springer New York, 2006).

24 Bar-Even, A., Noor, E., Flamholz, A. & Milo, R. Design and analysis of metabolic pathways supporting formatotrophic growth for electricity-dependent cultivation of microbes. Biochim Biophys Acta 1827, 1039-1047 (2013).

25 Bar-Even, A. Does acetogenesis really require especially low reduction potential? Biochim Biophys Acta 1827, 395-400, doi:10.1016/j.bbabio.2012.10.007 (2013).

26 Noor, E. et al. Pathway thermodynamics highlights kinetic obstacles in central metabolism. PLoS Comput Biol 10, e1003483, doi:10.1371/journal.pcbi.1003483 (2014).

27 Figueroa, I. A. etal. Metagenomics-guided analysis of microbial chemolithoautotrophic phosphite oxidation yields evidence of a seventh natural CO2 fixation pathway. Proc Natl Acad Sci U S A 115, E92-E101, doi:10.1073/pnas.1715549114 (2018).

28 Kawasaki, H., Sato, T. & Kikuchi, G. A new reaction for glycine biosynthesis. Biochem Biophys Res Commun 23, 227-233 (1966).

29 Motokawa, Y. & Kikuchi, G. Glycine metabolism by rat liver mitochondria. Reconstruction of the reversible glycine cleavage system with partially purified protein components. Arch Biochem Biophys 164, 624-633 (1974).

30 Pasternack, L. B., Laude, D. A., Jr. & Appling, D. R. 13C NMR detection of folate-mediated serine and glycine synthesis in vivo in Saccharomyces cerevisiae. Biochemistry 31, 8713-8719 (1992).

31 Tashiro, Y., Hirano, S., Matson, M. M., Atsumi, S. & Kondo, A. Electrical-biological hybrid system for CO2 reduction. Metab Eng 47, 211-218, doi:10.1016/j.ymben.2018.03.015 (2018).

32 Yishai, O., Bouzon, M., Doring, V. & Bar-Even, A. In Vivo Assimilation of One-Carbon via a Synthetic Reductive Glycine Pathway in Escherichia coli. ACS synthetic biology, doi:10.1021/acssynbio.8b00131 (2018).

33 Bang, J. & Lee, S. Y. Assimilation of formic acid and CO2 by engineered Escherichia coli equipped with reconstructed one-carbon assimilation pathways. Proc Natl Acad Sci USA 115, E9271-E9279, doi:10.1073/pnas.1810386115 (2018).

34 Crowther, G. J., Kosaly, G. & Lidstrom, M. E. Formate as the main branch point for methylotrophic metabolism in Methylobacterium extorquens AM1. J Bacteriol 190, 5057-5062 (2008).

35 Tishkov, V. I. & Popov, V. O. Catalytic mechanism and application of formate dehydrogenase. Biochemistry (Mosc) 69, 1252-1267, doi:BCM69111537 [pii] (2004).

36 Wenk, S., Yishai, O, Lindner, S. N. & Bar-Even, A. An Engineering Approach for Rewiring Microbial Metabolism. Methods Enzymol 608, 329-367, doi:10.1016/bs.mie.2018.04.026 (2018).

37 Bassalo, M. C. et al. Rapid and Efficient One-Step Metabolic Pathway Integration in E. coli. ACS synthetic biology 5, 561-568, doi:10.1021/acssynbio.5b00187 (2016).

38 Claassens, N. J., Cotton, C. A., Kopljar, D. & Bar-Even, A. Making quantitative sense of electromicrobial production. Nature Catalysis 2, 437 (2019).

39 Nicholls, P. Formate as an inhibitor of cytochrome c oxidase. Biochem Biophys Res Commun 67, 610-616 (1975).

40 Warnecke, T. & Gill, R. T. Organic acid toxicity, tolerance, and production in Escherichia coli biorefining applications. Microb Cell Fact 4, 25, doi:10.1186/1475-2859-4-25 (2005).

41 Rudolph, B., Gebendorfer, K. M., Buchner, J. & Winter, J. Evolution of Escherichia coli for growth at high temperatures. J Biol Chem 285, 19029-19034, doi:10.1074/jbc.M110.103374 (2010).

42 Dragosits, M. & Mattanovich, D. Adaptive laboratory evolution—principles and applications for biotechnology. Microb Cell Fact 12, 64, doi:10.1186/1475-2859-12-64 (2013).

43 Wytock, T. P. et al. Experimental evolution of diverse Escherichia coli metabolic mutants identifies genetic loci for convergent adaptation of growth rate. PLoS Genet 14, e1007284, doi:10.1371/journal.pgen.1007284 (2018).

44 Wang, H. H. et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature 460, 894-898, doi:nature08187 [pii] 10.1038/nature08187 (2009).

45 Gutheil, W. G., Kasimoglu, E. & Nicholson, P. C. Induction of glutathione-dependent formaldehyde dehydrogenase activity in Escherichia coli and Hemophilus influenza. Biochem Biophys Res Commun 238, 693-696 (1997).

46 Kotrbova-Kozak, A., Kotrba, P., lnui, M., Sajdok, J. & Yukawa, H. Transcriptionally regulated adhA gene encodes alcohol dehydrogenase required for ethanol and n-propanol utilization in Corynebacterium glutamicum R. Appl Microbiol Biotechnol 76, 1347-1356, doi:10.1007/s00253-007-1094-6 (2007).

47 Wu, T. Y. et al. Characterization and evolution of an activator-independent methanol dehydrogenase from Cupriavidus necator N-1. Appl Microbiol Biotechnol 100, 4969-4983, doi:10.1007/s00253-016-7320-3 (2016).

48 Roth, T. B., Woolston, B. M., Stephanopoulos, G. & Liu, D. R. Phage-Assisted Evolution of Bacillus methanolicus Methanol Dehydrogenase 2. ACS synthetic biology 8, 796-806, doi:10.102¹/_acssynbio.8b00481 (2019).

49 Zhang, W. et al. Expression, purification, and characterization of formaldehyde dehydrogenase from Pseudomonas aeruginosa. Protein Expr Purif 92, 208-213, doi:10.1016/j.pep.2013.09.017 (2013).

50 Thoma, S. & Schobert, M. An improved Escherichia coli donor strain for diparental mating. FEMS Microbiol Lett 294, 127-132, doi:10.1111/j.1574-6968.2009.01556.x (2009).

51 Thomason, L. C., Costantino, N. & Court, D. L. E. coli genome manipulation by P1 transduction.Curr Protoc Mol Biol Chapter 1, Unit 1 17, doi:10.1002/0471142727.mb0117s79 (2007).

52 Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2, 2006-2008, doi:10.1038/msb4100050 (2006).

53 Nyerges, A. et al. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species. Proc Natl Acad Sci U S A 113, 2502-2507, doi:10.1073/pnas.1520040113 (2016).

54 Zelcbuch, L. et al. Spanning high-dimensional expression space using ribosome-binding site combinatorics. Nucleic Acids Res 41, e98, doi:gkt151 [pii] 10.1093/nar/gkt151 (2013).

55 Sambrook, J. & Russell, D. W. Molecular cloning: a laboratory manual. 3rd edn, (Cold Spring Harbor Laboratory Press, 2001).

56 Braatsch, S., Helmark, S., Kranz, H., Koebmann, B. & Jensen, P. R. Escherichia coli strains with promoter libraries constructed by Red/ET recombination pave the way for transcriptional fine-tuning. Biotechniques 45, 335-337, doi:000112907 [pii] 10.2144/000112907 (2008).

57 Giavalisco, P. et al. Elemental formula annotation of polar and lipophilic metabolites using 13C, 15N and 34S isotope labelling, in combination with high-resolution mass spectrometry. Plant J 68, 364-376 (2011).

58 Liu, A., Feng, R. & Liang, B. Microbial surface displaying formate dehydrogenase and its application in optical detection of formate. Enzyme Microb Technol 91, 59-65, doi:10.1016/j.enzmictec.2016.06.002 (2016).

59 Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method. methods 25, 402-408 (2001).

60 Zhou, K. et al. Novel reference genes for quantifying transcriptional responses of Escherichia coli to protein overexpression by quantitative PCR. BMC Mol Biol 12, 18, doi:10.1186/1471-2199-12-18 (2011).

NOVEL GENETICALLY ENGINEERED MICROORGANISM CAPABLE OF GROWING ON FORMATE, METHANOL, METHANE OR CO2

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information