Like plants and algae, cyanobacteria obtain energy from photosynthesis, utilizing energy from sunlight and electrons from water to reduce carbon dioxide (CO2) and thereby ‘fix’ carbon into cell biomass. This photosynthetically-fixed carbon can then be used to make metabolites, such as carbohydrates, proteins, and fatty acids that are ultimately distributed to heterotrophic organisms. Besides their role as primary carbon fixation organisms, cyanobacteria can also be altered to produce useful products. For example, Synechococcus elongatus PCC 7942 has been engineered to produce isobutyraldehyde and butanol; Synechocystis sp. PCC 6803 has been modified produce ethanol and isoprene.
Cyanobacteria excel at carbon fixation, thanks to their complex carbon concentrating mechanism (ccm), which is comprised of bicarbonate pumps, carbon dioxide-uptake systems and the carboxysome. The carboxysome is an approximate 300 MDa compartment essential for carbon concentration, as it enhances carbon fixation by sequestering ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) and carbonic anhydrase (CA) within a protein shell. In the carboxysome lumen, bicarbonate is converted into carbon dioxide by carbonic anhydrase. Such conversion increases the proportion of carbon dioxide to oxygen in the vicinity of Rubisco, which favors Rubisco's carboxylase activity, while the shell limits the loss of carbon dioxide into the bulk cytosol (Cai et. al, 2009).
Researchers have explored ways to express the β-carboxysome shell and cyanobacterial form 1B Rubisco in chloroplasts (Lin et al., 2014b; Lin et al., 2014a). However, constructs have not been generated that can assemble the functional multi-protein metabolic carboxysome core.
In cyanobacteria, the key enzyme for photosynthetic CO2 fixation, ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco), is bound within proteinaceous polyhedral microcompartments called carboxysomes. A streamlined carboxysome is described herein that was generated by fusing key domains from four proteins into a single protein. This chimeric protein assembles into a functional carboxysome core that can readily be transferred and utilized in other organisms. This is the first instance of the redesign and construction of a carboxysome core, the first instance of a re-design of a bacterial microcompartment core, and lays the base for the generation of novel compartments with industrially relevant functions based on the carboxysome and related bacterial microcompartment architectures.
Described herein are fusion proteins that include a polypeptide comprising at least two small subunit-like domains (SSLDs) from a carbon dioxide concentrating mechanism (CcmM) protein, at least one carbonic anhydrase domain, and at least one encapsulation peptide. The at least two small subunit-like domains (SSLDs) from a carbon dioxide concentrating mechanism (CcmM) protein can bind or nucleate with ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco). The Rubisco can, for example, synthesize 3-phosphoglycerate (3-PGA). In some cases, the at least two small subunit-like domains (SSLDs) from a carbon dioxide concentrating mechanism (CcmM) protein can have a protein sequence with at least 95% sequence identity to any of SEQ ID NO:1-11, 37, 75, 76, or 77.
The at least one carbonic anhydrase domain is an enzyme that can convert bicarbonate to carbon dioxide. For example, the at least one carbonic anhydrase domain comprises at least 95% sequence identity to any of SEQ ID NO:17-21 or 71.
The at least one encapsulation peptide can interact with, nucleate, and/or bind one or more carboxysome shell protein. In some cases, the at least one encapsulation peptide comprises at least 95% sequence identity to any of SEQ ID NO:12-15 or 16.
Also described herein are expression cassettes that can include a promoter operably linked to a nucleic acid segment encoding such a fusion protein. Cells, plants, bacteria, algae, and/or microalgae can be modified to include such expression cassettes.
Methods are also described herein that can provide carbon fixation. Such methods can include culturing the cells that have nucleic acids or expression vectors that encode any of the fusion proteins described herein. The methods can involve cultivating one or more plants that have nucleic acids or expression vectors that encode any of the fusion proteins described herein. Such cells, plants, bacteria, algae, and/or microalgae can manufacture products such as 3-phosphoglycerate (3-PGA). Such cells, plants, bacteria, algae, and/or microalgae can be cultivated or cultured and then harvested. Products can be harvested from the cells, plants, bacteria, algae, and/or microalgae. Such products can include oils, carbohydrates, grains, vegetables, fruits and other components, as well as 3-phosphoglycerate (3-PGA).
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
A chimeric protein is described herein that can assemble into a functional carboxysome core and that is able to fix carbon by taking atmospheric carbon dioxide and converting it into useful carbon-containing molecules such as 3-phosphoglycerate (3-PGA or also referred to as glycerate 3-phosphate). 3PGA is a precursor for other useful molecules such as serine, which, in turn, can create cysteine and glycine through the homocysteine cycle.
The chimeric protein is referred to as CcmC (where the final “C” is for chimeric). The CmcC protein structure is schematically illustrated in
The chimeric protein structurally and functionally replaces four gene products required for carboxysome formation (see schematic illustrations in
Functional carboxysomes are needed for the survival of a cyanobacterial host. As illustrated herein, the chimeric CcmC protein can replace the function of native carboxysomes in cyanobacteria.
In CcmC, the small subunit-like domains (SSLDs) and the Encapsulation peptide (EP) are fused to opposite ends of the beta-carbonic anhydrase (β-CA) domain. The SSLDs are available to interact with the large subunit of Rubisco and the Encapsulation peptide can interact with the shell (see
The CcmC construct reduces the genomic load required to assemble a carboxysome by about 1100 bp, which is about 18% of total message required for wild type carboxysomes. In addition, it reduces the number of proteins and, concomitantly, the need to balance the expression levels of four different genes.
The chimeric CcmC carboxysomes, although smaller, morphologically resemble wild-type carboxysomes (
Carboxysomes
Bacterial microcompartments (BMCs) are a family of architecturally similar but functionally diverse self-assembling organelles composed entirely of protein (Axen et al., 2014; Kerfeld and Erbilgin, 2015). The first BMC identified was the carboxysome (Drews and Niklowitz, 1956). Carboxysomes are about 300 MDa in size. Carboxysomes form compartments (Cheng et al., 2008) that are part of the cyanobacterial carbon concentrating mechanism (ccm) that enhance carbon fixation by sequestering ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) and carbonic anhydrase (CA) within a protein shell. In the carboxysome lumen, bicarbonate is converted into carbon dioxide by a carbonic anhydrase (CA), which increases the proportion of carbon dioxide to oxygen in the vicinity of Rubisco while the carboxysome shell limits the loss of carbon dioxide into the bulk cytosol (Cai et. al, 2009). Such increased concentration of carbon dioxide favors Rubisco's carboxylase activity. The product of carbon fixation, 3-phosphoglycerate (3-PGA), exits the carboxysome and can be used in the Calvin cycle or other biosynthetic pathways. Rubisco is the most abundant protein in the biosphere and is responsible for the majority of Earth's primary production of biomass.
Two types of carboxysomes are found in cyanobacteria: α-carboxysomes containing form 1A Rubisco, and β-carboxysomes containing form 1B Rubisco. The constituent core proteins also differ between the two types of carboxysomes, as well as the mode of assembly. Recently it was proposed that a large, conserved multi-domain protein (CsoS2) organizes the Rubisco in the α-carboxysome core (Cai et al., 2015). In contrast, assembly of the β-carboxysome involves a sequence of protein domain interactions among multiple core proteins (Cameron et al., 2013).
In Synechococcus elongatus PCC 7942, the β-carboxysome shell is formed by the structural proteins CcmK, CcmL and CcmO. The core of native carboxysomes is composed of CcmM, M35 and CcmN as well as the enzymes Rubisco (form 1B) and the β-carbonic anhydrase, CcaA (
CcmM Protein
The carbon dioxide concentrating mechanism protein, CcmM, can exist as 58-kDa and 35-kDa protein products in Synechococcus elongatus PCC 7942. The relative composition of the 58-kDa and 35-kDa CcmM proteins is not affected by protease inhibitors.
An amino acid sequence for a Synechococcus elongatus PCC 7942 carbonate dehydratase (CcmM: Synpcc7942_1423; 57833 daltons) is available as accession number ABB57453 (see website at uniprot.org/uniprot/Q03513)(SEQ ID NO:1).
A related CcmM protein from Synechococcus elongatus has a sequence has at least 99% sequence identity to SEQ ID NO: 1, as illustrated below (SEQ ID NO:2).
This related protein has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_011242447.1 (GI:499561664), and with the sequence shown below (SEQ ID NO:2)
A related CcmM protein from Prochlorothrix hollandica has a sequence has at least 53% sequence identity to SEQ ID NO: 1.as illustrated below (SEQ ID NO:3).
This related CcmM protein from Prochlorothrix hollandica has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_017713783.1 (GI:516317089), and with the sequence shown below (SEQ ID NO:3).
A related CcmM protein from Hassallia byssoidea has a sequence has at least 53% sequence identity to SEQ ID NO:1, as illustrated below. Asterisks below the compared sequences indicate amino acid identity at that position (SEQ ID NO:4).
This related CcmM protein from Hassallia byssoidea has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_039748670.1 (GI:748175120), and with the sequence shown below (SEQ ID NO:4).
CcmM comprises an N-terminal γ-CA domain followed by three small subunit-like domains (SSLDs) with sequence homology to RbcS the small subunit of Rubisco (Long et al., 2007).
M35S Protein
The ccmM gene encodes two essential carboxysome components, the full-length protein and a truncated form containing only the SSLDs (known as M35 in Synechococcus elongatus PCC 7942). In Synechococcus, the short form is composed of three SSLDs, which are believed to aggregate Rubisco. An amino acid sequence for the CcmM short form from Synechococcus elongatus PCC 7942 is shown below as SEQ ID NO:5, where the SSLD domains are identified in bold and with underlining.
RRFRTSSWQP CAPIQSTNER QVLSELENCL SEHEGEYVRL
LGIDTNTRSR VFEALIQRP
D GSVPESLGSQ PVAVASGGGR
TSSWQSCAPI QSSNERQVLA ELENCLSEHE GEYVRLLGID
TASRSRVFEA LIQDP
QGPVG SAKAAAAPVS SATPSSHSYT
LPALSGQSEA TVLPALESIL QEHKGKYVRL IGIDPAARRR
VAELLIQKP
As illustrated, an SSLD can include any of SEQ ID NOs:75-77.
RRFRTSSWQP CAPIQSTNER QVLSELENCL SEHEGEYVRL
LGIDTNTRSR VFEALIQRP
TSSWQSCAPI QSSNERQVLA ELENCLSEHE GEYVRLLGID
TASRSRVFEA LIQDP
LPALSGQSEA TVLPALESIL QEHKGKYVRL IGIDPAARRR
VAELLIQKP
This related protein from Acaryochoris marina has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nhn.nih.gov) with accession number WP_012165581.1 (GI:501116295), and with the full length sequence shown below (SEQ ID NO:6).
The short form CcmM portion of this Acarvychloris marina related protein contains five SSLDs and is shown below as SEQ ID NO:7.
Some forms of CcmM can have a few aminoacids missing from the N-terminus or the C-terminus of the short form CcmM protein. In addition, the N-terminus of the short form CcmM protein can have a methionine.
A related short form CcmM protein from Thermosynechococcus elongatus BP-1 has a sequence has at least 49% sequence identity to SEQ ID NO:5, as illustrated below.
This related protein from Thermosynechococcus elongatus BP-J has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number NP_681734.1 (GI:22298487) and with the sequence shown below (SEQ ID NO:8).
The short form CcmM portion of this Thermosynechococcus elongatus BP-1 related protein has four SSLDS and is shown below as SEQ ID NO:9.
Some short forms of CcmM can have a few amino acids missing from the N-terminus or the C-terminus of the M35 protein. In addition, the N-terminus of the short form protein can have a methionine.
A related short form CcmM protein from Trichormus azollae has a sequence has at least 52% sequence identity to SEQ ID NO:5, as illustrated below.
This short form CcmM related protein from Trichormus azollae has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_013190978.1 (GI:502956002), and with the sequence shown below (SEQ ID NO:10).
The short form portion of this Trichormus azollae related protein contains three SSLDs and is shown below as SEQ ID NO:11.
Some short forms of CcmM can have a few amino acids missing from the N-terminus or the C-terminus of the protein. In addition, the N-terminus of the short form CcmM protein can have a methionine.
CcmN Protein—Encapsulation Peptide (EP)
CcmN contains multiple hexapeptide-repeats and, at its C-terminus, an encapsulation peptide (EP), which is a short α-helical segment linked to the hexapeptide-repeat domains by a flexible linker sequence (Kinney et al., 2012). In general, encapsulation peptides have poorly conserved sequences but are amphipathic in nature (Aussignargues et al., 2015) A schematic diagram of the CcmN protein is shown in
An amino acid sequence for a Synechococcus elongatus PCC 7942 carbon dioxide concentrating mechanism protein (CcmN: Synpcc7942_1424) is available as accession number ABB57454 (SEQ ID NO:12).
As illustrated herein, SSLDs domains are fused with an encapsulation peptide from a CcmN protein. Such an encapsulation peptide can have the following sequence (SEQ ID NO:13).
A related CcmN encapsulation peptide is available from Prochlorothrix hollandica that has at least 65% sequence identity to SEQ ID NO:13, as illustrated below.
This Prochlorothrix hollandica related encapsulation peptide has the following sequence: VYGRDYFLQMRFSLFPD (SEQ ID NO:14).
A related CcmN encapsulation peptide is available from Halothece sp. PCC 7418 (Cai et al. 2016) that has at least 27% sequence identity to SEQ ID NO:13, as illustrated below.
The Halothece sp. PCC 7418 related encapsulation peptide has the following sequence: IYGQTHIERLMVTLFPHKEKFKKKTNDWFLVLGSLLFDDFPNNE (SEQ ID NO:15).
A related CcmN encapsulation peptide is available from Moorea producens that has at least 56% sequence identity to SEQ ID NO:13, as illustrated below.
The Moorea producens related encapsulation peptide has the following sequence: EQFFRRMRQSLNRAFSER (SEQ ID NO:16).
CcaA Carbonate Dehydratase (Carbonic Anhydrase)
While the CcmM and CcmN are typically conserved and are needed for native carboxysome formation (Long et al., 2010; Kinney et al., 2012), CcaA deletion mutant cell lines can still form carboxysomes (So et al., 2002b). Such CcaA deletion mutant cells exhibit a high carbon dioxide-requiring (hcr) phenotype. The CcaA genes encode carbonic anhydrase, also called carbonate dehydratase. A schematic diagram of the carbonate dehydratase, CcaA, protein is shown in
An amino acid sequence for a Synechococcus elongatus PCC 7942 carbonate dehydratase (CcaA; Synpcc7942_1447; 30185 daltons) is available as accession number ABB57477.1 (see website at uniprot.org/uniprot/P27134)(SEQ ID NO:17).
A related CcaA carbonate dehydratase is available from Synechococcus elongatus that has at least 99% sequence identity to SEQ ID NO:17, as illustrated below.
This CcaA related protein from Synechococcus elongatus has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_011242423.1 (GI:499561640), and with the sequence shown below (SEQ ID NO:18).
A related CcaA carbonate dehydratase is available from Geminocystis herdnanii that has at least 55% sequence identity to SEQ ID NO:17, as illustrated below.
This CcaA related protein from Geminocystis herdmanii has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_017295030.1 (GI:515864402), and with the sequence shown below (SEQ ID NO:19).
A related CcaA carbonate dehydratase is available from Aliterella atlantica that has at least 74% sequence identity to SEQ ID NO:17, as illustrated below.
This CcaA related protein from Aliterella atlantica has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_045053064.1 (GI:769918643), and with the sequence shown below (SEQ ID NO:20).
A related CcaA carbonate dehydratase is available from Leptolyngbya boryana that has at least 74% sequence identity to SEQ ID NO:17, as illustrated below.
This CcaA related protein from Leptolyngbya boryana has a sequence that is available from the National Center for Biotechnology Information database (see website at ncbi.nlm.nih.gov) with accession number WP_017285834.1 (GI:515855206), and with the sequence shown below (SEQ ID NO:21).
Many cyanobacteria lack CcaA (Zarzycki et al., 2013) and its function can be replaced by the γ-CA domain of CcmM (Peña 2010).
CcmC Chimeric Protein
A streamlined carboxysome core, referred to as CcmC, is described herein that combines segments of several carboxysome components into a single chimeric protein. CcmC contains scaffolding domains (the SSLDs that are involved in nucleating Rubisco), an enzymatic domain (carbonic anhydrase), and an encapsulating domain (the EP).
VLFITCSDSR IDPNLITQSG MGELFVIRNA GNLIPPFGAA
NGGEGASIEY AIAALNIEHV VVCGHSHCGA MKGLLKLNQL
QEDMPLVYDW LQHAQATRRL VLDNYSGYET DDLVEILVAE
NVLTQIENLK TYPIVRSRLF QGKLQIFGWI YEVESGEVLQ
ISRTSSDDTG IDECPVRLPG SQEKAILGRC VVPLTEEVAV
APPEPEPVIA AVAAPPANYS SRGWLGSGGS VYGKEQFLRM
Note that amino acids 2-326 of the CcmC protein (with SEQ ID NO:22) are the same as the CcmM short form from Synechococcus elongatus PCC 7942 provided as SEQ ID NO:5. Similarly, amino acids 1-328 of the CcmC protein (with SEQ ID NO:22) are the same as amino acids 1-328 of the M35-EP protein with SEQ ID NO:37. The central amino acids 329-585 of the SEQ ID NO:38 CcmC protein correspond to amino acids 2-258 of the carbonate dehydratase (CcaA) with SEQ ID NO:71. Amino acids 591-608 of the SEQ ID NO:38 CcmC protein correspond to the encapsulation peptide (EP) from a CcmN protein, which has SEQ ID NO:13. Other M35, CcaA, and EP polypeptide segments can substitute for these M35. CcaA, and EP segments to form related CmcC proteins.
Such synthetic CcmC core proteins can support the assembly of functionally competent carboxysomes in cyanobacteria.
Such synthetic CcmC core proteins can have some sequence variation. For example, a CcmC core protein can have at least 40% sequence identity, or at least 50% sequence identity, or at least 60% sequence identity, or at least 70% sequence identity, or at least 80% sequence identity, or at least 90% sequence identity, or at least 95% sequence identity, or at least 96% sequence identity, or at least 97% sequence identity, or at least 98% sequence identity, or at least 99% sequence identity sequence identity (or complementarity) with SEQ ID NO:22. Related CcmC proteins can have, for example, 60-99% sequence identity, or 70-99% sequence identity, or 80-99% sequence identity, or 90-95% sequence identity, or 90-99% sequence identity, or 95-97% sequence identity, or 95-98% sequence identity, or 97-99% sequence identity, or 95-99% sequence identity, or 95-100% sequence identity, or 96-100% sequence identity, or 97-100% sequence identity, or 100% sequence identity (or complementarity) with SEQ ID NO:22.
Expression of multiple genes has previously been deemed to be necessary to assemble a BMC core in heterologous systems. However, the construct described herein has a streamlined design that functions to fix carbon even though it is smaller, and consists of a single polypeptide that has small subunit-like domains (SSLDs), Encapsulation peptide (EP), and carbonic anhydrase domains.
The more compact CcmC core protein can accommodate domain components with a variety sequences related to those described herein. For example, a CcmC core protein can have SSLDs (small subunit-like domains), encapsulation peptide (EP), and carbonic anhydrase domains that have at least 40% sequence identity, or at least 50% sequence identity, or at least 60% sequence identity, or at least 70% sequence identity, or at least 80% sequence identity, or at least 90% sequence identity, or at least 95% sequence identity, or at least 96% sequence identity, or at least 97% sequence identity, or at least 98% sequence identity, or at least 99% sequence identity, or 60-99% sequence identity, or 70-99% sequence identity, or 80-99% sequence identity, or 90-95% sequence identity, or 90-99% sequence identity, or 95-97% sequence identity, or 97-99% sequence identity, or 100% sequence identity (or complementarity) with any of the SEQ ID NOs described herein.
Previous attempts to engineer bacterial microcompartments have focused on associating heterologous proteins to shell proteins using encapsulation peptides (EPs). For example, through the addition of two different EPs to pyruvate decarboxylase and alcohol dehydrogenase, Lawrence et al. were able to repurpose a propanediol utilization (PDU) compartment for ethanol production (Lawrence et al., 2014). Lin et al. showed that the encapsulation peptide from CcmN targets yellow fluorescent protein into carboxysome-like structures formed in mutant tobacco (Nicotiana benthamiana) plants (Lin et al., 2014b).
In contrast to such previous studies the approach reported here focuses on assembling a multifunctional bacterial microcompartment core using a single polypeptide to nucleate assembly and provide key functions: CcmC nucleates Rubisco, supplies carbonic anhydrase activity, and recruits the shell. This approach allows the packaging of multiple protein domains within a shell using only a single encapsulation peptide (EP).
Shell Proteins
In some cases, it may be useful to express carboxysome shell protein(s) along with the CcmC chimeric core protein.
For example, a carbon dioxide concentrating mechanism protein CcmK and/or CcmL shell protein from Synechococcus elongatus PCC 7942 can be expressed along with the CcmC chimeric core protein. An example of a sequence for such a CcmK shell protein from Synechococcus elongatus PCC 7942 is provided below as SEQ ID NO:23 (see NCBI accession number (ABB56317.1; GI:81167977).
An example of a sequence for such a CcmL shell protein from Synechococcus elongatus PCC 7942 is provided below as SEQ ID NO:24 (see NCBI accession number (ABB57452.1; GI:81169112).
Such shell proteins can have some sequence variation. For example, such shell proteins can have at least 40% sequence identity, or at least 50% sequence identity, or at least 60% sequence identity, or at least 70% sequence identity, or at least 80% sequence identity, or at least 90% sequence identity, or at least 95% sequence identity, or at least 96% sequence identity, or at least 97% sequence identity, or at least 98% sequence identity, or at least 99% sequence identity, or 60-99% sequence identity, or 70-99% sequence identity, or 80-99% sequence identity, or 90-95% sequence identity, or 90-99% sequence identity, or 95-97% sequence identity, or 97-99% sequence identity, or 100% sequence identity (or complementarity) with SEQ ID NO:23 and/or SEQ ID NO:24.
Rubisco
In some cases, ribulose-1,5-bisphosphate carboxylase/oxygenase, abbreviated as Rubisco herein (also abbreviated as RuBPCase), can also be expressed with the chimeric core carboxysome CcmC protein. Rubisco is an enzyme that can be involved carbon fixation, to provide building blocks for energy-rich molecules such as glucose. Rubisco can catalyze the carboxylation of ribulose-1,5-bisphosphate, and may be one of the most abundant enzymes on Earth.
For example, a ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) protein can be expressed along with the CcmC chimeric core protein. An example of a sequence for such a Rubisco protein from Synechococcus elongatus PCC 7942 is provided below as SEQ ID NO:25 (see NCBI accession number (ABB57456.1; GI:81169116).
Expression
The chimeric carboxysome core protein, shell protein(s), Rubisco protein(s), and combinations thereof can be expressed from an expression cassette or expression vector. An expression cassette can include a nucleic acid segment that encodes a chimeric carboxysome core protein, shell protein, or Rubisco protein operably linked to a promoter to drive expression. In some cases, such polypeptide(s) can be expressed using convenient vectors, or expression systems. The invention therefore provides expression cassettes or vectors useful for expressing one or more chimeric carboxysome core protein, shell protein. Rubisco protein.
For example, a nucleotide sequence that encodes the chimeric core carboxysome CcmC protein and that can be expressed in a variety of organisms, including Synechococcus elongatus PCC 7942, is shown below as SEQ ID NO:26.
Another nucleotide sequence is provided below that encodes the chimeric core carboxysome CcmC protein and that has been codon-optimized for expression in Escherichia coli (SEQ ID NO:27).
Another nucleotide sequence is provided below that encodes the chimeric core carboxysome CcmC protein and that has been codon-optimized for expression in Nicotiana tabacum (SEQ ID NO:28).
Another nucleotide sequence is provided below that encodes the chimeric core carboxysome CcmC protein and that has been codon-optimized for expression in Chlamydomonas reinhardtii (SEQ ID NO:29).
The expression cassettes or vectors can include a promoter that is operably linked to a nucleic acid segment that encodes the chimeric core carboxysome CcmC protein. A promoter is a nucleotide sequence that controls expression of an operably linked nucleic acid sequence by providing a recognition site for RNA polymerase, and possibly other factors, required for proper transcription. A promoter includes a minimal promoter, consisting only of all basal elements needed for transcription initiation, such as a TATA-box and/or other sequences that serve to specify the site of transcription initiation. A promoter may be obtained from a variety of different sources. For example, a promoter may be derived entirely from a native gene, be composed of different elements derived from different promoters found in nature, or be composed of nucleic acid sequences that are entirely synthetic. A promoter may be derived from many different types of organisms and tailored for use within a given cell.
Any promoter able to direct transcription of an encoded peptide or polypeptide may be used. Accordingly, many promoters may be included within the expression cassette. Some useful promoters include constitutive promoters, inducible promoters, regulated promoters, cell specific promoters, viral promoters, and synthetic promoters. Particularly useful promoters are inducible promoters, especially those induced by inexpensive signals, or promoters that are auto-inducing under certain environmental conditions (e.g. a relatively dense cyanobacterial population).
For expression of one or more chimeric carboxysome core protein, shell protein, Rubisco protein, or combinations thereof in a host cell, one or more expression cassette can be used that has a nucleic acid segment encoding such protein(s) and a promoter operably linked thereto. Such a promoter can be any DNA sequence capable of binding a RNA polymerase and initiating the downstream (3″) transcription of a coding sequence into mRNA. A promoter has a transcription initiation region that is usually placed proximal to the 5′ end of the coding sequence. This transcription initiation region usually includes an RNA polymerase binding site and a transcription initiation site. A second domain called an operator may be present and overlap an adjacent RNA polymerase binding site at which RNA synthesis begins. The operator permits negatively regulated (inducible) transcription, as a gene repressor protein may bind the operator and thereby inhibit transcription of a specific gene.
Constitutive expression may occur in the absence of negative regulatory elements, such as the operator. In addition, positive regulation may be achieved by a gene activator protein binding sequence, which, if present is usually proximal (5′) to the RNA polymerase binding sequence. An example of a gene activator protein is the catabolite activator protein (CAP), which helps initiate transcription of the lac operon in E. coli (Raibaud et al., Ann. Rev. Genet., 18:173 (1984)). Regulated expression may therefore be positive or negative, thereby either enhancing or reducing transcription.
Other examples of promoters that can be employed include promoters of sugar metabolizing enzymes, such as galactose, lactose (lac) (Chang et al., Nature, 198:1056 (1977), and maltose. Additional examples include promoter sequences derived from biosynthetic enzymes such as tryptophan (Trp) (Goeddel et al., Nuc. Acids Res., 8:4057 (1980); Yelverton et al., Nuc. Acids Res., 9:731 (1981); U.S. Pat. No. 4,738,921; and EPO Publ. Nos. 036 776 and 121 775). The β-lactamase (bla) promoter system (Weissmann, “The cloning of interferon and other mistakes”, in: Interferon 3 (ed. I. Gresser), 1981), and bacteriophage lambda PL (Shimatake et al., Nature, 292:128 (1981)) and T5 (U.S. Pat. No. 4,689,406) promoter systems also provide useful promoter sequences. Another example is the Chlorella virus promoter (U.S. Pat. No. 6,316,224).
Synthetic promoters that do not occur in nature also function as promoters in host cells. For example, transcription activation sequences of a promoter may be joined with the operon sequences of another promoter, creating a synthetic hybrid promoter (U.S. Pat. No. 4,551,433). For example, the tac promoter is a hybrid trp-lac promoter comprised of both trp promoter and lac operon sequences that is regulated by the lac repressor (Amann et al., Gene, 25:167 (1983); de Boer et al., Proc. Natl. Acad. Sci. USA, 80:21 (1983)). Furthermore, a promoter can include naturally occurring promoters of non-bacterial origin that have the ability to bind RNA polymerase and initiate transcription in cyanobacteria or other types of host cells. A naturally occurring promoter of non-bacterial origin can also be coupled with a compatible RNA polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA polymerase/promoter system is an example of a coupled promoter system (Studier et al., J. Mol. Biol., 189:113 (1986); Tabor et al., Proc. Natl. Acad. Sci. USA, 82:1074 (1985)). In addition, a hybrid promoter can also be comprised of a bacteriophage promoter and an E. coli operator region (EPO Publ. No. 267 851).
In some cases, quorum sensing-responsive promoters can be employed in the expression cassettes/vectors. Quorum sensing is a mechanism whereby bacteria are able to indirectly detect the concentration of neighboring cells. A quorum sensing pathway is one that is usually activated when a bacterial population becomes concentrated. For example, biofilm formation is controlled often by quorum sensing. Such quorum sensing promoters can make bacteria, cyanobacteria, or other cells self-induce the genes of interest when a certain cell concentration is reached (e.g., when the cells are ready, or will soon be ready, to be harvested), without the addition of chemical inducers. See, e.g., Miller, Melissa B., and Bonnie L. Bassler. “Quorum sensing in bacteria.” Annual Reviews in Microbiology 55(1): 165-199 (2001).
In some cases, the promoter can become active at certain times during culture or fermentation. For example, the promoter can in some cases be active before, during, or after log phase growth of the cells during culture or fermentation.
For example. LuxI/LuxR genes are a family of genes that produce quorum sensing behavior in bacteria. See, e.g., Waters & Bassler, “Quorum sensing: cell-to-cell communication in bacteria.” Ann Rev Cell Dev Biol 21: 319-46 (2005). Quorum sensing pathways in natural contexts involve a microbe that is capable of producing a diffusible molecule that can pass through the cell membrane, such as the class of molecules called acyl-homoserine lactones (AHL). These molecules can diffuse from the cell that produces them to the outside environment, and then back into other neighboring bacteria. When the concentration of AHL of a specific type becomes high enough, it can stabilize a transcription factor that turns on specific genes. Usually, quorum sensing pathways are utilized for bacteria to sense how large its population is—the more surrounding bacteria in the environment, the higher the AHL levels. At a certain cell density, the AHL builds up to a level that it can bind a receptor protein (e.g. LuxR), stabilizing it and allowing for downstream gene regulation.
Quorum sensing-responsive promoters can be used in any of the expression cassettes or expression vectors described herein. For example, host cells expressing LuxI (or similar protein) can make an AHL signal that could then build up as the cell density increases. When the cells become dense enough, they can turn on the expression of chimeric carboxysome core protein(s), shell protein(s), Rubisco protein(s), or combinations thereof.
One example of a protein that can modulate quorum sensing-responsive promoters is the LuxI from Vibrio fishcheri, with the following sequence (SEQ ID NO:30).
A nucleic acid encoding this Vibrio fishcheri LuxI protein shown below (SEQ ID NO:31).
A sequence of a LuxR receptor protein from Vibrio fishcheri is shown below (SEQ ID NO:32).
A nucleic acid sequence for this LuxR protein from Vibrio fishcheri is provided below as SEQ ID NO:33.
An example of a LuxR-responsive promoter from Vibrio fishcheri is shown below as (SEQ ID NO:34).
When LuxR is expressed and stabilized (because AHL is present), the LuxR protein binds to a promoter sequence like that shown above as (SEQ ID NO:34) and drives gene expression from it.
It is understood that many promoters and associated regulatory elements may be used within the expression cassette/vector to transcribe an RNA encoding a chimeric carboxysome core protein. The promoters described above are provided merely as examples and are not to be considered as a complete list of promoters that are included within the scope of the invention.
The expression cassette of the invention may contain a nucleic acid sequence for increasing the translation efficiency of an mRNA encoding a chimeric carboxysome core protein. Such increased translation serves to increase production of the protein. The presence of an efficient ribosome binding site is useful for gene expression in prokaryotes. In bacterial mRNA, a conserved stretch of six nucleotides, the Shine-Dalgarno sequence, is usually found upstream of the initiating AUG codon. (Shine et al., Nature. 254:34 (1975)). This sequence is thought to promote ribosome binding to the mRNA by base pairing between the ribosome binding site and the 3′ end of Escherichia coli 16S rRNA. (Steitz et al., “Genetic signals and nucleotide sequences in messenger RNA”, in: Biological Regulation and Development: Gene Expression (ed. R. F. Goldberger), 1979)). Such a ribosome binding site, or operable derivatives thereof, are included within the expression cassette of the invention.
A translation initiation sequence can be derived from any expressed gene and can be used within an expression cassette/vector of the invention. Preferably the gene from which the translation initiation sequence is obtained is a highly expressed gene. A translation initiation sequence can be obtained via standard recombinant methods, synthetic techniques, purification techniques, or combinations thereof, which are all well known. (Ausubel et al., Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. (1989); Beaucage and Caruthers. Tetra. Letts., 22:1859 (1981); VanDevanter et al., Nucleic Acids Res., 12:6159 (1984). Alternatively, translational start sequences can be obtained from numerous commercial vendors. (Operon Technologies: Life Technologies Inc. Gaithersburg. Md.). In some embodiments, the T7 translation initiation sequence is used. The T7 translation initiation sequence is derived from the highly expressed T7 Gene 10 cistron and can have a sequence that includes TCTAGAAATAATTTTGTTTAACTTTAAGAA GGAGATATA (SEQ ID NO:35). Other examples of translation initiation sequences include, but are not limited to, the maltose-binding protein (Mal E gene) start sequence (Guan et al., Gene, 67:21 (1997)) present in the pMalc2 expression vector (New England Biolabs, Beverly, Mass.) and the translation initiation sequence for the following genes: thioredoxin gene (Novagen, Madison, Wis.), Glutathione-S-transferase gene (Pharmacia, Piscataway, N.J.), β-galactosidase gene, chloramphenicol acetyltransferase gene and E. coli Trp E gene (Ausubel et al., 1989, Current Protocols in Molecular Biology, Chapter 16, Green Publishing Associates and Wiley Interscience, N.Y.).
The invention therefore provides an expression cassette or vector that includes a promoter operable in a selected host and a nucleic acid encoding one of the chimeric carboxysome core proteins described herein. The expression cassette can have other elements, for example, termination signals, origins of replication, enhancers, and the like as described herein. The expression cassette can also be placed in a vector for easy replication and maintenance.
An expression cassette or nucleic acid construct of the invention is thought to be particularly advantageous for inducing expression of the polypeptides.
Host Organisms
The chimeric carboxysome core protein can be expressed by a variety of organisms. Examples of organisms that can be modified to express the chimeric carboxysome core protein can include microorganisms, plants (including land-based plants and aqueous plants), and fungi. For example, bacteria, cyanobacteria, algae, microalgae, seaweed, plankton, single-celled fungal cells, multi-celled fungi, plant cells, and multi-celled plants can be modified to express the chimeric carboxysome core protein.
In some cases, the chimeric carboxysome core protein can be expressed in addition to native or endogenous carboxysome components.
Any cyanobacteria can be modified to express the chimeric carboxysome core protein, either permanently or transiently.
Examples of cyanobacterial species that can be changed include Synechococcus elongatus sp. PCC 7942; Synechococcus elongatus 7002: Synechococcus elongatus UTEX 2973; Anthropira platensis; and Leptolyngbya sp. strain BL0902. Synechococcus elongatus sp. PCC 7942 is one of the dominant model organisms, providing a variety of useful genetic tools. Synechococcus elongatus 7002 is a well-developed model organism with improved productivity and resilience. Synechococcus elongatus UTEX 2973 is related to S. elongatus 7942, and it has greatly improved growth properties. Anthropira platensis is perhaps the most broadly utilized cyanobacteria in scaled applications. Leptolyngbya sp. strain BL0902 is a bioindustrial strain whose genetic make-up is not as well-studied as some of the model cyanobacterial species.
Further examples of cyanobacterial species that can be modified include, for example, any of those in Table 1.
Synechococcus
elongatus sp. PCC 7942
Synechococcus
Synechococcus
elongatus UTEX 2973
Synechococcus
Anthropira platensis
Arthrospira
Prochlorococcus
marinus str. AS9601
Prochlorococcus
Acaryochloris marina
Anabaena sp. PCC 7120
Anabaena variabilis
Synechococcus sp.
Synechococcus
Cyanothece sp. ATCC
Cyanothece
Chlorobium tepidum
Chlorobaculum
Synechococcus sp.
Synechococcus
Cyanothece sp. PCC
Cyanothece
Synechococcus sp.
Gloeobacter violaceus
Prochlorococcus
marinus MED4
Prochlorococcus
Microcystis aeruginosa
Microcystis
Prochlorococcus
marinus MIT9313
Prochlorococcus
Prochlorococcus
marinus str. NATL1A
Prochlorococcus
Arthrospira platensis
Arthrospira; Arthrospira
platensis
Nostoc punctiforme
Prochlorococcus
marinus str. MIT 9211
Prochlorococcus
Prochlorococcus
marinus str. MIT 9215
Prochlorococcus
Prochlorococcus
marinus str. MIT 9301
Prochlorococcus
Prochlorococcus
marinus str. MIT 9303
Prochlorococcus
Prochlorococcus
marinus str. MIT 9515
Prochlorococcus
Synechococcus
elongatus PCC 6301
Synechococcus
Cyanothece sp. PCC
Cyanothece
Cyanothece sp. PCC
Cyanothece
Prochlorococcus
marinus str. NATL2A
Prochlorococcus
Prochlorococcus
marinus str. MIT 9312
Prochlorococcus
Rhodopseudomonas
palustris CGA009
Rhodopseudomonas
Prochlorococcus
marinus SS120
Prochlorococcus
Synechococcus sp.
Synechococcus
Synechococcus sp.
Synechococcus
Synechocystis sp. PCC
Synechocystis
Synechococcus sp. PCC
Synechococcus
Synechococcus
elongatus PCC 7942
Synechococcus
Synechococcus sp.
Synechococcus
Synechococcus sp. WH
Synechococcus
Trichodesmium
erythraeum IMS101
Trichodesmium; Trichodesmium
erythraeum
Thermosynechococcus
elongatus BP-1
Thermosynechococcus
Synechococcus sp.
Synechococcus
Useful Products
The cells, plants, cyanobacteria, bacteria, algae, microalgae and other cells/organisms that express the fusion proteins described herein can produce a variety of products such as oils, carbohydrates, grains, vegetables, fruits and other components, as well as 3-phosphoglycerate (3-PGA). Examples include oils (fatty acids), alkenes, polyhydroxybutyrate, biomass, carbohydrates, phycocyanin, ethanol, hydrogen, isobutanol, ethylene, and combinations thereof. Products such as oils (fatty acids), alkenes, ethanol, hydrogen, isobutanol, ethylene, and combinations thereof can be used in manufacturing and as biofuels. For example, ethanol, carbohydrate feedstocks, and biomass can be used to make bioethanol. Polyhydroxybutyrate is useful, for example, in bioplastics. Biomass, carbohydrates, and ethanol can also be used in foods and food manufacturing. Ethanol, hydrogen, isobutanol, and ethylene are useful in manufacturing, as a source of energy, and/or for making fuel.
The following non-limiting Examples describe some of the experiments performed.
This Example describes some of the methods that were used during development of the invention.
Cyanobacterial Strain and Growth Conditions
Synechococcus elongatus PCC 7942 (Syn 7942) cultures were grown in 250 ml baffled Erlenmeyer flasks with 60 ml BG-11 medium (Rippka et al., 1979) buffered with 10 mM HEPES pH 8.0 under the following growth chamber settings: temperature of 30° C., light intensity of 40 μmoles photons m−2s−1, shaking at 150 rpm and CO2 concentrations of 5%, 3% or air. Unless otherwise indicated, experiments were performed in cultures at exponential growth phase (OD730=0.4-0.7).
Mutant Generation
Synechococcus elongatus PCC 7942 cells were transformed as described by Kufryk et al. (2002). Cultures were grown to OD730=0.5 and concentrated to OD730=2.5 by centrifugation at 5000 relative centrifugal force (rcf) for 5 minutes. Five microliters of plasmids (˜1 μg of DNA) prepared from E. coli DH5α cells were added to 400 μl of the cyanobacterial cell suspension and incubated for 6 hours. The 400 μl-aliquots were dried on Nucleopore track-etched polycarbonate membranes (GE Healthcare) on top of BG-11 plates and incubated for 12-24 hours. The membranes were transferred to BG-11 plates with the proper selectable marker until resistant colonies were obtained.
All mutant strains were transformed with pJCC008 plasmid (rbcL-GFP placed under the control of the ccmk2 promoter) (Cameron et al., 2013) for GFP-labeling of the large subunit of Rubisco (RbcL) to enable carboxysome visualization by fluorescence microscopy. The carboxysome-minus strain COREΔ2/RbcL-GFP was generated by replacing synpcc7942_1423 and synpcc7942_1424 genes with a kanamycin resistance/sucrose sensitivity cassette obtained from the pPSBAII-KS plasmid (Lagarde et al., 2000) and using synpcc7942_1422 and synpcc7942_1425 sequences as flanking regions for double homologous recombination. Domains for the generation of chimeric proteins were assigned using the InterPro software (Hunter et al., 2012) and the HMM tool from JCVI institute (see website at blast.jcvi.org/web-hmm).
DNA was obtained from Cyanobase (see website at genome.microbedb.jp/cyanobase) and cloned by methods involving restriction digestion and ligation (see, e.g., Sambrook and Russell, 2001) as follows.
Plasmids with genes coding for the chimeric proteins had the following amino acid sequences.
The following is an amino acid sequence for a CcaA-M35 gene (SEQ ID NO:36).
The following is an amino acid sequence for a CcmC protein (SEQ ID NO:38).
VLFITCSDSR IDPNLITQSG MGELFVIRNA GNLIPPFGAA
NGGEGASIEY AIAALNIEHV VVCGHSHCGA MKGLLKLNQL
QEDMPLVYDW LQHAQATRRL VLDNYSGYET DDLVEILVAE
NVLTQIENLK TYPIVRSRLE QGKLQIFGWI YEVESGEVLQ
ISRTSSDDTG IDECPVRLPG SQEKAILGRC VVPLTEEVAV
APPEPEPVIA AVAAPPANYS SRGWLGSGGS VYGKEQFLRM
Note that amino acids 1-328 of the CcmC protein (with SEQ ID NO:38) are the same as amino acids 1-328 of the M35-EP protein with SEQ ID NO:37. The central amino acids 329-585 (in bold) of the SEQ ID NO:38 CcmC protein correspond to amino acids 2-258 of the carbonate dehydratase (CcaA) with SEQ ID NO:71. Amino acids 591-608 of the SEQ ID NO:38 CcmC protein correspond to the encapsulation peptide from a CcmN protein, which has SEQ ID NO: 13.
Note also that in the case of CcmC the C-terminal extension of the β-CA was used as linker and its terminal 14 amino acids were replaced by 18 amino acids comprising the EP with synpcc7942_1422 and synpcc7942_1425 sequences as flanking regions were transformed into the COREΔ2/RbcL-GFP strain.
Growth in air was used for positive selection and growth in 5% sucrose as confirmation. The COREΔ2/CcmC/RbcL-GFP strain is obtained after CcmC restores growth in air. CcaA (Synpcc7942_1447) was interrupted in the COREΔ2/CcmC/RbcL-GFP strain and in Wild-type/RbcL-GFP by insertion of a gentamycin resistance cassette and selection with 5 μg/ml gentamycin in solid BG-11 plates (resulting in COREΔ3/CcmC/RbcL-GFP strain and ΔCcaA/RbcL-GFP strain, respectively). Primers used are described in Table 2.
Structural Modeling
The predicted domains obtained (
Spectrophotometric Measurements
Culture growth was monitored as the change in optical density at 730 nm (OD730). Chl a concentration was determined by absorbance measurements (at 663 nm) of methanol extracts from 1-ml culture aliquots and calculated according to Lichtenthaler (Lichtenthaler, 1987). Total cell spectra were obtained from 1-ml aliquots of cultures in exponential growth phase, which were diluted to OD730=0.3, and the obtained spectra were normalized to that of Chl a (OD3). Doubling times were calculated using the exponential regression curve fitting online tool available at website doubling-time.com/compute.php. All measurements were performed at least in triplicate from aliquots from different cultures (using the same inoculum from a BG-11 agar plate). All measurements were performed in a Nanodrop2000C spectrophotometer (Thermo Scientific. USA).
PCR and Immunoblot Analysis
Standard PCR was performed as described in the manufacturer's protocol using EconoTaq Plus Green 2X (Lucigen, USA) and gene-specific primer pairs (Table 2). For protein extraction, pellets from 50 ml culture aliquots were resuspended in 1 ml of lysis buffer (25 mM HEPES-NaOH pH 7.15 mM CaCl2, 5 mM MgCl2, 15% Glycerol, 200 μM PMSF and cOmplete, Mini protease inhibitor (Roche)) and broken in a BeadBug homogenizer (Biospec Products, USA), by beating for 6 cycles of 30 seconds and 2 minutes of incubation in ice between each cycle. After 20 minutes of centrifugation at 20000 rcf, 15-μl aliquots plus SDS loading dye were loaded onto an acrylamide gel (without boiling the sample) for SDS-PAGE. SDS-PAGE and immunoblot analysis were performed according to the manufacturer's protocol (BioRad's bulletin 6376) using a polyclonal antibody from rabbit against Syn 7942 CcmM (dilution 1:5000) (Rothamstead Research, UK) as a primary antibody and Goat Anti-Rabbit IgG-HRP (Dilution 1:7000) (Life Tech. #656120) as secondary antibody and 1-Step Ultra TMB-Blotting Solution as substrate (Thermo #37574). For densitometries, total protein extract samples from three independent cultures were normalized according to the peak absorbance at 663 nm, loaded at four decreasing serial dilutions, and blotted as described using Anti-RbcL antibody (Agrisera Cat. AS03 037) at a dilution of 1:10000. Densitometry measurements were performed on the different immunoblots using ImageJ software (Schneider et al., 2012).
Oxygen Evolution
Two-ml aliquots were harvested from exponential-phase cultures, supplemented with 10 mM bicarbonate prior to the measurement, and the steady-state rate of oxygen evolution was determined at saturating light intensity (950 μmoles photons m−2 s−1) and 30° C. using an LMI-6000 illuminator (Dolan-Jenner, USA) and an Oxygraph Plus Clark-type electrode (Hansatech. UK).
Fluorescence and Electron Microscopy
Cultures grown to OD730=0.5 in 3% CO2 were transferred to air and grown overnight. For fluorescence microscopy, 1-ml aliquots were concentrated by centrifugation (1500 rcf for 5 minutes and resuspended in 100 μl of BG11) and visualized (autofluorescence and GFP) using a Zeiss Axio Observer.D1 inverted microscope. For electron microscopy, pellets from 50-ml aliquots were chemically fixed with 2% glutaraldehyde in 50 mM phosphate buffer for 2 hours at room temperature, followed by 1% osmium tetroxide for 2 hours at room temperature, and block stained with 2% aqueous uranyl acetate overnight at 4° C. Cells were dehydrated in an increasing acetone series (2 minutes at 37° C.; 20% acetone increments) and embedded in Spurr's resin (15 minutes at 37° C.; 25% increments) using an MS-9000 Laboratory Microwave Oven (Electron Microscopy Science, USA). Sections (70 nm thick) were cut on a MYX ultramicrotome (RMC Products. USA), positively stained with 6% uranyl acetate and Reynolds lead citrate (Reynolds, 1963) and visualized on a JEM 100CX II transmission electron microscope (JEOL) equipped with an Orius SC200-830 CCD camera (Gatan Inc., USA).
Quantum Efficiency of Photosystem II
Fv/Fm was determined in triplicate using 4-ml culture aliquots from biological replicates at exponential phase in cells dark adapted for three minutes as described previously (Cameron et al., 2013). Briefly, aliquots were diluted with BG-11 immediately before dark adaptation to a chlorophyll concentration of ˜1-2 μg/ml and measured using an Aquapen AP100 (Photon Systems Instruments. Czech Republic). Measurement started at time=0 h when the cultures were transferred from 3% CO2 to air.
Sequences
Sequences can be found in the GenBank/EMBL data libraries. For example, an amino acid sequence for a Synechococcus elongatus PCC 7942 carbonate dehydratase (CcmM: Synpcc7942_1423) is available as accession number ABB57453 (SEQ ID NO:69).
An amino acid sequence for a Synechococcus elongatus PCC 7942 carbon dioxide concentrating mechanism protein (CcmN: Synpcc7942_1424) is available as accession number ABB57454 (SEQ ID NO:70).
An amino acid sequence for a Synechococcus elongatus PCC 7942 Carbonate dehydratase (CcaA; Synpcc7942_1447) is available as accession number ABB57477.1 (SEQ ID NO:71).
This Example describes construction of chimeric proteins that assemble into a carboxysome core.
The design took into consideration observations that proteins evolve via domain fusions that are reflective of protein-protein interactions. The inventors predicted the domain boundaries in the CcmM, CcmN and CcaA proteins from Synechococcus elongatus PCC 7942 (
1) a ccaA-M35 fusion construct, where the Y-CA domain (Pfam00132) of CcmM was replaced by β-CA (Pfam00484) (
2) a M35-EP fusion construct, where three SSLD domains (Pfam00101) and their native linkers were fused to the EP (
3) M35-ccaA(short)-EP fusion construct, containing three SSLDs and their native linkers, the β-CA, CcaA with a short segment of its C-terminal tail as a linker, and the EP from the C-terminus of CcmN (
A gene coding for a green fluorescent protein (GFP)-labeled large subunit of Rubisco (rbcL-GFP) was inserted into each strain for in vivo visualization of carboxysome formation by fluorescence microscopy (Savage et al., 2010). To test whether the chimeric proteins can assemble into a carboxysome core, the Synechococcus elongatus PCC 7942 ccmM and ccmN were replaced with selectable marker genes (COREΔ2/RbcL-GFP strain; her phenotype). The chimeric genes were then transformed via double homologous recombination to replace the selectable markers of the COREΔ2/RbcL-GFP strain (placing the genes under the same regulation of the ccm operon genes) using growth in air for positive selection. In the case of ccaA-M35, the ccmN gene was reintroduced in the same vector.
Only M35-ccaA(short)-EP expression was able to rescue the her phenotype. This construct was named CcmC where the final “C” was for chimeric (
The presence or absence of ccmM, ccmN and ccaA was confirmed by PCR. Sequencing of the region between ccmL and ccmO further indicated that ccmnC was integrated into the ccm operon. The CCM insertion site sequence is shown below (SEQ ID NO:72), where the ccmC DNA insert is identified in bold and with underlining, and the portion of the genomic ccmK2 gene disrupted by the ccmC DNA insert is shown in bold (at the beginning of the SEQ ID NO:72 sequence).
AGCCGCGGCA GTCAAGCGCG CCATGTGCGC GATTGTCAGG
AACGACCGGT TGATGCAGCT GTCATTGCCA TCATCGATAC
GGTCAACGTG GAAAACCGCT CCGTCTACGA CAAACGCGAG
CACAGCTAAT GGGCAGGGAT TGAATCCCTG CTGGTCATTG
GTGAGCGCTT ATAACGGCCA AGGCCGACTC AGTTCCGAAG
TCATCACCCA AGTCCGGAGT TTGCTGAACC AGGGCTATCG
GATTGGGACG GAACATGCGG ACAAGCGCCG CTTCCGGACT
AGCTCTTGGC AGCCCTGCGC GCCGATTCAA AGCACGAACG
AGCGCCAGGT CTTGAGCGAA CTGGAAAATT GTCTGAGCGA
ACACGAAGGT GAATACGTTC GCTTGCTCGG CATCGATACC
AATACTCGCA GCCGTGTTTT TGAAGCCCTG ATTCAACGGC
CCGATGGTTC GGTTCCTGAA TCGCTGGGGA GCCAACCGGT
GGCAGTCGCT TCCGGTGGTG GCCGTCAGAG CAGCTATGCC
AGCGTCAGCG GCAACCTCTC AGCAGAAGTG GTCAATAAAG
TCCGCAACCT CTTAGCCCAA GGCTATCGGA TTGGGACGGA
ACATGCAGAC AAGCGCCGCT TTCGGACTAG CTCTTGGCAG
TCCTGCGCAC CGATTCAAAG TTCGAATGAG CGCCAGGTTC
TGGCTGAACT GGAAAACTGT CTGAGCGAGC ACGAAGGTGA
GTACGTTCGC CTGCTGGGCA TCGACACTGC TAGCCGCAGT
CGTGTTTTTG AAGCCCTGAT CCAAGATCCC CAAGGACCGG
TGGGTTCCGC CAAAGCGGCC GCCGCACCTG TGAGTTCGGC
AACGCCCAGC AGCCACAGCT ACACCTCAAA TGGATCGAGT
TCGAGCGATG TCGCTGGACA GGTTCGGGGT CTGCTAGCCC
AAGGCTACCG GATCAGTGCG GAAGTCGCCG ATAAGCGTCG
CTTCCAAACC AGCTCTTGGC AGAGTTTGCC GGCTCTGAGT
GGCCAGAGCG AAGCAACTGT CTTGCCTGCT TTGGAGTCAA
TTCTGCAAGA GCACAAGGGT AAGTATGTGC GCCTGATTGG
GATTGACCCT GCGGCTCGTC GTCGCGTGGC TGAACTGTTG
ATTCAAAAGC CGGGATCTCG CAAGCTCATC GAGGGGTTAC
GGCATTTCCG TACGTCCTAC TACCCGTCTC ATCGGGACCT
GTTCGAGCAG TTTGCCAAAG GTCAGCACCC TCGAGTCCTG
TTCATTACCT GCTCAGACTC GCGCATTGAC CCTAACCTCA
TTACCCAGTC GGGCATGGGT GAGCTGTTCG TCATTCGCAA
CGCTGGCAAT CTGATCCCGC CCTTCGGTGC CGCCAACGGT
GGTGAAGGGG CATCGATCGA ATACGCGATC GCAGCTTTGA
ACATTGAGCA TGTTGTGGTC TGCGGTCACT CGCACTGCGG
TGCGATGAAA GGGCTGCTCA AGCTCAATCA GCTGCAAGAG
GACATGCCGC TGGTCTATGA CTGGCTGCAG CATGCCCAAG
CCACCCGCCG CCTAGTCTTG GATAACTACA GCGGTTATGA
GACTGACGAC TTGGTAGAGA TTCTGGTCGC CGAGAATGTG
CTGACGCAGA TCGAGAACCT TAAGACCTAC CCGATCGTGC
GATCGCGCCT TTTCCAAGGC AAGCTGCAGA TTTTTGGCTG
GATTTATGAA GTTGAAAGCG GCGAGGTCTT GCAGATTAGC
CGTACCAGCA GTGATGACAC AGGCATTGAT GAATGTCCAG
TGCGTTTGCC CGGCAGCCAG GAGAAAGCCA TTCTCGGTCG
TTGTGTCGTC CCCCTGACCG AAGAAGTGGC CGTTGCTCCA
CCAGAGCCGG AGCCTGTGAT CGCGGCTGTG GCGGCTCCAC
CCGCCAACTA CTCCAGTCGC GGTTGGTTGG GATCTGGAGG
CAGTGTCTAC GGCAAGGAAC AGTTTTTGCG GATGCGCCAG
AGCATGTTCC CCGATCGCTA A
GATGTGCAC AGCAGCTCTA
The portion of the sequence of the ccmL gene at the ccmC integration site is shown below (SEQ ID NO:73).
Protein screening by immunoblot using polyclonal anti-CcmM antibodies showed no cross-reactivity with a total protein extract of the COREΔ2/RbcL-GFP strain, confirming the absence of those proteins (
This Example illustrates assembly of CcmC into functioning carboxysomes.
Fluorescence and transmission electron microscopy were used to assay for formation of carboxysomes (
In contrast, abundant GFP-labeled carboxysomes were observed in the mutant strains COREΔ2/CcmC/RbcL-GFP and COREΔ3/CcmC/RbcL-GFP (
The average carboxysome number (fluorescent puncta across the longitudinal plane) per cell in the wild-type/RbcL-GFP strain was 3.7±1.1 (
The amount of Rubisco protein per mg Chlorophyll a (Chl a) protein in the different strains was compared by immunoblotting using antibodies against the large subunit RbcL. Both COREΔ2/CcmC/RbcL-GFP and COREΔ3/CcmC/RbcL-GFP strains contained more than a 2-fold increase in RbcL relative to the Wild-type/RbcL strain (
Analysis by transmission electron microscopy further confirmed carboxysome formation of native (
The chimeric carboxysomes were smaller than wild type carboxysomes. As illustrated in
Abnormally shaped carboxysomes were occasionally observed (“rod carboxysomes”) in the CcmC strains but these have also been observed in wild type cyanobacteria (Gantt and Conti, 1969). Researchers have proposed that such rod carboxysomes may be a type of intermediate during carboxysome formation (Chen et al., 2013). Based on studies by the inventors, these rod carboxysomes could also be indicative of a deficiency in CA activity, as carboxysome aggregation and morphological variation were observed in the control strain ΔCcaA/RbcL-GFP (data not shown).
To determine if the reengineered carboxysomes function comparably to the Wild-type/RbcL-GFP carboxysomes, the growth of cells was analyzed at the exponential growth phase under high CO2 (5%) and low CO2 (air) conditions. No growth difference was observed between the strains when incubated in high CO2(
This Example illustrates some of the physiological characteristics of a triple deletion strain containing carboxysomes with synthetic cores (COREΔ3/CcmC/RbcL-GFP).
The COREΔ3/CcmC/RbcL-GFP strain has pigmentation differences when compared to Wild-type/RbcL-GFP (
The relative photosynthetic capacities of photosystem II were measure through quantification of chlorophyll fluorescence in dark adapted cells (Fv/Fm) upon transfer of the cultures from 3% CO2 to air (
As illustrated in
As an additional, complementary measure of photosynthetic activity, the oxygen evolution rates of air-grown cultures were compared at high light intensity (950 μmoles photons m−2 s−1). As shown in
These results indicate that the altered composition of the core has a net effect on the physiology of the cell relative to the Wild-type/RbcL-GFP control. Nevertheless, the reengineered core is immediately able to effectively support functional carboxysome assembly (
All patents and publications referenced or mentioned herein are indicative of the levels of skill of those skilled in the art to which the invention pertains, and each such referenced patent or publication is hereby specifically incorporated by reference to the same extent as if it had been incorporated by reference in its entirety individually or set forth herein in its entirety. Applicants reserve the right to physically incorporate into this specification any and all materials and information from any such cited patents or publications.
The following statements are intended to describe and summarize various embodiments of the invention according to the foregoing description in the specification.
Statements
The specific compositions and methods described herein are representative, exemplary and not intended as limitations on the scope of the invention. Other objects, aspects, and embodiments will occur to those skilled in the art upon consideration of this specification, and are encompassed within the spirit of the invention as defined by the scope of the claims. It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The terms and expressions that have been employed are used as terms of description and not of limitation, and there is no intent in the use of such terms and expressions to exclude any equivalent of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention as claimed. Thus, it will be understood that although the present invention has been specifically disclosed by embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims and statements of the invention.
The invention illustratively described herein may be practiced in the absence of any element or elements, or limitation or limitations, which is not specifically disclosed herein as essential. The methods and processes illustratively described herein may be practiced in differing orders of steps, and the methods and processes are not necessarily restricted to the orders of steps indicated herein or in the claims.
As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a plant” or “a seed” or “a cell” includes a plurality of such plants, seeds or cells, and so forth. In this document, the term “or” is used to refer to a nonexclusive or, such that “A or B” includes “A but not B,” “B but not A.” and “A and B,” unless otherwise indicated.
Under no circumstances may the patent be interpreted to be limited to the specific examples or embodiments or methods specifically disclosed herein. Under no circumstances may the patent be interpreted to be limited by any statement made by any Examiner or any other official or employee of the Patent and Trademark Office unless such statement is specifically and without qualification or reservation expressly adopted in a responsive writing by Applicants.
The invention has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein. In addition, where features or aspects of the invention are described in terms of Markush groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.
The Abstract is provided to comply with 37 C.F.R. § 1.72(b) to allow the reader to quickly ascertain the nature and gist of the technical disclosure. The Abstract is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
This application is a continuation of U.S. patent application Ser. No. 15/685,742, filed Aug. 24, 2017, which claims benefit of priority to the filing date of U.S. Provisional Application Ser. No. 62/378,979, filed Aug. 24, 2016, the contents of which are specifically incorporated herein by reference in their entirety.
This invention was made with government support under DE-FG02-91ER20021 awarded by the U.S. Department of Energy. The government has certain rights in the invention.
Number | Name | Date | Kind |
---|---|---|---|
20180057546 | Kerfeld et al. | Mar 2018 | A1 |
Number | Date | Country |
---|---|---|
WO-2011094765 | Aug 2011 | WO |
WO-2014182968 | Nov 2014 | WO |
WO-2016077589 | May 2016 | WO |
Entry |
---|
Gonzalez-Esquer et al. (Plant Cell, vol. 27, pp. 2637-2644, Sep. 2015, Epub Aug. 29, 2015). |
Gonzalez-Esquer et al. (Plant Cell, vol. 27, Supplemental Data, Sep. 2015, Epub Aug. 29, 2015). |
“U.S. Appl. No. 15/685,742, Non Final Office Action dated Mar. 6, 2019”, 16 pgs. |
“U.S. Appl. No. 15/685,742, Notice of Allowability dated Sep. 19, 2019”, 4 pgs. |
“U.S. Appl. No. 15/685,742, Notice of Allowance dated Jul. 29, 2019”, 9 pgs. |
“U.S. Appl. No. 15/685,742, Response Filed Jan. 23, 2019 to Restriction Requriement dated Nov. 30, 2018”, 6 pgs. |
“U.S. Appl. No. 15/685,742, Response filed Jun. 6, 2019 to Non Final Office Action dated Mar. 6, 2019”, 15 pgs. |
“U.S. Appl. No. 15/685,742, Restriction Requirement dated Nov. 30, 2018”, 7 pgs. |
“U.S. Appl. No. 15/685,742, Supplemental Amendment Filed Jul. 15, 2019 to Non-Final Office Action dated Mar. 6, 2019”, 4 pgs. |
Aussignargues, Clement, et al., “Bacterial microcompartment assembly: The key role of encapsulation peptides”, ommunicative & Integrative Biology 8:3, e1039755, (May/Jun. 2015). |
Axen, Seth D., et al., “A Taxonomy of Bacterial Microcompartment Loci Constructed by a Novel Scoring Method”, PLOS Comput. Biol. 10(10), e1003898., (Oct. 2014), 1-20. |
Baker, Neil R., et al., “Chlorophyll Fluorescence: A Probe of Photosynthesis In Vivo”, Annu. Rev. Plant Biol., 59, (2008), 89-113. |
Biasini, Marco, et al., “SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information”, Nucleic Acids Research, vol. 42, (2014), W252-W258. |
Cai, Fei, et al., “Advances in Understanding Carboxysome Assembly in Prochlorococcus and Synechococcus Implicate CsoS2 as a Critical Componet”, Life, 5, (2015), 1141-1171. |
Cai, Fei, “Production and Characterization of Synthetic Carboxysome Shells with Incorporated Luminal Proteins1[OPEN]”, Plant Physiology, vol. 170, (Mar. 2016), 1868-1877. |
Cai, Fei, et al., “The Pentameric Vertex Proteins Are Necessary for the Icosahedral Carboxysome Shell to Function as a CO2 Leakage Barrier”, PLoS ONE 4(10), e7521, (Oct. 2009), 1-9. |
Cameron, Jeffrey C., “Biogenesis of a Bacterial Organelle: The Carboxysome Assembly Pathway”, Cell, 155, (2013), 1131-1140. |
Chen, Anna H., et al., “The Bacterial Carbon-Fixing Organelle is Formed by Shell Envelopment of Preassembled Cargo”, PLoS ONE, 8(9), e76127, (Sep. 2013), 1-13. |
Cheng, Shouqiang, et al., “Bacterial microcompartments: their properties and paradoxes”, BioEssays, 30(11-12), (2008), 1084-1095. |
Dragosits, Martin, et al., “Adaptive laboratory evolution—principles and applications for biotechnology”, Microbial. Cell Factories, 12: 64, (2013), 17 pgs. |
Drews, G., et al., “Beitrage zur Cytologie der Blaualgen [Cytology of Cyanophycea. II. Centroplasm and granular inclusions of Phormidium uncinatum].”, Archiv fur Mikrobiologie, 24, II. Mitteilung Zentroplasma und granulare Einschlusse von Phormidium uncinatum, (1956), 147-162. |
Frank, Stefanie, et al., “Bacterial microcompartments moving into a synthetic biological world”, J. Biotechnol., 163(2), (2013), 273-279. |
Gantt, E., et al., “Ultrastructure of Blue-Green Algae”, J. Bacteriol., 97(3), (1969), 1486-1493. |
Hunter, S., et al., “InterPro in 2011: new developments in the family and domain prediction database”, Nucleic Acids Res., 40(Database Issue), (2012), D306-D312. |
Kerfeld, Cheryl A., et al., “Bacterial microcompartments and the modular construction of microbial metabolism”, Trends Microbiol., 23, (2015), 22-34. |
Kinney, James N., et al., “Elucidating Essential Role of Conserved Carboxysomal Protein CcmN Reveals Common Feature of Bacterial Microcompartment Assembly”, J. Biol. Chem. 287(21), (2012), 17729-17736. |
Kufryk, G. I., “Transformation of the Cyanobacterium synechocystis sp. PCC 6803 as a tool for genetic mapping: optimization of efficiency”, FEMS Microbiol. Lett., 206, (2002), 215-219. |
Lagarde, Delphine, et al., “Increased Production of Zeaxanthin and Other Pigments by Application of Genetic Engineering Techniques to Synechocystis sp. Strain PCC 6803”, Appl. Environ. Microbiol., 66(1), (2000), 64-72. |
Landgraf, Dirk, et al., “Segregation of molecules at cell division reveals native protein localization”, Nat. Methods, 9(5), (2012), 480-482. |
Lawrence, Andrew D., et al., “Solution Structure of a Bacterial Microcompartment Targeting Peptide and Its Application in the Construction of an Ethanol Bioreactor”, ACS Synthetic Biology, 3(7), (2014), 454-465. |
Lichtenthaler, Hartmut K., “[34] Chlorophylls and Carotenoids: Pigments of Photosynthetic Biomembranes”, Methods in Enzymology, 148, (1987), 350-382. |
Lin, Myat T., et al., “A faster Rubisco with potential to increase photosynthesis in crops”, Nature, 513(7519), (2014), 547-550. |
Lin, Myat T., et al., “β-Carboxysomal proteins assemble into highly organized structures in Nicotiana chloroplasts”, Plant J., 79(1), (2014), 1-12. |
Lluch-Senar, Maria, et al., “Defining a minimal cell: essentiality of small ORFs and ncRNAs in a genome-reduced bacterium”, Mol. Syst. Biol., 11: 780, (2015), 1-7. |
Long, Benedict M., et al., “Analysis of Carboxysomes from Synechococcus PCC7942 Reveals Multiple Rubisco Complexes with Carboxysomal Proteins CcmM and CcaA”, J. Biol. Chem., 282(40), (2007), 29323-29335. |
Long, Benedict M., et al., “Functional Cyanobacterial b-Carboxysomes Have an Absolute Requirement for Both Long and Short Forms of the CcmM Protein1[W][OA]”, Plant Physiology, 153, (May 2010), 285-293. |
Marsh, Joseph A., “Protein Complexes Are under Evolutionary Selection to Assemble via Ordered Pathways”, Cell, 153, (2013), 461-470. |
Ngo, et al., “In the Protein Folding Problem and Tertiary Structure Prediction”, K. Merz., and S. Le Grand (eds.), (1994), 492-495. |
Pena, Kerry L., et al., “Structural basis of the oxidative activation of the carboxysomal ?-carbonic anhydrase,CcmM”, Proc. Natl. Acad. Sci. USA, 107(6), (2010), 2455-2460. |
Pettersen, Eric F., et al., “UCSF Chimera—a visualization system for exploratory research and analysis”, J Comput Chem., 25(13), (2010), 1605-1612. |
Price, G. Dean, et al., “Advances in understanding the cyanobacterial CO2-concentrating-mechanism (CCM): functional components, Ci transporters, diversity, genetic regulation and prospects for engineering into plants”, Journal of Experimental Biology, 59(7), (2008), 1441-1461. |
Price, G. D., et al., “Isolation and Characterization of High CO2-Requiing-Mutants of the Cyanobacterium Synechococcus PCC7942”, Plant Physiology, 91, (1989), 514-525. |
Price, G. Dean, et al., “The cyanobacterial CCM as a source of genes for improving photosynthetic CO2 fixation in crop species”, Journal of Experimental Biology. 64(3), (2013), 753-768. |
Reynolds, E. S., et al., “The use of lead citrate at high pH as an electron-opaque stain in electron microscopy”, J. Cell Biol. 17, (1963), 208-212. |
Rippka, Rosmarie, et al., “Generic Assignments, Strain Histories and Properties of Pure Cultures of Cyanobacteria”, J. Gen. Microbiol., 111, (1979), 1-61. |
Rokney, Assaf, “E. coli transports aggregated proteins to the poles by a specific and energy-dependent process”, J. Mol. Biol., 392(3), (2009), 589-601. |
Savage, Didi F., et al., “Spatially ordered dynamics of the bacterial carbon fixation machinery”, Science, 327(5970), (2010), 1258-1261. |
Schneider, C. A., et al., “NIH Image to ImageJ: 25 years of image analysis”, Nat. Methods 9(7), (2012), 671-675. |
So, Anthony K.-C., et al., “Characterization of a mutant lacking carboxysomal carbonic anhydrase from the cyanobacterium Synechocystis PCC6803”, Planta 214(3), (2002), 456-467. |
So, Anthony K.-C., et al., “Characterization of the C-terminal extension of carboxysomal carbonic anhydrase from Synechocystis sp. PCC6803”, Funct. Plant Biol., 29(3), (2002), 183-194. |
Takahashi, Shunichi, et al., “Interruption of the Calvin cycle inhibits the repair of Photosystem II from photodamage”, Biochim. Biophys. Acta (BBA)—Bioenergetics, 1708(3), (2005), 352-361. |
Zarzycki, Jan, et al., “Cyanobacteriai-based approaches to improving photosynthesis in plants”, Journal of Experimental Botany, 64(3), (2013), 787-798. |
Number | Date | Country | |
---|---|---|---|
20210206816 A1 | Jul 2021 | US |
Number | Date | Country | |
---|---|---|---|
62378979 | Aug 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15685742 | Aug 2017 | US |
Child | 16670125 | US |