The present invention relates to glycoside hydrolase enzymes. Recombinant nucleic acids containing a cDNA that encodes a novel and highly active arabinoxylanase or a cDNA that encodes a novel and highly active xyloglucanase are provided. Glycoside hydrolases provided herein find a variety of uses including the direct enzymatic processing of the lignocellulosic materials derived from plant feed stock. Recombinant gene cassettes for expression of the glycoside hydrolase enzymes by recombinant microbes that function to carry out such processes in bioreactors are also provided.
The rumen, the foregut of herbivorous ruminant animals, such as cattle, functions as a bioreactor to process complex plant material, and fibrolytic enzymes are essential for the digestion of cellulosic biomass in the ruminant diet. A suite of enzymes is required to produce a variety of free sugars, (exo-enzymes), as well as oligosaccharides (endo-enzymes) for metabolism by rumen bacteria and subsequent digestion by the ruminant host. Among the numerous and diverse microbes involved in ruminal digestion are the ruminal protozoans, which are single-celled, ciliated eukaryotic organisms.
Also, a broad range of specific classes of glycoside hydrolases are required to effect processing of biofuel feedstocks. One of the critical, time-consuming and rate-limiting steps in development of industrial-scale biofuel production is identification and biochemical characterization of the diverse glycosyl hydrolases required.
To process complex fibrous plant materials, the rumen harbors a complex collection of diverse microorganisms (reviewed by 33, 45). While the diversity and functions of the thousands (32) of microbial species of this unique ecosystem are interesting from both evolutionary and functional perspectives, the rumen also represents a rich resource of enzymes for converting lignocellulosic feedstocks into biofuel (35, 43) and other applications (19). A range of inexpensive, robust enzymes with a broad range of specificities will likely be required for efficient industrial processing of highly complex plant polysaccharides. Identification of such enzymes that microorganisms use to break down plant materials has been greatly facilitated by metagenomics (42), both in the form of activity-based screens (20, 52) or through increasingly powerful, high-throughput genomic DNA sequencing approaches (e.g., 28, 57). As evidenced by numerous studies (e.g., 28, 39, 41), metagenomics has proven to be particularly effective for identification of carbohydrate-active genes of fiber-adherent bacterial species of the rumen.
In addition to bacteria and archaea, the rumen also hosts eukaryotic species, namely anaerobic fungi and ciliate protozoa (reviewed by 33). Addressing the function of ruminal protozoa in particular has been a challenge due to the difficulty of maintaining these organisms in axenic cultures (55). Thus, assessing the diversity and dynamics of ruminal protozoa has been addressed historically by morphogenic studies (reviewed by 12) and molecular phylogenetics (e.g., using 18S rDNA markers; 47). Ruminal protozoa are known to contribute to fiber degradation in their hosts (21), and determination and characterization of their ability to directly process plant material has been addressed by diverse strategies, such as direct, biochemical detection of specific fibrolytic enzymes (e.g., cellulases) in extracts derived from individual protozoan species (e.g., 38, 54), by molecular cloning studies to directly identify genes encoding enzymes capable of degrading cellulose or hemicellulose (e.g., 49, 50) and, most recently, by sequencing of protozoan-derived EST libraries (41). Early studies to establish the capacity of protozoan species to express their own enzymes for degradation of plant material includes that of Howard et al. (29), who demonstrated that Epidinium ecaudatum (E. ecaudatum) indeed contains fibrolytic enzyme activity. Similarly, Bailey et al. (3), demonstrated the presence of both a hemicellulase and a xylobiase in E. ecaudatum using purified cell extracts. More recently, Clayet et al. (8), using gel filtration of E. ecaudatum extracts, identified at least ten distinct enzyme activities for plant cell wall degradation; their fractions contained a range of enzymes with glycoside hydrolase (GH) activities, including two distinct carboxymethylcellulases with molecular weights of 23 and 45 kDa (8).
Altogether, about a dozen protozoan fibrolytic genes have been identified in activity-based molecular screens; a comparable number have been identified in informatics-based studies (41), predominantly in ovine and bovine rumen systems. The protozoan enzyme genes characterized to date are diverse, both in terms of the individual GH domains (27) utilized, as well as the combinatorial domain organization of proteins that contain them. GH domains are modular by design; they exist in individual polypeptides in variable copy numbers and variable association with other, non-catalytic modules (e.g., carbohydrate binding domains; reviewed in 27). The described rumen protozoan-derived GH domain genes primarily encode single- or dual-GH-5 domains (cellulase superfamily; e.g., 49, 50, 53), or GH10 or GH11 domains (xylanase-related domains; e.g., 4, 14, 15). The combinatorial complexity of fibrolyic genes thus far detected in ruminal ciliates speaks to the potentially diverse utilization of GH modules within the entire ruminal protozoan population (4, 14, 15, 41, 49, 50, 53). Yet, due in part to the importance of demonstrating the existence of fibrolytic genes in a given protozoan species, enzyme cloning studies have largely been conducted using mono-faunated animals, in which the host ruminant is inoculated with a single ciliate species.
Therefore, there is a need to identify and provide cDNA encoding the enzyme with a substrate specificity that is valuable in the development of various industrial processes for the processing of lignocellulytic materials derived from plant feed stocks.
The invention provides a bovine protozoan glycoside hydrolase cDNA with a substrate specificity that is highly valuable in the development of various industrial processes for the processing of lignocellulytic materials derived from plant feed stocks. These include development of complex enzyme cocktails for the direct enzymatic processing of these materials, or as a gene cassette for expression by recombinant microbes that function to carry out such processes in bioreactors.
Novel glycoside hydrolase enzymes were identified during the course of an activity-based metagenomic screen that was executed to identify genes encoding fibrolytic enzymes present in the metatranscriptome of a bovine ruminal protozoan-enriched cDNA expression library. Of the novel glycoside hydrolase genes identified was a cDNA encoding a gene active against a hemicellulose substrate, xylan. Further, more detailed biochemical analyses have been performed and indicated that the cDNA (named the Type 2-8.6 cDNA) encodes a novel and highly active arabinoxylanase, which is proven to be highly valuable in various biofuel-related industrial processes.
The rumen, the foregut of herbivorous ruminant animals (e.g., cattle), is a complex ecosystem that functions as a bioreactor to effect processing of complex plant material. Among the numerous and diverse microbes involved in ruminal digestion are the ruminal protozoans, which are single-celled, ciliated eukaryotic organisms. We executed an activity-based screen to identify genes encoding fibrolytic enzymes present in the metatranscriptome of a bovine ruminal protozoan-enriched cDNA expression library. Of the four novel genes identified, two were characterized in biochemical assays. Our results provide evidence for the effective use of functional metagenomics to retrieve novel enzymes from microbial populations that cannot be maintained in axenic cultures.
Therefore, to investigate the potential diversity of fibrolytic enzymes in a total ciliate population, we conducted an activity-based metagenomics screen of the meta-transcriptome of protozoa in the rumen fluid derived from a single, fistulated cow.
Recombinant DNAs comprising the novel glycoside hydrolase enzymes (glycosidase) enzymes are provided herein. In certain embodiments, the recombinant nucleic acids can comprise a heterologous promoter that is operably linked to a gene encoding any one of:
Recombinant DNAs provided herein can comprise DNA or RNA molecules. In certain embodiments, the promoter provides for expression of the glycosidase in a bacterial cell, a yeast cell, a plant cell, a fungal cell, an algal cell, a protozoan cell, or a mammalian cell. In certain embodiments where the recombinant DNA encodes a protein having at least 85% amino acid sequence identity to SEQ ID NO: 6 and glycosidase activity, the glycosidase can comprises at least one GH5 domain. In certain embodiments where the recombinant DNA encodes a protein having at least 85% amino acid sequence identity to SEQ ID NO: 6 and glycosidase activity, the glycosidase activity comprises a xyloglucanase activity. In certain embodiments where the recombinant DNA encodes a protein having at least 85%, 90%, 95%, 98%, 99%, or 100% amino acid sequence identity to SEQ ID NO: 8 and glycosidase activity, the glycosidase can comprises at least one GH10 domain. In certain embodiments where the recombinant DNA encodes a protein having at least 85% amino acid sequence identity to SEQ ID NO: 8 and glycosidase activity, the glycosidase activity can comprise an arabinoxylanase activity. In certain embodiments where the recombinant DNA encodes a protein having at least 70% amino acid sequence identity to SEQ ID NO: 10 and glycosidase activity, the glycosidase can comprise at least two GH11 domains. In certain embodiments where the recombinant DNA encodes a protein having at least 75% amino acid sequence identity to SEQ ID NO: 12 and glycosidase activity, the glycosidase can comprise at least two GH11 domains. Also provided herein are transformed cells comprising any of the aforementioned recombinant nucleic acids. In certain embodiments, the transformed cell is a bacterial cell, a yeast cell, an algal cell, a protozoan cell, a plant cell, a fungal cell, or a mammalian cell. In certain embodiments, the promoter provides for constitutive and/or inducible expression of the protein in the cell.
Also provided are methods of making a glycosidase comprising the steps of:
Also provided herein are methods for degrading lignocellulosic, cellulosic, and/or hemicellulosic materials with any of the aforementioned transformed cells. In certain embodiments, a method of degrading lignocellulosic, cellulosic, and/or hemicellulosic materials can comprise culturing the transformed cell in the presence of lignocellulosic, cellulosic, and/or hemicellulosic materials under conditions that provide for accumulation of the protein in the cell or in the cell culture medium and for at least partial hydrolysis of lignocellulosic, cellulosic, and/or hemicellulosic materials. In certain embodiments, a method of degrading hemicellulose can comprise culturing the transformed cell in the presence of hemicellulose under conditions that provide for accumulation of the protein in the cell or in the cell culture medium and for at least partial hydrolysis of xyloglucans in said hemicellulose. In certain embodiments, the cell that provides for at least partial hydrolysis of xyloglucans is a cell transformed with a recombinant nucleic acid encoding a protein having at least 85%, 90%, 95%, 98%, 99% or 100% amino acid sequence identity to SEQ ID NO: 6 and glycosidase activity. In certain embodiments, a method of degrading hemicellulose comprising culturing the transformed cell in the presence of hemicellulose under conditions that provide for accumulation of the protein in the cell or in the cell culture medium and for at least partial hydrolysis of arabinoxyloglucans in said hemicellulose. In certain embodiments, the cell that provides for at least partial hydrolysis of arabinoxyloglucans is a cell transformed with a recombinant nucleic acid encoding a protein having at least 85%, 90%, 95%, 98%, 99%, or 100% amino acid sequence identity to SEQ ID NO: 8 and glycosidase activity. In certain embodiments, the lignocellulose, cellulose, and/or hemicellulose is obtained from plant biomass. In certain embodiments, the plant biomass is selected from the group consisting of corn fiber, corn stover, wheat straw, rice straw, rice bran, switchgrass, wood, and sugarcane bagasse.
Also provided herein are isolated proteins encoded by any of the aforementioned recombinant nucleic acids. Isolated proteins provided herein include:
Also provided herein are methods for degrading lignocellulosic, cellulosic, and/or hemicellulosic materials with any of the glycosidases encoded by the aforementioned recombinant DNAs. In certain embodiments, methods for degrading lignocellulosic, cellulosic, and/or hemicellulosic materials comprising incubating any of:
As used herein, the term “heterologous”, when used in the context of two nucleic acid or protein sequences, refers to sequences that not contiguous to one another in nature. For example, a yeast secretion signal peptide sequence and a protozoan polypeptide sequence are heterologous because the two sequences are not naturally contiguous.
As used herein, the phrase “isolated protein” refers to a protein that has been separated from its naturally occurring cellular host.
As used herein, the term “glycosidase” refers to an enzyme that can hydrolyse at least one substrate in a group comprising lignocelluloses, celluloses, hemicelluloses, xylan, beta-glucan, carboxymethylcellulose, arabinoxylan, xyloglucan, and derivatives thereof. The term “glycosidase” and the phrase “glycosyl hydrolase” are used interchangeably herein.
The phrase “operably linked” as used herein refers to the joining of nucleic acid sequences such that one sequence can provide a required function to a linked sequence. In the context of a promoter, “operably linked” means that the promoter is connected to a sequence of interest such that the transcription of that sequence of interest is controlled and regulated by that promoter. When the sequence of interest encodes a protein and when expression of that protein is desired, “operably linked” means that the promoter is linked to the sequence in such a way that the resulting transcript will be efficiently translated. If the linkage of the promoter to the coding sequence is a transcriptional fusion and expression of the encoded protein is desired, the linkage is made so that the first translational initiation codon in the resulting transcript is the initiation codon of the coding sequence. Alternatively, if the linkage of the promoter to the coding sequence is a translational fusion and expression of the encoded protein is desired, the linkage is made so that the first translational initiation codon contained in the 5′ untranslated sequence associated with the promoter is linked such that the resulting translation product is in frame with the translational open reading frame that encodes the protein desired. Nucleic acid sequences that can be operably linked include, but are not limited to, sequences that provide gene expression functions (i.e., gene expression elements such as promoters, 5′ untranslated regions, introns, protein coding regions, 3′ untranslated regions, polyadenylation sites, and/or transcriptional terminators), sequences that provide for protein localization functions (signal peptides for extracellular secretion, organellar targeting peptides, and the like), inteins, sequences that provide DNA transfer and/or integration functions (i.e., site specific recombinase recognition sites, integrase recognition sites), sequences that provide for selective functions (i.e., antibiotic resistance markers, biosynthetic genes), sequences that provide scoreable marker functions (i.e., reporter genes), sequences that facilitate in vitro or in vivo manipulations of the sequences (i.e., polylinker sequences, site specific recombination sequences, homologous recombination sequences), and sequences that provide replication functions (i.e., bacterial origins of replication, autonomous replication sequences, centromeric sequences).
Various recombinant nucleic acids encoding glycosidases useful for degradation of lignocellulose, cellulose, and hemicellulose are provided herein. In certain embodiments, the recombinant nucleic acids can comprise a heterologous promoter that is operably linked to a gene encoding any one of:
While the recombinant nucleic acids will typically comprise DNA molecules, the use of recombinant RNAs including, but not limited to, viral RNA vectors is also provided herein. Such viral vectors can comprise a promoter recognized by an RNA-dependent RNA polymerase that is operably linked to an RNA encoding any one of the aforementioned proteins. In certain embodiments, the glycosidase encoding sequence can also be operably linked to a sequence encoding a signal peptide that provides for secretion of the glycosidase from a cell. Such signal peptides can be from a heterologous organism. In certain embodiments, the glycosidase encoding sequence can also be operably linked to a sequence encoding a polyadenylation site and/or transcription termination sequence. Useful polyadenylation site and/or transcription termination sequences can be obtained from a homologous or heterologous source.
Useful promoters that can be used in the recombinant DNAs include promoters that provide for expression of the glycosidase in a bacterial cell, a yeast cell, a plant cell, a fungal cell, an algal cell, a protozoan cell, or a mammalian cell. In certain embodiments, the promoter can be an inducible promoter. Exemplary and non-limiting bacterial promoters include, but are not limited to, bacteriophage, pTAC, pLAC, and pARA (arabinose inducible) promoters. Methanol inducible promoters can also be used in the recombinant DNA vectors. In certain embodiments, the methanol inducible promoters can comprise an AOX promoter (Alcohol Oxidase promoter), DHAS promoter (or DAS promoter) (dihydroxyacetone synthase promoter), FDH promoter (or FMDH promoter) (formate dehydrogenase promoter), MOX promoter (Methanol Oxidase promoter), ZZA1, PEX5-, PEX8-, and PEX14-promoters. Exemplary and non-limiting methanol inducible promoters also include, but are not limited to, promoters from yeast such as Pichia, Hansenula, Candida, and Torulopsis (U.S. Pat. Nos. 8,143,023, 5,750,372 and 6,001,590).
In certain embodiments, the encoded glycosidase protein will comprise at least one conserved protein sequence domain that is characteristic of proteins belonging to certain glycosyl hydrolase superfamily. Recombinant nucleic acids encoding proteins having at least 85% amino acid sequence identity to SEQ ID NO: 6 and glycosidase activity can thus comprise at least one
GH5 domain (i.e. a glycosyl hydrolase family 5 domain). A comparison between the SEQ ID NO:6 protein and a pfam00150 consensus sequence (SEQ ID NO: 18) that shows conserved sequence motifs is provided in
Also provided are proteins having at least 85%, 90%, 95%, 98%, 99% or 100% amino acid sequence identity to SEQ ID NO: 8 and glycosidase activity that comprise at least one GH10 domain. Other glycosidase proteins containing GH10 domains have been described (26, 27; Pollet et al. 2010, Ibid.; and also “pfam00331” on the World Wide Web (internet) at “ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=201160”). In certain embodiments, the glycosidase activity of proteins having at least 85% sequence identity to SEQ ID NO:8 can comprise a preference for arabinoxylan substrates. In certain embodiments, the glycosidase activity of proteins having at least 85% sequence identity to SEQ ID NO:6 can exhibit specific activities of at least about 300, 400, 500 to about 600 or 700 Units/mg towards arabinoxylan substrates.
Also provided are proteins having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% amino acid sequence identity to SEQ ID NO: 10 and glycosidase activity that comprise at least two GH11 domains. Other glycosidase proteins containing GH11 domains have been described (26, 27; Pollet et al. 2010, Ibid.; and also “pfam00457” on the World Wide Web (internet) at “ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=201240”). In certain embodiments, the glycosidase activity of proteins having at least 70% sequence identity to SEQ ID NO:10 can comprise activity towards xylan substrates.
Also provided are proteins having at least 75%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% amino acid sequence identity to SEQ ID NO: 12 and glycosidase activity that comprise at least two GH11 domains. Other glycosidase proteins containing GH11 domains have been described (26, 27; Pollet et al. 2010, Ibid.; and also “pfam00457” on the World Wide Web (internet) at “ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=201240”). In certain embodiments, the glycosidase activity of proteins having at least 75% sequence identity to SEQ ID NO:12 can comprise activity towards xylan substrates.
Also provided are deletion derivatives of a protein having at least 75% amino acid sequence identity to SEQ ID NO: 12 and glycosidase activity. In certain embodiments, these deletion derivatives can comprise any of the deletion type Type 4 cDNAs disclosed in
A variety of transformed cells containing the recombinant nucleic acid are also provided herein. In certain embodiments, the transformed cell can be a bacterial cell, a yeast cell, an algal cell, a protozoan cell, a plant cell, a fungal cell, or a mammalian cell. Transformed microorganisms that are particularly useful or adapted for degradation of lignocellulosic, hemicellulosic, and/or cellulosic materials in bioreactors are specifically contemplated. In certain embodiments, the transformed microorganism is a bacterium. Bacteria that can be transformed with the recombinant nucleic acids provided herein include, but are not limited to, Escherichia, Zymomonas, Streptomyces, Bacillus, Lactobacillus, Thermoanaerobacterium, and Clostridium species. In another embodiment, the recombinant nucleic acids can be used to transform Escherichia coli, Zymomonas mobilis, Bacillus stearothermophilus, or Clostridia thermocellum. In certain embodiments, the transformed microorganism is a yeast. Yeasts that can be transformed with the recombinant nucleic acids provided herein include, but are not limited to, Saccharomyces, Schizosaccharomyces, Pichia, Hansenula, Candida, Rhodotorula, Kluyveromyces, and Torulopsis species. In certain embodiments, the transformed microorganism is a fungal microorganism. Fungal microorganism that can be transformed with the recombinant nucleic acids provided herein include, but are not limited to, Aspergillus, Trichoderma, Rhizopus, and Mucor species. In certain embodiments, the transformed microorganism is an algal microorganism. Algal microorganisms that can be transformed with the recombinant nucleic acids provided herein include, but are not limited to, Thraustochytrium and Schizochytrium species (Cheng et al. Microbiol. Res. 2012 Mar. 20; 167(3):179-86; US Patent Application Publication No. US20110086390). In certain embodiments, the transformed microorganism is a protozoan microorganism. Protozoan microorganisms that can be transformed with the recombinant nucleic acids provided herein include, but are not limited to, Tetrahymena species.
Transformed plant cells containing the recombinant nucleic acids and transformed plants comprising those recombinant nucleic acids are also provided herein. In certain embodiments, the recombinant nucleic acid will provide for regulated induction or activation of the encoded glycosidase protein on an as needed or as desired basis. In certain embodiments, the encoded glycosidase can be interrupted by, or fused to single or multiple Controllable InterVening Protein Sequence (CIVPS) or intein sequences. Controllable InterVening Protein Sequence (CIVPS) or intein sequences and their use in transgenic plants are described in U.S. Pat. No. 8,247,647, which is incorporated herein by reference in its entirety. Other compositions and methods for inducing or activating an enzyme that can be adapted to the glycosidases provided herein are disclosed in U.S. Pat. No. 7,102,057, which is also incorporated herein by reference in its entirety. Plant cells and plants that can be transformed with recombinant nucleic acids provided herein include, but are not limited to, corn, soy, cotton, rice, wheat, sorghum, sugarcane, switchgrass, poplar, aspen, coniferous plants, and the like.
Also provided herein are methods of using the glycosidases, or transformed microorganisms comprising recombinant nucleic acids encoding any of the glycosidases, to degrade lignocellulosic materials or derived cellulosic and/or hemicellulosic materials. Various methods for obtaining cellulosic and/or hemicellulosic materials from lignocellulosic material present in biomass have been described in the literature. Pre-treatment of lignocellulosic materials to render such material suitable for subsequent enzymatic hydrolysis of cellulose and hemicellulose components include, but are not limited to, concentrated acid, dilute acid, alkaline, sulfite, hydrogen peroxide, steam explosion (autohydrolysis), ammonia fiber explosion (AFEX), wet-oxidation. lime. liquid hot water, carbon dioxide explosion, and organic solvent treatments (see (Saha, J Ind Microbiol Biotechnol (2003) 30: 279-291 and references cited therein). Techniques for obtaining cellulosic and/or hemicellulosic materials suitable for enzymatic digestion from lignocellulosic material present in biomass are also disclosed in U.S. Pat. Nos. 8,173,406, 8,133,393, 8,057,639, 7,998,713, and 5,411,594, each incorporated herein by reference in their entireties.
In methods for degrading lignocellulosic, cellulosic, or hemi-cellulosic materials provided herein, treatment of materials with any of the glycosidases can be supplemented by treatment with additional glycosidases with complementary substrate specificities. In certain embodiments, treatment with any of the glycosidases provided herein can be supplemented with treatment with one or more lignin degrading enzyme(s), cellulose degrading enzyme(s), and/or exoglucanase(s). Such treatments can be achieved either by exposing the lignocellulosic, cellulosic, or hemi-cellulosic materials to glycosidase enzyme preparations. to transformed host cells comprising the glycosidases, or to conditioned cell culture media comprising the glycosidases obtained from the transformed host cells. Supplemental treatment with additional glycosidases with complementary substrate specificities can be either simultaneous with, prior to, or after treatment with the glycosidases provided herein.
The following examples are included to demonstrate certain embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
Materials. Carboxymethyl cellulose (CMC), cellulose (fibrous, medium), galactan, laminarin from Laminaria digitata, mannan from Saccharomyces cervisiae, and xylan from beechwood were obtained from Sigma-Aldrich. AZCL-HE-cellulose, AZCL-xylan (oat), arabinan (sugar beet), β-glucan (oat, medium viscosity), wheat arabinoxylan (medium viscosity) and xyloglucan from tamarind seed (amyloid) were obtained from Megazyme. Reagents for the reducing sugar assay, ammonium iron (III) sulfate dodecahydrate, 3-methyl-2-benzothiazolinone hydrazone hydrochloride hydrate, and sulfanic acid were obtained from Sigma-Aldrich. Isopropyl β-D-1-thiogalactopyranoside (IPTG) was obtained from Gold Biotechnology.
Rumen Sample Collection, Protozoan Purification and mRNA Purification. Rumen protozoa were harvested by using a procedure based on (36), modified as follows: approximately 3 L of total rumen content (fluid and solids) were collected from a fistulated Holstein cow maintained at the University of Missouri Dairy Farm, in Columbia, Mo. The donor cow was fed a total mixed ration (TMR) of a common lactation diet, that consisted of alfalfa haylage, corn silage, corn and protein, minerals and fat-soluble vitamins added to meet or exceed nutrient requirements. The rumen sample was collected in the morning, prior to feeding, into a pre-warmed canister, and transported to the lab within 30 minutes. Aliquots of the sample were treated in three pulses, 30 sec each, with a blender, and pressed through a double layer of cheesecloth to remove bulk solids. The resulting liquid (approximately 1 L) was supplemented to 1% maltose and 0.5% sucrose and then floculated anaerobically for 1 hr at 39° C. After aspirating the floating feed particle layer, the liquid was dialyzed against eight changes of 39° C. Coleman anaerobic buffer (55) using a home-made 10-μm pore size NITEX filter cloth (Sefar America) bag with gentle agitation. The volume of the resulting material was −50 mL and represented a concentrated mixture of protozoa, whose composition was verified by microscopic observation. Total RNA was isolated from this material using the TRIzol® Plus RNA Purification System (Invitrogen); mRNA was then purified using the magnetic bead-based FastTrack® 2.0 mRNA Isolation Kit (Invitrogen).
Lambda Zap-Based Protozoan cDNA Library Construction. A Lambda Zap II-based protozoan cDNA expression library was constructed by using the Zap cDNA synthesis kit (Stratagene catalog #200401), starting with 5 μg of polyadenylated mRNA (for detailed protocols, see on the world wide web (internet) site: “genomics.agilent.com/files/Manual/200401.pdf”). After size-fractionation of total cDNA by gel-exclusion chromatography using Sepharose® CL-2B gel filtration medium (procured from Stratagene) in a 1-mL disposable plastic pipet, cDNA fractions ranging from 0.5 to >10 kb were pooled and ligated into the prepared lambda vector (see below), and packaged using the ZAP-cDNA® Gigapack® III Gold Cloning Kit (Stratagene catalog number 200450), according to protocols provided by the manufacturer. After titering, 700,000 p.f.u. of the primary library were amplified on plates using standard lambda phage procedures (Stratagene Zap cDNA Synthesis manual), in order to generate the secondary library, used for activity-based screening.
Activity-based Screening for Fibrolytic Enzymes. To identify candidate fibrolytic enzymes, we utilized IPTG-inducible cDNA expression, plaque-based high-throughput screening on plates containing dye-linked insoluble polysaccharide substrates (44, 51). We screened for two classes of fibrolytic enzymes: xylanases and cellulases, using AZCL-HE-Cellulose and AZCL-Xylan (Oat), respectively (Megazyme, Inc.). The expression library was screened on Petri dishes containing NZYM bottom agar, supplemented with a 1× micronutrient solution (1000×MNS: 3.0 mM H3BO3; 0.46 mM MnCl2; 0.16 mM CuSO4; 0.6 mM ZnSO4; 0.1 mM NaMoO4; 0.01 mM NiSo4; 0.01 mM CoCl2), and NZYM top agarose, supplemented with 1×MNS plus 20 mM IPTG. To identify cDNA clones encoding fibrolytic enzymes, the top agarose incorporated either AZCL-HE-Cellulase or AZCL-Xylan (Oat), at a final concentration of 0.3% (w/v). For each substrate, approximately one million p.f.u. were screened (10,000 p.f.u. per 150 mm plate). Over a 2-5 day incubation period at 37° C., 70 clones were picked that exhibited xylan-degrading activity and ten that exhibited cellulose-degrading activity. Positives were plaque-purified in three rounds of plaque purification, and then in vivo-excised to generate pBluescript DNA preparations, using procedures described in the Stratagene Zap cDNA Synthesis manual. To initially characterize the positives, rescued plasmid-borne cDNAs were sequenced with the T3 promoter primer (5′-AAT-TAA-CCC-TCA-CTA-AAG-GG-3′; SEQ ID NO:13), which flanks the 5-prime end of the directionally-cloned cDNA insert. Selected clones of the longer cDNA types (Types 3 and 4, see below), were then completely sequenced with the T7 promoter primer (5′-TAA-TAC-GAC-TCA-CTA-TAG-GG-3′; SEQ ID NO:14) in addition to custom, internal primers, when required (data not shown). To initially assess the diversity of the cDNA collection, we used the CAP3 Sequence Assembly Program (30), BLASTp (1) and ClustalW2 (6).
Sequence Searches, Alignments and Phylogeny.
Protozoan and bacterial glycoside hydrolase sequences were collected for our analysis by BLASTp searches (31) of the GenBank non-redundant protein sequences database (nr), using default search parameters to identify sequences with homology values of 1e-50. Protein sequences were aligned using MUSCLE3.6 (16) with a FASTA output format, and then manually edited using Jalview (7). Majority-ruled parsimonious trees were generated using the program “protpars” of PHYLIP (18), with maximum likelihood branch lengths calculated using TREE-PUZZLE (46). Bootstrap values were calculated using the program “seqboot” of the PHYLIP package. All trees were viewed and printed into a pdf format using A Tree Viewer (58).
Molecular Cloning for Expression Constructs.
To investigate the biochemical properties of representative positives (see Results section for summary of positive classes and
Expression and Purification of Recombinant Enzymes.
The Type 1-7.1 and Type 2-8.6 pET29a constructs were transformed into E. coli BL21 DE3 cells (Invitrogen), and expression cultures for each construct were grown at 37° C. in 500 mL LB broth containing 30 μg/mL kanamycin. The cultures were grown to an OD600 0.6 to 0.8, at which point expression was induced by the addition of IPTG to a final concentration of 1 mM. Cultures were then grown at 37° C. for an additional three hours. Bacterial cells were harvested by centrifugation (10,000 g at 4° C. for 10 min) and resuspended in 20 mL of equilibration/wash solution (50 mM sodium phosphate buffer, pH 7.0, and 300 mM NaCl) supplemented with 1× Complete®, EDTA-free protease inhibitor cocktail (Roche), and phenylmethylsulphonyl fluoride to a final concentration of 1 mM. Cell resuspensions were lysed using a French press, and the resulting lysates were centrifuged at 10,000 g at 4° C. for 10 min, to remove cell debris. Recombinant proteins were then affinity-purified by TALON® Metal Affinity Resin (Clontech), according to manufacturer's protocols. The purified enzymes were eluted in a single-step elution with equilibration/wash buffer supplemented with 150 mM imidazole. Proteins were then concentrated and imidazole removed with an ultrafiltration membrane (Vivaspin-20 column, GE Healthcare). Enzyme purity was confirmed by SDS-PAGE (see
Enzyme Assays.
The activity of each enzyme was initially confirmed in a simple, colorimetric assay using the same insoluble substrates that were used in the plate screening procedure. Specifically, one mg of the respective substrate (AZCL-HE-cellulose or AZCL-xylan) was suspended in one mL of protein purification equilibration buffer (300 mM NaCl, 50 mM sodium phosphate, pH 7.0), at 37° C. for 30 min. Reactions were initiated by adding 50 μL of purified enzyme (0.5 to 2.2 mg/mL protein), and the release of solubilized dye was visually validated. Optimal pH conditions were then preliminarily determined in assays using 550 μL of 50 mM Britton-Robinson buffers (5) plus 400 μL 0.2% AZCL-labeled substrates. Reactions were initiated by adding 50 μL of purified enzyme solution (10 μg protein/μL). After incubation at 37° C. for 1 hr, supernatant absorbance at 590 nm was determined (
Polysaccharide Analysis Using HPLC.
Solutions of xylan or β-glucan (250 μL volume at 1.0 mg/mL) were treated with the Type 1-7.1 or Type 2-8.6 proteins (a 10 μL solution at 22 g protein/mL) for ten and 60 min. Negative controls included xylan and β-glucan substrates without enzyme amendments. Samples were incubated at 45° C., and reactions were terminated by adding 10 μL of 0.5 M NaOH. The 25 μL reactions were then injected onto a DX-500 HPLC instrument (Dionex) equipped with a 250×4 mm CarboPac-1 column (Dionex) at a solvent flow rate of 1 mL min−1. The gradient system utilized 100 mM NaOH as Solvent A and 100 mM NaOH, 1 M NaOAc as Solvent B. The gradient was run at 100% A for 15 min, followed by a linear gradient to 100% B over 60 min. Detection was by pulsed amperometry using an ED40 electrochemical detector (Dionex; 25).
Classification of Protozoan Metagenomic cDNAs.
We sequenced a total of 63 clones positive for glycoside hydrolase activity: 60 identified on xylan substrate and three on cellulose substrate. Sequencing of the 5-prime end of each cDNA generated approximately 800 b.p. of sequence for each clone; analysis of these sequences permits classification of the cDNAs into four Types, which are discussed in detail below. Because eight of the 63 cDNAs likely represent aberrant clones (see discussion below and
E. ecaudatum
E. ecaudatum
E. ecaudatum
E. ecaudatum
Four types (“TYPES1-4”, Row 1) of cDNAs were recovered from the activity-based screens, which utilized either cellulose- or xylan-based dye-linked “SUBSTRATE” (Row 2). The length of the coding region (“CDS”, row 3) in amino acids of the longest cDNA for the given “TYPE”. “GH DOMAIN(S)” (row 4), indicates the Glycoside Hydrolase (GH) domain(s) detected by BLASTp homology search; whereas “PFAM” (row 5) indicates the Pfam assignment for the respective GH domain. “BEST MATCH” indicates the GenBank (protein) accession number for the best hits, which were all derived from E. ecaudatum (“SPECIES”, row 7). Percent similarity and identity (“% IDENT % SIM”) to the “BEST MATCH” are indicated (row 8). The total number (row 9) of cDNAs sequenced from each “TYPE” and number of cDNAs with unique 5-prime ends (“UNIQUE cDNAs”, row 10) are indicated.
Cellulase Positives:
The three Type 1 cDNAs were isolated on cellulose indicator plates. Each Type 1 cDNA encodes a single GH5 domain-containing protein (cellulase superfamily; 26); all three cDNA sequences were identical, suggesting that they were independent isolates of the same, amplified cDNA. The Type 1 cDNA sequence is provided as SEQ ID NO:5 and the sequence of the encoded protein is provided as SEQ ID NO:6 in the sequence listing.
Xylanase Positives:
Type 2 through Type 4 cDNAs were isolated on xylan indicator plates. Sequence analysis of these positives with xylanase activity sort into three distinct classes (
The Type 3 cDNA encodes a partial, N-terminal GH11 domain in addition to a second, full-length GH11 domain; thus, it is unlikely to be a full-length cDNA. The Type 3 cDNA sequence is provided as SEQ ID NO:9 and the sequence of the encoded protein is provided as SEQ ID NO:10 in the sequence listing. The Type 4 cDNAs encode a protein with two complete GH11 domains. The Type 4 cDNA sequence is provided as SEQ ID NO:11 and the sequence of the encoded protein is provided as SEQ ID NO: 12 in the sequence listing. Among the Type 4 clones, DNA sequencing identified 13 different 5-prime ends. The longest cDNA encodes an ORF that contains two GH11 domains; whereas the shortest cDNAs encode at least the C-terminal GH11 domain. In addition to these intact Type 4 cDNAs, we also identified eight cDNAs that likely represent aberrant, deletion forms of the full-length Type 4 cDNA (
We next compared sequences of the two-domain GH11 positive types (Type 3 and Type 4) through DNA and polypeptide alignments, which indicate that they represent highly similar, yet distinct genes. The DNA alignments (not shown) between the overlapping 1118 b.p. of the two cDNAs show 94.6% identity, with nine gaps, most of which are in the 3-prime untranslated regions of the two cDNAs. A ClustalW protein alignment (
We performed a phylogenetic analysis on the full-length peptide sequences of each positive Type (
The Type 1-7.1 positive enzyme was identified as a possible cellulase due to its activity on cellulose indicator plates and its GH5 domain homology. While many characterized cellulases are typically active against carboxymethyl cellulose (CMC), the recombinant enzyme derived from our library and comprising the protein of SEQ ID NO:6 had 85 times higher activity against xyloglucan (896.06±14.98 U/mg) compared against CMC (10.45±2.03 U/mg) and 32 times the activity against β-glucan (334.29±13.92 U/mg) (
1The Type 1-7.1 protein comprised the protein sequence of SEQ ID NO: 6 and the Type 2-8.6 protein comprised the sequence of SEQ ID NO: 8.
The closest BLASTp match for Type 2-8.6 enzyme was also a GH10 domain-containing enzyme. Typically, members of this GH tend to have low pI values (9). In contrast, Type 2-8.6 has a calculated pI value of 6.35 (Table 2), a value that was higher than those reported for GH 10 enzymes, but a lower value than what is typically observed for GH 11 enzymes (which tend to be high values; reference 9). Although the enzyme had detectable activity against xylan (95.62±0.38 U/mg), it possessed a higher activity against arabinoxylan (584.39±2.07 U/mg) (
Fibrolytic enzymes are essential for the digestion of cellulosic biomass in the ruminant diet. A suite of enzymes is required to produce a variety of free sugars, (exo-enzymes), as well as oligosaccharides (endo-enzymes) for metabolism by rumen bacteria and subsequent digestion by the ruminant host. The HPLC analysis of hydrolysis products resulting from exposure of β-glucan to Type 1-7.1 enzyme and xylan exposure to Type 2-8.6 indicated that each tested enzyme employs an endo-type of cleavage (
We executed an activity-based metagenomic screen with the aim of assessing the diversity of fibrolytic enzymes encoded by the meta-transcriptome of protozoa present in bovine rumen fluid. Using just two substrates, a cellulose and a hemicellulose, we identified four genes with diverse GH domains and modular organization. Phylogenetic analysis of these genes revealed the closest homologs to be protozoan, and the closest non-protozoan homologs to be most closely related to gram-positive bacteria. These observations support the hypothesis that lignocellulose-degrading genes were acquired by protozoa from ruminal bacteria by horizontal gene transfer (15, 41). There was close homology of the positive sequences obtained in our study (
One strength of activity-based screening is its ability to directly recover genes encoding biocatalysts for specific substrates (e.g., cellulose). In our current study, we identified, expressed and biochemically characterized two enzymes. The putative xyloglucanase possesses a high specific activity towards tamarind xyloglucan (896.06 U/mg protein) (
The powerful and rapid, activity-based metagenomics approach does have its own technical and efficiency obstacles (as discussed by 52), however, as evidenced by our recovery of hybrid/deletion cDNA clones. Thus, more direct, sequence-based metagenomics (e.g., single-cell-based genomic sequencing; reference 57), in combination with molecular phylogenetics (e.g., 13, 48) may represent the more attractive technique for characterizing the population dynamics, functions and fibrolytic genes of ciliate ruminal protozoa.
Having illustrated and described the principles of the present invention, it should be apparent to persons skilled in the art that the invention can be modified in arrangement and detail without departing from such principles. As various modifications could be made in the compositions and methods herein described and illustrated without departing from the scope of the invention, it is intended that all matter contained in the foregoing description or shown in the accompanying drawings shall be interpreted as illustrative rather than limiting. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims appended hereto and their equivalents.
It should also be understood that when introducing elements of the present invention in the claims or in the above description of exemplary embodiments of the invention, the terms “comprising,” “including,” “containing”, and “having” are intended to be open-ended and mean that there may be additional elements other than the listed elements.
Although the materials and methods of this invention have been described in terms of various embodiments and illustrative examples, it will be apparent to those of skill in the art that variations can be applied to the materials and methods described herein without departing from the concept, spirit and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
This non-provisional US patent application claims the benefit of U.S. Provisional Patent Application No. 61/573,892, which was filed Sep. 14, 2011 and which is incorporated herein by reference in its entirety. The sequence listing that is contained in the file named “52553—107399_ST25.txt”, which is 43,940 bytes (measured in operating system MS-Windows), created on Sep. 14, 2012, is filed herewith by electronic submission and incorporated herein by reference in its entirety. The sequence listing contains SEQ ID NO: 1-18. None.
Number | Date | Country | |
---|---|---|---|
61573892 | Sep 2011 | US |