The present invention relates to a new, versatile fusion protein tool (tag) to improve the expression, solubility and purification of proteins. Particularly, the fusion tag is the substantially full-length Recoverin molecule, as well as the other members of EF-hand family of neuronal calcium sensor proteins, which bind calcium, that can be used for protein purification in a single step, even in the presence of detergents.
Because of the availability of the genome sequence of a large number of organisms, recombinant techniques have allowed the identification, modification, production, and purification of a large number of proteins using different host cell types1-13. Consequently, there has been a large increase in the use of recombinant proteins for a variety of applications including therapeutic and diagnostic uses.2 In addition, numerous laboratories are expressing novel proteins to determine their structure and function, or to discover their therapeutic potential. As well, pharmaceutical companies are involved in large scale production of proteins with therapeutic purposes. High protein purity is thus necessary to satisfy these requirements. It is however very difficult to purify proteins using their intrinsic properties. The use of fusion tags is therefore widely spread to facilitate protein purification.
The production of recombinant proteins often results in the formation of inclusion bodies, host cell toxicity, low levels of expression or improper folding of proteins.2 These problems can often be solved by changing the expression system, the host cell type or by using a fusion tag.2 Many fusion tags are available11 and new tags or new tagging procedures are continuously provided.14-17 These tags can be located at either the N- or C-terminus of the protein of interest, but the choice of the most appropriate tag depends on each protein's particular properties.
The most commonly used small affinity tag is the polyhistidine tag (His-tag). It allows purification of proteins by immobilized metal affinity chromatography. It is typically made of hexahistidine although decahistidine is often more efficient for protein purification in a single step. However, this tag does not allow for improvement of the yield and increased solubility of the protein of interest. Therefore, the most widely used protein tag is the Glutathione S-transferase (GST),18 that is considered as the «Gold Standard». It is a 26 kDa protein which binds with high affinity to glutathione. GST can increase the yield and solubility of proteins of interest, and purification is achieved by binding of GST-tagged proteins to glutathione immobilized to a solid support.
Although the GST tag was found to be very useful to improve the solubility and purification of several proteins of interest, it has several drawbacks. For example, purification of GST-tagged proteins cannot be achieved in the presence of detergents.19 Moreover, separation of the protein of interest from GST after their cleavage is difficult to readily achieve because of the high affinity of glutathione for GST, which consequently requires an additional step of dialysis, unless if cleavage can be performed on the column. New protein tags are thus needed to improve solubility and facilitate purification of proteins of interest.
According to a first aspect, there is provided a polynucleotide molecule comprising a sequence encoding a protein of interest to be purified, and a tag polynucleotide encoding a protein of an EF-hand calcium-binding family of proteins, wherein the EF-hand calcium-binding protein undergoes conformational change under presence or absence of calcium. Particularly, the conformational change comprises an extrusion of hydrophobic amino acids in the presence of calcium. More particularly, the tag is a member of the EF-hand calcium-binding family of neuronal calcium sensor proteins.
In accordance with a further particular aspect, the invention provides for a vector comprising the polynucleotide as defined herein, operably linked to a promoter.
In accordance with a particular aspect of the invention there is provided an expression cassette comprising: a transcriptional initiation region functional in an expression host cell; the recombinant polynucleotide as defined herein; and a transcriptional termination region functional in the expression host cell.
In accordance with a particular aspect of the invention, there is provided a host cell comprising the expression cassette as defined herein.
In accordance with a particular aspect of the invention, there is provided a method for expressing a protein of interest, comprising the steps of: expressing the protein of interest fused to a Recoverin tag (TagR) in an expression system comprising the vector, or the expression cassette as defined herein.
In accordance with a particular aspect of the invention, there is provided a method of expression of a protein of interest, comprising the steps of: expressing the protein of interest fused to a Recoverin tag (TagR) in a host cell as defined herein. According to a more particular aspect of the invention of this method, the expression of the protein is carried out with minimal growth media to allow folding of the tag protein.
In accordance with a particular aspect of the invention, there is provided a method for producing a protein of interest, wherein the method comprises: growing a cell under conditions that permit expression of the protein as defined herein, wherein the cell comprises: the polynucleotide; or the vector; or the expression cassette, all as defined herein.
In accordance with a particular aspect of the invention, there is provided a method for purifying a protein of interest, comprising the steps of: expressing the protein of interest as defined herein; or producing the protein of interest as defined herein; and purifying the fused-protein by separating the fused protein from unwanted components on hydrophobic affinity chromatography in presence of calcium. Particularly, the purifying step is carried out by eluting the fused-protein from the hydrophobic affinity column in presence of a calcium chelator; and optionally cleaving the eluted fused molecule in order to obtain the purified protein of interest cleaved from the tag. Alternatively, the purifying step is carried out by cleaving the fused-protein of interest from the tag to obtain a mixture comprising the cleaved protein and the tag, and separating the cleaved protein of interest from the tag by eluting the cleaved protein on the hydrophobic affinity column in presence of a calcium chelator.
In accordance with a particular aspect of the invention, there is provided a fusion protein comprising a protein of interest, fused to a protein of an EF-hand calcium-binding family of proteins, wherein the EF-hand calcium-binding protein undergoes conformational change under presence or absence of calcium.
In accordance with a particular aspect of the invention, there is provided a fusion protein encoded by the polynucleotide as defined herein. According to a particular aspect, the encoded protein tag is nonmyristoylated.
In accordance with a particular aspect of the invention, there is provided a kit for the expression and purification of a protein of interest, the kit comprising: the polynucleotide as defined herein; and instructions on how to insert the polynucleotide in a suitable vector; and/or instructions on how to transform the vector in a host cell; and/or instructions on how to isolate a recombinant fused-protein from the host cell; and/or instructions on how to purify the recombinant fused-protein as defined herein.
In accordance with a particular aspect of the invention, there is provided a kit for the expression and purification of a protein of interest, the kit comprising: the vector, or the expression cassette, both as defined herein; and instructions on how to transfect the vector in a host cell; and/or instructions on how to isolate a recombinant fused-protein from the host cell; and/or instructions on how to purify the recombinant fused-protein according to the method as defined herein.
According to a particular aspect of the invention, there is provided use of a protein from an EF-hand calcium-binding family defined herein, as a tag for protein expression and/or solubilisation and/or purification.
In accordance with a particular aspect of the invention, there is provided a protein from an EF-hand calcium-binding family defined herein for use as a protein tag for protein expression and/or solubilisation and/or purification.
BAPTA: 1,2-Bis(2-Aminophenoxy)ethane-N,N,N′,N′-tetraacetic acid; EDTA: Ethylenediaminetetraacetic acid; EGTA: Ethylene glycol-bis(2-aminoethylether)-N,N,N′,N′-tetraacetic acid; tLRAT: truncated lecithin retinol acyltransferase; RP2: retinitis pigmentosa 2; and TagR: Recoverin protein tag.
GCAP: guanylate cyclase activating protein; GCIP: guanylate cyclase inhibiting protein; KChIP: potassium channel interacting protein; NCS1: neuronal calcium sensor 1; VILIP-1: Visinin-like protein.
TEV: Tobacco Etch Virus; HRV 3C: Human rhinovirus 3C.
The term “about” as used herein refers to a margin of + or −10% of the number indicated. For sake of precision, the term about when used in conjunction with, for example: 90% means 90%+/−9% i.e. from 81% to 99%. More precisely, the term about refer to + or −5% of the number indicated, where for example: 90% means 90%+/−4.5% i.e. from 86.5% to 94.5%.
As used herein the singular forms “a”, “and”, and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a cell” includes a plurality of such cells and reference to “the culture” includes reference to one or more cultures and equivalents thereof known to those skilled in the art, and so forth. All technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs unless clearly indicated otherwise.
As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, un-recited elements or method steps.
The term “gene” as used herein refers to any DNA sequence comprising several operably linked DNA fragments such as a promoter and a 5′ regulatory region, a coding sequence and an untranslated 3′ region comprising a polyadenylation site.
The term “expression cassette” as used herein refers to a transferable region of DNA comprising a chimeric gene which is flanked by one or more restriction or other sites which facilitate precise excision from one DNA locus and insertion into another.
The term “protein of interest” as used herein means any protein for which small or large scale production may present an interest for a person skilled in the art.
Recoverin (full-length or substantially full-length) has now been found to possess properties suitable for protein tags. Recoverin has a molecular weight of about 23 kDa, is highly soluble (>30 mg/ml), it can be highly purified in a single step and its expression level in bacteria is very high (>30 mg/L of culture).20 It is predicted that, based upon the similar function and properties of this family of proteins, other members of EF-hand family of neuronal calcium sensor proteins, which bind calcium, will also behave as suitable protein tags for purification.
In addition to Recoverin, the neuronal calcium sensor family of proteins include GCAP1, GCAP2 or GCAP3 (guanylate cyclase activating protein), GCIP (guanylate cyclase inhibiting protein), KChIP1, KChIP2 (potassium channel interacting protein), Calsenilin/DREAM, NCS1/Frequenin (neuronal calcium sensor 1), Neurocalcin delta, Hippocalcin, VILIP-1 (Visinin-like protein). There is a large amino acid sequence identity between the different members of this family of proteins. For example, there is 61% identity between Frequenin and Neurocalcin delta and 41% between Frequenin and Recoverin61.
Therefore, in accordance with a particular aspect of the invention there is provided the polynucleotide as defined herein, wherein the encoded tag protein comprises at least about 40% identity with Recoverin, more particularly, the encoded tag protein comprises at least about 60% identity with Recoverin.
Similarly, there is a high similarity between the overall structure of the members of this family of proteins which can be estimated using the values of root mean square deviation (RMSD). The RMSD is the measure of the average distance between the backbone atoms of superimposed proteins. Very small RMSD values of 1.3 Å have been obtained when comparing the overall fold of Frequenin and Neurocalcin delta and of 1.7 Å between Frequenin and Recoverin. Similarly, the overall structure of Ca2+-bound non-myristoylated GCAP2 closely resembles that of Ca2+-bound non-myristoylated Recoverin (RMSD of 1.9 Å).62 Therefore, other members of the NCS family of proteins could potentially act as tags for purposes similar to those demonstrated here for Recoverin. RMSD>5 Å have been estimated for non-homologous proteins and ˜<2.5 Å for homologous proteins63.
In accordance with a particular aspect of the invention there is provided the polynucleotide as defined herein, wherein the tag protein comprises a root mean square deviation (RMSD) of less than 2.5 Å, when compared to the 3D structure of Recoverin. More particularly, the invention provides the polynucleotide as defined herein, wherein the RMSD is less than 2 Å.
Thus, according to particular embodiments, the proteins GCAP1, GCAP2 et GCAP3, GCIP, KChIP1, KChIP2, Calsenilin/DREAM, NCS1/Frequenin, Neurocalcin delta, Hippocalcin, VILIP-1 maybe used as tags such as in the case of Recoverin.
In accordance with a particular aspect of the invention, the Recoverin tag (TagR) cDNA can be cloned from freshly dissected bovine retina, although Recoverin from other organisms is deemed to be as useful. The RNA can be isolated and used for reverse transcription reaction. Then, first-strand cDNA can be used as a template for PCR using primers designed to amplify the coding sequence of Recoverin cDNA and, particularly to introduce restriction sites, more particularly NdeI and/or BamHI restriction sites.
In accordance with a particular aspect of the invention, the substantially full-length, or full-length Recoverin cDNA can be ligated into a plasmid to generate a Recoverin expression vector.
In accordance with a particular aspect of the invention, an EcoRI site in the native sequence of TagR can be changed to a silent mutation. More particularly, the EcoRI site (GAATTC) in the sequence of native Recoverin can be changed to GAGTTC (silent mutation) by site-directed mutagenesis.
Applicant has found that the presence of the myristoyl group did not improve the purification of TagR with its fusion partners. Therefore, in accordance with a particular aspect of the invention, the TagR is expressed without its myristoyl group.
In accordance with a particular aspect of the invention, the protein of interest is located at the C-terminus of the protein tag in order to allow the tag's major conformational change to occur at the N-terminus.
In accordance with a particular aspect of the invention, the TagR is located at the N-terminal of the protein of interest that is selected from, for example, but not limited to: tLRAT and RP2.
This protein tag is further engineered to ease its use as a tag for recombination, expression, solubilisation and purification purposes.
In accordance with a particular aspect of the invention, the TagR has introduced in its nucleotide sequence one or more sequence encoding for a restriction site, such, as for example: NdeI and/or BamHI for ease of construction of the fusion cDNA sequence.
Alternatively, the polynucleotide comprises a silent mutation at GAATTC (e.g. within the coding sequence: mutation T to C, at position 330 from 5′) to avoid cleavage by the restriction enzyme specific to the EcoRI site during the construction of the cDNA.
In accordance with a particular aspect of the invention, the nucleotide sequence corresponding to a proteolytic cleavage site is introduced to the TagR nucleotide sequence, for ease of separation of the resulting protein from its fusion partner during purification.
Particularly, a nucleotide sequence encoding the specific thrombin cleavage site (LVPRGS) is inserted to the tagR sequence before the cDNA of the protein of interest. Alternatively, other proteolytic cleavage sites can be introduced such as, for examples, the TEV cleavage site corresponding to: ENLYFQ/G; or the HRV C3 cleavage site corresponding to LEVLFQ/GP.
In accordance with a particular aspect of the invention, the TagR protein is nonmyristoylated when expressed in prokaryotic expression systems, unless the system is complemented with enzyme and substrate for myristoylation42. Applicant has found that the presence of the myristoyl group did not improve the purification of TagR with its fusion partners. Therefore, in accordance with a particular aspect of the invention, the TagR can be expressed in basic prokaryotic expression systems without having to completement the system to allow myristoylation.
In accordance with a particular aspect of the invention, the TagR protein is complemented with a poly His tag positioned at the N-terminus of the protein of interest to show that TagR can be used in tandem with other tags.
In accordance with a particular aspect of the invention, the TagR protein is complemented with a further TagR positioned at the N-terminus of the first TagR to further enhance solubility of the resulting fusion protein. Such a procedure may be useful for the expression and purification of highly insoluble proteins.
In accordance with a particular aspect of the invention, the tag polypeptide encodes for Recoverin, particularly, the Recoverin has at least about 90% nucleic acid identity, more particularly at least 94% nucleic acid identity to SEQ ID NO: 1. More particularly, the Recoverin is substantially full length, or alternatively is about 23 kD.
In accordance with a particular aspect of the invention, the polynucleotide has introduced therein one or more sequence encoding for a restriction site, such, as for example: NdeI and/or BamHI. Alternatively, the polynucleotide comprises a silent mutation at GAATTC (e.g. EcoRI site within the coding sequence: mutation T to C at position 330 from 5′).
In accordance with a further particular aspect, the invention provides for a vector comprising the polynucleotide as defined herein, operably linked to a promoter.
In accordance with a further aspect, the invention further provides an expression cassette comprising: a transcriptional initiation region functional in an expression host cell; the polynucleotide as defined herein; and a transcriptional termination region functional in the expression host cell.
Finally, in accordance with this particular aspect of the invention, there is provided a host cell comprising the expression cassette as defined herein above.
In accordance with a particular aspect of the invention, the substantially full-length, or full-length Recoverin cDNA can be ligated into a plasmid to generate a Recoverin expression vector. Particularly, the plasmid may be chosen from pET11a, pGEX-4T-3PEX Stiull, the plasmid may be chosen from the series of: pBAC, pAK, BJ, MP, pGPD, MW, pUCP, CY, MAT, pMSP, SNX, PM, etc. . . . well known in the art. More particularly, the plasmid may be chosen from several catalogs well known by the person skilled in the art such as from: addgene (www.addgene.org), thermofisher (www.thermofisher.com), millipore (www.emdmillipore.com), promega (www.promega.ca), EMBL (www.embl.de), GE Healthcare Life Sciences, Agilent, etc.
In accordance with a further aspect, the invention further provides an isolated recombinant polypeptide encoded by the polynucleotide as defined herein, particularly, a recombinant protein comprising a tag encoded by the polynucleotide as defined herein.
Alternatively, the invention provides to a protein tag that has an amino acid sequence is at least about 40% identical, particularly at least 60% identical to SED ID No. 2.
Still, alternatively, the invention provides to a protein tag that has an amino acid sequence is at least about 90% identical, particularly at least 95% identical to SED ID No. 4. Even most particularly, the protein tag is as defined by SEQ ID NO. 4.
Also, alternatively, the tag protein lacks its native myristoyl group, which was found not to be essential for the purification of TagR with its fusion partners. Hence, in a particular embodiment, the polynucleotide is as defined in SEQ ID NO. 3.
Recoverin's purification is based on its properties to reversibly bind calcium. Recoverin undergoes a large conformational change upon calcium binding21-24 (
Inversely, the chelation of the calcium ions by EGTA results in a large conformational change of TagR, that results in the sequestration of its hydrophobic residues inside the protein (
Therefore in accordance with a particular aspect of the invention, the purification of the protein of interest can be carried out in a single step hydrophobic chromatography, such as in the case for pure Recoverin20. Applicant has found that the fusion of the protein of interest at the C-terminus of the TagR did not prevent the typical large conformational change of Recoverin taking place when calcium is removed using EGTA to allow its elution from the column.
Hence, the protein of interest can be cleaved and separated from the TagR by proteolysis, particularly using thrombin. The protein of interest is then separated from the TagR by simply adding calcium to the cleaved sample. As a result, Recoverin exposes its hydrophobic amino acids, thereby allowing its binding to the hydrophobic resin. This allows elution of the purified protein of interest free of its TagR fusion partner. TagR is then eluted separately using a buffer containing a calcium chelator such as, for example, EGTA, EDTA or BAPTA.
Therefore, in accordance with a particular aspect of the invention, there is provided a method for expressing, and purifying a protein of interest, comprising the steps of: expressing the protein fused to a substantially full-length TagR in an expression system; and purifying the fused-protein. Particularly, the purifying step is carried out by eluting the fused-protein from the hydrophobic affinity column in the presence of a calcium chelator; and optionally cleaving the eluted fused molecule in order to obtain the purified protein of interest cleaved from the TagR. Still, particularly, the purifying step is carried out by cleaving the fused-protein of interest from the TagR to obtain a cleaved protein/tag mixture, and then separating the cleaved protein of interest from the tag by eluting the cleaved protein of interest/tag mixture on a hydrophobic affinity column in the presence of calcium.
In accordance with a particular aspect, the invention also provides a method for producing a protein of interest, comprising growing a cell under conditions that permit expression of a protein of interest, wherein the cell comprises: a polypeptide as defined herein; or a vector as defined herein, or an expression cassette as defined herein.
According to a particular embodiment, a significantly higher level of expression and of solubility is obtained when the cell transfected with the fused-polynucleotide of the invention is cultured in the minimal growth medium than in the normal LB growth medium (all culture conditions are the same except for the growth medium). Indeed, some insoluble proteins are more strongly expressed and more soluble in minimal growth medium. Therefore, In accordance with a particular aspect, the invention also provides a method for producing a protein of interest, particularly an insoluble protein, the method comprising growing a cell as defined herein under minimal growth conditions to allow better folding of the insoluble protein.
TagR also allows purification of highly insoluble proteins in the presence of detergent, which cannot be achieved using GST. Indeed, SDS does not prevent fused TagR-protein binding to the column used for purification.
In accordance with a particular aspect, the invention also provides a kit for the expression and purification of a protein of interest, the kit comprising: a polynucleotide according as defined herein; and/or instructions on how to insert the protein in a suitable vector; and/or instructions on how to transform the vector in a host cell; and/or instructions on how to isolate a recombinant fused-protein from the host cell; and/or instructions on how to purify the recombinant fused-protein.
Alternatively, the kit for the expression and purification of a protein of interest comprises: a vector as defined herein, or an expression cassette as defined herein; and/or instructions on how to transform the vector in a host cell; and/or instructions on how to isolate a recombinant fused-protein from the host cell; and/or instructions on how to purify the recombinant fused-protein according to the method of the invention.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Centigrade, and pressure is at or near atmospheric.
The properties of Recoverin to improve expression and solubility of fusion protein partners have thus been compared with those of the GST tag in the following examples.
1.1 Cloning of tLRAT, RP2 and Recoverin
Two proteins of interest have been used in the present study: truncated lecithin retinol acyltransferase (tLRAT, amino acids 31-196) and retinitis pigmentosa 2 (RP2). Human tLRAT has been cloned in the plasmid pET11a (Novagen) as previously described.47,48 Briefly, RNA from freshly dissected human retinal pigment epithelium was isolated with Tri-reagent (Sigma) and used for reverse transcription reaction with the RevertAid® H minus First Strand cDNA Synthesis Kit (Fermentas). In order to insure unidirectional cloning, additional NdeI and Bpu1102 I restriction sites were attached to the 5′ end of each primer. The pET11a vector was then linearized with NdeI and Bpu1102 I and subsequently ligated with the purified PCR product corresponding to tLRAT.
The human RP2 construct cloned in the pGEX-4T-3 plasmid to express a GST fusion protein was a kind gift from Dr. Alfred Wittinghofer (Max-Planck-Institut für Molekulare Physiologie, Germany). Recoverin (TagR) cDNA has been cloned from freshly dissected bovine retina to produce the pET11a-Rec vector as previously described.42 RNA was isolated with Tri-reagent and used for reverse transcription reaction (RevertAid Kit). Then, first-strand cDNA was used as a template for PCR using primers designed to amplify the coding sequence of Recoverin cDNA and to introduce NdeI and BamHI restriction sites. The full-length bovine cDNA of Recoverin was then ligated into the pET11a plasmid.
1.2 Preparation of the Different Constructs of tLRAT and RP2 in Fusion with TagR or GST in pET11a and pGEX-4T-3 or in Tandem with the PolyHis Tag
The coding region of tLRAT or RP2 has been inserted between the BamHI and EcoRI sites of the pET11a-Rec vector (with no stop codon). It is noteworthy that there was an EcoRI site (GAATTC) in the sequence of TagR which has been changed to GAGTTC (silent mutation) by site-directed mutagenesis (QuikChange Lightning, Agilent). Moreover, 5 glycines and then a specific thrombin cleavage site (LVPRGS) were inserted before the cDNA of tLRAT or RP2. The final constructions are as follows: BamHI-TagR-5 glycines-Thrombin cleavage site-tLRAT or RP2-EcoRI. RP2 was provided in fusion with GST (pGEX-4T-3, see above). It included 5 glycines and the thrombin cleavage site, such as described above for pET11a. The tLRAT coding sequence has been inserted between the BamHI-EcoRI sites of the pGEX-4T-3 vector. The TagR and GST tags were thus located at the N-terminal of tLRAT and RP2. The polyHis (10 histidines) tag was also used in tandem with TagR using the TagR-tLRAT construct described above. Thus, 10 histidines and 5 glycines have been added at the N-terminal of TagR by PCR using appropriate primers to produce the PolyHis-TagR-tLRAT construct.
Plasmid DNA of TagR-tLRAT, TagR-RP2, GST-tLRAT and GST-RP2 were transformed into E. coli BI21(DE3) RIPL (Novagen) and grown overnight in the LB medium until saturation. Then, fresh LB containing 50 μg/ml ampicillin and chloramphenicol was inoculated with the transformed culture and incubated at 37° C. under agitation (250 rpm) until A600 nm=0.6. Their expression was then induced with isopropyl β-D-thiogalacto-pyranoside (IPTG) (0.5 and 0.1 mM for the constructions in the pET11a or the pGEX vectors, respectively) followed by an incubation for 16 h at 21° C. Bacteria were then sedimented by centrifugation at 3,275 g for 25 min. Bacteria lysis was done first using 250 μl of lysozyme at a concentration of 4 mg/ml, which includes 1 mM PMSF (phenylmethylsulfonyl fluoride, a protease inhibitor). The resuspended cells were kept on ice for 30 min. The cells were then sonicated and centrifuged at 20,000 g for 30 min at 4° C. It is noteworthy that TagR was expressed without its myristoyl group42 (
The pellet from 50 ml bacterial culture resulting from the expression of TagR-RP2 or TagR-tLRAT have respectively been resuspended in 4.75 ml of buffer A (50 mM Hepes (pH 7.5), 100 mM NaCl, 1 mM CaCl2, 5 mM β-mercaptoethanol) or of buffer B (5 mM Hepes (pH 7.5), 1 mM CaCl2, 5 mM β-mercaptoethanol). After bacteria lysis (see Example 1.3) and centrifugation (20,000 g for 30 min at 4° C.), the cleared lysate was loaded to a column containing 5 ml of phenyl Sepharose 6 Fast Flow (low sub resin; GE Healthcare) that had been preequilibrated with buffer A (TagR-RP2) or B (TagR-tLRAT). The purification of these fusion proteins was performed at 4° C. The presence of 1 mM calcium allows binding of TagR to the column as a result of the extrusion of its hydrophobic amino acids (
The TagR-tLRAT fusion protein has been expressed as described in Example 1.3 except that bacterial culture has been resuspended in buffer A (Example 1.4) containing also 0.05% SDS, which allowed extensive solubilisation of the fusion protein. The supernatant of the centrifugation has been loaded on the phenyl Sepharose 6 Fast Flow column. The largest share of the protein binds to the column in the presence of 0.05% SDS, which is not true for the GST tag. The column was then washed with at least 20 column volumes of the same buffer without SDS (cleavage of the fusion protein cannot be achieved with thrombin in the presence of SDS). TagR-tLRAT was eluted with buffer C (see Example 1.4). Cleavage of TagR-tLRAT has not been assayed in these particular experiments but it can be either performed directly on the column, such as described in Example 1.6, or after the elution of the fusion protein. Purified TagR-tLRAT remains soluble in the absence of SDS.
1.6 Expression, Purification of PolyHis-TagR-tLRAT, Cleavage of PolyHis-TagR and Purification of tLRAT
The PolyHis-TagR-tLRAT fusion protein has been expressed as described for TagR-tLRAT (see Example 1.3). The pellet from 50 ml of the bacterial culture is resuspended in 10 ml of a lysis buffer (100 mM Tris, pH 7.8, 100 mM NaCl, 1 mM EDTA, 1 mM EGTA, 1.4 μg/μl aprotinin). Bacteria were first disrupted by 3 cycles of freeze-thawing and the cell suspension was then sonicated during 3 min (cycles of 5 s) on ice. Bacteria were then centrifuged at 13,000×g during 30 min. Supernatant was discarded and the membrane pellet was resuspended in the loading buffer (500 mM Tris, 5 mM imidazole, 0.1% SDS, pH 7.8). The supernatant was then loaded on a His-Trap column (GE Healthcare). The column was sequentially washed with 200 ml of a buffer containing 500 mM Tris (pH 7.8), 30 mM imidazole and 1 mM CaCl2 and then with an additional buffer containing 50 mM Tris (pH 7.8), 100 mM NaCl, 0.1 M NDSB201 (3-(1-pyridinio)-1-propanesulfonate, Sigma-Aldrich) and 1 mM CaCl2. The PolyHis-TagR has then been cleaved from tLRAT using thrombin directly on the column for 40 hours at room temperature. Furthermore, the column was washed with 6 column volumes with the buffer used for the cleavage (50 mM Tris, pH 7.8, 100 mM NaCl, 0.1 M NDSB201, 1 mM CaCl2), which allowed to remove thrombin. tLRAT has then been eluted using a buffer containing 50 mM Tris, pH 7.8, 100 mM NaCl, 1 mM CaCl2, 0.1% SDS and 1 mM DTT. Finally, the cleaved PolyHis-TagR and the uncleaved fusion protein have been eluted with a buffer including 500 mM Tris, pH 7.8, 150 mM imidazole and 0.1% SDS.
The analysis of the level of expression and purity of the proteins was carried out on a Bio-Rad Mini-protean II electrophoresis cell. The protein samples were separated using 15% acrylamide SDS-PAGE (SDS-polyacylamide gel electrophoresis) (or 12% acrylamide, see Example 4.2 and
2.1 Comparison Between Protein Expression in Fusion with TagR Using Different Vectors
Fusion tags can potentially increase the total yield of proteins of interest by improving their expression and decreasing their degradation.2 However, the level of protein expression is highly dependent on the promoter present in the cloning vector. Expression of TagR-tLRAT in the pET11a and pGEX-4T-3 expression vectors has thus been performed using the same culture conditions. Their level of expression has then been compared by SDS-PAGE after bacteria lysis. As shown in
2.2 Comparison Between Protein Expression in Fusion with TagR and GST Using the Same Vector
tLRAT has been expressed in fusion with either TagR or GST in the vector pGEX-4T-3 using the same conditions. As can be seen in the SDS-PAGE of the lysed bacteria shown in
3.1 RP2 Solubility is Much Larger in Fusion with TagR than with GST
A similar level of expression of the same fusion protein has been obtained with the pET11a or pGEX-4T-3 vectors (
3.2 Comparison Between tLRAT Solubility in Fusion with TagR or GST and in Different Growth Media
A similar solubility has been obtained with TagR-tLRAT or GST-tLRAT in pGEX-4T-3. Indeed, as can be seen in
As can be seen in
4.2 Purification of TagR-tLRAT, Cleavage of TagR and Purification of tLRAT
tLRAT is a very hydrophobic protein and it requires the presence of detergent for its solubilisation in the absence of a highly soluble fusion partner. We have indeed shown that, among the large number of detergents assayed, only SDS (0.05% p/v; cmc ˜0.17-0.23%) allowed full solubilisation of tLRAT.48 Large concentrations of SDS could result in protein denaturation.51 We have thus demonstrated that tLRAT enzymatic activity remains unchanged from 0.05% up to 1% SDS and that this activity is 55,000 times higher than the highest activity reported in the literature.48 It can therefore be concluded that this range of concentrations of SDS has no detrimental effect on tLRAT. As shown in
5.2 Purification of PolyHis-TagR-tLRAT, Cleavage of PolyHis-TagR and Purification of tLRAT
Dual tags, or tandem purification procedures, have been developed in recent years.1,4,9,11,12,22,23 It consists of two different tags that can either be cloned both at the N- or C-terminal or one at N-terminal and the other one at the C-terminal. These tags can however be more easily cleaved if they are located at a single end of the protein of interest. Dual tags should preferably include a solubilisation as well as a purification tag. A single purification step can be achieved if the cloning strategy is properly designed but this has hardly been achieved until now. In our experiments, the PolyHis tag has been cloned at the N-terminal of TagR in fusion with tLRAT to produce the PolyHis-TagR-tLRAT construction. Purification has been achieved using the immobilized metal affinity chromatography which allowed binding of the PolyHis tag in the presence of SDS. Purification could also likely be performed using TagR properties to bind the hydrophobic phenyl sepharose resin in the presence of SDS as shown in
A dual TagsR is used to further enhance solubility of a protein of interest with no compromise on its properties to bind the hydrophobic phenyl Sepharose resin. For this purpose, a second Recoverin molecule is cloned at the C-terminal of a first Recoverin fusion partner. The expression and purification are carried out as already described.
When comparing the properties of TagR and GST, we observed that the pellet of bacteria transformed with pGEX-4T-3 was typically much larger than that obtained pET11a. This however hardly resulted in the expression of a higher amount of proteins of interest, such as observed in
Although increasing protein expression is an interesting outcome, the main challenge is to get a soluble protein. In fact, high protein expression may be detrimental for protein solubility. Indeed, at high expression rates, protein folding may not be as efficient as when performed at lower expression rates. This can be well appreciated in
We have purposely chosen to perform a large share of the experiments using tLRAT, a protein which is difficult to solubilize. In fact, tLRAT is little soluble with either the GST or TagR fusion partners as shown in
We predict that dual TagRs (one TagR cloned at the C-terminal of another TagR) fusion partners would allow to further enhance solubility of proteins of interest with no compromise on its properties to bind the hydrophobic phenyl Sepharose resin. Dual TagRs would then represent an improved technology with regards to the solubilisation of «difficult-to-solubilize» proteins of interest. Most important would be that purification of proteins of interest can be achieved even though a detergent must be used for their solubilisation, which was shown to be possible with TagR as a fusion partner (
An additional concern from the industry is the production of a soluble protein of interest after its cleavage from its fusion partner. RP2 and tLRAT were shown to remain soluble after their cleavage from TagR (
Dual tagging methods have been recently used by combining, for example, a solubility-enhancing tag and a purification tag. This procedure can however complicate the removal of the tags unless the construction is carefully planned. We have thus added a PolyHis tag at the N-terminal of TagR to find out if they could be used in tandem. This approach allowed to purify tLRAT and to remove the PolyHis-TagR tags at the same time (
GST is by far the most widely used protein tag because of its high solubility and also since it allows protein purification typically using a single step affinity chromatography using a resin that is covalently coupled with its glutathione substrate. However, the use of detergents prevents binding of the GST tag to the glutathione resin,41 which thus limits its use for a large range of proteins, notably those associated with membranes as they typically require detergent for their solubilisation. In addition, the high affinity of GST for glutathione prevents the direct purification the protein of interest from cleaved GST after protease digestion. Indeed, dialysis must first be performed to remove glutathione from GST, which lengthens the purification procedure and typically leads to a decrease of protein yield. In contrast, the presence of detergents does not prevent protein purification of the TagR fusion proteins assayed. Moreover, concerns have been raised with regard to the slow kinetic of binding of GST to its resin which results in a highly time-consuming loading of the cell lysates to the chromatography column.52
Maltose binding protein (MBP)53 is the second mostly used protein tag. It has been shown to increase the solubility of a number of proteins of interest.54,55 It was also found to be attractive because it allows affinity purification with a low cost resin covalently coupled with the amylose substrate of MBP. However, we and others56-58 have found that MBP fusion proteins fail to properly bind to the amylose resin. Consequently, MBP is often used in combination with other tags to achieve purification of its fusion partner which, however, lengthens the purification procedure and leads to a decrease of protein yield and an increase of the cost of these experiments.2 Therefore, although it is an inexpensive technology, the difficulty to achieve purification of its fusion protein partner in a single step using affinity chromatography has led to a decrease in the interest for this protein tag.
Thioredoxin59 and ubiquitin-like modifier (SUMO)60 have attracted less interest, even though they enhance solubility, because they are not purification tags. Consequently, they must be used in combination with small purification tags to achieve protein purification, which complicates the procedure. It is also noteworthy that the SUMO technology is very expensive. These protein tags are thus less appealing.
As shown herein, the TagR technology allows a more efficient expression and solubilisation of its fusion partner than GST. Moreover, it allows purification and cleavage of the fusion partner. In addition, it is cheaper to use than GST and the very popular small PolyHis tag. Indeed, the commercially available resins used to purify GST, PolyHis and TagR respectively costs 62$, 43$ and 34$ to purify 10 mg of protein. The TagR thus appears as a new, attractive technology to improve the expression and solubility of proteins of interest. Moreover, it allows purification of its fusion partner and it can be easily removed by proteolytic cleavage.
We are presenting herein, a new protein tag to improve the solubility and purification of proteins of interest. This new protein tag has been called TagR. It is the widely studied 23 kDa protein Recoverin, which possesses properties typical of protein tags. Indeed, Recoverin is highly soluble (>30 mg/ml), it can be highly purified in a single step and its expression level in bacteria is very high (>30 mg/L of culture).42 Its purification is based on its properties to reversibly bind calcium through a calcium-myristoyl switch. Recoverin undergoes a large conformational change upon calcium binding.43-46 Indeed, calcium binding to two of its four EF-hands induces the extrusion of its myristoyl group as well as of several hydrophobic residues (
We have previously shown that the presence of the myristoyl group is not necessary to achieve high purity of Recoverin.42 The properties of TagR to improve solubility of fusion protein partners and to allow their purification have thus been compared with those of the same proteins of interest using the GST tag.
While the invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications and this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features herein before set forth, and as follows in the scope of the appended claims.
All patents, patent applications and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent, patent application, or publication was specifically and individually indicated to be incorporated by reference.
CCG CGT
GGATCC
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CA2016/050849 | 7/19/2016 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62194348 | Jul 2015 | US |