The invention relates to the determination of a nucleotide sequence encoding a modified protein, to the development of vectors for the expression thereof, and to the uses of the vectors obtained and of the proteins thus expressed.
A modified protein according to the invention is a protein “of interest”, i.e. a protein, or a part of this protein, which it is sought to isolate, for example in diagnostics, or to transport, for example in therapy, in the peptide sequence of which are included, by intercalation and/or addition, at least two series of amino acid residues: a series of at least six lysine residues and a series of at least six histidine residues. In the remainder of the description, the terms “series” and “tag” will be used without differentiating to represent a group of amino acid residues. In the examples which will follow, the protein of interest is the HIV-1 capsid glycoprotein p24, but the subjects of the invention are not of course limited thereto.
According to document WO-A-98/59241, the authors of the present invention have demonstrated that modification of the peptide sequence of the HIV-1 capsid protein p24, by insertion of a tag of six lysine residues, makes it possible to considerably increase the yield from coupling of the protein to the copolymer AMVE67. It has thus been possible to achieve mobilization of 50 molecules of modified protein per copolymer chain.
The immobilization of proteins finds applications in a large number of fields. For example, in chemotherapy, the immobilization of therapeutic proteins makes it possible to increase their lifetime in the blood by limiting proteolytic degradation (Monfardini et al., 1998), but also makes it possible to passively target tumor cells by virtue of the hyperpermeability of these cells (Duncan et al., 1999). In gene therapy, use is made of ligands specific for cell receptors, which are coupled to cationic polymers, in order to transport genes, allowing effective targeting of the cells to be transfected (Varga et al., 2000).
It is known, moreover, that the yield from purification of a protein by immobilized metal ion affinity chromatography (IMAC) is greatly increased when the protein is modified by introducing a tag of at least six histidine residues.
Documents U.S. Pat. No. 5,916,794 and E. Hoculi et al., Bio/Technology, Nature Publishing Co New-York, US, November 1988, pp 1321-1325 describe fusion proteins comprising a protein of interest, namely a restriction endonuclease for U.S. Pat. No. 5,916,794 and dihydrofolate reductase for E. Hoculi et al., and a tag of histidine residues at one or the other of the N- and C-terminal ends of the protein of interest. The presence of this tag makes it possible to increase the yield from isolation of the protein by immobilized metal chelate affinity chromatography.
According to those documents, after isolation, the histidine tag is detached from the protein of interest via the action of thrombin for U.S. Pat. No. 5,916,794, or by chemical or enzymatic cleavage, for example via the action of carboxypeptidase, for E. Hoculi et al., in order to recover, for subsequent use, the protein of interest. This cleavage step is not without risk since, depending on the nature of the amino acids of the protein of interest, and in particular on whether it possesses sites rich in histidine residues, undesired cleavage may occur in the protein. Similarly, the chemical cleavage conditions may be prejudicial to the structure of the protein of interest.
The invention depended on obtaining a modified protein which, at the same time, can be effectively purified by chromatography such as the IMAC technique, can be readily immobilized on a polymer, and has, once purified and immobilized, at least all the biological properties of the native protein for which the modified protein is used and finds a use, without it being necessary to have an additional step using conditions which risk altering the structure of the protein.
Thus, a first subject of the invention is a nucleotide sequence encoding a modified protein of interest, said modified protein of interest having, after purification and immobilization, at least the same biological activity as the native protein of interest and being directly usable, said sequence comprising at least one gene encoding said protein of interest, a “polyK” nucleotide fragment encoding a series of at least six lysine residues, and a “polyH” nucleotide fragment encoding a series of at least six histidine residues.
For the purpose of the present invention, “the same biological activity” is understood as meaning in qualitative terms and in quantitative terms. The applicant has in fact discovered that the insertion and/or the addition both of a histidine tag and of a lysine tag, and then purification and immobilization of the protein thus modified, does not affect the biological function of the protein of interest and alters neither the specificity nor the sensitivity of the protein. This observation is surprising in that, despite the introduction of these two tags representing approximately at least 5% of all the amino acids constituting a protein, for example the HIV capsid protein p24, and despite the immobilization of the protein thus modified, said protein does not appear to lose the conformation which gives it its activity. The term “directly usable” is understood to mean that the modified protein of interest obtained can, after purification and immobilization, be used like the protein of interest, without a prior treatment step to remove one and/or the other of the two histidine and lysine tags.
The invention is of most particular interest in gene therapy, where the protein is coupled to a polymer.
According to the protein under consideration, and in particular depending on the location of its site(s) of activity, in its peptide sequence, the histidine and lysine residue tags, respectively, should be introduced into one and/or the other of the N- and C-terminal ends, or may be intercalated between the epitopes located in said sequence.
Advantageously:
To this effect, a nucleotide sequence of the invention is chosen from the sequences as defined above and also exhibiting the following characteristics:
A preferred nucleotide sequence is a sequence in which the polyK fragment encodes a series of six lysine residues, and/or the polyH fragment encodes a series of six histidine residues.
A spacer arm is advantageously chosen from the nucleotide sequences comprising at least any one of SEQ ID NO: 5 to 8. The sequences SEQ ID NO: 9-12 illustrate the peptide sequences encoded by the nucleotide sequences of the spacer arms SEQ ID NO: 5 to 8.
As will be illustrated in the examples, in a particular use for detecting HIV-1, the protein of interest is HIV-1 p24, identified by SEQ ID NO: 13, and the modified protein has a sequence chosen from SEQ ID NO: 14 to 20.
Before disclosing the other subjects of the invention and describing in detail the characteristics and advantages thereof, a definition of certain terms used in the description and the claims is given hereinafter so that the invention and therefore the scope of the protection are clearly delimited.
A “series or tag of amino acid residues” is a short amino acid sequence which is included in the peptide sequence of the native or original protein, at a preferred site, so as to allow this series or tag to be exposed in a relevant manner, while at the same time conserving, or even improving, the biological properties of the native or original protein. In particular according to the invention, the presentation of the histidine residue tag should be favorable with respect to the affinity of this tag for metal ions, as used in the purification technique referred to as IMAC (immobilized metal ion affinity chromatography), and that of the lysine residue tag should be favorable with respect to its attachment to an immobilization phase via a covalent interaction between the tag and reactive functions present on or in said phase.
The expression “intercalation or insertion of a tag” is understood to mean that the tag is introduced within the peptide sequence of the protein of interest, between two amino acids. The expression “addition of a tag” is understood to mean that the tag is “joined onto” the peptide sequence of the protein of interest, at the N- or C-terminal end of said sequence.
In practice, the recombinant modified proteins obtained according to the invention will commonly comprise amino acids which intercalate between the tags, and/or between the tags and the peptide sequence of the native or original protein, without, however, having any effect on the specificity of the tags or on the biological activity of the protein.
The amino acid residues belonging to a tag according to the invention are chosen from natural amino acids and chemically modified amino acids. The chemical modification introduced into the natural amino acid should preserve, or even develop, the specificity of the tag with respect to its role in the attachment. By way of example, mention may be made of replacement of an L amino acid with the corresponding D amino acid, and vice versa; modification of the side chain of the amino acid: in the case of lysine, it may be an acetylation of the amino group of the side chain; modification of the peptide bonds of the tag, such as carba, retro, inverso, retro-inverso, reduced or methyleneoxy bonds.
The immobilization phase to which the attachment of the modified protein is favored by virtue of the lysine residue tag can be a particulate or linear polymer, in particular chosen from homopolymers such as polylysine, polytyrosine; from copolymers such as copolymers of maleic anhydride, copolymers of N-vinylpyrrolidone, natural or synthetic polysaccharides, polynucleotides and copolymers of amino acids such as enzymes. Advantageous polymers are the N-vinylpyrrolidone/N-acryloxysuccinimide copolymer, poly(6-aminoglucose), horseradish peroxidase (HRP) and alkaline phosphatase.
The immobilization phase comprises reactive functions which will interact by covalence with the lysine tag. These reactive functions are chosen from ester, acid, halocarbonyl, sulfhydryl, disulfide, epoxide, halocarbonyl and aldehyde functions.
The immobilization phase can be attached, directly or indirectly, to a solid support by passive adsorption or by covalence.
This solid support can be in any suitable form, such as a plate, a tip, a bead, the bead optionally being radioactive, fluorescent, magnetic and/or conductive, a strip, a glass tube, a well, a sheet, a chip, or the like. The material of the support is preferably chosen from polystyrenes, styrene-butadiene copolymers, styrene-butadiene copolymers mixed with polystyrenes, polypropylenes, polycarbonates, polystyrene-acrylonitrile copolymers and styrene-methyl methacrylate copolymers, from synthetic and natural fibers, and from polysaccharides and cellulose derivatives, glass and silicon, and their derivatives.
A nucleotide sequence according to the invention can be readily synthesized by routine techniques which those skilled in the art know how to implement.
Another subject of the invention is an expression system, such as a vector, for expressing a nucleotide sequence of the invention.
When the protein of interest is HIV-1 capsid p24, a suitable vector has a nucleotide sequence chosen from SEQ ID NO: 1 to 4, preferably the nucleotide sequence is SEQ ID NO: 1 or 3.
The invention also relates to a kit of vectors for the expression of at least two different nucleotide sequences of the invention.
An advantageous kit comprises vectors encoding the expression at least of two nucleotide sequences in which, with respect to said gene encoding the protein of interest, the two nucleotide fragments, polyK or polyH, are located on the 5′ end of the sequence; or of two nucleotide sequences in which, with respect to said gene encoding the protein of interest, at least one of the two nucleotide fragments, polyK or polyH, is located on the 5′ end of the sequence, and the other of the two nucleotide fragments, polyH or polyK, is located on the 3′ end; or else of two nucleotide sequences in which, with respect to the gene, the two nucleotide fragments, polyK and polyH, are located on the 3′ end of the sequence.
Another advantageous kit comprises vectors encoding the expression at least of a nucleotide sequence in which, with respect to said gene encoding the protein of interest, the two nucleotide fragments, polyK or polyH, are located on the 5′ end of the sequence; of a nucleotide sequence in which, with respect to said gene encoding the protein of interest, at least one of the two nucleotide fragments, polyK or polyH, is located on the 5′ end of the sequence, and the other of the two nucleotide fragments, polyH or polyK, is located on the 3′ end; and of a nucleotide sequence in which, with respect to the gene, the two nucleotide fragments, polyK and polyH, are located on the 3′ end of the sequence.
Another subject of the invention is a host cell comprising at least one vector of the invention, in which at least one nucleotide sequence as defined above is expressed.
This ability to obtain and express, in a vector for example, a nucleotide sequence has led the authors to develop a simple method for obtaining a purified and immobilized modified protein of interest, said modified protein of interest having at least the same biological activity as the protein of interest and being directly usable.
This method comprises the following steps:
The authors have also defined a simple and optimal method for obtaining a purified and immobilized modified protein of interest, said modified protein of interest having at least the same biological activity as the protein of interest and being directly usable, said method comprising the following steps:
According to a variant of the method of the invention, said method can also comprise the following steps:
This method makes it possible to select a purified and immobilized modified protein of interest in which the position of the histidine and lysine tags is optimal from the point of view of the biological activity of the modified protein.
A modified protein of interest according to the invention can be readily purified and immobilized and is directly usable, after purification and immobilization, these steps being carried out with very high yields.
The characteristics and advantages of the various subjects of the invention are illustrated hereinafter, in support of Examples 1 to 6 and of FIGS. 1 to 6, according to which:
Schematically, the vectors for expressing the tagged recombinant proteins were generated from the expression vector pMR24 obtained by ligation of the NcoI-XbaI fragment of pMH24 (Cheynet et al., 1993) containing the p24 gene, with the NcoI-XbaI fragment of pMR-T7 (WO 98/45449, Arnaud et al., 1997) containing all the sequences regulating replication of the plasmid and the elements for expressing the inserted gene. Suitable oligonucleotide linkers providing the coding information relating to the lysine and/or histidine tags were inserted between ClaI and NcoI in the 5′ position and SmaI and XbaI in the 3′ position, so as to obtain a nucleotide sequence according to the invention. The portion of the p24 gene encoding the polypeptide beginning at amino acid 3 (valine) and terminating at amino acid 224 (proline) is conserved in all the constructs.
The seven inserted nucleotide sequences were designed as follows: all have a nucleotide sequence encoding a series (or tag) of 6 histidine residues, which should allow efficient purification of the protein by metal ion affinity (IMAC for immobilized metal ion affinity chromatography), and five of them have a sequence encoding a series (or tag) of six lysines, in order to allow covalent coupling of the protein to the polymer.
The recombinant modified proteins obtained are as follows:
The spacer sequence of the recombinant proteins R24KsH and RHsK24 is represented by “s” and consists of a series of four glycine residues and one serine residue, which can be repeated several times.
E. coli strain XL1 competent bacteria were transformed with the seven plasmids obtained in Example 1, and protein expression was induced by adding isopropyl-β-D-thiogalactopyranoside (IPTG), as previously described (Cheynet et al., 1993, Arnaud et al., 1997). The proteins are extracted, after sonication of the bacterial pellet, in 50 mM Tris buffer, pH 8.0, containing 1 mM EDTA, 10 mM MgCl2 and 100 mM NaCl, in the presence of antiproteases (10 μg/μl leupeptin and 1.25 μg/μl aprotinin), and then purified by IMAC. The purifications were carried out on a zinc ion-activated Sepharose gel. The recombinant proteins comprising a tag of 6 histidine residues are chelated by the metal ions. The chromatographic system used is an FPLC (Akta Explorer, Pharmacia Biotech). The loading loop is 2 ml. The purifications are carried out by injection of protein diluted ½ in the washing buffer, which is a 67 mM phosphate buffer, pH 7.8, containing 0.5 M NaCl.
The proteins of interest are eluted specifically at approximately pH 4.7 by producing a pH gradient using ammonium acetate buffers, pH 6.0 and pH 3.0. The various purification fractions are collected. 10 μl of each of these fractions are deposited onto Whatman 3MM Chr paper and then stained with Coomassie blue. The fractions (nonretained proteins—purified protein) are then migrated on 12% acrylamide gels after reduction with β-Me and heating for 10 minutes at 95° C., and then stained with Coomassie blue. The fractions containing the highest concentrations of protein of interest are then combined and then dialyzed in a PIERCE Slide-A-Lyzer MWCO 10000 dialysis cassette for 1 hour and then overnight at 4° C., against a 50 mM phosphate buffer, pH 7.8. The protein concentrations are then defined using a calorimetric Bradford Coomassie Plus Assay (PIERCE).
The bacterial protein extracts and the purified proteins are migrated on 12% acrylamide gels after reduction with β-mercaptoethanol and heating for 10 minutes at 95° C., and then stained with Coomassie blue. For the purified proteins, a gel run in parallel is transferred by Western blotting onto a nitrocellulose membrane (Hybond C extra, Amersham Life Science). The nonspecific sites of the membrane are then saturated with Tris buffered saline (TBS)-0.1% Tween, to which 5% of milk has been added. After 3 washes in TBS-T, the membrane is incubated for 2 hours at ambient temperature in the presence of the biotinylated rabbit polyclonal anti-p24 antibody diluted {fraction (1/10)} 000 in TBS-T buffer+5% milk. After 3 washes in TBS-T, the membrane is incubated for 1 hour at ambient temperature in the presence of streptavidin-peroxidase (Jackson ImmunOResearch) at 0.5 g/l diluted 1/3000 in TBS-T buffer+5% milk. Three washes in TBS-T are performed before visualization by ECL+chemiluminescence (Amersham Pharmacia Biotech, RPN2132). Autoradiography for 15 seconds in a dark room is performed on Kodak Biomax MR film.
The analysis of the expression (
Finally, similar amounts of the recombinant proteins RH24, R24H, RH24K, RK24H, R24 KH and R24KsH are obtained, namely between 2 and 5 mg per gram of biomass for given culturing and extraction conditions, and only 0.4 mg of RHsK24 is obtained, in agreement with its low level of expression. It is observed that, by optimizing the culturing conditions such as the culture volume and the extraction step, yields of 9 to 16 mg per gram of biomass could be obtained for RH24 and RH24K.
The result of the protein purification step is represented in
The purified proteins are then characterized more precisely by mass spectrometry coupled to liquid chromatography (LC/ESI/MS). The analyses were carried out on an API 100 single-quadrupole mass spectrometer, 140B pumps and a 785A detector (Perkin Elmer). The reverse-phase liquid chromatographies were carried out on a C4 column (Vydac Ref 214PT5115, 5 pm particle size). The elution buffers are, for solvent A: 0.1% (v/v) formic acid in water and, for solvent B: formic acid in a water/acetonitrile (5:95 v/v) solution. A gradient of 40 to 60% of B was used.
For each recombinant protein,
The results show that the molecular masses determined by mass spectrometry are in accordance with those expected for the RH24, RH24K, RK24H and RHsK24 proteins, and that, therefore, the proteins used correspond to those deduced from the translation of the modified gene. The R24KsH, R24 KH and R24H proteins exhibit, respectively, a mass deficit of 119, 121 and 123 Da, probably corresponding to the loss of the carboxy-terminal isoleucine. This affects neither of the two tags.
The efficiency of coupling of these diversely tagged proteins to copolymers of maleic anhydride was tested. The covalent immobilization of proteins to polymers is carried out by establishing a covalent amide bond between the anhydride groups of the polymer and the primary amines present on the side chains of the lysine residues, as illustrated in the scheme below. However, since the polymer is not water-soluble, it is necessary to dissolve it in anhydrous DMSO (dimethyl sulfoxide) prior to the coupling reaction carried out in 95% aqueous medium.
Operating Conditions:
The covalent coupling reaction is performed spontaneously by incubation for 3 hours at 37° C. on a thermal stirrer.
The conjugates are then characterized as follows.
The samples are filtered in Ultrafree Millex HV 0.45 μm tubes (Millip ore) and then analyzed by steric exclusion chromatography on a Shodex Protein KW 803 column. The chromatographic system is a Kontron HPLC comprising a 422 pump, a 465 automatic injector and a DAD (Diode Array Detector). The elution is performed in 0.1 M phosphate buffer, pH 6.8+0.5% SDS (m/m) with a flow rate of 0.5 ml/min. The detection is carried out by measuring absorbance at 280 (at the concentration used, the polymer does not absorb).
The ratio of the area of the peak corresponding to the protein coupled to the polymer versus the sum of the two peaks corresponding to the cleaved and uncleaved proteins (i.e. the total amount of proteins involved in the reaction) gives the value for the coupling yield (Y).
(Area of the protein/polymer conjugate peak)280 nm×100 (Area of the protein/polymer conjugate peak)280 nm+(Area of the free protein peak)280 nm
The number of proteins per polymer chain is defined by the following relationship: N=n.Y/n′ where n and n′ represent, respectively, the number of moles of proteins and the number of polymer chains in the reaction medium.
The data in
The concentrations used are as follows:
[proteins]=0.95 g/l (3.56×10−9 mol), [AMVE67]=0.048 g/l (7.46×10−1 mol).
□ represents the proteins containing only a tag of 6 histidine residues, ▪ represents the proteins with a tag of 6 histidine residues opposite the tag of 6 lysine residues, ▪ represents the proteins with tags of 6 histidine residues and 6 lysine residues which are contiguous. The experiments were carried out 3 times, the values indicated correspond to the mean plus one standard deviation.
In the absence of lysine residues, the coupling yields are between 10 and 30%. They are greater than 95% when the tag of 6 lysine residues is present on the protein. The presence of a tag of 6 lysine residues therefore makes it possible to considerably improve the coupling efficiency (by comparison of RK24H, R24 KH, R24KsH and RHsK24 with RH24 and R24H), independently of its N- or C-terminal position (comparison of RK24H with RH24K, and RHsK24 with R24KsH), opposite or adjacent to the tag of 6 histidine residues (comparison of RH24K and RK24H with R24 KH, R24KsH and RHsK24).
The improvement in the yield from coupling the Lys-6 proteins to the AMVE67 copolymer suggests that the coupling reaction is region-selective, namely that it involves the lysine residue tag.
The biological reactivity of the conjugates was evaluated as a function of the N- or C-terminal position of the tag and of the N- or C-terminal position of the epitope recognized by the monoclonal antibody. Two proteins were selected for this study, RH24K and RK24H, having, respectively, a tag of six lysine residues at the C-terminal and N-terminal position, and opposite the tag of six histidine residues.
The ELISA protocol was carried out as follows: 100 μl/well of protein-polymer conjugate diluted to 0.25 μg/ml in PBS buffer are immobilized at the bottom of a 96-well microplate (Nunc Immunoa Plate Maxisorpa surface) by overnight incubation at ambient temperature. The nonspecific sites are then saturated for 2 hours at 37° C. with 200 μl/well of a solution of PBS containing 1% (w/v) Régilait™. The wells are then washed 3 times in PBS-0.05% tween. The monoclonal antibodies diluted at the appropriate dilution in PBS buffer-0.05% tween-0.2% Régilait™ are then incubated for 1 hour at 37° C. After 3 washes in PBS-0.05% tween, the peroxidase-labeled anti-mouse conjugate (Jackson ImmunOResearch) diluted 1/2000 in PBS-0.05% tween-1% Régilait™ is incubated for 1 hour at 37° C. Three washes in PBS-0.05% tween are carried out before the visualization during which 100 μl of a solution containing a 30 mg OPD tablet diluted in 10 ml of OPD substrate buffer (Sanofi pasteur) are incubated for 10 min in the dark at ambient temperature. The reaction is then blocked by adding 100 μl/well of 1 N H2SO4, and the absorbance values are then read on a spectrophotometer at 492 nm.
The data in
The Table gives the signal obtained by ELISA with a protein-polymer conjugate coating.
The results show that the ELISA signal is better when the tag is in an opposite position to the epitope recognized by a monoclonal antibody. Thus, the monoclonal antibody which recognizes an epitope located at the N-terminal position (MAb 15F8) exhibits a signal 1.3 times greater for a protein immobilized via its C-terminal region (RH24K) than for a protein immobilized via its N-terminal region (RK24H). Conversely, an antibody which recognizes an epitope located in the C-terminal position exhibits a signal 8.3 times (MAb 23A5) and 2.25 times (MAb 3D8) greater when the protein is immobilized via its N-terminal region (RK24H) than when said protein is immobilized via its C-terminal region (RH24K).
Given the expression, purification and oriented coupling capacities exhibited by the various double-tagged proteins derived from the p24 model, expression vectors allowing the insertion of a gene of interest for which the three properties would be required were produced. These vectors combine sequences encoding a tag of six histidine residues for efficient purification by metal ion chelation and a tag of six lysine residues for oriented covalent immobilization. According to the use and/or to restrictions imposed by the position of the active site of the protein, the expression vectors proposed exhibit various possible combinations.
The vector pMK81 is derived from the expression vector pH24K by cleavage with NcoI and SmaI, and then by ligation to the sequence of the NcoI-SmaI polyLinker. The vector pMK81 contains, in the 5′ position, a reading frame encoding a His-6 tag, unique cloning sites for the insertion of genes encoding proteins of interest and, in the 3′ position, a reading frame encoding a Lys-6 tag. It is 4935 bp in size.
The vector pMK82 is derived from the expression vector p24 KH by cleavage with NcoI and SmaI, and then by ligation to the NcoI-SmaI polyLinker sequence. The vector pMK82 contains, in the 5′ position, a translation start codon, unique cloning sites for the insertion of genes encoding proteins of interest and, in the 3′ position, a reading frame encoding a Lys-6 His-6 double tag. It is 4921 bp in size.
The vector pMK83 is derived from the expression vector pK24H by cleavage with NcoI and XhoI, and then by ligation to the NcoI-XhoI polyLinker sequence. During construction, the XhoI site was deleted. The double-stranded oligonucleotides were obtained by hybridization of each strand in buffer containing 50 mM NaCl, 6 mM Tris/HCl, pH 7.5, and 8 mM MgCl2, by heating for 5 minutes at 65° C., and slow cooling at ambient temperature. The vector pMK83 contains, in the 5′ position, a reading frame encoding a Lys-6 tag, unique cloning sites for the insertion of genes encoding proteins of interest and, in the 3′ position, a reading frame encoding a His-6 tag. It is 4945 bp in size.
The vector pMK84 is derived from the expression vector pHK24 by cleavage with NcoI and SmaI, and then by ligation to the sequence of the NcoI-SmaI polyLinker. The vector pMK84 contains, in the 5′ position, a reading frame encoding a His-6 Lys-6 double tag, unique cloning sites for the insertion of genes encoding proteins of interest and, in the 3′ position, a translation stop codon. It is 4951 bp in size.
The characteristics of the vectors represented in
Number | Date | Country | Kind |
---|---|---|---|
01/15081 | Nov 2001 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FR02/04004 | 11/21/2002 | WO |