The present invention is related with the biotechnology field, particularly with the proteomics. General speaking, we can define proteomics as a set of tools, techniques, interrelated methods for studying the proteomes. The term proteome is used to define the protein complement of the genome.
Today the combination of separation technologies, mainly two-dimensional gel electrophoresis (2DE) in combination with mass spectrometry and the automatic database search have made possible the highthroughout identification of proteins in complex mixtures. 2DE is the most resolving technique for separating complex mixture of proteins, however it has several limitations (Membrane proteins and proteomics: Un amour impossible. Santoni V., Molloy M., Rabilloud T. Electrophoresis. 21, 1054-1070, 2000; Proteome profiling-pitfall and progress. Haynes P. A., Yates Ill J. R. Yeast. 17, 81-87, 2000):
The fact that the sequencing by mass spectrometry of only 3 to 5 amino acids of a peptide is enough to carry out a reliable identification of the source protein (Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Mann M, Wilm M., Anal. Chem. 1994, 66, 4390-4399) and that the proteolytic digestion of a hydrophobic protein can even generate non-hydrophobic peptides easily to be handled, makes preferable the manipulation of peptides instead of intact proteins, arising variants to carry out proteomics without using two-dimensional electrophoresis and allowing a high throughput identification of a great number of proteins in the analyzed mixtures.
The first step towards this direction was done in 1999 by Link et al. (Link, A.J. et al. Direct analysis of protein complexes using mass spectrometry. Nat. Biotechnol. 1999, 17, 676-682) when a liquid chromatography system was directly coupled to a mass spectrometer (LC-MS/MS). They packed a microcapilar column followed by a reverse-phase column. Initially, at acidic pH all peptides are retained in the cation exchanger and fractions of these peptides are eluted in batches by increasing the saline concentration of the movile phase. Then, a continuous gradient of acetonitrile, elutes the peptides retained in the reverse phase column and are analyzed by mass spectrometry in order to proceed for their identification in sequence database. This procedure is repeated several times until a complete elution of the peptides retained in the cation exchange chromatography column. This procedure known as MudPiT (Multidimensional Protein Identification Technology) has allowed the identification of 1484 proteins from the total hydrolyzed from Saccharomyces cerevisiae (Washburn M. P. et al. Large-scale analysis of the yeast proteome by multidimensional protein identification technology, Nature Biotechnology, 2001, 19, 242-247). This bidimensional fractionation is essential for identifying a great number of proteins due to the complexity of the analyzed peptide mixture and it has promoted the high throughput identification of proteins by using liquid chromatography coupled to mass spectrometry.
However, the relative quantification of proteins in complex mixtures cannot be easily achieved. The first step to solve this problem was done by Washburn MP et al. (Analysis of quantitative proteomic data generated via multidimensional protein identification technology. Anal. Chem. 2002, 74, 1650-1656) when Saccharomyces cerevisiae were grown in the normal culture medium containing nitrogen-14 (14N) and the other cells in an enriched medium containing nitrogen-15 (15N). By this procedure, all the proteins derived from one conditions are labeled with 15N while the proteins of the other condition are labeled with 14N. The proteins derived from both conditions are mixed, and proceed to protein identification through the sequencing of peptides by mass spectrometry. The determination of the relative quantities of the proteins is based on the intensity ratio of the corresponding peptides labeled with 15N/14N.
This approach based on the isotopic labeling is not always possible, and only can be performed in simple organisms such as yeast, and bacteria because of high costs of the isotopic ally enriched culture media. On the other hand, it is important to point out, that the 15N-labeling introduces certain limitations in the identification process because there is a variable mass shifts between the labeled and non-labeled peptides according the number of nitrogen atoms in their sequences.
To overcome this last limitation, other authors restricted the introduction of the isotopic labeling to certain amino acids when the organism under study is grown in a culture medium depleted in a natural amino acid but it is supplemented with its corresponding labeled amino acid.
This strategy is know as SILAC (stable isotope labeling by amino acids in cell culture) and there is a considerable number of publications using the labeling with 12C/13C y 1H/2H in culture media supplemented with labeled and non-labeled leucine, arginine or lysine in the two compared conditions (S.E. Ong, B. Blagoev, I. Kratchmarova, D. B. Kristensen, H. Steen, A. Pandey, M. Mann, Mol. Cell Proteomics. 2002, 1, 376-386; Berger S J, Lee S W, Anderson G A, Lijiana PT, TolićN, Shen Y, Zhao R, Smith R D High-throughput Global Peptide Proteomic Analysis by Combining Stable Isotope Amino Acid Labeling and Data-Dependent Multiplexed-MS/MS. Analytical Chemistry 2002, 74, 4994-5000; and Precise peptide sequencing and protein quantification in the human proteome through in vivo lysine-Specific Mass Tagging, J. Am. Soc. Mass Spectrom. 2003, 14, 1-7). The mass shift of the light amino acid respect to the heavy amino acid, only is detected if the peptide contain the labeled amino acid, therefore peptides that do not contain the labeled amino acid cannot be used for quantification. SILAC cannot be used universally in all proteomics experiments because its high costs and it is only applicable to biological problems that are studied in cellular culture. The most universal method to perform the quantification is the 18O-labeling. It consists in the proteolytic digestion of the whole proteins from one condition in the presence of normal water (H2O) while the proteins from the other condition are hydrolyzed in presence of 18O-labeled water (H218O). The peptides obtained in the buffer prepared with H218O can incorporate either one or two 18O at their C-terminal end, on the contrary, the other peptides show their natural isotopic ion distribution. To perform the quantification, the labeled and non-labeled peptides are mixed and the ratio of the area corresponding to the isotopic ion distribution of the species labeled with 16O/18O in the mass spectra is proportional to the relative concentration of protein that originated them in the compared samples.
This method have two limitations: (1) it does not yield a sufficient separation between the isotopic ion distributions of the labeled and non-labeled peptides and (2) the addition of 18O is not homogeneous, adding one or two 18O atoms. Both problems make difficult the relative quantification of the species unless software allows the appropriate interpretation of the complex overlapping of the isotopic ion distributions.
To overcome this problem Yao et. al (Yao X, Afonso C and Fenselau C. Dissection of proteolytic 18O-labeling: endoprotease-catalyzed 16O-to-18O exchange of truncated peptide substrates. J. Proteome Res. 2003, 2, 147-52) proposed the complete labeling of peptides with 18O after the proteolytic digestion by a extended incubation of the peptides in the presence of trypsin in order to guaranteed the complete incorporation of two 18O atoms at the C-terminal end of the peptides and thus exists a separation of 4 Da between the isotopic ion distribution of the labeled and non-labeled peptides.
However, on one hand, there are peptides that are resistant to the additional incorporation of 18O once they were generated by the cleavage of trypsin an introduce considerable errors in the quantifications, and, on the other hand, the additional incubation for long times to ensure the complete exchange of 16O by 18O may introduce non-specific cleavages in the peptide sequence and it affect the protein identification in the sequence database.
Karen et al. implemented a new method named inverse 18O-labeling (Y.Karen Wang, Zhixian M a, Douglas F. Quin, W . Fu. Inverse 18O Labeling Mass Spectrometry for the Rapid Identification of Marker/Target Proteins. Anal. Chem. 2001, 73, 3742-3750) where to pool of proteins (composed of eight proteins), the control sample and the test sample were prepared. Each pool of proteins is divided in two equal fractions and digested in different buffers prepared with 18O and 16O. After finishing the digestions, they make two pools of mixtures, where each group were represented by the fractions labeled with the different isotopes. This original method allows the simplification in the data analysis because the substraction between both experiments eliminate all the proteins that did not changed their expression level; allow the detection of modification originated due to certain perturbations, is suitable for the high throughput analysis; and simplify the analysis because it did not introduce any restriction to the signal that changed their expression level.
The inverse labeling facilitates the identification of proteins with extreme differential expression, however this method require two experiments and it limitates its applicability because it require higher amount of sample and it is very often a limiting factor in proteomics.
The 18O-labeling also has been used in quantitative proteomics for the specific 18O-labeling of N-glycopeptides when one of the samples is deglycosylated in a buffer prepared with H218O and the other in a buffer prepared with normal water (Kuster, B and Mann M. 18O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoprotein's using peptide mass mapping and database searching. Anal. Chem. 1999, 71, 1431-1440). After mixing equal proportion of the analyzed samples, the relative quantities of the individual proteins are estimated in the same way, by the ratio 16O/18O present at the sidechain of Asp obtained in the deglycosylation reaction of the N-glycopeptides (Gonzalez J, Takao T, Hori H, Besada V, Rodriguez R, Padron G, Shimonishi Y. A Method for Determination of N-Glycosylation Sites in Glycoprotein's by Collision-Induced Dissociation Analysis in Fast Atom Bombardment Mass Spectrometry: Identification of the Positions of Carbohydrate-Linked Asparagine in Recombinant α-Amylase by treatment with PNGase-F in 18O-labeled Water. Anal. Biochem., 1992, 205, 151-158).
In fact, the high specificity of this method for the N-glycopeptides has been proposed as a strategy to discriminate false-positive identifications (Kuster, B and Mann M. 18O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoprotein's using peptide mass mapping and database searching. Anal. Chem. 1999, 71, 1431-1440) by restricting the database search to peptides containing potential N-glycosylation sites (Asn-X-Ser/Thr). Another variant to perform a quantitative proteomics was introduced by Zappacosta and Annan (Zappacosta F, and Annan RS. N-terminal isotope tagging strategy for quantitative proteomics: results-driven analysis of protein abundance changes. Anal Chem. 2004, 76, 6618-6627) and it comprises the introduction of the labeling in all proteolytic peptides after transforming all the lysine residues into homoarginines and derivatize all the amino terminal groups of proteolytic peptides obtained in one of the compared conditions with a blocking reagent enriched with heavy isotope (particularly deuterium) while the amino terminal groups of the peptides obtained in the other condition are modified with the non-labeled reagent. After mixing both samples, the quantification is performed after estimating the intensity ratio of the isotopic distribution corresponding to the light- and heavy-peptide.
To avoid the overlapping in the isotopic distribution of the labeled and non-labeled species, the derivatization of peptides is performed with deuterated (d5) and normal propionic anhydride. However, the incorporation of more than 3 atoms of deuterium might introduce errors in the relative quantification of the light and heavy species because of their different retention times in the reversed phase chromatography as demonstrated by Zhang and coworkers (Zhang R, Sioma C S, Wang S, Regnier F E. Fractionation of isotopic ally labeled peptides in quantitative proteomics. Anal Chem. 2001 73: 5142-5149). The errors in the quantification may increase in the same way increase the number of deuterium atoms incorporated in the amino acid sequence of the heavy specie (Zhang R, Sioma C S, Thompson R A, Xiong L, Regnier F E. Controlling deuterium isotope effects in comparative proteomics. Anal Chem. 2002; 74, 3662-3669) and it has been reported that these errors are minimized when the blocking is performed with the 13C-labeled reagents (Zhang R, Regnier F E, Minimizing resolution of isotopic ally coded peptides in comparative proteomics. J. Proteome Res. 2002, 1,139-147).
The analysis of the proteolysis of complex protein mixtures constitutes a great challenge since the overwhelming number of peptides that are generated, surpasses the resolution power of the current chromatographic systems and the most modern mass spectrometers as well.
To approach this challenge and to carry out quantitative proteomics without the necessity of the usage of the two-dimensional electrophoresis, a trend has been the simplification of the peptide mixture by the development of methods that allow the selective isolation of a subset of peptides that possess a common characteristic, and that its study doesn't affect the representativeness of the original proteins, allowing the characterization the biggest number of proteins present in the initial mixture. The combination of a selective isolation of peptides with appropriate isotope labeling techniques not only allows the identification but also the relative quantification of the proteins present in the compared initial mixtures.
Selective Isolation of Cysteine Containing Peptides
This approach was initiated by Gigy and coworkers (Quantitative analysis of complex protein mixes using isotope-coded affinity tags. Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. Nat. Biotechnol. 17, 994-999, 1999) when proposing the well-known ICAT method (isotope-code affinity tags) that is based in the selective isolation of cystein-containing peptides. The method combines the affinity chromatography (streptavidin-biotin) and the labeling with the ICAT reagent in its light- and heavy-variants. This reagent consists of three functional elements:
Once the proteins from both conditions are separately reduced in presence of DTT, the free cysteines generated in one of the conditions react with the heavy-ICAT and those generated in the other condition with the light-ICAT. Both mixture of proteins are joined in identical quantities and proceeds the proteolytic digestion. The generated peptides are purified by a streptavidine affinity column and as a consequence, the cystein containing peptides modified with the ICAT reagent are selectively isolated. To proceed to the relative quantification, the relative intensities of the signals corresponding to the peptides labeled with the light and heavy ICAT are measured. The masses of the peptides labeled with these reagents differ in multiples of 8 units of masses depending the number of cysteine residues contained in the sequence.
This methodology presents several limitations:
Another method has been implemented for the selective isolation of methionine-containing peptides by using a combined diagonal chromatography and it is known in the literature as the COFRADIC technology (Gevaert K, Van Damme J, Goethals M, Thomas G R, Hoorelbeke B, Demol H, Martens L, Puype M, Staes E, Vandekerckhove J.
Chromatographic isolation of methionine-containing peptides for gel-free proteome analysis: identification of more than 800 Escherichia coli proteins. Mol Cell Proteomics. 2002, 11, 896-903). After the reduction and S-alkylation, all proteins are digested and the peptides are fractionated by reverse phase chromatography and collected in a considerable number of fractions in what the authors denominate as a primary run. Each one of these fractions reacts with a solution of hydrogen peroxide (3%) during 3 minutes, and they are newly analyzed in the same chromatography system under the identical conditions in a second run. The methionine-containing peptides once oxidyzed are isolated selectively because they are transformed in less hydrophobic species, diminish their retention time, so they differ of the rest of the peptides that do not contain methionine whose retention times remain invariable in the second run and they are discarded. According to the authors this method allows a simplification of the complex mixture of peptides in a similar extension to the one obtained by the ICAT and it can be applied to the selective isolation of phosphopeptides, N-terminal peptides of proteins and peptides linked by disulfide bridges (Martens L, Van Damme P, Damme JV, Staes E, Timmerman A, Ghesquiere B, Thomas GR, Vandekerckhove J, Gevaert K. The human platelet proteome mapped by peptide-centric proteomics: To functional protein profile. Proteomics. 2005, 5(12), 3193-3204).
Although the authors outline that the conditions of the oxidation have been optimized to avoid the modification of labile residues like cysteines and tryptophans, if this happened the selectivity of the method would be affected and the simplification degree of the peptide mixture that would be reached would not be similar to that of the ICAT method as the authors claims.
On the other hand, although it is outlined that this method is amenable to be automated, to achieve the selective isolation of all the methionine containing peptides it is necessary to carry out a great number of chromatographic runs and the global yield of the method can it turns be affected.
Selective Isolation of the N-terminal Peptides.
A variant of the COFRADIC has also been proposed to isolate selectively the N-terminal peptide of all the proteins (Gevaert K, Goethals M, Martens L, Van Damme J, Staes T O, Thomas G R, Vandekerckhove J. Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nat Biotechnol. 2003; 21, 566-569). The first step of this method consists the blockage of all primary amino groups of the proteins present in the compared complex mixtures, then a specific proteolysis of the modified proteins is performed and by reverse phase chromatography, the peptide mixture is separated in a considerable number of fractions.
The new amino groups of the internal peptides generated during the proteolysis that are present in each one of these fractions react additionally with a highly hydrophobic blocking group and again are separated in the same chromatographic system under conditions identical to that of previously mentioned. The retention time of all internal peptides is increased considerably by the additions of the second blocking reagent, however all the N-terminal end peptides of proteins that were blocked in the first step are selectively isolated when being collected in the same retention time of its original fraction. This strategy have as a disadvantage: to perform a reliable quantification the first step consisting in the blocking of the aminos groups should work in a quantitative way and it is something difficult to achieve when proteins are present in complex mixtures.
Also in this first step during the blocking of amino groups the solubility of the proteins can be diminished considerably to originate precipitations that can affect in turns the quantitativity of the method. Lastly, the fact that a single peptide per protein is isolated is an excessive simplification and it may have a negative impact in the identification and the quantification of the present proteins in the complex mixtures. A method that allows redundancy by isolating a reduced group of 3-4 péptide for proteins can be ideal for proteomics studies without the usage of the two-dimensional electrophoresis since it permits the confirmation of the quantification results by other peptides of the same protein.
Selective Isolation of N-glycopeptides.
It is reported that approximately the 91 percent of the membrane proteins present in the swissprot databases are glycoprotein's (Gahmberg C G, Tolvanen M. Why mammalian cell surface proteins plows glycoprotein's. Trends Biochem. Sci. 1996, 21, 308-311). It has proposed an strategy of work based on the selective isolation of glycopeptides for proteome studies by using the lectin affinity chromatography (Geng M, Zhang X, Bina M, Regnier F. Proteomics of glycoprotein's based on affinity selection of glycopeptides from tryptic digests. J Chromatogr B Biomed Sci Appl. 2001, 752, 293-306; Kaji H, Saito H, Yamauchi AND, Shinkawa T, Taoka M, Hirabayashi J, Kasai K, Takahashi N, Isobe T. Lectin affinity captures, isotope-coded tagging and mass spectrometry to identify N-linked glycoprotein's. Nat. Biotechnol. 2003, 21, 667-672).
The usage of lectins, either a particular lectin or a battery of them immobilized in a chromatographic column, possesses a limitation since it is not able to recognize all the glycoforms present in the sample and to guarantee an efficient selective isolation. To overcome this limitation Zhang and collaborators (Zhang H, Li XJ, Martin DB, Aebersold R. Identification and quantification of N-linked glycoprotein's using hydrazide chemistry, stable isotope labeling and mass spectrometry. Nat Biotechnol. 2003, 21, 627-629) proposed the immobilization of the glycopeptides to a solid support by using the derivatization with hydrazine and then its releasing by the action of the PNGase-F. To carry out this last step in the presence and absence of H218O allows to proceed to the quantification.
This method can be applied to samples of biological interest such as membrane proteins where vaccine and receptor candidates are included. Also it can be applicable to the serum which is however the most complex proteome, however its applicability is restricted to those samples that are enriched in glycoprotein's.
Selective Isolation of Histidine-containing Peptides.
The immobilized chelate metal affinity chromatography has been used in proteomics for the selective isolation of peptides that contain histidine. There are several works that evaluate different matrixes and the immobilized metals ions (Ren D, Penner N A, Slentz B E, lnerowicz H D, Rybalko M, Regnier F E. Contributions of commercial sorbents to the selectivity in immobilized metal affinity chromatography with Cu(II). J Chromatogr A. 2004 1031, 87-92) but the results demonstrate that the specificity is still inferior compared with that of the other previously described methods for the selective isolation of peptides (Ren D, Penner N A, Slentz B E, Mirzaei H, Regnier F. Evaluating immobilized metal affinity chromatography for the selection of histidine-containing peptides in comparative proteomics. J Proteome Res. 2003, 2, 321-329; Ren D, Penner N A, Slentz B E, Regnier F E. Histidine-rich peptide selection and quantification in targeted proteomics. J Proteome Res. 2004, 3, 37-45). In fact, variants of chemical modification of peptides before the affinity chromatography have been explored for increasing the specificity.
Selective Isolation of Peptides by Cation Exchange Chromatography.
The cation exchange chromatography has been used for the selective isolation of peptides by separating in an easy manner the neutral peptides (zero charge) from the positively-charged peptides (chage: 1+, 2+, 3+, 4+, etc). Initially it was used to isolate the C-terminal peptide (A new method for the isolation of the C-terminal peptide of proteins by the combination of selective blocking of proteolytic peptides and cation-exchange chromatography. Chapter #12 in: Proteome and Proteome Analysis. (1999) Springer Verlag publishers); and (Isolation and characterization of modified species of a mutated (Cysl125-Ala) recombinant human interleukin-2. Moya G, Gonzalez L J, Huerta V, Garcia Y. J. Chromatogr. 2002, 971, 129-142) and blocked N-terminal peptides (Selective isolation and identification of N-terminal blocked peptides from tryptic digests. Betancourt L, Besada V, Gonzalez L J, Morera V, Padron G, Takao T, Shimonishi Y. J. Pept. 2001, 7, 229-237). However, these methods cannot be applied to the massive study of complex mixtures of proteins and proteomes because a single peptide per protein is isolated and for a more reliable quantification of the identified proteins is required the isolation of a bigger number of peptide per proteins, (approximately 3-4 peptides/proteins). On the other hand, if the identification of the proteins is based on a single peptide per protein there is a considerable risk of missing the identification of a high number of proteins since all the peptides do not fragment efficiently in the spectrometer of masses and it is not always possible to obtain structural information. In these two methods of selective isolation, the peptides are obtained as neutral species (chage zero) and they do not have any basic amino acids at their C-terminal end and it does not facilitate the fragmentation in the spectrometer and consequently its identification in the sequence database.
The method for the selective isolation of peptides based on the cation exchange chromatography was also applied in proteomics studies to selectively isolate peptides that do not contain neither histidine nor arginine, denominated by the authors as nHnR peptides (SCAPE: A new tool for the Selective Captures of Peptides in Protein identification. Betancourt L, Gil J, Besada V, Gonzalez L J, Fernández-of-Cossio J, García L, Pajón R, Sánchez A, Alvarez F, Census G. J. Proteome Res. 2005, 4, 491-496). This method achieved a considerable simplification of the analyzed peptide mixture in a similar extension to the one achieved by the ICAT (Quantitative analysis of complex protein mixes using isotope-coded affinity tags. Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. Nat. Biotechnol. 17, 994-999, 1999). The chromatographic system used in this method also separates charged peptides (most of the part composed by peptides that contain arginine and histidine) from neutral composed mainly by the nHnR peptides, completely blocked at their primary amino groups. Before analyzing the nHnR in the mass spectrometer, the attached blocking group is eliminated by means of a hydrolytic treatment to regenerate the free amino groups and to make more favorable and efficient its ionization, fragmentation and consequently its identification in the databases.
This method isolates the nHnR peptides in the non-retained fraction of the cation exchange chromatography and to achieve the identification of a higher number of proteins it requires another chromatographic step for its additional fractionation. These additional chromatographic steps may cause losses during the manipulation and affect the yields of the method.
The discarded fraction in this method is enriched in peptides that possess arginine and histidine. The arginine presumably should be located in the C-terminal end of the peptides because they were generated by the cleavage of trypsin and the histidine residues should be located inside the sequence.
The triptic peptides with these characteristics fragments very well in mass spectrometry because in the ionization process they possess a mobile proton (Paizs B. Suhai S. Towards understanding the tandem mass spectra of protonated oligopeptides: mechanism of amide bond cleavage. J Am Soc Mass Spectrom. 2004 15,103-13; Cordero M M, Houser J J, Wesdemiotis C. The neutral products formed during backbone fragmentations of protonated peptides in tandem mass spectrometry. Anal. Chem. 1993; 65,1594-1601) within their sequence (located at the histidine) and it can induce multiple fragmentations along the polypeptide chain, having at the same time a fixed proton located at the end C-terminal end of the peptide (located at the arginine). Their ESI-MSMS spectra produced by the collision induced dissociation dissociation experiments is abundant of Y{umlaut over ( )}n series with enough structural information to achieve a highly reliable identification in the databases.
The analysis of peptides with these characteristics is attractive, however, the fraction that is discarded in the SCAPE method is still very complex. The tryptic digest of a complex mixture of proteins of a given proteome generates approximately an average of 18-20 peptides/proteins and if the SCAPE simplifies this complex mixture when isolating selectively 4-5 peptides/protein it implies that the discarded fraction (mentioned above) should have an average of 14-15 peptides/proteins approximately. This last value is excessively high and it is not the optimum to achieve a proteomic study without using two-dimensional electrophoresis and a further simplification of this fraction is still necessary. Therefore, the development of a new 2DE-free method based on the selective isolation of peptides enriched in basic residues: arginine and histidine is required.
In spite of the limitations described for these methods it continues being very necessary to identify and to determine expression levels of the present proteins in complex mixtures, through the selective and specific isolation of a small group of peptides by means of the simplification of the complex mixture to be characterized by mass spectrometry.
The method for the selective isolation of RH peptides is achieved by combining the blocking of the primary amino groups of proteolytic peptides and a step of cation exchange chromatography at acidic pH. This combination simplifies the complex mixture of peptides by eliminating in an effective and simple way a majority subset of all proteolytic peptides that possess only one or less arginine or hisitidine residues within their sequences (R+H≦1) and it analyzes only a small subset of peptides composed by those that are retained selectively in the cation exchange column and possess multiple positive charges, the peptides that possess in their sequences more than one basic residue arginine or hisitidine (R+H>1).
This method can be used for the identification of the proteins constituent of complex mixtures and for the determination of their relative quantities under the compared conditions. For this purpose, the mixture of proteins obtained either by artificial or natural way should be treated according to the steps that are described in the
In general, all the reagents used in the peptide synthesis for the protection of amino groups or other reagents can also be used if they fulfills the previously explained properties. Examples of this type of reagents and the protocols to carry out the modification of amino groups are easily find in the scientific literature (Protective groups in organic synthesis, Teodora W. Greene and Peter G. M. Wuts, p. 494-654, E d. John Wiley & Sons, Inc. (1990) and Peptide Chemistry, Bodanszky, N., p. 74-103, Springer-Verlag, New York (1988) and their use is included in the present invention. This blocking reaction of the primary amino groups (alpha-amino terminal and epsilon-amino of lysines) is very important, so that the positive charge of the peptides at acidic pH only depends on the number of two other basic amino acids present in the peptide sequences: arginine (R) and histidine (H).
The chromatographic system developed here possesses a double function since it not only allows the simplification of the analyzed complex mixture when the RH peptides are selectively isolated but it also allows its further fractionation by applying a gradient increasing the ionic strength or the pH of the mobile phase before being analyzed by reverse phase chromatography coupled to the mass spectrometer. This additional fractionation by cation exchange chromatography is key for the identification of a great number of proteins (Washburn M. P. et al. Large-scale analysis of the yeast proteome by multidimensional protein identification technology, Nature Biotechnology 2001, 19, 242-247). The method of this invention uses desalting steps in the reverse phase columns before the cation exchange chromatography to eliminate the excess of reagents or to change the working solutions.
To apply this method in the quantitative proteomics it is necessary that the generated peptides in one of the two compared conditions carry one or several heavy isotopes (13C, 15N, 18O, and/or 2H) in their structure while the peptides generated in the other condition possesses the same elements previously mentioned but with their natural isotopic abundances (12C, 14N, 16O, and/or 1H). The elimination of the blocking group of the amino groups of the RH peptides only would be made when this it does not carry the isotopic labeling that is essential to achieve the relative quantification. If the modifier reagent is of a photosensible nature, this can be eliminated by light irradiation of the modified peptides.
The incorporation of the heavy isotopes in the structure of the generated peptides during the proteolysis of proteins in one of the two compared conditions can be carried out in three different ways:
The relative quantification by the analysis of the isotopic distribution of the mixture of the labeled and non-labeled RH peptides in the mass spectra is carried out by using an appropriate software (Fernández of Cossio et al. Isotopic, A Web Software for Isotopic Distribution Analysis of Biopolymers by Mass Spectrometry. Nuclei Acid Research 2004, 32, W674-W678 and Fernández of Cossio et al. Automated Interpretation of Mass Spectra of Complex Mixtures by Matching of Isotope Peak Distributions. Rapid Commun. Mass Spectrom. 2004; 18, 2465-2472). This software calculates the theoretical isotopic distributions of the labeled and non-labeled RH peptides and it achieves a such combination so that the resultant area of the theoretical isotopic distribution is adjusted from the best way to the area of the isotopic distribution observed experimentally. The proportions existent between each area corresponding to the peptides labeled with light and heavy isotopes (12C/13C, 14N/15N, 16O/18O, and/or 1H/2H) once normalized correspond with the relative proportion of the proteins that contained them in the compared mixtures.
To carry out the quantification it is necessary to know (a) the elemental composition of the analyzed peptide or its sequence, (b) the type of isotopic labeling used in the experiment and (c) the region of the mass spectrum that contains the experimental isotopic distribution of the RH peptide of interest. All this information is very restrictive allowing to calculate in a very precise way the experimental noise and to discard of the analysis the overlapping of other signals that are not of interest for the quantification. All these information's make the quantification process very robust with the used software and it is independent of the method of isotopic labeling. This proposed method is compatible with the ionization modes more frequently used in the characterization of peptides and proteins: the electrospray ionization for (ESI-MS) and the matrix assisted laser desorption ionizaton (MALDI-MS). These ionization methods give the molecular mass, an important information, however they do not provide structural information that allows a reliable identification in the sequence databases of the selectively isolated RH peptides. To achieve this, the peptides of interest are selected in the mass spectrometer and pass through a collision chamber where by means of a process known as collision induced dissociation, fragments containing enough structural information are generated and it allows the elucidation of the complete or partial sequence of the analyzed peptide.
The mass spectrum that contains this information is known as MSMS spectrum. Each MSMS spectrum is very characteristic of the sequence of the peptide that originated it, can be considered as a fingerprint of the fragments ions and it is useful for the reliable identification of the peptides in the sequence databases with the aid of computer programs. In fact this it is the principle of one of the most popular search engines designs for the identification of proteins in the sequence databases: the program MASCOT (Matrix Science Ltd, U K, Perkins, D N, et al. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20, 3551-3567); and the SEQUEST program (Trademark, University of Washington, Seattle Wash., McCormack, A. L. et al. Direct Analysis and Identification of Proteins Mixtures by LC/MS/MS and Database Searching at the Low-Femtomole Level. Anal. Chem. 1996, 69, 767-776; Eng, J. K. et al. An Approach to Correlate Tandem Mass Spectral Dates of Peptides with Amino Acid Sequences in to Protein Database. J. Amer. Soc. Mass. Spectrom. 1994, 5, 976-989; U.S. Pat. No. 5, 538,897 (Jul. 23,1996) Yates, Ill et al.)
These programs (MASCOT and SEQUEST) compare the MSMS spectra obtained experimentally with the theoretical MSMS spectra of all the peptides that possess a certain molecular mass in the protein sequences databases and they have been originated by the specific cleavage of the used protease. The MSMS spectrum where a bigger coincidence exists between the mass values of the theoretical fragments and those obtained experimentally, should correspond with the analyzed peptide and from here is inferred the protein that containing it and the identification in the database is carry out.
The following references are related with the application of some mass spectrometry techniques to the identification of proteins, particularly in the proteome analysis: Ideker T, Thorsson V, Ranish J A, Christmas R, Buhler J, Eng J K, Bungarner R, Goodlett D R, Aebersold R, Hood L. Integrated genomic and proteomic analyses of to systematically perturbed metabolic network. Science. 2001, 292, 929-934; Gygi S P, Aebersold R. Mass spectrometry and proteomics. Curr Opin Chem Biol. 2000, 4, 489-494., Gygi S P, Rist B, Aebersold R. Measuring gene expression by quantitative proteome analysis. Curr Opin Biotechnol. 2000, 1, 396-401, Goodlett D R, Bruce J, Anderson G A, Rist B, Pass-Tolic L, Fiehn O R, Smith R D, Aebersold R. Protein identification with to it sails accurate mass of to cysteine-containing peptide and constrained database searching. Anal Chem. 2000, 15; 72, 112-118; and Goodlett D R, Aebersold R, Watts J D. Quantitative in vitro kinase reaction as a guide for phosphoprotein analysis by mass spectrometry. Rapid Commun Mass Spectrom. 2000; 14, 344-348).
Taking into account the high selectivity of the proposed method, the identification can be restricted to databases that only possess RH peptides to guarantee a faster identification, to avoid false positive identification and to obtain a more reliable identification by using the programs MASCOT and SEQUEST.
In silico demonstration of the simplification of complex mixtures of peptides derived from proteomes of different organisms using the chromatographic system proposed that allows the selective isolation of multiply-charged peptides (2+, 3+, 4 +, etc) or RH peptides.
The great efficiency reached by the sequencing of the DNA molecule has made possible that complete sequences of genomes of several organisms are know, and it is possible to predict which are the proteins that are derived from the genomes and what peptides are generated depending the specific proteolytic treatment that is carried out to the studied proteome. To have an exact idea about the magnitude of the simplification when the RH peptides are selectively isolated a program named SELESTACT was devised. It was written in C for use in PC and calculates from the proteome of a given organism:
This program also calculates all these parameters for the proteolytic peptides of a given proteome that possess cystein residues and therefore they can be selectively isolated by the ICAT method, one of the pioneering and the most used method in the selective isolation of peptides and its application to proteome studies
a) it corresponds to the total number of proteins reported in the Swissprot sequences database.
b) total number of tryptic peptides/proteins that are generated by the specific proteolysis with trypsin of the different analyzed proteomes. It is expressed as a integer number.
c) total number of multiply-charged peptides/analyzed protein when using the chromatographic system described in heading of the this table and the one obtained by the ICAT method. Both data are expressed as integer numbers.
d) The coverage of the analyzed proteome represents the percentage of the total number of proteins that possess RH peptides and can be isolated when using the chromatographic method described in the heading of the present table. The values correspond to the proposed method and the ICAT.
Table 1 summarizes the results obtained by in silico analysis after applying the method to several proteomes including bacteria pathogen bacteria, yeasts, plants, and mammals are shown. As it can appreciate, of an average of 18 tryptic peptides per protein, the mixture would be simplified considerably until reaching an optimum value for these purposes because 4 RH peptides/proteins are selectively isolated at the same extent as the ICAT method. The average for the proteome coverage, that represents the percentage of the proteome of an organism that can be studied with the proposed method (87.9%) it is also very similar to the one achieved with the ICAT (87%). However, when individual proteomes are analyzed a very marked difference it is clearly noticed for the case of M. tuberculosis proteome. In the method object of the present invention 94,6% of the complete proteome can be analyzed, on the other hand, when the ICAT method is used only it can analyze less than 80%. When studying particularly this microorganism in proteomics the method of selective isolation object of this invention it should be the method of choice. The usage of the SELESTACT software is of great utility to predict which are the expected results from the method object of the present invention for particular organism and therefore it allows the comparison with other methods of selective isolation of peptides for proteomic studies.
This demonstrates us that the principle of this invention, the separation of peptides with charges 0 and 1+from the multiply-charged peptides (charges 2+, 3+, 4+, etc) by using a chromatographic system it is of great utility and it can be successfully used for the proteomics study of organisms with different evolutionary grades since it allows an ideal simplification of the complex mixture of peptides and at the same time it guarantees high coverage of the proteome under study.
It only remains to demonstrate in the example 2, a chromatography system that is able to make in a simple and effective way of the proposed separation, the neutral and singly-charged peptides from the multiply-charged peptides.
Selective Isolation of the RH peptides of the recombinant streptokinase (rSK). Evaluation of the selectivity and the specificity of the method.
The method proposed in the
The steps to follow are shown below:
The ESI-MS spectrum (
Those signals with assignments written in boldfaces correspond to 13 multiply-charged peptides or RH peptides (peptides T3, T7-8, T10, T12, T14, T16, T18, T20-22, T24-25,
The primary amino groups (amino terminal and epsilon amino groups of lysines) of tryptic peptides were blocked and it is observed all increased their molecular masses in dependence of the number of acetyl groups added to their structure (see number of added rhombuses, in
On the other hand, none of the multiply-charged or RH peptides shown in the spectrum of the
a) Code corresponding to the tryptic peptides of the rSK observed in the
b) The amino acids highlighted in boldfaces and italic are the arginine and histidine residues in the sequence of the tryptic peptides of rSK. The numbering of the amino acids N - and C- terminal correspond with the position of each one of them in the sequence of the rSK.
c) Z is the number of positive charges holded by the arginine and histidine residues in the peptides dissolved at acidic pH.
d) experimental mass of the tryptic peptides of the rSK.
e) theoretical mass of the tryptic peptides of the rSK.
These results demonstrate that the chromatographic system designed is very specific and selective to isolate the multiply-charged or RH peptides and it can be applied with success in the proteomic studies.
Identification and relative quantification of the component proteins of two artificial mixtures (A and B) by means of the selective isolation of RH peptides by using three different isotopic labeling (2H, 13C and 18O).
Two artificial mixtures A and B composed of the proteins rSK, myoglobin and cytochrome-C were prepared in a molar ratio of A with respect to B: 1:1 for the rSK, 2:1 for the cytochrome C and 1:3 for the myoglobin. The sequences of these proteins are shown at the end of the document, sequences 1-3. The selective isolation of the RH peptides was carried out for the method described in the example 1. Three experiments were performed by using different isotopic labeling methods to demonstrate the compatibility of the proposed method with different kind of labeling (2H, 13C and 18O) and it is demonstrated the possibility of using the reversible blocking of the amino groups in case it is desirable to analyze in the mass spectrometer the peptides with their free amino groups.
The blocking of the primary amino groups of peptides obtained in the tryptic digestion of the mixture “A” was carried out using normal acetic anhydride ((C1H3CO)2O) and the peptides of the mixture “B” were derivatized with deuterated acetic anhydride ((C2H3CO)2O) provided by ISOTEC (99% isotopic purity). In this experiment the molecular masses of the peptides obtained under both conditions differed in multiples of 3 units of masses depending the number of acetyl groups they added during the blocking reaction. In this experiment the overlapping of the isotopic distributions of the labeled and non-labeled peptides can be partial in case of a single acetyl group is added and the overlapping decrease as increase the quantity of blocking groups added to the peptides that contain lysine residues within their sequences.
The blocking of the primary amino groups of the tryptic peptides of the mixture “A” was carried out using normal acetic anhydride ((12CH3CO)2O) and those obtained from the mixture “B” were derivatized with 13C ((13CH3CO)2O, commercialized by ISOTEC 99% isotopic purity). In this experiment the overlapping of the isotopic distributions of the labeled and non-labeled peptides is elevated because their mass difference is just 1 Da for each blocking group added to a given peptide.
In this experiment some variants were introduced to the method described in the example 1 to demonstrate the combination of the 18O-labeling with the reversible blocking of the primary amino group of peptides. The mixture of proteins A was digested in a buffer described in the example 1 and the mixture B was digested with in the same buffer but prepared with heavy water (H218O) provided by ISOTEC (99% isotopic purity). By means of this procedure the peptides obtained in this last condition would be labeled with one or with two 18O atoms at their C-terminal end on the other hand, the other peptides keep their natural isotopic distribution. On the other hand, the amino groups of the tryptic peptides of both conditions were derivatized with a reversible reagent (2 (methyl sulfonyl) ethyl succinimidyl carbonate, marketed by Aldrich) and before mass spectrometric analyze they were eliminated in the same step of the removal of 0-acylations at the tyrosine residues using the same conditions described previously in the example 1.
In this case the analysis of the isotopic distributions is a more complex than in the previous cases because the peptides labeled with 18O can add 1 and 2 atoms of 18O at their C-terminal end, therefore for calculating the relative quantities of the peptides obtained under both conditions is important to take into account that the relationship is given by the quotient between the areas of the isotopic distributions of the peptides having 16O divided by the sum of the areas corresponding to the isotopic distributions of the peptides that incorporated one (18O1) and two atoms of (18O2), according to the expression:
(Area 16O)/(Area 18 O1+Area 18O2).
In all the above mentioned experiments the relative quantification of the peptides in the analyzed mixtures is carried out by using the ISOTOPICA software as it is explained in the detailed description of the method object of this invention (Fernández of Cossio et al. Isotopica, a Web Software for Isotopic Distribution Analysis of Biopolymers by Mass Spectrometry. Nucleic Acid Research 2004, 32, W674-W678 and Fernández of Cossio et al. Automated Interpretation of Mass Spectra of Complex Mixtures by Matching of Isotope Peak Distributions. Rapid Commun. Mass Spectrom. 2004; 18: 2465-2472).
The RH peptides of the three proteins present in the prepared mixture were isolated and sequenced in a single LC-MS/MS experiment being able to identify automatically without ambiguities by the MASCOT program (it
a) Sequence of the RH peptides of the automatically identified proteins by the MASCOT software. The asterisks indicate acetylation in the epsilon amino group of lysine.
b) theoretical proportion of the proteins in the compared artificial mixtures A and B.
c) experimental value obtained for the relative quantities of the proteins present in the two compared artificial mixtures. In boldface letters is highlighted at the end of each experiment for each protein the average value of their relative quantification and the value of the standard deviation is indicated between parenthesis.
For the three proteins the average of the experimental values for the relative quantifications are in very good agreement with the theoretical values and the obtained standard deviation was very good (below 5%).
These results demonstrate that the method can be used in the quantitative proteomics for determining with very good accuracy the relative quantities of the proteins in mixtures.
The adjustments of the areas corresponding to the experimental isotopic distributions obtained for one peptide of each one of the analyzed proteins in the three experiments mentioned in the present example are shown in the
Notice that in all the cases a very good adjustment was obtained for the theoretical contour of the isotopic distributions (red line) and the spectrum obtained experimentally (spectrum shadowed in black) independently of the type of labeling used (either 13C, 2H or 18O).
The results obtained for the relative quantification of the same protein in the three experiments were very similar independently of the type of labeling used. This example demonstrates the versatility and robustness of the method due to its compatibility with different types of labeling to obtain a great accuracy in the relative quantification.
LC-MS/MS and Database Search.
The measurements were carried out in a hybrid mass spectrometer (quadrupole and time of flight, QTof-2) from the Micromass company (Manchester, United Kingdom). The mass spectrometer was connected online with a HPLC (AKTA Basic, Amersham Pharmacia Biotech, Sweden) through a column of 200×1 mm (Vydac, USES). The peptides were eluted with a lineal gradient from 5 to 45% of the buffer B (0.2% of formic acid in acetonitrile) in 120 minutes.
The mass spectrometer was operated with cone and the capillary voltages of 35 and 3000 volts, respectively. For the acquisition of the MSMS spectra the singly-, doubly-, triply-charged precursory ions were selected automatically, once these they surpassed an intensity of 7 counts/sec. The measurement mode was changed from MSMS to MS when the total ion current (TIC) diminished to 2 counts/sec or when the spectra MS/MS was acquired during 4 seconds. The acquisition and the data processing were carried out by the software MassLynx (version 3.5, Micromass), while the identification of the proteins based on the MSMS spectra was through the version of the MASCOT software freely available in Internet. Among the search parameters, the cytein modification as well as the possible oxidations and desamidations were included.
Application of the Proposed Method to the Quantitative Proteomics of Complex Mixtures Derived from the Proteome of an Organism (Vibrio cholerae) Grown in Aerobiosis and Anaerobiosis Conditions.
A colony of the strain of V. cholerae O1 biotype El Tor C7258 (Ogawa; Peru, 1991) grew during 16 h at 37° C. in a LB medium (Sambrok, J., Fritsch, E. F., Maniatis, T., Molecular cloning: to manual laboratory, Cold Spring Harbor Laboratory Press, New York 1989, A.1) was inoculated in 5 mL of syncase broth (pH 7.5) (Finkelstein, R.A., Attasampunna, A., Chulasamaya, M., Charunmethee, P., J. Immunol. 1966, 96, 440-449) suplemented with 0.4% w/v of glucose and 1% w/v of casein hydrolyzate and grown overnight to 200 rpm at 37° C.
The culture was diluted 1:100 in a fresh syncase medium suplemented and precultured until an OD˜0.2 before inoculating in the final aerobics and anaerobics culture used in the proteomics studies. The anaerobic atmosphere was generated using AnaeroGenTM sacks from the Oxoid company (Basingstoke, Hampshire, UK) until reducing the oxygen levels to values lower than 1% to generate a concentration of CO2 between 9 and 13% during 30 min. Before proceeding to the definitive inoculation in the anaerobic culture, a 1 L erlenmeyer containing 200 mL of syncase was medium preconditioned during 30 minutes in anaerobiosis and later on it was inoculated with 2 mL of the strain C7258 precultivated and stirred to 220 rpm at 37° C. until reaching an optic density of approximately 0.5. The cells were collected by centrifugation to 10 000 g during 5 minutes, washed twice with the electroporation buffer (270 mM sucrose, 1.3 mM Na2HPO4, 1 mM MgCI2, pH 7.4), resuspended again in the same buffer, and centrifuged during three minutes at 10 000 g. Aliquot of 1010 cells were stored at −70° C. Two aliquots (1×107 cells) from the aerobic and anaerobic culture were dissolved separately in 1 mL of the lysis buffer (500 mM HEPES (pH 8.0), containing 10 mM EDTA, 2 mol/L guanidinium chloride). The complete disruption of the cells was carried out alternating ultrasound cycles (1 minute) and the incubation in a bathroom of ice. This procedure was repeated three times and later on the sample was pipeted continually several times until diminishing the viscosity of the solution. Finally streptomycin sulphate was added (Merck, Germany) until reaching a concentration of 1%, it was incubated during 15 minutes to 0° C. again, it was centrifuged at 500 g and the precipitate was discarded.
Later on, the complex mixture was analyzed following the steps described in the
Of the 91 identified proteins using the method of the present invention, 18 (it is equivalent to 19.8%) do not contain cystein in their amino acid sequences (proteins #2, 12, 13, 20, 22, 28, 29, 40, 41, 42, 44, 47, 48, 52, 60, 62, 68, and 88, see Table 4) and therefore they could not be identified using one of the most used methods in the quantitative proteomics, the ICAT (Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. Nat. Biotechnol. 17, 994-999, 1999; Simplification of complex peptide mixtures for proteomic analysis: reversible biotinylation of cysteinyl peptides. Spahr C S, Susin INC, Bures E J, Robinson J H, Davis M T, McGinley M D, Kroemer G, Patterson S D. Electrophoresis, 21(9), 1635-1650, 2000). Aomng the ten proteins that changed their expression levels identified by the present method (proteins #8, 9, 33, 43, 46, 55, 58, 70, 88 and 89, Table 4) one of them (10%) could not be identify with the ICAT because it does not have cystein residues in its sequence (protein 88, Table 4) and this valuable information would be lost if the ICAT method is used.
These results demonstrate the superiority of the present method to study complex mixtures of proteins and it allows to identify non identifiable proteins by an established method such as the ICAT.
a) K* represents the acetylated lysine at the epsilon amino group. All the identified peptides possesses acetyl groups in the amino terminal end. N represents a completely desamidated asparagines residue. In boldfaces the basic amino acids arginine and histidine are highlighted.
b) The peptides obtained in the aerobic and anaerobiosis conditions were derivatized with normal ((CH3CO)2O) and deuterado ((CD3CO)2O) acetic anhydrides, respectively. The proportion between both species is expressed by percentages.
c) The sequences (8, 9, 33, 46, 89) and (43, 55, 58, 70, 88) correspond to the proteins overexpressed in anaerobiosis and aerobiosis conditions, respectively.
In Silico Determination of the Content of Blocking Groups Grupos in the RH Peptides of Different Genomes.
The presence of several blocking groups in the structure of the RH peptides when the irreversible blocking of the amino groups is used and in particular when deuterated acetic anhydride is used could be a concern mainly for three reasons:
All the RH peptides should possess a blocking group at least in its N-terminal end and the presence of additional blocking groups is due to the presence of lysine residues (for each lysine residual a blocking group is added). The presence of lysine in the RH peptides can originate for three reasons:
The first of these factors (a) can be minimized by means of the optimization of the proteolytic digestion using a higher enzyme to substrate ratio or prolonged the digestion times.
The last factors (b and c) are an intrinsic property of each proteome and it depends on each one of the analyzed proteins, on the relative distribution of the proline residues contiguous to lysine residues (K-P) as well as of the presence of several basic residues in the tryptic peptides.
To know which is the abundance of K-P residues and the quantity of blocking groups, the sequences of all the RH peptides of several genomes were analyzed (
These results are similar to those obtained in the predictions carried out by an in silico analysis for the genome of this organism and similar to the other genomes shown in the
Zhang and coworkers (Zhang R, Sioma C S, Wang S, Regnier F E. Fractionation of isotopically labeled peptides in quantitative proteomics. Anal Chem. 2001, 73, 5142-5149; Zhang R, Sioma C S, Thompson R A, Xiong L, Regnier F E. Controlling deuterium isotope effects in comparative proteomics. Anal Chem. 2002, 74, 3662-3669) they demonstrated that the presence of a single deuterated acetyl group does not affect the relative quantification of the labeled and non-labeled species, when deuterium labeling is used in the quantification it would not be a problem for our method since most of the peptides added one acetyl group (e deuterium atoms) but in the case of ICAT method all peptides adds 8 deuterium atoms and the errors in the quantification could be appreciable (Zhang R, Sioma C S, Wang S, Regnier F E. Fractionation of isotopically labeled peptides in quantitative proteomics. Anal Chem. 2001, 73, 5142-5149). The addition of two acetyl groups (6 deuterium atoms) present in the remaining 20% of the RH peptides of several genomes (
Lastly, the MASCOT software was able to identify in an automatic mode all the RH peptides of the rSK protein that possessed one and two blocking groups (FIGS. 8 and 9). This result demonstrates that the presence of multiple blocking groups in the sequence of the RH peptides does not affect the efficiency of its fragmentation in the collision induced dissociation experiments because the RH peptides are multiply-charged species (R+H>1) with multiple sites of protonation and consequently the automatic and efficient identification of the RH peptides is guaranteed in the protein sequence databases by using one of the most popular search engines, the MASCOT program.
Simultaneous Relative Quantification of Proteins Present in Several Simples in a Single Experiment.
The RH peptides isolated in the proposed method can be blocked at their N-terminal end with light and heavy isotopes to carry out the relative quantification. Taking into account, the results shown in the example 3 of the present invention, where it is demonstrated that the ISOTOPICA software can be used independently of the type of introduced labeling and the degree of overlapping of the isotopic distributions, in this example it is demonstrated that the relative quantification of proteins can be carried out in a simultaneous way under three or more biological conditions in a single experiment whenever a different isotopic labeling is used for the RH peptides in each one of the compared conditions.
Three artificial mixtures (A, B and C) composed by the proteins rSK, cytochrome-C and myoglogin were prepared. The proteins rSK, cytochrome C and myoglobine in the mixtures A, B and C kept the proportion of 1:1:1, 1:2:3 and 1:4:5, respectively. Later on, it proceeded to the isolation of the RH peptides of the proteins present in each one of the samples. The tryptic peptides of the proteins present in the mixtures A, B and C were derivatized with normal acetic anhydride [(CH3CO)2O], acetic anhydride labeled with an atom of carbon-13 [(13CH3CO)2O and acetic anhydride labeled with two atoms of carbon-13 [(13CH313CO)2O], respectively. Before the desalting step, the three mixtures of blocked peptides were mixed and the procedure continued in the same way until the identification process. The blocked tryptic peptides of the proteins present in the mixture “C” are 1 heavier than the the blocked peptides of the mixture “B” and these in turn are 1 Da heavier than the peptides obtained in the mixture “A”. When A, B and C are mixed the isotopic distributions of the peptides from the same proteins are completely overlaped and the content of 13C is an reflect the relative quantities of the peptides present in each one of the compared conditions.
In the Table 5 are summarized the results obtained when applying the method for the selective isolation of RH peptides present in the artificial mixture of the three proteins. As they can be appreciated a very good agreement exists among the experimental isotopic distributions and those provided by the software when mixing in the appropriate proportions the theoretical distributions of the peptides. Particularly these values keep very good agreement with the expected values for each one of the proteins taking into consideration the proportions in which they were prepared in the three artificial mixtures.
a) In boldface the amino acids arginine and histidine are highlighted.
In the
In principle the comparison can be extended to a greater number samples for which it is only necessary to introduce an isotopic labeling in each one of the conditions in a such way that their molecular masses are different in at least in 1 Da. For example, to be able to compare four conditions it would be necessary to use the same three anhydrides shown in this example and the RH peptides derived from the fourth condition should be modified with deuterated acetic anhydride [(C2H3CO)O]. It is also possible to be compared 5 and 6 conditions if doubly labeled acetic anhydrides with deuterium and 13C ([(13C2H3CO)O] and [(13C2H33CO)O],) are used, respectively. This application 20 would have great utility in proteomic experiments where it is necessary the comparison of multiple conditions either as internal controls of the system or to study synergistic effect of diverse drugs in cell lines, tumors, mircrorganisms, etc
If it is desirable to compare the expression of proteins under three different conditions using two-dimensional electrophoresis it would be necessary to obtain analytic three gels for each one of the conditions to evaluate the reproducibility of the experiment because of the exquisite manipulation that requires this analytic technique and later on, it would be necessary to obtain two preparative gels to proceed to the identification by mass spectrometry. It is evident, the great consumption of reagents, samples and time that it would require a study of such a magnitude. If LC-MSMS techniques are used, it would be necessary to carry out two experiments at least, when comparing one of the samples with the other two remaining ones. However, the possibility to introduce different labeling in the RH peptides isolated in each one of the conditions allows to obtain information of the relative quantification of each one of the proteins present in the mixtures in an single experiment. This minimizes the inter-experiments errors in the relative quantification and brings a considerable simplification of the work to be carry out. Particularly this advantage is more appreciable in the same measure as increase the number of samples to be compared.
Number | Date | Country | Kind |
---|---|---|---|
CU-2005-0197 | Oct 2005 | CU | national |