The present invention relates to agents comprising non-natural protein sequences, the sequence comprising a scaffold portion and at least two protein or polypeptide epitopes, the proteins or polypeptides being relationally linked by a functional relationship such as and without limitation, being components in a series of components of the same metabolic or signal transduction pathway or a pathway associated with interaction between two systems such as host-parasite or host-pathogen. The protein or polypeptide may also be, for example and without limitation, related by their interaction with a given protein or they may occupy a common site in a cell or they may be isoforms of the same protein or allelic variations thereof. The invention further includes a method of simultaneously calibrating and investigating the quantitative relationships between the at least two proteins or polypeptides.
Biological science commonly involves the study of components which are in some way linked, i.e. they possess an important biological relationship with one another. The link or relationship could be that the proteins are:
1. Isoforms of one another, that is to say a protein that has the same function as another protein but is encoded by a different gene and may have small differences in its sequence. For example, transforming factor beta (TGF-B) naturally exists in three versions, or isoforms (TGF-B1, TGF-B2, and TGF-B3), each of which can set off a signalling cascade that starts in the cytoplasm and terminates in the nucleus of the cell. Alternatively isoforms can be created by alternative splicing of RNA transcribed from a gene.
2. Allelic variants of one another. Polymorphisms of a gene exist in the human population, and other species, which in some cases lead to amino acid sequence differences in the protein between individuals within the species. These polymorphisms might be without functional consequence or they might deliver benefit to the organism in particular circumstances, or they might be associated with medical disease.
3. Components of a particular process, in biology most phenomena are achieved by the coordinated action of multiple components. These are frequently called pathways, where a series of linked reactions occur within a living cell to produce a specific product or products. There are numerous metabolic pathways for example glycolysis, tricarboxylic acid or Krebs cycle, urea cycle; see Nicholson for detailed metabolic pathway maps http://www.tcd.ie/Biochemistry/IUBMB-Nicholson/), pathways in signal transduction (e.g. hormones, cytokines, growth factors, stress; several examples can be found at http://www.grt.kyushu-u.ac.ip/spad/) or pathways associated with the interaction between two systems (e.g. host-parasite interactions, host-pathogen interactions).
4. Proteins sharing a common location in the cell. The relationship could be proteins which interact with a given protein, or proteins which occupy a common site in the cell (e.g. a lipid raft).
It is known from the prior art to use biochemical assays or antibodies specific for each individual protein as herein before described to elucidate the underlying mechanisms of pathways or to detect their presence.
It is also known in the prior art to use single or multiple or his tagged proteins to detect or aid in the isolation and purifications of any of the proteins as herein before described so as to permit their quantification. However, these tagged proteins do not posses the appropriate properties which would allow them to be used for the study of components which are in some way linked and possess an important biological relationship with one another nor can they be used to study the “natural” molecule by virtue of the addition of extra protein sequence (which could influence structure or function) in the form of the epitope tag. Moreover, they do not provide any means with which to simultaneously calibrate quantitatively the amount of test product within a sample.
PCT/GB2005/00015 describes calibration complexes comprising a scaffold protein and a method of using the complexes for quantifying the amount of a target protein in a sample. A target moiety, such as a protein, is covalently attached to one or more cysteine or lysine groups on the scaffold protein and the scaffold material has a controlled property such as relative molecular mass or weight or a pH value for the isoelectric point. In this way the target moiety and scaffold protein can be used to detect not only the presence of a target protein in a sample but also as a positive control, an internal standard or may it be used to generate a calibration curve.
In the present invention we provide agents that have advantageously combined the property of a calibration standard with the ability to quantify multiple related or linked components in a biological system. In addition we describe a method using the agents of the present invention to detect not only the presence of individual relationally linked proteins or polypeptides but the quantity of each protein in a relationally linked series in a sample, in this way the method and agents of the invention may be used to provide stiochiometric information about the linked series of components in a sample.
According to a first aspect of the invention there is provided a non-natural protein sequence, the sequence comprising a removable scaffold portion and at least two or more protein or polypeptide epitopes, the proteins or polypeptides epitopes being relationally linked.
Throughout the description and claims of this specification, the words “comprise” and “contain” and variations of the words, for example “comprising” and “comprises”, means “including but not limited to”, and is not intended to (and does not) exclude other moieties, additives, components, integers or steps.
Throughout the description and claims of this specification, the singular encompasses the plural unless the context otherwise requires. In particular, where the indefinite article is used, the specification is to be understood as contemplating plurality as well as singularity, unless the context requires otherwise.
Features, integers, characteristics, compounds, chemical moieties or groups described in conjunction with a particular aspect, embodiment or example of the invention are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith.
Reference herein to “the proteins or polypeptides being relationally linked” is intended to include a shared or common functional relationship such as being components of the same metabolic or catabolic or signal transduction pathway, or isoforms of the same protein or polypeptide, or allelic variations thereof, or proteins or polypeptides that share a common location within a cell or that interact with the same target or protein or homologue thereof.
Reference herein to an “epitope” relates to a specific chemical domain on an antigen that stimulates the production of, and is recognized by, an antibody. Each epitope on a molecule such as protein elicits the synthesis of a different antibody. aka antigenic determinant.
Reference herein to “removable scaffold” is intended to include a non-natural protein sequence which may, as an end product contain a scaffold portion or which may prior to its use have the scaffold portion removed by, for example and without limitation enzymatic means. It will be appreciated that the non-natural protein sequence of the present invention when being constructed will comprise a backbone or scaffold on which the multiple epitopes can be arranged and that the epitopes may be manufactured so as to be adjacent one another without an intervening scaffold portion or that the scaffold portion may be cleaved out of the protein sequence so that the end product may exist as a string of multiple relationally linked epitopes.
Preferably, the non-natural protein sequence of the invention comprises a plurality of epitopes, the number of epitopes may be more than 5 or 10 or 20 or 30 or 40 or 50 or more, there being no upper limit for their number rather the number being dictated by practical capabilities of their manufacture and a user's requirements.
Preferably, the epitopes may also function as a calibration entity in so far as an epitope may have a selected and known property such as and without limitation a known molecular weight.
It will be appreciated that the non-natural protein sequences of the present invention provide a product capable of serving simultaneously as a calibrant for two or more different biologically related components as well as providing a means of identification of the same.
Preferably, the epitopes are linked in series in a continuous length of sequence. The epitopes could be linked directly to one another or they could be attached to a backbone sequence comprising non-reactive or inert molecules.
In one embodiment of the invention, where the sequence allows, the epitopes could overlap. Again the epitopes could directly overlap one another or they may be interspersed with regions of an inert backbone sequence.
In an alternative embodiment of the invention, the epitopes could be discontinuous in the primary sequence of the product, that is to say some may be present at the N-terminus, others at the C-terminus and some may be present within the sequence of the scaffold portion.
It will be appreciated that in the instances that the epitopes are linked in series they are essentially non-branched non-natural protein sequences and as such there is no requirement for chemically reactive groups in the scaffold portion.
However in other embodiments, where the epitopes are not linked in series the one or more epitopes could preferably be covalently attached to the calibration portion at a site other than utilising the α-carbon backbone sequence. Such “branched” non-natural protein sequences could preferably be fabricated using covalent bonding through unique (or controlled numbers of) reactive residues such as cysteine, lysine, aspartate, glutamate, tyrosine.
Antibodies recognise relatively few amino acids in the particular protein target, which in one embodiment of the invention comprises a linear stretch of amino acids in the primary sequence (continuous epitope), and in another embodiment comprises amino acids in discontinuous in primary sequence, but contiguous in 3-dimensional space (discontinuous epitope).
In the present invention we use antibodies specific for individual epitopes of a number of relationally linked proteins or polypeptides in the non-natural protein sequence to understand the workings of, for example and without limitation a biochemical pathway. A quantitative understanding of such pathways is most desirable, and this also requires calibration of the output of studies using the protein/polypeptide specific antibodies. Since the non-natural protein sequence of the present invention also includes a calibration portion which comprises known amounts of epitope specific for each antibody in the series, it conveniently provides a means of investigating the quantitative relationships between components within a biochemical pathway or other series of proteins such as isoforms, allelic variations or proteins located within a common intra-cellular compartment.
The linking of a number of epitopes in series is an attractive advantage of the present invention as it reduces the cost of production of each product substantially (virtually in proportion with the number of epitopes in series) and thus makes the product profitable at reasonable price to the consumer.
Reference herein to a “scaffold portion” refers to a protein or concatamer within the non-natural sequence of the agent of the present invention which is non-reactive or innocuous and contributes to the calibration capability only in terms of its dominant physical properties for example by its molecular weight, and/or its pI, and/or its good production (expression) characteristics.
Preferably the scaffold portion of the non-natural protein sequence has a controlled property, preferably this property is relative molecular mass (Mr) or weight (Mwt) or the pH value for the isoelectric point of a given substance in solution (pI).
Preferably, the scaffold portion is a protein.
Preferably the scaffold portion comprises at least one natural or unnatural amino acid with at least one or more chemically reactive groups, preferably within the side chain of a residue.
Preferably the scaffold portion comprises one or more chemically reactive groups, for example, the carbonyl on glutamic acid or aspartic acid or the hydroxyl on tyrosine and more preferably still comprises at least one cysteine and/or lysine amino acid groups. Thus it will be appreciated that the scaffold portion may be a polymer containing a thiol or primary amine functional group or any other protein in which there are suitable reactive side chain groups such as aspartic acid, glutamic acid, cysteine and/or lysine groups available for covalent conjugation with the epitopes. It is desirable that the covalent conjugation of the epitopes to the chemically reactive groups of the scaffold portion be controlled.
Preferably, the number of reactive cysteine and/or lysine groups may be controlled by selecting the scaffold portion protein from a natural source which contains the desired number of reactive cysteine and/or lysine groups.
Preferably, the scaffold portion protein is selected from the group comprising: 127, from titin which contains two cysteine residues; I39 domain which is a subunit (subunit 5) of splicing factor 3b and which contains one cysteine residue, organ of Corti protein (Mus musculus) Swiss-Prot/TrEMBL Primary Accession Number Q8R448 which contains one cysteine and one lysine residue; heat shock protein, mitochondrial (Mus musculus) Swiss-Prot/TrEMBL Primary Accession Number Q64433 which contains eleven lysine residues; splicing factor 3B subunit 5 (Mus musculus) Swiss-Prot/TrEMBL Primary Accession Number Q923D4 which contains one cysteine and five lysine residues; ubiquinol-cytochrome C reductase complex ubiquinone-binding protein QP-C (Schizosaccharomyces pomme) Swiss-Prot/TrEMBL Primary Accession Number P50523 which contains one cysteine and six lysine residues; E1B protein (Human adenovirus type 11) Swiss-Prot/TrEMBL Primary Accession Number Q8B8U6 which contains one cysteine residue; chaperonin (Arabidopsis thaliana) Swiss-Prot/TrEMBL Primary Accession Number P34893 which contains nine lysine residues; photosystem 11 reaction centre H protein (Arabidopsis thaliana) Swiss-Prot/TrEMBL Primary Accession Number P56780 which contains three lysine residues; a NADH-ubiquinone oxidoreductase subunit, mitochondrial [Precursor] (Homo sapiens) Swiss-Prot/TrEMBL Primary Accession Number P56181 which contains one cysteine and nine lysine residues; signal recognition particle protein (Mus musculus) Swiss-Prot/TrEMBL Primary Accession Number P49962 which contains two cysteine and eight lysine residues; DNA polymerase delta subunit 4(Mus musculus) Swiss-Prot/TrEMBL Primary Accession Number Q9CWP8 which contains two cysteine and six lysine residues.
In a particular embodiment, the scaffold portion may comprise one or more domains, such as 127, from titin. Titin contains a number of β-sandwich domains belonging to the Ig family. The I27 domains usually contain two cysteine residues and fold to form stable structures of 10 kDa. In Nature, the I27 domain contains two cysteine residues (the site for covalent attachment of peptide); however mutation of these cysteine residues, to serine for example, is compatible with domain folding. Thus a presentation system of I27 can be formed where the molecular weight step size is a convenient unit (10 kDa steps) and where one unit (or more if required) can be engineered to possess a single cysteine residue for peptide attachment while all other units of I27 will lack cysteine residues. In alternative embodiments, the units of I27 may lack other reactive residues. These residues may include, but not be limited to lysine, glutamate and aspartate. A copy of I27 could contain one or more of these reactive residues, offering a controlled number of sites for the covalent attachment of epitopes.
Preferably, the number of reactive cysteine and/or lysine groups may be controlled by modifying any of the aforementioned scaffold portion proteins by selectively mutation by adding in or out or rendering ineffective any one or more of the reactive cysteine and/or lysine residues.
Alternatively, one or more of the titin domains may be mutated to possess either one or no cysteine residues. In one embodiment, the non-natural protein sequence comprises one or more I27 domains and a plurality of epitopes, wherein one of the I27 domains comprises a single cysteine residue and the other I27 domains lack a cysteine residue. In an alternative embodiment, the scaffold portion may comprise an I39 domain which is a subunit (subunit 5) of splicing factor 3b. The I39 domain is a 10 kDa domain. Preferably, the scaffold portion is of a convenient molecular weight and is typically selected as 10 kDa for convenience.
Preferably, the scaffold portion of the non-natural protein sequence of the present invention is blind to the antibody/antibodies specific to epitopes i.e. non-reactive thus, the scaffold portion may be considered capable of discrimination so that it is absolutely or relatively “immunologically blind” or “reactively inert” or substantially so.
Many proteins are covalently modified in a transient manner in response to a number of stimuli. This form of covalent modification can alter the function of the protein in question. Covalent modification can take the form of phosphorylation (of serine, threonine, tyrosine, lysine, arginine, histidine, aspartic acid and glutamic acid residues), sulphation of tyrosine residues, nitrosylation of cysteine residues (http://download.cell.com/supplementarydata/cell/106/6/675/DC1/TableS1.pdf) glycosylation (of threonine, serine residues), ubiquinatinylation and related modifications (of lysine residues). A full understanding of the process of transient covalent modification includes description of the proportion of molecules covalently modified at any particular point in time, this is termed the stoichiometry or modification and it represents the number of moles of modified protein per mole of total target protein (i.e. mol/mol).
In one embodiment of the invention the non-natural protein sequence of the present invention comprises at least two independent epitopes relating to a single protein.
Preferably, the at least two independent epitopes relating to a single protein are both situated on the backbone sequence comprising non-reactive or inert molecules or one may be situated there and the other associated with the scaffold portion.
The agents of the present invention can be used to determine the stoichiometry of modification of a particular site on a protein target. For example, total protein can be determined using a scaffold portion protein containing the epitope for the protein (where antibody binding to this epitope is not affected by any of the covalent modifications the protein can experience) as described above. In addition, the scaffold portion or concatamer portion can contain a single reactive residue to allow the covalent attachment of a second peptide epitope containing the covalent modification site, to which a modification-specific antibody exists. The non-natural protein sequence will thus contain two independent epitopes relating to this one protein, one (in series with the backbone of the scaffold protein) which records the total protein content of a sample, and a second (forming a branch in the polymer) peculiar to the covalent modification site. The use of this agent comprising a non-natural protein sequence with an experimental (test) sample will permit the collection of calibrated data relating to the binding of both antibodies to the agent and will facilitate calculation of the stoichiometry of modified proteins in the sample.
Preferably, the non-natural protein sequence comprises a single (or controlled number of) attachment site(s) for the second epitope, which will result in the creation of a branched polymer. Cysteine, lysine, aspartic acid, glutamic acid, tyrosine are the principal reactive residues found in proteins. “In line epitope” sequences (i.e. those continuous with the α-carbon backbone of the scaffold portion) will dictate whether the protein contains controlled numbers of these residues to facilitate attachment of the second epitope (modification specific) without corruption of the first (in line series). For example, where an epitope is being displayed by covalent attachment to a thiol group within the scaffold portion, the presence of thiol groups within the in line epitope(s) would be highly undesirable. In some cases the “in line epitope” sequence might prove incompatible with this approach, and alternative in line epitope sequences will be required.
In an alternative embodiment of the invention, the “branched” sequence may contain information for one or more epitope.
Preferably, multiple epitopes may be a contiguous sequence or in an overlapping sequence or they may contain hapten epitopes (e.g. phosphoamino acid, dyes, other modified amino acids) or alternatively they could be linear or branched structures.
For example, protein isoforms, generated from alternative genes, by alternative splicing, or by post-translational processing (e.g. proteolysis) can be used to create functional versatility from a limited genetic resource. Isoforms can differ in terms of their functional properties, their mode of regulation (partners, modification sites), their location in a cell, or alter their stability and life-time (1).
Thus it will be appreciated that the non-natural protein sequence of the present invention has the ability to bind partners specific to the selected epitopes and simultaneously to provide calibration qualities.
In a further aspect of the present invention, there is provided a non-natural protein sequence for use in detecting the presence of one or more relationally linked proteins or polypeptides in a sample and calibration of the said sample, the non-natural protein sequence product comprising a plurality of epitopes to proteins or polypeptides that are functionally related or linked and a scaffold portion comprising a non-reactive sequence that contributes to the calibration capability only in terms of its dominant physical properties as herein before described.
In a further aspect of the present invention, there is provided a non-natural protein sequence for use in detecting the presence of one or more relationally linked proteins or polypeptides and the absolute concentration of said epitope in a sample, the non-natural protein sequence product comprising a plurality of epitopes to proteins or polypeptides that are functionally related or linked and a scaffold portion comprising a non-reactive sequence that contributes to the calibration capability only in terms of its dominant physical properties as herein before described.
By determining the absolute concentration of an epitope in the sample we are able to equate this amount to a particular protein or protein modification.
In a further aspect of the invention, the non-natural protein sequence of the present invention can be used to calibrate a sandwich ELISA style experiment, wherein one antibody bound to a physical surface captures an antigen and then a second antibody, specific for a second feature on the same antigen, binds to the captured antigen. A schematic representation is presented in
The present non-natural protein sequence or calibration material can incorporate multiple epitopes for antibodies specific for the same protein, spaced within the calibration product in such a way that more than one antibody can bind to its epitope simultaneously. In the example shown in
In a yet further aspect of the invention, there is provided a kit for identifying the presence of and quantifying the amount of at least two relationally linked proteins or polypeptides in a sample, the kit comprising a non-natural protein sequence product comprising a scaffold portion comprising a non-reactive sequence that contributes to the calibration capability only in terms of its dominant physical properties as herein before described and plurality of epitopes to proteins or polypeptides that are functionally related or linked. Optionally, the kit may comprise instructions for use thereof.
In a yet further aspect of the invention, there is provided a method of simultaneously detecting the presence of at least two relationally linked proteins or polypeptides and quantifying the amount of said functionally related or linked proteins or polypeptides in a sample, the method comprising:
The present invention is therefore of great utility to a researcher who may wish to study multiple related components in their system and at the same time provide a positive control, an internal standard or generate a calibration curve.
The present invention is of also great utility in industry pathway systems which can be altered to accomplish the manufacture of new products, those which occur in nature and others not normally produced in nature.
Antibodies specific for epitope sequences can be sourced from commercial or collaborative sources or produced using published procedures (Drago G. A., & Colyer, J. (1994) J. Biol. Chem. 269, 25073-25077; Hudson L., & Hay, F. C. (1980) Practical Immunology, 3rd Ed., Blackwell Scientific Publications, Oxford). Conversely, the epitope of an antibody can be defined using published methods (Morris, G. L., Cheng, H-C., Colyer, J., & Wang, J. H. (1991) J. Biol. Chem. 266, 11270-11275).
The epitope of each antibody for a component in a relationally linked series must be established, either by virtue of the immunisation material (i.e. a peptide) or by empirical characterisation of the antibody. The peptide sequence of each chosen epitope (e.g. five sequences from different SERCA enzyme isoform in example 1) will be selected on the basis of their recognition by an appropriate antibody, and their chemical dissimilarity with other sequences in the calibration product. A gene encoding these peptide epitope sequences in series will be designed using codon usage information for Escherichia coli, or other relevant protein expression host. The gene (epitope gene) would be synthesised using published methods (Maniatis, Fritsch and Sambrook. 1st Ed 1982, 2nd Ed. 1989, Molecular Cloning: A laboratory manual. Cold Spring Harbor Press.).
The epitope gene was cloned into a further gene encoding the scaffold portion and an affinity purification tag, at an unique restriction site (Brockwell, D. J., Beddard, G. S., Clarkson, J., Zinober, R., Blake, A. W., Trinick, J., Olmsted, P. D., Smith, D. A., & Raadford, S. E. (2002) Biophys. J. 83, 458-472). Inserts with the correct orientation were identified by polymerase chain reaction using appropriate primer sets, and the gene product was expressed in E. coli as described in Brockwell et al. (2002).
The calibration product was purified by affinity chromatography (Brockwell et al., 2002) followed by preparative SDS-PAGE, and the amount of product was determined using a standard protein assay (Smith P K, Krohn R1, Hermanson G T, Mallia A K, Gartner F H, Provenzano M D, Fujimoto E K, Goeke N M, Olson B J, Klenk D C. Measurement of Protein Using Bicinchoninic Acid. Anal. Biochem. 1985; 150: 76-85)
Known amounts of purified calibration product, in a serial dilution series from 10 pmol to 0.01 pmol were resolved on a 10% SDS-PAGE gel, transferred to PVDF membrane and stained with an antibody specific for the SERCA1a epitope. A series of biological samples containing SERCA proteins were analysed in parallel. Identical experiments were performed in series, or parallel and stained with antibodies specific for each of the other SERCA isoforms in turn, Immunosignals were quantified by densitometry (Rodriguez, P., Bhogal, M. S., & Colyer, J. (2003) J. Biol. Chem. 278, 38593-38600) and a calibration curve relating the immunosignal to calibrant loading (pmol) was used to determine the amount of epitope present in each biological sample.
Taking, by way of example only a process A→B→C→D→E, catalysed by enzymes f, g, h, and i. Antibodies specific to each enzyme exist (called f′, g′, h′, and i′) and the epitope for each of these antibodies has been defined (f″, g″, h″, and i″). A calibration standard material could be constructed in which the chemical constituent of each antibody epitope (f″, g″, h″, and i″) is linked in series to form an unnatural protein sequence. This sequence is then linked in series to an additional protein sequence or scaffold portion, which is not recognised by any of the antibodies in this particular experimental series. This additional protein mass or scaffold portion functions in controlling the molecular weight of the non-natural protein sequence of the present invention. The non-natural protein sequence product will contain known amounts of each epitope, and can thus be used in experiments in known amounts to calibrate the signals generated by the experiment. The non-natural protein sequence of the present invention is shown schematically in
Many proteins can be expressed in a variety of isoforms, either from the expression of closely related genes or from the production of alternatively spliced forms of an individual gene, or by combination of both of these mechanisms.
The multifunctional SERCA Sarcoplasmic/endoplasmic reticulum (Ca2+—Mg2+)-Adenosine triphosphoatase exists in a number of isoforms generated from different genes (1,2,3), with alternative splicing products of genes SERCA1 and SERCA2 resulting in further diversity. See Table 1 below for details.
990HHVDEKKDLk999
With reference to
The multifunctional protein phosphatase, calcineurin (CaN) is an example of an enzyme expressed in a variety of isoforms. CaN is involved in a large variety of biological events including programmes of gene expression in response to extracellular signals (CaN/NFAT). The calcineurin isoforms are as follows: Calcineurin alpha (CaN alpha); Calcineurin beta (CaN beta); and Calcineurin gamma (CaN gamma) sequence not commercially available. A product of the invention comprises the three calcineurin isoforms and Table 2 below for details.
The multifunctional protein kinase, calmodulin-dependent kinase II is an example of an enzyme expressed in a variety of isoforms. CaM kinase II is involved in a large variety of biological events including memory, regulation of vesicle movement, and maladaptive responses in heart failure. The isoforms of CAMKII are listed below:
A calibration product could be constructed from a series of epitope sequences, where each sequence represents the epitope for an antibody specific for an isoform of Cam kinase II. Some epitopes are shared between all or several isoforms, these epitopes could be incorporated in the calibration standard to calibrate multiple isoforms with a single antibody (e.g. module 5). A number of phosphorylation sites exist in the protein. Epitopes for phosphorylation site specific antibodies could be incorporated in the product (e.g. module 6 above) to calibrate the status of phosphorylation too. Details of the antibodies and epitopes are set out below in Table 3.
Polymorphisms occur within biological species in probably every gene. In some cases the polymorphisms occur with altered probability in disease situations, and in those case are of particular interest and use.
Genetic variation exists within the population of a species, which at the individual gene level is manifest as polymorphisms of a gene. Polymorphisms represent typically single base changes in the sequence of a gene which can occur in the coding or non-coding regions. These deviations in sequence can be without consequence to the gene, or can alter the level of expression of the gene, or can alter the polymer encoded by the gene. In many instances the probability of disease is linked to particular polymorphisms, which serves as a useful screening tool, and as a basis for hypothesis driven research into the cause and management of disease.
Certain polymorphisms in the RYR2 gene, which encodes an ion channel expressed in the heart, are associated with a disorder known as catecholaminergic polymorphic ventricular tachycardia (CPVT) which can provoke electrical irregularity and sudden death when an individual exercises. To date, over 20 separate mis-sense polymorphisms (those which alter the primary sequence of the protein) have been discovered in the human RYR2 gene, which are linked to CPVT. These include:
Which are residues conserved between man (and mouse), Drosophila and Caenorhabditis elegans. They exist in regions which are also highly conserved, both across these species, and between isoforms of RYR (1,2, and 3).
A large number of mutations in RyR2 are found in patients with arrhythmogenic right ventricular dysplasia type 2 (ARVD2) and catecholaminergic polymorphic ventricular tachycardia (CPVT). These are believed to play a causal role in disease. A sub-set of known disease associated mutations of RYR2 include: (1) R176 Q, (2) V2306 I, (3) G3946 S and (4) V4653 F. To our knowledge antibodies specific for these mutations do not exist, however it is likely that they can be generated using short synthetic peptide immunogen incorporating the mutation site, using techniques known in the art. Calibration of such antibodies could be achieved using a product comprising SEQ ID NOs 19-22 (see Table 4 below).
Protein p53 (also called TP53) is associated with a high proportion of cancers in man. For example in human liver cancer, 26% of cases (559 of 2153 tumours) show mutation in TP53 according to 64 studies (Jackson et al; 2006 Toxicology Science 90, 400-418). Similarly TP53 mutations occur in 42% of spontaneous lung tumours in man. Missense polymorphisms result in mutant proteins, some of which are associated with cancer, such as: (1) R 249 S—most frequent TP53 mutation in hepatocellular carcinoria (HCC), (2) R172P, (3) R172H and (4) R270H. To our knowledge antibodies specific for these mutations do not exist, however it is likely that they can be generated using short synthetic peptide immunogen incorporating the mutation site, using techniques known in the art. Calibration of such antibodies could be achieved using the following product comprising SEQ ID NOs 26-29 of the epitopes after mutation, see Table 5 below:
Proteins are involved in natural processes such as metabolism, blood clotting, and hormone action. For example, glycolysis: achieves the following chemical reaction
C6H12O6+2ADP+2Pi+2NAD+→2CH3COCOO+2ATP+2H2O+2NADH+2H+
It utilises 10 enzymes, and an 11th under anaerobic conditions, specifically the enzymes are as follows:
Hexokinase, Phosphoglucose isomerase, 6-phosphofructose-1-kinase, fructose bisphosphate aldolase, triose phosphate isomerase, glyceraldehydes 3-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglycerate mutase, enolase, pyruvate kinase, (lactate dehydrogenase).
A calibration standard comprising epitopes for antibodies to each of these proteins would be useful in the study of glycolytic processes in biology, biotechnology and medicine.
The eukaryotic cell cycle is an essential pathway necessary for all proliferative responses. This cycle involves a number of protein kinases and their partner regulatory proteins (cyclins), the concentration of the cyclins change throughout the cell cycle to allow passage of the cell through specific controlling check-points. Cell cycle engine parts include: (1) Cyclin A; (2) cdK1; (3) Cyclin D; (4) cdK4; (5) cdK6; (6) Cyclin E; (7) cdK2; and (8) Cyclin B. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-8 as defined in Table 6 below would be useful in the study of cell cycles in biology, biotechnology and medicine.
laevis). C-Terminal 2/3rds of Xenopus
Xenopus laevis,
laevis and
A series of components acting in consort can form a pathway. Large numbers of pathways exist in biochemistry. One example is a pathway of interactions which control the expression of cell cycle regulators, cyclin A and cyclin E. G1 cyclins that overcome inhibitors of cell cycle progression are: (1) P16; (2) Cyclin D; (3) Retinoblastoma Protein; (4) E2F; (5) Cyclin E; (6) Cyclin A; and (7) P27. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-7 as defined in Table 7 below would be useful in the study of cell cycles in biology, biotechnology and medicine.
An example of a pathway of interactions which communicate extracellular stimuli to changes in gene expression, involving NFAT and calcineurin is: (1) CHP; (2) FK506; (3) MCIP/calcipressin; (4) AKAP79; (5) CsA/CyA; (6) Cabin 1/CAIN; (7) NFAT(P); (8) PKA; (9) CKI; (10) GSK-3beta; (11) JNK; (12) P38; (13) MEF2; and (14)NFAT. Table 8 below shows the details for the manufacture a non-natural protein sequence of the present invention using two or more epitopes selected from the group comprising epitopes 1-14.
A further example pathway of interactions is control of the production of cytokines downstream of the toll-like receptor. Lipopolysaccharide is a ligand for the Toll-like receptor and the proteins involved in the signalling network include: (1) TLR; (2) TRIF; (3) TRAF6; (4) JNK; (5) IL10; (6) IL6; (7) RANTES; (8) GCSF; (9) TNF alpha; and (10) MIPI alpha. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-10 as defined in Table 9 below would be useful in the study of cytokine production in biology, biotechnology and medicine.
Some biological components share a common location in a cell for all or some of their time. A number of signals contained within the primary sequence of proteins control their location in the cell. Residence of this location is typically dynamic (rather than static) and thus evaluation of the entire protein complement of that location would be valuable in biological research.
Lipid raft domains of biological membranes are an interesting example of a discrete cellular location. Our present understanding places lipid rafts as subdomains of the plasma membrane, characterised by a gel phase lipid composition (lipid and cholesterol) which allows residence of some particular proteins and exclusion of others. Three distinct lipid raft types can be resolved, as summarised in table below (taken from http://www.bms.ed.ac.uk/research/others/smaciver/Cyto-Topics/lipid rafts and the cytoskeleton.htm):
The centrosome is a common physical location for some biological components. It is located adjacent to the eukaryotic nucleus and serves a variety of functions including the organisation of microtubules. The centrosome organises the assembly of the mitotic spindle which permits the correct segregation of chromosomes. Abnormalities in centrosome components can lead to centrosome dysfunction, which is often associated with proliferative diseases such as cancer. Centrosomes also play important roles in cell migration, the movement of cilia and the movement of vesicular membrane structures within cells. Centrosome contains a number of proteins, including: (1) Microtubule; (2) Pericentrin; (3) Centrin; (4) PCMI; (5) Ninein; (6) BBS4; (7) P150 Glued; (8) Dynein; (9) Centriolin; (10) Gamma Tubulin; (11) Polo Kinases; (12) Aurora Kinases; (13) Catanin; and (14) Katanin.
A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-14 as defined in Table 10 below would be useful in the study of cell cycles in biology, biotechnology and medicine. Furthermore a calibration product comprising amino acid sequences to antibodies specific to multiple components below is envisaged. In some instances suitable antibodies with known epitope sequences have been described, in other instances such antibodies need to be identified.
Lipid rafts is another physical location, which is a domain of the plasma membrane phase separated from surrounding regions of membrane. The phase separation arises as a consequence of the concentration of cholesterol and sphingomelin lipids, which group together to form a gel phase. Transmembrane proteins typically cannot enter this microdomain, which is populated instead by proteins anchored through fatty acid, or lipid-like units, including: GPI (glycosylinositolphosphatidyl) anchored proteins and proteins which are both myristolated and palmitoylated.
A series of proteins associated with lipid rafts include: (1) Lck (SRC kinase family members); (2) Fyn (SRC kinase family members); (3) H-Ras; (4) ZAP-70; (5) CD3ç; (6) LAT; (7) Flotillin-1; (8) CD2; (9) PAG; (10) F-actin; and (11) CD59. A calibration product comprising amino acid sequences to antibodies specific to multiple components above is conceivable. In some instances suitable antibodies with known epitope sequences have been described, in other instances such antibodies need to be identified. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-11 as defined in Table 11 below would be useful in the study of lipid rafts in biology, biotechnology and medicine.
Macromolecular complexes which bring post-translational modification enzymes close to their target substrates are known. AKAPs (A-kinase anchoring proteins) are a good example of such complexes, which target enzymes involved in signalling with their substrate or effector proteins, creating local signalling circuits. The mAKAP macromoleular complex contains: (1) DE4D3; (2) Makap; (3) PKA; (4) Epac1; (5) Rap1; (6) MEKK; (7) MEK5; and (8) ERK5. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-8 as defined in Table 12 below would be useful in the study of lipid rafts in biology, biotechnology and medicine.
Many biological polymers assemble stably or transiently into macromolecular complexes, which typically exhibit function. One such complex is the dystrophin complex which forms a junction between the plasma membrane of a muscle cell and the underlying cytoskeletal structure. The Dystrophin complex contains a number of proteins, including: (1) Laminin; (2) Alpha dystroglycan; (3) Beta dystroglycan; (4) Caveolin; (5) Dystrobrevin; (6) Dystrophin; (7) Actin; (8) Alpha sacroglycan; (9) Beta sacroglycan; (10) Delta sacroglycan; (11) Gamma sacroglycan; and (12) Sacrospan. A product comprising the non-natural protein sequence comprising two or more epitopes selected from 1-12 as defined in Table 13 below would be useful in the study of lipid rafts in biology, biotechnology and medicine.
Xenopus
laevis, Dog,
The cardiac ryanodine receptor (RyR2), located in the sarcoplasmic reticulum (SR), is a calcium release channel which is centrally involved in the myocyte excitation-contraction (E-C). The ryanodine recpetor is also the center of a massive macromolecular complex, which includes numerous regulatory proteins that can modulate RyR2 function. This complex includes proteins that interact with the cytoplasmic part of the RyR2 directly or indirectly (e.g. calmodulin (CaM), FK-506-binding proteins, protein kinase A, Ca-CaM-dependent protein kinase, phosphatases 1 and 2A, mAKAP, spinophilin, PR130, sorcin, triadin, junctin, calsequestrin and Horner). Understanding both the physical/molecular nature of the protein-protein interactions between RyR and these other proteins is important since this complex and the modulation of the ryanodine receptor is believed to be involved in cardiac arrhythmias, pace-maker function of the heart and cardiac disease.
Notations A and B can be cleaved and removed, which allows for the production of the true RyR2 macromolecular complex epitope calibrant. Notations 1 and 10 are tags which can be used in purification of the calibrant protein and also act as common and widely used antibody epitopes.
The optimised genetic sequence, which encodes for all of the proteins in Table 14 is shown above as SEQ ID NO:88. Below as SEQ ID NO:89 is the resulting amino acid sequence from the genetic code
The genetic sequence encoding the antibody epitopes that make up the calibrant (and purification tags) is synthesised and inserted into the E. coli expression vector pGS-21a (
The pGS-21a plasmid, which now contains the genetic sequence to encode for the RyR2 macromolecular complex calibrant, is transformed into BL21 (DE3) pLysS E. coli cells. Transformed cells are selected and used to express the calibrant protein after the induction of gene synthesis with IPTG. After 3.5 hours of expression, the cells were harvested and re-suspended in sample buffer for analysis by SDS-PAGE and western blot. This was done in order to assess the purity of the calibrant product.
Following successful expression of a considerably pure calibrant product, the calibrant expression was scaled up from 1 ml to 1.5 L (
Mitogen activated protein kinases (MAPK) are at the center of many signalling transduction pathways in eukaryotic cells. The study of MAPK pathways is important in the research of many disease areas such as inflammation, cancer and Parkinsons disease. We have used several proteins which are involved in one of the MAPK pathways to produce a corresponding calibrant. Antibody epitopes for the proteins on the in this pathway have been genetically encoded and expressed in bacteria to produce a single protein that contains all of the antibody epitopes to each of the labelled proteins. The numerical notation of each protein corresponds to the notation in Table 15 below.
Notations A and B can be cleaved and removed, which allows for the production of the true MAPKinase pathway epitope calibrant. Notations 1 and 11 are tags which can be used in purification of the calibrant protein and also act as common and widely used antibody epitopes.
The optimised genetic sequence, which encodes for all of the proteins in Table 15 is shown below as SEQ ID NO:100.
The resulting amino acid sequence from the genetic sequence of SEQ ID NO:100 is given below as SEQ ID NO:101.
The genetic sequence encoding the antibody epitopes that make up the calibrant (and purification tags) is synthesised and inserted into the E. coli expression vector pGS-21a (
The pGS-21a plasmid, which now contains the genetic sequence to encode for the MAPK pathway calibrant, is transformed into BL21 (DE3) pLysS E. coli cells. Transformed cells are selected and used to express the calibrant protein after the induction of gene synthesis with IPTG. After 3.5 hours of expression, the cells were harvested and re-suspended in sample buffer for analysis by SDS-PAGE and western blot. This was done in order to assess the purity of the calibrant product.
Number | Date | Country | Kind |
---|---|---|---|
0514729.3 | Jul 2005 | GB | national |
0514885.3 | Jul 2005 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2006/002695 | 7/19/2006 | WO | 00 | 7/2/2008 |