Cell Cycle Genes and Related Methods

Abstract
Novel plant polysaccharide synthesis genes and polypeptides encoded by such genes are provided. These genes and polynucleotide sequences are useful regulating polysaccharide synthesis and plant phenotype. Moreover, these genes are useful for expression profiling of plant polysaccharide synthesis genes. The invention specifically provides cell cycle polynucleotide and polypeptide sequences isolated from Eucalyptus and Pinus.
Description
FIELD OF THE INVENTION

The present invention relates generally to the field of plant cell cycle genes and polypeptides encoded by such genes, and the use of such polynucleotide and polypeptide sequences for regulating a plant cell cycle. The invention specifically provides cell cycle polynucleotide and polypeptide sequences isolated from Eucalyptus and Pinus and sequences related thereto.


BACKGROUND OF THE INVENTION

Cell growth and division are controlled by the temporal expression of different sets of genes, allowing the dividing cell to progress through the different phases of the cell cycle. Continued growth and organogesis in plants requires precise function of the cell cycle machinery. Plant development, which is directly affected by cell division rates and patterns, also is influenced by environmental factors, such as temperature, nutrient availability, light, etc. See Gastal and Nelon, Plant Physiol. 105:191-7 (1994), Ben-Haj-Sahal and Tardieu, Plant Physiol. 109:861-7 (1995), and Sacks et al., Plant Physiol. 114:519-27 (1997). Plant development and phenotype are connected with the cell cycle, and altering expression of the genes involved in the cell cycle can be a useful method of modifying plant development and altering plant phenotype.


The ability to alter expression of cell cycle genes is extremely powerful because the cell cycle drives plant development, including growth rates, responses to environmental cues, and resulting plant phenotype. Control of the plant cell cycle and phenotypes associated with alteration of cell cycle gene expression, in the vascular cambium, in particular, has applications for, inter alia, alteration of wood properties and, in particular, lumber and wood pulp properties. For example, improvements to wood pulp that can be effected by altering cell cycle gene expression include increased or decreased lignin and cellulose content, and altered length, diameter, and lumen diameter of cells. Manipulating the plant cell cycle, and in particular the cambium cell cycle (i.e. the rate and angle of cell division), can also engineer better lumber having increased dimensional stability, increased tensile strength, increased shear strength, increased compression strength, increased shock resistance, increased stiffness, increased or decreased hardness, decreased spirality, decreased shrinkage, and desirable characteristics with respect to weight, density, and specific gravity.


A. Cell Cycle Genes and Proteins


1. Cyclin Dependent Protein Kinase


Progression through the cell cycle is regulated primarily by cyclin-dependent kinases (CDKs). CDKs are a conserved family of eukaryotic serine/threonine protein kinases, which require heterodimer formation with a cyclin subunit for activity. For review see, e.g. Joubes et al., Plant Mol. Biol. 43: 607-20 (2000), Stals and Inze, Trends Plant Sci. 6:359-64 (2001), and John et al., Protoplasma 216: 119-42 (2001).


The are five subclasses of CDK's, each having a different cyclin binding consensus sequence. In CDK type A the cyclin binding consensus sequence is PSTAIRE. Id. The cyclin binding consensus sequence in CDK types B-1, B-2, and C are PPTTLRE, PPTALRE, and PITAIRE, respectively. Joubes et al, Plant Physiol, 126: 1403-15 (2001).


Cell cycle progression is directed, in part, by changes in CDK activity. CDK activity is modulated by a number of different cell cycle protein components, such as changes in the abundance of individual cyclins due to changing rates of biosynthesis and proteolysis. Fluctuations in cyclin concentrations result in commensurate fluctuations in CDK activity. Cyclin accumulation is especially important in terminating the G1 phase of the cell cycle because DNA replication is initiated by an increase in CDK activity.


Activation of CDK also requires phosphorylation of a threonine residue within the T-loop of CDK by a CDK-activating kinase (CAK). Umeda et al., Proc. Nat'l Acad. Sci. U.S.A. 97: 13396-400 (2000). It was suggested by Yamaguchi et al., Plant J. 24: 11-20 (2000), that cyclin H is a regulatory subunit of CAK. CDK activity is further regulated by interaction with a CDK regulatory subunit, a small (70-100 AA) protein involved in cell cycle regulation.


A cell must exit the cell cycle in order to commit to differentiation, senescence or apoptosis. This process involves the down-regulation of CDK activities. CDK inhibitors (CKI) are low molecular weight proteins, which are important for cell cycle regulation and development. CKIs bind stoichiometrically to CDK and down-regulate the activity of CDKs.


Many biochemical properties of ICK1, the first plant CKI to be identified from Arabidopsis thaliana, are known. Wang et al., Nature 386:451-2 (1997) Wang et al., Plant J. 24: 613-23 (2000). ICK1 is expressed at low levels in many tissue types, and there can be a threshold level of ICK1 that must be overcome before a cell can enter the cell cycle. Wang et al., Plant J. 24: 613-23 (2000). ICK1 is induced by the plant growth regulator abscisic acid (ABA), which inhibits cell division by blocking DNA replication. When the expression of ICK1 increases, there is a corresponding decrease in Cdc2-like H1 histone activity. ICK1 has been shown to bind in vitro with the cyclins C2c2a and CycD3, and deletion experiments have identified different domain regions for these two interactions.


Altering the expression of CDK regulatory protein or a subunit thereof is known to cause changes in plant phenotype. Overexpression of the Arabidopsis CDK regulatory subunit, CKS1At, resulted in a reduction of leaf size, root growth rates and meristem size. Additionally, overexpression of CKS1At resulted in inhibition of cell-cycle progression, with an extension in the duration of the G1 and G2 phases of the cell cycle.


2. Cyclins


Cyclins are positive regulatory subunits of cyclin-dependent kinase (CDK) enzymes and are required for CDK activity. Fowler et al., Mol. Biotech. 10, 123, 126. Cyclins and CDK complexes provide temporal regulation of transition through the cell cycle. Evidence also suggests that cyclins provide spatial regulation of specific CDK activity, differentially targeting the cytoskeleton, spindle, phragmoplast, nuclear envelope, and chromosomes.


Plant cyclins are classified into five major groups: A, B, C, D, and H. Renaudin et al., Plant Mol. Biol. 32: 1003-18 (1996) and Yamaguchi et al., (supra 2000). Cyclins can be divided into mitotic cyclins (A and B) and G1 cyclins.


The mitotic cyclins possess a consensus sequence (R-x-x-L-x-x-I-x-N) located at the N-terminal region, termed a destruction box, adjacent to a lysine-rich region. The destruction box and lysine-rich region target the mitotic cyclins for ubiquitin-dependent proteolysis during mitosis. Stals, supra at 361, and Fowler, supra at 126. The destruction box in A versus B cyclins differs slightly and this difference is thought to result in slightly different timing of degradation of A versus B cyclins. Fowler, supra at 126. A-type cyclins accumulate during the S, G2, and early M phase of the cell cycle, whereas B-type cyclins accumulate during the late G2 and early M phase. Mironov et al., Plant Cell 11: 509-22 (1999). Three subgroups of A-type cyclins are known in plants, but only one is known in animals. Cyclin A1 (cycA1;zm;1 from Zea cans) is most concentrated during cytokinesis at the microtubule-containing phragmoplast. Expression of cyclin A2 is upregulated by auxins in roots, and by cytokinins in the shoot apex. Abrahams et al., Biochim. Biophys. Acta 28: 1-2 (2001).


D-type cyclins, of which five subgroups are known, are thought to control the progression through the G1 phase in response to growth factors and nutrients. Riou-Khamlichi et al., Mol. Cell Biol. 20: 4513-21 (2000). For example, the expression of D-type cyclins is upregulated by sucrose as shown by an increase in cycD2 mRNA 30 minutes after sucrose exposure, and an increase in cycD3 four hours after sucrose exposure. This timing corresponds to early G1-phase and late G1-phase, respectively. Cockcroft et al., Nature 405: 575-9 (2000). Furthermore, in Arabidopsis, a D3 cyclin was shown to be upregulated by the brassinosteroid, epi-brassinolide.


Cyclin D2 proteins bind with CDKA to produce an active complex, which binds to and phosphorylates retinoblastoma-related protein (Rb). This process is found in actively proliferating tissue, suggesting it plays an important function during late G1- and early S-phase. Three different D3-type cyclins are active during tomato fruit development. These proteins all contain a retinoblastoma binding motif and a PEST-destruction motif. There are differences in the spatial and temporal expression of these D3 cyclins, inferring different roles during fruit development.


Overexpression of cyclin D was shown to increase overall growth rate. Over-expression of cyclin D2 in tobacco increases causes shortening the G1-phase which producing a faster rate of cell cycling.


C- and H-type cyclins were characterized in poplar (Populus tremula×tremuloides) and rice (Oryza sativa) but their exact function is still unclear. Putative cyclins with a lesser degree of peptide sequence conservation have also been identified. For example, Arabidopsis CycJ18 has only 20% identity with homologues over the cyclin box domain. CycJ18 is expressed predominantly in young seedlings. Arabidopsis F3O9.13 protein also has similarity to the cyclin family.


3. Histone Acetyltransferase/Deacetyltransferase


Histone acetyltransferase (HA) and histone deacetyltransferase (HAD) control the net level of acetylation of histones. Histone acetylation and deacetylation are thought to exert their regulatory effects on gene expression by altering the accessibility of nucleosomal DNA to DNA-binding transcriptional activators, other chromatin-modifying enzymes or multi-subunit chromatin remodeling complexes capable of displacing nucleosomes. Lusser et al., Nucleic Acids Res. 27: 4427-35 (1999). Therefore, in general, the HDAs are involved in the repression of gene expression, while HAs are correlated with gene activation.


HA effects acetylation at the ε-amino group of conserved lysine residues clustered near the amino terminus of core histones which up-regulates gene expression.


HDAs remove acetyl groups from the core histones of the nucleosome. There are numerous family members in the HDA group, many of which are conserved throughout evolution. Lechner et al., Biochim Biophys Acta 5:181-8 (1996). HDAs function as part of multi-protein complexes facilitating chromatin condensation.


HDAs and HAs recognize highly distinct acetylation patterns on the nucleosome. It is thought that different types of HDAs interact with specific regions of the genome, to influence gene silencing.


Schultz et al., Genes Dev. 15: 428-43 (2001), demonstrated that the superfamily of Kruppel-associated-box zinc finger proteins (KRAB-ZFPs) are linked to the nucleosome remodeling and histone deacetylation complex via the PHD (plant homeodomain) and bromodomains of co-repressor KAP-1, to form a cooperative unit that is required for transcriptional repression. A maize HDAC (HD2) has been identified that has no sequence homology to other eukaryotic HDACs, but instead contains sequence similarity to peptidyl-prolyl cis-trans isomerases (PPIases).


The effects of interfering with histone deacetylation are discussed in e.g. Tian and Chen, Proc. Nat'l Acad. Sci. USA 98: 200-5 (2001).


4. Peptidyl Prolyl Cis-Trans Isomerase


Peptidylprolyl isomerases (e.g., peptidylprolyl cis-trans isomerase, peptidyl-prolyl cis-trans isomerase, PPIase, rotamase, cyclophilin) catalyze the interconversion of peptide bonds between the cis and trans conformations at proline residues. Sheldon and Venis, Biochem J. 315: 965-70 (1996). This interconversion is thought to be the rate limiting step of protein folding. PPIases belong to a conserved family of proteins that are present in animals, fungi, bacteria and plants. PPIases are implicated in a number of responses including the response to environmental stress, calcium signals, transcriptional repression, cell cycle control, etc. Viaud, et al., Plant Cell 14: 917-30 (2002).


5. Retinoblastoma-Related Protein


Retinoblastoma (Rb)-related protein putatively regulates progression of the cell cycle through the G1 phase and into S phase. Xie et al., EMBO J. 15: 4900-8 (1996) and Ach et al., Mol. Cell Biol. 17: 5077-86 (1997).


Although Rb is well-characterized in mammalian systems, the role of Rb-related proteins in regulation of G1 phase progression and S phase entry is not well characterized in plants. It is known, however, that RB-related protein functions through its association with various other cellular proteins involved in cell cycle regulation, such as the cyclins, WD40 proteins, Soni et al., Plant. Cell. 7:85-103 (1995); Grafi et al., Proc. Natl. Acad. Sci. U.S.A. 93:8962 (1996); Ach et al., Plant Cell 9:1595-606 (1997); Umen and Goodenough, Genes Dev. 15:1652-61 (2001); Mariconti et al., J. Biol. Chem. 277:9911-9 (2002).


6. WD40 Repeat Protein


WD40 is a common repeating motif involved in many different protein-protein interactions. The WD40 domain is found in proteins having a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly. Goh et al., Eur. J. Biochem. 267: 434-49 (2000).


The WD40 domain, which is 40 residues long, typically contains a GH dipeptide 11-24 residues from the N-terminus and the WD dipeptide at the C-terminus. Id. Between the GH dipeptide and the WD dipeptide lies a conserved core which serves as a stable platform where proteins can bind either stably or reversibly. The core forms a propeller-like structure with several blades. Each blade is composed of a four-stranded anti-parallel β-sheet. Each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade. The last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure. The residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands.


Studies in yeast demonstrated that Cdc20, which contains the WD40 motif, is required for the proteolysis of mitotic cyclins. This process is mediated by an ubiquitin-protein ligase called anaphase-promoting complex (APC) or cyclosome. Following ubiquitination and proteolysis by the 26S proteasome, the cell can segregate chromosomes, and exit from mitosis. Cdc20 also contains a destruction-box domain.


7. WEE1-Like Protein


WEE1 controls the activity of cyclin-dependent kinases. WEE1 itself is a serine/threonine kinase. Sorrell et al., Planta 215: 518-22 (2002). The enzymatic activity of these protein kinases is controlled by phosphorylation of specific residues in the activation segment of the catalytic domain, sometimes combined with reversible conformational changes in the C-terminal autoregulatory tail. This process is conserved among eukaryotes, from fungi to animals and plants. Similarly, there is a high degree of homology between WEE1 proteins from various organisms. For example, there is 50% identity between the protein kinase domains of the human and maize WEE1 proteins.


Expression of WEE1 is shown to occur only in actively dividing tissues and is believed to inhibit cell division by acting as a negative regulator of mitosis. WEE1 is believed to prevent entry from G2 to M by protecting the nucleus from cytoplasmically-activated cyclin B1-complexed CDC2 before the onset of mitosis. For example, over-expression of AtWEE1 (from Arabidopsis) and ZmWEE1 (from Zea cans) in fission yeast inhibits cell division which results in elongated cells. Sun et al., Proc. Nat'l Acad. Sci. USA 96: 4180-5 (1999).


B. Expression Profiling and Microarray Analysis in Plant Development


The multigenic control of plant phenotype presents difficulties in determining the genes responsible for phenotypic determination. One major obstacle to identifying genes and gene expression differences that contribute to phenotype in plants is the difficulty with which the expression of more than a handful of genes can be studied concurrently. Another difficulty in identifying and understanding gene expression and the interrelationship of the genes that contribute to plant phenotype is the high degree of sensitivity to environmental factors that plants demonstrate.


There have been recent advances using genome-wide expression profiling. In particular, the use of DNA microarrays has been useful to examine the expression of a large number of genes in a single experiment. Several studies of plant gene responses to developmental and environmental stimuli have been conducted using expression profiling. For example, microarray analysis was employed to study gene expression during fruit ripening in strawberry, Aharoni et al., Plant Physiol. 129:1019-1031 (2002), wound response in Arabodopsis, Cheong et al., Plant Physiol. 129:661-7 (2002), pathogen response in Arabodopsis, Schenk et al., Proc. Nat'l Acad. Sci. 97:11655-60 (2000), and auxin response in soybean, Thibaud-Nissen et al., Plant Physiol. 132:118. Whetten et al., Plant Mol. Biol. 47:275-91 (2001) discloses expression profiling of cell wall biosynthetic genes in Pinus taeda L. using cDNA probes. Whetten et al. examined genes which were differentially expressed between differentiating juvenile and mature secondary xylem. Additionally, to determine the effect of certain environmental stimuli on gene expression, gene expression in compression wood was compared to normal wood. 156 of the 2300 elements examined showed differential expression. Whetten, supra at 285. Comparison of juvenile wood to mature wood showed 188 elements as differentially expressed. Id. at 286.


Although expression profiling and, in particular, DNA microarrays provide a convenient tool for genome-wide expression analysis, their use has been limited to organisms for which the complete genome sequence or a large cDNA collection is available. See Hertzberg et al., Proc. Nat'l Acad. Sci. 98:14732-7 (2001a), Hertzberg et al., Plant J., 25:585 (2001b). For example, Whetten, supra, states, “A more complete analysis of this interesting question awaits the completion of a larger set of both pine and poplar ESTs.” Whetten et al. at 286. Furthermore, microarrays comprising cDNA or EST probes may not be able to distinguish genes of the same family because of sequence similarities among the genes. That is, cDNAs or ESTs, when used as microarray probes, may bind to more than one gene of the same family.


Methods of manipulating gene expression to yield a plant with a more desirable phenotype would be facilitated by a better understanding of cell cycle gene expression in various types of plant tissue, at different stages of plant development, and upon stimulation by different environmental cues. The ability to control plant architecture and agronomically important traits would be improved by a better understanding of how cell cycle gene expression effects formation of plant tissues, how cell cycle gene expression causes plant cells to enter or exit cell division, and how plant growth and the cell cycle are connected. Among the large number of genes, the expression of which can change during development of a plant, only a fraction are likely to effect phenotypic changes during any given stage of the plant development.


SUMMARY

Accordingly, there is a need for tools and methods useful in determining the changes in the expression of cell cycle genes that occur during the plant cell cycle. There is also a need for polynucleotides useful in such methods. There is a further need for methods which can correlate changes in cell cycle gene expression to phenotype or stage of plant development. There is a further need for methods of identifying cell cycle genes and gene products that impact plant phenotype, and that can be manipulated to obtain a desired phenotype.


In one aspect, the present invention provides an isolated polynucleotide comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.


In another aspect, the present invention provides a DNA construct comprising at least one polynucleotide having the sequence of any one of SEQ ID NOs: 1-237 and conservative variants thereof.


Another aspect of the invention is a plant cell transformed with a DNA construct of comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.


A further aspect of the invention is a transgenic plant comprising a plant cell transformed with a DNA construct comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.


Another aspect of the invention is an isolated polynucleotide comprising a sequence encoding the catalytic or substrate-binding domain of a polypeptide selected from of any one of SEQ ID NOs: 261-497, wherein the polynucleotide encodes a polypeptide having the activity of said polypeptide selected from any one of SEQ ID NOs: 261-497.


A further aspect of the invention is a method of making a transformed plant comprising transforming a plant cell with a DNA construct comprising at least one polynucleotide having the sequence of any of SEQ ID NOs: 1-237; and culturing the transformed plant cell under conditions that promote growth of a plant.


In another aspect, the invention provides a wood obtained from a transgenic tree.


In a further aspect, the invention provides a wood pulp obtained from a transgenic tree which has been transformed with the DNA construct of the invention.


Another aspect of the invention is a method of making wood, comprising transforming a plant with a DNA construct comprising a polynucleotide having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof; culturing the transformed plant under conditions that promote growth of a plant; and obtaining wood from the plant.


The invention further provides a method of making wood pulp, comprising transforming a plant with a DNA construct comprising a polynucleotide having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof; culturing the transformed plant under conditions that promote growth of a plant; and obtaining wood pulp from the plant.


In another aspect, the invention provides an isolated polypeptide comprising an amino acid sequence encoded by the isolated polynucleotide comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.


The invention also provides, an isolated polypeptide comprising an amino acid sequence selected from the group consisting of 261-497.


The invention further provides a method of altering a plant phenotype of a plant, comprising altering expression in the plant of a polypeptide encoded by any one of SEQ ID NOs: 1-237.


In another aspect, the invention provides a polynucleotide comprising a nucleic acid selected from the group comprising of SEQ ID NOs: 471-697.


An aspect of the invention is a method of correlating gene expression in two different samples, comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first sample; detecting a level of expression of the one or more genes in a second sample; comparing the level of expression of the one or more genes in the first sample to the level of expression of the one or more genes in the second sample; and correlating a difference in expression level of the one or more genes between the first and second samples.


A further aspect of the invention is a method of correlating the possession of a plant phenotype to the level of gene expression in the plant of one or more genes comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first plant possessing a phenotype; detecting a level of expression of the one or more genes in a second plant lacking the phenotype; comparing the level of expression of the one or more genes in the first plant to the level of expression of the one or more genes in the second plant; and correlating a difference in expression level of the one or more genes between the first and second plants to possession of the phenotype.


In a further aspect, the invention provides a method of correlating gene expression to a stage of the cell cycle, comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first plant cell in a first stage of the cell cycle; detecting a level of expression of the one or more genes in a second plant cell in a second, different stage of the cell cycle; comparing the level of the expression of the one or more genes in the first plant cells to the level of expression of the one or more genes in the second plants cells; and correlating a difference in expression level of the one or more genes between the first and second samples to the first or second stage of the cell cycle.


An aspect of the invention is a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237.


Another aspect of the invention is a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237.


The invention further provides a microarray comprising a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 or wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237, wherein each of said two or more oligonucleotides occupies a unique location on said solid support.


In another aspect, the invention provides a method for detecting one or more genes in a sample, comprising contacting the sample with two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a gene comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 under standard hybridization conditions; and detecting the one or more genes of interest which are hybridized to the one or more oligonucleotides.


The invention also provides a method for detecting one or more nucleic acid sequences encoded by one or more genes in a sample, comprising contacting the sample with two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a gene comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 under standard hybridization conditions; and detecting the one or more nucleic acid sequences which are hybridized to the one or more oligonucleotides.


The invention further provides a kit for detecting gene expression comprising the microarray of the invention together with one or more buffers or reagents for a nucleotide hybridization reaction.


Other features, objects, and advantages of the present invention are apparent from the detailed description that follows. It should be understood, however, that the detailed description, while indicating preferred embodiments of the invention, are given by way of illustration only, not limitation. Various changes and modifications within the spirit and scope of the invention will be apparent to those skilled in the art from the detailed description.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1: Exemplary microarray sampling parameters.



FIG. 2: Plasmid map for pWVK202.



FIG. 3: Plasmid map for pGrowth14.



FIG. 4: Plasmid map for pGrowth15.



FIG. 5: Plasmid map for pGrowth16.



FIG. 6: Plasmid map for pGrowth18.



FIG. 7: Plasmid map for pGrowth19.



FIG. 8: Plasmid map for pGrowth20.





LIST OF TABLES

Table 1: shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.


Table 2: identifies plasmid(s), genes, and Genesis ID numbers for constructs described in Example 17.


Table 3: Rooting medium for Populus deltoids.


Table 4: pGrowth information.


Table 5: shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.


Table 6: Differentially expressed cDNAs.


Table 7: Consensus ID information.


Table 8: pGrowth information.


Table 9: Eucalyptus grandis cell cycle genes and proteins.


Table 10: Pinus radiata cell cycle genes and proteins.


Table 11: Annotated peptide sequences of the present invention.


Table 12: Eucalyptus in silico data.


Table 13: Pine in silico data.


Table 14: Oligo table.


Table 15: Peptide table.


Table 16: BLAST sequence alignment table.


DETAILED DESCRIPTION

The inventors have discovered novel isolated cell cycle genes and polynucleotides useful for identifying the multigenic factors that contribute to a phenotype and for manipulating gene expression to affect a plant phenotype. These genes, which are derived from plants of commercially important forestry genera, pine and eucalyptus, are involved in the plant cell cycle and are, at least in part, responsible for expression of phenotypic characteristics important in commercial wood, such as stiffness, strength, density, fiber dimensions, coarseness, cellulose and lignin content, and extractives content. Generally speaking, the genes and polynucleotides encode a protein which can be a cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, or WD40 repeat protein, or a catalytic domain thereof, or a polypeptide having the same function, and the invention further includes such proteins and polypeptides.


The methods of the present invention for selecting cell cycle gene sequences to target for manipulation will permit better design and control of transgenic plants with more highly engineered phenotypes. The ability to control plant architecture and agronomically important traits in commercially important forestry species will be improved by the information obtained from the methods, such as which genes affect which phenotypes, which genes affect entry into which stage of the cell cycle, which genes are active in which stage of plant development, and which genes are expressed in which tissue at a given point in the cell cycle or plant development.


Unless indicated otherwise, all technical and scientific terms are used herein in a manner that conforms to common technical usage. Generally, the nomenclature of this description and the described laboratory procedures, including cell culture, molecular genetics, and nucleic acid chemistry and hybridization, respectively, are well known and commonly employed in the art. Standard techniques are used for recombinant nucleic acid methods, oligonucleotide synthesis, cell culture, tissue culture, transformation, transfection, transduction, analytical chemistry, organic synthetic chemistry, chemical syntheses, chemical analysis, and pharmaceutical formulation and delivery. Generally, enzymatic reactions and purification and/or isolation steps are performed according to the manufacturers' specifications. Absent an indication to the contrary, the techniques and procedures in question are performed according to conventional methodology disclosed, for example, in Sambrook et al., MOLECULAR CLONING A LABORATORY MANUAL, 2d ed. (Cold Spring Harbor Laboratory Press, 1989), and CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1989). Specific scientific methods relevant to the present invention are discussed in more detail below. However, this discussion is provided as an example only, and does not limit the manner in which the methods of the invention can be carried out.


A. Plant Cell Cycle Genes and Proteins


1. Cell Cycle Genes, Polynucleotide and Polypeptide Sequences


One aspect of the present invention relates to novel plant cell cycle genes and polypeptides encoded by such genes. As used herein, the term “plant cell cycle genes” refers to genes encoding proteins that function during the plant cell cycle, and the term “plant cell cycle proteins” refers to proteins that function during the plant cell cycle. There are several known families of plant cell cycle proteins, including cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, and WD40 repeat protein. Although there is significant sequence homology within each gene and protein family, each member of each family can display different biochemical properties and altering the expression of at least one of these genes can result in a different plant phenotype.


The present invention provides novel plant cell cycle genes and polynucleotides and novel cell cycle proteins and polypeptides. In accordance with one embodiment of the invention, the novel plant cell cycle genes are the same as those expressed in a wild-type plant of a species of Pinus or Eucalyptus. Exemplary novel plant cell cycle gene sequences of the invention are set forth in Tables 9 and 10, which depict Eucalyptus grandis sequences and Pinus radiata sequences, respectively. Corresponding gene products, i.e., oligonucleotides and polypeptides, are also listed in Tables 14, 15, and 16. The Sequence Listing in APPENDIX 1 provides the sequences of these aspects of the invention.


The sequences of the invention have cell cycle activity and encode proteins that are active in the cell cycle, such as proteins of the cell cycle families discussed above. As discussed in more detail below, manipulation of the expression of the cell cycle genes and polynucleotides, or manipulation of the activity of the encoded proteins and polypeptides, can result in a transgenic plant with a desired phenotype that differs from the phenotype of a wild-type plant of the same species.


Throughout this description, reference is made to cell cycle gene products. As used herein, a “cell cycle gene product” is a product encoded by a cell cycle gene, and includes both nucleotide products, such as RNA, and amino acid products, such as proteins and polypeptides. Examples of specific cell cycle genes of the invention include SEQ ID NOs: 1-237. Examples of specific cell cycle gene products of the invention include products encoded by any one of SEQ ID NOs: 1-237. Reference also is made herein to cell cycle proteins and cell cycle polypeptides. Examples of specific cell cycle proteins and polypeptides of the invention include polypeptides encoded by any of SEQ ID NOs: 1-237 or polypeptides comprising the amino acid sequence of any of SEQ ID NOs: 261-497. One aspect of the invention is directed to a subset of these cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14-58, 60-62, 64-70, 72-75, 77-83, 85-86, 88-91, 93-119, 121-130, 132-148, 150-156, 158-191, 193-207, 209-218, 220-221, 223-231, 233-237, their respective conservative variants (as that term is defined below), and the nucleotide and amino acid products encoded thereby. Another aspect of the invention is directed to a subset of the cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-76, 78-103, 106, 108-113, 116-121, 124-125, 128-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 208-213, and 215-234 their respective conservative variants, and the nucleotide and amino acid products encoded thereby. A further aspect of the invention is directed to a subset of the cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-58, 60-62, 64-70, 72-75, 78-83, 85-86, 88-91, 93-103, 106, 108-113, 116-119, 121, 124-125, 128-130, 132-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 209-213, 215-218, 220-221, 223-231, and 233-234 their respective conservative variants, and the nucleotide and amino acid products encoded thereby.


The present invention also includes sequences that are complements, reverse sequences, or reverse complements to the nucleotide sequences disclosed herein.


The present invention also includes conservative variants of the sequences disclosed herein. The term “variant,” as used herein, refers to a nucleotide or amino acid sequence that differs in one or more nucleotide bases or amino acid residues from the reference sequence of which it is a variant.


Thus, in one aspect, the invention includes conservative variant polynucleotides. As used herein, the term “conservative variant polynucleotide” refers to a polynucleotide that hybridizes under stringent conditions to an oligonucleotide probe that, under comparable conditions, binds to the reference gene the conservative variant is a variant of. Thus, for example, a conservative variant of SEQ ID NO: 1 hybridizes under stringent conditions to an oligonucleotide probe that, under comparable conditions, binds to SEQ ID NO: 1. One aspect of the invention provides conservative variant polynucleotides that exhibit at least about 75% sequence identity to their respective reference sequences.


“Sequence identity” has an art-recognized meaning and can be calculated using published techniques. See COMPUTATIONAL MOLECULAR BIOLOGY, Lesk, ed. (Oxford University Press, 1988), BIOCOMPUTING: INFORMATICS AND GENOME PROJECTS, Smith, ed. (Academic Press, 1993), COMPUTER ANALYSIS OF SEQUENCE DATA, PART I, Griffin & Griffin, eds., (Humana Press, 1994), SEQUENCE ANALYSIS IN MOLECULAR BIOLOGY, Von Heinje ed., Academic Press (1987), SEQUENCE ANALYSIS PRIMER, Gribskov & Devereux, eds. (Macmillan Stockton Press, 1991), and Carillo & Lipton, SIAM J. Applied Math. 48: 1073 (1988). Methods commonly employed to determine identity or similarity between two sequences include but are not limited to those disclosed in GUIDE TO HUGE COMPUTERS, Bishop, ed., (Academic Press, 1994) and Carillo & Lipton, supra. Methods to determine identity and similarity are codified in computer programs. Preferred computer program methods to determine identity and similarity between two sequences include but are not limited to the GCG program package (Devereux et al., Nucleic Acids Research 12: 387 (1984)), BLASTP, BLASTN, FASTA (Atschul et al., J. Mol. Biol. 215: 403 (1990)), and FASTDB (Brutlag et al., Comp. App. Biosci. 6: 237 (1990)).


The invention includes conservative variant polynucleotides having a sequence identity that is greater than or equal to 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75%, 74%, 73%, 72%, 71%, 70%, 69%, 68%, 67%, 66%, 65%, 64%, 63%, 62%, 61%, or 60% to any one of SEQ ID NOs: 1 to 237. In such variants, differences between the variant and the reference sequence can occur at the 5′ or 3′ terminal positions of the reference nucleotide sequence or anywhere between those terminal positions, interspersed either individually among nucleotides in the reference sequence or in one or more contiguous groups within the reference sequence.


Additional conservative variant polynucleotides contemplated by and encompassed within the present invention include polynucleotides comprising sequences that differ from the polynucleotide sequences of SEQ ID NO: 1-237, or complements, reverse complements or reverse sequences thereof, as a result of deletions and/or insertions totaling less than 10% of the total sequence length.


The invention also includes conservative variant polynucleotides that, in addition to sharing a high degree of similarity in their primary structure (sequence) to SEQ ID NOs: 1 to 237, have at least one of the following features: (i) they contain an open reading frame or partial open reading frame encoding a polypeptide having substantially the same functional properties in the cell cycle as the polypeptide encoded by the reference polynucleotide, or (ii) they have nucleotide domains or encoded protein domains in common. The invention includes conservative variants of SEQ ID NOs: 1-237 that encode proteins having the enzyme or biological activity or binding properties of the protein encoded by the reference polynucleotide. Such conservative variants are functional variants, in that they have the enzymatic or binding activity of the protein encoded by the reference polynucleotide.


In accordance with the invention, polynucleotide variants can include a “shuffled gene” such as those described in e.g. U.S. Pat. Nos. 6,500,639, 6,500,617 6,436,675, 6,379,964, 6,352,859 6,335,198 6,326,204, and 6,287,862. A variant of a nucleotide sequence of the present invention also can be a polynucleotide modified as disclosed in U.S. Pat. No. 6,132,970, which is incorporated herein by reference.


In accordance with one embodiment, the invention provides a polynucleotide that encodes a cell cycle protein from one of the following families: cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, or WD40 repeat protein. SEQ ID NOs: 1-237 provide examples of such polynucleotides.


In accordance with another embodiment, a polynucelotide of the invention encodes the catalytic or protein binding domain of a polypeptide encoded by any of SEQ ID NOs: 1-237 or of a polypeptide comprising any of SEQ ID NOs: 261-497. The catalytic and protein binding domains of the cell cycle proteins of the invention are known in the art. The conserved sequences of these proteins are shown in Entries 1-195 as underlined, bold, and/or italicized text.


The invention also encompasses as conservative variants polynucleotides that differ from the sequences discussed above but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide which is the same as that encoded by a polynucleotide of the present invention. The invention also includes as conservative variants polynucleotides comprising sequences that differ from the polynucleotide sequences discussed above as a result of substitutions that do not affect the amino acid sequence of the encoded polypeptide sequence, or that result in conservative substitutions in the encoded polypeptide sequence.


The present invention also includes an isolated polypeptide encoded by a polynucleotide comprising any of SEQ ID NOs: 1-237 or any of the conservative variants thereof discussed above. The invention also includes polypeptides comprising SEQ ID NOs: 261-497 and 495-497 and conservative variants of these polypeptides. Another aspect of the invention include polypeptides comprising SEQ ID NOs: 261-272, 274-318, 320-322, 324-330, 332-335, 337-343, 345-346, 348-351, 353-379, 381-390, 392-408, 410-416, 418-451, 453-467, 469-478, 480-481, 483-491, and 493-494 and conservative variants thereof. A further aspect of the invention includes polypeptides comprising SEQ ID NOs: 261-272, 274, 276-286, 289, 290-297, 300-301, 303-345, 347-363, 366, 368-373, 376-381, 384-385, 388-407, 410-412, 414-415, 420-422, 424-432, 434, 437-443, 445-451, 453-457, 460-464, 468-473, and 475-494 and conservative variants thereof. Another aspect of the invention includes polypeptides comprising SEQ ID NOs: 261-272, 274, 276-286, 290-297, 300-301, 303-318, 320-322, 324-330, 332-335, 337-343, 345, 348-351, 353-363, 366, 368-373, 376-381, 384-385, 388-390, 392-407, 410-412, 414-415, 421-422, 424-432, 434, 437-443, 445-451, 453-457, 460-464, 469-473, 475-478, 480-481, 483-491, and 493-494 and conservative variants thereof.


In accordance with the invention, a variant polypeptide or protein refers to an amino acid sequence that is altered by the addition, deletion or substitution of one or more amino acids.


The invention includes conservative variant polypeptides. As used herein, the term “conservative variant polypeptide” refers to a polypeptide that has similar structural, chemical or biological properties to the protein it is a conservative variant of. Guidance in determining which amino acid residues can be substituted, inserted, or deleted can be found using computer programs well known in the art such as Vector NTI Suite (InforMax, MD) software. In one embodiment of the invention, conservative variant polypeptides that exhibit at least about 75% sequence identity to their respective reference sequences.


Conservative variant protein includes an “isoform” or “analog” of the polypeptide. Polypeptide isoforms and analogs refers to proteins having the same physical and physiological properties and the same biological function, but whose amino acid sequences differs by one or more amino acids or whose sequence includes a non-natural amino acid.


Polypeptides comprising sequences that differ from the polypeptide sequences of SEQ ID NO: 261-497 as a result of amino acid substitutions, insertions, and/or deletions totaling less than 10% of the total sequence length are contemplated by and encompassed within the present invention.


One aspect of the invention provides conservative variant polypeptides that have the same function in the cell cycle as the proteins of which they are variants, as determined by one or more appropriate assays, such as those described below. The invention includes variant polypeptides that function as cell cycle proteins, such as those having the biological activity of cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, and WD40 repeat protein, and are thus capable of modulating the cell cycle in a plant. As discussed above, the invention includes variant polynucleotides that encode polypeptides that function as cell cycle proteins.


The activities and physical properties of cell cycle proteins can be examined using any method known in the art. The following examples of assay methods are not exhaustive and are included to provide some guidance in examining the activity and distinguishing protein characteristics of cell cycle protein variants.


CDK activity can be assessed using roscovitine as described in Yamaguchi et al., Proc. Natl. Acad. Sci. U.S.A. 100:8019 (2003). CDK histone kinase activity can be assayed using autoradiography to detect histone H1 phosphorylation by CDK as described in Joubés et al., Plant Physiol. 121:857 (1999).


CKI activity can be assayed using a variation of the method described in Zhou et al., Planta. 6:604 (2003). The modified method can employ co-transformation or subsequent transformations to identify the interaction of CKI and cyclins in vivo. For example, in the first transformation pine tissue can be transformed using the method described in U.S. Patent Application Publication No. 2002/0100083 using geneticin selection to obtain transgenic plants possessing cycD3 and cdc2a homologs. The second transformation can be performed using alpha-methyltryptophan as a selectable marker to obtain transformants having an ICK1 homologue as described in U.S. Provisional Application No. 60/476,189. Tissue capable of growing on both on geneticin and on alpha-methyltryptophan contains the ICK1 homologue and the cycD3 and cdc2a homologues. The CKI activity is determined by comparison of the phenotype of transformants having the cycD3 and cdc2a homologues to the transformants having ICK1 homologue and the cycD3 and cdc2a homologs.


Histone deacetylase activity can be assessed by complementation of the Arabidopsis mutants described in Tian et al., Genetics 165:399 (2003). Histone acetyltransferase activity can be assayed using anacardic acid as described in Balasubramanyam et al., J. Biol. Chem. 278:19134 (2003). Histone acetyltransferase also can be assayed using trichostatin A-treated plant lines as is described in Bhat et al., Plant J. 33:455 (2003). The plant lines described in Bhat et al., supra, also can be used to assay retinoblastoma-related proteins using the co-precipitation method described in Rossi et al., Plant Mol. Biol. 51:401 (2003).


Peptidyl-prolyl isomerase can be assayed as described in Edvardsson et al., FEBS Lett. 542:137 (2003). WD40 proteins can be evaluated based on the possession of the WD40 motif as well as their ability to interact with cdc2. WEE-1 can be assayed using any kinase activity assay known in the art.


2. Methods of Using Cell Cycle Genes, Polynucleotide and Polypeptide Sequences


The present invention provides methods of using plant cell cycle genes and conservative variants thereof. The invention includes methods and constructs for altering expression of plant cell cycle genes and/or gene products for purposes including, but not limited to (i) investigating function during the cell cycle and ultimate effect on plant phenotype and (ii) to effect a change in plant phenotype. For example, the invention includes methods and tools for modifying wood quality, fiber development, cell wall polysaccharide content, fruit ripening, and plant growth and yield by altering expression of one or more plant cell cycle genes.


The invention comprises methods of altering the expression of any of the cell cycle genes and variants discussed above. Thus, for example, the invention comprises altering expression of a cell cycle gene present in the genome of a wild-type plant of a species of Eucalyptus or Pinus. In one embodiment, the cell cycle gene comprises a nucleotide sequence selected from SEQ ID NOs: 1-237, from the subset thereof comprising SEQ ID NOs: SEQ ID NOs: 1-12, 14-58, 60-62, 64-70, 72-75, 77-83, 85-86, 88-91, 93-119, 121-130, 132-148, 150-156, 158-191, 193-207, 209-218, 220-221, 223-231, and 233-237, from the subset thereof comprising SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-76, 78-103, 106, 108-113, 116-121, 124-125, 128-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 208-213, and 215-234, from the subset thereof comprising SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-58, 60-62, 64-70, 72-75, 78-83, 85-86, 88-91, 93-103, 106, 108-113, 116-119, 121, 124-125, 128-130, 132-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 209-213, 215-218, 220-221, 223-231, and 233-234, or the conservative variants thereof, as discussed above.


Techniques which can be employed in accordance with the present invention to alter gene expression, include, but are not limited to: (i) over-expressing a gene product, (ii) disrupting a gene's transcript, such as disrupting a gene's mRNA transcript; (iii) disrupting the function of a polypeptide encoded by a gene, or (iv) disrupting the gene itself Over-expression of a gene product, the use of antisense RNAs, ribozymes, and the use of double-stranded RNA interference (dsRNAi) are valuable techniques for discovering the functional effects of a gene and for generating plants with a phenotype that is different from a wild-type plant of the same species.


Over-expression of a target gene often is accomplished by cloning the gene or cDNA into an expression vector and introducing the vector into recipient cells. Alternatively, over-expression can be accomplished by introducing exogenous promoters into cells to drive expression of genes residing in the genome. The effect of over-expression of a given gene on cell function, biochemical and/or physiological properties can then be evaluated by comparing plants transformed to over-express the gene to plants that have not been transformed to over-express the gene.


Antisense RNA, ribozyme, and dsRNAi technologies typically target RNA transcripts of genes, usually mRNA. Antisense RNA technology involves expressing in, or introducing into, a cell an RNA molecule (or RNA derivative) that is complementary to, or antisense to, sequences found in a particular mRNA in a cell. By associating with the mRNA, the antisense RNA can inhibit translation of the encoded gene product. The use of antisense technology to reduce or inhibit the expression of specific plant genes has been described, for example in European Patent Publication No. 271988, Smith et al., Nature, 334:724-726 (1988); Smith et. al., Plant Mol. Biol., 14:369-379 (1990)).


A ribozyme is an RNA that has both a catalytic domain and a sequence that is complementary to a particular mRNA. The ribozyme functions by associating with the mRNA (through the complementary domain of the ribozyme) and then cleaving (degrading) the message using the catalytic domain.


RNA interference (RNAi) involves a post-transcriptional gene silencing (PTGS) regulatory process, in which the steady-state level of a specific mRNA is reduced by sequence-specific degradation of the transcribed, usually fully processed mRNA without an alteration in the rate of de novo transcription of the target gene itself. The RNAi technique is discussed, for example, in Elibashir, et al., Methods Enzymol. 26: 199 (2002); McManus & Sharp, Nature Rev. Genetics 3: 737 (2002); PCT application WO 01/75164; Martinez et al., Cell 110: 563 (2002); Elbashir et al., supra; Lagos-Quintana et al., Curr. Biol. 12: 735 (2002); Tuschl et al., Nat. Biotechnol. 20:446 (2002); Tuschl, Chembiochem. 2: 239 (2001); Harborth et al., J. Cell Sci. 114: 4557 (2001); et al., EMBO J. 20:6877 (2001); Lagos-Quintana et al., Science. 294: 8538 (2001); Hutvagner et al., loc cit, 834; Elbashir et al., Nature. 411: 494 (2001).


The present invention provides a DNA construct comprising at least one polynucleotide of SEQ ID NOs: 1-235 or conservative variants thereof, such as the conservative variants discussed above. Any method known in the art can be used to generate the DNA constructs of the present invention. See, e.g. Sambrook et al., supra.


The invention includes DNA constructs that optionally comprise a promoter. Any suitable promoter known in the art can be used. A promoter is a nucleic acid, preferably DNA, that binds RNA polymerase and/or other transcription regulatory elements. As with any promoter, the promoters of the invention facilitate or control the transcription of DNA or RNA to generate an mRNA molecule from a nucleic acid molecule that is operably linked to the promoter. The RNA can encode a protein or polypeptide or can encode an antisense RNA molecule or a molecule useful in RNAi. Promoters useful in the invention include constitutive promoters, inducible promoters, temporally regulated promoters and tissue-preferred promoters.


Examples of useful constitutive plant promoters include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers constitutive, high-level expression in most plant tissues (Odel et al. Nature 313:810(1985)); the nopaline synthase promoter (An et al. Plant Physiol. 88:547 (1988)); and the octopine synthase promoter (Fromm et al., Plant Cell 1: 977 (1989)). It should be noted that, although the CaMV 35S promoter is commonly referred to as a constitutive promoter, some tissue preference can be seen. The use of CaMV 35S is envisioned by the present invention, regardless of any tissue preference which may be exhibited during use in the present invention.


Inducible promoters regulate gene expression in response to environmental, hormonal, or chemical signals. Examples of hormone inducible promoters include auxin-inducible promoters (Baumann et al. Plant Cell 11:323-334(1999)), cytokinin-inducible promoters (Guevara-Garcia, Plant Mol. Biol. 38:743-753(1998)), and gibberellin-responsive promoters (Shi et al. Plant Mol. Biol. 38:1053-1060(1998)). Additionally, promoters responsive to heat, light, wounding, pathogen resistance, and chemicals such as methyl jasmonate or salicylic acid, can be used in the DNA constructs and methods of the present invention.


Tissue-preferred promoters allow for preferred expression of polynucleotides of the invention in certain plant tissue. Tissue-preferred promoters are also useful for directing the expression of antisense RNA or siRNA in certain plant tissues, which can be useful for inhibiting or completely blocking the expression of targeted genes as discussed above. As used herein, vascular plant tissue refers to xylem, phloem or vascular cambium tissue. Other preferred tissue includes apical meristem, root, seed, and flower. In one aspect, the tissue-preferred promoters of the invention are either “xylem-preferred,” “cambium-preferred” or “phloem-preferred,” and preferentially direct expression of an operably linked nucleic acid sequence in the xylem, cambium or phloem, respectively. In another aspect, the DNA constructs of the invention comprise promoters that are tissue-specific for xylem, cambium or phloem, wherein the promoters are only active in the xylem, cambium or phloem.


A vascular-preferred promoter is preferentially active in any of the xylem, phloem or cambium tissues, or in at least two of the three tissue types. A vascular-specific promoter is specifically active in any of the xylem, phloem or cambium, or in at least two of the three. In other words, the promoters are only active in the xylem, cambium or phloem tissue of plants. Note, however, that because of solute transport in plants, a product that is specifically or preferentially expressed in a tissue may be found elsewhere in the plant after expression has occurred.


In another embodiment, the promoter is under temporal regulation, wherein the ability of the promoter to initiate expression is linked to factors such as the stage of the cell cycle or the stage of plant development. For example, the promoter of a cyclin D2 gene may be expressed only during the G1 and early S-phase, and the promoters of particular cyclin genes may be expressed only within the primary vascular poles of the developing seedling.


Additionally, the promoters of particular cell cycle genes may be expressed only within the cambium in developing secondary vasculature. Within the cambium, particular cell cycle gene promoters may be expressed exclusively in the stem or in the root. Moreover, the cell cycle promoters may be expressed only in the spring (for early wood formation) or only in the summer.


A promoter may be operably linked to the polynucleotide. As used in this context, operably linked refers to linking a polynucleotide encoding a structural gene to a promoter such that the promoter controls transcription of the structural gene. If the desired polynucleotide comprises a sequence encoding a protein product, the coding region can be operably linked to regulatory elements, such as to a promoter and a terminator, that bring about expression of an associated messenger RNA transcript and/or a protein product encoded by the desired polynucleotide. In this instance, the polynucleotide is operably linked in the 5′- to 3′-orientation to a promoter and, optionally, a terminator sequence.


Alternatively, the invention provides DNA constructs comprising a polynucleotide in an “antisense” orientation, the transcription of which produces nucleic acids that can form secondary structures that affect expression of an endogenous cell cycle gene in the plant cell. In another variation, the DNA construct may comprise a polynucleotide that yields a double-stranded RNA product upon transcription that initiates RNA interference of a cell cycle gene with which the polynucleotide is associated. A polynucleotide of the present invention can be positioned within a t-DNA, such that the left and right t-DNA border sequences flank or are on either side of the polynucleotide.


It should be understood that the invention includes DNA constructs comprising one or more of any of the polynucleotides discussed above. Thus, for example, a construct may comprise a t-DNA comprising one, two, three, four, five, six, seven, eight, nine, ten, or more polynucleotides.


The invention also includes DNA constructs comprising a promoter that includes one or more regulatory elements. Alternatively, the invention includes DNA constructs comprising a regulatory element that is separate from a promoter. Regulatory elements confer a number of important characteristics upon a promoter region. Some elements bind transcription factors that enhance the rate of transcription of the operably linked nucleic acid. Other elements bind repressors that inhibit transcription activity. The effect of transcription factors on promoter activity can determine whether the promoter activity is high or low, i.e. whether the promoter is “strong” or “weak.”


A DNA construct of the invention can include a nucleotide sequence that serves as a selectable marker useful in identifying and selecting transformed plant cells or plants. Examples of such markers include, but are not limited to, a neomycin phosphotransferase (nptII) gene (Potrykus et al., Mol. Gen. Genet. 199:183-188 (1985)), which confers kanamycin resistance. Cells expressing the nptII gene can be selected using an appropriate antibiotic such as kanamycin or G418. Other commonly used selectable markers include a mutant EPSP synthase gene (Hinchee et al., Bio/Technology 6:915-922 (1988)), which confers glyphosate resistance; and a mutant acetolactate synthase gene (ALS), which confers imidazolinone or sulphonylurea resistance (European Patent Application 154,204, 1985).


The present invention also includes vectors comprising the DNA constructs discussed above. The vectors can include an origin of replication (replicons) for a particular host cell. Various prokaryotic replicons are known to those skilled in the art, and function to direct autonomous replication and maintenance of a recombinant molecule in a prokaryotic host cell.


In one embodiment, the present invention utilizes a pWVR8 vector as described in U.S. Application No. 60/476,222, filed Jun. 6, 2003, or pART27 as described in Gleave, Plant Mol. Biol, 20:1203-27 (1992).


The invention also provides host cells which are transformed with the DNA constructs of the invention. As used herein, a host cell refers to the cell in which a polynucleotide of the invention is expressed. Accordingly, a host cell can be an individual cell, a cell culture or cells that are part of an organism. The host cell can also be a portion of an embryo, endosperm, sperm or egg cell, or a fertilized egg. In one embodiment, the host cell is a plant cell.


The present invention further provides transgenic plants comprising the DNA constructs of the invention. The invention includes transgenic plants that are angiosperms or gymnosperms. The DNA constructs of the present invention can be used to transform a variety of plants, both monocotyledonous (e.g. grasses, corn, grains, oat, wheat and barley), dicotyledonous (e.g., Arabidopsis, tobacco, legumes, alfalfa, oaks, eucalyptus, maple), and Gymnosperms (e.g., Scots pine; see Aronen, Finnish Forest Res. Papers, Vol. 595, 1996), white spruce (Ellis et al., Biotechnology 11:84-89, 1993), and larch (Huang et al., In Vitro Cell 27:201-207, 1991).


The plants also include turfgrass, wheat, maize, rice, sugar beet, potato, tomato, lettuce, carrot, strawberry, cassava, sweet potato, geranium, soybean, and various types of woody plants. Woody plants include trees such as palm oak, pine, maple, fir, apple, fig, plum and acacia. Woody plants also include rose and grape vines.


In one embodiment, the DNA constructs of the invention are used to transform woody plants, i.e., trees or shrubs whose stems live for a number of years and increase in diameter each year by the addition of woody tissue. The invention includes methods of transforming plants including eucalyptus and pine species of significance in the commercial forestry industry such as plants selected from the group consisting of Eucalyptus grandis and its hybrids, and Pinus taeda, as well as the transformed plants and wood and wood pulp derived therefrom. Other examples of suitable plants include those selected from the group consisting of Pinus banksiana, Pinus brutia, Pinus caribaea, Pinus clausa, Pinus contorta, Pinus coulteri, Pinus echinata, Pinus eldarica, Pinus ellioti, Pinus jeffreyi, Pinus lambertiana, Pinus massoniana, Pinus monticola, Pinus nigra, Pinus palustris, Pinus pinaster, Pinus ponderosa, Pinus radiata, Pinus resinosa, Pinus rigida, Pinus serotina, Pinus strobus, Pinus sylvestris, Pinus taeda, Pinus virginiana, Abies amabilis, Abies balsamea, Abies concolor, Abies grandis, Abies lasiocarpa, Abies magnifica, Abies procera, Chamaecyparis lawsoniona, Chamaecyparis nootkatensis, Chamaecyparis thyoides, Juniperus virginiana, Larix decidua, Larix laricina, Larix leptolepis, Larix occidentalis, Larix siberica, Libocedrus decurrens, Picea abies, Picea engelmanni, Picea glauca, Picea mariana, Picea pungens, Picea rubens, Picea sitchensis, Pseudotsuga menziesii, Sequoia gigantea, Sequoia sempervirens, Taxodium distichum, Tsuga canadensis, Tsuga heterophylla, Tsuga mertensiana, Thuja occidentalis, Thuja plicata, Eucalyptus alba, Eucalyptus bancroftii, Eucalyptus botryoides, Eucalyptus bridgesiana, Eucalyptus calophylla, Eucalyptus camaldulensis, Eucalyptus citriodora, Eucalyptus cladocalyx, Eucalyptus coccifera, Eucalyptus curtisii, Eucalyptus dalrympleana, Eucalyptus deglupta, Eucalyptus delagatensis, Eucalyptus diversicolor, Eucalyptus dunnii, Eucalyptus ficifolia, Eucalyptus globulus, Eucalyptus gomphocephala, Eucalyptus gunnii, Eucalyptus henryi, Eucalyptus laevopinea, Eucalyptus macarthurii, Eucalyptus macrorhyncha, Eucalyptus maculata, Eucalyptus marginata, Eucalyptus megacarpa, Eucalyptus melliodora, Eucalyptus nicholii, Eucalyptus nitens, Eucalyptus nova-angelica, Eucalyptus obliqua, Eucalyptus occidentalis, Eucalyptus obtusiflora, Eucalyptus oreades, Eucalyptus pauciflora, Eucalyptus polybractea, Eucalyptus regnans, Eucalyptus resinifera, Eucalyptus robusta, Eucalyptus rudis, Eucalyptus saligna, Eucalyptus sideroxylon, Eucalyptus stuartiana, Eucalyptus tereticornis, Eucalyptus torelliana, Eucalyptus urnigera, Eucalyptus urophylla, Eucalyptus viminalis, Eucalyptus viridis, Eucalyptus wandoo, and Eucalyptus youmanni.


As used herein, the term “plant” also is intended to include the fruit, seeds, flower, strobilus, etc. of the plant. A transformed plant of the current invention can be a direct transfectant, meaning that the DNA construct was introduced directly into the plant, such as through Agrobacterium, or the plant can be the progeny of a transfected plant. The second or subsequent generation plant can be produced by sexual reproduction, i.e., fertilization. Furthermore, the plant can be a gametophyte (haploid stage) or a sporophyte (diploid stage).


As used herein, the term “plant tissue” encompasses any portion of a plant, including plant cells. Plant cells include suspension cultures, callus, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, seeds and microspores. Plant tissues can be grown in liquid or solid culture, or in soil or suitable media in pots, greenhouses or fields. As used herein, “plant tissue” also refers to a clone of a plant, seed, progeny, or propagule, whether generated sexually or asexually, and descendents of any of these, such as cuttings or seeds.


In accordance with one aspect of the invention, a transgenic plant that has been transformed with a DNA construct of the invention has a phenotype that is different from a plant that has not been transformed with the DNA construct.


As used herein, “phenotype” refers to a distinguishing feature or characteristic of a plant which can be altered according to the present invention by integrating one or more DNA constructs of the invention into the genome of at least one plant cell of a plant. The DNA construct can confer a change in the phenotype of a transformed plant by modifying any one or more of a number of genetic, molecular, biochemical, physiological, morphological, or agronomic characteristics or properties of the transformed plant cell or plant as a whole.


In one embodiment, transformation of a plant with a DNA construct of the present invention can yield a phenotype including, but not limited to any one or more of increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides.


In another embodiment, the affected phenotype includes one or more of the following traits: propensity to form reaction wood, a reduced period of juvenility, an increased period of juvenility, self-abscising branches, accelerated reproductive development or delayed reproductive development, as compared to a plant of the same species that has not been transformed with the DNA construct.


In a further embodiment, the phenotype that is different in the transgenic plant includes one or more of the following: lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape.


Phenotype can be assessed by any suitable means. The plants can be evaluated based on their general morphology. Transgenic plants can be observed with the naked eye, can be weighed and their height measured. The plant can be examined by isolating individual layers of plant tissue, namely phloem and cambium, which is further sectioned into meristematic cells, early expansion, late expansion, secondary wall formation, and late cell maturation. See, e.g., Hertzberg, supra. The plants also can be assessed using microscopic analysis or chemical analysis.


Microscopic analysis includes examining cell types, stage of development, and stain uptake by tissues and cells. Fiber morphology, such as fiber wall thickness and microfibril angle of wood pulp fibers can be observed using, for example, microscopic transmission ellipsometry. See Ye and Sundström, Tappi J., 80:181 (1997). Wood strength, density, and grain slope in wet wood and standing trees can be determined by measuring the visible and near infrared spectral data in conjunction with multivariate analysis. See, U.S. Patent Application Publication Nos. 2002/0107644 and 2002/0113212. Lumen size can be measured using scanning electron microscopy. Lignin structure and chemical properties can be observed using nuclear magnetic resonance spectroscopy as described in Marita et al., J. Chem. Soc., Perkin Trans. I 2939 (2001).


The biochemical characteristic of lignin, cellulose, carbohydrates and other plant extracts can be evaluated by any standard analytical method known including spectrophotometry, fluorescence spectroscopy, HPLC, mass spectroscopy, and tissue staining methods.


As used herein, “transformation” refers to a process by which a nucleic acid is inserted into the genome of a plant cell. Such insertion encompasses stable introduction into the plant cell and transmission to progeny. Transformation also refers to transient insertion of a nucleic acid, wherein the resulting transformant transiently expresses the nucleic acid. Transformation can occur under natural or artificial conditions using various methods well known in the art. Transformation can be achieved by any known method for the insertion of nucleic acid sequences into a prokaryotic or eukaryotic host cell, including Agrobacterium-mediated transformation protocols, viral infection, whiskers, electroporation, microinjection, polyethylene glycol-treatment, heat shock, lipofection, and particle bombardment. Transformation can also be accomplished using chloroplast transformation as described in e.g. Svab et al., Proc. Natl Acad. Sci. 87:8526-30 (1990).


In accordance with one embodiment of the invention, transformation in Eucalyptus is performed as described in U.S. Patent Application No. 60/476,222 (supra) which is incorporated herein by reference in its entirety. In accordance with another embodiment, transformation of Pinus is accomplished using the methods described in U.S. Patent Application Publication No. 2002/0100083.


Another aspect of the invention provides methods of obtaining wood and/or making wood pulp from a plant transformed with a DNA construct of the invention. Methods of producing a transgenic plant are provided above and are known in the art. A transformed plant can be cultured or grown under any suitable conditions. For example, pine can be cultured and grown as described in U.S. Patent Application Publication No. 2002/0100083. Eucalyptus can be cultured and grown as in, for example, Rydelius, et al., GROWING EUCALYPTUS FOR PULP AND ENERGY, presented at the Mechanization in Short Rotation, Intensive Culture Forestry Conference, Mobile, Ala., 1994. Wood and wood pulp can be obtained from the plant by any means known in the art.


As noted above, the wood or wood pulp obtained in accordance with this invention may demonstrate improved characteristics including, but not limited to any one or more of lignin composition, lignin structure, wood composition, cellulose polymerization, fiber dimensions, ratio of fibers to other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape include increased or decreased lignin content, increased accessibility of lignin to chemical treatments, improved reactivity of lignin, increased or decreased cellulose content increased dimensional stability, increased tensile strength, increased shear strength, increased compression strength, increased shock resistance, increased stiffness, increased or decreased hardness, decreased spirality, decreased shrinkage, and differences in weight, density, and specific gravity.


B. Expression Profiling of Cell Cycle Genes


The present invention also provides methods and tools for performing expression profiling of cell cycle genes. Expression profiling is useful in determining whether genes are transcribed or translated, comparing transcript levels for particular genes in different tissues, genotyping, estimating DNA copy number, determining identity of descent, measuring mRNA decay rates, identifying protein binding sites, determining subcellular localization of gene products, correlating gene expression to a phenotype or other phenomenon, and determining the effect on other genes of the manipulation of a particular gene. Expression profiling is particularly useful for identifying gene expression in complex, multigenic events. For this reason, expression profiling is useful in correlating gene expression to plant phenotype and formation of plant tissues and the interconnection thereof to the cell cycle.


Only a small fraction of the genes of a plant's genome are expressed at a given time in a given tissue sample, and all of the expressed genes may not affect the plant phenotype. To identify genes capable of affecting a phenotype of interest, the present invention provides methods and tools for determining, for example, a gene expression profile at a given point in the cell cycle, a gene expression profile at a given point in plant development, and a gene expression profile a given tissue sample. The invention also provides methods and tools for identifying cell cycle genes whose expression can be manipulated to alter plant phenotype or to alter the biological activity of cell cycle gene products. In support of these methods, the invention also provides methods and tools that distinguish expression of different genes of the same family.


As used herein, “gene expression” refers to the process of transcription of a DNA sequence into an RNA sequence, followed by translation of the RNA into a protein, which may or may not undergo post-translational processing. Thus, the relationship between cell cycle stage and/or developmental stage and gene expression can be observed by detecting, quantitatively or qualitatively, changes in the level of an RNA or a protein. As used herein, the term “biological activity” includes, but is not limited to, the activity of a protein gene product, including enzyme activity.


The present invention provides oligonucleotides that are useful in these expression profiling methods. Each oligonucleotide is capable of hybridizing under a given set of conditions to a cell cycle gene or gene product. In one aspect of the invention, a plurality of oligonucleotides is provided, wherein each oligonucleotide hybridizes under a given set of conditions to a different cell cycle gene product. Examples of oligonucleotides of the present invention include SEQ ID NOs: 471-697. Each of the oligos of SEQ ID NOs 471-697 hybridizes under standard conditions to a different gene product of one of SEQ ID NOs: 1-237. The oligonucleotides of the invention are useful in determining the expression of one or more cell cycle genes in any of the above-described methods.


1. Cell, Tissue, Nucleic Acid, and Protein Samples


Samples for use in methods of the present invention may be derived from plant tissue. Suitable plant tissues include, but are not limited to, somatic embryos, pollen, leaves, stems, calli, stolons, microtubers, shoots, xylem, male strolbili, pollen cones, vascular tissue, apical meristem, vascular cambium, xylem, root, flower, and seed.


According to the present invention “plant tissue” is used as described previously herein. Plant tissue can be obtained from any of the plants types or species described supra.


In accordance with one aspect of the invention, samples are obtained from plant tissue at different stages of the cell cycle, from plant tissue at different developmental stages, from plant tissue at various times of the year (e.g. spring versus summer), from plant tissues subject to different environmental conditions (e.g. variations in light and temperature) and/or from different types of plant tissue and cells. In accordance with one embodiment, plant tissue is obtained during various stages of maturity and during different seasons of the year. For example, plant tissue can be collected from stem dividing cells, differentiating xylem, early developing wood cells, differentiated spring wood cells, and differentiated summer wood cells. As another example, gene expression in a sample obtained from a plant with developing wood can be compared to gene expression in a sample obtained from a plant which does not have developing wood.


Differentiating xylem includes samples obtained from compression wood, side-wood, and normal vertical xylem. Methods of obtaining samples for expression profiling from pine and eucalyptus are known. See, e.g., Allona et al., Proc. Nat'l Acad. Sci. 95:9693-8 (1998) and Whetton et al., Plant Mol. Biol. 47:275-91, and Kirst et al., INT'L UNION OF FORESTRY RESEARCH ORGANIZATIONS BIENNIAL CONFERENCE, S6.8 (June 2003, Umea, Sweden).


In one embodiment of the invention, gene expression in one type of tissue is compared to gene expression in a different type of tissue or to gene expression in the same type of tissue in a difference stage of development. Gene expression can also be compared in one type of tissue which is sampled at various times during the year (different seasons). For example, gene expression in juvenile secondary xylem can be compared to gene expression in mature secondary xylem. Similarly, gene expression in cambium can be compared to gene expression in xylem. Furthermore, gene expression in apical meristems can be compared to gene expression in cambium.


In an alternative embodiment, differences in gene expression are determined as cells from different tissues advance during the cell cycle. In this method, the cells from the different tissues are synchronized and their gene expression is profiled. Methods of synchronizing the stage of cell cycle in a sample are known. These methods include, e.g., cold acclimation, photoperiod, and aphidicoline. See, e.g., Nagata et al., Int. Rev. Cytol. 132:1-30 (1992), Breyne and Zabeau, Curr. Opin. Plant Biol. 4:136-42, 140 (2001). A sample is obtained during a specific stage of the cell cycle and gene expression in that sample is compared to a sample obtained during a different stage of the cell cycle. For example, tissue can be examined in any of the phases of the cell cycle, such as mitosis, G1, G0, S, and G2. In particular, one can examine the changes in gene expression at the G1, G2, and metaphase checkpoints.


In another embodiment of the invention, a sample is obtained from a plant having a specific phenotype and gene expression in that sample is compared to a sample obtained from a plant of the same species that does not have that phenotype. For example, a sample can be obtained from a plant exhibiting a fast rate of growth and gene expression can be compared with that of a sample obtained from a plant exhibiting a normal or slow rate of growth. Differentially expressed genes identified from such a comparison can be correlated with growth rate and, therefore, useful for manipulating growth rate.


In a further embodiment, a sample is obtained from clonally propagated plants. In one embodiment the clonally propagated plants are of the species Pinus or Eucalyptus. Individual ramets from the same genotype can be sacrificed at different times of year. Thus, for any genotype there can be at least two genetically identical trees sacrificed, early in the season and late in the season. Each of these trees can be divided into juvenile (top) to mature (bottom) samples. Further, tissue samples can be divided into, for example, phloem to xylem, in at least 5 layers of peeling. Each of these samples can be evaluated for phenotype and gene expression. See Entry 196.


Where cellular components may interfere with an analytical technique, such as a hybridization assay, enzyme assay, a ligand binding assay, or a biological activity assay, it may be desirable to isolate the gene products from such cellular components. Gene products, including nucleic acid and amino acid gene products, can be isolated from cell fragments or lysates by any method known in the art.


Nucleic acids used in accordance with the invention can be prepared by any available method or process, or by other processes as they become known in the art. Conventional techniques for isolating nucleic acids are detailed, for example, in Tijssen, LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY: HYBRIDIZATION WITH NUCLEIC ACID PROBES, chapter 3 (Elsevier Press, 1993), Berger and Kimmel, Methods Enzymol. 152:1 (1987), and GIBCO BRL & LIFE TECHNOLOGIES TRIZOL RNA ISOLATION PROTOCOL, Form No. 3786 (2000). Techniques for preparing nucleic acid samples, and sequencing polynucleotides from pine and eucalyptus are known. See, e.g., Allona et al., supra and Whetton et al., supra, and U.S. Application No. 60/476,222.


A suitable nucleic acid sample can contain any type of nucleic acid derived from the transcript of a cell cycle gene, i.e., RNA or a subsequence thereof or a nucleic acid for which an mRNA transcribed from a cell cycle gene served as a template. Suitable nucleic acids include cDNA reverse-transcribed from a transcript, RNA transcribed from that cDNA, DNA amplified from the cDNA, and RNA transcribed from the amplified DNA. Detection of such products or derived products is indicative of the presence and/or abundance of the transcript in the sample. Thus, suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse-transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, and RNA transcribed from amplified DNA. As used herein, the category of “transcripts” includes but is not limited to pre-mRNA nascent transcripts, transcript processing intermediates, and mature mRNAs and degradation products thereof.


It is not necessary to monitor all types of transcripts to practice the invention. For example, the expression profiling methods of the invention can be conducted by detecting only one type of transcript, such as mature mRNA levels only.


In one aspect of the invention, a chromosomal DNA or cDNA library (comprising, for example, fluorescently labeled cDNA synthesized from total cell mRNA) is prepared for use in hybridization methods according to recognized methods in the art. See Sambrook et al., supra.


In another aspect of the invention, mRNA is amplified using, e.g., the MessageAmp kit (Ambion). In a further aspect, the mRNA is labeled with a detectable label. For example, mRNA can be labeled with a fluorescent chromophore, such as CyDye (Amersham Biosciences).


In some applications, it may be desirable to inhibit or destroy RNase that often is present in homogenates or lysates, before use in hybridization techniques. Methods of inhibiting or destroying nucleases are well known. In one embodiment of the invention, cells or tissues are homogenized in the presence of chaotropic agents to inhibit nuclease. In another embodiment, RNase is inhibited or destroyed by heat treatment, followed by proteinase treatment.


Protein samples can be obtained by any means known in the art. Protein samples useful in the methods of the invention include crude cell lysates and crude tissue homogenates. Alternatively, protein samples can be purified. Various methods of protein purification well known in the art can be found in Marshak et al., STRATEGIES FOR PROTEIN PURIFICATION AND CHARACTERIZATION: A LABORATORY COURSE MANUAL (Cold Spring Harbor Laboratory Press 1996).


2. Detecting Level of Gene Expression


For methods of the invention that comprise detecting a level of gene expression, any method for observing gene expression can be used, without limitation. Such methods include traditional nucleic acid hybridization techniques, polymerase chain reaction (PCR) based methods, and protein determination. The invention includes detection methods that use solid support-based assay formats as well as those that use solution-based assay formats.


Absolute measurements of the expression levels need not be made, although they can be made. The invention includes methods comprising comparisons of differences in expression levels between samples. Comparison of expression levels can be done visually or manually, or can be automated and done by a machine, using for example optical detection means. Subrahmanyam et al., Blood. 97: 2457 (2001); Prashar et al., Methods Enzymol. 303: 258 (1999). Hardware and software for analyzing differential expression of genes are available, and can be used in practicing the present invention. See, e.g., GenStat Software and GeneExpress® GX Explorer™ Training Manual, supra; Baxevanis & Francis-Ouellette, supra.


In accordance with one embodiment of the invention, nucleic acid hybridization techniques are used to observe gene expression. Exemplary hybridization techniques include Northern blotting, Southern blotting, solution hybridization, and S1 nuclease protection assays.


Nucleic acid hybridization typically involves contacting an oligonucleotide probe and a sample comprising nucleic acids under conditions where the probe can form stable hybrid duplexes with its complementary nucleic acid through complementary base pairing. For example, see PCT application WO 99/32660; Berger & Kimmel, Methods Enzymol. 152: 1 (1987). The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. The detectable label can be present on the probe, or on the nucleic acid sample. In one embodiment, the nucleic acids of the sample are detectably labeled polynucleotides representing the mRNA transcripts present in a plant tissue (e.g., a cDNA library). Detectable labels are commonly radioactive or fluorescent labels, but any label capable of detection can be used. Labels can be incorporated by several approached described, for instance, in WO 99/32660, supra. In one aspect RNA can be amplified using the MessageAmp kit (Ambion) with the addition of aminoallyl-UTP as well as free UTP. The aminoallyl groups incorporated into the amplified RNA can be reacted with a fluorescent chromophore, such as CyDye (Amersham Biosciences)


Duplexes of nucleic acids are destabilized by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids. Under low stringency conditions (e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA, RNA:RNA or RNA:DNA) will form even where the annealed sequences are not perfectly complementary. Thus, specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature and/or lower salt and/or in the presence of destabilizing reagents) hybridization tolerates fewer mismatches.


Typically, stringent conditions for short probes (e.g., 10 to 50 nucleotide bases) will be those in which the salt concentration is at least about 0.01 to 1.0 M at pH 7.0 to 8.3 and the temperature is at least about 30° C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.


Under some circumstances, it can be desirable to perform hybridization at conditions of low stringency, e.g., 6×SSPE-T (0.9 M NaCl, 60 mM NaH2PO4, pH 7.6, 6 mM EDTA, 0.005% Triton) at 37° C., to ensure hybridization. Subsequent washes can then be performed at higher stringency (e.g., 1×SSPE-T at 37° C.) to eliminate mismatched hybrid duplexes. Successive washes can be performed at increasingly higher stringency (e.g., down to as low as 0.25×SSPE-T at 37° C. to 50° C.) until a desired level of hybridization specificity is obtained.


In general, standard conditions for hybridization is a compromise between stringency (hybridization specificity) and signal intensity. Thus, in one embodiment of the invention, the hybridized nucleic acids are washed at successively higher stringency conditions and read between each wash. Analysis of the data sets produced in this manner will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular oligonucleotide probes of interest. For example, the final wash may be selected as that of the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity.


a. Oligonucleotide Probes


Oligonucleotide probes useful in nucleic acid hybridization techniques employed in the present invention are capable of binding to a nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. A probe can include natural bases (i.e., A, G, U, C or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the nucleotide bases in the probes can be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. Thus, probes can be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages.


Oligonucleotide probes can be prepared by any means known in the art. Probes useful in the present invention are capable of hybridizing to a nucleotide product of cell cycle genes, such as one of SEQ ID NOs: 1-237. Probes useful in the invention can be generated using the nucleotide sequences disclosed in SEQ ID NOs: 1-237. The invention includes oligonucleotide probes having at least a 2, 10,15, 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, or 100 nucleotide fragment of a corresponding contiguous sequence of any one of SEQ ID NOs: 1-237. The invention includes oligonucleotides of less than 2, 1, 0.5, 0.1, or 0.05 kb in length. In one embodiment, the oligonucleotide is 60 nucleotides in length.


Oligonucleotide probes can be designed by any means known in the art. See, e.g., Li and Stormo, Bioinformatics 17: 1067-76 (2001). Oligonucleotide probe design can be effected using software. Exemplary software includes ArrayDesigner, GeneScan, and ProbeSelect. Probes complementary to a defined nucleic acid sequence can be synthesized chemically, generated from longer nucleotides using restriction enzymes, or can be obtained using techniques such as polymerase chain reaction (PCR). PCR methods are well known and are described, for example, in Innis et al. eds., PCR PROTOCOLS: A GUIDE TO METHODS AND APPLICATIONS, Academic Press Inc. San Diego, Calif. (1990). The probes can be labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Optimally, the nucleic acids in the sample are labeled and the probes are not labeled. Oligonucleotide probes generated by the above methods can be used in solution or solid support-based methods.


The invention includes oligonucleotide probes that hybridize to a product of the coding region or a 3′ untranslated region (3′ UTR) of a cell cycle gene. In one embodiment, the oligonucleotide probe hybridizes to the 3′UTR of any one of SEQ ID NOs: 1-237. The 3′ UTR is generally a unique region of the gene, even among members of the same family. Therefore, the probes capable of hybridizing to a product of the 3′ UTR can be useful for differentiating the expression of individual genes within a family where the coding region of the genes likely are highly homologous. This allows for the design of oligonucleotide probes to be used as members of a plurality of oligonucleotides, each capable of uniquely binding to a single gene. In another embodiment, the oligonucleotide probe comprises any one of SEQ ID NOs: 471-697. In another embodiment, the oligonucleotide probe consists of any one of SEQ ID NOs:471-697.


b. Oligonucleotide Array Methods


One embodiment of the invention employs two or more oligonucleotide probes in combination to detect a level of expression of one or more cell cycle genes, such as the genes of SEQ ID NOs: 1-237. In one aspect of this embodiment, the level of expression of two or more different genes is detected. The two or more genes may be from the same or different cell cycle gene families discussed above. Each of the two or more oligonucleotides may hybridize to a different one of the genes.


One embodiment of the invention employs two or more oligonucleotide probes, each of which specifically hybridize to a polynucleotide derived from the transcript of a gene provided by SEQ ID NOs: 1-237. Another embodiment employs two or more oligonucleotide probes, at least one of which comprises a nucleic acid sequence of SEQ ID NOs: 471-697. Another embodiment employs two or more oligonucleotide probes, at least one of which consists of SEQ ID NOs: 471-697.


The oligonucleotide probes may comprise from about 5 to about 60, or from about 5 to about 500, nucleotide bases, such as from about 60 to about 100 nucleotide bases, including from about 15 to about 60 nucleotide bases.


One embodiment of the invention uses solid support-based oligonucleotide hybridization methods to detect gene expression. Solid support-based methods suitable for practicing the present invention are widely known and are described, for example, in PCT application WO 95/11755; Huber et al., Anal. Biochem. 299: 24 (2001); Meiyanto et al., Biotechniques. 31: 406 (2001); Relogio et al., Nucleic Acids Res. 30:e51 (2002). Any solid surface to which oligonucleotides can be bound, covalently or non-covalently, can be used. Such solid supports include filters, polyvinyl chloride dishes, silicon or glass based chips, etc.


One embodiment uses oligonucleotide arrays, i.e. microarrays, which can be used to simultaneously observe the expression of a number of genes or gene products. Oligonucleotide arrays comprise two or more oligonucleotide probes provided on a solid support, wherein each probe occupies a unique location on the support. The location of each probe may be predetermined, such that detection of a detectable signal at a given location is indicative of hybridization to an oligonucleotide probe of a known identity. Each predetermined location can contain more than one molecule of a probe, but each molecule within the predetermined location has an identical sequence. Such predetermined locations are termed features. There can be, for example, from 2, 10, 100, 1,000, 2,000 or 5,000 or more of such features on a single solid support. In one embodiment, each oligonucleotide is located at a unique position on an array at least 2, at least 3, at least 4, at least 5, at least 6, or at least 10 times.


Oligonucleotide probe arrays for detecting gene expression can be made and used according to conventional techniques described, for example, in Lockhart et al., Nat'l Biotech. 14: 1675 (1996), McGall et al., Proc. Nat'l Acad. Sci. USA 93: 13555 (1996), and Hughes et al., Nature Biotechnol. 19:342 (2001). A variety of oligonucleotide array designs is suitable for the practice of this invention.


In one embodiment the one or more oligonucleotides include a plurality of oligonucleotides that each hybridize to a different gene expressed in a particular tissue type. For example, the tissue can be developing wood.


In one embodiment, a nucleic acid sample obtained from a plant can be amplified and, optionally labeled with a detectable label. Any method of nucleic acid amplification and any detectable label suitable for such purpose can be used. For example, amplification reactions can be performed using, e.g. Ambion's MessageAmp, which creates “antisense” RNA or “aRNA” (complementary in nucleic acid sequence to the RNA extracted from the sample tissue). The RNA can optionally be labeled using CyDye fluorescent labels. During the amplification step, aaUTP is incorporated into the resulting aRNA. The CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction. Subsequent to the amplification and labeling steps, labeled amplified antisense RNAs are precipitated and washed with appropriate buffer, and then assayed for purity. For example, purity can be assay using a NanoDrop spectrophotometer. The nucleic acid sample is then contacted with an oligonucleotide array having, attached to a solid substrate (a “microarray slide”), oligonucleotide sample probes capable of hybridizing to nucleic acids of interest which may be present in the sample. The step of contacting is performed under conditions where hybridization can occur between the nucleic acids of interest and the oligonucleotide probes present on the array. The array is then washed to remove non-specifically bound nucleic acids and the signals from the labeled molecules that remain hybridized to oligonucleotide probes on the solid substrate are detected. The step of detection can be accomplished using any method appropriate to the type of label used. For example, the step of detecting can accomplished using a laser scanner and detector. For example, on can use and Axon scanner which optionally uses GenePix Pro software to analyze the position of the signal on the microarray slide.


Data from one or more microarray slides can analyzed by any appropriate method known in the art.


Oligonucleotide probes used in the methods of the present invention, including microarray techniques, can be generated using PCR. PCR primers used in generating the probes are chosen, for example, based on the sequences of SEQ ID NOs:1-237, to result in amplification of unique fragments of the cell cycle genes (i.e., fragments that hybridize to only one polynucleotide of any one of SEQ ID NOs: 1-237 under standard hybridization conditions). Computer programs are useful in the design of primers with the required specificity and optimal hybridization properties. For example, Li and Stormo, supra at 1075, discuss a method of probe selection using ProbeSelect which selects an optimum oligonucleotide probe based on the entire gene sequence as well as other gene sequences to be probed at the same time.


In one embodiment, oligonucleotide control probes also are used. Exemplary control probes can fall into at least one of three categories referred to herein as (1) normalization controls, (2) expression level controls and (3) negative controls. In microarray methods, one or more of these control probes may be provided on the array with the inventive cell cycle gene-related oligonucleotides.


Normalization controls correct for dye biases, tissue biases, dust, slide irregularities, malformed slide spots, etc. Normalization controls are oligonucleotide or other nucleic acid probes that are complementary to labeled reference oligonucleotides or other nucleic acid sequences that are added to the nucleic acid sample to be screened. The signals obtained from the normalization controls, after hybridization, provide a control for variations in hybridization conditions, label intensity, reading efficiency and other factors that can cause the signal of a perfect hybridization to vary between arrays. In one embodiment, signals (e.g., fluorescence intensity or radioactivity) read from all other probes used in the method are divided by the signal from the control probes, thereby normalizing the measurements.


Virtually any probe can serve as a normalization control. Hybridization efficiency varies, however, with base composition and probe length. Preferred normalization probes are selected to reflect the average length of the other probes being used, but they also can be selected to cover a range of lengths. Further, the normalization control(s) can be selected to reflect the average base composition of the other probes being used. In one embodiment, only one or a few normalization probes are used, and they are selected such that they hybridize well (i.e., without forming secondary structures) and do not match any test probes. In one embodiment, the normalization controls are mammalian genes.


Expression level controls probes hybridize specifically with constitutively expressed genes present in the biological sample. Virtually any constitutively expressed gene provides a suitable target for expression level control probes. Typically, expression level control probes have sequences complementary to subsequences of constitutively expressed “housekeeping genes” including, but not limited to certain photosynthesis genes.


“Negative control” probes are not complementary to any of the test oligonucleotides (i.e., the inventive cell cycle gene-related oligonucleotides), normalization controls, or expression controls. In one embodiment, the negative control is a mammalian gene which is not complementary to any other sequence in the sample.


The terms “background” and “background signal intensity” refer to hybridization signals resulting from non-specific binding or other interactions between the labeled target nucleic acids (i.e., mRNA present in the biological sample) and components of the oligonucleotide array. Background signals also can be produced by intrinsic fluorescence of the array components themselves.


A single background signal can be calculated for the entire array, or a different background signal can be calculated for each target nucleic acid. In a one embodiment, background is calculated as the average hybridization signal intensity for the lowest 5 to 10 percent of the oligonucleotide probes being used, or, where a different background signal is calculated for each target gene, for the lowest 5 to 10 percent of the probes for each gene. Where the oligonucleotide probes corresponding to a particular cell cycle gene hybridize well and, hence, appear to bind specifically to a target sequence, they should not be used in a background signal calculation. Alternatively, background can be calculated as the average hybridization signal intensity produced by hybridization to probes that are not complementary to any sequence found in the sample (e.g., probes directed to nucleic acids of the opposite sense or to genes not found in the sample). In microarray methods, background can be calculated as the average signal intensity produced by regions of the array that lack any oligonucleotides probes at all.


c. PCR-Based Methods


In another embodiment, PCR-based methods are used to detect gene expression. These methods include reverse-transcriptase-mediated polymerase chain reaction (RT-PCR) including real-time and endpoint quantitative reverse-transcriptase-mediated polymerase chain reaction (Q-RTPCR). These methods are well known in the art. For example, methods of quantitative PCR can be carried out using kits and methods that are commercially available from, for example, Applied BioSystems and Stratagene®. See also Kochanowski, QUANTITATIVE PCR PROTOCOLS (Humana Press, 1999); Innis et al., supra.; Vandesompele et al., Genome Biol. 3: RESEARCH0034 (2002); Stein, Cell Mol. Life Sci. 59: 1235 (2002).


Gene expression can also be observed in solution using Q-RTPCR. Q-RTPCR relies on detection of a fluorescent signal produced proportionally during amplification of a PCR product. See Innis et al., supra. Like the traditional PCR method, this technique employs PCR oligonucleotide primers, typically 15-30 bases long, that hybridize to opposite strands and regions flanking the DNA region of interest. Additionally, a probe (e.g., TaqMan®, Applied Biosystems) is designed to hybridize to the target sequence between the forward and reverse primers traditionally used in the PCR technique. The probe is labeled at the 5′ end with a reporter fluorophore, such as 6-carboxyfluorescein (6-FAM) and a quencher fluorophore like 6-carboxy-tetramethyl-rhodamine (TAMRA). As long as the probe is intact, fluorescent energy transfer occurs which results in the absorbance of the fluorescence emission of the reporter fluorophore by the quenching fluorophore. As Taq polymerase extends the primer, however, the intrinsic 5′ to 3′ nuclease activity of Taq degrades the probe, releasing the reporter fluorophore. The increase in the fluorescence signal detected during the amplification cycle is proportional to the amount of product generated in each cycle.


The forward and reverse amplification primers and internal hybridization probe is designed to hybridize specifically and uniquely with one nucleotide derived from the transcript of a target gene. In one embodiment, the selection criteria for primer and probe sequences incorporates constraints regarding nucleotide content and size to accommodate TaqMan® requirements.


SYBR Green® can be used as a probe-less Q-RTPCR alternative to the Taqman®-type assay, discussed above. ABI PRISM® 7900 SEQUENCE DETECTION SYSTEM USER GUIDE APPLIED BIOSYSTEMS, chap. 1-8, App. A-F. (2002).


A device measures changes in fluorescence emission intensity during PCR amplification. The measurement is done in “real time,” that is, as the amplification product accumulates in the reaction. Other methods can be used to measure changes in fluorescence resulting from probe digestion. For example, fluorescence polarization can distinguish between large and small molecules based on molecular tumbling (see U.S. Pat. No. 5,593,867).


d. Protein Detection Methods


Proteins can be observed by any means known in the art, including immunological methods, enzyme assays and protein array/proteomics techniques.


Measurement of the translational state can be performed according to several protein methods. For example, whole genome monitoring of protein—the “proteome”—can be carried out by constructing a microarray in which binding sites comprise immobilized, preferably monoclonal, antibodies specific to a plurality of proteins having an amino acid sequence of any of SEQ ID NOs: 261-497 or proteins encoded by the genes of SEQ ID NOs:1-237 or conservative variants thereof. See Wildt et al., Nature Biotechnol. 18: 989 (2000). Methods for making polyclonal and monoclonal antibodies are well known, as described, for instance, in Harlow & Lane, ANTIBODIES: A LABORATORY MANUAL (Cold Spring Harbor Laboratory Press, 1988).


Alternatively, proteins can be separated by two-dimensional gel electrophoresis systems. Two-dimensional gel electrophoresis is well-known in the art and typically involves isoelectric focusing along a first dimension followed by SDS-PAGE electrophoresis along a second dimension. See, e.g., Hames et al, GEL ELECTROPHORESIS OF PROTEINS: A PRACTICAL APPROACH (IRL Press, 1990). The resulting electropherograms can be analyzed by numerous techniques, including mass spectrometric techniques, western blotting and immunoblot analysis using polyclonal and monoclonal antibodies, and internal and N-terminal micro-sequencing.


3. Correlating Gene Expression to Phenotype and Tissue Development


As discussed above, the invention provides methods and tools to correlate gene expression to plant phenotype. Gene expression may be examined in a plant having a phenotype of interest and compared to a plant that does not have the phenotype or has a different phenotype. Such a phenotype includes, but is not limited to, increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides.


In another embodiment, the phenotype includes one or more of the following traits: propensity to form reaction wood, a reduced period of juvenility, an increased period of juvenility, self-abscising branches, accelerated reproductive development or delayed reproductive development.


In a further embodiment, the phenotype that is differs in the plants compares includes one or more of the following: lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape.


Phenotype can be assessed by any suitable means as discussed above.


In a further embodiment, gene expression can be correlated to a given point in the cell cycle, a given point in plant development, and in a given tissue sample. Plant tissue can be examined at different stages of the cell cycle, from plant tissue at different developmental stages, from plant tissue at various times of the year (e.g. spring versus summer), from plant tissues subject to different environmental conditions (e.g. variations in light and temperature) and/or from different types of plant tissue and cells. In accordance with one embodiment, plant tissue is obtained during various stages of maturity and during different seasons of the year. For example, plant tissue can be collected from stem dividing cells, differentiating xylem, early developing wood cells, differentiated spring wood cells, differentiated summer wood cells.


It will be apparent to those skilled in the art that various modifications and variations can be made in the methods and compositions of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.


The following examples are given to illustrate the present invention. It should be understood, however, that the invention is not to be limited to the specific conditions or details described in these examples. Throughout the specification, any and all references to a publicly available document, including a U.S. patent, are specifically incorporated by reference.


Examples
Example 1

Example 1 illustrates a procedure for RNA extraction and purification, which is particularly useful for RNA obtained from conifer needle, xylem, cambium, and phloem.


Tissue is obtained from conifer needle, xylem, cambium or phloem. The tissue is frozen in liquid nitrogen and ground. The total RNA is extracted using Concert Plant RNA reagent (Invitrogen). The resulting RNA sample is extracted into phenol:chloroform and treated with DNase. The RNA is then incubated at 65° C. for 2 minutes followed by centrifugation at 4° C. for 30 minutes. Following centrifugation, the RNA is extracted into phenol at least 10 times to remove contaminants.


The RNA is further cleaned using RNeasy columns (Qiagen). The purified RNA is quantified using RiboGreen reagent (Molecular Probes) and purity assessed by gel electrophoresis.


RNA is then amplified using MessageAmp (Ambion). Aminoallyl-UTP and free UTP are added to the in vitro transcription of the purified RNA at a ratio of 4:1 aminoallyl-UTP-to-UTP. The aminoallyl-UTP is incorporated into the new RNA strand as it is transcribed. The amino-allyl group is then reacted with Cy dyes to attach the colorimetric label to the resulting amplified RNA using the Amersham procedure modified for use with RNA. Unincorporated dye is removed by ethanol precipitation. The labeled RNA is quantified spectrophotometrically (NanoDrop). The labeled RNA is fragmented by heating to 95° C. as described in Hughes et al., Nature Biotechnol. 19:342 (2001).


Example 2

Example 2 illustrates how cell cycle genes important for wood development in Pinus radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.


Pine trees of the species Pinus radiata are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species Pinus radiata.


RNA is isolated and ESTs generated as described in Example 1 or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in the plant cell cycle. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed using BLAST (NCBI). Sequences from among the known cell cycle genes that show hybridization in silico to ESTs made from samples containing developing wood, but that do not hybridize to ESTs from samples not containing developing wood are selected for further examination.


cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.


The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosis, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.


Example 3

Example 3 illustrates how cell cycle genes important for wood development in Eucalyptus grandis can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.


Eucalyptus trees of the species Eucalyptus grandis are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species Pinus radiata.


RNA is isolated and ESTs generated as described in Example 1 or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in the plant cell cycle. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed as described in, for example, Audic and Claverie, Genome Res. 7:986 (1997). Sequences from among the known cell cycle genes that show hybridization in silico to ESTs made from samples containing developing wood, but do not hybridize to ESTs from samples not containing developing wood are selected for further examination.


cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.


The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosus, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.


Example 4

Example 4 illustrates how to detect expression of Pinus radiata cell cycle genes which are important in wood formation using an oligonucleotide microarray prepared as in Example 2. This is an example of a balanced incomplete block designed experiment carried out using aRNA samples prepared from mature-phase phloem (P), cambium (C), expanding xylem found in a layer below the cambium (X1) and differentiating, lignifying xylem cells found deeper in the same growth ring (X2). In this example, cell cycle gene expression is compared among the four samples, namely P, C, X1, and X2.


In the summer, plants of the species Pinus radiata are felled and the bark of the main stem is immediately pulled gently away to reveal the phloem and xylem. The phloem and xylem are then peeled with a scalpel into separate containers of liquid nitrogen. Needles (leaves) and buds from the trees are also harvested with a scalpel into separate containers of liquid nitrogen. RNA is subsequently isolated from the frozen tissue samples as described in Example 1. Equal microgram quantities of total RNA are purified from each sample using RNeasy Mini columns (Qiagen, Valencia, Calif.) according to the manufacturers instructions.


Amplification reactions are carried out for each of the P, C, X1, and X2 tissue samples. Amplification reactions are performed using Ambion's MessageAmp kit, a T7-based amplification procedure, following the manufacturer's instructions, except that labeled aaUTP is added to the reagent mix during in the amplification step. aaUTP is incorporated into the resulting antisense RNA formed during this step. CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction as described in Example 1. Labeled amplified antisense RNAs are precipitated and washed, and then assayed for purity using a NanoDrop spectrophotometer. These labeled antisense RNAs, corresponding to the RNA isolated from the P, C, X1, and X2 tissue samples, constitute the sample nucleic acids, which are referred to as the P, C, X1, and X2 samples.


Normalization control samples of known nucleic acids are added to each sample in a dilution series of 500, 200, 100, 50, 25 and 10 pg/μl for quantitation of the signals. Positive controls corresponding to specific genes showing expression in all tissues of pine, such as housekeeping genes, are also added to the plant sample.


Each of four microarray slides is incubated with 125 μL of a P, C, X1 or X2 sample under a coverslip at 42° C. for 16-18 hours. The arrays are washed in 1×SSC, 0.1% SDS for 10 minutes and then in 0.1×SSC, 0.1% SDS for 10 minutes and the allowed to dry.


The array slides are scanned using an Axon laser scanner and analyzed using GenePix Pro software. Data from the microarray slides are subjected to microarray data analysis using GenStat SAS or Spotfire software. Outliers are removed and ratiometric data for each of the datasets are normalized using a global normalization which employs a cubic spline fit applied to correct for differential dye bias and spatial effects. A second transformation is performed to fit control signal ratios to a mean log2=0 (i.e. 1:1 ratio). Normalized data are then subjected to a variance analysis.


Mean signal intensity for each signal at any given position on the microarray slide is determined for each of three of P, C, X1, and X2 sample microarray slides. This mean signal/probe position is compared to the signal at the same position on sample slide which was not used for calculating the mean. For example, a mean signal at a given position is determined for P, C, and X1 and the signal at that position in the X2 microarray slide is compared to the P, C, and X1 mean signal value.


Table 1 shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.














TABLE 1







Gene
PvCX12
PvX12
CvX12





















WD40 repeat protein A
−1.24
−0.88
−1.07



CDC2
−1.09
−0.78
−0.92



CYCLIN
−1.08
−1
−0.26



WD-40 repeat protein B
−1.01
−0.87
−0.42



CDC2
−0.83
−0.49
−1.01







P = Phloem



C = Cambium



X1 = xylem layer-1



X2 = xylem layer-2



PvCX12 = Ratio of the signal for Phloem target versus mean signal for Cambium, Xylem1, and Xylem2 targets






The data shows that WD40 repeat protein A encodes a WD40 repeat protein is less highly expressed in cambium than in developing xylem, while WD40 repeat protein B encodes a WD40 repeat protein that is more highly expressed in phloem than in the other tissues.


Signal data are then verified with RT-PCR to confirm gene expression in the target tissue of the genes corresponding to the unique oligonucleotides in the probe.


Example 5

Example 5 demonstrates how one can correlate cell cycle gene expression with agronomically important wood phenotypes such as density, stiffness, strength, distance between branches, and spiral grain.


Mature clonally propagated pine trees are selected from among the progeny of known parent trees for superior growth characteristics and resistance to important fungal diseases. The bark is removed from a tangential section and the trees are examined for average wood density in the fifth annual ring at breast height, stiffness and strength of the wood, and spiral grain. The trees are also characterized by their height, mean distance between major branches, crown size, and forking.


To obtain seedling families that are segregating for major genes that affect density, stiffness, strength, distance between branches, spiral grain and other characteristics that may be linked to any of the genes affecting these characteristics, trees lacking common parents are chosen for specific crosses on the criterion that they exhibit the widest variation from each other with respect to the density, stiffness, strength, distance between branches, and spiral grain criteria. Thus, pollen from a plus tree exhibiting high density, low mean distance between major branches, and high spiral grain is used to pollinate cones from the unrelated plus tree among the selections exhibiting the lowest density, highest mean distance between major branches, and lowest spiral grain. It is useful to note that “plus trees” are crossed such that pollen from a plus tree exhibiting high density are used to pollinate developing cones from another plus tree exhibiting high density, for example, and pollen from a tree exhibiting low mean distance between major branches would be used to pollinate developing cones from another plus tree exhibiting low mean distance between major branches.


Seeds are collected from these controlled pollinations and grown such that the parental identity is maintained for each seed and used for vegetative propagation such that each genotype is represented by multiple ramets. Vegetative propagation is accomplished using micropropagation, hedging, or fascicle cuttings. Some ramets of each genotype are stored while vegetative propagules of each genotype are grown to sufficient size for establishment of a field planting. The genotypes are arrayed in a replicated design and grown under field conditions where the daily temperature and rainfall are measured and recorded.


The trees are measured at various ages to determine the expression and segregation of density, stiffness, strength, distance between branches, spiral grain, and any other observable characteristics that may be linked to any of the genes affecting these characteristics. Samples are harvested for characterization of cellulose content, lignin content, cellulose microfibril angle, density, strength, stiffness, tracheid morphology, ring width, and the like. Samples are also examined for gene expression as described in Example 4. Ramets of each genotype are compared to ramets of the same genotype at different ages to establish age:age correlations for these characteristics.


Example 6

Example 6 demonstrates how the stage of plant development and responses to environmental conditions such as light and season can be correlated to cell cycle gene expression using microarrays prepared as in Example 4. In particular, the changes in gene expression associated with wood density are examined.


Trees of three different clonally propagated Eucalyptus grandis hybrid genotypes are grown on a site with a weather station that measures daily temperatures and rainfall. During the spring and subsequent summer, genetically identical ramets of the three different genotypes are first photographed with north-south orientation marks, using photography at sufficient resolution to show bark characteristics of juvenile and mature portions of the plant, and then felled as in Example 4. The age of the trees is determined by planting records and confirmed by a count of the annual rings. In each of these trees, mature wood is defined as the outermost rings of the tree below breast height, and juvenile wood as the innermost rings of the tree above breast height. Each tree is accordingly sectored as follows:


NM—NORTHSIDE MATURE


SM—SOUTHSIDE MATURE


NT—NORTHSIDE TRANSITION


ST—SOUTHSIDE TRANSITION


NJ—NORTHSIDE JUVENILE


SJ—SOUTHSIDE JUVENILE


Tissue is harvested from the plant trunk as well as from juvenile and mature form leaves. Samples are prepared simultaneously for phenotype analysis, including plant morphology and biochemical characteristics, and gene expression analysis. The height and diameter of the tree at the point from which each sector was taken is recorded, and a soil sample from the base of the tree is taken for chemical assay. Samples prepared for gene expression analysis are weighed and placed into liquid nitrogen for subsequent preparation of RNA samples for use in the microarray experiment. The tissues are denoted as follows:


P—phloem


C—cambium


X1—expanding xylem


X2—differentiating and lignifying xylem


Thin slices in tangential and radial sections from each of the sectors of the trunk are fixed as described in Ruzin, Plant Microtechnique and Microscopy, Oxford University Press, Inc., New York, N.Y. (1999) for anatomical examination and confirmation of wood developmental stage. Microfibril angle is examined at the different developmental stages of the wood, for example juvenile, transition and mature phases of Eucalyptus grandis wood. Other characteristics examined are the ratio of fibers to vessel elements and ray tissue in each sector. Additionally, the samples are examined for characteristics that change between juvenile and mature wood and between spring wood and summer wood, such as fiber morphology, lumen size, and width of the S2 (thickest) cell wall layer. Samples are further examined for measurements of density in the fifth ring and determination of modulus of elasticity using techniques well known to those skilled in the art of wood assays. See, e.g., Wang, et al., Non-destructive Evaluations of Trees, EXPERIMENTAL TECHNIQUES, pp. 28-30 (2000).


For biochemical analysis, 50 grams from each of the harvest samples are freeze-dried and analyzed, using biochemical assays well known to those skilled in the art of plant biochemistry for quantities of simple sugars, amino acids, lipids, other extractives, lignin, and cellulose. See, e.g., Pettersen & Schwandt, J. Wood Chem. & Technol. 11:495 (1991).


In the present example, the phenotypes chosen for comparison are high density wood, average density wood, and low density wood. Nucleic acid samples are prepared as described in Example 3, from trees harvested in the spring and summer. Gene expression profiling by hybridization and data analysis is performed as described in Examples 3 and 4.


Using similar techniques and clonally propagated individuals one can examine cell cycle gene expression as it is related to other complex wood characteristics such as strength, stiffness and spirality.


Example 7

Example 7 demonstrates the ability of the oligonucleotide probes of the invention to distinguish between highly homologous members of a family of cell cycle genes. Hybridization to a particular oligonucleotide on the array identifies a unique WD40 gene that is expressed more strongly in a genotype having a higher density wood than in observed in other genotypes examined. The WD40 gene is also expressed more strongly in mature wood than in juvenile wood and more strongly in summer wood than in spring wood. This gene is not found to be expressed at high levels either in leaves or buds.


The gene expression pattern is confirmed by RT-PCR. This gene, the putative “density-related” gene, is used for in situ hybridization of fixed radial sections. The density-related WD40 gene hybridizes most strongly to the vascular cambium in regions of the stem where the xylem is comprised primarily of fibers with few vessel elements and few xylem ray cells.


These results suggest that the WD40 gene product functions in radial cell division, which occurs in the cambium and results in diameter growth, rather than in axial cell division such as may be important in the apex or leaves. Such a gene would be difficult to identify by cDNA microarrays or other traditional hybridization means because the highly conserved regions present in the gene would result in confusing it with genes encoding enzymes having similar catalytic functions, but acting in axial or radial divisions. Furthermore, from the sequence similarity-based annotation suggesting a function of this gene product in cell division and the observation of this microarray hybridization pattern, confirmed by RT-PCR and in silico hybridization, this gene product functions specifically in developing secondary xylem to guide the cell division patterns of fibers, such that higher expression of this gene results in greater fiber production relative to vessel element or ray production. The fiber content is correlated with a principal components analysis (PCA) variable that accounts for at least 10% of the variation in basic density.


Example 8

Example 8 demonstrates how the use of oligonucleotide probes of the invention can be used to identify one wood “density related” WD40 repeat protein gene and its promoter from among the family of homologous genes. Further, this example demonstrates how a promoter sequence identified using this method is used to transform other hardwood species to result in increased diameter growth rates as compared to wild-type plants of the same species.


The sequence of the WD40 gene is used to probe a Genome Walker library in order to isolate 5′ flanking sequences comprising a promoter region. The promoter region is then operably linked to a beta-glucuronidase reporter gene and cloned into a binary vector for transformation into Eucalyptus using the method described in U.S. Application Ser. No. 60/476,222. Regenerated transgenic tobacco and Eucalyptus plants are then sectioned and stained using X-gluc, demonstrating that the microarray data results in isolation of a promoter capable of highly cambial-specific expression solely in those portions of the stem that develop more fibers than vessel elements or xylem rays.


Using techniques well known to those skilled in the art of molecular biology, the promoter is then operably linked to a cell division promoting gene and this construct placed in a binary vector for transformation into hardwood plants such as Sweetgum and Populus, such that the cell division promoting gene is expressed more strongly than normally in the vascular cambium. This results in increased diameter growth rate in the transgenic hardwood plants relative to control hardwood plants.


Example 9

Example 9 demonstrates how a density related polypeptide can be linked to a tissue-preferred promoter and expressed in pine resulting in a plant with increased wood density.


A density-related polypeptide, which is more highly expressed during the early spring, is identified by the method described in Example 7. A DNA construct having the density-related polypeptide operably linked to a promoter is placed into an appropriate binary vector and transformed into pine using the method of Connett et al. (U.S. patent application Ser. Nos. 09/973,088 and 09/973,089). Pine plants are transformed as described in Connett et al., supra, and the transgenic pine plants are used to establish a forest planting. Increased density even in the spring wood (early wood) is observed in the transgenic pine plants relative to control pine plants which are not transformed with the density related DNA construct.


Example 10

Using techniques well known to those skilled in the art of molecular biology, the sequence of the putative density-related gene isolated in Example 7 is analyzed in genomic DNA isolated from alfalfa. This enables the identification of an orthologue in alfalfa whose sequence is then used to create an RNAi knockout construct. This construct is then transformed into alfalfa. See, e.g., Austin et al., Euphytica 85, 381 1995. The regenerated transgenic plants show lower fiber content and increased ray cells content in the xylem. Such properties improved digestability which results in higher growth rates in cattle fed on this alfalfa as compared to wild-type alfalfa of the same species.


Example 11

Example 11 demonstrates how gene expression analysis can be used to find gene variants which are present in mature plants having a desirable phenotype. The presence or absence of such a variant can be used to predict the phenotype of a mature plant, allowing screening of the plants at the seedling stage. Although this example employs eucalyptus, the method used herein is also useful in breeding programs for pine and other tree species.


The sequence of a putative density-related gene is used to probe genomic DNA isolated from Eucalyptus that vary in density as described in previous examples. Non-transgenically produced Eucalyptus hybrids of different wood phenotypes are examined. One hybrid exhibits high wood density and another hybrid exhibits lower wood density. A molecular marker in the 3′ portion of the coding region is found which distinguishes a high-density gene variant from a lower density gene variant.


This molecular marker enables tree breeders to assay non-transgenic Eucalyptus hybrids for likely density profiles while the trees are still at seedling stage, whereas in the absence of the marker, tree breeders must wait until the trees have grown for multiple years before density at harvest age can be reliably predicted. This enables selective outplanting of the best trees at seedling stage rather than an expensive culling operation and resultant erosion at thinning age. This molecular marker is further useful in the breeding program to determine which parents will give rise to high density outcross progeny.


Molecular markers found in the 3′ portion of the coding region of the gene that do not correspond to variants seen more frequently in higher or lower wood density non-transgenic Eucalyptus hybrid trees are also useful. These markers are found to be useful for fingerprinting different genotypes of Eucalyptus, for use in identity-tracking in the breeding program and in plantations.


Example 12

This Example describes microarrays for identifying gene expression differences that contribute to the phenotypic characteristics that are important in commercial wood, namely wood appearance, stiffness, strength, density, fiber dimensions, coarseness, cellulose and lignin content, extractives content and the like.


As in Examples 2-4, woody trees of genera that produce commercially important wood products, in this case Pinus and Eucalyptus, are felled from various sites and at various times of year for the collection and isolation of RNA from developing xylem, cambium, phloem, leaves, buds, roots, and other tissues. RNA is also isolated from seedlings of the same genera.


All contigs are compared to both the ESTs made from RNA isolated from samples containing developing wood and the sequences of the ESTs made from RNA of various tissues that do not contain developing wood. Contigs containing primarily ESTs that show more hybridization in silico to ESTs made from RNA isolated from samples containing developing wood than to ESTs made from RNA isolated from samples not containing developing wood are determined to correspond to possible novel genes particularly expressed in developing wood. These contigs are then used for BLAST searches against public domain sequences. Those contigs that hybridize with high stringency to no known genes or genes annotated as having only a “hypothetical protein” are selected for the next step. These contigs are considered putative novel genes showing wood-preferred expression.


The longest cDNA clones containing sequences hybridizing to the putative novel genes showing wood-preferred expression are selected from cDNA libraries using techniques well known to those skilled in the art of molecular biology. The cDNAs are sequenced and full-length gene-coding sequences together with untranslated flanking sequences are obtained where possible. Stretches of 45-80 nucleotides (or oligonucleotides) are selected from each of the sequences of putative novel genes showing wood-preferred expression such that each oligonucleotide probe hybridizes at high stringency to only one sequence represented in the ESTs made from RNA isolated from trees or seedlings of the same genus.


Oligomers are then chemically synthesized and placed onto a microarray slide as described in Example 3. Each oligomer corresponds to a particular sequence of a putative novel gene showing wood-preferred expression and to no other gene whose sequence is represented among the ESTs made from RNA isolated from trees or seedlings of the same genus.


Sample preparation and hybridization are carried out as in Example 4. The technique used in this example is more effective than use of a microarray using cDNA probes because the presence of a signal represents significant evidence of the expression of a particular gene, rather than of any of a number of genes that may contain similarities to the cDNA due to conserved functional domains or common evolutionary history. Thus, it is possible to differentiate homologous genes, such as those in the same family, but which may have different functions in phenotype determination.


Thus hybridization data, gained using the method of Example 4, enable the user to identify which of the putative novel genes actually has a pattern of coordinate expression with known genes, a pattern of expression consistent with a particular developmental role, and/or a pattern of expression that suggests that the gene has a promoter that drives expression in a valuable way.


The hybridization data thus using this method can be used, for example, to identify a putative novel gene that shows an expression pattern particular to the tracheids with the lowest cellulose microfibril angle in developing spring wood (early wood). The promoter of this gene can also be isolated as in Example 8, and operably linked to a gene that has been shown as in Example 9 to be associated with late wood (summer wood). Transgenic pine plants containing this construct are generated using the methods of Example 9, and the early wood of these plants is then shown to display several characteristics of late wood, such as higher microfibril angle, higher density, smaller average lumen size, etc.


Example 13

Example 13 demonstrates the use of a cambium-specific promoter functionally linked to a cell cycle gene for increased plant biomass.


Cambium-specific cell cycle transcripts are identified via array analyses of different secondary vasculature layers as described in Example 4. Candidate promoters linked to the genes corresponding to these transcripts are cloned from pine genomic DNA using, e.g., the BD Clontech GenomeWalker kit and tested in transgenic tobacco via a reporter assay(s) for cambium specificity/preference. The cambium-specific promoter overexpressing a cell cycle gene involved in secondary xylem cell division is used to increased wood biomass. A tandem cambium-specific promoter is constructed driving the cell cycle ORF. Boosted transcript levels of the candidate cell cycle gene result in an increased xylem biomass phenotype.


Example 14

Isolation and Characterization of cDNA Clones from Eucalyptus Grandis



Eucalyptus grandis cDNA expression libraries were prepared from mature shoot buds, early wood phloem, floral tissue, leaf tissue (two independent libraries), feeder roots, structural roots, xylem or early wood xylem and were constructed and screened as follows.


Total RNA was extracted from the plant tissue using the protocol of Chang et al. (Plant Molecular Biology Reporter 11:113-116 (1993). mRNA was isolated from the total RNA preparation using either a Poly(A) Quik mRNA Isolation Kit (Stratagene, La Jolla, Calif.) or Dynal Beads Oligo (dT)25 (Dynal, Skogen, Norway). A cDNA expression library was constructed from the purified mRNA by reverse transcriptase synthesis followed by insertion of the resulting cDNA clones in Lambda ZAP using a ZAP Express cDNA Synthesis Kit (Stratagene), according to the manufacturer's protocol. The resulting cDNAs were packaged using a Gigapack II Packaging Extract (Stratagene) using an aliquot (1-5 αl) from the 5 μl ligation reaction dependent upon the library. Mass excision of the library was done using XL1-Blue MRF′ cells and XLOLR cells (Stratagene) with ExAssist helper phage (Stratagene). The excised phagemids were diluted with NZY broth (Gibco BRL, Gaithersburg, Md.) and plated out onto LB-kanamycin agar plates containing X-gal and isopropylthio-beta-galactoside (IPTG).


Of the colonies plated and selected for DNA miniprep, 99% contained an insert suitable for sequencing. Positive colonies were cultured in NZY broth with kanamycin and cDNA was purified by means of alkaline lysis and polyethylene glycol (PEG) precipitation. Agarose gel at 1% was used to screen sequencing templates for chromosomal contamination. Dye primer sequences were prepared using a Turbo Catalyst 800 machine (Perkin Elmer/Applied Biosystems Division, Foster City, Calif.) according to the manufacturer's protocol.


DNA sequence for positive clones was obtained using a Perkin Elmer/Applied Biosystems Division Prism 377 sequencer. cDNA clones were sequenced first from the 5′ end and, in some cases, also from the 3′ end. For some clones, internal sequence was obtained using either Exonuclease III deletion analysis, yielding a library of differentially sized subclones in pBK-CMV, or by direct sequencing using gene-specific primers designed to identified regions of the gene of interest.


The determined cDNA sequences were compared with known sequences in the EMBL database using the computer algorithms FASTA and/or BLASTN. Multiple alignments of redundant sequences were used to build reliable consensus sequences. Based on similarity to known sequences from other plant species, the isolated polynucleotide sequences were identified as encoding transcription factors, as detailed herein. The predicted polypeptide sequences corresponding to the polynucleotide sequences are also depicted therein.


Example 15

Isolation and Characterization of cDNA Clones from Pinus Radiata



Pinus radiata cDNA expression libraries (prepared from either shoot bud tissue, suspension cultured cells, early wood phloem (two independent libraries), fascicle meristem tissue, male strobilus, root (unknown lineage), feeder roots, structural roots, female strobilus, cone primordia, female receptive cones and xylem (two independent libraries) were constructed and screened as described above in Example 14.


DNA sequence for positive clones was obtained using forward and reverse primers on a Perkin Elmer/Applied Biosystems Division Prism 377 sequencer and the determined sequences were compared to known sequences in the database as described above.


Based on similarity to known sequences from other plant species, the isolated polynucleotide sequences were identified as encoding transcription factors, as detailed herein. The predicted polypeptide sequences corresponding to the polynucleotide sequences are also depicted therein.


Example 16

5′ RACE Isolation


To identify additional sequence 5′ or 3′ of a partial cDNA sequence in a cDNA library, 5′ and 3′ rapid amplification of cDNA ends (RACE) was performed using the SMART RACE cDNA amplification kit (Clontech Laboratories, Palo Alto, Calif.). Generally, the method entailed first isolating poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, and then ligating of the SMART RACE. Adaptor to the cDNA to form a library of adaptor-ligated ds cDNA. Gene-specific primers were designed to be used along with adaptor specific primers for both 5′ and 3′ RACE reactions. Using 5′ and 3′ RACE reactions, 5′ and 3′ RACE fragments were obtained, sequenced, and cloned. The process may be repeated until 5′ and 3′ ends of the full-length gene were identified. A full-length cDNA may generated by PCR using primers specific to 5′ and 3′ ends of the gene by end-to-end PCR.


For example, to amplify the missing 5′ region of a gene from first-strand cDNA, a primer was designed 5′→3′ from the opposite strand of the template sequence, and from the region between ˜100-200 bp of the template sequence. A successful amplification should give an overlap of ˜100 bp of DNA sequence between the 5′ end of the template and PCR product.


RNA was extracted from four pine tissues, namely seedling, xylem, phloem and structural root using the Concert Reagent Protocol (Invitrogen, Carlsbad, Calif.) and standard isolation and extraction procedures. The resulting RNA was then treated with DNase, using 10 U/ul DNase I (Roche Diagnostics, Basel, Switzerland). For 100 μg of RNA, 9 μl 10× DNase buffer (Invitrogen, Carlsbad, Calif.), 10 μl of Roche DNase I and 90 μl of Rnase-free water was used. The RNA was then incubated at room temperature for 15 minutes and 1/10 volume 25 mM EDTA is added. A RNeasy mini kit (Qiagen, Venlo, The Netherlands) was used for RNA clean up according to manufacturer's protocol.


To synthesize cDNA, the extracted RNA from xylem, phloem, seedling and root was used and the SMART RACE cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, Calif.) was followed according to manufacturer's protocol. For the RACE PCR, the cDNA from the four tissue types was combined. The master mix for PCR was created by combining equal volumes of cDNA from xylem, phloem, root and seedling tissues. PCR reactions were performed in 96 well PCR plates, with 1 μl of primer from primer dilution plate (10 mM) to corresponding well positions. 49 μl of master mix is aliquoted into the PCR plate with primers. Thermal cycling commenced on a GeneAmp 9700 (Applied Biosystems, Foster City, Calif.) at the following parameters:


94° C. (5 sec),


72° C. (3 min), 5 cycles;


94° C. (5 sec),


70° C. (10 sec),


72° C. (3 min), 5 cycles;


94° C. (5 sec),


68° C. (10 sec),


72° C. (3 min), 25 cycles.


cDNA was separated on an agarose gel following standard procedures. Gel fragments were excised and eluted from the gel by using the Qiagen 96-well Gel Elution kit, following the manufacturer's instructions.


PCR products were ligated into pGEMTeasy (Promega, Madison, Wis.) in a 96 well plate overnight according to the following specifications: 60-80 ng of DNA, 5 μl 2× rapid ligation buffer, 0.5 μl pGEMT easy vector, 0.1 μl DNA ligase, filled to 10 μl with water, and incubated overnight.


Each clone was transformed into E. coli following standard procedures and DNA was extracted from 12 clones picked by following standard protocols. DNA extraction and the DNA quality was verified on an 1% agarose gel. The presence of the correct size insert in each of the clones was determined by restriction digests, using the restriction endonuclease EcoRI, and gel electrophoresis, following standard laboratory procedures.


Example 17

Curation of an EST Sequence.


During the production of cDNA libraries, the original transcripts or their DNA counterparts may have features that prevent them from coding for functional proteins. There may be insertions, deletions, base substitutions, or unspliced or improperly spliced introns. If such features exist, it is often possible to identify them so that they can be changed. Similar curation can be performed on any other sequences that have homology to sequences in the public databases.


After determination of the DNA sequence, BLAST analysis shows that it is related to an Arabidopsis gene on the publicly available Arabidopsis genome sequence). However, instead of coding for an approximately 240 amino acid polypeptide, the consensus being curated is predicted to code for a product of only 157 amino acid residues, suggesting an error in the DNA sequence. To identify where the genuine coding region might be, the DNA sequence to the end of each EST is translated in each of the three reading frames and the predicted sequences are aligned with the Arabidopsis gene's amino acid sequence. It is found that the DNA segment in one portion of the EST codes for a sequence with similarity to the carboxyl terminus of the Arabidopsis gene. Therefore, it appears that an unspliced intron is present in the EST.


Unspliced introns are a relatively minor issue with regard to use of a cloned sequence for overexpression of the gene of interest. The RNA resulting from transcription of the cDNA can be expected to undergo normal processing to remove the intron. Antisense and RNAi constructs are also expected to function to suppress the gene of interest. On other occasions, it may be desirable to identify the precise limits of the intron so that it can be removed. When the sequence in question has a published sequence that is highly similar, it may be possible to find the intron by aligning the two sequences and identifying the locations where the sequence identity falls off, aided by the knowledge that introns start with the sequence GT and end with the sequence AG.


When there is some doubt about the site of the intron because highly similar sequences are not available, the intron location can be verified experimentally. For example, DNA oligomers can be synthesized flanking the region where the suspected intron is located. RNA from the source species, either Pinus or Eucalyptus, is isolated and used as a template to make cDNA using reverse transcriptase. The selected primers are then used in a PCR reaction to amplify the correctly spliced DNA segment (predicted size of approximately 350 by smaller than the corresponding segment of the original consensus) from the population of cDNAs. The amplified segment is then subjected to sequence analysis and compared to the consensus sequence to identify the differences.


The same procedure can be used when an alternate splicing event (partial intron remaining, or partial loss of an exon) is suspected. When an EST has a small change, such as insertion or deletion of a small number of bases, computer analysis of the EST sequence can still indicate its location when a translation product of the wrong size is predicted or if there is an obvious frameshift. Verification of the true sequence is done by synthesis of primers, production of new cDNA, and PCR amplification as described above.


Example 18

Transformation of Populus deltoides with constructs containing cell cycle genes.


Constructs made as described in the preceding example and shown in Table 2 below were each inoculated into Agrobacterium cultures by standard techniques.


Table 2 identifies plasmid(s), genes, and Genesis ID numbers for constructions described in Example 17.













TABLE 2







Plasmid(s)
Gene
Genesis ID









pGrw14
Cyclin A
prga001823



pGrw15
Cyclin A
prpe001264



pGrw16
Cyclin D
prxa004540



pGrw18
Cyclin D
prxl006271



PGrw19
Cyclin D
prpb019661



PGrw20
WEE1-like protein
prrd041233











Populus deltoides stock plant cultures were maintained on DKW medium (Driver and Kuniyuki, 1984, McGranahan et al. 1987, available commercially from Sigma/Aldrich) with 2.5 uM zeatin in a growth room with a 16 h photoperiod. For transformation, petioles were excised aseptically using a sharp scalpel blade from the stock plants, cut into 4-6 mm lengths, placed on DKW medium with 1 ug/ml BAP and 1 ug/ml NAA immediately after harvest, and incubated in a dark growth chamber (28 degrees) for 24 hours.



Agrobacterium cultures containing the desired constructs were grown to log phase, indicated by an OD600 between 0.8-1.0 A, then pelleted and resuspended in an equal volume of Agrobacterium Induction Medium (AIM), which contains Woody Plant Medium salts (Lloyd, G., and McCown, B., 1981. Woody plant medium. Proc. Intern. Plant Prop. Soc. 30:421, available commercially from Sigma/Aldrich), 5 g/L glucose and 0.6 g/L MES at pH 5.8, with the addition of 1 ul of a 100 mM stock solution of acetosyringone per ml of AIM. The pellet was resuspended by vortexing. The bacterial cells were incubated for an hour in this medium at 28 degrees C. in an environmental chamber, shaking at 100 rpm.


After the induction period, Populus deltoides explants were exposed to the Agrobacterium mixture for 15 minutes. The explants were then lightly blotted on sterile paper towels, replaced onto the same plant medium and cultured in the dark at 18-20 degrees C. After a three-day co-cultivation period, the explants were transferred to DKW medium in which the NAA concentration was reduced to 0.1 ug/ml and to which was added 400 mg/L timentin to eradicate the Agrobacterium.


After 4 days on eradication medium, explants were transferred to small magenta boxes containing the same medium supplemented with timentin (400 mg/L) as well as the selection agent geneticin (50 mg/L). Explants were transferred every two weeks to fresh selection medium. Calli that grow in the presence of selection were isolated and sub-cultured to fresh selection medium every three weeks. Calli were observed for the production of adventitious shoots.


Adventitious shoots were normally observed within two months from the initiation of transformation. These shoot clusters were transferred to DKW medium to which no NAA was added, and in which the BAP concentration was reduced to 0.5 ug.ml, for shoot elongation, typically for about 14 weeks. Elongated shoots were excised and transferred to BTM medium (Chalupa, Communicationes Instituti Forestalis Checosloveniae 13:7-39, 1983, available commercially from Sigma/Aldrich) at pH5.8, containing 20 g/l sucrose and 5 g/l activated charcoal. See Table 3 below.









TABLE 3







Rooting medium for Populus deltoids.










BTM-1 Media Components
mg/L







NH4NO3
412



KNO3
475



Ca(NO3)2•4H2O
640



CaCl2•2H2O
 440*



MgSO4•7H2O
370



KH2PO4
170



MnSO4•H2O
   2.3



ZnSO4•7H2O
   8.6



CuSO4•5H2O
   0.25



CoCl2•6H2O
   0.02



KI
   0.15



H3BO3
   6.2



Na2MoO4•2H2O
   0.25



FeSO4•7H2O
  27.8



Na2EDTA•2H2O
  37.3



Myo-inositol
100



Nicotinic acid
   0.5



Pyridoxine HCl
   0.5



Thiamine HCl
 1



Glycine
 2



Sucrose
20000 



Activated Carbon
5000 










After development of roots, typically four weeks, transgenic plants were propagated in the greenhouse by rooted cutting methods, or in vitro through axillary shoot induction for four weeks on DKW medium containing 11.4 uM zeatin, after which the multiplied shoots were separated and transferred to root induction medium. Rooted plants were transferred to soil for evaluation of growth in glasshouse and field conditions.


Example 19

Production of disproportionately large leaves mediated by ectopic expression of certain cyclin D genes


Approximately 100 explants of Populus deltoides per construct were transformed with pGRW16 and pGRW19, which contain genes that are normally show preferred expression in the vasculature, driven by a constitutive promoter (the Pinus radiata superubiquitin promoter). Upon regeneration, many of the ramets of many of the translines were observed to have disproportionately large leaves relative to control plants. The leaves were both longer and broader than those of control plants.


Disproportionately large leaves could be a very useful early indicator of growth potential large leaf size and thus high growth potential. Lage leaf size can be a function of either increased numbers of leaf cells or increased leaf cell size or both.


Example 20

Production of unusual vascular development mediated by ectopic expression of a cyclin D gene.


Approximately 100 explants of Populus deltoides per construct were transformed with pGRW18. Multiple transgenic lines regenerated from this experiment showed a very unique pleiotropic phenotype. Leaves of these transgenic lines symmetrically folded on both sides of the midrib down the entire length of the leaf. Many petioles of these lines spiraled, and in many cases turned 360 degrees, in a right-handed fashion towards the leaf. The stem showed some thickening and slight bending near the middle.


One ramet of the transgenic line TDL002534 showing these phenotypes was sacrificed to investigate these aberrancies at the tissue level. Transverse sections of a curling petiole stained with toluidine blue revealed retardation of vascular development, but the presence of additional vascular cylinders developing as indicated by the black arrows. The xylem and phloem within the vascular cylinders of the curling petiole appeared to be developmentally similar and spatially oriented correctly. Longitudinal sections of straight and curled petioles may offer an explanation for the spiraling phenomenon. Curled petioles showed more elongated cells on the outside turn of the curl and more compressed cells on the opposite side of the petiole.


Perhaps the most striking phenotype was identified in the leaves. As with the petioles, aberrant vascular development was noted, comprising additional forming vascular cylinders lateral to the larger midrib. In some sections almost fully-formed veins could be seen immediately adjacent to the midrib. In all instances where the folding phenotype was noted, this type of leaf configuration was associated with the phenotype.


The development of additional vascular cylinders in the space where normally a small number of vascular bundles or a single midrib are seen is indicative of unusual cell division activity at the level of early vascular development. Thus, this gene expressed under the control of a vascular-preferred promoter rather than a constitutive promoter could have utility in increasing cell division in later vascular development, creating additional wood.


Example 21

This example illustrates how polynucleotides important for wood development in P. radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.


Open pollinated trees of approximately 16 years of age are selected from plantation-grown sites, in the United States for loblolly pine, and in New Zealand for radiata pine. Trees are felled during the spring and summer seasons to compare the expression of genes associated with these different developmental stages of wood formation. Trees are felled individually and trunk sections are removed from the bottom area approximately one to two meters from the base and within one to two meters below the live crown. The section removed from the basal end of the trunk contains mature wood. The section removed from below the live crown contains juvenile wood. Samples collected during the spring season are termed earlywood or springwood, while samples collected during the summer season are considered latewood or summerwood (Larson et al., Gen. Tech. Rep. FPL-GTR-129. Madison, Wis.: U.S. Department of Agriculture, Forest Service, Forest Products Laboratory. p. 42).


Tissues are isolated from the trunk sections such that phloem, cambium, developing xylem, and maturing xylem are removed. These tissues are collected only from the current year's growth ring. Upon tissue removal in each case, the material is immediately plunged into liquid nitrogen to preserve the nucleic acids and other components. The bark is peeled from the section and phloem tissue removed from the inner face of the bark by scraping with a razor blade. Cambium tissue is isolated from the outer face of the peeled section by gentle scraping of the surface. Developing xylem and lignifying xylem are isolated by sequentially performing more vigorous scraping of the remaining tissue. Tissues are transferred from liquid nitrogen into containers for long term storage at −70 until RNA extraction and subsequent analysis is performed.


Example 22

This example illustrates a procedure for RNA extraction and purification, which is particularly useful for RNA obtained from conifer needle, xylem, cambium, and phloem.


Tissue is obtained from conifer needle, xylem, cambium or phloem. The tissue is frozen in liquid nitrogen and ground. The total RNA is extracted using Concert Plant RNA reagent (Invitrogen). The resulting RNA sample is extracted into phenol:chloroform and treated with DNase. The RNA is then incubated at 65° C. for 2 minutes followed by centrifugation at 4° C. for 30 minutes. Following centrifugation, the RNA is extracted into phenol at least 10 times to remove contaminants.


The RNA is further cleaned using RNeasy columns (Qiagen). The purified RNA is quantified using RiboGreen reagent (Molecular Probes) and purity assessed by gel electrophoresis.


RNA is then amplified using MessageAmp (Ambion). Aminoallyl-UTP and free UTP are added to the in vitro transcription of the purified RNA at a ratio of 4:1 aminoallyl-UTP-to-UTP. The aminoallyl-UTP is incorporated into the new RNA strand as it is transcribed. The amino-allyl group is then reacted with Cy dyes to attach the colorimetric label to the resulting amplified RNA using the Amersham procedure modified for use with RNA. Unincorporated dye is removed by ethanol precipitation. The labeled RNA is quantified spectrophotometrically (NanoDrop). The labeled RNA is fragmented by heating to 95° C. as described in Hughes et al., Nature Biotechnol. 19:342 (2001).


Example 23

This Example illustrates how genes important for wood development in P. radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.


Pine trees of the species P. radiata are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species P. radiata.


RNA is isolated and ESTs generated as described in the Example above or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in polysaccharide synthesis. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed using BLAST (NCBI) as follows.


Example 24

Eucalyptus in Silico Data

In silico gene expression can be used to determine the membership of the consensi EST libraries. For each library, a consensus is determined from the number of ESTs in any tissue class divided by the total number of ESTs in a class multiplied by 1000. These values provide a normalized value that is not biased by the extent of sequencing from a library. Several libraries were sampled for a consensus value, including reproductive, bud reproductive, bud vegetative, fruit, leaf, phloem, cambium, xylem, root, stem, sap vegetative, whole plant libraries.


As shown below, a number of the inventive sequences exhibit vascular-preferred expression (more than 50% of the hits by these sequences if the databases were searched at random would be in libraries made from developing vascular tissue) and thus are likely to be involved in wood-related developmental processes. The data are shown in Table 12.


Example 25

Pinus in Silico Data

In silico gene expression can be used to determine the membership of the consensi EST libraries. For each library, a consensus is determined from the number of ESTs in any tissue class divided by the total number of ESTs in a class multiplied by 1000. These values provide a normalized value that is not biased by the extent of sequencing from a library. Several libraries were sampled for a consensus value, including needles, phloem, cambium, xylem, root, stem and, whole plant libraries.


As shown below, a number of the inventive sequences exhibit vascular-preferred expression (more than 50% of the hits by these sequences if the databases were searched at random would be in libraries made from developing vascular tissue) and thus are likely to be involved in wood-related developmental processes. The data are shown in Table 13.


Example 26

Sequences that show hybridization in silico to ESTs made from samples containing developing wood, but that do not hybridize to ESTs from samples not containing developing wood are selected for further examination.


cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.


The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosis, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.


Example 27

This example illustrates how to detect expression of Pinus radiata genes of the instant application which are important in wood formation using an oligonucleotide microarray prepared as described above. This is an example of a balanced incomplete block designed experiment carried out using aRNA samples prepared from mature-phase phloem (P), cambium (C), expanding xylem found in a layer below the cambium (X1) and differentiating, lignifying xylem cells found deeper in the same growth ring (X2). In this example, cell cycle gene expression is compared among the four samples, namely P, C, X1, and X2.


In the summer, plants of the species Pinus radiata are felled and the bark of the main stem is immediately pulled gently away to reveal the phloem and xylem. The phloem and xylem are then peeled with a scalpel into separate containers of liquid nitrogen. Needles (leaves) and buds from the trees are also harvested with a scalpel into separate containers of liquid nitrogen. RNA is subsequently isolated from the frozen tissue samples as described in Example 1. Equal microgram quantities of total RNA are purified from each sample using RNeasy Mini columns (Qiagen, Valencia, Calif.) according to the manufacturers instructions.


Amplification reactions are carried out for each of the P, C, X1, and X2 tissue samples. Amplification reactions are performed using Ambion's MessageAmp kit, a T7-based amplification procedure, following the manufacturer's instructions, except that labeled aaUTP is added to the reagent mix during in the amplification step. aaUTP is incorporated into the resulting antisense RNA formed during this step. CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction as described in Example 1. Labeled amplified antisense RNAs are precipitated and washed, and then assayed for purity using a NanoDrop spectrophotometer. These labeled antisense RNAs, corresponding to the RNA isolated from the P, C, X1, and X2 tissue samples, constitute the sample nucleic acids, which are referred to as the P, C, X1, and X2 samples.


Normalization control samples of known nucleic acids are added to each sample in a dilution series of 500, 200, 100, 50, 25 and 10 pg/μl for quantitation of the signals. Positive controls corresponding to specific genes showing expression in all tissues of pine, such as housekeeping genes, are also added to the plant sample.


Each of four microarray slides is incubated with 125 μL of a P, C, X1 or X2 sample under a coverslip at 42° C. for 16-18 hours. The arrays are washed in 1×SSC, 0.1% SDS for 10 minutes and then in 0.1×SSC, 0.1% SDS for 10 minutes and the allowed to dry.


The array slides are scanned using an Axon laser scanner and analyzed using GenePix Pro software. Data from the microarray slides are subjected to microarray data analysis using GenStat SAS or Spotfire software. Outliers are removed and ratiometric data for each of the datasets are normalized using a global normalization which employs a cubic spline fit applied to correct for differential dye bias and spatial effects. A second transformation is performed to fit control signal ratios to a mean log2=0 (i.e. 1:1 ratio). Normalized data are then subjected to a variance analysis.


Mean signal intensity for each signal at any given position on the microarray slide is determined for each of three of P, C, X1, and X2 sample microarray slides. This mean signal/probe position is compared to the signal at the same position on sample slide which was not used for calculating the mean. For example, a mean signal at a given position is determined for P, C, and X1 and the signal at that position in the X2 microarray slide is compared to the P, C, and X1 mean signal value.


Table 5 shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.














TABLE 5







Gene
PvCX12
PvX12
CvX12





















WD40 repeat protein A
−1.24
−0.88
−1.07



CDC2
−1.09
−0.78
−0.92



CYCLIN
−1.08
−1
−0.26



WD-40 repeat protein B
−1.01
−0.87
−0.42



CDC2
−0.83
−0.49
−1.01







P = Phloem



C = Cambium



X1 = xylem layer-1



X2 = xylem layer-2



PvCX12 = Ratio of the signal for Phloem target versus mean signal for Cambium, Xylem1, and Xylem2 targets






The data shows that WD40 repeat protein A encodes a WD40 repeat protein is less highly expressed in cambium than in developing xylem, while WD40 repeat protein B encodes a WD40 repeat protein that is more highly expressed in phloem than in the other tissues.


Signal data are then verified with RT-PCR to confirm gene expression in the target tissue of the genes corresponding to the unique oligonucleotides in the probe.


Example 28

This example illustrates how RNAs of tissues from multiple pine species, in this case both P. radiata and loblolly pine P. taeda trees, are selected for analysis of the pattern of gene expression associated with wood development in the juvenile wood and mature wood forming sections of the trees using the microarrays derived from P. radiata cDNA sequences described in Example 4.


Open pollinated trees of approximately 16 years of age are selected from plantation-grown sites, in the United States for loblolly pine, and in New Zealand for radiata pine. Trees are felled during the spring and summer seasons to compare the expression of genes associated with these different developmental stages of wood formation. Trees are felled individually and trunk sections are removed from the bottom area approximately one to two meters from the base and within one to two meters below the live crown. The section removed from the basal end of the trunk contains mature wood. The section removed from below the live crown contains juvenile wood. Samples collected during the spring season are termed earlywood or springwood, while samples collected during the summer season are considered latewood or summerwood. Larson et al., Gen. Tech. Rep. FPL-GTR-129. Madison, Wis.: U.S. Department of Agriculture, Forest Service, Forest Products Laboratory. p. 42.


Tissues are isolated from the trunk sections such that phloem, cambium, developing xylem, and maturing xylem are removed. These tissues are collected only from the current year's growth ring. Upon tissue removal in each case, the material is immediately plunged into liquid nitrogen to preserve the nucleic acids and other components. The bark is peeled from the section and phloem tissue removed from the inner face of the bark by scraping with a razor blade. Cambium tissue is isolated from the outer face of the peeled section by gentle scraping of the surface. Developing xylem and lignifying xylem are isolated by sequentially performing more vigorous scraping of the remaining tissue. Tissues are transferred from liquid nitrogen into containers for long term storage at −70° C. until RNA extraction and subsequent analysis is performed.


Example 29

This example illustrates procedures alternative to those used in the example above for RNA extraction and purification, particularly useful for RNA obtained from a variety of tissues of woody plants, and a procedure for hybridization and data analysis using the arrays described in Example 4.


RNA is isolated according to the protocol of Chang et al., Plant Mol. Biol. Rep. 11:113. DNA is removed using DNase I (Invitrogen, Carlsbad, Calif.) according to the manufacturer's recommendations. The integrity of the RNA samples is determined using the Agilent 2100 Bioanalyzer (Agilent Technologies, USA).


10 μg of total RNA from each tissue is reverse transcribed into cDNA using known methods.


In the case of Pinus radiata phloem tissue, it can be difficult to extract sufficient amounts of total RNA for normal labelling procedures. Total RNA is extracted and treated as previously described and 100 ng of total RNA is amplified using the Ovation™ Nanosample RNA Amplification system from NuGEN™ (NuGEN, CA, USA). Similar amplification kits such as those manufactured by Ambion may alternatively be used. The amplified RNA is reverse transcribed into cDNA and labelled as described above.


Hybridization and stringency washes are performed using the protocol as described in the US Patent Application for “Methods and Kits for Labeling and Hybridizing cDNA for Microarray Analysis” (supra) at 42 C. The arrays (slides) are scanned using a ScanArray 4000 Microarray Analysis System (GSI Lumonics, Ottawa, ON, Canada). Raw, non-normalized intensity values are generated using QUANTARRAY software (GSI Lumonics, Ottawa, ON, Canada).


A fully balanced, incomplete block experimental design (Kerr and Churchill, Gen. Res. 123:123, 2001) is used in order to design an array experiment that would allow maximum statistical inferences from analyzed data.


Gene expression data is analyzed using the SAS® Microarray Solution software package (The SAS Institute, Cary, N.C., USA). Resulting data was then visualized using JMP® (The SAS Institute, Cary, N.C., USA).


Analysis done for this experiment is an ANOVA approach with mixed model specification (Wolfinger et al., J. Comp. Biol. 8:625-637). Two steps of linear mixed models are applied. The first one, normalization model, is applied for global normalization at slide-level. The second one, gene model, is applied for doing rigorous statistical inference on each gene. Both models are stated in Models (1) and (2).





log2(Yijkls)=θij+Dk+Sl+DSklijkls   (1)






R
ijkls
(g)ij(g)+Dk(g)+Sl(g)+DSkl(g)+SSls(g)ijkls(g)   (2)


Yijkls represents the intensity of the sth spot in the 1th slide with the kth dye applying the jth treatment for the ith cell line. θij, Dk, Sl, and DSkl represent the mean effect of the jth treatment in the ith cell line, the kth dye effect, the lth slide random effect, and the random interaction effect of the kth dye in the lth slide. ωijkls is the stochastic error term. represent the similar roles as θij, Dk, Sl, and DSkl except they are specific for the gth gene. Rijkls(g) represents the residual of the gth gene from model (1). μij(g), Dk(g), Sl(g), and DSkl(g) represent the similar roles as θij, Dk, Sl, and DSkl except they are specific for the gth gene. SSls(g) represent the spot by slide random effect for the gth gene. εijkls(g) represent the stochastic error term. All random terms are assumed to be normal distributed and mutually independent within each model.


According to the analysis described above, certain cDNAs, some of which are shown in Table 6 below, are found to be differentially expressed.












TABLE 6





Gene corresponding





to SEQ ID
Oligo ID
Gene_Family
Expression







162
Pra_000171_O_4
Peptidylprolyl isomerase
steady state RNA higher





in xylem than cambium


164
Pra_001480_O_3
Peptidylprolyl isomerase
steady state RNA lower





in xylem than cambium


control
Pra_000218_O_2
RIBONUCLEOSIDE-DIPHOSPHATE
steady state RNA lower




REDUCTASE LARGE CHAIN (EC1.17.4.1).
in xylem than cambium


control
Pra_000193_O_2
PUTATIVE SURFACE PROTEIN.
steady state RNA lower





in xylem than cambium









The involvement of these specific genes in wood development is inferred through the association of the up-regulation or down-regulation of genes to the particular stages of wood development. Both the spatial continuum of wood development across a section (phloem, cambium, developing xylem, maturing xylem) at a particular season and tree trunk position and the relationships of season and tree trunk position are considered when making associations of gene expression to the relevance in wood development.


Example 30

This example demonstrates how one can correlate polysaccharide gene expression with agronomically important wood phenotypes such as density, stiffness, strength, distance between branches, and spiral grain.


Mature clonally propagated pine trees are selected from among the progeny of known parent trees for superior growth characteristics and resistance to important fungal diseases. The bark is removed from a tangential section and the trees are examined for average wood density in the fifth annual ring at breast height, stiffness and strength of the wood, and spiral grain. The trees are also characterized by their height, mean distance between major branches, crown size, and forking.


To obtain seedling families that are segregating for major genes that affect density, stiffness, strength, distance between branches, spiral grain and other characteristics that may be linked to any of the genes affecting these characteristics, trees lacking common parents are chosen for specific crosses on the criterion that they exhibit the widest variation from each other with respect to the density, stiffness, strength, distance between branches, and spiral grain criteria. Thus, pollen from a tree exhibiting high density, low mean distance between major branches, and high spiral grain is used to pollinate cones from the unrelated plus tree among the selections exhibiting the lowest density, highest mean distance between major branches, and lowest spiral grain. It is useful to note that “plus trees” are crossed such that pollen from a plus tree exhibiting high density are used to pollinate developing cones from another plus tree exhibiting high density, for example, and pollen from a tree exhibiting low mean distance between major branches would be used to pollinate developing cones from another plus tree exhibiting low mean distance between major branches.


Seeds are collected from these controlled pollinations and grown such that the parental identity is maintained for each seed and used for vegetative propagation such that each genotype is represented by multiple ramets. Vegetative propagation is accomplished using micropropagation, hedging, or fascicle cuttings. Some ramets of each genotype are stored while vegetative propagules of each genotype are grown to sufficient size for establishment of a field planting. The genotypes are arrayed in a replicated design and grown under field conditions where the daily temperature and rainfall are measured and recorded.


The trees are measured at various ages to determine the expression and segregation of density, stiffness, strength, distance between branches, spiral grain, and any other observable characteristics that may be linked to any of the genes affecting these characteristics. Samples are harvested for characterization of cellulose content, lignin content, cellulose microfibril angle, density, strength, stiffness, tracheid morphology, ring width, and the like. RNA is then collected from replicated samples of trees showing divergent stiffness and density, or other characteristics, from genotypes that are otherwise as similar as possible in growth habit, in spring and fall so that early and late wood development is assayed. These samples are examined for gene expression similarly as described in above examples.









TABLE 7







Concensus ID Information.











Patent app
SEQ ID
Gene Family
Consensus_ID
Expression






control
Ribonucleoside-
pinusRadiata_000218
up in early spring xylem




diphosphate reductase

vs late summer xylem


Cell Cycle
168
Peptidylprolyl
pinusRadiata_001692
up in juvenile




isomerase

developing wood vs






mature developing xylem



control
Nitrite transporter
pinusRadiata_016801
up mature developing xylem






vs juvenile cambium









Ramets of each genotype are compared to ramets of the same genotype at different ages to establish age:age correlations for these characteristics.


Example 31

Example 8 demonstrates how responses to environmental conditions such as light and season alter plant phenotype and can be correlated to polysaccharide synthesis gene expression using microarrays. In particular, the changes in gene expression associated with wood density are examined.


Trees of three different clonally propagated E. grandis hybrid genotypes are grown on a site with a weather station that measures daily temperatures and rainfall. During the spring and subsequent summer, genetically identical ramets of the three different genotypes are first photographed with north-south orientation marks, using photography at sufficient resolution to show bark characteristics of juvenile and mature portions of the plant, and then felled. The age of the trees is determined by planting records and confirmed by a count of the annual rings. In each of these trees, mature wood is defined as the outermost rings of the tree below breast height, and juvenile wood as the innermost rings of the tree above breast height. Each tree is accordingly sectored as follows:


NM—NORTHSIDE MATURE


SM—SOUTHSIDE MATURE


NT—NORTHSIDE TRANSITION


ST—SOUTHSIDE TRANSITION


NJ—NORTHSIDE JUVENILE


SJ—SOUTHSIDE JUVENILE


Tissue is harvested from the plant trunk as well as from juvenile and mature form leaves. Samples are prepared simultaneously for phenotype analysis, including plant morphology and biochemical characteristics, and gene expression analysis. The height and diameter of the tree at the point from which each sector was taken is recorded, and a soil sample from the base of the tree is taken for chemical assay. Samples prepared for gene expression analysis are weighed and placed into liquid nitrogen for subsequent preparation of RNA samples for use in the microarray experiment. The tissues are denoted as follows:


P—phloem


C—cambium


X1—expanding xylem


X2—differentiating and lignifying xylem


Thin slices in tangential and radial sections from each of the sectors of the trunk are fixed as described in Ruzin, PLANT MICROTECHNIQUE AND MICROSCOPY, Oxford University Press, Inc., New York, N.Y. (1999) for anatomical examination and confirmation of wood developmental stage. Microfibril angle is examined at the different developmental stages of the wood, for example juvenile, transition and mature phases of Eucalyptus grandis wood. Other characteristics examined are the ratio of fibers to vessel elements and ray tissue in each sector. Additionally, the samples are examined for characteristics that change between juvenile and mature wood and between spring wood and summer wood, such as fiber morphology, lumen size, and width of the S2 (thickest) cell wall layer. Samples are further examined for measurements of density in the fifth ring and determination of modulus of elasticity using techniques well known to those skilled in the art of wood assays. See, e.g., Wang, et al., Non-destructive Evaluations of Trees, EXPERIMENTAL TECHNIQUES, pp. 28-30 (2000).


For biochemical analysis, 50 grams from each of the harvest samples are freeze-dried and analyzed, using biochemical assays well known to those skilled in the art of plant biochemistry for quantities of simple sugars, amino acids, lipids, other extractives, lignin, and cellulose. See, e.g., Pettersen & Schwandt, J. Wood Chem. & Technol. 11:495 (1991).


In the present example, the phenotypes chosen for comparison are high density wood, average density wood, and low density wood. Nucleic acid samples are prepared as described in Example 3, from trees harvested in the spring and summer. Gene expression profiling by hybridization and data analysis is performed as described above.


Using similar techniques and clonally propagated individuals one can examine polysaccharide gene expression as it is related to other complex wood characteristics such as strength, stiffness and spirality.


Example 32

Example 32 demonstrates the use of a vascular-preferred promoter functionally linked to one of the genes of the instant application.


A vascular-preferred promoter is then linked to one of the genes in the instant application and used to transform tree species. Boosted transcript levels of the candidate gene in the xylem of the transformants results in an increased xylem biomass phenotype.


In another example, a vascular-preferred promoter such as any of those in ArborGen's November 2003 patent applications is then linked to an RNAi construct containing sequences from one of the genes in the instant application and used to transform a tree of the genus from which the gene was isolated. Reduced transcript levels of the candidate gene in the xylem of the transformants results in an increased xylem biomass phenotype.


Example 33

The vector pARB476 was developed using the following steps. The Bluescript vector (Stratagene, La Jolla, Calif.) was modified by adding the Superubiquitin 3′UTR and nos 3′terminator sequence at the KpnI and ClaI sites to produce the vector pARB005 (SEQ ID NO. 773). To this vector the P. radiata superubiquitin promoter with intron was added. The promoter/intron sequence was first amplified from the P. radiata superubiquitin sequence identified in U.S. Pat. No. 6,380,459 using standard PCR techniques and the primers of SEQ ID NOS 774 and 775. The amplified fragment was then ligated into pARB005 using XbaI and PstI restriction digestion to produce the vector pARB119 (SEQ ID NO. 776).


The poplus tremuloises UDB Glucose binding domain gene (patent WO 0071670, ptCelA Genbank number AF072131) was amplified using standard PCR techniques and primers including and ATG and a ClaI site as part of the 5′ primer and a TGA and a ClaI site as part of the 3′ primer. The amplified fragment was then cloned into the ClaI site of pARB119 to produce the vector pARB476 (SEQ ID NO. 777).


The NotI cassette containing the P. radiata superubiquitin promoter with intron::UDP Glucose Binding domain::3′UTR: nos terminator from pARB476 was removed and cloned into the NotI site of pART29 to produce the vector pARB483. The binary vector pART29 is a modified pART27 vector (Gleave, Plant Mol. Biol. 20:1203-1207, 1992) that contains the Arabidopsis thaliana ubiquitin 3 (UBQ3) promoter instead of the nos5′ promoter and no lacZ sequences.










SEQ ID 773









CGATGGGTGTTATTTGTGGATAATAAATTCGGGTGATGTTCAGTGTTTGTCGTATTTCTCACGAATAAA






TTGTGTTTATGTATGTGTTAGTGTTGTTTGTCTGTTTCAGACCCTCTTATGTTATATTTTTCTTTTCGT





CGGTCAGTTGAAGCCAATACTGGTGTCCTGGCCGGCACTGCAATACCATTTCGTTTAATATAAAGACTC





TGTTATCCGTGAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAATAAAGTTTCTTAAGATTGAATC





CTGTTGCCGGTCTTGCGATGATTATCATATAATTTCTGTTGAATTACGTTAAGCATGTAATAATTAACA





TGTAATGCATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACG





CGATAGAAAACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAG





ATCGCGGCCGCATTTAAATGGTACCCAATTCGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCG





TCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCC





CTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGA





ATGGCGAATGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA





CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCG





CCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACC





TCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTC





GCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACC





CTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGC





TGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTAGGTGGCACTTTTCG





GGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAG





ACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGT





CGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGT





AAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGAT





CCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGC





GGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTT





GGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGC





TGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCT





AACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGA





AGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATT





AACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGC





AGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCG





TGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACAC





GACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAA





GCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATT





TAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTT





CCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAAT





CTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAAC





TCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA





GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGT





GGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGC





GCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACT





GAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCC





GGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTA





TAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAG





CCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACAT





GTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGC





TCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAA





ACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGC





GGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTAT





GCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCA





TGATTACGCCAAGCGCGCAATTAACCCTCACTAAAGGGAACAAAAGCTGGGGCCGCTCTAGAACTAGTG





GATCCCCCGGGCTGCAGGAATTCGTCCAGCAGTTGTCTGGAGCTCCACCAGAAATCTGGAAGCTTAT











SEQ ID 774









AAATCTAGAGGTACCATTTAAATGCGGCCGCAAAACCCCTCACAAATACATAA












SEQ ID 775









TTTCTGCAGCTTGAAATTGAAATATGACTAACGAAT












SEQ ID 776









tctagaggtaccatttaaatgcggccgcaaaacccctcacaaatacataaaaaaaattctttatttaat






tatcaaactctccactacctttcccaccaaccgttacaatcctgaatgttggaaaaaactaactacatt





gatataaaaaaactacattacttcctaaatcatatcaaaattgtataaatatatccactcaaaggagtc





tagaagatccacttggacaaattgcccatagttggaaagatgttcaccaagtcaacaagatttatcaat





ggaaaaatccatctaccaaacttactttcaagaaaatccaaggattatagagtaaaaaatctatgtatt





attaagtcaaaaagaaaaccaaagtgaacaaatattgatgtacaagtttgagaggataagacattggaa





tcgtctaaccaggaggcggaggaattccctagacagttaaaagtggccggaatcccggtaaaaaagatt





aaaatttttttgtagagggagtgcttgaatcatgttttttatgatggaaatagattcagcaccatcaaa





aacattcaggacacctaaaattttgaagtttaacaaaaataacttggatctacaaaaatccgtatcgga





ttttctctaaatataactagaattttcataactttcaaagcaactcctcccctaaccgtaaaacttttc





ctacttcaccgttaattacattccttaagagtagataaagaaataaagtaaataaaagtattcacaaac





caacaatttatttcttttatttacttaaaaaaacaaaaagtttatttattttacttaaatggcataatg





acatatcggagatccctcgaacgagaatcttttatctccctggttttgtattaaaaagtaatttattgt





ggggtccacgcggagttggaatcctacagacgcgctttacatacgtctcgagaagcgtgacggatgtgc





gaccggatgaccctgtataacccaccgacacagccagcgcacagtatacacgtgtcatttctctattgg





aaaatgtcgttgttatccccgctggtacgcaaccaccgatggtgacaggtcgtctgttgtcgtgtcgcg





tagcgggagaagggtctcatccaacgctattaaatactcgccttcaccgcgttacttctcatcttttct





cttgcgttgtataatcagtgcgatattctcagagagcttttcattcaaaggtatggagttttgaagggc





tttactcttaacatttgtttttctttgtaaattgttaatggtggtttctgtgggggaagaatcttttgc





caggtccttttgggtttcgcatgtttatttgggttatttttctcgactatggctgacattactagggct





ttcgtgctttcatctgtgttttcttcccttaataggtctgtctctctggaatatttaattttcgtatgt





aagttatgagtagtcgctgtttgtaataggctcttgtctgtaaaggtttcagcaggtgtttgcgtttta





ttgcgtcatgtgtttcagaaggcctttgcagattattgcgttgtactttaatattttgtctccaacctt





gttatagtttccctcctttgatctcacaggaaccctttcttctttgagcattttcttgtggcgttctgt





agtaatattttaattttgggcccgggttctgagggtaggtgattattcacagtgatgtgctttccctat





aaggtcctctatgtgtaagctgttagggtttgtgcgttactattgacatgtcacatgtcacatattttc





ttcctcttatccttcgaactgatggttctttttctaattcgtggattgctggtgccatattttatttct





attgcaactgtattttagggtgtctctttctttttgatttcttgttaatatttgtgttcaggttgtaac





tatgggttgctagggtgtctgccctcttcttttgtgcttctttcgcagaatctgtccgttggtctgtat





ttgggtgatgaattatttattccttgaagtatctgtctaattagcttgtgatgatgtgcaggtatattc





gttagtcatatttcaatttcaagcgatcccccgggctgcaggaattcgtccagcagttgtctggagctc





caccagaaatctggaagcttatcgatgggtgttatttgtggataataaattcgggtgatgttcagtgtt





tgtcgtatttctcacgaataaattgtgtttatgtatgtgttagtgttgtttgtctgtttcagaccctct





tatgttatatttttcttttcgtcggtcagttgaagccaatactggtgtcctggccggcactgcaatacc





atttcgtttaatataaagactctgttatccgtgagctcgaatttccccgatcgttcaaacatttggcaa





taaagtttcttaagattgaatcctgttgccggtcttgcgatgattatcatataatttctgttgaattac





gttaagcatgtaataattaacatgtaatgcatgacgttatttatgagatgggtttttatgattagagtc





ccgcaattatacatttaatacgcgatagaaaacaaaatatagcgcgcaaactaggataaattatcgcgc





gcggtgtcatctatgttactagatcgcggccgcatttaaatggtacccaattcgccctatagtgagtcg





tattacgcgcgctcactggccgtcgttttacaacgtcgtgactgggaaaaccctggcgttacccaactt





aatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgccct





tcccaacagttgcgcagcctgaatggcgaatgggacgcgccctgtagcggcgcattaagcgcggcgggt





gtggtggttacgcgcagcgtgaccgctacacttgccagcgccctagcgcccgctcctttcgctttcttc





ccttcctttctcgccacgttcgccggctttccccgtcaagctctaaatcgggggctccctttagggttc





cgatttagtgctttacggcacctcgaccccaaaaaacttgattagggtgatggttcacgtagtgggcca





tcgccctgatagacggtttttcgccctttgacgttggagtccacgttctttaatagtggactcttgttc





caaactggaacaacactcaaccctatctcggtctattcttttgatttataagggattttgccgatttcg





gcctattggttaaaaaatgagctgatttaacaaaaatttaacgcgaattttaacaaaatattaacgctt





acaatttaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacatt





caaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagta





tgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctc





acccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaac





tggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcactt





ttaaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaagagcaactcggtcgccgca





tacactattctcagaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatga





cagtaagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctgacaa





cgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatc





gttgggaaccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatgg





caacaacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatagact





ggatggaggcggataaagttgcaggaccacttctgcgctcggcccttccggctggctggtttattgctg





ataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccct





cccgtatcgtagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctg





agataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatactttagattg





atttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaa





tcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgag





atcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtt





tgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaata





ctgtccttctagtgtagccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcg





ctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaa





gacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttgg





agcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaag





ggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccag





ggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgt





gatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggttcctggcct





tttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccg





cctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaag





cggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggcacg





acaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagctcactcattagg





caccccaggctttacactttatgcttccggctcgtatgttgtgtggaattgtgagcggataacaatttc





acacaggaaacagctatgaccatgattacgccaagcgcgcaattaaccctcactaaagggaacaaaagc





tggggccgctctag











SEQ ID 777









TCTAGAGGTACCATTTAAATGCGGCCGCAAAACCCCTCACAAATACATAAAAAAAATTCTTTATTTAAT






TATCAAACTCTCCACTACCTTTCCCACCAACCGTTACAATCCTGAATGTTGGAAAAAACTAACTACATT





GATATAAAAAAACTACATTACTTCCTAAATCATATCAAAATTGTATAAATATATCCACTCAAAGGAGTC





TAGAAGATCCACTTGGACAAATTGCCCATAGTTGGAAAGATGTTCACCAAGTCAACAAGATTTATCAAT





GGAAAAATCCATCTACCAAACTTACTTTCAAGAAAATCCAAGGATTATAGAGTAAAAAATCTATGTATT





ATTAAGTCAAAAAGAAAACCAAAGTGAACAAATATTGATGTACAAGTTTGAGAGGATAAGACATTGGAA





TCGTCTAACCAGGAGGCGGAGGAATTCCCTAGACAGTTAAAAGTGGCCGGAATCCCGGTAAAAAAGATT





AAAATTTTTTTGTAGAGGGAGTGCTTGAATCATGTTTTTTATGATGGAAATAGATTCAGCACCATCAAA





AACATTCAGGACACCTAAAATTTTGAAGTTTAACAAAAATAACTTGGATCTACAAAAATCCGTATCGGA





TTTTCTCTAAATATAACTAGAATTTTCATAACTTTCAAAGCAACTCCTCCCCTAACCGTAAAACTTTTC





CTACTTCACCGTTAATTACATTCCTTAAGAGTAGATAAAGAAATAAAGTAAATAAAAGTATTCACAAAC





CAACAATTTATTTCTTTTATTTACTTAAAAAAACAAAAAGTTTATTTATTTTACTTAAATGGCATAATG





ACATATCGGAGATCCCTCGAACGAGAATCTTTTATCTCCCTGGTTTTGTATTAAAAAGTAATTTATTGT





GGGGTCCACGCGGAGTTGGAATCCTACAGACGCGCTTTACATACGTCTCGAGAAGCGTGACGGATGTGC





GACCGGATGACCCTGTATAACCCACCGACACAGCCAGCGCACAGTATACACGTGTCATTTCTCTATTGG





AAAATGTCGTTGTTATCCCCGCTGGTACGCAACCACCGATGGTGACAGGTCGTCTGTTGTCGTGTCGCG





TAGCGGGAGAAGGGTCTCATCCAACGCTATTAAATACTCGCCTTCACCGCGTTACTTCTCATCTTTTCT





CTTGCGTTGTATAATCAGTGCGATATTCTCAGAGAGCTTTTCATTCAAAGGTATGGAGTTTTGAAGGGC





TTTACTCTTAACATTTGTTTTTCTTTGTAAATTGTTAATGGTGGTTTCTGTGGGGGAAGAATCTTTTGC





CAGGTCCTTTTGGGTTTCGCATGTTTATTTGGGTTATTTTTCTCGACTATGGCTGACATTACTAGGGCT





TTCGTGCTTTCATCTGTGTTTTCTTCCCTTAATAGGTCTGTCTCTCTGGAATATTTAATTTTCGTATGT





AAGTTATGAGTAGTCGCTGTTTGTAATAGGCTCTTGTCTGTAAAGGTTTCAGCAGGTGTTTGCGTTTTA





TTGCGTCATGTGTTTCAGAAGGCCTTTGCAGATTATTGCGTTGTACTTTAATATTTTGTCTCCAACCTT





GTTATAGTTTCCCTCCTTTGATCTCACAGGAACCCTTTCTTCTTTGAGCATTTTCTTGTGGCGTTCTGT





AGTAATATTTTAATTTTGGGCCCGGGTTCTGAGGGTAGGTGATTATTCACAGTGATGTGCTTTCCCTAT





AAGGTCCTCTATGTGTAAGCTGTTAGGGTTTGTGCGTTACTATTGACATGTCACATGTCACATATTTTC





TTCCTCTTATCCTTCGAACTGATGGTTCTTTTTCTAATTCGTGGATTGCTGGTGCCATATTTTATTTCT





ATTGCAACTGTATTTTAGGGTGTCTCTTTCTTTTTGATTTCTTGTTAATATTTGTGTTCAGGTTGTAAC





TATGGGTTGCTAGGGTGTCTGCCCTCTTCTTTTGTGCTTCTTTCGCAGAATCTGTCCGTTGGTCTGTAT





TTGGGTGATGAATTATTTATTCCTTGAAGTATCTGTCTAATTAGCTTGTGATGATGTGCAGGTATATTC





GTTAGTCATATTTCAATTTCAAGCGATCCCCCGGGCTGCAGGAATTCGTCCAGCAGTTGTCTGGAGCTC





CACCAGAAATCTGGAAGCTTATCGATATGGATCAGTTCCCCAAGTGGAATCCTGTCAATAGAGAAACGT





ATATCGAAAGGCTGTCGGCAAGGTATGAAAGAGAGGGTGAGCCTTCTCAGCTTGCTGGTGTGGATTTTT





TCGTGAGTACTGTTGATCCGCTGAAGGAACCGCCATTGATCACTGCCAATACAGTCCTTTCCATCCTTG





CTGTGGACTATCCCGTCGATAAAGTCTCCTGCTACGTGTCTGATGATGGTGCAGCTATGCTTTCATTTG





AATCTCTTGTAGAAACAGCTGAGTTTGCAAGGAAGTGGGTTCCGTTCTGCAAAAAATTCTCAATTGAAC





CAAGAGCACCGGAGTTTTACTTCTCACAGAAAATTGATTACTTGAAAGACAAGGTTCAACCTTCTTTCG





TGAAAGAACGTAGAGCAATGAAAAGGGATTATGAAGAGTACAAAGTCCGAGTTAATGCCCTGGTAGCAA





AGGCTCAGAAAACACCTGAAGAAGGATGGACTATGCAAGATGGAACACCTTGGCCTGGGAATAACACAC





GTGATCACCCTGGCATGATTCAGGTCTTCCTTGGAAATACTGGAGCTCGTGACATTGAAGGAAATGAAC





TACCTCGTCTAGTATATGTCTCCAGGGAGAAGAGACCTGGCTACCAGCACCACAAAAAGGCTGGTGCAG





AAAATGCTCTGGTGAGAGTGTCTGCAGTACTCACAAATGCTCCCTACATCCTCAATGTTGATTGTGATC





ACTATGTAAACAATAGCAAGGCTGTTCGAGAGGCAATGTGCATCCTGATGGACCCACAAGTAGGTCGAG





ATGTATGCTATGTGCAGTTCCCTCAGAGGTTTGATGGCATAGATAAGAGTGATCGCTACGCCAATCGTA





ACGTAGTTTTCTTTGATGTTAACATGAAAGGGTTGGATGGCATTCAAGGACCAGTATACGTAGGAACTG





GTTGTGTTTTCAACAGGCAAGCACTTTACGGCTACGGGCCTCCTTCTATGCCCAGCTTACGCAAGAGAA





AGGATTCTTCATCCTGCTTCTCATGTTGCTGCCCCTCAAAGAAGAAGCCTGCTCAAGATCCAGCTGAGG





TATACAGAGATGCAAAAAGAGAGGATCTCAATGCTGCCATATTTAATCTTACAGAGATTGATAATTATG





ACGAGCATGAAAGGTCAATGCTGATCTCCCAGTTGAGCTTTGAGAAAACTTTTGGCTTATCTTCTGTCT





TCATTGAGTCTACACTAATGGAGAATGGAGGAGTACCCGAGTCTGCCAACTCACCAACACTCATCAAGG





AAGCAATTCATGTCATCGGCTGTGGCTATGAAGAGAAGACTGAATGGGGAAAAGAGATTGGTTGGATAT





ATGGGTCAGTCACTGAGGATATCTTAAGTGGCTTCAAGATGCACTGCCGAGGATGGAGATCAATTTACT





GCATGCCCGTAAGGCCTGCATTCAAAGGATCTGCACCCATCAACCTGTCTGATAGATTGCACCAGGTCC





TCCGATGGGCTCTTGGTTCTGTGGAAATTTTCTTTAGCAGACACTGTCCCCTCTGGTACGGGTTTGGAG





GAGGCCGTCTTAAATGGCTCCAAAGGCTTGCGTATATAAACACCATTGTGTACCCATGAATCGATGGGT





GTTATTTGTGGATAATAAATTCGGGTGATGTTCAGTGTTTGTCGTATTTCTCACGAATAAATTGTGTTT





ATGTATGTGTTAGTGTTGTTTGTCTGTTTCAGACCCTCTTATGTTATATTTTTCTTTTCGTCGGTCAGT





TGAAGCCAATACTGGTGTCCTGGCCGGCACTGCAATACCATTTCGTTTAATATAAAGACTCTGTTATCC





GTGAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAATAAAGTTTCTTAAGATTGAATCCTGTTGCC





GGTCTTGCGATGATTATCATATAATTTCTGTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGC





ATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACGCGATAGAA





AACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAGATCGCGGC





CGCATTTAAATGGTACCCAATTCGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCGTCGTTTTA





CAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCC





AGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAA





TGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACA





CTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTT





CCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCC





AAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTG





ACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCG





GTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAA





CAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTAGGTGGCACTTTTCGGGGAAATG





TGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAAC





CCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTA





TTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATG





CTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGA





GTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTAT





CCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGT





ACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAA





CCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTT





TTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATAC





CAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCG





AACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCAC





TTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTC





GCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGA





GTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGT





AACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGA





TCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAG





CGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCT





TGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTC





CGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCC





ACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTG





CCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGT





CGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACC





TACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCG





GCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTG





TCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGA





AAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTC





CTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCA





GCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTC





TCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTG





AGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGG





CTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACG





CCAAGCGCGCAATTAACCCTCACTAAAGGGAACAAAAGCTGGGGCCGCTCTAG













TABLE 8







pGrowth Information.











CW AR
Plasmid(s)
Promoter
Gene
Genesis ID





88
pGrowth14
SUBIN
Cyclin A
prga001823


88
pGrowth15
SUBIN
Cyclin A
prpe001264


88
pGrowth16
SUBIN
Cyclin D
prxa004540


88
pGrowth18
SUBIN
Cyclin D
prxl006271


88
pGrowth19
SUBIN
Cyclin D
prpb019661


88
pGrowth20
SUBIN
WEE1-like protein
prrd041233









To make the growth100 plasmids, an acceptor vector (pWVK202) was built by first inserting the NotI-SUBIN::UDPGBD::nos term-NotI cassette from pARB483a into plasmid pWVK147 at NotI. Next, the UDPGBD gene was removed using restriction sites PstI and ClaI. A polylinker containing the restriction sites PstI, NheI, AvrII, ScaI, and ClaI was inserted in place of the UDPGBD gene. Sites AvrII and NheI are both compatible with SpeI, a site found often in the plasmids provided by Genesis. ScaI is blunt, so any fragment can be blunted and then inserted at that position into the acceptor vector. Plasmids were received from Genesis and analyzed to determine which restriction sites would be most suitable for subcloning into the acceptor vector pWVK202. After the ligations were performed, the resulting products were checked by extensive restriction digest analysis to make sure that the desired plasmid had been created.









TABLE 9








Eucalyptus grandis Cell Cycle Genes and Proteins.















Patent
Patent


DNA SEQ
Protein SEQ

ORF
ORF


ID NO
ID NO
Sequence Identifier
start
stop














1
236
eucalyptusSpp_003910
387
1820


2
237
eucalyptusSpp_019213
99
1007


3
238
eucalyptusSpp_036800
120
1004


4
239
eucalyptusSpp_040260
23
937


5
240
eucalyptusSpp_041965
149
1033


6
241
eucalyptusSpp_002906
199
1116


7
242
eucalyptusSpp_001518
41
982


8
243
eucalyptusSpp_008078
291
2042


9
244
eucalyptusSpp_009826
107
2236


10
245
eucalyptusSpp_010364
82
1749


11
246
eucalyptusSpp_011523
151
1560


12
247
eucalyptusSpp_024358
82
1644


13
248
eucalyptusSpp_039125
626
2782


14
249
eucalyptusSpp_005362
13
1467


15
250
eucalyptusSpp_044857
113
1558


16
251
eucalyptusSpp_001743
187
1686


17
252
eucalyptusSpp_012405
238
1653


18
253
eucalyptusSpp_003739
235
1539


19
254
eucalyptusSpp_022338
158
1618


20
255
eucalyptusSpp_028605
205
1530


21
256
eucalyptusSpp_041006
174
1499


22
257
eucalyptusSpp_006643
94
1332


23
258
eucalyptusSpp_045338
176
1342


24
259
eucalyptusSpp_046486
150
1283


25
260
eucalyptusSpp_012070
101
367


26
261
eucalyptusSpp_006617
9
1352


27
262
eucalyptusSpp_007827
89
1486


28
263
eucalyptusSpp_008036
80
1477


29
264
010212EGLA007017HT
160
1062


30
265
eucalyptusSpp_001596
172
1077


31
266
eucalyptusSpp_005870
66
989


32
267
eucalyptusSpp_006901
111
1541


33
268
eucalyptusSpp_006902
116
1615


34
269
eucalyptusSpp_007440
155
1453


35
270
eucalyptusSpp_008994
228
2033


36
271
eucalyptusSpp_024580
110
1258


37
272
eucalyptusSpp_037831
50
1462


38
273
eucalyptusSpp_034958
176
739


39
274
001209EGXC004488HT
150
1529


40
275
010310EGXD012820HT
247
1971


41
276
010310EGXD013036HT
136
1644


42
277
010316EGXF999037HT
48
836


43
278
010324EGXF002118HT
49
822


44
279
011019EGKA001923HT
185
751


45
280
eucalyptusSpp_000966
103
621


46
281
eucalyptusSpp_001037
41
559


47
282
eucalyptusSpp_004603
127
693


48
283
eucalyptusSpp_005465
28
639


49
284
eucalyptusSpp_006571
135
812


50
285
eucalyptusSpp_006786
119
613


51
286
eucalyptusSpp_007057
38
562


52
287
eucalyptusSpp_008670
109
1872


53
288
eucalyptusSpp_009137
74
1159


54
289
eucalyptusSpp_010285
54
2045


55
290
eucalyptusSpp_010600
53
1879


56
291
eucalyptusSpp_011551
7
690


57
292
eucalyptusSpp_020743
83
601


58
293
eucalyptusSpp_023739
125
535


59
294
eucalyptusSpp_024103
55
573


60
295
eucalyptusSpp_031985
147
842


61
296
eucalyptusSpp_032025
167
487


62
297
eucalyptusSpp_032173
195
890


63
298
eucalyptusSpp_033340
68
586


64
299
eucalyptusSpp_009143
182
3265


65
300
eucalyptusSpp_000349
165
1145


66
301
eucalyptusSpp_000575
529
1569


67
302
eucalyptusSpp_000804
156
1136


68
303
eucalyptusSpp_000805
90
1073


69
304
eucalyptusSpp_000806
66
1049


70
305
eucalyptusSpp_002248
277
1512


71
306
eucalyptusSpp_003203
33
1076


72
307
eucalyptusSpp_003209
65
973


73
308
eucalyptusSpp_004429
82
1047


74
309
eucalyptusSpp_004607
43
1101


75
310
eucalyptusSpp_004682
142
1095


76
311
eucalyptusSpp_005786
61
1257


77
312
eucalyptusSpp_005887
193
1527


78
313
eucalyptusSpp_005981
109
1155


79
314
eucalyptusSpp_006766
71
1213


80
315
eucalyptusSpp_006769
109
1785


81
316
eucalyptusSpp_006907
364
2685


82
317
eucalyptusSpp_007518
96
1412


83
318
eucalyptusSpp_007717
116
1702


84
319
eucalyptusSpp_007718
46
1101


85
320
eucalyptusSpp_007741
23
1258


86
321
eucalyptusSpp_007884
404
2644


87
322
eucalyptusSpp_008258
107
2383


88
323
eucalyptusSpp_008465
243
1625


89
324
eucalyptusSpp_008616
126
1127


90
325
eucalyptusSpp_008690
257
1390


91
326
eucalyptusSpp_008708
178
1632


92
327
eucalyptusSpp_008850
290
2917


93
328
eucalyptusSpp_009072
148
1197


94
329
eucalyptusSpp_009465
140
1567


95
330
eucalyptusSpp_009472
376
1737


96
331
eucalyptusSpp_009550
69
1010


97
332
eucalyptusSpp_010284
149
1423


98
333
eucalyptusSpp_010595
365
2677


99
334
eucalyptusSpp_010657
24
923


100
335
eucalyptusSpp_012636
221
3598


101
336
eucalyptusSpp_012748
44
1447


102
337
eucalyptusSpp_012879
196
1314


103
338
eucalyptusSpp_015515
193
1668


104
339
eucalyptusSpp_015724
78
1634


105
340
eucalyptusSpp_016167
85
2826


106
341
eucalyptusSpp_016633
74
1246


107
342
eucalyptusSpp_017485
100
4377


108
343
eucalyptusSpp_018007
58
2439


109
344
eucalyptusSpp_020775
159
1064


110
345
eucalyptusSpp_023132
118
1665


111
346
eucalyptusSpp_023569
57
1628


112
347
eucalyptusSpp_023611
250
1566


113
348
eucalyptusSpp_024934
106
1434


114
349
eucalyptusSpp_025546
190
1917


115
350
eucalyptusSpp_030134
102
2942


116
351
eucalyptusSpp_031787
75
1079


117
352
eucalyptusSpp_034435
99
1148


118
353
eucalyptusSpp_034452
232
1806


119
354
eucalyptusSpp_035789
72
1124


120
355
eucalyptusSpp_035804
315
2069


121
356
eucalyptusSpp_043057
145
1968


122
357
eucalyptusSpp_046741
130
1488


123
358
eucalyptusSpp_047161
269
1693


698
718
eucalyptusSpp_008994


699
719
eucalyptusSpp_009143


700
720
eucalyptusSpp_006366


701
721
eucalyptusSpp_006907


702
722
eucalyptusSpp_012636


703
723
eucalyptusSpp_015724


704
724
eucalyptusSpp_016167


705
725
eucalyptusSpp_017485


706
726
eucalyptusSpp_030134


707
727
eucalyptusSpp_046741


708
728
eucalyptusSpp_047161


709
729
eucalyptusSpp_017378
















TABLE 10








Pinus radiata cell cycle genes and proteins.















Patent
Patent


DNA SEQ
Protein SEQ

ORF
ORF


ID NO
ID NO
Sequence Identifier
start
stop














124
359
pinusRadiata_001766
1163
2545


125
360
pinusRadiata_002927
152
1582


126
361
990309PRCA009171HT
389
1297


127
362
pinusRadiata_013714
38
946


128
363
pinusRadiata_016332
180
1088


129
364
pinusRadiata_021677
40
948


130
365
pinusRadiata_027562
229
1134


131
366
pinusRadiata_001504
105
2642


132
367
pinusRadiata_015211
187
2580


133
368
pinusRadiata_020421
220
1749


134
369
pinusRadiata_003187
438
1748


135
370
pinusRadiata_015661
240
1631


136
371
pinusRadiata_013874
252
1604


137
372
pinusRadiata_014615
261
1817


138
373
pinusRadiata_004578
167
1576


139
374
pinusRadiata_023387
183
1598


140
375
pinusRadiata_006970
98
1126


141
376
pinusRadiata_010322
148
894


142
377
pinusRadiata_022721
287
1363


143
378
pinusRadiata_023407
251
1348


144
379
pinusRadiata_001945
229
510


145
380
pinusRadiata_008233
92
409


146
381
pinusRadiata_008234
64
381


147
382
pinusRadiata_022054
68
349


148
383
pinusRadiata_012137
125
1849


149
384
pinusRadiata_012582
70
1602


150
385
pinusRadiata_015285
140
1465


151
386
pinusRadiata_017229
628
2565


152
387
pinusRadiata_020724
55
1818


153
388
pinusRadiata_004555
259
1710


154
389
pinusRadiata_004556
356
1807


155
390
pinusRadiata_005729
261
1298


156
391
pinusRadiata_007395
365
2251


157
392
pinusRadiata_009503
156
1454


158
393
pinusRadiata_011283
203
1348


159
394
pinusRadiata_012322
229
1644


160
395
pinusRadiata_018671
156
1454


161
396
pinusRadiata_023236
27
2222


162
397
pinusRadiata_000171
71
1759


163
398
pinusRadiata_000172
358
2040


164
399
pinusRadiata_001480
238
756


165
400
pinusRadiata_001481
285
803


166
401
pinusRadiata_001483
190
708


167
402
pinusRadiata_001484
156
674


168
403
pinusRadiata_001692
176
1912


169
404
pinusRadiata_005313
64
765


170
405
pinusRadiata_006362
93
881


171
406
pinusRadiata_006493
372
1070


172
407
pinusRadiata_006983
28
594


173
408
pinusRadiata_006984
34
648


174
409
pinusRadiata_007665
481
1611


175
410
pinusRadiata_012196
93
584


176
411
pinusRadiata_013382
250
1869


177
412
pinusRadiata_016461
84
422


178
413
pinusRadiata_017611
128
1213


179
414
pinusRadiata_019776
265
837


180
415
pinusRadiata_020659
38
781


181
416
pinusRadiata_022559
38
526


182
417
pinusRadiata_024188
37
1158


183
418
pinusRadiata_027973
61
768


184
419
pinusRadiata_001353
421
2172


185
420
pinusRadiata_001978
163
1647


186
421
pinusRadiata_002810
192
1172


187
422
pinusRadiata_002811
131
1111


188
423
pinusRadiata_002812
149
1726


189
424
pinusRadiata_003514
948
2228


190
425
pinusRadiata_004104
332
1465


191
426
pinusRadiata_005595
232
1590


192
427
pinusRadiata_005754
207
1550


193
428
pinusRadiata_006463
221
1171


194
429
pinusRadiata_006665
221
3679


195
430
pinusRadiata_006750
269
1252


196
431
pinusRadiata_007030
214
1242


197
432
pinusRadiata_007854
119
2065


198
433
pinusRadiata_007917
186
1550


199
434
pinusRadiata_007989
244
3671


200
435
pinusRadiata_008506
163
1431


201
436
pinusRadiata_008692
155
1081


202
437
pinusRadiata_008693
537
1463


203
438
pinusRadiata_009170
284
1909


204
439
pinusRadiata_009408
610
1659


205
440
pinusRadiata_009522
241
1452


206
441
pinusRadiata_009734
223
1173


207
442
pinusRadiata_009815
251
1777


208
443
pinusRadiata_010670
367
1419


209
444
pinusRadiata_011297
284
1303


210
445
pinusRadiata_013098
684
1784


211
446
pinusRadiata_013172
336
2738


212
447
pinusRadiata_013589
81
1622


213
448
pinusRadiata_013608
399
1460


214
449
pinusRadiata_014299
207
1673


215
450
pinusRadiata_014498
263
1309


216
451
pinusRadiata_014548
232
2529


217
452
pinusRadiata_014610
56
2950


218
453
pinusRadiata_015460
56
1234


219
454
pinusRadiata_016090
193
2577


220
455
pinusRadiata_016722
187
1233


221
456
pinusRadiata_016785
51
1436


222
457
pinusRadiata_017094
525
2351


223
458
pinusRadiata_017527
152
1099


224
459
pinusRadiata_017591
470
4114


225
460
pinusRadiata_017769
196
2007


226
461
pinusRadiata_018047
214
1323


227
462
pinusRadiata_018414
68
2146


228
463
pinusRadiata_018986
874
3705


229
464
pinusRadiata_019479
360
1754


230
465
pinusRadiata_020144
185
1384


231
466
pinusRadiata_022480
241
1533


232
467
pinusRadiata_023079
230
1435


233
468
pinusRadiata_026739
101
2857


234
469
pinusRadiata_026951
43
1548


235
470
pinusRadiata_026529
206
1657


710
730
pinusRadiata_000888


711
731
pinusRadiata_004578


712
732
pinusRadiata_007989


713
733
pinusRadiata_009522


714
734
pinusRadiata_014610


715
735
pinusRadiata_017591


716
736
pinusRadiata_017769


717
737
pinusRadiata_026951
















TABLE 11







Annotated Peptide Sequences of the Present Invention.









Entry
Sequence Description
Annotated Peptide Sequence













1
The amino acid sequence of SEQ ID
MGDGSLGSGGRGNSGGGGGGGSRPEWLQQYDLIGKIGEG




261. The conserved eukaryotic

TYGLVFLARIKHPSTNRGKYIAIKKFKQSKDGDGVSPTA




protein kinase domain is

IREIMLLREISHENVVKLVNVHINPVDMSLYLAFDYADH




underlined and the

DLYEIIRHHRDKVNQAINPYTVKSLLWQLLNGLNYLHSN




serine/threonine protein kinases

WIIHRDLKPSNILVMGEGEEQGVVKIADFGLARVYQAPL




active-site signature is in bold.

KPLSDNGVVVTIWYRAPELLLGAKHYTSAVDMWAVGCIF






AELLTLKPLFQGQEVKANPNPFQLDQLDKIFKVLGHPTQ






EKWPMLVNLPHWQSDVQHIQRHKYDDNALGNVVRLSSKN






ATFDLLSKMLEYDPQKRITAAQALEHEYFRMEPLPGRNA





LVPSSPGDKVNYPTRPVDTTTDIEGTTSLQPSQSASSGN




AVPGNMPGPHVVTNRPMPRPMHMVGMQRVPASGMAGYNL




NPSGMGGGMNPSGIPMQRGVANQAQQSRRKDPGMGMGGY




PPQQKQRRF





2
The amino acid sequence of SEQ ID
MEKYQQLAKIGEGTYGIVYKAKDKKSGELLALKKIRLEA



262. The conserved eukaryotic

EDEGIPSTAIREISLLKQLQHPNIVRLYDVVHTEKKLTL




protein kinase domain is

VFEFLDQDLKKYLDACGDNGLEPYTVKSFLYQLLQGIAF




underlined and the protein kinases

CHEHRVLHRDLKPQNLLINMEGELKLADFGLARAFGIPV




ATP-binding region and

RNYTHEVVTLWYRAPDVLMGSRKYSTQVDIWSVGCIFAE




serine/threonine protein kinases

MVNGRPLFPGSSEQDQLLRIFKTLGTPSLKTWPGMAELP




active-site signatures are in

DFKDNFPKYVVQSFKKICPKKLDKTGLDLLSRMLQYDPA




bold.

KRISAEQAMGHPYFKDLKLRKPKAAGPGP






3
The amino acid sequence of SEQ ID
MDQYEKIEKIGEGTYGVVYKAIDRSTNKTIALKKIRLEQ



263. The conserved eukaryotic

EDEGVPSTAIREISLLKEMQHGNIVKLQDVVHSERRLYL




protein kinase domain is

VFEYLDLDLKKHMDSCPEFSKDTHTIKMFLYQILRGISY




underlined and the protein kinases

CHSHRVLHRDLKPQNLLLDRRTNSLKLADFGLARAFGIP




ATP-binding region and

VRTFTHEVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFA




serine/threonine protein kinases

EMVNRRPLFPGDSEIDELFKIFRIMGTPNEDSWPGVTSL




active-site signatures are in

PDFKSTFPKWASQDLKTVTPTVDPAGIDLLSKMLCMDPR




bold.

RRITAKVALEHEYFKDVGVIP






4
The amino acid sequence of SEQ ID
MVMKSKLDKYEKLEKLGEGTYGVVYKAQDKTTKEIYALK



264. The conserved eukaryotic


KIRLESEDEGIPSTAIREIALLKELQHPNVVRIHDVIHT





protein kinase domain is

NKKLILVFEFVDYDLKKFLHNFDKGIDPKIVKSLLYQLV




underlined and the protein kinases

RGVAHCHQQKVLHRDLKPQNLLVSQEGILKLGDFGLARA




ATP-binding region and

FGIPVKNYTNEVVTLWYRAPDILLGSKNYSTSVDIWSIG




serine/threonine protein kinases

CIFVEMLNQKPLFPGSSEQDQLKKIFKIMGTPDATKWPG




active-site signatures are in

IAELPDWKPENFEKYPGEPLNKVCPKMDPDGLDLLDKML




bold.

KCNPSERIAAKNAMSHPYFKDIPDNLKKLYN






5
The amino acid sequence of SEQ ID
MDQYEKVEKIGEGTYGVVYKAIDRLTNETIALKKIRLEQ



265. The conserved eukaryotic

EDEGVPSTAIREISLLKEMQHGNIVRLQDVVHSENRLYL




protein kinase domain is

VFEYLDLDLKKHMDSSPDFAKDPRLVKIFLYQILRGIAY




underlined and the protein kinases

CHSHRVLHRDLKPQNLLIDRRTNALKLADFGLARAFGIP




ATP-binding region and

VRTFTHEVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFA




serine/threonine protein kinases

EMVNQRPLFPGDSEIDELFKIFRILGTPNEDTWPGVTAL




active-site signatures are in

PDFKSAFPKWPAKNLQDMVPGLNSAGIDLLSKMLCLDPS




bold.

KRITARSALEHEYFKDIGFVP






6
The amino acid sequence of SEQ ID
MEKYEKLEKVGEGTYGKVYKAKDKATGQLVALKKTRLEM



266. The conserved eukaryotic

DEEGVPPTALREVSLLQLLSQSLYVVRLLSVEHVDGGSK




protein kinase domain is

RKAAAAAAAEGGGGEAHGGGAVGGGKPMLYLVFEYLDTD




underlined and the protein kinases

LKKFIDSHRKGPNPRPVPAATVQNFLYQLLKGVAHCHSH




ATP-binding region and

GVLHRDLKPQNLLVDKEKGILKIADLGLGRAFTVPLKSY




serine/threonine protein kinases

THEVFAFLAILLWRSEGESAADFDSXFRVSPVQVVTLWY




active-site signatures are in

RAPEVLLGSAHYSIGVDMWSVGCIFAEMVRRQALFPGDS




bold.

EFQQLLHIFRLLGTPTEKQWPGVTTLRDWHVYPQWEPQN






LARAVPSLGPDGVDLLSKMLKYDPAERISAKAALDHPFF





DSLDKSQF





7
The amino acid sequence of SEQ ID
MERPATAAVSAMEAFEKLEKVGEGTYGKVYRAREKATGK



267. The conserved eukaryotic


IVALKKTRLHEDEEGVPPTTLREISILRMLSRDPHIVRL





protein kinase domain is

MDVKQGQNKEGKTVLYLVFEYMETDLKKYIRGFRSSGES




underlined and the protein kinases

IPVNIVKSLMYQLCKGVAFCHGHGVLHRDLKPHNLLMDK




ATP-binding region and

KTLTLKIADLGLARAFTVPIKKYTHEILTLWYRAPEVLL




serine/threonine protein kinases

GATHYSTAVDMWSVGCIFAELVTKQALFPGDSELQQLLH




active-site signatures are in

IFRLLGTPNEKMWPGVSSLMNWHEYPQWKPQSLSTAVPN




bold.

LDKDGLDLLSQMLHYEPSRRISAKAAMEHPYFDDVNKTCL






8
The amino acid sequence of SEQ ID
MGCVLGREVSSGIVTESKGRDSSEVETSKRDDSVAAKVE



268. The conserved eukaryotic
GEGKAEEVRTEETQKKEKVEDDQQSREQRRRSKPSTKLG



protein kinase domain is
NLPKHIRGEQVAAGWPSWLSDICGEALNGWIPRRANTFE



underlined and the

KIDKIGQGTYSNVYKAKDLLTGKIVALKKVRFDNLEPES




serine/threonine protein kinases

VRFMAREILILRHLDHPNVVKLEGLVTSRMSCSLYLVFE




active-site signature is in bold.

YMEHDLAGLAASPAIKFTEPQVKCYMHQLLSGLEHCHNR






RVLHRDIKGSNLLIDNGGVLKIGDFGLASFYDPDHKHRM






TSRVVTLWYRPPELLLGANDYGVGIDLWSAGCILAELLA






GKPIMPGRTEVEQLHKIYKLCGSPSEEYWKKYKLPNATL






FKPREPYRRCIRETFKDFPPSSLPLIETLLAIDPAERGT






ATDALQSEFFRTEPYACEPSSLPQYPPSKEMDAKKRDDE





ARRLRAASKGQADGSKKERTRDRRVRAVPAPEANAELQH




NIDRRRLISHANAKSKSEKFPPPHQDGALGFPLGASHRF




DPAVVPPDVPFTSTSFTSSKEHDQTWSGPLVDPPGAPRR




KKHSAGGQRESSKLSMGTNKGRRADSHLKAYESKSIA





9
The amino acid sequence of SEQ ID
MYSKSSAVDDSRESPKDRVSSSRRLSEVKTSRLDSSRRE



269. The conserved eukaryotic
NGFRARDKVGDVSVMLIDKKVNGSARFCDDQIEKKSDRL



protein kinase domain is
QKQRRERAEAAAAADHPGAGRVPKAVEGEQVAAGWPVWL



underlined and the
SAVAGEAIKGWLPRRADTFEKLDKIGQGTYSSVYKARDV



serine/threonine protein kinase

TNNKIVALKRVRFDNLDTESVKFMAREIHILRMLDHPNV




active-site signature is in bold.

IKLEGLITSRMSCSLYLVFEYMEHDLTGLASRPDVKFSE






PQIKCYMKQLLSGLDHCHKHGVLHRDIKGSNLLIDNNGI






LKIADFGLASVFDPHQTAPLTSRVVTLWYRPPELLLGAS






RYGVEVDLWSTGCILGELYTGKPILPGKTEVEQLHKIFK






LCGSPSDDYWRRLHLPHAAVFKPPQPYRRCVAEIFKELP






PVALGLLETLISVDPSQRGTAAFALRSEFFTASPLPCDP





SSLPKYPPSKEIDMKLREEEARRRGAAGGKNELEKRGTK




DSRTNSAYYPNAGQLQVKQCHSNANGRSEIFGPYQEKTV




SGFLVAPPKQARVSKETRKDYAEQPDRASFSGPLVPGPG




FSKAGKELGHSITVSRNTNLSTLSSLVTSRTGDNKQKSG




PLVSESANQASRYSGPIREMEPARKQDRRSHVRTNIDYR




SREDGNSSTKEPALYGRGSAGNKIYVSGPLLVSSNNVDQ




MLKEHDRRIQEHARRARFDKARVGNNHPQAAVDSKLVSV




HDAG





10
The amino acid sequence of SEQ ID
MGCIPTIISDGRRRSAAPDKRRPRPRRSSSEGEAPPHAT



270. The conserved eukaryotic
AAGSEGGESARGAPGKERPEPAPRFVVRSPQGWPPWLVA



protein kinase domain is
AVGHAIGEFVPRCADSFRKLAKIGEGTYSNVYKARDLVT



underlined and the

GKTVALKKVRFDNLEAESIKFMAREILVLTRLNHPNVIK




serine/threonine protein kinase

LEGPVTSRMSSGLYLAFEYMEHDLSGIAARQNGKFTEPQ




active-site is in bold

VKCFMRQLLSGLEHCHNHDVLHRDIKCSNLLIDNEGNLK






IADFGLATFYDPERKQVMTNRVVTLWYRAPELLLGATSY






GIGIDLWSAGCILAELLYGKPIMPGRTEVEQLHKIFKLC






GSPSEAYWNKFKLPNANIFKPPQPYARCIAETFKDFPPS






ALPLLETLLSIDPDERGTATTALNSEFFAAEPHACEPSS





LPKYPPSKEMDLKLIKEKTRRDSSKRPSAIHGSRRDGIH




DRAGRVIPAPEATAENQATLHRPRAMKKANPMSRSEKFP




PAHMDGVVGSSANAWLSGPASNAAPDSRRHRSLNQNPSS




SVGKASTGSSTTQETLKVAPELLQVGSSSLHPCHRMLVY




GSNLTIRSK





11
The amino acid sequence of SEQ ID
MGCICAKQADRGPASPGSGILTGAGTGTGTRSSKIPSGL



271. The conserved protein kinase
FEFEKSGVKEHGGRSGELRKLEEKGSLSKRLRLELGFSH



family domain is underlined, and
RYVEAEQAAAGWPSWLTAVAGDAIQGLVPLKADSFEKLE



the serine/threonine protein

KIGQGTYSSVFRARELANGRMVALKKVRFDNFQPESIQF




kinases active-site signature is

MAREISILRRLDHPNIMKLEGIITSRMSNSIYLVFEYME




in bold

HDLYGLISSPQVKFSDAQVKCYMKQLLSGIEHCHQHGVI







HRDVKSSNILVNNEGILRIGDFGLANILNPKDRQQLTSH







VVTLWYRPPELLMGSTSYGVTVDLWSVGCVFAELMFRKP






ILRGRTEVEQLHKIFKLCGSPPDGYWKMCKVPQATMFRP






RHAYECTLRERCKGIATSAMKLMETFLSIEPHKRGTASS






ALISEYFRTVPYACDPSSLPKYPPNKEIDAKHREEARRK





KARSRVREAEVGKRPTRIHRASQEQGFSSNIAPKEKRSYA





12
The amino acid sequence of SEQ ID
MAVAAPGHLNVNESPSWGSRSVDCFEKLEQIGEGTYGQV



272. The conserved eukaryotic


YMAKEKKTGEIVALKKIRMDNEREGFPITAIREIKILKK





protein kinase domain is

LHHENVIKLKEIVTSPGPEKDEQGRPEGNKYKGGIYMVF




underlined and the protein kinases

EYMDHDLTGLADRPGMRFSVPQIKCYMRQLLTGLHYCHI




ATP-binding region and

NQVLHRDIKGSNLLIDNEGNLKLADFGLARSFSNDHNAN




serine/threonine protein kinases

LTNRVITLWYRPPELLLGATKYGPAVDMWSVGCIFAELL




active-site signatures are in

HGKPIFPGKDEPEQLNKIFELCGAPDEINWPGVSKIPWY




bold.

NNFKPTRPMKRRLREVFRHFDRHALELLERMLTLDPSQR






ISAKDALDAEYFWADPLPCDPKSLPKYESSHEFQTKKKR





QQQRQHEETAKRQKLQHPPQHPRLPPVQQSGQAHAQMRP




GPNQLMHGSQPPVATGPPGHHYGKPRGPSGGAGRYPSSG




NPGGGYNHPSRGGQGGSGGYNSGPYPPQGRAPPYGSSGM




PGAGPRGGGGNNYGVGPSNYPQGGGGPYGGSGAGRGSNM




MGGNRNQQYGWQQ





13
The amino acid sequence of SEQ ID
MGCICTKGILPAHYRIKDGGLKLSKSSKRSVGSLRRDEL



273. The conserved
AVSANGGGNDAADRLISSPHEVENEVEDRKNVDFNEKLS



serine/threonine protein kinase
KSLQRRATMDVASGGHTQAQLKVGKVGGFPLGERGAQVV



domain is underlined, and the
AGWPSWLTAVAGEAINGWVPRRADSFEKLEKIGQGTYSS



serine/threonine protein kinase

VYRARDLETNTIVALKKVRFANMDPESVRFMAREIIIMR




active-site signature is in bold.

KLDHPNVMKLEGLITSRVSGSLYLVFEYMDHDLAGLAAT






PSIKLTESQIKCYMQQLLRGLEYCHSHGVLHRDIKGSNL







LVDNNGNLKIGDFGLATFFRTNQKQPLTSRVVTLWYRPP







ELLLGSSDYGASVDLWSSGCILAELFAGKPIMPGRTEVE






QLHKIFKLCGSPSEEYWKKSKLPHATIFKPQQPYKRCLL






ETFKDFPSSALGLLDVLLAVEPECRGTASSALQNEFFTS





NPLPSDPSSLPKYPSSKEFDARLRDEEARKHKATAGKAR




GLESIRKGSKESKVVPTSNANADLKASIQKRQEQSNPRS




TGEKPGGTTQNNFILSGQSAKPSLNGSTQIGNANEVEAL




IVPDRELDSPRGGAELRRQRSFMQRRASQLSRFSNSVAV




GGDSHLDCSREKGANTQWRDEGFVARCSHPDGGELAGKH




DWSHHLLHRPISLFKKGGEHSRRDSIASYSPKKGRIHYS




GPLLPSGDNLDEMLKEHERQIQNAVRKARLDKVKTKREY




ADHGQTESLLCWANGR





14
The amino acid sequence of SEQ ID
MDPDPSPDPDPPKSWSIHTRREIIARYEILERVGSGAYS



274. The conserved protein kinase

DVYRGRRLSDGLAVALKEVHDYQSAFREIEALQILRGSP




family domain is underlined and

HVVLLHEYFWREDEDAVLVLEFLRSDLAAVIADASRRPR




the serine/threonine protein

DGGGGGAAALRAGEVKRWMLQVLEGVDACHRNSIVHRDL




kinases active-site signature is


KPGNLLISEEGVLKIADFGQARILLDDGNVAPDYEPESF





in bold.

EERSSEQADILQQPETMEADTTCPEGQEQGAITREAYLR






EVDEFKAKNPRHEIDKETSIFDGDTSCLATCTTSDIGED






PFKGSYVYGAEEAGEDAQGCLTSCVGTRWFRAPELLYGS






TDYGLEVDLWSLGCIFAELLTLEPLFPGISDIDQLSRIF






NVLGNLSEEVWPGCTKLPDYRTISFCKIENPIGLESCLP






NCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLP





VPISALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFG




PLKFTPTSTGFSIQFP





15
The amino acid sequence of SEQ ID
MDPDPSPSPDPPKSWSIHTRREIIARYEILERVGSGAYS



275. The conserved

DVYRGRRLSDGLAVALKEVHDYQSAFREIEALQILRGSP




serine/threonine protein kinase

HVVLLHEYFWREDEDAVLVLEFLRSDLAAVIADASRRPR




domain is underlined, and the

GGGVAPLRAGEGKRWMLQVLEGVDACHRNSIVHRDLKPG




serine/threonine protein kinase


NLLISEEGVLKIADFGQARILLDDGNVAPDYEPESFEER





active-site signature is in bold.

SSEQADILQQPETMEADTTCPEGQEQGAITREAYLREVD






EFKAKNPRHEIDKETSIYDGDTSCLATCTTSDIGEDPFK






GSYVYGAEEAGEDAQGSLTSCVGTRWFRAPELLYGSTDY






GLEVDLWSLGCIFAELLTLEPLFPGISDIDQLSRIFNVL






GNLSEEVWPGCTKLPDYRTISFCKIENPIGLESCLPNCS






SDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPI





SALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLK




FTPTSTGFSIQFP





16
The amino acid sequence of SEQ ID
MSNQHRRSSFSSSTTSSLAKRHASSSSSSLENAGKAFAA



276. The conserved cyclin and
AAVPSHLAKKRAPLGNLTNLKAGDGNSRSSSAPSTLVAN



cyclin C-terminal domains are
ATKLAKTRKGSSTSSSIMGLSGSALPRYASTKPSGVLPS



underlined and the cyclins
VNPSIPRIEIAVDPMSCSMVVSPSRSDMQSVSLDESMST



signature is in bold.
CESFKSPDVEYIDNEDVSAVDSIDRKTFSNLYISDAAAK




TAVNICERDVLMEMETDEKIVNVDDNYSDPQLCATIACD





IYQHLRASEAKKRPSTDFMDRVQKDITASMRAILIDWLV







EVAEEYRLVPDTLYLTVNYIDRYLSGNVMNRQRLQLLGV







ACMMIAAKYEEICAPQVEEFCYITDNTYFKEEVLQMESS






VLNYLKFEMTAPTVKCFLRRFVRAAQGVNEVPSLQLECM






ANYIAELSLLEYDMLCYAPSLVAASAIFLAKFVITPSKR






PWDPTLQHYTLYQPSDLGNCVKDLHRLCFNNHGSTLPAI






REKYSQHKYKYVAKKYCPPSIPPEFFHNLVY






17
The amino acid sequence of SEQ ID
MNKENAVGTKSEAPTIRITRSRSKALGTSTGMLPSSRPS



277. The conserved cyclin and
FKQEQKRTVRANAKRSASDENKGTMVGNASKQHKKRTVL



cyclin C-terminal domains are
NDVTNIFCENSYSNCLNAAKAQTSRQGRKWSMKKDRDVH



underlined.
QSGAVQIMQEDVQAQFVEESSKIKVAESMEITIPDKWAK




RENSEHSISMKDTVAESSRKPQEFICGEKSAALVQPSIV




DIDSKLEDPQACTPYALDIYNYKRSTELERRPSTIYMET





LQKDVTPNMRGILVDWLVEVSEEYKLVPDTLYLTVNLID






RSLSQKFIEKQRLQLLGVTCMLIASKYEEICPPRVEEFC






FITDNTYTSLEVLKMESRVLNLLHFQLSVPTVKTFLRRF






VQAAQVSSEVPSVELEYLANYLAELTLVEYSFLKFLPSL






MAASAVLLARWTLNQSDNPWNLTLEHYTKYKASELKAAV






LALEDLQLNTSGSTLNAIREKYRQQKVNYSLLIHSKANH





EIL





18
The amino acid sequence of SEQ ID
MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRAL



278. The conserved cyclin N- and
SNINSNIIGAPPYPCAVNKRVLSEKNVNSENDLLNAAHR



C-terminal family domains are
PITRQFAAQMAYKQQLRPEENKRTTQSVSNPSKSEDCAI



underlined and the cyclins
LDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDV



signature is in bold.
AEEPVTDIDSGDKENQLAVVEYIDDLYMFYQKAEASSCV





PPNYMDRQQDINERMRGILIDWLIEVHYKFELMDETLYL







TVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEVSVP







VVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPY






VFMRRFLKAAQSDKKLELLSFFIIELSLVEYDMLKFPPS






LLAASAIYTALSTITRTKQWSTTCEWHTSYSEEQLLECA






RLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFL





LDFRL





19
The amino acid sequence of SEQ ID
MASRPIVPVQARGEAAIGGGAGKAAIGGGAGKQQKKNGA



279. The conserved cyclin and
AEGRNRKALGDIGNLVTVRGIEGKVQPHRPITRSFCAQL



cyclin C-terminal domains are
LANAQAAAAAENNKKQAVVNVNGAPSILDVPGAGKRAEP



underlined.
AAAAAAAVAKAAQKKVVKPKQKAEVIDLTSDSEERSRPR




RSNNIMSLRRRKERNHREGICPLSLRSSLLEARLVDWLI





EIHNKFDLMPETLYLTINIIDRFLSVKAVPRRELQLLGM






GALFTASKYEEIWAPEVNDLVCIADRAYSHEQVLAMEKT






ILGKLEWTLTVPTHYVFLVRFIKASLGDRKLENMVYFLA






ELGVMNYATLTYCPSMVAASAVYAARCTLGLTPLWNDTL






KLHTGFSESQLMDCARLLVGYHAKAKENKLQVVYKKYSS






SQREGVALIPPAKALLCEGGGLSSSSSLASSS






20
The amino acid sequence of SEQ ID
MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRA



280. The conserved cyclin and
LSVINQNLVGDRAYPCHVVNKRGHSKRDAVCGKDQVDPV



cyclin C-terminal domains are
HRPLTRKFAAQTASTQQHCIEEAKKPRTAVQERNEFGDC



underlined and the cyclins
IFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMED



signature is in bold.
IVEEEEEEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTE





NCSCVSANYMAQQADINEKMRSILIDWLIEVHDKFDLMH







ETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLACKYE







EVSVPVVGDLILISDKAYTRKEVLEMESLMLNSLQFNMS






VPTPYVFMRRFLKAAESDKKLEVLSFFLIELSLVEYEMV






KFPPSLLAAAAIFTAQCTLYGFKQWTKTCEWHSNYTEDQ






LLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCE






PANFLLGEMKNP






21
The amino acid sequence of SEQ ID
MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRA



281. The conserved cyclin and
LSVINQNLVGDRAYPCHVVNKRGHSKRDAVCGKDQVDPV



cyclin C-terminal domains are
HRPLTRKFAAQTASTQQHCIEEAKKPRTAVQERNEFGDC



underlined and the cyclins
IFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMED



signature is in bold.
IVEEEEEEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTE





NCSCVSANYMAQQADINEKMRSILIDWLIEVHDKFDLMH







ETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLACKYE







EVSVPVVGDLILISDKAYTRKEVLEMEKLMLNSLQFNMS






VPTPYVFMRRFLKAAESDKKLEVLSFFLIELSLVEYEMV






KFPPSLLAAAAIFTAQCTLYGFKQWTKTCEWHSNYTEDQ






LLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCE






AANFLLGEMKNP






22
The amino acid sequence of SEQ ID
MAMVQRQGHDPSSPQEQEDGPSSFLSDDALYCEEGRFEE



282. The conserved cyclin N- and
DDGGGGGQVDGIPLFPSQPADRQQDSPWADEDGEEKEEE



C-terminal family domains are
EAELQSLFSKERGARPELAKDDGGAVAARREAVEWMLMV



underlined.

RGVYGFSALTAVLAVDYLDRFLAGFRLQRDNRPWMTQLV






AVACLALAAKVEETDVPLLVELQEVGDARYVFEAKTVQR






MELLVLSTLGWEMHPVTPLSFVHHVARRLGASPHHGEFT






HWAFLRRCERLLVAAVSDARSLKHLPSVLAAAAMLRVIE






EVEPFRSSEYKAQLLSALHMSQEMVEDCCRFILGIAETA






GDAVTSSLDSFLKRKRRCGHLSPRSPSGVIDASFSCDDE






SNDSWATDPPSDPDDNDDLNPLPKKSRSSSPSSSPSSVP





DKVLDLPFMNRIFEGIVNGSPI





23
The amino acid sequence of SEQ ID
MEASYQPHHHGHLRQHDPSSSQQEEQVPFDALYCSEEHW



283. The conserved cyclin and
GEEDEEEGLASDGLLSEERDHRLLSPRALLDQDLLWEDE



cyclin C-terminal domains are

ELASLFSKEEPGGMRLNLENDPSLADARREAVEWIMRVH




underlined.

AHYAFSALTALLAVNYWDRFTCSFALQEDKPWMTQLSAV






ACLSLAAKVEETQVPLLIDFQVEDSSPVFEAKNIQRMEL






LVLSSLEWKMNPVTPLSFLDYMTRRLGLTGHLCWEFLRR






CENVLLSVISDCRFTCYLPSVIAASTMLHVINGLKPRLD






VEDQTQLLGILAMGMDKIDACYKLIDDDHALRSQRYSHN






KRKFGSVPGSPRGVMELCFSSDGSNDSWSVAASVSSSPE





PHSKKSRAGEEAEDRLLRGLEGEEDDPASADIFSFPH





24
The amino acid sequence of SEQ ID
MALQEEDTRRHYPTAPPFSPDGLYCEDETFGEDLADNAC



284. The conserved cyclin and
EYAGGGARDGLCEIKDPTLPPSLLGQDLFWEDGELASLV



cyclin C-terminal domains are

SRETGTHPCWDELISDGSVALARKDAVGWILRVHGHYGF




underlined.

RPLTAMLAVNYLDRFFLSRSYQRDRPWISQLVAVACLSV






AAKVEETQVPILLDLQVANAKFVFESRTIQRMELLLMST






LDWRMNSVTPISFFDHILRRFGLTTNLHRQFFWMCERLL






LSVVADVRLASFLPSVVATAAMLYVNKEIEPCICSEFLD






QLLSLLKINEDRVNECYELILELSIDHPEILNYKHKRKR






GSVPSSPSGVIDTSFSCDSSNDSWGVASSVSSSLEPRFK





RSRFQDQQMGLPSVNVSSMGVLNSSY





25
The amino acid sequence of SEQ ID

MGQIQYSEKYFDDTYEYRHVVLPPDVAKLLPKNRLLSEN




285. The conserved cyclin-

EWRAIGVQQSRGWVHYAIHRPEPHIMLFRRPLNYQQQQE




dependent kinases regulatory
NQAQQNMLAK



subunit domain is underlined and



the cyclin-dependent kinases



regulatory subunits signature 1 is



in bold.





26
The amino acid sequence of SEQ ID
MGSIDPPKAEQNGTAAAAVADPGQKPGAGDAMPPPPPVK



286. The conserved chromo domain
HSNGTAAEPDVATKRRRMSVLPLEVGTRVMCRWRDGKYH



is underlined and the MOZ/SAS-like

PVKVIERRKLNPGDPNDYEYYVHYTEFNRRLDEWVKLEQ




protein domain is in bold/italics.

LDLNSVETVVDEKVEDKVTGLKMTRHQKRKIDETHVEGH





EELDAASLREHEEFTKVKNIATIELGRYEIETWYFSPFP




PEYNDCSKLYFCEFCLNFMKRKEQLQRHMKKCD






































PKVLDRHLKAAGRG





GLEVDVSKLIWTPYREQG





27
The amino acid sequence of SEQ ID
MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYLLQHMQVL



292. The conserved histone

KPVPARDRDLCRFHADDYVAFLRSITPETQQDQLRQLKR




deacetylase family domain is

FNVGEDCPVFDGLHSFCQTYAGGSVGGAVKLNHGLCDIA




underlined.

INWAGGLHHAKKCEASGFCYVNDIVLGILELLKQHERVL






YVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGTGD






IRDIGYGKGKYYSLNVPLDDGIDDESYHSLFKPIIGKVM






EVFKPGAVVLQCGADSLSGDRLGCFNLSIKGHAECVRYM






RSFNVPVLLLGGGGYTIRNVARCWCYETGVALGLEVDDK





MPQHEYYEYFGPDYTLHVAPSNMENKNSRQLLEEIRSKL




LENLSKLQHAPSVPFQERPPDTELPEADEDQEDPDERWD




PDSDMDVDEDRKPLPSRVKRELIVEPEVKDQDSQKASID




HGRGLDTTQEDNASIKVSDMNSMITDEQSVKMEQDNVNK




PSEQIFPK





28
The amino acid sequence of SEQ ID
MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYYGQGHPMK



293. The conserved histone

PHRIRMTHALLAHYGLLQHMQVLKPVPARDRDLCRFHAD




deacetylase family domain is

DYVAFLRSITPETQQDQLRQLKRFNVGEDCPVFDGLHSF




underlined.

CQTYAGGSVGGAVKLNHGLCDIAINWAGGLHHAKKCEAS






GFCYVNDIVLGILELLKQHERVLYVDIDIHHGDGVEEAF






YTTDRVMTVSFHKFGDYFPGTGDIRDIGYGKGKYYSLNV






PLDDGIDDESYHSLFKPIIGKVMEVFKPGAVVLQCGADS






LSGDRLGCFNLSIKGHAECVRYMRSFNVPVLLLGGGGYT






IRNVARCWCYETGVALGLEVDDKMPQHEYYEYFGPDYTL





HVAPSNMENKNSRQLLEDIRSKLLENLSKLQHAPSVPFQ




ERPPDTELPEADEDQEDPDERWDPDSDMDVDEDRKPLPS




RVKRELIVEPEVKDQDSQKASIDHGRGLDTTQEDNASIK




VSDMNSMITDEQSVKMEQDNVNKPSEQIFPK





29
The amino acid sequence of SEQ ID
MRPKDRISYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVL



294. The conserved histone

SYELHTKMEIYRPHKAYPAELAQFHSPDYVEFLHRITPD




deacetylase domain is underlined.

TQHLFPNDLAKYNLGEDCPVFENLFEFCQIYAGGTIDAA






RRLNNQLCDIAINWAGGLHHAKKCEASGFCYINDLVLGI






LELLKYHARVLYIDIDVHHGDGVEEAFYFTDRVMTVSFH






KFGDMFFPGTGDVKEIGGKEGKFYAINVPLKDGIDDTSF






TRLFKAIISKVVETYQPGAIVLQCGADSLAGDRLGCFNL






SIDGHSECVRFVKKFNLPLLVTGGGGYTKENVARCWVVE






TGVLLDTELPNEIPENEYFKYFAPDYSLKIPRGNIVLEN





LNSKSYLSAIKVQVLENLRNIQHAPSVQMQEVPPDFYIP




DFDEDEQNPDERMDQHTQDKQIQRDDEYYDGDNDNDHNM




DD





30
The amino acid sequence of SEQ ID
MTVAEDFHVNNRSKMVSQATPESRLTGGEDDNSLHNQVD



295. The conserved histone
ELLCQELPERQVILEFEGTRPKPYFSDHNGGENSALGVR



deacetylase family domain is
ATEDDLNSDVEAEEKQKEMTLEDMYKNDGTLYDDDEDDS



underlined and the Zinc finger
DWEPVKRQVELMRWFCTNCTMVNVEDVFLCDICGEHRDS



RanBP2-type profile is in bold.
GILRHGFYASPFMQDVGAPSVEAEVQESREDHARSSPPS





SSTVVGFDEKMLLHSEVEMKSHPHPERADRLQAIAASLA






TAGIFPGRCRSLPVREITKEELQMVHSSEHVDAVEMTSH






MFSSYFTPDTYANEHSARAARIAAGLCADLASTIISGRS






KNGFALVRPPGHHAGIKHAMGFCLHNNAAVAALAAQGAG






AKKVLIVDWDVHHGNGTQEIFDGNKSVLYISLHRHEGGN






FYPGTGAAHEVGTMGAEGYCVNIPWSRRGVGDNDYVFAF






HHIVLPIASAFAPDFTIISAGFDAARGDPLGCCDVTPAG






YAQMTHMLSALSGGKLLVILEGGYNLRSISSSAVAVIKV






LLGDSPISEIADAVPSKAGLRTVLEVLKIQRSYWPSLES





IFWELQSQWGMFLVDNRRKQIRKRRRVLVPIWWKWGRKS




VLYHLLNGHLHVKTKR





31
The amino acid sequence of SEQ ID

MAAAPSSPPTNRVDVFWHDGMLSHDTGRGVFDTGSDPGF




296. The conserved histone

LDVLEKHPENPDRVRNMVSILKRGPISPFISWHTATPAL




deacetylase family domain is

ISQLLSFHSPEYINELVEADKNGGKVLCAGTFLNPGSWD




underlined.

AALLAAGNTLSAMKYVLDGKGKIAYALVRPPGHHAQPSQ






ADGYCFLNNAGLAVRLALDSGCKRVVVVDIDVHYGNGTA






EGFYQSSDVLTISLHMNHGSWGPSHPQSGSVDELGEDEG






YGYNMNIPLPNGTGDRGYEYAVTELVVPAVESFKPEMVV





LVVGQDSSAFDPNGRQCLTMDGYRAIGRTIRGLADRHSG




GRILIVQEGGYHVTYSAYCLHATVEGILDLPDPLLADPI




AYYPEDEAFPVKVVDSIKRYLVDKVPFLKEH





32
The amino acid sequence of SEQ ID
MVESSGGASLPSVGQDARKRRVSYFYEPTIGDYYYGQGH



297. The conserved histone

PMKPHRIRMAHNLIVHYYLHRRMEISRPFPAATTDIRRF




deacetylase family domain is

HSEDYVTFISSVTPETVSDPAFSRQLKRFNVGEDCPVFD




underlined.

GIFGFCQASAGGSMGAAVKLNRGDSDIALNWAGGLHHAK






KSEASGFCYVNDIVLGILELLKVHKRVLYVDIDVHHGDG






VEEAFYTTDRVMTVSFHKFGDFFPGSGHIKDTGAGPGKN






YALNVPLNDGIDDESFRGMFRPIIQKVMEVYQPDAVVLQ






CGADSLSGDRLGCFNLSVKGHADCLRFLRSFNVPLMVLG






GGGYTMRNVARCWCYETAVAVGVEPENDLPYNEYYEYFG





PDYTLHVEPCSMENLNAPKDLERIRNMLLEQLSRIPHAP




SVPFQMTPPITQEPEEAEEDMDERPKPRIWNGEDYESDA




EEDKSQHRSSNADALHDENVEMRDSVGENSGDKTREDRS




PS





33
The amino acid sequence of SEQ ID
MAAIISCHHYHSCCSSLIASKWVGARIPTSCFGRSSTQS



299. The conserved cyclophilin-
NNAASVRQFVTRCSSSPSSRGQWQPHQNGEKGRSFSLRE



type peptidyl-prolyl cis-trans
CAISIALAVGLVTGVPSLDMSTGNAYAASPALPDLSVLI



isomerase family domain is
SGPPIKDPEALLRYALPINNKAIREVQKPLEDITDSLKV



underlined.
AGLRALDSVERNVRQASRVLKQGKNLIVSGLAESKKDHG




VELLDKLEAGMDELQQIVEDGNRDAVAGKQRELLNYVGG




VEEDMVDGFPYEVPEEYKNMPLLKGRAAVDMKVKVKDNP





NLEECVFRIVLDGYNAPVTAGNFVDLVERHFYDGMEIQR






ADGFVVQTGDPEGPAESFIDPSTEKPRTIPLEIMVDGEK






APVYGATLEELGLYKAQTKLPFNAFGTMAMARDEFEDNS






ASSQIFWLLKESELTPSNANILDGRYAVFGYVTENQDFL






ADLKVGDVIESVQVVSGLDNLANPSYKIAG






34
The amino acid sequence of SEQ ID
MAGEDFDIPPADEMNEDFDLPDDDDDAPVMKAGDEKEIG



300. The conserved FKBP-type
KQGLKKKLVKEGDAWETPDNGDEVEVHYTGTLLDGTQFD



peptidylprolyl isomerase domains


SSRDRGTPFKFTLGQGQ






are underlined. The FKBP-type


EAGSPPTIPPNATLQFDVELLSWTSVKDICKD




peptidyl-prolyl cis-trans
GGIFKKILVEGEKWENPKDLDEVLVKYEFQLEDGTTIAR



isomerase signature 1 is in bold

SDGVEFTVKEGHFCPAVAKAVKTMKKGEKVLLTVKPQYG




and the FKBP-type peptidyl-prolyl

FGEKGKPASGDEGAVPPNATLQITLELVSWKTVSEVTDD




cis-trans isomerase signature 2 is
KKVIKKILKEGEGYERPNEGAVVEVKLIGKLQDGTVFVK



in bold/italics.

KGHDDCEELFKFKIDEEQ








SSESKQDLAVVPPSSTVYYEVELVSFVKDKE





SWDMNTEEKIEAAGKKKEEGNVIFKAGKYAKASKRYEKA




VKYIEYDTSFSEDEKKQAKALKVACNLNDAACKLKLKDY




NQAEKLCTKVLELDSRNVKALYRRAQAYIELSDLDLAEF




DIKKALEIDPHNRDVKLEYKVLKEKVKEFNKKDAKFYGN




MFAKMSKLEPVEKTAAKEPEPMSIDSKA





35
The amino acid sequence of SEQ ID
MSTVYVLEPPTKGKVVLNTTHGPLDVELWPKEAPKAVRN



301. The conserved cyclophilin-

FVQLCLEGYYDNTIFHRIIKDFLVQGGDPTGSGTGGESI




type peptidyl-prolyl cis-trans

YGDAFSDEFHSRLRFKHRGLVACANAGSPHSNGSQFFIT




isomerase family domain is

LDRCDWLDRKNTIFGKITGDSIYNLSGLAEVETDKSDRP




underlined and the cyclophilin-

LDPPPKIISVEVLWNPFEDIVPRAPVRSLVPTVPDVQNK




type peptidyl-prolyl cis-trans
EPKKKAVKKLNLLSFGEEAEEEEKALVVVKQKIKSSHDV



isomerase signature is in bold.
LDDPRLLKEHIPSKQVDSYDSKTARDVQSVREALSSKKQ




ELQKESGAEFSNSFREIADDEDDDDDDASFDARMRRQIL




QKRKELGDLPPKPKPKSRDGISARKERETSISRDKDDDD




DDDQPRVEKLSLKKKGIGSEARGERMANADADLQLLNDA




ERGRQLQKQKKHRLRGREDEVLTKLETFKASVFGKPLAS




SAKVGDGDGDLSDWRSVKLKFAPEPGKDRMTRNEDPNDY




VVVDPLLEKGKEKFNRMQAKEKRRGREWAGKSLT





36
The amino acid sequence of SEQ ID
MASAISMHSSGLLLLQGTNGKDVTEMGKAPASSRVANMQ



302. The conserved cyclophilin-
QRKYGATCCVARGLTSRSHYASSLAFKQFSKTPSIKYDR



type peptidyl-prolyl cis-trans
MVEIKAMATDLGLQAKVTNKCFFDVEIGGEPAGRIVIGL



isomerase family domain is

FGDDVPKTVENFRALCTGEKGFGYKGCSFHRIIKDFMIQ




underlined and the cyclophilin-


GGDFTRGNGTGGKSIYGSTFEDENFALKHVGPGVLSMAN





type peptidyl-prolyl cis-trans

AGPSTNGSQFFICTVKTPWLDNRHVVFGQVVDGMDVVQK




isomerase signature is in bold.

LESQETSRSDVPRQPCRIVNCGELPLDG






37
The amino acid sequence of SEQ ID
MAASFTALSNVGSLSSPRNGSEIRRFRPSCNVAASVRPP



303. The conserved cyclophilin-
PLKAGLSASSSSSFSGSLRLIPLSSSPQRKSRPCSVRAS



type peptidyl-prolyl cis-trans
AEAAAAQSKVTNKVYLDISIGNPVGKLVGRIVIGLYGDD



isomerase signature is underlined.

VPQTAENFRALCTGEKGFGYKGSTVHRVIKDFMIQGGDF






DKGNGTGGKSIYGRTFKDENFKLSHVGPGVVSMANAGPN






TNGSQFFICTVKTPWLDQRHVVFGQVLEGMDIVRLIESQ






ETDRGDRPRKRVVVSDCGELPVV






38
The amino acid sequence of SEQ ID
MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRY



304. The conserved FKBP-type

EGVLAETGEVFDSTHEDNTLFSFEIGKGSVISAWDTALR




peptidyl-prolyl cis-trans


TMKVGEVAKITCKPEYAYGSTGSPPDIPPDATLIFEVEL





isomerase signature is underlined

VACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEK




and the FKBP-type peptidyl-prolyl
KRREEAKAAAAARVQAKLDAKKGHGKGKGKAK



cis-trans isomerase signature 2 is



in bold.





39
The amino acid sequence of SEQ ID
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL



305. The conserved cyclophilin-

CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN




type peptidyl-prolyl cis-trans

GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS




isomerase family domain is
1QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS



underlined and the cyclophilin-

GRTSKPVVVADCGQLS




type peptidyl-prolyl cis-trans



isomerase signature is in bold.





40
The amino acid sequence of SEQ ID
MPNPKVFFDMTIGGAAAGRVVMELYADTTPRTAENFRAL



306. The conserved cyclophilin-

CTGEKGVGRSKKPLHYKGSKFHRVIPSFMCQGGDFTAGN




type peptidyl-prolyl cis-trans

GTGGESIYGVKFADENFIKKHTGPGILSMANAGPGTNGS




isomerase signature is underlined

QFFICTTKTEWLDGKHVVFGKVVEGMEVVKAIEKVGSSS




and the cyclophilin-type peptidyl-

GRTSKPVVVADCGQLP




prolyl cis-trans isomerase



signature is in bold.





41
The amino acid sequence of SEQ ID
MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRY



307. The conserved FKBP-type

EGVLAETGEVFDSTHEDNTLFSFEIGKGSVISAWDTALR




peptidyl-prolyl cis-trans


TMKVGEVAKITCKPEYAYGSTGSPPDIPPDATLIFEVEL





isomerase signature is underlined

VACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEK




and the FKBP-type peptidyl-prolyl
KRREEAKAAAAARVQAKLDAKKGHGKGKGKAK



cis-trans isomerase signature 2 is



in bold.





42
The amino acid sequence of SEQ ID
MATARSFFLCALLLLATLYLAQAKKSEDLKEVTHKVYFD



308. The conserved cyclophilin-

VEIAGKPAGRIVMGLYGKAVPKTAENFRALCTGEKGTGK




type peptidyl-prolyl cis-trans

SGKPLHYKGSSFHRIIPSFMLQGGDFTLGDGRGGESIYG




isomerase signature is underlined

EKFADENFKLKHTGPGLLSMANAGPDTNGSQFFITTVTT




and the cyclophilin-type peptidyl-

SWLDGRHVVFGKVLSGMDVVYKVEAEGRQSGTPKSKVVI




prolyl cis-trans isomerase

ADSGELPL




signature is in bold.





43
The amino acid sequence of SEQ ID
MMRREISVLLQPRFVLAFLALAVLLLVFAFPFSRQRGDQ



309. The conserved cyclophilin-
VEEEPEITHRVYLDVDIDGQHLGRIVIGLYGEVVPRTVE



type peptidyl-prolyl cis-trans

NFRALCTGEKGKSANGKKLHYKGTPFHRIISGFMIQGGD




isomerase family domain is

VIYGDGKGYESIYGGTFADENFRIKHSHAGIISMVNSGP




underlined and the cyclophilin-

DSNGSQFFITTVKASWLDGEHVVFGRVIQGMDTVYAIEG




type peptidyl-prolyl cis-trans

GAGTYNGKPRKKVIIADSGEIPKSKWDEER




isomerase signature is in bold.





44
The amino acid sequence of SEQ ID
MWATAEGGPPEVTLETSMGSFTVELYFKHAPRTSRNFIE



310. The conserved cyclophilin-

LSRRGYYDNVKFHRIIKDFIVQGGDPTGTGRGGESIYGK




type peptidyl-prolyl cis-trans

KFEDEIKPELKHTGAGILSMANAGPNTNGSQFFITLAPC




isomerase family domain is

PSLDGKHTIFGRVCRGMEIIKRLGSVQTDNNDRPIHDVK




underlined and the cyclophilin-

ILRTSVKD




type peptidyl-prolyl cis-trans



isomerase signature is in bold.





45
The amino acid sequence of SEQ ID
MSNPKVFFDILIGKMKAGRVVMELFADVTPKTAENFRAL



311. The conserved cyclophilin-

CTGEKGIGRSGKPLHYKGSTFHRIIPNFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGMKFADENFKIKHTGLGVLSMANAGPDTNGS




isomerase family domain is

QFFICTEKTPWLDGKHVVFGKVIDGYNVVKEMESVGSDS




underlined and the cyclophilin-

GSTRETVAIEDCGQLSEN




type peptidyl-prolyl cis-trans



isomerase signature is in bold.





46
The amino acid sequence of SEQ ID
MDDDFEFPASSNVENDDDDGMDMDDMGGDVPEEEDPVAS



312. The conserved FKBP-type
PAVLKVGEEREIGKAGFKKKLVKEGEGWETPSSGDEVEV



peptidylprolyl isomerase domains


HYTGTLLDGTKFDSSRDRGTPFKFKLGRGQ






are underlined. The FKBP-type


ESGSPPTIPPNATLQFDVE




peptidyl-prolyl cis-trans

LLSWSSVKDICKDGGILKKVLVEGEKWDNPKDLDEVFVK




isomerase signature 1 is in bold

YEASLEDGTLISKSDGVEFTVGDGYFCAALAKAVKTMKK




and the FKBP-type peptidyl-prolyl

GEKVLLTVMPQYAFGETGRPASGDEAAVPPDASLQIMLE




cis-trans isomerase signature 2 is

LVSWKTVSDVTKDKKVLKKTLKEGEGYERPNDGAAVQVR




in bold/italics. The TPR repeat

LCGKLQDGTVFVKKDDEEPFEFKIDEEQ





is in italics.


PTESQQDLAVVPANSTVYYEV






ELLSFVKEKESWEMNNQEKIEAAARKKEEGNAAFKAGKY





VRASKRYEKAVRFIEYDSSFSDEEKQQAKTLKNTCNLND




AACKLKLKDFKEAEKLCTKVLEGDGKNVKALYRRAQAYI




QLVDLDLAEQDIKKALEIDPNNRDVKLEYKILKEKVREY




NKRDAQFYGNMFAKMNKLEHSRTAGMGAKHEAAPMTIDS




KA





47
The amino acid sequence of SEQ ID
MAKPRCFMDISIGGELEGRIVGELYTDVAPKTAENFRAL



313. The conserved cyclophilin-

CTGEKGIGPHTGAPLHYKGVRFHRVIKGFMVQGGDISAG




type peptidyl-prolyl cis-trans

DGTGGESIYGLKFEDENFDLKHERKGMLSMANSGPNTNG




isomerase family domain is

SQFFITTTRTSHLDGKHVVFGRVVKGMGVVRSVEHVTTA




underlined and the cyclophilin-

AGDCPTVDVVIADCGEIPAGADDGIRNFFKDGDTYPDWP




type peptidyl-prolyl cis-trans
ADLDESPAELSWWMDAVDSIKAFGNGSYKKQDYKMALRK



isomerase signature is in bold.
YRKALRYLDICWEKEGIDEVESSSLRKTKSQIFTNSSAC



The TPR repeat is in bold/italics.
KLKLCDLKGALLDAEFAVRDGENN





GIKKELNAAKKKIFERREQ





EKRAYRKMFL





48
The amino acid sequence of SEQ ID
MTKRKNPLVFLDVSIDGDPVERIVIELFADTVPRTAENF



314. The conserved cyclophilin-

RSLCTGEKGVGKTTGKPLHYKGSYFHRIIKGFMAQGGDF




type peptidyl-prolyl cis-trans

SNGNGTGGESIYGGKFADENFKLAHDGPGLLSMANGGPN




isomerase signature is underlined

TNGSQFFIIFKRQPHLDGKHVVFGKVMRGMEVVKKIEQV




and the cyclophilin-type peptidyl-

GSANGKPLQPVKIVDCGETSETGTQDAVVEEKSKSATLK




prolyl cis-trans isomerase
AKKKRSARDSSSESRGKRRQRKSRKERTRKRRRYSSSDS



signature is in bold.
YSSESSDSDSESYSSDTESESKSHSESSVSDSSSSDGRR




RKRKSTKREKLRRQRGKDSRGEQKSARYDKKSRHKSADS




SSDSESESSSRSRSRDDKKKSSRRESARSVSKLKDAEAN




SPENLESPRDREIKKVEDNSSHEEGEFSPKNDVQHNGHG




TDAKFGKYDDQRPRSDGSKKSSGSMRDSPKRLANSVPQG




SPSSSPAHKASEPSSSIRARNPSRSPAPDGNSKRIRKGR




GFTERFSYARRYRTPSPEDVTYRPYHYGRRNFHDRRNDR




YSNYRSYSERSPHRRYRSPPRGRSPPRYQRRRSRSRSVS




RSPGGNKGRYRGRDQSRSRSRSRSRSPRRGSSPANKQLP




LSERLKSRLGTRVDEHSPRRRRSSSRSHDSSRSRSPDEV




PDKHEGKAAPVSPARSRSSSPSGRGLVSYGDASPDSGIN





49
The amino acid sequence of SEQ ID
MSVLLVTSLGDIVVDLHADRCPLTCKNFLKLCRIKYYNG



315. The conserved cyclophilin-

CVFHTVQKDFTAQTGDPTGTGTGGDSVYKFLYGDQARFF




type peptidyl-prolyl cis-trans

MDEIHLDLKHSKTGTVAMASGGENLNASQFYFTLRDDLD




isomerase signature is underlined.

YLDGKHTVFGEVAEGLETLTRINEAYVDEKGRPYKNIRI




The CCHC type zinc finger is in

RHTYILDDPFDDPPQLAELIPDASPEGKPKDEVVDDVRL




bold and the RNA-binding region
EDDWVPLDEQLGPAQLEEAIRAKEAHSRAVVLESIGDIP



RNP-1 (RNA recognition motif) is
DAEIKPPDNV



in bold/italics.









DFSQSVAKLWSQFKRKDSQAAKGKGCFKCGAPDHM






ARECPGSSTRQPLSKYILKEDNAQRGGDDSRYEMVFDED





APESPSHGKKRRGRDDRDDRHKMSRQSVEETKFNDREGG




HSVDKHRQSERSKHREDEMSRDSKASEAGRRRIDRDFPE




EERDGEKYTESHRDRDGKRGDYRDYRKGRADVQTHGDRR




GDENYRRKSAAYDDGHEGAGAARRKDSNDDHHAYRRGYG




DSRKGTRDEDDDGRGRRDDPSYRRSSGHKDSSNGGREEQ




KYRSGETDGKSHPERSHRGDRRR





50
The amino acid sequence of SEQ ID
MRPFNGGSSIACLVLVIAAGALAESQGPHLGSARVVFQT



316. The conserved cyclophilin-

NYGDIEFGFFPGVAPRTVDHIFKLVRLGCYNTNHFFRVD




type peptidyl-prolyl cis-trans

KGFVAQVADVANGRTAPMNDEQRTEAEKTIVGEFSNVKH




isomerase signature is underlined.

VRGILSMGRYDDPDSAQSSFSILLGDAPHLDGKYAIFGR






VTKGDETLKKLEQLPTRREGMFVMPTERITILSSYYYDT





GAESCEEENSTLRRRLAASAVEVERQRMKCFP





51
The amino acid sequence of SEQ ID
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL



317. The conserved cyclophilin-

CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS




isomerase signature is underlined

QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS




and the cyclophilin-type peptidyl-

GRTSKPVVIADSGQLA




prolyl cis-trans isomerase



signature is in bold.





52
The amino acid sequence of SEQ ID
MRFTSITSAIALFAAAASALDKPLDIKVDKAVECSRKTK



318. The conserved FKBP-type

AGDKIQVHYRGTLEADGSEFDASYKRGQPLSFHVGKGQV




peptidyl-prolyl cis-trans


IKGWDQGLLDMCPGEKRTLTIQPDWGYGSRGMGPIPANS





isomerase signature is underlined

VLIFETELVEIAGVAREEL




and the FKBP-type peptidyl-prolyl



cis-trans isomerase signature 2 is



in bold.





53
The amino acid sequence of SEQ ID
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL



319. The conserved cyclophilin-

CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN




type peptidyl-prolyl cis-trans

GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS




isomerase signature is underlined

QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS




andThe cyclophilin-type peptidyl-

GRTSKPVVVADCGQLS




prolyl cis-trans isomerase



signature 2 is in bold.





54
The amino acid sequence of SEQ ID
MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSD



320. The conserved FKBP-type
PKLVHRKVGEEKKKPDDLEEVTHKVFFDVEIGGKPAGRI



peptidyl-prolyl cis-trans

VMGLFGKTVPKTVENFRALCTGEKGIGKSGKPLNYKGSQ




isomerase signature is underlined


FHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLK





and the Cyclophilin-type peptidyl-

HTDAGRLSMTNAGPDTNGSQFFITTVTTSWLDGRHVVFG




prolyl cis-trans isomerase

KVLSGMDVVHKIEAEGGQSGQPKSIVVISDSGELDL




signature is in bold.





55
The amino acid sequence of SEQ ID

MAVTLHTNLGDIKCEIFCDEVPKAAEHNARGILSMANSG




321. The conserved cyclophilin-

PNTNGSQFFIAYAKQPHLNGLYTIFGRVIHGFEVLDIME




type peptidyl-prolyl cis-trans

KTQTGPGDRPLAEIRLNRVTIHANPLAG




isomerase signature is underlined





56
The amino acid sequence of SEQ ID
MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSD



322. The conserved FKBP-type
PKLVHRKVGEEKKKPDDLEEVTHKVFFDVEIGGKPAGRI



peptidyl-prolyl cis-trans

VMGLFGKTVPKTVENFRALCTGEKGIGKSGKPLNYKGSQ




isomerase signature is underlined


FHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLK





and the Cyclophilin-type peptidyl-

HTDAGRLSMANAGPDTNGSQFFITTVTTSWLDGRHVVFG




prolyl cis-trans isomerase

KVLSGMDVVHKIEAEGGQSGQPKSIVVISDSGELDL




signature is in bold.





57
The amino acid sequence of SEQ ID
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL



323. The conserved cyclophilin-

CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN




type peptidyl-prolyl cis-trans

GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS




isomerase signature is underlined

QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS




andThe cyclophilin-type peptidyl-

GRTSKPVVVADCGQLS




prolyl cis-trans isomerase



signature 2 is in bold.





58
The amino acid sequence of SEQ ID
MSPVAANAMEEAAEPEVPAPVTPSKDDADTDAAVSRFLG



324. The conserved A-box of the

FCKSKLGLAEGNCVQSSTLLRKTAHVLRSSGTVIGTGTA




Retinoblastoma-associated protein

EEAERYWFAFVLYTVRRVGERKAEDEQNGSDETEVPLSR




is underlined and the B-box of the

ILKASVLNLIDFFKEIPQFVIKAGAIVSGIYGANWDSRL




Retinoblastoma-associated protein

EAREMQTNYVHLCILCKFYKRICGEFFILNDAKDDMKSA




is in bold.

DSSTSDPVIMYQPFGWLLFLALRIHALSRFKDLVSSTNA





LVSVLAILIIHLPTRFRKFSISDSSQLVKRSEKGVDLVG




SLAYRYDTSEDEIKRTLEKANNVIAEILGITPPPASECK




AENLENVDTDGLIYFGNLMEETSLSSILSTLEKIYEDAT




RNDSEFDERVFINDDDSLLVSGSLSGAAINLTGAKRKYD




SFASPAKTITRPLSPSRSPASHINGIIGGTNLRITATPV




ATAMTTAKWLRTFVSPLPSKPSTDLQGFLASCDRDVTSD




VIRRANIILEAIFPNSPIGERTVTGGLQNANLMDNMWAE




QRRLEALKLYYRVLEAMCRAEAQILHSNNLTSLLTNERF




HRCMLACSAELVLATHKTVTMLFPAVLERTGITAFDLSK




VIESFVRHEETLPRELRRHLNTLEERLLENMVWERGSSM




YNSLVVARPALAPEINRLGLLPEPMPSLDAIALLINFSS




SGLPQSPVQKHEASPGQNGDIRSPKRISTEYRSVLVERN




FTSPVKDRLLALSNIKSKLPPPPLQSAFASPTRPHPGGG




GETCAETAIHIFFSKITKLAAVRINAMLERLQLSQQIKE





GVYCLFQQILSQRTNLFFNRHIDQVILCCFYGVAKINQI






NLTFREIIYNYRKQPQCKPQVFRNVFVDWSTRRNGKAGN






EHVDIISFYNEIFIPSVKPLLVELGPTGATTRTNRTSEV





GNKNDAQCPGSPKISSFPTLPDMSPKKVSASHNVYVSPL




RSSKMDASISHSSKSYYACVGESTHAYQSPSKDLVAINS




RLNGNRKVRGTLNFDDVDAGLVSDSMVANSLYLQNGSSM




SSSTAKSSEK





59
The amino acid sequence of SEQ ID
MRPILMKGHERPLTFLKYNREGDLLFSCAKDHTPTVWFA



325. The conserved G-protein beta
DNGERLGTYRGHNGAVWCCDVSRDSMRLITGSADTTAKL



WD-40 repeat domains are

WSVQNGTQLFTFNFDSPARSVDFSIGDKLAVITTDPFME




underlined.
LPSAIHVKRIARDPADQASESVLVLRGHQGRIARAVWGP




LNKTIISAGEDAVIRIWDSETGKLLRESDKETGHKKAVT





SLMKSVDGSHFVTGSQDKSAKLWDIRTLTLIKTYVTERP





VNAVTMSPLLDHVVLGGGQDASAVTMTDHRAGKFEAKFF




DKILQEEIGGVKGHFGPINALAFNPDGKSFSSGGEDGYV





RLHHFDPDYFNIKI






60
The amino acid sequence of SEQ ID
MDKKRTVVPLVCHGHSRPVVDLFYSPITPDGFFLISASK



326. The conserved G-protein beta


DSSPMLRNGETGDWIGTFEGHKGAVWSCCLDTNALRAAS





domain is underlined and the WD-40


GSADFSAKLWDALSGDELHSFEHKHIVRSCAFSEDTHLL





repeat domains are in bold


LTGGVEKILRIFDLNRPDAPPREVDNSPGSIRTVAWLHS








DQTILSSCTDIGGVRLWDVRSGKIVQTLETKSPVTSSEV








SQDGRYITTADGSTVKFWDANHFGLVKSYNMPCNIESAS







LEPKLGNKFIAGGEDMWVHIFDFHTGEEIGCNKGHHGPV







HCVRFSPGGESYASGSEDGTIRIWQ
TGPANNVEGDANPS





NGPVTGKAKVGADEVTRKVEDLQIGKEGKDWREG





61
The amino acid sequence of SEQ ID
MAEGLILKGTMRAHTDMVTAIAIPIDNSDMVVTSSRDKS



327. The conserved G-protein beta

IILWHLTKEEKVYGVPRRRLTGHSHFVQDVVLSSDGQFA1




WD-40 repeat domains are

LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN




underlined.

RQIVSASRDRTIKLWNTLGECKYTIQEGEAHTDWVSCVR






FSPNTLQPTIVSASWDRTIKVWNLTNCKRNTLAGHNGY






VNTVAVSPDGSLCASGGKDGVILLWDLAEGKRLYNLEAG





AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV




DLKNEADKTDGTTTAASNKKVIYCTSLNWSADGSTLFSG




YNDGVIRVWGTGRY





62
The amino acid sequence of SEQ ID
MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKS



328. The conserved G-protein beta
IILWHLTKEDKVYGVPRRRLTGHSHFVQDVVLSSDGQFA



WD-40 repeat domains are

LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN




underlined.

RQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVR






FSPNTLQPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGY






VNTVAVSPDGSLCASGGKDGVILLWDLAEGKKLYSLEAG





AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV




DLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFS




GYNDGVIRVWGIGRY





63
The amino acid sequence of SEQ ID
MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKS



329. The conserved G-protein beta

IILWHLTKEDKVYGVPRRRLTGHSHFVQDVVLSSDGQFA




WD-40 repeat domains are

LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN




underlined and the Trp-Asp (WD)

RQIVSASRDRTIKLWN
TLGECKYTIQEGEAHNDWVSCVR




repeats signature is in bold.

FSPNTLQPTIVSASWDRTVKVWN
LTNCKLRNTLQGHSGY






VNTVAVSPDGSLCASGGKDGVILLWDLAEGKKLYSLEAG






AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV





DLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFS





GYNDGVIRVWGIGRY






64
The amino acid sequence of SEQ ID
MSGVPAPPFATTTPENGTMSSNSPAFHRDSDDDDDQGEV



330. The conserved G-protein beta
FLDDSDIIHEVAVDDEDLPDADDEADEAEEADDSLHIFT



WD-40 repeat domains are

GHNGEVYSLACSPTDATLVATGAGDDKGFLWRIGHGDWA




underlined.

VELQGHKDSISSLAFSLDGQLLASGSLDGVIQIWDVPSG





NLKGTLDGPGGGIEWIRWHPKGHIILAGSEDSTVWMWNA




DKMAYLNMFSGHGNSVTCGDFTPDGKTICTGSDDATLRI





WNPKSGENIHVVKGHPYHAEGLTSMAISSDSGLAITGAK






DGSVRIVNISSGRVVSSLDAHADSVEFVGLALSSPWAAT






GSLDQKLIIWDLQHSSPRATCDHEDGVTCLSWVGASRFL






ASGCVDGKVRVWDSLSGDCVRTFHGHSDAIQSLSVSANE





EFLVSVSIDGTARVFEIAEFH





65
The amino acid sequence of SEQ ID
MGTSQHQLSSCLQLLPRRRGNKNLIFRRTMASGGAAAVA



331. The conserved G-protein beta
PPPGYKPYRHLKTLTGHVAAVSCVKFSNDGTLLASASLD



WD-40 repeat domains are

KTLIIWSSAALSLLHRLVGHSEGVSDLAWSSDSHYICSA




underlined.

SDDRTLRIWSSRSPFDCLKTLRGHTDFVFCVNFNPQSSL






IVSGSFDETIRIWEVKTGRCLNVIRAHSMPVTSVHFNRD






GSLIVSGSHDGSCKIWDTKNGACLKTLIDDTVPAVSFAK






FSPNGKFILVATLNDTLKLWNYATGKFLKIYTGHKNSVY






CLTSTFSVTNGKYIVSGSEDRCICIWDLQGKNLIQKLEG






HSDTVISVTCHPSENKIASAGLDSDRTVRIWLQDA






66
The amino acid sequence of SEQ ID
MPSQKIETGHQDIVHDVAMDYYGKRVATASSDTTIKIIG



332. The conserved G-protein beta
VSNSSGSQHLASLSGHKGPVWQVAWAHPKFGSILASCSY



WD-40 repeat domains are

DGQVILWKEGNQNDWAQAHVFNDHKSSVNSIAWAPHELG




underlined.

LCLACGSSDGNISVFTARPDGGWDTTRIEQAHPVGVTSV






SWAPSMAPGALVGSGLLDPVQKLASGGCDNTVKVWKLYN





GTWKMDCFPALQMHSDWVRDVAWAPNLGLPKSTIASASQ





DGTVVIWTVAKEGEQWQGKVLKDFKTPVWRVSWSLTGNL






LAVADGNNNVTLWNEAVDGEWQQVTTVEP






67
The amino acid sequence of SEQ ID
MKIAGLKSVENAHDESVWAAAWVPATESRPALLLTGSLD



333. The conserved G-protein beta

ETVKLWRPDELALERTNAGHFLGVVSVAAHPSGVIAASA




WD-40 repeat domains are

SIDSFVRVFDVDTNATIATLEAPPSEVWQMQFDPKGTTL




underlined and the Trp-Asp (WD)

AVAGGGSASIKLWDTATWELNATLSIPRPEQPKPSEKGN




repeats signature is in bold.
KKFVLSVAWSPDGRRLACGSMDGTISIFDVARAKFLHHL





EGHFMPVRSLVFSPVEPRLLFSASDDAHVHMYDSEGKSL






VGSMSGHASWVLSVDVSPDGAALATGSSDRTVRLWD
LSM





RAAVQTMSNHSDQVWGVAFRPMAGAGVRAGGRLASVSDD





KSISLYDYS






68
The amino acid sequence of SEQ ID
MEIDLGNLAFDVDFHPSEQLVASGLITGDLLLYRYGDGS



334. The conserved G-protein beta

SPEKLLEVRAHGESCRAVRFINDGKAILTGSPDCSILAT




WD-40 repeat domains are

DVETGSVVARVENAHEAAVNRLVNLTESTIATGDDNGCI




underlined and the Trp-Asp (WD)

KVWDTRQRSCCNTFSAHEDFISDMTFASDSMKLVVTSGD




repeats signature is in bold.

GTLSVCNLRSNKVQTRSEFSEDELLSVVIMKNGRKVVCG






TQSGTLLLYSWGFFKDCSDRFVDLSPSSVDALLKLDEDR





IIAGTENGLISLIGILPNRIIQPIAEHSDHPIERLAFSH





DKKFLGSISHDQTLKLWD
LNDILGSEDSPSSQAAIDDSD





SDEMDVDANPPDSSKGNKKKHSGKGNDVGNANNFFADLGD





69
The amino acid sequence of SEQ ID
MSQQPSVILATASYDHTIRFWEAKSGRCYRTIQYPDSQV



335. The conserved G-protein beta

NRLEITPHKRYLAVAGNPSIRLFDVNSNTPQPVMSFDSH




WD-40 repeat domains are

TNNVMAVGFQYDGNWMYSGSEDGTVRIWDLRARGCQREY




underlined and the Trp-Asp (WD)

ESRGAVNTVVLHPNQTELISGDQNGNIRVWDLTANSCSC




repeats signature is in bold.

ELVPEVDTAVRSLTVMWDGSLVVAANNNGTCYVWRLLRG





SQTMTNFEPLHKLQAHNGYILKCLLSPEFCEPHRYLATA






SSDHTVKIWN

VEGFTLEKTLIGHQRWVWDCVFSVDGAYL






ITASSDTTARLWSMSTSQDIRVYQGHHKATTCCALHDGA





EGSPG





70
The amino acid sequence of SEQ ID
MEDAMDMEVEVEVEAEEHSPSSSNPSGSSFRRFGLKNSI



336. The conserved G-protein beta
QTNFGSDYVFEITPKFDWSLMGVSLSSNAVKLYSPTTGQ



WD-40 repeat domains are

YCGECRGHSDTVNGISFSGPSSPHVLHSCSSDGTIRAWD




underlined.
TRSFKEVSCISAGPSQEIFSFSFGGSSDSLLSAGCKSQI





LFWDWRNKKQVACLEDSHVDDVTQVCFVPHHQNKLISAS





VDGLICIFDTAGDINDDEHMESVINVGTSIGKVGIFGQT




FEKLWCLTHIETLSVWDWKEGTNEANFEDARKLASDSWS




LDHIDYFVDCHSAEEGEGLWVIGGTNAGTLGYFPVKYKG




GAAIGSPEAVLGGGHSDVVRSVLPMSGMAGTTSKTRGIF





GWTGGEDGRLCCWLSDDSSATSRSWMSSNLVLKSSRSHH





KKNRHQPY





71
The amino acid sequence of SEQ ID
MSQHQEYPMEYAADDYDVGEVEDDMYFHERVMGDSDTDE



337. The conserved G-protein beta
DEEYDHLDNKITDTSAADARRGKDIQGIPWERLSVTREK



domain is underlined and the WD-40
YRRTRIEQYKNYENVPQSGESSEKDCKPTRKGGNYYEFW



repeat domains are in bold
RNTRSVKSTILHFQLRNLVWSTTKHDVYLMSHFSIIHWS




SLTCKKTEVLDVYGHVAPREKHPGSLLEGFTQTQVSTLA




VRDKLLIAGGFQGELICKNLDRPGVSYCCRTTYDDNAIT




NAVEIYDYPSGAVHFMASNNDCGVRDFDMEKFELSRHFT




FPWPVNHTSLSPDGKLLVIVGDNPEGIVVDSQRGKTIRP






LQGHLDFSFASAWHPDGHIFATGNQDKTCRIWDIRNLSK








SVAVLKGNLGAIRSIRFTSDGRFMAMAEPADFVHVYD
VK





SGYEKEQEIDFFGEISGVSFSPDTESLFVGVWDRTYGSL




LQYNRCRNYSYLDSM





72
The amino acid sequence of SEQ ID
MGASSDPNPDVSDEHQKRSEIYTYEAPWHIYAMNWSVRR



338. The conserved G-protein beta
DKKYRLAIASLLDHPAAAAAVPNRVEIVQLDDSTGEIRA



WD-40 repeat domains are
DPNLSFDHPYPATKAAFVPDKDCQRADLLATSSDFLRIW



underlined.
RIADDSSRVDLRSFLNGNKNSEFCRPLTSFDWNEAEPKR





IGTSSIDTTCTIWDIERETVDTQLIAHDKEVYDIAWGGV






SVFASVSADGSVRVFDLRDKEHSTIIYESSEPDTPLVRL





GWNKQDPRYMATIIMDSAKVVVLDIRYPTMPVVELQRHQ





ASVNAIAWAPHSSCHICTAGDDSQALIWDLSSMAQPVEG





GLDPILAYTAGAEIEQLQWSSSQPDWVAIAFSLKLQ





73
The amino acid sequence of SEQ ID
MRGGGGGGDATGWDEDAYRESVLKEREVQTRTVFRAAFA



339. The conserved G-protein beta

PSPSPSPSPDAVVVASSDGSVASYSISACLSDHRLQSLR




WD-40 repeat domains are
FADAKSQNVLEAEPACFLQGHDGPAYDVKFYGEGEDSLL



underlined.

LSCGDDGRIRGWMWRDITSSEAHDHSQGNSAKPVLDLVN





PQSRGPWGALSPIPENNALAVDVKRGSIYAAAGDSCAYC




WDVECGKIKTVFKGHSDYLHCIAARNSSSQIITGSEDGT





ARIWDCRSGKCVQVIDPDKDHKKGFFASVSCLALDASES






WLVCGRGRDLSVWSISASDCIAKISTNAPAQDVLFDDNQ





ILLVGAEPLISRLDMNGAVLSQIHCAPQSVFSVSLHQSG




VTAVGGYGGLVDVISQFGSHLCTFRCKCI





74
The amino acid sequence of SEQ ID
MEAPIIDPLQGDFPEVIEEYLEHGIMKCIAFNRRGTLLA



340. The conserved G-protein beta

AGCTDGSCIIWDFETRGVAKELRDKECTAAITSVCWSKY




WD-40 repeat domains are

GHRILVSASDKSLILWDVLSGEKIAHTTLQHTVLQACLH




underlined.
PGSSTPSICLACPFSSAPMIVDLNTGSTTALPVLTADVS




NGATPLSRNKTSDTSVTYSPCNACFNKHGDLVYAGTSKG




EILIIDHKNVRVCAIVLVSGGAVIKNVVFSRNGQYMLTN





SNDRLIRIYKNLLPPKDGLKMLDELNESFNESDDVEKLK





AIGSKCLELLHEFQDSITRVQWKAPCFSGDGEWVIGGAA




SRGEHKIYIWDRAGHLVKILEGPKEALMDLAWHPVHPII





ISVSLTGLVYIWAKDYTENWSAFAPDFKELEENEEYVER





EDEFDLVPETEKVKGLDVHEDDEVDVLTVERDSVFSDSD




MSQEELCFLPAVPCLDIPEQQDKCVGSCSKLPDGNHSGS




PLSVEAGQNGNASNHNSSPLEPMENSTADDTDGVRLKRK




RKPSEKGLELQAEKVKKPVKPLKSSGRLSKTNKPVIDPD




SSNGVYGDDGSD





75
The amino acid sequence of SEQ ID
MRGVSWPEDGNNPSTSSSSQRNQQQAHAPRAVSGHAASH



341. The conserved G-protein beta
PSASNIFKLLVQREVSPRSKHSSKKLWREASKCQPYPFQ



WD-40 repeat domains are
QSCEAVRDVRQGLISWVESASLRHLSAKYCPLVPPPRST



underlined.

IAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVLRGHR






RTPWVVRFHPLYPEILASGSLDHEVRLWDANTAECIGSR






NFYRPIASIAFHARGELLAVASGHKLYIWHYNRRGETSS





PTIVLRTQRSLRAVHFHPHAAPFLLTAEVNDLDSADSAM




TLATSPGYLHYPPPTVYFADAHSHERSRLADELPLMPLP




LLMWPSFTRDDGRVPLQRIDGDVGLNGQQRVDSSSSVRL




WTYSTPSGQYELLLSPVESGNSPSMPEETGNNAFSSAVE




AEVSQSAMDTVEDMEVQPEERNTQFFSFSDPRFWELPLL




HGWLVGQTQAGPRSVRQSSPGDIETQSAFGEVASVSPIT




SGVMPVSMDPSRFGGRSGSRYRSPGSRGVHVTGPNNDGP




RDENDPQSVVSKLRSELAASLAAAASTELPCTVKLRIWP




HDVKDPCAQLDLESCRLTIPHAVLCSEMGAHFSPCGRFL




AACVACVLPHLESDPGLHGQVNQDVTGVATSPTRHPISA




HQIMYELRIYSLEEATFGIVLASRPVRAAHCLTSIQFSP




TSEHLLLAYGRRHSSLLKSIVIDGENTVPIYTILEVYRV




SDMELVRVLPSAEDEVNVACFHPSVGGGLIYGTKEGKLR





ILHYDSSHGLNLKSSGFLDENVPEVQTYALEC






76
The amino acid sequence of SEQ ID
MDSAVAIAALSLVVGAAIALLFFGNYFRKRRSEVVAMAE



342. The conserved G-protein beta
ADLQPHPKNPSRPPPQPAAKKVHAKSHAHGADKDKNKRH



WD-40 repeat domains are
HPLDLNTLKGHGDSVTGLCFASDGRSLATACADGVVRVF



underlined and the Trp-Asp (WD)

KLDDASNKSFKFLRINLPAGGHPTAVAFGDGVSSVIVAS




repeats signature is in bold.
QHLSGCSLYMYGEEKPTNLDSNKQQTKLPMPEIKWEHHK





VHEQKAILTLSGAAANYDSGDGSTIIASCSEGTDIIIWH





AKTGKILGNVDTNQLKNTMSAISPNGRFIAAAAFTADVK





VWEIVYSKDGSVKGVTKVMQLKGHKSAVTWLCFTPNSEQ







IVTASKDGSIRIWN

INVRYHLDEDTKTLKVFPIPLQDSS





GTTLHYERLSLSPDGKILAATHGSMLQWLCIETGKVLDT





AEKAHDGDITCMSWAPQSIPTGDKKVNVLATASGDKKVK






LWAAPPLPS






77
The amino acid sequence of SEQ ID
MEVEPKKASKTFPVKPKLKPKPRTPSGKTPESKYWSSFK



343. The conserved G-protein beta

TTHPLDNLSFSVPSLAFSPSPPHLLAAAHSATVSLFSPH




WD-40 repeat domains are

RTTISSFSDVVSSLSFRSDGQLLAASDLSGLIQVFDVRS




underlined.

RTPLRRLRSHARPVRFVRYPVLDKLHLVSGGDDALVKYW






DVAGESVVSELRGHKDYVRCGDCSPADANCFVTGSYDHV






VKLWDVRVRDGNRAATEVNHGSPVQDVIFLPSGSLVATA






GGNSVKIWDLIGGGRMVYSMESHNKTVTSICVGTMGAQQ






SGEEGVQLRILSVGLDGYMKVFDYSRMKVTHSMRFPAPL





LSIGFSPDSNVRAIGTSNGILYVGKRKAKENAEGGANGI




LGLGSVEEPRRRVLKPSFYRYFHRGQSEKPSEGDYLVMR




PKKVKLAEHDKLLKKFQHKNALISVLGGNDPEKVVAVME




ELVARRALLKCVLNLDADELGLILTFLHKNSTVPRYSSL




LLGLAKKVIDLRLEDIRASDALKGHIRNLKRSVDEEIRI




QEGLQEIQGMVSPLLRIAGRR





78
The amino acid sequence of SEQ ID
MQGGSSGVGYGLKYQARCISDVKADTDHTSFLTGTLSLK



344. The conserved G-protein beta
EENEVHLLRLSSGGTELICEGLFSHPSEIWDLSSCPFDQ



WD-40 repeat domains are
RIFSTVFSTGESYGAAVWQIPELYGQLNSPQLEKIASLD



underlined.

AHSRKISCVLWWPSGRHDKLVSIDEENIFLWGLDCSKKS






AQVQSQESAGMLHNLSGGAWDPHDVNTVAATCESSIQFW






DLRTMKKANSLESVHARDLDYDMRKKHLLVTSEDESGVR





VWDLRMPKAPIQEFPGHTHWTWAVRCNPDYEGLILSAGT





DSAVNLWWSSTASSDELISERLIDSPTRKLDPLLHSYND






YEDSVYGLAWSSREPWIFASLSYDGRVVVESVKPFLSRK






79
The amino acid sequence of SEQ ID
MAEEEGSAELEQQLEEEFAVWKKNTPILYDLLISHALEW



345. The conserved G-protein beta
PSLTVHWAPLLPQPSSSAAAAAGDPSLAAHRLVLGTHTS



WD-40 repeat domains are
DGAPNFLILADALLPSSESDHCGDDAVLPKVEISQKIRV



underlined.
DGEVNRARFMPQNHNIVGAKTNGCEVYVFDCSKQAAKQH




DGGFDPDLRLTGHDGEGYGLSWSPLKENYLLSASHDKKI




CLWDISAAAQDKVLGAMHVFEAHEGAVGDASWHSKNDNL





FGSAGDDCQLMIWDLRTNKAQQCVKAHEKEVNSVSFNSY






NDWILATASSDTTVGLFDMRKLTTPLHVFSSHEGEVLQV






EWDPNHEAVLASSSEDRRVMVWDLNRIGDEQQEGDASDG





PAELLFSHGGHKAKISDFSWNKNEPWVISSVAEDNSVQV





WQMAESICGDDDDMQAMEGYI






80
The amino acid sequence of SEQ ID
MGNYGEEDEDQYFDALEETASVSDRGSNSSDCCSSGSGL



346. The conserved G-protein beta
DENVLDSLGFEFWTKFPESVRARRNRFLMLTGLGIEANS



WD-40 repeat domains are
VDKEDAFPPSCNEIEVYTCKVTRDDGAVQRSLDSYNCIS



underlined.
LLQSSTSIRSNQEVESLRGDSLLSSFRGRSKESDDLTEL




CGMGCPESKRNAVSEFGSVSQGSIEELRRIVASSPLVHP




LLHRKLEYERELIETKQKMGAGWLRKFGSATCISGRQGD




TWSDPDDLEITAGMKMRRVRAHSSKKKYKELSSLYAAQE




FLAHEGSISTMKFSMDGQYLASAGEDTVVRVWKVTEEDR




SERVNVTVDPSCLYFALNESTQLASLNTNKEHIGKAKTF




QRSSDSSCVILPLKVFQITEKPWHEFKGHNGEVLDLSWS




SKGYLLSSSTDKTVRLWRVGCDRCQRVYSHNDYVTCISF




NPVNENFFISGSIDGKVRIWNVFGGQVVAYIDCREIVSA




VCYRSDGKGAIVGTMTGNCLFYSIKDNHLQMDAQVYLHG




KKKSPGKRITGFQFPPNDPGKLMITSADSVIRVLSGLDV




VCKLKGPRNSGGPMIATFTSDGKHVISASEDSNVYIWNY




AGQDKTSSRVKKIWSCESFWSSNASVALPWCGIRTVPEA




LAPPSRSEERRASCAENGENHHMLEEYFQKMPPYSPDCF




SLSRGFFLELLPKGSATWPEEKLSDTSPPTVSSQAISKL




EYKFLKSACHSVLSSAHMWGLVIVTAGWDGRIRTYHNYG




LPVRS





81
The amino acid sequence of SEQ ID
MDIDFKEYRLRCELRGHEDDVRGVCVCGDGSIGTSSRDR



347. The conserved G-protein beta

TVRLWAPSAGERRKYEVARVLLGHKSFVGPLAWVPPSEE




WD-40 repeat domains are

LPEGGIVSGGMDTLVMAWDLRNGEAQTLKGHQLQVTGIV




underlined.

LDGGDIVSASVDCTLIRWKNGQLTEHWEAHKAPIQAVIR






LPSGELVTGSSDTTLKLWRGKTCTQTFVGHTDTVRGLAV






MPDLGILSASHDGSIRLWAVSGECLMEMVDHTSIVYSVD






SHASGLIVSGSEDRFAKIWKDGVCFQSIEHPGCVWDVKF






LEDGDIVTACSDGTIRIWTNQEDRMANSTELELFDLELS





SYKRSRKRVGGLKLEELPGLEALQVPGTSDGQTKVIREG




DNGVAYAWNSTELKWDKIGEVVDGPEDSMNRPALDGVQY




DYVFDVDIGDGEPTRKLPYNRSDNPYDTADKWLLKENLP




LSYRQQIVEFILANSGQRDFNLDPSFRDPYTGSSAYVPG




APSQLAAKQARPTFKHIPKKGMLVFDAAQFDGILKKINE




FNNTLLSNQEKKNLSLTDIEISRLGAVVKILKDTSHYHS




SKFADADFDLMLKLLESWPYEMMFPVIDIFRMVILHPDG




ADGLLRHQEDKKDVLMESIKRATGNPSVPANFLTSIRAV




TNLFKNSAYYSWLQKHRSEMLDAFSSCSSSSNKNLQLSY




ATLLLNYAVLLIEKKDEEGQSQVLSAALELAENESLEVD




ARYRALVAIGSLMLDGLVKRIALDFDVEHIAKAARTSKE




AKIAEVGADIELLIKQS





82
The amino acid sequence of SEQ ID
MEFTEAYKQSGPCCFSPNARFIAVAVDYRLVIRDTLSLK



348. The conserved G-protein beta
VVQLFSCLDKISYIEWALDSEYILCGLYKRPMIQAWSLI



domain is underlined and the WD-40
QPEWTCKIDEGPAGIAYARWSPDSRHILTTSDFQLRLTV



repeat domains are in bold
WSLVNTACVHVQWPKHASKGVSFTRDGKFAAICTRHDCK




DYINLLSCHNWEIMGVFAVDTLDLADIQWSPDDSAIVIW




DSPLEYKVLVYSPDGRCLFKYQAYESGLGVKSVSWSPCG






QFLAVGSYDQMLRVLSHLTWKTFAEFTHLSNVRAPCCAA







IFKEVDEPLQIDMSELSLSDDYMQGNSGDAPEGHYRVRY






DVTEVPITLPCQKPPADRPNPKQGIGLMSWSNDSQYICT






RNDSMPTILWIWDMRHLELAAILVQKDPIRAAVWDPTGT







RLVLCTGSSHLYMWT
PSGAYCVSVPLSQFNITDLKWNSD





GSCLLLKDKESFCCAAAPLPPDESSDYSSDD





83
The amino acid sequence of SEQ ID
MATIAALDDDMVRSMSIGAVFSDFVGKLNSLDFHRKDDI



349. The conserved G-protein beta

LVTAGEDDSVRLYDIANARLLKTTFHKKHGTDRVCFTHH




WD-40 repeat domains are
PNSLICSSTKNLDTGESLRYISMYDNRSLRYFKGHKQRV



underlined.

VSLCMSPINDSFMSGSLDHSVRMWDLRVNACQGILRLRG





RPTVAYDQQGLVFAVAMEGGAIKLFDSRSYDKGPFDAFL





VGGDTSEVCDIKFSNDGKSVLLSTTNNNIYVLDAYAGDK





QCGFNLEPSPSTPIEASFSPDGQYVVSGSGDGTLHAWNI




SRRNEVACWNSHIGVASCLKWAPRRAMFVAASTVLTFWI




PNSEPELASAKGEAGVPPEQV





84
The amino acid sequence of SEQ ID
MSVAELKERHRAATETVNSLRERLKQKRVQLLDTDVAGY



350. The conserved G-protein beta
ARTQGKTPVTFGATDLVCCRTLQGHTGKVYSLDWTPERN



WD-40 repeat domains are

RIVSVSQDGRFIVWNALTSQKTHAIRLPCAWVMTCAFAP




underlined and the beta G-protein


NGQSVACGGLDSVCSIFN
LNSPVDRDGNLPVSRMLSGHK




(transducin) is in bold.

GYVSSCQYVPDGDAHLITGSGDQTCVLWDITTGLRTSVF






GGEFQSGHTADVLSVSINGSSPRIFVSGSCDSTARMWDT





RVASRAVHTYHGHEGDVNAVKFFPDGNRFGTGSDDGTCR





LFDIRTGHELQVYYQQRGIDEIPHVTSIAFSISGRLLIA






GYSNGDCFVWDTLLAQVVLNLGSLQNSHEGRISCLGVSA






DGSALCTGSWDTNLKIWAFGGIRRVT






85
The amino acid sequence of SEQ ID
MKKRPRGASLDQAVVDIRRREVGGLSGLSFARRLAASEG



351. The conserved G-protein beta
LVLRLDIYNKLKGHRGCVNTVGFNLDGDIVISGSDDRHV



domain is underlined and the WD-40


KLWDWQTGKVKLSFDSGHLSNVFQAKIMPYTDDRSIVTC





repeat domains are in bold


AADGQARHAQILEGGQVQTMLLAKHRGRAHKLAIDPGSP








HIVYTCGEDGLVQRLDLRSNTARELFTCREVYGTHVEVV








HLNAIAIDPRNPNLFVIGGSDEYARVYDIRNYKWNGSHN







FGRSANYFCPSHLIGEAHVGITGLAFSGQSELLVSYNDE







SIYLFTQEMGLGPDPLSASTKSVDSNSSEVTSPTAVNVD







DNVTPQVYKGHRNCETVKGVGFFGPKCEYVVSGSDCGRI







FIWKKKGGQLIRVMAADKHVVNCIEPHPHIPALASSGIE








NDIKIWT
PKAIERATLPMNVEQLKPKARGWMNRISSPRQ





LLLQLYSLERWPEHGGETSSGLAASQEELTELFFALSAN




GNGSPDGGGDPSGPLL





86
The amino acid sequence of SEQ ID
MSKRGYKLQEFVAHSSNVNCLSIGKKACRLFLTGGDDCK



352. The conserved G-protein beta

VNLWAIGKPNSLMSLCGHTNAVESVAFDSAEVLVLAGAS




WD-40 repeat domains are

SGVIKLWDVEEAKLVRGLTGHRSNCTAMEFHPFGEFFAS




underlined and the Trp-Asp (WD)

GSTDTNLKIWDIRKKGCIHTYKGHTRGISTIRFSPDGRW




repeats signature is in bold.

VVSGGNDNVVKVWDLTAGKLLHDFKFHENHIRSIDFHPL






EFLLATGSADRTVKFWD
LETFELIGSSRPEAAGVRAIAF






HPDGRTLFCGLEDSLKVYSWEPVICHDGVDMGWSTLADL





CIHDGKLLGCSYYQSSVGVWVADASLIEPYGTNVKPQQK




DSGDDEIEHQESRPSAKVGTTIRSTSIMRCASPDYETKD




IKNIYVDTASGNPVSSQRVGTTNFAKVTQPLDFNDTPNL




TLRRQGLVTETPDGLSGHVPSKSITQPKVVSRDSPDGKD




SSRRESITFSRTKPGMLLRPAHSRRPSSTKYDVDRLSAC




AEIGVLSSAKSGSESLVDSFLNIKVAPEDGARNGCEDNH




SSVKNVSVESEKVLPLQTPKTEKCDQTVGFKEEINSVKF




VNGVAVVPGRTRTLVEKFEKREKLNSTEDQTINTPENPT




LDKTPPPSLAENEEKSDRLNIVERKATRMSSHMVTAEDR




TPVTLVGSPEDQSTVMAPQRELPADESSKTPPLPVEDLE




IHHGSNVSEDKATILSSQTVSEEDSKRSTLIRNFRRRDR




FKSTEGRSPVMATQRKLPTDESGKTSSLPMEDLEIKGGL




NVSEDKATSFSSRAPPREDRAHSALVRNVRKRDKFKSTN




DTITVMVHQRGLSTDEASTVSVERVERRQLSNNVENPLN




NLPPHSVPPTTTRGEPQYVGSESDSVNHEDVTELLLGNH




EVFLSTLRSRLTKLQVV





87
The amino acid sequence of SEQ ID
MSTFLTGTALSNPNPNKSYEVVQPPNDSVSSLSFNPKAN



353. The conserved G-protein beta

FLVATSWDNQVRCWEIVRSGTSLGTTPKASISHDQPVLC




WD-40 repeat domains are

STWKDDGTTVFSGGCDKQVKMWPLSGGQPMTVAMHDAPI




underlined.

KEISWIPEMNLLVTGSWDKTLRYWDTRQANPVHIQQLPE






RCYALTVRHPLMVVGTADRNLIIYNLQSPQTEFKRISSP





LKYQTRCLAAFPDQQGFLVGSIEGRVGVHHLDDSQQSKN




FTFKCHREGSEIYSVNSLNFHPVHHTFATAGSDGAFNFW





DKDSKQRLKAMSRCSQPIPCSTFNNDGSIFAYSACYDWS





KGAENHNPATAKTYIFLHLPQESEVKGKPRLGTTGRK





88
The amino acid sequence of SEQ ID
MEVEAQQRDVNNVMCQLVDPEGTTLGPPMYLPQDVGPQQ



354. The conserved G-protein beta
LQQMVNKLLSNEDKLPYTFYISDQELVVPLESYLQKNKV



WD-40 repeat domains are
SVEKVLSIVYQPQAIFRIRPVNRCSATIAGHSEAVLSVA



underlined and the Trp-Asp (WD)

FSPDGKQLASGSGDTTVRLWD
LSTQTPMFTCKGHKNWVL




repeats signatures are in bold.

SIAWSPDGKHLVSGSKAGEIQCWDPLTGQPSGNPLVGHK






KWITGISWEPVHLSSPCRRFVSSSKDGDARIWDVTLRRC






VICLSGHTLAVTCVKWGGDGVIYTGSQDCTIKVWETSQG





KLIRELKGHGHWVNSLALSTEYVLRTGAFDHTGKQYSSA




EEMKQVALERYKKMKGNAPERLVSGSDDFTMFLWEPSVS




KHPKTRMTGHQQLVNHVYFSPDGQWVASASFDKSVKLWN





GITGKFVAAFRGHVGPVYQISWSADSRLLLSGSKDSTLK







IWD
IRTKKLKRDLPGHADEVFAVDWSPDGEKVVSGGKDK





VLKLWMG





89
The amino acid sequence of SEQ ID
MDAGSAHSSSNMKTQSRSPLQEQFLQRRNSRENLDRFIP



355. The conserved G-protein beta
NRSAMDFDYAHYMLTEGRKGKENPAVSSPSREAYRKQLA



WD-40 repeat domains are
ETLNMNRTRILAFKNKPPTPVELIPHELTSAQPAKPTKT



underlined.
RRYIPQTSERTLDAPDLLDDYYLNLLDWGSSNVLSIALG




NTVYLWNASDGSTSELVTIDDETGPVTSVSWAPDGRHIA





VGLNNSDVQLWDSADNRLLRTLRGGHRSRVGSLAWNNHI






LTTGGMDGLIVNNDVRVRSHIVDTYRGHTQEVCGLKWSA






SGQQLASGGNDNILHIWDRSTASSNSPTQWLHRLEEHTA






AVKALAWCPFQGNLLASGGGGGDRTIKFWNTHTGACLNS





VDTGSQVCALLWNKNERELLSSHGFTQNQLTLWKYPSMV





KIAELTGHTSRVLFMAQSPDGCTVASAAGDETLRFWNVF





GVPEVAKPAPKANPEPFAHLNRIR





90
The amino acid sequence of SEQ ID
MEEAIPFKNLPSREYQGHKKKVHSVAWNCTGTKLASGSV



356. The conserved G-protein beta

DQTARVWHIEPHGHGKVKDIELKGHTDSVDQLCWDPKHA




WD-40 repeat domains are

DLIATASGDKTVRLWD
ARSGKCSQQAELSGENINITYKP




underlined and the Trp-Asp (WD)
DGTHVAVGNRDDELTILDVRKFKPIHKRKFNYEVNEIAW



repeats signature is in bold.
NMSGEMFFLTTGNGTVEVLAYPSLRPVDTLMAHTAGCYC





IAIDPVGRYFAVGSADSLVSLWDISEMLCVRTFTKLEWP






VRTISFNHTGDYVASASEDLFIDISNVQTGRTVHQIPCR





AAMNSVEWNPKYNLLAYAGDDKNKYQADEGVFRIFGFESA





91
The amino acid sequence of SEQ ID
MGKDEEEMRGEIEERLINEEYKVWKKNTPFLYDLVITHA



357. The conserved G-protein beta
LEWPSLTVEWLPDREEPPGKDYSVQKLVLGTHTSENEPN



WD-40 repeat domains are
YLMLAQVQLPLEDAENDARHYDDDRADVGGFGCANGKVQ



underlined.
IIQQINHDGEVNRARYMPQNSFIIATKTVSAEVYVFDYS




KHPSKPPLDGACSPDLRLRGHSTEGYGLSWSKFKQGHLL





SGSDDAQICLWDINATPKNKSLDAMQIFKVHEGVVEDVA






WHLRHEYLFGSVGDDQYLLIWDLRTPSVTKPVQSVVAHQ






SEVNCLAFNPFNEWVVATGSTDKTVKLFDLRKISTALHT






FDAHKEEVFQVGWNPKNETILASCCLGRRLMVWDLSRID





EEQTPEDAEDGPPELLFIHGGHTSKISDFSWNTCEDWVV





ASVAEDNILQIWQMAENIYHDEDDVPGEESNKGS






92
The amino acid sequence of SEQ ID
MMRGFSCTEDGDAPSTSSTSPPPPPPPPHRQQMQAPRAS



358. The conserved G-protein beta
SSSSGQPTSRRSTGNVFKLLARREVSPRSKHSLKKFWGE



WD-40 repeat domains are
ASECQLCPFQQSYEAVRDVRRSLISWVEAFSLQHLSAKY



underlined.

CPLMPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTG






SCLKVLRGHRRTPWVVRFHPLYPEILASGSLDHEVHLWD





ANTAECIGSRNFYRPIASIAFHAQGDLLAVASGHKLYIW





HYNRSGETSSPTIVLRTPRSLRAVHFHPHAAPFLLTAEV





NDLDLTDSAMTLATSPGYLHYPPPTIYLADAHSNERSRL




EDELPLMPSPLLMWPSFTRDDGRATLPHIGGDVGLSGQQ




RVDSLSSGQYEFHPSPIEPSSSTSMHEEMGTDPFSSVRE




SEVTQSAMNIVDNTEVQPEERSTYSFSFSDPRFWELPSV




YGWLVGQTQAAPRTAPSPGALETASALGEVASVSPVRSE




FMPGGMDQPRLGGRSGSGCRSSGSRMMRTAGLNDHPHDE




NYPQSVVSKLRSELEASLAAAASTELPCTVKLRVWPYDM




KDPCALFRSESCRLTIPHAVLCSEMGAHFSPCGRFFAAC




VACVLPQLEADPVLHGQVDPDVTGVATSPTRHPVSAYQI




MYELRIYSLEEATFGMVLASRSIRAAHCLTSIQFSPTSE




HLLLAYGRRHNSLLKSIVIDGENTVPIYSILEVYRVSDM





ELVRVLPSAEDEVNVACFHPSVGGGLVYGTKEGKLRILQ





IDSSGGLNPKSTGFLDENMAEVPTYALEC





93
The amino acid sequence of SEQ ID
MGEGDLPRTEAGVLRGHEGAVLAARFNGDGNYCLSCGKD



359. The conserved G-protein beta

RTIRLWNPHRGIHIKTYKSHGREVRDVHCTSDNSKLISC




WD-40 repeat domains are

GGDRQIFYWDVSTGRVIRRFRGHDSEVNAVKFNDYASVV




underlined.

VSAGYDRSVRAWDCRSHSTEPIQIINTFQDSVMSVCLTK






TEIIGGSVDGTVRTFDIRIGREISDDLGQPVNCISMSND






GNCILASCLDSTLRLVDRSAGELLQEYKGHTCKSYKLDC






CLTNTDAHVAGGSEDGYVFFWDLVDASVISKFRAHSSVV






TSVSYHPKEDCMITASVDGTIKVWKT






94
The amino acid sequence of SEQ ID
MACIKGVGRSASVAMAPDGGYLATGTMAGTVDLSFSSSA



360. The conserved G-protein beta
SLEIFGLDFQSDDRDLPLIAESPSSERFNRLSWGKNGSG



WD-40 repeat domains are
SDEFSLGLIAGGLVDGTIGLWNPLSLIRSEAGDKAIVGH



underlined

LSRHKGPVRGLEFNVIAPNLLASGADDGEICIWDLAAPR





EPSHFPPLRGSGSAAQGEISFLSWNSKVQHILASTSYNG





TTVVWDLKKQKPVISFSDSVRRRCSVLQWNPDLATQLVV





ASDEDSSPTLRLWDMRNIMSPVKEFAGHTRGVIAMSWCP





NDSSYLVTCAKDNRTICWDTVTGEIVCELPAGSNWNFDV






HWYPKIPGVISASSFDGKIGIYNVEGCSRYGVRENEFGA





ATLRAPKWFKRPVGASFGFGGKVVSFHTRSTGGPSVNSS




EVFVHDIITEQTLVSRSSEFEAAIQSGDRPSLRALCEKK




SQHCESTDDQETWGFLKVLLEDDGTARSKLLAHLGFDIP




TETNDGSQEDLSQQVNALGLEDVTADKVVQEDNNESMVF




PTDNGEDFFNNLPSPRADTPVSTSADGFPTVNAAVEPSQ




DEVDGLEESSDPSFDDSVQRALVVGDYKAAVALCMSANK




LADALVIAHVGGASLWESTRDKYLKMSRLPYLKVVFAMV




NNDLQSLVDTRPLKFWKETLAILCSFAQGEEWAMLCNSL




ASKLMAAGNMLAATLCFICAGNIDKTVEIWSRSLATEHD




GMSYMDLLQDLMEKTIVLALASGQKQFSASVCKLVEKYA




EILASQGLLTTAMDYLKLLGTDDLSPELAVLRDRIAFSV




EAEKGANISAFNGSQDPRGAVYGVDQSNYGMVDTSQHYY




PEAAQPQVPHTVPGSPYGENYQQPFGSSFGKGYNTPMQY




QAPSQASMFVPSEPPQNAQPSFVPTPVTSQPTTRSQFIP




APPLALRNPEQYQQPTLGSHLYPGSVNPTFQPLPHAPGP




VAPVPPQVSSVPGQNMPQAVAPTQMRGFMPVTNPGVVQN




PGPISMQPATPIESAAAQPVVSPAAPPPTVQTADTSNVP




APQKPVIATL





95
The amino acid sequence of SEQ ID
MKERGKGAGRSVDERYTQWKSLVPVLYDWLANHNLVWPS



361. The conserved G-protein beta
LSCRWGPQLEQATYKNRQRLYLSEQTDGSVPNTLVIANV



WD-40 repeat domains are
EVVKPRVAAAEHISQFNEEARSPFVKKFKTIIHPGEVNR



underlined.
IRELPQNSKIVATHTDSPDVLIWDVETQPNRHAVLGAST




SRPDLILTGHKDNAEFALAMSPTEPFVLSGGKDRYVVLW





SIQDHISTLAADPGSAKSPGSAGTNNKQSSKAAGGNDKT





GDSPSIEPRGVYLGHGDTVEDVTFCPSSAQEFCSVGDDS





CLILWDARTGSSPAIKVEKAHHADLHCVDWNPHDVNLIL






TGSADNTVRMFDRRNLTSGGVGSPVHTFEGHNAAVLCVQ






WSPDKSSVFGSSAEDGILNIWDHEKIGRKIETVGSKVPN





SPPGLFFRHAGHRDKVVDFHWNSSDPWTIVSVSDDGEST





GGGGTLQIWRMIDLIYRPEEEVLAELDKFKSHILSCTS






96
The amino acid sequence of SEQ ID
MAKIAPGCEPVAGTLTPSKKREYRVTNRLQEGKRPLYAV



362. The conserved G-protein beta
VFNFIDSRYFNVFATVGGNRVTVYQCLEGGVIAVLQSYI



WD-40 repeat domains are

DEDKDESFYTVSWACNIDRTPFVVAGGINGIIRVIDAGN




underlined and the Trp-Asp (WD)
EKIHRSFVGHGDSINEIRTQPLNPSLIVSASKDESVRLW



repeats signature is in bold.


N

VHTGICILIFAGAGGHRNEVLSVDFHPSDKYRIASCGM






DNTVKIWSMKEFWTYVEKSFTWTDLPSKFPTKYVQFPVF






IAPVHSNYVDCNRWLGDFVLSKSVDNEIVLWEPKMKEQS





PGEGSVDILQKYPVPECDIWFIKFSCDFHYHSIAIGNRE





GKIYVWELQSSPPVLIAKLSHPQSKSPIRQTAMSFDGST






ILSCCEDGTIWRWDAITASTS






97
The amino acid sequence of SEQ ID
MNTAMHFGAGWRSIAEMGYTMSRLEIEPESCEDEKSLDG



363. The conserved G-protein beta
VGNSQGPNELPRCLDHELAHLTNLKSRPHEHLIRDFPGR



WD-40 repeat domains are
RALPVSTVKMLAGRECNYSRRGRFSSADCCHMLSRYVPV



underlined.
NGPSPLDQMNSRAYVSQFSADGSLFVAGFQGSHIRIYNV




DKGWKCQKNILTKSLRWTITDTSLSPDQRYLVYASMSPI




VHIVDIGSAAMDSLANITEIHEGLDFSADSGPYSFGIFS





VKFSTDGREVVAGSSDDSIYVYDLVANKLSLRIPAHESD






VNTVCFADESGHIIYSGSDDTYCKVWDRRCLSARNKPAG






VLMGHLEGITFIDSRGDGRYFISNGKDQTIKLWDIRKMG





SDICRRGFRNFEWDYRWMDYPPRARDSKHPFDLSVATYK





GHSVLRTLIRCYFSPVHSTGQKYIYTGSHDSCVYIYDVV





TGAQVAALKHHKSPVRDCSWHPEYPMIVSSSWDGDIVKW





EFFGNGETEIPAMKKRIRRRHLY






98
The amino acid sequence of SEQ ID
MEPQPQAPKKRGRKPKPKEDKKEEQLHQPPPPPPPQQQA



364. The conserved G-protein beta
APAPAPAATRSSTSGSAGGRDRRPQQQHAVDEKYARWKS



WD-40 repeat domains are
LVPVLYDWLANHNLLWPSLSCRWGPQLEQATYKNRQRLY



underlined.
ISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQFNEEAR




SPFIRKYKTIIHPGEVNRVRELPQNPNIVATHTDSPDVL




IWDVESQPNRHAVYGATASRPNLILTGHQENAEFALAMC




PAEPFVLSGGKDKTVVLWSIQDHITASATDQTTNKSPGS




GGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAF





CPSTAQEFCSVGDDSCLILWDARVGTNPVAKVEKAHNGD






LHCVDWNPHDNNLILTGSADNSVNMFDRRNLTSNGVGSP






VYKFEGHKAAVLCVQWSPDKPSVFGSSAEDGLLNIWDYE





RVDKKVDRAPNAPAGLFFQHAGHRDKIVDFHWNAADPWT




MVSVSDDCDTAGGGGTLQIWRMSDLIYRPEEEVLAELEN




FKAHVLECSKA





99
The amino acid sequence of SEQ ID
MGIFEPYRAVGYITTGVPFSVQRLGTETFVTVSVGKAFQ



365. The conserved G-protein beta
VYNCAKLSLVLVGPQLPKKIRALASYREYTFAAYGSDIG



WD-40 repeat domains are
IFKRAHQLATWSGHTAKVCLLLLFGEHILSVDVDGNAYI



underlined and the Trp-Asp (WD)
WAFKGMNYNLSPVGHILLDSNFTPSCIMHPDTYLNKVIL



repeats signature is in bold. The
GSQEGPLQLWNISTKTKLYEFKGWNSSVSSCVSSPALDV



Utp21 specific WD40 associated
VAVGCADGKIHVHNIRYDEELVTFSHSMRGSVTALSFST



putative domain is in italics.

DGQPLLASGSSSGVVSIWNLDKRRLQSVIRDAHDGSIIS





LHFFANEPVLMSSSADNSIKMWIFDTSDGDPRLLRFRSG




HSAPPLCIRFYANGRHILSAGQDRAFRLFSVVQDQQSRE




LSQRHVSKRAKKLKLKEEEIKLKPVIAFDVAEIRERDWC




NVVTSHMDTPQAYVWRLQNFVIGEHILRPCPNKPTPVKA




CMISACGNFAILGTAGGWIERFNLQSGISRGSYIDQLEG




TNSAHDGEVVGVACDATNTLMISAGYAGDIKVWDFKGRE




LKSRWEIGSSLVKISYHRLNGLLATVADDFIIRLFDAVA





LRMVRKFEGHTDRITDLCFSEDGKWLLSSSMDGSLRIWD






IILARQVDAVFVDVSITALSLSPNMDILATTHVDQNGVF





LWVNQSMFSGDSDINLYASGKEVVTVKLPSVSSVEGSQV





EESNEPTIRHSESKDVPSFRPSLEQIPDLVTLSLLPKSQ






WQSLINLDIIKVRNKPVEPPKKPEKAPFFLPSIPSLSGE






ILFKPSEMSDKGDMKADEDKSKITPEVPSSRFLQLLHSC






SEAKNFSPFTTYIKGLSPSTLDLELRMLQIIDDDAVDAD






ADDPQDVDKRQELLSIELLMDYFIHEISCRSNFEFVQAL






VRLFLKIHGETIRRQSVLQNKAKVLLETQCSVWQRVDKL






FQGARCMVAFLSNSQF






100
The amino acid sequence of SEQ ID
MEETKVTCGSWIRRPENVNLAVLGRSPRRRGSAALEIFA



366. The conserved G-protein beta
FDPKSTSLSSSPLVAHVIEEIEGDPLAIAVHPNGEDIVC



WD-40 repeat domains are
FASSGSCLSFELSGQESNLKLLTKELPPLRGIGPQKCMA



underlined.

FSVDGSRFATGGVDGRLRILEWPSLRIILDEPKAHKSIR






DLDFSLDSEFLATTSTDGSARIWKAEDGLPCTTLTRRSD





EKIELCRFSKDGTKPFLFCTVQRGDKAVTGVWDISTWNK




IGHKRLLRKPAVVMSISLDGKYLAQGSKDGDMCVVEVKK




MEVSHWSKRLHLGTSLTSLEFCPIERVVITTSDEWGVLV




TKLNVPADWKAWQVYLLLLGLFLASLVAFYIFYENSDSF




WGFPLGKDQPARPKIGSVLGDPKSADDQNMWGEFGPLDM





101
The amino acid sequence of SEQ ID
MADPVEHQHQQHQQHQLQQQRRRGWRIQGGQYLGEISAL



367. The conserved G-protein beta
CFLHLPPPPLSLSSSPVLSLSSGLDSESRDRPACSFRFP



WD-40 repeat domains are
SAGSGSQVSLFDLASGAMVRTFYVFRGIRVHGIVLGCAD



underlined.
FPGGSSSSSSTLDYVIAVYGERRVKLFRLSVRLGRGAGE




GSGTVLSADLELVSAAPRLSHWVMDVRFLKENGTSEDEL




QRCLTVAIGCSDNSIRLWDVDKCSFVLAVSSPERCLLYS




MRLWGDNLEDLQVASGTIYNEILIWKVVPNHDAPSSNEL




TEEGLTNSCAGNSVHECLRYEAYHICRLVGHEGSIFRIA





WSSDGSKLVSVSDDRSARIWEVHCKVQYSEDAGEVGLLF






GHSARVWDCYISDNLIVTAGEDCSCRVWGLDGQQHDVIK





EHIGRGIWRCLYDPWSSLLVTGGFDSAIKVHKLDASLAE




ASAKQSNIKDLSDGTELFTTHLPNSSGHSGHMDSKSEYV




RCLSFSCEDVMYIATNHGYLYHAKLCNDGDLRWTELAQV




SNEVQIICMELLPSNPYDPRIDADDWVAVGDGKGWTTVV




RVVKNSDSPKVSTSFSWAAEMDRQLLGIHWCKSLGHRFI




FTADPRGALKLWRFFEVSQSSSLYPENSPRISLIAEFKS




DLGARIMCLDVAFESELLICGDLRGNLVLFPLLKDLLLD




TFVVSAAKISPVNHFKGAHGISAVSSISVAHMSFNHIEL




RSTGADGCICYMEYDKGLQSLNFVGMKQVKELSMIESVS




TENESTGYRTSGSYASGFASTDFIIWNLVTEAKVLQVSC




GGWRRPHSYYLGDVPEMKNCFAYVKDDIIYIRRHWIKDS




KDKILPQNLRLQFHGREVHSLCFVTGDFQLRKNKQSSWI




VTGCEDGTVRLTRYTQCTDNWSSSKLLGEHVGGSAVRSI




CCVSNIHTTSSGTSVSDVKGIENLPKDIKGTLMEDECNP




SLLISVGAKRVLTSWLLRRRKQDGKEDDVTDLQEAENSS




LPSSAGSSTFSFQWLSTDMPVKYSVPSKKSGSIKKLIGV




SDTNVRCKSL





102
The amino acid sequence of SEQ ID
MPYKLSATLSNHSSDVRAVASPSDDLILSASRDSTAISW



368. The conserved G-protein beta

FRQSPSSFTPASVIRAGSRFVNAIAYLPPTPRAPQGYAV




WD-40 repeat domains are

VGGQDTVVNVFALGPGDKEEPEYTLVGHTDNVCALSVNS




underlined.

DDTIISGSWDKTAKVWKDFALVYDLKGHQQSVWAVLAMN






EKEFLTASADRTIKYWVQHKTMQTYEGHRDAVRGLALIP






DIGFASCSNDSEIRVWTMGGDVVYTLSGHTSFVYSLSVL






PNGDLVSAGEDRSVRVWRDGECSQVIVHPAISVWAVSTM






PNGDIISGSSDGVVRVFSESEKRWATASELKALEDQIAS





QSLPSQQVGDVKKTDLPGPEALSVPGKKAGEVKMIRSGD




VVEAHQWDSLASSWQKIGEVVDAIGSGRKQLHDGKEYDY




VFDVDIQEGAPPLKLPYNVSENPYTAAQRFLEQNDLPTG




YLDQVVKFIEQNTAGVKLGNDGYVDPFTGASRYQPATQS




TSNTASSSYMDPFTGGSRHIAESAPSNVPQGSHATGIIP




FSKPIFFKLANVSAMQAKMFQFDEVLRNEISTATLAMRP




DEVIMVNETFTYLSKVVTSTSSARTSLGWIHIETIMQIL




DRWPVPQRFPVIDLGRLVTAYCMNAFSGPGDLEKFFSCL




FRTSEWTSITSGSKALTKAQETNVLLLFRTIANSLDGAP




LNDMEWIKQIFRELAQTPQLVLNKSHRLALASVLFNFSC




IGLKGPVPADVRTLHLTIILQVLRSPNDDPEVAYRTCVA




LGNMLYSDKTRGTPRDAQSPSPTELKSAVAAIKGGFSDP




RINDVHREIMSLI





103
The amino acid sequence of SEQ ID


MPPQKIESGHKDTVHDLAMDYYGKRLATASSDHTINVVG





369. The conserved G-protein beta

VSSSGSQHLATLIGHQGPVWQISWAHPKFGSLLASCSYD




domain is underlined and the WD-40


GRVIIWREGNPNEWTQAQVFEEHKSSVNSVAWAPHELGL





repeat domains are in bold


CLACGSSDGNISVFTARQDGGWDTSRIDQAHPVGVTSVS








WAPSTAPGALVGSGMMEPVQKLCSGGCDNTVKVWKLYNR







VWKLDCFPVLQMHTDWVRDVAWAPNLGLPKSTIASASQD







GRVIIWTLAKEGDQWQGKVLYDFRTPVWRVSWSLTGNIL








AVADGNNNVSLWN
EAVDGEWIQVSTVEP






104
The amino acid sequence of SEQ ID
MSAPMLEIEARDVVKIVLQFCKENSLHQTFQTLQSECQV



370. The conserved G-protein beta
SLNTVDSIETFVADINSGRWDAILPQVAQLKLPRNTLED



WD-40 repeat domains are
LYEQIVLEMIELRELDTARAILRQTQAMGVMKQEQPERY



underlined and the Trp-Asp (WD)
LRLEHLLVRTYFDPNEAYQDSTKEKRRAQIAQALAAEVT



repeats signature is in bold.
VVPPSRLMALVGQALKWQQHQGLLPPGTQFDLFRGTAAM




KQDVDDMYPTTLSHTIKFGTKSHAECARFSPDGQFLVSC





SVDGFIEVWDYMSGKLKKDLQYQADETFMMHDDPVLCVD






FSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGV






TSVLFSRDGSQLLSTSFDGSARIHGLKSGKQLKEFRGHS






SYVNDAIFSNDGSRVITASSDCTVKVWD
VKTSDCLQTFK






PPPPLRGGDASVNSVHLFPKNADHIVVCNKTSSIYIMTL






QGQVVKSLSSGKREGGDFVAACVSPKGEWIYCVGEDRNL






YCFSCQSGKLEHLMKVHEKDVIGVTHHPHRNLVATYSED






STMKLWKP






105
The amino acid sequence of SEQ ID
MDLLQSYAEDNDGDLGRHSSPEPSPPRLLPSKSAAPKVD



371. The conserved G-protein beta
DTTLALTVAQTNQTLARPIDPSQHAVAFNPTYDQLWAPI



WD-40 repeat domains are
CGPAHPYAKDGIAQGMRNHKLGFVEDAAIGSFLFDEQYN



underlined.
TFQRYGYAADPCASTGNEYVGDLDALKQNDGISVYNIRQ




QEQKKYAEEYAKKKGEERGEGGREKAEVVSDKSTFHGKE




ERDYQGRSWIAPPKDAKATNDHCYIPKRLVHTWSGHTKG





VSAIRFFPKHGHLILSAGMDTKVKIWDVFNSGKCMRTYM






GHSKAVRDISFCNDGTKFLTAGYDKNIKYWDTETGKVIS






TFSTGKIPYVVKLHPDDEKQNILLAGMSDKKIVQWDMNT





GQITQEYDQHLGAVNTITFVDDNRRFVTSSDDKSLRVWE




FGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDNQI





LIYSTRERFQLNKKKRFAGHIVAGYACQVNFSPDGRFVM






SGDGEGRCWFWDWKSCKVFRTLKCHEGVCIGCEWHPLEQ






SKVATCGWDGLIKYWD






106
The amino acid sequence of SEQ ID
MESNGNLEQTLQDGRIYRQLNSLIVAHLRDHNFPQAASA



372. The conserved G-protein beta
VALATMTPLNVEAPRNRLLELVAKGLAVEKGELLRGVSH



WD-40 repeat domains are
AGTNDLGGSIPASYGLVPAPWTAIDFSSLRDTKGMSKSF



underlined.

TKHETRHLSDHKNVARCARFSTDGRFFATGSADTSIKLF






EVSKIKQMMLPDSTDGAIRAVIRTFYDHTHPVNDLDFHP






QNTVLISAAKDHTVKFFDYSKATAKRAFRVIQDTHNVRS






VAFHPSGDFLLAGTDHPIPHLYDVNTFQCYLSANVPEFA






VNAAINQVRYSSSGGMYVTASKDGTIRFWDGASANCVRS






IAGAHGAAEVTSANFTKDQRYVLSCGKDSTVKLWEVGTG





RLVKQYLGATHMQLRCQAVFNNTEEFVLSIDEPSNEIVV




WDAMTAEKVARWPSNHNGPPRWIEHSPTEAAFVSCSTDR





SIRFWKETH






107
The amino acid sequence of SEQ ID
MSNFQGEDGEYVADDFEAEDGDEELHGRESADPESDVDE



373. The conserved G-protein beta
IDTPSNRFTDTTADQARRGRDIQGIPWERLSITREKYRR



WD-40 repeat domains are
TRLEQYKNYENVPQSGEKSGKDCTVTEKGNSFYEFRRNS



underlined.
RSVKSTILHFQLRNLVWATSKHDVYLMSNYSVVHWSSLT




GKKSEVLNLAGHVAPNEKHPGSLLEGFTQTQVSTLAVKD




RFLVAGGFQGELICKFLDRPGISFCSRTTYDDNAITNAV




EIYVSPSGGIHFIASNNDCGVRDFDMENFELSKHFRFPW





PVNHTSLSPDGKLLVIVGDDPEGILVDAKTGKTIMPLRG






HLDFSFASEWHPDGVTFATGNQDKTCRIWDIRNLSKSIA






VLKGNLGAIRSIRYTSDGRYMAIAEPADFVHVYDTKTGY





KKEQEIDFFGEISGMSFSPDTESLFIGVWDRTYGSLLEY




GRRRNFSYLDCLV





108
The amino acid sequence of SEQ ID
MGVEEDLEDLNALAESTDAAVDGQAALASAVDSVTLQPA



374. The conserved G-protein beta
PPILPPVIPPPAVPVVAPVPTIPPVLRPLAPLPIRPPVL



WD-40 repeat domains are
RPPAPKRDEAGSSDSDSDHDGTAAGSTAEYEITEESRLV



underlined and the splicing factor
RERHEKAMQDLMMKRRGAALAVPTNDKAVRARLRRLGEP



motif is in bold.

MTLFGEREMERRDRLRMLMAKLDAEGQLEKLMKAHEDEE





AAASAAPEDVEEEMLQYPFYTEGSKALFNARIDIAKFSI




TRAALRLERARRRRDDPDEDVDAEIDWALKKAESLSLHC




SEIGDDRPLSGCSFSHDGKLLATCSMSGVAKLWDTCRMP




QVNRVLTLKGHTERATDVAFSPVQNHIATASADRTAKLW





NTEGTILKTFEGHLDRLGRIAFHPSGKYLGTTSFDKTWR






LWDIESGEELLLQEGHSRSIYGIDFHRDGSLVASCGLDA





LARVWDLRTGRSILALEGHVKPVLGVSFSPNGYHLATGG





EDNTCRIWDLRKKKSLYTIPAHANLISEVKFEPQEGYFL





VTASYDTTAKVWSARDFKPVKTLSVHEAKITSVDITADA




SHIVTVSHDRTIKLWTSNDDVKEQAMDVD





109
The amino acid sequence of SEQ ID
MVKAYLRYEPAAAFGVIASVESNIAYDASGKHLLAPALE



375. The conserved G-protein beta
KVGVWHVRQGVCTKALAPSASSAAGPSLAVTAIASSPSS



WD-40 repeat domains are
LIASGYADGSIRIWDFEKGSCETTLNGHKGAVSVLRYGK



underlined, and the conserved
LGSLLASGSKDNDIILWDVVGETGLYRLRGHRDQVTDLV



Dip2/Utp12 domain is in bold.
FLDSDKKLVSSSKDKYLRVWDLETQHCMQIVGGHHSEIW




SLDTDPEERYLVTGSADPELRFYTVKNDSSDERSEADAS




GGVGNGDLASHNKWDVLKQFGEIQRQSKDRVATVRFNKN




GNLLACQAAGKLVEVFRVLDEAEAKRKAKRRLHRKREKK




GADVNENSDSSRGIGEGHDTMVTVADVFKLLQTIRASKK




ICSISFCPVAPKSSLATLALSLNNNLLEFHSIEADKTSK




MLTIELQGHRSDVRSVTLSSDNTLLMSTSHNSVKIWNPS




TGSCLRTIDSGYGLCGLIVPQNKHALIGTKDGAIEIFDV




GSGTCIEVVEAHGGSIRSIVAIPNQNGFVTGSADHDIKF




WEYGMKQKPGDNSKHLTVSNVRTLKMNDDVLVVAVSPDA




QKIAVALLDCTVKVFFMDSLKLMHSLYGHRLPVLCLDIS




SDGDLIVTGSADKNLMIWGLDFGDRHKSIFAHGDSIMAV




QFVGNTHYMFSVGKDRLVKYWDADKFELLLTLEGHHADI




WCLAISNRGDFLVTGSHDRSIRRWDRTEEPFFIEEEKEK




RLEEMFESDLDNAFGNKYVPKEEIPEEGAVALAGKKTQE





TLSATDSIIEALDIAEVELKRIAEHEEEKNNGKTAEFHP






NYVMLGLSPSDFILRALSNVQTNDLEQTLLALPFSDALK






LLSYLKDWTTYPDKVELVSRIATVLLQTHYNQLVSTPAA






RPLLTTLKDILHKKVKECKDTIGFNLAAMDHLKQLMALR






SDALFQDAKVKLLEIRSQLSKRLEERTDPREAKRRKKKQ






KKSTNMHAWP






110
The amino acid sequence of SEQ ID
MGGVQAEREDKDKVSLELTEEILQSMEVGMTFRDYSGRI



376. The conserved G-protein beta

SSMDFHRASSYLVTASDDESIRLYDVASATCLKTINSKK




WD-40 repeat domains are
YSVDLVSFTSHPMTVIYSSKNGWDESLRLLSLHDNKYLR



underlined.

YFKGHHDRVVSLSLCPRNECFISGSLDRTVLLWDQRAEK





CQGLLRVQGRPATAYDDPGLVFAIAFGGCVRMFDARKYE




KGPFEIFSVGGDVSDANVVKFSNDGRLMLLTTTDGHIHV





LDSFRGTLLYTFNVKPTSSKSTLEASFSPEGMFVISGSG






DGSVYAWSVRGGKEVASWLSTDTEPPVIKWAPGNLMFAT





GSSELSFWIPDLSKLGAYVGRK





111
The amino acid sequence of SEQ ID
MAAFGAAPAGNHNPNKSSEVIQPPSDSVSSLCFSPRANH



377. The conserved G-protein beta

LVATSWDNQVRCWELTKNGASVTSVPKASMSHDQPVLCS




WD-40 repeat domains are

AWKDDGTTVFSGGCDKQAKMWSLMSGGQPVTVAMHDAPI




underlined.

KEIAWIPEMNVLVTGSWDKTLKYWDTRQSNPVHTQQLPE





RCYAMTVRYPLMVVGTADRNLIVFNLQNPQAEFKRFSSP





LKYQTRCVAAFPDQQGFLVGSIEGRVGVHHLDDSQISKN






FTFKCHRDNNDIYSVNSLNFHPVHHTFATAGSDGTFNFW






DKDSKQRLKAMSRCSQPIPCSTFNNDGTIYAYSVCYDWS





KGAENHNPATAKTYIFLHLPQESEVKAKPRVGTTNRK





112
The amino acid sequence of SEQ ID
MNCSISGEVPEEPVVSTKSGHVFERRLIERYVSDYGKCP



378. The conserved G-protein beta
VSGEPLTMDDVLPVKMGKIVKPRPLQAASIPGLLSIFQN



WD-40 repeat domains are
EWDSLMLSNFALEQQLHTARQELSHALYQHDAACRVIAR



underlined.
LKKERDEARSLLALAERQIPMTASSDIAVNAPAMSNGRK




ASLDEEPGYAGKKMRPGISASIIAEITDCNLALSQQRKK




RQIPSTLAPVEDLERYTQLSSYPLHKTGKPGITSLDICH





SKDIIATGGIDTSAVLFDRSSGQIMSTLSGHSKKVTSVN






FDAQGDMVLTGSADKTVRIWQGSEDGSYNCRHILKDHTA






EVQAITVHATNNYFATASLDNTWCFYEFSTGLCLTQVEG






ASGSEGYTSAAFHPDGLILGTGTSNADVKIWDVKTQANV






TTFSGHTGAITAISFSENGYFLATAAQDGVKLWDLRKLK






NFRTFSAYDKDTGTNSVEFDHSGCYLGLAGSDIRVYQVA





SVKSEWNCVKTFPDLSGTGKVTCVKFGPDSKYIAVGSMD





HNLRIFGLPSEDGAMES






113
The amino acid sequence of SEQ ID
MAAPGVETLKKEIKELKEKIAQHRLDTDGEQPLPAAAKS



379. The conserved G-protein beta
KSVPEVSAALKQRRILKGHFGKIYALHWSADSRHLVSAS



domain is underlined and the WD-40


QDGKLIIWNGFTTNKVHAIPLRSSWVMTCAYSPSGNLVA





repeat domains are in bold


CGGLDNLCSVYKVPHGGNKESSSAQKTYGELAQHEGYLS








CCRFIKDNEIVTSSGDSTCILWDVETKTPKAIFNDHTGD








VMSLAVFDDKGVFVSGSCDATAKLWDHRVHKQCVMTFQG








HESDINSVQFFPDGDAFGTGSDDSSCRLFDIRAYQQINK








YSSDKILCGITSVAFSKTGKSLFAGYDDYNTYVWDTLSG








NQVEVLTGHENRVSCLGVSEDGKALATGSWDTLLKIWA







114
The amino acid sequence of SEQ ID
MGGVEDESEPASKRMKLSSRVLRGLANGSSRTEPAAGSS



380. The conserved G-protein beta
LDLMARPLPIEGDEEVIGSKGVIKRVEFVRLIAKALYSL



WD-40 repeat domains are
GYEKSGARLEEESGIPLQSSVVNLFMQQISDGLWDESVV



underlined.
TLHKIGLSDENLVKSASFLILEQKFLELLDQEKAMDALK




TLRTEITPLCIKNSRVRELSSCIISPSSCGLLNQNKRNS




TRARSRSELLEELQKLLPPAVIIPERRLEHLVEQALVLQ




TDACMLHNSIDMEMSLYTDHQCGKEHIPCRTLQILQSHN





DEVWLVQFSHNGKYLASASNDRSAIIWEVDENGSVSLKH






KLTGHQKPISSVCWSPDDRQLLTCGVGETVRRWDVSSGE






CLRVYEKAGHGLISCAWFPDGKWICYGVSDRSICMCDLE






GKEIECWKGQRTLSISDLEITSDGKQIISICRETAILLL






DREAKYERMIEENQTITSFSLSKDNRYLLVNLLNQEIHL






WDIKGDFRLVAKYKGLKRSRFVIRSCFGGLKQAFVASGS






EDSQVYIWHKGSGELIEPLPGHSGAVNCVSWNPANHHML






ASASDDRTIRIWGLNELNTRHKGARPNGVHYCNGNGTS






115
The amino acid sequence of SEQ ID
MTQLAETYACMPSTERGRGILIAGNPKPGSNSVLYTNGR



381. The conserved G-protein beta
SVVILNLDNPLDISVYAEHAYPATVARFSPNGEWVASAD



WD-40 repeat domains are

SSGAVRIWGAYNDHVLKKEFKVLSGRIDDLQWSPDGLRI




underlined.
VASGDGKGKSLVRAFMWDSGTNVGEFDGHSRRVLSCAFK





PTRPFRIVTCGEDFLVNFYEGPPFKFKLSRRDHSNFVNC






LRFSPDGNRFISVSSDKKGIIYDGKTGEKIGELSSDGGH






TGSIYAVSWSPDSKQVITVSADKSAKIWDISEDGSGNLR





KTLTSSGSGGVDDMLVGCLWQNNHLVTVSLGGTISIYTA




GDLDKAPVSFSGHMKNVSSLSVLKGDPKVILSSSYDGLI





IKWIQGIGFSGRVQRKESTQIKCLAAVDEEIVTSGYDNK





VCRVSGSGDAEFIDIGCQPKDLSLALQCPEFALVSTDTG




VVLLRGAKIVSTINLGFAVTASTVAPDGTEAIIGAQDGK





LRIYSISGDTLTEEAVLEKHRGAISVIHYSPDLSMFASG






DLNREAVVWDRASREVRLKNILYHTARINCLAWSPDSST






VATGSLDTCVIIYEVDKPASNRLTIKGAHLGGVYGLAFT






DDFSVVSSGEDACIRVWKINRQ






116
The amino acid sequence of SEQ ID
MKVKVISRSTDEFTRERSQDLQRVFRNFDPNLRTQEKAV



382. The conserved G-protein beta
EYVRALNAAKLDKVFARPFVGAMDGHVDSVSCMAKNPNY



WD-40 repeat domains are

LKGIFSGSMDGDIRLWDIASRRTVCQFPGHQGPVRGLAA




underlined and the SOF1 protein

STDGQILVSCGIDSTVRLWNVPVATLGESDGTHENLAKP




domain is in bold.
LAVYVWKNAFWAVDHQWDGELFATAGAQVDIWNQNRSQP





ISSFEWGTDTVISVRFNPGEPNVLATSGSDRSITLYDLR





MSSPTRKVIMRTKTNAISWNPMEPMNFTAANEDCNCYSY




DARKLEEAKCVHKDHVSAVMDIDYSPTGREFVTGSYDRT





VRIFQYNGGHSREVYHTKRMQRVFCVKFSCDASYVISGS






DDTNLRLWK
AKASEQLGVVLPRERRKHEYHEAVKSRYKH






LPEVKRIVRHRHLPKPIYKAGILRRTVNEADRRKEERRK






AHSAPGSSSAEPLRKRRIIKEIE






117
The amino acid sequence of SEQ ID
MVRSIKNPKKAKRKNKGSKNGDGSSSSSSIPSMPTKVWQ



383. The conserved G-protein beta
PGVDKLEEGEELQCDPSAYNSLHAFHIGWPCLSFDIVRD



WD-40 repeat domains are
TLGLVRTEFPHQVYFVAGTQAEKPTWNSIGIFKVSNITG



underlined.
KRRELVPSKPTDDADEESDSSDSDEDSDDEVGGSGTPIL




QLRKVGHEGCVNRIRAMNQNPHICASWGDSGHVQIWDFS




SHLNALAESEADVSQGASSVFNQAPLVKFGGHKDEGYAL




DWSPLVPGRLVSGDCKNSIHLWEPTSGSTWNVDSTPFIG





HAASVEDLQWSPTEENVFASCSVDGTIAIWDTRLGKTPA






ASFKAHDADVNVISWNRLATCMLASGCDDGTFSIHDLRL





LKEGDSVVAHFEYHKHPVTSIEWSPHEASTLAVSSADCQ





LTIWDLSLEKDEEEEAEFKAKTKEQVNAPEDLPPQLLFV





HQGQKDLKELHWHAQIPGMIVSTAADGFNILMPSNIQST




LPSDGA





118
The amino acid sequence of SEQ ID
MERYKVIKELGDGTYGSVWKALNQQTHEIVAIKKMKRKY



384. The conserved eukaryotic

YIWEECINLREVKSLRKLNHPNIIKLKEVIRENNELFFI




protein kinase domain is

FEYMECNLYQIMKERSTPFSETAIIKFCYQILQGLSYMH




underlined and the protein kinases

RNGYFHRDLKPENLLVTSDLIKIADFGLAREVLTSPPYT




ATP-binding region and

DYVSTRWYRAPEVLLQSPTYTTAIDMWAVGAILAELFTL




serine/threonine protein kinases

HPLFPGESELDEIYKICGVLGTPDYETWPDGMQLAAFRN




active-site signatures are in

FIFPQFLPVNLSVLIPHASPEAIDLITRLCSWDPQKRPT




bold.

AEQALHHPFFRIGMSIPLSLGGHFQDNTCAAEVDTKFHS





KKACKAWNGEKESSLECFLGLSLGLKPSLGHLGAMGSQG




VGAVKQEVGSSPGCQSNPKQSLFQVLNSRAILPLFSSSP




NLNVVPVKSSLPSAYTVNSQVMWPTIAGPPAAAVTVSTL




QPSILGDFKIFGKSMGLASQYAGKEASPFS





119
The amino acid sequence of SEQ ID
MGEMGRGINNSSNNNNSNRPAWLQHYDLVGKIGEGTYGL



385. The conserved eukaryotic


VFLARSKLPNNRGLRIAIKKFKQSKDGDGVSPTAIREIM





protein kinase domain is

LLREFSHENVVKLVNVHINHVDMSLYLAFDYAEHDLYEI




underlined and the protein kinases

IRHHREKLNHHNINQYTVKSLLWQLLNGLNYLHSNWIVH




ATP-binding region and


RDLKPSNILVMGEGEEHGVVKIADFGLARIYQAPLKPLS





serine/threonine protein kinases

DNGVVVTIWYRAPELLLGAKHYTSAVDMWAVGCIFAELI




active-site signatures are boxed

TLKPLFQGVEVKASPNPFQLDQLDKIFKVLGHPTIEKWP




in bold.

TLMNLPHWSKNLQQIQQHKYDNAGLHIGPIPAKSPAYDL






LSKMLEYDPRKRITAAQALEHEYFRIDPQPGRNALVPSQ





PGEKAINYPPRLVDANTDFDGTIAPQPSQVSSGNAPSGS




IASAAVPAVRPLPQQMQLMGMQRMQNPGMAAFNLGAQAS




MSGLNHNNIALQRGSSQQQAHQQVRRKEPNSGFPNTGYP




PPPKSRRL





120
The amino acid sequence of SEQ ID
MDKYEKLEKVGEGTYGKVYKARDKMTGQLVALKKTRLEM



386. The conserved protein kinase

DEEGVPPSSLREISLLQMLSQSIYVVRLLCVEHVTKKGK




family domain is underlined. The

PLLYLVFEYLDTDLKKFIDYRRSVNAGPLPQNVIQSFMY




protein kinases ATP-binding region

QLLKGVAHCHSHGVLHRDLKPQNLLVDKSKGLLKVGDLG




is in bold and the

LGRAFTVPLKCYTHEVVTLWYRAPEVLLGSTHYSTPVDI




serine/threonine protein kinases

WSVGCIFAEMVRRQPLFPGDCEIQQLLHIFTLLGTPTEE




active-site signature is in

MWPGVKRLRDWHEYPQWKPENLARAVPNLSPTGLDLISK




bold/italics.

MLQCDPAKRISAKAAMNHPYFDDLDKSQF






121
The amino acid sequence of SEQ ID
MDGYEKMDKVGEGTYGKVYMARDKKTGQLVALKKTRLEN



387. The conserved protein kinase

DGEGIPPTALREISLLQMLSQDIYIVRLLDVKHTENKLG




family domain is underlined. The

KPLLYLVFEYMESDLKKYIDSYRRSHTKMPPSMIKSFMY




protein kinases ATP-binding region

QLCRGVAYCHSRG DKEKGVLKIADLG




is in bold and the

LSRAFTVPVKKYTHEIVTLWYRAPEVLLGATHYSLPVDI




serine/threonine protein kinases

WSVGCIFAEMSRMQALFTGDSEVQQLMNIFRFLGTPNEE




active-site signature is in

VWPGVTKLKDWHIYPEWKPQDISHAVPDLEPSGLDLLSQ




bold/italics.

MLVYEPSKRISAKKALEHPYFDDLDKSQF






122
The amino acid sequence of SEQ ID
MDAYEKLEKVGEGTYGKVYKAKDKNTGQLVALKKTRLES



388. The conserved eukaryotic

DDEGIPPTALREISLLQMLSQDIHIVRLLDVEHTENKNG




protein kinase domain is

KPLLYLVFEYMDSDLKKYIDGYRRSHTKVPPNIIKSFMY




underlined and the protein kinases

QLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVVKIADLG




ATP-binding region and

LGRAFTIPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDI




serine/threonine protein kinases

WSVGCIFAEMVRLQALFIGDSEVQQLFKIFSFLGTPNEE




active-site signatures are in

IWPGVTKFRDWHIYPQWKPQDISSAVPDLEPSGVDLLSK




bold.

MLVYEPSKRISAKKALEHPYFDDLDKSQF






123
The amino acid sequence of SEQ ID
MDSYEKLEKVGEGTYGKVYKAKDKKTGKLVALKKTRLEN



389. The conserved protein kinase

DGEGIPPTALREISLLQMLSQDMNIVRLLDVEHTENKNG




family domain is underlined. The

KPLLYLVFEYMDSDLKKYVDGYRRSHTKMPPKIIKSFMY




protein kinases ATP-binding region

QLCQGVAYCHSRG DKQRGVLKIADLG




is in bold and the

LGRAFTVPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDI




serine/threonine protein kinases

WSVGCIFAEMSRMHALFCGDSEVQQLMSIFKFLGTPNEG




active-site signature is in

VWPGVTKLKDWHIYPEWRPQDLSRAVPDLEPSGVDLLTK




bold/italics.

MLVYEPSKRISAKKALQHPYFDDLDKSQF






124
The amino acid sequence of SEQ ID
MEKYEKLEKVGEGTYGKVYKGRDKRTGRLVALKKTPFHQ



390. The conserved eukaryotic

EEGIPPTAIREISLLKSLSQCIYIVKLLDVKASFNGKGK




protein kinase domain is

HVLFMVFEYADSDLKKHIDAHRQCNTKLSPRSIQSYMFQ




underlined and the protein kinases

LCKGIAYCHSHGVLHRDLKPQNILVDQKIGLLKIADLGL




ATP-binding region and

GRACTVPIKSYTFEVVTLWYRAPEVLLGAKRYSMALDIW




serine/threonine protein kinases

SLGCIFAELCNLQALFAGDSQIQQLINIFRLLGTPNEQL




active-site signatures are in

WPGVTQLSDWHEFPQWRPQDLSKVVFNLDPNGVDLLSKM




bold.

LQYDPAKRISAKEALDHPYFDSLDKSQF






125
The amino acid sequence of SEQ ID
MGCVCGKPSARAADYVESPAEKGASSNSRSSSMASRRLV



391. The conserved eukaryotic
APAVMDQGIDAENGHEGDYRTKLRGKQSNGADPVSLLSD



protein kinase domain is
DAEKQRHSRHHQHQQHHPIRPHHLRPQGEFVPNANSNPR



underlined and the
FGNPPRHIEGEQVAAGWPAWLTAVAGEAIKGWIPRRADS



serine/threonine protein kinases

FEKLDKIGQGTYSNVYKARDLDTGKIVALKKVRFDNLEP




active-site signatures are in

ESVRFMAREIQVLRRLDHPNVVKLEGLVTSRMSCSLYLV




bold.

FEYMDHDLAGLAACPGIKFTEPQVKCYMQQLLRGLDHCH






SRGVLHRDIKGSNLLIDNGGILKIADFGLATFFHPDQRQ






PLTSRVVTLWYRPPELLLGATEYGVAVDLWSTGCILAEL






LAGKPIMPGRTEVEQLHKIFKLCGSPSEDYWKKSKLPHA






TIFKPQQPYKRCVAETFKDFPPSALALMEVLLAIEPADR






GTATSALKSDFFTTKPLACDPSSLPKYPPSKEFDAKIRD





EEARRQRAAGGRGRDAARRPSRESRAIPAPEANAELAIS




IQKRRLSSQGPSKSKSEKFNPQQEDGAVGFPIEPPRPMH




IGIDAGATSRMYSQQFGPSHSGPLSNQISSSIWGKNQKE




DEIQMAPGRPSRSSKATISDFRKPGACAPQPGADLSHLS




SLVATARSNAGIDTHKDRSGMWQHNRIDAIDGVHNNGKH




EFLEVPEHPNRQDWTRFQQPESFKGLDNYHLQDLPATHH




RKDERVASKEATMNWQGYGGQGGDKIHYSGPLLPPSGNI




DEILKEHERHIQHAVRRARQDKGRPQRSNLSQNERKAFE




HRSFVSGVNGNAGYSDLVNELPISVGSNRLKVSKTRGTE




EIVELRELEREPLSSVMEKYEREHEM





126
The amino acid sequence of SEQ ID
MGCVCAKQSDILGEPESPKVKGSNLASSRWSVSSETKQL



392. The conserved eukaryotic
PQHSDSGILHHQHYYHPRDESDEAKLKESNYGGSKRRTR



protein kinase domain is
QGRDPADLDMGIFVRTPSSQSEAELVAAGWPAWMAAFAG



underlined and serine/threonine
EAIHGWIPRRAESFEKLYKIGQGTYSNVYKARDLDNGKI



protein kinases active-site

VALKKVRFDSLDAESVRFMAREILVLRKLDHPNIVKLEG




signatures is in bold.

LVTSEVSSSLYLVFEYMEHDLAGLAACPGIKFTEPQVKC






YMQQLLQGLDHCHRHGVLHRDIKGSNLLIDNGGILKIAD






FGLATFFYPDQKQLLTSRVVTLWYRPPELLLGATDYGVA






VDIWSAGCILAELLAGKPILPGRTEVEQLHKIFKLCGSP






SEDYWKESKLPHATIFKPQHPYKSCIAEAFKDFSPSALA






LLETLLAIEPGHRGEASGALKSEFFTTEPLSCDPSSLPK





YPPSKEFDAKLRAQETRRQRDVGVRGHGSEAARRTSRLS




RAGPTPNEGAELTALTQKQHSTSHATSNIGSEKPSTKKE




DYTAGLHIDPPRPVNHSYETTGVSRAYDAIRGVAYSGPL




SQTHVSGSTSGKKPKRDHVKGLSGQSSLQPSKPFIVSDS




RSERIYEKSHVTDLSNHSRLAVGRNRDTTDPHKSLSTLM




QQIQDGTLDGIDIGTHEYARAPVSSTKQKSAQLQRPSAL




KYVDNVQLQNTRVGSRQSDERPANKESDMVSHRQGQRIH




CSGPLLHPSANIEDLLQKHEQQIQQAVRRAHHGKREALS




NKSSLPGKKPVDHRAWVSSGKGNKESPYFKGKGNKELSD




LKGGPTAKVTNFRQKVM





127
The amino acid sequence of SEQ ID
MAVANPGQLNLQEAPSWGSRSVNCFEKLEQIGEGTYGQV



393. The conserved protein kinase


YMAKEIETGEIVALKKIRMDNEREGFPITAIREIKLLKK





family domain is underlined. The

LQHENVIKLKEIVTSPGPEKDEQGKSDGNKYNGSIYMVF




protein kinases ATP-binding region

EYMDHDLTGLAERPGMRFSVPQIKCYMKQLLIGLHYCHI




is in bold and the

NQ

DNNGILKLADFGLARSFCSDQNGN




serine/threonine protein kinases

LTNRVITLWYRPPELLLGSTKYGPAVDMWSVGCIFAELL




active-site signature is in

YGKPILPGKNEPEQLTKIFELCGSPDESNWPGVSKLPWY




bold/italics.

SNFKPQRQMKRRVRESFKNFDRHALDLVEKMLTLDPSQR






ISAKDALDAEYFWTDPVPCAPSSLPRYEPSHDFQTKRKR





QQQRQHDEMTKRQKISQHPPQQHVRLPPIQNAGQGHLPL




RPGPNPTMHNPPPQFPVGPSHYTGGPRGAGGQNRHPQNI




RPLHAAQGGGYNANRGYGGPPQQQGGGYPPHGMGNQGPR




GGQFGGRGAGYSQGGPYGGPVGGRGPNVGGGNRGPQFWS




EQ





128
The amino acid sequence of SEQ ID
MQNMEDNVQSSWSLHGNKEICARYEILERVGSGTYSDVY



394. The conserved eukaryotic

RGRRKADGLIVALKEVHDYQSSWREIEALQRLCGCPNVV




protein kinase domain is

RLYEWFWRENEDAVLVLEFLPSDLYSVIKSGKNKGENGI




underlined and the

PEAEVKAWMIQILQGLADCHANWVIHRDLKPSNLLISAD




serine/threonine protein kinases

GILKLADFGQARILEEPEAIYEVEYELPQEDIVADAPGE




active-site signature is in bold.

RLMEEDDSVKGVRNEGEEDSSTAVETNFGDMAETANLDL






SWKNEGDMVMQGFTSGVGTRWYRAPELLYGATIYGKEID






LWSLGCILGELLILEPLFSGTSDIDQLSRLVKVLGTPTE






ENWPGCSNLPDYRKLCFPGDGSPVGLKNHVPSCSDSVFS






ILERLVCYDPAARLNAKEVLENKYFVEDPYPVLTHELRV





PSPLREENNFSEDWAKWKDMEADSDLENIDEFNVVHSSD




GFCIKFS





129
The amino acid sequence of SEQ ID
MDLNQYPEDLNPELPEGTDNVDNPDNNKGSPVPSPHPPL



395. The conserved eukaryotic
KPLDPSERYRKGITLGQGTYGIVYKAFDTVTNKTVAVKK



protein kinase domain is

IHLGKAKEGVNVTALREIKLLKELSHPNIIQLIDAYPHK




underlined and the protein kinases

QNLHIVFEFMETDLEAVIKDRNLVFSPADIKSYLQMTLK




ATP-binding region and

GLAVCHKKWVLHRDMKPNNLLIAADGQLKLGDFGLARLF




serine/threonine protein kinases

GSPDRKFTHQVFAVWYRAPELLFGAKQYGPAVDIWATGC




active-site signatures are in

IFAELLLRKPFLQGVSDLDQIGKIFAAFGTPRQSQWPDV




bold.

ASLPDFVEFQFVPAPSLRSLFPMASEDALDLLSKMFTLD






PKNRITAQQALEHRYFSSVPAPTRPDLLPKPSKVDSSRP





PKHASPDGPVVLSPSKARRVMLFPNNLAGILPKQVSQST




TGGTPIEFDMPTQKLREVCPRSRITESGKKHLKRKTMDM




SAALDECAREQEGQEGKTILDPDHQRSAKKEKHM





130
The amino acid sequence of SEQ ID
MAGGQENCVRITRARAACVSKASAPVIQSQVDEKKSRKR



396. The conserved cyclin N- and
APKRAAVDDLAANASGSQPKRRAVLGDVTNLHAAATDCL



C-terminal family domains are
STAEDQVDAPNPSIKGRARNKKKEARTSTKVVKDEIHPE



underlined.
SNPLADHSSNLSECQKPPAAKLAEQRSLRGVPSKAKQGG




SSNSQSCSKHTDIDKDHTDPQMCTTYVEDIYEYLRNAEL





KNRPSANFMETAQNDITPNMRAILVDWLVEVSEEYKLVP






DTLYLTVSYIDRYLSANPTSRHKLQLLGVSCMLIASKYE






EVCPPHVEEFCYITDNTYTRDEMLSMERKILIFLNFEMT





KPTTKSFLRRFVRASQAGNKAPSLHMEFLANYLAELTLM





ECSFLQYLPSLIAASTVFLSRLTLDFLTNPWNPTLAHYT






GYKASQLKDCVMAIYNVQMNRKGSTLVAIREKYQQHKFK






CVASLPPPPFIAERFFDTPN






131
The amino acid sequence of SEQ ID
MTGTQASNVRITRARAAKSTLNNALPPLPPAQGKPRGKR



397. The conserved cyclin and
AATESNISGFSVAAEPLKRRAVLSDVSNICKEAAAVDCL



cyclin C-terminal domains are
KKPKAVKVVSQNANAKGRGRGIPRNNKKITQEAEIKKET



underlined and the cyclins
SPAICNVDDASAGNAIGDDKQNNNVNPLKEVQDNPKELN



signature is in bold.
PIAEQISVHPHCKQSVEKPNEKEIVVSDNKAAIASLKQQ




STLQSLRIPKQPKYSLKQGNPVPLANLHEDVGRSSCSDF




IDIDSEYKDPQMCTAYVTDIYANMRVVELKRRPLPNFME





TTQRDINANMRSVLIDWLVEVSEEYKLVPDTLYLTVSYI







DRFLSANVVNRQRLQLLGVSCMLVASKYEEICAPPVEEF







CYITDNTYKKEEVLEMEISVLNRLQYDLTTPTTKTFLRR






FIRAAQASCKVSSLHLEFMGNYLAELTLVEYDFLKYLPS






LIAAAAVFVARMTLDPMVHPWNSTLQHYTGYKVSDMRDC






ICAIHDLQLNRKGCTLAAIREKYNQPKFKCVANLFPPPI






ISPQFLIDNEV






132
The amino acid sequence of SEQ ID
MAAPNQNALLINNNNRRPLVDIGNLVGALNAQCNISKNG



398. The conserved cyclin and
ARKRAFGDIGNLVEDLDAKCTISKYWVRKRPRTNFGVNA



cyclin C-terminal domains are
NKGASSSTQGQGIVVRGEQKAWDRIVWGNKQSCAIKMNA



underlined and the cyclins
QHVTATQRGTAISISDIIDSSVQDGGIKAPSQLKARKQT



signature is in bold.
VRTVTATLTARSEDSLRDVLEVPPGIDDGDRDNPLAVVE




YVEDIYHFYRKIEVRSCVPPDYMTRQLEIKDSMRGVIID






WLIEVHRTFLLMPETLYLTVNIIDRYLSIQSVTRNELQL







MGITAMFIASKYEEISPPKINDLVYITKDAYTSKQIVNM






EHTILNRLKFKLTVPTPYVFLVRFLKAAGPDKVMKNLAF






FLVDLCLLHYKMIKYSPSMLAAAAVYTAQCTLKKHPYWN






KTLILHIGYSEAHLRECAHLMADLHLKAEGSNLKSVYKK






YSYPIFGSVAFLSPAKIPAGTVAAPAIDKCAHQIYLRNLR






133
The amino acid sequence of SEQ ID
MFPNKQTQGLVQNKKMASKAAQPKAMVPPQRVPPAANNR



399. The conserved cyclin N- and
RALGDIGNIVADVGGKCNVTKDGVNGKPLAQVSRPITRS



C-terminal family domains are
FGAQLLAQAAANKGISAANNQTQVPVVIPKADVRGNKQR



underlined.
RTSKSKDIPPTTVVTNESDDCVIIEQAQRIKPTCNHNVG




AVGNKEKPQLLTAKPKSLTASLTSRSAVALRGFRFDDEM




TEAEEDPLPNIDVGDRDNQLAVVEYVEDIYKFYRRTEQM





SCVPDYMPRQQEINPKMRAVLINWLIEVHYRFGLMPETL






YLTTNLIDRYLATQLVSRSNYQLVGATAMLLASKYEEIW






APEMNDFLDILENKFERKHVLVMEKAMLNKLKFHLTVPT






PYVFLVRFLKAAASDEEMENLVFFLMELSLMQYVMIKFP






PSMLAAAAVYTAQITLKKTTVWNDVLKRHTGYSEIDLKE






CTRLMVAFHQSSEESKLNVVFKKYSMPEYDSVALIKPAK






LPA






134
The amino acid sequence of SEQ ID
MAPSFDCVANAYIESCEDQEKLRQNAQILAQSGENDVDE



400. The conserved cyclin and

PVSMLVQRETHYMLPEDYLQRLRNRTLDVNVRREAVGWI




cyclin C-terminal domains are

LKVHSFYNFSAPTAYLAVNYLDRFLSRHRMPQGVKAWMI




underlined

QLMAVACLSLAAKMEETQVPLPSDLQREDARFIFDARTI






QRMELLILSTLQWGMRSITPFSFIDYFAYRAVQGHGHGH






DATPKAVMSRAIELILSTTEEIDFMEYRPSAIAAAALLC






AAEEVVPLQAVHYKRALSSSITDVDKDKMFGCYNLIQET






IIEGGCYWTPMSLQSTEKTPVGVLDAAACLSNTPTSSYS





VKPYASVTAAKRRKLNEICSALLVSQAHPC





135
The amino acid sequence of SEQ ID
MAANFWTSSHCKELLDAEKVGIVHPLDKDQGLTQEDVKI



401. The conserved cyclin and
IKINMSNCIRTLAQYVKLRQRVVATAITYCRRVYTRKSF



cyclin C-terminal domains are

TEYDPQLVAPTCLYLASKAEESTVQAKLVIFYMKKYSKH




underlined.

RYEIKDMLEMEMKLLEALDYYLVIYHPYRPLIQFLQDAG






LNDLKVTAWALVNDTYRTDLILTYPPYMIALACIYFACI






MEEKDAQAWFEELRVDMNEIKNISMEIVDYYDNYRVIPD





EKMNSALNKLPHRF





136
The amino acid sequence of SEQ ID
MAPALSSSYECLSHLLCAEDASNVVGCWDEDESKIFCEE



402. The conserved cyclin domain
EEGFGIQHFPDFPVPDDDEIRVLVRKESQYMPGKSYVQS



is underlined.

YQNLGLDFTARQNAIGWILKVHGSYNFGPLTAYLSINYL






DRFLSRNPLPKAKVWMLQLLSVACLSLAAKMEETQVPLL






LDLQAEEPDFLFEPRTIQRMELLVLSTLEWRMLSVTPFS





FVDYFLQGGGGRKPPPRAMVARANELIFNTHTVLDFLEH




RPSAIAAAAVICAAEEVLPLEAAQYKETILSCSLVDKEW




VFGSYNLIQEVLIEKFSTPKKAKSASSSIPQSPVGVLDA




FCLSNNSNNTSLEASLSVNLYASVAAKRRKLNDYCNTWR




MFQHSTC





137
The amino acid sequence of SEQ ID
MAPNCIDCAPSDLFCAEDAFGVVEWGDAETGSLYGDEDQ



403. The conserved cyclin domain
LHYNLDICDQHDEHLWDDGELVAFAEKETLYVPNPVEKN



is underlined.

SAEAKARQDAVDWILKVHAHYGFGPVTAVLSINYLDRFL






SANQLQQDKPWMTQLAAVACLSLAAKMDETEVPLLLDFQ






VEEAKYIFESRTIQRMELLVLSTLEWRMSPVTPLSYIDH





ASRMIGLENHHCWIFTMRCKEILLNTLRDAKFLGLLPSV




VAAAIMLHVIKETELVNPCEYENRLLSAMKVNKDMCERC




IGLLIAPESSSLGSFSLGLKRKSSTINIPVPGSPDGVLD




ATFSCSSSSCGSGQSTPGSYDSNNSSILCISPAVIKKRK




LNYEFCSDLHCLED





138
The amino acid sequence of SEQ ID

MPQIQYSEKYTDDTYEYRHVVLPPETAKLLPKNRLLNEN




404. The conserved cyclin-

EWRAIGVQQSRGWVHYAIHRPEPHIMLFRRPLNYQQNQQ




dependent kinases regulatory
QQAGAQSQPMGLKAQ



subunit domain is underlined and



the cyclin-dependent kinases



regulatory subunits signature 1 is



in bold.





139
The amino acid sequence of SEQ ID

MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTEN




405. The conserved cyclin-

EWRGIGVQQSRGWVHYAIHCSEPHIMLFRRPLNYEQNHQ




dependent kinases regulatory
HPEPHIMLFRRPLNCQPNHQPQAHHPT



subunit domain is underlined and



the cyclin-dependent kinases



regulatory subunits signature 1 is



in bold.





140
The amino acid sequence of SEQ ID

MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTEN




406. The conserved cyclin-

EWRGIGVQQSRGWVHYAIHCSEPHIMLFRRPLNYEQNHQ




dependent kinases regulatory
HPEPHIMLFRRPLNCQPNHQPQAHHPT



subunit domain is underlined and



the cyclin-dependent kinases



regulatory subunits signature 1 is



in bold.





141
The amino acid sequence of SEQ ID

MPQIQYSEKYYDDTYEYRHVVLPPDVARLLPKNRLLNEN




407. The conserved cyclin-

EWRGIGVQQSRGWVHYAIHRPEPHIMLFRRHLNYQQNQQ




dependent kinases regulatory
QQAQQQPAQAMGLQA



subunit domain is underlined and



the cyclin-dependent kinases



regulatory subunits signature 1 is



in bold.





142
The amino acid sequence of SEQ ID
MALVETEPVTLIHPEEPKKFKKKPTPGRGGVISHGLTEE



408. The conserved GCN5-related N-
EARVKAIAEIVGAMVEGCRKGEDVDLNALKAAACRRYGL



acetyltransferase family domain is
SRAPKLVEMIAALPDGERAAVLPKLKAKPVRTASGIAVV



underlined and the radical SAM
AVMSKPHRCPHIATTGNICVYCPGGPDSDFEYSTQSYTG



family domain is in bold.

YEPTSMRAIRARYNPYVQTRSRIDQLKRLGHTVDKVEFI






LMGGTFMSLPADYRDYFIRNLHDALSGHTSSNVEEAVCY






SEHSATKCIGLTIETRPDYCLGPHLRQMLSYGCTRLEIG






VQSTYEDVARDTNRGHTVAAVADCFCLAKDAGFKVVAHM






MPDLPNVGVERDMESFREFFENPAFRADGLKIYPTLVIR





GTGLYELWKTGRYRNYPPEQLVDIIARVLALVPPWTRVY




RVQRDIPMPLVTSGVEKGNLRELALARMDDLGLKCRDVR




TREAGIQDIHHKIRPEVVELVRRDYCANEGWETFLSYED





TRQDILVGLLRLRKCGHNTTCPELKGRCSIVRELHVYGT






AVPVHGRDADKLQHQGYGTLLMEQAERIAWKEHRSIKIA






VISGVGTRHYYRKLGYELEGPYMMKYLN






143
The amino acid sequence of SEQ ID
MLGFRDLYTSICEHLQRASGRLPIIAAATSLISTPEIAA



409. The conserved chromo domain
VEKENKAPNSVDKMGMGSADESGRFSTSNGQFMNMNNGV



is underlined and the MOZ/SAS-like
VKEEWKGGVPVVPSAPTTVPVITNVKLETPSSPDHDMAR



protein domain is in bold.
KRKLGFLPLEVGTRVLCKWRDGKFHPVKIIERRKLPNGA





TNDYEYYVHYTEFNRRLDEWVKLEQLELDSVETDADEKV






DDKAGSLKMTRHQKRKIDETHVEGNEELDAASLREHEEF





TKVKNITKIELGRYEIETWYFSPFPSEYNNCEKLYFCEF




CLNFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEV





DGKKNKVYAQNLCYLAKLFLDHKTLYYDVDLFLFYILCE






CDERGCHMVGYFSKEKHSEESYNLACILTLPPYQRKGYG






KFLISFSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRV






LLDILKKHKSNISIKELSDMTAIKADDVLSTLQGLDLIQ






YRKGQHAICADPKVLDRHLKAVGRGGLEVDVCKLIWTPY





KEQ





144
The amino acid sequence of SEQ ID
MGSLDESTCSEEIRDEGKDSIRTKFKVESTVNNAQNGGN



410. The conserved MOZ/SAS-like
DNSKKKRAAGLPLEVGIRLLCKWRDSKLHPVKIIERRKL



protein domain is underlined.
PNGFPQDYEYYVHYTEFNRRLDEWVKLEQFELDSVETDA




DEKIEDKGGSLKMTRHQKRKIDEIHVEEGQGHEDFDPAS




LREHEEFTKVKNIAKVELGRYEIETWYFSPFPPEYSHCE




KLFFCEFCLNFMKRKEQLQRHMRKCDLKHPPGDEIYRNG





TLSMFEVDGKKNKIYGQNLCYLAKLFLDHKTLYYDVDLF






LFYVLCECDDRGCHVVGYFSKEKHSDEAYNLACILTLPP






YQRKGYGKFLIAFSYELSKKEGKVGTPERPLSDLGLLSY






RGYWTRILLDILKKQRGNISIKELSDMTAIKVEDVISTL






QVLDLIQYRKGQHVICADPKVLDRHLKAAGIAGLEVDVS





KLIWTPYKEQCG





145
The amino acid sequence of SEQ ID
MASAPMVGCDDSRDKHRWVESKVYMRKGHGKGSKGNAGF



411. The conserved bromo family
NAQNSTAQVRRENDNMGNSIADNGKSEAASEGLSSLSRK



domain is underlined.
QITVNQDHPPNETSSMPAVGGLQNIDTHVTFKLEGCSKQ




EIWELRKKLTNELEQVRGTFKKLEARELQLRGYSVSAGV




NTSYSASQFSGNDMRNNGGKEVTSEVASGGAITPKQAQR




ESNPPRQLSISLMENNQAASDMGEKGKRTPKANQYYRNS




EFVLGKDKFPPAESKKSKSTGNKKISQSKVFSKETMQVG




KEFMPQKSVNEVFKQCSLLLTKLMKHKYGWVFNLPVDAQ





ALGLHDYHTIIKRPMDLGTVKSKLEKNLYNSPASFAEDV






KLTFSNAMTYNPKGHEVHTMAEQLLQLFEERWKTIYEEH





LDGKMRFGSGQGLGASSSTKKLPFQDSKKNIKKSEPAGG




PSPPKPKSTNHHASRTPSAKKPKAKDPHKRDMTYEEKQK




LSTNLQNLPQERLELIVQIIKKRNPSLCQHDEEIEVDID




SFDTETLWELDRFVTNYKKSLSKNKKKALLADQAKRASE




HGSARNKHPMIGRELPMNNKKGEQGEKVVEIDHMPPVNP




PVVEVEKDGVYAKRSSSSSSSSSDSGSSSSDSDSGSSSG




SESDAYAATSPPAGSNTSARG





146
The amino acid sequence of SEQ ID
MEGHSGALGFGQGFSRSSQSPNLSPSPSHSASASVTSSG



412. The conserved GCN5-related N-
QKRKRNEVEHAGVASNSTGMFAVPPSHIYSHLHPMSMSM



acetyltransferase family domain is
PMPMHNSHPSSLSESRDGALTSNDDDDNLTGGNQSQLDS



underlined and the bromodomain is
MSAGNTDGREDFDDEDDDDDDEEDDDEVEGDEEDQDHDP



in bold.
DADDDSDDGHDSMRTFTAARLDNGAPNSRNLKPKADAAG




VAIAPTVKTEPILDTVKEEKVSGNNNNNSVSANNAQVAP




SGSAVLLSAVKEEANKPTSTDHIQTSGAYCAREESLKRE




EDADRLKFVCFGNDGIDQHMIWLIGLKNIFARQLPNMPK




EYIVRLVMDRSHKSVMIIKQNQVVGGITYRPYLSQKFGE





IAFCAITADEQVKGYGTRLMNHLKQHARDVDGLTHFLTY






ADNNAVGYFIKQDFTKEIKLEKERWHGYIKDYDGGILME





CKIDPKLPYTDLPAMIRWQRQTIDEKIRELSNCHIVYSG




IDIQKKEAGIPRKPIKVEDIPGLKEAGWTTDQWGHSRFR




LLNSPSEGLPNRQVLHAFMRSLHKAMVEHADAWPFKEPV





DPRDVPDYYDIIKDPMDVKRMFTNARTYNTHETIYYKCA






NR






147
The amino acid sequence of SEQ ID
MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMK



413. The conserved histone

PHRIRMAHSLIVHYALDEKMEVCRPNLLQSRELRVFHAD




deacetylase family domain is

DYISFLQSVTPETQHEQLRQLKRFNVGEDCPVFDGLYNF




underlined.

CQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEAS






GFCYVNDIVLAILELLKVHQRVLYIDIDIHHGDGVEEAF






YSTDRVMSVSFHKFGDYFPGTGHLKDVGYGKGKYYSLNV






PLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCGADS






LSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYT






IRNVARCWCYETAVAVGVEPQDKLPYNEYYEYFGPDYTL





HVAPSNMENQNSAKELAKIRNTLLEQLKRIQHVPSVPFQ




ERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQK




PQNRDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGI




VNENDGAKWPLGEAG





148
The amino acid sequence of SEQ ID
MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMK



414. The conserve histone

PHRIRMAHSLIVHYALDEKMEVCRPNLLQSRELRVFHAD




deacetylase domain is underlined.

DYISFLQSVTPETQHEQLRQLKRFNVGEDCPVFDGLYNF






CQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEAS






GFCYVNDIVLAILELLKVHQRVYIDIDIHHGDGVEEAF






YSTDRVMSVSFHKFGDYFPGTGHLKDVGYGKGKYYSLNV






PLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCGADS






LSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYT






IRNVARCWCYETAVGVEVEPQDKLPYNEYYEYFGPDYTL





HVAPSNMENQNSAKELAKIRNTLLEQLKRIQHVPSVPFQ




ERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQK




PQNRDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGI




VNENDGAKWPLGEAG





149
The amino acid sequence of SEQ ID
MMETGGNSLPSGPDGVKRKVAYFYDPEVGNYYYGQGHPM



416. The conserved histone

KPHRIRMTHALLVQYGLHKEMQILKPYPARDRDLCRFHA




deacetylase family domain is

DDYVAFLRGITPETIQDQVKALKRFNVGDDCPVFDGLYQ




underlined.

YCQTYAGGSVGGAVKLNHKLCDIAINWAGGLHHAKKCEA






SGFCYVNDIVLAILELLKYHKRVLYVDIDIHHGDGVEEA






FYTTDRVMTVSFHKFGDYFPGTGDIRDIGCGKGKYYAVN






VPLDDGIDDESFQSLFKPIIQQVMLVYNPEAIVLQCGAD






SLSGDRLGCFNLSVKGHAECVRYMRSFNVPLLMVGGGGY






TVRNVARCWCYETGVAVGVEIDDKMPQHEYYEYFGPDYT





VHVAPSNMENKNTKQYLDKIRSKILENINSLPCAPSAQF




QVQPPDTDFPELEEEDYDERTRSHKWDGASCDSDSENGD




LKHRNHDVEESAFPRHNLANISYNTKIKLEGVGTGGLDM




AAGTDTKKNDESFEAMDYESGEELRQDHFASTINASQPC




DPALLTGVQNQLQSTDTVKPIEQSGNAPGIPPPSVATVS




TGTRPSSISRTSSLNSMSSVKQGSILGPNPPQGLNASGL




QFPVPTSNSPIRQGGSYSITVQAPDKQGLQNHMKGPQNM




PGNS





150
The amino acid sequence of SEQ ID
MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVL



417. The conserved histone

SYELHKKMEIYRPHKAYPVELAQFHSADYVEFLHRITPD




deacetylase family domain is

TQHLFTKELVKYNMGEDCPVFENLFEFCQIYAGGTIDAA




underlined.

HRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGI






LELLKHHARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFH






KYGDMFFPGTGDVKEVGEREGKYYAINVPLKDGIDDASF






TRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGCFNL






SIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVE






TGVLLDTELPNEIPDNDYIKYFAPDYSLKINTAGNMENL





NSKTYLSAIKVQVMENLRAIQHAPSVQMHEVPPDFYIPD




IDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDME




EAS





151
The amino acid sequence of SEQ ID
MDSSKSEEANILHVFWHEGMLNHDLGTGVFDTLEDPGFL



418. The conserved histone

EVLEKHPENADRVRNMLSILRKGPIAPYTEWHTGRAAYL




deacetylase family domain is

SELYSFHRPDYVDMLAKTSTAGGKTLCHGTRLNPGSWEA




underlined.

ALLAAGTTLEAMRYILDGHGKLSYALVRPPGHHAQPTQA






DGYCFLNNAGLAVELAVASGCKRVAVVDIDVHYGNGTAE






GFYERDDVLTISLHMNHGSWGPSHPQTGFHDEVGRGKGL






GFNLNVPLPNGTGDKGYEHAMHELVVPAISKFMPEMIVL






VIGQDSSAFDPNGRECLTMEGYRKIGQIMRQQADQFSGG






RLVVVQEGGYHITYAAYCLHATLEGVLCLPHPLLSDPIA





YYPEHDIYSERVTFIKNYWQGGIISTTDKRN





152
The amino acid sequence of SEQ ID
MEESGNALVSGPDGSKRRVTYFYDADIGNYYYGQGHPMK



419. The conserved histone

PHRMRMAHNLIVHYGLHQRMEVCRPHLAQSKDIRAFHTD




deacetylase family domain is

DYIHFLSSVAPDTQQEQLRQLKRFNVGEDCPVFDGLFNF




underlined.

CQSSAGGSIGAALKLNRKDADIAINWAGGLHHAKKCEAS






GFCYVNDIVLGILELLKVHQRVLYIDIDIHHGDGVEEAF






YTTDRVMTVSFHKFGDYFPGTGHIKDVGYGKGKYYALNV






PLNDGIDDESYKHLFRPIIQKVMEVYQPEAVVLQCGADS






LSGDRLGCFNLSVKGHADCVRFVRSFNIPLMLVGGGGYT






IRNVARCWCYETAVAVGVEPQDKLPYNEYYEYFGPDYTL





YVAPSNMENLNTEKDLEKMRNVLLEQLSKIQHTPSVPFQ




ERPPDTEFNDEEEEDMEKRSKCRIWDGEYVGSEPEEDGK




LPRFDADTYERSVLKHENKRLVPVSNVEPLKRIKQEEDG




AAV





153
The amino acid sequence of SEQ ID
MDLNLVSHGEEEEGVRRRKVGIVYDERMCKHATPEDQPH



421. The conserved histone

PEQPDRIRVIWDKLNSAGVLHKCVMVEAKEASEEQLAGV




deacetylase family domain is

HSRKHIEVMKSIGTARYNKKKRDKLAASYSSIYFSQGSS




underlined.

EAALLAAGSVVEISEKVASGELDAGVAIVRPPGHHAEAD






KAMGFCLFNNIAIAAKHLVHERPELGVQKVLIVDWDVHH






GNGTQHMFWTDPHVLYFSVHRFDAGTFYPGGDDGFYDKI






GEGKGAGYNINVPWEQGKCGDADYLAVWDHVLVPVAKSY






DPDMVLISGGFDAALGDPLGGCRLTPYGYSLMTKKLMEF






AGGKIVLALEGGYNLKSLADSFLACVEALLKDGPGRSSV





LTHPFGSTWRVIQAVRKELSSFWPALNEELQLPRLLKDA




SESFDKLSSSSSDESSASEDEKKFAEVTSIMEVSPDPSS




ILALTAEDIAQPLAGLKIEEAGTDSQRSSDHTLLDLTND




DTQKLKQFEGEIFVMIGDEESVPSASSSKDQNESTVVLS




KSNIKAHSWRLTFSSIYVWYASYGSNMWNPRFLCYIEGG




QVEGMAKRCCGSEDKLLLKGYSGKLFLIECFLGDHTQIH




GVQEECPFLIQIVVIRVKRMSACIK





154
The amino acid sequence of SEQ ID
MADEDLDLSDVGEVEDEPGEEIESTPPLAVGQEKEINSL



422. The conserved FKBP-type
ALKKKLLKVGTRWETPENGDEVTVHYTGTLPDGTKFDSS



peptidyl-prolyl cis-trans

RDRGEPFTFKLGQGQVIKGWDQGIVTMKKGERALFTIPP




isomerase signature is underlined


ELAYGSSGVRPTIPPNATLQFDVELL
SWTNIVDVCNDGG




and the FKBP-type peptidyl-prolyl
ILKRIISEGEKYERPKDPDEVTVKYEAKLEDGTLVAKSP



cis-trans isomerase signatures 1

EEGVEFYVNDGHFCPAIAKAVKTMKRGESVILTIKPTYA




and 2 are in bold.

FGERGKDAEEGFAAIPPNATLTTSLELVSFKAVIAVTED





KKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEK





KGYEGEEPFQFVVDEEQVIAGLDKAVETMKTGEIALITI






GAEYGFGNFETQRDLAVIPPNSTLIYEVEMISFTKEKES





WDMDTTEKIEASKQKKEQGNSLFKVGKYQRAAKKYEKAA




KYIEHDSSFSAEEKKQSKVLKVSCNLNHAACRLKLKDFK




EAVKLCSKVLELESQNVKALYRRAQAYIETADLDLAEFD




IKKALEIEPQNREVQLEYKILKQKQIEYNKKDAKLYGNM




FAKLNKLEAFEGKVLS





155
The amino acid sequence of SEQ ID
MADEGLELSDVAEVEDEPGEEFESAPPLVVGQEKELNSS



423. The conserved FKBP-type
GLKKKLLKAGTRCETPENGDEVTVHYTGTLLDGTKFDSS



peptidyl-prolyl cis-trans

RDRGEPFTFNIGQGQVIKGWDQGIVTMKKREHALFTIPP




isomerase family domains are

ELAYGASGMPPTIPPNATLQFDVELLSWTNIVDVCKDGG




underlined. The FKBP-type
ILKRIISDGEKYERPKDPDEVTVKYEAKLEDGMLVAKSP



peptidyl-prolyl cis-trans

EEGVEFYVNDGNFCPAIVKAVKTMKKGENVTLTIKPAYA




isomerase signatures 1 and 2 are

FGEQGKDAEEGFAAIPPNATITINLQLVSFKAVKEVTED




in bold. The TPR repeat is in
KKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEK



bold/italics.

KGYAGEEPFQFVVDEEQVIAGLDKAVETMKTGEVALITI







GPEYGFGNIETQRDLAVIPPYSTLIYEVEMV
SFTKEKES





WDMNTTENIEASKQKKEQGNSLFKVGKYLRAAKKYDKAA




KYIEHDNSFSAEEKKQSKVLKVSCNLNHAACCLKLKDFK




KAVKLCSKVLELESQN





REVRLEYLILKQKQIEYNKKDAKLYGNM





FARQNKLEAIEGKD





156
The amino acid sequence of SEQ ID
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL



424. The conserved cyclophilin-

CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS




isomerase signature is underlined

QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS




and the cyclophilin-type peptidyl-

GRTSKPVVIADSGQLA




prolyl cis-trans isomerase



signature 2 is in bold.





157
The amino acid sequence of SEQ ID
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL



425. The conserved cyclophilin-

CTGEKGNGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS




isomerase signature is underlined

QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS




and The cyclophilin-type peptidyl-

GRTSKPVVIADSGQLA




prolyl cis-trans isomerase



signature 2 is in bold.





158
The amino acid sequence of SEQ ID
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL



426. The conserved cyclophilin-

CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS




isomerase signature is underlined

QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS




and The cyclophilin-type peptidyl-

GRTSKPVVIADSGQLA




prolyl cis-trans isomerase



signature 2 is in bold.





159
The amino acid sequence of SEQ ID
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL



427. The conserved cyclophilin-

CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN




type peptidyl-prolyl cis-trans

GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS




isomerase signature is underlined

QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS




and The cyclophilin-type peptidyl-

GRTSKPVVIADSGQLA




prolyl cis-trans isomerase



signature 2 is in bold.





160
The amino acid sequence of SEQ ID
MADDFELPESAGMMENEDFGDTVFKVGEEKEIGKQGLKK



428. The conserved FKBP-type
LLVKEGGSWETPETGDEVEVHYTGTLLDGTKFDSSRDRG



peptidyl-prolyl cis-trans

TPFKFKLGQGQVIKGWDQGIATMKKGENAVFTIPPDLAY




isomerase signature is underlined

GESGSQPTIPPNATLKFDVELLSWASVKDICKDGGIFKK




and the FKBP-type peptidyl-prolyl
IIKEGEKWEHPKEADEVLVKYEARLEDGTVVSKSEEGVE



cis-trans isomerase signature 1 is

FYVKDGYFCPAFAIAVKTMKKGEKVLLTVKPQYGFGHQG




in bold and underlined. The TPR

REAIGNDVARSTNATLLVDLELVSWKVVDEVTDDKKVLK




repeat is in bold/italics.
KILKQGEGYERPNDGAVVKVKYTGKLEDGTIFEEKGSDE





EPFEFMAGEEQVVDGLDRAVMTMKKGEVALVSVAAEYGY






QTEIKTDLAVVPPKSTLIYEVELVSFVKEKESWDMNTAE





KIEAAGKKKEEGNALFKVGKYFRASKKYEKATKYIEYDT




SFSEEEKKQSKPLK













RDVKLEYRALKEKQKEYNKKEAKFYGNMFARMSKL





EELESRKSGSQKVETANKEEGSDAMAVDGESA





161
The amino acid sequence of SEQ ID
MAASLTPLGAGLAYATIYDQAKVRKLEPTKRSLIALCQH



429. The conserved FKBP-type
SDSQHRRFITRKYHVNVQILNRRDAIRLIGLAAGLCIDL



peptidylprolyl isomerase domain is
SLMYDARGAGLPPQENAKLCDTTCEKELENAPMITTESG



underlined.
LQYKDIKIGNGPSPPIGFQVAANYVAMVPSGQVFDSSLD





KGQPYIFRVGSGQVIKGLDEGLLSMKVGGKRRLYIPGPL






AFPKGLNSAPGRPRVAPSSPVIFDVSLEFIPGLESEEE






162
The amino acid sequence of SEQ ID
MSAASLSADMAIRGTILGKTALHVLGPQVVSQCRQPVMF



430. The conserved FKBP-type
KCPPHTLRKMRFSAQDLQSKNFYSGFTPFKSVFISTSKR



peptidylprolyl isomerase domain is
SWQAGSARAMSQDAAFQSKVTTKCFLDIEIGGDPAGRIV



underlined and the Cyclophilin-

LGLFGEDVPKTAENFRALCTGEKGFGYKGSSFHRIIKDF




type peptidyl-prolyl cis-trans


MLQGGDFDRGDGTGGKSIYGRTFEDENFKLAHVGPGVLS





isomerase signature is in bold.

MANAGPNTNGSQFFICTVKTPWLDKRHVVFGQVIEGMEI






VKKLESEETNRTDRPKRPCRIVDCGELP






163
The amino acid sequence of SEQ ID
MGRIKPQTLLQQSKKKKVPGRISVSTIIVCNLIIIFLMF



431. The conserved FKBP-type
SLVGIYRQRAKRNRATSRSDGDEEMENFGRSKINSVPHQ



peptidylprolyl isomerase domain is

AIVNTTKGLITLELFGKSSAHTVEKFVEWSERGYFNGLP




underlined.

FYRVIKHFVIQVGDPKFAGNREDWTVGGQLNVQLEFSPK






HEAFMLGTSKLEDQGDGFELFITTAPIPDLNDKLNVFGR






VIKGQDVVQEIEEVDTDEHFQPKSPIIINDVRLKDEL






164
The amino acid sequence of SEQ ID
MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFD



432. The conserved cyclophilin-

VDIAGKPAGRVVIGLFGKAVPKTVENFRALCTGEKGVGK




type peptidyl-prolyl cis-trans

SGKPLHYKGSFFHRIIPSFMIQGGDFTLGDGRGGESIYG




isomerase signature is underlined

TKFADENFKLKHTGPVFITTVTTDWLDGRHVVFGKIISG




and the cyclophilin-type peptidyl-

MDVVYKVEAEGRQSGQPKRKVKIADSGELSMD




prolyl cis-trans isomerase



signature is in bold.





165
The amino acid sequence of SEQ ID
MEMDEIQEQSQPQSSEKQDISQESDTGNDKTINAEKITS



434. The conserved FKBP-type
ENAEVEEDDMLPPKVNTEVEVLHDKVTKQIIKEGSGNKP



peptidyl-prolyl cis-trans

SRNSTCFLHYRAWAESTMHKFQDTWQEQQPLELVLGREK




isomerase signature is underlined

KELSGFAIGVAGMKAGERALLHVDWQLGYGEEGNFSFPN




and the TPR repeat is in bold.

VPPRANLIYEAELIGFEEAKEGKARSDMTVEERIEAADR





RRQQGNELFKEDKLAEAMQQYEMALAYMGDDFMFQLFGK




YKDMANAVKNPCHLNMAQCLLKLNRYEEAIGQCNMVLAE




DEKNIKALFRRGKARATLGQTDDAREDFQKVRKFSPEDK




AVIRELRLLAEHDKQVYQKQKEMFKGLFGQKPEQKPKKL




HWFVVFWQWLLSMIRTIFRMRSKTD





166
The amino acid sequence of SEQ ID
MAGAGEGTPEVTLETSMGPITVELYHKHAPKTCRNFLEL



435. The conserved cyclophilin-

SRRGYYNNVKFHRVIKDFMVQGGDPTGTGRGGESIYGPR




type peptidyl-prolyl cis-trans

FEDEITRDLKHTGAGILSMANAGPNTNGSQFFISLAPTP




isomerase signature is underlined

WLDEKHTIFGRVCKGMDVVKRLGNVQTDKNDRPIHDVKI




and the cyclophilin-type peptidyl-

LRTTVKD




prolyl cis-trans isomerase



signature is in bold.





167
The amino acid sequence of SEQ ID
MMDPELMRLAQEQMSKISPDELMKMQRQIMANPDLMRMA



436. The conserved TPR repeat
SENMKNLKPEDIRFAAEQMKNVRKEEMAEISERISRASP



domain is underlined.
EEIEAMKARANLQSAYQLQVAQNLKDQGNQLHARMKYSE




AAEKYLQARNNLTGIPFSEAKSLLLASSSNLMSCYLKTG




QYEECVQTGSEVLAYDAMNVKALYRRGQAYKQIGKLELA





VADLRKAVEVSPEDETIAQALREASTELMEKGGTQDQNG





PRIEEIIEEEAVQPTAEKYPQSAPMVTSVTEDVSDDEQG




SEDQNGFSRDSFQATNAPDGQMYAESLRNLTENPDMLRT




MQSLMKNVDPDSLVALSGGKLSPDMVKTVSGMFGRMSPE




EIQNMMKMSSTLSRQNPSTSSRFDDITRGHSNMDSSPQS




VSVDNDLFEENQNRVGESSTNLSSSAAFSGMPNFSAEMQ




EQVRNQMNDPATRQMFTSMIQNMSPEMMASMSEQFGVKL




SPEDAVKAQNAMASLSPNDLDRLMNWATRLQTAIDYARK




IKNWILGRPGLIFAISMLLLAIILHRFGYIGD





168
The amino acid sequence of SEQ ID
MGVEKEILRPGNGPKPRPGQSVTVHCTGYGKNEDLSQKF



437. The conserved FKBP-type

WSTKDPGQKPFTFTIGQGRVIKGWDEGVLDMQLGEIFKL




peptidylprolyl isomerase domain is


RCSPDYGYGSNGFPAWGIRPNSVLVFEIEVL
SVN




underlined and the Cyclophilin-



type peptidyl-prolyl cis-trans



isomerase signature is in bold.





169
The amino acid sequence of SEQ ID
MPNPRCYLDITIGEELEGRILVELYSDVVPKTAENFRAL



438. The conserved cyclophilin-

CTGEKGIGPHTGVPLHYKGLPFHRVIKGFMIQGGDISAQ




type peptidyl-prolyl cis-trans

NGTGGESIYGLKFDDENFQLKHERRGMLSMANSGPNTNG




isomerase family domain is

SQFFITTTRTSHLDGKHVVFGKVIKGMGVVRGIEHTPTE




underlined and the cyclophilin-

SNDRPSLDVVISDCGEIPEGSDDGIANFFKDGDLYPDWP




type peptidyl-prolyl cis-trans
ADLDEKSAEISWWMNAVDSAKCFGNENYKKGDYKMALRK



isomerase signature is in bold.
YRKALRYLDICWEKEEIDEEKSNHLRKTKSQIFTNSSAC




KLKLGDLKGALLDTEFAMRDGEDNVKALFRQGQAYMALK




DVDSAVASFKKALQLEPNDAGIRKELAVATKMINDRRDQ




ERRAYARMFQ





170
The amino acid sequence of SEQ ID
MGDVIDLNGDGGVLKTIIRSAKPGAMQPTEDLPNVDVHY



439. The conserved FKBP-type

EGTLADTGEVFDTTREDNTLFSFELGKGTVIKAWDIAVK




peptidylprolyl isomerase domain is


TMKVGEVARITCKPEYAYGSAGSPPDIPENATLIFEVEL





underlined and the Cyclophilin-

VACKPRKGSTFGSVSDEKARLEELKKQREIAAASKEEEK




type peptidyl-prolyl cis-trans
KRREEAKATAAARVQAKLEAKKGQGRGKGKSKGK



isomerase signature is in bold.





171
The amino acid sequence of SEQ ID
MGLGLKIASASFLPIFNIMATRSLCILLVCFIPVLAHVL



440. The conserved cyclophilin-
SLQDPELGTVRVYFQTTYGDIEFGFFPHVAPKTVEHIYK



type peptidyl-prolyl cis-trans

LVRLGCYNSNHFFRVDKGFVAQVADVVGGREVPLNSEQR




isomerase signature is underlined.

KEGEKTIVGEFSEVKHVRGILSMGRYSDPDSASSSFSIL






LGNAPHLDGQYAVFGKVTKGDDTLKRLEEVPTRQEGIFV






MPLERIRILSTYYYDTNERESNLTCDHEVSILKRRLVES





AYEIEYQRRKCLP





172
The amino acid sequence of SEQ ID
MASKRSLRTMNVWPTLPPLVLLLLLCFSSMSSSVVAKKS



441. The conserved FKBP-type
DVSELQIGVKHKPKSCDIQAHKGDRIKVHYRGSLTDGTV



peptidylprolyl isomerase domain is


FDSSFERGDPIEFELGSGQVIKGWDQGLLGMCVGEKRKL





underlined and the Cyclophilin-


RIPSKLGYGAQGSPPKIPGGATLIFDTELV
AVNGKGISN




type peptidyl-prolyl cis-trans
DGDSDL



isomerase signatures are in bold.





173
The amino acid sequence of SEQ ID
MSGAPAERPISYFDITIGGKPIGRIVFSLYADLVPKTAE



442. The conserved FKBP-type

NFRALCTGEKGIGKSGKPLCYAGSGFHRVIKGFMCQGGD




peptidylprolyl isomerase domain is

FTAGNGTGGESIYGEKFEDEAFPVKHTKPFLLSMANAGK




underlined and the Cyclophilin-

DTNGSQFFITVSQTPHLDDKHVVFGEVIKGKSIVRAIEN




type peptidyl-prolyl cis-trans

YPTASGDVPTSPIIISACGVLSPDDPSLAASEETIGDSY




isomerase signatures are in bold.
EDYPEDDDSDVQNPEVALDIARKIRELGNKLFKEGQIEL




ALKKYLKSIRYLDVHPVLPDDSPPELKDSYDALLAPLLL




NSALAALRTQPADAQTAVKNATRALERLELSDADKAKAL




YRRASAHVILKQEDEAEEDLVAASQLSPEDMAISSKLKE




VKDEKKKKREKEKKAFKKMFSS





174
The amino acid sequence of SEQ ID
MASSLRSSLFSSWALDSKSVCSLFNLNPGKMGLPSISTP



443. The conserved FKBP-type
LNWRTCCCSHSSELLELNEGLQSSRRKTVMGLSTVIALS



peptidylprolyl isomerase domain is
LVYCDEVGAVSTSKRALRSQKVPEDEYTTLPNGLKYYDL



underlined.
KVGSGTEAVKGSRVAVHYVAKWKGITFMTSRQGMGITGG





TPYGFDVGASERGAVLKGLDLGVQGMRVGGQRILIVPPE






LAYGNTGIQEIPPNATLEFDVELISIKQSPFGSSVKIVEG






175
The amino acid sequence of SEQ ID
MGAIEDEEPPLKRLKVSSPGLRRGLEEEAPSLSVGSVSI



444. The conserved G-protein beta
LMAKSLSLEEGETVGSKGLIRRVEFVRIITQALYSLGYQ



WD-40 repeat domains are
KAGALLEEESGILLQSSNVALFRKQILDGKWDESVVTLR



underlined.
GIDQVEVEGNTLKAASFLILQQKFFELLDKGNIPEAMKT




LRLEISPMQLNTKRVHELASCIVFPSRCEELGYSKQGNP




KSSQRMKVLQEIQQLLPPSIMIPEKRLERLVEQALNVQR




EACIFHNSLDPALSLYTDHQCGRDQIPTTTLQVLESHKN





EVWFLQFSNNGKYLASASKDCSAIIWEITEGDSFSMKHR






LSAHQKPVSFVAWSPDDKLLLTCGIEEVVKLWNVETGEC






KLTYDKANSGFTSCGWFPDGERFISGGVDKCIYIWDLEG





KELDSWKGQGMPKISDLAVTSDGKEIISICGDNAIVMYN




LDTKTERLIEEESGITSLCVSKDSRFLLLNLANQEIHLW




DIGARSKLLLKYKSHRQSRYVIRSCFSSSDLAFVVSGSE




DSQVYIWHRGNGELLAVLPGHSGTVNCVSWNPVNPHVFA





SASDDYTIRIWGVNRNTFRSKNASSSNGVVHLANGGP






176
The amino acid sequence of SEQ ID
MPGTTAGAGIEPTEPQSLKKLSLKSLKRSFDLFASLHGE



445. The conserved G-protein beta
PQPPDQRSQRIRIACKVRAEYEVVKNLPTLPQREVGSSV



WD-40 repeat domains are
SNSNVGETHSSLTTNQAQGFPTDTSGDLSKDEGKEITSI



underlined and the Trp-Asp (WD)
AVHLQPQTGLIDGKAGAIAGTSTAISSVGSSDRYQPSAA



repeats signature is in bold.
IMKRLPSKWPRPIWHPPWKNYRVISGHLGWVRSVAFDPG





NEWFCTGSADRTIKIWEVATGKLKLTLTGHIEQIRGLAV






SSRHPYLFSAGDDKQVKCWDLEYNKAIRSYHGHLSGVYC






LALHPTLDILCTGGRDSVCRVWDIRTKAQIFALSGHENT






VCSVFTQAIDPQVVTGSHDTTIKLWD
LAAGKTMSTLTYH






KKSVRAIAKHPFEHTFASASADNIKKFKLPKGEFLHNML






SQQKTIVNAMAINEDNVLVSAGDNGSLWFWDWKSGHNFQ





QAQTIVQPGSLDSEAGIYALQYDITGSRLVSCEADKTIK





MWKEDETATPESHPINFKAPKDIRRF






177
The amino acid sequence of SEQ ID
MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYG



446. The conserved G-protein beta
HNGERLGTYRGHNGAVWCCDVSRDSTRLITSSADQTAKL



WD-40 repeat domains are

WNVETGAQLFSFNFESPARAVDLAIGDKLVVITTDPFME




underlined.
LPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGP





LNSTIISGGEDSVVRIWDSETGKLLRESDKETGHQKPIT






SLCKSADGSHFLTGSLDKSARLWDIRTLTLIKTYVTERP





VNAVAISPLLDHVVIGGGQEASHVTTTDREAGKFEAKFF




HKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYV





RLHHFDPDYFHIKM






178
The amino acid sequence of SEQ ID
MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYG



447. The conserved G-protein beta
HNGERLGTYRGHNGAVWCCDVSRDSTRLITSSADQTAKL



WD-40 repeat domains are

WNVETGNQLFSFNFESPARAVDLAIGDKLVVITTDPFME




underlined.
LPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGP





LNSTIISGGEDSVVRIWDSETGKLLRESDKETGHQKAIT






SLCKSADGSHFLTGSLDKSARLWDIRTLTLIKTYVTERP





VNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEAKFF




HKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYV





RLHHFDPDYFHIKM






179
The amino acid sequence of SEQ ID
MAENNVGDFIPLDRQEYPSKPAPGAVDSSFWKSFKKKEV



448. The conserved G-protein beta

SRQIAGVTCINFCPEPPHDFAVTSSTRVHIYDGKSCELK




WD-40 repeat domains are

KTITKFKDVAYSGVFRSDGQIIAAGGETGVIQVFNAKSQ




underlined.
MVLRQLKGHGRPVRVVRYSPQDKLHLLSGGDDSMVKWWD




ITTQEELLNLEGHKDYVRCGAASPSSVNLWATGSYDHTV





RLWDLRNSKTVLQLKHGKPLEDVLFFPSGGLLATAGGNV






VKVWDILGGGRPIHTMETHQKTVMAMCISKVPRSGQALG






DAPSRLVTASLDGYMKVFDLDHFKVTHSARYPAPILSMG





ISSLCRTMAVGTSSGLLFIRQRKGQIEDKIHSDSSGLQV




NPVNDEKDSAVLKPNQYRYYLRGRSEKPSEGDYVVKRMA




KVYFQEYDKDLRHFNHSKALVSALKAADSKGTVAVIEEL




VARKRLIQTLSILNLDELELLINFLSRFILVPKYSRFLI




SLTDRVLDARAVDLGKSENLKKQIADLKGIVVQELRVQQ




SMQELQGIIEPLIRASAR





180
The amino acid sequence of SEQ ID
MDVETSSKPTGNKRTYTRLPRQVCVFWQEGRCTRESCNF



449. The conserved C-x8-C-x5-C-x3-
LHVDEPGSVKRGGATNGFAPKRSYNGSDERDTLAAGPPG



H type zinc finger is underlined
GSRRNISARWGRGRGGIFISDERQKIRNKVNYWLAGN



and in italics and the conserved


QRGEE



KYL



SF
VMGSDVKFLTQLSGHVKAIRGIAFPSD




Cys and His residues in bold, The

SGKLYSGGQDKKVIVWDCQTGQGTDIPLNDEVGCLMSEG




conserved G-protein beta WD-40
PWIFVGLPNAVKAWNILTSTELSLVGPRGQVHALAVGNG



repeat domains are underlined and

MLFAGTHDGSILAWKFSPASNTFEPAASLVGHTQAVVSL




the Trp-Asp (WD) repeats signature

VSGADRLYSGSMDKTIRVW
DL
GTFQCLQTLRDHTSVVMS




is in bold (non-italics).

LLCWDQFLLSCSLDNTVKVWVATSSGALEVTYTHNEEHG






VLALCGMNDEQAKPVLLCSCNDNTVRLYDLPSFSERGRI






FSRNEVRTFQIAPGGLFFTGDATGELKVWNWATQKS






181
The amino acid sequence of SEQ ID
MSVQELRERHAAATAKVNALRERIKAKRLQLLDTDVATY



450. The conserved G-protein beta
ASSNGRTPISFSFTDLVCCRTLQGHTGKVYSLDWTSEKN



WD-40 repeat domains are

RIVSASQDGRLIVWNALTSQKTHAIKLPCAWVMTCAFSP




underlined.

SGQAVACGGLDSVCSIFQLNNQLDRDGHLPVSRILSGHR






SYVSSCQYVPDGDTHVITGSGDRTCIQWDVTTGQRIAIF






GGEFPLGHTADVMSVSISAANPKEFVSGSCDTTTRLWDT





RIASRAIRTFHGHEADVNTVKFFPDGLRFGSGSDDGTCR





LFDIRTGHQLQVYRQPPRENQSPTVTAIAFSFSGRLLFA






GYSNGDCFVWDTILEKVVLNLGELQNTHNGRISCLGLSA






DGSALCTGSWDKNLKIWAFGGHRKIV






182
The amino acid sequence of SEQ ID
MKVKIISRSTDEFTRERSNDLQRVFRNFDPNLHTQARAQ



451. The conserved G-protein beta
EYVRALNAAKLDKIFAKPFLAAMSGHIDGISAMAKSPRH



WD-40 repeat domains are

LKSIFSGSVDGDIRLWDIAARRTVQQFPGHRGAVRGLTV




underlined.

STEGGRLISCGDDCTVRLWDIPVAGIGESSYGSENVQKP






LATYVGKNSFRAVDYQWDSNVFATGGAQVDIWDHDRSEP






TNSFAWGSDTVISVRFNPAEKDIFATTASDRSIVLYDLR






MASPLNKLIMQTRNNAIAWNPREPMNFTAANEDCNCYSY





DMRRMNISTCVHQDHVSAVMDIDYSPSGREFVTGSYDRT





VRIFPYNAGHSREIYHTKRMQRVFCVKFSGDATYVVSGS






DDANIRLWKAKASEQLGVLLPRERKRHEYLDAVKERFKH





LPEIKRIERHRHLPKPIYKAALLRHTVNAAAKRKEERKR




AHSAPGSVVTNPLRKKRIVAQLE





183
The amino acid sequence of SEQ ID
MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSD



452. The conserved G-protein beta
SENDFDLNNKSPDTTALQAKRGKDIQGIPWNRLNFTREK



WD-40 repeat domains are
YRETRLQQYKNYENLPRPRRSRNLDKECTNFERGSSFYD



underlined.
FRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMH




WSSLKQKGEEVLNVAGPIVPSVKHPGSSPQGLTRVQVSA




MSVKDNLVVAGGFQGELICKYLDKPGVSFCTKISHDENG




ITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTVLER




FSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTV





GTLRGHLDYSFAAAWHPDGYILATGNQDTTCRLWDVRKL






SSSLAVLKGRMGAIRSIRFSSDGRFMAMAEPADFVHLYD





TRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYG




SLLEFNRRRMNYYLDSIL





184
The amino acid sequence of SEQ ID
MAEALVLRGTMEGHTDAVTAIATPIDNSDMIVSSSRDKS



453. The conserved G-protein beta


ILLWN

LTKEPEKYGVPRRRLTGHSHFVQDVVISSDGQFA




WD-40 repeat domains are

LSGSWDSELRLWDLNTGLTTRRFVGHTKDVLSVAFSIDN




underlined and the Trp-Asp (WD)

RQIVSASRDRTIKLWN
T
LGECKYTIQPDAEGHSNWISCV




repeats signatures are in bold.

RFSPSATNPTIVSCSWDRTVKVWN
LTNCKLRNTLVGHGG






YVNTAAVSPDGSLCASGGKDGVTMLWDLAEGKRLYSLDA






GDIIYALCFSPNRYWLCAATQQCVKIWDLESKSIVADLR





PDFIPNKKAQIPYCTSLSWSADGSTLFSGYTDGKIRVWG




IGHV





185
The amino acid sequence of SEQ ID
MAAIKSTSRSASVAFAPDAPLLAAGTMAGAIDLSFSSLA



454. The conserved G-protein beta
NLEIFKLDFQSDDPELPVVGECPSNERLNRLSWGSAGGS



WD-40 repeat domains are

FGIIAGGLVDGTINIWNPATLINSEDNGDALIARLEQHT




underlined.

GPVRGLEFNTISTNLLASGAEDGELCIWDLANPTAPTHF






PPLKGVGSGAQGEISFLAWNRKVQHILASTSYSGTTVVW






DLRRQKPIISFPDATRRRCSVLQWNPDASTQLIVASDDD






NSPTLEAWDLRNTISPYKEFVGHSRGVIAMSWCPSDSLF






LLTCAKDNRTLCWDTGSGEIVCELPAGANWNFDVQWSPK






IPGILSTSSFDGKIGIHNIEACSRNVSGEVEFGGAIVRG





GPSALLKAPKWLERPAGVSFGFGGKLASFRPSTVAQAAD




HRHSEVFIHNLVTEDNLVIRSTEFEAAIADGEKVSLRAL




CDRKAEESQSDEEKETWNFLRVMFEDEGTARTKLLEHLG




FKVQSEENGDLQETHSSKIDDIGSEIGKTLTLDDKTEED




VLPQLKGGQDAAIPQDNGEDFFDNLHSPKEEVSLSHVGN




DFVGEKDKDMVVNGAEIEHETEDLTEYSDWNEAIQHSLV




VGDYKGAVLQCLSANRMADALIIAHLGGNSLWEKTRDEY




LKKAKSSYLKVVSAMVNNDLTGLVNSRPLKSWKETLAML




CTYSQREEWTVLCDMLASRLIAAGNVMAATLCYICAGNI




EKTVEIWSRSLKYDYDGRSFVDHLQDVMEKTVVLALATG




QKRVSPSLSKLVENYAELLASQGLLTTAMEYLKLLGTEE




SSHELSILRDRLYLSGTDNKVEASSFPFETRQDLTESQY




NMHQTGFGAPETQKNYQENVHQVLPSGSYTDNYQPTANT




HYIAGYQPAPQQQPSFQNYFTPASYQPAPSPNVFYPSQV




SQAEQSNFAPPVNQPPMKTFVPSTPPILRNVDQYQTPSL




NPQLYQGVSSATVETHPYQTGAPASVSVGTTPGQPSVVP




NFMVPGPVTAPTVTPRGFMPVTTPTQHPLGSANPPVQPQ




SPQSSQVQSV





186
The amino acid sequence of SEQ ID
MAGAADSQLQTLSERDSTPNFKNLHTREYAAHKKKVHSV



455. The conserved G-protein beta

AWNCTGTKLASGSVDQTARVWNIEPHGHSKTKDLELKGH




WD-40 repeat domains are

ADSVDQLCWDPKHSELLATASGDRTVRLWD
ARSGKCSQQ




underlined and the Trp-Asp (WD)

VELSGENINITFKPDGTHIAVGNRDDELTIIDVRKFKPL




repeats signature is in bold.

HKRKFSYEVNEIAWNTTGELFFLTTGNGTVEVLSYPSLQ






VLHTLVAHTAGCYCIAIDPIGRYFAVGSADALVSLWDLS






EMLCVRTFTKLEWPVRTISFNHDGQYIASASEDLFIDIA






DVQTGRTVHQISCRAAMNSVEWNPKYNLLAFAGDDKNKY






MQDEGVFRVFGFETP






187
The amino acid sequence of SEQ ID
MAATSPVGAGSGRELANPPTDGISNLRFSNHSDHLLVSS



456. The conserved G-protein beta

WDRKVRLYDASANSLKGQFVHGGPVLDCCFHDDASGFSG




WD-40 repeat domains are

SADNTVRRYDFSTRKEDILGRHEAPVRCVEYSYAAGQVI




underlined.

TGSWDKTLKCWDPRGASGQEKTLVGTYSQLERVYSMSLV






GHRLVVATAGRHINVYDLRNMSQPEQRRESSLKYQTRCV





RCYPNGTGFALSSVEGRVAMEFFDLSEAGQAKKYAFKCH





RKSEAGRDTVYPVNAIAFHPIYGTFATGGCDGYVNVWDG





NNKKRLYQYSKYPTSIAALSFSRDGRLLAVASSYTFEEG




EKPHEPDAVFVRSVNEAEVKPKPKVYAAPP





188
The amino acid sequence of SEQ ID
MASDDEEGFKNEEAPGVVDEAEVQEGLRACFPLSFGKQE



457. The conserved G-protein beta
KKQAPLESIHSATKRPEDPRPRRQLGPPRPPPSILAEQE



WD-40 repeat domains are
DSDRFVGPPRPPQFVRDDNDDGEAEIMIGPPRPPAQYSD



underlined and the Trp-Asp (WD)
DHDNEETIGPPKPSYLEKGEETDQMVGPSKRGSDDETSG



repeats signature is in bold.
DSDDGDDAVDFRVPLSNEIVLRGHTKVVSALAIDQTGSR





VLTGSYDYSVRMYDFQGMTSQLKSFRQLEPAEGHQVRSL






SWSPTSDRFLCVTGSAQAKIFDRDGLTLGEFVKGDMYLR






DLKNTKGHISGLTCGEWHPKEKQTILTCSEDGSLRIWD
V





NDFNTQKQVIKPKLAKPGRVPVTACAWGRDGKCIAGGVG





DGSIQVWNLKPGWGSRPDLYVAKGHDDDITGLQFSADGN






ILLTRSTDETLKVWDLRKAITPLQVFRDLPNNYAQTNVA





FSPDERLIFTGTSVERDGNSGGLLCFYDRQTLELVLRIG




VSPVHSVVRCTWHPRHNQVFATVGDKKEGGAHILYDPAL




SERGALVCVARAPRKKSLDDFEAKPVIHNPHALPLFRDE




PSRKRQREKARMDPMKSQRPDLPVTGPGFGGRVGSTKGS




LLTQYLLKEGGLIKETWMEEDPREAILKYADVAAKDPKF




IAPAYAQTQPETVFAETDSEEEQK





189
The amino acid sequence of SEQ ID
MKERGQSHAGQPSVDERYTQWKSLVPVLYDWLANHNLVW



458. The conserved G-protein beta
PSLSCRWGPQMHQATYKNSQRLYLSEQTDGTVPNTLVIA



WD-40 repeat domains are
TCEVVKPRVAAAEHISQFNEEARSPFVKKFKTIIHPGEV



underlined.
NRIRELPQNSKIVATHTDGPDVLIWDVDTQPNRQATLGA




ADSRPDLVLTGHKDNAEFALAMSPSAPFVLSGGKDKCVL




LWSIQDHISAATEPSSAKASKTPSSAHGEKVPKIPSIGP





RGVYKGHKDTVEDVQFCPSNAQEFCSVGDDSALILWDAR





NGNEPVIKVEKAHNADLHCVDWNPHDENLILTGSADNSV





RMFDRRNLTSSGVGSPVHKFEGHSAPVLCVQWCPDKASV






FGSAAEDSYLNVWDYEKVGKNVGKKTPPGLFFQHAGHRD





KVVDFHWNSFDPWTIVSVSDDGESTGGGGTLQIWRMSDL




IYRPEDEVLAELERFEAHILSCQNK





190
The amino acid sequence of SEQ ID
MSSLSRELVFLILQFLDEEKFKESVHKLEQESGNMK



459. The conserved G-protein beta

YFDEKAQAGEWDEVERYLSGFTKVDDNRYSMKIFFEIRK




WD-40 repeat domains are

QKYLEALDRQDRAKAVDILVKDLKVFSTFNEELYKEITQ




underlined. The Lissencephaly
LLTLDNFRENEQLSKYGDTKSARTIMMSELKKLIEANPL



type-1-like homology motif is in
FREKLIYPNLKASRLRTLINQSLNWQHQLCKNPRPNPDI



bold and the CTLH, C-terminal to
KTLFTDHACGPPNGARTPTQPTASLGVLPKATTFTPIGP



LisH motif is in italics.
HGPFPSSSTATSGLASWMSNPNMVTSPQAPVAVGPSVPV




PPNQATLLKRPRTPPGSSSVVDYQTADSEQLIKRLRPVS




QSIDEATYPGPTLRVPWSTDDLPKTLARALNEPYPVTSI





DFHPSQQTFLLVGTKNGEITLWEVGSREKLATRSFKIWD





NANCSNHLEAAFVKDSSVSINRVLWSPDGTLIGIAFTKH




LVHTYTFQGLDLRQHLEIDAHVGGVNDLAFSHPNKQLCV





VTCGDDKMIKVWDAVTGRKLYNFEGHDAPVYSVCPHHKE






NIQFIFSTAVDGKIKAWLYDHLGSRVDYDAPGHSCTTMM





YSADGTRLFSCGTSKEGESFLVEWNESEGAIKRTYSGLR




KKGSGVVQFDTTQNHFLAVGDEHLIKFWDMDSTNMLTSC




DAEGGLLNLPRLRFNKEGSLLAVTTVNGIKILANADGQK




LLKTMENRTFDLPSRAHIDAASATSSPATGRMERIERTS




SANTVSGINGVDPAQSSEKLRLSDDLSEKTKIWKLTEIT




DSIQCRCITLPENAAEPASKVSRLLYTNSGVGLLALGSN




AVHKLWKWNRSEQNPSGKATASVHPQRWQPTSGLLMTND





ITDINPEEAVPCIALSKNDSYVMSASGGKVSLFNMMTFK






VMTTFMPPPPASTFLAFHPQDNNIIAIGMEDSTIHIYNV





RVDEVKTKLKGHQKRITGLAFSSTQNILVSSGADAQLCV





WNTETWEKRKSKTIQMPVGKTVSGDTRVQFHSDQLHILV





VHETQLAIYDAYKLERQYQWVPQDALSAPILYATYSCNR




QLIYATFSDG





191
The amino acid sequence of SEQ ID
MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHA



460. The conserved G-protein beta
LEWPSLTVQWLPDREEPPGKDYSVQKMILGTHTSDNEPN



WD-40 repeat domains are
YLMLAQVQLPLEDAENDARQYDDERGEIGGFGCANGKVQ



underlined.
VIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYS




KHPSKPPQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLL





SGSDDAQICLWDINVPAKNKVLEAQQIFKVHEGVVEDVA






WHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVVAHQ






GEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHT






FSCHKEEVFQIGWSPKNETILASCSADRRLMVWDLSRID





EFQTPEDALDGPPELLFIHGGHTSKISDFSWNPCEDWVI




ASVAEDNILQIWQMAENIYHDEEDDMPPEEVV





192
The amino acid sequence of SEQ ID
MSPGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSAD



461. The conserved G-protein beta

RTIKLFGLNASDTPSLLASLTGHEGPVWQVAWAHPKFGS




WD-40 repeat domains are

MLASCSYDGRVIIWREGQQENEWSQVQVFKEHEASVNSI




underlined.

SWAPHELGLCLACGSSDGSITVFTCREDGSWDKTKIDQA






HQVGVTAVSWAPASAPGSLVGQPSDPIQKLVSGGCDNTA






KVWKFYNGSWKLDCFPPLQMHTDWVRDVAWAPNLGLPKS






TIASCSQDGKVVIWTQGKEGDKWEGRILNDFKIPVWRVN






WSLTGNILAVADGNNSVTLWKEAVDGDWNQVTTVQ






193
The amino acid sequence of SEQ ID
MSSGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSAD



462. The conserved G-protein beta

RTIKLFGMNTSDTPTLLASLTGHEGPVWQVAWAHPKFGS




WD-40 repeat domains are

MLASCSYDRRVIIWREGQQENEWSQVQVFKEHEASVNSI




underlined.

SWAPHELGLCLACGSSDGSITVFTGREDGSWDKTKIDQA






HQVGVTAVSWAPASAPGSLVGQPSDPVQKLVSGGCDNTA






KVWKFYNGSWKLDCFPPLQMHTDWVRDVAWAPNLGLPKS






TIASCSQDGRVVIWTQGKEGDKWEGKILNDFKTPVWRIS






WSLTGNILAVADGNNNVTLWKEAVDGEWNQVTTVQ






194
The amino acid sequence of SEQ ID
MKKRSRPSNGHLSTAAKNKSRKTAPITKDPFFDSAHNRN



463. The conserved G-protein beta
KSKGKGKSRGKGEEIFSSDEDDDAIGRDAPAEEEEEIAE



WD-40 repeat domains are
EERETADEKRLRVAKAYLDKIRAITKANEEDNEEEAGED



underlined.
EETEAERRGKRDSLVAEILQQEQLEESGRVQRQLASRVV




TPSKLVECRVVKRHKQSVTAVALTEDDLRGFSASKDGTI




IHWDVETGASEKYEWPSQAVSVSSSNEVSKTQKSKGSKK





QGSKHVLSMAVSSDGRYLATGGLDRYIHLWDTRTQKHIQ






AFRGHRGAVSCLAFRQGTQQLISGSFDRTIKLWSAEDRA





YMDTLYGHQSEILAVDCLRKERVLSVGRDHTLRLWKVPE





ETQLVFRGHAASLECCCFINNEDFLSGSDDGSIELWSML





RKKPVFMAKNAHGHAIVENLSEDTSTREEPDEEVTTRQL




PNGNSIGNGMTNQMGITPSVESWVGAVTVCRGTDLAASG




AGNGVVRLWAIENSSKSLRALHDIPLTGFVNSLTFARSG




RFLIAGVGQEPRLGRWGRIQAARNGVTLCPIELS





195
The amino acid sequence of SEQ ID
MAATFGTINTATSPHNPNKSFEIVQPPNDSISSLSFSPK



464. The conserved G-protein beta

ANYLVATSWDNQVRCWEVLQTGASMPKAAMSHDQPVLCS




WD-40 repeat domains are

TWKDDGTAVFSAGCDKQAKMWPLLTGGQPVTVAMHDAPI




underlined.

KDIAWIPEMNLLATGSWDKTLKYWDTRQSNPVHTQQLPE






RCFALSVRHPLMVVGTADRNLIIFNLQNPQTEFKRISSP






LKYQTRCVAAFPDKQGFLVGSIEGRVGVHHVEEAQQSKN






FTFKCHRDSNDIYAVNSLNFHPVHQTFATAGSDGAFNFW






DKDSKQRLKAMARSNQPIPCSTFNSDGSLYAYAVSYDWS





KGAENHNPATAKHHILLHVPQESEIKGKPRVTTSGRK





196
The amino acid sequence of SEQ ID
MVVMDKGTHQTNEDESESEFIDEDDVIDEISIDEEDLPD



465. The conserved G-protein beta
ADVEGEDVQEDNKRSEPDENSSSLDDAIHTFEGHEDTLF



WD-40 repeat domains are
AVACSPVDATWVASGGGDDKAFMWRIGHATPFFELKGHT



underlined.

DSVVALSFSNDGLLLASGGLDGVVRIWDASTGNLIHVLD






GPGGGIEWVRWHPKGHLVLAGSEDYSTWMWNADLGKCLS






VYTGHCESVTCGDFTPDGKAICTGSADGSLRVWNPQTQE





SKLTVKGYPYHTEGLTCLSISSDSTLVVSGSTDGSVHVV





NIKNGKVVASLVGHSGSIECVRFSPSLTWVATGGMDKKL






MIWELQSSSLRCTCQHEEGVMRLSWSLSSQHIITSSLDG






IVRLWDSRSGVCERVFEGHNDSIQDMVVTVDQRFILTGS






DDTTAKVFEIGAF






197
The amino acid sequence of SEQ ID
MPVFRTAFNGYAVKFSPFVETRLAVATAQNFGIIGNGRQ



466. The conserved G-protein beta

HVLELTPNGIVEVCAFDSSDGLYDCTWSEANENLVVSAS




WD-40 repeat domains are


GDGSVKIWD

IALPPVANPIRSLEEHAREVYSVDWNLVRK




underlined and the Trp-Asp (WD)

DCFLSASWDDTIRLWTIDRPQSMRLFKEHTYCIYAAVWN




repeats signature is in bold.

PRHADVFASASGDCTVRIWDVREPNATIIIPAHEHEILS






CDWNKYNDCMLVTGSVDKLIKVWDIRTYRTPMTVLEGHT






YAIRRVKFSPHQESLIASCSYDMTTCMWDYRAPEDALLA






RYDHHTEFAVGIDISVLVEGLLASTGWDETVYVWQHGMD





PRAC





198
The amino acid sequence of SEQ ID
MDSRNRRSRLNLPPGMSPSSLHLETTAGSPGLSRVNSSP



467. The conserved G-protein beta
STPSPSRTTTYSDRFIPSRTGSRLNGFALIDKQPQPLPS



WD-40 repeat domains are
PTRSAAEGRDDASSSSASAYSTLLRNELFGEDVVGPATP



underlined.
ATPEKSTGLYGGSRDSIKSPMSPSRNLFRFKNDHGGNSP




GSPYSASTVGSEGLFSSNVGTPPKPARKITRSPYKVLDA




PALQDDFYLNLVDWSSNNVLAVGLGTCVYLWSACTSKVT




KLCDLGVNDSVCSVGWTPQGTHLAVGTNIGEVQIWDTSR




CKKVRTMGGHCTRAGALAWSSYILSSGSRDRNILHRDIR




VQDDFIRKLVGHKSEVCGLKWSYDDRELASGGNDNQLLV





WNQQSAQPLLRFNEHTAAVKAIAWSPHQHGILASGGGTA





DRCLRFWNTATDTRLNCVDTGSQVCNLVWCKNVNELVST




HGYSQNQIMVWRYPSMSKLATLTGHTLRVLYLAISPDGQ





TIVTGAGDETLRFWSIFPSPKSQSAVHDSGLWSLGRTHIR






199
The amino acid Sequence of SEQ ID
MEKKKVVVPIVCHGHSRPIVDLFYSPVTPDGLFLISASK



468. The Conserved G-protein beta

DSSTMLRNGETGDWIGTFEGHKGAVWSCCLDNRALRAAS




WD-40 repeat domains are

GSADFSAKIWDALTGDELHCFVHKHIVRACAFSESTSLL




underlined.

LTGGHEKILRIFDLNRPDAPPKEVDNSPGSIRTVAWLHS






DQTILSSNSDAGGVRLWDLRTEKIVRVLETKSPVTSAEV






SQDGRYITTADGNSVKFWDANHFGMVKSYTMPCMVESAS






LEPTMGNMFVAGGEDMWVRLFDFHTGEEIACNKGHHGPV






HCVRFAPGGESYSSGSEDGTIRIWQTLNMNSEENESYGV





NGLSGKVRVGVDDVVQKVEGFQITADGHLNDKPEKPNP





200
The amino acid Sequence of SEQ ID
MERYSQGTQKKSEIYTYEAPWQIYGMNWSVRKDKKFRLG



469. The Conserved G-protein beta
IGSFLEEYNNRVEIIELDEESGEFKSDPRLAFDHPYPTT



WD-40 repeat domains are
KIMFVPDKECQRPDLLATTGDYLRIWQVCEDRVEPKSLL



underlined.

NNNKNSEFCAPLTSFDWNDADPKRIGTSSIDTTCTIWDI





EKEVVDTQLIAHDKEVYDIAWGEVGVFASVSADGSVRVF





DLRDKEHSTIIYESSQPETPLLRLGWNKQDPRFIATILM





DSCKVVILDIRFPTLPVAELQRHQASVNTIAWAPHSPCH





ICTAGDDSQALIWELSSVSQPLVEGGGLDPILAYTAAAE





INQLQWSSMQPDWVAIAFSNEVQILRV





201
The amino acid sequence of SEQ ID
MQSENNLDESLHLREVQELQGHTDTVWAVAWNPVTGIDG



470. The conserved G-protein beta

APSMLASCSGDKTVRIWENTHTLNSTSPSWACKAVLEET




WD-40 repeat domains are

HTRTVRSCAWSPNGKLLATASFDATTAIWENVGGEFECI




underlined.

ASLEGHENEVKSVSWSASGMLLATCGRDKSVWIWDVQPG





NEFECVSVLQGHTQDVKMVQWHPNRDILVSASYDNSIKV





WAEDGDGDDWACMQTLGNSVSGHTSTVWAVSFNSSGDRM






VSCSDDLTLMVWDTSINPAERSGNAGPWKHLCTISGYHD






RTIFSVHWSRSGLIASGASDDCIRLFS






202
The amino acid sequence of SEQ ID
MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKV



471. The conserved G-protein beta

NMWAIGKPNAILSLSGHSSAVESVTFDSAEALVVAGAAS




WD-40 repeat domains are

GTIKLWDLEEAKIVRTLTGHRSNCISVDFHPFGEFFASG




underlined and the Trp-Asp (WD)

SLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWV




repeats signature is in bold.

VSGGEDNIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQE






FLLATGSADRTVKFWD
LETFELIGSAGPETTGVRAMIFN






PDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLADLN





IHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNG




HNEAKLASSGHPSVQQLDNNLKTNMARLSLSHSTESGIK




EPKTTTSLTTTEGLSSTPQRAGIAFSSKNLPASSGPPSY




VSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRP




ETTSDVKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESD




KIDSINQKRMTGNDKTDLNIARAEQHVSSRLDNTNTSSV




VCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRSPTFPWS




ATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETR




EKALTADTPVLVSGRPPTSPGVDMNSFIPRGSHGTSESD




LTVSDDNSAIEELMQQHNAFTSILQARLTKLQVIRRFWQ




RNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC




TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRA




TISATPTIGVDLQAEQRLERCNLCYVELENIKQILVPLI




RRGGAVAKSAQELSLALQEV





203
The amino acid sequence of SEQ ID
MSTLEIEARDVIKIVLQFCKENSLHQTFQTLQNECQVSL



472. The conserved G-protein beta
NTVDSLETFVADINSGRWDVILPQVAQLKLPRKKLEDLY



WD-40 repeat domains are
EQIVLEMIELRELDTARAILRQTQAMGFMKQEQPERYLR



underlined and the Trp-Asp (WD)
LEHLLVRTYFDPREAYHESSKEKRRSQIAQALASEVTVV



repeats signature is in bold.
PPSRLMALIGQSLKWQQHQGLLPPGTQFDLFRGTAAVKA




DEEEMYPTTLAHTIKFGKQSHPECARFSPDGQYLVSCSV





DGFIEVWDYISGKLKKDLQYQADDSFMMHDDAVLCVDFS






RDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTS






LSFSRDGSQLLSTSFDSTARIHGLKSGKALKEFRGHTSY






VNDAIFTSDGGRVITASSDCTVKVWD
VKTTDCIQTFKPP






PPLKGGDVSVNSVHLFPKNSEHIVVCNKASSIYIMTLQG






QVVKSFSSGKREGGDFVAACISPKGEWIYCVGEDRNIYC






FSQQSGKLEHLMKAHDKDIIGVTPHPHRNLLVTYSEDST






MKIWKP






204
The amino acid sequence of SEQ ID
MDIELEDQPFDLDFHPSAPIVAVALITGRLQLFRYVDIS



473. The conserved G-protein beta
SEPERLWTVTAHTESCRAARFINAGSSVLTASPDCSILA



WD-40 repeat domains are

TNVETGQPVARLDNAHGAAINCLTNLTESTIASGDENGI




underlined.

IKVWDTRQNSCCNKFKAHEDYISDMEFVPDTMQLLGTSG






DGTLSVCNLRKNKVHARSEFSEDELLSVALMKNGKKVVC






GSQEGVLLLYSWGYFKDCSDRFVGHPHSVDALLKLDEDT






VLTGSSDGIIRVVSILPNKMIGVIGEHSSYPIERLAFSH






DRNVLGSASHDQILKLWDIHYLHEDDEPETNKQEAVNDE





NVDMDLDVDTEKRPRGSKRKKRAEKGQTSSQKQSSDFFA




DI





205
The amino acid sequence of SEQ ID
MDRIQQIPHTCVARKINLPLGMSKESLALNLPANLAPTM



474. The conserved G-protein beta
SPPSITYSDRFIPSRKASNFEEFALPDKTSPSPNSAGGQ



WD-40 repeat domains are
SSSTNGEGRDDACAAYSALLRTELFPATPDKTEGCRRPV



underlined.
IGSPSGNVFRFKSQQCKSQSPFSLCPVGEDGDLSETGAV




ARKTTRKIPRSPFKVLDAPALQDDFYLNLVDWSSHNILA




VGLSACVYLWSASSSKVTKLCDLGLDDNVCSVAWTQRGT




YLAVGTNNGGVQIWDAAHCKQVRTMEGHCTRVGTLAWNS




HILSSGGRDRNILQRDIRAQDDFVSKFSGHKSEVCGLKW





SYDNRELASGGNDNQLFVWNQQSQQPVLKYNEHTAAVKA





IAWSPHQHGLLASGGGTADRCIRFWNTATNTSLNCVDTG




SQVCNLVWSKNVNELVSTHGYSQNQIIVWRYPTMSKLAT





LTGHTLRVLYLAISPDGQTIVTGAGDETLRFWNVFPSSK





TQQNTIRDMGVWSSGRTHIR





206
The amino acid sequence of SEQ ID
MAGGQGEGEEKVDKLSMELTEDVMKSMEIGAVFKDYNGK



475. The conserved G-protein beta

INSLDFHRTNNYLVTASDDEAIRLFDTASATWQKTSYSK




WD-40 repeat domains are
KYGVDLICFTNHQTSVLYSSKNGWDESLRHLSLMDNKYL



underlined.

RYFKGHHDRVVSLCMSPKGECFMSGSLDRTVLLWDLRID





KCQGLIRVRGRPAVAYDEQGLVFAISNEGGLIKMFDARL




YDKGPFDTFVVEGDKSEASGIKFSNDGKLILLSTMDSNI





HVLDAYQGTTVHSFSVEAVPNGGEAVPNGGTLEASFSPD






GKFVISGSGNGNIHAWSVNSGKEVACWTTEGVIPAVVKW





APRRLMFASGSSVLSLWVPDLSKLASLTGSNSNSAY





207
The amino acid sequence of SEQ ID
MHRVGSTGNTSNSSRPRREKRLTYVLNDANDSRHCSGIN



476. The conserved G-protein beta

CLVISKLSLLGGNDYLFSGSRDGTLKRWELADDSAVCSA




WD-40 repeat domains are

TFESHVDWVNDAVLTGETLVSCSSDTTLKTWRPFSDGVC




underlined.

TRTLRQHSDYVTCLAAASKNSNIVASGGLGREVFIWDIE





AAMAPVSRTSEAMDDDTSNGVLSSGNSVLSTTVRSTNAT




NSASLHTSQLQGYTPIAAKGHKESVYALAMNDVGTLLVS





GGTEKVVRVWDPRSGAKQMKLRGHTDNVRALILDSTGRF






CLSGSSDSIIRLWDLGQQRCVHSYAVHTDSVWALASTPN






FSHVYSGGRDLSLYLTDLTTRESLLLCMEKHPLLRLTLQ





DDSIWVATTDSSLHRWPAEGQNPPKMFQRGGSFLAGNLS




FTRARACLEGSAPVPVNTQPSFVIPGSPGIVQHEILNNR




RHVLTKDAEGTVKLWEITRGAVLDDYGKVSFEEKKEELF




EMVSIPAWFTMDTRLGSMSVHLDTPQCFTAEMYAVDLNV




PDAPEEQKINLAQETLRGLLAHWLSRRRQRLATQASANG




DFPAGQENALRNHISSRIDVHDDAETHIAGILPAFDFST




TSPPSIITEGSQGGPWRKKITDLDGTEDEKDFPWWCLEC




VLHGRLSPRESLKCSFYLHPYEGTTVQVLTQGKLSAPRI




LRIQKVINYVLEKMVLDRPLDSSNSETTFTPGLSGNQSH




AAVVGDGSLRSGARVWQQKAKPLVEILCNNQVLSPDMSL




ATVRTYIWKKPDDLYLYYRLVQNR





208
The amino acid sequence of SEQ ID
MMKGKTIQMQAAHQNHDGETSVACVLWDWHAKHLITAGA



477. The conserved G-protein beta

DNTILIHSYPSSSSSKPITLRHHKNAVTALAINSNVRSL




WD-40 repeat domains are

ASGSVDHSVKLYSYPGGEFQSNVTRFTLPIRSLAFNKSG




underlined.

ELLAAAGDDEGIKLISTIDNSIARVLKGHNGPVTSISFD






PKNEFLASSDSDGTVIYWELSTGKPVHTLKKIAPNTTSN






PTSLNQISWRPDGEMLAVPGRKSEVSMYDRDTAEKLFSL





KGGHSDTICSLAWSPNGKYIATAGTDRQVMVWDADRRQD




IDKQRFDNPICSVAWKPSDNALAVIDVLGRFGVWESPIA




SHMKSPADGAERYDNMEDEEPLMARYEEELEDSVSGSLN




EIINDDDDDDEMGKIPRKILQKKPSVKVEKGKEESNAKA




FKSGQDSFKLKSAMQEAFQPGATQRQSGKRNFLAYNMLG




SVITFDNDGFSHIEVDFHDIGKGCRVPSMTDYFGFTMAS




LSESGSVFGSPQKGEKNPSTLMYRPFSSWANNSEWSMRF




PMGEEVKAVALGSGWVAAVTSLNFLRVFSEGGLQKFVLS




MDGPVVTAAGYENLLVVVSHASNPLLSGDQVLSFTVYDI




SQKTCPLSGRLPLSPGSHLTWLGFSEEGLLSSYDSEGNL




RVFTNDYNGCWVPIFSAARERKSETESIWMVGLNSTQVF




CVVCKLPDTYPQVAPKPVLSVLNLSLPLACSDLGADDLE




NEYLRGSLLLSQMQKKAEDAVACGRESNMEEDSIFKMEA




ALDRCLLRLIANCCKGDKLVRATELARLLSLEKSLQGAI




KLVSAMKLPMLAERFNTILEEKILQENMETISCRRLTSE




AQDMDTPISISVKQVSYGANLGDSPFLPNRQVEPKHSTP




VFSKPDTKIEVDTSEAIAKGCDAQNGNIKSGDAEVQPAS




HNDSIQKPSNPFAKASNTSANQAVQRNASLLSSIKQMKT




ATENEGKRKERARSGSLPQKPAKQSKIS





209
The amino acid Sequence of SEQ ID
MKQKRKGHQVDDPKYSVQTPQEDDTPNESGPASEEVESS



478. The conserved G-protein beta
DEEGGNSSNIEDDIIYSSSEEDPVVSSDYEEDEDAESDA



WD-40 repeat domains are
EGVTAEQELEGDIDNALQNYMGTLTVLSNFHGENLKNAE



underlined.
GEDTSGDDDDEEEMPKRAEESDSPEDENDERPKRAEESD




FSEDEDEERPKRAEESDSSEDEVPSRNTVGDVPLRWYKD




EQHIGYDIKGKKIKKQPKKDQLDSFLASTDDSSDWRKVY




DEYNDEEVELTKDEIKFISRLRKGTIPHADVNPYEPYVD




WFDWKDKGHPLSNAPEPKRRFIPSKWEAKKVVKLVRAIR




KGWITFQKAEEKPRFYLMWGDDLKPSEKMANGLSYIPAP




KPKLPGHEESYNPPPEYIPTQEEINSYQLMYEEDRPKFI




PKRFDSLRNVPAYDRFLSEIFERCLDLYLCPRTRKKRIN




IDPESLIPKLPKPKDLQPFPSICFLEYKGHTGAVSCISP





ESSGQWLASGSKDGTVRIWEVETARCLKVWDIGRPIQHI





AWNPVSQLSILAVAVDEEVLVLNTGLGSEDSQEKVAELL




HVKSKPVSADDLGDNTSLTKWIKHEKFDGIKLTHLKPVH




LISWHHKGDYFATVAPDGNTRAVLVHQLSKQQTQNPFKK




MQGRVVHVLFHPSRAIFFVATKTHVRVYDLVKQQLVKRL




VTGLHEVSSMAVHHKGDNLLVGSKEGKVCWFDMDLSTQP





YKTLKNHSKDIHSVAFHDSYPLFASCSDDCKAYVFYGLV





YSDLLQNPLIVPLKVLQGHQSVNGMGVLDCQFHPKQPWL




FTAGADSVVKLYCN





210
The amino acid sequence of SEQ ID
MMSLKRGFEESLVPAKRQKTELSTVTYGDGPRRTSSLES



479. The conserved G-protein beta

PIMLLTGHHAAIYTMKFNPTGTVIASGSHEREIFLWNVH




WD-40 repeat domains are
GDCKNFMVLKGHKNAVLDLHWTTDGCQIISASPDKTLRA



underlined.

WDVETGKQIKKMAEHSSFVNSCCPSRRGPPLVVSGSDDG






TAKLWDLRHRGAIQTFPDKYQITAVGFSDAADKIYSGGI






DNEIKVWDLRRGEVTMRLQGHTDTITGMQLSSDGSYLLT






NSMDCSLRIWDMRPYAPQNRCVKILTGHQHNFEKNLLKC






SWSSDGSKVTAGSADRMVYIWDTTTRRILYKLPGHTGSV





NETGFHPTQPIIGSCSSDKQIYLGEIEPNVGYQAVI





211
The amino acid sequence of SEQ ID
MEFSDTYKHTGPCCFSPDARYLAIAVDYRLVIRDVVTLK



480. The conserved G-protein beta
VVQLYSCMDKISNIEWALDSEYILCGLYKRAMVQAWSLS



WD-40 repeat domains are

QPEWTCKIDEGPAGIAHARWSPDSRHIITTSDFQLRLTV




underlined.

WSLVNTACIHIQWPKHASKGVSETQDSKfAAIATRRDCK





DYVNLLSCHTWEVMGTFTVDTIDLADLEWSPNDSAIVVW




DSPLEYKVLIYSPDGRCLFKYQAYDSWLGVKTVAWSPCS





QFLAVGSYDQTLRTLNHLTWKPFAEFVHVSTVRGPASAV





VFKEVEEPWNLDVSGLHLNDDNAHDIQDGKPAEGHSRVR




YKVVEFPVNVSSQKHPVDKPNPKQGIGLLAWSRDSQYLF




TRNDNMPTALWIWDICRLELAALLIQKEPIRAAAWDPVY





PRVALCTGSSHLYMWTPSGACCVNIPLPQFVVSDLKWNP





DGTSMLLKDRESFCCTFVPMLPEFNDDETNEE





212
The amino acid sequence of SEQ ID
MAKLIETHSCVPSTERGRGILIAGDAKTNSIIYCNGRSV



481. The conserved G-protein beta
IMRNLDNPLEASVYGEHSYPATVARFSPNGEWVASGDTS



WD-40 repeat domains are

GTVRIWGRGSDHTLKYEYKALAGRIDDLEWSADGQRIVV




underlined.

CGDSKGKSMVRAFMWDSGTNVGEFDGHSRRVLSCSFKPT






RPFRVATCGEDFLVNFYEGPPFRFKTSHRDHSNYVNCVR






FAPDGSKFITVGSDRKGVIFDGKMGEKIGELSKEGGHTG






SIYAASWSPDSKQVLTVSADKSAKIWEISETGNGTVKKT





LTFGSQGGADDMLVGCLWLNDYLITVSLGGIVSLLSAVD





PDKPPKTISGHMKSINAIALSLQSGQSEVCSSSYDGVIV






RWILGVGYAGRVERKDSTQIKCLATIEGELVTCGFDNKV





RRVPLLSEQHKESEPIDIGAQPKDLDVAVGCPELTFVST




DAGIIIIRASKIVSTTNVGYAVTAAAISPDGTEAVVGGQ





DGKLRVYSIKGDTLLEESVLERHRGPINAIRFSPDGSMF






ASGDLNREAVVWDRITREVKLKNMVYHTARINCIAWSPD






SSKVATGSLDTCILIYEVGKPASSRITIKGAHLGGVYGL






AFSDQSTVISAGEDACVRVWSLP






213
The amino acid sequence of SEQ ID
MPQPSVILATAGYDHTVRFWEATSGRCYRTLQYPDSQVN



482. The conserved G-protein beta

HLEITPDKQYLAAAGNPHIRLFEVNSNNPQPVISYDSHT




WD-40 repeat domains are

NNVTAVGFQCDGKWMYSGSEDGTVKIWDLRAPGFQREYE




underlined and the Trp-Asp (WD)

SRAAVNTVVLHPNQTELISGDQNGNIRVWDLNANSCSCE




repeats signature is in bold.

LVPEDTAVRSLTVMWDGSLVVAANNHGTCYVWRLMRGTQ





TMTNFEPLHKLQAHNSYILKCLLSPEFCEHHRYLATTSS






DQTVKIWN

VDGFTLERTLTGHQRWVWDCVFSVDGAFLVT






ASSDSTARLWDLSTGEAIRTYQGHHKATVCCALHDGTDG





ASC





214
The amino acid sequence of SEQ ID
MLTKFETKSNRVKGLSFHPKRPWILASLHSGVIQLWDYR



483. The conserved G-protein beta
MGTLIDKFDEHDGPVRGVHFHKTQPLFVSGGDDYKIKVW



WD-40 repeat domains are

NYKMRQCLFTFVGHLDYIRTVHFHNEYPWIVSASDDQTI




underlined and the Trp-Asp (WD)

RLWNWQSRVCISVLTGHNHYVMSASFHPKEDLVVSASLD




repeats signature is in bold. The


QTVRVWD

ISGLRKKTVSPADDLSRLAQMNTDLFGGGDVV




coatomer WD associated region is

VKYVLEGHDRGVNWAAFHTSLPLIVSGADDRQVKLWRMN




in bold/italics.
DTKAWEVDTLRGHTNNVSCVIFHARQDIIVSNSEDKSIR





VWDMSKRTSVQTFRREHDRFWILAAHPEMNLLAAGHDSG





MIVFKLERERPAYVVYGGSLLYVK





























































































IPVLPPGKKSSLLMPPAPILHGGDWPLLRVTKGIFE





GGLENSTSAAYEEEDEEAAADWGEDIDIENIEGENGEAT




VLDDQEVKGGEDDEGGWDMEDLELPPDVAAANVGTNQKT




LFVAPTLGMPVSQIWMQKSSLAGEHAAAGNFETALRLLT




RQLGIKNFSPLKPLFLELYMGSHTFLPSFASVPAFSLAL




QRGWSESASPNIRGPPALVYRLSVLEEKLTVAYRATTEG




RFSEALRLFL





215
The amino acid sequence of SEQ ID
MDLLQNYQDDSEDSNPELRNHPPLEDATATSAPAGVENE



484. The conserved G-protein beta
TSSSPDSSPLRLALPAKSCAPDVDETLMALGVPGSEKKN



WD-40 repeat domains are
NHNKPIDPTQHSVTFNPSYDQLWAPLYGPAHPYAKDGIA



underlined.
QGMRNHKLGFVEDSAIEPFMFDEQYNTFHRYGYAADPSA




SLGSHIVGDLESLKKNDGASVYNLPKREHKRQKLEKKMI




QKDENEEEEKEVGEEVDNPSTEEWLKKNRKSPWAGKKEG




LQTELTEEQKKYAQEHAEKKGDREKGEKVEIVDKTTFHG




KEERDYQGRSWIDPPKDAKATNDHCYIPKRWVHTWSGHT





KGVSAIRFFPKYGHLLLSAGMDTKVKIWDVFNSGKCMRT






YMGHSKAVRDISFSNDGSRFLSAGYDRNIKLWDTETGKV






ISTFSTGKIPYVVKLHPDEDKQNVLLAGMSDKKIVQWDM





NSGEITQEYDQHLGAVNTITFVDNNRRFVTSSDDKSLRV





WEFGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDN






QILIYSTRERFQLNKKKRFAGHIAAGYACQVNFSPDGRF






VMSGDGEGRCWFWDWKTCKVFRTLKCHDNVCIGCEWHPL






EQSKVATCGWDGMIKYWD






216
The amino acid sequence of SEQ ID
MARKGLGTDPAIGSLMSSKKRKEYKVTNRFQEGKRPLYA



485. The conserved G-protein beta
IAFNFIDARYHNIFATAGGTRVTIYQCLEGGAISVLQAY



WD-40 repeat domains are
VDDDKDESFYTLSWACDVNGSPLLVAGGHNGIIRVLDVA



underlined and the Trp-Asp (WD)

NEKVHKSFVGHGDSVNEIRTQALKPSLILSASKDESVRL




repeats signature is in bold.


WN

VQTGICILIFAGAGGHRNEVLSVDFHPSDVYRIASCG






MDNTVKIWSMKEFWTYVEKSFTWTDLPSKFPTKYVQFPV






FIAAVHSNYVDCTRWLGNFILSKSVDNEVVLWEPYSKEQ





STSDGVVDILQKYPVPECDIWFIKFSCDFHYNSMAVGNR





EGKVYVWELQSSPPNLIARLSHAHCKNPIRQTAISHDGS






TILCCCDDGSMWRWDVVQ






217
The amino acid sequence of SEQ ID
MESGAGGSVGARVPSAKPEMLQQPPYSNGDDDNDMERGT



486. The conserved G-protein beta
APVPSSNPNTVSKWELDKDFLCPICMQTMKDAFLTACGH



WD-40 repeat domains are
SFCYMCIMTHLNNKSNCPCCSLYLTNNQLFPNFLLNKLL



underlined and the Trp-Asp (WD)
KKTSACQMASTASPVENLCLSLQQGAEVSVKELDFLLTL



repeats signature is in bold.
LAEKKRKMEQEEAETNMEILLDFLQRLRQQKQAELNEVQ




ADLHYIKDDILALEKRRLELSRARERYSRKLHMLLDDPM




DTTLGHAAIDDGNNVRTAFVRGGQGDAISGKFQQKKAEI




KAQASSQGMQKRANFCHSDSQVLPTLSGLTIARKRRVLA




QFDDLQECYLQKRRRWATQLRKQCDGGLRKERDGNSISR




EGYHAGLEEFQSILTTFTRYSRLRVISELRHGDLFHSAN





IVSSIEFDRDDELFATAGVSRRIKVFDFATVVNEPADVH






CPVVEMSTRSKLSCLSWNKCIKSQIASSDYEGIVTVWDV





NTRQSVMMYEEHEKRAWSVDFSRTEPTRLISGSDDGKVK





VWCTRQETSVLNIDMKANICCVKYNPGSSYYVAVGSADH






HIHYYDLRNPSVPLYEFNGHRKTVSYVKFISTNELASAS







TDSTLRLWD

VRDNCLVRTFKGHTNEKNFVGLTVNSEYIA






CGSETNGVFVYHKAISKPAAWHQFGSPDLDDSDDDTSHF





ISAVCWKSESPTMLAANSQGTIKVLVLAP





218
The amino acid sequence of SEQ ID
MANYVDSKKNFKCVPALQQFYTGGPFRLSSDGSFLVCAC



487. The conserved G-protein beta
NDEVKVVDLATGSVKNTLEGDSELIVALALTPDNKYLFS



WD-40 repeat domains are

ASRSTQIKFWDLSSATCKRTWKAHNGPVADMACDASGGL




underlined.

LATAGADRSILVWDVDGGYCTHSFRGHQGVVTTVIFHPD






PHCLLLFSGSDDATVRIWDLVAKKCISVLEKHFSTVTSL






AISENGWNLLSAGRDKVVNIWDLRDYHCRATIPTYEPLE





AVCVLPTGSRLVSVMNQSRALPENRKKSGAAPVYFLTVG




ERGIVRIWYSEGALCLYEQKSSDAIISSDKDELKGGFVS




AVLLPLTQGVMCVTADQRFLFYNLDESDEGKCDLKVSKR




LIGYNEEIVDLKFLGDEEKFLAVATNLEQVRMYDLSSMT





CVYELSGHTDIVLCLDTVVFSGHSLLASGSKDHTVRIWD





TESKSCICVAAGHMGAVGAVAFSKKAKNFFVSGSSDRTI





KVWSFASVLDFGGISKSIKLSSQAAVAAHDKDINSVAVA






PNDSLICTGSQDRTARIWRLPDLVPVLVLRGHKRGVWCV






EFSPVDQCVMTASGDKTIKIWALSDGSCLKTFEGHTASV






LRASFLTRGTQFVSSGADGLLKLWTIKSNECIATFDQHE






DKIWAMAVGKKTEMLATGGSDSLVNLWHDCTTTDEEEAL





LKEEEAALKDQELLNALADTDYVKAIQLAFELRRPYKLL




NVFTELYSKGHAQDQIQKVIRELGNEELRLLLEYVREWN




TKPKFAHVAQFVLFQLFNVLPPKEIIEVQGISELLEGLI




PYAQRHYSRIDRLMRSTFLLDYTLSSMSVLSPTETDLSS




SNLLARTADPLHAQIDQFHPTHFPEPNLTPIQSLLDSGN




TDSVEVTARRAKKKRVSGNDSEKTTVAEVKIGDMENAFD




EPDVADQGSSRKHKPASSKKRKSIAVGNASIKRIASGNA




VTIALQV





219
The amino acid sequence of SEQ ID
MESSCSSMNSNRHSTEKRCLRPLQKQGASMNKHSSDRFI



488. The conserved G-protein beta
PARGSIDLDVARFMVTQKQKDNNDIHALSPSPSPSKKAY



WD-40 repeat domains are
QKEMADTLLKNAGAADNNCRILSFNGKSSTVSQGSQENV



underlined.
LANLSISRRARRYIPQSADRTLDAPDLLDDYYLNLLDWS





STNVLSTALGNTVYLWDASNSSISELLIADEEEGPVTSV






SWAPDGSQIAVGLNNSVVQLWDSQSNKKLRALKGHHDRV






GALSWNGPILTTGGLDGIIINHDVRTRDHIVQTYKGHTQ






EVCGLKWSPSGQQLASGGNDNLLYIWDKSMASHNPSSQY






FHQLDEHCAAVKALAWCPFQTNLLASGGGTSDGSIKFWN





TQTGACLNTVDTHSQVCSLLWNRHERELLSSHGLNQNQL





TLWKYPSMVKITELTGHTARVLHMAQSPDGYTVASAAAD






ETLKFWQVFGAPDASKKTKDTKGAFNMFHMHIR






220
The amino acid sequence of SEQ ID
MLDEIVADEEEEFNIWKKNTPLLYDVVITHALEWPSLTV



489. The conserved G-protein beta
QWLPDRHQSPTKDYSLQKMIVGTHTSGDEPNYLMIAEVQ



WD-40 repeat domains are
MPLQYSEDGNVGGFESTEAKVHIIQQINHEGEVNRAQYM



underlined.
PQNSFIIATKTVSSDVYVFDYTKHSSNAPQERVCNPELI





LKGHTNEGYSLSWSPLKEGQLLSGSNDAQICFWDINAAS





GRKVVEAKQIFKVHEGAVEDVSWHLKHEYLFGSVGDDCH





LLIWDTRTAAPNKPQHSVVAHESEVNSLAFNPFNEWLLA






TGSADKTVKLFDLRKLSCSLHTFSNHTEEVFQIEWSPMN






ETILASSGGDRRLMVWDLRRIGDEQTSEDAEDGPPRLIF






IHGGHTSKISDFSWNLHDDWLIASVAEDNILQIWQMAEN





IYHDDADIL





221
The amino acid sequence of SEQ ID
MTKEDHGESRDEMGERMVNEEYKLWKKNTPFLYDLVITH



490. The conserved G-protein beta
ALEWPSLTVQWLPPSCKQQQDIIKDDDIDHPNTQMVILG



WD-40 repeat domains are
THTSDNEPNYLILAEVQLHDGTEDEDGDGDVKRPQDKMK



underlined.
PGTSGGAMGKVRILQQINHQKEVNRARYMPQKPTIIATK




TVNADVYVFDYSKHPSKPPQEGRCNPELRLQGHESEGYG





LSWSPLKEGHLLSASDDAQICLWDITAATKAPKVVEANQ






IFRYHDGPVEDVAWHAIHDHLFGSVGDDHHLLLWDIRND





SEKPLHIVEAHQAEVNCLAFNPFNEWIVATGSADRTVAL





HDIRKLDKVLHTCAHHMEEVFQIGWSPQNGAILASCGSD






RRLMVWDLSRIGDEQNPEDAEEAPPELLFIHGGHTSKIS






DFSWNPAEEWVIASVAEDNILQVWQMSEHIYNDDNDSPTA






222
The amino acid sequence of SEQ ID
MAMAMGDENAADPVEEFNIWKKNTPFLYDLVITHALEWP



491. The conserved G-protein beta
SLTVQWLPDRHQSSTADYSLQKMIVGTHTSEDEPNYLMI



WD-40 repeat domains are
AEVQIPLQNSEDNIIGGFESTEAKVQIIQKINHEGEVNK



underlined.

ARYMPQNSFVIATKTVSSDVYVFDYSKHPSKAPQERVCN






PELILKGHSNEGYGLSWSPLKEGYLLSGSNDAQICLWDI





NAAFGKKVLEANQIFKVHEGAVGDVSWHLKHEYLFGSVG





DDCHLLIWDMRTAAPNKPQQSVIAHQSEVNSLAFNPFNE






WLLATGSMDKTVKLFDLRKLSCSLHTFSNHTDQVFQIEW






SPMNETILASSGADRRLMVWDLARIGETPEDEEDGPPEL






LFVHGGHTSKISDFSWNLNDDRVIASVAEDNILQIWQMA





ENIYHDDEDML





223
The amino acid sequence of SEQ ID
MGLFEPFRALGYITDGVPFAVQRRGIETFVTLSVGKAWQ



492. The conserved G-protein beta
IYNCAKLIPVLVGPQMDKKIRALACWRDFTFAATGHDIA



WD-40 repeat domains are
VFRRAHQVATWSGHKAKVTLLLSFGQHVLSVDLEGCLFI



underlined and the Trp-Asp (WD)
WAVAEVNQNKPPIGQIQLGEKFSPSCIMHPDTYLNKVLI



repeats signature is in bold.

GSEEGTLQLWNVNTRKKLYEFKGWGSSIRCCVSSPALDV






VGIGCSDGKIHVHNLRYDEEIVTFMHSTRGAVTALSFRT






DGQPLLAAGGSSGVISIWNLEKKKLQSVIKDAHDSSVCS






LHFFANEPVLMSSATDNSIKMWIFDTTDGEARLLKYRSG






HSAPPMCIRYYGKGRHILSAGQDRAFRIFSVIQDQQSRE





LSQGHVGKRAKKLKVKDEEIKLPPVIAFDAAEIRERDWC




NVVTCHLDDPCAYTWRLQNFVIGEHILKPCLEDPTPVKS




CSISACGNFAVLGTEGGWLERFNLQSGISRGTYIDIGEK





RQCAHNGAVVGLACDATNTLLISGGYNGDIKVWDFKGRE






LKFRWEIEVPLIKIVYHPGNGILATAADDMILRLFDVTA






MRLVRIFVGHMDRVTDLCFSGDGKWLLSSSMDGTIRVWD






IISSRQLNAMHMDSAVTALSLSPGMDMLATTHVGHNGIY





LWANRMIYSKATDIEPFISGKQVVKVSMPTVSSKRESEE




GDEKRTIVAESNVNKSDVSGSLIGDSYSAQLTPELVTLA




LLPKAQWQSLVNLDIIKMRNKPIEPPKKPEKAPFFLPSL




PTLSGERIFIPSSMNGDGDQDETRNDKTVFEARGKKLGG




ESLSFMQLLQSCAKIKDFTTFTNYLKGLSPSAVDMELRL




LQIVDNENISETEHSVELQGIGMLLDYFVNEVSCNNNFE




FVQALIRLFLKIHGETIRCQVSLQEKARKLLEIQSSTWE




RLDTSFQNARCMITFLSSSQF





224
The amino acid sequence of SEQ ID
MIAAVCWVPKGVAKVLPDSAEPPTQEEIQELLKCNVVAE



493. The conserved G-protein beta
SDDNEDSDEESEEMDTETDKNTDAVAKALAAANALGSQS



WD-40 repeat domains are
SDFQRQHKVDDIANGLKELDMDHYDDEDEGIDIFGSGSL



underlined and the Trp-Asp (WD)
GNCYYPANDMDPYLVEQDDDDEDEIEDMTIKPSDLIILS



repeats signature is in bold.
ARNEDDVSHLEVWIYEEETEEGGSNMYVHHDIILPAFPL




SLAWLDCNLKGGEKGNFVAVGTMQPEIELWDLDVLDEVE




PAVVLGGAVKDEASGKTTKLKKKKKNKQAVNFKEGSHTD





AVLGLAWNMEYRNVLASASADKSVKIWD
IVAEKCEHTMQ






PHTDKVQAVAWNPNQATVLLSGSFDRSVIMMDMRAPTHS






GIRWPVPADVESLAWDPHTDHSFMVSAEDGTVRGFDIRA





AASTADFDGKPMFILHAHDKAVCAISYNPAAPSLLTTGS





TDKMVKLWDITNNQPSCIASTNPNVGAVFSAAFSKNSPF






LLATGGSKGILHVWDTLDNSEVARRFGKFRPQN






225
The amino acid sequence of SEQ ID
MIMDENEFCDIFSLRKRLCLLSSQEGEEEEELEAMSQLD



494. The conserved eukaryotic
AGEFTVTGNEEVVAIAEDDVNTGILSQDLFSSQDYCTPS



protein kinase domain is
QPQDSTDLDSKDKAPCPLSPVKSTIQRKRCRPELLSNPP



underlined.
DSIQFSFQRLERVRSEESIQSSSQQLARVRSEVSSSDDF




KTPKITASGQKNYVSQSALALRARVMSPPCIKNPYLDEN




EELNEKIQRSTRRSPACVTPIQSGACLSRYRADFHELEE





IGRGNFSRVYKALNRLDGCCYAVKCSQSELRLDTERKVA






LMEVQSLAALGPHKNIVGYHTAWFENDHLYIQMELCDHN






LTTANDRGILRTDTDFLEAVYQIAQALEFIHGRGVAHLD






VKPENIYVRDGTYKLGDFGRATLINGTLHVEEGDARYMS






REILNDNYEHLDKVDMFSLGATFFELLMRKQYPGSGKRI






DRDTEIKIPILPGFSIYFQKLLQDLVSNDPGKRPSAKDV






LKNPIFNKVRGAKEV






226
The amino acid sequence of SEQ ID
MLAPALEMEPVEPQSLKKLSFKSLKRALDLFSPVHGQIA



495. The conserved G-protein beta
PPDPESKKMRISYKLNFEYGGGSGSEDQVPKRKESGAAQ



WD-40 repeat domains are
NQGQQAAGASNALALPGPEGSKIPPMEKSQNALTVGPSL



underlined and the Trp-Asp (WD)
RPQGLNDVGLHGKGTAIISASGSSDRNLSTSAIMERLPS



repeats signature is in bold.
RWPRPVWHPPWKNYRVISGHLGWVRSIAFDPSNQWFCTG





SADRTIKIWDLASGRLKLTLTGHIEQIRGLAVSSKHTYM






FSAGDDKQVKCWDLEQNKVIRSYHGHLSGVYCLALHPTI






DILLTGGRDSVCRVWDIRSKMQIFALSGHDNTVCSVFAR






PTDPQVVTGSHDTTIKFWD
LRHGKTMTTLTNHKKSVRAM






AQHPKENCFASASADNIKKFQLPRGEFLHNMLSQQKTII






NTMAVNEEGVMATGGDNGSLWFWDWKSGHNFQQAHTIVQ






PGSLESEAGIYALSYDLTGSRLVSCEADKTIKMWKEDEL





ATPETHPLNFKPPKDIRRF





227
The amino acid sequence of SEQ ID
MEEAAKEQSAGSGKPKLLRYGLRSAAKPKEDKKEEQLHQ



496. The conserved G-protein beta
PPPPPPPQQQAAPAPAPAATRSSTSGSAGGRDRRPQQQH



WD-40 repeat domains are
AVDEKYARWKSLVPVLYDWLANHNLLWPSLSCRWGPQLE



underlined.
QATYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAA




EHVSQFNEEARSPFIRKYKTIIHPGEVNRIRELPQNPNI




VATHTDSPDVLIWDVESQPNRHAVYGATASRPNLILTGH




QENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHITASA




TDQTTNKSPGSGGSIIKKTGEGNEETGNGPSVGPRGIYC





GHEDTVEDVAFCPSTAQEFCSVGDDSCLILWDARIGTNP






VAKVEKAHNGDLHCVDWNPHDNNLILTGSADNSVNMFDR





RNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSA





EDGLLNIWDYERVDKKVDRAPNAPAGLFFQHAGHRDKIV





DFHWNTADPWTMVSVSDDCDTAGGGGTLQIWRMSDLIYR




PEEEVLAELENFKAHVLECSKA





228
The amino acid sequence of SEQ ID
MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHA



497. The conserved G-protein beta
LEWPSLTVQWLPDREEPPGKDYSVQKMILGTHTSDNEPN



WD-40 repeat domains are
YLMLAQVQLPLEDAENDARQYDDERGEIGGFGCANGKVQ



underlined.
VIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYS




KHPSKPPQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLL





SGSDDAQICLWDINVPAKNKVLEAQQIFKVHEGVVEDVA





WHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVVAHQ




GEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHT




FSCHKEEVFQIGWSPKNETILASCSADRRLMVWDLSRID




EFQTPEDALDGPPELLFIHGGHTSKISDFSWNPCEDWVI




ASVAEDNILQIWQMAENIYHDEEDDMPPEEVV





229
The amino acid sequence of SEQ ID
MGKYMRKGKGVGEVAVMEVSQGSLGVRTRARTLAAASSQ



498. The conserved cyclin-
KDHRRLGASKSVTTKHQSSAPPASPCVESSMHTCYLELR



dependent kinase inhibitor domain
SRKLEKFSRCYHSAHGATSHGESKRSLSLSEPSRLAVSE



is underined.
EARVASDKSSHRVLQQQSSVAHSRNNSATFSHNAKPAKA




AQRKERRDDDHTSARPSEAPHEDEDGMEVEASFGENVMD




LDSRERRTRETTPSSYTRDVETMETPGSTTRPPSNAGRR




RFQTEGGHGTRNQFHVPTTNEIEEFFAGAEQQEQRRFTD





RYNYDPVSDSPLPGRFEWVRLRP






230
The amino acid sequence of SEQ ID
MQNMEENVQSSWSLHGNKEICARYEILKRVSSGTYLDVY



499. The conserved

RGRRKEDGLIVALKEVHDYQSSWREIEALQRLCGCPNVV




serine/threonine protein kinase

RLYEVILEFLTSDLYSVIKSAKNKGENGIPEAEVKAWMI




domain is underlined, and the

QILQGLANCHANWVIHRDLKPSNMLISAYGILKLADFGS




serine/threonine protein kinase

MSFLKRAIYEVEYELPQEDILADAPGERLMDEDDSVKGV




active-site signature is in bold.

WNEGEEDSSTAVETNFDDMAETANLDLSWKNEGDMVMQG






FTSGVGTRWYRAPDFLYGATIYGKEIDLWSLGCILGELL






ILEPLFSGTSNIDQLSRLVKVLGLQQKKNWPGCSNLPDY






RKLCFPGDGSPVGLKNHVPNCSDNMFSILERLVCYDPAA






RLNAKEIVENKYFVEDPYPVLTHELRVPSPLREENNFSE





DWAKWKDMEVDSDLENIDEFNVVHSSDGFCIKFS





231
The amino acid sequence of SEQ ID
MADVPESLQQEKDEQGTDKNCCDGKFQKEIDIDDMEEEY



502. The conserved histone
NESSIDDEEENLSDNVATNNMGTTPQGQACMAVTVEGIE



deacetylase family domain is
HANSVGCGRNGREGSEEVTAAEDMGHVSIENIREQGRNR



underlined
KSSEQLLALYEQEGLLEDDEDDDDVDWEPFEGVTVQMKW




YCTNCTMANSDDSVHCDSCGEHRNSDILRQGFLASPYLP




AESPSSSDVPDERLEESKCVMTTLTPSISPMIGVCCSSL




QSERRTVVGFDERMLLHSEIQMETYPHPERPDRLRAIAA





SLRAAGLFPGKCFSIPAREATCEELQTIHSLEHVNAVES






TSCGMLSHLSPDTYANEHSSLAARLAAGLCADLAKAIMT






GQAQNGFALVRPPGHHAGVKDSMGFCLHNNAAIAVSASR






VVGAKKVLIVDWDVHHGNGTQEIFEADQSVLYISLHRHG






EGFYPGSGAVTEVGSSKGEGYSVNIPWKCGGVGDNDYIF






AFQHAVLPIAEQFEPDLTIISAGFDAAKGDPLGRCEVTP






DGFAHMAQMLSCLSKGKMLVILEGGYNLRSISASATAVI






KVLLGDNPKALPIDIQPSKGGLQTLLEVFEIQSKYWSSL





KGHDQKLRSQWEAQYGSKKRKVIRKRHMHIVGGPVWWKW




GRKRVVYYHWFARVSSRKHL





232
The amino acid sequence of SEQ ID
MASGAGAAGVVEWHQKPPNPKNPVVFFDVTIGTIPAGRI



503. The conserved cyclophilin-

KMELFADIVPRTAENFRQFCTGEYRKAGIPIGYKGCHFH




type peptidyl-prolyl cis-trans


RVIKDFMIQAGDFVKGDGSGCISIYGSKFEDENFIAKHT





isomerase family domain is

GPGLLSMANSGPNTNGCQFFLTCAKCDWLDNKHVVFGRV




underlined and the cyclophilin-

LGEGLLVLRKIENVQTGQHNRPKLPCVIAECGEM




type peptidyl-prolyl cis-trans



isomerase signature is in bold.





233
The amino acid sequence of SEQ ID
MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSD



505. The conserved G-protein beta
SENDFDSNNKSPDTTALQAKRGKDIQGIPWNRLNFTREK



WD-40 repeat domain is underlined.
YRETRLQQYKNYENLPRPRRSRNLDKECTNFERGSSFYD




FRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMH




WSSLKQKGEEVLNVAGPIIPSVKHPGSSPQGLTRVQVSA




MSVKDNLVVAGGFQGELICKYLDKPGVSFCTKISHDENG




ITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTVLER




FSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTV





GTLRGHLDYSFAAAWHPDGYILATGNQDTTCRLWDVRKL





SSSLAVLKGRMGAIRSIRFSSDGRFMAMAEPADFVHLYD




TRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYG




SLLEFNRRRMNYYLDSIL





234
The amino acid sequence of SEQ ID
MDCSGDEEEEQFFESLEEMLSPSDSGSEAADNETGCRNA



506. The conserved G-protein beta
DARSKYEIWKRAPSSIQERRQRFLVRMGLANPSELGNQV



WD-40 repeat domains are
NSTSAESTCSTETANIPNGIERLRENSGAVLRTAGSSGR



underlined.
KTHCKNVINIGLREGSVRSSSSSNGTPDVGEDNGEFGGT




IFSRSGGTWECMCKIKNLDSGKEFVVDELGQDGLWNKLR




EVGTDRQLTMDEFERSLGLSPLVQELMRRESGVAQADCN




GVHHHDAEISSSKRRSWLKALKSAAYSMRRPKEDQSNYD




SERSGRRSGSFDVPWGKPQWTKVRHYRKRYKEFTALYMG




QEIEAHEGSIWTMKFSLDGRYLASAGQDCVIHVREVIES




MRTFGADTPDLYASSAYFSMNGLQELVPLSIEDHANKMK




RGKIIGSKKSSNSDCIVLPNKVFQLSEEPVCSFHGHLLD





VFDLSWSPSQYLLSSSMDKTVRLWKLGHESCLKVFSHND






IVTCIQFNPVDERYFISGSLDGKARIWSIPDRQVVDWSD





LREMVTAVCYTPDGQGGLVGSIKGSCRFYNTSGNKLQLE




NQLNVRSKKKKSSGKKITGFQFAPGGDSQKVLITSADSR




VRVYNGSELVCKYKGFRNTCSQISASFAPNGQHFVCASE




DSRVYIWNHESPRGSGARHEKSSWSHEHFLSQGVSVAIP




WSGMKLQPPVWNSPEFMLGQRHNLLSLQGGKDVGCQNGL




LSREAGEGQESETPLHYISQVSHSCGSQNMVDRDGQDDL




SRYSACISDSRLSSFMAFPESPGNPDDLNSKVFFSDSSS




KGSATWPEEKLPPTRKQSRSNSTSSHYDTLKTHLGNTIQ




GQSGASAAVAWGLVIVTAGHGGEIRSFQNYGLPVRL





235
The amino acid sequence of SEQ ID
MPSIPAIGEFTVCEINRELLTTKDESDTQAKDAYAKILG



507. The conserved G-protein beta
LVFPPISFQIEEGFGSASRQQFDQDLDREDTIVTPSTSE



WD-40 repeat domain is underlined.
GTNALQEGGLLLKGVSVLKNILASSFGPIFSPNDTKVLK




KVELLQGISWHRHKHILAFISGSNQVTVHDFQDPEWRES





SLLVSESQRGIEALEWRPNGGTTLSVACRGGICIWSASY





PGSVAPVRSGVASFLGTSTRGSSVRWTLVDFLQIPGGKA




VTALSWSPTGRLLASASREDSSFTIWDVAQGVGTPLRRG




LGGISLLKWSPTGDYLFSAKPNGTFYLWETNTWTLEQWS




SSGGCVISATWGPDGRMLFMAFSESTTLGSLHFAGRPPS




LDAHLLPMELPEIGSITGGFGNIEKMAWDGCGERLAVSY




TGGDLMYVGLIAIYDTRRTPFISASLVGFIRGPGEQVKP




LAFAFHDKFKQGPLLSVCWSSGLCCTYPLIFRAH





236
The amino acid sequence of SEQ ID
MEEENAKHTEETRQVQVRFTTKLQPALRVPTTSIAIPAH



508. The conserved G-protein beta
LTRYGLSDIVNTLLGNDKPQPFDFLVESELVRTSLEKLL



WD-40 repeat domains are
LIKGISAEKILNIEYILAVVPPKQEEPSLHDDWVSVVDG



underlined.
SYPNFIFSGSFDSIGRIWKGEGLCTHVLEGHRDAITSAA




FIMPSDSSDSFINLATASKDRTLRLWQFKPNEHMTNGKM




VRPYKLLKGHTSSVQTVSACPRRNLICSGSWDCSIKIWQ




TAGEMDIESNAGSVKKRKLEDSTEQIISQIEASRTLEGH




SQCVSSVVWLEKDTIYSASWDHSVRSWDVETGVNSLTVG




CRKALHCLSIGGEGSALIAAGGADSVLRIWDPRMPGTFT




PILQLSSHKSWITACKWHPKSRHHLISASHDGTLKLWDV




RSKVPLTTLEAHKDKVLCADWWKEDCVISGGADSTLQIF




SNLNLT





237
The amino acid sequence of SEQ ID
MNRLRSKRNHILELRLGQSEPEKEATLASNRSRGTNAPI



509. The conserved RING-type zinc
VVEDDDDVVVSSPRSFALARSSVSQRSSRIPIVNEEDLE



finger is underlined.
LRLGLAVTGRTSAEHNPRRRHGRVPPNKPIVLCDDAGEA




DQSSSKKRRTGQQLSSDVQSDESKEVKLTCAICISTMEE




ETSTICGHIFCKKCITNAIHRWKRCPTCRKKLAINNIIHR




IYISSSTG





238
The amino acid sequence of SEQ TD
MEEPPPPAVLPSSEDTSIVSSHSFVNAPPTVPVGLDASI



510. The conserved G-protein beta
PQISTPGINQPGLTIPVPPEAAPLTASLVAASAGMPPAV



WD-40 repeat domains are
VPSFVRPAIVAHPSVMPPPSMPLAALPMPVASAVPVAAP



underlined and the splicing factor
HFPPSTPNDNSITPSMPVPTPIVASSSVPPSVTIPGIAP



motif is in bold.
LPFIAPIPVPSSRPVAPSPFMPPARPLGASVSVAMDVDN




TDEQDQDADNKGESPSSSPDHPEDPSAAEYEITEESRKV




RERQEQAIQELLLRRRAYALAVPTNDSSVRARLRRLNEP





ITLFGEREMERRDRLRALMAKLDAEGQLEKLMKVQEEEE





AAANVDAEEVQEMEGPQVYPFYTEGSQELLKARTEITKF




SLPRAVSRLQRARRKREDPDEDEDEELKCVLQQSAQINM




DCSEIGDDRPLSGCAFSSDGTLLATSAWSGVTKLWSVPN





INKVATLKGHTERVTDVAFSPTNCHLATACADRTAMLWN





SEGVLMKTYEGHLDRLARLAFHPSGLYLGTASFDKTWRL





WDVNTGIELLLQEGHSRSVYGIAFQCDGSLAATCGLDGL






ARIWDLRTGRSILALEGHVKPVLGIDFSPNGYHLATGSE






DHTCRIWDLRKRQSVYIIPAHSHLVSQVKFEPQEGYFLV






TASYDSTAKVWSARDFKSIKVLAGHEAKVTSVDITADGQ






YIATVSHDRTIKLWSSKNSTNDMNIG






239
The amino acid sequence of SEQ ID
MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKV



511. The conserved G-protein beta

NMWAIGKPNAILSLSGHSSAVESVTFDSAEALVVAGAAS




WD-40 repeat domains are

GTIKLWDLEEAKIVRTLTGHRSNCISVDFHPFGEFFASG




underlined and the Trp-Asp (WD)

SLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWV




repeats signature is in bold.

VSGGEDNIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQE






FLLATGSADRTVKFWD
LETFELIGSAGPETTGVRAMIFN






PDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLADLN





IHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNG




HNEAKLASSGHPSVQQLDNNLKTNMARLSLSHSTESGIK




EPKTTTSLTTTEGLSSTPQRAGIAFSSKNLPASSGPPSY




VSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRP




ETTSDAKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESD




KIDSINQKRMTGNDKTDLNIARAEQHVSSRLDNTNTSSV




VCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRSPTFPWS




ATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETR




EKALTADTPVLVSGRPPTSPGVDMNSFIPRGSHGTSESD




LTVSDDNSAIEELMQQHNAFTSILQARLTKLQVIRRFWQ




RNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC




TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRA




TISATPTIGVDLQAEQRLERCNLCYVELENIKQILVPLI




RRGGAVAKSAQELSLALQEV





240
The amino acid sequence of SEQ ID
MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRAL



512. The conserved cyclin N- and
SNINSNIIGAPPYPCAVNKRVLSEKNVNSENDLLNAAHR



C-terminal family domains are
PITRQFAAQMAYKQQLRPEENKRTTQSVSNPSKSEDCAI



underlined.
LDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDV




AEEPVTDIDSGDKENQLAVVEYIDDLYMFYQKAEASSCV





PPNYMDRQQDINERMRGILIDWLIEVHYKFELMDETLYL






TVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEVSVP






VVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPY






VFMRRFLKAAQSDKKLELLSFFIIELSLVEYDMLKFPPS






LLAASAIYTALSTITRTKQWSTTCEWHTSYSEEQLLECA






RLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFL





LDFRL





241
The amino acid sequence of SEQ ID
MQAPREGKSAAAIVGMGKYMKKSKAIPRDVSLLEASPRS



513. The conserved cyclin-
PSATGVRTRAKTLASRRLRRASQRRPPPPAAAAAAAAPS



dependent kinase inhibitor domain
LDASPCPFSYLQLRSRRLRRPRLAPSPEARIDEGPAGSG



is underlined.
SRGSRDASCSARTASSSGGVEGEGACVGRGDRGNGGECV




RDAAVDASYGENDLEIEDRDRSTRESTPCSLIRDSNANT




PPGSTTRQQSSCTAHRTQMSILRSIPTSDEMEEFFAYAE





QRQQRSFIEKYNFDIVKDRPLPGRFEWVQVIP






242
The amino acid sequence of SEQ ID
MDGHSSHLAAQNRSRGSQTPSPSHSAASASATSSIHLKR



514. The conserved GCN5-related N-
KLSAANASAASAAAAAAAAAAAADDHAPPFPPSSISADT



acetyltransferase family domain is
RDGALTSNDDLESISARGGGAGDDSDDDSDDEEEDDGDN



underlined and the bromodomain is
DGGSSLRTFTAARLENVGPAAARNRKIKAESNATVKVEK



in bold.
EDSAKDGGNGAGVGALGPAATSGAGSGSGTVPKEDAVKI




FTENLQASGAYSAREENLKREEEAGRLKFECLSNDGVDD




HMVWLIGLKNIFARQLPNMPKEYIVRLVMDRNHKSVMVI





RRNLVVGGITYRPYASQKFGEIAFCAIKADEQVKGYGTR






LMNHLKQHARDVDGLTHFLTYADNNAVGYFIKQGFTKEI





YLDKDRWHGYIKDYDGGILMECKIDPKLPYTDLSTMVRR




QRQAIDEKIRELSNCHIVYQGIDFQKRDAGVPQNTIKME




DIPGLREAGWTPDQWGYSRFRGLSDQKRLTFFIRQLLKV





LNDHSDAWPFKEPVDAREVPDYYDIIKDPMDLKTMTKRV






ESEQYYVTLEMFIADVKRMFANARTYNSPDTIYFKIATR





LEAHFQSKVQSNLQSGAGKIQQ





243
The amino acid sequence of SEQ ID
MFNGMMDPELFKLAQEQMNRMSPAELAKIQQQMMSNPEL



515. The conserved TPR repeat
MRMASESMKNMRPEDLRQAAEQLKHVRPEEMAEIGEKMA



domain is underlined
NASPEEIAAVRARADAQMTYEINAAKILKKEGNELHSQG




RFKDASQKYLRAKNNLKGIPSSEGKNLLLACSLNLMSCY




LKTRQYEECIKEGSEALACEEKNLKAFYRRGQAYRELGQ





LKDAVSDLRKAHEISPDDETIAQVLRDTEESLTKEGGSA





PRGVVIEEITEEDETLASVNHESPSEYSEKRHQESEDAH




KGPINGDIMGQMTNSESLKALKGDPDAIRSFQNFISNAD




PTTLAAMGAGNAGEVSPDLIKTASSMIGKMSAEELQKMI




QLASSFPGENPYVTRNSDSNSNSFGNGSIPNVSPDMLKT




ASDMMSKMSPDDLQRMFEMASSSRGKDPSLDANHASSSS




GANLAANLNHILGESEPSSSYHIPSSSRNISSSPLSNFP




SSPGDMQEQIRNQMKDPAMRQMFTSMMKNMSPEMMANMG




KQFGLELSPEDAAKAQEAMSSLSPEMLDKMMRWADRAQR




GVETAKKTKNWLLGRPGMILAICMLLLAVILHRLGFIGS





244
The amino acid sequence of SEQ ID
MIAAISWVPRGASKAVPEVAEPPSKEEIEEILKSGVVER



516. The conserved G-protein beta
SGDSDGEEDDENMDAVASEKADEVSTALSAADALGRISK



WD-40 repeat domains are
VTKAGSGFEDIADGLRELDMDNYDEEDEDVKLFSTGLGD



underlined.
LYYPSNDMDPYLKDKDDDDDTEEIEDLSIKPMDSLIVCA




RTDDEVNLLEVYLLEPSLSDESNMYVHHEVVISEFPLCT




AWLDCPIKGGDKGNFIAVGSMEPAIEIWDLDIIDAVEPC




LVLGGQEELKKKKKKGKKASIKYKEGSHTDSVLGLAWNK




EFRNILASASADRQVKIWDVAAGKCNITMEHHTDKVQAV




AWNHHAPQVLLSGSFDHSVVMKDGRIPSHSGYRWSVTAD




VESLAWDPHSEHFFVVSLEDGTVRGFDVRAAISNSASQS




LPSFTLHAHEKAVSTISYNPAAPNLLATGSTDKMVKLWD





LSNNQPSCIASRNPKAGAVFSVSFSEDSPLLLAIGGSKG





RLEVWDTSSDAAVSRRFGKHGKPKTAEPGS





245
The amino acid sequence of SEQ ID

MKFCKKYQEYMQGQEGKKLPGLGFKKLKKILKRCRRRDS




517. The conserved Zn-finger, RING

LHSQKALQAVQNPRTCPAHCSVCDGSFFPSLLEEMSAVL




domain is underlined, and the SPX,

GCFNKQAQKLLELHLASGFQKYLMWFKGKLRGNHVALIQ




N-terminal is in bold

EGKDLVTYALINAIAIRKILKKYDKIHLSTQGQAFKSQV






QRMHMEILQSPWLCELIAFHINVRETKANSGKGHALFEG





CSLVVDDGKPSLSCELFDSIKLDIDLTCSICLDTVFDSV





SLTCGHIYCYMCACSAASVTIVDGLKAAEPKEKCPLCRE





ARVFEGAVHLDELNILLSRSCPEYWAERLQTERVERVRQ




AKEHWESQCRAFMGVE





246
The amino acid sequence of SEQ ID
MVSTQSTRENPSIFFPPPLKPWLLPVVLSLSLSRQLGMA



518. The conserved G-protein beta
AAAAASLPFKKNYRSSQALQQFYAGGPFAVSSDGSFIAC



WD-40 repeat domains are
NCGDSIKIVDSSNASLRPSIDCGSDTITALSLSPDGKLL



underlined.
FSAGHSRQIRVWDLSTSTCLRSWKGHDGPVMSMACPVSG




GLLATGGADRKVMVWDVDGGFCTHFFKGHDGVVSTVLFH




PDSNRSLLFSGSDDGTIRVWDLLAKKCASTLRGHDSTVT




SLAFSEDGLTLLAAGRDKVVSLWDLHNYACKKTIPMYEV




LESVCVIHSGTVLASQLGLDDQLKVTKESAQNIHFITVG




ERGILRIWKSEGSVCLFKQEHSDVTVISDEDDSRSGFTA




AVMLPLDQGLLCVTADQQFLFYYPEKHPEGIFSLTLCRR




LVGYNEEIVDMKFLGEEENFLAVATNLEQVRVYELASMS




CSYVLAGHTETVLCLDTCISSSGRTLIVTGSKDNSVRLW





DSESRHCIGVGVGHMGAVGAVAFSRKRQDFFVSGSSDRT






LKVWSLDGISEDGVDSTNLKAKAVVAAHDKDINSVAVAP





NDSLVCSGSQDRTACVWRLPDLVSVVVLKGHKRGIWSVE




FSPVDQCVLTASGDKTVKIWAISDGSCLKTFEGHVSSVL




RASFLTRGTQFVSCGADGLVKLWTVRTNECIATYDQHSD




KVWALAVGKKTEMLATGGSDAVVNLWYDSTASDKEDAFR




KEEEGVLKGQELENAVSDADYTKAIELALELRRPHKLFE




LFSELCRTREVGDRVERILSALSGEEVCLLLEYIREWNA




KPKLCHVAQSVLSQVFRILSPTEIVEIKGIGELLEGLIP




YSQRHFSRIDRLVRSTYLLDYTLTGMSVIEPEADRSAVN




DGSPDKSGLEKLEDGLLGENVGEEKIQNKEELESSAYKK




RKLPRSKDRSKKKSKNVVYADAAAISFRA





247
The amino acid sequence of SEQ ID
MDSAPRRKSGGINLPSGMSETSLRLDGFSGSSSSFRAIS



519. The conserved G-protein beta
NLTSPSKSSSISDRFIPCRSSSRLHTFGLVERGSPVKEG



WD-40 repeat domains are
GNEAYSRLLKAELFGSDFGSLSPAGQGSPMSPSKNMLRF



underlined.
KTESSGPNSPFSPSILRQDSGFSSEASTPPKPPRKVPKT




PHKVLDAPSLQDDFYLNLVDWSSQNTLAVGLGTCVYLWS




ASNSKVTKLCDLGPNDGVCAVQWTREGSYISIGTSLGQV




QIWDGTQCKRVRTMGGHQTRTGVLAWNSRILASGSRDRV




ILQHDLRVPNEFIGKLVGHKSEVCGLKWSHDDRELASGG





NDNQLLVWNQHSQQPVLKLTEHTAAVKAIAWSPHQNGLL






ASGGGTADRCIRFWNTTNGHQTSSVDTGSQVCNLAWSKN





VNELVSTHGYSQNQIMVWKYPSMAKVATLTGHSLRVLYL





AMSPDGQTIVTGAGDETLRFWNVFPSAKAPAPVKDTGLW





SLGRTHIR





248
The amino acid sequence of SEQ ID
MEDEAEIYDGVRAQFPLTFGKQSKPQTSLESVHSATRRG



520. The conserved G-protein beta
GPAPAPAPASSSSLPSTTSPSAAGGAGKSSGLPSLSSSS



WD-40 repeat domains are
TAWLEGLRAGNPRAGREAGIGSRGGDGEDGGRAMIGPPR



underlined.
PPPGFSANDDGGGEDDDDDGDGVMVGPPPPPPGNLGDGD




DDEEEEEAMIGPPRPPVVDSDEEEEEEEEENRYRLPLSN




EIVLKGHNKIVSALAVDPTGSRVLSGSYDYTVRMFDFQG




MNSRLSSFRDFEPVEGHQVRNLSWSPTADRFLCVTGSAQ




AKIYDRDGLTLGEFVKGDMYIRDLKNTKGHITGLTWGEW




HPKTKETILTSSEDGSLRIWDVNDFKSQKQVIKPKLARP




GRVPVTTCTWDREGKCIAGGIGDGSIQIWNLKPGWGSRP




DIHVEQAHADDITGLKFSSDGKILLTRSFDDSLKVWDLR




LMKNPLKVFEDLPNHYAQTNIACSPDEQLFLTGTSVERE




STIGGLLCFFDRSKLELVSRIGISPTCSVVQCAWHPRLN




QIFATSGDKSQGGTHVLYDPTLSERGALVCVARAPRKKS




VDDFELKPVIHNPHALPLFRDQPSRKRQREKILKDPLKS




HKPELPMNGPGHGGRVGASKGSLLTQYLLKQGGMIKETW




MDEDPREAILKHADAAEKNPKFTRAYAETQPDPVFAKSD




SEDEDK
















TABLE 12








Eucalyptus in silico Data.






















SEQ
ConsID















ID
eucSpp
Family
1
2
3
4
5
6
7
8
9
10
11
12
























1
3910
Cyclin-

0.25



0.11

0.20



0.73




dependant




protein




kinase


2
19213
Cyclin-







0.59



0.64




dependant




protein




kinase


3
36800
Cyclin-






0.11
0.36




dependant




protein




kinase


4
40260
Cyclin-



0.85




dependant




protein




kinase


5
41965
Cyclin-




0.35






0.86




dependant




protein




kinase


6
2906
Cyclin-

0.93









0.81




dependant




protein




kinase


7
1518
Cyclin-

0.08
0.28
0.08

0.06
0.11




dependant




protein




kinase


8
8078
Cyclin-




0.17






3.20




dependant




protein




kinase


9
9826
Cyclin-


0.36
0.23


0.15
0.04

0.24

0.43




dependant




protein




kinase


10
10364
Cyclin-





0.11
1.52




0.13




dependant




protein




kinase


11
11523
Cyclin-



0.15

0.06
0.15




2.40




dependant




protein




kinase


12
24358
Cyclin-

0.76




0.07
0.04

0.24




dependant




protein




kinase


13
39125
Cyclin-



0.23




dependant




protein




kinase


14
5362
Cyclin-

0.68



0.06

0.08



1.17




dependant




protein




kinase


15
44857
Cyclin-

0.68



0.06

0.08



1.17




dependant




protein




kinase


16
1743
Cyclin A


0.19

2.10
0.06
0.15


17
12405
Cyclin A





0.06
0.59


2.84


18
3739
Cyclin B

0.42
1.99
0.08







2.33


19
22338
Cyclin B











0.86


20
28605
Cyclin B





0.39

0.04

0.47


21
41006
Cyclin B









0.71


22
6643
Cyclin D

0.85


0.83
0.06
1.06
0.08



0.26


23
45338
Cyclin D











2.03


24
46486
Cyclin D






0.30


25
12070
Cyclin-
0.24

0.82


0.06
0.26




0.92




dependent




kinase




regulatory




subunit


26
6617
Histone

0.08



0.06
0.04
0.55


0.51
0.26




acetyltransferase


27
7827
Histone


2.27


0.11

0.04




acetyltransferase


28
8036
Histone




1.16




acetyltransferase


30
1596
Histone

0.17
0.16
0.08
2.98
0.88
0.26
0.98



0.71




deacetylase


31
5870
Histone


0.19

0.17


0.12



5.43




deacetylase


32
6901
Histone
1.21
0.08



2.01
1.16
0.08




deacetylase


33
6902
Histone

0.08



0.11
1.21




0.47




deacetylase


34
7440
Histone
0.48

1.23
0.15

0.22
0.48
0.20



2.02




deacetylase


35
8994
Histone


0.09



0.15




deacetylase


36
24580
Histone

0.42




1.22




deacetylase


37
37831
Histone

0.08



0.22

0.40

1.19

0.12




deacetylase


38
34958
MAT1 CDK-






0.15
0.23




activating




kinase




assembly




factor


39
22967
Peptidyl-


0.72








0.69




prolyl cis-




trans




isomerase


40
8599
Peptidyl-


0.46
0.08
0.50
0.17
0.51
0.28



3.01




prolyl cis-




trans




isomerase


41
9919
Peptidyl-


0.51

0.35
0.06
0.15
0.43



4.24




prolyl cis-




trans




isomerase


42
15820
Peptidyl-






0.04




6.78




prolyl cis-




trans




isomerase


43
8327
Peptidyl-





0.06

0.04



6.86




prolyl cis-




trans




isomerase


44
4604
Peptidyl-











0.68




prolyl cis-




trans




isomerase


45
966
Peptidyl-

0.59
1.02
0.54
0.69
0.50
0.93
0.59

0.95

18.65




prolyl cis-




trans




isomerase


46
1037
Peptidyl-

0.59




prolyl cis-




trans




isomerase


47
4603
Peptidyl-

0.17



0.17
1.24
0.04



0.34




prolyl cis-




trans




isomerase


48
5465
Peptidyl-


1.21
0.08
0.66
0.11
0.29
0.16



6.99




prolyl cis-




trans




isomerase


49
6571
Peptidyl-

0.51

0.08


0.41
0.08



1.14




prolyl cis-




trans




isomerase


50
6786
Peptidyl-

0.42



0.33
0.06
0.41
0.04




prolyl cis-




trans




isomerase


51
7057
Peptidyl-

0.42



0.11
0.04




prolyl cis-




trans




isomerase


52
8670
Peptidyl-


1.56


0.39

0.20



0.12




prolyl cis-




trans




isomerase


53
9137
Peptidyl-






0.04
0.59




prolyl cis-




trans




isomerase


54
10285
Peptidyl-




0.60
1.16
0.04
0.04



0.45




prolyl cis-




trans




isomerase


55
10600
Peptidyl-


0.16

0.17
0.06





0.46




prolyl cis-




trans




isomerase


56
11551
Peptidyl-



0.08

0.06
0.04
0.08



1.89




prolyl cis-




trans




isomerase


57
20743
Peptidyl-







0.76




prolyl cis-




trans




isomerase


58
23739
Peptidyl-

0.59




prolyl cis-




trans




isomerase


60
31985
Peptidyl-




1.99




prolyl cis-




trans




isomerase


61
32025
Peptidyl-




0.99




prolyl cis-




trans




isomerase


62
32173
Peptidyl-




1.99




prolyl cis-




trans




isomerase


64
9143
Retinoblastoma


0.90



0.15




related




protein


65
349
WD40 repeat
0.24

0.34
0.08
0.17
0.22
0.33
0.08


0.25
2.24




protein


66
575
WD40 repeat

0.25
0.94
0.31
0.34
0.11

0.16

0.47

1.87




protein


67
804
WD40 repeat



0.15
0.34
0.39
0.33
0.39



1.82




protein


68
805
WD40 repeat
0.97
0.51
4.66
0.23
0.17
0.77
0.33
1.07

0.24

4.43




protein


69
806
WD40 repeat




0.83


0.04




protein


70
2248
WD40 repeat

0.08

0.08
1.92
0.06

0.08



0.91




protein


71
3203
WD40 repeat

0.34
0.18
0.15
0.17
0.11
0.30
0.04



0.72




protein


72
3209
WD40 repeat

0.08

0.15
0.17


0.12



0.61




protein


73
4429
WD40 repeat

0.08
1.16
0.08







0.13




protein


74
4607
WD40 repeat

0.76

0.54

0.06
0.07




protein


75
4682
WD40 repeat

0.08
0.28
0.23


1.13
0.08



0.12




protein


76
5786
WD40 repeat

0.08



0.06
0.46
0.08



0.13




protein


77
5887
WD40 repeat

1.61
1.23
0.08

0.06
0.15
0.28



1.41




protein


78
5981
WD40 repeat

0.08









0.37




protein


79
6766
WD40 repeat
0.24
0.08
1.31

0.51
0.06
0.74
0.51



0.28




protein


80
6769
WD40 repeat

0.93


0.17


0.12



2.28




protein


81
6907
WD40 repeat

0.25


0.17
0.06
0.45
0.32

0.47

1.67




protein


82
7518
WD40 repeat


0.91


0.28
0.15
0.55



0.59




protein


83
7717
WD40 repeat


0.47








0.38




protein


84
7718
WD40 repeat
0.24

1.88
0.08

0.22

0.04



0.92




protein


85
7741
WD40 repeat


1.42


0.11



0.47




protein


86
7884
WD40 repeat


1.33
0.15





0.24




protein


87
8258
WD40 repeat
0.72

0.19
0.23
0.87

0.15
0.08



0.08




protein


88
8465
WD40 repeat


0.47
0.08
1.75




protein


89
8616
WD40 repeat


0.57
0.08
0.69


0.16



0.13




protein


90
8690
WD40 repeat


0.26
0.08
0.35
1.39
0.34
0.32

2.13

0.80




protein


91
8708
WD40 repeat


0.57




0.04




protein


92
8850
WD40 repeat


0.09


0.06

0.27



2.03




protein


93
9072
WD40 repeat


1.21

0.17






0.48




protein


94
9465
WD40 repeat
0.24

0.72

0.33

0.15




protein


95
9472
WD40 repeat


0.36

1.99
0.11
0.61




6.90




protein


96
9550
WD40 repeat


0.90


0.11
1.78




protein


97
10284
WD40 repeat
0.24
0.08


1.82

1.22
0.16

0.47

0.28




protein


98
10595
WD40 repeat


0.16

0.17
0.11
6.52




0.85




protein


99
10657
WD40 repeat





0.06

0.12




protein


100
12636
WD40 repeat





0.06





0.65




protein


101
12748
WD40 repeat


1.50
0.08

0.06
1.67
0.04



0.38




protein


102
12879
WD40 repeat



0.08
0.33
0.06
0.04
0.08



2.00




protein


103
15515
WD40 repeat




0.35

0.30




protein


104
15724
WD40 repeat

0.25
0.33
0.15


0.47
0.04



0.39




protein


105
16167
WD40 repeat
0.24



0.52




protein


106
16633
WD40 repeat


1.96




0.12



0.42




protein


107
17485
WD40 repeat


0.65




protein


108
18007
WD40 repeat







0.12




protein


109
20775
WD40 repeat




0.17


0.08




protein


110
23132
WD40 repeat











2.42




protein


111
23569
WD40 repeat






0.91




0.91




protein


112
23611
WD40 repeat






4.15




protein


113
24934
WD40 repeat

0.34





0.04




protein


114
25546
WD40 repeat


0.09




protein


115
30134
WD40 repeat






0.07




protein


116
31787
WD40 repeat


0.19








1.19




protein


117
34435
WD40 repeat




0.35


0.08




protein


118
34452
WD40 repeat


1.44




0.20



0.25




protein


119
35789
WD40 repeat







0.20




protein


120
35804
WD40 repeat






0.19
0.27



0.08




protein


121
43057
WD40 repeat






0.30




0.57




protein


122
46741
WD40 repeat






0.46




protein


123
47161
WD40 repeat






1.78




protein


235
6366
WD40 repeat

0.08
0.68
0.23
0.93
0.11
0.36
0.83

0.24

0.94




protein


236
17378
WD40 repeat


0.65




0.12



0.08




protein


252
45414
Cyclin B











3.13


253
44328
Cyclin-











0.38




dependant




kinase




inhibitor


254
15615
Histone






0.22
0.04




acetyltransferase


255
17239
Peptidyl-



0.08
0.50


0.08




prolyl cis-




trans




isomerase


256
18643
WD40 repeat







0.04



0.90




protein


257
19127
WD40 repeat







0.04



0.89




protein


258
22624
WD40 repeat











1.16




protein


259
32424
WD40 repeat




0.50




protein


260
37472
WD40 repeat







0.08



0.17




protein





In Table 12, the following numbers 1-12 represent the following tissues:


1 is bud reproductive;


2 is bud vegetative;


3 is cambium;


4 is fruit;


5 is leaf 6 is phloem;


7 is reproductive;


8 is root;


9 is sap vegetative;


10 is stem;


11 is whole; and


12 is xylem.













TABLE 13







Pine in silico data.






















ConsID















SEQ

pinus



ID

Radiata

Family
1
2
3
4
5
6
7
8
9
10
11
12
























124
1766
Cyclin-



1.02
0.05
1.58
0.15
0.22
0.22
0.18
2.16
4.91




dependant




protein




kinase


125
2927
Cyclin-
0.16




0.19
0.11
0.14
0.04
0.36
0.38
0.17




dependant




protein




kinase


126
7642
Cyclin-


0.22
0.21
0.05



0.07




dependant




protein




kinase


127
13714
Cyclin-





0.11
0.11




dependant




protein




kinase


128
16332
Cyclin-





0.54
0.26

0.14

0.04
0.91




dependant




protein




kinase


129
21677
Cyclin-




0.05
0.14





0.17




dependant




protein




kinase


130
27562
Cyclin-











0.41




dependant




protein




kinase


131
1504
Cyclin-
0.16




0.36

0.35
0.21
0.54
0.09
0.65




dependant




protein




kinase


132
15211
Cyclin-





0.13
0.15



0.19
0.19




dependant




protein




kinase


133
20421
Cyclin-








0.04

0.05
0.95




dependant




protein




kinase


134
3187
Cyclin-





0.34
0.15

0.04
0.18
0.38




dependant




protein




kinase


135
15661
Cyclin-








0.04

0.13




dependant




protein




kinase


136
13874
Cyclin A
0.31




0.27
0.15



0.05


137
14615
Cyclin A
0.16




0.15


138
4578
Cyclin B
0.47
0.14



0.13
0.22



0.74
0.38


139
23387
Cyclin B





0.29
0.26




0.17


140
6970
Cyclin D

0.14





0.27
0.04


141
10322
Cyclin D
0.16


0.19

0.06


0.14

1.12
1.36


142
22721
Cyclin D





0.27

0.36


143
23407
Cyclin D





0.15
0.26




0.31


144
1945
Cyclin-

0.28
0.55
0.41
0.16
1.62
5.02
0.22
0.72

0.39
3.06




dependent




kinase




regulatory




subunit


145
8233
Cyclin-



0.21




dependent




kinase




regulatory




subunit


146
8234
Cyclin-
0.16

0.11




dependent




kinase




regulatory




subunit


147
22054
Cyclin-




0.05

0.22



0.18




dependent




kinase




regulatory




subunit


148
12137
Histone





0.06




1.51
0.19




acetyltransferase


149
12582
Histone





0.64
0.15
1.09


0.33
0.63




acetyltransferase


150
15285
Histone



0.21




0.12

0.70
0.14




acetyltransferase


151
17229
Histone










0.94
0.16




acetyltransferase


152
20724
Histone








0.04

0.19
0.19




acetyltransferase


153
4555
Histone
0.16
0.14



0.97

0.14


0.89
0.89




deacetylase


154
4556
Histone











0.14




deacetylase


155
5729
Histone
0.31
0.28
0.22
0.58
0.22
2.00
0.48
0.07
0.04

2.73
1.46




deacetylase


156
7395
Histone

0.14

0.14


0.19
0.93
0.04

0.14
1.33




deacetylase


157
9503
Histone


0.11




0.14




deacetylase


158
11283
Histone



0.19


0.15



0.96
1.35




deacetylase


159
12322
Histone
0.16




0.06
0.11

0.04

0.05
0.29




deacetylase


161
23236
Histone





0.13


0.11




deacetylase


162
171
Peptidyl-








0.07


0.46




prolyl




cis-trans




isomerase


163
172
Peptidyl-





0.19


0.11
0.18
0.11
0.46




prolyl




cis-trans




isomerase


164
1480
Peptidyl-
2.51
4.20
0.88
2.97
1.58
3.53
7.36
1.33
2.74
0.72
6.62
10.14




prolyl




cis-trans




isomerase


168
1692
Peptidyl-
0.16

0.22

0.65
0.61
0.26

0.29
0.18
1.28
0.34




prolyl




cis-trans




isomerase


169
5313
Peptidyl-

0.14





0.07


0.37
0.17




prolyl




cis-trans




isomerase


170
6362
Peptidyl-

0.14
0.33
0.05

0.06
0.60

0.04

2.92
0.68




prolyl




cis-trans




isomerase


171
6493
Peptidyl-

0.42
0.11
0.21


0.11

0.04

0.25
0.32




prolyl




cis-trans




isomerase


172
6983
Peptidyl-



0.61

0.13




0.04




prolyl




cis-trans




isomerase


174
7665
Peptidyl-


0.11
0.39
0.05
0.62




0.25




prolyl




cis-trans




isomerase


175
12196
Peptidyl-





0.19
0.15
0.14
0.16




prolyl




cis-trans




isomerase


176
13382
Peptidyl-



0.25

0.06

0.07
0.04

0.87
0.15




prolyl




cis-trans




isomerase


177
16461
Peptidyl-



0.19

0.15
0.15

0.04

0.04
0.74




prolyl




cis-trans




isomerase


178
17611
Peptidyl-



0.24
0.11
0.27
0.41



0.99




prolyl




cis-trans




isomerase


179
19776
Peptidyl-





0.13

0.07
0.16

0.05
0.61




prolyl




cis-trans




isomerase


180
20659
Peptidyl-







0.15


0.19




prolyl




cis-trans




isomerase


181
22559
Peptidyl-






0.11
0.14



0.20




prolyl




cis-trans




isomerase


182
24188
Peptidyl-








0.23




prolyl




cis-trans




isomerase


183
27973
Peptidyl-










1.01




prolyl




cis-trans




isomerase


184
1353
WD40


0.44

0.05
0.73


0.11
1.07
0.70
1.32




repeat




protein


185
1978
WD40

0.14

0.05

0.44
0.11
0.21
0.27
0.36
1.46
0.82




repeat




protein


186
2810
WD40

0.42

0.79
0.11
0.39

0.27

0.36
1.69
1.03




repeat




protein


187
2811
WD40








0.14

0.09
0.14




repeat




protein


188
2812
WD40






0.15


0.18
0.04
0.16




repeat




protein


189
3514
WD40



0.63

0.06

0.14

0.18
0.48
0.56




repeat




protein


190
4104
WD40

0.14

0.25

0.27
0.37
0.36
0.19
0.18
0.39
0.53




repeat




protein


191
5595
WD40

0.14

0.25


0.15
0.14
0.07

0.23




repeat




protein


192
5754
WD40
0.31
0.14



0.06

0.07
0.16

0.10
0.16




repeat




protein


193
6463
WD40
0.16
0.56
0.22
0.43

0.81
0.53
0.21
0.08

1.00
0.70




repeat




protein


194
6665
WD40
0.31
0.28

0.45
0.44
0.96


0.07

3.37
2.68




repeat




protein


195
6750
WD40

0.14

0.59
0.05

0.37
0.42
0.04

0.18
0.52




repeat




protein


196
7030
WD40
0.31


0.40
0.54
0.45
0.37

0.07

1.58
3.41




repeat




protein


197
7854
WD40


0.11




0.14


0.05




repeat




protein


198
7917
WD40


0.22
0.39

0.13
0.15



0.18
0.56




repeat




protein


199
7989
WD40


0.11





0.04

0.11




repeat




protein


200
8506
WD40
0.47

0.33

0.11
0.86
0.19
1.28
0.04

1.23
3.12




repeat




protein


201
8692
WD40



0.21

0.06
0.11

0.15

0.10
0.87




repeat




protein


202
8693
WD40


0.11
0.80

0.25

0.14
0.18

0.53
0.31




repeat




protein


203
9170
WD40
0.16

0.11
0.05






0.05




repeat




protein


204
9408
WD40


0.33

0.05
0.41
0.15
0.14


0.41
0.33




repeat




protein


205
9522
WD40


0.11







0.18




repeat




protein


206
9734
WD40


0.11
0.05
0.11
0.15

0.07
0.25

0.11




repeat




protein


207
9815
WD40


0.11







0.18
0.14




repeat




protein


208
10670
WD40



0.40
0.16
0.11


0.16

0.34
0.31




repeat




protein


209
11297
WD40



0.53


0.15

0.16

0.05




repeat




protein


210
13098
WD40



0.19
0.11
0.54
0.31
0.14
0.26

1.85
0.14




repeat




protein


211
13172
WD40








0.04




repeat




protein


212
13589
WD40




0.11
0.06

0.21


0.05
0.37




repeat




protein


213
13608
WD40





0.11


0.04

0.59
0.33




repeat




protein


214
14299
WD40
0.16


0.05



1.09


0.38




repeat




protein


215
14498
WD40



0.21






0.44
0.30




repeat




protein


216
14548
WD40
0.16







0.11

0.11
0.82




repeat




protein


217
14610
WD40
0.16




0.27




repeat




protein


218
16090
WD40





0.43


0.04

0.37
0.85




repeat




protein


219
16722
WD40










0.10




repeat




protein


220
16785
WD40



0.05

0.13




0.38
0.50




repeat




protein


221
17094
WD40





0.29
0.15



0.24
0.81




repeat




protein


222
17527
WD40








0.04

0.10




repeat




protein


223
17591
WD40





0.14




0.10




repeat




protein


224
17769
WD40










0.39




repeat




protein


225
18047
WD40



0.05
0.22
0.98
0.15
2.68
0.07

0.19
0.80




repeat




protein


226
18414
WD40




0.16
0.15


0.34

0.23
0.19




repeat




protein


227
18986
WD40





0.41




0.15




repeat




protein


228
19479
WD40




0.05





0.28
0.32




repeat




protein


229
20144
WD40



0.43


0.29



0.05




repeat




protein


230
22480
WD40






0.15
0.27




repeat




protein


231
23079
WD40





0.13


0.04




repeat




protein


232
26739
WD40






0.15



0.18




repeat




protein


233
26951
WD40



0.21







0.20




repeat




protein


234
26529
WEE1-like








0.04

0.18




protein


237
888
WD40






0.11



0.18




repeat




protein


238
14166
Cyclin-
0.16


0.05






0.05




dependant




kinase




inhibitor


239
3189
Cyclin-





0.06




dependant




protein




kinase


240
9356
Histone


0.11



0.22




0.46




acetyltransferase


241
65
Histone
0.16



0.22
0.27
0.22



0.24
0.34




deacetylase


242
14197
Histone
0.16



0.33





0.05




deacetylase


243
9081
Peptidyl-


0.11

0.05

0.29

0.26

0.69




prolyl




cis-trans




isomerase


244
13417
Peptidyl-





0.06




0.59




prolyl




cis-trans




isomerase


245
5755
WD40
0.16




repeat




protein


246
6670
WD40

0.14

0.05




repeat




protein


247
7027
WD40

0.14



0.15




1.30
0.15




repeat




protein


248
7276
WD40

0.14




0.11



0.05




repeat




protein


249
7390
WD40
0.31
0.14


0.11

0.44



1.29
0.38




repeat




protein


250
12648
WD40



0.05

0.06




0.05
0.94




repeat




protein


251
13171
WD40



0.19

0.63




0.19
0.34




repeat




protein





Table 13, the following numbers 1-12 represent the following tissues:


1 is bud reproductive;


2 is bud vegetative;


3 is callus;


4 is cambium;


5 is meristem vegetative;


6 is phloem;


7 is reproductive female;


8 is reproductive male;


9 is root;


10 is vascular;


11 is whole; and


12 is xylem.













TABLE 14







Oligo Table.










Oligo





SEQ


ID
Oligo ID
Microarray Oligo Seq





521
Euc_003910_O_4
GATTTTAAGTAACTCAATTAGCAGTTCCAACATTAAACCATTATTATTACCCCTTTTATC






522
Euc_019213_O_1
CTCAAAAAGTACTTGGATGCGTGCGGTGACAACGGACTCGAACCGTACACTGTCAAATCT





523
Euc_036800_O_4
TTGTCAAGTTGCAGGACGTAGTGCACAGTGAGAGGCGTCTATATCTAGTTTTTGAGTACT





524
Euc_040260_O_1
GAAGAAATTATATAACTAGATACAAGGTTAGCTAGGTATATAATAGCGGTACAAGTCTTT





525
Euc_041965_O_1
GGACAAATCAAGTAGAACTTCTCTCGGCAGCATCAGTTTTTCTAATCCATGCCTTGTTGC





526
Euc_002906_O_1
CTCAGTTCTGATAATGCCTCGGATATATGGCCGAGTGTTCGCTGGACGGCCTCTTATGTT





527
Euc_001518_O_3
GGAGATTCTGAACTGCAACAGCTCCTACACATTTTCAGACTGTTGGGTACTCCAAATGAA





528
Euc_008078_O_2
GACTGGTAAAATCGTTGCACTAAAAAAGGTCCGGTTTGACAACTTGGAACCTGAAAGCGT





529
Euc_009826_O_4
AAACACCAATCTATCAACACTGTCGAGTTTAGTCACTAGTAGAACCGGAGATAACAAACA





530
Euc_010364_O_1
CTATGATCCTGAGCGCAAGCAAGTTATGACCAATAGAGTCGTTACACTATGGTACCGAGC





531
Euc_011523_O_1
TGTTGTGAAGGTAGTTATAGCCATCGATTAGACAGTGATTAAAGTAGTACCCGTGCCAAT





532
Euc_024358_O_2
CCACATACAAGAGTTGTTACGCTACACATCCTATACCATCAAAGGAACGTTGGAATGCCA





533
Euc_039125_O_3
TATGATCGACACAAGCATTTTGTGTTGGAGCCTCAGCTAATTGTATGTCATCGAGTACTT





534
Euc_005362_O_3
AAAATTTTTGCTACGGATAATGTTGTGAGGCGAGGCAGTCGAAATTACGGAGGTTGACTT





535
Euc_044857_O_1
ATGCAGGGATCAAATTTGTGAGTACTACGTAAAATTTTGCTACGGAGGCGAGGCAGTCGA





536
Euc_001743_O_1
GAAGAATACAGGCTCGTACCTGATACACTGTACCTGACTGTTAACTACATAGATCGGTAT





537
Euc_012405_O_1
TCCACCCTAAATGCGATACGTGAAAAGTATAGACAACAGAAGGTAAACTATTCATTACTG





538
Euc_003739_O_2
AGGCTTCTAGTTGCGTTCCCCCAAACTACATGGATCGGCAGCAGGATATTAATGAGCGGA





539
Euc_022338_O_2
GAGAAAAATGACAGATTGATATCGATGATGATGACTGTCGTGTCATCAGTAGTGTGCTTT





540
Euc_028605_O_5
TTTCCAATTGTAGTTCGTCTTTTATTGTAACAATAAATTGATAGATACTGATTCGAAATA





541
Euc_041006_O_1
ACATTTATGCTAACTATAGGAGAACGGAGAATTGTAGCTGCGTCTCTGCTAACTACATGG





542
Euc_006643_O_1
TTCTGGCTTAAAGGCTATTCTTTGTGCACAATGACCTGAGGGAGGTCTCGACAGACCACT





543
Euc_045338_O_1
TTCATCCGGGTCCTGGTTATCATACTCTTATATATGTTGGGGAATAACGGTTCATATGTT





544
Euc_046486_O_3
GGGTGTGCTTAATAGTTCTTATTAGTCTTAGCTTATTATCTTTGATTGGACATGCTATAA





545
Euc_012070_O_2
CTTGCTAAGTAGACATGTTATATTTCTAATGCTTTGAGAACAATATTACAGTATAATTAG





546
Euc_006617_O_2
AATCATCGACTAGACCGATGGTCAAAGTGGTAATCATGTAATTAAACGCGTTTGTCATTG





547
Euc_007827_O_2
ATGGAAAAATCTATGGATATGAAGGATTGAAGATATCCGTCTGGGTAAGCTGTGTATCAT





548
Euc_008036_O_3
TTATGATTTGAGAAAACCCTTGCAGGCTGCGATTTGCGGATCATGACAGCATAGTTTTGC





549
Euc_001596_O_2
GTTTTGTTGTGAGGGCTTGGTAGGTTTTCATTATATTGTAATGTCGACGACAGAGATTTT





550
Euc_005870_O_3
CCAATTAATGTTACTGCTCAAGCTGACGTACCTGCGAAAAAAGCACCAGTGACTGCTAAT





551
Euc_006901_O_3
TGATGTCAAAACGTAGCTCTTTTTTGTGTGAGCTATCCTGCTAAATTAAACCTCAGCAAA





552
Euc_006902_O_1
ACATGAGTATTATGAATACTTCGGTCCTGACTATACACTTCATGTTGCTCCGAGTAACAT





553
Euc_007440_O_2
GAATTGGCGATCACAATCTACTGTAGTCAATACTCAAGTGGGAGGTGTAAATAGATTCCA





554
Euc_008994_O_1
GATCATGTGTAATCAGTATATCAGGTTAGAAACAGTACTCTTGAGCTTAGCGGGCACTGT





555
Euc_024580_O_2
TCCTGTGAAGGTGGTCGACTCAATCAAAAGGTACCTTGTAGATAAGGTACCTTTTCTCAA





556
Euc_037831_O_5
GCATTTTATACGACGGATAGAGTCATGACCGTATCTTTCCATAAGTTTGGGGACTTCTTC





557
Euc_034958_O_3
CCTCGTTTCTTTGCGGTTCGGACGCATCATGGATGTATCTCCAAAGAGTAATCTGTCGAT





558
Euc_022967_O_2
AATTCAGATCTATTAGTGAAAGTTGGCATGAGTCTCAATCTTAGGGGAATACAGTACGGA





559
Euc_008599_O_3
TGATATGAGTATCATAACTCGGATGGTGACAACTTTGTACTACGGTCGGCACCGGTAGAT





560
Euc_009919_O_1
CATATACAATCTTAGTGGATTAGCTGAGGTCGAAACTGACAAGAGTGATCGCCCGTTGGA





561
Euc_015820_O_2
CATGGCTAACGCTGGCCCTAGCACTAATGGGAGCCAATTTTTCATATGCACTGTAAAGAC





562
Euc_008327_O_2
AACAAAGTCTACCTTGACATTAGCATCGGTAACCCTGTCGGGAAACTAGTCGGAAGAATT





563
Euc_004604_O_2
TGTGCTTGGATATACTGTATAAGCATTCTATATTATGCTTGTTGGCTTCGTTTTGAGGGA





564
Euc_000966_O_1
TTAACGTCGACCGCTTCTCTGCCCCTTGAATTTTCCCGAGAAAACCAGGAACCTGCCAAA





565
Euc_001037_O_1
TGTTGAATACGATGTATTATAATGTTGGTGTCTTGGTGAAATACAGAATTATGCTTGCGT





566
Euc_004603_O_2
ATCGCTGTGGCTGATCTCGTCGCTCCGGCTTTTCATAAAAATCATGGCTGAGGCAATCGA





567
Euc_005465_O_2
CTCGCAACCCTATATCTCGCTCAGGCGAAGAAGTCTGAGGATTTGAAAGAGGTGACTCAC





568
Euc_006571_O_1
TGTTTTTGGGTACACGCAGTTAGGATAACTAGCATGAAAGCCCGATCCCGCATATACAGG





569
Euc_006786_O_2
GAGGACTAGCCGGAACTTCATCGAACTCTCTCGGAGGGGTTACTACGATAACGTCAAGTT





570
Euc_007057_O_1
GATGGCTAGCACTGTGTAGAAAGGTGAATTTAAAGTACTTGTCTACACTGCTTATTAAAT





571
Euc_008670_O_2
TGAGACTGTCTTGGCGTGTATTTTGGAATAAACTATTATCACGTTTTGTTAAATATAATA





572
Euc_009137_O_3
TTACAAAATGGCTCTCAGAAAGTATCGAAAGGCCCTGCGCTATCTGGATATCTGCTGGGA





573
Euc_010285_O_2
AATTTTATGTTTGCTACTGCTTAGTGCTTAATGGACTTGCGTAGGTATTCAAATTACAGA





574
Euc_010600_O_1
TGGAACCGTGGTATCGGCTGACGTTATCCGTGATTTTAAGACTGGAGATAGTTTATGCTA





575
Euc_011551_O_2
CTTTGATGTATCCTCAGTGTACTGCTTTTAGCTATGTATAGATCGAGTCAACTCATTGAA





576
Euc_020743_O_3
TTTTTATTATTTACCTTCGCCTTTACGCTGCATACGTTAATAGGTTATTATTTCCTTCAA





577
Euc_023739_O_1
ATTTGTCCATGACAATCGTAGTCGAAGACACGATACGCTCTTAGATGGTACGGAAATCTG





578
Euc_031985_O_2
TGAATAGAGATAACTTTTCTGAGTGTGAATTGGATATTACGTTGCAAATAGCCGAATGAA





579
Euc_032025_O_2
GCTTTAGGTTAGGGATCCCTGTAAGCTGATGATAGATATTGGAGATGGTACTTGTAAGAT





580
Euc_032173_O_1
TGTTGTGTTTGGAAAGGTGCTGTCTGGGATGGATGTTGTCCACAAGATTGAGGCTGAAGG





581
Euc_009143_O_1
GGAAAGCGGGGAATGAGCATGTGGATATTATCTCTTTCTACAATGAAATATTCATTCCTT





582
Euc_000349_O_1
CATCAGGACGTTGACTCTAATTAAGACATATGTGACAGAGCGCCCTGTTAATGCGGTTAC





583
Euc_000575_O_2
CTTTAGGTTTGATCTGTCTGTTTTGTCTATCCTGCGAGTTTCGAGCATGTGCGTGTGTGA





584
Euc_000804_O_1
CAGCCCCAATAGATACTGGCTCTGTGCCGCTACTGAGAACAGTATTAAAATCTGGGACCT





585
Euc_000805_O_2
AAGAATGAAGCTGATATGAGTGATGGAACTACGGGGGCCATGAGCTCAAATAAGAAGGTC





586
Euc_000806_O_1
TGACTACAATTAGCACCTCACCATTATCGAACTGTATAATTGTGCTTGCCTGCTATTATT





587
Euc_002248_O_4
TTGAAGCGGAAATATATATTTATGCTACTACATAAGTAATGTACTACTTGACAAGATGAG





588
Euc_003203_O_1
TACTCGATGTGGTATAGAATTTATCCAATGTACTCCTAAATGTAGATACATCGTGTATTG





589
Euc_003209_O_2
GCTTCGTCTGATACCACTATCAAGATAATAGGCGTGAGCAATAGCTCTGGATCACAGCAC





590
Euc_004429_O_4
GGTCGGCTTGCTAGTGTATCTGATGACAAGAGCATATCACTCTATGATTACTCATGAAGG





591
Euc_004607_O_3
GAAAGGAGAAAAGCATGGAGATCGATCTCGGAAACCTCGCATTCGACGTCGATTTTCATC





592
Euc_004682_O_1
GATTCAGTACCCGGATTCGCAAGTCAACCGGTTGGAGATAACTCCACATAAGCGGTACCT





593
Euc_005786_O_1
TTCCATGTATCAAGCCGCATCAATGTTTGTCGCTGCAATTAACATGTGTGCAGTCGATCC





594
Euc_005887_O_2
TTCAGCGCATTGTGTAAATGTAGATAGGTGATATATTTCTCGTTGCAATGTAGGGTAAGA





595
Euc_005981_O_2
TCCAATAATCACATTTACCATCAACAGGCATCAGCAACATACTGTTGTAGTGTAATTAAT





596
Euc_006766_O_1
GGGCATTCTGACTACCTGCACTGTATAGCTGCACGGAACTCTTCTAGTCAGATTATAACA





597
Euc_006769_O_1
AATCGTCTGGTAGATTGTCAAAAACTAATAAACCTGTGATTGATCCGGATTCTAGTAATG





598
Euc_006907_O_2
AGTTGAGGATTCTCCACTATGACAGCTCTCATGGCTTGAATCTAAAGTCATCTGGTTTTC





599
Euc_007518_O_1
GAACAATCATTCTGTAGAACACTAGAGTCTATATGCTTGACTGTATCGGTTAATTAATTC





600
Euc_007717_O_1
AGATAGCGATAGAGTTATACTGCATGTACTGAGGTAAATGTTTTGATTACTCCACCCAAT





601
Euc_007718_O_1
AAGAATTGTTAGGAGGTGTATACTTTCTGTAACTGTATTCAATGAGCATACACCTGACGG





602
Euc_007741_O_2
CAACTCATATAATGACTGGATTCTGGCAACCGCGTCTTCAGACACAACAGTTGGACTATT





603
Euc_007884_O_1
AGTGTAAAAGGATGCCCCTAATAGATTATATGCCAAGTGTAGTATATATAATAGTGCTTT





604
Euc_008258_O_2
AAGAATCTACAGTTGTCTTATGCTACTCTATTACTCAATTATGCTGTGCTATTGATTGAG





605
Euc_008465_O_4
TCTGAATACATACTTTGTGGTCTCTATAAAAGACCAATGATACAGGCATGGTCATTAATT





606
Euc_008616_O_5
TAAATCTTCTCATGTGCCTGGCGTAAATTTTGCAGTTATTACTAGACCAAGATAGTTTCA





607
Euc_008690_O_4
ACATGGATTCGATCAATCGCCACATGACAACTAAAACAAGCGGTTCACGTGATTGTAATT





608
Euc_008708_O_4
AGATGAGTATGCTCGGGTGTATGATATTCGCAATTACAAGTGGAATGGATCGCATAATTT





609
Euc_008850_O_5
TCTTTGATTCTGTTGTATGGTGTATCTTATTGTATCTTCTATCTGCCCCCCATGTAATTC





610
Euc_009072_O_1
TTCGTTGTGTAGTACTGGGAGTTACTACTTGTATGTATGTAAATCATGTGGCGTCTGTCC





611
Euc_009465_O_1
GGAGATGTGTAATATGTCTGAGCGGTCACACTCTAGCTGTTACATGCGTAAAGTGGGGAG





612
Euc_009472_O_3
CCACCGTTGCGTAACTCGAATAGCCGGATTTTCGTTTTCGTTTTTATTTCCCCGTTAATT





613
Euc_009550_O_1
TGAGATGCTCTGTGTGAGGACTTTTACGAAACTTGAATGGCCCGTAAGGACAATAAGCTT





614
Euc_010284_O_3
TGGGTTGTTGCGACGGGTTCTACAGATAAGACTGTTAAGTTATTTGATCTACGCAAGATC





615
Euc_010595_O_1
GCAGAGGTGCCTACATATGCTTTAGAATGCTAGTAGCTTGGAAGTGCAACACGCTCGTGA





616
Euc_010657_O_1
AGTAAAGTTTAACGACTATGCATCTGTCGTAGTATCAGCCGGCTATGATCGTTCAGTGCG





617
Euc_012636_O_2
CGTTAGGATAGTCTTTAAAGGAGTTGGTGATTATTGATTTCCACCCAATATATGTAGCGT





618
Euc_012748_O_2
GAGCAAGCTACTTACAAAAATCGACAGCGTCTTTACCTATCTGAACAGACAGATGGCAGT





619
Euc_012879_O_2
TCCTTCCGACAAGTACCGTATTGCAAGTTGTGGTATGGACAATACGGTTAAAATCTGGTC





620
Euc_015515_O_1
TTTCACTCGATGACGGTTGGCCGGATAAATAATCGCTTATATAGTCCTAATAAGTTCCAT





621
Euc_015724_O_3
ATATGTAGGTGGTAGAGGTGTGGATATTGCATAGACCGAACCTCCGCAGGTCCGCATTCT





622
Euc_016167_O_1
CCATTGAACTACTTATGGATTACTTTATACATGAAATATCATGCCGGAGTAATTTTGAGT





623
Euc_016633_O_3
AGCATTAGAGACCTGGATTTTAGTCTAGATTCAGAGTTTTTGGCTACGACATCTACTGAT





624
Euc_017485_O_3
AAAGGTTTATCCCTCATTGGATTTGATATATAAACTGAGAGTGTTTTGCCCCCCATTAAA





625
Euc_018007_O_1
GTACAGCGTGTATTTCTTGTTACGATACTTGAGGGGTTAGAGGCACCTACGAATTAGGAA





626
Euc_020775_O_3
ATATCCTTATGAATGAAGTTTGGATGATAAGTGGCGCCAGACTTTCTACTCACCCTTTTT





627
Euc_023132_O_3
TGATCACATCGTTGTTTGCAATAAGACGTCATCAATTTATATCATGACTCTACAGGGACA





628
Euc_023569_O_2
TTTTCCCAGTGTACTGCGAGAGTGATGCTACATAAGTTTACTCTTGTGTCTAACTTTTCC





629
Euc_023611_O_1
AGATTCTACAGATGGCGCTATACGAGCTGTTATACGGACATTTTATGACCATACACATCC





630
Euc_024934_O_3
TGCTACGGGAAACCAGGACAAAACTTGTAGGATTTGGGACATACGAAACTTATCTAAGTC





631
Euc_025546_O_1
CAAGTCATATAGTTACAGTGTCGCATGACAGAACAATTAAGCTCTGGACTAGTAACGACG





632
Euc_030134_O_2
TGCCACATCGTAACCATCATAGCACTTATCATCTAATTATGGTGAAAGGGAGTTATATAT





633
Euc_031787_O_5
GTTTATACTTATAAACAACAGAGAGACAACTGTACAGGTGTTGTAAACACTCCCAGTGTG





634
Euc_034435_O_1
CTGTGTTTTAGCCCGAGGGCCAATCACTTAGTTGCTACTTCGTGGGATAATCAGGTACGG





635
Euc_034452_O_3
GCAAAGTAGAGTTTAAGTTTCGTTGTGCTTGGACCGGAAAACTCACATGCTTAGAGTTTA





636
Euc_035789_O_5
AAGATTTGGGCATAACTTGTATGAACTTTTTCTGTTGTCGACACTGTAATTACACGAGCT





637
Euc_035804_O_4
AAACAGATGCATGTATGCTTCATAACTCTATAGATATGGAAATGTCACTGTACACTGATC





638
Euc_043057_O_2
TTATTGGTGCACAGGACGGAAAATTGCGCATATATTCTATTTCAGGTGATACATTAACAG





639
Euc_046741_O_1
AGGCACAGACACTTGCCTAAACCAATATACAAGGCAGGTATTCTAAGGCGCACCGTGAAT





640
Euc_047161_O_4
CATGCGAAGGTTTCTGGGAATTTTCAGTAGAAAATTCGGTCGTGGCGGCCATCCTCGATA





641
Pra_001766_O_1
TTAAGCTGATAGCTTTAGTTCCTACGTGGAATGTATAAATGCACCATTGTCCATAAGGCA





642
Pra_002927_O_2
GGATGCTCTGGTTACATGACTACTCCTTAGGGAATCAGTCAGACATTTTAAATAACTTCC





643
Pra_007642_O_2
TCATTAAGCGGTACTGGCAGAGGACATGTCTATTTATACAAGCAAATGGTCCTATTGGCT





644
Pra_013714_O_1
ATGTTGGTCAGACCTCAAATATTGTACTCCCCACACTAGGGAGCATTTACGGTGAATATA





645
Pra_016332_O_1
TCCTCTCGACCCTTAGAGTCCTCTGCGAATCTTGTTGTTAGTTACTGTGTACGCTGTAAC





646
Pra_021677_O_3
AAGCATGTTTTGAATTTATGGTGGTGGCATGTGGATATTTGAACTTGGTTGAGAAAAATT





647
Pra_027562_O_2
CATTCCTATTGAAGGGTCAACCTTTAATTTTGGCTAGCAGGACTGTATAGGATTATATGC





648
Pra_001504_O_2
TTATTGTATTTTAGATTCTTGATGGCCATCTAAACTTCTGGCTGCTTGGTGCAACATTGA





649
Pra_015211_O_2
ATAGCTAATGATTCCATGCTATCCATGGTATCTACTTCACGATAATAAAGGTCTTAGTCC





650
Pra_020421_O_2
CACCTAATAGGCCTGAGTATTGCTCACCACTATGCTGATATGGGGAGCAATAACGTTAGT





651
Pra_003187_O_2
TTTCTTTTCACTTTGTACTAATGATCATTGTGACCACAAAATCTTTATACACAATACAGA





652
Pra_015661_O_1
CTTGTCACTATCCTCATATTGATATCACCTCGTGTATGTTGTGGGGTGGCAAAATTACTT





653
Pra_013874_O_1
TATTTTAACTCAGCGACTTACCAGCCTAGTAAGCAATGGGGAGCTTGCATGTATTAGTTT





654
Pra_014615_O_1
ATTCGTCCTGGTCCTTTAGGACATGTACTTATGTCCATGCAAGTGCTTCTTGCCTAAGCT





655
Pra_004578_O_2
TTCTAGGCGATATATATCGCCGTAACTTTGGATGTGTTAAGAATATAGGGGATCATTAGC





656
Pra_023387_O_3
AGTTGCAGAGTGTGTAGCAACTGATGAGCATAGTTGTTATGTTTCTCAACTCAGTTGCAC





657
Pra_006970_O_1
AAGAAACTCATACACTGGACAGGCCAACCTTCCAAATATGTGTTTAGAAAACCTTTGTCT





658
Pra_010322_O_1
AAGGGGTGCTATCCATATCTAGAATCTACCATGCTCAATGAGGTATCTTCATTAGTATAC





659
Pra_022721_O_1
ATCTAATGCTAGTTTATTGATTTCTATGATCCAAGACCTCGTCATAGATCAAGTGCCTAG





660
Pra_023407_O_1
TTGTTATTAAATACCATTCAATATGCTTATGATTCATGAATGCTTAAGAGATTCTGCTGC





661
Pra_001945_O_2
GCTTCTAAACTGTAGAAGCCTGTTATCTTTAGACTCGTGGTTATGTGAACTACTTTTACA





662
Pra_008233_O_1
GGCTGTGGGGATTCGAGCCTGATGGTTATGCACTGTGGCCAGCAAGATGTTGAAGTTTTA





663
Pra_008234_O_4
GCCTGATGGTTATGCACTGTAAGTGATCTGATTTGATTAACTATTTTATCAATTAATTTT





664
Pra_022054_O_2
ATGGTCATTATCCGAGATAGTGCGCTTTGTCATGGGAAAATGACTATTGAATGTGAGTTT





665
Pra_012137_O_2
TTTTCTGGTGCATCCTTAACACAGCTTGGTTACATGGTGAATTACAGTATTTGAAGGAGT





666
Pra_012582_O_2
AGATTTAATGCCACTTAGGTGATCGGTGACCCACTTGTACATATAGATGTTGGCGATGTT





667
Pra_015285_O_2
AAGAAATTCATCAATTCTTTGAAATTATTGTTCCCTTTTGATGCGGCCCCTTTCTGGAGG





668
Pra_017229_O_1
TAAAGTATATTTTAGCCGCTGTTGTTGTAAATTTATGTTTTTCATTGCTATCAACATTTA





669
Pra_020724_O_2
GGTTTTCCTATAAGATGTATGAATTCGCACTGTGGTGCAATTTTATGAATTAAACTCAAA





670
Pra_004555_O_1
TTTACTATTCCGTCTGGGCTTAGAGATGTACGTTAATTGGTCATTTAAGACGACTCAGTT





671
Pra_004556_O_5
TCAAATCTAGTCAATATCCGTGTTGAGCTAAACAAGCGCTGAAAGTTTGCTCGAATCAGC





672
Pra_005729_O_2
AGAAAGTTGTGTACTAATTTGTATTGTAACGTCCATTTATCCAACGAGTCCTCCATTCAT





673
Pra_007395_O_3
CAGTACTGTATTCGAAGATCCTGAAAATTTACTAAAACAAATGGAATATCAACAACCTAG





674
Pra_009503_O_1
TTGCTCTATATAATTTGTGCTCGTGTGTGTACTTGAAGATCCATCCTCACATAGTCCAAT





675
Pra_011283_O_1
GTGTGTATAGTTTTATAACACTCTATGGTATCACTACCACTATGGGCCTGTTTAGTCCAA





676
Pra_012322_O_3
GAAGCAGAATCAGCTTTGACCAGTATTTAGTGTCTTGTATACAATTCTTGTTTCAGTGAA





677
Pra_023236_O_3
AAATCAAGATTAAAATCCGAAACCAAGGCTAACCAGCAAACTGTGAGGTGTACATTGTTG





678
Pra_000171_O_2
TTCCAAGCAGAAGGGCACATGTTGTGACATCAAGTAGTAGATTGTTCTGCAGATTCTGGT





679
Pra_000172_O_1
GTTAATGTAATACATTTAGTTTTTAGATAACTGTTAATGTGTAGTAAAGCACTAGGAAGA





680
Pra_001480_O_3
GAGGCTTCAAAGGTTTTTGTGTCTTTTCTAGTTATTATAAACGCTTCATAGGTTCCTAGG





681
Pra_001692_O_2
GAAGATTGTAAGTTGGGTGAACTTTTTTACCACGCTAGGTTGATCTATTTTAAGACTCTT





682
Pra_005313_ORF_O1
AAAATAGCTGCGCGTACCACAAAGGTGACAAACGCCGGATTTCTCTTATCAGACTTGTCA





683
Pra_006362_O_1
TTTAATTATCATAGTTTTATTCCGGCTATCTTGATCATTCACGGAAGTCCCGAGAGTCAA





684
Pra_006493_O_3
GTGGAGTGAACGTGGTTACTTCAATGGATTACCCTTCTATCGTGTCATTAAACACTTTGT





685
Pra_006983_O_1
GCTAACTCTTCTAGTTGAGATCTCCATCAATTAATGGATACAAACATTGAGTTTCACTTT





686
Pra_007665_O_1
GGATCACTACTGGATTCCGTTACATTAGTTATTGCAAGTTGGTTATTATGTACGTTTATA





687
Pra_012196_O_1
ATGAACAAATGCAATTACCCTGTTTTATTCTATCCCGCTTTAATTAATATTGGTCATGTT





688
Pra_013382_O_1
TTTGCTTGTGGATTGTACTGTGGTACATGGTATAAATCTATAGGCTATGTCGATTATTTT





689
Pra_016461_O_1
ATATAAGATATAAGATATTGCCAGCAAACTATTTGACAGGTTATTTAATAAAGTGTGCTA





690
Pra_017611_O_1
TTTTAAATGTGGACAGAGGCACTATAAGAATGCGAAATATCGTCGGAGCACGACTAATTG





691
Pra_019776_O_1
ATAGACTAGTTCTACAAAGCCCTAGGATGATGGACTTCATTTCTTTTGCATTAAGATGAA





692
Pra_020659_O_1
GATTTCTTATGGGGTTGGAACATTCCTCGCTGCCTTCTGGTAATATTAGGTTATGCGTTT





693
Pra_022559_O_3
AATTGAGGTTGACTGTGTACTTCTCCAGTGGACAGGAGAAAGCGATAAAATTCAAACGTT





694
Pra_024188_O_5
AAGGAAGGGCAAATAGAGCTCGCGCTCAAGAAATACCTTAAATCGATACGGTATTTGGAT





695
Pra_027973_O_2
TAATTTAAGAGCTATGAAACAACTACCTTTTGGAATGGTTTTGTTTTTAGCATCCCAATT





696
Pra_001353_O_1
TTGTAAATTATGCTGGTTCCATATGGGGGTTAATCAGTATCCTGGTTATTTGTGACACCA





697
Pra_001978_O_3
GTTGTGAACTATCAATAGACGGGGATGGTCCTTTTTAGCTGCTCCTTAAGCAGCTCAAAT





698
Pra_002810_O_2
TCAATTCCGGTCATATGTAGACGACTATAATGTTGTTTGTGTCCTATAACTATAGTGTTG





699
Pra_002811_O_1
CATTTTACACCCTATAACAAAATATAGTGTCATAAGTTTACACCAGGTAACAACTCTATA





700
Pra_002812_O_3
ATGGAGAGTTTTATTCATTACATGAAAGAGTATGTCACCTTTCGTGCTCCATCTATTGAT





701
Pra_003514_O_1
TTTCACGTCCTGTATACTCACTCAAGCAACTTTAGGATGAAGAGCTAAAGTATATCAAAG





702
Pra_004104_O_2
AATGCACTCTTTATAAAGTGGGATGAGGTATGTGTTTCCTTCCTATTGGCTAACCTGAAT





703
Pra_005595_O_1
ATTGGGCAATCGTTATTGATTTTACCTATCGCTATCTCACTGTCCGCCAATTTAGTGTAA





704
Pra_005754_O_1
TTTCAGCGGATATAAAGTCTTCCAACTTGTAAACCGGTGCTGTGAAGATTAAAAGTCCTT





705
Pra_006463_O_1
GCTTTAGAGGCAATGGTAGATTATGAAGTCAACACCAGGGAGTTTGACCGTTTGGGACAT





706
Pra_006665_O_1
CATTCAATTTGACATTGGAGTTTCAAGGCATTCCAAGGATAGCATGTACACAAGTTGAAT





707
Pra_006750_O_1
CATAAAATTACTATGGAAGTTGGATCATTATCTATGCCATAGTGGAGTAGAACTAGATTT





708
Pra_007030_O_1
CTCTTGATTCTAGAATCTAAACTACTACCTTGCGGACATGACTGAGCATCTCTCTAACAG





709
Pra_007854_O_1
CAGGGTTGTGCTAGTTTAACATTTTAACTTAATGTAATCATGTAAGCTTTAGAGAGGTGG





710
Pra_007917_O_1
GTAAATGTTTACATTGAGGTCATGCATGAGTGTTAATTACGCTTTCACTACTGTTCACTT





711
Pra_007989_ORF_O2
AATTAAAGCTTGGTTGTATGATCATTTGGGATCGAGAGTAGATTATGATGCTCCTGGGCA





712
Pra_008506_O_1
TTATCTAGCTAGAAGTTGTGAAATTAAGAGGGATGTGAGGATTGGGTTATAACTAGTGTA





713
Pra_008692_ORF_O2
AATGAATCAGGCATTAAAGCGGGAATCATTTATGACTTGGCAACCTGAAAATTCTATTAA





714
Pra_008693_O_2
TTCTTGACGTTTTAATATGGTATGGTATTAAATTTGGAAGGCCTATTCGATTGTTTGCAA





715
Pra_009170_O_1
TTCTTATAACCTGTACGATTGCCGATATATCACCAATTTTGCTGATTTTAATCTGAGTTT





716
Pra_009408_O_1
CAATTTCATATTCGGGTTCAATGTAGTGCCTCTCATTTTAGGGTGATAGCATGAGTTTTT





717
Pra_009522_O_1
TCCACAAGTTAACATAGGTAACTATCGACTGAAGTGAACTGGGGGGCAGAAGCTAACTAT





718
Pra_009734_O_2
TTTAGATAGCCATTTACATTTTACTTATTATTGGACTTGTAAAGATTTTTGTACCCTTGT





719
Pra_009815_O_4
TTGCTGAAATATTTCAAGCTGAAAGTTATGATTCTGGCCAAGAAGTCTACTGAAAATTTG





720
Pra_010670_O_2
AAACATAAGTTTGGCCCAGATTCGGTTTATCATAAAATCTGGCTGCATATAAGGTGTCAG





721
Pra_011297_O_1
ATGTTCTAGAATTTGTCTAAGCTAGCTACTGGTGTTTAACTGATATGGAAAACTTTTGCC





722
Pra_013098_O_2
TTTGGGGAGTACTTTAGTCAATAAAAGTGAAGTGAATCATGATATAAAGGGTTTAAGTAA





723
Pra_013172_O_2
AGAAGTTACTAATTTGTAGATAAATTCTAACGAAGGTGATGATAGCATACACGTAATGAA





724
Pra_013589_O_2
GAATTTTGATGGTAGCGTATGGTTGAAGGAAAACTTGGATATATCATGTAAACATTTTTC





725
Pra_013608_O_1
TTAATGAACCGCTTTTTCCTTGAGAGGCTATGAATGCCTGTAGAACTAATCCTTTAAGTA





726
Pra_014299_O_2
TTTCTCTAACACTATATTTTCTGGTATGACCGCTCTACATTGTATATTAACCCTTGCAAA





727
Pra_014498_O_1
TATATTCACTGTGCTGGGATTATCCTCTCCCCTTTTTGACCCACTGTTGTGTGTATTTGA





728
Pra_014548_O_1
GAGCATACAGCGTTATCTTTGAGACGAGTCATCAATGATAATATCCTCGTAAAAGGTTAC





729
Pra_014610_O_2
TTTATTCAATTACGACGGATTCAGTTGGCCTTTTGTAACATTCAAGTATCCATCTATCAC





730
Pra_016090_O_2
ATGTTCAGGGGTATTAAAAATTCAGAGGATAAATTTCCTCACTCTCAAGTGTTAGATGGT





731
Pra_016722_O_2
CAAAGTCTAGACGTTAATGTTTTGGAACTCTTTTTTCGAATTTGTGCCTATTGAATCACT





732
Pra_016785_O_3
TATAAATATATTGTACTGGGGATCCAAGACATGGCAATATATGTCGAGATTTTCATTTTC





733
Pra_017094_O_3
CTTTTGCATGAGTTCAAATGTCTTTGTGACATATTGTCTTGAACCACCGAGGATATATCA





734
Pra_017527_O_2
GTTTGTATGTCCAATAGATTATAACCTATTTACTGTGACACTATTCTTCACACCCATGTC





735
Pra_017591_ORF_O2
AGATCTAGTTGTTTCAGCATCGTTGGACCAAACTGTTCGTGTATGGGATATAAGTGGCCT





736
Pra_017769_O_2
TGCCGTATCAAAAGATTGGTACTTCCTTATGGACACACAAGATCGTAAGCATGGCTGAAT





737
Pra_018047_O_2
TTGATGGCCACATGAGTTGTTTATACAAGTCGTTGTTTTATGAGAGAACCTTCTTCAGAT





738
Pra_018414_O_1
ATTTCTATAGTGCCATATGCTTGTCGGTTGTCATTGACCTCTAATAGAATAGCCAGAGTA





739
Pra_018986_O_1
TTCACGGCAGTTGAACTAGTCATAGTGGAATATTATTTAAATGGTGTATTCTAGTCACAT





740
Pra_019479_ORF_O1
TGCAGGCGCTCTATAGTTCTGTTCTCTAGCATGAAGTGTGTATTTTATCTATTGTGGACC





741
Pra_020144_O_1
TGTCTTTAATCTTCAGGGTTCGTTACTAACAATTGAGCTCAAATCTCTATTCTGACCAGC





742
Pra_022480_O_1
CATTTATAGAGTTGTGCAAAATCACCCATAATGCTATGAATTGACAGGTGACTGTAATCT





743
Pra_023079_O_2
GGAGAAAATTTCCTATCCCTTTGTGGGTGTGTGAAAAACGAAATATAGAGGAACAATGTG





744
Pra_026739_O_2
ACCAATCATTTATTTGCAGTGTAGTTGATATGAAGGGAGAAATATGACAGTTGGTTTCAA





745
Pra_026951_O_2
AAGTTAATGTTCTCATAGGTTATTCATTGGAGTTGTCTCGTATGTACGCTGTGCCGTAGT





746
Pra_026529_O_2
CTCATAAATTGAGGCTTGCCTACGTTAATTGTTATATATGGAGAGCCATGCTAATTGTTA





747
Euc_006366_O_2
GCAGATCATGTAATTGTATCTCAAATTATAGTATCCGTATTCTGTACAAATGCTCCGGAA





748
Euc_017378_O_1
TCTTTACGCAGATGGTGACTGAAGCTGGTTCCGAGATCGGCATATGTAGCTGGTAGAGGT





749
Pra_000888_O_1
TTCACATTGAGGGTTGCCGTCGGTATTCGCCGATGATATCCTGTTTTACGCGCAACAGTT





750
Pra_014166_O_1
TCATTATTTAGGGTGCAGGCTGTATAAAATGTTGTAAATTGTAGTATCAATGTGTACAAT





751
Pra_003189_O_1
GCATTCACCACGACAGTAAAGTAATCATTATGATTACTAATGTATTGCTTTCATGGGGTG





752
Pra_009356_O_4
AAAGGGTATATTTTGTCTCATGTTGGGGTGATAATTCTCCCTGAAAGTCTCCAAAATATA





753
Pra_000065_ORF_O_2
AAATTTCCGGTTGCCATAGTCTAGTGGGGTGAGGGTTCATTCTAGGGGATTTATTGTGTT





754
Pra_014197_ORF_O1
GCAGTGATAAAGGTACTTCTTGGTGATAATCCTAAAGCCTTACCCATGGATATCCAGCCT





755
Pra_009081_O_2
TTCTTTAACAAGGTAAAAATCCCCCCCTTGGCATGTAGCTCAATTAGTTGTAATGGAACT





756
Pra_013417_O_1
AGTTGTAAACAGTGTAATAAGGAGCAGAAGTTGTGATAGCTTTTAGGAACGATAGACTTT





757
Pra_005755_O_1
TGAACCAATTCTTGTATATTAGATATGTAACATGTATGAATGTCCATAGAGCAGAGCTTT





758
Pra_006670_O_2
AGCCAGGCACGCTTAACTAAATTTCGTTTAGTTCACCATGACTATTCGTTGAACTTAATG





759
Pra_007027_O_1
CAAAACCCCTTGTAGGGTGGACTTCTGTTGTATCCAATTTTTATGGCATAATTAGCTAGT





760
Pra_007276_O_1
AATTTGGTGATTATTCCTTACCATATCGTACTGTACAGATACGGTAAGGTCGAAATATAT





761
Pra_007390_ORF_O1
CATGCCGTGATCGGTCGATTGCATTAAGTGCTGCAAGGATCAAATAGTGGCACTGTCATG





762
Pra_012648_ORF_O1
CAAACATAAATAAGGTTGCTACTTTAAAGGGACATACGGAACGAGTTACTGATGTGGCAT





763
Pra_013171_O_2
ATTTATGGATGAGGTACTCCTTATGAATATCTTCAAACTAAGAAATAACTATATATGCAA





764
Euc_045414_O_2
CTTGGTTTTTGTTGAGCTTTCTATTTCAAGCAATTTGTGATTGGGGGGTTCTGCATTCTT





765
Euc_044328_O_2
ATGTCTAAAGAGCCGTGATCTATGAGTAGATTAGAAACCGCCTTTTTAGTTGCAAACGCC





766
Euc_015615_O_2
TTGCAACAAGGTATACTTAGTCAGTCCTTGTTATGTATGTCTTTTGTCAACCCTTCAGGG





767
Euc_017239_O_3
GGCGGAATCCCTTTGTTCTTTCGAGCTTTACGTGACAAGTCGGCCAGAAAGCAGTAGCAT





768
Euc_018643_O_3
TTGATGTACGAGCCGCTATATCTAATTCTGCCTCCCAGTCACTGCCAAGTTTTACTCTTC





769
Euc_019127_O_5
GTCTTGCATGTCAGCTATTATACAGTCCTGTTTATAGTCCTGTGATGTAATAAAAAGCTG





770
Euc_022624_O_3
AAGTAGGAGATCGTGTAGAGAGAATACTTTCTGCTCTCAGCGGCGAAGAGGTTTGTCTGC





771
Euc_032424_O_1
AATTGTGAGTAGAATAGGAGAAACTTTTGTACAAGATTAATACGTGTGGCATAATAAGAT





772
Euc_037472_O_1
TGATGTGCAGTTTACATTATTATGGTTCGAGTATTATTTAGCTGCCCTATCTTAAGTCAT
















TABLE 15







Peptide Table.















Patent
Patent



Protein


ORF
ORF


SEQ ID
Target
Patent PEPTIDE Sequence
start
stop















261
CDK type A
MGDGSLGSGGRGNSGGGGGGGSRPEWLQQYDLIGKIGEGTYGLVFLARIKHPST
387
1820





NRGKYIAIKKFKQSKDGDGVSPTAIREIMLLREISHENVVKLVNVHINPVDMSL




YLAFDYADHDLYEIIRHHRDKVNQAINPYTVKSLLWQLLNGLNYLHSNWIIHRD




LKPSNILVMGEGEEQGVVKIADFGLARVYQAPLKPLSDNGVVVTIWYRAPELLL




GAKHYTSAVDMWAVGCIFAELLTLKPLFQGQEVKANPNPFQLDQLDKIFKVLGH




PTQEKWPMLVNLPHWQSDVQHIQRHKYDDNALGNVVRLSSKNATFDLLSKMLEY




DPQKRITAAQALEHEYFRMEPLPGRNALVPSSPGDKVNYPTRPVDTTTDIEGTT




SLQPSQSASSGNAVPGNMPGPHVVTNRPMPRPMHMVGMQRVPASGMAGYNLNPS




GMGGGMNPSGIPMQRGVANQAQQSRRKDPGMGMGGYPPQQKQRRF





262
CDK type A
MEKYQQLAKIGEGTYGIVYKAKDKKSGELLALKKIRLEAEDEGIPSTAIREISL
99
1007




LKQLQHPNIVRLYDVVHTEKKLTLVFEFLDQDLKKYLDACGDNGLEPYTVKSFL




YQLLQGIAFCHEHRVLHRDLKPQNLLINMEGELKLADFGLARAFGIPVRNYTHE




VVTLWYRAPDVLMGSRKYSTQVDIWSVGCIFAEMVNGRPLFPGSSEQDQLLRIF




KTLGTPSLKTWPGMAELPDFKDNFPKYVVQSFKKICPKKLDKTGLDLLSRMLQY




DPAKRISAEQAMGHPYFKDLKLRKPKAAGPGP





263
CDK type A
MDQYEKIEKIGEGTYGVVYKAIDRSTNKTIALKKIRLEQEDEGVPSTAIREISL
120
1004




LKEMQHGNIVKLQDVVHSERRLYLVFEYLDLDLKKHMDSCPEFSKDTHTIKMFL




YQILRGISYCHSHRVLHRDLKPQNLLLDRRTNSLKLADFGLARAFGIPVRTFTH




EVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFAEMVNRRPLFPGDSEIDELFKI




FRIMGTPNEDSWPGVTSLPDFKSTFPKWASQDLKTVTPTVDPAGIDLLSKMLCM




DPRRRITAKVALEHEYFKDVGVIP





264
CDK type A
MVMKSKLDKYEKLEKLGEGTYGVVYKAQDKTTKEIYALKKIRLESEDEGIPSTA
23
937




IREIALLKELQHPNVVRIHDVIHTNKKLILVFEFVDYDLKKFLHNFDKGIDPKI




VKSLLYQLVRGVAHCHQQKVLHRDLKPQNLLVSQEGILKLGDFGLARAFGIPVK




NYTNEVVTLWYRAPDILLGSKNYSTSVDIWSIGCIFVEMLNQKPLFPGSSEQDQ




LKKIFKIMGTPDATKWPGIAELPDWKPENFEKYPGEPLNKVCPKMDPDGLDLLD




KMLKCNPSERIAAKNAMSHPYFKDIPDNLKKLYN





265
CDK type A
MDQYEKVEKIGEGTYGVVYKAIDRLTNETIALKKIRLEQEDEGVPSTAIREISL
149
1033




LKEMQHGNIVRLQDVVHSENRLYLVFEYLDLDLKKHMDSSPDFAKDPRLVKIFL




YQILRGIAYCHSHRVLHRDLKPQNLLIDRRTNALKLADFGLARAFGIPVRTFTH




EVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFAEMVNQRPLFPGDSEIDELFKI




FRILGTPNEDTWPGVTALPDFKSAFPKWPAKNLQDMVPGLNSAGIDLLSKMLCL




DPSKRITARSALEHEYFKDIGFVP





266
CDK type B-1
MEKYEKLEKVGEGTYGKVYKAKDKATGQLVALKKTRLEMDEEGVPPTALREVSL
199
1116




LQLLSQSLYVVRLLSVEHVDGGSKRKPMLYLVFEYLDTDLKKFIDSHRKGPNPR




PVPAATVQNFLYQLLKGVAHCHSHGVLHRDLKPQNLLVDKEKGILKIADLGLGR




AFTVPLKSYTHEVVTLWYRAPEVLLGSAHYSIGVDMWSVGCIFAEMVRRQALFP




GDSEFQQLLHIFRLLGTPTEKQWPGVTTLRDWHVYPQWEPQNLARAVPSLGPDG




VDLLSKMLKYDPAERISAKAALDHPFFDSLDKSQF





267
CDK type B-2
MERPATAAVSAMEAFEKLEKVGEGTYGKVYRAREKATGKIVALKKTRLHEDEEG
41
982




VPPTTLREISILRMLSRDPHIVRLMDVKQGQNKEGKTVLYLVFEYMETDLKKYI




RGFRSSGESIPVNIVKSLMYQLCKGVAFCHGHGVLHRDLKPHNLLMDKKTLTLK




IADLGLARAFTVPIKKYTHEILTLWYRAPEVLLGATHYSTAVDMWSVGCIFAEL




VTKQALFPGDSELQQLLHIFRLLGTPNEKMWPGVSSLMNWHEYPQWKPQSLSTA




VPNLDKDGLDLLSQMLHYEPSRRISAKAAMEHPYFDDVNKTCL





268
CDK type C
MGCVLGREVSSGIVTESKGRDSSEVETSKRDDSVAAKVEGEGKAEEVRTEETQK
291
2042




KEKVEDDQQSREQRRRSKPSTKLGNLPKHIRGEQVAAGWPSWLSDICGEALNGW




IPRRANTFEKIDKIGQGTYSNVYKAKDLLTGKIVALKKVRFDNLEPESVRFMAR




EILILRHLDHPNVVKLEGLVTSRMSCSLYLVFEYMEHDLAGLAASPAIKFTEPQ




VKCYMHQLLSGLEHCHNRRVLHRDIKGSNLLIDNGGVLKIGDFGLASFYDPDHK




HRMTSRVVTLWYRPPELLLGANDYGVGIDLWSAGCILAELLAGKPIMPGRTEVE




QLHKIYKLCGSPSEEYWKKYKLPNATLFKPREPYRRCIRETFKDFPPSSLPLIE




TLLAIDPAERGTATDALQSEFFRTEPYACEPSSLPQYPPSKEMDAKKRDDEARR




LRAASKGQADGSKKERTRDRRVRAVPAPEANAELQHNIDRRRLISHANAKSKSE




KFPPPHQDGALGFPLGASHRFDPAVVPPDVPFTSTSFTSSKEHDQTWSGPLVDP




PGAPRRKKHSAGGQRESSKLSMGTNKGRRADSHLKAYESKSIA





269
CDK type C
MYSKSSAVDDSRESPKDRVSSSRRLSEVKTSRLDSSRRENGFRARDKVGDVSVM
107
2236




LIDKKVNGSARFCDDQIEKKSDRLQKQRRERAEAAAAADHPGAGRVPKAVEGEQ




VAAGWPVWLSAVAGEAIKGWLPRRADTFEKLDKIGQGTYSSVYKARDVTNNKIV




ALKRVRFDNLDTESVKFMAREIHILRMLDHPNVIKLEGLITSRMSCSLYLVFEY




MEHDLTGLASRPDVKFSEPQIKCYMKQLLSGLDHCHKHGVLHRDIKGSNLLIDN




NGILKIADFGLASVFDPHQTAPLTSRVVTLWYRPPELLLGASRYGVEVDLWSTG




CILGELYTGKPILPGKTEVEQLHKIFKLCGSPSDDYWRRLHLPHAAVFKPPQPY




RRCVAEIFKELPPVALGLLETLISVDPSQRGTAAFALRSEFFTASPLPCDPSSL




PKYPPSKEIDMKLREEEARRRGAAGGKNELEKRGTKDSRTNSAYYPNAGQLQVK




QCHSNANGRSEIFGPYQEKTVSGFLVAPPKQARVSKETRKDYAEQPDRASFSGP




LVPGPGFSKAGKELGHSITVSRNTNLSTLSSLVTSRTGDNKQKSGPLVSESANQ




ASRYSGPIREMEPARKQDRRSHVRTNIDYRSREDGNSSTKEPALYGRGSAGNKI




YVSGPLLVSSNNVDQMLKEHDRRIQEHARRARFDKARVGNNHPQAAVDSKLVSV




HDAG





270
CDK type C
MGCIPTIISDGRRRSAAPDKRRPRPRRSSSEGEAPPHATAAGSEGGESARGAPG
82
1749




KERPEPAPRFVVRSPQGWPPWLVAAVGHAIGEFVPRCADSFRKLAKIGEGTYSN




VYKARDLVTGKTVALKKVRFDNLEAESIKFMAREILVLTRLNHPNVIKLEGPVT




SRMSSGLYLAFEYMEHDLSGIAARQNGKFTEPQVKCFMRQLLSGLEHCHNHDVL




HRDIKCSNLLIDNEGNLKIADFGLATFYDPERKQVMTNRVVTLWYRAPELLLGA




TSYGIGIDLWSAGCILAELLYGKPIMPGRTEVEQLHKIFKLCGSPSEAYWNKFK




LPNANIFKPPQPYARCIAETFKDFPPSALPLLETLLSIDPDERGTATTALNSEF




FAAEPHACEPSSLPKYPPSKEMDLKLIKEKTRRDSSKRPSAIHGSRRDGIHDRA




GRVIPAPEATAENQATLHRPRAMKKANPMSRSEKFPPAHMDGVVGSSANAWLSG




PASNAAPDSRRHRSLNQNPSSSVGKASTGSSTTQETLKVAPELLQVGSSSLHPC




HRMLVYGSNLTIRSK





271
CDK type C
MGCICAKQADRGPASPGSGILTGAGTGTGTRSSKIPSGLFEFEKSGVKEHGGRS
151
1560




GELRKLEEKGSLSKRLRLELGFSHRYVEAEQAAAGWPSWLTAVAGDAIQGLVPL




KADSFEKLEKIGQGTYSSVFRARELANGRMVALKKVRFDNFQPESIQFMAREIS




ILRRLDHPNIMKLEGIITSRMSNSIYLVFEYMEHDLYGLISSPQVKFSDAQVKC




YMKQLLSGIEHCHQHGVIHRDVKSSNILVNNEGILRIGDFGLANILNPKDRQQL




TSHVVTLWYRPPELLMGSTSYGVTVDLWSVGCVFAELMFRKPILRGRTEVEQLH




KIFKLCGSPPDGYWKMCKVPQATMFRPRHAYECTLRERCKGIATSAMKLMETFL




SIEPHKRGTASSALISEYFRTVPYACDPSSLPKYPPNKEIDAKHREEARRKKAR




SRVREAEVGKRPTRIHRASQEQGFSSNIAPKEKRSYA





272
CDK type C
MAVAAPGHLNVNESPSWGSRSVDCFEKLEQIGEGTYGQVYMAKEKKTGEIVALK
82
1644




KIRMDNEREGFPITAIREIKILKKLHHENVIKLKEIVTSPGPEKDEQGRPEGNK




YKGGIYMVFEYMDHDLTGLADRPGMRFSVPQIKCYMRQLLTGLHYCHINQVLHR




DIKGSNLLIDNEGNLKLADFGLARSFSNDHNANLTNRVITLWYRPPELLLGATK




YGPAVDMWSVGCIFAELLHGKPIFPGKDEPEQLNKIFELCGAPDEINWPGVSKI




PWYNNFKPTRPMKRRLREVFRHFDRHALELLERMLTLDPSQRISAKDALDAEYF




WADPLPCDPKSLPKYESSHEFQTKKKRQQQRQHEETAKRQKLQHPPQHPRLPPV




QQSGQAHAQMRPGPNQLMHGSQPPVATGPPGHHYGKPRGPSGGAGRYPSSGNPG




GGYNHPSRGGQGGSGGYNSGPYPPQGRAPPYGSSGMPGAGPRGGGGNNYGVGPS




NYPQGGGGPYGGSGAGRGSNMMGGNRNQQYGWQQ





273
CDK type C
MGCICTKGILPAHYRIKDGGLKLSKSSKRSVGSLRRDELAVSANGGGNDAADRL
626
2782




ISSPHEVENEVEDRKNVDFNEKLSKSLQRRATMDVASGGHTQAQLKVGKVGGFP




LGERGAQVVAGWPSWLTAVAGEAINGWVPRRADSFEKLEKIGQGTYSSVYRARD




LETNTIVALKKVRFANMDPESVRFMAREIIIMRKLDHPNVMKLEGLITSRVSGS




LYLVFEYMDHDLAGLAATPSIKLTESQIKCYMQQLLRGLEYCHSHGVLHRDIKG




SNLLVDNNGNLKIGDFGLATFFRTNQKQPLTSRVVTLWYRPPELLLGSSDYGAS




VDLWSSGCILAELFAGKPIMPGRTEVEQLHKIFKLCGSPSEEYWKKSKLPHATI




FKPQQPYKRCLLETFKDFPSSALGLLDVLLAVEPECRGTASSALQNEFFTSNPL




PSDPSSLPKYPSSKEFDARLRDEEARKHKATAGKARGLESIRKGSKESKVVPTS




NANADLKASIQKRQEQSNPRSTGEKPGGTTQNNFILSGQSAKPSLNGSTQIGNA




NEVEALIVPDRELDSPRGGAELRRQRSFMQRRASQLSRFSNSVAVGGDSHLDCS




REKGANTQWRDEGFVARCSHPDGGELAGKHDWSHHLLHRPISLFKKGGEHSRRD




SIASYSPKKGRIHYSGPLLPSGDNLDEMLKEHERQIQNAVRKARLDKVKTKREY




ADHGQTESLLCWANGR





274
CDK type D
MDPDPSPDPDPPKSWSIHTRREIIARYEILERVGSGAYSDVYRGRRLSDGLAVA
13
1467




LKEVHDYQSAFREIEALQILRGSPHVVLLHEYFWREDEDAVLVLEFLRSDLAAV




IADASRRPRDGGGGGAAALRAGEVKRWMLQVLEGVDACHRNSIVHRDLKPGNLL




ISEEGVLKIADFGQARILLDDGNVAPDYEPESFEERSSEQADILQQPETMEADT




TCPEGQEQGAITREAYLREVDEFKAKNPRHEIDKETSIFDGDTSCLATCTTSDI




GEDPFKGSYVYGAEEAGEDAQGCLTSCVGTRWFRAPELLYGSTDYGLEVDLWSL




GCIFAELLTLEPLFPGISDIDQLSRIFNVLGNLSEEVWPGCTKLPDYRTISFCK




IENPIGLESCLPNCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPI




SALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLKFTPTSTGFSIQFP





275
CDK type D
MDPDPSPSPDPPKSWSIHTRREIIARYEILERVGSGAYSDVYRGRRLSDGLAVA
113
1558




LKEVHDYQSAFREIEALQILRGSPHVVLLHEYFWREDEDAVLVLEFLRSDLAAV




IADASRRPRGGGVAPLRAGEGKRWMLQVLEGVDACHRNSIVHRDLKPGNLLISE




EGVLKIADFGQARILLDDGNVAPDYEPESFEERSSEQADILQQPETMEADTTCP




EGQEQGAITREAYLREVDEFKAKNPRHEIDKETSIYDGDTSCLATCTTSDIGED




PFKGSYVYGAEEAGEDAQGSLTSCVGTRWFRAPELLYGSTDYGLEVDLWSLGCI




FAELLTLEPLFPGISDIDQLSRIFNVLGNLSEEVWPGCTKLPDYRTISFCKIEN




PIGLESCLPNCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPISAL




QVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLKFTPTSTGFSIQFP





276
Cyclin A
MSNQHRRSSFSSSTTSSLAKRHASSSSSSLENAGKAFAAAAVPSHLAKKRAPLG
187
1686




NLTNLKAGDGNSRSSSAPSTLVANATKLAKTRKGSSTSSSIMGLSGSALPRYAS




TKPSGVLPSVNPSIPRIEIAVDPMSCSMVVSPSRSDMQSVSLDESMSTCESFKS




PDVEYIDNEDVSAVDSIDRKTFSNLYISDAAAKTAVNICERDVLMEMETDEKIV




NVDDNYSDPQLCATIACDIYQHLRASEAKKRPSTDFMDRVQKDITASMRAILID




WLVEVAEEYRLVPDTLYLTVNYIDRYLSGNVMNRQRLQLLGVACMMIAAKYEEI




CAPQVEEFCYITDNTYFKEEVLQMESSVLNYLKFEMTAPTVKCFLRRFVRAAQG




VNEVPSLQLECMANYIAELSLLEYDMLCYAPSLVAASAIFLAKFVITPSKRPWD




PTLQHYTLYQPSDLGNCVKDLHRLCFNNHGSTLPAIREKYSQHKYKYVAKKYCP




PSIPPEFFHNLVY





277
Cyclin A
MNKENAVGTKSEAPTIRITRSRSKALGTSTGMLPSSRPSFKQEQKRTVRANAKR
238
1653




SASDENKGTMVGNASKQHKKRTVLNDVTNIFCENSYSNCLNAAKAQTSRQGRKW




SMKKDRDVHQSGAVQIMQEDVQAQFVEESSKIKVAESMEITIPDKWAKRENSEH




SISMKDTVAESSRKPQEFICGEKSAALVQPSIVDIDSKLEDPQACTPYALDIYN




YKRSTELERRPSTIYMETLQKDVTPNMRGILVDWLVEVSEEYKLVPDTLYLTVN




LIDRSLSQKFIEKQRLQLLGVTCMLIASKYEEICPPRVEEFCFITDNTYTSLEV




LKMESRVLNLLHFQLSVPTVKTFLRRFVQAAQVSSEVPSVELEYLANYLAELTL




VEYSFLKELPSLMAASAVLLARWTLNQSDNPWNLTLEHYTKYKASELKAAVLAL




EDLQLNTSGSTLNAIREKYRQQKVNYSLLIHSKANHEIL





278
Cyclin B
MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRALSNINSNIIGAPPYPC
235
1539




AVNKRVLSEKNVNSENDLLNAAHRPITRQFAAQMAYKQQLRPEENKRTTQSVSN




PSKSEDCAILDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDVAEEPVT




DIDSGDKENQLAVVEYIDDLYMFYQKAEASSCVPPNYMDRQQDINERMRGILID




WLIEVHYKFELMDETLYLTVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEV




SVPVVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPYVFMRRFLKAAQS




DKKLELLSFFIIELSLVEYDMLKFPPSLLAASAIYTALSTITRTKQWSTTCEWH




TSYSEEQLLECARLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFLLDF




RL





279
Cyclin B
MASRPIVPVQARGEAAIGGGAGKAAIGGGAGKQQKKNGAAEGRNRKALGDIGNL
158
1618




VTVRGIEGKVQPHRPITRSFCAQLLANAQAAAAAENNKKQAVVNVNGAPSILDV




PGAGKRAEPAAAAAAAVAKAAQKKVVKPKQKAEVIDLTSDSERAIEAKKKQQHH




EPTKKEGEKSSRRNMPTLTSVLTARSKAACGMTKKPKEKVVDIDAGDAHNELAA




FEYIEDIYTYYKEAENESLPRNYMSSQPEINEKMRAILVDWLIEIHNKFDLMPE




TLYLTINIIDRFLSVKAVPRRELQLLGMGALFTASKYEEIWAPEVNDLVCIADR




AYSHEQVLAMEKTILGKLEWTLTVPTHYVFLVRFIKASLGDRKLENMVYFLAEL




GVMNYATLTYCPSMVAASAVYAARCTLGLTPLWNDTLKLHTGFSESQLMDCARL




LVGYHAKAKENKLQVVYKKYSSSQREGVALIPPAKALLCEGGGLSSSSSLASSS





280
Cyclin B
MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRALSVINQNLVGDRAYP
205
1530




CHVVNKRGHSKRDAVCGKDQVDPVHRPLTRKFAAQTASTQQHCIEEAKKPRTAV




QERNEFGDCIFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMEDIVEEEE




EEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTENCSCVSANYMAQQADINEKMR




SILIDWLIEVHDKFDLMHETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLAC




KYEEVSVPVVGDLILISDKAYTRKEVLEMESLMLNSLQFNMSVPTPYVFMRRFL




KAAESDKKLEVLSFFLIELSLVEYEMVKFPPSLLAAAAIFTAQCTLYGFKQWTK




TCEWHSNYTEDQLLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCEPAN




FLLGEMKNP





281
Cyclin B
MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRALSVINQNLVGDRAYP
174
1499




CHVVNKRGHSKRDAVCGKDQVDPVHRPLTRKFAAQTASTQQHCIEEAKKPRTAV




QERNEFGDCIFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMEDIVEEEE




EEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTENCSCVSANYMAQQADINEKMR




SILIDWLIEVHDKFDLMHETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLAC




KYEEVSVPVVGDLILISDKAYTRKEVLEMEKLMLNSLQFNMSVPTPYVFMRRFL




KAAESDKKLEVLSFFLIELSLVEYEMVKFPPSLLAAAAIFTAQCTLYGFKQWTK




TCEWHSNYTEDQLLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCEAAN




FLLGEMKNP





282
Cyclin D
MAMVQRQGHDPSSPQEQEDGPSSFLSDDALYCEEGRFEEDDGGGGGQVDGIPLF
94
1332




PSQPADRQQDSPWADEDGEEKEEEEAELQSLFSKERGARPELAKDDGGAVAARR




EAVEWMLMVRGVYGFSALTAVLAVDYLDRFLAGFRLQRDNRPWMTQLVAVACLA




LAAKVEETDVPLLVELQEVGDARYVFEAKTVQRMELLVLSTLGWEMHPVTPLSF




VHHVARRLGASPHHGEFTHWAFLRRCERLLVAAVSDARSLKHLPSVLAAAAMLR




VIEEVEPFRSSEYKAQLLSALHMSQEMVEDCCRFILGIAETAGDAVTSSLDSFL




KRKRRCGHLSPRSPSGVIDASFSCDDESNDSWATDPPSDPDDNDDLNPLPKKSR




SSSPSSSPSSVPDKVLDLPFMNRIFEGIVNGSPI





283
Cyclin D
MEASYQPHHHGHLRQHDPSSSQQEEQVPFDALYCSEEHWGEEDEEEGLASDGLL
176
1342




SEERDHRLLSPRALLDQDLLWEDEELASLFSKEEPGGMRLNLENDPSLADARRE




AVEWIMRVHAHYAFSALTALLAVNYWDRFTCSFALQEDKPWMTQLSAVACLSLA




AKVEETQVPLLIDFQVEDSSPVFEAKNIQRMELLVLSSLEWKMNPVTPLSFLDY




MTRRLGLTGHLCWEFLRRCENVLLSVISDCRFTCYLPSVIAASTMLHVINGLKP




RLDVEDQTQLLGILAMGMDKIDACYKLIDDDHALRSQRYSHNKRKFGSVPGSPR




GVMELCFSSDGSNDSWSVAASVSSSPEPHSKKSRAGEEAEDRLLRGLEGEEDDP




ASADIFSFPH





284
Cyclin D
MALQEEDTRRHYPTAPPFSPDGLYCEDETFGEDLADNACEYAGGGARDGLCEIK
150
1283




DPTLPPSLLGQDLFWEDGELASLVSRETGTHPCWDELISDGSVALARKDAVGWI




LRVHGHYGFRPLTAMLAVNYLDRFFLSRSYQRDRPWISQLVAVACLSVAAKVEE




TQVPILLDLQVANAKFVFESRTIQRMELLLMSTLDWRMNSVTPISFFDHILRRF




GLTTNLHRQFFWMCERLLLSVVADVRLASFLPSVVATAAMLYVNKEIEPCICSE




FLDQLLSLLKINEDRVNECYELILELSIDHPEILNYKHKRKRGSVPSSPSGVID




TSFSCDSSNDSWGVASSVSSSLEPRFKRSRFQDQQMGLPSVNVSSMGVLNSSY





285
Cyclin-
MGQIQYSEKYFDDTYEYRHVVLPPDVAKLLPKNRLLSENEWRAIGVQQSRGWVH
101
367



dependent
YAIHRPEPHIMLFRRPLNYQQQQENQAQQNMLAK



kinase



regulatory



subunit





286
Histone
MGSIDPPKAEQNGTAAAAVADPGQKPGAGDAMPPPPPVKHSNGTAAEPDVATKR
9
1352



acetyltransferase
RRMSVLPLEVGTRVMCRWRDGKYHPVKVIERRKLNPGDPNDYEYYVHYTEFNRR




LDEWVKLEQLDLNSVETVVDEKVEDKVTGLKMTRHQKRKIDETHVEGHEELDAA




SLREHEEFTKVKNIATIELGRYEIETWYFSPFPPEYNDCSKLYFCEFCLNFMKR




KEQLQRHMKKCDLKHPPGDEIYRSGTLSMFEVDGKKNKVYGQNLCYLAKLFLDH




KTLYYDVDLFLFYVLCECDDRGCHMVGYFSKEKHSEESYNLACILTLPPYQRKG




YGKFLIAFSYELSKKEGKVGTPERPLSDLGLLSYKGYWTRVLLDILKKHKANIS




IKELSDMTAIKADDILNTLQSLDLIQYRKGQHVICADPKVLDRHLKAAGRGGLE




VDVSKLIWTPYREQG





287
Histone
MAQKHSTAPDPAAEPKKRRRVGFSGIDAGVDPNGCFKVYLVSREEEVGAPDSFC
89
1486



acetyltransferase
LDPVDLSHFFEEEDGKIYGYEGLKISVWVSCVSFHSYAEIAFESKSDGGKGITD




LNTALKNMFGETLVDNKDDFLQTFSKETQFIRSTVSAGEILKHKHSDDHVNDSV




SNLKVGSDVEAVRMLMGDMTAGHLYSRLVPLVLLLVDGSSPIDVTDSSWELYLL




IQKTSDQQGNFHDRLLGFAAVYRFYHYPDSSRLRLGQILVLPLYQRKGYGRYLL




EVLNNVAIADDVYDFTIEEPVDNLQHLRTCIDVQRLLSFDKVQQAVNSTVSQLK




QGKLSKKTYIPRLLPPPSVVEDARKRFKINKKQFLQCWEILVYLGLDPADKSIQ




DYFSVISNRVRADILGKDSETAGKKVIEVPSDFDPEMSFVMHRAKAGGEANGIQ




VEDNQNKQEEQLQQLIDERLKDIKLIAEKVTQK





288
Histone
MAQKHSTAPDPAAEPKKRRRVGFSGIDAGVDPNGCFKVYLVSREEEVGAPDSFC
89
1477



acetyltransferase
LDPVDLSHFFEEEDGKIYGYEGLKISVWVSCVSFHSYAEIAFESKSDGGKGITD




LNTALKNMFGETLVDNKDDFLQTFSKETQFIRSTVSAGEILKHKHSDGHVNDSV




SNLKVGSDVEAVRMLMGDMTAGHLYSRLVPLVLLLVDGSNPIDVTDSSWELYLL




IQKTSDQQGNFHDRLLGFAAVYRFYHYPDSLRLRLGQILVLPLYQRKGYGHYLL




EVLNNVAIADDVYDFTIEEPVDNLQHLRTCIDVQRLLSFDKVQQAVNSTVSQLK




QGKLSKKTYIPRLLPPPSVVEDARKRFKINKKQFLQCWEILVYLGLDPADKSIQ




DYFSVISNRVRADILGKDSETAGKKVIEVPSDFDPEMSFVLHRAKAGGETNGIQ




VEDNQNKQEEQLQQLIDERLKDIKLIAQKVSRK





289
Histone
MALPMEFWGVEVKAGQPLKVNPGNAKILHLSQASLGECKSSKGNESVPLHVKFG
160
1062



deacetylase
DQKLVLGTLSTENFPQLAFDLVFEKEFELSHNWKSGSVYFCGYKSVVHDDDDEF




SDLESDSEEEDLPMIGVENGKVAAQASAKTATASANASKVESSGKQKARIPQPM




KVDEDDSDEDDDDEDEDESDEEGVDGEADSDEEEDESDEEETPKKAEIGKKRAA




DSATKTPVPAKKSKLPTPQKTDGKKGGHTATPHPAKQAGKNPANSANKSQSPKS




AGQVSCKSCSKTFNSDGALQSHSKAKHGGK;





290
Histone
MEFWGVEVKAGQPLKVNPGNAKILHLSQASLGECKSSKGNESVPLHVKFGDQKL
172
1077



deacetylase
VLGTLSTENFPQLAFDLVFEKEFELSHNWKSGSVYFCGYKSVVHDDDDEFSDLE




SDSEEEDLPMIGVENGKVAAQASAKTATASANASKVESSGKQKASIPQPMKVDE




DDSDEDDDEDDDDEDESDEGVDGEADSDEEEDESDEEETPKKAEIGKKRAADSA




TKTPVPAKKSKLPTPQKTDGKKGGHTATPHPAKQAGKNPANSANKSQSPKSAGQ




VSCKSCSKTFNSDGALQSHSKAKHGGK





291
Histone
MEFWGVEVKSGEPLNVEPGAETVVHLSQACLGETKEKTKESVLLYVHIGVQKLV
66
989



deacetylase
LGTLSADKFPQIPFDLVFEKSFKLSHNWKNGSVFFSGYKTLLPCGSDADSPYSD




SDTDEGLPINVTAQADVPAKKAPVTANANAAKPNLASAKQKVKIVESNEDGKNE




GDDDEDADVSSDDDAEDDSGDEDMVDGGDESSDEDDDDSEEGESSEEEEPKAQP




SKKRPADSVLKTPASDKKSKLETPQKTDGKKASEHVATPYPSKQAGKAIASKGQ




AKQQTPNSNEFSCKPCNRSFKSDQALQSHNKAKHGGS





292
Histone
MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYLLQHMQVLKPVPARDRDLCRFHA
111
1541



deacetylase
DDYVAFLRSITPETQQDQLRQLKRFNVGEDCPVFDGLHSFCQTYAGGSVGGAVK




LNHGLCDIAINWAGGLHHAKKCEASGFCYVNDIVLGILELLKQHERVLYVDIDI




HHGDGVEEAFYTTDRVMTVSFHKFGDYFPGTGDIRDIGYGKGKYYSLNVPLDDG




IDDESYHSLFKPIIGKVMEVFKPGAVVLQCGADSLSGDRLGCFNLSIKGHAECV




RYMRSFNVPVLLLGGGGYTIRNVARCWCYETGVALGLEVDDKMPQHEYYEYFGP




DYTLHVAPSNMENKNSRQLLEEIRSKLLENLSKLQHAPSVPFQERPPDTELPEA




DEDQEDPDERWDPDSDMDVDEDRKPLPSRVKRELIVEPEVKDQDSQKASIDHGR




GLDTTQEDNASIKVSDMNSMITDEQSVKMEQDNVNKPSEQIFPK





293
Histone
MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYYGQGHPMKPHRIRMTHALLAHYG
116
1615



deacetylase
LLQHMQVLKPVPARDRDLCRFHADDYVAFLRSITPETQQDQLRQLKRFNVGEDC




PVFDGLHSFCQTYAGGSVGGAVKLNHGLCDIAINWAGGLHHAKKCEASGFCYVN




DIVLGILELLKQHERVLYVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGT




GDIRDIGYGKGKYYSLNVPLDDGIDDESYHSLFKPIIGKVMEVFKPGAVVLQCG




ADSLSGDRLGCFNLSIKGHAECVRYMRSFNVPVLLLGGGGYTIRNVARCWCYET




GVALGLEVDDKMPQHEYYEYFGPDYTLHVAPSNMENKNSRQLLEDIRSKLLENL




SKLQHAPSVPFQERPPDTELPEADEDQEDPDERWDPDSDMDVDEDRKPLPSRVK




RELIVEPEVKDQDSQKASIDHGRGLDTTQEDNASIKVSDMNSMITDEQSVKMEQ




DNVNKPSEQIFPK





294
Histone
MRPKDRISYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHTKMEIYRPHK
155
1453



deacetylase
AYPAELAQFHSPDYVEFLHRITPDTQHLFPNDLAKYNLGEDCPVFENLFEFCQI




YAGGTIDAARRLNNQLCDIAINWAGGLHHAKKCEASGFCYINDLVLGILELLKY




HARVLYIDIDVHHGDGVEEAFYFTDRVMTVSFHKFGDMFFPGTGDVKEIGGKEG




KFYAINVPLKDGIDDTSFTRLFKAIISKVVETYQPGAIVLQCGADSLAGDRLGC




FNLSIDGHSECVRFVKKFNLPLLVTGGGGYTKENVARCWVVETGVLLDTELPNE




IPENEYFKYFAPDYSLKIPRGNIVLENLNSKSYLSAIKVQVLENLENIQHAPSV




QMQEVPPDFYIPDFDEDEQNPDERMDQHTQDKQIQRDDEYYDGDNDNDHNMDDS





295
Histone
MTVAEDFHVNNRSKMVSQATPESRLTGGEDDNSLHNQVDELLCQELPERQVILE
228
2033



deacetylase
FEGTRPKPYFSDHNGGENSALGVRATEDDLNSDVEAEEKQKEMITLEDMYKNDGT




LYDDDEDDSDWEPVKRQVELMRWFCTNCTMVNVEDVFLCDICGEHRDSGILRHG




FYASPFMQDVGAPSVEAEVQESREDHARSSPPSSSTVVGFDEKMLLHSEVEMKS




HPHPERADRLQAIAASLATAGIFPGRCRSLPVREITKEELQMVHSSEHVDAVEM




TSHMFSSYFTPDTYANEHSARAARIAAGLCADLASTIISGRSKNGFALVRPPGH




HAGIKHAMGFCLHNNAAVAALAAQGAGAKKVLIVDWDVHHGNGTQEIFDGNKSV




LYISLHRHEGGNFYPGTGAAHEVGTMGAEGYCVNIPWSRRGVGDNDYVFAFHHI




VLPIASAFAPDFTIISAGFDAARGDPLGCCDVTPAGYAQMTHMLSALSGGKLLV




ILEGGYNLRSISSSAVAVIKVLLGDSPISEIADAVPSKAGLRTVLEVLKIQRSY




WPSLESIFWELQSQWGIFLVDNRRKQIRKRRRVLVPIWWKWGRKSVLYHLLNGH




LHVKTKR





296
Histone
MAAAPSSPPTNRVDVFWHDGMLSHDTGRGVFDTGSDPGFLDVLEKHPENPDRVR
110
1258



deacetylase
NMVSILKRGPISPFISWHTATPALISQLLSFHSPEYINELVEADKNGGKVLCAG




TFLNPGSWDAALLAAGNTLSAMKYVLDGKGKIAYALVRPPGHHAQPSQADGYCF




LNNAGLAVRLALDSGCKRVVVVDIDVHYGNGTAEGFYQSSDVLTISLHMNHGSW




GPSHPQSGSVDELGEDEGYGYNMNIPLPNGTGDRGYEYAVTELVVPAVESFKPE




MVVLVVGQDSSAFDPNGRQCLTMDGYRAIGRTIRGLADRHSGGRILIVQEGGYH




VTYSAYCLHATVEGILDLPDPLLADPIAYYPEDEAFPVKVVDSIKRYLVDKVPF




LKEH





297
Histone
MVESSGGASLPSVGQDARKRRVSYFYEPTIGDYYYGQGHPMKPHRIRMAHNLIV
50
1462



deacetylase
HYYLHRRMEISRPFPAATTDIRRFHSEDYVTFISSVTPETVSDPAFSRQLKRFN




VGEDCPVFDGIFGFCQASAGGSMGAAVKLNRGDSDIALNWAGGLHHAKKSEASG




FCYVNDIVLGILELLKVHKRVLYVDIDVHHGDGVEEAFYTTDRVMTVSFHKFGD




FFPGSGHIKDTGAGPGKNYALNVPLNDGIDDESFRGMFRPIIQKVMEVYQPDAV




VLQCGADSLSGDRLGCFNLSVKGHADCLRFLRSFNVPLMVLGGGGYTMRNVARC




WCYETAVAVGVEPENDLPYNEYYEYFGPDYTLHVEPCSMENLNAPKDLERIRNM




LLEQLSRIPHAPSVPFQMTPPITQEPEEAEEDMDERPKPRIWNGEDYESDAEED




KSQHRSSNADALHDENVEMRDSVGENSGDKTREDRSPS





298
MAT1 CDK-
MVVPSSNPHNREMAIRRRMASTFNKREDDFPSLREYNDYLEEVEEMTFNLIEGV
176
739



activating
DVPTIEAKIAKYQEENAEQIMINRAKKAEEFAAALAASKGLPPQTDPDGALNSQ



kinase assembly
AGLSVGTQGQYAPAIAGGQPRPTGMAPQPVPLGTGLDTHGYDDEEMIKLRAERG



factor
GRAGGWSIELSKKRALEEAFGSLWL





299
Peptidylprolyl
MAAIISCHHYHSCCSSLIASKWVGARIPTSCFGRSSTQSNNAASVRQFVTRCSS
150
1529



isomerase
SPSSRGQWQPHQNGEKGRSFSLRECAISIALAVGLVTGVPSLDMSTGNAYAASP




ALPDLSVLISGPPIKDPEALLRYALPINNKAIREVQKPLEDITDSLKVAGLRAL




DSVERNVRQASRVLKQGKNLIVSGLAESKKDHGVELLDKLEAGMDELQQIVEDG




NRDAVAGKQRELLNYVGGVEEDMVDGFPYEVPEEYKNMPLLKGRAAVDMKVKVK




DNPNLEECVFRIVLDGYNAPVTAGNFVDLVERHFYDGMEIQRADGFVVQTGDPE




GPAESFIDPSTEKPRTIPLEIMVDGEKAPVYGATLEELGLYKAQTKLPFNAFGT




MAMARDEFEDNSASSQIFWLLKESELTPSNANILDGRYAVFGYVTENQDFLADL




KVGDVIESVQVVSGLDNLANPSYKIAG;





300
Peptidylprolyl
MAGEDFDIPPADEMNEDFDLPDDDDDAPVMKAGDEKEIGKQGLKKKLVKEGDAW
247
1971



isomerase
ETPDNGDEVEVHYTGTLLDGTQFDSSRDRGTPFKFTLGQGQVIKGWDQGIKTMK




KGENAIFTIPPELAYGEAGSPPTIPPNATLQFDVELLSWTSVKDICKDGGIFKK




ILVEGEKWENPKDLDEVLVKYEFQLEDGTTIARSDGVEFTVKEGHFCPAVAKAV




KTMKKGEKVLLTVKPQYGFGEKGKPASGDEGAVPPNATLQITLELVSWKTVSEV




TDDKKVIKKILKEGEGYERPNEGAVVEVKLIGKLQDGTVFVKKGHDDCEELFKF




KIDEEQVVDGLDKAVMNMKKGEVALLTVAPEYAFGSSESKQDLAVVPPSSTVYY




EVELVSFVKDKESWDMNTEEKIEAAGKKKEEGNVIFKAGKYAKASKRYEKAVKY




IEYDTSFSEDEKKQAKALKVACNLNDAACKLKLKDYNQAEKLCTKVLELDSRNV




KALYRRAQAYIELSDLDLAEFDIKKALEIDPHNRDVKLEYKVLKEKVKEFNKKD




AKFYGNMFAKMSKLEPVEKTAAKEPEPMSIDSKA;





301
Peptidylprolyl
MSTVYVLEPPTKGKVVLNTTHGPLDVELWPKEAPKAVRNFVQLCLEGYYDNTIF
136
1644



isomerase
HRIIKDFLVQGGDPTGSGTGGESIYGDAFSDEFHSRLRFKHRGLVACANAGSPH




SNGSQFFITLDRCDWLDRKNTIFGKITGDSIYNLSGLAEVETDKSDRPLDPPPK




IISVEVLWNPFEDIVPRAPVRSLVPTVPDVQNKEPKKKAVKKLNLLSFGEEAEE




EEKALVVVKQKIKSSHDVLDDPRLLKEHIPSKQVDSYDSKTARDVQSVREALSS




KKQELQKESGAEFSNSFREIADDEDDDDDDASFDARMRRQILQKRKELGDLPPK




PKPKSRDGISARKERETSISRDKDDDDDDDQPRVEKLSLKKKGIGSEARGERMA




NADADLQLLNDAERGRQLQKQKKHRLRGREDEVLTKLETFKASVFGKPLASSAK




VGDGDGDLSDWRSVKLKFAPEPGKDRMTRNEDPNDYVVVDPLLEKGKEKFNRMQ




AKEKRRGREWAGKSLT;





302
Peptidylprolyl
MASAISMHSSGLLLLQGTNGKDVTEMGKAPASSRVANMQQRKYGATCCVARGLT
48
836



isomerase
SRSHYASSLAFKQFSKTPSIKYDRMVEIKAMATDLGLQAKVTNKCFFDVEIGGE




PAGRIVIGLFGDDVPKTVENFRALCTGEKGFGYKGCSFHRIIKDFMIQGGDFTR




GNGTGGKSIYGSTFEDENFALKHVGPGVLSMANAGPSTNGSQFFICTVKTPWLD




NRHVVFGQVVDGMDVVQKLESQETSRSDVPRQPCRIVNCGELPLDG;





303
Peptidylprolyl
MAASFTALSNVGSLSSPRNGSEIRRFRPSCNVAASVRPPPLKAGLSASSSSSFS
49
822



isomerase
GSLRLIPLSSSPQRKSRPCSVRASAEAAAAQSKVTNKVYLDISIGNPVGKLVGR




IVIGLYGDDVPQTAENFRALCTGEKGFGYKGSTVHRVIKDFMIQGGDFDKGNGT




GGKSIYGRTFKDENFKLSHVGPGVVSMANAGPNTNGSQFFICTVKTPWLDQRHV




VFGQVLEGMDIVRLIESQETDRGDRPRKRVVVSDCGELPVV;





304
Peptidylprolyl
MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRYEGVLAETGEVFDSTH
185
751



isomerase
EDNTLFSFEIGKGSVISAWDTALRTMKVGEVAKITCKPEYAYGSTGSPPDIPPD




ATLIFEVELVACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEKKRREEA




KAAAAARVQAKLDAKKGHGKGKGKAK;





305
Peptidylprolyl
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH
103
621



isomerase
YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA




NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP




VVVADCGQLS





306
Peptidylprolyl
MPNPKVFFDMTIGGAAAGRVVMELYADTTPRTAENFRALCTGEKGVGRSKKPLH
41
559



isomerase
YKGSKFHRVIPSFMCQGGDFTAGNGTGGESIYGVKFADENFIKKHTGPGILSMA




NAGPGTNGSQFFICTTKTEWLDGKHVVFGKVVEGMEVVKAIEKVGSSSGRTSKP




VVVADCGQLP





307
Peptidylprolyl
MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRYEGVLAETGEVFDSTH
127
693



isomerase
EDNTLFSFEIGKGSVISAWDTALRTMKVGEVAKITCKPEYAYGSTGSPPDIPPD




ATLIFEVELVACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEKKRREEA




KAAAAARVQAKLDAKKGHGKGKGKAK





308
Peptidylprolyl
MATARSFFLCALLLLATLYLAQAKKSEDLKEVTHKVYFDVEIAGKPAGRIVMGL
28
639



isomerase
YGKAVPKTAENFRALCTGEKGTGKSGKPLHYKGSSFHRIIPSFMLQGGDFTLGD




GRGGESIYGEKFADENFKLKHTGPGLLSMANAGPDTNGSQFFITTVTTSWLDGR




HVVFGKVLSGMDVVYKVEAEGRQSGTPKSKVVIADSGELPL





309
Peptidylprolyl
MMRREISVLLQPRFVLAFLALAVLLLVFAFPFSRQRGDQVEEEPEITHRVYLDV
135
812



isomerase
DIDGQHLGRIVIGLYGEVVPRTVENFRALCTGEKGKSANGKKLHYKGTPFHRII




SGFMIQGGDVIYGDGKGYESIYGGTFADENFRIKHSHAGIISMVNSGPDSNGSQ




FFITTVKASWLDGEHVVFGRVIQGMDTVYAIEGGAGTYNGKPRKKVIIADSGEI




PKSKWDEER





310
Peptidylprolyl
MWATAEGGPPEVTLETSMGSFTVELYFKHAPRTSRNFIELSRRGYYDNVKFHRI
119
613



isomerase
IKDFIVQGGDPTGTGRGGESIYGKKFEDEIKPELKHTGAGILSMANAGPNTNGS




QFFITLAPCPSLDGKHTIFGRVCRGMEIIKRLGSVQTDNNDRPIHDVKILRTSV




KD





311
Peptidylprolyl
MSNPKVFFDILIGKMKAGRVVMELFADVTPKTAENFRALCTGEKGIGRSGKPLH
38
562



isomerase
YKGSTFHRIIPNFMCQGGDFTRGNGTGGESIYGMKFADENFKIKHTGLGVLSMA




NAGPDTNGSQFFICTEKTPWLDGKHVVFGKVIDGYNVVKEMESVGSDSGSTRET




VAIEDCGQLSEN





312
Peptidylprolyl
MDDDFEFPASSNVENDDDDGMDMDDMGGDVPEEEDPVASPAVLKVGEEREIGKA
109
1872



isomerase
GFKKKLVKEGEGWETPSSGDEVEVHYTGTLLDGTKFDSSRDRGTPFKFKLGRGQ




VIKGWDEGIKTMKKGENAIFTIPPELAYGESGSPPTIPPNATLQFDVELLSWSS




VKDICKDGGILKKVLVEGEKWDNPKDLDEVFVKYEASLEDGTLISKSDGVEFTV




GDGYFCAALAKAVKTMKKGEKVLLTVMPQYAFGETGRPASGDEAAVPPDASLQI




MLELVSWKTVSDVTKDKKVLKKTLKEGEGYERPNDGAAVQVRLCGKLQDGTVFV




KKDDEEPFEFKIDEEQVIDGLDRAVKNMKKGEVALVTIQPEYAFGPTESQQDLA




VVPANSTVYYEVELLSFVKEKESWEMNNQEKIEAAARKKEEGNAAFKAGKYVRA




SKRYEKAVRFIEYDSSFSDEEKQQAKTLKNTCNLNDAACKLKLKDFKEAEKLCT




KVLEGDGKNVKALYRRAQAYIQLVDLDLAEQDIKKALEIDPNNRDVKLEYKILK




EKVREYNKRDAQFYGNMFAKMNKLEHSRTAGMGAKHEAAPMTIDSKA





313
Peptidylprolyl
MAKPRCFMDISIGGELEGRIVGELYTDVAPKTAENFRALCTGEKGIGPHTGAPL
74
1159



isomerase
HYKGVRFHRVIKGFMVQGGDISAGDGTGGESIYGLKFEDENFDLKHERKGMLSM




ANSGPNTNGSQFFITTTRTSHLDGKHVVFGRVVKGMGVVRSVEHVTTAAGDCPT




VDVVIADCGEIPAGADDGIRNFFKDGDTYPDWPADLDESPAELSWWMDAVDSIK




AFGNGSYKKQDYKMALRKYRKALRYLDICWEKEGIDEVESSSLRKTKSQIFTNS




SACKLKLCDLKGALLDAEFAVRDGENNAKAYFRQGQAHMELNDIDAAAESFSKA




LELEPNDVGIKKELNAAKKKIFERREQEKRAYRKMFL





314
Peptidylprolyl
MTKRKNPLVFLDVSIDGDPVERIVIELFADTVPRTAENFRSLCTGEKGVGKTTG
54
2045



isomerase
KPLHYKGSYFHRIIKGFMAQGGDFSNGNGTGGESIYGGKFADENFKLAHDGPGL




LSMANGGPNTNGSQFFIIFKRQPHLDGKHVVFGKVMRGMEVVKKIEQVGSANGK




PLQPVKIVDCGETSETGTQDAVVEEKSKSATLKAKKKRSARDSSSESRGKRRQR




KSRKERTRKRRRYSSSDSYSSESSDSDSESYSSDTESESKSHSESSVSDSSSSD




GRRRKRKSTKREKLRRQRGKDSRGEQKSARYDKKSRHKSADSSSDSESESSSRS




RSRDDKKKSSRRESARSVSKLKDAEANSPENLESPRDREIKKVEDNSSHEEGEF




SPKNDVQHNGHGTDAKFGKYDDQRPRSDGSKKSSGSMRDSPKRLANSVPQGSPS




SSPAHKASEPSSSIRARNPSRSPAPDGNSKRIRKGRGFTERFSYARRYRTPSPE




DVTYRPYHYGRRNFHDRRNDRYSNYRSYSERSPHRRYRSPPRGRSPPRYQRRRS




RSRSVSRSPGGNKGRYRGRDQSRSRSRSRSRSPRRGSSPANKQLPLSERLKSRL




GTRVDEHSPRRRRSSSRSHDSSRSRSPDEVPDKHEGKAAPVSPARSRSSSPSGR




GLVSYGDASPDSGIN





315
Peptidylprolyl
MSVLLVTSLGDIVVDLHADRCPLTCKNFLKLCRIKYYNGCVFHTVQKDFTAQTG
53
1879



isomerase
DPTGTGTGGDSVYKFLYGDQARFFMDEIHLDLKHSKTGTVAMASGGENLNASQF




YFTLRDDLDYLDGKHTVFGEVAEGLETLTRINEAYVDEKGRPYKNIRIRHTYIL




DDPFDDPPQLAELIPDASPEGKPKDEVVDDVRLEDDWVPLDEQLGPAQLEEAIR




AKEAHSRAVVLESIGDIPDAEIKPPDNVLFVCKLNPVTEDEDLHTIFSRFGTVV




SADVIRDFKTGDSLCYAFIEFENKDSCEQAYFKMDNALIDDRRIKVDFSQSVAK




LWSQFKRKDSQAAKGKGCFKCGAPDHMARECPGSSTRQPLSKYILKEDNAQRGG




DDSRYEMVFDEDAPESPSHGKKRRGRDDRDDRHKMSRQSVEETKFNDREGGHSV




DKHRQSERSKHREDEMSRDSKASEAGRRRIDRDFPEEERDGEKYTESHRDRDGK




RGDYRDYRKGEADVQTHGDRRGDENYRRKSAAYDDGHEGAGAARRKDSNDDHHA




YRRGYGDSRKGTRDEDDDGRGRRDDPSYRRSSGHKDSSNGGREEQKYRSGETDG




KSHPERSHRGDRRR





316
Peptidylprolyl
MRPFNGGSSIACLVLVIAAGALAESQGPHLGSARVVFQTNYGDIEFGFFPGVAP
7
690



isomerase
RTVDHIFKLVRLGCYNTNHFFRVDKGFVAQVADVANGRTAPMNDEQRTEAEKTI




VGEFSNVKHVRGILSMGRYDDPDSAQSSFSILLGDAPHLDGKYAIFGRVTKGDE




TLKKLEQLPTRREGMFVMPTERITILSSYYYDTGAESCEEENSTLRRRLAASAV




EVERQRMKCFP





317
Peptidylprolyl
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH
83
601



isomerase
FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA




NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP




VVIADSGQLA





318
Peptidylprolyl
MRFTSITSAIALFAAAASALDKPLDIKVDKAVECSRKTKAGDKIQVHYRGTLEA
125
535



isomerase
DGSEFDASYKRGQPLSFHVGKGQVIKGWDQGLLDMCPGEKRTLTIQPDWGYGSR




GMGPIPANSVLIFETELVEIAGVAREEL





319
Peptidylprolyl
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH
55
573



isomerase
YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA




NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP




VVVADCGQLS





320
Peptidylprolyl
MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSDPKLVHRKVGEEKKKP
147
842



isomerase
DDLEEVTHKVFFDVEIGGKPAGRIVMGLFGKTVPKTVENFRALCTGEKGIGKSG




KPLNYKGSQFHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLKHTDAGR




LSMTNAGPDTNGSQFFITTVTTSWLDGRHVVFGKVLSGMDVVHKIEAEGGQSGQ




PKSIVVISDSGELDL





321
Peptidylprolyl
MAVTLHTNLGDIKCEIFCDEVPKAAEHNARGILSMANSGPNTNGSQFFIAYAKQ
167
487



isomerase
PHLNGLYTIFGRVIHGFEVLDIMEKTQTGPGDRPLAEIRLNRVTIHANPLAG





322
Peptidylprolyl
MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSDPKLVHRKVGEEKKKP
195
890



isomerase
DDLEEVTHKVFFDVEIGGKPAGRIVMGLFGKTVPKTVENFRALCTGEKGIGKSG




KPLNYKGSQFHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLKHTDAGR




LSMANAGPDTNGSQFFITTVTTSWLDGRHVVFGKVLSGMDVVHKIEAEGGQSGQ




PKSIVVISDSGELDL





323
Peptidylprolyl
MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH
68
586



isomerase
YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA




NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP




VVVADCGQLS





324
Retinoblastoma
MSPVAANAMEEAAEPEVPAPVTPSKDDADTDAAVSRFLGFCKSKLGLAEGNCVQ
182
3265



related protein
SSTLLRKTAHVLRSSGTVIGTGTAEEAERYWFAFVLYTVRRVGERKAEDEQNGS




DETEVPLSRILKASVLNLIDFFKEIPQFVIKAGAIVSGIYGANWDSRLEAREMQ




TNYVHLCILCKFYKRICGEFFILNDAKDDMKSADSSTSDPVIMYQPFGWLLFLA




LRIHALSRFKDLVSSTNALVSVLAILIIHLPTRFRKFSISDSSQLVKRSEKGVD




LVGSLAYRYDTSEDEIKRTLEKANNVIAEILGITPPPASECKAENLENVDTDGL




IYFGNLMEETSLSSILSTLEKIYEDATRNDSEFDERVFINDDDSLLVSGSLSGA




AINLTGAKRKYDSFASPAKTITRPLSPSRSPASHINGIIGGTNLRITATPVATA




MTTAKWLRTFVSPLPSKPSTDLQGFLASCDRDVTSDVIRRANIILEAIFPNSPI




GERTVTGGLQNANLMDNMWAEQRRLEALKLYYRVLEAMCRAEAQILHSNNLTSL




LTNERFHRCMLACSAELVLATHKTVTMLFPAVLERTGITAFDLSKVIESFVRHE




ETLPRELRRHLNTLEERLLENMVWERGSSMYNSLVVARPALAPEINRLGLLPEP




MPSLDAIALLINFSSSGLPQSPVQKHEASPGQNGDIRSPKRISTEYRSVLVERN




FTSPVKDRLLALSNIKSKLPPPPLQSAFASPTRPHPGGGGETCAETAIHIFFSK




ITKLAAVRINAMLERLQLSQQIKEGVYCLFQQILSQRTNLFFNRHIDQVILCCF




YGVAKINQINLTFREIIYNYRKQPQCKPQVFRNVFVDWSTRRNGKAGNEHVDII




SFYNEIFIPSVKPLLVELGPTGATTRTNRTSEVGNKNDAQCPGSPKISSFPTLP




DMSPKKVSASHNVYVSPLRSSKMDASISHSSKSYYACVGESTHAYQSPSKDLVA




INSRLNGNRKVRGTLNFDDVDAGLVSDSMVANSLYLQNGSSMSSSTAKSSEKPES





325
WD40 repeat
MRPILMKGHERPLTFLKYNREGDLLFSCAKDHTPTVWFADNGERLGTYRGHNGA
165
1145



protein
VWCCDVSRDSMRLITGSADTTAKLWSVQNGTQLFTFNFDSPARSVDFSIGDKLA




VITTDPFMELPSAIHVKRIARDPADQASESVLVLRGHQGRIARAVWGPLNKTII




SAGEDAVIRIWDSETGKLLRESDKETGHKKAVTSLMKSVDGSHFVTGSQDKSAK




LWDIRTLTLIKTYVTERPVNAVTMSPLLDHVVLGGGQDASAVTMTDHRAGKFEA




KFFDKILQEEIGGVKGHFGPINALAFNPDGKSFSSGGEDGYVRLHHFDPDYFNI




KI





326
WD40 repeat
MDKKRTVVPLVCHGHSRPVVDLFYSPITPDGFFLISASKDSSPMLRNGETGDWI
529
1569



protein
GTFEGHKGAVWSCCLDTNALRAASGSADFSAKLWDALSGDELHSFEHKHIVRSC




AFSEDTHLLLTGGVEKILRIFDLNRPDAPPREVDNSPGSIRTVAWLHSDQTILS




SCTDIGGVRLWDVRSGKIVQTLETKSPVTSSEVSQDGRYITTADGSTVKFWDAN




HFGLVKSYNMPCNIESASLEPKLGNKFIAGGEDMWVHIFDFHTGEEIGCNKGHH




GPVHCVRFSPGGESYASGSEDGTIRIWQTGPANNVEGDANPSNGPVTGKAKVGA




DEVTRKVEDLQIGKEGKDWREG





327
WD40 repeat
MAEGLILKGTMRAHTDMVTAIAIPIDNSDMVVTSSRDKSIILWHLTKEEKVYGV
156
1136



protein
PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV




LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHTDWVSCVRFSPNTL




QPTIVSASWDRTIKVWNLTNCKLRNTLAGHNGYVNTVAVSPDGSLCASGGKDGV




ILLWDLAEGKRLYNLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED




LRVDLKNEADKTDGTTTAASNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGTG




RY





328
WD40 repeat
MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKSIILWHLTKEDKVYGV
90
1073



protein
PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV




LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVRFSPNTL




QPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGYVNTVAVSPDGSLCASGGKDGV




ILLWDLAEGKKLYSLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED




LRVDLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGI




GRY





329
WD40 repeat
MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKSIILWHLTKEDKVYGV
66
1049



protein
PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV




LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVRFSPNTL




QPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGYVNTVAVSPDGSLCASGGKDGV




ILLWDLAEGKKLYSLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED




LRVDLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGI




GRY





330
WD40 repeat
MSGVPAPPFATTTPENGTMSSNSPAFHRDSDDDDDQGEVFLDDSDIIHEVAVDD
227
1512



protein
EDLPDADDEADEAEEADDSLHIFTGHNGEVYSLACSPTDATLVATGAGDDKGFL




WRIGHGDWAVELQGHKDSISSLAFSLDGQLLASGSLDGVIQIWDVPSGNLKGTL




DGPGGGIEWIRWHPKGHIILAGSEDSTVWMWNADKMAYLNMFSGHGNSVTCGDF




TPDGKTICTGSDDATLRIWNPKSGENIHVVKGHPYHAEGLTSMAISSDSGLAIT




GAKDGSVRIVNISSGRVVSSLDAHADSVEFVGLALSSPWAATGSLDQKLIIWDL




QHSSPRATCDHEDGVTCLSWVGASRFLASGCVDGKVRVWDSLSGDCVRTFHGHS




DAIQSLSVSANEEFLVSVSIDGTARVFEIAEFH





331
WD40 repeat
MGTSQHQLSSCLQLLPRRRGNKNLIFRRTMASGGAAAVAPPPGYKPYRHLKTLT
33
1076



protein
GHVAAVSCVKFSNDGTLLASASLDKTLIIWSSAALSLLHRLVGHSEGVSDLAWS




SDSHYICSASDDRTLRIWSSRSPFDCLKTLRGHTDFVFCVNFNPQSSLIVSGSF




DETIRIWEVKTGRCLNVIRAHSMPVTSVHFNRDGSLIVSGSHDGSCKIWDTKNG




ACLKTLIDDTVPAVSFAKFSPNGKFILVATLNDTLKLWNYATGKFLKIYTGHKN




SVYCLTSTFSVTNGKYIVSGSEDRCICIWDLQGKNLIQKLEGHSDTVISVTCHP




SENKIASAGLDSDRTVRIWLQDA





332
WD40 repeat
MPSQKIETGHQDIVHDVAMDYYGKRVATASSDTTIKIIGVSNSSGSQHLASLSG
65
973



protein
HKGPVWQVAWAHPKFGSILASCSYDGQVILWKEGNQNDWAQAHVFNDHKSSVNS




IAWAPHELGLCLACGSSDGNISVFTARPDGGWDTTRIEQAHPVGVTSVSWAPSM




APGALVGSGLLDPVQKLASGGCDNTVKVWKLYNGTWKMDCFPALQMHSDWVRDV




AWAPNLGLPKSTIASASQDGTVVIWTVAKEGEQWQGKVLKDFKTPVWRVSWSLT




GNLLAVADGNNNVTLWNEAVDGEWQQVTTVEP





333
WD40 repeat
MKIAGLKSVENAHDESVWAAAWVPATESRPALLLTGSLDETVKLWRPDELALER
82
1047



protein
TNAGHFLGVVSVAAHPSGVIAASASIDSFVRVFDVDTNATIATLEAPPSEVWQM




QFDPKGTTLAVAGGGSASIKLWDTATWELNATLSIPRPEQPKPSEKGNKKFVLS




VAWSPDGRRLACGSMDGTISIFDVARAKFLHHLEGHFMPVRSLVFSPVEPRLLF




SASDDAHVHMYDSEGKSLVGSMSGHASWVLSVDVSPDGAALATGSSDRTVRLWD




LSMRAAVQTMSNHSDQVWGVAFRPMAGAGVRAGGRLASVSDDKSISLYDYS





334
WD40 repeat
MEIDLGNLAFDVDFHPSEQLVASGLITGDLLLYRYGDGSSPEKLLEVRAHGESC
43
1101



protein
RAVRFINDGKAILTGSPDCSILATDVETGSVVARVENAHEAAVNRLVNLTESTI




ATGDDNGCIKVWDTRQRSCCNTFSAHEDFISDMTFASDSMKLVVTSGDGTLSVC




NLRSNKVQTRSEFSEDELLSVVIMKNGRKVVCGTQSGTLLLYSWGFFKDCSDRF




VDLSPSSVDALLKLDEDRIIAGTENGLISLIGILPNRIIQPIAEHSDHPIERLA




FSHDKKFLGSISHDQTLKLWDLNDILGSEDSPSSQAAIDDSDSDEMDVDANPPD




SSKGNKKKHSGKGNDVGNANNFFADLGD





335
WD40 repeat
MSQQPSVILATASYDHTIRFWEAKSGRCYRTIQYPDSQVNRLEITPHKRYLAVA
142
1095



protein
GNPSIRLFDVNSNTPQPVMSFDSHTNNVMAVGFQYDGNWMYSGSEDGTVRTWDL




RARGCQREYESRGAVNTVVLHPNQTELISGDQNGNIRVWDLTANSCSCELVPEV




DTAVRSLTVMWDGSLVVAANNNGTCYVWRLLRGSQTMTNFEPLHKLQAHNGYIL




KCLLSPEFCEPHRYLATASSDHTVKIWNVEGFTLEKTLIGHQRWVWDCVFSVDG




AYLITASSDTTARLWSMSTGQDIRVYQGHHKATTCCALHDGAEGSPG





336
WD40 repeat
MEDAMDMEVEVEVEAEEHSPSSSNPSGSSFRRFGLKNSIQTNFGSDYVFEITPK
61
1257



protein
FDWSLMGVSLSSNAVKLYSPTTGQYCGECRGHSDTVNGISFSGPSSPHVLHSCS




SDGTIRAWDTRSFKEVSCISAGPSQEIFSFSFGGSSDSLLSAGCKSQILFWDWR




NKKQVACLEDSHVDDVTQVCFVPHHQNKLISASVDGLICIFDTAGDINDDEHME




SVINVGTSIGKVGIFGQTFEKLWCLTHIETLSVWDWKEGTNEANFEDARKLASD




SWSLDHIDYFVDCHSAEEGEGLWVIGGTNAGTLGYFPVKYKGGAAIGSPEAVLG




GGHSDVVRSVLPMSGMAGTTSKTRGIFGWTGGEDGRLCCWLSDDSSATSRSWMS




SNLVLKSSRSHHKKNRHQPY





337
WD40 repeat
MSQHQEYPMEYAADDYDVGEVEDDMYFHERVMGDSDTDEDEEYDHLDNKITDTS
193
1527



protein
AADARRGKDIQGIPWERLSVTREKYRRTRIEQYKNYENVPQSGESSEKDCKPTR




KGGNYYEFWRNTRSVKSTILHFQLRNLVWSTTKHDVYLMSHFSIIHWSSLTCKK




TEVLDVYGHVAPREKHPGSLLEGFTQTQVSTLAVRDKLLIAGGFQGELICKNLD




RPGVSYCCRTTYDDNAITNAVEIYDYPSGAVHFMASNNDCGVRDFDMEKFELSR




HFTFPWPVNHTSLSPDGKLLVIVGDNPEGIVVDSQRGKTIRPLQGHLDFSFASA




WHPDGHIFATGNQDKTCRIWDIRNLSKSVAVLKGNLGAIRSIRFTSDGRFMAMA




EPADFVHVYDVKSGYEKEQEIDFFGEISGVSFSPDTESLFVGVWDRTYGSLLQY




NRCRNYSYLDSM





338
WD40 repeat
MGASSDPNPDVSDEHQKRSEIYTYEAPWHIYAMNWSVRRDKKYRLAIASLLDHP
109
1155



protein
AAAAAVPNRVEIVQLDDSTGEIRADPNLSFDHPYPATKAAFVPDKDCQRADLLA




TSSDFLRIWRIADDSSRVDLRSFLNGNKNSEFCRPLTSFDWNEAEPKRIGTSSI




DTTCTIWDIERETVDTQLIAHDKEVYDIAWGGVSVFASVSADGSVRVFDLRDKE




HSTIIYESSEPDTPLVRLGWNKQDPRYMATIIMDSAKVVVLDIRYPTMPVVELQ




RHQASVNAIAWAPHSSCHICTAGDDSQALIWDLSSMAQPVEGGLDPILAYTAGA




EIEQLQWSSSQPDWVAIAFSLKLQ





339
WD40 repeat
MRGGGGGGDATGWDEDAYRESVLKEREVQTRTVFRAAFAPSPSPSPSPDAVVVA
71
1213



protein
SSDGSVASYSISACLSDHRLQSLRFADAKSQNVLEAEPACFLQGHDGPAYDVKF




YGEGEDSLLLSCGDDGRIRGWMWRDITSSEAHDHSQGNSAKPVLDLVNPQSRGP




WGALSPIPENNALAVDVKRGSIYAAAGDSCAYCWDVECGKIKTVFKGHSDYLHC




IAARNSSSQIITGSEDGTARIWDCRSGKCVQVIDPDKDHKKGFFASVSCLALDA




SESWLVCGRGRDLSVWSISASDCIAKISTNAPAQDVLFDDNQILLVGAEPLISR




LDMNGAVLSQIHCAPQSVFSVSLHQSGVTAVGGYGGLVDVISQFGSHLCTFRCK




CI





340
WD40 repeat
MEAPIIDPLQGDFPEVIEEYLEHGIMKCIAFNRRGTLLAAGCTDGSCIIWDFET
109
1785



protein
RGVAKELRDKECTAAITSVCWSKYGHRILVSASDKSLILWDVLSGEKIAHTTLQ




HTVLQACLHPGSSTPSICLACPFSSAPMIVDLNTGSTTALPVLTADVSNGATPL




SRNKTSDTSVTYSPCNACFNKHGDLVYAGTSKGEILIIDHKNVRVCAIVLVSGG




AVIKNVVFSRNGQYMLTNSNDRLIRIYKNLLPPKDGLKMLDELNESFNESDDVE




KLKAIGSKCLELLHEFQDSITRVQWKAPCFSGDGEWVIGGAASRGEHKIYIWDR




AGHLVKILEGPKEALMDLAWHPVHPIIISVSLTGLVYIWAKDYTENWSAFAPDF




KELEENEEYVEREDEFDLVPETEKVKGLDVHEDDEVDVLTVERDSVFSDSDMSQ




EELCFLPAVPCLDIPEQQDKCVGSCSKLPDGNHSGSPLSVEAGQNGNASNHNSS




PLEPMENSTADDTDGVRLKRKRKPSEKGLELQAEKVKKPVKPLKSSGRLSKTNK




PVIDPDSSNGVYGDDGSD





341
WD40 repeat
MRGVSWPEDGNNPSTSSSSQRNQQQAHAPRAVSGHAASHPSASNIFKLLVQREV
364
2685



protein
SPRSKHSSKKLWREASKCQPYPFQQSCEAVRDVRQGLISWVESASLRHLSAKYC




PLVPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVLRGHRRTPWVV




RFHPLYPEILASGSLDHEVRLWDANTAECIGSRNFYRPIASIAFHARGELLAVA




SGHKLYIWHYNRRGETSSPTIVLRTQRSLRAVHFHPHAAPFLLTAEVNDLDSAD




SAMTLATSPGYLHYPPPTVYFADAHSHERSRLADELPLMPLPLLMWPSFTRDDG




RVPLQRIDGDVGLNGQQRVDSSSSVRLWTYSTPSGQYELLLSPVESGNSPSMPE




ETGNNAFSSAVEAEVSQSAMDTVEDMEVQPEERNTQFFSFSDPRFWELPLLHGW




LVGQTQAGPRSVRQSSPGDIETQSAFGEVASVSPITSGVMPVSMDPSRFGGRSG




SRYRSPGSRGVHVTGPNNDGPRDENDPQSVVSKLRSELAASLAAAASTELPCTV




KLRIWPHDVKDPCAQLDLESCRLTIPHAVLCSEMGAHFSPCGRFLAACVACVLP




HLESDPGLHGQVNQDVTGVATSPTRHPISAHQIMYELRIYSLEEATFGIVLASR




PVRAAHCLTSIQFSPTSEHLLLAYGRRHSSLLKSIVIDGENTVPIYTILEVYRV




SDMELVRVLPSAEDEVNVACFHPSVGGGLIYGTKEGKLRILHYDSSHGLNLKSS




GFLDENVPEVQTYALEC





342
WD40 repeat
MDSAVAIAALSLVVGAAIALLFFGNYFRKRRSEVVAMAEADLQPHPKNPSRPPP
96
1412



protein
QPAAKKVHAKSHAHGADKDKNKRHHPLDLNTLKGHGDSVTGLCFASDGRSLATA




CADGVVRVFKLDDASNKSFKFLRINLPAGGHPTAVAFGDGVSSVIVASQHLSGC




SLYMYGEEKPTNLDSNKQQTKLPMPEIKWEHHKVHEQKAILTLSGAAANYDSGD




GSTIIASCSEGTDIIIWHAKTGKILGNVDTNQLKNTMSAISPNGRFIAAAAFTA




DVKVWEIVYSKDGSVKGVTKVMQLKGHKSAVTWLCFTPNSEQIVTASKDGSIRI




WNINVRYHLDEDTKTLKVFPIPLQDSSGTTLHYERLSLSPDGKILAATHGSMLQ




WLCIETGKVLDTAEKAHDGDITCMSWAPQSIPTGDKKVNVLATASGDKKVKLWA




APPLPS





343
WD40 repeat
MEVEPKKASKTFPVKPKLKPKPRTPSGKTPESKYWSSFKTTHPLDNLSFSVPSL
116
1702



protein
AFSPSPPHLLAAAHSATVSLFSPHRTTISSFSDVVSSLSFRSDGQLLAASDLSG




LIQVFDVRSRTPLRRLRSHARPVRFVRYPVLDKLHLVSGGDDALVKYWDVAGES




VVSELRGHKDYVRCGDCSPADANCFVTGSYDHVVKLWDVRVRDGNRAATEVNHG




SPVQDVIFLPSGSLVATAGGNSVKIWDLIGGGRMVYSMESHNKTVTSICVGTMG




AQQSGEEGVQLRILSVGLDGYMKVFDYSRMKVTHSMRFPAPLLSIGFSPDSNVR




AIGTSNGILYVGKRKAKENAEGGANGILGLGSVEEPRRRVLKPSFYRYFHRGQS




EKPSEGDYLVMRPKKVKLAEHDKLLKKFQHKNALISVLGGNDPEKVVAVMEELV




ARRALLKCVLNLDADELGLILTFLHKNSTVPRYSSLLLGLAKKVIDLRLEDIRA




SDALKGHIRNLKRSVDEEIRIQEGLQEIQGMVSPLLRIAGRR





344
WD40 repeat
MQGGSSGVGYGLKYQARCISDVKADTDHTSFLTGTLSLKEENEVHLLRLSSGGT
46
1101



protein
ELICEGLFSHPSEIWDLSSCPFDQRIFSTVFSTGESYGAAVWQIPELYGQLNSP




QLEKIASLDAHSRKISCVLWWPSGRHDKLVSIDEENIFLWGLDCSKKSAQVQSQ




ESAGMLHNLSGGAWDPHDVNTVAATCESSIQFWDLRTMKKANSLESVHARDLDY




DMRKKHLLVTSEDESGVRVWDLRMPKAPIQEFPGHTHWTWAVRCNPDYEGLILS




AGTDSAVNLWWSSTASSDELISERLIDSPTRKLDPLLHSYNDYEDSVYGLAWSS




REPWIFASLSYDGRVVVESVKPFLSRK





345
WD40 repeat
MAEEEGSAELEQQLEEEFAVWKKNTPILYDLLISHALEWPSLTVHWAPLLPQPS
23
1258



protein
SSAAAAAGDPSLAAHRLVLGTHTSDGAPNFLILADALLPSSESDHCGDDAVLPK




VEISQKIRVDGEVNRARFMPQNHNIVGAKTNGCEVYVFDCSKQAAKQHDGGFDP




DLRLTGHDGEGYGLSWSPLKENYLLSASHDKKICLWDISAAAQDKVLGAMHVFE




AHEGAVGDASWHSKNDNLFGSAGDDCQLMIWDLRTNKAQQCVKAHEKEVNSVSF




NSYNDWILATASSDTTVGLFDMRKLTTPLHVFSSHEGEVLQVEWDPNHEAVLAS




SSEDRRVMVWDLNRIGDEQQEGDASDGPAELLFSHGGHKAKISDFSWNKNEPWV




ISSVAEDNSVQVWQMAESICGDDDDMQAMEGYI





346
WD40 repeat
MGNYGEEDEDQYFDALEETASVSDRGSNSSDCCSSGSGLDENVLDSLGFEFWTK
404
2644



protein
FPESVRARRNRFLMLTGLGIEANSVDKEDAFPPSCNEIEVYTCKVTRDDGAVQR




SLDSYNCISLLQSSTSIRSNQEVESLRGDSLLSSFRGRSKESDDLTELCGMGCP




ESKRNAVSEFGSVSQGSIEELRRIVASSPLVHPLLHRKLEYERELIETKQKMGA




GWLRKFGSATCISGRQGDTWSDPDDLEITAGMKMRRVRAHSSKKKYKELSSLYA




AQEFLAHEGSISTMKFSMDGQYLASAGEDTVVRVWKVTEEDRSERVNVTVDPSC




LYFALNESTQLASLNTNKEHIGKAKTFQRSSDSSCVILPLKVFQITEKPWHEFK




GHNGEVLDLSWSSKGYLLSSSTDKTVRLWRVGCDRCQRVYSHNDYVTCISFNPV




NENFFISGSIDGKVRIWNVFGGQVVAYIDCREIVSAVCYRSDGKGAIVGTMTGN




CLFYSIKDNHLQMDAQVYLHGKKKSPGKRITGFQFPPNDPGKLMITSADSVIRV




LSGLDVVCKLKGPRNSGGPMIATFTSDGKHVISASEDSNVYIWNYAGQDKTSSR




VKKIWSCESFWSSNASVALPWCGIRTVPEALAPPSRSEERRASCAENGENHHML




EEYFQKMPPYSPDCFSLSRGFFLELLPKGSATWPEEKLSDTSPPTVSSQAISKL




EYKFLKSACHSVLSSAHMWGLVIVTAGWDGRIRTYHNYGLPVRS





347
WD40 repeat
MDIDFKEYRLRCELRGHEDDVRGVCVCGDGSIGTSSRDRTVRLWAPSAGERRKY
107
2383



protein
EVARVLLGHKSFVGPLAWVPPSEELPEGGIVSGGMDTLVMAWDLRNGEAQTLKG




HQLQVTGIVLDGGDIVSASVDCTLIRWKNGQLTEHWEAHKAPIQAVIRLPSGEL




VTGSSDTTLKLWRGKTCTQTFVGHTDTVRGLAVMPDLGILSASHDGSIRLWAVS




GECLMEMVDHTSIVYSVDSHASGLIVSGSEDRFAKIWKDGVCFQSIEHPGCVWD




VKFLEDGDIVTACSDGTIRIWTNQEDRMANSTELELFDLELSSYKRSRKRVGGL




KLEELPGLEALQVPGTSDGQTKVIREGDNGVAYAWNSTELKWDKIGEVVDGPED




SMNRPALDGVQYDYVFDVDIGDGEPTRKLPYNRSDNPYDTADKWLLKENLPLSY




RQQIVEFILANSGQRDFNLDPSFRDPYTGSSAYVPGAPSQLAAKQARPTFKHIP




KKGMLVFDAAQFDGILKKINEFNNTLLSNQEKKNLSLTDIEISRLGAVVKILKD




TSHYHSSKFADADFDLMLKLLESWPYEMMFPVIDIFRMVILHPDGADGLLRHQE




DKKDVLMESIKRATGNPSVPANFLTSIRAVTNLFKNSAYYSWLQKHRSEMLDAF




SSCSSSSNKNLQLSYATLLLNYAVLLIEKKDEEGQSQVLSAALELAENESLEVD




ARYRALVAIGSLMLDGLVKRIALDFDVEHIAKAARTSKEAKIAEVGADIELLIK




QS





348
WD40 repeat
MEFTEAYKQSGPCCFSPNARFIAVAVDYRLVIRDTLSLKVVQLFSCLDKISYIE
243
1625



protein
WALDSEYILCGLYKRPMIQAWSLIQPEWTCKIDEGPAGIAYARWSPDSRHILTT




SDFQLRLTVWSLVNTACVHVQWPKHASKGVSFTRDGKFAAICTRHDCKDYINLL




SCHNWEIMGVFAVDTLDLADIQWSPDDSAIVIWDSPLEYKVLVYSPDGRCLFKY




QAYESGLGVKSVSWSPCGQFLAVGSYDQMLRVLSHLTWKTFAEFTHLSNVRAPC




CAAIFKEVDEPLQIDMSELSLSDDYMQGNSGDAPEGHYRVRYDVTEVPITLPCQ




KPPADRPNPKQGIGLMSWSNDSQYICTRNDSMPTILWIWDMRHLELAAILVQKD




PIRAAVWDPTGTRLVLCTGSSHLYMWTPSGAYCVSVPLSQFNITDLKWNSDGSC




LLLKDKESFCCAAAPLPPDESSDYSSDD





349
WD40 repeat
MATIAALDDDMVRSMSIGAVFSDFVGKLNSLDFHRKDDILVTAGEDDSVRLYDI
126
1127



protein
ANARLLKTTFHKKHGTDRVCFTHHPNSLICSSTKNLDTGESLRYISMYDNRSLR




YFKGHKQRVVSLCMSPINDSFMSGSLDHSVRMWDLRVNACQGILRLRGRPTVAY




DQQGLVFAVAMEGGAIKLFDSRSYDKGPFDAFLVGGDTSEVCDIKFSNDGKSVL




LSTTNNNIYVLDAYAGDKQCGFNLEPSPSTPIEASFSPDGQYVVSGSGDGTLHA




WNISRRNEVACWNSHIGVASCLKWAPRRAMFVAASTVLTFWIPNSEPELASAKG




EAGVPPEQV





350
WD40 repeat
MSVAELKERHRAATETVNSLRERLKQKRVQLLDTDVAGYARTQGKTPVTFGATD
257
1390



protein
LVCCRTLQGHTGKVYSLDWTPERNRIVSVSQDGRFIVWNALTSQKTHAIRLPCA




WVMTCAFAPNGQSVACGGLDSVCSIFNLNSPVDRDSNLPVSRMLSGHKGYVSSC




QYVPDGDAHLITGSGDQTCVLWDITTGLRTSVFGGEFQSGHTADVLSVSTNGSS




PRIFVSGSCDSTARMWDTRVASRAVHTYHGHESGVNAVKFFPDGNRFGTGSDDG




TCRLFDIRTGHELQVYYQQRGIDEIPHVTSIAFSISGRLLIAGYSNGDCFVWDT




LLAQVVLNLGSLQNSHEGRISCLGVSADGSALCTGSWDTNLKIWAFGGIRRVT





351
WD40 repeat
MKKRPRGASLDQAVVDIRRREVGGLSGLSFARRLAASEGLVLRLDIYNKLKGHR
178
1632



protein
GCVNTVGFNLDGDIVISGSDDRHVKLWDWQTGKVKLSFDSGHLSNVFQAKIMPY




TDDRSIVTCAADGQARHAQILEGGQVQTMLLAKHRGRAHKLAIDPGSPHIVYTC




GEDGLVQRLDLRSNTARELFTCREVYGTHVEVVHLNAIAIDPRNPNLFVIGGSD




EYARVYDIRNYKWNGSHNFGRSANYFCPSHLIGEAHVGITGLAFSGQSELLVSY




NDESIYLFTQEMGLGPDPLSASTKSVDSNSSEVTSPTAVNVDDNVTPQVYKGHR




NCETVKGVGFFGPKCEYVVSGSDCGRIFIWKKKGGQLIRVMAADKHVVNCIEPH




PHIPALASSGIENDIKIWTPKAIERATLPMNVEQLKPKARGWMNRISSPRQLLL




QLYSLERWPEHGGETSSGLAAGQEELTELFFALSANGNGSPDGGGDPSGPLL





352
WD40 repeat
MSKRGYKLQEFVAHSSNVNCLSIGKKACRLFLTGGDDCKVNLWAIGKPNSLMSL
290
2917



protein
CGHTNAVESVAFDSAEVLVLAGASSGVIKLWDVEEAKMVRGLTGHRSNCTAMEF




HPFGEFFASGSTDTNLKIWDIRKKGCIHTYKGHTRGISTIRFSPDGRWVVSGGN




DNVVKVWDLTAGKLLHDFKFHENHIRSIDFHPLEFLLATGSADRTVKFWDLETF




ELIGSSRPEAAGVRAIAFHPDGRTLFCGLEDSLKVYSWEPVICHDGVDMGWSTL




ADLCIHDGKLLGCSYYQSSVGVWVADASLIEPYGTNVKPQQKDSGDDEIEHQES




RPSAKVGTTIRSTSIMRCASPDYETKDIKNIYVDTASGNPVSSQRVGTTNFAKV




TQPLDFNDTPNLTLRRQGLVTETPDGLSGHVPSKSITQPKVVSRDSPDGKDSSR




RESITFSRTKPGMLLRPAHSRRPSSTKYDVDRLSACAEIGVLSSAKSGSESLVD




SFLNIKVAPEDGARNGCEDNHSSVKNVSVESEKVLPLQTPKTEKCDQTVGFKEE




INSVKFVNGVAVVPGRTRTLVEKFEKREKLNSTEDQTINTPENPTLDKTPPPSL




AENEEKSDRLNIVERKATRMSSHMVTAEDRTPVTLVGSPEDQSTVMAPQRELPA




DESSKTPPLPVEDLEIHHGSNVSEDKATILSSQTVSEEDSKRSTLIRNFRRRDR




FKSTEGRSPVMATQRKLPTDESGKTSSLPMEDLEIKGGLNVSEDKATSFSSRAP




PREDRAHSALVRNVRKRDKFKSTNDTITVMVHQRGLSTDEASTVSVERVERRQL




SNNVENPLNNLPPHSVPPTTTRGEPQYVGSESDSVNHEDVTELLLGNHEVFLST




LRSRLTKLQVV





353
WD40 repeat
MSTFLTGTALSNPNPNKSYEVVQPPNDSVSSLSFNPKANFLVATSWDNQVRCWE
148
1197



protein
IVRSGTSLGTTPKASISHDQPVLCSTWKDDGTTVFSGGCDKQVKMWPLSGGQPM




TVAMHDAPIKEISWIPEMNLLVTGSWDKTLRYWDTRQANPVHIQQLPERCYALT




VRHPLMVVGTADRNLIIYNLQSPQTEFKRISSPLKYQTRCLAAFPDQQGFLVGS




IEGRVGVHHLDDSQQSKNFTFKCHREGSEIYSVNSLNFHPVHHTFATAGSDGAF




NFWDKDSKQRLKAMSRCSQPIPCSTFNNDGSIFAYSACYDWSKGAENHNPATAK




TYIFLHLPQESEVKGKPRLGTTGRK





354
WD40 repeat
MEVEAQQRDVNNVMCQLVDPEGTTLGPPMYLPQDVGPQQLQQMVNKLLSNEDKL
140
1567



protein
PYTFYISDQELVVPLESYLQKNKVSVEKVLSIVYQPQAIFRIRPVNRCSATIAG




HSEAVLSVAFSPDGKQLASGSGDTTVRLWDLSTQTPMFTCKGHKNWVLSIAWSP




DGKHLVSGSKAGEIQCWDPLTGQPSGNPLVGHKKWITGISWEPVHLSSPCRRFV




SSSKDGDARIWDVTLRRCVICLSGHTLAVTCVKWGGDGVIYTGSQDCTIKVWET




SQGKLIRELKGHGHWVNSLALSTEYVLRTGAFDHTGKQYSSAEEMKQVALERYK




KMKGNAPERLVSGSDDFTMFLWEPSVSKHPKTRMTGHQQLVNHVYFSPDGQWVA




SASFDKSVKLWNGITGKFVAAFRGHVGPVYQISWSADSRLLLSGSKDSTLKIWD




IRTKKLKRDLPGHADEVFAVDWSPDGEKVVSGGKDKVLKLWMG





355
WD40 repeat
MDAGSAHSSSNMKTQSRSPLQEQFLQRRNSRENLDRFIPNRSAMDFDYAHYMLT
376
1737



protein
EGRKGKENPAVSSPSREAYRKQLAETLNMNRTRILAFKNKPPTPVELIPHELTS




AQPAKPTKTRRYIPQTSERTLDAPDLLDDYYLNLLDWGSSNVLSIALGNTVYLW




NASDGSTSELVTIDDETGPVTSVSWAPDGRHIAVGLNNSDVQLWDSADNRLLRT




LRGGHRSRVGSLAWNNHILTTGGMDGLIVNNDVRVRSHIVDTYRGHTQEVCGLK




WSASGQQLASGGNDNILHIWDRSTASSNSPTQWLHRLEEHTAAVKALAWCPFQG




NLLASGGGGGDRTIKFWNTHTGACLNSVDTGSQVCALLWNKNERELLSSHGFTQ




NQLTLWKYPSMVKIAELTGHTSRVLFMAQSPDGCTVASAAGDETLRFWNVFGVP




EVAKPAPKANPEPFAHLNRIR





356
WD40 repeat
MEEAIPFKNLPSREYQGHKKKVHSVAWNCTGTKLASGSVDQTARVWHIEPHGHG
69
1010



protein
KVKDIELKGHTDSVDQLCWDPKHADLIATASGDKTVRLWDARSGKCSQQAELSG




ENINITYKPDGTHVAVGNRDDELTILDVRKFKPIHKRKFNYEVNEIAWNMSGEM




FFLTTGNGTVEVLAYPSLRPVDTLMAHTAGCYCIAIDPVGRYFAVGSADSLVSL




WDISEMLCVRTFTKLEWPVRTISFNHTGDYVASASEDLFIDISNVQTGRTVHQI




PCRAAMNSVEWNPKYNLLAYAGDDKNKYQADEGVFRIFGFESA





357
WD40 repeat
MGKDEEEMRGEIEERLINEEYKVWKKNTPFLYDLVITHALEWPSLTVEWLPDRE
149
1423



protein
EPPGKDYSVQKLVLGTHTSENEPNYLMLAQVQLPLEDAENDARHYDDDRADVGG




FGCANGKVQIIQQINHDGEVNRARYMPQNSFIIATKTVSAEVYVFDYSKHPSKP




PLDGACSPDLRLRGHSTEGYGLSWSKFKQGHLLSGSDDAQICLWDINATPKNKS




LDAMQIFKVHEGVVEDVAWHLRHEYLFGSVGDDQYLLIWDLRTPSVTKPVQSVV




AHQSEVNCLAFNPFNEWVVATGSTDKTVKLFDLRKISTALHTFDAHKEEVFQVG




WNPKNETILASCCLGRRLMVWDLSRIDEEQTPEDAEDGPPELLFIHGGHTSKIS




DFSWNTCEDWVVASVAEDNILQIWQMAENIYHDEDDVPGEESNKGS





358
WD40 repeat
MMRGFSCTEDGDAPSTSSTSPPPPPPPPHRQQMQAPRASSSSSGQPTSRRSTGN
365
2677



protein
VFKLLARREVSPRSKHSLKKFWGEASECQLCPFQQSYEAVRDVRRSLISWVEAF




SLQHLSAKYCPLMPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVL




RGHRRTPWVVRFHPLYPEILASGSLDHEVHLWDANTAECIGSRNFYRPIASIAF




HAQGDLLAVASGHKLYIWHYNRSGETSSPTIVLRTPRSLRAVHFHPHAAPFLLT




AEVNDLDLTDSAMTLATSPGYLHYPPPTIYLADAHSNERSRLEDELPLMPSPLL




MWPSFTRDDGRATLPHIGGDVGLSGQQRVDSLSSSQYEFHPSPIEPSSSTSMHE




EMGTDPFSSVRESEVTQSAMNIVDNTEVQPEERSTYSFSFSDPRFWELPSVYGW




LVGQTQAAPRTAPSPGALETASALGEVASVSPVRSEFMPGGMDQPRLGGRSGSG




CRSSGSRMMRTAGLNDHPHDENYPQSVVSKLRSELEASLAAAASTELPCTVKLR




VWPYDMKDPCALFRSESCRLTIPHAVLCSEMGAHFSPCGRFFAACVACVLPQLE




ADPVLHGQVDPDVTGVATSPTRHPVSAYQIMYELRIYSLEEATFGMVLASRSIR




AAHCLTSIQFSPTSEHLLLAYGRRHNSLLKSIVIDGENTVPIYSILEVYRVSDM




ELVRVLPSAEDEVNVACFHPSVGGGLVYGTKEGKLRILQIDSSGGLNPKSTGFL




DENMAEVPTYALEC





359
WD40 repeat
MGEGDLPRTEAGVLRGHEGAVLAARFNGDGNYCLSCGKDRTIRLWNPHRGIHIK
24
923



protein
TYKSHGREVRDVHCTSDNSKLISCGGDRQIFYWDVSTGRVIRRFRGHDSEVNAV




KFNDYASVVVSAGYDRSVRAWDCRSHSTEPIQIINTFQDSVMSVCLTKTEIIGG




SVDGTVRTFDIRIGREISDDLGQPVNCISMSNDGNCILASCLDSTLRLVDRSAG




ELLQEYKGHTCKSYKLDCCLTNTDAHVAGGSEDGYVFFWDLVDASVISKFRAHS




SVVTSVSYHPKEDCMITASVDGTIKVWKT





360
WD40 repeat
MACIKGVGRSASVAMAPDGGYLATGTMAGTVDLSFSSSASLEIFGLDFQSDDRD
221
3598



protein
LPLIAESPSSERFNRLSWGKNGSGSDEFSLGLIAGGLVDGTIGLWNPLSLIRSE




AGDKAIVGHLSRHKGPVRGLEFNVIAPNLLASGADDGEICIWDLAAPREPSHFP




PLRGSGSAAQGEISFLSWNSKVQHILASTSYNGTTVVWDLKKQKPVISFSDSVR




RRCSVLQWNPDLATQLVVASDEDSSPTLRLWDMRNIMSPVKEFAGHTRGVIAMS




WCPNDSSYLVTCAKDNRTICWDTVTGEIVCELPAGSNWNFDVHWYPKIPGVISA




SSFDGKIGIYNVEGCSRYGVRENEFGAATLRAPKWFKRPVGASFGFGGKVVSFH




TRSTGGPSVNSSEVFVHDIITEQTLVSRSSEFEAAIQSGDRPSLRALCEKKSQH




CESTDDQETWGFLKVLLEDDGTARSKLLAHLGFDIPTETNDGSQEDLSQQVNAL




GLEDVTADKVVQEDNNESMVFPTDNGEDFFNNLPSPRADTPVSTSADGFPTVNA




AVEPSQDEVDGLEESSDPSFDDSVQRALVVGDYKAAVALCMSANKLADALVIAH




VGGASLWESTRDKYLKMSRLPYLKVVFAMVNNDLQSLVDTRPLKFWKETLAILC




SFAQGEEWAMLCNSLASKLMAAGNMLAATLCFICAGNIDKTVEIWSRSLATEHD




GMSYMDLLQDLMEKTIVLALASGQKQFSASVCKLVEKYAEILASQGLLTTAMDY




LKLLGTDDLSPELAVLRDRIAFSVEAEKGANISAFNGSQDPRGAVYGVDQSNYG




MVDTSQHYYPEAAQPQVPHTVPGSPYGENYQQPFGSSFGKGYNTPMQYQAPSQA




SMFVPSEPPQNAQPSFVPTPVTSQPTTRSQFIPAPPLALRNPEQYQQPTLGSHL




YPGSVNPTFQPLPHAPGPVAPVPPQVSSVPGQNMPQAVAPTQMRGFMPVTNPGV




VQNPGPISMQPATPIESAAAQPVVSPAAPPPTVQTADTSNVPAPQKPVIATLTR




LYNETSEALGGSRANPAKKREIEDNSRKIGALFAKLNSGDISKNAADKLVQLCQ




ALDNGDYSTALQIQVLLTTSEWDECNFWLATLKRMIKTRQNVRLS





361
WD40 repeat
MKERGKGAGRSVDERYTQWKSLVPVLYDWLANHNLVWPSLSCRWGPQLEQATYK
44
1447



protein
NRQRLYLSEQTDGSVPNTLVIANVEVVKPRVAAAEHISQFNEEARSPFVKKFKT




IIHPGEVNRIRELPQNSKIVATHTDSPDVLIWDVETQPNRHAVLGASTSRPDLI




LTGHKDNAEFALAMSPTEPFVLSGGKDRYVVLWSIQDHISTLAADPGSAKSPGS




AGTNNKQSSKAAGGNDKTGDSPSIEPRGVYLGHGDTVEDVTFCPSSAQEFCSVG




DDSCLILWDARTGSSPAIKVEKAHHADLHCVDWNPHDVNLILTGSADNTVRMFD




RRNLTSGGVGSPVHTFEGHNAAVLCVQWSPDKSSVFGSSAEDGILNIWDHEKIG




RKIETVGSKVPNSPPGLFFRHAGHRDKVVDFHWNSSDPWTIVSVSDDGESTGGG




GTLQIWRMIDLIYRPEEEVLAELDKFKSHILSCTS





362
WD40 repeat
MAKIAPGCEPVAGTLTPSKKREYRVTNRLQEGKRPLYAVVFNFIDSRYFNVFAT
196
1314



protein
VGGNRVTVYQCLEGGVIAVLQSYIDEDKDESFYTVSWACNIDRTPFVVAGGING




IIRVIDAGNEKIHRSFVGHGDSINEIRTQPLNPSLIVSASKDESVRLWNVHTGI




CILIFAGAGGHRNEVLSVDFHPSDKYRIASCGMDNTVKIWSMKEFWTYVEKSFT




WTDLPSKFPTKYVQFPVFIAPVHSNYVDCNRWLGDFVLSKSVDNEIVLWEPKMK




EQSPGEGSVDILQKYPVPECDIWFIKFSCDFHYHSIAIGNREGKIYVWELQSSP




PVLIAKLSHPQSKSPIRQTAMSFDGSTILSCCEDGTIWRWDAITASTS





363
WD40 repeat
MNTAMHFGAGWRSIAEMGYTMSRLEIEPESCEDEKSLDGVGNSQGPNELPRCLD
193
1668



protein
HELAHLTNLKSRPHEHLIRDFPGRRALPVSTVKMLAGRECNYSRRGRFSSADCC




HMLSRYVPVNGPSPLDQMNSRAYVSQFSADGSLFVAGFQGSHIRIYNVDKGWKC




QKNILTKSLRWTITDTSLSPDQRYLVYASMSPIVHIVDIGSAAMDSLANITEIH




EGLDFSADSGPYSFGIFSVKFSTDGREVVAGSSDDSIYVYDLVANKLSLRIPAH




ESDVNTVCFADESGHIIYSGSDDTYCKVWDRRCLSARNKPAGVLMGHLEGITFI




DSRGDGRYFISNGKDQTIKLWDIRKMGSDICRRGFRNFEWDYRWMDYPPRARDS




KHPFDLSVATYKGHSVLRTLIRCYFSPVHSTGQKYIYTGSHDSCVYIYDVVTGA




QVAALKHHKSPVRDCSWHPEYPMIVSSSWDGDIVKWEFFGNGETEIPAMIKKRIR




RRHLY





364
WD40 repeat
MEPQPQAPKKRGRKPKPKEDKKEEQLHQPPPPPPPQQQAAPAPAPAATRSSTSG
78
1634



protein
SAGGRDRRPQQQHAVDEKYARWKSLVPVLYDWLANHNLLWPSLSCRWGPQLEQA




TYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQFNEEARSPFIRK




YKTIIHPGEVNRVRELPQNPNIVATHTDSPDVLIWDVESQPNRHAVYGATASRP




NLILTGHQENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHITASATDQTTNKS




PGSGGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAFCPSTAQEFCSVG




DDSCLILWDARVGTNPVAKVEKAHNGDLHCVDWNPHDNNLILTGSADNSVNMFD




RRNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSAEDGLLNIWDYERVD




KKVDRAPNAPAGLFFQHAGHRDKIVDFHWNAADPWTMVSVSDDCDTAGGGGTLQ




IWRMSDLIYRPEEEVLAELENFKAHVLECSKA





365
WD40 repeat
MGIFEPYRAVGYITTGVPFSVQRLGTETFVTVSVGKAFQVYNCAKLSLVLVGPQ
85
2826



protein
LPKKIRALASYREYTFAAYGSDIGIFKRAHQLATWSGHTAKVCLLLLFGEHILS




VDVDGNAYIWAFKGMNYNLSPVGHILLDSNFTPSCIMHPDTYLNKVILGSQEGP




LQLWNISTKTKLYEFKGWNSSVSSCVSSPALDVVAVGCADGKIHVHNIRYDEEL




VTFSHSMRGSVTALSFSTDGQPLLASGSSSGVVSIWNLDKRRLQSVIRDAHDGS




IISLHFFANEPVLMSSSADNSIKMWIFDTSDGDPRLLRFRSGHSAPPLCIRFYA




NGRHILSAGQDRAFRLFSVVQDQQSRELSQRHVSKRAKKLKLKEEEIKLKPVIA




FDVAEIRERDWCNVVTSHMDTPQAYVWRLQNFVIGEHILRPCPNKPTPVKACMI




SACGNFAILGTAGGWIERFNLQSGISRGSYIDQLEGTNSAHDGEVVGVACDATN




TLMISAGYAGDIKVWDFKGRELKSRWEIGSSLVKISYHRLNGLLATVADDFIIR




LFDAVALRMVRKFEGHTDRITDLCFSEDGKWLLSSSMDGSLRIWDIILARQVDA




VFVDVSITALSLSPNMDILATTHVDQNGVFLWVNQSMFSGDSDINLYASGKEVV




TVKLPSVSSVEGSQVEESNEPTIRHSESKDVPSFRPSLEQIPDLVTLSLLPKSQ




WQSLINLDIIKVRNKPVEPPKKPEKAPFFLPSIPSLSGEILFKPSEMSDKGDMK




ADEDKSKITPEVPSSRFLQLLHSCSEAKNFSPFTTYIKGLSPSTLDLELRMLQI




IDDDAVDADADDPQDVDKRQELLSIELLMDYFIHEISCRSNFEFVQALVRLFLK




IHGETIRRQSVLQNKAKVLLETQCSVWQRVDKLFQGARCMVAFLSNSQF





366
WD40 repeat
MEETKVTCGSWIRRPENVNLAVLGRSPRRRGSAALEIFAFDPKSTSLSSSPLVA
74
1246



protein
HVIEEIEGDPLAIAVHPNGEDIVCFASSGSCLSFELSGQESNLKLLTKELPPLR




GIGPQKCMAFSVDGSRFATGGVDGRLRILEWPSLRIILDEPKAHKSIRDLDFSL




DSEFLATTSTDGSARIWKAEDGLPCTTLTRRSDEKIELCRFSKDGTKPFLFCTV




QRGDKAVTGVWDISTWNKIGHKRLLRKPAVVMSISLDGKYLAQGSKDGDMCVVE




VKKMEVSHWSKRLHLGTSLTSLEFCPIERVVITTSDEWGVLVTKLNVPADWKAW




QVYLLLLGLFLASLVAFYIFYENSDSFWGFPLGKDQPARPKIGSVLGDPKSADD




QNMWGEFGPLDM





367
WD40 repeat
MADPVEHQHQQHQQHQLQQQRRRGWRIQGGQYLGEISALCFLHLPPPPLSLSSS
100
4377



protein
PVLSLSSGLDSESRDRPACSFRFPSAGSGSQVSLFDLASGAMVRTFYVFRGIRV




HGIVLGCADFPGGSSSSSSTLDYVIAVYGERRVKLFRLSVRLGRGAGEGSGTVL




SADLELVSAAPRLSHWVMDVRFLKENGTSEDELQRCLTVAIGCSDNSIRLWDVD




KCSFVLAVSSPERCLLYSMRLWGDNLEDLQVASGTIYNEILIWKVVPNHDAPSS




NELTEEGLTNSCAGNSVHECLRYEAYHICRLVGHEGSIFRIAWSSDGSKLVSVS




DDRSARIWEVHCKVQYSEDAGEVGLLFGHSARVWDCYISDNLIVTAGEDCSCRV




WGLDGQQHDVIKEHIGRGIWRCLYDPWSSLLVTGGFDSAIKVHKLDASLAEASA




KQSNIKDLSDGTELFTTHLPNSSGHSGHMDSKSEYVRCLSFSCEDVMYIATNHG




YLYHAKLCNDGDLRWTELAQVSNEVQIICMELLPSNPYDPRIDADDWVAVGDGK




GWTTVVRVVKNSDSPKVSTSFSWAAEMDRQLLGIHWCKSLGHRFIFTADPRGAL




KLWRFFEVSQSSSLYPENSPRISLIAEFKSDLGARIMCLDVAFESELLICGDLR




GNLVLFPLLKDLLLDTFVVSAAKISPVNHFKGAHGISAVSSISVAHMSFNHIEL




RSTGADGCICYMEYDKGLQSLNFVGMKQVKELSMIESVSTENESTGYRTSGSYA




SGFASTDFIIWNLVTEAKVLQVSCGGWRRPHSYYLGDVPEMKNCFAYVKDDIIY




IRRHWIKDSKDKILPQNLRLQFHGREVHSLCFVTGDFQLRKNKQSSWIVTGCED




GTVRLTRYTQCTDNWSSSKLLGEHVGGSAVRSICCVSNIHTTSSGTSVSDVKGI




ENLPKDIKGTLMEDECNPSLLISVGAKRVLTSWLLRRRKQDGKEDDVTDLQEAE




NSSLPSSAGSSTFSFQWLSTDMPVKYSVPSKKSGSIKKLIGVSDTNVRCKSLLP




DSEALQSKVSAVDKNEDDWRYLAVTAFLVRHSGSRLIVCFIIVACSDATLAIRA




LVLPYRLWFDVALMVPLSSPVLSLQHVIIGRCQLPDENVQIGNVYVVISGATDG




SIAFWDLTESVEAFMRRLSNIHLEKFMDCQKRPRTGRGSQGGRWWRSLSKIACK




EQPINDPVTAKAIKELNRKLTGGVACGSSSSMLDASPELDSNAANSSFEIIEVN




PFHVLNGVHQSGVNCLHVCETKHGQSSDGRFLYQLVSGGDDQALHLLKFEVLVQ




PPVQVPDVPNSDIRNSILVEEFLLDEQNQKTKCTIEFISQEKIASAHNSAVKGV




WTDGTWVFSTGLDQRVRCWISKDRGTPTELAHFIISVPEPEALDARSICWDQYQ




IAVAGRGMQMIEFHVPSSEIR





368
WD40 repeat
MPYKLSATLSNHSSDVRAVASPSDDLILSASRDSTAISWFRQSPSSFTPASVIR
58
2439



protein
AGSRFVNAIAYLPPTPRAPQGYAVVGGQDTVVNVFALGPGDKEEPEYTLVGHTD




NVCALSVNSDDTIISGSWDKTAKVWKDFALVYDLKGHQQSVWAVLAMNEKEFLT




ASADRTIKYWVQHKTMQTYEGHRDAVRGLALIPDIGFASCSNDSEIRVWTMGGD




VVYTLSGHTSFVYSLSVLPNGDLVSAGEDRSVRVWRDGECSQVIVHPAISVWAV




STMPNGDIISGSSDGVVRVFSESEKRWATASELKALEDQIASQSLPSQQVGDVK




KTDLPGPEALSVPGKKAGEVKMIRSGDVVEAHQWDSLASSWQKIGEVVDAIGSG




RKQLHDGKEYDYVFDVDIQEGAPPLKLPYNVSENPYTAAQRFLEQNDLPTGYLD




QVVKFIEQNTAGVKLGNDGYVDPFTGASRYQPATQSTSNTASSSYMDPFTGGSR




HIAESAPSNVPQGSHATGIIPFSKPIFFKLANVSAMQAKMFQFDEVLRNEISTA




TLAMRPDEVIMVNETFTYLSKVVTSTSSARTSLGWIHIETIMQILDRWPVPQRF




PVIDLGRLVTAYCMNAFSGPGDLEKFFSCLFRTSEWTSITSGSKALTKAQETNV




LLLFRTIANSLDGAPLNDMEWIKQIFRELAQTPQLVLNKSHRLALASVLFNFSC




IGLKGPVPADVRTLHLTIILQVLRSPNDDPEVAYRTCVALGNMLYSDKTRGTPR




DAQSPSPTELKSAVAAIKGGFSDPRINDVHREIMSLI





369
WD40 repeat
MPPQKIESGHKDTVHDLAMDYYGKRLATASSDHTINVVGVSSSGSQHLATLIGH
159
1064



protein
QGPVWQISWAHPKFGSLLASCSYDGRVIIWREGNPNEWTQAQVFEEHKSSVNSV




AWAPHELGLCLACGSSDGNISVFTARQDGGWDTSRIDQAHPVGVTSVSWAPSTA




PGALVGSGMMEPVQKLCSGGCDNTVKVWKLYNRVWKLDCFPVLQMHTDWVRDVA




WAPNLGLPKSTIASASQDGRVIIWTLAKEGDQWQGKVLYDFRTPVWRVSWSLTG




NILAVADGNNNVSLWNEAVDGEWIQVSTVEP





370
WD40 repeat
MSAPMLEIEARDVVKIVLQFCKENSLHQTFQTLQSECQVSLNTVDSIETFVADI
118
1665



protein
NSGRWDAILPQVAQLKLPRNTLEDLYEQIVLEMIELRELDTARAILRQTQAMGV




MKQEQPERYLRLEHLLVRTYFDPNEAYQDSTKEKRRAQIAQALAAEVTVVPPSR




LMALVGQALKWQQHQGLLPPGTQFDLFRGTAAMKQDVDDMYPTTLSHTIKFGTK




SHAECARFSPDGQFLVSCSVDGFIEVWDYMSGKLKKDLQYQADETFMMHDDPVL




CVDFSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTSVLFSRDGSQL




LSTSFDGSARIHGLKSGKQLKEFRGHSSYVNDAIFSNDGSRVITASSDCTVKVW




DVKTSDCLQTFKPPPPLRGGDASVNSVHLFPKNADHIVVCNKTSSIYIMTLQGQ




VVKSLSSGKREGGDFVAACVSPKGEWIYCVGEDRNLYCFSCQSGKLEHLMKVHE




KDVIGVTHHPHRNLVATYSEDSTMKLWKP





371
WD40 repeat
MDLLQSYAEDNDGDLGRHSSPEPSPPRLLPSKSAAPKVDDTTLALTVAQTNQTL
57
16828



protein
ARPIDPSQHAVAFNPTYDQLWAPICGPAHPYAKDGIAQGMRNHKLGFVEDAAIG




SFLFDEQYNTFQRYGYAADPCASTGNEYVGDLDALKQNDGISVYNIRQQEQKKY




AEEYAKKKGEERGEGGREKAEVVSDKSTFHGKEERDYQGRSWIAPPKDAKATND




HCYIPKRLVHTWSGHTKGVSAIRFFPKHGHLILSAGMDTKVKIWDVFNSGKCMR




TYMGHSKAVRDISFCNDGTKFLTAGYDKNIKYWDTETGKVISTFSTGKIPYVVK




LHPDDEKQNILLAGMSDKKIVQWDMNTGQITQEYDQHLGAVNTITFVDDNRRFV




TSSDDKSLRVWEFGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDNQILIY




STRERFQLNKKKRFAGHIVAGYACQVNFSPDGRFVMSGDGEGRCWFWDWKSCKV




FRTLKCHEGVCIGCEWHPLEQSKVATCGWDGLIKYWD





372
WD40 repeat
MESNGNLEQTLQDGRIYRQLNSLIVAHLRDHNFPQAASAVALATMTPLNVEAPR
250
1566



protein
NRLLELVAKGLAVEKGELLRGVSHAGTNDLGGSIPASYGLVPAPWTAIDFSSLR




DTKGMSKSFTKHETRHLSDHKNVARCARFSTDGRFFATGSADTSIKLFEVSKIK




QMMLPDSTDGAIRAVIRTFYDHTHPVNDLDFHPQNTVLISAAKDHTVKFFDYSK




ATAKRAFRVIQDTHNVRSVAFHPSGDFLLAGTDHPIPHLYDVNTFQCYLSANVP




EFAVNAAINQVRYSSSGGMYVTASKDGTIRFWDGASANCVRSIAGAHGAAEVTS




ANFTKDQRYVLSCGKDSTVKLWEVGTGRLVKQYLGATHMQLRCQAVFNNTEEFV




LSIDEPSNEIVVWDAMTAEKVARWPSNHNGPPRWIEHSPTEAAFVSCGTDRSIR




FWKETH





373
WD40 repeat
MSNFQGEDGEYVADDFEAEDGDEELHGRESADPESDVDEIDTPSNRFTDTTADQ
106
1434



protein
ARRGRDIQGIPWERLSITREKYRRTRLEQYKNYENVPQSGEKSGKDCTVTEKGN




SFYEFRRNSRSVKSTILHFQLRNLVWATSKHDVYLMSNYSVVHWSSLTGKKSEV




LNLAGHVAPNEKHPGSLLEGFTQTQVSTLAVKDRFLVAGGFQGELICKFLDRPG




ISFCSRTTYDDNAITNAVEIYVSPSGGIHFIASNNDCGVRDFDMENFELSKHFR




FPWPVNHTSLSPDGKLLVIVGDDPEGILVDAKTGKTIMPLRGHLDFSFASEWHP




DGVTFATGNQDKTCRIWDIRNLSKSIAVLKGNLGAIRSIRYTSDGRYMAIAEPA




DFVHVYDTKTGYKKEQEIDFFGEISGMSFSPDTESLFIGVWDRTYGSLLEYGRR




RNFSYLDCLV





374
WD40 repeat
MGVEEDLEDLNALAESTDAAVDGQAALASAVDSVTLQPAPPILPPVIPPPAVPV
190
1917



protein
VAPVPTIPPVLRPLAPLPIRPPVLRPPAPKRDEAGSSDSDSDHDGTAAGSTAEY




EITEESRLVRERHEKAMQDLMMKRRGAALAVPTNDKAVRARLRRLGEPMTLFGE




REMERRDRLRMLMAKLDAEGQLEKLMKAHEDEEAAASAAPEDVEEEMLQYPFYT




EGSKALFNARIDIAKFSITRAALRLERARRRRDDPDEDVDAEIDWALKKAESLS




LHCSEIGDDRPLSGCSFSHDGKLLATCSMSGVAKLWDTCRMPQVNRVLTLKGHT




ERATDVAFSPVQNHIATASADRTAKLWNTEGTILKTFEGHLDRLGRIAFHPSGK




YLGTTSFDKTWRLWDIESGEELLLQEGHSRSIYGIDFHRDGSLVASCGLDALAR




VWDLRTGRSILALEGHVKPVLGVSFSPNGYHLATGGEDNTCRIWDLRKKKSLYT




IPAHANLISEVKFEPQEGYFLVTASYDTTAKVWSARDFKPVKTLSVHEAKITSV




DITADASHIVTVSHDRTIKLWTSNDDVKEQAMDVD





375
WD40 repeat
MVKAYLRYEPAAAFGVIASVESNIAYDASGKHLLAPALEKVGVWHVRQGVCTKA
102
2942



protein
LAPSASSAAGPSLAVTAIASSPSSLIASGYADGSIRIWDFEKGSCETTLNGHKG




AVSVLRYGKLGSLLASGSKDNDIILWDVVGETGLYRLRGHRDQVTDLVFLDSDK




KLVSSSKDKYLRVWDLETQHCMQIVGGHHSEIWSLDTDPEERYLVTGSADPELR




FYTVKNDSSDERSEADASGGVGNGDLASHNKWDVLKQFGEIQRQSKDRVATVRF




NKNGNLLACQAAGKLVEVFRVLDEAEAKRKAKRRLHRKREKKGADVNENGDSSR




GIGEGHDTMVTVADVFKLLQTIRASKKICSISFCPVAPKSSLATLALSLNNNLL




EFHSIEADKTSKMLTIELQGHRSDVRSVTLSSDNTLLMSTSHNSVKIWNPSTGS




CLRTIDSGYGLCGLIVPQNKHALIGTKDGAIEIFDVGSGTCIEVVEAHGGSIRS




IVAIPNQNGFVTGSADHDIKFWEYGMKQKPGDNSKHLTVSNVRTLKMNDDVLVV




AVSPDAQKIAVALLDCTVKVFFMDSLKLMHSLYGHRLPVLCLDISSDGDLIVTG




SADKNLMIWGLDFGDRHKSIFAHGDSIMAVQFVGNTHYMFSVGKDRLVKYWDAD




KFELLLTLEGHHADIWCLAISNRGDFLVTGSHDRSIRRWDRTEEPFFIEEEKEK




RLEEMFESDLDNAFGNKYVPKEEIPEEGAVALAGKKTQETLSATDSIIEALDIA




EVELKRIAEHEEEKNNGKTAEFHPNYVMLGLSPSDFILRALSNVQTNDLEQTLL




ALPFSDALKLLSYLKDWTTYPDKVELVSRIATVLLQTHYNQLVSTPAARPLLTT




LKDILHKKVKECKDTIGFNLAAMDHLKQLMALRSDALFQDAKVKLLEIRSQLSK




RLEERTDPREAKRRKKKQKKSTNMHAWP





376
WD40 repeat
MGGVQAEREDKDKVSLELTEEILQSMEVGMTFRDYSGRISSMDFHRASSYLVTA
75
1079



protein
SDDESIRLYDVASATCLKTINSKKYGVDLVSFTSHPMTVIYSSKNGWDESLRLL




SLHDNKYLRYFKGHHDRVVSLSLCPRNECFISGSLDRTVLLWDQRAEKCQGLLR




VQGRPATAYDDPGLVFAIAFGGCVRMFDARKYEKGPFEIFSVGGDVSDANVVKF




SNDGRLMLLTTTDGHIHVLDSFRGTLLYTFNVKPTSSKSTLEASFSPEGMFVIS




GSGDGSVYAWSVRGGKEVASWLSTDTEPPVIKWAPGNLMFATGSSELSFWIPDL




SKLGAYVGRK





377
WD40 repeat
MAAFGAAPAGNHNPNKSSEVIQPPSDSVSSLCFSPRANHLVATSWDNQVRCWEL
99
1148



protein
TKNGASVTSVPKASMSHDQPVLCSAWKDDGTTVFSGGCDKQAKMWSLMSGGQPV




TVAMHDAPIKEIAWIPEMNVLVTGSWDKTLKYWDTRQSNPVHTQQLPERCYAMT




VRYPLMVVGTADRNLIVFNLQNPQAEFKRFSSPLKYQTRCVAAFPDQQGFLVGS




IEGRVGVHHLDDSQISKNFTFKCHRDNNDIYSVNSLNFHPVHHTFATAGSDGTF




NFWDKDSKQRLKAMSRCSQPIPCSTFNNDGTIYAYSVCYDWSKGAENHNPATAK




TYIFLHLPQESEVKAKPRVGTTNRK





378
WD40 repeat
MNCSISGEVPEEPVVSTKSGHVFERRLIERYVSDYGKCPVSGEPLTMDDVLPVK
232
1806



protein
MGKIVKPRPLQAASIPGLLSIFQNEWDSLMLSNFALEQQLHTARQELSHALYQH




DAACRVIARLKKERDEARSLLALAERQIPMTASSDIAVNAPAMSNGRKASLDEE




PGYAGKKMRPGISASIIAEITDCNLALSQQRKKRQIPSTLAPVEDLERYTQLSS




YPLHKTGKPGITSLDICHSKDIIATGGIDTSAVLFDRSSGQIMSTLSGHSKKVT




SVNFDAQGDMVLTGSADKTVRIWQGSEDGSYNCRHILKDHTAEVQAITVHATNN




YFATASLDNTWCFYEFSTGLCLTQVEGASGSEGYTSAAFHPDGLILGTGTSNAD




VKIWDVKTQANVTTFSGHTGAITAISFSENGYFLATAAQDGVKLWDLRKLKNFR




TFSAYDKDTGTNSVEFDHSGCYLGLAGSDIRVYQVASVKSEWNCVKTFPDLSGT




GKVTCVKFGPDSKYIAVGSMDHNLRIFGLPSEDGAMES





379
WD40 repeat
MAAPGVETLKKEIKELKEKIAQHRLDTDGEQPLPAAAKSKSVPEVSAALKQRRI
72
1124



protein
LKGHFGKIYALHWSADSRHLVSASQDGKLIIWNGFTTNKVHAIPLRSSWVMTCA




YSPSGNLVACGGLDNLCSVYKVPHGGNKESSSAQKTYGELAQHEGYLSCCRFIK




DNEIVTSSGDSTCILWDVETKTPKAIFNDHTGDVMSLAVFDDKGVFVSGSCDAT




AKLWDHRVHKQCVMTFQGHESDINSVQFFPDGDAFGTGSDDSSCRLFDIRAYQQ




INKYSSDKILCGITSVAFSKTGKSLFAGYDDYNTYVWDTLSGNQVEVLTGHENR




VSCLGVSEDGKALATGSWDTLLKIWA





380
WD40 repeat
MGGVEDESEPASKRMKLSSRVLRGLANGSSRTEPAAGSSLDLMARPLPIEGDEE
315
2069



protein
VIGSKGVIKRVEFVRLIAKALYSLGYEKSGARLEEESGIPLQSSVVNLFMQQIS




DGLWDESVVTLHKIGLSDENLVKSASFLILEQKFLELLDQEKAMDALKTLRTEI




TPLCIKNSRVRELSSCIISPSSCGLLNQNKRNSTRARSRSELLEELQKLLPPAV




IIPERRLEHLVEQALVLQTDACMLHNSIDMEMSLYTDHQCGKEHIPCRTLQILQ




SHNDEVWLVQFSHNGKYLASASNDRSAIIWEVDENGSVSLKHKLTGHQKPISSV




CWSPDDRQLLTCGVGETVRRWDVSSGECLRVYEKAGHGLISCAWFPDGKWICYG




VSDRSICMCDLEGKEIECWKGQRTLSISDLEITSDGKQIISICRETAILLLDRE




AKYERMIEENQTITSFSLSKDNRYLLVNLLNQEIHLWDIKGDFRLVAKYKGLKR




SRFVIRSCFGGLKQAFVASGSEDSQVYIWHKGSGELIEPLPGHSGAVNCVSWNP




ANHHMLASASDDRTIRIWGLNELNTRHKGARPNGVHYCNGNGTS





381
WD40 repeat
MTQLAETYACMPSTERGRGILIAGNPKPGSNSVLYTNGRSVVILNLDNPLDISV
145
1968



protein
YAEHAYPATVARFSPNGEWVASADSSGAVRIWGAYNDHVLKKEFKVLSGRIDDL




QWSPDGLRIVASGDGKGKSLVRAFMWDSGTNVGEFDGHSRRVLSCAFKPTRPFR




IVTCGEDFLVNFYEGPPFKFKLSRRDHSNFVNCLRFSPDGNRFISVSSDKKGII




YDGKTGEKIGELSSDGGHTGSIYAVSWSPDSKQVITVSADKSAKIWDISEDGSG




NLRKTLTSSGSGGVDDMLVGCLWQNNHLVTVSLGGTISIYTAGDLDKAPVSFSG




HMKNVSSLSVLKGDPKVILSSSYDGLIIKWIQGIGFSGRVQRKESTQIKCLAAV




DEEIVTSGYDNKVCRVSGSGDAEFIDIGCQPKDLSLALQCPEFALVSTDTGVVL




LRGAKIVSTINLGFAVTASTVAPDGTEAIIGAQDGKLRIYSISGDTLTEEAVLE




KHRGAISVIHYSPDLSMFASGDLNREAVVWDRASREVRLKNILYHTARINCLAW




SPDSSTVATGSLDTCVIIYEVDKPASNRLTIKGAHLGGVYGLAFTDDFSVVSSG




EDACIRVWKINRQ





382
WD40 repeat
MKVKVISRSTDEFTRERSQDLQRVFRNFDPNLRTQEKAVEYVRALNAAKLDKVF
130
1488



protein
ARPFVGAMDGHVDSVSCMAKNPNYLKGIFSGSMDGDIRLWDIASRRTVCQFPGH




QGPVRGLAASTDGQILVSCGIDSTVRLWNVPVATLGESDGTHENLAKPLAVYVW




KNAFWAVDHQWDGELFATAGAQVDIWNQNRSQPISSFEWGTDTVISVRFNPGEP




NVLATSGSDRSITLYDLRMSSPTRKVIMRTKTNAISWNPMEPMNFTAANEDCNC




YSYDARKLEEAKCVHKDHVSAVMDIDYSPTGREFVTGSYDRTVRIFQYNGGHSR




EVYHTKRMQRVFCVKFSCDASYVISGSDDTNLRLWKAKASEQLGVVLPRERRKH




EYHEAVKSRYKHLPEVKRIVRHRHLPKPIYKAGILRRTVNEADRRKEERRKAHS




APGSSSAEPLRKRRIIKEIE





383
WD40 repeat
MVRSIKNPKKAKRKNKGSKNGDGSSSSSSIPSMPTKVWQPGVDKLEEGEELQCD
269
1693



protein
PSAYNSLHAFHIGWPCLSFDIVRDTLGLVRTEFPHQVYFVAGTQAEKPTWNSIG




IFKVSNITGKRRELVPSKPTDDADEESDSSDSDEDSDDEVGGSGTPILQLRKVG




HEGCVNRIRAMNQNPHICASWGDSGHVQIWDFSSHLNALAESEADVSQGASSVF




NQAPLVKFGGHKDEGYALDWSPLVPGRLVSGDCKNSIHLWEPTSGSTWNVDSTP




FIGHAASVEDLQWSPTEENVFASCSVDGTIAIWDTRLGKTPAASFKAHDADVNV




ISWNRLATCMLASGCDDGTFSIHDLRLLKEGDSVVAHFEYHKHPVTSIEWSPHE




ASTLAVSSADCQLTIWDLSLEKDEEEEAEFKAKTKEQVNAPEDLPPQLLFVHQG




QKDLKELHWHAQIPGMIVSTAADGFNILMPSNIQSTLPSDGA





384
CDK type A
MERYKVIKELGDGTYGSVWKALNQQTHEIVAIKKMKRKYYIWEECINLREVKSL
1163
2545




RKLNHPNIIKLKEVIRENNELFFIFEYMECNLYQIMKERSTPFSETAIIKFCYQ




ILQGLSYMHRNGYFHRDLKPENLLVTSDLIKIADFGLAREVLTSPPYTDYVSTR




WYRAPEVLLQSPTYTTAIDMWAVGAILAELFTLHPLFPGESELDEIYKICGVLG




TPDYETWPDGMQLAAFRNFIFPQFLPVNLSVLIPHASPEAIDLITRLCSWDPQK




RPTAEQALHHPFFRIGMSIPLSLGGHFQDNTCAAEVDTNFHSKKACKGRGMGEK




ESSLECFLGLSLGLKPSLGHLGAMGSQGVGAVKQEVGSSPGCQSNPKQSLFQVL




NSRAILPLFSSSPNLNVVPVKSSLPSAYTVNSQVMWPTIAGPPAAAVTVSTLQP




SILGDFKIFGKSMGLASQYAGKEASPFS





385
CDK type A
MGEMGRGINNSSNNNNSNRPAWLQHYDLVGKIGEGTYGLVFLARSKLPNNRGLR
152
1582




IAIKKFKQSKDGDGVSPTAIREIMLLREFSHENVVKLVNVHINHVDMSLYLAFD




YAEHDLYEIIRHHREKLNHHNINQYTVKSLLWQLLNGLNYLHSNWIVHRDLKPS




NILVMGEGEEHGVVKIADFGLARIYQAPLKPLSDNGVVVTIWYRAPELLLGAKH




YTSAVDMWAVGCIFAELITLKPLFQGVEVKASPNPFQLDQLDKIFKVLGHPTIE




KWPTLMNLPHWSKNLQQIQQHKYDNAGLHIGPIPAKSPAYDLLSKMLEYDPRKR




ITAAQALEHEYFRIDPQPGRNALVPSQPGEKAINYPPRLVDANTDFDGTIAPQP




SQVSSGNAPSGSIASAAVPAVRPLPQQMQLMGMQRMQNPGMAAFNLGAQASMSG




LNHNNIALQRGSSQQQAHQQVRRKEPNSGFPNTGYPPPPKSRRL





386
CDK type B-1
MDKYEKLEKVGEGTYGKVYKARDKMTGQLVALKKTRLEMDEEGVPPSSLREISL
389
1297




LQMLSQSIYVVRLLCVEHVTKKGKPLLYLVFEYLDTDLKKFIDYRRSVNAGPLP




QNVIQSFMYQLLKGVAHCHSHGVLHRDLKPQNLLVDKSKGLLKVGDLGLGRAFT




VPLKCYTHEVVTLWYRAPEVLLGSTHYSTPVDIWSVGCIFAEMVRRQPLFPGDC




EIQQLLHIFTLLGTPTEEMWPGVKRLRDWHEYPQWKPENLARAVPNLSPTGLDL




ISKMLQCDPAKRISAKAAMNHPYFDDLDKSQF;





387
CDK type B-1
MDGYEKMDKVGEGTYGKVYMARDKKTGQLVALKKTRLENDGEGIPPTALREISL
38
946




LQMLSQDIYIVRLLDVKHTENKLGKPLLYLVFEYMESDLKKYIDSYRRSHTKMP




PSMIKSFMYQLCRGVAYCHSRGVMHRDLKPHNLLVDKEKGVLKIADLGLSRAFT




VPVKKYTHEIVTLWYRAPEVLLGATHYSLPVDIWSVGCIFAEMSRMQALFTGDS




EVQQLMNIFRFLGTPNEEVWPGVTKLKDWHIYPEWKPQDISHAVPDLEPSGLDL




LSQMLVYEPSKRISAKKALEHPYFDDLDKSQF





388
CDK type B-1
MDAYEKLEKVGEGTYGKVYKAKDKNTGQLVALKKTRLESDDEGIPPTALREISL
180
1088




LQMLSQDIHIVRLLDVEHTENKNGKPLLYLVFEYMDSDLKKYIDGYRRSHTKVP




PNIIKSFMYQLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVVKIADLGLGRAFT




IPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDIWSVGCIFAEMVRLQALFIGDS




EVQQLFKIFSFLGTPNEEIWPGVTKFRDWHIYPQWKPQDISSAVPDLEPSGVDL




LSKMLVYEPSKRISAKKALEHPYFDDLDKSQF





389
CDK type B-1
MDSYEKLEKVGEGTYGKVYKAKDKKTGKLVALKKTRLENDGEGIPPTALREISL
40
948




LQMLSQDMNIVRLLDVEHTENKNGKPLLYLVFEYMDSDLKKYVDGYRRSHTKMP




PKIIKSFMYQLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVLKIADLGLGRAFT




VPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDIWSVGCIFAEMSRMHALFCGDS




EVQQLMSIFKFLGTPNEGVWPGVTKLKDWHIYPEWRPQDLSRAVPDLEPSGVDL




LTKMLVYEPSKRISAKKALQHPYFDDLDKSQF





390
CDK type B-1
MEKYEKLEKVGEGTYGKVYKGRDKRTGRLVALKKTPFHQEEGIPPTAIREISLL
299
1134




KSLSQCIYIVKLLDVKASFNGKGKHVLFMVFEYADSDLKKHIDAHRQCNTKLSP




RSIQSYMFQLCKGIAYCHSHGVLHRDLKPQNILVDQKIGLLKIADLGLGRACTV




PIKSYTFEVVTLWYRAPEVLLGAKRYSMALDIWSLGCIFAELCNLQALFAGDSQ




IQQLINIFRLLGTPNEQLWPGVTQLSDWHEFPQWRPQDLSKVVFNLDPNGVDLL




SKMLQYDPAKRISAKEALDHPYFDSLDKSQF





391
CDK type C
MGCVCGKPSARAADYVESPAEKGASSNSRSSSMASRRLVAPAVMDQGIDAENGH
105
2642




EGDYRTKLRGKQSNGADPVSLLSDDAEKQRHSRHHQHQQHHPIRPHHLRPQGEF




VPNANSNPRFGNPPRHIEGEQVAAGWPAWLTAVAGEAIKGWIPRRADSFEKLDK




IGQGTYSNVYKARDLDTGKIVALKKVRFDNLEPESVRFMAREIQVLRRLDHPNV




VKLEGLVTSRMSCSLYLVFEYMDHDLAGLAACPGIKFTEPQVKCYMQQLLRGLD




HCHSRGVLHRDIKGSNLLIDNGGILKIADFGLATFFHPDQRQPLTSRVVTLWYR




PPELLLGATEYGVAVDLWSTGCILAELLAGKPIMPGRTEVEQLHKIFKLCGSPS




EDYWKKSKLPHATIFKPQQPYKRCVAETFKDFPPSALALMEVLLAIEPADRGTA




TSALKSDFFTTKPLACDPSSLPKYPPSKEFDAKIRDEEARRQRAAGGRGRDAAR




RPSRESRAIPAPEANAELAISIQKRRLSSQGPSKSKSEKFNPQQEDGAVGFPIE




PPRPMHIGIDAGATSRMYSQQFGPSHSGPLSNQISSSIWGKNQKEDEIQMAPGR




PSRSSKATISDFRKPGACAPQPGADLSHLSSLVATARSNAGIDTHKDRSGMWQH




NRIDAIDGVHNNGKHEFLEVPEHPNRQDWTRFQQPESFKGLDNYHLQDLPATHH




RKDERVASKEATMNWQGYGGQGGDKIHYSGPLLPPSGNIDEILKEHERHIQHAV




RRARQDKGRPQRSNLSQNERKAFEHRSFVSGVNGNAGYSDLVNELPISVGSNRL




KVSKTRGTEEIVELRELEREPLSSVMEKYEREHEM





392
CDK type C
MGCVCAKQSDILGEPESPKVKGSNLASSRWSVSSETKQLPQHSDSGILHHQHYY
187
2580




HPRDESDEAKLKESNYGGSKRRTRQGRDPADLDMGIFVRTPSSQSEAELVAAGW




PAWMAAFAGEAIHGWIPRRAESFEKLYKIGQGTYSNVYKARDLDNGKIVALKKV




RFDSLDAESVRFMAREILVLRKLDHPNIVKLEGLVTSEVSSSLYLVFEYMEHDL




AGLAACPGIKFTEPQVKCYMQQLLQGLDHCHRHGVLHRDIKGSNLLIDNGGILK




IADFGLATFFYPDQKQLLTSRVVTLWYRPPELLLGATDYGVAVDIWSAGCILAE




LLAGKPILPGRTEVEQLHKTFKLCGSPSEDYWKESKLPHATIFKPQHPYKSCIA




EAFKDFSPSALALLETLLAIEPGHRGEASGALKSEFFTTEPLSCDPSSLPKYPP




SKEFDAKLRAQETRRQRDVGVRGHGSEAARRTSRLSRAGPTPNEGAELTALTQK




QHSTSHATSNIGSEKPSTKKEDYTAGLHIDPPRPVNHSYETTGVSRAYDAIRGV




AYSGPLSQTHVSGSTSGKKPKRDHVKGLSGQSSLQPSKPFIVSDSRSERIYEKS




HVTDLSNHSRLAVGRNRDTTDPHKSLSTLMQQIQDGTLDGIDIGTHEYARAPVS




STKQKSAQLQRPSALKYVDNVQLQNTRVGSRQSDERPANKESDMVSHRQGQRIH




CSGPLLHPSANIEDLLQKHEQQIQQAVRRAHHGKREALSNKSSLPGKKPVDHRA




WVSSGKGNKESPYFKGKGNKELSDLKGGPTAKVTNFRQKVM





393
CDK type C
MAVANPGQLNLQEAPSWGSRSVNCFEKLEQIGEGTYGQVYMAKEIETGEIVALK
220
1749




KIRMDNEREGFPITAIREIKLLKKLQHENVIKLKEIVTSPGPEKDEQGKSDGNK




YNGSIYMVFEYMDHDLTGLAERPGMRFSVPQIKCYMKQLLIGLHYCHINQVLHR




DIKGSNLLIDNNGILKLADFGLARSFCSDQNGNLTNRVITLWYRPPELLLGSTK




YGPAVDMWSVGCIFAELLYGKPILPGKNEPEQLTKIFELCGSPDESNWPGVSKL




PWYSNFKPQRQMKRRVRESFKNFDRHALDLVEKMLTLDPSQRISAKDALDAEYF




WTDPVPCAPSSLPRYEPSHDFQTKRKRQQQRQHDEMTKRQKISQHPPQQHVRLP




PIQNAGQGHLPLRPGPNPTMHNPPPQFPVGPSHYTGGPRGAGGQNRHPQNIRPL




HAAQGGGYNANRGYGGPPQQQGGGYPPHGMGNQGPRGGQFGGRGAGYSQGGPYG




GPVGGRGPNVGGGNRGPQFWSEQ





394
CDK type D
MQNMEDNVQSSWSLHGNKEICARYEILERVGSGTYSDVYRGRRKADGLIVALKE
438
1748




VHDYQSSWREIEALQRLCGCPNVVRLYEWFWRENEDAVLVLEFLPSDLYSVIKS




GKNKGENGIPEAEVKAWMIQILQGLADCHANWVIHRDLKPSNLLISADGILKLA




DFGQARILEEPEAIYEVEYELPQEDIVADAPGERLMEEDDSVKGVRNEGEEDSS




TAVETNFGDMAETANLDLSWKNEGDMVMQGFTSGVGTRWYRAPELLYGATIYGK




EIDLWSLGCILGELLILEPLFSGTSDIDQLSRLVKVLGTPTEENWPGCSNLPDY




RKLCFPGDGSPVGLKNHVPSCSDSVFSILERLVCYDPAARLNAKEVLENKYFVE




DPYPVLTHELRVPSPLREENNFSEDWAKWKDMEADSDLENIDEFNVVHSSDGFC




IKFS





395
CDK type D
MDLNQYPEDLNPELPEGTDNVDNPDNNKGSPVPSPHPPLKPLDPSERYRKGITL
240
1631




GQGTYGIVYKAFDTVTNKTVAVKKIHLGKAKEGVNVTALREIKLLKELSHPNII




QLIDAYPHKQNLHIVFEFMETDLEAVIKDRNLVFSPADIKSYLQMTLKGLAVCH




KKWVLHRDMKPNNLLIAADGQLKLGDFGLARLFGSPDRKFTHQVFAVWYRAPEL




LFGAKQYGPAVDIWATGCIFAELLLRKPFLQGVSDLDQIGKIFAAFGTPRQSQW




PDVASLPDFVEFQFVPAPSLRSLFPMASEDALDLLSKMFTLDPKNRITAQQALE




HRYFSSVPAPTRPDLLPKPSKVDSSRPPKHASPDGPVVLSPSKARRVMLFPNNL




AGILPKQVSQSTTGGTPIEFDMPTQKLREVCPRSRITESGKKHLKRKTMDMSAA




LDECAREQEGQEGKTILDPDHQRSAKKEKHM





396
Cyclin A
MAGGQENCVRITRARAACVSKASAPVIQSQVDEKKSRKRAPKRAAVDDLAANAS
252
1604




GSQPKRRAVLGDVTNLHAAATDCLSTAEDQVDAPNPSIKGRARNKKKEARTSTK




VVKDEIHPESNPLADHSSNLSECQKPPAAKLAEQRSLRGVPSKAKQGGSSNSQS




CSKHTDIDKDHTDPQMCTTYVEDIYEYLRNAELKNRPSANFMETAQNDITPNMR




AILVDWLVEVSEEYKLVPDTLYLTVSYIDRYLSANPTSRHKLQLLGVSCMLIAS




KYEEVCPPHVEEFCYITDNTYTRDEMLSMERKILIFLNFEMTKPTTKSFLRRFV




RASQAGNKAPSLHMEFLANYLAELTLMECSFLQYLPSLIAASTVFLSRLTLDFL




TNPWNPTLAHYTGYKASQLKDCVMAIYNVQMNRKGSTLVAIREKYQQHKFKCVA




SLPPPPFIAERFFEDTPN





397
Cyclin A
MTGTQASNVRITRARAAKSTLNNALPPLPPAQGKPRGKRAATESNISGFSVAAE
261
1817




PLKRRAVLSDVSNICKEAAAVDCLKKPKAVKVVSQNANAKGRGRGIPRNNKKIT




QEAEIKKETSPAICNVDDASAGNAIGDDKQNNNVNPLKEVQDNPKELNPIAEQI




SVHPHCKQSVEKPNEKEIVVSDNKAAIASLKQQSTLQSLRIPKQPKYSLKQGNP




VPLANLHEDVGRSSCSDFIDIDSEYKDPQMCTAYVTDIYANMRVVELKRRPLPN




FMETTQRDINANMRSVLIDWLVEVSEEYKLVPDTLYLTVSYIDRFLSANVVNRQ




RLQLLGVSCMLVASKYEEICAPPVEEFCYITDNTYKKEEVLEMEISVLNRLQYD




LTTPTTKTFLRRFIRAAQASCKVSSLHLEFMGNYLAELTLVEYDFLKYLPSLIA




AAAVFVARMTLDPMVHPWNSTLQHYTGYKVSDMRDCICAIHDLQLNRKGCTLAA




IREKYNQPKFKCVANLFPPPIISPQFLIDNEV





398
Cyclin B
MAAPNQNALLINNNNRRPLVDIGNLVGALNAQCNISKNGARKRAFGDIGNLVED
167
1576




LDAKCTISKYWVRKRPRTNFGVNANKGASSSTQGQGIVVRGEQKAWDRIVWGNK




QSCAIKMNAQHVTATQRGTAISISDIIDSSVQDGGIKAPSQLKARKQTVRTVTA




TLTARSEDSLRDVLEVPPGIDDGDRDNPLAVVEYVEDIYHFYRKIEVRSCVPPD




YMTRQLEIKDSMRGVIIDWLIEVHRTFLLMPETLYLTVNIIDRYLSIQSVTRNE




LQLMGITAMFIASKYEEISPPKINDLVYITKDAYTSKQIVNMEHTILNRLKFKL




TVPTPYVFLVRFLKAAGPDKVMKNLAFFLVDLCLLHYKMIKYSPSMLAAAAVYT




AQCTLKKHPYWNKTLILHIGYSEAHLRECAHLMADLHLKAEGSNLKSVYKKYSY




PIFGSVAFLSPAKIPAGTVAAPAIDKCAHQIYLRNLR





399
Cyclin B
MFPNKQTQGLVQNKKMASKAAQPKAMVPPQRVPPAANNRRALGDIGNIVADVGG
183
1598




KCNVTKDGVNGKPLAQVSRPITRSFGAQLLAQAAANKGISAANNQTQVPVVIPK




ADVRGNKQRRTSKSKDIPPTTVVTNESDDCVIIEQAQRIKPTCNHNVGAVGNKE




KPQLLTAKPKSLTASLTSRSAVALRGFRFDDEMTEAEEDPLPNIDVGDRDNQLA




VVEYVEDIYKFYRRTEQMSCVPDYMPRQQEINPKMRAVLINWLIEVHYRFGLMP




ETLYLTTNLIDRYLATQLVSRSNYQLVGATAMLLASKYEEIWAPEMNDFLDILE




NKFERKHVLVMEKAMLNKLKFHLTVPTPYVFLVRFLKAAASDEEMENLVFFLME




LSLMQYVMIKFPPSMLAAAAVYTAQITLKKTTVWNDVLKRHTGYSEIDLKECTR




LMVAFHQSSEESKLNVVFKKYSMPEYDSVALIKPAKLPA





400
Cyclin D
MAPSFDCVANAYIESCEDQEKLRQNAQILAQSGENDVDEPVSMLVQRETHYMLP
98
1126




EDYLQRLRNRTLDVNVRREAVGWILKVHSFYNFGAPTAYLAVNYLDRFLSRHRM




PQGVKAWMIQLMAVACLSLAAKMEETQVPLPSDLQREDARFIFDARTIQRMELL




ILSTLQWGMRSITPFSFIDYFAYRAVQGHGHGHDATPKAVMSRAIELILSTTEE




IDFMEYRPSAIAAAALLCAAEEVVPLQAVHYKRALSSSITDVDKDKMFGCYNLI




QETIIEGGCYWTPMSLQSTEKTPVGVLDAAACLSNTPTSSYSVKPYASVTAAKR




RKLNEICSALLVSQAHPC





401
Cyclin D
MAANFWTSSHCKELLDAEKVGIVHPLDKDQGLTQEDVKIIKINMSNCIRTLAQY
148
894




VKLRQRVVATAITYCRRVYTRKSFTEYDPQLVAPTCLYLASKAEESTVQAKLVI




FYMKKYSKHRYEIKDMLEMEMKLLEALDYYLVIYHPYRPLIQFLQDAGLNDLKV




TAWALVNDTYRTDLILTYPPYMIALACIYFACIMEEKDAQAWFEELRVDMNEIK




NISMEIVDYYDNYRVIPDEKMNSALNKLPHRF





402
Cyclin D
MAPALSSSYECLSHLLCAEDASNVVGCWDEDESKIFCEEEEGFGIQHFPDFPVP
287
1363




DDDEIRVLVRKESQYMPGKSYVQSYQNLGLDFTARQNAIGWILKVHGSYNFGPL




TAYLSINYLDRFLSRNPLPKAKVWMLQLLSVACLSLAAKMEETQVPLLLDLQAE




EPDFLFEPRTIQRMELLVLSTLEWRMLSVTPFSFVDYFLQGGGGRKPPPRAMVA




RANELIFNTHTVLDFLEHRPSAIAAAAVICAAEEVLPLEAAQYKETILSCSLVD




KEWVFGSYNLIQEVLIEKFSTPKKAKSASSSIPQSPVGVLDAFCLSNNSNNTSL




EASLSVNLYASVAAKRRKLNDYCNTWRMFQHSTC





403
Cyclin D
MAPNCIDCAPSDLFCAEDAFGVVEWGDAETGSLYGDEDQLHYNLDICDQHDEHL
251
1348




WDDGELVAFAEKETLYVPNPVEKNSAEAKARQDAVDWILKVHAHYGFGPVTAVL




SINYLDRFLSANQLQQDKPWMTQLAAVACLSLAAKMDETEVPLLLDFQVEEAKY




IFESRTIQRMELLVLSTLEWRMSPVTPLSYIDHASRMIGLENHHCWIFTMRCKE




ILLNTLRDAKFLGLLPSVVAAAIMLHVIKETELVNPCEYENRLLSAMKVNKDMC




ERCIGLLIAPESSSLGSFSLGLKRKSSTINIPVPGSPDGVLDATFSCSSSSCGS




GQSTPGSYDSNNSSILCISPAVIKKRKLNYEFCSDLHCLED





404
Cyclin-
MPQIQYSEKYTDDTYEYRHVVLPPETAKLLPKNRLLNENEWRAIGVQQSRGWVH
229
510



dependent
YAIHRPEPHIMLFRRPLNYQQNQQQQAGAQSQPMGLKAQ



kinase



regulatory



subunit





405
Cyclin-
MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTENEWRGIGVQQSRGWVH
92
409



dependent
YAIHCSEPHIMLFRRPLNYEQNHQHPEPHIMLFRRPLNCQPNHQPQAHHPT



kinase



regulatory



subunit





406
Cyclin-
MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTENEWRGIGVQQSRGWVH
64
381



dependent
YAIHCSEPHIMLFRRPLNYEQNHQHPEPHIMLFRRPLNCQPNHQPQAHHPT



kinase



regulatory



subunit





407
Cyclin-
MPQIQYSEKYYDDTYEYRHVVLPPDVARLLPKNRLLNENEWRGIGVQQSRGWVH
68
349



dependent
YAIHRPEPHIMLFRRHLNYQQNQQQQAQQQPAQAMGLQA



kinase



regulatory



subunit





408
Histone
MALVETEPVTLIHPEEPKKFKKKPTPGRGGVISHGLTEEEARVKAIAEIVGAMV
125
1849



acetyltransferase
EGCRKGEDVDLNALKAAACRRYGLSRAPKLVEMIAALPDGERAAVLPKLKAKPV




RTASGIAVVAVMSKPHRCPHIATTGNICVYCPGGPDSDFEYSTQSYTGYEPTSM




RAIRARYNPYVQTRSRIDQLKRLGHTVDKVEFILMGGTFMSLPADYRDYFIRNL




HDALSGHTSSNVEEAVCYSEHSATKCTGLTIETRPDYCLGPHLRQMLSYGCTRL




EIGVQSTYEDVARDTNRGHTVAAVADCFCLAKDAGFKVVAHMMPDLPNVGVERD




MESFREFFENPAFRADGLKIYPTLVIRGTGLYELWKTGRYRNYPPEQLVDIIAR




VLALVPPWTRVYRVQRDIPMPLVTSGVEKGNLRELALARMDDLGLKCRDVRTRE




AGIQDIHHKIRPEVVELVRRDYCANEGWETFLSYEDTRQDILVGLLRLRKCGHN




TTCPELKGRCSIVRELHVYGTAVPVHGRDADKLQHQGYGTLLMEQAERIAWKEH




RSIKIAVISGVGTRHYYRKLGYELEGPYMMKYLN





409
Histone
MLGFRDLYTSICEHLQRASGRLPIIAAATSLISTPEIAAVEKENKAPNSVDKMG
70
1602



acetyltransferase
MGSADESGRFSTSNGQFMNMNNGVVKEEWKGGVPVVPSAPTTVPVITNVKLETP




SSPDHDMARKRKLGFLPLEVGTRVLCKWRDGKFHPVKIIERRKLPNGATNDYEY




YVHYTEFNRRLDEWVKLEQLELDSVETDADEKVDDKAGSLKMTRHQKRKIDETH




VEGNEELDAASLREHEEFTKVKNITKIELGRYEIETWYFSPFPSEYNNCEKLYF




CEFCLNFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEVDGKKNKVYAQNL




CYLAKLFLDHKTLYYDVDLFLFYILCECDERGCHMVGYFSKEKHSEESYNLACI




LTLPPYQRKGYGKFLISFSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRVLLD




ILKKHKSNISIKELSDMTAIKADDVLSTLQGLDLIQYRKGQHAICADPKVLDRH




LKAVGRGGLEVDVCKLIWTPYKEQ





410
Histone
MGSLDESTCSEEIRDEGKDSIRTKFKVESTVNNAQNGGNDNSKKKRAAGLPLEV
140
1465



acetyltransferase
GIRLLCKWRDSKLHPVKIIERRKLPNGFPQDYEYYVHYTEFNRRLDEWVKLEQF




ELDSVETDADEKIEDKGGSLKMTRHQKRKIDEIHVEEGQGHEDFDPASLREHEE




FTKVKNIAKVELGRYEIETWYFSPFPPEYSHCEKLFFCEFCLNFMKRKEQLQRH




MRKCDLKHPPGDEIYRNGTLSMFEVDGKKNKIYGQNLCYLAKLFLDHKTLYYDV




DLFLFYVLCECDDRGCHVVGYFSKEKHSDEAYNLACILTLPPYQRKGYGKFLIA




FSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRILLDILKKQRGNISIKELSDM




TAIKVEDVISTLQVLDLIQYRKGQHVICADPKVLDRHLKAAGIAGLEVDVSKLI




WTPYKEQCG





411
Histone
MASAPMVGCDDSRDKHRWVESKVYMRKGHGKGSKGNAGFNAQNSTAQVRRENDN
628
2565



acetyltransferase
MGNSIADNGKSEAASEGLSSLSRKQITVNQDHPPNETSSMPAVGGLQNIDTHVT




FKLEGCSKQEIWELRKKLTNELEQVRGTFKKLEARELQLRGYSVSAGVNTSYSA




SQFSGNDMRNNGGKEVTSEVASGGAITPKQAQRESNPPRQLSISLMENNQAASD




MGEKGKRTPKANQYYRNSEFVLGKDKFPPAESKKSKSTGNKKISQSKVFSKETM




QVGKEFMPQKSVNEVFKQCSLLLTKLMKHKYGWVFNLPVDAQALGLHDYHTIIK




RPMDLGTVKSKLEKNLYNSPASFAEDVKLTFSNAMTYNPKGHEVHTMAEQLLQL




FEERWKTIYEEHLDGKMRFGSGQGLGASSSTKKLPFQDSKKNIKKSEPAGGPSP




PKPKSTNHHASRTPSAKKPKAKDPHKRDMTYEEKQKLSTNLQNLPQERLELIVQ




IIKKRNPSLCQHDEEIEVDIDSFDTETLWELDRFVTNYKKSLSKNKKKALLADQ




AKRASEHGSARNKHPMIGRELPMNNKKGEQGEKVVEIDHMPPVNPPVVEVEKDG




VYAKRSSSSSSSSSDSGSSSSDSDSGSSSGSESDAYAATSPPAGSNTSARG





412
Histone
MEGHSGALGFGQGFSRSSQSPNLSPSPSHSASASVTSSGQKRKRNEVEHAGVAS
55
1818



acetyltransferase
NSTGMFAVPPSHIYSHLHPMSMSMPMPMHNSHPSSLSESRDGALTSNDDDDNLT




GGNQSQLDSMSAGNTDGREDFDDEDDDDDDEEDDDEVEGDEEDQDHDPDADDDS




DDGHDSMRTFTAARLDNGAPNSRNLKPKADAAGVAIAPTVKTEPILDTVKEEKV




SGNNNNNSVSANNAQVAPSGSAVLLSAVKEEANKPTSTDHIQTSGAYCAREESL




KREEDADRLKFVCFGNDGIDQHMIWLIGLKNIFARQLPNMPKEYIVRLVMDRSH




KSVMIIKQNQVVGGITYRPYLSQKFGEIAFCAITADEQVKGYGTRLMNHLKQHA




RDVDGLTHFLTYADNNAVGYFIKQDFTKEIKLEKERWHGYIKDYDGGILMECKI




DPKLPYTDLPAMIRWQRQTIDEKIRELSNCHIVYSGIDIQKKEAGIPRKPIKVE




DIPGLKEAGWTTDQWGHSRFRLLNSPSEGLPNRQVLHAFMRSLHKAMVEHADAW




PFKEPVDPRDVPDYYDIIKDPMDVKRMFTNARTYNTHETIYYKCANR





413
Histone
MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMKPHRIRMAHSLIVHYA
259
1710



deacetylase
LDEKMEVCRPNLLQSRELRVFHADDYISFLQSVTPETQHEQLRQLKRFNVGEDC




PVFDGLYNFCQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEASGFCYVN




DIVLAILELLKVHQRVLYIDIDIHHGDGVEEAFYSTDRVMSVSFHKFGDYFPGT




GHLKDVGYGKGKYYSLNVPLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCG




ADSLSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYTIRNVARCWCYET




AVAVGVEPQDKLPYNEYYEYFGPDYTLHVAPSNMENQNSAKELAKIRNTLLEQL




KRIQHVPSVPFQERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQKPQN




RDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGIVNENDGAKWPLGEAG





414
Histone
MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMKPHRIRMAHSLIVHYA
356
1807



deacetylase
LDEKMEVCRPNLLQSRELRVFHADDYISFLQSVTPETQHEQLRQLKRFNVGEDC




PVFDGLYNFCQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEASGFCYVN




DIVLAILELLKVHQRVLYIDIDIHHGDGVEEAFYSTDRVMSVSFHKFGDYFPGT




GHLKDVGYGKGKYYSLNVPLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCG




ADSLSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYTIRNVARCWCYET




AVAVGVEPQDKLPYNEYYEYFGPDYTLHVAPSNMENQNSAKELAKIRNTLLEQL




KRIQHVPSVPFQERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQKPQN




RDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGIVNENDGAKWPLGEAG





415
Histone
MEFWGVEVKPGEALTCDPGDERYLHMSQAAIGDKEGAKENERVSLYVHVDGKKF
261
1298



deacetylase
VLGTLSRGKCDQIGLDLVFEKEFKLSHTSQTGSVFVSGYTTVDHEALDGFPDDE




DLESSEDEEEELAQITTLTAKENGGKTGAKPVKPESKSSVTDKAAAKGKPEVKP




PVKKQEDDSDSDEDEDEDEDEDEDDDDEDDEDMKDASASDDGDEEDDSDEESDD




DEEEDEETPKPAAGKKRPMPASDNKSPATDKKAKITTPAGGQKPGADKGKKTEH




IATPYPKHGAKGPASGVKGKETPLGSKQTPGSKVKNSSTPESGKKSGQFKCQSC




SRDFATEGALSSHNAAKHGGK





416
Histone
MMETGGNSLPSGPDGVKRKVAYFYDPEVGNYYYGQGHPMKPHRIRMTHALLVQY
365
2251



deacetylase
GLHKEMQILKPYPARDRDLCRFHADDYVAFLRGITPETIQDQVKALKRFNVGDD




CPVFDGLYQYCQTYAGGSVGGAVKLNHKLCDIAINWAGGLHHAKKCEASGFCYV




NDIVLAILELLKYHKRVLYVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPG




TGDIRDIGCGKGKYYAVNVPLDDGIDDESFQSLFKPIIQQVMLVYNPEAIVLQC




GADSLSGDRLGCFNLSVKGHAECVRYMRSFNVPLLMVGGGGYTVRNVARCWCYE




TGVAVGVEIDDKMPQHEYYEYFGPDYTVHVAPSNMENKNTKQYLDKIRSKILEN




INSLPCAPSAQFQVQPPDTDFPELEEEDYDERTRSHKWDGASCDSDSENGDLKH




RNHDVEESAFPRHNLANISYNTKIKLEGVGTGGLDMAAGTDTKKNDESFEAMDY




ESGEELRQDHFASTINASQPCDPALLTGVQNQLQSTDTVKPIEQSGNAPGIPPP




SVATVSTGTRPSSISRTSSLNSMSSVKQGSILGPNPPQGLNASGLQFPVPTSNS




PIRQGGSYSITVQAPDKQGLQNHMKGPQNMPGNS





417
Histone
MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHKKMEIYRPHK
156
1454



deacetylase
AYPVELAQFHSADYVEFLHRITPDTQHLFTKELVKYNMGEDCPVFENLFEFCQI




YAGGTIDAAHRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGILELLKH




HARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFHKYGDMFFPGTGDVKEVGEREG




KYYAINVPLKDGIDDASFTRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGC




FNLSIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVETGVLLDTELPNE




IPDNDYIKYFAPDYSLKINTAGNMENLNSKTYLSAIKVQVMENLRAIQHAPSVQ




MHEVPPDFYIPDIDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDMEEAS





418
Histone
MDSSKSEEANILHVFWHEGMLNHDLGTGVFDTLEDPGFLEVLEKHPENADRVRN
203
1348



deacetylase
MLSILRKGPIAPYTEWHTGRAAYLSELYSFHRPDYVDMLAKTSTAGGKTLCHGT




RLNPGSWEAALLAAGTTLEAMRYILDGHGKLSYALVRPPGHHAQPTQADGYCFL




NNAGLAVELAVASGCKRVAVVDIDVHYGNGTAEGFYERDDVLTISLHMNHGSWG




PSHPQTGFHDEVGRGKGLGFNLNVPLPNGTGDKGYEHAMHELVVPAISKFMPEM




IVLVIGQDSSAFDPNGRECLTMEGYRKIGQIMRQQADQFSGGRLVVVQEGGYHI




TYAAYCLHATLEGVLCLPHPLLSDPIAYYPEHDIYSERVTFIKNYWQGIISTTD




KRN





419
Histone
MEESGNALVSGPDGSKRRVTYFYDADIGNYYYGQGHPMKPHRMRMAHNLIVHYG
229
1644



deacetylase
LHQRMEVCRPHLAQSKDIRAFHTDDYIHFLSSVAPDTQQEQLRQLKRFNVGEDC




PVFDGLFNFCQSSAGGSIGAALKLNRKDADIAINWAGGLHHAKKCEASGFCYVN




DIVLGILELLKVHQRVLYIDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGT




GHIKDVGYGKGKYYALNVPLNDGIDDESYKHLFRPIIQKVMEVYQPEAVVLQCG




ADSLSGDRLGCFNLSVKGHADCVRFVRSFNIPLMLVGGGGYTIRNVARCWCYET




AVAVGVEPQDKLPYNEYYEYFGPDYTLYVAPSNMENLNTEKDLEKMRNVLLEQL




SKIQHTPSVPFQERPPDTEFNDEEEEDMEKRSKCRIWDGEYVGSEPEEDGKLPR




FDADTYERSVLKHENKRLVPVSNVEPLKRIKQEEDGAAV





420
Histone
MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHKKMEIYRPHK
156
1454



deacetylase
AYPVELAQFHSADYVEFLHRITPDTQHLFTKELVKYNMGEDCPVFENLFEFCQI




YAGGTIDAAHRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGILELLKH




HARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFHKYGDMFFPGTGDVKEVGEREG




KYYAINVPLKDGIDDASFTRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGC




FNLSIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVETGVLLDTELPNE




IPDNDYIKYFAPDYSLKINTAGNMENLNSKTYLSAIKVQVMENLRAIQHAPSVQ




MHEVPPDFYIPDIDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDMEEAS





421
Histone
MDLNLVSHGEEEEGVRRRKVGIVYDERMCKHATPEDQPHPEQPDRIRVIWDKLN
27
2222



deacetylase
SAGVLHKCVMVEAKEASEEQLAGVHSRKHIEVMKSIGTARYNKKKRDKLAASYS




SIYFSQGSSEAALLAAGSVVEISEKVASGELDAGVAIVRPPGHHAEADKAMGFC




LFNNIAIAAKHLVHERPELGVQEVLIVDWDVHHGNGTQHMFWTDPHVLYFSVHR




FDAGTFYPGGDDGFYDKIGEGKGAGYNINVPWEQGKCGDADYLAVWDHVLVPVA




KSYDPDMVLISGGFDAALGDPLGGCRLTPYGYSLMTKKLMEFAGGKIVLALEGG




YNLKSLADSFLACVEALLKDGPSRSSVLTHPFGSTWRVIQAVRKELSSFWPALN




EELQLPRLLKDASESFDKLSSSSSDESSASEDEKKIAEVTSIMEVSPDPSSILA




LTAEDIAQPLAGLKIEEAGTDSQRSSDHTLLDLTNDDTQKLKQFEGEIFVMIGD




EESVPSASSSKDQNESTVVLSKSNIKAHSWRLTFSSIYVWYASYGSNMWNPRFL




CYIEGGQVEGMAKRCCGSEDKTPPQRIQWKVVPHRMFFGRSYTNTWGSGGVSFL




DPNCSDTSEAHVCLYKITLAQFNDLLLQENNLNCGTEHPLVDLSSIDAIRNGNS




ILELIKDSWYGTLIYLGMEGGLPIVTFTCSVCDVEKFKHGQLPLCPPSSRYENI




LIRGLVQGKKLSEDDATAYIRAASTSPLL





422
Peptidylprolyl
MADEDLDLSDVGEVEDEPGEEIESTPPLAVGQEKEINSLALKKKLLKVGTRWET
71
1759



isomerase
PENGDEVTVHYTGTLPDGTKFDSSRDRGEPFTFKLGQGQVIKGWDQGIVTMKKG




ERALFTIPPELAYGSSGVRPTIPPNATLQFDVELLSWTNIVDVCNDGGILKRII




SEGEKYERPKDPDEVTVKYEAKLEDGTLVAKSPEEGVEFYVNDGHFCPAIAKAV




KTMKRGESVILTIKPTYAFGERGKDAEEGFAAIPPNATLTTSLELVSFKAVIAV




TEDKKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEKKGYEGEEPFQFV




VDEEQVIAGLDKAVETMKTGEIALITIGAEYGFGNFETQRDLAVIPPNSTLIYE




VEMISFTKEKESWDMDTTEKIEASKQKKEQGNSLFKVGKYQRAAKKYEKAAKYI




EHDSSFSAEEKKQSKVLKVSCNLNHAACRLKLKDFKEAVKLCSKVLELESQNVK




ALYRRAQAYIETADLDLAEFDIKKALEIEPQNREVQLEYKILKQKQIEYNKKDA




KLYGNMFAKLNKLEAFEGKVLS





423
Peptidylprolyl
MADEGLELSDVAEVEDEPGEEFESAPPLVVGQEKELNSSGLKKKLLKAGTRCET
358
2040



isomerase
PENGDEVTVHYTGTLLDGTKFDSSRDRGEPFTFNIGQGQVIKGWDQGIVTMKKR




EHALFTIPPELAYGASGMPPTIPPNATLQFDVELLSWTNIVDVCKDGGILKRII




SDGEKYERPKDPDEVTVKYEAKLEDGMLVAKSPEEGVEFYVNDGNFCPAIVKAV




KTMKKGENVTLTIKPAYAFGEQGKDAEEGFAAIPPNATITINLQLVSFKAVKEV




TEDKKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEKKGYAGEEPFQFV




VDEEQVIAGLDKAVETMKTGEVALITIGPEYGFGNIETQRDLAVIPPYSTLIYE




VEMVSFTKEKESWDMNTTENIEASKQKKEQGNSLFKVGKYLRAAKKYDKAAKYI




EHDNSFSAEEKKQSKVLKVSCNLNHAACCLKLKDFKKAVKLCSKVLELESQNVK




ALYRRAQAYIETADLDLAEFDIKKALEIEPQNREVRLEYLILKQKQIEYNKKDA




KLYGNMFARQNKLEAIEGKD





424
Peptidylprolyl
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH
238
756



isomerase
FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA




NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP




VVIADSGQLA





425
Peptidylprolyl
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGNGRSGKPLH
238
756



isomerase
FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA




NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP




VVIADSGQLA





426
Peptidylprolyl
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH
238
756



isomerase
FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA




NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP




VVIADSGQLA





427
Peptidylprolyl
MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH
238
756



isomerase
FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA




NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP




VVIADSGQLA





428
Peptidylprolyl
MADDFELPESAGMMENEDFGDTVFKVGEEKEIGKQGLKKLLVKEGGSWETPETG
176
1912



isomerase
DEVEVHYTGTLLDGTKFDSSRDRGTPFKFKLGQGQVIKGWDQGIATMKKGENAV




FTIPPDLAYGESGSQPTIPPNATLKFDVELLSWASVKDICKDGGIFKKIIKEGE




KWEHPKEADEVLVKYEARLEDGTVVSKSEEGVEFYVKDGYFCPAFAIAVKTMKK




GEKVLLTVKPQYGFGHQGREAIGNDVARSTNATLLVDLELVSWKVVDEVTDDKK




VLKKILKQGEGYERPNDGAVVKVKYTGKLEDGTIFEEKGSDEEPFEFMAGEEQV




VDGLDRAVMTMKKGEVALVSVAAEYGYQTEIKTDLAVVPPKSTLIYEVELVSFV




KEKESWDMNTAEKIEAAGKKKEEGNALFKVGKYFRASKKYEKATKYIEYDTSFS




EEEKKQSKPLKVTCNLNNAACKLKLKDYTQAEKLCTKVLEVESQNVKALYRRAQ




AYIQTADLELAELDIKKALEIDPNNRDVKLEYRALKEKQKEYNKKEAKFYGNMF




ARMSKLEELESRKSGSQKVETANKEEGSDAMAVDGESA





429
Peptidylprolyl
MAASLTPLGAGLAYATIYDQAKVRKLEPTKRSLIALCQHSDSQHRRFITRKYHV
64
765



isomerase
NVQILNRRDAIRLIGLAAGLCIDLSLMYDARGAGLPPQENAKLCDTTCEKELEN




APMITTESGLQYKDIKIGNGPSPPIGFQVAANYVAMVPSGQVFDSSLDKGQPYI




FRVGSGQVIKGLDEGLLSMKVGGKRRLYIPGPLAFPKGLNSAPGRPRVAPSSPV




IFDVSLEFIPGLESEEE





430
Peptidylprolyl
MSAASLSADMAIRGTILGKTALHVLGPQVVSQCRQPVMFKCPPHTLRKMRFSAQ
93
881



isomerase
DLQSKNFYSGFTPFKSVFISTSKRSWQAGSARAMSQDAAFQSKVTTKCFLDIEI




GGDPAGRIVLGLFGEDVPKTAENFRALCTGEKGFGYKGSSFHRIIKDFMLQGGD




FDRGDGTGGKSIYGRTFEDENFKLAHVGPGVLSMANAGPNTNGSQFFICTVKTP




WLDKRHVVFGQVIEGMEIVKKLESEETNRTDRPKRPCRIVDCGELP





431
Peptidylprolyl
MGRIKPQTLLQQSKKKKVPGRISVSTIIVCNLIIIFLMFSLVGIYRQRAKRNRA
372
1070



isomerase
TSRSDGDEEMENFGRSKINSVPHQAIVNTTKGLITLELFGKSSAHTVEKFVEWS




ERGYFNGLPFYRVIKHFVIQVGDPKFAGNREDWTVGGQLNVQLEFSPKHEAFML




GTSKLEDQGDGFELFITTAPIPDLNDKLNVFGRVIKGQDVVQEIEEVDTDEHFQ




PKSPIIINDVRLKDEL





432
Peptidylprolyl
MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFDVDIAGKPAGRVVIGL
28
594



isomerase
FGKAVPKTVENFRALCTGEKGVGKSGKPLHYKGSFFHRIIPSFMIQGGDFTLGD




GRGGESIYGTKFADENFKLKHTGPVFITTVTTDWLDGRHVVFGKIISGMDVVYK




VEAEGRQSGQPKRKVKIADSGELSMD





433
Peptidylprolyl
MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFDVDIAGKPAGRVVIGL
34
648



isomerase
FGKAVPKTVENFRALCTGEKGVGKSGKPLHYKGSFFHRIIPSFMIQGGDFTLGD




GRGGESIYGTKFADENFKLKHTGPGFLSMANAGPDTNGSQFFITTVTTDWLDGR




HVVFGKIISGMDVVYKVEAEGRQSGQPKRKVKIADSGELSMD





434
Peptidylprolyl
MEMDEIQEQSQPQSSEKQDISQESDTGNDKTINAEKITSENAEVEEDDMLPPKV
481
1611



isomerase
NTEVEVLHDKVTKQIIKEGSGNKPSRNSTCFLHYRAWAESTMHKFQDTWQEQQP




LELVLGREKKELSGFAIGVAGMKAGERALLHVDWQLGYGEEGNFSFPNVPPRAN




LIYEAELIGFEEAKEGKARSDMTVEERIEAADRRRQQGNELFKEDKLAEAMQQY




EMALAYMGDDFMFQLFGKYKDMANAVKNPCHLNMAQCLLKLNRYEEAIGQCNMV




LAEDEKNIKALFRRGKARATLGQTDDAREDFQKVRKFSPEDKAVIRELRLLAEH




DKQVYQKQKEMFKGLFGQKPEQKPKKLHWFVVFWQWLLSMIRTIFRMRSKTD





435
Peptidylprolyl
MAGAGEGTPEVTLETSMGPITVELYHKHAPKTCRNFLELSRRGYYNNVKFHRVI
93
584



isomerase
KDFMVQGGDPTGTGRGGESIYGPRFEDEITRDLKHTGAGILSMANAGPNTNGSQ




FFISLAPTPWLDEKHTIFGRVCKGMDVVKRLGNVQTDKNDRPIHDVKILRTTVKD





436
Peptidylprolyl
MMDPELMRLAQEQMSKISPDELMKMQRQIMANPDLMRMASENMKNLKPEDIRFA
250
1869



isomerase
AEQMKNVRKEEMAEISERISRASPEEIEAMKARANLQSAYQLQVAQNLKDQGNQ




LHARMKYSEAAEKYLQARNNLTGIPFSEAKSLLLASSSNLMSCYLKTGQYEECV




QTGSEVLAYDAMNVKALYRRGQAYKQIGKLELAVADLRKAVEVSPEDETIAQAL




REASTELMEKGGTQDQNGPRIEEIIEEEAVQPTAEKYPQSAPMVTSVTEDVSDD




EQGSEDQNGFSRDSFQATNAPDGQMYAESLRNLTENPDMLRTMQSLMKNVDPDS




LVALSGGKLSPDMVKTVSGMFGRMSPEEIQNMMKMSSTLSRQNPSTSSRFDDIT




RGHSNMDSSPQSVSVDNDLFEENQNRVGESSTNLSSSAAFSGMPNFSAEMQEQV




RNQMNDPATRQMFTSMIQNMSPEMMASMSEQFGVKLSPEDAVKAQNAMASLSPN




DLDRLMNWATRLQTAIDYARKIKNWILGRPGLIFAISMLLLAIILHRFGYIGD





437
Peptidylprolyl
MGVEKEILRPGNGPKPRPGQSVTVHCTGYGKNEDLSQKFWSTKDPGQKPFTFTI
84
422



isomerase
GQGRVIKGWDEGVLDMQLGEIFKLRCSPDYGYGSNGFPAWGIRPNSVLVFEIEV




LSVN





438
Peptidylprolyl
MPNPRCYLDITIGEELEGRILVELYSDVVPKTAENFRALCTGEKGIGPHTGVPL
128
1213



isomerase
HYKGLPFHRVIKGFMIQGGDISAQNGTGGESIYGLKFDDENFQLKHERRGMLSM




ANSGPNTNGSQFFITTTRTSHLDGKHVVFGKVIKGMGVVRGIEHTPTESNDRPS




LDVVISDCGEIPEGSDDGIANFFKDGDLYPDWPADLDEKSAEISWWMNAVDSAK




CFGNENYKKGDYKMALRKYRKALRYLDICWEKEEIDEEKSNHLRKTKSQIFTNS




SACKLKLGDLKGALLDTEFAMRDGEDNVKALFRQGQAYMALKDVDSAVASFKKA




LQLEPNDAGIRKELAVATKMINDRRDQERRAYARMFQ





439
Peptidylprolyl
MGDVIDLNGDGGVLKTIIRSAKPGAMQPTEDLPNVDVHYEGTLADTGEVFDTTR
265
837



isomerase
EDNTLFSFELGKGTVIKAWDIAVKTMKVGEVARITCKPEYAYGSAGSPPDIPEN




ATLIFEVELVACKPRKGSTFGSVSDEKARLEELKKQREIAAASKEEEKKRREEA




KATAAARVQAKLEAKKGQGRGKGKSKGK





440
Peptidylprolyl
MGLGLKIASASFLPIFNIMATRSLCILLVCFIPVLAHVLSLQDPELGTVRVYFQ
38
781



isomerase
TTYGDIEFGFFPHVAPKTVEHIYKLVRLGCYNSNHFFRVDKGFVAQVADVVGGR




EVPLNSEQRKEGEKTIVGEFSEVKHVRGILSMGRYSDPDSASSSFSILLGNAPH




LDGQYAVFGKVTKGDDTLKRLEEVPTRQEGIFVMPLERIRILSTYYYDTNERES




NLTCDHEVSILKRRLVESAYEIEYQRRKCLP





441
Peptidylprolyl
MASKRSLRTMNVWPTLPPLVLLLLLCFSSMSSSVVAKKSDVSELQIGVKHKPKS
38
526



isomerase
CDIQAHKGDRIKVHYRGSLTDGTVFDSSFERGDPIEFELGSGQVIKGWDQGLLG




MCVGEKRKLRIPSKLGYGAQGSPPKIPGGATLIFDTELVAVNGKGISNDGDSDL





442
Peptidylprolyl
MSGAPAERPISYFDITIGGKPIGRIVFSLYADLVPKTAENFRALCTGEKGIGKS
37
1158



isomerase
GKPLCYAGSGFHRVIKGFMCQGGDFTAGNGTGGESIYGEKFEDEAFPVKHTKPF




LLSMANAGKDTNGSQFFITVSQTPHLDDKHVVFGEVIKGKSIVRAIENYPTASG




DVPTSPIIISACGVLSPDDPSLAASEETIGDSYEDYPEDDDSDVQNPEVALDIA




RKIRELGNKLFKEGQIELALKKYLKSIRYLDVHPVLPDDSPPELKDSYDALLAP




LLLNSALAALRTQPADAQTAVKNATRALERLELSDADKAKALYREASAHVILKQ




EDEAEEDLVAASQLSPEDMAISSKLKEVKDEKKKKREKEKKAFKKMFSS





443
Peptidylprolyl
MASSLRSSLFSSWALDSKSVCSLFNLNPGKMGLPSISTPLNWRTCCCSHSSELL
61
768



isomerase
ELNEGLQSSRRKTVMGLSTVIALSLVYCDEVGAVSTSKRALRSQKVPEDEYTTL




PNGLKYYDLKVGSGTEAVKGSRVAVHYVAKWKGITFMTSRQGMGITGGTPYGFD




VGASERGAVLKGLDLGVQGMRVGGQRILIVPPELAYGNTGIQEIPPNATLEFDV




ELISIKQSPEGSSVKIVEG





444
WD40 repeat
MGAIEDEEPPLKRLKVSSPGLRRGLEEEAPSLSVGSVSILMAKSLSLEEGETVG
421
2172



protein
SKGLIRRVEFVRIITQALYSLGYQKAGALLEEESGILLQSSNVALFRKQILDGK




WDESVVTLRGIDQVEVEGNTLKAASFLILQQKFFELLDKGNIPEAMKTLRLEIS




PMQLNTKRVHELASCIVFPSRCEELGYSKQGNPKSSQRMKVLQEIQQLLPPSIM




IPEKRLERLVEQALNVQREACIFHNSLDPALSLYTDHQCGRDQIPTTTLQVLES




HKNEVWFLQFSNNGKYLASASKDCSAIIWEITEGDSFSMKHRLSAHQKPVSFVA




WSPDDKLLLTCGIEEVVKLWNVETGECKLTYDKANSGFTSCGWFPDGERFISGG




VDKCIYIWDLEGKELDSWKGQGMPKISDLAVTSDGKEIISICGDNAIVMYNLDT




KTERLIEEESGITSLCVSKDSRFLLLNLANQEIHLWDIGARSKLLLKYKGHRQG




RYVIRSCFGGSDLAFVVSGSEDSQVYIWHRGNGELLAVLPGHSGTVNCVSWNPV




NPHVFASASDDYTIRIWGVNRNTFRSKNASSSNGVVHLANGGP





445
WD40 repeat
MPGTTAGAGIEPIEPQSLKKLSLKSLKRSFDLFASLHGEPQPPDQRSQRIRIAC
163
1647



protein
KVRAEYEVVKNLPTLPQREVGSSVSNSNVGETHSSLTTNQAQGFPTDTSGDLSK




DEGKEITSIAVHLQPQTGLIDGKAGAIAGTSTAISSVGSSDRYQPSAAIMKRLP




SKWPRPIWHPPWKNYRVISGHLGWVRSVAFDPGNEWFCTGSADRTIKIWEVATG




KLKLTLTGHIEQIRGLAVSSRHPYLFSAGDDKQVKCWDLEYNKAIRSYHGHLSG




VYCLALHPTLDILCTGGRDSVCRVWDIRTKAQIFALSGHENTVCSVFTQAIDPQ




VVTGSHDTTIKLWDLAAGKTMSTLTYHKKSVRAIAKHPFEHTFASASADNIKKF




KLPKGEFLHNMLSQQKTIVNAMAINEDNVLVSAGDNGSLWFWDWKSGHNFQQAQ




TIVQPGSLDSEAGIYALQYDITGSRLVSCEADKTIKMWKEDETATPESHPINFK




APKDIRRF





446
WD40 repeat
MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYGHNGERLGTYRGHNGA
192
1172



protein
VWCCDVSRDSTRLITSSADQTAKLWNVETGAQLFSFNFESPARAVDLAIGDKLV




VITTDPFMELPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGPLNSTII




SGGEDSVVRIWDSETGKLLRESDKETGHQKPITSLCKSADGSHFLTGSLDKSAR




LWDIRTLTLIKTYVTERPVNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEA




KFFHKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYVRLHHFDPDYFHI




KM





447
WD40 repeat
MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYGHNGERLGTYRGHNGA
131
1111



protein
VWCCDVSRDSTRLITSSADQTAKLWNVETGNQLFSFNFESPARAVDLAIGDKLV




VITTDPFMELPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGPLNSTII




SGGEDSVVRIWDSETGKLLRESDKETGHQKAITSLCKSADGSHFLTGSLDKSAR




LWDIRTLTLIKTYVTERPVNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEA




KFFHKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYVRLHHFDPDYFHI




KM





448
WD40 repeat
MAENNVGDFIPLDRQEYPSKPAPGAVDSSFWKSFKKKEVSRQIAGVTCINFCPE
149
1726



protein
PPHDFAVTSSTRVHIYDGKSCELKKTITKFKDVAYSGVFRSDSQIIAAGGETGV




IQVFNAKSQMVLRQLKGHGRPVRVVRYSPQDKLHLLSGGDDSMVKWWDITTQEE




LLNLEGHKDYVRCGAASPSSVNLWATGSYDHTVRLWDLRNSKTVLQLKHGKPLE




DVLFFPSGGLLATAGGNVVKVWDILGGGRPIHTMETHQKTVMAMCISKVPRSGQ




ALGDAPSRLVTASLDGYMKVFDLDHFKVTHSARYPAPILSMGISSLCRTMAVGT




SSGLLFIRQRKGQIEDKIHSDSSGLQVNPVNDEKDSAVLKPNQYRYYLRGRSEK




PSEGDYVVKRMAKVYFQEYDKDLRHFNHSKALVSALKAADSKGTVAVIEELVAR




KRLIQTLSILNLDELELLINFLSRFILVPKYSRFLISLTDRVLDARAVDLGKSE




NLKKQIADLKGIVVQELRVQQSMQELQGIIEPLIRASAR





449
WD40 repeat
MDVETSGKPTGNKRTYTRLPRQVCVFWQEGRCTRESCNFLHVDEPGSVKRGGAT
948
2228



protein
NGFAPKRSYNGSDERDTLAAGPPGGSRRNISARWGRGRGGIFISDERQKIRNKV




CNYWLAGNCQRGEECKYLHSFVMGSDVKFLTQLSGHVKAIRGIAFPSDSGKLYS




GGQDKKVIVWDCQTGQGTDIPLNDEVGCLMSEGPWIFVGLPNAVKAWNILTSTE




LSLVGPRGQVHALAVGNGMLFAGTHDGSILAWKFSPASNTFEPAASLVGHTQAV




VSLVSGADRLYSGSMDKTIRVWDLGTFQCLQTLRDHTSVVMSLLCWDQFLLSCS




LDNTVKVWVATSSGALEVTYTHNEEHGVLALCGMNDEQAKPVLLCSCNDNTVRL




YDLPSFSERGRIFSRNEVRTFQIAPGGLFFTGDATGELKVWNWATQKS





450
WD40 repeat
MSVQELRERHAAATAKVNALRERIKAKRLQLLDTDVATYASSNGRTPISFSFTD
332
1465



protein
LVCCRTLQGHTGKVYSLDWTSEKNRIVSASQDGRLIVWNALTSQKTHAIKLPCA




WVMTCAFSPSGQAVACGGLDSVCSIFQLNNQLDRDGHLPVSRILSGHRSYVSSC




QYVPDGDTHVITGSGDRTCIQWDVTTGQRIAIFGGEFPLGHTADVMSVSISAAN




PKEFVSGSCDTTTRLWDTRIASRAIRTFHGHEADVNTVKFFPDGLRFGSGSDDG




TCRLFDIRTGHQLQVYRQPPRENQSPTVTAIAFSFSGRLLFAGYSNGDCFVWDT




ILEKVVLNLGELQNTHNGRISCLGLSADGSALCTGSWDKNLKIWAFGGHRKIV





451
WD40 repeat
MKVKIISRSTDEFTRERSNDLQRVFRNFDPNLHTQARAQEYVRALNAAKLDKIF
232
1590



protein
AKPFLAAMSGHIDGISAMAKSPRHLKSIFSGSVDGDIRLWDIAARRTVQQFPGH




RGAVRGLTVSTEGGRLISCGDDCTVRLWDIPVAGIGESSYGSENVQKPLATYVG




KNSFRAVDYQWDSNVFATGGAQVDIWDHDRSEPTNSFAWGSDTVISVRFNPAEK




DIFATTASDRSIVLYDLRMASPLNKLIMQTRNNAIAWNPREPMNFTAANEDCNC




YSYDMRRMNISTCVHQDHVSAVMDIDYSPSGREFVTGSYDRTVRIFPYNAGHSR




EIYHTKRMQRVFCVKFSGDATYVVSGSDDANIRLWKAKASEQLGVLLPRERKRH




EYLDAVKERFKHLPEIKRIERHRHLPKPIYKAALLRHTVNAAAKRKEERKRAHS




APGSVVTNPLRKKRIVAQLE





452
WD40 repeat
MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSDSENDFDLNNKSPDTT
207
1550



protein
ALQAKRGKDIQGIPWNRLNFTREKYRETRLQQYKNYENLPRPRRSRNLDKECTN




FERGSSFYDFRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMHWSSLKQ




KGEEVLNVAGPIVPSVKHPGSSPQGLTRVQVSAMSVKDNLVVAGGFQGELICKY




LDKPGVSFCTKISHDENGITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTV




LERFSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTVGTLRGHLDYSFA




AAWHPDGYILATGNQDTTCRLWDVRKLSSSLAVLKGRMGAIRSIRFSSDGRFMA




MAEPADFVHLYDTRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYGSLL




EFNRRRMNYYLDSIL





453
WD40 repeat
MAEALVLRGTMEGHTDAVTAIATPIDNSDMIVSSSRDKSILLWNLTKEPEKYGV
221
1171



protein
PRRRLTGHSHFVQDVVISSDGQFALSGSWDSELRLWDLNTGLTTRRFVGHTKDV




LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQPDAEGHSNWISCVRFSPSA




TNPTIVSCSWDRTVKVWNLTNCKLRNTLVGHGGYVNTAAVSPDGSLCASGGKDG




VTMLWDLAEGKRLYSLDAGDIIYALCFSPNRYWLCAATQQCVKIWDLESKSIVA




DLRPDFIPNKKAQIPYCTSLSWSADGSTLFSGYTDGKIRVWGIGHV





454
WD40 repeat
MAAIKSTSRSASVAFAPDAPLLAAGTMAGAIDLSFSSLANLEIFKLDFQSDDPE
221
3679



protein
LPVVGECPSNERLNRLSWGSAGGSFGIIAGGLVDGTINIWNPATLINSEDNGDA




LIARLEQHTGPVRGLEFNTISTNLLASGAEDGELCIWDLANPTAPTHFPPLKGV




GSGAQGEISFLAWNRKVQHILASTSYSGTTVVWDLRRQKPIISFPDATRRRCSV




LQWNPDASTQLIVASDDDNSPTLRAWDLRNTISPYKEFVGHSRGVIAMSWCPSD




SLFLLTCAKDNRTLCWDTGSGEIVCELPAGANWNFDVQWSPKIPGILSTSSFDG




KIGIHNIEACSRNVSGEVEFGGAIVRGGPSALLKAPKWLERPAGVSFGFGGKLA




SFRPSTVAQAADHRHSEVFIHNLVTEDNLVIRSTEFEAAIADGEKVSLRALCDR




KAEESQSDEEKETWNFLRVMFEDEGTARTKLLEHLGFKVQSEENGDLQETHSSK




IDDIGSEIGKTLTLDDKTEEDVLPQLKGGQDAAIPQDNGEDFFDNLHSPKEEVS




LSHVGNDFVGEKDKDMVVNGAEIEHETEDLTEYSDWNEAIQHSLVVGDYKGAVL




QCLSANRMADALIIAHLGGNSLWEKTRDEYLKKAKSSYLKVVSAMVNNDLTGLV




NSRPLKSWKETLAMLCTYSQREEWTVLCDMLASRLIAAGNVMAATLCYICAGNI




EKTVEIWSRSLKYDYDGRSFVDHLQDVMEKTVVLALATGQKRVSPSLSKLVENY




AELLASQGLLTTAMEYLKLLGTEESSHELSILRDRLYLSGTDNKVEASSFPFET




RQDLTESQYNMHQTGFGAPETQKNYQENVHQVLPSGSYTDNYQPTANTHYIAGY




QPAPQQQPSFQNYFTPASYQPAPSPNVFYPSQVSQAEQSNFAPPVNQPPMKTFV




PSTPPILRNVDQYQTPSLNPQLYQGVSSATVETHPYQTGAPASVSVGTTPGQPS




VVPNFMVPGPVTAPTVTPRGFMPVTTPTQHPLGSANPPVQPQSPQSSQVQSVTA




ATTPPPTIQNVDTSNVAAEIRPVIGTLRRLYDETSEALGGARANPAKRREIEDN




SRKIGSLFAKLNSGDISSNAASKLVHLCQALESRDYATAFQIQVGLTTSDWDEC




SFWLAALKRMIKVKQNMR





455
WD40 repeat
MAGAADSQLQTLSERDSTPNFKNLHTREYAAHKKKVHSVAWNCTGTKLASGSVD
269
1252



protein
QTARVWNIEPHGHSKTKDLELKGHADSVDQLCWDPKHSELLATASGDRTVRLWD




ARSGKCSQQVELSGENINITFKPDGTHIAVGNRDDELTIIDVRKFKPLHKRKFS




YEVNEIAWNTTGELFFLTTGNGTVEVLSYPSLQVLHTLVAHTAGCYCIAIDPIG




RYFAVGSADALVSLWDLSEMLCVRTFTKLEWPVRTISFNHDGQYIASASEDLFI




DIADVQTGRTVHQISCRAAMNSVEWNPKYNLLAFAGDDKNKYMQDEGVFRVFGF




ETP





456
WD40 repeat
MAATSPVGAGSGRELANPPTDGISNLRFSNHSDHLLVSSWDRKVRLYDASANSL
214
1242



protein
KGQFVHGGPVLDCCFHDDASGFSGSADNTVRRYDFSTRKEDILGRHEAPVRCVE




YSYAAGQVITGSWDKTLKCWDPRGASGQEKTLVGTYSQLERVYSMSLVGHRLVV




ATAGRHINVYDLRNMSQPEQRRESSLKYQTRCVRCYPNGTGFALSSVEGRVAME




FFDLSEAGQAKKYAFKCHRKSEAGRDTVYPVNAIAFHPIYGTFATGGCDGYVNV




WDGNNKKRLYQYSKYPTSIAALSFSRDGRLLAVASSYTFEEGEKPHEPDAVFVR




SVNEAEVKPKPKVYAAPP





457
WD40 repeat
MASDDEEGFKNEEAPGVVDEAEVQEGLRACFPLSFGKQEKKQAPLESIHSATKR
119
2065



protein
PEDPRPRRQLGPPRPPPSILAEQEDSDRFVGPPRPPQFVRDDNDDGEAEIMIGP




PRPPAQYSDDHDNEETIGPPKPSYLEKGEETDQMVGPSKRGSDDETSGDSDDGD




DAVDFRVPLSNEIVLRGHTKVVSALAIDQTGSRVLTGSYDYSVRMYDFQGMTSQ




LKSFRQLEPAEGHQVRSLSWSPTSDRFLCVTGSAQAKIFDRDGLTLGEFVKGDM




YLRDLKNTKGHISGLTCGEWHPKEKQTILTCSEDGSLRIWDVNDFNTQKQVIKP




KLAKPGRVPVTACAWGRDGKCIAGGVGDGSIQVWNLKPGWGSRPDLYVAKGHDD




DITGLQFSADGNILLTRSTDETLKVWDLRKAITPLQVFRDLPNNYAQTNVAFSP




DERLIFTGTSVERDGNSGGLLCFYDRQTLELVLRIGVSPVHSVVRCTWHPRHNQ




VFATVGDKKEGGAHILYDPALSERGALVCVARAPRKKSLDDFEAKPVIHNPHAL




PLFRDEPSRKRQREKARMDPMKSQRPDLPVTGPGFGGRVGSTKGSLLTQYLLKE




GGLIKETWMEEDPREAILKYADVAAKDPKFIAPAYAQTQPETVFAETDSEEEQK





458
WD40 repeat
MKERGQSHAGQPSVDERYTQWKSLVPVLYDWLANHNLVWPSLSCRWGPQMHQAT
186
1550



protein
YKNSQRLYLSEQTDGTVPNTLVIATCEVVKPRVAAAEHISQFNEEARSPFVKKF




KTIIHPGEVNRIRELPQNSKIVATHTDGPDVLIWDVDTQPNRQATLGAADSRPD




LVLTGHKDNAEFALAMSPSAPFVLSGGKDKCVLLWSIQDHISAATEPSSAKASK




TPSSAHGEKVPKIPSIGPRGVYKGHKDTVEDVQFCPSNAQEFCSVGDDSALILW




DARNGNEPVIKVEKAHNADLHCVDWNPHDENLILTGSADNSVRMFDRRNLTSSG




VGSPVHKFEGHSAPVLCVQWCPDKASVFGSAAEDSYLNVWDYEKVGKNVGKKTP




PGLFFQHAGHRDKVVDFHWNSFDPWTIVSVSDDGESTGGGGTLQIWRMSDLIYR




PEDEVLAELERFRAHILSCQNK





459
WD40 repeat
MSSLSRELVFLILQFLDEEKFKESVHKLEQESGFFFNMKYFDEKAQAGEWDEVE
244
3671



protein
RYLSGFTKVDDNRYSMKIFFEIRKQKYLEALDRQDRAKAVDILVKDLKVFSTFN




EELYKEITQLLTLDNFRENEQLSKYGDTKSARTIMMSELKKLIEANPLFREKLI




YPNLKASRLRTLINQSLNWQHQLCKNPRPNPDIKTLFTDHACGPPNGARTPTQP




TASLGVLPKATTFTPIGPHGPFPSSSTATSGLASWMSNPNMVTSPQAPVAVGPS




VPVPPNQATLLKRPRTPPGSSSVVDYQTADSEQLIKRLRPVSQSIDEATYPGPT




LRVPWSTDDLPKTLARALNEPYPVTSIDFHPSQQTFLLVGTKNGEITLWEVGSR




EKLATRSFKIWDNANCSNHLEAAFVKDSSVSINRVLWSPDGTLIGIAFTKHLVH




TYTFQGLDLRQHLEIDAHVGGVNDLAFSHPNKQLCVVTCGDDKMIKVWDAVTGR




KLYNFEGHDAPVYSVCPHHKENIQFIFSTAVDGKIKAWLYDHLGSRVDYDAPGH




SCTTMMYSADGTRLFSCGTSKEGESFLVEWNESEGAIKRTYSGLRKKGSGVVQF




DTTQNHFLAVGDEHLIKFWDMDSTNMLTSCDAEGGLLNLPRLRFNKEGSLLAVT




TVNGIKILANADGQKLLKTMENRTFDLPSRAHIDAASATSSPATGRMERIERTS




SANTVSGINGVDPAQSSEKLRLSDDLSEKTKIWKLTEITDSIQCRCITLPENAA




EPASKVSRLLYTNSGVGLLALGSNAVHKLWKWNRSEQNPSGKATASVHPQRWQP




TSGLLMTNDITDINPEEAVPCIALSKNDSYVMSASGGKVSLFNMMTFKVMTTFM




PPPPASTFLAFHPQDNNIIAIGMEDSTIHIYNVRVDEVKTKLKGHQKRITGLAF




SSTQNILVSSGADAQLCVWNTETWEKRKSKTIQMPVGKTVSGDTRVQFHSDQLH




ILVVHETQLAIYDAYKLERQYQWVPQDALSAPILYATYSCNRQLIYATFSDGNI




GVYDAEILRPRCRIAPTTYLSSGTSSSTSLPLVVAAHPHEPNQFAIGLSDGAVQ




VLEPSESEGKWGVSPPPENGVVPAVVAGPSTSNQGSEQAPR





460
WD40 repeat
MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHALEWPSLTVQWLPDRE
163
1431



protein
EPPGKDYSVQKMILGTHTSDNEPNYLMLAQVQLPLEDAENDARQYDDERGEIGG




FGCANGKVQVIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYSKHPSKP




PQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLLSGSDDAQICLWDINVPAKNKV




LEAQQIFKVHEGVVEDVAWHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVV




AHQGEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHTFSCHKEEVFQIG




WSPKNETILASCSADRRLMVWDLSRIDEFQTPEDALDGPPELLFIHGGHTSKIS




DFSWNPCEDWVIASVAEDNILQIWQMAENIYHDEEDDMPPEEVV





461
WD40 repeat
MSPGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSADRTIKLFGLNASDTPS
155
1081



protein
LLASLTGHEGPVWQVAWAHPKFGSMLASCSYDGRVIIWREGQQENEWSQVQVFK




EHEASVNSISWAPNELGLCLACGSSDGSITVFTCREDGSWDKTKIDQAHQVGVT




AVSWAPASAPGSLVGQPSDPIQKLVSGGCDNTAKVWKFYNGSWKLDCFPPLQMH




TDWVRDVAWAPNLGLPKSTIASCSQDGKVVIWTQGKEGDKWEGRILNDFKIPVW




RVNWSLTGNILAVADGNNSVTLWKEAVDGDWNQVTTVQ





462
WD40 repeat
MSSGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSADRTIKLFGMNTSDTPT
537
1463



protein
LLASLTGHEGPVWQVAWAHPKFGSMLASCSYDRRVIIWREGQQENEWSQVQVFK




EHEASVNSISWAPHELGLCLACGSSDGSITVFTGREDGSWDKTKIDQAHQVGVT




AVSWAPASAPGSLVGQPSDPVQKLVSGGCDNTAKVWKFYNGSWKLDCFPPLQMH




TDWVRDVAWAPNLGLPKSTIASCSQDGRVVIWTQGKEGDKWEGKILNDFKTPVW




RISWSLTGNILAVADGNNNVTLWKEAVDGEWNQVTTVQ





463
WD40 repeat
MKKRSRPSNGHLSTAAKNKSRKTAPITKDPFFDSAHNRNKSKGKGKSRGKGEEI
284
1909



protein
FSSDEDDDAIGRDAPAEEEEEIAEEERETADEKRLRVAKAYLDKIRAITKANEE




DNEEEAGEDEETEAERRGKRDSLVAEILQQEQLEESGRVQRQLASRVVTPSKLV




ECRVVKRHKQSVTAVALTEDDLRGFSASKDGTIIHWDVETGASEKYEWPSQAVS




VSSSNEVSKTQKGKGSKKQGSKHVLSMAVSSDGRYLATGGLDRYIHLWDTRTQK




HIQAFRGHRGAVSCLAFRQGTQQLISGSFDRTIKLWSAEDRAYMDTLYGHQSEI




LAVDCLRKERVLSVGRDHTLRLWKVPEETQLVFRGHAASLECCCFINNEDFLSG




SDDGSIELWSMLRKKPVFMAKNAHGHAIVENLSEDTSTREEPDEEVTTRQLPNG




NSIGNGMTNQMGITPSVESWVGAVTVCRGTDLAASGAGNGVVRLWAIENSSKSL




RALHDIPLTGFVNSLTFARSGRFLIAGVGQEPRLGRWGRIQAARNGVTLCPIELS





464
WD40 repeat
MAATFGTINTATSPHNPNKSFEIVQPPNDSISSLSFSPKANYLVATSWDNQVRC
610
1659



protein
WEVLQTGASMPKAAMSHDQPVLCSTWKDDGTAVFSAGCDKQAKMWPLLTGGQPV




TVAMHDAPIKDIAWIPEMNLLATGSWDKTLKYWDTRQSNPVHTQQLPERCFALS




VRHPLMVVGTADRNLIIFNLQNPQTEFKRISSPLKYQTRCVAAFPDKQGFLVGS




IEGRVGVHHVEEAQQSKNFTFKCHRDSNDIYAVNSLNFHPVHQTFATAGSDGAF




NFWDKDSKQRLKAMARSNQPTPCSTFNSDGSLYAYAVSYDWSKGAENHNPATAK




HHILLHVPQESEIKGKPRVTTSGRK





465
WD40 repeat
MVVMDKGTHQTNEDESESEFIDEDDVIDEISIDEEDLPDADVEGEDVQEDNKRS
241
1452



protein
EPDENSSSLDDAIHTFEGHEDTLFAVACSPVDATWVASGGGDDKAFMWRIGHAT




PFFELKGHTDSVVALSFSNDGLLLASGGLDGVVRIWDASTGNLIHVLDGPGGGI




EWVRWHPKGHLVLAGSEDYSTWMWNADLGKCLSVYTGHCESVTCGDFTPDGKAI




CTGSADGSLRVWNPQTQESKLTVKGYPYHTEGLTCLSISSDSTLVVSGSTDGSV




HVVNIKNGKVVASLVGHSGSIECVRFSPSLTWVATGGMDKKLMIWELQSSSLRC




TCQHEEGVMRLSWSLSSQHIITSSLDGIVRLWDSRSGVCERVFEGHNDSIQDMV




VTVDQRFILTGSDDTTAKVFEIGAF





466
WD40 repeat
MPVFRTAFNGYAVKFSPFVETRLAVATAQNFGIIGNGRQHVLELTPNGIVEVCA
223
1173



protein
FDSSDGLYDCTWSEANENLVVSASGDGSVKIWDIALPPVANPIRSLEEHAREVY




SVDWNLVRKDCFLSASWDDTIRLWTIDRPQSMRLFKEHTYCIYAAVWNPRHADV




FASASGDCTVRIWDVREPNATIIIPAHEHEILSCDWNKYNDCMLVTGSVDKLIK




VWDIRTYRTPMTVLEGHTYAIRRVKFSPHQESLIASCSYDMTTCMWDYRAPEDA




LLARYDHHTEFAVGIDISVLVEGLLASTGWDETVYVWQHGMDPRAC





467
WD40 repeat
MDSRNRRSRLNLPPGMSPSSLHLETTAGSPGLSRVNSSPSTPSPSRTTTYSDRF
251
1777



protein
IPSRTGSRLNGFALIDKQPQPLPSPTRSAAEGRDDASSSSASAYSTLLRNELFG




EDVVGPATPATPEKSTGLYGGSRDSIKSPMSPSRNLFRFKNDHGGNSPGSPYSA




STVGSEGLFSSNVGTPPKPARKITRSPYKVLDAPALQDDFYLNLVDWSSNNVLA




VGLGTCVYLWSACTSKVTKLCDLGVNDSVCSVGWTPQGTHLAVGTNIGEVQIWD




TSRCKKVRTMGGHCTRAGALAWSSYILSSGSRDRNILHRDIRVQDDFIRKLVGH




KSEVCGLKWSYDDRELASGGNDNQLLVWNQQSAQPLLRFNEHTAAVKAIAWSPH




QHGILASGGGTADRCLRFWNTATDTRLNCVDTGSQVCNLVWCKNVNELVSTHGY




SQNQIMVWRYPSMSKLATLTGHTLRVLYLAISPDGQTIVTGAGDETLRFWSIFP




SPKSQSAVHDSGLWSLGRTHIR





468
WD40 repeat
MEKKKVVVPIVCHGHSRPIVDLFYSPVTPDGLFLISASKDSSTMLRNGETGDWI
367
1419



protein
GTFEGHKGAVWSCCLDNRALRAASGSADFSAKIWDALTGDELHCFVHKHIVRAC




AFSESTSLLLTGGHEKILRIFDLNRPDAPPKEVDNSPGSIRTVAWLHSDQTILS




SNSDAGGVRLWDLRTEKIVRVLETKSPVTSAEVSQDGRYITTADGNSVKFWDAN




HFGMVKSYTMPCMVESASLEPTMGNMFVAGGEDMWVRLFDFHTGEEIACNKGHH




GPVHCVRFAPGGESYSSGSEDGTIRIWQTLNMNSEENESYGVNGLSGKVRVGVD




DVVQKVEGFQITADGHLNDKPEKPNP





469
WD40 repeat
MERYSQGTQKKSEIYTYEAPWQIYGMNWSVRKDKKFRLGIGSFLEEYNNRVEII
284
1303



protein
ELDEESGEFKSDPRLAFDHPYPTTKIMFVPDKECQRPDLLATTGDYLRIWQVCE




DRVEPKSLLNNNKNSEFCAPLTSFDWNDADPKRIGTSSIDTTCTIWDIEKEVVD




TQLIAHDKEVYDIAWGEVGVFASVSADGSVRVFDLRDKEHSTIIYESSQPETPL




LRLGWNKQDPRFIATILMDSCKVVILDIRFPTLPVAELQRHQASVNTIAWAPHS




PCHICTAGDDSQALIWELSSVSQPLVEGGGLDPILAYTAAAEINQLQWSSMQPD




WVAIAFSNEVQILRV





470
WD40 repeat
MQSENNLDESLHLREVQELQGHTDTVWAVAWNPVTGIDGAPSMLASCSGDKTVR
684
1784



protein
IWENTHTLNSTSPSWACKAVLEETHTRTVRSCAWSPNGKLLATASFDATTAIWE




NVGGEFECIASLEGHENEVKSVSWSASGMLLATCGRDKSVWIWDVQPGNEFECV




SVLQGHTQDVKMVQWHPNRDILVSASYDNSIKVWAEDGDGDDWACMQTLGNSVS




GHTSTVWAVSFNSSGDRMVSCSDDLTLMVWDTSINPAERSGNAGPWKHLCTISG




YHDRTIFSVHWSRSGLIASGASDDCIRLFSESTDDSVTPVDGTSYKLILKKEKA




HSMDVNSVQWHPSEPQLLASASDDGRIKIWEVTRINGLANSH





471
WD40 repeat
MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKVNMWAIGKPNAILSLS
336
2738



protein
GHSSAVESVTFDSAEALVVAGAASGTIKLWDLEEAKIVRTLTGHRSNCISVDFH




PFGEFFASGSLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWVVSGGED




NIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQEFLLATGSADRTVKFWDLETFE




LIGSAGPETTGVRAMIFNPDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLA




DLNIHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNGHNEAKLASSGHP




SVQQLDNNLKTNMARLSLSHSTESGIKEPKTTTSLTTTEGLSSTPQRAGIAFSS




KNLPASSGPPSYVSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRPETT




SDVKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESDKIDSINQKRMTGNDKTDL




NIARAEQHVSSRLDNTNTSSVVCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRS




PTFPWSATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETREKALTADTP




VLVSGRPPTSPGVDMNSFIPRGSHGTSESDLTVSDDNSAIEELMQQHNAFTSIL




QARLTKLQVIRRFWQRNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC




TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRATISATPTIGVDLQAE




QRLERCNLCYVELENIKQILVPLIRRGGAVAKSAQELSLALQEV





472
WD40 repeat
MSTLEIEARDVIKIVLQFCKENSLHQTFQTLQNECQVSLNTVDSLETFVADINS
81
1622



protein
GRWDVILPQVAQLKLPRKKLEDLYEQIVLEMIELRELDTARAILRQTQAMGFMK




QEQPERYLRLEHLLVRTYFDPREAYHESSKEKRRSQIAQALASEVTVVPPSRLM




ALIGQSLKWQQHQGLLPPGTQFDLFRGTAAVKADEEEMYPTTLAHTIKFGKQSH




PECARFSPDGQYLVSCSVDGFIEVWDYISGKLKKDLQYQADDSFMMHDDAVLCV




DFSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTSLSFSRDGSQLLS




TSFDSTARIHGLKSGKALKEFRGHTSYVNDAIFTSDGGRVITASSDCTVKVWDV




KTTDCIQTFKPPPPLKGGDVSVNSVHLFPKNSEHIVVCNKASSIYIMTLQGQVV




KSFSSGKREGGDFVAACISPKGEWIYCVGEDRNIYCFSQQSGKLEHLMKAHDKD




IIGVTPHPHRNLLVTYSEDSTMKIWKP





473
WD40 repeat
MDIELEDQPFDLDFHPSAPIVAVALITGRLQLFRYVDISSEPERLWTVTAHTES
399
1460



protein
CRAARFINAGSSVLTASPDCSILATNVETGQPVARLDNAHGAAINCLTNLTEST




IASGDENGIIKVWDTRQNSCCNKFKAHEDYISDMEFVPDTMQLLGTSGDGTLSV




CNLRKNKVHARSEFSEDELLSVALMKNGKKVVCGSQEGVLLLYSWGYFKDCSDR




FVGHPHSVDALLKLDEDTVLTGSSDGIIRVVSILPNKMIGVIGEHSSYPIERLA




FSHDRNVLGSASHDQILKLWDIHYLHEDDEPETNKQEAVNDENVDMDLDVDTEK




RPRGSKRKKRAEKGQTSSQKQSSDFFADI





474
WD40 repeat
MDRIQQIPHTCVARKINLPLGMSKESLALNLPANLAPTMSPPSITYSDRFIPSR
207
1673



protein
KASNFEEFALPDKTSPSPNSAGGQSSSTNGEGRDDACAAYSALLRTELFPATPD




KTEGCRRPVIGSPSGNVFRFKSQQCKSQSPFSLCPVGEDGDLSETGAVARKTTR




KIPRSPFKVLDAPALQDDFYLNLVDWSSHNILAVGLSACVYLWSASSSKVTKLC




DLGLDDNVCSVAWTQRGTYLAVGTNNGGVQIWDAAHCKQVRTMEGHCTRVGTLA




WNSHILSSGGRDRNILQRDIRAQDDFVSKFSGHKSEVCGLKWSYDNRELASGGN




DNQLFVWNQQSQQPVLKYNEHTAAVKAIAWSPHQHGLLASGGGTADRCIRFWNT




ATNTSLNCVDTGSQVCNLVWSKNVNELVSTHGYSQNQIIVWRYPTMSKLATLTG




HTLRVLYLAISPDGQTIVTGAGDETLRFWNVFPSSKTQQNTIRDMGVWSSGRTH




IR





475
WD40 repeat
MAGGQGEGEEKVDKLSMELTEDVMKSMEIGAVFKDYNGKINSLDFHRTNNYLVT
263
1309



protein
ASDDEAIRLFDTASATWQKTSYSKKYGVDLICFTNHQTSVLYSSKNGWDESLRH




LSLMDNKYLRYFKGHHDRVVSLCMSPKGECFMSGSLDRTVLLWDLRIDKCQGLI




RVRGRPAVAYDEQGLVFAISNEGGLIKMFDARLYDKGPFDTFVVEGDKSEASGI




KFSNDGKLILLSTMDSNIHVLDAYQGTTVHSFSVEAVPNGGEAVPNGGTLEASF




SPDGKFVISGSGNGNIHAWSVNSGKEVACWTTEGVIPAVVKWAPRRLMFASGSS




VLSLWVPDLSKLASLTGSNSNSAY





476
WD40 repeat
MHRVGSTGNTSNSSRPRREKRLTYVLNDANDSRHCSGINCLVISKLSLLGGNDY
232
2529



protein
LFSGSRDGTLKRWELADDSAVCSATFESHVDWVNDAVLTGETLVSCSSDTTLKT




WRPFSDGVCTRTLRQHSDYVTCLAAASKNSNIVASGGLGREVFIWDIEAAMAPV




SRTSEAMDDDTSNGVLSSGNSVLSTTVRSTNATNSASLHTSQLQGYTPIAAKGH




KESVYALAMNDVGTLLVSGGTEKVVRVWDPRSGAKQMKLRGHTDNVRALILDST




GRFCLSGSSDSIIRLWDLGQQRCVHSYAVHTDSVWALASTPNFSHVYSGGRDLS




LYLTDLTTRESLLLCMEKHPLLRLTLQDDSIWVATTDSSLHRWPAEGQNPPKMF




QRGGSFLAGNLSFTRARACLEGSAPVPVNTQPSFVIPGSPGIVQHEILNNRRHV




LTKDAEGTVKLWEITRGAVLDDYGKVSFEEKKEELFEMVSIPAWFTMDTRLGSM




SVHLDTPQCFTAEMYAVDLNVPDAPEEQKINLAQETLRGLLAHWLSRRRQRLAT




QASANGDFPAGQENALRNHISSRIDVHDDAETHIAGILPAFDFSTTSPPSIITE




GSQGGPWRKKITDLDGTEDEKDFPWWCLECVLHGRLSPRESLKCSFYLHPYEGT




TVQVLTQGKLSAPRILRIQKVINYVLEKMVLDRPLDSSNSETTFTPGLSGNQSH




AAVVGDGSLRSGARVWQQKAKPLVEILCNNQVLSPDMSLATVRTYIWKKPDDLY




LYYRLVQNR





477
WD40 repeat
MMKGKTIQMQAAHQNHDGETSVACVLWDWHAKHLITAGADNTILIHSYPSSSSS
56
2950



protein
KPITLRHHKNAVTALAINSNVRSLASGSVDHSVKLYSYPGGEFQSNVTRFTLPI




RSLAFNKSGELLAAAGDDEGIKLISTIDNSIARVLKGHNGPVTSISFDPKNEFL




ASSDSDGTVIYWELSTGKPVHTLKKIAPNTTSNPTSLNQISWRPDGEMLAVPGR




KSEVSMYDRDTAEKLFSLKGGHSDTICSLAWSPNGKYIATAGTDRQVMVWDADR




RQDIDKQRFDNPICSVAWKPSDNALAVIDVLGRFGVWESPIASHMKSPADGAER




YDNMEDEEPLMARYEEELEDSVSGSLNEIINDDDDDDEMGKIPRKILQKKPSVK




VEKGKEESNAKAFKSGQDSFKLKSAMQEAFQPGATQRQSGKRNFLAYNMLGSVI




TFDNDGFSHIEVDFHDIGKGCRVPSMTDYFGFTMASLSESGSVFGSPQKGEKNP




STLMYRPFSSWANNSEWSMRFPMGEEVKAVALGSGWVAAVTSLNFLRVFSEGGL




QKFVLSMDGPVVTAAGYENLLVVVSHASNPLLSGDQVLSFTVYDISQKTCPLSG




RLPLSPGSHLTWLGFSEEGLLSSYDSEGNLRVFTNDYNGCWVPIFSAARERKSE




TESIWMVGLNSTQVFCVVCKLPDTYPQVAPKPVLSVLNLSLPLACSDLGADDLE




NEYLRGSLLLSQMQKKAEDAVACGRESNMEEDSIFKMEAALDRCLLRLIANCCK




GDKLVRATELARLLSLEKSLQGAIKLVSAMKLPMLAERFNTILEEKILQENMET




ISCRRLTSEAQDMDTPISISVKQVSYGANLGDSPFLPNRQVEPKHSTPVFSKPD




TKIEVDTSEAIAKGCDAQNGNIKSGDAEVQPASHNDSIQKPSNPFAKASNTSAN




QAVQRNASLLSSIKQMKTATENEGKRKERARSGSLPQKPAKQSKIS





478
WD40 repeat
MKQKRKGHQVDDPKYSVQTPQEDDTPNESGPASEEVESSDEEGGNSSNIEDDII
193
2577



protein
YSSSEEDPVVSSDYEEDEDAESDAEGVTAEQELEGDIDNALQNYMGTLTVLSNF




HGENLKNAEGEDTSGDDDDEEEMPKRAEESDSPEDENDERPKRAEESDFSEDED




EERPKRAEESDSSEDEVPSRNTVGDVPLRWYKDEQHIGYDIKGKKIKKQPKKDQ




LDSFLASTDDSSDWRKVYDEYNDEEVELTKDEIKFISRLRKGTIPHADVNPYEP




YVDWFDWKDKGHPLSNAPEPKRRFIPSKWEAKKVVKLVRAIRKGWITFQKAEEK




PRFYLMWGDDLKPSEKMANGLSYIPAPKPKLPGHEESYNPPPEYIPTQEEINSY




QLMYEEDRPKFIPKRFDSLRNVPAYDRFLSEIFERCLDLYLCPRTRKKRINIDP




ESLIPKLPKPKDLQPFPSICFLEYKGHTGAVSCISPESSGQWLASGSKDGTVRI




WEVETARCLKVWDIGRPIQHIAWNPVSQLSILAVAVDEEVLVLNTGLGSEDSQE




KVAELLHVKSKPVSADDLGDNTSLTKWIKHEKFDGIKLTHLKPVHLISWHHKGD




YFATVAPDGNTRAVLVHQLSKQQTQNPFKKMQGRVVHVLFHPSRAIFFVATKTH




VRVYDLVKQQLVKRLVTGLHEVSSMAVHHKGDNLLVGSKEGKVCWFDMDLSTQP




YKTLKNHSKDIHSVAFHDSYPLFASCSDDCKAYVFYGLVYSDLLQNPLIVPLKV




LQGHQSVNGMGVLDCQFHPKQPWLFTAGADSVVKLYCN





479
WD40 repeat
MMSLKRGFEESLVPAKRQKTELSTVTYGDGPRRTSSLESPIMLLTGHHAAIYTM
187
1233



protein
KFNPTGTVIASGSHEREIFLWNVHGDCKNFMVLKGHKNAVLDLHWTTDGCQIIS




ASPDKTLRAWDVETGKQIKKMAEHSSFVNSCCPSRRGPPLVVSGSDDGTAKLWD




LRHRGAIQTFPDKYQITAVGFSDAADKIYSGGIDNEIKVWDLRRGEVTMRLQGH




TDTITGMQLSSDGSYLLTNSMDCSLRIWDMRPYAPQNRCVKILTGHQHNFEKNL




LKCSWSSDGSKVTAGSADRMVYIWDTTTRRILYKLPGHTGSVNETGFHPTQPII




GSCSSDKQIYLGEIEPNVGYQAVI





480
WD40 repeat
MEFSDTYKHTGPCCFSPDARYLAIAVDYRLVIRDVVTLKVVQLYSCMDKISNIE
51
1436



protein
WALDSEYILCGLYKRAMVQAWSLSQPEWTCKIDEGPAGIAHARWSPDSRHIITT




SDFQLRLTVWSLVNTACIHIQWPKHASKGVSFTQDGKFAAIATRRDCKDYVNLL




SCHTWEVMGTFTVDTIDLADLEWSPNDSAIVVWDSPLEYKVLIYSPDGRCLFKY




QAYDSWLGVKTVAWSPCSQFLAVGSYDQTLRTLNHLTWKPFAEFVHVSTVRGPA




SAVVFKEVEEPWNLDVSGLHLNDDNAHDIQDGKPAEGHSRVRYKVVEFPVNVSS




QKHPVDKPNPKQGIGLLAWSRDSQYLFTRNDNMPTALWIWDICRLELAALLIQK




EPIRAAAWDPVYPRVALCTGSSHLYMWTPSGACCVNIPLPQFVVSDLKWNPDGT




SMLLKDRESFCCTFVPMLPEFNDDETNEE





481
WD40 repeat
MAKLIETHSCVPSTERGRGILIAGDAKTNSIIYCNGRSVIMRNLDNPLEASVYG
525
2351



protein
EHSYPATVARFSPNGEWVASGDTSGTVRIWGRGSDHTLKYEYKALAGRIDDLEW




SADGQRIVVCGDSKGKSMVRAFMWDSGTNVGEFDGHSRRVLSCSFKPTRPFRVA




TCGEDFLVNFYEGPPFRFKTSHRDHSNYVNCVRFAPDGSKFITVGSDRKGVIFD




GKMGEKIGELSKEGGHTGSIYAASWSPDSKQVLTVSADKSAKIWEISETGNGTV




KKTLTFGSQGGADDMLVGCLWLNDYLITVSLGGIVSLLSAVDPDKPPKTISGHM




KSINAIALSLQSGQSEVCSSSYDGVIVRWILGVGYAGRVERKDSTQIKCLATIE




GELVTCGFDNKVRRVPLLSEQHKESEPIDIGAQPKDLDVAVGCPELTFVSTDAG




IIIIRASKIVSTTNVGYAVTAAAISPDGTEAVVGGQDGKLRVYSIKGDTLLEES




VLERHRGPINAIRFSPDGSMFASGDLNREAVVWDRITREVKLKNMVYHTARINC




IAWSPDSSKVATGSLDTCILIYEVGKPASSRITIKGAHLGGVYGLAFSDQSTVI




SAGEDACVRVWSLP





482
WD40 repeat
MPQPSVILATAGYDHTVRFWEATSGRCYRTLQYPDSQVNHLEITPDKQYLAAAG
152
1099



protein
NPHIRLFEVNSNNPQPVISYDSHTNNVTAVGFQCDGKWMYSGSEDGTVKIWDLR




APGFQREYESRAAVNTVVLHPNQTELISGDQNGNIRVWDLNANSCSCELVPEDT




AVRSLTVMWDGSLVVAANNHGTCYVWRLMRGTQTMTNFEPLHKLQAHNSYILKC




LLSPEFCEHHRYLATTSSDQTVKIWNVDGFTLERTLTGHQRWVWDCVFSVDGAF




LVTASSDSTARLWDLSTGEAIRTYQGHHKATVCCALHDGTDGASC





483
WD40 repeat
MLTKFETKSNRVKGLSFHPKRPWILASLHSGVIQLWDYRMGTLIDKFDEHDGPV
470
4114



protein
RGVHFHKTQPLFVSGGDDYKIKVWNYKMRQCLFTFVGHLDYIRTVHFHNEYPWI




VSASDDQTIRLWNWQSRVCISVLTGHNHYVMSASFHPKEDLVVSASLDQTVRVW




DISGLRKKTVSPADDLSRLAQMNTDLFGGGDVVVKYVLEGHDRGVNWAAFGTSL




PLIVSGADRQVGKLWRMNDTKAWEVDTLRGHTNNVSCVIFHARQDIIVSNSEDK




SIRVWDMSKRTSVQTFRREHDRFWILAAHPEMNLLAAGHDSGMIVFKLERERPA




YVVYGGSLLYVKDRYLRTYEFATQKDNPLIPIRKPGSIGPNQGPRSLSYSPTEN




AILICSDADGGAYELYAVPKDSHGRSDTVQEAKKGLGGSAVGVARNRFAVLDKN




HNQVTIKNLKNEVTKKFDLPVTADALFYAGTGNLLCRSEDSVFLFDMQQRTVLG




EIQTPNVRYVVWSNDMENVALLSKHTIIIASKKLSSTCSLHETIRVKSGAWDDN




GIFMYSTLNHIKYCLPNGDSGIIKTLDVPVYITKVSGKSLYCLDRDGKNRVIQI




DITECLFKLALSKKKYDYVINMIRNSQLCGQAIIAYLQQKGFPEVALHFVRDER




TRFNLAVESGNIEIAVASAKEIDEKDHWYRLGVEALRQGNAGIVEYAYQRTKNF




ERLSFLYLITGNLDKLSKMLRIAEMKNDVMGQFHNALYLGDIQERIKILEESGH




LHLAYATASLHGLADIADRLAADLGGNIPVLPPGKKSSLLMPPAPILHGGDWPL




LRVTKGIFEGGLENSTSAAYEEEDEEAAADWGEDIDIENIEGENGEATVLDDQE




VKGGEDDEGGWDMEDLELPPDVAAANVGTNQKTLFVAPTLGMPVSQIWMQKSSL




AGEHAAAGNFETALRLLTRQLGIKNFSPLKPLFLELYMGSHTFLPSFASVPAFS




LALQRGWSESASPNIRGPPALVYRLSVLEEKLTVAYRATTEGRFSEALRLFLNI




LHTIPVIVVDSRKEIDEVKELIGIAKEYVLGLRMEVKRKEIRDDAVRQQELAAY




FTHCNLQKAHLKLALLNAMGISYRCKNYNTAANFARRLLETDPSSNHATKARQV




LQVCERNLQDATQLNYDFRNPFVVCGATFTPIYRGQKEVSCPYCMARFVPDIAG




KLCSICDLAIVGSDASGLFCFATQTR





484
WD40 repeat
MDLLQNYQDDSEDSNPELRNHPPLEDATATSAPAGVENETSSSPDSSPLRLALP
196
2007



protein
AKSCAPDVDETLMALGVPGSEKKNNHNKPIDPTQHSVTFNPSYDQLWAPLYGPA




HPYAKDGIAQGMRNHKLGFVEDSAIEPFMFDEQYNTFHRYGYAADPSASLGSHI




VGDLESLKKNDGASVYNLPKREHKRQKLEKKMIQKDENEEEEKEVGEEVDNPST




EEWLKKNRKSPWAGKKEGLQTELTEEQKKYAQEHAEKKGDREKGEKVEIVDKTT




FHGKEERDYQGRSWIDPPKDAKATNDHCYIPKRWVHTWSGHTKGVSAIRFFPKY




GHLLLSAGMDTKVKIWDVFNSGKCMRTYMGHSKAVRDISFSNDGSRFLSAGYDR




NIKLWDTETGKVISTFSTGKIPYVVKLHPDEDKQNVLLAGMSDKKIVQWDMNSG




EITQEYDQHLGAVNTITFVDNNRRFVTSSDDKSLRVWEFGIPVVIKYISEPHMH




SMPSISLHPNTNWLAAQSLDNQILIYSTRERFQLNKKKRFAGHIAAGYACQVNF




SPDGRFVMSGDGEGRCWFWDWKTCKVFRTLKCHDNVCIGCEWHPLEQSKVATCG




WDGMIKYWD





485
WD40 repeat
MARKGLGTDPAIGSLMSSKKRKEYKVTNRFQEGKRPLYAIAFNFIDARYHNIFA
214
1323



protein
TAGGTRVTIYQCLEGGAISVLQAYVDDDKDESFYTLSWACDVNGSPLLVAGGHN




GIIRVLDVANEKVHKSFVGHGDSVNEIRTQALKPSLILSASKDESVRLWNVQTG




ICILIFAGAGGHRNEVLSVDFHPSDVYRIASCGMDNTVKIWSMKEFWTYVEKSF




TWTDLPSKFPTKYVQFPVFIAAVHSNYVDCTRWLGNFILSKSVDNEVVLWEPYS




KEQSTSDGVVDILQKYPVPECDIWFIKFSCDFHYNSMAVGNREGKVYVWELQSS




PPNLIARLSHAHCKNPIRQTAISHDGSTILCCCDDGSMWRWDVVQ





486
WD40 repeat
MESGAGGSVGARVPSAKPEMLQQPPYSNGDDDNDMERGTAPVPSSNPNTVSKWE
68
2146



protein
LDKDFLCPICMQTMKDAFLTACGHSFCYMCIMTHLNNKSNCPCCSLYLTNNQLF




PNFLLNKLLKKTSACQMASTASPVENLCLSLQQGAEVSVKELDFLLTLLAEKKR




KMEQEEAETNMEILLDFLQRLRQQKQAELNEVQADLHYIKDDILALEKRRLELS




RARERYSRKLHMLLDDPMDTTLGHAAIDDGNNVRTAFVRGGQGDAISGKFQQKK




AEIKAQASSQGMQKRANFCHSDSQVLPTLSGLTIARKRRVLAQFDDLQECYLQK




RRRWATQLRKQCDGGLRKERDGNSISREGYHAGLEEFQSILTTFTRYSRLRVIS




ELRHGDLFHSANIVSSIEFDRDDELFATAGVSRRIKVFDFATVVNEPADVHCPV




VEMSTRSKLSCLSWNKCIKSQIASSDYEGIVTVWDVNTRQSVMMYEEHEKRAWS




VDFSRTEPTRLISGSDDGKVKVWCTRQETSVLNIDMKANICCVKYNPGSSYYVA




VGSADHHIHYYDLRNPSVPLYEFNGHRKTVSYVKFISTNELASASTDSTLRLWD




VRDNCLVRTFKGHTNEKNFVGLTVNSEYIACGSETNGVFVYHKAISKPAAWHQF




GSPDLDDSDDDTSHFISAVCWKSESPTMLAANSQGTIKVLVLAP





487
WD40 repeat
MANYVDSKKNFKCVPALQQFYTGGPFRLSSDGSFLVCACNDEVKVVDLATGSVK
874
3705



protein
NTLEGDSELIVALALTPDNKYLFSASRSTQIKRWDLSSATCKRTWKAHNGPVAD




MACDASGGLLATAGADRSILVWDVDGGYCTHSFRGHQGVVTTVIFHPDPHCLLL




FSGSDDATVRIWDLVAKKCISVLEKHFSTVTSLAISENGWNLLSAGRDKVVNIW




DLRDYHCRATIPTYEPLEAVCVLPTGSRLVSVMNQSRALPENRKKSGAAPVYFL




TVGERGIVRIWYSEGALCLYEQKSSDAIISSDKDELKGGFVSAVLLPLTQGVMC




VTADQRFLFYNLDESDEGKCDLKVSKRLIGYNEEIVDLKFLGDEEKFLAVATNL




EQVRMYDLSSMTCVYELSGHTDIVLCLDTVVFSGHSLLASGSKDHTVRIWDTES




KSCICVAAGHMGAVGAVAFSKKAKNFFVSGSSDRTIKVWSFASVLDFGGISKSI




KLSSQAAVAAHDKDINSVAVAPNDSLICTGSQDRTARIWRLPDLVPVLVLRGHK




RGVWCVEFSPVDQCVMTASGDKTIKIWALSDGSCLKTFEGHTASVLRASFLTRG




TQFVSSGADGLLKLWTIKSNECIATFDQHEDKIWAMAVGKKTEMLATGGSDSLV




NLWHDCTTTDEEEALLKEEEAALKDQELLNALADTDYVKAIQLAFELRRPYKLL




NVFTELYSKGHAQDQIQKVIRELGNEELRLLLEYVREWNTKPKFAHVAQFVLFQ




LFNVLPPKEIIEVQGISELLEGLIPYAQRHYSRIDRLMRSTFLLDYTLSSMSVL




SPTETDLSSSNLLARTADPLHAQIDQFHPTHFPEPNLTPIQSLLDSGNTDSVEV




TARRAKKKRVSGNDSEKTTVAEVKIGDMENAFDEPDVADQGSSRKHKPASSKKR




KSIAVGNASIKRIASGNAVTIALQV





488
WD40 repeat
MESSCSSMNSNRHSTEKRCLRPLQKQGASMNKHSSDRFIPARGSIDLDVARFMV
360
1754



protein
TQKQKDNNDIHALSPSPSPSKKAYQKEMADTLLKNAGAADNNCRILSFNGKSST




VSQGSQENVLANLSISRRARRYIPQSADRTLDAPDLLDDYYLNLLDWSSTNVLS




TALGNTVYLWDASNSSISELLIADEEEGPVTSVSWAPDGSQIAVGLNNSVVQLW




DSQSNKKLRALKGHHDRVGALSWNGPILTTGGLDGIIINHDVRTRDHIVQTYKG




HTQEVCGLKWSPSGQQLASGGNDNLLYIWDKSMASHNPSSQYFHQLDEHCAAVK




ALAWCPFQTNLLASGGGTSDGSIKFWNTQTGACLNTVDTHSQVCSLLWNRHERE




LLSSHGLNQNQLTLWKYPSMVKITELTGHTARVLHMAQSPDGYTVASAAADETL




KFWQVFGAPDASKKTKTKDTKGAFNMFHMHIR





489
WD40 repeat
MLDEIVADEEEEFNIWKKNTPLLYDVVITHALEWPSLTVQWLPDRHQSPTKDYS
185
1384



protein
LQKMIVGTHTSGDEPNYLMIAEVQMPLQYSEDGNVGGFESTEAKVHIIQQINHE




GEVNRAQYMPQNSFIIATKTVSSDVYVFDYTKHSSNAPQERVCNPELILKGHTN




EGYSLSWSPLKEGQLLSGSNDAQICFWDINAASGRKVVEAKQIFKVHEGAVEDV




SWHLKHEYLFGSVGDDCHLLIWDTRTAAPNKPQHSVVAHESEVNSLAFNPFNEW




LLATGSADKTVKLFKLRKLSCSLFTFSNHTEEVFQIEWSPMNETILASSGGDRR




LMVWDLRRIGDEQTSEDAEDGPPELIFIHGGHTSKISDFSWNLHDDWLIASSAE




DNILQIWQMAENIYHDDADIL





490
WD40 repeat
MTKEDHGESRDEMGERMVNEEYKLWKKNTPFLYDLVITHALEWPSLTVQWLPPS
241
1533



protein
CKQQQDIIKDDDIDHPNTQMVILGTHTSDNEPNYLILAEVQLHDGTEDEDGDGD




VKRPQDKMKPGTSGGAMGKVRILQQINHQKEVNRARYMPQKPTIIATKTVNADV




YVFDYSKHPSKPPQEGRCNPELRLQGHESEGYGLSWSPLKEGHLLSASDDAQIC




LWDITAATKAPKVVEANQIFRYHDGPVEDVAWHAIHDHLFGSVGDDHHLLLWDI




RNDSEKPLHIVEAHQAEVNCLAFNPFNEWIVATGSADRTVALHDIRKLDKVLHT




CAHHMEEVFQIGWSPQNGAILASCGSDRRLMVWDLSRIGDEQNPEDAEEAPPEL




LFIHGGHTSKISDFSWNPAEEWVIASVAEDNILQVWQMSEHIYNDDNDSPTA





491
WD40 repeat
MAMAMGDENAADPVEEFNIWKKNTPFLYDLVITHALEWPSLTVQWLPDRHQSST
230
1435



protein
ADYSLQKMIVGTHTSEDEPNYLMIAEVQIPLQNSSEDNIIGGFESTEAKVQIIQK




INHEGEVNKARYMPQNSFVIATKTVSSDVYVFDYSKHPSKAPQERVCNPELILK




GHSNEGYGLSWSPLKEGYLLSGSNDAQICLWDINAAFGKKVLEANQIFKVHEGA




VGDVSWHLKHEYLFGSVGDDCHLLIWDMRTAAPNKPQQSVIAHQSEVNSLAFNP




FNEWLLATGSMDKTVKLFDLRKLSCSLHTFSNHSTDQVFQIEWSPMNETILASSG




ADRRLMVWDLARIGETPEDEEDGPPELLFVHGGHTSKISDFSWNLNDDRVIASV




AEDNILQIWQMAENIYHDDEDML





492
WD40 repeat
MGLFEPFRALGYITDGVPFAVQRRGIETFVTLSVGKAWQIYNCAKLIPVLVGPQ
101
2857



protein
MDKKIRALACWRDFTFAATGHDIAVFRRAHQVATWSGHKAKVTLLLSFGQHVLS




VDLEGCLFIWAVAEVNQNKPPIGQIQLGEKFSPSCIMHPDTYLNKVLIGSEEGT




LQLWNVNTRKKLYEFKGWGSSIRCCVSSPALDVVGIGCSDGKIHVHNLRYDEEI




VTFMHSTRGAVTALSFRTDGQPLLAAGGSSGVISIWNLEKKKLQSVIKDAHDSS




VCSLHFFANEPVLMSSATDNSIKMWIFDTTDGEARLLKYRSGHSAPPMCIRYYG




KGRHILSAGQDRAFRIFSVIQDQQSRELSQGHVGKRAKKLKVKDEEIKLPPVIA




FDAAEIRERDWCNVVTCHLDDPCAYTWRLQNFVIGEHILKPCLEDPTPVKSCSI




SACGNFAVLGTEGGWLERFNLQSGISRGTYIDIGEKRQCAHNGAVVGLACDATN




TLLISGGYNGDIKVWDFKGRELKFRWEIEVPLIKIVYHPGNGILATAADDMILR




LFDVTAMRLVRIFVGHMDRVTDLCFSGDGKWLLSSSMDGTIRVWDIISSRQLNA




MHMDSAVTALSLSPGMDMLATTHVGHNGIYLWANRMIYSKATDIEPFISGKQVV




KVSMPTVSSKRESEEGDEKRTIVAESNVNKSDVSGSLIGDSYSAQLTPELVTLA




LLPKAQWQSLVNLDIIKMRNKPIEPPKKPEKAPFFLPSLPTLSGERIFIPSSMN




GDGDQDETRNDKTVFEARGKKLGGESLSFMQLLQSCAKIKDFTTFTNYLKGLSP




SAVDMELRLLQIVDNENISETEHSVELQGIGMLLDYFVNEVSCNNNFEFVQALI




RLFLKIHGETIRCQVSLQEKARKLLEIQSSTWERLDTSFQNARCMITFLSSSQF





493
WD40 repeat
MIAAVCWVPKGVAKVLPDSAEPPTQEEIQELLKCNVVAESDDNEDSDEESEEMD
43
1548



protein
TETDKNTDAVAKALAAANALGSQSSDFQRQHKVDDIANGLKELDMDHYDDEDEG




IDIFGSGSLGNCYYPANDMDPYLVEQDDDDEDEIEDMTIKPSDLIILSARNEDD




VSHLEVWIYEEETEEGGSNMYVHHDIILPAFPLSLAWLDCNLKGGEKGNFVAVG




TMQPEIELWDLDVLDEVEPAVVLGGAVKDEASGKTTKLKKKKKNKQAVNFKEGS




HTDAVLGLAWNMEYRNVLASASADKSVKIWDIVAEKCEHTMQPHTDKVQAVAWN




PNQATVLLSGSFDRSVIMMDMRAPTHSGIRWPVPADVESLAWDPHTDHSFMVSA




EDGTVRGFDIRAAASTADFDGKPMFILHAHDKAVCAISYNPAAPSLLTTGSTDK




MVKLWDITNNQPSCIASTNPNVGAVFSAAFSKNSPFLLATGGSKGILHVWDTLD




NSEVARRFGKFRPQN





494
WEE1-like
MIMDENEFCDIFSLRKRLCLLSSQEGEEEEELEAMSQLDAGEFTVTGNEEVVAI
206
1657



protein
AEDDVNTGILSQDLFSSQDYCTPSQPQDSTDLDSKDKAPCPLSPVKSTIQRKRC




RPELLSNPPDSIQFSFQRLERVRSEESIQSSSQQLARVRSEVSSSDDFKTPKIT




ASGQKNYVSQSALALRARVMSPPCIKNPYLDENEELNEKIQRSTRRSPACVTPI




QSGACLSRYRADFHELEEIGRGNFSRVYKALNRLDGCCYAVKCSQSELRLDTER




KVALMEVQSLAALGPHKNIVGYHTAWFENDHLYIQMELCDHNLTTANDRGILRT




DTDFLEAVYQIAQALEFIHGRGVAHLDVKPENIYVRDGTYKLGDFGRATLINGT




LHVEEGDARYMSREILNDNYEHLDKVDMFSLGATFFELLMRKQYPGSGKRIDRD




TEIKIPILPGFSIYFQKLLQDLVSNDPGKRPSAKDVLKNPIFNKVRGAKEV





495
WD40 repeat
MLAPALEMEPVEPQSLKKLSFKSLKRALDLFSPVHGQIAPPDPESKKMRISYSL
117
1580



protein
NFEYGGGSGSEDQVPKRKESGAAQNQGQQAAGASNALALPGPEGSKIPPMEKSQ




NALTVGPSLRPQGLNDVGLHGKGTAIISASGSSDRNLSTASAIMERLPSRWPRPV




WHPPWKNYRVISGHLGWVRSIAFDPSNQWFCTGSADRTIKIWDLASGRLKLTLT




GHIEQIRGLAVSSKHTYMFSAGDDKQVKCWDLEQNKVIRSYHGHLSGRLKLTLT




PTIDILLTGGRDSVCRVWDIRSKMQIFALSGHDNTVCSVFARPTDPQVVTGSHD




TTIKFWDLRHGKTMTTLTNHKKSVRAMAQHPKENCFASASADNIKKFQLPRGEF




LHNMLSQQKTIINTMAVNEEGVMATGGDNGSLWFWDWKSGHNGQQAHTIVQPGS




LESEAGIYALSYDLTGSRLVSCEADKTIKMWKEDELATPETHPLNFKPPKDIRRF





496
WD40 repeat
MEEAAKEQSAGSGKPKLLRYGLRSAAKPKEDKKEEQLHQPPPPPPPQQQAAPAP
111
1700



protein
APAATRSSTSGSAGGRDRRPQQQHAVDEKYARWKSLVPVLYDWLANHNLLWPSL




SCRWGPQLEQATYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQF




NEEARSPFIRKYKTIIHPGEVNRIRELPQNPNIVATHTDSPDVLIWDVESQPNR




HAVYGATASRPNLILTGHQENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHIT




ASATDQTTNKSPGSGGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAFC




PSTAQEFCSVGDDSCLILWDARIGTNPVAKVEKAHNGDLHCVDWNPHDNNLILT




GSADNSVNMFDRRNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSAEDG




LLNIWDYERVDKKVDRAPNAPAGLFFQHAGHRDKIVDFHWNTADPWTMVSVSDD




CDTAGGGGTLQIWRMSDLIYRPEEEVLAELENGKAHVLECSKA





497
WD40 repeat
MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHALEWPSLTVQWLPDRE
144
1412



protein
EPPGKDYSVQKMILGTHTSDNEPNYLMLAQVQKOKEDAENDARQYDDERGEIGG




EGCANGKVQVTQQTNHDGEVNEARYYIPQNPETTATKTVSAEVYVEDYSKHPSKP




PQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLLSGSDDAQICLWDINVPAKNKV




LEAQQIFKVHEGVVEDVAWHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVV




AHQGEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHTFSCHKEEVFQIG




WSPKNETILASCSADRRLMVWDLSRIDEFQTPEDALDGPPELLFIHGGHTSKIS




DFSWNPCEDWVIASVAEDNILQIWQMAENIYHDEEDDMPPEEVV





498
Cyclin-
MGKYMRKGKGVGEVAVMEVSQGSLGVRTRARTLAAASSQKDHRRLGASKSVTTK
793
1683



dependant
HQSSAPPASPCVESSMHTCYLELRSRKLEKFSRCYHSAHGATSHGESKRSLSLS



kinase
EPSRLAVSEEARVASDKSSHRVLQQQSSVAHSRNNASATFSHNAKPAKAAQRKER



inhibitor
RDDDHTSARPSEAPHEDEDGMEVEASFGENVMDLDSRERRTRETTPSSYTRDVE




TMETPGSTTRPPSNAGRRRFQTEGGHGTRNQFHVPTTNEIEEFFAGAEQQEQRR




FTDRYNYDPVSDSPLPGRFEWVRLRP





499
CDK type D
MQNMEENVQSSWSLHGNKEICARYEILKRVSSGTYLDVYRGRRKEDGLIVALKE
415
2196




VHDYQSSWREIEALQRLCGCPNVVRLYEVILEFLTSDLYSVIKSAKKNKGENGIP




EAEVKAWMIQILQGLANCHANWVIHRDLKPSNMLISAYGILKLADFGSMSFLKR




AIYEVEYELPQEDILADAPGERLMDEDDSVKGVWNEGEEDSSTAVETNFDDMAE




TANLDLSWKNEGDMVMQGFTSGVGTRWYRAPDFLYGATIYGKEIDLWSLGCILG




ELLILEPLFSGTSNIDOLSRLVKVLGLQQKKNWPGCSNLPDYRKLCFPGDGSPV




GLKNHVPNCSDNMFSILERLVCYDPAARLNAKEIVENKYFVEDPYPVLTHELRV




PSPLREENNFSEDWAKWKDMEVDSDLENIDEFNVVHSSDGFCIKFS





500
Histone
MAPVKRIEPEKTKANEGKPKRRKVAFAIDTGIEANDCISLHLVSTPEEMRDAEG
109
1653



acetyltransferase
VEDQSLSFNPEYMQHFVGEHGKIYGYKGLKIDVWLNALSFHAYVDIQYESKVEE




GKSEKEATDLTDIMKRIFGRGLVEDRNAFIQSFSSNSQSIESMIHNEGERIATR




EILTDKGLSAQGDSERLGVSNEIFRLELSDPQIREWHARLEPLVLLFVEGSQPI




EQDDPKWEMYIRVQRESLSGGSAVCRLLGFCTVYRFYHYPDTTRLRISQILVFP




PYQGKGHGLLLLEAVNKTAVSRDSYDVTVEEPSESLQELRDCMDTIRISQILVFP




MPAVKSAVQKLKEANPSDKGAADHCLEGNVNNETVTTSSTKPKNKSGWFPPPGL




VEEVRKHLKISKKQFKRCWEILLYLNLDRSDSQCEDKYHISLMEQIMSELFDKS




SEKSAKGKRVIDIDNEYDNSKTFIMVRTRNPGNGEGFLPEALEGGMEVSQEDQL




KSLFEERLEEIAQIAEKVPSLCKALQMP





501
Histone
MPEDRKKILEALAAKRKAEAESGEKKPRQKSSLNPAKPVSKPVSKPVGGIGSKG
343
1023



deacetylase
KSTSAPISSTKAKSKHKEEVKAKRVTKMDRYETDEDDESEEEEDLDSESDDDEL




SDEDSEDDIKSKSVKKLPPQSKGKAPVKGISSSNGKGRDEKGKGIMKDKGKAKA




KVEESSSDAEGDSDDDGGDLSDDPLQEVDPSNILPSKTRREASQPTNYQFANMS




GDDDDDDDSD





502
Histone
MADVPESLQQEKDEQGTDKNCCDGKFQKEIDIDDMEEEYNESSIDDEEENLSDN
417
2351



deacetylase
VATNNMGTIPQGQACMAVTVEGIEHANSVGCGRNGREGSEEVTAAEDMGHVSIE




NIREQGRNRKSSEQLLALYEQEGLLEDDEDDDDVDWEPFEGVTVQMKWYCTNCT




MANSDDSVHCDSCGEHRNSDILRQGFLASPYLPAESPSSSDVPDERLEESKCVM




TTLTPSISPMIGVCCSSLQSERRTVVGFDERMLLHSEIQMETYPHPERPDRLRA




IAASLRAAGLFPGKCFSIPAREATCEELQTIHSLEHVNAVESTSCGMLSHLSPD




TYANEHSSLAARLAAGLCADLAKAIMTGQAQNGFALVRPPGHHAGVKDSMGFCL




HNNAAIAVSASRVVGAKKVLIVDWDVHHGNGTQEIFEADQSVLYISLHRHGEGF




YPGSGAVTEVGSSKGEGYSVNIPWKCGGVGDNDYIFAFQHAVLPIAEQFEPDLT




IISAGFDAAKGDPLGRCEVTPDGFAHMAQMLSCLSKGKMLVILEGGYNLRSISA




SATAVIKVLLGDNPKALPIDIQPSKGGLQTLLEVFEIQSKYWSSLKGHDQKLRS




QWEAQYGSKKRKVIRKRHMHIVGGPVWWKWGRKRVVYYHWFARVSSRKHL





503
Peptidylprolyl
MASGAGAAGVVEWHQKPPNPKNPVVFFDVTIGTIPAGRIKMELFADIVPRTAEN
69
641



isomerase
FRQFCTGEYRKAGIPIGYKGCHFHRVIKDFMIQAGDFVKGDGSGCISIYGSKFE




DENFIAKHTGPGLLSMANSGPNTNGCQFFLTCAKCDWLDNKHVVFGRVLGEGLL




VLRKIENVQTGQHNRPKLPCVIAECGEM





504
Peptidylprolyl
MAKLVSSVCAFSCQQRHPHSRPRFLSNRDHYNHYHNHSHYHNVCYFPPMMMMQQ
172
1623



isomerase
QLQKQKRMTTKTITSLFKCNSSNHTLLKGLKEFMGFKFRLQAAMLSCEMSILGR




VFAIFFIVHQAAAPFPFNHFDNWLVPPASAVLYSPNTKVPRTGEVALRKSIPAN




PAMKSIQDFLEDIYYLLRFPQRKPYGTMEGDVKSALQIAINEKDSILGSVPLDM




KERGLQLYNFLIDGQGGLQVLIEYIKEKDPDKVSVNLSSSLDTIAQLELLQAPG




LPYLLPEEYQQYPRLNGRATIEFTMEKGDNSMFSVSSGGGLQKTATIQVVLDGY




SAPLTAGNFTKLVIDGAYNGLKLKTTEQAVISDNERAEAGFNLPIEILPAGGFE




PLYRTTLSVQDGELPVLPLSVYGAIAMAHNTISEDYSSPSQFFFYLYDKRNAGL




GGLSFDEGQFSVFGYTTVGKEILPQLKTGDIIKSAKLVDGFDHLVLPSSST





505
WD40 repeat
MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSDSENDFDSNNKSPDTT
231
1768



protein
ALQAKRGKDIQGIPWNRLNFTREKYRETRLQQYKNYENLPRPRRSRNLDKECTN




FERGSSFYDFRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMHWSSLKQ




KGEEVLNVAGPIIPSVKHPGSSPQGLTRVQVSAMSVKDNLVVAGGFQGELICKY




LDKPGVSFCTKISHDENGITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTV




LERFSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTVGTLRGHLDYSFA




AAWHPDGYILATGNQDTTCRLWDVRKLSSSLAVLKGRMGAIRSIRFSSDGRFMA




MAEPADFVHLYDTRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYGSLL




EFNRRRMNYYLDSIL





506
WD40 repeat
MDCSGDEEEEQFFESLEEMLSPSDSGSEAADNETGCRNADARSKYEIWKRAPSS
376
2943



protein
IQERRQRFLVRMGLANPSELGNQVNSTSAESTCSTETANIPNGIERLRENSGAV




LRTAGSSGRKTHCKNVINIGLREGSVRSSSSSNGTPDVGEDNGEFGGTIFSRSG




GTWECMCKIKNLDSGKEFVVDELGQDGLWNKLREVGTDRQLTMDEFERSLGLSP




LVQELMRRESGVAQADCNGVHHHDAEISSSKRRSWLKALKSAAYSMRRPKEDQS




NYDSERSGRRSGSFDVPWGKPQWTKVRHYRKRYKEFTALYMGQEIEAHEGSIWT




MKFSLDGRYLASAGQDCVIHVREVIESMRTFGADTPDLYASSAYFSMNGLQELV




PLSIEDHANKMKRGKIIGSKKSSNSDCIVLPNKVFQLSEEPVCSFHGHLLDVFD




LSWSPSQYLLSSSMDKTVRLWKLGHESCLKVFSHNDIVTCIQFNPVDERYFISG




SLDGKARIWSIPDRQVVDWSDLREMVTAVCYTPDGQGGLVGSIKGSCRFYNTSG




NKLQLENQLNVRSKKKKSSGKKITGFQFAPGGDSQKVLITSADSRVRVYNGSEL




VCKYKGFRNTCSQISASFAPNGQHFVCASEDSRVYIWNHESPRGSGARHEKSSW




SHEHFLSQGVSVAIPWSGMKLQPPVWNSPEFMLGQRHNLLSLQGGKDVGCQNGL




LSREAGEGQESETPLHYISQVSHSCGSQNMVDRDGQDDLSRYSACISDSRLSSF




MAFPESPGNPDDLNSKVFFSDSSSKGSATWPEEKLPPTRKQSRSNSTSSHYDTL




KTHLGNTIQGQSGASAAVAWGLVIVTAGHGGEIRSFQNYGLPVRL





507
WD40 repeat
MPSIPAIGEFTVCEINRELLTTKDESDTQAKDAYAKILGLVFPPISFQIEEGFG
107
1498



protein
SASRQQFDQDLDREDTIVTPSTSEGTNALQEGGLLLKGVSVLKNILASSFGPIF




SPNDTKVLKKVELLQGISWHRHKHILAFISGSNQVTVHDFQDPEWRESSLLVSE




SQRGIEALEWRPNGGTTLSVACRGGICIWSASYPGSVAPVRSGVASFLGTSTRG




SSVRWTLVDFLQIPGGKAVTALSWSPTGRLLASASREDSSFTIWDVAQGVGTPL




RRGLGGISLLKWSPTGDYLFSAKPNGTFYLWETNTWTLEQWSSSGGCVISATWG




PDGRMLFMAFSESTTLGSLHFAGRPPSLDAHLLPMELPEIGSITGGFGNIEKMA




WDGCGERLAVSYTGGDLMYVGLIAIYDTRRTPFISASLVGFIRGPGEQVKPLAF




AFHDKFKQGPLLSVCWSSGLCCTYPLIFRAH





508
WD40 repeat
MEEENAKHTEETRQVQVRFTTKLQPALRVPTTSIAIPAHLTRYGLSDIVNTLLG
118
1425



protein
NDKPQPFDFLVESELVRTSLEKLLLIKGISAEKILNIEYILAVVPPKQEEPSLH




DDWVSVVDGSYPNFIFSGSFDSIGRIWKGEGLCTHVLEGHRDAITSAAFIMPSD




SSDSFINLATASKDRTLRLWQFKPNEHMTNGKMVRPYKLLKGHTSSVQTVSACP




RRNLICSGSWDCSIKIWQTAGEMDIESNAGSVKKRKLEDSTEQIISQIEASRTL




EGHSQCVSSVVWLEKDTIYSASWDHSVRSWDVETGVNSLTVGCRKALHCLSIGG




EGSALIAAGGADSVLRIWDPRMPGTFTPILQLSSHKSWITACKWHPKSRHHLIS




ASHDGTLKLWDVRSKVPLTTLEAHKDKVLCADWWKEDCVISGGADSTLQIFSNL




NLT





509
WD40 repeat
MNRLRSKRNHILELRLGQSEPEKEATLASNRSRGTNAPIVVEDDDDVVVSSPRS
186
797



protein
FALARSSVSQRSSRIPIVNEEDLELRLGLAVTGRTSAEHNPRRRHGRVPPNKPI




VLCDDAGEADQSSSKKRRTGQQLSSDVQSDESKEVKLTCAICISTMEEETSTIC




GHIFCKKCITNAIHRWKRCPTCRKKLAINNIHRIYISSSTG





510
WD40 repeat
MEEPPPPAVLPSSEDTSIVSSHSFVNAPPTVPVGLDASIPQISTPGINQPGLTI
387
2456



protein
PVPPEAAPLTASLVAASAGMPPAVVPSFVRPAIVAHPSVMPPPSMPLAALPMPV




ASAVPVAAPHFPPSTPNDNSITPSMPVPTPIVASSSVPPSVTIPGIAPLPFIAP




IPVPSSRPVAPSPFMPPARPLGASVSVAMDVDNTDEQDQDADNKGESPSSSPDH




PEDPSAAEYEITEESRKVRERQEQAIQELLLRRRAYALAVPTNDSSVRARLRRL




NEPITLFGEREMERRDRLRALMAKLDAEGQLEKLMKVQEEEEAAANVDAEEVQE




MEGPQVYPFYTEGSQELLKARTEITKFSLPRAVSRLQRARRKREDPDEDEDEEL




KCVLQQSAQINMDCSEIGDDRPLSGCAFSSDGTLLATSAWSGVTKLWSVPNINK




VATLKGHTERVTDVAFSPTNCHLATACADRTAMLWNSEGVLMKTYEGHLDRLAR




LAFHPSGLYLGTASFDKTWRLWDVNTGIELLLQEGHSRSVYGIAFQCDGSLAAT




CGLDGLARIWDLRTGRSILALEGHVKPVLGIDFSPNGYHLATGSEDHTCRIWDL




RKRQSVYIIPAHSHLVSQVKFEPQEGYFLVTASYDSTAKVWSARDFKSIKVLAG




HEAKVTSVDITADGQYIATVSHDRTIKLWSSKNSTNDMNIG





511
WD40 repeat
MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKVNMWAIGKPNAILSLS
359
2761



protein
GHSSAVESVTFDSAEALVVAGAASGTIKLWDLEEAKIVRTLTGHRSNCISVDFH




PFGEFFASGSLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWVVSGGED




NIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQEFLLATGSADRTVKFWDLETFE




LIGSAGPETTGVRAMIFNPDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLA




DLNIHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNGHNEAKLASSGHP




SVQQLDNNLKTNMARLSLSHSTESGIKEPKTTTSLTTTEGLSSTPQRAGIAFSS




KNLPASSGPPSYVSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRPETT




SDAKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESDKIDSINQKRMTGNDKTDL




NIARAEQHVSSRLDNTNTSSVVCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRS




PTFPWSATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETREKALTADTP




VLVSGRPPTSPGVDMNSFIPRGSHGTSESDLTVSDDNSAIEELMQQHNAFTSIL




QARLTKLQVIRRFWQRNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC




TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRATISATPTIGVDLQAE




QRLERCNLCYVELENIKQILVPLIRRGGAVAKSAQELSLALQEV





512
Cyclin B
MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRALSNINSNIIGAPPYPC
238
1648




AVNKRVLSEKNVNSENDLLNAAHRPITRQFAAQMAYKQQLRPEENKRTTQSVSN




PSKSEDCAILDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDVAEEPVT




DIDSGDKENQLAVVEYIDDLYMFYQKAEASSCVPPNYMDRQQDINERMRGILID




WLIEVHYKFELMDETLYLTVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEV




SVPVVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPYVFMRRFLKAAQS




DKKLELLSFFIIELSLVEYDMLKFPPSLLAASAIYTALSTITRTKQWSTTCEWH




TSYSEEQLLECARLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFLLDF




RL





513
Cyclin-
MQAPREGKSAAAIVGMGKYMKKSKAIPRDVSLLEASPRSPSATGVRTRAKTLAS
59
859



dependant
RRLRRASQRRPPPPAAAAAAAAPSLDASPCPFSYLQLRSRRLRRPRLAPSPEAR



kinase
IDEGPAGSGSRGSRDASCSARTASSSGGVEGEGACVGRGDRGNGGECVRDAAVD



inhibitor
ASYGENDLEIEDRDRSTRESTPCSLIRDSNANTPPGSTTRQQSSCTAHRTQMSI




LRSIPTSDEMEEFFAYAEQRQQRSFIEKYNFDIVKDRPLPGRFEWVQVIP





514
Histone
MDGHSSHLAAQNRSRGSQTPSPSHSAASASATSSIHLKRKLSAANASAASAAAA
44
1829



acetyltransferase
AAAAAAAADDHAPPFPPSSISADTRDGALTSNDDLESISARGGGAGDDSDDDSD




DEEEDDGDNDGGSSLRTFTAARLENVGPAAARNRKIKAESNATVKVEKEDSAKD




GGNGAGVGALGPAATSGAGSGSGTVPKEDAVKIFTENLQASGAYSAREENLKRE




EEAGRLKFECLSNDGVDDHMVWLIGLKNIFARQLPNMPKEYIVRLVMDRNHKSV




MVIRRNLVVGGITYRPYASQKFGEIAFCAIKADEQVKGYGTRLMNHLKQHARDV




DGLTHFLTYADNNAVGYFIKQGFTKEIYLDKDRWHGYIKDYDGGILMECKIDPK




LPYTDLSTMVRRQRQAIDEKIRELSNCHIVYQGIDFQKRDAGVPQNTIKMEDIP




GLREAGWTPDQWGYSRFRGLSDQKRLTFFIRQLLKVLNDHSDAWPFKEPVDARE




VPDYYDIIKDPMDLKTMTKRVESEQYYVTLEMFIADVKRMFANARTYNSPDTIY




FKIATRLEAHFQSKVQSNLQSGAGKIQQ





515
Peptidylprolyl
MFNGMMDPELFKLAQEQMNRMSPAELAKIQQQMMSNPELMRMASESMKNMRPED
109
1866



isomerase
LRQAAEQLKHVRPEEMAEIGEKMANASPEEIAAVRARADAQMTYEINAAKILKK




EGNELHSQGRFKDASQKYLRAKNNLKGIPSSEGKNLLLACSLNLMSCYLKTRQY




EECIKEGSEALACEEKNLKAFYRRGQAYRELGQLKDAVSDLRKAHEISPDDETI




AQVLRDTEESLTKEGGSAPRGVVIEEITEEDETLASVNHESPSEYSEKRHQESE




DAHKGPINGDIMGQMTNSESLKALKGDPDAIRSFQNFISNADPTTLAAMGAGNA




GEVSPDLIKTASSMIGKMSAEELQKMIQLASSFPGENPYVTRNSDSNSNSFGNG




SIPNVSPDMLKTASDMMSKMSPDDLQRMFEMASSSRGKDPSLDANHASSSSGAN




LAANLNHILGESEPSSSYHIPSSSRNISSSPLSNFPSSPGDMQEQIRNQMKDPA




MRQMFTSMMKNMSPEMMANMGKQFGLELSPEDAAKAQEAMSSLSPEMLDKMMRW




ADRAQRGVETAKKTKNWLLGRPGMILAICMLLLAVILHRLGFIGS





516
WD40 repeat
MIAAISWVPRGASKAVPEVAEPPSKEEIEEILKSGVVERSGDSDGEEDDENMDA
212
1815



protein
VASEKADEVSTALSAADALGRISKVTKAGSGFEDIADGLRELDMDNYDEEDEDV




KLFSTGLGDLYYPSNDMDPYLKDKDDDDDTEEIEDLSIKPMDSLIVCARTDDEV




NLLEVYLLEPSLSDESNMYVHHEVVISEFPLCTAWLDCPIKGGDKGNFIAVGSM




EPAIEIWDLDIIDAVEPCLVLGGQEELKKKKKKGKKASIKYKEGSHTDSVLGLA




WNKEFRNILASASADRQVKIWDVAAGKCNITMEHHTDKVQAVAWNHHAPQVLLS




GSFDHSVVMKDGRIPSHSGYRWSVTADVESLAWDPHSEHFFVVSLEDGTVRGFD




VRAAISNSASQSLPSFTLHAHEKAVSTISYNPAAPNLLATGSTDKMVKLWDLSN




NQPSCIASRNPKAGAVFSVSFSEDSPLLLAIGGSKGRLEVWDTSSDAAVSRRFG




KHGKPKTAEPGS





517
WD40 repeat
MKFCKKYQEYMQGQEGKKLPGLGFKKLKKILKRCRRRDSLHSQKALQAVQNPRT
207
1193



protein
CPAHCSVCDGSFFPSLLEEMSAVLGCFNKQAQKLLELHLASGFQKYLMWFKGKL




RGNHVALIQEGKDLVTYALINAIAIRKILKKYDKIHLSTQGQAFKSQVQRMHME




ILQSPWLCELIAFHINVRETKANSGKGHALFEGCSLVVDDGKPSLSCELFDSIK




LDIDLTCSICLDTVFDSVSLTCGHIYCYMCACSAASVTIVDGLKAAEPKEKCPL




CREARVFEGAVHLDELNILLSRSCPEYWAERLQTERVERVRQAKEHWESQCRAF




MGVE





518
WD40 repeat
MVSTQSTRENPSIFFPPPLKPWLLPVVLSLSLSRQLGMAAAAAASLPFKKNYRS
6
2786



protein
SQALQQFYAGGPFAVSSDGSFIACNCGDSIKIVDSSNASLRPSIDCGSDTITAL




SLSPDGKLLFSAGHSRQIRVWDLSTSTCLRSWKGHDGPVMSMACPVSGGLLATG




GADRKVMVWDVDGGFCTHFFKGHDGVVSTVLFHPDSNRSLLFSGSDDGTIRVWD




LLAKKCASTLRGHDSTVTSLAFSEDGLTLLAAGRDKVVSLWDLHNYACKKTIPM




YEVLESVCVIHSGTVLASQLGLDDQLKVTKESAQNIHFITVGERGILRIWKSEG




SVCLFKQEHSDVTVISDEDDSRSGFTAAVMLPLDQGLLCVTADQQFLFYYPEKH




PEGIFSLTLCRRLVGYNEEIVDMKFLGEEENFLAVATNLEQVRVYELASMSCSY




VLAGHTETVLCLDTCISSSGRTLIVTGSKDNSVRLWDSESRHCIGVGVGHMGAV




GAVAFSRKRQDFFVSGSSDRTLKVWSLDGISEDGVDSTNLKAKAVVAAHDKDIN




SVAVAPNDSLVCSGSQDRTACVWRLPDLVSVVVLKGHKRGIWSVEFSPVDQCVL




TASGDKTVKIWAISDGSCLKTFEGHVSSVLRASFLTRGTQFVSCGADGLVKLWT




VRTNECIATYDQHSDKVWALAVGKKTEMLATGGSDAVVNLWYDSTASDKEDAFR




KEEEGVLKGQELENAVSDADYTKAIELALELRRPHKLFELFSELCRTREVGDRV




ERILSALSGEEVCLLLEYIREWNAKPKLCHVAQSVLSQVFRILSPTEIVEIKGI




GELLEGLIPYSQRHFSRIDRLVRSTYLLDYTLTGMSVIEPEADRSAVNDGSPDK




SGLEKLEDGLLGENVGEEKIQNKEELESSAYKKRKLPRSKDRSKKKSKNVVYAD




AAAISFRA





519
WD40 repeat
MDSAPRRKSGGINLPSGMSETSLRLDGFSGSSSSFRAISNLTSPSKSSSISDRF
213
1726



protein
IPCRSSSRLHTFGLVERGSPVKEGGNEAYSRLLKAELFGSDFGSLSPAGQGSPM




SPSKNMLRFKTESSGPNSPFSPSILRQDSGFSSEASTPPKPPRKVPKTPHKVLD




APSLQDDFYLNLVDWSSQNTLAVGLGTCVYLWSASNSKVTKLCDLGPNDGVCAV




QWTREGSYISIGTSLGQVQIWDGTQCKRVRTMGGHQTRTGVLAWNSRILASGSR




DRVILQHDLRVPNEFIGKLVGHKSEVCGLKWSHDDRELASGGNDNQLLVWNQHS




QQPVLKLTEHTAAVKAIAWSPHQNGLLASGGGTADRCIRFWNTTNGHQTSSVDT




GSQVCNLAWSKNVNELVSTHGYSQNQIMVWKYPSMAKVATLTGHSLRVLYLAMS




PDGQTIVTGAGDETLRFWNVFPSAKAPAPVKDTGLWSLGRTHIR





520
WD40 repeat
MEDEAEIYDGVRAQFPLTFGKQSKPQTSLESVHSATRRGGPAPAPAPASSSSLP
101
2110



protein
STTSPSAAGGAGKSSGLPSLSSSSTAWLEGLRAGNPRAGREAGIGSRGGDGEDG




GRAMIGPPRPPPGFSANDDGGGEDDDDDGDGVMVGPPPPPPGNLGDGDDDEEEE




EAMIGPPRPPVVDSDEEEEEEEEENRYRLPLSNEIVLKGHNKIVSALAVDPTGS




RVLSGSYDYTVRMFDFQSMNSRLSSFRDFEPVEGHQVRNLSWSPTADRFLCVTG




SAQAKIYDRDGLTLGEFVKGDMYIRDLKNTKGHITGLTWGEWHPKTKETILTSS




EDGSLRIWDVNDFKSQKQVIKPKLARPGRVPVTTCTWDREGKCIAGGIGDGSIQ




IWNLKPGWGSRPDIHVEQAHADDITGLKFSSDGKILLTRSFDDSLKVWDLRLMK




NPLKVFEDLPNHYAQTNIACSPDEQLFLTGTSVERESTIGGLLCFFDRSKLELV




SRIGISPTCSVVQCAWHPRLNQIFATSGDKSQGGTHVLYDPTLSERGALVCVAR




APRKKSVDDFELKPVIHNPHALPLFRDQPSRKRQREKILKDPLKSHKPELPMNG




PGHGGRVGASKGSLLTQYLLKQGGMIKETWMDEDPREAILKHADAAEKNPKFTR




AYAETQPDPVFAKSDSEDEDK
















TABLE 16







BLAST Sequence Alignment Table.

















BlastX top

BlastX e
BlastX
BlastX


SEQ ID
Target
Patent Identifier
hit
Gene name
value
identities
overlap

















1
CDK type A
eucalyptusSpp_003910
Q9FRN5
PUTATIVE
0
367
492






SERINE/THREONINE






KINASE


2
CDK type A
eucalyptusSpp_019213
O44000
CDC2-LIKE
e−160
217
290






PROTEIN






KINASE TPK2


3
CDK type A
eucalyptusSpp_036800
Q40789
PROTEIN
0
259
294






KINASE






P34CDC2


4
CDK type A
eucalyptusSpp_040260
Q27168
CDC2
e−156
208
304


5
CDK type A
eucalyptusSpp_041965
Q43361
CDC2PA mRNA.
e−159
274
294






SPTREMBL


6
CDK type B-1
eucalyptusSpp_002906
Q9FYT9
Cyclin-
e−159
269
305






dependent






kinase B1-1


7
CDK type B-2
eucalyptusSpp_001518
Q9FSH4
B2-TYPE
0
270
315






CYCLIN






DEPENDENT






KINASE


8
CDK type C
eucalyptusSpp_008078
Q9LDC1
CRK1 protein
0
415
558


9
CDK type C
eucalyptusSpp_009826
Q9LNN0
F8L10.9
0
392
716






protein.






SPTREMBL


10
CDK type C
eucalyptusSpp_010364
Q8GZA7
Putative
e−172
309
499






cyclin-






dependent






protein






kinase.


11
CDK type C
eucalyptusSpp_011523
Q8W2N0
Cyclin-
e−165
273
405






dependent






kinase CDC2C


12
CDK type C
eucalyptusSpp_024358
P93320
CDC2MSC
0
448
523






PROTEIN


13
CDK type C
eucalyptusSpp_039125
O80540
F14J9.26
0
418
743






protein


14
CDK type D
eucalyptusSpp_005362
O80345
CDK-
e−180
305
483






activating






kinase 1AT






(Cdk-






activating






kinase






CAK1At)


15
CDK type D
eucalyptusSpp_044857
O80345
CDK-
e−177
302
477






activating






kinase 1AT






(Cdk-






activating






kinase






CAK1At)


16
Cyclin A
eucalyptusSpp_001743
Q39879
MITOTIC
0
360
508






CYCLIN A2-






TYPE


17
Cyclin A
eucalyptusSpp_012405
Q39878
MITOTIC
e−179
278
470






CYCLIN A2-






TYPE


18
Cyclin B
eucalyptusSpp_003739
Q9LDM4
F2D10.10
e−148
288
466






(F5M15.6)


19
Cyclin B
eucalyptusSpp_022338
P93557
Mitotic
e−168
310
476






cyclin


20
Cyclin B
eucalyptusSpp_028605
Q40337
B-like
e−158
300
439






cyclin.






SPTREMBL


21
Cyclin B
eucalyptusSpp_041006
Q40337
B-like
e−158
300
439






cyclin


22
Cyclin D
eucalyptusSpp_006643
Q9SXN7
NtcycD3-1
1E−73
177
404






protein


23
Cyclin D
eucalyptusSpp_045338
Q8LK74
Cyclin D3.1
e−101
190
332






protein.






SPTREMBL


24
Cyclin D
eucalyptusSpp_046486
Q9ZRX7
CYCLIN D3.2
e−126
196
373






PROTEIN


25
Cyclin-
eucalyptusSpp_012070
CAB69358
SEQUENCE 1
8E−64
83
88



dependent


FROM PATENT



kinase


WO9841642



regulatory



subunit


26
Histone
eucalyptusSpp_006617
O80378
181
0
371
395



acetyltransferase


(Fragment)


27
Histone
eucalyptusSpp_007827
Q9FJT8
Histone
e−148
260
465



acetyltransferase


acetyltransferase






HAT B


28
Histone
eucalyptusSpp_008036
Q9FJT8
Histone
e−149
262
465



acetyltransferase


acetyltransferase






HAT B.






SPTREMBL


30
Histone
eucalyptusSpp_001596
Q9M4T5
Putative
7E−76
156
305



deacetylase


histone






deacetylase






HD2


31
Histone
eucalyptusSpp_005870
Q9M4T4
Putative
7E−66
144
318



deacetylase


histone






deacetylase






HD2c






(AT5g03740/F17C15_160)


32
Histone
eucalyptusSpp_006901
HDAC_ARATH
Histone
0
405
499



deacetylase


deacetylase






(HD)


33
Histone
eucalyptusSpp_006902
AAM13152
HISTONE
0
427
499



deacetylase


DEACETYLASE


34
Histone
eucalyptusSpp_007440
Q8W508
HISTONE
0
369
428



deacetylase


DEACETYLASE


35
Histone
eucalyptusSpp_008994
Q8LD93
Histone
0
354
536



deacetylase


deacetylase,






putative


36
Histone
eucalyptusSpp_024580
Q94EJ2
At1g08460/T27G7_7
e−165
274
373



deacetylase


(HDA8).






SPTREMBL


37
Histone
eucalyptusSpp_037831
Q9FML2
Histone
0
356
464



deacetylase


deacetylase.






SPTREMBL


38
MAT1 CDK-
eucalyptusSpp_034958
Q8LES8
Hypothetical
4E−47
101
190



activating


protein



kinase



assembly



factor


39
Peptidylprolyl
001209EGXC004488HT
TL40_SPIOL
Peptidylprolyl
0
329
392



isomerase


cis-






trans






isomerase,






chloroplast






precursor


40
Peptidylprolyl
010310EGXD012820HT
Q9FJL3
PEPTIDYLPROLYL
0
453
579



isomerase


ISOMERASE


41
Peptidylprolyl
010310EGXD013036HT
O82646
HYPOTHETICAL
0
302
521



isomerase


57.1 KDA






PROTEIN (EC






5.2.1.8)


42
Peptidylprolyl
010316EGXF999037HT
BAB39983
PUTATIVE
e−115
146
172



isomerase


PEPTIDYLPROLYL






CIS-






TRANS






ISOMERASE,






CHLOROPLAST


43
Peptidylprolyl
010324EGXF002118HT
AAK32894
AT5G13120/T19L5_80
e−122
179
264



isomerase


44
Peptidylprolyl
011019EGKA001923HT
AAM14253
HYPOTHETICAL
e−108
146
188



isomerase


20.3 KDA






PROTEIN


45
Peptidylprolyl
eucalyptusSpp_000966
Q8L5T1
Peptidylprolyl
1E−91
155
170



isomerase


isomerase






(Cyclophilin)






(EC






5.2.1.8)


46
Peptidylprolyl
eucalyptusSpp_001037
Q8VX73
CYCLOPHILIN
e−120
155
169



isomerase


(EC 5.2.1.8)


47
Peptidylprolyl
eucalyptusSpp_004603
AAM14253
HYPOTHETICAL
e−108
146
188



isomerase


20.3 KDA






PROTEIN.


48
Peptidylprolyl
eucalyptusSpp_005465
Q9SP02
Cyclophilin
2E−93
172
204



isomerase


ROC7 (EC






5.2.1.8)






(AT5g58710/mzn1_160)






(Pepti . . .


49
Peptidylprolyl
eucalyptusSpp_006571
O49605
EC 5.2.1.8
9E−98
169
224



isomerase


(Cyclophilin-






like






protein)






(Peptidyl-






prolyl


50
Peptidylprolyl
eucalyptusSpp_006786
Q93VG0
Cyclophilin
5E−82
142
164



isomerase


(EC 5.2.1.8)






(Peptidyl-






prolyl cis-






trans


51
Peptidylprolyl
eucalyptusSpp_007057
Q38901
Cytosolic
3E−84
144
172



isomerase


cyclophilin






(EC 5.2.1.8)






(Peptidyl-






prolyl


52
Peptidylprolyl
eucalyptusSpp_008670
Q9FJL3
PEPTIDYLPROLYL
0
423
596



isomerase


ISOMERASE


53
Peptidylprolyl
eucalyptusSpp_009137
Q9C566
Cyclophilin-
e−168
285
361



isomerase


40 (EC






5.2.1.8)






(Expressed






protein)


54
Peptidylprolyl
eucalyptusSpp_010285
Q9LY75
Cyclophylin-
e−160
345
658



isomerase


like protein






(EC 5.2.1.8)






(Peptidyl-






prolyl


55
Peptidylprolyl
eucalyptusSpp_010600
Q93YQ8
HYPOTHETICAL
0
346
475



isomerase


50.1 KDA






PROTEIN






(FRAGMENT)


56
Peptidylprolyl
eucalyptusSpp_011551
Q9ZVG4
T2P11.13
e−115
154
192



isomerase


PROTEIN


57
Peptidylprolyl
eucalyptusSpp_020743
Q8VXA5
PUTATIVE
e−125
161
172



isomerase


CYCLOSPORIN






A-BINDING






PROTEIN


58
Peptidylprolyl
eucalyptusSpp_023739
FK21_NEUCR
FK506-
3E−49
74
112



isomerase


binding






protein






precursor






(FKBP-21)


60
Peptidylprolyl
eucalyptusSpp_031985
Q8L8W5
Cyclophilin-
1E−82
155
229



isomerase


like protein






(EC 5.2.1.8)






(Peptidyl-






prolyl


61
Peptidylprolyl
eucalyptusSpp_032025
Q9LPC7
F22M8.7
1E−45
99
160



isomerase


protein (EC






5.2.1.8)






(Peptidyl-






prolyl cis-






trans


62
Peptidylprolyl
eucalyptusSpp_032173
Q8L8W5
Cyclophilin-
4E−83
156
229



isomerase


like protein






(EC 5.2.1.8)






(Peptidyl-






prolyl


64
Retinoblastoma
eucalyptusSpp_009143
Q9SLZ4
Retinoblastoma-
0
704
1008



related


related



protein


protein


65
WD40 repeat
eucalyptusSpp_000349
AAK49947
TGF-BETA
0
291
326



protein


RECEPTOR-






INTERACTING






PROTEIN 1


66
WD40 repeat
eucalyptusSpp_000575
Q9LW17
WD-40 repeat
e−168
282
341



protein


protein-like






(Expressed






protein)


67
WD40 repeat
eucalyptusSpp_000804
GBLP_SOYBN
Guanine
0
291
326



protein


nucleotide-






binding






protein beta






subunit-like


68
WD40 repeat
eucalyptusSpp_000805
GBLP_MEDSA
Guanine
e−171
291
327



protein


nucleotide-






binding






protein beta


69
WD40 repeat
eucalyptusSpp_000806
GBLP_MEDSA
Guanine
e−171
291
327



protein


nucleotide-






binding






protein beta






subunit-like


70
WD40 repeat
eucalyptusSpp_002248
AAL86002
HYPOTHETICAL
0
261
388



protein


43.8 KDA






PROTEIN


71
WD40 repeat
eucalyptusSpp_003203
Q9SY00
Putative WD-
e−144
236
317



protein


repeat






protein






(AT4G02730/T5J8_2)


72
WD40 repeat
eucalyptusSpp_003209
AAM14986
HYPOTHETICAL
e−160
259
302



protein


32.6 KDA






PROTEIN


73
WD40 repeat
eucalyptusSpp_004429
Q9SZQ5
HYPOTHETICAL
0
260
322



protein


34.3 KDA






PROTEIN


74
WD40 repeat
eucalyptusSpp_004607
AAC27402
EXPRESSED
0
253
356



protein


PROTEIN


75
WD40 repeat
eucalyptusSpp_004682
AAK00964
HYPOTHETICAL
0
264
313



protein


35.3 KDA






PROTEIN


76
WD40 repeat
eucalyptusSpp_005786
Q944S2
At2g47790/F17A22.18
e−155
264
396



protein


(Expressed






protein).






SPTREMBL


77
WD40 repeat
eucalyptusSpp_005887
Q94AB4
AT3g13340/MDC11_13
0
332
446



protein


78
WD40 repeat
eucalyptusSpp_005981
Q8L4X6
WD-repeat
0
315
348



protein


protein






GhTTG2.






SPTREMBL


79
WD40 repeat
eucalyptusSpp_006766
Q8L4M1
Putative WD-
e−137
234
369



protein


40 repeat






protein


80
WD40 repeat
eucalyptusSpp_006769
Q9LJC6
RETINOBLASTOMA-
0
372
566



protein


BINDING






PROTEIN-LIKE


81
WD40 repeat
eucalyptusSpp_006907
Q94C94
Hypothetical
0
446
812



protein


protein.


82
WD40 repeat
eucalyptusSpp_007518
Q93ZN5
AT4G00090/F6N15_8
0
311
436



protein


83
WD40 repeat
eucalyptusSpp_007717
O82266
At2g47990
e−180
327
528



protein


protein






(Hypothetical






58.9 kDa






protein)


84
WD40 repeat
eucalyptusSpp_007718
Q8RWD8
Hypothetical
e−173
278
350



protein


protein.






SPTREMBL


85
WD40 repeat
eucalyptusSpp_007741
Q8LA40
Putative WD-
e−158
269
409



protein


40 repeat






protein,






MSI2


86
WD40 repeat
eucalyptusSpp_007884
Q9FHY2
Similarity
e−149
316
765



protein


to unknown






protein


87
WD40 repeat
eucalyptusSpp_008258
Q9LHN3
EMB|CAB63739.1
0
524
758



protein


(AT3G18860/MCB22_3)


88
WD40 repeat
eucalyptusSpp_008465
Q9FLS2
WD-repeat
0
366
460



protein


protein-like


89
WD40 repeat
eucalyptusSpp_008616
Q9LYK6
Hypothetical
e−148
252
321



protein


protein


90
WD40 repeat
eucalyptusSpp_008690
Q9SW94
G PROTEIN
0
326
376



protein


BETA SUBUNIT


91
WD40 repeat
eucalyptusSpp_008708
Q8L862
Hypothetical
e−167
297
487



protein


protein


92
WD40 repeat
eucalyptusSpp_008850
O22725
F11P17.7
0
402
853



protein


protein.






SPTREMBL


93
WD40 repeat
eucalyptusSpp_009072
Q9SAJ0
F23A5.2 (form
e−176
288
350



protein


2) (mRNA






export






protein,






putative)


94
WD40 repeat
eucalyptusSpp_009465
Q9FLX9
NOTCHLESS
0
384
475



protein


PROTEIN






HOMOLOG


95
WD40 repeat
eucalyptusSpp_009472
Q9SZA4
WD-REPEAT
0
374
457



protein


PROTEIN-LIKE






PROTEIN


96
WD40 repeat
eucalyptusSpp_009550
Q9FKT5
Gb|AAF54217.1
e−167
275
313



protein


(Hypothetical






protein)


97
WD40 repeat
eucalyptusSpp_010284
O22466
WD-40 repeat
0
397
423



protein


protein MSI1


98
WD40 repeat
eucalyptusSpp_010595
Q94C94
Hypothetical
0
419
789



protein


protein


99
WD40 repeat
eucalyptusSpp_010657
Q94AH2
HYPOTHETICAL
0
243
298



protein


33.1 KDA






PROTEIN


100
WD40 repeat
eucalyptusSpp_012636
Q8L611
Hypothetical
0
756
1133



protein


protein


101
WD40 repeat
eucalyptusSpp_012748
AAD10151
PUTATIVE WD-
0
375
469



protein


40 REPEAT






PROTEIN,






MSI4


102
WD40 repeat
eucalyptusSpp_012879
Q8VZY6
FERTILIZATION-
0
291
377



protein


INDEPENDENT






ENDOSPERM






PROTEIN


103
WD40 repeat
eucalyptusSpp_015515
Q8LPI5
Putative WD-
0
360
493



protein


repeat






protein.






SPTREMBL


104
WD40 repeat
eucalyptusSpp_015724
O22607
WD-40 repeat
0
395
522



protein


protein MSI4


105
WD40 repeat
eucalyptusSpp_016167
Q93YS7
Putative WD-
0
663
917



protein


repeat






membrane






protein


106
WD40 repeat
eucalyptusSpp_016633
Q9SUY6
HYPOTHETICAL
e−174
240
384



protein


43.8 KDA






PROTEIN


107
WD40 repeat
eucalyptusSpp_017485
Q8RXC4
Hypothetical
0
650
1348



protein


144.7 kDa






protein


108
WD40 repeat
eucalyptusSpp_018007
O94289
WD repeat-
e−129
302
794



protein


containing






protein


109
WD40 repeat
eucalyptusSpp_020775
Q8W403
Sec13p
e−150
242
304



protein


110
WD40 repeat
eucalyptusSpp_023132
AAK52092
WD-40 REPEAT
0
458
515



protein


PROTEIN


111
WD40 repeat
eucalyptusSpp_023569
Q9XIJ3
T10O24.21.
0
404
576



protein


SPTREMBL


112
WD40 repeat
eucalyptusSpp_023611
Q8L4J2
Cleavage
e−174
301
438



protein


stimulation






factor 50K






chain






(Cleavage






stimulation


113
WD40 repeat
eucalyptusSpp_024934
Q94AB4
AT3g13340/MDC11_13.
0
343
444



protein


WD-






repeat






protein-like






SPTREMBL


114
WD40 repeat
eucalyptusSpp_025546
O22212
Hypothetical
0
352
566



protein


61.8 kDa






Trp-Asp






repeats






containing






protein


115
WD40 repeat
eucalyptusSpp_030134
Q9LVF2
Genomic DNA,
0
677
946



protein


chromosome






3, P1 clone:






MIL23


116
WD40 repeat
eucalyptusSpp_031787
AAL91206
WD REPEAT
0
264
329



protein


PROTEIN-LIKE


117
WD40 repeat
eucalyptusSpp_034435
Q9SAJ0
F23A5.2(form
e−178
290
349



protein


2) (mRNA






export






protein,






putative).






SPTREMBL


118
WD40 repeat
eucalyptusSpp_034452
Q94BR4
Hypothetical
0
381
525



protein


protein






(Putative






pre-mRNA






splicing






factor


119
WD40 repeat
eucalyptusSpp_035789
P93563
Guanine
3E−88
171
356



protein


nucleotide-






binding






protein beta






subunit


120
WD40 repeat
eucalyptusSpp_035804
Q9FNN2
WD-repeat
0
356
589



protein


protein-






like.






SPTREMBL


121
WD40 repeat
eucalyptusSpp_043057
Q9LV35
WD40-repeat
0
472
610



protein


protein.






SPTREMBL


122
WD40 repeat
eucalyptusSpp_046741
Q93VK1
AT4g28450/F20O9_130
0
363
452



protein


123
WD40 repeat
eucalyptusSpp_047161
Q9ZUN8
Putative WD-
0
350
473



protein


40 repeat






protein


124
CDK type A
pinusRadiata_001766
Q9M3W7
PUTATIVE
e−128
237
436






CDC2-RELATED






PROTEIN






KINASE CRK2.459






e−128


125
CDK type A
pinusRadiata_002927
Q9FRN5
PUTATIVE
0
349
470






SERINE/THREONINE






KINASE


126
CDK type B-1
990309PRCA009171HT
Q9FYT8
Cyclin-
e−145
244
303






dependent






kinase B1-2


127
CDK type B-1
pinusRadiata_013714
Q9FYT8
CYCLIN-
e−174
222
304






DEPENDENT






KINASE B1-2


128
CDK type B-1
pinusRadiata_016332
Q9FYT8
CYCLIN-
e−178
228
304






DEPENDENT






KINASE B1-2


129
CDK type B-1
pinusRadiata_021677
Q9FYT8
CYCLIN-
e−176
229
304






DEPENDENT






KINASE B1-2


130
CDK type B-1
pinusRadiata_027562
Q9FYT8
Cyclin-
e−118
211
304






dependent






kinase B1-2


131
CDK type C
pinusRadiata_001504
Q9LNN0
F8L10.9
0
434
790






protein


132
CDK type C
pinusRadiata_015211
Q9LNN0
F8L10.9
0
371
746






protein


133
CDK type C
pinusRadiata_020421
P93320
Cdc2MsC
0
318
432






protein


134
CDK type D
pinusRadiata_003187
O80345
CDK-
e−137
226
485






ACTIVATING






KINASE 1AT






(CDK-






ACTIVATING






KINASE






CAK1AT)


135
CDK type D
pinusRadiata_015661
Q947K6
CDK-
0
266
407






ACTIVATING






KINASE.


136
Cyclin A
pinusRadiata_013874
Q96226
Cyclin
e−108
223
474


137
Cyclin A
pinusRadiata_014615
CAC27333
PUTATIVE A-
0
332
390






LIKE CYCLIN






(FRAGMENT)


138
Cyclin B
pinusRadiata_004578
O65064
Probable
9E−87
162
217






G2/mitotic-






specific






cyclin






(Fragment)


139
Cyclin B
pinusRadiata_023387
O04389
B-like
2E−98
220
466






cyclin


140
Cyclin D
pinusRadiata_006970
P93103
CYCLIN-D
1E−75
135
293






LIKE PROTEIN


141
Cyclin D
pinusRadiata_010322
CAC17049
SEQUENCE 33
e−131
171
254






FROM PATENT






WO0065040


142
Cyclin D
pinusRadiata_022721
P93103
CYCLIN-D
1E−76
137
289






LIKE PROTEIN


143
Cyclin D
pinusRadiata_023407
Q9SMD5
CYCD3,2
8E−90
139
278






PROTEIN


144
Cyclin-
pinusRadiata_001945
Q947Y1
PUTATIVE
5E−55
74
86



dependent


CYCLIN-



kinase


DEPENDENT



regulatory


KINASE



subunit


REGULATORY






SUBUNIT


145
Cyclin-
pinusRadiata_008233
CAB69358
SEQUENCE 1
4E−49
65
86



dependent


FROM PATENT



kinase


WO9841642



regulatory



subunit


146
Cyclin-
pinusRadiata_008234
CAB69358
SEQUENCE 1
4E−49
65
86



dependent


FROM PATENT



kinase


WO9841642



regulatory



subunit


147
Cyclin-
pinusRadiata_022054
CAB69358
SEQUENCE 1
8E−55
70
82



dependent


FROM PATENT



kinase


WO9841642



regulatory



subunit


148
Histone
pinusRadiata_012137
Q9FK40
Histone
0
496
555



acetyltransferase


acetyltransferase






(AT5g50320/MXI22_3)


149
Histone
pinusRadiata_012582
O80378
181
0
354
402



acetyltransferase


(Fragment)






SPTREMBL


150
Histone
pinusRadiata_015285
O80378
181
0
342
401



acetyltransferase


(Fragment)


151
Histone
pinusRadiata_017229
Q9LNC4
F9P14.9
e−118
268
585



acetyltransferase


protein


152
Histone
pinusRadiata_020724
Q9AR19
Histone
e−177
355
639



acetyltransferase


acetyltransferase






GCN5






(Expressed






protein)


153
Histone
pinusRadiata_004555
AAM13152
HISTONE
0
331
488



deacetylase


DEACETYLASE


154
Histone
pinusRadiata_004556
AAM13152
HISTONE
0
331
488



deacetylase


DEACETYLASE


155
Histone
pinusRadiata_005729
Q9M4U5
Histone
9E−62
154
348



deacetylase


deacetylase






2 isoform b


156
Histone
pinusRadiata_007395
AAM13152
HISTONE
0
335
426



deacetylase


DEACETYLASE


157
Histone
pinusRadiata_009503
Q8W508
Histone
0
365
427



deacetylase


deacetylase


158
Histone
pinusRadiata_011283
AAM19887
AT1G08460/T27G7_7
0
255
366



deacetylase


159
Histone
pinusRadiata_012322
Q9FML2
HISTONE
0
327
435



deacetylase


DEACETYLASE






(PUTATIVE






HISTONE






DEACETYLASE)


161
Histone
pinusRadiata_023236
Q8RX28
Putative
e−144
238
390



deacetylase


histone






deacetylase


162
Peptidylprolyl
pinusRadiata_000171
Q9FJL3
PEPTIDYLPROLYL
0
364
549



isomerase


ISOMERASE


163
Peptidylprolyl
pinusRadiata_000172
Q38949
FK506
0
365
552



isomerase


BINDING






PROTEIN






FKBP62






(ROF1)


164
Peptidylprolyl
pinusRadiata_001480
Q8VXA5
PUTATIVE
e−125
161
172



isomerase


CYCLOSPORIN






A-BINDING






PROTEIN


168
Peptidylprolyl
pinusRadiata_001692
FKB7_WHEAT
70 kDa
0
418
553



isomerase


peptidylprolyl






isomerase






(EC 5.2.1.8)


169
Peptidylprolyl
pinusRadiata_005313
AAB64339
FKBP-TYPE
1E−97
135
175



isomerase


PEPTIDYL-






PROLYL CIS-






TRANS






ISOMERASE


170
Peptidylprolyl
pinusRadiata_006362
BAB39983
PUTATIVE
3E−77
129
168



isomerase


PEPTIDYL-






PROLYL CIS-






TRANS






ISOMERASE,






CHLOROPLA . . .






290 3e−77


171
Peptidylprolyl
pinusRadiata_006493
Q9C835
Hypothetical
2E−62
128
235



isomerase


26.4 kDa






protein (EC






5.2.1.8)






(Peptidyl-






prol . . .


172
Peptidylprolyl
pinusRadiata_006983
AAK96784
CYCLOPHILIN
e−103
151
204



isomerase


174
Peptidylprolyl
pinusRadiata_007665
Q9LDC0
FKBP-like
e−138
239
378



isomerase


protein






(Genomic






DNA,






chromosome






3, P1 clone:


175
Peptidylprolyl
pinusRadiata_012196
Q93VG0
Cyclophilin
4E−74
132
160



isomerase


(EC 5.2.1.8)






(Peptidyl-






prolyl cis-






trans


176
Peptidylprolyl
pinusRadiata_013382
Q9C588
HYPOTHETICAL
0
288
581



isomerase


60.2 KDA






PROTEIN


177
Peptidylprolyl
pinusRadiata_016461
O04287
IMMUNOPHILIN
9E−66
88
109



isomerase


178
Peptidylprolyl
pinusRadiata_017611
Q9C566
Cyclophilin-
e−163
276
360



isomerase


40 (EC






5.2.1.8)






(Expressed






protein)


179
Peptidylprolyl
pinusRadiata_019776
AAM14253
HYPOTHETICAL
e−110
146
190



isomerase


20.3 KDA






PROTEIN


180
Peptidylprolyl
pinusRadiata_020659
AAO63961
Hypothetical
7E−85
159
227



isomerase


protein






SPTREMBL


181
Peptidylprolyl
pinusRadiata_022559
AAK43974
PUTATIVE
2E−73
113
153



isomerase


PEPTIDYL-






PROLYL CIS-






TRANS






ISOMERASE


182
Peptidylprolyl
pinusRadiata_024188
Q9P3X9
PEPTIDYL-
e−122
210
379



isomerase


PROLYL CIS-






TRANS






ISOMERASE






(EC 5.2.1.8)


183
Peptidylprolyl
pinusRadiata_027973
Q9SR70
T22K18.11
3E−69
125
171



isomerase


protein






(AT3g10060/T22K18_11)


184
WD40 repeat
pinusRadiata_001353
Q9FNN2
WD-repeat
0
317
590



protein


protein-






likeSPTREMBL


185
WD40 repeat
pinusRadiata_001978
PRL1_ARATH
PP1/PP2A
0
341
502



protein


phosphatases






pleiotropic






regulator






PRL1


186
WD40 repeat
pinusRadiata_002810
AAK49947
TGF-BETA
0
273
326



protein


RECEPTOR-






INTERACTING






PROTEIN 1


187
WD40 repeat
pinusRadiata_002811
AAK49947
TGF-BETA
0
273
326



protein


RECEPTOR-






INTERACTING






PROTEIN 1


188
WD40 repeat
pinusRadiata_002812
AAM15129
HYPOTHETICAL
e−127
225
521



protein


58.9 KDA






PROTEIN


189
WD40 repeat
pinusRadiata_003514
Q9FJ94
Similarity
e−137
242
445



protein


to myosin






heavy chain






kinaseSPTREMBL


190
WD40 repeat
pinusRadiata_004104
GBB_ORYSA
Guanine
0
294
378



protein


nucleotide-






binding






protein beta






subunit


191
WD40 repeat
pinusRadiata_005595
Q9FTT9
PUTATIVE
0
320
459



protein


DKFZP564O0463






PROTEIN


192
WD40 repeat
pinusRadiata_005754
Q94JT6
At1g78070/F28K19_28SPTREMBL
e−168
294
451



protein


193
WD40 repeat
pinusRadiata_006463
GBLP_MEDSA
Guanine
e−152
261
324



protein


nucleotide-






binding






protein beta






subunit-like . . .






538 e−152


194
WD40 repeat
pinusRadiata_006665
AAM20553
HYPOTHETICAL
0
655
1169



protein


119.9 KDA






PROTEIN.






1229 0.0


195
WD40 repeat
pinusRadiata_006750
AAM13119
HYPOTHETICAL
e−158
264
312



protein


35.4 KDA






PROTEIN. 560






e−158


196
WD40 repeat
pinusRadiata_007030
Q9LJN8
MITOTIC
e−169
284
335



protein


CHECKPOINT






PROTEIN. 595






e−169


197
WD40 repeat
pinusRadiata_007854
Q8H919
Putative WD
0
429
644



protein


domain






containing






protein


198
WD40 repeat
pinusRadiata_007917
AAD10151
PUTATIVE WD-
0
353
462



protein


40 REPEAT






PROTEIN,






MSI4


199
WD40 repeat
pinusRadiata_007989
Q9LRZ0
Genomic DNA,
0
480
687



protein


chromosome






3, TAC






clone: K20I9


200
WD40 repeat
pinusRadiata_008506
MSI1_LYCES
WD-40 repeat
0
364
420



protein


protein MSI1


201
WD40 repeat
pinusRadiata_008692
Q8W403
Sec13p
e−134
218
301



protein


202
WD40 repeat
pinusRadiata_008693
Q8W403
Sec13p
e−137
222
301



protein


203
WD40 repeat
pinusRadiata_009170
Q9M0V4
U3 snoRNP-
e−127
244
524



protein


associated-






like






protein.






SPTREMBL


204
WD40 repeat
pinusRadiata_009408
Q9SAJ0
F23A5.2(FORM
e−171
282
350



protein


2). 602 e−171


205
WD40 repeat
pinusRadiata_009522
Q8RXQ4
Hypothetical
e−129
231
395



protein


43.8 kDa






protein


206
WD40 repeat
pinusRadiata_009734
AAO27452
Peroxisomal
e−142
227
317



protein


targeting






signal type






2 receptor.






SPTREMBL


207
WD40 repeat
pinusRadiata_009815
AAM20433
CELL CYCLE
0
326
500



protein


SWITCH






PROTEIN


208
WD40 repeat
pinusRadiata_010670
AAN72058
Expressed
e−157
264
345



protein


protein


209
WD40 repeat
pinusRadiata_011297
AAM13100
WD REPEAT
e−157
262
337



protein


PROTEIN






ATAN11


210
WD40 repeat
pinusRadiata_013098
AAM13153
HYPOTHETICAL
e−136
229
352



protein


39.1 KDA






PROTEIN. 487






e−136


211
WD40 repeat
pinusRadiata_013172
Q8H0T9
Hypothetical
0
437
860



protein


protein


212
WD40 repeat
pinusRadiata_013589
AAK52092
WD-40 REPEAT
0
448
512



protein


PROTEIN


213
WD40 repeat
pinusRadiata_013608
AAC27402
EXPRESSED
e−141
202
358



protein


PROTEIN


214
WD40 repeat
pinusRadiata_014299
Q9XED5
Cell cycle
0
335
488



protein


switch






proteinSPTREMBL


215
WD40 repeat
pinusRadiata_014498
Q9FH64
WD REPEAT
e−152
206
329



protein


PROTEIN-LIKE


216
WD40 repeat
pinusRadiata_014548
Q93ZS6
HYPOTHETICAL
0
505
763



protein


82.2 KDA






PROTEIN


217
WD40 repeat
pinusRadiata_014610
Q9M298
Hypothetical
0
450
922



protein


104.7 kDa






protein


218
WD40 repeat
pinusRadiata_016090
Q9SIY9
Putative WD-
0
442
802



protein


40 repeat






proteinSPTREMBL


219
WD40 repeat
pinusRadiata_016722
O22826
Putative
e−159
257
310



protein


splicing






factorSPTREMBL


220
WD40 repeat
pinusRadiata_016785
AAG60193
PUTATIVE
0
344
464



protein


WD40 PROTEIN


221
WD40 repeat
pinusRadiata_017094
Q9LV35
WD40-REPEAT
0
406
604



protein


PROTEIN


222
WD40 repeat
pinusRadiata_017527
Q9AYE4
Hypothetical
e−154
254
314



protein


35.3 kDa






protein


223
WD40 repeat
pinusRadiata_017591
O80706
F8K4.21
0
905
1218



protein


protein


224
WD40 repeat
pinusRadiata_017769
Q9XIJ3
T10O24.21
0
446
607



protein


225
WD40 repeat
pinusRadiata_018047
Q8VZY6
FERTILIZATION-
0
285
373



protein


INDEPENDENT






ENDOSPERM






PROTEIN


226
WD40 repeat
pinusRadiata_018414
Q947M8
COPI
0
455
638



protein


227
WD40 repeat
pinusRadiata_018986
Q9LFE2
WD40-repeat
0
518
886



protein


protein


228
WD40 repeat
pinusRadiata_019479
Q9SZA4
WD-repeat
e−156
276
454



protein


protein-like






protein


229
WD40 repeat
pinusRadiata_020144
Q8W514
MSI TYPE
0
288
413



protein


NUCLEOSOME/CHROMATIN






ASSEMBLY






FACTOR C


230
WD40 repeat
pinusRadiata_022480
Q8W514
MSI type
e−167
287
426



protein


nucleosome/chromatin






assembly






factor C


231
WD40 repeat
pinusRadiata_023079
Q8W514
MSI type
e−169
283
397



protein


nucleosome/chromatin






assembly






factor C. SPTREMBL


232
WD40 repeat
pinusRadiata_026739
Q93YS7
Putative WD-
0
591
918



protein


repeat






membrane






protein.






SPTREMBL


233
WD40 repeat
pinusRadiata_026951
Q93VS5
AT4g18900/F13C5_70
e−163
290
503



protein


(Hypothetical






protein)


234
WEE1-like
pinusRadiata_026529
Q9SRY9
F22D16.3
e−122
209
451



protein


PROTEIN


235
WD40 repeat
eucalyptusSpp_006366
Q8LF96
PRL1 protein
0
374
492



protein


236
WD40 repeat
eucalyptusSpp_017378
O22607
WD-40 repeat
0
371
453



protein


protein MSI4


237
WD40 repeat
pinusRadiata_000888
O22466
WD-40 repeat
0
364
420



protein


protein MSI1


238
Cyclin-
pinusRadiata_014166
Q9FKB5
GENOMIC DNA,
5E−42
114
304



dependant


CHROMOSOME



kinase


5, TAC



inhibitor


CLONE: K24G6






(CYCLIN-






DEPENDENT


239
CDK type D
pinusRadiata_003189
Q9M5G4
CDK-
8E−21
56
100






activating






kinase


240
Histone
pinusRadiata_009356
Q9FJT8
Histone
7E−85
187
510



acetyltransferase


acetyltransferase






HAT B


241
Histone
pinusRadiata_000065
Q9LPW6
F13K23.8
5E−18
71
209



deacetylase


protein.


242
Histone
pinusRadiata_014197
Q8GXJ1
Putative
e−170
308
519



deacetylase


histone






deacetylase


243
Peptidylprolyl
pinusRadiata_009081
Q9ZRQ9
Cyclophilin
e−106
185
190



isomerase


(EC 5.2.1.8)






(Peptidyl-






prolyl cis-






trans


244
Peptidyprolyl
pinusRadiata_013417
Q8H4T0
Putative
e−140
235
345



isomerase


peptidyl-






prolycis-






trans






isomerase






protein


245
WD40 repeat
pinusRadiata_005755
Q9SKW4
F5J5.6.
e−143
144
319



protein


246
WD40 repeat
pinusRadiata_006670
Q9LDG7
WD-40 repeat
e−163
393
960



protein


protein-like






(MJK13.13






protein)


247
WD40 repeat
pinusRadiata_007027
Q8GWR1
Hypothetical
e−157
276
470



protein


protein.


248
WD40 repeat
pinusRadiata_007276
Q9LF27
Hypothetical
e−138
235
428



protein


47.3 kDa






protein


249
WD40 repeat
pinusRadiata_007390
Q94AH4
PUTATIVE
3E−17
53
158



protein


RING ZINC






FINGER






PROTEIN. 91






3e−17


250
WD40 repeat
pinusRadiata_012648
O22212
Hypothetical
0
324
561



protein


61.8 kDa






Trp-Asp






repeats






containing






protein


251
WD40 repeat
pinusRadiata_013171
Q8H0T9
Hypothetical
0
437
860



protein


protein.


252
Cyclin B
eucalyptusSpp_045414
Q9LDM4
F2D10.10
e−142
255
423






(F5M15.6)


253
Cyclin-
eucalyptusSpp_044328
Q9FKB5
GENOMIC DNA,
1E−54
121
260



dependant


CHROMOSOME



kinase


5, TAC



inhibitor


CLONE: K24G6






(CYCLIN-






DEPENDENT


254
Histone
eucalyptusSpp_015615
Q9AR19
Histone
0
390
563



acetyltransferase


acetyltransferase






GCN5






(Expressed






protein)


255
Peptidylprolyl
eucalyptusSpp_017239
Q8GWM6
Hypothetical
0
364
591



isomerase


protein


256
WD40 repeat
eucalyptusSpp_018643
Q93VS5
AT4g18900/F13C5_70
0
229
327



protein


(Hypothetical






protein)


257
WD40 repeat
eucalyptusSpp_019127
Q9SRX9
F22D16.14
e−131
232
337



protein


protein.






SPTREMBL


258
WD40 repeat
eucalyptusSpp_022624
Q9LFE2
WD40-repeat
0
594
868



protein


protein


259
WD40 repeat
eucalyptusSpp_032424
Q8LPL5
Cell cycle
0
255
327



protein


switch






protein


260
WD40 repeat
eucalyptusSpp_037472
Q9SK69
Putative WD-
0
461
677



protein


40 repeat






protein






(AT2G20330/F11A3.12)








Claims
  • 1. An isolated polynucleotide comprising a nucleic acid sequence that (i) is selected from the group consisting of SEQ ID NOs: 1-260 and variants thereof, (ii) is selected from the group consisting of SEQ ID NOs: 521-772 and variants thereof, or (iii) encodes the catalytic or substrate-binding domain of a polypeptide selected from of any one of SEQ ID NOs: 261-520, wherein the polynucleotide encodes a polypeptide having the activity of said polypeptide selected from any one of SEQ ID NOs: 261-520.
  • 2.-5. (canceled)
  • 6. The isolated polynucleotide of claim 1, wherein the variant has a sequence identity that is greater than or equal to 80% to any one of SEQ ID NOs: 1-260 or encodes a protein with an amino acid sequence having a sequence identity that is greater than 60%, 65%, 70%, 75%, 80%, 85% or 90% to any one of SEQ ID NOs: 261-520, and wherein the protein encoded by the polynucleotide possesses the activity of the protein encoded by said any one of SEQ ID NOs: 1-260.
  • 7.-8. (canceled)
  • 9. A DNA construct comprising at least one polynucleotide of claim 1, operably linked in sense or antisense orientation to a promoter, wherein the promoter is selected from the group consisting of a constitutive promoter, a strong promoter, an inducible promoter, a regulatable promoter, a temporally regulated promoter, and a tissue-preferred promoter.
  • 10.-13. (canceled)
  • 14. The DNA construct of claim 9, wherein an RNA transcript of the polynucleotide is complementary to a nucleic acid sequence selected from the group consisting of 1-260.
  • 15. A plant cell, comprising the DNA construct of claim 9.
  • 16. The plant cell of claim 15, wherein the plant cell is in a transgenic plant, and wherein the phenotype of the plant is different from a plant of the same species which does not comprise the plant cell, wherein the difference in phenotype is in lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation, ratio of root to branch vegetative development, leaf area index, and leaf shape.
  • 17.-20. (canceled)
  • 21. The transgenic plant of claim 16, wherein the plant is of a species of Eucalyptus or Pinus.
  • 22. The transgenic plant of claim 16, wherein the plant exhibits one or more traits selected from the group consisting of increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides, reduced period of juvenility, an increased period of juvenility, propensity to form reaction wood, self-abscising branches, accelerated reproductive development or delayed reproductive development as compared to a plant of the same species that has not been transformed with the DNA construct.
  • 23.-31. (canceled)
  • 32. A wood obtained from a transgenic tree which has been transformed with the DNA construct of claim 9.
  • 33. A wood pulp obtained from a transgenic tree which has been transformed with the DNA construct of claim 9.
  • 34.-36. (canceled)
  • 37. An isolated polypeptide comprising an amino acid sequence encoded by the isolated polynucleotide of claim 1.
  • 38.-43. (canceled)
  • 44. The isolated polynucleotide of claim 1, wherein the polynucleotide comprises fewer than about 100 nucleotide bases.
  • 45. A method of correlating gene expression in two different samples, comprising: detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 and conservative variants thereof in a first sample; detecting a level of expression of the one or more genes in a second sample; comparing the level of expression of the one or more genes in the first sample to the level of expression of the one or more genes in the second sample; and correlating a difference in expression level of the one or more genes between the first and second samples.
  • 46. The method of claim 45, wherein the first sample and the second sample are plant tissues that are from the same or different plant.
  • 47. The method of claim 4, wherein the first sample and the second sample are (i) from the same plant tissue, (ii) harvested during a different season of the year, and/or (iii) obtained from plants in different stages of development.
  • 48.-50. (canceled)
  • 51. The method of claim 46 wherein the plant tissue is selected from the group consisting of vascular tissue, apical meristem, vascular cambium, xylem, phloem, root, flower, cone, fruit, and seed.
  • 52. The method of claim 51, wherein the plant tissues are obtained from at least one of (i) a different type of tissue, (ii) a different stage of development, or (iii) different stages of the cell cycle.
  • 53.-54. (canceled)
  • 55. The method of claim 51, wherein the plant tissues are from one or more species of Eucalyptus or Pinus.
  • 56. (canceled)
  • 57. The method of claim 45, wherein the step of detecting is effected using one or more polynucleotides capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 under standard hybridization conditions.
  • 58. (canceled)
  • 59. The method of claim 57, wherein the step of detecting is accomplished by hybridization to a labeled nucleic acid.
  • 60. (canceled)
  • 61. The method of claim 57, wherein at least one of polynucleotides hybridizes to a 3′ untranslated region of the nucleic acid sequence.
  • 62. (canceled)
  • 63. The method of claim 57, wherein the one or more polynucleotides comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 521-772.
  • 64.-66. (canceled)
  • 67. The method of claim 45, further comprising, prior to the detecting steps, the step of amplifying at least one of the genes.
  • 68. The method of claim 45, further comprising, prior to the detecting steps, the step of labeling at least one of the genes with a detectable label.
  • 69. A combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 or to an RNA transcript of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260.
  • 70. (canceled)
  • 71. The combination of claim 69, wherein the oligonucleotides each hybridize to different nucleic acid sequences or to different RNA transcripts.
  • 72. (canceled)
  • 73. The combination of claim 69, wherein at least one of the oligonucleotides hybridizes to a 3′ untranslated region of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260.
  • 74.-75. (canceled)
  • 76. The combination of claim 69, wherein at least one of the oligonucleotides comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 521-772.
  • 77.-83. (canceled)
  • 84. The combination of claim 69, comprising from about 2 to about 5000 oligonucleotides.
  • 85. The combination of claim 84, wherein each of the oligonucleotides is labeled with a detectable label.
  • 86. A microarray comprising the combination of claim 69 provided on a solid support, wherein each of the oligonucleotides occupies a unique location on said solid support.
  • 87. (canceled)
  • 88. A method for detecting one or more nucleic acid sequences in a sample, comprising contacting the sample with the combination of claim 69.
  • 89.-91. (canceled)
  • 92. The method of claim 88, wherein at least one of the oligonucleotides hybridizes to a 3′ untranslated region of a gene that comprises the nucleic acid sequence of at least any one of SEQ ID NOs 1-260.
  • 93.-103. (canceled)
  • 104. A kit for detecting gene expression comprising the microarray of claim 86 and one or more buffers or reagents for a nucleotide hybridization reaction.
CROSS REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. provisional application Ser. No. 60/533,036, filed on Dec. 30, 2003, which is specifically incorporated in its entirety herein by reference.

Provisional Applications (1)
Number Date Country
60533036 Dec 2003 US
Divisions (1)
Number Date Country
Parent 11024959 Dec 2004 US
Child 12555853 US