Cell cycle progressional proteins

Abstract
Polynucleotides encoding a number of Drosophila gene products are provided. Polynucleotide probes derived from these nucleotide sequences, polypeptides encoded by the polynucleotides and antibodies that bind to the polypeptides are also provided.
Description

The present invention relates to a number of genes implicated in the processes of cell cycle progression, including mitosis and meiosis.


We have now identified a number of genes in the X chromosome of Drosophila, mutations in which disrupt cell cycle progression, for example the processes of mitosis and/or meiosis. We have determined the phenotypes of these mutations and relate the mutations to the total genome sequence and so identify individual genes essential for cell cycle progression.


According to one aspect of the present invention, we provide a use of a polynucleotide as set out in Table 5, or a polypeptide encoded by the polypeptide, in a method of prevention, treatment or diagnosis of a disease in an individual.


Preferably, the polynucleotide comprises a human polypeptide as set out in column 3 of Table 5. In preferred embodiments, the polynucleotide or polypeptide is used to identify a substance capable of binding to the polypeptide, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.


Alternatively or in addition, the polynucleotide or polypeptide is used to identify a substance capable of modulating the function of the polypeptide, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.


The polynucleotide or polypeptide may be administered to an individual in need of such treatment. Alternatively, or in addition, the substance identified by the method is administered to an individual in need of such treatment.


The use may be for a method of diagnosis, in which the presence or absence of a polynucleotide is detected in a biological sample in a method comprising: (a) bringing the biological sample containing nucleic acid such as DNA or RNA into contact with a probe comprising a fragment of at least 15 nucleotides of the polynucleotide as set out in Table 5 under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.


Alternatively, or in addition, the presence or absence of a polypeptide is detected in a biological sample in a method comprising: (a) providing an antibody capable of binding to the polypeptide; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.


In highly preferred embodiments, the disease comprises a proliferative disease such as cancer.


In a further aspect of the invention, we provide a method of modulating, preferably down-regulating, the expression of a polynucleotide as set out in Table 5 in a cell, the method comprising introducing a double stranded RNA (dsRNA) corresponding to the polynucleotide, or an antisense RNA corresponding to the polynucleotide, or a fragment thereof, into the cell.


According to another aspect of the present invention, we provide a polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


There is provided, according to a further aspect of the present invention, a polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


We provide, according to another aspect of the present invention, a polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Table 5 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Table 5, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Table 5, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


As a further aspect of the present invention, there is provided a polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


We provide, according to a further aspect of the present invention, a polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


The present invention, in another aspect, provides polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 3 to 9 and 9A or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 3 to 9 and 9A, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 3 to 9 and 9A, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


In a further aspect of the present invention, there is provided polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 10 to 29 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 10 to 29, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 10 to 29, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


As a further aspect of the invention, we provide a polynucleotide probe which comprises a fragment of at least 15 nucleotides of a polynucleotide according to any of the above aspects of the invention.


The present invention also provides a polypeptide which comprises any one of the amino acid sequences set out in Examples 1 to 29 or in any of Examples 1 to 2, 2A, 2B and 2C, Examples 3 to 9 and 9A and Examples 10 to 29, or a homologue, variant, derivative or fragment thereof.


Preferably the polypeptide is encoded by a cDNA sequence obtainable from a eukaryotic cDNA library, preferably a metazoan cDNA library (such as insect or mammalian) said DNA sequence comprising a DNA sequence being selectively detectable with a nucleotide sequence, preferably a Drosophila nucleotide sequence, as shown in any one of Examples 1 to 29.


The term “selectively detectable” means that the cDNA used as a probe is used under conditions where a target cDNA is found to hybridize to the probe at a level significantly above background. The background hybridization may occur because of other cDNAs present in the cDNA library. In this event background implies a level of signal generated by interaction between the probe and a non-specific cDNA member of the library which is less than 10 fold, preferably less than 100 fold as intense as the specific interaction observed with the target cDNA. The intensity of interaction may be measured, for example, by radiolabelling the probe, e.g. with 32P. Suitable conditions may be found by reference to the Examples, as well as in the detailed description below.


A polynucleotide encoding a polypeptide as described here is also provided.


We further provide a vector comprising a polynucleotide of the invention, for example an expression vector comprising a polynucleotide of the invention operably linked to a regulatory sequence capable of directing expression of said polynucleotide in a host cell.


Also provided is an antibody capable of binding such a polypeptide.


In a further aspect the present invention provides a method for detecting the presence or absence of a polynucleotide of the invention in a biological sample which method comprises: (a) bringing the biological sample containing DNA or RNA into contact with a probe comprising a nucleotide of the invention under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.


In another aspect the invention provides a method for detecting a polypeptide of the invention present in a biological sample which comprises: (a) providing an antibody of the invention; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.


Knowledge of the genes involved in cell cycle progression allows the development of therapeutic agents for the treatment of medical conditions associated with aberrant cell cycle progression. Accordingly, the present invention provides a polynucleotide of the invention for use in therapy. The present invention also provides a polypeptide of the invention for use in therapy. The present invention further provides an antibody of the invention for use in therapy.


In a specific embodiment, the present invention provides a method of treating a tumour or a patient suffering from a proliferative disease, comprising administering to a patient in need of treatment an effective amount of a polynucleotide, polypeptide and/or antibody of the invention.


The present invention also provides the use of a polypeptide of the invention in a method of identifying a substance capable of affecting the function of the corresponding gene. For example, in one embodiment the present invention provides the use of a polypeptide of the invention in an assay for identifying a substance capable of inhibiting cell cycle progression. The assay involves contacting the polypeptide with a candidate substance or molecule, and detecting modulation of activity of the polypeptide. In preferred embodiments, further steps of isolating or synthesising the substance so identified are carried out.


The substance may inhibit any of the steps or stages in the cell cycle, for example, formation of the nuclear envelope, exit from the quiescent phase of the cell cycle (G0), G1 progression, chromosome decondensation, nuclear envelope breakdown, START, initiation of DNA replication, progression of DNA replication, termination of DNA replication, centrosome duplication, G2 progression, activation of mitotic or meiotic functions, chromosome condensation, centrosome separation, microtubule nucleation, spindle formation and function, interactions with microtubule motor proteins, chromatid separation and segregation, inactivation of mitotic functions, formation of contractile ring, and cytokinesis functions. For example, possible functions of genes of the invention for which it may be desired to identify substances which affect such functions include chromatin binding, formation of replication complexes, replication licensing, phosphorylation or other secondary modification activity, proteolytic degradation, microtubule binding, actin binding, septin binding, microtubule organising centre nucleation activity and binding to components of cell cycle signalling pathways.


In a further aspect the present invention provides a method for identifying a substance capable of binding to a polypeptide of the invention, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.


In an additional aspect, the invention provides kits comprising polynucleotides, polypeptides or antibodies of the invention and methods of using such kits in diagnosing the presence of absence of polynucleotides and polypeptides of the invention including deleterious mutant forms.


Also provided is a substance identified by the above methods of the invention. Such substances may be used in a method of therapy, such as in a method of affecting cell cycle progression, for example mitosis and/or meiosis.


The invention also provides a process comprising the steps of: (a) performing one of the above methods; and (b) preparing a quantity of those one or more substances identified as being capable of binding to a polypeptide of the invention.


Also provided is a process comprising the steps of: (a) performing one of the above methods; and (b) preparing a pharmaceutical composition comprising one or more substances identified as being capable of binding to a polypeptide of the invention.


We further provide a method for identifying a substance capable of modulating the function of a polypeptide of the invention or a polypeptide encoded by a polynucleotide of the invention, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.


A substance identified by a method or assay according to any of the above methods or processes is also provided, as is the use of such a substance in a method of inhibiting the function of a polypeptide. Use of such a substance in a method of regulating a cell division cycle function is also provided.


We further provide a method of identifying a human nucleic acid sequence, by: (a) selecting a Drosophila polypeptide identified in any of Examples 1 to 29; (b) identifying a corresponding human polypeptide; (c) identifying a nucleic acid encoding the polypeptide of (b).


Preferably, a human homologue of the Drosophila sequence, or a human sequence similar to the Drosophila sequence, is identified in step (b).


Preferably, the human polypeptide has at least one of the biological activities, preferably substantially all the biological activities of the Drosophila polypeptide.


We provide a human polypeptide identified by a method according to the previous aspect of the invention.




BRIEF DESCRIPTION OF THE FIGURES


FIG. 1 shows mitotic index after RNAi knockdown of Corkscrew (CG3954) in Dmel-2 Drosophila cultured cells. Values are an average of triplicate samples. Positive controls are siRNA with the mitotic genes Polo kinase and Orbit, negative controls are siRNA with water and with an siRNA against non-endogenous gene GL3



FIG. 2 shows a BLASTP alignment of Drosophila Corkscrew (CG3954) (query sequence), identified in Example 19 as a cell cycle gene, and human Shp2 Protein-tyrosine phosphatase, non-receptor type 11 (genbank accession D13540) (subject sequence).



FIG. 3 shows a histogram of Facs analysis of cell cycle compartment as determined by DNA content in U20S cells after human Shp2 siRNA transfection for 48 hours. The negative control is transfection with siRNA against the non-endogenous gene GL3.



FIG. 4 shows fluorescence micrographs showing the effect of Shp2 siRNAi in U2OS cells. A) Irregular nuclear shape, B) Increase in apoptosis.



FIG. 5 shows Mitotic index after RNAi knockdown of Drosophila discs large 1 Dlg1 (CG1725) in Dmel-2 Drosophila cultured cells. Values are an average of triplicate samples. Positive controls are siRNA with the mitotic genes Polo kinase and Orbit, negative controls are siRNA with water and with an siRNA against non-endogenous gene GL3



FIG. 6A shows a BLASTP alignment of Drosophila discs large 1 Dlg1 (CG1725), identified in Example 28 as a cell cycle gene, and human discs, large (Drosophila) homolog 1 (genbank accession U13896).



FIG. 6B shows a ClustalW alignment of Drosophila discs large 1 Dlg1 (CG1725) and human discs, large (Drosophila) homolog 1 (genbank accession U13896).



FIG. 6C shows a BLASTP alignment of Drosophila discs large 1 Dlg1 (CG1725), and human discs, large (drosophila) homolog 2 (genbank accession U32376).



FIG. 6D shows a ClustalW alignment of Drosophila discs large 1 Dlg1 (CG1725) and human discs, large (drosophila) homolog 2 (genbank accession U32376).



FIG. 7 shows a ClustalW alignment Drosophila Dlg1 and 5 human Dlg genes (Dlg 1-5) so far described.



FIG. 8 shows a histogram of FACS analysis of cell cycle status after siRNA in U2OS cells. Negative control is siRNA against the non-endogenous GL3 gene.



FIG. 9 fluorescence micrographs showing the dominant phenotype observed with Dlg1 COD1654 siRNAi in U2OS cells. A) Multicentrosomal cells at prometaphase and anaphase. B) Cytokinesis defect



FIG. 10 fluorescence micrographs showing the dominant phenotype observed with Dlg2 COD1652 siRNAi in U2OS cells. A) Multicentrosomal cell at telophase. B) Cytokinesis defects.




DETAILED DESCRIPTION

We provide for polynucleotide sand polypeptides whose sequences are set out, or which are referred to, in any of Examples 1 to 29, including Drosophila and human sequences. In particular, we provide for the sequences, including human sequences, and their use in diagnosis and treatment of disease (including prevention and treatment of diseases, syndromes and symptoms) as described in further detail below. A particularly suitable disease for treatment or diagnosis is a proliferative disease such as cancer or any tumour. The polynucleotides and polypeptides disclosed here may be used in screening assays to identify compounds which are capable of binding to, or inhibiting an activity of, the polypeptide or polynucleotide.


Particularly preferred polypeptides include those set out in Example 19 and referred to as Shp2, as well as those set out in Example 28 and referred to as Dlg1 and Dlg2. Accordingly, we provide for Shp2 polypeptide and polynucleotide, as well as Dlg1 and Dlg2 polypeptide and polynucleotide, for the treatment and diagnosis of diseases such as cancer, as described in further detail below.


By the term “Shp2”, we mean a sequence as set out in Example 19 and having the accession number NM002834, together with its variants, homologues, derivatives, fragments and complements as described in further detail below. Preferably, the term “Shp2” should be taken to refer to the human sequence itself. Two transcript variants (variants 1 and 2 as set out in Example 19) are known, and both are encompassed in the term “Shp2”. Shp2 is also known as Homo sapiens protein tyrosine phosphatase, non-receptor type 11 (PTPN11). Furthermore, various sequences differing in length are known for Shp2, and each of these is intended to be included for the uses and compositions described here.


As used in this document, the terms “Dlg1” and “Dlg2” mean the sequences as set out in Example 28 and having the GENBANK accession numbers U13896 and U32376 respectively. Variants, homologues, derivatives, fragments and complements (as described in further detail below) of each of these sequences are also included within the meaning of these terms.


Dlg1 is also known as “human discs, large (Drosophila) homolog 1” while Dlg2 is also known as “human discs, large (Drosophila) homolog 2, chapsyn-110 channel-associated protein of synapses-110′”. Various sequences differing in length are known for Dlg1 and Dlg2, and each of these is intended to be included for the uses and compositions described here.


Preferably, the polypeptides and polynucleotides are such that they give rise to or are associated with defined phenotypes when mutated.


For example, mutations in the polypeptides and polynucleotides may be associated with female sterility; such polypeptides and polynucleotides are conveniently categorised as “Category 1”. Phenotypes associated with Category 1 polypeptides and polynucleotides include any one or more of the following, singly or in combination: Female semi-sterile, brown eggs laid; female sterile, few eggs laid, several fully matured eggs in ovarioles; female semi-sterile, lays eggs, but arrest before cortical migration; “Female sterile, no eggs laid. Fully mature eggs, but “retained eggs” phenotype. Also has a mitotic phenotype: higher mitotic index, uneven chromosome staining, tangled and badly defined chromosomes with frequent bridges”; Female sterile (semi-sterile), 2-3 fully matured eggs in each of the ovarioles.


Alternatively, mutations in the polypeptides and polynucleotides may be associated with male sterility; such polypeptides and polynucleotides are conveniently categorised as “Category 2”. Phenotypes associated with Category 2 polypeptides and polynucleotides include any one or more of the following, singly or in combination: Lethal phase pharate adult, cytokinesis defect—some onion stage cysts with large nebenkerns; reduced adult viability, cytokinesis defect—onion stage cysts have variable sized Nebenkerns—mitotic phenotype: tangled unevenly condensed chromosomes, anaphases with lagging chromosomes and bridges; semi-lethal male and female, cytokinesis defect—in some cysts, variable sized Nebenkerns; male sterile, cytokinesis defect, different meiotic stages within one cyst, variable sized nuclei, 2-4 nuclei, mitotic phenotype:


semi-lethal, rod-like overcondensed chromosomes, high mitotic index, lagging chromosomes and bridges; male sterile, asynchronous meiotic divisions, cysts with large Nebenkern and 1-2 larger nuclei, testis from 2-3 old males become smaller, high mitotic index, colchicine type overcondensaton, many anaphases and telophases, no decondensation in telophase, mitotic phenotype: high mitotic index, colchicines-type overcondensed chromosomes, many ana- and relophases, no decondensation in telophase; cytokinesis defect, small testis, no meiosis observed, variable sized Nebenkerns with 2-4N nuclei; male sterile, cytokinesis defect, larger Nebenkerns with 2-4N nuclei; Male sterile, Cytokinesis defect: variable sized Nebenkerns with 4N nuclei, some nuclei detached from Nebenkern.


Mutations in the polypeptides and polynucleotides may be associated with a mitotic (neuroblast) phenotype (“Category 3”). Phenotypes associated with Category 3 polypeptides and polynucleotides include any one or more of the following, singly or in combination: lethal phase between pupil and pharate adult (P-pA), high mitotic index, rod-like overcondensed chromosomes, a few circular metaphases, many overcondensed anaphases and telophases, a few tetraploid cells; lethal phase pharate adult, high mitotic index, rod-like overcondensed chromosomes, lagging chromosomes and bridges in anaphase, highly condensed; lethal phase pupal-pharate adult, high mitotic index, colchicines-type overcondensation, high frequency of polyploids; lethal phase pupal-pharate adult, high mitotic index, colchicines-type overcondensed chromosomes, many strongly stained nuclei; lethal phase larval stage 3-pre-pupal-pupal, small optic lobes, missing or small imaginal discs, badly defined chromosomes; lethal phase pharate adult, Dot and rod-like overcondensed chromosomes, high mitotic index, overcondensed anaphases some with lagging chromosomes, a few tetraploid cells with overcondensed chromosomes, XYY males; lethal phase embryonic larval phase3-pre-pupal-pupal, high mitotic index, dot-like chromosomes, strong metaphase arrest; lethal phase larval phase 3 D pre-pupal-pupal-pharate adult-adult, high mitotic index, dot and rod-like overcondensed chromosomes, high frequency of polyploids; lethal phase larval stage 3 (few pupae), high mitotic index, colchicine-type overcondensation of chromosomes, polyploid cells, mininuclei formation; lethal phase larval stage 1-2, low mitotic index, few cells in mitosis, metaphase with separated chromosomes; viable, high mitotic index, colchicines-type overcondensed chromosomes, a few polyploid cells; lethal phase pharate adult, high mitotic index, rod like overcondensed chromosomes, few anaphases with lagging chromosomes; lethal phase larval stage 3-pharate adult, small brain and optic lobes, high mitotic index, rod-like overcondensed chromosomes, fewer ana- and telophases, overcondensed chromosomes in ana- and telophase; lethal phase larval stage 3, small brain, few cells in mitosis, badly defined chromosomes, weak chromosome condensation, abnormal anaphases with broken chromosomes; lethal phase larval stage 3, small brain, high mitotic index, rod-like overcondensed chromosomes, fewer ana- and telophases; semilethal male and female, Low mitotic index, badly defined chromosomes, weak/uneven staining, fewer ana- and telophases; lethal phase pupal to pharate adult, lagging chromosomes and bridges in ana- and telophase; lethal phase, pupal, uneven chromosome condensation, lagging chromosomes in anaphase; lethal phase pupal, higher mitotic index, colchicine-like overcondensed chromosomes, many ana- and telophases, lagging chromosomes; lethal phase, prepupal-pupal, high mitotic index, colchicines-like chromosome condensation, metaphase arrest.


The polypeptides and polynucleotides described here may also be categorised according to their function, or their putative function.


For example, the polypeptides described here preferably comprise, and the polynucleotides described here are ones which preferably encode polypeptides comprising, any one or more of the following: CREB-binding proteins, transcription factors, casein kinases, serine threonine kinases, preferably involved in replication and cell cycle, protein phosphatases, membrane associated proteins, preferably involved in priming synaptic vesicles, dynein light chains, microtubule motor proteins, protein phosphatases, protein phosphatases with p53 dependent expression, proteins capable of inhibiting cell division, ribosomal proteins, motor proteins, cytoskeletal binding proteins linking to plama membrane, proteins involved in cytokinesis and cell shape, phosphatidylinositol 3-kinases, C-myc oncogenes, transcription factors, dehydrogenases, thioredoxin reductases, cell cycle regulators preferably involved in cyclin degradation; centrosome components, protein tyrosine phosphatases, Wnt oncogenes, ubiquitin ligases, ubiquitin conjugating enzymes, vesicle trafficking proteins, protein kinases (including protein kinases which regulate the G1/S phase transition and/or DNA replication in mammalian cells), serine/threonine kinases, including serine/threonine kinases involved in winglwess signaling pathway, components of cell junctions, including components of cell junctions having a role in proliferation and Ras associated effector proteins; hydroxymethyltransferase; glycosylation/membrane protein; hydrogen transporting ATP synthase; role in cell cycle progression.


The practice of the present invention will employ, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA and immunology, which are within the capabilities of a person of ordinary skill in the art. Such techniques are explained in the literature. See, for example, J. Sambrook, E. F. Fritsch, and T. Maniatis, 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Books 1-3, Cold Spring Harbor Laboratory Press; Ausubel, F. M. et al. (1995 and periodic supplements; Current Protocols in Molecular Biology, ch. 9, 13, and 16, John Wiley & Sons, New York, N.Y.); B. Roe, J. Crabtree, and A. Kahn, 1996, DNA Isolation and Sequencing: Essential Techniques, John Wiley & Sons; J. M. Polak and James O'D. McGee, 1990, In Situ Hybridization: Principles and Practice; Oxford University Press; M. J. Gait (Editor), 1984, Oligonucleotide Synthesis: A Practical Approach, Irl Press; D. M. J. Lilley and J. E. Dahlberg, 1992, Methods of Enzymology: DNA Structure Part A: Synthesis and Physical Analysis of DNA Methods in Enzymology, Academic Press; Using Antibodies: A Laboratory Manual: Portable Protocol NO. I by Edward Harlow, David Lane, Ed Harlow (1999, Cold Spring Harbor Laboratory Press, ISBN 0-87969-544-7); Antibodies: A Laboratory Manual by Ed Harlow (Editor), David Lane (Editor) (1988, Cold Spring Harbor Laboratory Press, ISBN 0-87969-314-2), 1855. Handbook of Drug Screening, edited by Ramakrishna Seethala, Prabhavathi B. Fernandes (2001, New York, N.Y., Marcel Dekker, ISBN 0-8247-0562-9); and Lab Ref: A Handbook of Recipes, Reagents, and Other Reference Tools for Use at the Bench, Edited Jane Roskams and Linda Rodgers, 2002, Cold Spring Harbor Laboratory, ISBN 0-87969-630-3. Each of these general texts is herein incorporated by reference.


Polypeptides


It will be understood that polypeptides as described here are not limited to polypeptides having the amino acid sequence set out in Examples 1 to 29 or fragments thereof but also include homologous sequences obtained from any source, for example related viral/bacterial proteins, cellular homologues and synthetic peptides, as well as variants or derivatives thereof.


Thus polypeptides also include those encoding homologues from other species including animals such as mammals (e.g. mice, rats or rabbits), especially primates, more especially humans. More specifically, such homologues include human homologues.


Thus, we describe variants, homologues or derivatives of the amino acid sequence set out in Examples 1 to 29, as well as variants, homologues or derivatives of the nucleotide sequence coding for the amino acid sequences as described here.


In the context of this document, a homologous sequence is taken to include an amino acid sequence which is at least 15, 20, 25, 30, 40, 50, 60, 70, 80 or 90% identical, preferably at least 95 or 98% identical at the amino acid level over at least 50 or 100, preferably 200, 300, 400 or 500 amino acids with any one of the polypeptide sequences shown in the Examples. In particular, homology should typically be considered with respect to those regions of the sequence known to be essential for protein function rather than non-essential neighbouring sequences. This is especially important when considering homologous sequences from distantly related organisms.


Although homology can also be considered in terms of similarity (i.e. amino acid residues having similar chemical properties/functions), in the context of this document, it is preferred to express homology in terms of sequence identity.


Homology comparisons can be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These publicly and commercially available computer programs can calculate % homology between two or more sequences.


% homology may be calculated over contiguous sequences, i.e. one sequence is aligned with the other sequence and each amino acid in one sequence directly compared with the corresponding amino acid in the other sequence, one residue at a time. This is called an “ungapped” alignment. Typically, such ungapped alignments are performed only over a relatively short number of residues (for example less than 50 contiguous amino acids).


Although this is a very simple and consistent method, it fails to take into consideration that, for example, in an otherwise identical pair of sequences, one insertion or deletion will cause the following amino acid residues to be put out of alignment, thus potentially resulting in a large reduction in % homology when a global alignment is performed. Consequently, most sequence comparison methods are designed to produce optimal alignments that take into consideration possible insertions and deletions without penalising unduly the overall homology score. This is achieved by inserting “gaps” in the sequence alignment to try to maximise local homology.


However, these more complex methods assign “gap penalties” to each gap that occurs in the alignment so that, for the same number of identical amino acids, a sequence alignment with as few gaps as possible—reflecting higher relatedness between the two compared sequences—will achieve a higher score than one with many gaps. “Affine gap costs” are typically used that charge a relatively high cost for the existence of a gap and a smaller penalty for each subsequent residue in the gap. This is the most commonly used gap scoring system. High gap penalties will of course produce optimised alignments with fewer gaps. Most alignment programs allow the gap penalties to be modified. However, it is preferred to use the default values when using such software for sequence comparisons. For example when using the GCG Wisconsin Bestfit package (see below) the default gap penalty for amino acid sequences is −12 for a gap and −4 for each extension.


Calculation of maximum % homology therefore firstly requires the production of an optimal alignment, taking into consideration gap penalties. A suitable computer program for carrying out such an alignment is the GCG Wisconsin Bestfit package (University of Wisconsin, U.S.A; Devereux et al., 1984, Nucleic Acids Research 12:387). Examples of other software than can perform sequence comparisons include, but are not limited to, the BLAST package (see Ausubel et al., 1999 ibid—Chapter 18), FASTA (Atschul et al., 1990, J. Mol. Biol., 403-410) and the GENEWORKS suite of comparison tools. Both BLAST and FASTA are available for offline and online searching (see Ausubel et al., 1999 ibid, pages 7-58 to 7-60). However it is preferred to use the GCG Bestfit program.


Although the final % homology can be measured in terms of identity, the alignment process itself is typically not based on an all-or-nothing pair comparison. Instead, a scaled similarity score matrix is generally used that assigns scores to each pairwise comparison based on chemical similarity or evolutionary distance. An example of such a matrix commonly used is the BLOSUM62 matrix—the default matrix for the BLAST suite of programs. GCG Wisconsin programs generally use either the public default values or a custom symbol comparison table if supplied (see user manual for further details). It is preferred to use the public default values for the GCG package, or in the case of other software, the default matrix, such as BLOSUM62.


Once the software has produced an optimal alignment, it is possible to calculate % homology, preferably % sequence identity. The software typically does this as part of the sequence comparison and generates a numerical result.


The terms “variant” or “derivative” in relation to the amino acid sequences includes any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) amino acids from or to the sequence providing the resultant amino acid sequence retains substantially the same activity as the unmodified sequence, preferably having at least the same activity as the polypeptides presented in the sequence listings in the Examples.


Polypeptides having the amino acid sequence shown in the Examples, or fragments or homologues thereof may be modified for use in the methods and compositions described here. Typically, modifications are made that maintain the biological activity of the sequence. Amino acid substitutions may be made, for example from 1, 2 or 3 to 10, 20 or 30 substitutions provided that the modified sequence retains the biological activity of the unmodified sequence. Alternatively, modifications may be made to deliberately inactivate one or more functional domains of the polypeptides described here. Amino acid substitutions may include the use of non-naturally occurring analogues, for example to increase blood plasma half-life of a therapeutically administered polypeptide.


Conservative substitutions may be made, for example according to the Table below. Amino acids in the same block in the second column and preferably in the same line in the third column may be substituted for each other:

ALIPHATICNon-polarG A PI L VPolar - unchargedC S T MN QPolar - chargedD EK RAROMATICH F W Y


Polypeptides also include fragments of the full length sequences mentioned above. Preferably said fragments comprise at least one epitope. Methods of identifying epitopes are well known in the art. Fragments will typically comprise at least 6 amino acids, more preferably at least 10, 20, 30, 50 or 100 amino acids.


Proteins as described here are typically made by recombinant means, for example as described below. However they may also be made by synthetic means using techniques well known to skilled persons such as solid phase synthesis. Proteins may also be produced as fusion proteins, for example to aid in extraction and purification. Examples of fusion protein partners include glutathione-S-transferase (GST), 6xHis, GAL4 (DNA binding and/or transcriptional activation domains) and β-galactosidase. It may also be convenient to include a proteolytic cleavage site between the fusion protein partner and the protein sequence of interest to allow removal of fusion protein sequences. Preferably the fusion protein will not hinder the function of the protein of interest sequence. Proteins as described here may also be obtained by purification of cell extracts from animal cells.


The proteins may be in a substantially isolated form. It will be understood that the protein may be mixed with carriers or diluents which will not interfere with the intended purpose of the protein and still be regarded as substantially isolated. A protein may also be in a substantially purified form, in which case it will generally comprise the protein in a preparation in which more than 90%, e.g. 95%, 98% or 99% of the protein in the preparation is a protein as described in this document.


A polypeptide may be labeled with a revealing label. The revealing label may be any suitable label which allows the polypeptide to be detected. Suitable labels include radioisotopes, e.g. 125I, enzymes, antibodies, polynucleotides and linkers such as biotin. Labeled polypeptides as described here may be used in diagnostic procedures such as immunoassays to determine the amount of a polypeptide in a sample. Polypeptides or labeled polypeptides may also be used in serological or cell-mediated immune assays for the detection of immune reactivity to said polypeptides in animals and humans using standard protocols.


A polypeptide or labeled polypeptide or fragment thereof may also be fixed to a solid phase, for example the surface of an immunoassay well or dipstick. Such labeled and/or immobilised polypeptides may be packaged into kits in a suitable container along with suitable reagents, controls, instructions and the like. Such polypeptides and kits may be used in methods of detection of antibodies to the polypeptides or their allelic or species variants by immunoassay.


Immunoassay methods are well known in the art and will generally comprise: (a) providing a polypeptide comprising an epitope bindable by an antibody against said protein; (b) incubating a biological sample with said polypeptide under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said polypeptide is formed.


The polypeptides described here may be used in in vitro or in vivo cell culture systems to study the role of their corresponding genes and homologues thereof in cell function, including their function in disease. For example, truncated or modified polypeptides may be introduced into a cell to disrupt the normal functions which occur in the cell. The polypeptides may be introduced into the cell by in situ expression of the polypeptide from a recombinant expression vector (see below). The expression vector optionally carries an inducible promoter to control the expression of the polypeptide.


The use of appropriate host cells, such as insect cells or mammalian cells, is expected to provide for such post-translational modifications (e.g. myristolation, glycosylation, truncation, lapidation and tyrosine, serine or threonine phosphorylation) as may be needed to confer optimal biological activity on recombinant expression products. Such cell culture systems in which such polypeptides are expressed may be used in assay systems to identify candidate substances which interfere with or enhance the functions of the polypeptides described here in the cell.


Polynucleotides


We demonstrate here that mutations in genes encoding the polypeptides disclosed in the Examples demonstrate a cell cycle defect, and that accordingly these genes and the proteins encoded by them are responsible for cell cycle function.


Polynucleotides as described in this document include polynucleotides that comprise any one or more of the nucleic acid sequences encoding the polypeptides set out in Examples 1 to 29 and fragments thereof. Such polynucleotides also include polynucleotides encoding the polypeptides described here. It is straightforward to identify a nucleic acid sequence which encodes such a polypeptide, by reference to the genetic code. Furthermore, computer programs are available which translate a nucleic acid sequence to a polypeptide sequence, and/or vice versa. Each and all of sequences which are capable of encoding the polypeptides disclosed in the Examples is considered disclosed in this document, and the disclosure of a polypeptide sequence includes a disclosure of all nucleic acids (and their sequences) which encodes that polypeptide sequence.


It will be understood by a skilled person that numerous different polynucleotides can encode the same polypeptide as a result of the degeneracy of the genetic code. In addition, it is to be understood that skilled persons may, using routine techniques, make nucleotide substitutions that do not affect the polypeptide sequence encoded by the polynucleotides described here to reflect the codon usage of any particular host organism in which the polypeptides are to be expressed.


In preferred embodiments, the polynucleotides comprise those polypeptides, such as cDNA, mRNA, and genomic DNA of the relevant organism, which encode the polypeptides disclosed in the Examples. Such polynucleotides may typically comprise Drosophila cDNA, mRNA, and genomic DNA, Homo sapiens cDNA, mRNA, and genomic DNA, etc. Accession numbers are provided in the Examples for the polypeptide sequences, and it is straightforward to derive the encoding nucleic acid sequences by use of such accession numbers in a relevant database, such as a Drosophila sequence database, a human sequence database, including a Human Genome Sequence database, GadFly, FlyBase, etc. in particular, the annotated Drosophila sequence database of the Berkeley Drosophila Genome Project (GadFly: Genome Annotation Database of Drosophil at http://www.fruitfly.org/annot/) may be used to identify such Drosophila and human polynucleotide sequences. Relevant sequences may also be obtained by searching sequence databases such as BLAST with the polypeptide sequences. In particular, a search using TBLASTN may be employed.


Furthermore, we provide a method of identifying a human nucleic acid sequence, by: (a) selecting a Drosophila polypeptide identified in any of Examples 1 to 29; (b) identifying a corresponding human polypeptide; (c) identifying a nucleic acid encoding the polypeptide of (b). Step (b) may in particular involve identifying a human homologue of the Drosophila sequence, or a human sequence similar to the Drosophila sequence. Preferably, such a polypeptide has at least one of the biological activities, preferably substantially all the biological activities (such as identified in the Examples) of the Drosophila polypeptide. Preferably, the human polypeptide is involved in an aspect of cell cycle control. A human polypeptide identified as above, as well as a sequence of the human polypeptide and a sequence of the human nucleic acid are also provided.


Polynucleotides as described here may comprise DNA or RNA. They may be single-stranded or double-stranded. They may also be polynucleotides which include within them synthetic or modified nucleotides. A number of different types of modification to oligonucleotides are known in the art. These include methylphosphonate and phosphorothioate backbones, addition of acridine or polylysine chains at the 3′ and/or 5′ ends of the molecule. For the purposes of this document, it is to be understood that the polynucleotides described herein may be modified by any method available in the art. Such modifications may be carried out in order to enhance the in vivo activity or life span of polynucleotides.


The terms “variant”, “homologue” or “derivative” in relation to a nucleotide sequence include any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) nucleic acid from or to the sequence. Preferably said variant, homologues or derivatives code for a polypeptide having biological activity.


As indicated above, with respect to sequence homology, preferably there is at least 50 or 75%, more preferably at least 85%, more preferably at least 90% homology to the sequences shown in the sequence listing herein. More preferably there is at least 95%, more preferably at least 98%, homology. Nucleotide homology comparisons may be conducted as described above. A preferred sequence comparison program is the GCG Wisconsin Bestfit program described above. The default scoring matrix has a match value of 10 for each identical nucleotide and −9 for each mismatch. The default gap creation penalty is −50 and the default gap extension penalty is −3 for each nucleotide.


This document also encompasses nucleotide sequences that are capable of hybridising selectively to the sequences presented herein, or any variant, fragment or derivative thereof, or to the complement of any of the above. Nucleotide sequences are preferably at least 15 nucleotides in length, more preferably at least 20, 30, 40 or 50 nucleotides in length.


The term “hybridization” as used herein shall include “the process by which a strand of nucleic acid joins with a complementary strand through base pairing” as well as the process of amplification as carried out in polymerase chain reaction technologies.


Polynucleotides which capable of selectively hybridising to the nucleotide sequences presented herein, or to their complement, will be generally at least 70%, preferably at least 80 or 90% and more preferably at least 95% or 98% homologous to the corresponding nucleotide sequences presented herein over a region of at least 20, preferably at least 25 or 30, for instance at least 40, 60 or 100 or more contiguous nucleotides.


The term “selectively hybridizable” means that the polynucleotide used as a probe is used under conditions where a target polynucleotide is found to hybridize to the probe at a level significantly above background. The background hybridization may occur because of other polynucleotides present, for example, in the cDNA or genomic DNA library being screening. In this event, background implies a level of signal generated by interaction between the probe and a non-specific DNA member of the library which is less than 10 fold, preferably less than 100 fold as intense as the specific interaction observed with the target DNA. The intensity of interaction may be measured, for example, by radiolabelling the probe, e.g. with 32P.


Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex, as taught in Berger and Kimmel (1987, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol 152, Academic Press, San Diego Calif.), and confer a defined “stringency” as explained below.


Maximum stringency typically occurs at about Tm−5° C. (5° C. below the Tm of the probe); high stringency at about 5° C. to 10° C. below Tm; intermediate stringency at about 10° C. to 20° C. below Tm; and low stringency at about 20° C. to 25° C. below Tm. As will be understood by those of skill in the art, a maximum stringency hybridization can be used to identify or detect identical polynucleotide sequences while an intermediate (or low) stringency hybridization can be used to identify or detect similar or related polynucleotide sequences.


In a preferred aspect, we describe nucleotide sequences that can hybridise to the nucleotide sequence as described here under stringent conditions (e.g. 65° C. and 0.1×SSC {1×SSC=0.15 M NaCl, 0.015 M Na3 Citrate pH 7.0).


Where the polynucleotide is double-stranded, both strands of the duplex, either individually or in combination, are encompassed by the methods and compositions described here. Where the polynucleotide is single-stranded, it is to be understood that the complementary sequence of that polynucleotide is also included.


Polynucleotides which are not 100% homologous to the sequences of described here but are encompassed can be obtained in a number of ways. Other variants of the sequences described herein may be obtained for example by probing DNA libraries made from a range of individuals, for example individuals from different populations. In addition, other viral/bacterial, or cellular homologues particularly cellular homologues found in mammalian cells (e.g. rat, mouse, bovine and primate cells), may be obtained and such homologues and fragments thereof in general will be capable of selectively hybridising to sequences which encode the polypeptides shown in the Examples. Such sequences may be obtained by probing cDNA libraries made from or genomic DNA libraries from other animal species, and probing such libraries with probes comprising all or part of any on of the sequences under conditions of medium to high stringency. The nucleotide sequences of or which encode the human homologues described in the Examples, may preferably be used to identify other primate/mammalian homologues since nucleotide homology between human sequences and mammalian sequences is likely to be higher than is the case for the Drosophila sequences identified herein.


Similar considerations apply to obtaining species homologues and allelic variants of the polypeptide or nucleotide sequences described here.


Variants and strain/species homologues may also be obtained using degenerate PCR which will use primers designed to target sequences within the variants and homologues encoding conserved amino acid sequences within the sequences described here. Conserved sequences can be predicted, for example, by aligning the amino acid sequences from several variants/homologues. Sequence alignments can be performed using computer software known in the art. For example the GCG Wisconsin PileUp program is widely used.


The primers used in degenerate PCR will contain one or more degenerate positions and will be used at stringency conditions lower than those used for cloning sequences with single sequence primers against known sequences. It will be appreciated by the skilled person that overall nucleotide homology between sequences from distantly related organisms is likely to be very low and thus in these situations degenerate PCR may be the method of choice rather than screening libraries with labeled fragments.


In addition, homologous sequences may be identified by searching nucleotide and/or protein databases using search algorithms such as the BLAST suite of programs. This approach is described below and in the Examples.


Alternatively, such polynucleotides may be obtained by site directed mutagenesis of characterised sequences, such as the sequences encoding polypeptides disclosed in the Examples. This may be useful where for example silent codon changes are required to sequences to optimise codon preferences for a particular host cell in which the polynucleotide sequences are being expressed. Other sequence changes may be desired in order to introduce restriction enzyme recognition sites, or to alter the property or function of the polypeptides encoded by the polynucleotides. For example, further changes may be desirable to represent particular coding changes found in the sequences coding polypeptides disclosed in the Examples which give rise to mutant genes which have lost their regulatory function. Probes based on such changes can be used as diagnostic probes to detect such mutants.


The polynucleotides described here may be used to produce a primer, e.g. a PCR primer, a primer for an alternative amplification reaction, a probe e.g. labeled with a revealing label by conventional means using radioactive or non-radioactive labels, or the polynucleotides may be cloned into vectors. Such primers, probes and other fragments will be at least 8, 9, 10, or 15, preferably at least 20, for example at least 25, 30 or 40 nucleotides in length, and are also encompassed by the term “polynucleotides” as used herein.


Polynucleotides such as a DNA polynucleotides and probes as described here may be produced recombinantly, synthetically, or by any means available to those of skill in the art. They may also be cloned by standard techniques.


In general, primers will be produced by synthetic means, involving a step wise manufacture of the desired nucleic acid sequence one nucleotide at a time. Techniques for accomplishing this using automated techniques are readily available in the art.


Longer polynucleotides will generally be produced using recombinant means, for example using a PCR (polymerase chain reaction) cloning techniques. This will involve making a pair of primers (e.g. of about 15 to 30 nucleotides) flanking a region of the lipid targeting sequence which it is desired to clone, bringing the primers into contact with mRNA or cDNA obtained from an animal or human cell, performing a polymerase chain reaction under conditions which bring about amplification of the desired region, isolating the amplified fragment (e.g. by purifying the reaction mixture on an agarose gel) and recovering the amplified DNA. The primers may be designed to contain suitable restriction enzyme recognition sites so that the amplified DNA can be cloned into a suitable cloning vector


The polynucleotides or primers may carry a revealing label. Suitable labels include radioisotopes such as 32P or 35S, enzyme labels, or other protein labels such as biotin. Such labels may be added to the polynucleotides or primers and may be detected using by techniques known per se.


Polynucleotides or primers or fragments thereof labeled or unlabeled may be used by a person skilled in the art in nucleic acid-based tests for detecting or sequencing polynucleotides in the human or animal body.


Such tests for detecting generally comprise bringing a biological sample containing DNA or RNA into contact with a probe comprising a polynucleotide or primer as described here under hybridising conditions and detecting any duplex formed between the probe and nucleic acid in the sample. Such detection may be achieved using techniques such as PCR or by immobilising the probe on a solid support, removing nucleic acid in the sample which is not hybridised to the probe, and then detecting nucleic acid which has hybridised to the probe. Alternatively, the sample nucleic acid may be immobilised on a solid support, and the amount of probe bound to such a support can be detected. Suitable assay methods of this and other formats can be found in for example WO89/03891 and WO90/13667.


Tests for sequencing nucleotides include bringing a biological sample containing target DNA or RNA into contact with a probe comprising a polynucleotide or primer under hybridising conditions and determining the sequence by, for example the Sanger dideoxy chain termination method (see Sambrook et al.).


Such a method generally comprises elongating, in the presence of suitable reagents, the primer by synthesis of a strand complementary to the target DNA or RNA and selectively terminating the elongation reaction at one or more of an A, C, G or T/U residue; allowing strand elongation and termination reaction to occur; separating out according to size the elongated products to determine the sequence of the nucleotides at which selective termination has occurred. Suitable reagents include a DNA polymerase enzyme, the deoxynucleotides dATP, dCTP, dGTP and dTTP, a buffer and ATP. Dideoxynucleotides are used for selective termination.


Tests for detecting or sequencing nucleotides in a biological sample may be used to determine particular sequences within cells in individuals who have, or are suspected to have, an altered gene sequence, for example within cancer cells including leukaemia cells and solid tumours such as breast, ovary, lung, colon, pancreas, testes, liver, brain, muscle and bone tumours. Cells from patients suffering from a proliferative disease may also be tested in the same way.


In addition, the identification of the genes described in the Examples will allow the role of these genes in hereditary diseases to be investigated. In general, this will involve establishing the status of the gene (e.g. using PCR sequence analysis), in cells derived from animals or humans with, for example, neurological disorders or neoplasms.


The probes as described here may conveniently be packaged in the form of a test kit in a suitable container. In such kits the probe may be bound to a solid support where the assay format for which the kit is designed requires such binding. The kit may also contain suitable reagents for treating the sample to be probed, hybridising the probe to nucleic acid in the sample, control reagents, instructions, and the like.


Homology Searching


Sequence homology (or identity) may be determined using any suitable homology algorithm, using for example default parameters.


Advantageously, the BLAST algorithm is employed, with parameters set to default values. The BLAST algorithm is described in detail at http://www.ncbi.nih.gov/BLAST/blast_help.html, which is incorporated herein by reference. The search parameters are defined as follows, and are advantageously set to the defined default parameters.


Advantageously, “substantial homology” when assessed by BLAST equates to sequences which match with an EXPECT value of at least about 7, preferably at least about 9 and most preferably 10 or more. The default threshold for EXPECT in BLAST searching is usually 10.


BLAST (Basic Local Alignment Search Tool) is the heuristic search algorithm employed by the programs blastp, blastn, blastx, tblastn, and tblastx; these programs ascribe significance to their findings using the statistical methods of Karlin and Altschul (see http://www.ncbi.nih.gov/BLAST/blast_help.html) with a few enhancements. The BLAST programs were tailored for sequence similarity searching, for example to identify homologues to a query sequence. The programs are not generally useful for motif-style searching. For a discussion of basic issues in similarity searching of sequence databases, see Altschul et al. (1994).


The five BLAST programs available at http://www.ncbi.nlm.nih.gov perform the following tasks:


blastp compares an amino acid query sequence against a protein sequence database;


blastn compares a nucleotide query sequence against a nucleotide sequence database;


blastx compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database;


tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands). tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.


BLAST uses the following search parameters:


HISTOGRAM Display a histogram of scores for each search; default is yes. (See parameter H in the BLAST Manual).


DESCRIPTIONS Restricts the number of short descriptions of matching sequences reported to the number specified; default limit is 100 descriptions. (See parameter V in the manual page). See also EXPECT and CUTOFF.


ALIGNMENTS Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported; the default limit is 50. If more database sequences than this happen to satisfy the statistical significance threshold for reporting (see EXPECT and CUTOFF below), only the matches ascribed the greatest statistical significance are reported. (See parameter B in the BLAST Manual).


EXPECT The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable. (See parameter E in the BLAST Manual).


CUTOFF Cutoff score for reporting high-scoring segment pairs. The default value is calculated from the EXPECT value (see above). HSPs are reported for a database sequence only if the statistical significance ascribed to them is at least as high as would be ascribed to a lone HSP having a score equal to the CUTOFF value. Higher CUTOFF values are more stringent, leading to fewer chance matches being reported. (See parameter S in the BLAST Manual). Typically, significance thresholds can be more intuitively managed using EXPECT.


MATRIX Specify an alternate scoring matrix for BLASTP, BLASTX, TBLASTN and TBLASTX. The default matrix is BLOSUM62 (Henikoff & Henikoff, 1992). The valid alternative choices include: PAM40, PAM120, PAM250 and IDENTITY. No alternate scoring matrices are available for BLASTN; specifying the MATRIX directive in BLASTN requests returns an error response.


STRAND Restrict a TBLASTN search to just the top or bottom strand of the database sequences; or restrict a BLASTN, BLASTX or TBLASTX search to just reading frames on the top or bottom strand of the query sequence.


FILTER Mask off segments of the query sequence that have low compositional complexity, as determined by the SEG program of Wootton & Federhen (1993) Computers and Chemistry 17:149-163, or segments consisting of short-periodicity internal repeats, as determined by the XNU program of Claverie & States (1993) Computers and Chemistry 17:191-201, or, for BLASTN, by the DUST program of Tatusov and Lipman (see http://www.ncbi.nlm.nih.gov). Filtering can eliminate statistically significant but biologically uninteresting reports from the blast output (e.g., hits against common acidic-, basic- or proline-rich regions), leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences.


Low complexity sequence found by a filter program is substituted using the letter “N” in nucleotide sequence (e.g., “NNNNNNNNNNNNN”) and the letter “X” in protein sequences (e.g., “XXXXXXXXX”).


Filtering is only applied to the query sequence (or its translation products), not to database sequences. Default filtering is DUST for BLASTN, SEG for other programs.


It is not unusual for nothing at all to be masked by SEG, XNU, or both, when applied to sequences in SWISS-PROT, so filtering should not be expected to always yield an effect. Furthermore, in some cases, sequences are masked in their entirety, indicating that the statistical significance of any matches reported against the unfiltered query sequence should be suspect.


NCBI-gi Causes NCBI gi identifiers to be shown in the output, in addition to the accession and/or locus name.


Most preferably, sequence comparisons are conducted using the simple BLAST search algorithm provided at http://www.ncbi.nlm.nih.gov/BLAST.


Nucleic Acid Vectors


Polynucleotides as described in this document can be incorporated into a recombinant replicable vector. The vector may be used to replicate the nucleic acid in a compatible host cell. Thus in a further embodiment, we provide a method of making polynucleotides by introducing a polynucleotide as described here into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell. Suitable host cells include bacteria such as E. Coli, yeast, mammalian cell lines and other eukaryotic cell lines, for example insect Sf9 cells.


Preferably, a polynucleotide in a vector is operably linked to a control sequence that is capable of providing for the expression of the coding sequence by the host cell, i.e. the vector is an expression vector. The term “operably linked” means that the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence “operably linked” to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.


The control sequences may be modified, for example by the addition of further transcriptional regulatory elements to make the level of transcription directed by the control sequences more responsive to transcriptional modulators.


Vectors as described here may be transformed or transfected into a suitable host cell as described below to provide for expression of a protein. This process may comprise culturing a host cell transformed with an expression vector as described above under conditions to provide for expression by the vector of a coding sequence encoding the protein, and optionally recovering the expressed protein. Vectors will be chosen that are compatible with the host cell used.


The vectors may be for example, plasmid or virus vectors provided with an origin of replication, optionally a promoter for the expression of the said polynucleotide and optionally a regulator of the promoter. The vectors may contain one or more selectable marker genes, for example an ampicillin resistance gene in the case of a bacterial plasmid or a neomycin resistance gene for a mammalian vector. Vectors may be used, for example, to transfect or transform a host cell.


Control sequences operably linked to sequences encoding a polypeptide described here include promoters/enhancers and other expression regulation signals. These control sequences may be selected to be compatible with the host cell for which the expression vector is designed to be used in. The term promoter is well-known in the art and encompasses nucleic acid regions ranging in size and complexity from minimal promoters to promoters including upstream elements and enhancers.


The promoter is typically selected from promoters which are functional in mammalian cells, although prokaryotic promoters and promoters functional in other eukaryotic cells, such as insect cells, may be used. The promoter is typically derived from promoter sequences of viral or eukaryotic genes. For example, it may be a promoter derived from the genome of a cell in which expression is to occur. With respect to eukaryotic promoters, they may be promoters that function in a ubiquitous manner (such as promoters of α-actin, β-actin, tubulin) or, alternatively, a tissue-specific manner (such as promoters of the genes for pyruvate kinase). They may also be promoters that respond to specific stimuli, for example promoters that bind steroid hormone receptors. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR) promoter, the rous sarcoma virus (RSV) LTR promoter or the human cytomegalovirus (CMV) IE promoter.


It may also be advantageous for the promoters to be inducible so that the levels of expression of the heterologous gene can be regulated during the life-time of the cell. Inducible means that the levels of expression obtained using the promoter can be regulated.


In addition, any of these promoters may be modified by the addition of further regulatory sequences, for example enhancer sequences. Chimeric promoters may also be used comprising sequence elements from two or more different promoters described above.


The polynucleotides may also be inserted into the vectors described above in an antisense orientation to provide for the production of antisense RNA. Antisense RNA or other antisense polynucleotides may also be produced by synthetic means. Such antisense polynucleotides may be used in a method of controlling the levels of RNAs transcribed from genes comprising any one of the polynucleotides as described.


Host Cells


The vectors and polynucleotides may be introduced into host cells for the purpose of replicating the vectors/polynucleotides and/or expressing the polypeptides encoded by the polynucleotides described here. Although such polypeptides may be produced using prokaryotic cells as host cells, it is preferred to use eukaryotic cells, for example yeast, insect or mammalian cells, in particular mammalian cells.


Vectors/polynucleotides as described here may be introduced into suitable host cells using a variety of techniques known in the art, such as transfection, transformation and electroporation. Where vectors/polynucleotides are to be administered to animals, several techniques are known in the art, for example infection with recombinant viral vectors such as retroviruses, herpes simplex viruses and adenoviruses, direct injection of nucleic acids and biolistic transformation.


Protein Expression and Purification


Host cells comprising polynucleotides as described here may be used to express polypeptides. Host cells may be cultured under suitable conditions which allow expression of the proteins. Expression of the polypeptides as described may be constitutive such that they are continually produced, or inducible, requiring a stimulus to initiate expression. In the case of inducible expression, protein production can be initiated when required by, for example, addition of an inducer substance to the culture medium, for example dexamethasone or IPTG.


Polypeptides can be extracted from host cells by a variety of techniques known in the art, including enzymatic, chemical and/or osmotic lysis and physical disruption.


The polypeptides may also be produced recombinantly in an in vitro cell-free system, such as the TnT™ (Promega) rabbit reticulocyte system.


Antibodies


We also provide monoclonal or polyclonal antibodies to polypeptides as described here, or fragments thereof. Thus, we further provide a process for the production of monoclonal or polyclonal antibodies to polypeptides.


If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, goat, horse, etc.) is immunised with an immunogenic polypeptide bearing an epitope(s) from a polypeptide as described here. Serum from the immunised animal is collected and treated according to known procedures. If serum containing polyclonal antibodies to an epitope from a polypeptide contains antibodies to other antigens, the polyclonal antibodies can be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal antisera are known in the art. In order that such antibodies may be made, we also provide polypeptides as described here, or fragments thereof, haptenised to another polypeptide for use as immunogens in animals or humans.


Monoclonal antibodies directed against epitopes in the polypeptides described here can also be readily produced by one skilled in the art. The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. Panels of monoclonal antibodies produced against epitopes in the polypeptides can be screened for various properties; i.e., for isotype and epitope affinity.


An alternative technique involves screening phage display libraries where, for example the phage express scFv fragments on the surface of their coat with a large variety of complementarity determining regions (CDRs). This technique is well known in the art.


Antibodies, both monoclonal and polyclonal, which are directed against epitopes from polypeptides described here are particularly useful in diagnosis, and those which are neutralising are useful in passive immunotherapy. Monoclonal antibodies, in particular, may be used to raise anti-idiotype antibodies. Anti-idiotype antibodies are immunoglobulins which carry an “internal image” of the antigen of the agent against which protection is desired.


Techniques for raising anti-idiotype antibodies are known in the art. These anti-idiotype antibodies may also be useful in therapy.


For the purposes of this document, the term “antibody”, unless specified to the contrary, includes fragments of whole antibodies which retain their binding activity for a target antigen. Such fragments include Fv, F(ab′) and F(ab′)2 fragments, as well as single chain antibodies (scFv). Furthermore, the antibodies and fragments thereof may be humanised antibodies, for example as described in EP-A-239400.


Antibodies may be used in method of detecting polypeptides as described in this document present in biological samples by a method which comprises: (a) providing an antibody as described here; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.


Suitable samples include extracts tissues such as brain, breast, ovary, lung, colon, pancreas, testes, liver, muscle and bone tissues or from neoplastic growths derived from such tissues.


Such antibodies may be bound to a solid support and/or packaged into kits in a suitable container along with suitable reagents, controls, instructions and the like.


Assays


We also provide assays that are suitable for identifying substances which bind to polypeptides as described here and which affect, for example, formation of the nuclear envelope, exit from the quiescent phase of the cell cycle (G0), G1 progression, chromosome decondensation, nuclear envelope breakdown, START, initiation of DNA replication, progression of DNA replication, termination of DNA replication, centrosome duplication, G2 progression, activation of mitotic or meiotic functions, chromosome condensation, centrosome separation, microtubule nucleation, spindle formation and function, interactions with microtubule motor proteins, chromatid separation and segregation, inactivation of mitotic functions, formation of contractile ring, cytokinesis functions, chromatin binding, formation of replication complexes, replication licensing, phosphorylation or other secondary modification activity, proteolytic degradation, microtubule binding, actin binding, septin binding, microtubule organising centre nucleation activity and binding to components of cell cycle signalling pathways.


In addition, assays suitable for identifying substances that interfere with binding of polypeptides as described here, where appropriate, to components of cell division cycle machinery. This includes not only components such as microtubules but also signalling components and regulatory components as indicated above. Such assays are typically in vitro. Assays are also provided that test the effects of candidate substances identified in preliminary in vitro assays on intact cells in whole cell assays. The assays described below, or any suitable assay as known in the art, may be used to identify these substances.


In particular, we provide for the use of a polynucleotide as set out in Table 5, or a polypeptide encoded by the polypeptide, in a method of identifying a substance capable of binding to the polypeptide, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.


We further provide for use of a polynucleotide as set out in Table 5, or a polypeptide encoded by the polypeptide, in a method of identifying a substance capable of modulating the function of the polypeptide, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.


The substance identified may be isolated or synthesised, and used for prevention, treatment or diagnosis of a disease in an individual. The substance may be adminstered to an individual in need of such treatment. Alternatively or in addition, the substance identified by the assay is administered to an individual in need of such treatment. Preferably, the polynucleotide comprises a human polypeptide as set out in column 3 of Table 5.


Therefore, we provide one or more substances identified by any of the assays described below, viz, mitosis assays, meiotic assays, polypeptide binding assays, microtubule binding/polymerisation assays, microtubule purification and binding assays, microtubule organising centre (MTOC) nucleation activity assays, motor protein assay, assay for spindle assembly and function, assays for dna replication, chromosome condensation assays, kinase assays, kinase inhibitor assays, and whole cell assays, each as described in further detail below.


Candidate Substances


A substance that inhibits cell cycle progression as a result of an interaction with a polypeptide as described here may do so in several ways. For example, if the substance inhibits cell division, mitosis and/or meiosis, it may directly disrupt the binding of a polypeptide as described here to a component of the spindle apparatus by, for example, binding to the polypeptide and masking or altering the site of interaction with the other component. A substance which inhibits DNA replication may do so by inhibiting the phosphorylation or de-phosphorylation of proteins involved in replication. For example, it is known that the kinase inhibitor 6-DMAP (6-dimethylaminopurine) prevents the initiation of replication (Blow, J J, 1993, J Cell Biol 122, 993-1002). Candidate substances of this type may conveniently be preliminarily screened by in vitro binding assays as, for example, described below and then tested, for example in a whole cell assay as described below. Examples of candidate substances include antibodies which recognise a polypeptide as described in this document.


A substance which can bind directly to such a polypeptide may also inhibit its function in cell cycle progression by altering its subcellular localisation and hence its ability to interact with its normal substrate. The substance may alter the subcellular localisation of the polypeptide by directly binding to it, or by indirectly disrupting the interaction of the polypeptide with another component. For example, it is known that interaction between the p68 and p180 subunits of DNA polymerase alpha-primase enzyme is necessary in order for p180 to translocate into the nucleus (Mizuno et al (1998) Mol Cell Biol 18, 3552-62), and accordingly, a substance which disrupts the interaction between p68 and p180 will affect nuclear translocation and hence activity of the primase. A substance which affects mitosis may do so by preventing the polypeptide and components of the mitotic apparatus from coming into contact within the cell.


These substances may be tested using, for example the whole cells assays described below. Non-functional homologues of a polypeptide as described here may also be tested for inhibition of cell cycle progression since they may compete with the wild type protein for binding to components of the cell division cycle machinery whilst being incapable of the normal functions of the protein or block the function of the protein bound to the cell division cycle machinery. Such non-functional homologues may include naturally occurring mutants and modified sequences or fragments thereof.


Alternatively, instead of preventing the association of the components directly, the substance may suppress the biologically available amount of a polypeptide as described here. This may be by inhibiting expression of the component, for example at the level of transcription, transcript stability, translation or post-translational stability. An example of such a substance would be antisense RNA or double-stranded interfering RNA sequences which suppresses the amount of mRNA biosynthesis.


Suitable candidate substances include peptides, especially of from about 5 to 30 or 10 to 25 amino acids in size, based on the sequence of the polypeptides described in the Examples, or variants of such peptides in which one or more residues have been substituted. Peptides from panels of peptides comprising random sequences or sequences which have been varied consistently to provide a maximally diverse panel of peptides may be used.


Suitable candidate substances also include antibody products (for example, monoclonal and polyclonal antibodies, single chain antibodies, chimeric antibodies and CDR-grafted antibodies) which are specific for a polypeptide as described here. Furthermore, combinatorial libraries, peptide and peptide mimetics, defined chemical entities, oligonucleotides, and natural product libraries may be screened for activity as inhibitors of binding of a polypeptide as described here to the cell division cycle machinery, for example mitotic/meiotic apparatus (such as microtubules). The candidate substances may be used in an initial screen in batches of, for example 10 substances per reaction, and the substances of those batches which show inhibition tested individually. Candidate substances which show activity in in vitro screens such as those described below can then be tested in whole cell systems, such as mammalian cells which will be exposed to the inhibitor and tested for inhibition of any of the stages of the cell cycle.


Polypeptide Binding Assays


One type of assay for identifying substances that bind to a polypeptide as described here involves contacting a polypeptide as described here, which is immobilised on a solid support, with a non-immobilised candidate substance determining whether and/or to what extent the polypeptide as described here and candidate substance bind to each other. Alternatively, the candidate substance may be immobilised and the polypeptide non-immobilised.


In a preferred assay method, the polypeptide is immobilised on beads such as agarose beads. Typically this is achieved by expressing the component as a GST-fusion protein in bacteria, yeast or higher eukaryotic cell lines and purifying the GST-fusion protein from crude cell extracts using glutathione-agarose beads (Smith and Johnson, 1988). As a control, binding of the candidate substance, which is not a GST-fusion protein, to the immobilised polypeptide is determined in the absence of the polypeptide as described here. The binding of the candidate substance to the immobilised polypeptide is then determined. This type of assay is known in the art as a GST pulldown assay. Again, the candidate substance may be immobilised and the polypeptide non-immobilised.


It is also possible to perform this type of assay using different affinity purification systems for immobilising one of the components, for example Ni-NTA agarose and histidine-tagged components.


Binding of the polypeptide as described here to the candidate substance may be determined by a variety of methods well-known in the art. For example, the non-immobilised component may be labeled (with for example, a radioactive label, an epitope tag or an enzyme-antibody conjugate). Alternatively, binding may be determined by immunological detection techniques. For example, the reaction mixture can be Western blotted and the blot probed with an antibody that detects the non-immobilised component. ELISA techniques may also be used.


Candidate substances are typically added to a final concentration of from 1 to 1000 nmol/ml, more preferably from 1 to 100 nmol/ml. In the case of antibodies, the final concentration used is typically from 100 to 500 μg/ml, more preferably from 200 to 300 μg/ml.


Microtubule Binding/Polymerisation Assays


In the case of polypeptides as described here that bind to microtubules, another type of in vitro assay involves determining whether a candidate substance modulates binding of such a polypeptide to microtubules. Such an assay typically comprises contacting a polypeptide as described here with microtubules in the presence or absence of the candidate substance and determining if the candidate substance has an affect on the binding of the polypeptide as described here to the microtubules. This assay can also be used in the absence of candidate substances to confirm that a polypeptide as described here does indeed bind to microtubules. Microtubules may be prepared and assays conducted as follows:


Microtubule Purification and Binding Assays


Microtubules are purified from 0-3 h-old Drosophila embryos essentially as described previously (Saunders, et al., 1997). About 3 ml of embryos are homogenized with a Dounce homogenizer in 2 volumes of ice-cold lysis buffer (0.1 M Pipes/NaOH, pH 6.6, 5 mM EGTA, 1 mM MgSO4, 0.9 M glycerol, 1 mM DTT, 1 mM PMSF, 1 μg/ml aprotinin, 1 μg/ml leupeptin and 1 μg/ml pepstatin). The microtubules are depolymerized by incubation on ice for 15 min, and the extract is then centrifuged at 16,000 g for 30 min at 4° C. The supernatant is recentrifuged at 135,000 g for 90 min at 4° C. Microtubules in this later supernatant are polymerized by addition of GTP to 1 mM and taxol to 20 μM and incubation at room temperature for 30 min. A 3 ml aliquot of the extract is layered on top of 3 ml 15% sucrose cushion prepared in lysis buffer. After centrifuging at 54,000 g for 30 min at 20° C. using a swing out rotor, the microtubule pellet is resuspended in lysis buffer.


Microtubule overlay assays are performed as previously described (Saunders et al., 1997). 500 ng per lane of recombinant Asp, recombinant polypeptide, and bovine serum albumin (BSA, Sigma) are fractionated by 10% SDS-PAGE and blotted onto PVDF membranes (Millipore). The membranes are preincubated in TBST (50 mM Tris pH 7.5, 150 mM NaCl, 0.05% Tween 20) containing 5% low fat powdered milk (LFPM) for 1 h and then washed 3 times for 15 min in lysis buffer. The filters are then incubated for 30 minutes in lysis buffer containing either 1 mM GDP, 1 mM GTP, or 1 mM GTP-γ-S. MAP-free bovine brain tubulin (Molecular Probes) is polymerised at a concentration of 2 μg/ml in lysis buffer by addition of GTP to a final concentration of 1 mM and incubated at 37° C. for 30 min. The nucleotide solutions are removed and the buffer containing polymerised microtubules added to the membanes for incubation for 1 h at 37° C. with addition of taxol at a final concentration of 10 μM for the final 30 min. The blots are then washed 3 times with TBST and the bound tubulin detected using standard Western blot procedures using anti-p-tubulin antibodies (Boehringer Manheim) at 2.5 μg/ml and the Super Signal detection system (Pierce).


It may be desirable in one embodiment of this type of assay to deplete the polypeptide as described here from cell extracts used to produce polymerise microtubules. This may, for example, be achieved by the use of suitable antibodies.


A simple extension to this type of assay would be to test the effects of purified polypeptide as described here upon the ability of tubulin to polymerise in vitro (for example, as used by Andersen and Karsenti, 1997) in the presence or absence of a candidate substance (typically added at the concentrations described above). Xenopus cell-free extracts may conveniently be used, for example as a source of tubulin.


Microtubule Organising Centre (MTOC) Nucleation Activity Assays


Candidate substances, for example those identified using the binding assays described above, may be screening using a microtubule organising centre nucleation activity assay to determine if they are capable of disrupting MTOCs as measured by, for example, aster formation. This assay in its simplest form comprises adding the candidate substance to a cellular extract which in the absence of the candidate substance has microtubule organising centre nucleation activity resulting in formation of asters.


In a preferred embodiment, the assay system comprises (i) a polypeptide as described here and (ii) components required for microtubule organising centre nucleation activity except for functional polypeptide as described here, which is typically removed by immunodepletion (or by the use of extracts from mutant cells). The components themselves are typically in two parts such that microtubule nucleation does not occur until the two parts are mixed. The polypeptide as described here may be present in one of the two parts initially or added subsequently prior to mixing of the two parts.


Subsequently, the polypeptide as described here and candidate substance are added to the component mix and microtubule nucleation from centrosomes measured, for example by immunostaining for the polypeptide and visualising aster formation by immuno-fluorescence microscopy. The polypeptide may be preincubated with the candidate substance before addition to the component mix. Alternatively, both the polypeptide as described here and the candidate substance may be added directly to the component mix, simultaneously or sequentially in either order.


The components required for microtubule organising centre formation typically include salt-stripped centrosomes prepared as described in Moritz et al., 1998. Stripping centrosome preparations with 2 M KI removes the centrosome proteins CP60, CP190, CNN and γ-tubulin. Of these, neither CP60 nor CP190 appear to be required for microtubule nucleation. The other minimal components are typically provided as a depleted cellular extract, or conveniently, as a cellular extract from cells with a non-functional variant of a polypeptide as described here. Typically, labeled tubulin (usually β-tubulin) is also added to assist in visualising aster formation.


Alternatively, partially purified centrosomes that have not been salt-stripped may be used as part of the components. In this case, only tubulin, preferably labeled tubulin is required to complete the component mix.


Candidate substances are typically added to a final concentration of from 1 to 1000 nmol/ml, more preferably from 1 to 100 nmol/ml. In the case of antibodies, the final concentration used is typically from 100 to 500 μg/ml, more preferably from 200 to 300 μg/ml.


The degree of inhibition of aster formation by the candidate substance may be determined by measuring the number of normal asters per unit area for control untreated cell preparation and measuring the number of normal asters per unit area for cells treated with the candidate substance and comparing the result. Typically, a candidate substance is considered to be capable of disrupting MTOC integrity if the treated cell preparations have less than 50%, preferably less than 40, 30, 20 or 10% of the number of asters found in untreated cells preparations. It may also be desirable to stain cells for γ-tubulin to determine the maximum number of possible MTOCs present to allow normalisation between samples.


Motor Protein Assay


The polypeptides may interact with motor proteins such as the Eg5-like motor protein in vitro. The effects of candidate substances on such a process may be determined using assays wherein the motor protein is immobilised on coverslips. Rhodamine labeled microtubules are then added and their translocation can be followed by fluorescent microscopy. The effect of candidate substances may thus be determined by comparing the extent and/or rate of translocation in the presence and absence of the candidate substance. Generally, candidate substances known to bind to a polypeptide as described here, would be tested in this assay. Alternatively, a high throughput assay may be used to identify modulators of motor proteins and the resulting identified substances tested for affects on a polypeptide as described above.


Typically this assay uses microtubules stabilised by taxol (e.g. Howard and Hyman 1993; Chandra and Endow, 1993—both chapters in “Motility Assays for Motor Proteins” Ed Jon Scholey, pub Academic Press). If however, a polypeptide as described here were to promote stable polymerisation of microtubules (see above) then these microtubules could be used directly in motility assays.


Simple protein-protein binding assays as described above, using a motor protein and a polypeptide as described here may also be used to confirm that the polypeptide binds to the motor protein, typically prior to testing the effect of candidate substances on that interaction.


Assay for Spindle Assembly and Function


A further assay to investigate the function of polypeptide as described here and the effect of candidate substances on those functions is an assay which measures spindle assembly and function. Typically, such assays are performed using Xenopus cell free systems, where two types of spindle assembly are possible. In the “half spindle” assembly pathway, a cytoplasmic extract of CSF arrested oocytes is mixed with sperm chromatin. The half spindles that form subsequently fuse together. A more physiological method is to induce CSF arrested extracts to enter interphase by addition of calcium, whereupon the DNA replicates and kinetochores form. Addition of fresh CSF arrested extract then induces mitosis with centrosome duplication and spindle formation (for discussion of these systems see Tournebize and Heald, 1996).


Again, generally, candidate substances known to bind to a polypeptide as described here, or non-functional polypeptide variants, would be tested in this assay. Alternatively, a high throughput assay may be used to identify modulators of spindle formation and function and the resulting identified substances tested for affects binding of the polypeptide as described above.


Assays for DNA Replication


Another assay to investigate the function of polypeptide as described here and the effect of candidate substances on those functions is as assay for replication of DNA. A number of cell free systems have been developed to assay DNA replication. These can be used to assay the ability of a substance to prevent or inhibit DNA replication, by conducting the assay in the presence of the substance. Suitable cell-free assay systems include, for example the SV-40 assay (Li and Kelly, 1984, Proc. Natl. Acad. Sci USA 81, 6973-6977; Waga and Stillman, 1994, Nature 369, 207-212.). A Drosophila cell free replication system, for example as described by Crevel and Cotteril (1991), EMBO J 10, 4361-4369, may also be used. A preferred assay is a cell free assay derived from Xenopus egg low speed supernatant extracts described in Blow and Laskey (1986, Cell 47, 577-587) and Sheehan et al. (1988, J. Cell Biol. 106, 1-12), which measures the incorporation of nucleotides into a substrate consisting of Xenopus sperm DNA or HeLa nuclei. The nucleotides may be radiolabelled and incorporation assayed by scintillation counting. Alternatively and preferably, bromo-deoxy-uridine (BrdU) is used as a nucleotide substitute and replication activity measured by density substitution. The latter assay is able to distinguish genuine replication initiation events from incorporation as a result of DNA repair. The human cell-free replication assay reported by Krude, et al (1997), Cell 88, 109-19 may also be used to assay the effects of substances on the polypeptides.


Other In Vitro Assays


Other assays for identifying substances that bind to a polypeptide as described here are also provided. For example, substances which affect chromosome condensation may be assayed using the in vitro cell free system derived from Xenopus eggs, as known in the art.


Substances which affect kinase activity or proteolysis activity are of interest. It is known, for example, that temporal control of ubiquitin-proteasome mediated protein degradation is critical for normal G1 and S phase progression (reviewed in Krek 1998, Curr Opin Genet Dev 8, 36-42). A number of E3 ubiquitin protein ligases, designated SCFs (Skp1-cullin-F-box protein ligase complexes), confer substrate specificity on ubiquitination reactions, while protein kinases phosphorylate substrates destined for destruction and convert them into preferred targets for ubiquitin modification catalyzed by SCFs. Furthermore, ubiquitin-mediated proteolysis due to the anaphase-promoting complex/cyclosome (APC/C) is essential for separation of sister chromatids during mitosis, and exit from mitosis (Listovsky et al., 2000, Exp Cell Res 255, 184-191).


Substances which inhibit or affect kinase activity may be identified by means of a kinase assay as known in the art, for example, by measuring incorporation of 32P into a suitable peptide or other substrate in the presence of the candidate substance. Similarly, substances which inhibit or affect proteolytic activity may be assayed by detecting increased or decreased cleavage of suitable polypeptide substrates.


Assays for these and other protein or polypeptide activities are known to those skilled in the art, and may suitably be used to identify substances which bind to a polypeptide and affect its activity.


Whole Cell Assays


Candidate substances may also be tested on whole cells for their effect on cell cycle progression, including mitosis and/or meiosis. Preferably the candidate substances have been identified by the above-described in vitro methods. Alternatively, rapid throughput screens for substances capable of inhibiting cell division, typically mitosis, may be used as a preliminary screen and then used in the in vitro assay described above to confirm that the affect is on a particular polypeptide.


The candidate substance, i.e. the test compound, may be administered to the cell in several ways. For example, it may be added directly to the cell culture medium or injected into the cell. Alternatively, in the case of polypeptide candidate substances, the cell may be transfected with a nucleic acid construct which directs expression of the polypeptide in the cell. Preferably, the expression of the polypeptide is under the control of a regulatable promoter.


Typically, an assay to determine the effect of a candidate substance identified by the method as described here on a particular stage of the cell division cycle comprises administering the candidate substance to a cell and determining whether the substance inhibits that stage of the cell division cycle. Techniques for measuring progress through the cell cycle in a cell population are well known in the art. The extent of progress through the cell cycle in treated cells is compared with the extent of progress through the cell cycle in an untreated control cell population to determine the degree of inhibition, if any. For example, an inhibitor of mitosis or meiosis may be assayed by measuring the proportion of cells in a population which are unable to undergo mitosis/meiosis and comparing this to the proportion of cells in an untreated population.


The concentration of candidate substances used will typically be such that the final concentration in the cells is similar to that described above for the in vitro assays.


A candidate substance is typically considered to be an inhibitor of a particular stage in the cell division cycle (for example, mitosis) if the proportion of cells undergoing that particular stage (i.e., mitosis) is reduced to below 50%, preferably below 40, 30, 20 or 10% of that observed in untreated control cell populations.


Therapeutic Uses


Many tumours are associated with defects in cell cycle progression, for example loss of normal cell cycle control. Tumour cells may therefore exhibit rapid and often aberrant mitosis. One therapeutic approach to treating cancer may therefore be to inhibit mitosis in rapidly dividing cells. Such an approach may also be used for therapy of any proliferative disease in general. Thus, since the polypeptides described here appear to be required for normal cell cycle progression, they represent targets for inhibition of their functions, particularly in tumour cells and other proliferative cells.


The term proliferative disorder is used herein in a broad sense to include any disorder that requires control of the cell cycle, for example, cardiovascular disorders such as restenosis and cardiomyopathy, auto-immune disorders such as glomerulonephritis and rheumatoid arthritis, dermatological disorders such as psoriasis, anti-inflammatory, anti-fungal, antiparasitic disorders such as malaria, emphysema and alopecia.


One possible approach is to express anti-sense constructs directed against polynucleotides described in this document, preferably selectively in tumour cells, to inhibit gene function and prevent the tumour cell from progressing through the cell cycle. Anti-sense constructs may also be used to inhibit gene function to prevent cell cycle progression in a proliferative cell. Such anti-sense constructs may comprise anti-sense molecules corresponding to any of the polynucleotides, in particular, those identified in Table 5.


Alternatively, or in addition, RNAi may be used to modulate expression of the polynucleotide in a cell. Double stranded RNA may be made as described in the Examples, e.g., by transcribing both strands of a polynucleotide sequence in a suitable vector (e.g., from T7 or other promoters on either side of the cloned sequence), denatured and annealed. The double stranded RNA (ds RNA) may then be introduced into a relevant cell to inhibit the transcription or expression of the relevant polynucleotide or polypeptide.


We therefore describe a method of modulating, preferably down-regulating, the expression of a polynucleotide as described here, preferably a polynucleotide as set out in Table 5 in a cell, the method comprising introducing a double stranded RNA (dsRNA) corresponding to the polynucleotide, or an antisense RNA corresponding to the polynucleotide, or a fragment thereof, into the cell.


Another approach is to use non-functional variants of the polypeptides that compete with the endogenous gene product for cellular components of cell cycle machinery, resulting in inhibition of function. Alternatively, compounds identified by the assays described above as binding to a polypeptide may be administered to tumour or proliferative cells to prevent the function of that polypeptide. This may be performed, for example, by means of gene therapy or by direct administration of the compounds. Suitable antibodies may also be used as therapeutic agents.


Alternatively, double-stranded (ds) RNA is a powerful way of interfering with gene expression in a range of organisms that has recently been shown to be successful in mammals (Wianny and Zernicka-Goetz, 2000, Nat Cell Biol 2000, 2, 70-75). Double stranded RNA corresponding to the sequence of a polynucleotide can be introduced into or expressed in oocytes and cells of a candidate organism to interfere with cell division cycle progression.


In addition, a number of the mutations described herein exhibit aberrant meiotic phenotypes. Aberrant meiosis is an important factor in infertility since mutations that affect only meiosis and not mitosis will lead to a viable organism but one that is unable to produce viable gametes and hence reproduce. Consequently, the elucidation of genes involved in meiosis is an important step in diagnosing and preventing/treating fertility problems. Thus the polypeptides identified in mutant Drosophila having meiotic defects (as is clearly indicated in the Examples) may be used in methods of identifying substances that affect meiosis. In addition, these polypeptides, and corresponding polynucleotides, may be used to study meiosis and identify possible mutations that are indicative of infertility. This will be of use in diagnosing infertility problems.


Administration


Substances identified or identifiable by the assay methods described here may preferably be combined with various components to produce compositions. Preferably the compositions are combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition (which may be for human or animal use). Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition as described here may be administered by direct injection. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration. Typically, each protein may be administered at a dose of from 0.01 to 30 mg/kg body weight, preferably from 0.1 to 10 mg/kg, more preferably from 0.1 to 1 mg/kg body weight.


Polynucleotides/vectors encoding polypeptide components (or antisense constructs) for use in inhibiting cell cycle progression, for example, inhibiting mitosis or meiosis, may be administered directly as a naked nucleic acid construct. They may further comprise flanking sequences homologous to the host cell genome. When the polynucleotides/vectors are administered as a naked nucleic acid, the amount of nucleic acid administered may typically be in the range of from 1 μg to 10 mg, preferably from 100 μg to 1 mg. It is particularly preferred to use polynucleotides/vectors that target specifically tumour or proliferative cells, for example by virtue of suitable regulatory constructs or by the use of targeted viral vectors.


Uptake of naked nucleic acid constructs by mammalian cells is enhanced by several known transfection techniques for example those including the use of transfection agents. Example of these agents include cationic agents (for example calcium phosphate and DEAE-dextran) and lipofectants (for example lipofectam™ and transfectam™). Typically, nucleic acid constructs are mixed with the transfection agent to produce a composition.


Preferably the polynucleotide, polypeptide, compound or vector described here may be conjugated, joined, linked, fused, or otherwise associated with a membrane translocation sequence.


Preferably, the polynucleotide, polypeptide, compound or vector, etc described here may be delivered into cells by being conjugated with, joined to, linked to, fused to, or otherwise associated with a protein capable of crossing the plasma membrane and/or the nuclear membrane (i.e., a membrane translocation sequence). Preferably, the substance of interest is fused or conjugated to a domain or sequence from such a protein responsible for the translocational activity. Translocation domains and sequences for example include domains and sequences from the HIV-1-trans-activating protein (Tat), Drosophila Antennapedia homeodomain protein and the herpes simplex-1 virus VP22 protein. In a highly preferred embodiment, the substance of interest is conjugated with penetratin protein or a fragment of this. Penetratin comprises the sequence RQIKIWFQNRRMKWKK (SEQ ID NO:1) and is described in Derossi, et al., (1994), J. Biol. Chem. 269, 10444-50; use of penetratin-drug conjugates for intracellular delivery is described in WO/00/01417. Truncated and modified forms of penetratin may also be used, as described in WO/00/29427.


Preferably the polynucleotide, polypeptide, compound or vector is combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition. Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration.


The routes of administration and dosages described are intended only as a guide since a skilled practitioner will be able to determine readily the optimum route of administration and dosage for any particular patient and condition.


Further Aspects


Further aspects of the invention are set out in the following numbered paragraphs; it is to be understood that the invention includes these aspects.


Paragraph 1. A polynucleotide selected from: (a) polynucleotides encoding any one of the polypeptide sequences set out in Examples 1 to 30 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the polynucleotides defined in (a) above, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the polynucleotides defined in (a) above, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


Paragraph 2. A polynucleotide selected from: (a) polynucleotides encoding any one of the polypeptide sequences set out in Examples 1, 2, 2A, 2B and 2C or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the polynucleotides defined in (a) above, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the polynucleotides defined in (a) above, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


Paragraph 3. A polynucleotide selected from: (a) polynucleotides encoding any one of the polypeptide sequences set out in Examples 3 to 9 and 9A or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the polynucleotides defined in (a) above, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the polynucleotides defined in (a) above, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


Paragraph 4. A polynucleotide selected from: (a) polynucleotides encoding any one of the polypeptide sequences set out in Examples 10 to 29 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the polynucleotides defined in (a) above, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the polynucleotides defined in (a) above, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).


Paragraph 5. A polynucleotide probe which comprises a fragment of at least 15 nucleotides of a polynucleotide according to any of Paragraphs 1 to 4.


Paragraph 6. A polypeptide which comprises any one of the amino acid sequences set out in Examples 1 to 30 or in any of Examples 1 to 2, 2A, 2B and 2C, Examples 3 to 9 and 9A and Examples 10 to 29 or a homologue, variant, derivative or fragment thereof.


Paragraph 7. A polynucleotide encoding a polypeptide according to Paragraph 6.


Paragraph 8. A vector comprising a polynucleotide according to any of Paragraphs 1 to 5 and 7.


Paragraph 9. An expression vector comprising a polynucleotide according to any of Paragraphs 1 to 5 and 7 operably linked to a regulatory sequence capable of directing expression of said polynucleotide in a host cell.


Paragraph 10. An antibody capable of binding a polypeptide according to Paragraph 6.


Paragraph 11. A method for detecting the presence or absence of a polynucleotide according to any of Paragraphs 1 to 5 and 7 in a biological sample which comprises: (a) bringing the biological sample containing DNA or RNA into contact with a probe according to Paragraph 5 under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.


Paragraph 12. A method for detecting a polypeptide according to Paragraph 6 present in a biological sample which comprises: (a) providing an antibody according to Paragraph 10; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.


Paragraph 13. A polynucleotide according to according to any of Paragraphs 1 to 5 and 7 for use in therapy.


Paragraph 14. A polypeptide according to Paragraph 6 for use in therapy.


Paragraph 15. An antibody according to Paragraph 10 for use in therapy.


Paragraph 16. A method of treating a tumour or a patient suffering from a proliferative disease comprising administering to a patient in need of treatment an effective amount of a polynucleotide according to any of Paragraphs 1 to 5 and 7.


Paragraph 17. A method of treating a tumour or a patient suffering from a proliferative disease, comprising administering to a patient in need of treatment an effective amount of a polypeptide according to Paragraph 6.


Paragraph 18. A method of treating a tumour or a patient suffering from a proliferative disease, comprising administering to a patient in need of treatment an effective amount of an antibody according to Paragraph 10 to a patient.


Paragraph 19. Use of a polypeptide according to Paragraph 6 in a method of identifying a substance capable of affecting the function of the corresponding gene.


Paragraph 20. Use of a polypeptide according to Paragraph 6 in an assay for identifying a substance capable of inhibiting the cell division cycle.


Paragraph 21. Use as Paragraph ed in Paragraph 20, in which the substance is capable of inhibiting mitosis and/or meiosis.


Paragraph 22. A method for identifying a substance capable of binding to a polypeptide according to Paragraph 6, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.


Paragraph 23. A method for identifying a substance capable of modulating the function of a polypeptide according to Paragraph 6 or a polypeptide encoded by a polynucleotide according to any of Paragraphs 1 to 5 and 7, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.


Paragraph 24. A substance identified by a method or assay according to any of Paragraphs 19 to 23.


Paragraph 25. Use of a substance according to Paragraph 24 in a method of inhibiting the function of a polypeptide.


Paragraph 26. Use of a substance according to Paragraph 24 in a method of regulating a cell division cycle function.


Paragraph 27. A method of identifying a human nucleic acid sequence, by: (a) selecting a Drosophila polypeptide identified in any of Examples 1 to 30; (b) identifying a corresponding human polypeptide; (c) identifying a nucleic acid encoding the polypeptide of (b).


Paragraph 28. A method according to Paragraph 27, in which a human homologue of the Drosophila sequence, or a human sequence similar to the Drosophila sequence, is identified in step (b).


Paragraph 29. A method according to Paragraph 27 or 28, in which the human polypeptide has at least one of the biological activities, preferably substantially all the biological activities of the Drosophila polypeptide.


Paragraph 30. A human polypeptide identified by a method according to Paragraph 27, 28 or 29.


The invention will now be further described by way of Examples, which are meant to serve to assist one of ordinary skill in the art in carrying out the invention and are not intended in any way to limit the scope of the invention.


EXAMPLES
Examples Section A
Identification of Human Cell Cycle Genes

Introduction


In order to identify new cell cycle regulatory genes in Drosophila and their human counterparts, we investigated 33 fly lines obtained by P-element mutagenesis carried out on the X chromosome. All those fly lines are screened directly for mitotic phenotypes at developmental stages where division is crucial (i.e. the syncytial embryo, larval brains, and male and female meiosis). In each case, the P-element insertion site is identified leading to the selection of 62 genes flanking the insertion site.


In order to clarify the identity of the mutated “mitotic genes”, we use an RNAi-based knockdown approach in cultured Drosophila cells followed by FACS analysis, mitotic index evaluation (Cellomics Arrayscan) and immunofluorescence observations of mitotic phenotypes for all 63 genes.


The microscope phenotyping approach led to the identification of 30 gene candidates that are required for cell cycle progression, some of which are also detected as presenting some changes in the FACS profile and/or in the mitotic index (see Table 5 for a full summary). Data relating to these genes is presented in Examples Section B, Examples 1 to 29 below.


These genes encode a variety of novel proteins: 6 protein kinases; 2 protein phosphatases, 2 proteins of the ubiquitin-mediated protein degradation pathway, a cytosketal protein, a microtubule-binding protein, a homologue of a suspected kinesin-like protein, a RNA polymerase 2 associated cyclin, a ribosomal protein; a protein involved in retrograde (Golgi to ER) transport, a member of the family of thioredoxin reductases, a hydroxymethyltransferase, a Cdk associated protein, an RNA binding protein, an O-acetyl transferase and 9 other novel proteins with no particularly characteristic identifying features.


Human counterparts of the selected genes are identified and tested as described below. A short list of Drosophila and human genes and proteins useful for screening for anti-proliferative molecules is presented as Table 5.

TABLE 5Short list of potentially new interesting gene candidatesDrosophilaHuman HomologueGene NameHuman Homologue Gene NameAccession NumberCG2028Casein kinase IP48729CG3011Serine hydroxymethylAAA63258transferaseCG15309DiGeorge syndrome relatedAAL09354protein FKSG4CG15305Human homologue of CG15305NoneCG2222Hypothetical protein FLJ13912NP_073607CG2938CAS1 O-acetyltransferaseNP_075051CG1524Ribosomal protein S14A25220CG10778Hypothetical protein FLJ13102NP_079163(kinesin like)CG18292Cdk associated protein 1BAA22937(deleted in oral cancer)CG10701MoesinA41289CG10648Mak16-like RNA binding proteinNP_115898CG2854CAD38627 hypothetical proteinCAD38627CG2845B-rafAAA35609CG1486BAA19780 novel proteinBAA19780CG1096411-cis retinal dehydrogenaseAAC50725CG2151Thioredoxin reductase betaXP_033135CG10988Gamma tubulin ring complex 3AAC39727CG1558Human homologue of CG1558NONECG11697Novel proteinBAB14444 unamedprotein - similarto a hypotheticalprotein in theregion deleted inhuman familialCG3954Protein tyrosine phosphataseAAH08692non-receptor type 11 (Shp2)CG16903Cyclin L ania-6aAAD53184CG16983Skp1 ubiquitin ligaseXP_054159CG13363CGI-85NP_057112CG18319Ubc13 ubiquitin conjugatingBAA11675enzymeCG14813archainCAA57071CG8655Cdc7AAB97512CG2621GSK 3 betaNP_002084CG1725Dlg1/Dlg2XP_012060CG1594JAK-2 Janus kinase 2NP_004963CG2096Protein phosphatase 1NP_002700


Results


Table 6 shows all significant cell cycle phenotypes observed after RNAi with the Drosophila genes flanking P-element insertion sites identified in Examples 1 to 29. The PCR primers used to create the double stranded RNA (see Materials and Methods above) are shown in each case together with the RNA ID number. Results derived from Facs analysis of cell cycle compartment, mitotic index as determined by the Cellomics mitotic index assay, and cellular phenotypes determined by microscopy are shown.


FACS Analysis of Cell Cycle


FACS analysis is used to assess the effects of Drosophila gene specific RNAi on the cell cycle. Through the determination of the DNA content by propidium iodide quantitation, any changes in the cell cycle distribution in sub-G1 (apoptotic), G1, G2/M can be observed. 24 genes in the Facs assessment present some changes in cell cycle distribution. (Table 6).


Mitotic Index Evaluation with Cellomics Arrayscan


An evaluation of mitotic index is performed using the Cellomics arrayscan and the Cellomics proprietary mitotic index HitKit procedure (see Materials and Methods above).


The basic principle of this method is that cells in mitosis are decorated by an antibody directed against a specific mitotic marker. Their proportion relatively to the total number of cells is determined, giving a proportion of cells in mitosis. This automated method presents the advantage of being more rapid than the microscope observations, however it only measures one feature of the cycling cells. Some mitotic genes that do not significantly affect the overall proportion of cells in mitosis will therefore not be detected. The reverse is also true as the knockdown of some gene products might affect the mitotic index without displaying any obvious increase in chromosomal or spindle defects. Table 6 presents data only where there was a statistically significant variation in the mitotic index (determined by a Ttest value of <0.1) as compared to the RFP RNAi control.


An increase in mitotic index can indicate that the knockdown of a gene essential for completion of mitosis has blocked more cells in mitosis, however many of the gene knockdowns listed in Table 6 result in a decrease in the mitotic index, suggesting that the population of cells overall are spending less time in mitosis. Possible interpretations of this, are that defects in the centrosome duplication cycle block some cells in GI/S and they are unable to enter mitosis, or that defects in cytokinesis block cells on the exit from mitosis at a point after the assay specific marker is lost. The loss of checkpoints at mitosis may also allow cells to move faster through mitosis. The increase in mitotic defects observed for most of these genes might then be the result of this lack of checkpoint control.


13 genes in the phenotype assessment present some changes in the mitotic index (Table 6).


Microscope Observation and Cellular Phenotyping


The primary goal of the cell phenotype assessment is to find abnormalities in the following: chromosome number in prometaphase (ploidy), chromosome behaviour in metaphase or anaphase, spindle morphology, number of centrosomes, and cell viability. The secondary goal of the assessment is to evaluate and quantify these abnormalities, this is an essential step as control cells also present some defects.


The wild-type Drosophila DMEL2 cells present a large range and a significant proportion of chromosomal defects (between 30-40%). Therefore, between 300 and 500 mitotic cells were counted for each experiment in order to obtain a statistically significant evaluation of any change in the proportion of defects. The cells categorized as presenting chromosomal defects in the study encompass aneuploid and polyploid prometaphase cells, cells that apparently fail to align their chromosomes at metaphase and the cells with lagging or stretched chromosomes in anaphase. Spindle defects are also noted, but not quantified in the same group. Some candidates are also noted as presenting a significant decrease in the number of mitotic cells (mitotic index) or as affecting the viability of the cells (decrease in cell confluency or presence of apoptotic cells).


A noteworthy observation is that it is difficult to find a unique representative phenotype for most of the genes tested. Rather than one gene=one phenotype, an overall increase in the different categories of chromosomal defects is observed. However, one can often see a more significant increase in one particular subcategory of defects as for example in the proportion of lagging chromatids or the number of centrosomes.


Table 6 describes the data obtained from these studies for genes where a significant phenotype is observed. 30 of the candidate genes show a significant phenotype, 26 of which show an increase in chromosomal defects. This increase in mitotic chromosome behaviour abnormalities is sometimes associated with an increase in mitotic spindle defects. Of the remaining 4 with no increase in chromosomal defects, CG1725 (RNA528/529) shows a clear increase in spindle defects, with CG1524 (RNA 482/483) there are not enough mitotic cells to do a proper quantification (as the gene product is a ribosomal protein, it is highly probable that its inactivation results in a net increase in the proportion of cell death explaining the drop in cell confluency also observed) and for CG14813 (RNA 586/587), a large proportion of cells are dying and there is an obvious decrease in the number of mitotic cells, this might affect the relative proportion of normal and abnormal mitotic cells. Finally CG10648 (RNA 488/489) had a lower proportion of chromosomal defects but a high proportion of monopolar and small spindles. The proportion of prometaphase cells and apoptotic cells was also high.


Conclusion


From a collection of Drosophila P-element insertion lines which display phenotypes consistent with an effect on mitosis we derived a series of novel Drosophila and human genes which represent targets for the development of anti-proloiferative therapies. We used three different approaches to validate the role of each gene in the cell cycle and to gather phenotype information following an RNAi-based gene knockdown approach.


Table 5 shows a short list of 30 new interesting human genes demonstrated to play a role in mitosis. This short list is mainly based on the results of the detailed microscope phenotype evaluation (see Table 6), although all of the 42 genes listed in Table 6 show a cell cycle related phenotype in one or more of the 3 assays.


Materials and Methods


Generation and Identification of Lethal, Semi-Lethal and Sterile X Chromosome Mutants Having Defects in Mitosis and/or Meiosis


P-Element Mutagenesis


Transposable elements are widely used for mutagenesis in Drosophila melanogaster as they couple the advantages of providing effective genetic lesions with ease of detecting disrupted genes for the purpose of molecular cloning. To achieve near saturation of the genome with mutations resulting from mobilisation of the P-lacW transposon (a P-element marked with a mini-white gene, bearing the E. coli lacZ gene as an enhancer trap, and an E. coli replicon and ampicillin resistance gene to facilitate ‘plasmid rescue’ of sequences at the site of the P-insertion), Drosophila females that are homozygous for P-lacW (inserted on the second chromosome) are crossed with males carrying the transposase source P(Δ2-3) (Deak et al., 1997). Random transpositions of the mutator element are then ‘captured’ in lines lacking transposase activity. Stable, or balanced, stocks bearing single lethal P-lacW insertions are made to give a collection of 501 lines (Peter et al., submitted) and a further 73 lines that are either sterile or carry a mutation giving a visible morphological phenotype.


Screening for Mitotic and Meiotic Defects


About half of the mutants in the collection are embryonic lethals.


Screens for mutants affecting spermatogenesis within this collection of 501 recessive lethal, semi-lethal and sterile mutants were carried out.


We have carried out cytological screens of the lines that comprise late larval lethals, pupal lethals, pharate and adult semi-lethals and steriles for defective mitosis in the developing larval CNS. This has identified 20 complementation groups that affect all stages of the mitotic cycle. The cytological screens involve examining orcein-stained squashed preparations of the larval CNS to detect abnormal mitotic cells. In lines where defects are identified, the larval CNS is subjected to immunostaining to identify centromeres, spindle microtubules and DNA for further examination. This leads to clarification of the mitotic defect.


As a set of common functions are essential to both mitosis and meiosis, we then identify mutations resulting in sterility and failed progression through male meiosis. This involves examining squashed preparations larval, pupal or adult testes by phase contrast microscopy. We examine “onion stage” spermatids in the 24 pupal and pharate lethal lines and adult “semi-lethal” and viable lines for variations in size and number of nuclei which provides an indication of whether there have been defects in either chromosome segregation or cytokinesis, respectively. A total of 8 lines show such defects.


Further phenotype information for each mutant described in the results section, as observed by phase contrast microscopy of dividing meiocytes, is provided in the “Phenotype” field.


We then examined the ovaries and eggs of females that when homozygous are either sterile or produce embryos that fail to develop. Dissected ovaries are examined by microscopy for defects in the mitotic divisions that lead to the formation of the 16 cell egg chambers, for defects in the endoreduplication of 15 nurse cell nucleic; for cytoskeletal defects in the development of the egg chamber; for defects in meiosis; and for mitotic defects in embryos derived from mutant mothers.


We examined 24 lines that show female sterility or maternal effect lethality when homozygous and identify 5 that display defects of the type described above. In the Examples 1 to 29 below, lines exhibiting mitotic and meiotic phenotypes are categorised generally into three categories:


Category 1: Female Sterile


Category 2: Male Sterile


Category 3: Mitotic (Neuroblast) Phenotypes


Category 1 phenotypes are exhibited by mutations in Examples 1, 2, 2A, 2B and 2C; while Category 2 phenotypes are exhibited by mutations in Examples 3 to 9 and 9A. Category 3 phenotypes are exhibited by mutations in Examples 10 to 29.


Plasmid Rescue of P-Elements from Mutant Drosophila Lines


Genomic DNA was isolated from adult flies by the method of Jowett et al., 1986. Inverse PCR is used to identify flanking chromosomal sequences. The position of the inserted P-element is indicated in the Examples.


Sequence Analysis of P Element Insertion Lines


The open reading frame(s) (ORF(s)) immediately adjacent to the insertion site are identified from the annotated total genome sequence of Drosophila with reference to the ‘GADFLY’ section of the ‘FLYBASE’ Drosophila genome database (database of the Berkeley Drosophila Genome Project). The site of P element insertion and the GenBank accession number of the genomic file which contains the insertion site are included in the results section.


Where the insertion site was within a gene or close to the 5′ end of a gene, disruption of this gene is likely to be responsible for the phenotype, and it is included in the results section under the field heading “Annotated Drosophila Genome Complete Genome Candidate”, as both an accession number and an amino acid sequence. Where the insertion site indicates that the P-element may be affecting expression of two diverging genes (on opposite strands of the DNA) both are included in the results section.


The Drosophila gene sequence is then used to identify a human homologue. Data on homologues is derived from the Blink (“BLAST Link”) facility provided by the NCBI (National Center for Biotechnology Information) database. Where homologues are not apparent, further searches are made against the NCBI database using BLASTX (which compares the nucleotide query sequence virtually translated in all 6 frames against an amino acid database) or TBLASTN (amino acid query sequence against a nucleotide database virtually translated in all 6 frames) or TBLASTX (nucleotide query sequence against nucleotide database, both virtually translated in all 6 frames). Human homologues are included in the results section under the heading “Human Homologue of Complete Genome Candidate”, as both an accession number and an amino acid.


Additional Sequence Analysis using the Annotated D. melanogaster Sequence (GadFly)


As indicated above, rescue sequences are also used to search the fully annotated version of the Drosophila genome (GadFly; Adams, et al., 2000, Science 287, 2185-2195), using GlyBLAST at the Berkeley Drosophila Genome Projects web site (http://www.fruitfly.org/annot/) to identify the genome segment (usually approximately 200-250 kb) containing the P-element insertion site. The graphic representation of the genomic fragment available at GadFly allows the identification of all real and theoretical genes which flank the site of insertion. Candidate genes where the P-element is either inserted within the gene or close to the 5′ end of the gene are identified. In GadFly, the Drosophila genes are given the designation CG (Complete gene) and usually details of human homologues are also given. Such human sequences may also be obtained using the fly sequences to screen databases using the BLAST series of programs. They may also be found by nucleic acid hybridisation techniques. In both cases homologies are defined using the parameters taught earlier in this patent. In most cases, this data confirms the data derived from the sequence analysis procedure described above, and in some cases new data is obtained. Where available both sets of data are included in the individual Examples described below.


Confirmation of Cell Cycle Involvement of Candidate Genes Using Double Stranded RNA Interference (RNAi)


P-elements usually insert into the region 5′ to a Drosophila gene. This means that there is sometimes more than one candidate gene affected, as the P-element can insert into the 5′ regions of two diverging genes (one on each DNA strand). In order to confirm which of the candidate genes is responsible for the cell cycle phenotype observed in the fly line, we use the technique of double stranded RNA interference to specifically knock out gene expression in Drosophila cells in tissue culture (Clemens, et al., 2000, Proc. Natl. Acad. Sci. USA, 6499-6503). The overall strategy is to prepare double stranded RNA (dsRNA) specific to each gene of interest and to transfect this into Schneider's Drosophila line 2 (Dmel-2) to inhibit the expression of the particular gene. The dsRNA is prepared from a double stranded, gene specific PCR product with a T7 RNA polymerase binding site at each end. The PCR primers consist of 25-30 bases of gene specific sequence fused to a T7 polymerase binding site (TAATACGACTCACTATAGGGACA) (SEQ ID NO:2), and are designed to amplify a DNA fragment of around 500 bp. Although this is the optimal size, the sequences in fact range from 450 bp to 650 bp. Where possible, PCR amplification is performed using genomic DNA purified from Schneider's Drosophila line 2 (Dmel-2) as a template. This is only feasible where the gene has an exon of 450 bp or more. In instances where the gene possesses only short exons of less than 450 bp, primers are designed in different exons and PCR amplification is performed using cDNA derived from Schneider's Drosophila line 2 (Dmel-2) as a template.


A sample of PCR product is analysed by horizontal gel electrophoresis and the DNA purified using a Qiagen QiaQuick PCR purification kit. 1 μg of DNA is used as the template in the preparation of gene specific single stranded RNA using the Ambion T7 Megascript kit. Single stranded RNA is produced from both strands of the template and is purified and immediately annealed by heating to 90 degrees C. for 15 mins followed by gradual cooling to room temperature overnight. A sample of the dsRNA is analysed by horizontal gel electrophoresis.


3 μg of dsRNA is transfected into Schneider's Drosophila line 2 (Dmel-2) using the transfection agent, Transfect (Gibco) and the cells incubated for 72 hours prior to fixation. The DNA content of the cells is analysed by staining with propidium iodide and standard FACS analysis for DNA content. The cells in G1 and G2/S phases of the cell cycle are visualised as two separate population peaks in normal cycling S2 cells. In each experiment, Red Fluorescent Protein dsRNA is used as a negative control.


Preparation of dsRNA


RNA is prepared using an Ambion T7 Megascript kit in the following reaction: μl 10×T7 reaction buffer, 2 μl 75 mM ATP, 2 μl 75 mM GTP, 2 μl 75 mM UTP, 2 μl 75 mM CTP, 2 μl T7 RNA polymerase enzyme mix, 8 μl purified PCR product


Incubate at 37° C. for 6 hours. For convenience this can be done overnight in a PCR machine, such that the reaction is due to finish the next day e.g. 10 hrs 4° C., 6 hrs 37° C., 4° C. ∞ (prog. LISA6)


To degrade the DNA, add 1 ml DNase I (2U/ml) and incubate at 37° C. for 15 mins.


Add 115 μl DEPC-treated water and 15 μl ammonium acetate stop solution (5M ammonium acetate, 100 mM EDTA)


Extract with an equal volume of phenol/chloroform, an equal volume of chloroform and then precipitate the RNA by adding 1 volume of isopropanol. Chill at −20° C. for 15-30 mins, then spin at top speed in a microfuge at 4° C. Remove the supernatant avoiding the RNA pellet, which appears as a clear, jelly-like pellet at the base of the tube. Dry briefly then dissolve the RNA in 20-100 μl DEPC-treated water, depending on the size of the pellet.


At this stage there are 2 complimentary single stranded RNAs. To anneal these, incubate the tube at 90° C. for 10 mins, then cool slowly, by transferring to a hot block at 37° C. and then setting the thermostat to room temperature.


Once the hot block has reduced to room temperature, spin down the liquid to the bottom of the tube and run 1 μl on a 1% agarose TBE horizontal gel to check the RNA yield and size.


Transfection of Schneider Line 2 (Dmel-2) Cells with dsRNA (Adherent Protocol)


Transfect 3 μg dsRNA into Schneider line 2 (Dmel-2) cells using Promega Transfast transfection reagent.


Schneider line 2 (Dmel-2) cells are grown in Schneider's medium+10% FCS+penicillin/Streptomycin, at 25° C. For the purpose of transfection with dsRNA, 25 ml of a healthy growing culture should be sufficient for 24-30 transfections. Knock off cells adhering to the bottom of the flask by banging it sharply against the side of the bench, then aliquot 1 ml into each well of 5 six-well plates. Add an additional 2 ml Schneider's medium+10% FCS+penicillin/Streptomycin to each well and incubate the plates overnight in a humid chamber at 25° C.


Vortex the Transfast, then add 9 μl to a sterile eppendorf containing the 3 μg dsRNA. Add 1 ml Schneider's medium (no additives), vortex immediately and incubate at room temperature for 15 mins. In the mean time, carefully remove the Schneider's medium from the six-well plates and replace with Schneider's medium (no additives); ˜1 ml/well.


Once the dsRNA+ Transfast has finished its 15 min incubation, remove the medium from the cells in the six-well plates, replace with the 1 ml dsRNA/Transfast/Schneider's medium and incubate at 25° C. for 1 hr in a humid chamber.


Add 2 ml Schneider's medium containing 10% FCS+pen/strep and return to humid chamber in 25° C. incubator for 24-72 hrs.


Initially, observations of the affects of dsRNA transfection on the Schneider line 2 cell cycle are made after 72 hrs incubation, but where a significant phenotype is observed, additional transfections are performed and observations made at earlier time points.


For each experiment, transfection with RFP dsRNA is used as a negative control. Cells which have been treated with transfast, but which have not been transfected with dsRNA are also included as a control. Transfection with polo or orbit dsRNA, shown in preliminary studies to have an observable affect on Schneider line 2 cell cycle, is used as a positive control in each experiment.


Immunostaining of DMEL-2 Cells for Microscopic Analysis

    • For microscopic analysis of DMEL-2 insect cell line, ˜4×106 cells (0.5×106 cells for 3 day incubations) are grown on coverslips in the bottom of the wells of six-well plates
    • Following any required treatments, the media is carefully removed and replaced with 1 ml PHEMgSO4 fixation buffer (60 mM PIPES, 25 mM Hepes, 10 mM EGTA, 4 mM MgSO4, pH to 6.8 with KOH)+3.7% formaldehyde. Until the cells are fixed they do not adhere strongly to the coverslip, so it is important to pipette gently at this stage.
    • The cells are left to fix for 20 mins, then the buffer replaced with PBS+0.1% Triton X-100 for 2 mins to permeablise the cells.
    • Cells are then blocked using PBS+0.1% Triton X-100+1% BSA (freshly prepared) and incubated for 1 hr at RT.
    • Next cells are incubated with the primary rat α-tubulin antibody YL½ (1:300 dil.) (+any other primary antibodies to be used, ex: gamma-tub at 1/500) in PBS+0.1% Triton X-100+1% BSA 2-3 hrs at RT or alternatively overnight at 4° C.
    • Wash the cells 3 times for 5 mins in PBS+0.1% Triton X-100 and then incubate with the secondary antibody, TRITC-donkey anti-rat (1:500 dil.) (+any other secondary antibodies to be used) in PBS+0.1% Triton X-100+1% BSA, at room temperature for 1 hr.
    • Wash the cells 3 times for 5 mins in PBS+0.1% Triton X-100 and once in PBS alone, then mount on a slide on a drop of N-propyl gallate mounting medium containing DAPI to stain the DNA and seal with nail varnish
    • View using fluorescent microscopy.


Primary antibodies: anti α-tub, 1:300 (rat YL½; SEROTEC); anti γ-tub, 1:500 (mouse; Sigma GTU-88)


Secondary antibodies: TRITC donkey anti-rat IgG at 1:300 (Jackson Immunoresearch, 712-026-150); AlexaFluor 488 goat anti-mouse, 1:300 (Molecular Probes; A-11001)


Transfections of S2 cells were carried out in 6 well tissue culture plates using 3 μg ds RNA per gene. The cells were harvested following three days for immunostaining.


Microscope Observations and Cellular Phenotyping


All studies were performed using a standard operating procedure. For every gene, each phenotypic test was performed following a 48 hours period of RNAi induction in duplicate and in two independent sets of experiments. The observations were carried out using a Zeiss Axioskop 2 motorized microscope with a 63×/1.4 plan-apochromat Zeiss objective.


Cells were fixed and stained with DAPI, alpha-tubulin and gamma-tubulin to visualise the nucleus/DNA, the microtubule network/spindle and the centrosomes respectively (see immunostaining section).


For each experiment, the number of normal looking mitotic cells in prophase/prometaphase, metaphase, anaphase and telophase is quantified as well as the abnormal looking ones in those various stages. These comprise abnormal chromosome number in prometaphase, misaligned chromosomes and lagging chromosomes in metaphase and anaphase respectively. Also, the abnormalities in the spindle morphology and the number of centrosomes are carefully noted. To get a more complete characterisation of the phenotype, the cell viability (cell confluency and number of apoptotic cells) is also assessed as well as the number of multinucleated interphase cells and the nucleus and cell morphology if different from control. If a phenotype appears to be more representative some images were stored for presentation of data.


FACS Analysis of Transfected Schneider Line 2 Cells


Following transfection and incubation for the desired length of time, then transfer the cells to a 15 ml centrifuge tube and pellet by spinning at 2000 rpm for 5 mins. Remove the supernatant, resuspend the cell pellet in 1 ml PBS and pellet a second time by spinning at 2000 rpm for 5 mins. Remove 900 μl of the PBS, resuspend the cells in the remaining PBS and then add 900 μl ethanol drop-wise while vortexing the tube. Transfer the cells to an eppendorf tube and store at −20° C.


On the day of analysis, pellet the cells by spinning in a microfuge for 5 mins at 2000 rpm, remove the supernatant, resuspend the cells in the residual ethanol and add 500 μl PBS. To remove clumps take the cells up through a 25 gauge needle and transfer to FACS tube. Add 3 μl 6 mg/ml Rnase A (Pharmacia) and 2.5 μl 25 mg/ml propidium iodide and incubate at 37° C. for 30 mins, then store on ice.


Analyse DNA content of the Schneider line 2 cells using FACSCalibur at Babraham Institute. Mutant phenotypes are determined by comparing profiles relative to cells transfected with RFP dsRNA.


Cellomics Mitotic Index HitKit Procedure

    • To Packard Viewplates containing pre-aliquoted dsRNA samples (1000 ng/well) add 35 μl of logarithmically growing D.Mel-2 cells diluted to 2.3×105 cells/ml in fresh Drosophila-SFM/glutamine/Pen-Strep pre-warmed to 28° C.
    • Incubate the cells with the dsRNA (60 nM) in a humid chamber at 28° C. for 1 hr.
    • Add 100 μl Drosophila-SFM/glutamine/Pen-Strep pre-warmed to 28° C. and return the cells containing the dsRNA to the humid chamber at 28° C. for 72 hrs.
    • Gently remove the medium and slowly add 100 μl Fixation Solution (3.7% formaldehyde, 1.33 mM CaCl2, 2.69 mM KCI, 1.47 mM KH2PO4, 0.52 mM MgCl2-6H2O, 137 mM NaCl, 8.50 mM Na2HPO4-7H2O) pre-warmed to 28° C. Incubate in the fume hood for 15 minutes. It is imperative to use care when manipulating cells before and during fixation.
    • Remove the Fixation Solution and wash with 100 μl Wash Buffer (1.33 mM CaCl2, 2.69 mM KCI, 1.47 mM KH2PO4, 0.52 mM MgCl2-6H2O, 137 mM NaCl, 8.50 mM Na2HPO4-7H2O).
    • Remove the Wash buffer, add 100 μl Permeabilisation Buffer (30.8 mM NaCl, 0.31 mM KH2PO4, 0.57 mM Na2HPO4-7H2O, 0.02% Triton X-100), and incubate for 15 minutes.
    • Remove the Permeabilisation Buffer and wash with 100 μl Wash Buffer.
    • Remove the Fixation Solution and wash with 100 μl Wash Buffer (1.33 mM CaCl2, 2.69 mM KCI, 1.47 mM KH2PO4, 0.52 mM MgCl2-6H2O, 137 mM NaCl, 8.50 mM Na2HPO4-7H2O).
    • Remove the Wash buffer, add 100 μl Permeabilisation Buffer (30.8 mM NaCl, 0.31 mM KH2PO4, 0.57 mM Na2HPO4-7H2O, 0.02% Triton X-100), and incubate for 15 minutes.
    • Remove the Permeabilisation Buffer and wash with 100 μl Wash Buffer.
    • Remove the Wash Buffer and add 50 μl of Staining Solution (1 μg/ml Hoechst 33258, 1.33 mM CaCl2, 2.69 mM KCI, 1.47 mM KH2PO4, 0.52 mM MgCl2-6H2O, 137 mM NaCl, 8.50 mM Na2HPO4-7H2O) per well. Incubate for 1 hour protected from the light.
    • Remove the Staining Solution and wash twice with 100 μl Wash Buffer.
    • Remove the Wash Buffer and replace with 200 μL Wash Buffer containing 0.02% sodium azide.


Seal the plates and analyse the transfection efficiency using the ArrayScan HCS System, running the Application protocol Percent_Transfection20060210×_p2.0 with the 10× objective and the QuadBGRFR filter set.

TABLE 6Results of Facs, Mitotic Index, and Cell phenotype assays after siRNA geneknockdown in Dmel-2 cellsExampleFlyDrosophilaRNAnumberLinegeneIDRNAi primers1464CG15319452TAATACGACTCACTATAGGGAGAACGGCACTTCTTTTTCTGTCACCT(SEQ ID NO:3)453TAATACGACTCACTATAGGGAGAATGATGAGCAGCTCCAGCAGTCTCT(SEQ ID NO:4)2492CG2028458TAATACGACTCACTATAGGGAGAGAAGCGGATCGTTTGGCGACATTTA(SEQ ID NO:5)459TAATACGACTCACTATAGGGAGAAGATGGGCATTGATCGAGGCATAGC(SEQ ID NO:6)2Accr-a2CG3011598TAATACGACTCACTATAGGGAGATGGCAACGAGTACATCGACCGCATA(SEQ ID NO:7)599TAATACGACTCACTATAGGGAGATACCTTGTCTCCATTGGCCTTGGTG(SEQ ID NO:8)2Bewv-bCG2446602TAATACGACTCACTATAGGGAGACCCCAAGGCGATAGATACCACGATA(SEQ ID NO:9)603TAATACGACTCACTATAGGGAGAATCTCTGGTATGGCCATCAGGCACT(SEQ ID NO:10)2CFs(1)06CG15309608TAATACGACTCACTATAGGGAGAGGTGAAGACGTTTCAGGCCTATCTA(SEQ ID NO:11)609TAATACGACTCACTATAGGGAGATCCCAGCCGTTCTCCTTGATCATGT(SEQ ID NO:12)3167CG15305462TAATACGACTCACTATAGGGAGATATGTGCATCCATTCGAAAGACTTT(SEQ ID NO:13)463TAATACGACTCACTATAGGGAGAATAGGGGAGGTTGTTCTTAGATTGA(SEQ ID NO:14)4224CG2096468TAATACGACTCACTATAGGGAGATGAAACCATCCGAGAAGAAGGCCAA(SEQ ID NO:15)469TAATACGACTCACTATAGGGAGACAGATAATCATCAAATGCAGGAATC(SEQ ID NO:16)CG2222464TAATACGACTCACTATAGGGAGAACGGAATGAACTATTTTCCGAACTATTACT(SEQ ID NO:17)465TAATACGACTCACTATAGGGAGAGATGTACTGACTGTTGGTGCGCACT(SEQ ID NO:18)5231CG2941470TAATACGACTCACTATAGGGAGAATCTGTAGACAGACGGCAGAATTGC(SEQ ID NO:19)471TAATACGACTCACTATAGGGAGACGCAATAGCAGTACTTCCATCTTGT(SEQ ID NO:20)CG2938474TAATACGACTCACTATAGGGAGAATTGGATTGCGAATCGCTCAGGATC(SEQ ID NO:21)475TAATACGACTCACTATAGGGAGATTTTCGCGAAGGACATCAATATCAG(SEQ ID NO:22)6248CG6998476TAATACGACTCACTATAGGGAGAGGCCTACATCAAGAAGGAGTTCGAC(SEQ ID NO:23)477TAATACGACTCACTATAGGGAGATGGTTAGTTGTATTTGCGAATCTTC(SEQ ID NO:24)8ms(1)04CG1524482TAATACGACTCACTATAGGGAGAGTTGCTGATCGACAAACAAACCCAG(SEQ ID NO:25)483TAATACGACTCACTATAGGGAGACTTTCCAGATACTGCCATCTACAGA(SEQ ID NO:26)CG10778484TAATACGACTCACTATAGGGAGAGAGTGTCGCGTGTAGAGGCATTCTT(SEQ ID NO:27)485TAATACGACTCACTATAGGGAGAAAGTACACATGGACGGAGCGGATAG(SEQ ID NO:28)9thb-aCG1453556TAATACGACTCACTATAGGGAGAGGCTGCCGTTTTTCCTTTTTGTTATCC(SEQ ID NO:29)557TAATACGACTCACTATAGGGAGATGATCCTTCCTCTTTGACTCCACCT GTT(SEQ ID NO:30)CG18292558TAATACGACTCACTATAGGGAGACGCTAAAAACTAGTAGTTTTGTGTGCCAGG(SEQ ID NO:31)559TAATACGACTCACTATAGGGAGAACCACCATTGCTGGAGCACATGTTG(SEQ ID NO:32)9Ams(1)13CG5941610TAATACGACTCACTATAGGGAGAGGATTAGCACCGTCGACCACGAAAA(SEQ ID NO:33)611TAATACGACTCACTATAGGGAGAAATTTCCTGTGTGGATAACGTGAGGAGTCC(SEQ ID NO:34)10187CG10701490TAATACGACTCACTATACGGGAGACGTTCCTGCTGTTTGGCATTCTTCT(SEQ ID NO:35)491TAATACGACTCACTATAGGGAGAACCACAATAAGACCACCCACACAGC(SEQ ID NO:36)CG10648488TAATACGACTCACTATAGGGAGACACCTTCTGCCGCCATGAGTACAAT(SEQ ID NO:37)489TAATACGACTCACTATAGGGAGATTCCGCCTCCAGAGCCTTGTTGAAA(SEQ ID NO:38)11226CG2865492TAATACGACTCACTATAGGGAGATCAAGGCGTCCATGATCACCTCGAAAT(SEQ ID NO:39)493TAATACGACTCACTATAGGGAGAACCTGTCCAGCTGCAACTTGGTCAA(SEQ ID NO:40)CG2854494TAATACGACTCACTATAGGGAGAGGAGATGGAAAAGGAGCTCGGAAAA(SEQ ID NO:41)495TAATACGACTCACTATAGGGAGATCTCAATCCGTATGCCAAGGAGCAC(SEQ ID NO:42)CG2845496TAATACGACTCACTATAGGGAGAAGTTGACCTCCAAGCTCCACGAACT(SEQ ID NO:43)497TAATACGACTCACTATAGGGAGACTGGTGCTTGATGTGTGTCCTAATG(SEQ ID NO:44)12269CG1696500TAATACGACTCACTATAGGGAGACACTTGGCGATTGAACATGAAACAA(SEQ ID NO:45)501TAATACGACTCACTATAGGGAGAATATAAAAAGCCCCCAAAAGAATTG(SEQ ID NO 46)CG1486502TAATACGACTCACTATAGGGAGAATTGCACTTTGATTGCAGTCGATTGCG(SEQ ID NO:47)503TAATACGACTCACTATAGGGAGAGATGTGGAATGGTGTGACCGTAGTG(SEQ ID NO:48)13291CG10798504TAATACGACTCACTATAGGGAGAGACAGGCATATAACTCAGGAACTTA(SEQ ID NO:49)505TAATACGACTCACTATAGGGAGACTTGATGATCACCGGCATGTTCTCG(SEQ ID NO:50)15379CG10964552TAATACGACTCACTATAGGGAGACGGAGTGCCGTCGTAGTTGACAAAA(SEQ ID NO:51)553TAATACGACTCACTATAGGGAGATGACCAAGGACCAAGGCCTCAATGT(SEQ ID NO:52)CG2151554TAATACGACTCACTATAGGGAGAAGCCCACTGTGATGGTGCGTTCTAT(SEQ ID NO:53)555TAATACGACTCACTATAGGGAGAATCTCATCGGCTCCGAACTGCTTGA(SEQ ID NO:54)17121CG10988560TAATACGACTCACTATAGGGAGACATTTAAGCAAAATGATTGCCGCCAATAGT(SEQ ID NO:55)561TAATACGACTCACTATAGGGAGATCTCAATCCGATGCTGGACTGTGTG(SEQ ID NO:56)18237CG1558562TAATACGACTCACTATAGGGAGAGCCCAGAAGGAGCAGCAAAAGTTCT(SEQ ID NO:57)563TAATACGACTCACTATAGGGAGATAAGTTACCTGCATCGAGGCATTGT(SEQ ID NO:58)CG11697564TAATACGACTCACTATAGGGAGAATGATTTATGCGATCGTGATACACA(SEQ ID NO:59)565TAATACGACTCACTATAGGGAGACCGCTTCTCTTCCAACTGCCTTTTG(SEQ ID NO:60)19171CG3954566TAATACGACTCACTATAGGGAGAGGAGCCGAGTACATCAATGCCAACT(SEQ ID NO:61)567TAATACGACTCACTATAGGGAGAATGTAGGTCTTAAACATCTCGCGCT(SEQ ID NO:62)CG16903568TAATACGACTCACTATAGGGAGAGGAAATCTCGCCCATGGTGCTAGAT(SEQ ID NO:63)569TAATACGACTCACTATAGGGAGATGTTCCGATCCACGGTGATTACAGC(SEQ ID NO:64)20500CG4399570TAATACGACTCACTATAGGGAGATGCCCCCCTGGATGATAATGCCAAT(SEQ ID NO:65)571TAATACGACTCACTATAGGGAGAACTTGCAGCTCGTGACTCTGATGCT(SEQ ID NO:65)CG4406572TAATACGACTCACTATAGGGAGAATGCTTGTTAAATTTGTTGTCATCTTTTGCC(SEQ ID NO:67)573TAATACGACTCACTATAGGGAGAATCTCCTCCGAGTCCTGGAACTTGA(SEQ ID NO:68)23 37CG16983580TAATACGACTCACTATAGGGAGAATGCCCAGCATCAAGTTGCAATCTT(SEQ ID NO:69)581TAATACGACTCACTATAGGGAGACGAAATGCCGCGCTTTACTTCTCCT(SEQ ID NO:70)CG13363582TAATACGACTCACTATAGGGAGATCCGATACCTGCGCGTCTTTGACAA(SEQ ID NO:71)583TAATACGACTCACTATAGGGAGAGCCATTATTACCAGGTCCACTGCTG(SEQ ID NO:72)24186CG18319584TAATACGACTCACTATAGGGAGACTCAACGAGAAGGTCCAGACTCAAC(SEQ ID NO:73)585TAATACGACTCACTATAGGGAGATCGACGGCATATTTCTGGGTCCACT(SEQ ID NO:74)25301CG14813586TAATACGACTCACTATAGGGAGAAATGTGCAGCCTTCGGTGGCGGAGTACGAC(SEQ ID NO:75)587TAATACGACTCACTATAGGGAGACAATTACTCGCTCTGAGAAGCTGTC(SEQ ID NO:76)26148CG8655590TAATACGACTCACTATAGGGAGAATGCCCTTCATGGCACATGACCGAT(SEQ ID NO:77)591TAATACGACTCACTATAGGGAGATTGCTGCTCTTGCTGCACTAGCTGT(SEQ ID NO:78)27335CG2621594TAATACGACTCACTATAGGGAGAAATAATAATAACAACGTTATAAGCCAGCCG(SEQ ID NO:79)595TAATACGACTCACTATAGGGAGATAATGCGGCTGCGCAAGATGCTGTT(SEQ ID NO:80)28342CG1725528TAATACGACTCACTATAGGGAGAGCCACGTTGAAATCGATCACCGACA(SEQ ID NO:81)CT4934529TAATACGACTCACTATAGGGAGAATAGAAGGAGTTGGCGGGTGGAGAT(SEQ ID NO:82)CT41310530TAATACGACTCACTATAGGGAGATCTCTTTCGATTTCTTCTCTTCTGT(SEQ ID NO:83)531TAATACGACTCACTATAGGGAGATTGATGAACACGGCGACGGGATACA(SEQ ID NO:84)CG1594532TAATACGACTCACTATAGGGAGAAGGGAATCGTGTGGAAAGACTCGCA(SEQ ID NO:85)533TAATACGACTCACTATAGGGAGAACAAGGACAAATCAACGGGACTGGC(SEQ ID NO:86)29419CG12638596TAATACGACTCACTATAGGGAGATGTTTGCCATATCATTGCAGCTGCT(SEQ ID NO:87)597TAATACGACTCACTATAGGGAGAGATGTCATATTGGCCAGGTCACTGG(SEQ ID NO:88)RNAi phenotypeMitoticIndex(% ofExampleRFPHumannumberFacscontrol)Microscopyhomologue1Fewer G1wtwtAAC51331-cells, withCREB-bindingcorrespond-proteining increasein G2/M2Fewer cells20% increase inP48729 Caseinin G2/M,chromosomal defects.kinase 1, alphawith aSome bright spotsisoformcorrespond-scattered in theing increasecytoplasm in the DAPIin sub-G1channel, most of theeventsnuclei are irregularlyshaped, M1 decreases, andDNA appears hypocondensedShape of the cells isalso very affected.2Awt91%12% increase inAAA63258-chromosomal defectsserineMultipolar and tripolarhydroxymethyl-spindlestransferase2Bwt74%wtnone2Cwt111%20% increase inAAL09354chromosomal defectsDiGeorgespindle defects,syndrome-relatedsome bipolar spindleprotein FKSG43Very slightlywt20% increase inNonefewerchromosomal defectscycling cellsDifficult to see a normal& a corres-spindlepondingincrease insub-G1 cells4wtwt20% increase inNP_002700chromosomal defects, noproteindefects in centrosomes orphosphatase 1spindlewtNot done40% increase inNP_073607chromosomal defectshypotheticalMultipolar and monopolarproteinspindlesFLJ13912Many polyploid cellsSome hyper-condensedchromosomes5Fewer cellswtwtNonein G2/M,with acorrespond-ing increasein sub-G1eventswtwt10% increase inNP_075051 Cas1chromosomal defectsO-Fewer cells indicating cellacelyltransferasedeathMultipolar spindles6Very slightlywtwtAAH10744fewer cells inSimilar toG2/M & aRIKEN cDNAcorrespond-6720463E02ing increasegenein sub-G1cells8Fewer G2/M63%Only 38 mitotic cellsA25220events, withremained on theribosomala corres-slide, cells are veryprotein S14pondingscattered and someincrease inare dying. Nuclei aresub-G1degraded.events and adifferent G1profilewt78%20% increase inhypotheticalchromosomal defectsproteinHigh number of multipolarFLJ13102spindles(54%) Similarityto Mousekinesin-likeprotein KIF49Slightwtwt(CG1453)-increase inCAA69621-G1 and sub-kinesin-2G1 cells, butno obviouscorrespond-ing decreasein S or G2/Mcellswt91%20% increase inBAA22937-chromosomal defectscdk2-Possible decrease in mitoticassociatedindexprotein 1;Some multipolar spindles,cdk2ap1,few normal looking spindlesdeleted in oralcancer 19AVery slightwtwtMCT-1 (multipledecrease incopies in a T-cellG1 peak, butmalignancies)no other(BAA86055),obviousvariationfrom wtprofile10Fewer G2/Mwt20% increase inA41289 humanevents with achromosomal defects,moesincorrespond-misaligned chromosomeing increase(40%), spindle with freein sub-G1extracentrosome, cells witheventsmore than one spindle.wtwtProportion of mitoticNP_115898chromosomal defects a bitMak16-like RNAlower than normal, highbinding proteinproportion of monopolarspindles and small spindles.Very high proportion ofprometaphase cellsCell death11Fewer cellswtwtnonein G2/M andalso S.Increasedpercentageof cells insub-G1 andG1wtwt17% increase inCAD38627chromosomal defectshypotheticalHigher level of polyploid,proteinprometaphase cells andmisaligned chromosomes,anaphase normalwtwtMore than 20% increase inAAA35609 B-chromosomal defectsraf proteinMore multipolar spindles12Fewer cellswtwtNP_056158in G2/M andhypotheticalalso S.proteinIncreasedpercentageof cells insub-G1 andG1wtwt10% increase inBAA19780chromosomal defectsSimilar to aMore prometaphase cellsC. elegans proteinin cosmidC14H1013Fewer cellswtwtCAA23831 c-in G2/M.myc oncogeneIncreasedpercentageof cells insub-G1 andG115wtwt15% increase inAAC50725 11-chromosomal defectscis retinolhigh number of disorganiseddehydrogenasespindleswt81%20% increase inXP_033135chromosomal defectsthioredoxinHigh proportion ofreductase betapolyploid cells17wtwt22% increase ofAAC39727-chromosomal defectsspindle poleMain feature is a highbody proteinproportion of metaphasespc98 homologfigures with misalignedGCP3chromosomes (75% vs 20%in normal cells) Some cellswithout any centrosomes18wt117%18% increase innonechromosomal defectsAbnormal spindle structures(increased number ofcentrosomes)Fewer G2/Mwt18% increase inBAB14444events, withchromosomal defectsunamed protein-a corres-More polyploid cellssimilar to apondinghypotheticalincrease inprotein in thesub-G1region deleted inevents. Alsohuman familiala differentadenomatousG1 profilepolyposis 1from wt.19Very slight45%20% increase inAAH08692-increase inchromosomal defectsprotein tyrosineG1 and sub-Spindle and centrosomephosphatase,G1 cells, butseem normal.non-receptorno obviousHigher level of aneuploidytype 11correspond-and polyploidying decreasein S or G2/Mcellswtwt20% increase inAAD53184-chromosomal defectscyclin L. ania-6aClear decrease in mitoticindexA lot of spindles seem to beaffected in their structure,poles not well defined andmicrotubule array irregularMany cells with fusedinterphase or decondensednuclei20Fewer cells88%wtAAF13722-in G2/M,neurofilamentwith aproteincorrespondingincrease insub-G1events. Alsoa differentG1 profilefrom wt.SlightwtwtXP_131206decrease insimilar to GP1-G2/M andanchorcorrespond-transamidaseing slightincrease insub-G1 cells.23Significantwt30% increase inXP_054159-decrease inchromosomal defectshypotheticalsub-G1 &All types of spindle andproteinG1 peaks,chromosomal defects arewith avisible but no obvious maincorrespond-oneing increaseHigher proportion ofin the G2/Maneuploid and polyploid cellspeak, indicat-Possible decrease in mitoticing mitoticindexarrest.Cells with excess centrosomeswtwt40% increase inNP_057112chromosomal defectsCGI-85 proteinA lot of polyploid cells,multicentrosome but somenormal spindle also24Significant91%30% increase inBAA11675-decrease inchromosomal defectsubiquitin-sub-G1 &Various chromosomalconjugatingG1 peaks,defects ranging fromenzyme E2but nonumber of centrosomes,UbcH-bencorrespond-spindle structure anding increasestretched/lagging chromatidsin the G2/MHigh number of abnormalpeak.anaphases 75% of anaphasesProbably(compared to 10-15% inindicatesnormal cells)mitoticarrest.25Fewer G181%Cell deathCAA57071-events, withLower proportion ofarchainan increasedchromosomal defectsnumber ofcells inG2/Mindicatingmitotic arrest26very slightwt40% increase inAAB97512-decrease inchromosomal defectsHsCdc7G1 andSome chromosomal defectsG2/M peaks,in spindle structure but nobut noclear single phenotypesignificantincrease insub-G1 cellsor polypoidcells.27wtwt20% increase inNP_002084-chromosomal defectsglycogenMany obvious mitoticsynthase kinase 3chromosomal defects andbetatoo many centrosomes percellVery difficult to find anormal looking mitoticspindleMost of the anaphases areabnormal with laggingchromosomes28EssentiallyNo increase in chromosomalXP_012060-wt profile.defects but many with morediscs, largeVery slightthan two centrosomes(Drosophila)reduction inhomolog 2G1 peak, butno obviouscorrespond-ing increasein otherpeaksVery slightwt20% increase inNP_004963reduction inchromosomal defectsJAK-2 kinaseG1 peak,Polyploid cells(Janus kinase 2),with aAbnormal number ofinvolved incorrespond-centrosomes in many cellscytokine receptoring increasebut some normal bipolarsignalingin sub-G1spindlescells.29Decrease in94%wtB38637-Rasthe numberinhibitor (cloneof cells inJC265)-humanG2/M, with(fragment)an increasein the sub-G1 popula-tion. The G1peak differsin profilefrom wt.


Examples Section B
P-Element Screening Results

The layout of a typical entry in the results section is shown below. Not all fields present in the actual results section contain information for each individual Drosophila line described.


Results Layout (Examples 1 to 29)


Line ID


(Drosophila line designation)


Phenotype


(Description of Drosophila phenotype)


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)


(Accession number, map position according to the Bridges map, Lefevre, 1976)


P element Insertion site


(Base pair position within genomic segment)


Annotated Drosophila Genome Complete Genome candidate


(derived from GADFLY Berkley Drosophila Genome Project database, accession number, mRNA sequence (complete CDS) and Peptide sequence)


Human homologue of Complete Genome candidate


(Derived from Blink and BLAST searches, accession number, mRNA sequence (complete CDS) and peptide sequence)


Putative function


(Derived from homologies or Drosophila experimental data)


A specific example is as follows (Example 5, Category 2):


Line ID—231


Phenotype—Semi-lethal male and female, cytokinesis defect. In some cysts, variable sized Nebenkerns


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003429 (3F)


P element insertion site—153,730


Annotated Drosophila genome Complete Genome candidate—CG5014—vap-33-1 vesicle associated membrane protein

(SEQ ID NO:124)CACATCACTAGCTGACAGAATATATGGCTTTTTTACATTTTGCGTTTTCAACTGAAGTTTGCGAAGAAACCGAAGCGTGGTAAACCACTGAAATCGAAAATATCGACAGAAAAGCGACCTAAAGTCGGTGAAGAAGTCGGACGTTGATCGTTGTGTTTTTTTCCCGAAATTTTCTGCAAAAAGCCCGTGCGTGCGTGAGTTTCTCTGGCTCTTGGTTTTTTTTTGTCCATGCGTGTGTGTGTGGTGGCATAAATTTACCGATATTTCGCCTGTGAGAGCGAAACGAACGAAAAAGGAAAGAAAAAAAGAGAGACGAGTAAAGTAAAACGAAACAGGCATAAAAACAGCAGCAGTTTTCTTGATATATTTGGCTAAAAAACGCAAACCAAACAGCGAGCAAGAACAACAAATAGCTGGGCAAAAACAGGACGCACAAAAAATAAAATTAAAACGATAAGAGGCGAAAAGCGGAGAGAGTGAAATTCTCGGCAGCAACAACGACAAGAACAACACCAGGAGCAGCAGCAACAACAACAAGAAAAGCCAGCCGCCACAATGAGCAAATCACTCTTTGATCTTCCGTTGACCATTGAACCAGAACATGAGTTGCGTTTTGTGGGTCCCTTGACCCGACCCGTTGTCAGAATCATGACTCTGGGCAAGAACTCGGCTCTGCCTCTGGTCTTCAAGATCAAGACAACCGCCCCGAAACGCTACTGCGTACGTCCAAACATCGGCAAGATAATTCCCTTTCGATCAACCCAGGTGGAGATCTGCCTTCAGCCATTCGTCTACGATCAGCAGGAGAAGAACAAGCACAAGTTCATGGTGCAGAGCGTCCTGGGACCCATGGATGGTGATCTAAGCGATTTAAATAAATTGTGGAAGGATCTGGAGCCCGAGCAGGTGATGGACGCCAAACTGAAGTGCGTTTTCGAGATGCCCACCGCTGAGGCAAATGCTGAGAACACCAGCGGTGGTGGTGCCGTTGGCGGCGGAACCGGAGCTGGGGGAGGCGGAAGCGCGGGTGCCAATACTAGCTCAGCCAGCGCTGAGGCGCTCGAGAGGAAGCCGAAGCTCTCCAGCGAGGATAAGTTTAAGCGATCCAATTTGCTCGAAACGTCTGAGAGTGTGGACTTGCTGTGCGGAGAGATCAAAGCGCTGCGTGAATGCAACATTGAATTGCGAAGAGAGAATCTTCACTTGAAGGATCAAATCACACGTTTCCGGAGCTCGCCGGCCGTCAAACAGGTGAATGAGCCCTATGCCCCAGTCCTGGCTGAGAAGCAGATTCCGGTCTTTTACATTGCAGTTGCCATTGCTGCGGCCATCGTTAGCCTCCTGCTGGGCAAATTCTTTCTCTGA(SEQ ID NO:125)MSKSLFDLPLTIEPEHELRFVGPFTRPVVTIMTLRNNSALPLVFKIKTTAPKRYCVRPNIGKIIPFRSTQVEICLQPFVYDQQEKNKHKFMVQSVLAPMDADLSDLNKLWKDLEPEQLMDAKLKCVFEMPTAEANAENTSGGGAVGGGTGAAGGGSAGANTSSASAEALESKPKLSSEDKFKPSNLLETSESLDLLSGEIKALRECNIELRRENLHLKDQITRFRSSPAVKQVNEPYAPVLAEKQIPVFYIAVAIAAAIVSLLLGKFFL


Human homologue of Complete Genome candidate


AAD13577 VAMP-associated protein B

(SEQ ID NO:126)1gcgcgcccac ccggtagagg acccccgccc gtgccccgaccggtccccgc ctttttgtaa61aacttaaagc gggcgcagca ttaacgcttc ccgccccggtgacctctcag gggtctcccc121gccaaaggtg ctccgccgct aaggaacatg gcgaaggtggagcaggtcct gagcctcgag181ccgcagcacg agctcaaatt ccgaggtccc ttcaccgatgttgtcaccac caacctaaag241cttggcaacc cgacagaccg aaatgtgtgt tttaaggtgaagactacagc accacgtagg301tactgtgtga ggcccaacag cggaatcatc gatgcaggggcctcaattaa tgtatctgtg361atgttacagc ctttcgatta tgatcccaat gagaaaagtaaacacaagtt tatggttcag421tctatgtttg ctccaactga cacttcagat atggaagcagtatggaagga ggcaaaaccg481gaagacctta tggattcaaa acttagatgt gtgtttgaattgccagcaga gaatgataaa541ccacatgatg tagaaataaa taaaattata tccacaactgcatcaaagac agaaacacca601atagtgtcta agtctctgag ttcttctttg gatgacaccgaagttaagaa ggttatggaa661gaatgtaaga ggctgcaagg tgaagttcag aggctacgggaggagaacaa gcagttcaag721gaagaagatg gactgcggat gaggaagaca gtgcagagcaacagccccat ttcagcatta781gccccaactg ggaaggaaga aggccttagc acccggctcttggctctggt ggttttgttc841tttatcgttg gtgtaattat tgggaagatt gccttgtagaggtagcatgc acaggatggt901aaattggatt ggtggatcca ccatatcatg ggatttaaatttatcataac catgtgtaaa961aagaaattaa tgtatgatga catctcacag gtcttgcctttaaattaccc ctccctgcac1021acacatacac agatacacac acacaaatat aatgtaacgatcttttagaa agttaaaaat1081gtatagtaac tgattgaggg ggaaaagaat gatctttattaatgacaagg gaaaccatga1141gtaatgccac aatggcatat tgtaaatgtc attttaaacattggtaggcc ttggtacatg1201atgctggatt acctctctta aaatgacacc cttcctcgcctgttggtgct ggcccttggg1261gagctggagc ccagcatgct ggggagtgcg gtcagctccacacagtagtc cccacgtggc1321ccactcccgg cccaggctgc tttccgtgtc ttcagttctgtccaagccat cagctccttg1381ggactgatga acagagtcag aagcccaaag gaattgcactgtggcagcat cagacgtact1441cgtcataagt gagaggcgtg tgttgactga ttgacccagcgctttggaaa taaatggcag1501tgctttgttc acttaaaggg accaagctaa atttgtattggttcatgtag tgaagtcaaa1561ctgttattca gagatgttta atgcatattt aacttatttaatgtatttca tctcatgttt1621tcttattgtc acaagagtac agttaatgct gcgtgctgctgaactctgtt gggtgaactg1681gtattgctgc tggagggctg tgggctcctc tgtctctggagagtctggtc atgtggaggt1741ggggtttatt gggatgctgg agaagagctg ccaggaagtgttttttctgg gtcagtaaat1801aacaactgtc ataggcaggg aaattctcag tagtgacagtcaactctagg ttaccttttt1861taatgaagag tagtcagtct tctagattgt tcttataccacctctcaacc attactcaca1921cttccagcgc ccaggtccaa gtttgagcct gacctccccttggggaccta gcctggagtc1981aggacaaatg gatcgggctg caaagggtta gaagcgagggcaccagcagt tgtgggtggg2041gagcaaggga agagagaaac tcttcagcga atccttctagtactagttga gagtttgact2101gtgaattaat tttatgccat aaaagaccaa cccagttctgtttgactatg tagcatcttg2161aaaagaaaaa ttataataaa gccccaaaat taaga(SEQ ID NO:127)1makveqvlsl epqhelkfrg pftdvvttnl klgnptdrnvcfkvkttapr rycvrpnsgi61idagasinvs vmlqpfdydp nekskhkfmv qsmfaptdtsdmeavwkeak pedlmdsklr121cvfelpaend kphdveinki isttasktet pivskslssslddtevkkvm eeckrlqgev181qrlreenkqf keedglrmrk tvqsnspisa laptgkeeglstrllalvvl ffivgviigk241ial


Putative function


Membrane associated protein which may be involved in priming synaptic vesicles


Results Layout for Examples 2A, 2B, 2C and 9A


The results layout for Examples 2A, 2B, 2C and 9A includes, in place of the fourth field “P Element Insertion Site”, a field “P Element Insertion Site Sequence”. This field shows the actual sequence of the insertion site which is determined experimentally, as opposed to the base pair position within genomic segment present in the other Examples.


Category 1—Female Sterile


Example 1 (Category 1)

Line ID—464


Phenotype—Female semi-sterile, brown eggs laid


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003448 (8F)


P element Insertion site—44,575


Annotated Drosophila genome Complete Genome candidate—CG15319-nejire (CREB binding protein, p300/CBP)

(SEQ ID NO:89)CTTAACCAAACAAACAACCTGTGCAACAATTGTCAAAGTGCTAGGCGACAAATAATTTCTGAAAGAAGATTTGACAAGTTCCAATAACGAAAATATCAGAACACACTCGAACTCCAACATAGACGGATCATTGGAGAGTTAGTGAAAAAAAAAAGCGAAAAATCAGAAAAACTTTATAAACTAATAGAAACAATACTACTCAGATTTTTCGAACGTTTTTCGTCTGCGTTTCTGTTTTTTTCCGAATCGAAAGAATCAAACTAACTCTATATGATGGCCGATCACTTAGACGAACCGCCCCAAAAGCGGGTTAAAATGGATCCAACGGATATCTCTTACTTTCTGGAGGAGAACCTGCCCGATGAGCTGGTGTCCTCGAATAGTGGCTGGTCGGATCAGCTGACCGGCGGAGCAGGCGGTGGCAATGGAGGTGGCGGCGCCTCCGGTGTAACCACAAATCCCACATCCGGCCCAAATCCCGGTGGCGGACCCAACAAGCCGGCAGCCCAAGGACCCGGCTCTGGCACAGGCGGAGTCGGTGTTGGAGTGAATGTGGGTGTCGGCGGTGTTGTTGGCGTCGGCGTTGTGCCTTCCCAGATGAACGGAGCCGGCGGCGGCAACGGATCCGGAACGGGTGGCGACGACGGCAGTGGCAACGGCTCAGGAGCGGGCAACAGAATCAGTCAAATGCAACACCAGCAACTGCAGCACCTACTCCAGCAGCAGCAGCAGGGCCAGAAGGGCGCCATGGTGGTGCCCGGCATGCAGCAGCTGGGCAGCAAGTCGCCCAACCTGCAGTCACCCAACCAGGGCGGCATGCAGCAGGTGGTGGGCACTCAGATGGGTATGGTCAACTCAATGCCCATGTCAATATCGAATAATGGCAACAATGGCATGAACGCCATACCAGGCATGAACACCATTGCGCAGGGCAATCTGGGAAACATGGTGCTGACCAACAGCGTTGGCGGCGGCATGGGCGGCATGGTTAATCATCTTAAGCAGCAGCCTGGCGGCGGCGGCGGTGGGATGATCAATTCCGTTTCAGTACCCGGAGGACCTGGAGCAGGAGCTGGTGGCGTTGGAGCTGGCGGCGGAGGAGCCGTTGCCGCAAACCAAGGCATGCATATGCAGAACGGCCCAATGATGGGACGCATGGTGGGGCAACAGCATATGCTTCGTGGCCCGCATCTCATGGGTGCCTCTGGAGGAGCTGGTGGGCCAGGAAACGGGCCTGGTGGCGGAGGACCACGCATGCAGAATCCGAACATGCAAATGACTCAACTCAACAGTCTGCCCTACGGAGTGGGTCAGTATGGTGGCCCAGGCGGTGGTAACAATCCTCAGCAACAGCAGCAGCAACAGCAGCAACAACTTCTCGCCCAGCAGATGGCCCAAAGAGGTGGCGTCGTACCGGGCATGCCGCAGGGTAATCGGCCCGTTGGCACAGTGGTGCCCATGTCCACACTCGGCGGCGATGGATCAGGGCCCGCGGGGCAGCTGGTAAGCGGGAATCCTCAGCAGCAGCAGATGCTGGCGCAGCAGCAAACCGGAGCCATGGGCCCGCGTCCTCCGCAACCAAACCAGCTGCTCGGTCATCCCGGCCAGCAGCAGCAGCAGCAACAGCAGCCTGGCACCTCGCAGCAGCAGCAACAGCAGCAGGGAGTCGGAATCGGAGGAGCAGGCGTTGTGGCCAATGCAGGAACCGTGGCTGGCGTGCCGGCAGTGGCAGGCGGCGGAGCCGGTGGTGCCGTACAATCTAGCGGCCCTGGTGGCGCCAATCGCGATGTGCCCGACGACCGTAAGCGACAGATCCAGCAGCAACTGATGCTGCTCCTCCATGCACACAAATGCAATCGCAGGGAGAACCTGAATCCGAACAGGGAAGTGTGCAACGTTAACTACTGCAAGGCGATGAAATCCGTGCTGGCCCACATGGGCACTTGCAAACAGAGCAAGGACTGCACCATGCAGCATTGTGCCTCTTCGCGCCAAATTCTGTTGCATTATAAAACGTGCCAGAACAGTGGCTGCGTCATTTGCTATCCCTTCCGGCAGAATCATTCGGTTTTTCAAAATGCGAATGTGCCGCCAGGAGGCGGACCGGCAGGAATTGGAGGTGCGCCACCAGGTGGCGGCGGAGCGGGTGGTGGAGCGGCTGGAGCAGGCGGTAATCTTCAGCAGCAACAGCAGCAGCAACAACAGCAGCAGCAGAACCAGCAGCCCAATCTGACGGGTCTGGTAGTGGATGGCAAGCAAGGACAGCAGGTTGCACCGGGAGGTGGCCAAAATACTGCCATAGTTCTTCCCCAGCAACAGGGAGCGGGCGGTGCACCGGGTGCGCCGAAAACGCCTGCGGATATGGTGCAACAATTGACCCAACAGCAGCAGCAGCAGCAACAGCAGGTTCACCAGCAACAGGTTCAGCAACAGGAACTCCGTCGATTCGATGGCATGAGCCAGCAAGTCGTAGCAGGTGGTATGCAACAGCAGCAGCAGCAGGGTTTGCCTCCTGTGATTCGCATTCAAGGCGCTCAGCCGGCCGTCAGGGTACTGGGACCAGGTGGTCCCGGCGGCCCAAGTGGACCAAATGTTCTGCCGAACGATGTTAACAGCCTGCATCAACAACAGCAACAAATGCTGCAACAGCAGCAGCAACAGGGCCAGAATCGACGACGCGGTGGCCTGGCCACCATGGTGGAGCAACAACAGCAGCATCAGCAACAACAGCAGCAACCCAATCCCGCCCAGCTGGGTGGCAACATTCCAGCACCACTCTCTGTCAACGTCGGTGGCTTTGGCAATACCAATTTCGGTGGTGCAGCTGCCGGCGGAGCCGTGGGAGCCAACGATAAGCAGCAACTGAAGGTGGCCCAAGTGCATCCGCAGAGCCATGGCGTAGGAGCGGGCGGTGCATCAGCGGGCGCCGGGGCGAGTGGTGGTCAAGTGGCAGCCGGTTCCAGTGTCCTGATGCCAGCCGATACCACGGGCAGTGGTAATGCGGGCAATCCCAACCAGAATGCAGGCGGTGTAGCTGGAGGTGCCGGCGGTGGCAATGGCGGAAACACTGGACCTCCGGGCGACAACGAGAAAGACTGGCGGGAATCGGTGACCGCCGATCTGCGCAACCACCTCGTCCACAAACTGGTGCAGGCCATCTTCCCCACCTCGGATCCTACGACCATGCAGGACAAACGGATGCATAATCTCGTTTCATACGCGGAAAAGGTCGAGAAGGACATGTACGAAATGGCCAAGTCCAGATCGGAGTACTATCACCTGCTGGCCGAGAAGATCTACAAGATTCAAAAGGAGCTGGAGGAGAAGCGACTGAAGCGTAAGGAGCAGCATCAGCAGATGCTGATGCAGCAACAGGGCGTTGCGAATCCAGTGGCTGGAGGAGCGGCTGGCGGAGCAGGCAGTGCAGCTGGTGTAGCGGGCGGTGTAGTCTTGCCCCAGCAGCAACAGCAGCAGCAACAACAACAGCAGCAGCAGGGTCAGCAGCCTCTGCAGAGCTGTATCCATCCAAGCATCAGTCCAATGGGCGGTGTGATGCCGCCGCAGCAGCTGCGTCCACAGGGACCACCTGGAATACTGGGCCAACAGACGGCAGCAGGCCTGGGCGTCGGCGTGGGAGTGACCAACAATATGGTTACCATGCGCAGTCATTCGCCCGGTGGCAACATGCTCGCCTTGCAGCAACAACAGCGCATGCAGTTCCCGCAACAACAGCAGCAACAACCGCCAGGGTCTGGAGCCGGCAAAATGCTGGTCGGTCCACCAGGACCCAGTCCCGGTGGCATGGTGGTCAATCCCGCGCTCTCGCCTTACCAGACGACCAATGTGCTCACCAGTCCGGTGCCAGGACAGCAGCAACAGCAGCAGTTCATTAATGCGAACGGCGGCACTGGCGCCAATCCTCAACTGAGCGAAATCATGAAGCAGCGTCACATTCACCAGCAGCAGCAGCAACAACAACAGCAGCAGCAGCAGGGAATGTTGTTGCCGCAGTCGCCATTTAGCAATTCAACACCTCTACAACAACAACAGCAGCAGCAGCAGCAACAACAGCAGCAGCAGGCGACTAGCAACAGTTTTAGCTCACCAATGCAGCAACAGCAGCAAGGTCAGCAACAGCAACAACAGAAGCCCGGCAGTGTGCTGAATAATATGCCGCCCACGCCCACGAGTCTGGAAGCCCTGAATGCGGGGGCCGGAGCGCCGGGAACTGGAGGATCCGCCTCCAATGTAACGGTTTCAGCTCCGAGCCCATCGCCTGGCTTCTTGTCCAACGGCCCGTCGATTGGCACGCCCTCCAACAATAATAATAATAGTAGTGCTAACAACAACCCGCCCTCGGTGAGCAGTCTAATGCAACAGCCGCTGAGCAATCGGCCGGGTACGCCTCCTTACATACCCGCTTCCCCAGTGCCGGCGACAAGTGCCTCCGGATTAGCGGCGAGCAGTACGCCCGCATCAGCAGCAGCCACCTGTGCGAGTAGTGGCAGTGGCAGCAATAGCAGCAGCGGAGCAACTGCAGCGGGTGCAAGTTCCACGTCATCATCTTCCTCGGCGGGCTCGGGTACACCACTCAGCTCGGTATCGACTCCTACATCGGCCACGATGGCCACCAGCAGCGGTGGTGGTGGTGGTGGTGGGGGCAATGCAGGAGGCGGATCATCCACTACGCCCGCTAGCAATCCACTGCTCCTCATGTCTGGAGGAACGGCAGGAGGCGGAACGGGAGCAACGACCACCACATCGACATCCTCGAGCAGTCGCATGATGAGCAGCTCCAGCAGTCTCTCCTCACAGATGGCTGCCCTGGAGGCTGCGGCGCGAGACAACGACGATGAGACGCCCTCGCCATCCGGCGAGAATACGAACGGCAGTGGTGGCAGTGGAAATGCCGGCGGTATGGCCTCCAAGGGCAAACTGGACTCCATTAAGCAAGATGATGATATCAAGAAGGAGTTTATGGATGACAGCTGTGGCGGAAATAACGATAGCTCGCAGATGGATTGCTCGACGGGTGGTGGCAAGGGCAAGAATGTGAACAACGACGGAACAAGCATGATCAAAATGGAGATCAAGACGGAGGATGGACTCGATGGCGAGGTAAAGATCAAAACGGAGGCCATGGATGTGGACGAGGCTGGAGGATCGACAGCCGGAGAGCATCATGGCGAAGGTGGCGGCGGCAGTGGTGTTGGCGGCGGTAAGGATAACATAAATGGTGCGCACGATGGCGGAGCGACAGGCGGTGCTGTGGACATAAAACCCAAGACGGAGACGAAACCACTCGTACCGGAGCCACTGGCACCCAATGCAGGTGACAAGAAAAAGAAGTGCCAATTCAATCCCGAGGAACTGCGCACCGCTCTCCTGCCAACGCTAGAGAAGCTCTACAGGCAGGAGCCCGAATCCGTGCCCTTTCGCTACCCAGTTGATCCCCAGGCGCTGGGCATACCTGATTACTTTGAAATCGTTAAGAAGCCCATGGACCTGGGCACTATACGCACCAACATCCAGAATGGAAAGTACAGTGATCCCTGGGAATATGTGGACGACGTTTGGCTGATGTTCGACAATGCCTGGCTGTATAATCGCAAAACATCGCGGGTCTATCGCTATTGCACAAAGCTTTCCGAAGTCTTTGAGGCGGAGATTGATCCTGTGATGCAGGCACTGGGATATTGCTGCGGCAGGAAGTACACATTCAATCCACAGGTGCTATGCTGCTACGGCAAGCAGCTCTGCACGATTCCGCGGGATGCCAAGTACTACAGCTACCAGAACAGTCTAAAGGAATACGGTGTCGCCTCAAATAGATACACCTACTGCCAAAAGTGCTTTAACGACATCCAGGGCGATACGGTCACACTGGGCGACGATCCACTGCAATCGCAAACCCAAATCAAAAAGGATCAGTTCAAGGAGATGAAGAACGATCACCTCGAACTGGAGCCGTTTGTCAATTGCCAGGAGTGCGGACGCAAACAGCACCAAATCTGCGTACTCTGGCTGGATTCTATCTGGCCCGGTGGCTTCGTGTGCGATAACTGCCTGAAAAAGAAGAACTCAAAGCGGAAGGAGAACAAGTTCAATGCGAAACGCCTGCCCACCACCAAGCTGGGCGTGTACATAGAGACGCGGGTGAATAATTTCCTCAAGAAGAAGGAGGCTGGTGCCGGCGAGGTGCACATTCGTGTGGTCAGCTCATCGGACAAGTGTGTAGAGGTGAAGCCCGGCATGCGTCGACGATTCGTCGAGCAGGGCGAGATGATGAACGAGTTCCCATACCGAGCCAAAGCGCTCTTTGCCTTCGAGGAGGTGGATGGCATCGATGTGTGCTTCTTTGGCATGCACGTTCAGGAGTATGGATCCGAGTGCCCGGCGCCGAATACGCGGCGTGTGTATATTGCCTATTTGGATTCCGTTCATTTCTTCCGGCCAAGACAGTACCGTACAGCGGTATATCACGAAATCCTGCTCGGCTATATGGACTACGTGAAACAGCTGGGCTACACAATGGCCCATATCTGGGCCTGTCCGCCATCCGAGGGCGATGACTACATCTTTCACTGCCATCCCACGGACCAGAAGATACCCAAGCCCAAGCGCCTGCAGGAGTGGTACAAAAAGATGCTTGACAAGGGAATGATCGAGCGCATCATACAGGACTACAAGGATATCCTGAAGCAGGCGATGGAGGACAAACTGGGCTCTGCCGCAGAGCTGCCCTACTTTGAGGGCGACTTCTGGCCCAATGTGCTGGAGGAGAGCATCAAGGAACTGGACCAGGAGGAGGAAGAGAAGCGCAAACAGGCCGAGGCCGCGGAAGCAGCAGCTGCGGCAAATCTTTTCTCTATCGAGGAAAATGAAGTAAGCGGCGATGGCAAAAAGAAGGGCCAGAAGAAGGCCAAAAAGTCGAACAAATCGAAAGCGGCGCAGCGTAAGAACAGCAAAAAGTCCAACGAACATCAGTCGGGCAATGATCTCTCCACAAAGATATATGCGACCATGGAGAAGCACAAGGAGGTCTTCTTCGTTATCCGTCTGCATTCGGCGCAGTCGGCAGCTAGTTTAGCGCCCATCCAGGATCCCGATCCGCTGCTCACATGCGATCTGATGGATGGACGCGATGCCTTCCTCACGCTCGCCCGCGACAAGCACTTTGAGTTCTCGTCGCTGCGGCGCGCACAATTCTCCACTCTGTCCATGTTGTATGAGCTGCATAACCAGGGTCAGGACAAGTTTGTTTACACCTGCAACCACTGCAAGACGGCCGTGGAGACGCGCTACCACTGTACTGTTTGTGATGACTTCGATCTGTGTATCGTGTGCAAGGAGAAGGTTGGCCATCAGCACAAGATGGAGAAGCTCGGCTTCGACATCGACGACGGCTCTGCGCTGGCGGATCACAAGCAGGCTAATCCACAGGAGGCCCGCAAGCAATCCATCCAGCGTTGCATCCAATCGCTGGCGCACGCCTGCCAGTGTCGCGATGCCAACTGCCGCCTGCCATCGTGCCAGAAGATGAAGCTCGTTGTCCAGCATACGAAGAACTGCAAGCGCAAGCCCAACGGAGGATGCCCCATTTGCAAGCAGCTTATCGCACTCTGTTGCTATCACGCGAAGAACTGTGAGGAGCAGAAGTGCCCCGTGCCGTTCTGTCCCAACATCAAGCACAAGCTCAAGCAGCAGCAGTCACAGCAGAAATTCCAGCAGCAGCAGTTGCTGCGTCGCCGTGTGGCGCTCATGTCGCGTACAGCAGCTCCAGCGGCTCTGCAAGGCCCAGCTGCAGTAAGCGGTCCGACCGTCGTCTCTGGAGGAGTGCCCGTGGTGGGCATGTCCGGTGTGGCAGTTAGCCAACAGGTGATCCCCGGCCAGGCGGGTATACTGCCTCCAGGGGCGGGTGGCATGTCGCCATCTACCGTGGCAGTTCCATCGCCTGTTTCAGGAGGAGCGGGAGCCGGTGGAATGGGTGGAATGACATCACCACATCCGCATCAACCAGGTATAGGTATGAAACCTGGTGGCGGTCACTCGCCGTCTCCAAATGTCCTACAAGTGGTGAAGCAGGTCCAGGAAGAGGCAGCTCGTCAGCAGGTATCGCATGGCGGTGGCTTCGGCAAGGGCGTACCCATGGCGCCGCCCGTAATGAATCGACCAATGGGCGGCGCTGGGCCCAACCAAAATGTTGTTAATCAACTTGGTGGCATGGGCGTTGGAGTTGAAGGTGTCGGTGGTGTTGGCGTCGGAGGCGTTGGTGGAGTGGGTGTTAATCAACTGAATTCGGGTGGTGGCAATACACCCGGTGCACCCATTTCCGGTCCCGGAATGAATGTCAATCATCTAATGTCCATGGATCAGTGGGGCGGTGGCGGAGCCGGCGGCGGAGGTGCCAATCCCGGCGGTGGCAATCCACAAGCCCGCTATGCCAACAATACCGGCGGCATGCGCCAACCCACCCATGTGATGCAAACGAATCTGATACCGCCGCAGCAACAGCAACAGATGATGGGCGGACTGGGCGGACCCAACCAACTGGGAGGTGGCCAAATGCCAGTCGGCGGACAGCATGGAGGAATGGGAATGGGCATGGGAGCACCACCAATGGCCGGAACTGTTGGCGGAGTGCGTCCATCTCCCGGAGCAGGAGGTGGAGGTGGAAGTGCGACTGGGGGCGGTCTAAATACGCAACAACTCGCCCTGATTATGCAAAAGATTAAGAACAATCCCACCAACGAGAGCAACCAGCACATCCTTGCCATACTAAAACAGAATCCGCAGATCATGGCGGCGATCATCAAGCAGCGCCAGCAGTCGCAGAACAATGCGGCAGCGGGCGGAGGAGCACCTGGCCCAGGTGGAGCCCTACAGCAGCAGCAGGCCGGTAACGGACCGCAAAATCCTCAACAGCAGCAGCAGCAGCAGCAACAGCAACAGGTGATGCAGCAACAGCAGATGCAGCACATGATGAACCAGCAGCAGGGCGGCGGCGGTCCACAGCAGATGAATCCCAACCAGCAGCAGCAACAGCAGCAGGTTAATCTCATGCAGCAGCAGCAACAAGGTGGACCCGGAGGACCAGGTTCTGGACTTCCCACGCGCATGCCCAATATGCCCAATGCCTTGGGTATGCTGCAGAGTCTTCCGCCCAACATGTCGCCAGGCGTTTCTACTCAGGGAGGAATGGTGCCCAACCAAAACTGGAACAAGATGCGTTACATGCAAATGAGCCAGTACCCGCCACCGTATCCGCAGCGCCAGCGTGGCCCGCACATGGGCGGAGCGGGACCTGGTCCCGGCCAGCAACAGTTCCCCGGTGGCGGAGGTGGAGCGGGCAACTTTAATGCGGGTGGTGCTGGTGGTGCAGGCGGCGTTGTCGGTGTGGGCGGAGTGCCCGGAGGTGCCGGCACGGTGCCCGGTGGCGATCAATACTCGATGGCGAATGCCGCGGCTGCCTCCAATATGCTGCAACAGCAGCAGGGCCAGGTGGGCGTCGGAGTGGGCGTGGGCGTGAAACCAGGACCCGGCCAACAGCAACAGCAGATGGGCGTTGGCATGCCGCCGGGTATGCAGCAGCAACAGCAGCAACAGCAACCGCTGCAGCAGCAGCAGATGATGCAGGTAGCAATGCCAAATGCGAATGCCCAGAATCCGTCGGCGGTGGTTGGCGGACCCAATGCTCAGGTGATGGGTCCGCCGACGCCGCACTCTCTGCAGCAGCAGCTGATGCAATCGGCCCGCTCGTCGCCGCCTATTCGCTCCCCGCAGCCAACGCCATCGCCACGTTCGGCTCCATCGCCACGTGCTGCTCCATCCGCCTCGCCTAGGGCACAGCCCTCGCCGCACCATGTGATGAGCAGTCACTCGCCAGCGCCGCAGGGACCACCGCATGACGGCATGCACAATCATGGCATGCATCATCAGTCGCCACTGCCAGGAGTGCCGCAGGATGTTGGCGTCGGAGTCGGTGTCGGCGTTGGCGTTGGCGTTAACGTTAACGTCGGCAACGTGGGCGTCGGCAATGCCGGAGGAGCCCTGCCCGACGCCTCCGACCAGCTGACCAAGTTTGTGGAGCGACTCTAGTGCAGCAACAGCAGCAGCACCAGCACCAGCACCACCACCAGCTACAATGGTTGGTAGGCGATGTGGCTAGAGGGCTAGGGCTAGACTGAATGAATGAATGAGTGTCCAGTAGCCGCAGACGGGATGACGACGAAGACCAACCGGCAGGGATAACCAGTGTGTGTTAAGCGAATTAACAACTATTACTAACTTAAATCTTTTTTTTTTTTTTAAACGGCACCACAAATAATTGTATATTGTTATAATTAAATCAACAAATATCGCGCCTAATGTGTACTGTAGATTAAGATGACCCACCATTACAACCACTAACAAATACCTTATTATTTAAGTTTAAGACGAAAGTTGGACAGAGCATTATGATTCGATTTCCATTTTATGTCCGCGATTTAGCAAATATATAATATCATATATTTCATATGCCCCCAAAACACACACACACCATGTATTAATTAATGCGATTCCTTCGTTTCCACTAAGCAGATATAGAAAAAAAAAAA(SEQ ID NO:90)MMADHLDEPPQKRVKMDPTDISYFLEENLPDELVSSNSGWSDQLTGGAGGGNGGGGASGVTTNPTSGPNPGGGPNKPAAQGPGSGTGGVGVGVNVGVGGVVGVGVVPSQMNGAGGGNGSGTGGDDGSGNGSGAGNRISQMQHQQLQHLLQQQQQGQKGAMVVPGMQQLGSKSPNLQSPNQGGMQQVVGTQMGMVNSMPMSISNNGNNGMNAIPGMNTIAQGNLGNMVLTNSVGGGMGGMVNHLKQQPGGGGGGMINSVSVPGGPGAGAGGVGAGGGGAVAANQGMHMQNGPMMGRMVGQQHMLRGPHLMGASGGAGGPGNGPGGGGPRMQNPNMQMTQLNSLPYGVGQYGGPGGGNNPQQQQQQQQQQLLAQQMAQRGGVVPGMPQGNRPVGTVVPMSTLGGDGSGPAGQLVSGNPQQQQMLAQQQTGAMGPRPPQPNQLLGHPGQQQQQQQQPGTSQQQQQQQGVGIGGAGVVANAGTVAGVPAVAGGGAGGAVQSSGPGGANRDVPDDRKRQIQQQLMLLLHAHKCNRRENLNPNREVCNVNYCKAMKSVLAHMGTCKQSKDCTMQHCASSRQILLHYKTCQNSGCVICYPFRQNHSVFQNANVPPGGGPAGIGGAPPGGGGAGGGAAGAGGNLQQQQQQQQQQQQNQQPNLTGLVVDGKQGQQVAPGGGQNTAIVLPQQQGAGGAPGAPKTPADMVQQLTQQQQQQQQQVHQQQVQQQELRRFDGMSQQVVAGGMQQQQQQGLPPVIRIQGAQPAVRVLGPGGPGGPSGPNVLPNDVNSLHQQQQQMLQQQQQQGQNRRRGGLATMVEQQQQHQQQQQQPNPAQLGGNIPAPLSVNVGGFGNTNFGGAAAGGAVGANDKQQLKVAQVHPQSHGVGAGGASAGAGASGGQVAAGSSVLMPADTTGSGNAGNPNQNAGGVAGGAGGGNGGNTGPPGDNEKDWRESVTADLRNHLVHKLVQAIFPTSDPTTMQDKRMWThVSYAEKVEKDMYEMAKSRSEYYHLLAEKIYKIQKELEEKRLKRKEQHQQMLMQQQGVANPVAGGAAGGAGSAAGVAGGVVLPQQQQQQQQQQQQQGQQPLQSCIHPSISPMGGVMPPQQLRPQGPPGILGQQTAAGLGVGVGVTNNMVTMRSHSPGGNMLALQQQQRMQFPQQQQQQPPGSGAGKMLVGPPGPSPGGMVVNPALSPYQTTNVLTSPVPGQQQQQQFINANGGTGANPQLSEIMKQRHIHQQQQQQQQQQQQGMLLPQSPFSNSTPLQQQQQQQQQQQQQQATSNSFSSPMQQQQQGQQQQQQKPGSVLNNMPPTPTSLEALNAGAGAPGTGGSASNVTVSAPSPSPGFLSNGPSIGTPSNNNNNSSANNNPPSVSSLMQQPLSNRPGTPPYIPASPVPATSASGLAASSTPASAAATCASSGSGSNSSSGATAAGASSTSSSSSAGSGTPLSSVSTPTSATMATSSGGGGGGGGNAGGGSSTTPASNPLLLMSGGTAGGGTGATTTTSTSSSSRMMSSSSSLSSQMAALEAAARDNDDETPSPSGENTNGSGGSGNAGGMASKGKLDSIKQDDDIKKEFMDDSCGGNNDSSQMDCSTGGGKGKNVNNDGTSMIKMEIKTEDGLDGEVKIKTEAMDVDEAGGSTAGEHHGEGGGGSGVGGGKDNINGAHDGGATGGAVDIKPKTETKPLVPEPLAPNAGDKKKKCQFNPEELRTALLPTLEKLYRQEPESVPFRYPVDPQALGIPDYFEIVKKPMDLGTIRTNIQNGKYSDPWEYVDDVWLMFDNAWLYNRKTSRVYRYCTKLSEVFEAEIDPVMQALGYCCGRKYTFNPQVLCCYGKQLCTIPRDAKYYSYQNSLKEYGVASNRYTYCQKCFNDIQGDTVTLGDDPLQSQTQIKKDQFKEMKNDHLELEPFVNCQECGRKQHQICVLWLDSIWPGGFVCDNCLKKKNSKRKENKFNAKRLPTTKLGVYIETRVNNFLKKKEAGAGEVHIRVVSSSDKCVEVKPGMRRRFVEQGEMMNEFPYRAKALFAFEEVDGIDVCFFGMHVQEYGSECPAPNTRRVYIAYLDSVHFFRPRQYRTAVYHEILLGYMDYVKQLGYTMAHIWACPPSEGDDYIFHCHPTDQKIPKPKRLQEWYKKMLDKGMIERIIQDYKDILKQAMEDKLGSAAELPYFEGDFWPNVLEESIKELDQEEEEKRKQAEAAEAAAAANLFSIEENEVSGDGKKKGQKKAKKSNKSKAAQRKNSKKSNEHQSGNDLSTKIYATMEKHKEVFFVIRLHSAQSAASLAPIQDPDPLLTCDLMDGRDAFLTLARDKHFEFSSLRRAQFSTLSMLYELHNQGQDKFVYTCNHCKTAVETRYHCTVCDDFDLCIVCKEKVGHQHRMEKLGFDIDDGSALADHKQANPQEARKQSIQRCIQSLAHACQCRDANCRLPSCQKMKLVVQHTKNCKRKPNGGCPICKQLIALCCYHAKNCEEQKCPVPFCPNIKHKLKQQQSQQKFQQQQLLRRRVALMSRTAAPAALQGPAAVSGPTVVSGGVPVVGMSGVAVSQQVIPGQAGILPPGAGGMSPSTVAVPSPVSGGAGAGGMGGMTSPHPHQPGIGMKPGGGHSPSPNVLQVVKQVQEEAARQQVSHGGGFGKGVPMAPPVMNRPMGGAGPNQNVVNQLGGMGVGVEGVGGVGVGGVGGVGVNQLNSGGGNTPGAPISGPGMNVNHLMSMDQWGGGGAGGGGANPGGGNPQARYANNTGGMRQPTHVMQTNLIPPQQQQQMMGGLGGPNQLGGGQMPVGGQHGGMGMGMGAPPMAGTVGGVRPSPGAGGGGGSATGGGLNTQQLALIMQKIKNNPTNESNQHILAILKQNPQIMAAIIKQRQQSQNNAAAGGGAPGPGGALQQQQAGNGPQNPQQQQQQQQQQQVMQQQQMQHMMNQQQGGGGPQQMNPNQQQQQQQVNLMQQQQQGGPGGPGSGLPTRMPNMPNALGMLQSLPPNMSPGVSTQGGMVPNQNWNKMRYMQMSQYPPPYPQRQRGPHMGGAGPGPGQQQFPGGGGGAGNFNAGGAGGAGGVVGVGGVPGGAGTVPGGDQYSMANAAAASNMLQQQQGQVGVGVGVGVKPGPGQQQQQMGVGMPPGMQQQQQQQQPLQQQQMMQVAMPNANAQNPSAVVGGPNAQVMGPPTPHSLQQQLMQSARSSPPIRSPQPTPSPRSAPSPRAAPSASPRAQPSPHHVMSSHSPAPQGPPHDGMHNHGMHHQSPLPGVPQDVGVGVGVGVGVGVNVNVGNVGVGNAGGALPDASDQLTKYVERL


Human homologue of Complete Genome candidate


AAC51331—CREB-binding protein

(SEQ ID NO:91)1tccgaattcc ttttttttaa ttgaggaatc aacagccgccatcttgtcgc ggacccgacc61ggggcttcga gcgcgatcta ctcggccccg ccggtcccgggccccacaac cgcccgcgca121ccccgctccg cccggccggc ccgctccgcc cggccctcggcgcccgcccc ggcggccccg181ctcgcctctc ggctcggcct cccggagccc ggcggcggcggcggcggcag cggcggcggc241ggcggcggaa cggggggtgg gggggccgcg gcggcggcggcgaccccgct cggcgcattg301tttttcctca cggcggcggc ggcggcgggc cgcgggccgggagcggagcc cggagccccc361tcgtcgtcgg gccgcgagcg aattcattaa gtggggcgcggggggggagc gaggcggcgg421cggcggcggc accatgttct cggggactgc ctgagccgcccggccgggcg ccgtcgctgc481cagccgggcc cgggggggcg gccgggccgc cggggcgcccccaccgcgga gtgtcgcgct541cgggaggcgg gcaggggatg agggggccgc ggccggcggcggcggcggcg gccgggggcg601ggcggtgagc gctgcggggc gctgttgctg tggctgagatttggccgccg cctcccccac661ccggcctgcg ccctccctct ccctcggcgc ccgcccgcgccgctcgcggc gcccgcgctc721gctcctctcc ctcgcagccg gcagggcccc cgacccccgtccgggccctc gccggcccgg781ccgcccgtgc ccggggctgt tttcgcgagc aggtgaaaatggctgagaac ttgctggacg841gaccgcccaa ccccaaaaga gccaaactca gctcgcccggtttctcggcg aatgacagca901cagattttgg atcattgttt gacttggaaa atgatcttcctgatgagctg atacccaatg961gaggagaatt aggcctttta aacagtggga accttgttccagatgctgct tccaaacata1021aacaactgtc ggagcttcta cgaggaggca gcggctctagtatcaaccca ggaataggaa1081atgtgagcgc cagcagcccc gtgcagcagg gcctgggtggccaggctcaa gggcagccga1141acagtgctaa catggccagc ctcagtgcca tgggcaagagccctctgagc cagggagatt1201cttcagcccc cagcctgcct aaacaggcag ccagcacctctgggcccacc cccgctgcct1261cccaagcact gaatccgcaa gcacaaaagc aagtggggctggcgactagc agccctgcca1321cgtcacagac tggacctggt atctgcatga atgctaactttaaccagacc cacccaggcc1381tcctcaatag taactctggc catagcttaa ttaatcaggcttcacaaggg caggcgcaag1441tcatgaatgg atctcttggg gctgctggca gaggaaggggagctggaatg ccgtacccta1501ctccagccat gcagggcgcc tcgagcagcg tgctggctgagaccctaacg caggtttccc1561cgcaaatgac tggtcacgcg ggactgaaca ccgcacaggcaggaggcatg gccaagatgg1621gaataactgg gaacacaagt ccatttggac agccctttagtcaagctgga gggcagccaa1681tgggagccac tggagtgaac ccccagttag ccagcaaacagagcatggtc aacagtttgc1741ccaccttccc tacagatatc aagaatactt cagtcaccaacgtgccaaat atgtctcaga1801tgcaaacatc agtgggaatt gtacccacac aagcaattgcaacaggcccc actgcagatc1861ctgaaaaacg caaactgata cagcagcagc tggttctactgcttcatgct cataagtgtc1921agagacgaga gcaagcaaac ggagaggttc gggcctgctcgctcccgcat tgtcgaacca1981tgaaaaacgt tttgaatcac atgacgcatt gtcaggctgggaaagcctgc caagttgccc2041attgtgcatc ttcacgacaa atcatctctc attggaagaactgcacacga catgactgtc2101ctgtttgcct ccctttgaaa aatgccagtg acaagcgaaaccaacaaacc atcctggggt2161ctccagctag tggaattcaa aacacaattg gttctgttggcacagggcaa cagaatgcca2221cttctttaag taacccaaat cccatagacc ccagctccatgcagcgagcc tatgctgctc2281tcggactccc ctacatgaac cagccccaga cgcagctgcagcctcaggtt cctggccagc2341aaccagcaca gcctcaaacc caccagcaga tgaggactctcaaccccctg ggaaataatc2401caatgaacat tccagcagga ggaataacaa cagatcagcagcccccaaac ttgatttcag2461aatcagctct tccgacttcc ctgggggcca caaacccactgatgaacgat ggctccaact2521ctggtaacat tggaaccctc agcactatac caacagcagctcctccttct agcaccggtg2581taaggaaagg ctggcacgaa catgtcactc aggacctgcggagccatcta gtgcataaac2641tcgtccaagc catcttccca acacctgatc ccgcagctctaaaggatcgc cgcatggaaa2701acctggtagc ctatgctaag aaagtggaag gggacatgtacgagtctgcc aacagcaggg2761atgaatatta tcacttatta gcagagaaaa tctacaagatacaaaaagaa ctagaagaaa2821aacggaggtc gcgtttacat aaacaaggca tcttggggaaccagccagcc ttaccagccc2881cgggggctca gccccctgtg attccacagg cacaacctgtgagacctcca aatggacccc2941tgtccctgcc agtgaatcgc atgcaagttt ctcaagggatgaattcattt aaccccatgt3001ccttggggaa cgtccagttg ccacaagcac ccatgggacctcgtgcagcc tccccaatga3061accactctgt ccagatgaac agcatgggct cagtgccagggatggccatt tctccttccc3121gaatgcctca gcctccgaac atgatgggtg cacacaccaacaacatgatg gcccaggcgc3181ccgctcagag ccagtttctg ccacagaacc agttcccgtcatccagcggg gcgatgagtg3241tgggcatggg gcagccgcca gcccaaacag gcgtgtcacagggacaggtg cctggtgctg3301ctcttcctaa ccctctcaac atgctggggc ctcaggccagccagctacct tgccctccag3361tgacacagtc accactgcac ccaacaccgc ctcctgcttccacggctgct ggcatgccat3421ctctccagca cacgacacca cctgggatga ctcctccccagccagcagct cccactcagc3481catcaactcc tgtgtcgtct tccgggcaga ctcccaccccgactcctggc tcagtgccca3541gtgctaccca aacccagagc acccctacag tccaggcagcagcccaggcc caggtgaccc3601cgcagcctca aaccccagtt cagcccccgt ctgtggctacccctcagtca tcgcagcaac3661agccgacgcc tgtgcacgcc cagcctcctg gcacaccgctttcccaggca gcagccagca3721ttgataacag agtccctacc ccctcctcgg tggccagcgcagaaaccaat tcccagcagc3781caggacctga cgtacctgtg ctggaaatga agacggagacccaagcagag gacactgagc3841ccgatcctgg tgaatccaaa ggggagccca ggtctgagatgatggaggag gatttgcaag3901gagcttccca agttaaagaa gaaacagaca tagcagagcagaaatcagaa ccaatggaag3961tggatgaaaa gaaacctgaa gtgaaagtag aagttaaagaggaagaagag agtagcagta4021acggcacagc ctctcagtca acatctcctt cgcagccgcgcaaaaaaatc tttaaaccag4081aggagttacg ccaggccctc atgccaaccc tagaagcactgtatcgacag gacccagagt4141cattaccttt ccggcagcct gtagatcccc agctcctcggaattccagac tattttgaca4201tcgtaaagaa tcccatggac ctctccacca tcaagcggaagctggacaca gggcaatacc4261aagagccctg gcagtacgtg gacgacgtct ggctcatgttcaacaatgcc tggctctata4321atcgcaagac atcccgagtc tataagtttt gcagtaagcttgcagaggtc tttgagcagg4381aaattgaccc tgtcatgcag tcccttggat attgctgtggacgcaagtat gagttttccc4441cacagacttt gtgctgctat gggaagcagc tgtgtaccattcctcgcgat gctgcctact4501acagctatca gaataggtat catttctgtg agaagtgtttcacagagatc cagggcgaga4561atgtgaccct gggtgacgac ccttcacagc cccagacgacaatttcaaag gatcagtttg4621aaaagaagaa aaatgatacc ttagaccccg aacctttcgttgattgcaag gagtgtggcc4681ggaagatgca tcagatttgc gttctgcact atgacatcatttggccttca ggttttgtgt4741gcgacaactg cttgaagaaa actggcagac ctcgaaaagaaaacaaattc agtgctaaga4801ggctgcagac cacaagactg ggaaaccact tggaagaccgagtgaacaaa tttttgcggc4861gccagaatca ccctgaagcc ggggaggttt ttgtccgagtggtggccagc tcagacaaga4921cggtggaggt caagcccggg atgaagtcac ggtttgtggattctggggaa atgtctgaat4981ctttcccata tcgaaccaaa gctctgtttg cttttgaggaaattgacggc gtggatgtct5041gcttttttgg aatgcacgtc caagaatacg gctctgattgcccccctcca aacacgaggc5101gtgtgtacat ttcttatctg gatagtattc atttcttccggccacgttgc ctccgcacag5161ccgtttacca tgagatcctt attggatatt tagagtatgtgaagaaatta gggtatgtga5221cagggcacat ctgggcctgt cctccaagtg aaggagatgattacatcttc cattgccacc5281cacctgatca aaaaataccc aagccaaaac gactgcaggagtggtacaaa aagatgctgg5341acaaggcgtt tgcagagcgg atcatccatg actacaaggatattttcaaa caagcaactg5401aagacaggct caccagtgcc aaggaactgc cctattttgaaggtgatttc tggcccaatg5461tgttagaaga gagcattaag gaactagaac aagaagaagaggagaggaaa aaggaagaga5521gcactgcagc cagtgaaacc actgagggca gtcagggcgacagcaagaat gccaagaaga5581agaacaacaa gaaaaccaac aagaacaaaa gcagcatcagccgcgccaac aagaagaagc5641ccagcatgcc caacgtgtcc aatgacctgt cccagaagctgtatgccacc atggagaagc5701acaaggaggt cttcttcgtg atccacctgc acgctgggcctgtcatcaac accctgcccc5761ccatcgtcga ccccgacccc ctgctcagct gtgacctcatggatgggcgc gacgccttcc5821tcaccctcgc cagagacaag cactgggagt tctcctccttgcgccgctcc aagtggtcca5881cgctctgcat gctggtggag ctgcacaccc agggccaggaccgctttgtc tacacctgca5941acgagtgcaa gcaccacgtg gagacgcgct ggcactgcactgtgtgcgag gactacgacc6001tctgcatcaa ctgctataac acgaagagcc atgcccataagatggtgaag tgggggctgg6061gcctggatga cgagggcagc agccagggcg agccacagtcaaagagcccc caggagtcac6121gccggctgag catccagcgc tgcatccagt cgctggtgcacgcgtgccag tgccgcaacg6181ccaactgctc gctgccatcc tgccagaaga tgaagcgggtggtgcagcac accaagggct6241gcaaacgcaa gaccaacggg ggctgcccgg tgtgcaagcagctcatcgcc ctctgctgct6301accacgccaa gcactgccaa gaaaacaaat gccccgtgcccttctgcctc aacatcaaac6361acaagctccg ccagcagcag atccagcacc gcctgcagcaggcccagctc atgcgccggc6421ggatggccac catgaacacc cgcaacgtgc ctcagcagagtctgccttct cctacctcag6481caccgcccgg gacccccaca cagcagccca gcacaccccagacgccgcag ccccctgccc6541agccccaacc ctcacccgtg agcatgtcac cagctggcttccccagcgtg gcccggactc6601agccccccac cacggtgtcc acagggaagc ctaccagccaggtgccggcc cccccacccc6661cggcccagcc ccctcctgca gcggtggaag cggctcggcagatcgagcgt gaggcccagc6721agcagcagca cctgtaccgg gtgaacatca acaacagcatgcccccagga cgcacgggca6781tggggacccc ggggagccag atggcccccg tgagcctgaatgtgccccga cccaaccagg6841tgagcgggcc cgtcatgccc agcatgcctc ccgggcagtggcagcaggcg ccccttcccc6901agcagcagcc catgccaggc ttgcccaggc ctgtgatatccatgcaggcc caggcggccg6961tggctgggcc ccggatgccc agcgtgcagc cacccaggagcatctcaccc agcgctctgc7021aagacctgct gcggaccctg aagtcgccca gctcccctcagcagcaacag caggtgctga7081acattctcaa atcaaacccg cagctaatgg cagctttcatcaaacagcgc acagccaagt7141acgtggccaa tcagcccggc atgcagcccc agcctggcctccagtcccag cccggcatgc7201aaccccagcc tggcatgcac cagcagccca gcctgcagaacctgaatgcc atgcaggctg7261gcgtgccgcg gcccggtgtg cctccacagc agcaggcgatgggaggcctg aacccccagg7321gccaggcctt gaacatcatg aacccaggac acaaccccaacatggcgagt atgaatccac7381agtaccgaga aatgttacgg aggcagctgc tgcagcagcagcagcaacag cagcagcaac7441aacagcagca acagcagcag cagcaaggga gtgccggcatggctgggggc atggcggggc7501acggccagtt ccagcagcct caaggacccg gaggctacccaccggccatg cagcagcagc7561agcgcatgca gcagcatctc cccctccagg gcagctccatgggccagatg gcggctcaga7621tgggacagct tggccagatg gggcagccgg ggctgggggcagacagcacc cccaacatcc7681agcaagccct gcagcagcgg attctgcagc aacagcagatgaagcagcag attgggtccc7741caggccagcc gaaccccatg agcccccagc aacacatgctctcaggacag ccacaggcct7801cgcatctccc tggccagcag atcgccacgt cccttagtaaccaggtgcgg tctccagccc7861ctgtccagtc tccacggccc cagtcccagc ctccacattccagcccgtca ccacggatac7921agccccagcc ttcgccacac cacgtctcac cccagactggttccccccac cccggactcg7981cagtcaccat ggccagctcc atagatcagg gacacttggggaaccccgaa cagagtgcaa8041tgctccccca gctgaacacc cccagcagga gtgcgctgtccagcgaactg tccctggtcg8101gggacaccac gggggacacg ctagagaagt ttgtggagggcttgtag(SEQ ID NO:92)1maenlldgpp npkraldssp gfsandstdf gslfdlendlpdelipngge lgllnsgnlv61pdaaskhkql sellrggsgs sinpgignvs asspvqqglggqaqgqpnsa nmaslsamgk121splsqgdssa pslpkqaast sgptpaasqa lnpqaqkqvglatsspatsq tgpgicmnan181fnqthpglln snsghslinq asqgqaqvmn gslgaagrgrgagmpyptpa mqgasssvla241etltqvspqm tghaglntaq aggmakmgit gntspfgqpfsqaggqpmga tgvnpqlask301qsmvnslptf ptdikntsvt nvpnmsqmqt svgivptqaiatgptadpek rkliqqqlvl361llhahkcqrr eqangevrac slphcrtmkn vlnhmthcqagkacqvahca ssrqiishwk421nctrhdcpvc lplknasdkr nqqtilgspa sgiqntigsvgtgqqnatsl snpnpidpss481mqrayaalgl pynmqpqtql qpqvpgqqpa qpqthqqmrtlnplgnnpmn ipaggittdq541qppnlisesa lptslgatnp lmndgsnsgn igtlstiptaappsstgvrk gwhehvtqdl601rshlvhklvq aifptpdpaa lkdrrmenlv ayakkvegdmyesansrdey yhllaekiyk661iqkeleekrr srlhkqgilg nqpalpapga qppvipqaqpvrppngplsl pvnrmqvsqg721mnsfnpmslg nvqlpqapmg praaspmnhs vqmnsmgsvpgmaispsrmp qppnmmgaht781nnmmaqapaq sqflpqnqfp sssgamsvgm gqppaqtgvsqgqvpgaalp nplnmlgpqa841sqlpcppvtq splhptpppa staagmpslq httppgmtppqpaaptqpst pvsssgqtpt901ptpgsvpsat qtqstptvqa aaqaqvtpqp qtpvqppsvatpqssqqqpt pvhaqppgtp961lsqaaasidn rvptpssvas aetnsqqpgp dvpvlemktetqaedtepdp geskgeprse1021mmeedlqgas qvkeetdiae qksepmevde kkpevkvevkeeeesssngt asqstspsqp1081rkkifkpeel rqalmptlea lyrqdpeslp frqpvdpqllgipdyfdivk npmdlstikr1141kldtgqyqep wqyvddvwlm fnnawlynrk tsrvykfcsklaevfeqeid pvmqslgycc1201grkyefspqt lccygkqlct iprdaayysy qnryhfcekcfteiqgenvt lgddpsqpqt1261tiskdqfekk kndtldpepf vdckecgrkm hqicvlhydiiwpsgfvcdn clkktgrprk1321enkfsakrlq ttrlgnhled rvnkflrrqn hpeagevfvrvvassdktve vkpgmksrfv1381dsgemsesfp yrtkalfafe eidgvdvcff gmhvqeygsdcpppntrrvy isyldsihff1441rprclrtavy heiligyley vkklgyvtgh iwacppsegddyithchppd qkipkpkrlq1501ewykkmldka faeriihdyk difkqatedr ltsakelpyfegdfwpnvle esikeleqee1561eerkkeesta asettegsqg dsknakkknn kktnknkssisrankkkpsm pnvsndlsqk1621lyatmekhke vffvihlhag pvintlppiv dpdpllscdlmdgrdafltl ardkhwefss1681lrrskwstlc mlvelhtqgq drfvytcnec khhvetrwhctvcedydlci ncyntkshah1741kmvkwglgld degssqgepq skspqesrrl siqrciqslvhacqcrnanc slpscqkmkr1801vvqhtkgckr ktnggcpvck qlialccyha khcqenkcpvpfclnikhkl rqqqiqhrlq1861qaqlmrrrma tmntrnvpqq slpsptsapp gtptqqpstpqtpqppaqpq pspvsmspag1921fpsvartqpp ttvstgkpts qvpappppaq pppaaveaarqiereaqqqq hlyrvninns1981mppgrtgmgt pgsqmapvsl nvprpnqvsg pvmpsmppgqwqqaplpqqq pmpglprpvi2041smqaqaavag prmpsvqppr sispsalqdl lrtlkspsspqqqqqvlnil ksnpqlmaaf2101ikqrtakyva nqpgmqpqpg lqsqpgmqpq pgmhqqpslqnlnamqagvp rpgvppqqqa2161mgglnpqgqa lnimnpghnp nmasmnpqyr emlrrqllqqqqqqqqqqqq qqqqqqgsag2221maggmaghgq fqqpqgpggy ppamqqqqrm qqhlplqgssmgqmaaqmgq lgqmgqpglg2281adstpniqqa lqqrilqqqq mkqqigspgq pnpmspqqhmlsgajqashl pgqqiatsls2341nqvrspapvq sprpqsqpph sspspriqpq psphhvspqtgsphpglavt massidqghl2401gnpeqsamlp qlntpsrsal sselslvgdt tgdtlekfvegl


Putative function


CREB-binding protein, transcription factor


Example 2 (Category 1)

Line ID—492


Phenotype—Female sterile, few eggs laid, several fully matured eggs in ovarioles


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003490 (11B4-14)


P element insertion site—30,773


Annotated Drosophila genome Complete Genome candidate—CG2028—CK1 alpha (2 splice variants)

(SEQ ID NO:93)TAAAGTGCAAGCTGGAAAAGAAAAGCAAAACAAATTCCGGAGAGCAGAAAGAGAGTTTTTCAAGTGAACGCGTCCAACTGTTTTTGAAGCGAAGCGCTTAGGCGGAGGAGCAGCTAGCCAGGATGGACAAGATGCGGATATTGAAGGAAAGTCGCCCCGAGATAATCGTCGGTGGCAAATATCGGGTGATCAGGAAGATTGGAAGCGGATCGTTTGGCGACATTTACCTGGGCATGAGCATCCAGAGCGGCGAAGAAGTGGCCATCAAGATGGAGAGCGCCCACGCCCGCCATCCGCAGCTGTTGTACGAGGCCAAGCTGTACCGCATTCTGAGCGGCGGCGTTGGATTCCCTCGTATACGTCACCATGGCAAGGAAAAGAACTTCAACACCCTGGTCATGGACCTGCTGGGACCCTCGCTGGAGGATCTGTTCAATTTCTGTACGCGCCATTTCACAATCAAAACGGTTCTGATGCTCGTCGACCAGATGATCGGACGCTTGGAGTACATCCATCTCAAGTGCTTCATCCATCGCGACATCAAGCCGGATAACTTCCTAATGGGCATTGGTCGGCACTGCAATAAGCTGTTCCTGATCGATTTCGGTCTGGCCAAGAAGTTCCGCGATCCGCACACGCGCCATCACATCGTTTACCGCGAGGACAAGAACCTCACCGGCACTGCCCGCTATGCCTCGATCAATGCCCATCTGGGCATCGAGCAGTCGCGGCGTGACGACATGGAATCGCTTGGATACGTGATGATGTACTTCAATCGCGGCGTACTGCCATGGCAAGGCATGAAGGCCAACACCAAGCAGCAGAAATACGAGAAGATCTCCGAAAAGAAGATGTCCACGCCCATCGAGGTCCTCTGCAAGGGCTCGCCGGCCGAGTTCTCCATGTATCTGAACTATTGTCGTAGCCTGCGCTTCGAGGAGCAGCCAGATTACATGTACCTACGTCAATTGTTCCGCATACTGTTCAGAACGCTGAACCATCAGTATGACTACATCTACGACTGGACAATGCTGAAGCAGAAGACCCATCAGGGTCAACCCAATCCAGCTATACTCTTGGAGCAATTGGACAAGGACAAGGAGAAGCAGAACGGCAAGCCCCTGATCGCGGACTAAGAGCTGCAGCGCATTCAGACGAATGGGGGGAGTGCATCAGAGAAGGAGAACGTGGATGCGTGGATGTAAATGACGTTGATGTGGGCGAAAGGCCCGGCAAGGAGCGGAGCAAATATGAAACAGACGCAACCGTAAAATTGAGTAACACCAGCGGTCGTCCGAATGTTTCTTAATATTAATTTAAATTCAATACTAAACAAATAAGGAACCACAAACAAGCAAGCAAC(SEQ ID NO:94)MDKMRILKESRPEIIVGGKYRVIRKIGSGSFGDIYLGMSIQSGEEVAIKMESAHARHPQLLYEAKLYRILSGGVGFPRIRHHGKEKNFNTLVMDLLGPSLEDLFNFCTRHFTIKTVLMLVDQMIGRLEYIHLKCFIHRDIKPDNFLMGIGRHCNKLFLIDFGLAKKERDPHTRHHIVYREDKNLTGTARYASINAHLGIEQSRRDDMLSLGYVMMYFNRGVLPWQGMKANTKQQKYEKISEKKMSTPIEVLCKGSPAEFSMYLNYCRSLRFEEQPDYMYLRQLFRILFRTLNHQYDYIYDWTMLKQKTHQGQPNPAILLEQLDKDKEKQNGKPLIAD(SEQ ID NO:95)TTTGGTTGAACCTATCGGGCCCTATCGATATAAGCAAAAGCATTTTTGCTGGATCTACCATTTTATTTTAGTTAATAAAATACATATATTTCCTCTCTTTTTGTTCCGTTTGTGCGCGTACAAAACTAGCTGCGAACTCGTGCAATATTTCATAAACTGAATGGGAAAACAACGATAACGACGAAAGAAAACGAAAACGGATCTGCGACGAAATTTTCCCCGTTCCGTTTTTTTTTCTCCACCAGCAGCAGAAGCAGCAGAGCAAAAGCAGCGAATATATTTGTAAAAGAGAGCCCCAACCTTGAGAAAAAACAACCAGCAGGGCAATAATTAGTTGAATTTATCGTCTGCTGTTTTTCAAGTGAACGCGTCCAACTGTTTTTGAAGCGAAGCGCTTAGGCGGAGGAGCAGCTAGCCAGGATGGACAAGATGCGGATATTGAAGGAAAGTCGCCCCGAGATAATCGTCGGTGGCAAATATCGGGTGATCAGGAAGATTGGAAGCGGATCGTTTGGCGACATTTACCTGGGCATGAGCATCCAGAGCGGCGAAGAAGTGGCCATCAAGATGGAGAGCGCCCACGCCCGCCATCCGCAGCTGTTGTACGAGGCCAAGCTGTACCGCATTCTGAGCGGCGGCGTTGGATTCCCTCGTATACGTCACCATGGCAAGGAAAAGAACTTCAACACCCTGGTCATGGACCTGCTGGGACCCTCGCTGGAGGATCTGTTCAATTTCTGTACGCGCCATTTCACAATCAAAACGGTTCTGATGCTCGTCGACCAGATGATCGGACGCTTGGAGTACATCCATCTCAAGTGCTTCATCCATCGCGACATCAAGCCGGATAACTTCCTAATGGGCATTGGTCGGCACTGCAATAAGCTGTTCCTGATCGATTTCGGTCTGGCCAAGAAGTTCCGCGATCCGCACACGCGCCATCACATCGTTTACCGCGAGGACAAGAACCTCACCGGCACTGCCCGCTATGCCTCGATCAATGCCCATCTGGGCATCGAGCAGTCGCGGCGTGACGACATGGAATCGCTTGGATACGTGATGATGTACTTCAATCGCGGCGTACTGCCATGGCAAGGCATGAAGGCCAACACCAAGCAGCAGAAATACGAGAAGATCTCCGAAAAGAAGATGTCCACGCCCATCGAGGTCCTCTGCAAGGGCTCGCCGGCCGAGTTCTCCATGTATCTGAACTATTGTCGTAGCCTGCGCTTCGAGGAGCAGCCAGATTACATGTACCTACGTCAATTGTTCCGCATACTGTTCAGAACGCTGAACCATCAGTATGACTACATCTACGACTGGACAATGCTGAAGCAGAAGACCCATCAGGGTCAACCCAATCCAGCTATACTCTTGGAGCAATTGGACAAGGACAAGGAGAAGCAGAACGGCAAGCCCCTGATCGCGGACTAAGAGCTGCAGCGCATTCAGACGAATGGGGGGAGTGCATCAGAGAAGGAGAACGTGGATGCGTGGATGTAAATGACGTTGATGTGGGCGAAAGGCCCGGCAAGGAGCGGAGCAAATATGAAACAGACGCAACCGTAAAATTGAGTAACACCAGCGGTCGTCCGAATGTTTCTTAATATTAAYFTAAATTCAATACTAAACAAATAAGGAACCACAAACAAGCAAGCAAC(SEQ ID NO:96)MDKMRILKESRPEIIVGGKYRVIRKIGSGSFGDIYLGMSIQSGEEVAIKMESAHARHPQLLYEAKLYRILSGGVGFPRIRHHGKEKNFNTLVMDLLGPSLEDLFNFCTRHFTIKTVLMLVDQMIGRLEYIHLKCFIHRDIKPDNFLMGIGRHCNKLFLIDFGLAKKFRDPHTRHHIVYREDKNLTGTARYASINAHLGIEQSRRDDMESLGYVMMYFNRGVLPWQGMKANTKQQKYEKISEKKMSTPTEVLCKGSPAEFSMYLNYCRSLRFEEQPDYMYLRQLFRILFRTLNHQYDYIYDWTMLKQKTHQGQPNPAILLEQLDKDKEKQNGKPLIAD


Human homologue of Complete Genome candidate


P48729 Casein kinase I, alpha isoform (cki-alpha) (ckl)

(SEQ ID NO: 97)1ccgcctccgt gttccgtttc ctgccgccct cctctcgtagccttgcctag tgtggagccc61caggcctccg tcctcttccc agaggtgtcg aggcttggccccagcctcca tcttcgtctc121tcaggatggc gagtagcagc ggctccaagg ctgaattcattgtcggtggg aaatataaac181tggtacggaa gatcgggtct ggctccttcg gggacatctatttggcgatc aacatcacca241acggcgagga agtggcactg aagctagaat ctcagaaggccaggcatccc cagttgctgt301acgagagcaa gctctataag attcttcaag gtggggttggcatcccccac atacggtggt361atggtcagga aaaagactac aatgtactag tcatggatcttctgggacct agcctcgaag421acctcttcaa tttctgttca agaaggttca caatgaaaactgtacttatg ttagctgacc481agatgatcag tagaattgaa tatgtgcata caaagaattttatacacaga gacattaaac541cagataactt cctaatgggt attgggcgtc actgtaataagttattcctt attgattttg601gtttggccaa aaagtacaga gacaacagga caaggcaacacataccatac agagaagata661aaaacctcac tggcactgcc cgatatgcta gcatcaatgcacatcttggt attgagcaga721gtcgccgaga tgacatggaa tcattaggat atgttttgatgtattttaat agaaccagcc781tgccatggca agggctaaag gctgcaacaa agaaacaaaaatatgaaaag attagtgaaa841agaagatgtc cacgcctgtt gaagttttat gtaaggggtttcctgcagaa tttgcgatgt901acttaaacta ttgtcgtggg ctacgctttg aggaagccccagattacatg tatctgaggc961agctattccg cattcttttc aggaccctga accatcaatatgactacaca tttgattgga1021caatgttaaa gcagaaagca gcacagcagg cagcctcttcaagtgggcag ggtcagcagg1081cccaaacccc cacaggcaag caaactgaca aatccaagagtaacatgaaa ggtttctaat1141ttctaagcat gaattgagga acagaagaag cagacgagatgatcggagca gcatttgttt1201ctccccaaat ctagaaattt tagttcatat gtacactagccagtggttgt ggacaacca(SEQ ID NO: 98)1masssgskae fivggkyklv rkigsgsfgd iylainitngeevalklesq karhpqllye61sklykilqgg vgiphirwyg qekdynvlvm dllgpsledlfnfcsrrftm ktvlmladqm121isrieyvhtk nfihrdikpd nflmgigrhc nklflidfglakkyrdnrtr qhipyredkn181ltgtaryasi nahlgieqsr rddmeslgyv lmyfnrtslpwqglkaatkk qkyekisekk241mstpvevlck gfpaefamyl nycrglrfee apdymylrqlfrilfrtlnh qydytfdwtm301lkqkaaqqaa sssgqgqqaq tptgkqtdks ksnmkgf


Putative function


Casein kinase


Example 2A (Category 1)

Line ID—ccr-a2


Phenotype—Female semi-sterile, Lays eggs, but arrest before cortical migration


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003435 (5C6)


P element insertion site sequence

(SEQ ID NO: 99)GATCAGACGATATTCGGACTCCAAGCAGAGCACTTTGAAGGTGAGTTCGCCGGAAACCAGGCAAAGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTCTGCTGCTCTAAACGACGCATTTCGTACTCCAAAGTACGAATTTTTTCCCTCAAGCTCTTATTTTCATTAAACAATGAACAGGACCTAACGCCACAGTA


Annotated Drosophila genome Complete Genome candidate—CG3011—glycine hydroxymethyltransferase

(SEQ ID NO: 100)GTAAATGTTGTTTACCAACGTAACGCGTGTTTTCGCTTCGTTGTATTTTCGGTGTCGAATATTTTGGATGCTGGCCAAGAGATAGCGCAGCGATCGGGTCGGAACTCTTGGGCGGACTTATCACTGGGTCGGTCAGGGGTCACGGGTTATCGTTATCGCTTATCAGCCAGCGGCGGCGTCATCTCAGCGCCGGCGACTCTTCTCACTTTGCGGCAGTTCCGATTCGAACGCAGCCGTTTACAAAGACATGCAGCGGGCGCGCTCTACACTGACACAAAAGCTTCGGTTTTGCCTTAGTCGGGACCTGAACACCAAAGTTGGCAACCCGGTTAACTTCGAGACTGGAAAGCTTAGCGGAGCTTTAACTCGCATCGCCGCCAAAAAACAACCATCACCAACGCCATTCTTACCGGCGATCAGACGATATTCGGACTCCAAGCAGAGCACTTTGAAGAATATGGCCGATCAGAAACTGCTGCAAACCCCGCTGGCACAGGGCGATCCGGAGCTGGCCGAGCTGATCAAGAAGGAGAAGGAGCGCCAGCGCGAAGGACTCGAGATGATCGCCAGTGAGAACTTCACCTCGGTGGCGGTTCTCGAGAGCCTGAGCTCCTGCCTGACCAACAAGTACTCCGAGGGATATCCCGGCAAGAGGTACTACGGTGGCAACGAGTACATCGACCGCATAGAGCTGCTCGCCCAGCAACGCGGACGCGAGCTGTTCAACCTGGACGATGAGAAGTGGGGCGTTAATGTGCAGCCTTATTCCGGATCCCCGGCCAATCTGGCTGTCTACACGGGCGTCTGCCGGCCCCACGATCGCATCATGGGCCTGGATCTGCCCGATGGCGGTCACTTGACGCACGGTTTCTTCACGCCCACCAAGAAGATATCGGCCACATCGATCTTCTTCGAGAGCATGCCGTACAAAGTGAACCCGGAGACGGGCATCATCGATTACGATAAGTTGGCGGAGGCGGCGAAGAATTTCCGGCCGCAGATCATCATTGCTGGCATATCGTGCTACTCCCGTCTGCTGGACTATGCGCGTTTCCGACAGATTTGCGATGATGTGGGCGCCTACCTGATGGCCGACATGGCCCATGTGGCGGGCATTGTGGCCGCGGGATTGATACCATCGCCGTTCGAATGGGCCGACATTGTGACCACCACCACGCACAAGACACTGCGAGGTCCGCGCGCCGGCGTGATCTTCTTCCGCAAGGGCGTGCGCAGCACCAAGGCCAATGGAGACAAGGTACTCTACGATCTGGAGGAGCGCATCAACCAGGCGGTGTTTCCATCACTCCAGGGTGGTCCGCACAACAACGCCGTGGCTGGCATTGCCACCGCCTTCAAGCAGGCCAAGAGTCCCGAATTCAAGGCCTACCAGACGCAGGTGCTCAAGAATGCCAAGGCCCTGTGCGATGGCCTCATTTCGCGAGGCTATCAGGTGGCCACCGGCGGCACCGACGTCCATTTGGTGCTGGTCGATGTGCGTAAGGCTGGCCTGACCGGCGCCAAGGCCGAGTACATCCTCGAGGAGGTGGGCATCGCGTGCAACAAGAACACTGTGCCCGGCGACAAGTCCGCCATGAATCCCTCCGGCATCCGGCTGGGCACACCGGCCCTGACCACTCGCGGCCTTGCCGAGCAGGACATCGAGCAGGTGGTGGCCTTCATCGATGCTGCCCTAAAGGTTGGCGTCCAGGCAGCCAAGCTGGCCGGCAGTCCCAAGATAACCGATTACCACAAGACGCTGGCCGAGAATGTGGAGCTCAAGGCCCAGGTGGACGAGATCCGCAAGAATGTGGCCCAGTTCAGCAGGAAATTCCCGCTGCCCGGCCTGGAGACCCTGTAG(SEQ ID NO: 101)MQRARSTLTQKLRFCLSRDLNTKVGNPVNFETGKLSGALTRIAAKKQPSPTPFLPAIRRYSDSKQSTLKNMADQKLLQTPLAQGDPELAELIKKEKERQREGLEMIASENFTSVAVLESLSSCLTNKYSEGYPGKRYYGGNEYIDRIELLAQQRGRELFNLDDEKWGVNVQPYSGSPANLAVYTGVCRPHDRIMGLDLPDGGHLTHGFFTPTKKISATSIFFESMPYKVNPETGIIDYDKLAEAAKNFRPQIIIAGISCYSRLLDYARFRQICDDVGAYLMADMAHVAGIVAAGLIPSPFEWADLVTTTTHKTLRGPRAGVIFFRKGVRSTKANGDKVLYDLEERINQAVFPSLQGGPHNNAVAGIATAFKQAKSPEFKAYQTQVLKNAKALCDGLISRGYQVATGGTDVHLVLVDVRKAGLTGAKAEYILEEVGIACNKNTVPGDKSAMNPSGIRLGTPALTTRGLAEQDIEQVVAFIDAALKVGVQAAKLAGSPKITDYHKTLAENVELKAQVDEIRKNVAQFSRKFPLPGLETL


Human homologue of Complete Genome candidate


AAA63258—serine hydroxymethyltransferase

(SEQ ID NO: 102)1ggcacgaggc ctgcgacttc cgagttgcga tgctgtacttctctttgttt tgggcggctc61ggcctctgca gagatgtggg cagctggtca ggatggccattcgggctcag cacagcaacg121cagcccagac tcagactggg gaagcaaaca ggggctggacaggccaggag agcctgtcgg181acagtgatcc tgagatgtgg gagttgctgc agagggagaaggacaggcag tgtcgtggcc241tggagctcat tgcctcagag aacttctgca gccgagctgcgctggaggcc ctggggtcct301gtctgaacaa caagtactcg gagggttatc ctggcaagagatactatggg ggagcagagg361tggtggatga aattgagctg ctgtgccagc gccgggccttggaagccttt gacctggatc421ctgcacagtg gggagtcaat gtccagccct actccgggtccccagccaac ctggccgtct481acacagccct tctgcaacct cacgaccgga tcatggggctggacctgccc gatgggggcc541agtgatctca cccacggcta catgtctgac gtcaagcggatatcagccac gtccatcttc601ttcgagtcta tgccctataa gctcaacccc aaaactggcctcattgacta caaccagctg661gcactgactg ctcgactttt ccggccacgg ctcatcatagctggcaccag cgcctatgct721cgcctcattg actacgcccg catgagagag gtgtgtgatgaagtcaaagc acacctgctg781gcagacatgg cccacatcag tggcctggtg gctgccaaggtgattccctc gcctttcaag841cacgcggaca tcgtcaccac cactactcac aagactcttcgaggggccag gtcagggctc901atcttctacc ggaaaggggt gaaggctgtg gaccccaagactggccggga gatcccttac961acatttgagg accgaatcaa ctttgccgtg ttcccatccctgcagggggg cccccacaat1021catgccattg ctgcagtagc tgtggcccta aagcaggcctgcacccccat gttccgggag1081tactccctgc aggttctgaa gaatgctcgg gccatggcagatgccctgct agagcgaggc1141tactcactgg tatcaggtgg tactgacaac cacctggtgctggtggacct gcggcccaag1201ggcctggatg gagctcgggc tgagcgggtg ctagagcttgtatccatcac tgccaacaag1261aacacctgtc ctggagaccg aagtgccatc acaccgggcggcctgcggct tggggcccca1321gccttaactt ctcgacagtt ccgtgaggat gacttccggagagttgtgga ctttatagat1381gaaggggtca acattggctt agaggtgaag agcaagactgccaagctcca ggatttcaaa1441tccttcctgc ttaaggactc agaaacaagt cagcgtctggccaacctcag gcaacgggtg1501gagcagtttg ccagggcctt ccccatgcct ggttttgatgagcattgaag gcacctggga1561aatgaggccc acagactcaa agttactctc cttccccctacctgggccag tgaaatagaa1621agcctttcta ttttttggtg cgggagggaa gacctctcacttagggcaag agccaggtat1681agtctccctt cccagaattt gtaactgaga agatcttttctttttccttt ttttggtaac1741aagacttaga aggagggccc aggcactttc tgtttgaacccctgtcatga tcacagtgtc1801agagacgcgt cctctttctt ggggaagttg aggagtgcccttcagagcca gtagcaggca1861ggggtgggta ggcaccctcc ttcctgtttt tatctaataaaatgctaacc tgcaaaaaaa1921aaaaaaaaaa a(SEQ ID NO: 103)1aaqtqtgean rgwtgqesls dsdpemwell qrekdrqcrgleliasenfc sraalealgs61clnnkysegy pgkryyggae vvdeiellcq rraleafdldpaqwgvnvqp ysgspanlav121ytallqphdr imgldlpdgg hlthgymsdv krisatsiffesmpyklnpk tglidynqla181ltarlfrprl iiagtsayar lidyarmrev cdevkahlladmahisglva akvipspfkh241adivtttthk tlrgarsgli fyrkgvkavd pktgreilytfedrinfavf pslqggphnh301aiaavavalk qactpmfrey slqvlknara madallergyslvsggtdnh lvlvdlrpkg361ldgaraervl elvsitankn tcpgdrsait pgglrlgapaltsrqfredd frrvvdfide421gvniglevks ktaklqdfks fllkdsetsq rlanlrqrveqfarafpmpg fdeh


Putative function


hydroxymethyltransferase


Example 2B (Category 1)

Line ID—ewv-b


Phenotype—Female sterile, No eggs laid. Fully mature eggs, but “retained eggs” phenotype. Also has a mitotic phenotype: higher mitotic index, uneven chromosome staining, tangled and badly defined chromosomes with frequent bridges


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003486 (10D4-6)


P element insertion site sequence

(SEQ ID NO: 104)GACAGGAGCAGCTCGGAACGGACAGGAAAAGCAGGAGACTAAACAGTAAGCAATAAATTGATTTGGCGTATAGTAGCTTACACCAAAGTACATATATTGCCGCATATATAGCCAGCCGGTCACTTGCGGATCAGCCAACGTCCTGGGCCCCAAGGCGATAGATACCACGATAAGGAGATACAGCGATACCACCAATCATTAGCAGGCGACAACGACACATCCGCATCCGCAGAAGATGTCCAACGGCAAGGCGACGGTCTCGTTCTTCGAGACCGGGAGCACCAAACAGTTCGAGTACTGCTACCAGCTCTATCCCCAGGTTCTTAAGCTAAAGGCCGAGAAGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGNCACGACGNTGNAAAACGACGGNCANNGCCANNCTNTGNTGNTNTAAACNACNCATT


Annotated Drosophila genome Complete Genome candidate—CG2446 (2 transcripts)—encodes a novel protein which may be a glycosylation/membrane protein

(SEQ ID NO: 105)AGATAGAACGACAACTCCTGTTCCCGGTTCGTCGTCGTTCGTCATTCCCATATTCGCTTCTCGTATTCCCTCCCATTCCCATTCGCAATCCCAATTCCCAATTCCCGTCACACGAGTTAGCAGCACATCGCACAGCTGCATCGCTCCGCTCCGATCCTTTTTAATTTTTTGTTGTGCCTTCGGTGGCGTGCTCATTTCGAGAACAGAGTAACCCCTTTTTATTTGTCAGTTGTCAACGGCGCCCCTGCAGGCAGAAAGCAGAAACTGAAACAGCAGAGGAAGAAGAAGAAGCAGCACAGCACGGGCACAGCACGAAGCACGCAGCACAGCACAAGCACAGAGGCGAAGCGAAGCAAAGCAAAGCAGAGGCAACACAGAAAAACAGCAAAGCATTGGAGTAGTTGTTTGGATGTGGACGGAAAGGAAGACTGGCGGCGACTAACTAAAAGCAGTACGTTGACAGGAGCAGCTCGGAACGGACAGGAAAAGCAGGAGACTAAACACCAGCCGGTCACTTGCGGATCAGCCAACGTCCTGGGCCCCAAGGCGATAGATACCACGATAAGGAGATACAGCGATACCACCAATCATTAGCAGGCGACAACGACACATCCGCATCCGCAGAAGATGTCCAACGGCAAGGCGACGGTCTCGTTCTTCGAGACCGGGAGCACCAAACAGTTCGAGTACTGCTACCAGCTCTATCCCCAGGTTCTTAAGCTAAAGGCCGAGAAGCGCTGCAAGAAGCCGCAAGAGCTGATCCGCCTGGATCAGTGGTATCAGAATGAACTGCCCAAATTGATTAAGGCACGCGGCAAGGACGCGCATATGGTATACGATGAGCTCGTCCAGTCGATGAAGTGGAAGCAGTCGCGCGGCAAATTCTATCCGCAGCTATCCTACCTGGTCAAGGTCAACACACCGCGCGCCGTCATCCAGGAGACAAAGAAGGCCTTCCGCAAGCTGCCCAATCTGGAGCAGGCGATCACAGCTTTATCGAACCTCAAGGGCGTTGGCACCACAATGGCCAGTGCACTGCTGGCAGCCGCAGCTCCCGATTCGGCACCATTCATGGCCGACGAGTGCCTGATGGCCATACCAGAGATCGAGGGCATCGATTACACCACCAAGGAGTACCTCAACTTCGTCAATCACATTCAGGCCACCGTGGAGCGCCTCAATGCGGAGGTGGGCGGGGATACGCCGCACTGGTCGCCTCATCGCGTGGAGCTGGCCCTCTGGTCACACTATGTGGCCAATGATCTCAGTCCCGAGATGCTCGACGATATGCCGCCGCCTGGATCCGGCGCCTCCACTGGCACCGGTTCACTCAGCACAAACGGCAACAGCAGCAAGGTGCTCGATGGCGACGATACCAACGATGGTGTGGGTGTTGATTTGGACGACGAAAGCCAAGGAGCAGGCGGTCGCAACACTGCTACAGAATCGGAGACAGAGAATGAGAACACCAACCCGGCTGCTCTGACGCCTCTACAGTCGGGCGAGGCCAAGAACAACGCAGCTGCCGTTGGCGCCGCCCTGCAGGACGGTGACTCCAACTTTGTTTCGAACGATTCCACCTCCCAGGAGCCGATCATCGATGACAACGATGGCACCACACAGACAACGGCCACCACTTCCACAGAGGACGGTGAGCCCATCGCCCTAGACATTGGCATTGGCATCGGTTCGAGTGGAACACCGCTCGCCTCGGACTCTGAAAGCAATCAGGAGGCGCCGCCCAAGACCAACAGCCTGCCCATCCTGACTCCCACACAGCACTCGAGCCAGAATCAGAATCAAAAGCAGTCGCCGAGCCAGCCCCACAAAACTAACAATTCGATCACCAACAACGGTCAGCCTGCTCCTTTGGCAGAAGAGGAAGCGGTTACAGCAGCACCACAGCCAGCCAGCAAAGCGACTGCAGCACCAGCCAATGGAAATGGTAACGGGAACGGCGTCCTGGGCGACGAGGATGAGGATGAGGCGGAGGACGAGGAGGAAGATGAGCTGGACGAGGAGGAGGATAATGAGGCGGAGCTAGAGGCTGACGAGAGCAATAGCAGCAACGGCATTGTGAGGGACAGTAAACTGCAGCAGCTGGCGGCGAACAAGGCGGTGGATGCGGTTTCACCGGTAGCAGCGGGTGCAGACTCGGCACCAGCCATTGGACAGAAGCGTACTGCCCTGCACTGCGATATGGAGCTGAAGAACGCCGGCGGAGTGGGTGTGGGCGTGGGGGAGAAGTCACCGGATCTAAAGAAACTGCGCAGCGAATGA(SEQ ID NO: 106)MSNGKATVSFFETGSTKQFEYCYQLYPQVLKLKAEKRCKKPQELIRLDQWYQNELPKLIKARGKDAHMVYDELVQSMKWKQSRGKFYPQLSYLVKVNTPRAVIQETKKAFRKLPNLEQAITALSNLKGVGTTMASALLAAAAPDSAPFMADECLMAIPEIEGIDYTTKEYLNFVNHIQATVERLNAEVGGDTPHWSPHRVELALWSHYVANDLSPEMLDDMPPPGSGASTGTGSLSTNGNSSKVLDGDDTNDGVGVDLDDESQGAGGRNTATESETENENTNPAALTPLQSGEAKNNAAAVGAALQDGDSNFVSNDSTSQEPIIDDNDGTTQTTATTSTEDGEPIALDIGIGIGSSGTPLASDSESNQEAPPKTNSLPILTPTQHSSQNQNQKQSPSQPHKTNNSITNNGQPAPLAEEEAVTAAPQPASKATAAPANGNGNGNGVLGDEDEDEAEDEEEDELDEEEDNEAELEADESNSSNGIVRDSKLQQLAANKAVDAVSPVAAGADSAPAIGQKRTALHCDMELKNAGGVGVGVGEKSPDLKKLRSE(SEQ ID NO: 107)GCCTGTCAGTTTGACTGTGTGAGTGCATGGCGGACTAAAAAGAACCCGACGACAGCACTGTAAAAATTCGATTTGTGTGCTGTGCAAACGGCGGCGGAAGCGAGCAGATTTTTGGCAAATAGTGAGCGATTATCGGATTGAGTAAATACAACAAACAACAGAGACACGGCCGCAGCAGCAGCAGCATTAACACAGTACGTTGACAGGAGCAGCTCGGAACGGACAGGAAAAGCAGGAGACTAAACACCAGCCGGTCACTTGCGGATCAGCCAACGTCCTGGGCCCCAAGGCGATAGATACCACGATAAGGAGATACAGCGATACCACCAATCATTAGCAGGCGACAACGACACATCCGCATCCGCAGAAGATGTCCAACGGCAAGGCGACGGTCTCGTTCTTCGAGACCGGGAGCACCAAACAGTTCGAGTACTGCTACCAGCTCTATCCCCAGGTTCTTAAGCTAAAGGCCGAGAAGCGCTGCAAGAAGCCGCAAGAGCTGATCCGCCTGGATCAGTGGTATCAGAATGAACTGCCCAAATTGATTAAGGCACGCGGCAAGGACGCGCATATGGTATACGATGAGCTCGTCCAGTCGATGAAGTGGAAGCAGTCGCGCGGCAAATTCTATCCGCAGCTATCCTACCTGGTCAAGGTCAACACACCGCGCGCCGTCATCCAGGAGACAAAGAAGGCCTTCCGCAAGCTGCCCAATCTGGAGCAGGCGATCACAGCTTTATCGAACCTCAAGGGCGTTGGCACCACAATGGCCAGTGCACTGCTGGCAGCCGCAGCTCCCGATTCGGCACCATTCATGGCCGACGAGTGCCTGATGGCCATACCAGAGATCGAGGGCATCGATTACACCACCAAGGAGTACCTCAACTTCGTCAATCACATTCAGGCCACCGTGGAGCGCCTCAATGCGGAGGTGGGCGGGGATACGCCGCACTGGTCGCCTCATCGCGTGGAGCTGGCCCTCTGGTCACACTATGTGGCCAATGATCTCAGTCCCGAGATGCTCGACGATATGCCGCCGCCTGGATCCGGCGCCTCCACTGGCACCGGTTCACTCAGCACAAACGGCAACAGCAGCAAGGTGCTCGATGGCGACGATACCAACGATGGTGTGGGTGTTGATTTGGACGACGAAAGCCAAGGAGCAGGCGGTCGCAACACTGCTACAGAATCGGAGACAGAGAATGAGAACACCAACCCGGCTGCTCTGACGCCTCTACAGTCGGGCGAGGCCAAGAACAACGCAGCTGCCGTTGGCGCCGCCCTGCAGGACGGTGACTCCAACTTTGTTTCGAACGATTCCACCTCCCAGGAGCCGATCATCGATGACAACGATGGCACCACACAGACAACGGCCACCACTTCCACAGAGGACGGTGAGCCCATCGCCCTAGACATTGGCATTGGCATCGGTTCGAGTGGAACACCGCTCGCCTCGGACTCTGAAAGCAATCAGGAGGCGCCGCCCAAGACCAACAGCCTGCCCATCCTGACTCCCACACAGCACTCGAGCCAGAATCAGAATCAAAAGCAGTCGCCGAGCCAGCCCCACAAAACTAACAATTCGATCACCAACAACGGTCAGCCTGCTCCTTTGGCAGAAGAGGAAGCGGTTACAGCAGCACCACAGCCAGCCAGCAAAGCGACTGCAGCACCAGCCAATGGAAATGGTAACGGGAACGGCGTCCTGGGCGACGAGGATGAGGATGAGGCGGAGGACGAGGAGGAAGATGAGCTGGACGAGGAGGAGGATAATGAGGCGGAGCTAGAGGCTGACGAGAGCAATAGCAGCAACGGCATTGTGAGGGACAGTAAACTGCAGCAGCTGGCGGCGAACAAGGCGGTGGATGCGGTTTCACCGGTAGCAGCGGGTGCAGACTCGGCACCAGCCATTGGACAGAAGCGTACTGCCCTGCACTGCGATATGGAGCTGAAGAACGCCGGCGGAGTGGGTGTGGGCGTGGGGGAGAAGTCACCGGATCTAAAGAAACTGCGCAGCGAATGA(SEQ ID NO: 108)MSNGKATVSFFETGSTKQFEYCYQLYPQVLKLKAEKRCKKPQELIRLDQWYQNELPKLIKARGKDAHMVYDELVQSMKWKQSRGKFYPQLSYLVKVNTPRAVIQETKKAFRKLPNLEQAITALSNLKGVGTTMASALLAAAAPDSAPFMADECLMAIPEIEGIDYTTKEYLNFVNHIQATVERLNAEVGGDTPHWSPHRVELALWSHYVANDLSPEMLDDMPPPGSGASTGTGSLSTNGNSSKVLDGDDTNDGVGVDLDDESQGAGGRNTATESETENENTNPAALTPLQSGEAKNNAAAVGAALQDGDSNFVSNDSTSQEPIIDDNDGTTQTTATTSTEDGEPIALDIGIGIGSSGTPLASDSESNQEAPPKTNSLPILTPTQHSSQNQNQKQSPSQPHKTNNSITNNGQPAPLAEEEAVTAAPQPASKATAAPANGNGNGNGVLGDEDEDEAEDEEEDELDEEEDNEAELEADESNSSNGIVRDSKLQQLAANKAVDAVSPVAAGADSAPAIGQKRTALHCDMELKNAGGVGVGVGEKSPDLKKLRSE


Human homologue of Complete Genome candidate


CG2446—none


Putative function


glycosylation/membrane protein


Example 2C (Category 1)

Line ID—fs(1)06


Phenotype—Female sterile (semi-sterile), 2-3 fully matured eggs seen in each of the ovarioles


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003449 (9B6-7)


P element insertion site sequence

(SEQ ID NO: 109)CTNCATGNTGNAGGAGACAAGGCGTTCTATATTATATAGNNGATTTTNNTGTATATAAAGGAAGANCTGNGCTAANGNAANAGGCATCTCGATGANTTTNATAATNAGGGCAANTGGTANNAANGGTTTATGCCAAAGTATTACACACCAGGGNTGGGCACAACAGATCTTAACTNANNATAGGNNATTGGNATAANCTTAAATTTGTAAGATTNTGNAATAATATAGTAGAGANNNTCAATACGCATTANTAATNGTGACGATCCCNAGCATAAACTCAAAAAAANCTTATANTTTTATAAAGGCNANNCCNNACTAANNAATTAAANGAANNNCNGNCGCCNCNAAANGATGATTGNGCTATATAANNANANNATTGATNGAGGCACTTATATTATTATAATTAAAACACTTAATTATTNTGTGTGAAATGATTGCACTNNNNATTGGGCNAGAGCCTNNNNCGTATTGANANNNNNNNATTTNGGCTNNANCTGTAAATATCNTACAAACTCGTNATTGCTAAATAACTTTTGTATNCCCCNCTGGTCACTCTGACTTAAACGTNNTTCGNNAAAACAGCGGCTGATCACTGANGTTTTCTCCCGNNTTTCGCTNTCAANCCGAANTANAAACAGGNGAANNTCCCNGATAATTTGNGGNNTANCCCACTGATCACAGNGCCCNNGGATNNNCAAGGAANNGCGATCGAAACCCGNCCTGGNGNAACACNNTTTCCC


Annotated Drosophila genome Complete Genome candidate—CG2968—hydrogen transporting ATP synthase

(SEQ ID NO: 110)CAAAAACAGCGGCTGATCACTGAAGTTTTCTCGTGTTTTTCGCTATCAAACCGAAATAAAAACAGCCCAAAATGTCCTTCGTTAAGAACGCCCGTTTGCTGGCCGCCCGCGGCGCTCGCTTGGCCCAGAACCGCAGCTACTCGGATGAGATGAAGCTGACCTTCGCCGCCGCCAACAAAACCTTCTACGATGCCGCTGTGGTGCGCCAAATCGATGTGCCTTCCTTCTCGGGATCCTTCGGCATCCTGGCCAAGCACGTGCCCACTCTGGCTGTCCTGAAGCCCGGCGTTGTCCAGGTGGTGGAAAACGATGGCAAGACCCTCAAGTTCTTCGTCTCCAGCGGTTCCGTCACCGTCAACGAGGATTCCTCCGTTCAGGTTCTGGCCGAGGAGGCCCACAACATCGAGGACATCGATGCCAATGAGGCGCGCCAGCTGCTCGCGAAATACCAGTCACAGCTTAGCTCCGCTGGCGACGACAAGGCCAAGGCCCAGGCTGCCATTGCCGTGGAGGTCGCCGAAGCGTTAGTCAAGGCTGCCGAATAGACGTAATCACCACACAACCGCCACCAATAAACCACAATCGATGCTTTGTGTCTGAAATAAATAAAAAACATAACGATCACCTTAAAAAGCCAGAGAGTTATGAAACAATAAAAAAGCGA(SEQ ID NO: 111)MSFVKNARLLAARGARLAQNRSYSDEMKLTFAAANKTFYDAAVVRQIDVPSFSGSFGILAKHVPTLAVLKPGVVQVVENDGKTLKFFVSSGSVTVNEDSSVQVLAEEAHNIEDIDANEARQLLAKYQSQLSSAGDDKAKAQAAIAVEVAEALVKAAE


Human homologue of Complete Genome candidate


CAA45016—H(+)-transporting ATP synthase, delta-subunit of the human mitochondrial ATP synthase complex

(SEQ ID NO: 112)1gtcctcctcg ccctccaggc cgcccgcgcc gcgccggagtccgctgtccg ccagctaccc61gcttcctgcc gcccgccgct gccatgctgc ccgccgcgctgctccgccgc ccgggacttg121gccgcctcgt ccgccacgcc cgtgcctatg ccgaggccgccgccgccccg gctgccgcct181ctggccccaa ccagatgtcc ttcaccttcg cctctcccacgcaggtgttc ttcaacggtg241ccaacgtccg gcaggtggac gtgcccacgc tgaccggagccttcggcatc ctggcggccc301acgtgcccac gctgcaggtc ctgcggccgg ggctggtcgtggtgcatgca gaggacggca361ccacctccaa atactttgtg agcagcggtt ccatcgcagtgaacgccgac tcttcggtgc421agttgttggc cgaagaggcc gtgacgctgg acatgttggacctgggggca gccaaggcaa481acttggagaa ggcccaggcg gagctggtgg ggacagctgacgaggccacg cgggcagaga541tccagatccg aatcgaggcc aacgaggccc tggtgaaggccctggagtag gcggtgcgta601cccggtgtcc cgaggcccgg ccaggggctg ggcagggatgccaggtgggc ccagccagct661cctggggtcc cggccacctg gggaagccgc gcctgccaaggaggccacca gagggcagtg721caggcttctg cctgggcccc aggccctgcc tgtgttgaaagctctgggga ctgggccagg781gaagctcctc ctcagctttg agctgtggct gccacccatggggctctcct tccgcctctc841aagatccccc cagcctgacg ggccgcttac catcccctctgccctgcaga gccagccgcc901aaggttgacc tcagcttcgg agccacctct ggatgaactgcccccagccc ccgccccatt961aaagacccgg aagcctgaaa aaaaaaaaaa aaaa(SEQ ID NO: 113)1mlpaallrrp glgrlvrhar ayaeaaaapa aasgpnqmsftfasptqvff nganvrqvdv61ptltgafgil aahvptlqvl rpglvvvhae dgttskyfvssgsiavnads svqllaeeav121tldmldlgaa kanlekaqae lvgtadeatr aeiqirieanealvkale


Putative function


hydrogen transporting ATP synthase


Category 2—Male Steriles


Example 3 (Category 2)

Line ID—167


Phenotype—lethal phase pharate adult, cytokinesis defect.


Some onion stage cysts with large nebenkerns


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003428 (3F4-5)


P element insertion site—293,654


Annotated Drosophila genome Complete Genome candidate—CG2829—BcDNA:GH07910 tousled kinase (2 splice variants)

(SEQ ID NO: 114)AGTTTCATTCGGGGATGCTTGGCCTATCGCAAGGAGGATCGCATGGATGTGTTCGCACTGGCCAGGCACGAGTACATTCAGCCACCGATACCGAAACATGGGCGCGGTTCGCTCAATCAGCAACAGCAGGCGCAACAACAGCAGCAGCAACAACAGCAACAGCAGCAGCAACAGTCGTCGACGTCACAGGCCAATTCTACAGGCCAGACATCTTTCTCTGCCCACATGTTTGGCAATATGAATCAGTCGAGTTCGTCCTAGATGAGAGCGACTGCAAAAAAATCGGAATAAACACGGTTATAATATATAAGTACAAATAAACCATATATATGTGTTTATGTTATGTATATATACATAAAGGAAAATAACAAGGCAAATGTGAAAATTAGTGCAAACTGAACGAAAAGACAAAAATAAAACAAAAGGAAACCCAAATGTGATAATATTGTAATATAATGTGAAAAGCAAAACACACACAAATACACAACTCACGCACTTAGCCACGTATGTGTGTGCAGAAAAATATGCGGCGCTTAAAAAAGATGTCCCCCGGCGCCCATTTGCAGATGTCCCCGCAGAACACTTCGTCCCTAAGTCAACACCATCCACATCAACAGCAACAGTTACAACCCCCACAGCAGCAACAACAGCATTTCCCTAACCATCACAGCGCCCAGCAACAGTCGCAGCAGCAGCAGCAACAGGAGCAACAGAATCCCCAGCAGCAGGCGCAACAGCAGCAGCAGATACTCCCACATCAACATTTGCAGCACCTGCACAAGCATCCGCATCAGCTGCAACTGCATCAGCAGCAGCAACAACAACTCCACCAGCAACAGCAGCAACACTTCCACCAGCAGTCGCTGCAAGGGCTGCATCAGGGTAGCAGCAATCCGGATTCGAATATGAGCACTGGCTCCTCGCATAGCGAGAAGGATGTCAATGATATGCTGAGTGGCGGTGCAGCAACGCCAGGAGCTGCAGCAGCAGCGATTCAACAGCAACATCCCGCCTTTGCGCCCACACTGGGAATGCAGCAACCACCGCCGCCCCCACCTCAACACTCCAATAATGGAGGCGAGATGGGCTACTTGTCGGCAGGCACGACCACGACGACGTCGGTGTTAACGGTAGGCAAGCCTCGGACGCCAGCGGAGCGGAAACGGAAGCGAAAAATGCCTCCATGTGCCACTAGTGCGGATGAGGCGGGGAGTGGCGGTGGCTCTGGCGGAGCAGGAGCAACCGTTGTTAACAACAGCAGCCTGAAGGGCAAATCATTGGCCTTTCGTGATATGCCCAAGGTAAACATGAGCCTGAATCTGGGCGATCGTCTGGGAGGATCTGCAGGAAGCGGAGTAGGAGCCGGTGGCGCCGGAAGCGGGGGAGGTGGCGCTGGTTCCGGTTCTGGAAGCGGTGGCGGCAAAAGCGCCCGCCTGATGCTGCCAGTCAGCGACAACAAGAAGATCAACGACTATTTCAATAAGCAGCAAACGGGCGTGGGCGTCGGTGTGCCAGGTGGTGCGGGAGGCAATACCGCTGGCCTTCGAGGATCACATACGGGAGGTGGCAGCAAGTCACCCTCATCCGCCCAGCAGCAGCAAACGGCGGCACAGCAGCAGGGAAGCGGTGTTGCGACGGGAGGCAGTGCAGGCGGTTCCGCTGGCAACCAGGTGCAAGTGCAAACGAGCAGCGCTTACGCCCTTTACCCACCAGCTAGTCCCCAAACCCAGACGTCACAGCAACAGCAGCAGCAGCAACCGGGATCAGACTTTCACTATGTCAACTCCAGCAAGGCGCAGCAACAACAGCAGCGTCAACAGCAACAGACTTCCAATCAAATGGTTCCTCCACACGTGGTCGTTGGCCTTGGTGGTCATCCACTGAGCCTCGCGTCCATTCAGCAGCAGACGCCCTTATCCCAGCAGCAACAGCAGCAACAACAGCAGCAGCAACAGCAGCAACTGGGACCACCGACCACATCGACGGCCTCCGTCGTGCCAACGCATCCGCATCAACTCGGATCCCTGGGAGTTGTTGGGATGGTCGGTGTGGGTGTTGGCGTGGGCGTTGGAGTAAATGTGGGTGTGGGACCACCACTGCCACCACCACCGCCGATGGCCATGCCAGCGGCCATTATCACTTATAGTAAGGCCACTCAAACGGAGGTGTCGCTGCATGAATTGCAGGAGCGCGAAGCGGAGCACGAATCGGGCAAGGTGAAGCTAGACGAGATGACACGGCTGTCCGATGAACAAAAGTCCCAAATTGTTGGCAACCAGAAGACGATTGACCAGCACAAGTGCCACATAGCCAAGTGTATTGATGTGGTCAAGAAGCTGTTGAAGGAGAAGAGCAGCATCGAGAAGAAGGAGGCGCGACAGAAGTGCATGCAGAATCGCCTCAGGCTCGGACAGTTTGTTACCCAACGAGTGGGCGCCACATTCCAGGAGAACTGGACGGACGGCTATGCGTTCCAGGAGCTGAGTCGGCGGCAAGAAGAAATAACCGCTGAGCGTGAAGAGATAGATCGGCAGAAAAAGCAGCTGATGAAAAAGCGTCCGGCGGAGTCCGGACGCAAGCGCAACAACAACAGTAACCAGAACAACCAGCAGCAGCAGCAACAGCAACACCAGCAACAGCAGCAGCAACAAAATTCCAACTCGAACGATTCCACGCAGCTGACGAGCGGAGTTGTTACCGGTCCAGGCAGTGATCGTGTGAGCGTAAGCGTCGACAGCGGATTGGGTGGCAATAATGCGGGCGCGATCGGTGGCGGAACCGTTGGTGGTGGCGTTGGAGGTGGTGGTGTTGGAGGCGGTGGTGTCGGAGGCGGCGGTGGACGTGGACTTTCTCGCAGCAATTCGACGCAGGCCAATCAGGCTCAATTGCTGCACAACGGCGGTGGTGGTTCGGGCGGCAATGTCGGCAACTCGGGCGGCGTTGGCGACCGCTTGTCAGATCGAGGAGGAGGAGGTGGCGGCATCGGCGGAAACGATAGCGGCAGCTGCTCGGACTCGGGCACTTTCCTGAAGCCAGACCCCGTATCGGGTGCCTACACAGCGCAGGAGTATTACGAGTACGATGAGATCCTCAAGTTGCGACAAAATGCCCTCAAAAAGGAGGACGCCGACCTGCAGCTGGAGATGGAGAAGCTGGAGCGGGAGCGCAATCTGCACATCCGAGAGCTCAAGCGGATTCTTAACGAGGATCAGTCCCGCTTTAACAATCATCCCGTGCTGAATGATCGCTATCTTCTGTTGATGCTCCTGGGCAAGGGCGGCTTCTCAGAGGTCCACAAGGCCTTCGACCTGAAGGAGCAACGCTATGTCGCATGTAAGGTGCACCAATTAAACAAGGATTGGAAGGAGGATAAGAAAGCTAATTATATCAAACACGCTTTGCGGGAATACAACATTCACAAGGCACTGGATCATCCGCGGGTCGTCAAGCTATACGATGTCTTCGAGATCGATGCGAATTCCTTTTGCACAGTGCTCGAATACTGTGATGGCCACGATCTGGACTTCTATTTGAAGCAACATAAGACTATACCCGAGCGTGAAGCGCGCTCGATAATAATGCAGGTTGTATCTGCACTCAAGTATCTAAATGAGATTAAGCCTCCAGTTATCCACTACGATCTGAAGCCCGGCAACATTCTGCTTACCGAGGGCAACGTCTGCGGCGAGATTAAGATCACCGACTTCGGTCTGTCAAAGGTGATGGACGACGAGAATTACAATCCCGATCACGGCATGGATCTGACCTCTCAGGGGGCGGGAACCTACTGGTATCTGCCACCCGAGTGCTTTGTCGTGGGCAAAAATCCGCCGAAAATCTCCTCCAAAGTGGACGTATGGAGTGTGGGTGTTATCTTCTACCAGTGTCTGTACGGCAAAAAGCCCTTCGGTCACAATCAGTCGCAGGCCACGATTCTCGAGGAGAATACGATCCTGAAGGCCACCGAAGTGCAGTTCTCCAACAAGCCAACCGTTTCTAACGAGGCCAAG(SEQ ID NO: 115)MCVQKNMRRLKKMSPGAHLQMSPQNTSSLSQHHPHQQQQLQPPQQQQQHFPNHHSAQQQSQQQQQQEQQNPQQQAQQQQQILPHQHLQHLHKHPHQLQLHQQQQQQLHQQQQQHFHQQSLQGLHQGSSNPDSNMSTGSSHSEKDVNDMLSGGAATPGAAAAAIQQQHPAFAPTLGMQQPPPPPPQHSNNGGEMGYLSAGTTTTTSVLTVGKPRTPAERKRKRKMPPCATSADEAGSGGGSGGAGATVVNNSSLKGKSLAFRDMPKVNMSLNLGDRLGGSAGSGVGAGGAGSGGGGAGSGSGSGGGKSARLMLPVSDNKKINDYFNKQQTGVGVGVPGGAGGNTAGLRGSHTGGGSKSPSSAQQQQTAAQQQGSGVATGGSAGGSAGNQVQVQTSSAYALYPPASPQTQTSQQQQQQQPGSDFHYVNSSKAQQQQQRQQQQTSNQMVPPHVVVGLGGHPLSLASIQQQTPLSQQQQQQQQQQQQQQLGPPTTSTASVVPTHPHQLGSLGVVGMVGVGVGVGVGVNVGVGPPLPPPPPMAMPAAIITYSKATQTEVSLHELQEREAEHESGKVKLDEMTRLSDEQKSQIVGNQKTIDQHKCHIAKCIDVVKKLLKEKSSIEKKEARQKCMQNRLRLGQFVTQRVGATFQENWTDGYAFQELSRRQEEITAEREEIDRQKKQLMKKRPAESGRKRNNNSNQNNQQQQQQQHQQQQQQQNSNSNDSTQLTSGVVTGPGSDRVSVSVDSGLGGNNAGAIGGGTVGGGVGGGGVGGGGVGGGGGRGLSRSNSTQANQAQLLHNGGGGSGGNVGNSGGVGDRLSDRGGGGGGIGGNDSGSCSDSGTFLKPDPVSGAYTAQEYYEYDEILKLRQNALKKEDADLQLEMEKLERERNLHIRELKRILNEDQSRFNNHPVLNDRYLLLMLLGKGGFSEVHKAFDLKEQRYVACKVHQLNKDWKEDKKANYIKHALREYNIHKALDHPRVVKLYDVFEIDANSFCTVLEYCDGHDLDFYLKQHKTIPEREARSIIMQVVSALKYLNEIKPPVIHYDLKPGNILLTEGNVCGEIKITDFGLSKVMDDENYNPDHGMDLTSQGAGTYWYLPPECFVVGKNPPKISSKVDVWSVGVIFYQCLYGKKPFGHNQSQATILEENTILKATEVQFSNKPTVSNEAK(SEQ ID NO: 116)AGTTTCATTCGGGGATGCTTGGCCTATCGCAAGGAGGATCGCATGGATGTGTTCGCACTGGCCAGGCACGAGTACATTCAGCCACCGATACCGAAACATGGGCGCGGTTCGCTCAATCAGCAACAGCAGGCGCAACAACAGCAGCAGCAACAACAGCAACAGCAGCAGCAACAGTCGTCGACGTCACAGGCCAATTCTACAGGCCAGACATCTTTCTCTGCCCACATGTTTGGCAATATGAATCAGTCGAGTTCGTCCTAGTGGTGTCGGTGTCGTTTTGGTTTTGTCGGCGGTTGCTAAACACAATTTAAGTTCACTCGGTTAGCAGACATTACACACTGCCTGCTCTCATACATATTTACGCACTTGTATATACATGCAATGTGCCTGTGTGTGCGCAAGAAACCAGAAAAAACGAAAAGTACAACATTCGTTGAGTCGCGTTCGGCTTAATTTTTTTTTGTGTTACCGTGTGTGTGTTTGTGCTTTGGATTTGCCAATTTTAGCCGACTGGCTCTCAGTGTCGAACTTAAACTTAAAGAGCGAGCAACGTGACGTGTCGCCCAGTGTCGCTTAAAATTCGCGCACACAACTTCCTACTACAAAAAAACGAAAGAAAGAAGGAGAAAAAACGTTAAAGATGTCCCCCGGCGCCCATTTGCAGATGTCCCCGCAGAACACTTCGTCCCTAAGTCAACACCATCCACATCAACAGCAACAGTTACAACCCCCACAGCAGCAACAACAGCATTTCCCTAACCATCACAGCGCCCAGCAACAGTCGCAGCAGCAGCAGCAACAGGAGCAACAGAATCCCCAGCAGCAGGCGCAACAGCAGCAGCAGATACTCCCACATCAACATTTGCAGCACCTGCACAAGCATCCGCATCAGCTGCAACTGCATCAGCAGCAGCAACAACAACTCCACCAGCAACAGCAGCAACACTTCCACCAGCAGTCGCTGCAAGGGCTGCATCAGGGTAGCAGCAATCCGGATTCGAATATGAGCACTGGCTCCTCGCATAGCGAGAAGGATGTCAATGATATGCTGAGTGGCGGTGCAGCAACGCCAGGAGCTGCAGCAGCAGCGATTCAACAGCAACATCCCGCCTTTGCGCCCACACTGGGAATGCAGCAACCACCGCCGCCCCCACCTCAACACTCCAATAATGGAGGCGAGATGGGCTACTTGTCGGCAGGCACGACCACGACGACGTCGGTGTTAACGGTAGGCAAGCCTCGGACGCCAGCGGAGCGGAAACGGAAGCGAAAAATGCCTCCATGTGCCACTAGTGCGGATGAGGCGGGGAGTGGCGGTGGCTCTGGCGGAGCAGGAGCAACCGTTGTTAACAACAGCAGCCTGAAGGGCAAATCATTGGCCTTTCGTGATATGCCCAAGGTAAACATGAGCCTGAATCTGGGCGATCGTCTGGGAGGATCTGCAGGAAGCGGAGTAGGAGCCGGTGGCGCCGGAAGCGGGGGAGGTGGCGCTGGTTCCGGTTCTGGAAGCGGTGGCGGCAAAAGCGCCCGCCTGATGCTGCCAGTCAGCGACAACAAGAAGATCAACGACTATTTCAATAAGCAGCAAACGGGCGTGGGCGTCGGTGTGCCAGGTGGTGCGGGAGGCAATACCGCTGGCCTTCGAGGATCACATACGGGAGGTGGCAGCAAGTCACCCTCATCCGCCCAGCAGCAGCAAACGGCGGCACAGCAGCAGGGAAGCGGTGTTGCGACGGGAGGCAGTGCAGGCGGTTCCGCTGGCAACCAGGTGCAAGTGCAAACGAGCAGCGCTTACGCCCTTTACCCACCAGCTAGTCCCCAAACCCAGACGTCACAGCAACAGCAGCAGCAGCAACCGGGATCAGACTTTCACTATGTCAACTCCAGCAAGGCGCAGCAACAACAGCAGCGTCAACAGCAACAGACTTCCAATCAAATGGTTCCTCCACACGTGGTCGTTGGCCTTGGTGGTCATCCACTGAGCCTCGCGTCCATTCAGCAGCAGACGCCCTTATCCCAGCAGCAACAGCAGCAACAACAGCAGCAGCAACAGCAGCAACTGGGACCACCGACCACATCGACGGCCTCCGTCGTGCCAACGCATCCGCATCAACTCGGATCCCTGGGAGTTGTTGGGATGGTCGGTGTGGGTGTTGGCGTGGGCGTTGGAGTAAATGTGGGTGTGGGACCACCACTGCCACCACCACCGCCGATGGCCATGCCAGCGGCCATTATCACTTATAGTAAGGCCACTCAAACGGAGGTGTCGCTGCATGAATTGCAGGAGCGCGAAGCGGAGCACGAATCGGGCAAGGTGAAGCTAGACGAGATGACACGGCTGTCCGATGAACAAAAGTCCCAAATTGTTGGCAACCAGAAGACGATTGACCAGCACAAGTGCCACATAGCCAAGTGTATTGATGTGGTCAAGAAGCTGTTGAAGGAGAAGAGCAGCATCGAGAAGAAGGAGGCGCGACAGAAGTGCATGCAGAATCGCCTCAGGCTCGGACAGTTTGTTACCCAACGAGTGGGCGCCACATTCCAGGAGAACTGGACGGACGGCTATGCGTTCCAGGAGCTGAGTCGGCGGCAAGAAGAAATAACCGCTGAGCGTGAAGAGATAGATCGGCAGAAAAAGCAGCTGATGAAAAAGCGTCCGGCGGAGTCCGGACGCAAGCGCAACAACAACAGTAACCAGAACAACCAGCAGCAGCAGCAACAGCAACACCAGCAACAGCAGCAGCAACAAAATTCCAACTCGAACGATTCCACGCAGCTGACGAGCGGAGTTGTTACCGGTCCAGGCAGTGATCGTGTGAGCGTAAGCGTCGACAGCGGATTGGGTGGCAATAATGCGGGCGCGATCGGTGGCGGAACCGTTGGTGGTGGCGTTGGAGGTGGTGGTGTTGGAGGCGGTGGTGTCGGAGGCGGCGGTGGACGTGGACTTTCTCGCAGCAATTCGACGCAGGCCAATCAGGCTCAATTGCTGCACAACGGCGGTGGTGGTTCGGGCGGCAATGTCGGCAACTCGGGCGGCGTTGGCGACCGCTTGTCAGATCGAGGAGGAGGAGGTGGCGGCATCGGCGGAAACGATAGCGGCAGCTGCTCGGACTCGGGCACTTTCCTGAAGCCAGACCCCGTATCGGGTGCCTACACAGCGCAGGAGTATTACGAGTACGATGAGATCCTCAAGTTGCGACAAAATGCCCTCAAAAAGGAGGACGCCGACCTGCAGCTGGAGATGGAGAAGCTGGAGCGGGAGCGCAATCTGCACATCCGAGAGCTCAAGCGGATTCTTAACGAGGATCAGTCCCGCTTTAACAATCATCCCGTGCTGAATGATCGCTATCTTCTGTTGATGCTCCTGGGCAAGGGCGGCTTCTCAGAGGTCCACAAGGCCTTCGACCTGAAGGAGCAACGCTATGTCGCATGTAAGGTGCACCAATTAAACAAGGATTGGAAGGAGGATAAGAAAGCTAATTATATCAAACACGCTTTGCGGGAATACAACATTCACAAGGCACTGGATCATCCGCGGGTCGTCAAGCTATACGATGTCTTCGAGATCGATGCGAATTCCTTTTGCACAGTGCTCGAATACTGTGATGGCCACGATCTGGACTTCTATTTGAAGCAACATAAGACTATACCCGAGCGTGAAGCGCGCTCGATAATAATGCAGGTTGTATCTGCACTCAAGTATCTAAATGAGATTAAGCCTCCAGTTATCCACTACGATCTGAAGCCCGGCAACATTCTGCTTACCGAGGGCAACGTCTGCGGCGAGATTAAGATCACCGACTTCGGTCTGTCAAAGGTGATGGACGACGAGAATTACAATCCCGATCACGGCATGGATCTGACCTCTCAGGGGGCGGGAACCTACTGGTATCTGCCACCCGAGTGCTTTGTCGTGGGCAAAAATCCGCCGAAAATCTCCTCCAAAGTGGACGTATGGAGTGTGGGTGTTATCTTCTACCAGTGTCTGTACGGCAAAAAGCCCTTCGGTCACAATCAGTCGCAGGCCACGATTCTCGAGGAGAATACGATCCTGAAGGCCACCGAAGTGCAGTTCTCCAACAAGCCAACCGTTTCTAACGAGGCCAAG(SEQ ID NO: 117)MSPGAHLQMSPQNTSSLSQHHPHQQQQLQPPQQQQQHFPNHHSAQQQSQQQQQQEQQNPQQQAQQQQQILPHQHLQHLHKHPHQLQLHQQQQQQLHQQQQQHFHQQSLQGLHQGSSNPDSNMSTGSSHSEKDVNDMLSGGAATPGAAAAAIQQQHPAFAPTLGMQQPPPPPPQHSNNGGEMGYLSAGTTTTTSVLTVGKPRTPAERKRKRKMPPCATSADEAGSGGGSGGAGATVVNNSSLKGKSLAFRDMPKVNMSLNLGDRLGGSAGSGVGAGGAGSGGGGAGSGSGSGGGKSARLMLPVSDNKKINDYFNKQQTGVGVGVPGGAGGNTAGLRGSHTGGGSKSPSSAQQQQTAAQQQGSGVATGGSAGGSAGNQVQVQTSSAYALYPPASPQTQTSQQQQQQQPGSDFHYVNSSKAQQQQQRQQQQTSNQMVPPHVVVGLGGHPLSLASIQQQTPLSQQQQQQQQQQQQQQLGPPTTSTASVVPTHPHQLGSLGVVGMVGVGVGVGVGVNVGVGPPLPPPPPMAMPAAIITYSKATQTEVSLHELQEREAEHESGKVKLDEMTRLSDEQKSQIVGNQKTIDQHKCHIAKCIDVVKKLLKEKSSIEKKEARQKCMQNRLRLGQFVTQRVGATFQENWTDGYAFQELSRRQEEITAEREEIDRQKKQLMKKRPAESGRKRNNNSNQNNQQQQQQQHQQQQQQQNSNSNDSTQLTSGVVTGPGSDRVSVSVDSGLGGNNAGAIGGGTVGGGVGGGGVGGGGVGGGGGRGLSRSNSTQANQAQLLHNGGGGSGGNVGNSGGVGDRLSDRGGGGGGIGGNDSGSCSDSGTFLKPDPVSGAYTAQEYYEYDEILRLRQNALKKEDADLQLEMEKLERERNLHIRELKRILNEDQSRFNNHPVLNDRYLLLMLLGKGGFSEVHKAFDLKEQRYVACKVHQLNKDWKEDKKANYIKHALREYNIHKALDHPRVVKLYDVFEIDANSFCTVLEYCDGHDLDFYLKQHKTIPEREARSIIMQVVSALKYLNEIKPPVIHYDLKPGNILLTEGNVCGEIKITDFGLSKVMDDENYNPDHGMDLTSQGAGTYWYLPPECFVVGKNPPKISSKVDVWSVGVIFYQCLYGKKPFGHNQSQATILEENTILKATEVQFSNKPTVSNEAK


Human homologue of Complete Genome candidate


AAF03095—tousled-like kinase2

(SEQ ID NO:118)1ccgggcgggg ggttgcggcg ctcaggagag gccccggctc cgccccgggc ctgcccaggg61ggagagcgga gctccgcagc cgggtcgggt cggggcccct cccgggagga gcgtggagcg121cggcggcggc ggcggcagca gaaatgatgg aagaattgca tagcctggac ccacgacggc181aggaattatt ggaggccagg tttactggag taggtgttag taagggacca cttaatagtg241agtcttccaa ccagagcttg tgcagcgtcg gatccttgag tgataaagaa gtagagactc301ccgagaaaaa gcagaatgac cagcgaaatc ggaaaagaaa agctgaacca tatgaaacta361gccaagggaa aggcactcct aggggacata aaattagtga ttactttgag tttgctgggg421gaagcgcgcc aggaaccagc cctggcagaa gtgttccacc agttgcacga tcctcaccgc481aacattcctt atccaatccc ttaccgcgac gagtagaaca gcccctctat ggtttagatg541gcagtgctgc aaaggaggca acggaggagc agtctgctct gccaaccctc atgtcagtga601tgctagcaaa acctcggctt gacacagagc agctggcgca aaggggagct ggcctctgct661tcacttttgt ttcagctcag caaaacagtc cctcatctac gggatctggc aacacagagc721attcctgcag ctcccaaaaa cagatctcca tccagcacag acggacccag tccgacctca781caatagaaaa aatatctgca ctagaaaaca gtaagaattc tgacttagag aagaaggagg841gaagaataga tgatttatta agagccaact gtgatttgag acggcagatt gatgaacagc901aaaagatgct agagaaatac aaggaacgat taaatagatg tgtgacaatg agcaagaaac961tccttataga aaagtcaaaa caagagaaga tggcgtgtag agataagagc atgcaagacc1021gcttgagact gggccacttt actactgtcc gacacggagc ctcatttact gaacagtgga1081cagatggtta tgcttttcag aatcttatca agcaacagga aaggataaat tcacagaggg1141aagagataga aagacaacgg aaaatgttag caaagcggaa acctcctgcc atgggtcagg1201cccctcctgc aaccaatgag cagaaacagc ggaaaagcaa gaccaatgga gctgaaaatg1261aaacgttaac gttagcagaa taccatgaac aagaagaaat cttcaaactc agattaggtc1321atcttaaaaa ggaggaagca gagatccagg cagagctgga gagactagaa agggttagaa1381atctacatat cagggaacta aaaaggatac ataatgaaga taattcacaa tttaaagatc1441atccaacgct aaatgacaga tatttgttgt tacatctttt gggtagagga ggtttcagtg1501aagtttacaa ggcatttgat ctaacagagc aaagatacgt agctgtgaaa attcaccagt1561taaataaaaa ctggagagat gagaaaaagg agaattacca caagcatgca tgtagggaat1621accggattca taaagagctg gatcatccca gaatagttaa gctgtatgat tacttttcac1681tggatactga ctcgttttgt acagtattag aatactgtga gggaaatgat ctggacttct1741acctgaaaca gcacaaatta atgtcggaga aagaggcccg gtccattatc atgcagattg1801tgaatgcttt aaagtactta aatgaaataa aacctcccat catacactat gacctcaaac1861caggtaatat tcttttagta aatggtacag cgtgtggaga gataaaaatt acagattttg1921gtctttcgaa gatcatggat gatgatagct acaattcagt ggatggcatg gagctaacat1981cacaaggtgc tggtacttat tggtatttac caccagagtg ttttgtggtt gggaaagaac2041caccaaagat ctcaaataaa gttgatgtgt ggtcggtggg tgtgatcttc tatcagtgtc2101tttatggaag gaagcctttt ggccataacc agtctcagca agacatccta caagagaata2161cgattcttaa agctactgaa gtgcagttcc cgccaaagcc agtagtaaca cctgaagcaa2221aggcgtttat tcgacgatgc ttggcctacc gaaagaggga ccgcattgat gtccagcagc2281tggcctgtga tccctacttg ttgcctcaca tccgaaagtc agtctctaca agtagccctg2341ctggagctgc tattgcatca acctctgggg cgtccaataa cagttcttct aattgagact2401gactccaagg ccacaaactg ttcaacacac acaaagtgga caaatggcgt tcagcagcgg2461gtttggaaca tagcgaatcc gaatggatct gatgaaacct gtaccaggtg cttttatttt2521cttgcttttt tcccatccat agagcatgac agcatcgatt ctcattgagg agaaaccttg2581ggcagctccg gccaggcctt gtaggaaaag gccccgcccg aggttccagc gtcaacggcc2641actgtgtgtg gctgctctga gtgaggaaaa aattaaaaag aaaaactggt tccatgtact2701gtgaacttga aaacttgcag actcaggggg gtccctgatg cagtgcttca gatgaagaat2761gtggacttga aaatacagac tgggctagtc cagtgtctat atttaaactt gttcttttct2821tttaataaag tttaggtaac atctcctgaa aagcttgtag cacaaaggct cagctgggga2881tggtgtttga cttcggagga aaaaagttgc tattgcccgt taaaggcact agagttagtg2941ttttatccct aaataatttc aatttttaaa aacatgcagc ttccctctcc ccttttttat3001ttttgaaaga atacatttgg tcataaagtg aaacccgtat tagcaagtac gaggcaatgt3061tcattccaat cagatgcagc tttctcctcc gtctggtctc ctgtttgcaa ttgcttccct3121catctcagta gggaaaaaat tgagtgggag tactgagatg tgtgggtttt tgccattgga3181caaagaatga ggttagaaga ctgcagcttg gagtctctct aggttttcaa ctatttcttc3241acaatttgaa cacttgacgg ttgtcccttt taatttattt gaagtgctat ttttttaaat3301aaaggttcat ctgtccatgc aaaaaaa(SEQ ID NO:119)1meelhsldpr rqellearft gvgvskgpln sessnqslcs vgslsdkeve tpekkqndqr61nrkrkaepye tsqgkgtprg hkisdyfefa ggsapgtspg rsvppvarss pqhslsnplp121rrveqplygl dgsaakeate eqsalptlms vmlakprldt eqlaqrgagl cftfvsaqqn181spsstgsgnt ehscssqkqi siqhrrtqsd ltiekisale nsknsdlekk egriddllra241ncdlrrqide qqkmlekyke rlnrcvtmsk klliekskqe kmacrdksmq drlrlghftt301vrhgasfteq wtdgyafqnl ikqqerinsq reeierqrkm lakrkppamg qappatneqk361qrksktngae netltlaeyh eqeeifklrl ghlkkeeaei qaelerlerv rnlhirelkr421ihnednsqfk dhptlndryl llhllgrggf sevykafdlt eqryvavkih qlnknwrdek481kenyhkhacr eyrihkeldh privklydyf sldtdsfctv leycegndld fylkqhklms541ekearsiimq ivnalkylne ikppiihydl kpgnillvng tacgeikitd fglskimddd601synsvdgmel tsqgagtywy lppecfvvgk eppkisnkvd vwsvgvifyq clygrkpfgh661nqsqqdilqe ntilkatevq fppkpvvtpe akafirrcla yrkrdridvq qlacdpyllp721hirksvstss pagaaiasts gasnnsssn


Putative function


Serine threonine kinase involved in replication and cell cycle


Example 4 (Category 2)

Line ID—224


Phenotype—Semi-lethal male and female, cytokinesis defect. Onion stage cysts have variable sized Nebenkerns. Also has a mitotic phenotype: Tangled unevenly condensed chromosomes, anaphases with lagging chromosomes and bridges


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003450 (9C)


P element insertion site—139,674


Annotated Drosophila genome Complete Genome candidate—CG2096—flapwing, phosphatase type 1

(SEQ ID NO:120)ATCTGTAAGTGAAGTCCACTAACAACCGGTTTACTTGCAGTGCGCAGCTGCCGAACGGGCAAACAGGTCCAGATGACGGAGGCGGAGGTGCGTGGCCTCTGTCTCAAGTCGCGCGAGATCTTCTTGCAACAGCCCATCCTGCTGGAACTGGAGGCACCGCTGATCATCTGCGGCGACATCCACGGCCAGTACACAGACCTGTTGCGCCTGTTCGAGTACGGCGGATTCCCTCCGGCTGCCAACTACTTGTTCCTCGGCGACTACGTCGATCGGGGCAAGCAGTCCCTGGAGACCATCTGTCTGCTGCTGGCCTACAAGATCAAATATCCGGAGAACTTCTTCTTGTTGCGCGGCAACCACGAGTGCGCCAGTATTAATAGGATTTACGGCTTCTACGATGAGTGCAAGCGCCGATACAATGTCAAACTGTGGAAGACTTTCACAGATTGCTTCAACTGTCTGCCGGTAGCCGCCATTATTGACGAAAAGATCTTCTGCTGCCACGGCGGCCTCAGTCCCGATCTTCAGGGCATGGAGCAGATCCGTCGCCTAATGCGACCCACAGATGTGCCGGATACCGGGTTACTGTGCGATCTTCTGTGGAGTGATCCCGACAAGGATGTTCAGGGTTGGGGCGAGAATGATCGCGGTGTGAGCTTCACCTTCGGTGTGGATGTGGTCTCCAAGTTTTTGAACCGCCACGAGCTGGACTTGATCTGCCGTGCACATCAGGTTGTGGAGGATGGCTATGAGTTCTTTGCGCGTCGGCAACTGGTCACGTTGTTCTCGGCGCCCAATTACTGTGGAGAGTTCGACAATGCCGGCGGAATGATGACCGTGGACGACACGCTGATGTGCTCATTCCAGATCCTGAAACCATCCGAGAAGAAGGCCAAGTATCTGTACAGCGGAATGAACTCGTCGCGACCCACAACACCGCAGCGCAGCGCCCCAATGCTTGCGACCAACAAGAAGAAATAATATATCCATCCGCTTCCATTTCCTTAAAGGTTCAACAAACAACAGAAATAAACTTTTACATAGATACACACATATATACATATAAATATAACGAAACGATAGAAAAGGAGAGCGTTAGGCGATAGTAGAGAAAGGGCAAATGATAAATTAAATGTGTGAGCTATTAAAGCAAGCAAAATCGAAGTGCATGAATATCAACATCTATGTGAATCCGTCATTATCTGTTATCTGATGTGTCATCTGTATCCAACTTGATTACCTTATCCGTGTACCTGCTAGTTGCAGCAGCAACATCAGGAGCAACAACACCAGCAGCAGCAGCAGCAGAAACATCAGTGAAACACTCAGAGGCCCATAGTTAAGTCGATTCCTGCATTTGATGATTATCTGTTGAATGGAAATTGTGACAACGTCCCCGTAACAGCAGCTCCCAGATCCAAAACTCCCGAAACATGCAGATAAATAAATACATTAAAAGTACAGCGATGTTAAGCAATGAATTTATATATAGGCTTATTAATGTAAACT(SEQ ID NO:121)MTEAEVRGLCLKSREIFLQQPILLELEAPLIICGDIHGQYTDLLRLFEYGGFPPAANYLFLGDYVDRGKQSLETICLLLAYKIKYPENFFLLRGNHECASINRIYGFYDECKRRYNVKLWKTFTDCFNCLPVAAIIDEKIFCCHGGLSPDLQGMEQIRRLMRPTDVPDTGLLCDLLWSDPDKDVQGWGENDRGVSFTFGVDVVSKFLNRHELDLICRAHQVVEDGYEFFARRQLVTLFSAPNYCGEFDNAGGMMTVDDTLMCSFQILKPSEKKAKYLYSGMNSSRPTTPQRSAPMLATNKKK


Human homologue of Complete Genome candidate


NP002700 protein phosphatase 1, catalytic subunit, beta isoform

(SEQ ID NO:122)1cctgggtctg acgcggccct gttcgagggg gcctctcttg tttatttatt tattttccgt61gggtgcctcc gagtgtgcgc gcgctctcgc tacccggcgg ggagggggtg gggggagggc121ccgggaaaag ggggagttgg agccggggtc gaaacgccgc gtgacttgta ggtgagagaa181cgccgagccg tcgccgcagc ctccgccgcc gagaagccct tgttcccgct gctgggaagg241agagtctgtg ccgacaagat ggcggacggg gagctgaacg tggacagcct catcacccgg301ctgctggagg tacgaggatg tcgtccagga aagattgtgc agatgactga agcagaagtt361cgaggcttat gtatcaagtc tcgggagatc tttctcagcc agcctattct tttggaattg421gaagcaccgc tgaaaatttg tggagatatt catggacaat atacagattt actgagatta481tttgaatatg gaggtttccc accagaagcc aactatcttt tcttaggaga ttatgtggac541agaggaaagc agtctttgga aaccatttgt ttgctattgg cttataaaat caaatatcca601gagaacttct ttctcttaag aggaaaccat gagtgtgcta gcatcaatcg catttatgga661ttctatgatg aatgcaaacg aagatttaat attaaattgt ggaagacctt cactgattgt721tttaactgtc tgcctatagc agccattgtg gatgagaaga tcttctgttg tcatggagga781ttgtcaccag acctgcaatc tatggagcag attcggagaa ttatgagacc tactgatgtc841cctgatacag gtttgctctg tgatttgcta tggtctgatc cagataagga tgtgcaaggc901tggggagaaa atgatcgtgg tgtttccttt acttttggag ctgatgtagt cagtaaattt961ctgaatcgtc atgatttaga tttgatttgt cgagctcatc aggtggtgga agatggatat1021gaattttttg ctaaacgaca gttggtaacc ttattttcag ccccaaatta ctgtggcgag1081tttgataatg ctggtggaat gatgagtgtg gatgaaactt tgatgtgttc atttcagata1141ttgaaaccat ctgaaaagaa agctaaatac cagtatggtg gactgaattc tggacgtcct1201gtcactccac ctcgaacagc taatccgccg aagaaaaggt gaagaaagga attctgtaaa1261gaaaccatca gatttgttaa ggacatactt cataatatat aagtgtgcac tgtaaaacca1321tccagccatt tgacaccctt tatgatgtca cacctttaac ttaaggagac gggtaaagga1381tcttaaattt ttttctaata gaaagatgtg ctacactgta ttgtaataag tatactctgt1441tatagtcaac aaagttaaat ccaaattcaa aattatccat taaagttaca tcttcatgta1501tcacaatttt taaagttgaa aagcatccca gttaaactag atgtgatagt taaaccagat1561gaaagcatga tgatccatct gtgtaatgtg gttttagtgt tgcttggttg tttaattatt1621ttgagcttgt tttgtttttg tttgttttca ctagaataat ggcaaatact tctaattttt1681ttccctaaac atttttaaaa gtgaaatatg ggaagagctt tacagacatt caccaactat1741tattttccct tgtttatcta cttagatatc tgtttaatct tactaagaaa actttcgcct1801cattacatta aaaaggaatt ttagagattg attgttttaa aaaaaaatac gcacattgtc1861caatccagtg attttaatca tacagtttga ctgggcaaac tttacagctg atagtgaata1921ttttgcttta tacaggaatt gacactgatt tggatttgtg cactctaatt tttaacttat1981tgatgctcta ttgtgcagta gcatttcatt taagataagg ctcatatagt attacccaac2041tagttggtaa tgtgattatg tggtaccttg gctttaggtt ttcattcgca cggaacacct2101tttggcatgc ttaacttcct ggtaacacct tcacctgcat tggttttctt tttctttttt2161ctttcttttt tttttttttt ttttttttga gttgttgttt gtttttagat ccacagtaca2221tgagaatcct tttttgacaa gccttggaaa gctgacactg tctctttttc ctccctctat2281acgaaggatg tatttaaatg aatgctggtc agtgggacat tttgtcaact atgggtattg2341ggtgcttaac tgtctaatat tgccatgtga atgttgtata cgattgtaag gcttatgtca2401ctaaagattt ttattctgat tttttcataa tcaaaggtca tatgatactg tatagacaag2461ctttgtagtg aagtatagta gcaataattt ctgtacctga tcaagtttat tgcagccttt2521cttttcctat ttcttttttt taagggttag tattaacaaa tggcaatgag tagaaaagtt2581aacatgaaga ttttagaagg agagaactta caggacacag atttgtgatt ctttgactgt2641gacactattg gatgtgattc taaaagcttt tattgagcat tgtcaaattt gtaagcttca2701tagggatgga catcatatct ataatgccct tctatatgtg ctaccataga tgtgacattt2761ttgaccttaa tatcgtcttt gaaaatgtta aattgagaaa cctgttaact tacattttat2821gaattggcac attgtattac ttactgcaag agatatttca ttttcagcac agtgcaaaag2881ttctttaaaa tgcatatgtc tttttttcta attccgtttt gttttaaagc acattttaaa2941tgtagttttc tcatttagta aaagttgtct aattgatatg aagcctgact gatttttttt3001ttccttacag tgagacattt aagcacacat tttattcaca tagatactat gtccttgaca3061tattgaaatg attcttttct gaaagtattc atgatctgca tatgatgtat taggttaggt3121cacaaaggtt ttatctgagg tgatttaaat aacttcctga ttggagtgtg taagctgagc3181gatttctaat aaaattttag ttgtacactt ttagtagtca tagtgaagca ggtctagaaa3241ataagccttt ggcagggaaa aagggcaatg ttgattaatc tcagtattaa accacattaa3301tctgtatccc attgtctggc ttttgtaaat tcatccaggt caagactaag tatgttggtt3361aataggaatc cttttttttt tttaaagact aaatgtgaaa aaataatcac tacttaagct3421aattaatatt ggtcattaaa tttaaaggat ggaaatttat catgtttaaa aattattcaa3481gcactcttaa aaccacttaa acagcctcca gtcataaaaa tgtgttcttt acaaatattt3541gcttggcaac acgacttgaa ataaataaaa ctttgtttct taggagaaaa(SEQ ID NO:123)1madgelnvds litrllevrg crpgkivqmt eaevrglcik sreiflsqpi lleleaplki61cgdihgqytd llrlfeyggf ppeanylflg dyvdrgkqsl eticlllayk ikypenffll121rgnhecasin riygfydeck rrfniklwkt ftdcfnclpi aaivdekifc chgglspdlq181smeqirrimr ptdvpdtgll cdllwsdpdk dvqgwgendr gvsftfgadv vskflnrhdl241dlicrahqvv edgyeffakr qlvtlfsapn ycgefdnagg mmsvdetlmc sfqilkpsek301kakyqyggln sgrpvtpprt anppkkr


Putative function


Protein phosphatase


Example 5 (Category 2)

Line ID—231


Phenotype—Semi-lethal male and female, cytokinesis defect. In some cysts, variable sized Nebenkerns


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003429 (3F)


P element insertion site—153,730


Annotated Drosophila genome Complete Genome candidate—CG5014—vap-33-1 vesicle associated membrane protein

(SEQ ID NO:124)CACATCACTAGCTGACAGAATATATGGCTTTTTTACATTTTGCGTTTTCAACTGAAGTTTGCGAAGAAACCGAAGCGTGGTAAACCACTGAAATCGAAAATATCGACAGAAAAGCGACCTAAAGTCGGTGAAGAAGTCGCACGTTGATCGTTGTGTTTTTTTCCCGAAATTTTCTGCAAAAAGCCCGTGCGTGCGTGAGTTTCTCTGGCTCTTGCTTTTTTTTTGTCCATGCGTGTGTGTGTGGTCGCATAAATTTACCGATATTTCGCCTGTGAGAGCGAAACGAACGAAAAACGAAAGAAAAAAAGAGAGACGAGTAAAGTAAAACGAAACAGGCATAAAAACAGCAGCAGTTTTCTTGATATATTTGGCTAAAAAACGCAAACCAAACAGCCAGCAAGAACAACAAATAGCTGGGCAAAAACAGGACGCACAAAAAATAAAATTAAAACGATAAGAGGCGAAAAGCGGAGAGAGTGAAATTCTCGGCAGCAACAACGACAAGAACAACACCAGGAGCAGCAGCAACAACAACAACAAAAGCCAGCCGCCACAATGAGCAAATCACTCTTTGATCTTCCGTTGACCATTGAACCAGAACATGAGTTGCGTTTTGTGGGTCCCTTCACCCGACCCGTTGTCACAATCATGACTCTGCGCAACAACTCGGCTCTGCCTCTGGTCTTCAAGATCAAGACAACCGCCCCGAAACGCTACTGCGTACGTCCAAACATCGGCAAGATAATTCCCTTTCGATCAACCCAGGTGGAGATCTGCCTTCAGCCATTCGTCTACGATCAGCAGGAGAAGAACAAGCACAAGTTCATGGTGCAGAGCGTCCTGGCACCCATGGATGCTGATCTAAGCGATTTAAATAAATTGTGGAAGGATCTGGAGCCCGAGCAGCTGATGGACGCCAAACTGAAGTGCGTTTTCGAGATGCCCACCGCTGAGGCAAATGCTGAGAACACCAGCGGTGGTGGTGCCGTTGGCGGCGGAACCGGAGCTGCCGGAGGCGGAAGCGCGGGTGCCAATACTAGCTCAGCCAGCGCTGAGGCGCTCGAGAGCAAGCCGAAGCTCTCCAGCGAGGATAAGTTTAAGCCATCCAATTTGCTCGAAACGTCTGAGAGTCTGGACTTGCTGTCCGGAGAGATCAAAGCGCTGCGTGAATGCAACATTGAATTGCGAAGAGAGAATCTTCACTTGAAGGATCAAATCACACGTTTCCGGAGCTCGCCGGCCGTCAAACAGGTGAATGAGCCCTATGCCCCAGTCCTGGCTGAGAAGCAGATTCCGGTCTTTTACATTGCAGTTGCCATTGCTGCGGCCATCGTTAGCCTCCTGCTGGGCAAATTCTTTCTCTGA(SEQ ID NO:125)MSKSLFDLPLTIEPEHELRFVGPFTRPVVTIMTLRNNSALPLVFKIKTTAPKRYCVRPNIGKIIPFRSTQVEICLQPFVYDQQEKNKHKFMVQSVLAPMDADLSDLNKLWKDLEPEQLMDAKLKCVFEMPTAEANAENTSGGGAVGGGTGAAGGGSAGANTSSASAEALESKPKLSSEDKFKPSNLLETSESLDLLSGEIKALRECNIELRRENLHLKDQITRFRSSPAVKQVNEPYAPVLAEKQIPVFYIAVAIAAAIVSLLLGKFFL


Human homologue of Complete Genome candidate


AAD13577 VAMP-associated protein B

(SEQ ID NO:126)1gcgcgcccac ccggtagagg acccccgccc gtgccccgac cggtccccgc ctttttgtaa61aacttaaagc gggcgcagca ttaacgcttc ccgccccggt gacctctcag gggtctcccc121gccaaaggtg ctccgccgct aaggaacatg gcgaaggtgg agcaggtcct gagcctcgag181ccgcagcacg agctcaaatt ccgaggtccc ttcaccgatg ttgtcaccac caacctaaag241cttggcaacc cgacagaccg aaatgtgtgt tttaaggtga agactacagc accacgtagg301tactgtgtga ggcccaacag cggaatcatc gatgcagggg cctcaattaa tgtatctgtg361atgttacagc ctttcgatta tgatcccaat gagaaaagta aacacaagtt tatggttcag421tctatgtttg ctccaactga cacttcagat atggaagcag tatggaagga ggcaaaaccg481gaagacctta tggattcaaa acttagatgt gtgtttgaat tgccagcaga gaatgataaa541ccacatgatg tagaaataaa taaaattata tccacaactg catcaaagac agaaacacca601atagtgtcta agtctctgag ttcttctttg gatgacaccg aagttaagaa ggttatggaa661gaatgtaaga ggctgcaagg tgaagttcag aggctacggg aggagaacaa gcagttcaag721gaagaagatg gactgcggat gaggaagaca gtgcagagca acagccccat ttcagcatta781gccccaactg ggaaggaaga aggccttagc acccggctct tggctctggt ggttttgttc841tttatcgttg gtgtaattat tgggaagatt gccttgtaga ggtagcatgc acaggatggt901aaattggatt ggtggatcca ccatatcatg ggatttaaat ttatcataac catgtgtaaa961aagaaattaa tgtatgatga catctcacag gtcttgcctt taaattaccc ctccctgcac1021acacatacac agatacacac acacaaatat aatgtaacga tcttttagaa agttaaaaat1081gtatagtaac tgattgaggg ggaaaagaat gatctttatt aatgacaagg gaaaccatga1141gtaatgccac aatggcatat tgtaaatgtc attttaaaca ttggtaggcc ttggtacatg1201atgctggatt acctctctta aaatgacacc cttcctcgcc tgttggtgct ggcccttggg1261gagctggagc ccagcatgct ggggagtgcg gtcagctcca cacagtagtc cccacgtggc1321ccactcccgg cccaggctgc tttccgtgtc ttcagttctg tccaagccat cagctccttg1381ggactgatga acagagtcag aagcccaaag gaattgcact gtggcagcat cagacgtact1441cgtcataagt gagaggcgtg tgttgactga ttgacccagc gctttggaaa taaatggcag1501tgctttgttc acttaaaggg accaagctaa atttgtattg gttcatgtag tgaagtcaaa1561ctgttattca gagatgttta atgcatattt aacttattta atgtatttca tctcatgttt1621tcttattgtc acaagagtac agttaatgct gcgtgctgct gaactctgtt gggtgaactg1681gtattgctgc tggagggctg tgggctcctc tgtctctgga gagtctggtc atgtggaggt1741ggggtttatt gggatgctgg agaagagctg ccaggaagtg ttttttctgg gtcagtaaat1801aacaactgtc ataggcaggg aaattctcag tagtgacagt caactctagg ttaccttttt1861taatgaagag tagtcagtct tctagattgt tcttatacca cctctcaacc attactcaca1921cttccagcgc ccaggtccaa gtttgagcct gacctcccct tggggaccta gcctggagtc1981aggacaaatg gatcgggctg caaagggtta gaagcgaggg caccagcagt tgtgggtggg2041gagcaaggga agagagaaac tcttcagcga atccttctag tactagttga gagtttgact2101gtgaattaat tttatgccat aaaagaccaa cccagttctg tttgactatg tagcatcttg2161aaaagaaaaa ttataataaa gccccaaaat taaga(SEQ ID NO:127)1makveqvlsl epqhelkfrg pftdvvttnl klgnptdrnv cfkvkttapr rycvrpnsgi61idagasinvs vmlqpfdydp nekskhkfmv qsmfaptdts dmeavwkeak pedlmdsklr121cvfelpaend kphdveinki isttasktet pivskslsss lddtevkkvm eeckrlqgev181qrlreenkqf keedglrmrk tvqsnspisa laptgkeegl strllalvvl ffivgviigk241ial


Putative function


Membrane associated protein which may be involved in priming synaptic vesicles


Example 6 (Category 2)

Line ID—248


Phenotype—Male sterile, cytokinesis defect. Cytokinesis defect, different meiotic stages within one cyst, variable sized nuclei, 2-4 nuclei. Also has a mitotic phenotype: semi-lethal, rod-like overcondensed chromosomes, high mitotic index, lagging chromosomes and bridges.


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003431 (4D 1)


P element insertion site—299,078


Annotated Drosophila genome Complete Genome candidate—CG6998—cutup (dynein light chain)

(SEQ ID NO:128)CAAAACGTTCAGTTGTGTTTCAGTTGTCGAGAAGTCAGGGTGTTTCTACCTTCCATTTACCGTTCCAGTGTAAAATTCAGGCGACACGCTTAGCGTTACCAAGGAGAACCGCTAAAAAGGGCCACTTTTCAAACGGTTAGATTCCAGTGAAGTTGTAAGCACACAGGGAACCTAAAAAAAAAAAAAACAGCCAAAATGTCTGATCGCAAGGCCGTGATTAAAAATGCCGACATGAGCGAGGAGATGCAGCAGGATGCCGTCGATTGTGCGACACAGGCCCTCGAGAAGTACAACATTGAAAAGGACATTGCGGCCTACATCAAGAAGGAGTTCGACAAAAAATACAATCCCACATGGCATTGCATTGTCGGTCGCAACTTTGGATCGTATGTCACACACGAGACGCGCCACTTTATTTACTTCTATTTGGGCCAGGTGGCTATTTTACTGTTTAAGAGCGGTTAAAGTATTGTCGAGTCGGATGAAGTGGTGGTGAGGAGGCTGATGGAGATGCAGCAGCTGCCCCGCCAGCAGCAACAACAGCAGGGGCAGCAGTCGCATTTCGGAGCATCAGAGGATGAGGATCTAGAGCAGAAACAGCAACAACCA(SEQ ID NO:129)MSDRKAVIKNADMSEEMQQDAVDCATQALEKYNIEKDIAAYIKKEFDKKYNPTWHCIVGRNFGSYVTHETRHFIYFYLGQVAILLFKSG


Human homologue of Complete Genome candidate


AAH10744 Similar to RIKEN cDNA 6720463E02 gene

(SEQ ID NO:130)1gctgtgaggc gccagtgcgg agcgggcggg cgggcgggcg ggcgggcggc gcgaggcgga61gcgcgggcgg ccggcgaaac tccaagggcg gaccgcggca gggagcgatc ggcctcgggc121tgcgggagcc ggagaccgcg gcggcggcgg ctgctgcagc tgcaggagga gcccagggaa181caccgcccct gcctgtgctc tgcctcgggc catcgctcct ccccagggcc cagtgcggac241tcgcctccgt gaagtgtcac accatgtctg accggaaggc agtgatcaag aacgcagaca301tgtctgagga catgcaacag gatgccgttg actgcgccac gcaggccatg gagaagtaca361atatagagaa ggacattgct gcctatatca agaaggaatt tgacaagaaa tataacccta421cctggcattg tatcgtgggc cgaaattttg gcagctacgt cacacacgag acaaagcact481tcatctattt ttacttgggt caagttgcaa tcctcctctt caagtcaggc taggtggcca541tggtgaaggt gtcagtggcg gcggcagcga tggcaagcag gcggcgttgc tgggactgtt601ttgcactgga gccagcatca ggatgtcctc tccaatggct gtgctactgc atggactgta661tactcgattt catgtgtatg tcgcagtaaa caaaaccaaa cctcaaaaaa aaaaaaaaaa721aaaaaaaaaa aaaaa(SEQ ID NO:131)1msdrkavikn admsedmqqd avdcatqame kyniekdiaa yikkefdkky nptwhcivgr61nfgsyvthet khfiyfylgq vaillfksg


Putative function


Dynein light chain, a microtubule motor protein


Example 7 (Category 2)

Line ID—bbl-E1


Phenotype—Male sterile. Asynchronous meiotic divisions, cysts with large Nebenkern and 1-2 larger nuclei, testis from 2-3 old males become smaller. High mitotic index, colchicine type overcondensaton, many anaphases and telophases, no decondensation in telophase. Also has a mitotic phenotype: High mitotic index, colchicines-type overcondensed chromosomes, many ana- and relophases, no decondensation in telophase


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003431 (4E)


P element insertion site—not determined


Annotated Drosophila genome Complete Genome candidate


CG2984—Pp2C 1 protein phosphatase

(SEQ ID NO:132)TGTTCGCAAGTCGAGAGCAGAATCGAACGGCAAAAAATGCTGGCGAACAACAAATCATCAAGGTAAAACTGCGCGCCTTGGTCATTAAGTCTTTCATCGAGGATAAAAGACCGATGTCTTTTAACGTTATTGCTGTAAGCAAAAGCAGAAATCACAATCTACTCATAAATCCTCGATTTGGTGCAAATTAAAGGAAATTCATCGGTTTTTGGCGGCCAGTTGCAAACACAAAATACTAAATACGCTAGATGGAGCACGCATACACGCAAGCTCGTTGGCGAACGTAAATTACATACATCATATAGATAGTCGTCCCGCTTGCACTGCCCGTCACAGCGAGGGCTGCGAGAGCGAGAGCGGGAGAGAGAAAGGCCTGAGTCGCTTTTTCTTCTTGTACTTTATATATTTTTTATTGTTTTTTTGTGTTGTGTTGCGTTGTACGTGTGTGTGAGAGTGCCAAATGTCAACGGAAATTACAACACTGCGAGACGGAGAAGTCTAAAAGGCAGAAGAAGAAGAAGCAGCAGCAGGCAGCATAAACAAAACTCGGGGGAAAAATGTTGCCCGCCAATAACAGGAGTAGCACCAGCACCCATACCAACACAAATGCCAACACAATCAACGCCACTACCAATACCACCAACAGATGCCTCATCAATACGGCCATCGAAAAAACGGTAGTCCGTTTGCGAGAGACGGCAGCGAATAGCGCACCAGCTCCAGCCACAGCCTCCGTTACTCGCCACGGCGGCAGCAGCAGCGGCAATAACAACAATAACAGTGCATGCCATCCAGCACTGGATGCCAGCAGTGATGTTGTTGTTGTTGAACCGGCAGCGGTAGGAGTCGCACAGGAGGAAGAGGAAGAGCCGGAGCAAAGGCCAGAGAGGATCAGCATACCCATTCCCGACCTGGCGTTCACCGAGATGGAAGCATATGCCGAGGATATAGTCGTCGATATGGAGGGGGGATCACCAGCCAAGCCTTTAAATCCAAAGAAACAACGTTTAAACTCAGCAACAACCACAACAATAAATCGCTCGAGGGGCGGCGGAGCGGCACAGAGTCGATTACGCCGGTCGGCGGCCATCGTTCCACCGCGATCGATTCCAGAGAGCTGTGCCAGCAGCAGCAATTCCAATTCGAGCAGCAGTTCCAACAGTAATTCCAGTTCCAGCTCCGCTACAGGAAGTAGCGCATCCACCGGCAATCCGTCGCCGTGCTCCTCCCTGGGCGTCAATATGCGCGTAACTGGACAATGCTGCCAGGGAGGCCGGAAATACATGGAGGATCAGTTCTCGGTGGCCTACCAGGAATCACCGATCACCCACGAACTGGAATACGCATTTTTTGGCATCTACGACGGACACGGCGGTCCCGAGGCCGCGCTCTTCGCCAAGGAGCACCTTATGCTCGAGATCGTCAAGCAGAAGCAGTTCTGGTCTGATCAGGATGAGGATGTCCTGCGGGCAATACGCGAGGGATACATCGCCACACATTTCGCCATGTGGCGGGAACAAGAGAAATGGCCACGCACTGCCAATGGGCATCTGAGCACCGCCGGCACCACCGCCACAGTGGCCTTTATGCGTCGCGAGAAGATCTACATTGGTCATGTGGGTGATTCTGGGATCGTTTTGGGTTACCAGAACAAGGGCGAACGCAACTGGCGTGCTCGTCCACTGACCACGGACCACAAGCCGGAGTCACTGGCAGAGAAGACGAGAATCCAGCGTTCCGGCGGCAATGTTGCCATCAAATCGGGAGTTCCGCGAGTGGTATGGAACCGACCCAGGGACCCAATGCATCGCGGTCCCATTCGCCGCAGAACTCTGGTAGATGAAATACCCTTTTTGGCGGTGGCTCGTTCCCTGGGCGATCTCTGGAGCTACAATTCCCGCTTCAAGGAATTCGTTGTGAGTCCCGATCCGGATGTCAAAGTGGTTAAAATAAATCCCAGTACCTTTAGATGCTTAATTTTCGGCACCGATGGCCTGTGGAATGTGGTGACCGCCCAGGAGGCGGTGGACAGTGTGCGCAAGGAGCATCTAATCGGCGAGATACTCAACGAGCAGGACGTTATGAATCCCAGCAAGGCGCTGGTGGATCAGGCCCTCAAAACCTGGGCCGCCAAGAAGATGCGTGCGGACAACACGTCCGTTGTGACTGTGATACTAACACCAGCGGCCCGCATTAATTCGCCCACAACGCCAACACGTTCCCCATCCGCGATGGCACGCGACAATGATCTGGAGGTGGAGCTACTGCTGGAGGAGGACGACGAGGAGCTGCCGACACTGGATGTGGAGAACAACTACCCTGACTTTCTCATCGAGGAGCATGAGTATGTGCTGGACCAGCCGTACAGTGCATTGGCCAAGCGACATTCGCCTCCGGAAGCCTTCCGCAACTTCGACTACTTCGATGTGGACGAGGACGAGTTGGATGAAGATGAGGAAACAGTGGAAGAAGACGAGGAGGAGGAGGAGGAAGAGGAGGAAACCAAATCGGTGGGAATTCTACAGCAAAGTTTGTTCAACCCCAGAAAAACGTGGCGCAAGTCAACCATCAACAATTCCTGGAGTGGCGTCACCGAACCGGAACCGGAACCCGATCCCGAACCAGATCGAATAGATGTCTTAACACTGGACATGTACTCCCACACCAGCATTGACAAGGGCACCAATTATGGCGGCAGCATAGCCCAGTCCTCAATAGATCCTGCGGAGACGGCTGAAAATCGTGAGCTGAGTGAGTTGGAGCAGCATCTGGAGAGTAGCTACAGTTTCGCCGAGTCGTACAACTCCCTGTTAAACGAGCAGGAGGAGCAGGAGGCACGCTCACGTTCAGCAGCAGCAGCAGCCGCCGCCGCAGAAGCAGCAGCAGTAGAAGCACAACAAACCACTGCCCATTCCGCATCCGTTGTGCTGGACCGCAGCATGTTGGAGATCATCCAGGAGCAGCAGCACTATCAGCAGCAAGAGGGCTATTCGCTAACGCAACTAGAGACCAGACGTGAAAGGGAGCGGCTGACCGAATCGTGGCCACAGCAGCCGGCTGAGCTGCTCGAGCTGGATGCTCTACTGCAGCAGGAGCGTGCCGAGGAGGAGCAGGTAGCCCTGGAGCAGCAGCAGCAGCGCGAACAGCAAATGGAGCAAATGGAGGTGGAGGCCATTAGTAGTTCGGGACAGCACGAATTTGCTTACCCAGTGACCACCGCCACAGCCAGCGAGTGGTGTGCTACATTACAAGAAGACGAGGAGGAGTTGGACTCCACAGTAATAGACATAGTAATTCAACCCGAACAAGAGTTGCAGGACAATGAAGTGAGCTCCACGTTGCCCGCCACACCCACTCATGTGGAGCCTGAGCAGATTGTGGACAAGATGGAGCCCCTGAAGGTTCAGGAGATGCTAACCGCGGTCGAAAAACCTCCATCCAAGCAGGAAAAGAAGCTGCCGAAGAAGCAAGAGACCAAACAGGTTGCTGTGCTAGATACAGTGGCCGAGATGCCCAAAGAGGATGCCCATGCCGTGCACTATATATTCCAGCGCATTCAAAAGGTTCAGGACTCTGAGGCAACACCAGTGGCCGTGACGAATTCCACAATGGCTGACGCCCTGCCCACCGAATCTAGTGGACTGGGAGGATCTATGACCGCGCCCCGAATCCGACGCTATCGCAACGTGCCCAACGAGAACCATCAGCACATGCAGACGCGTCGTCGTCAGATCTTCAAGCATGTCAAGCCAAAGTCCTTCATACAGTCCAGTGCTGCGGCGATTGTGGCCTATGGAGACAGCACCGAAACGGTCGGAGGAACAGCCGGAGCATCTGGCACACCTGCAGCTGGGCGTGTAGGCGGGGGCGGTGGCGGCGGCGGCGGCAGAGGATCGGCCAGTGGTGGGAGCAGTCCAGCGGTGGCAGCCAATAGTCGGCGGAGCGTCAATGTGGTGGCCAATGCGAGTGGAAACAGCGCTAGCAAAGTTGTGCCCAGCAGCAGTTCCATGATGATGACCCGCCGCAGTCACACCTTGACGGCCAGCGGTGGTGTGAACAAAAGGCAGCTGCGCAGCAGTCTCTGCACCTTGGGCCTGGGTGTGGGTGTCGGTGTCGGTCTGGGCATGGACCTGGACATGACCAAGCGCACGCTAAGGACAAGGAATGTACCCGCTTTGTCGGGCGGTTCAGCCACGCCATCTAGCAATTCGTCGCCAGCCAGCGGAGGCAGCAGTCCAGCCGGTTTCACAAGCCCAGCCAGTCCGGTCATCACGTCCAGGGGAAGCGGATCGCGTACTACCGCCTCGCCAGCCAGGCGCCTAAAACGCAGTCATGAGGATCGGGAGCAAAGAATGAGCTTGCGACGGAGCACTCTGAGTGGCAGTGCCAGCGGCAGTGGGCTGGTGGGCACTGGTGGGTCGCCCTCGAATGTGAAATCAAATCGCCTGCAGGCCTGCAATGGAGCCATCTCTGCGCGTCCGCCGCCCTCGCCGAAGAAACTGAATGCAGCCGTGCCCACATTGGCAATTGGAACGCGTGCATATACGGCGGCGTTGGCGGCGGCGGCGGATCACCTGAACAAGCGGTGGTCGTTGCGCAGCAGCAGTGGCAACTCTGGCAATCTGATAACCGCCATCAGTTGCTACAGTGACAGGAGCAGGGCGGCGACTGCGGCGGGATCACCGGGATCTGGAGGCGGGGCAGCGGGACCACCAGGAGCATCTTTGGCCGCATCCACAGTCGGCACGCGAAGGCGCTAGGCTAGATTGTAACGAAACATGCGAGCAACTTGCAAGTACAAATCCTAAGCAACGGAAAATTTTAGATCCTAGTATACTACTTTACTGAAAACGCAAAATTGCATAATTTAACCAATTTTTTTATGTGCACAACACACACAC(SEQ ID NO:133)MLPANNRSSTSTHTNTNANTINATTNTTNRCLINTAIEKTVVRLRETAANSAPAPATASVTRHGGSSSGNNNNNSACHPALDASSDVVVVEPAAVGVAQEEEEEPEQRPERISIPIPDLAFTEMEAYAEDIVVDMEGGSPAKPLNPKKQRLNSATTTTINRSRGGGAAQSRLRRSAAIVPPRSIPESCASSSNSNSSSSSNSNSSSSSATGSSASTGNPSPCSSLGVNMRVTGQCCQGGRKYMEDQFSVAYQESPITHELEYAFFGIYDGHGGPEAALFAKEHLMLEIVKQKQFWSDQDEDVLRAIREGYIATHFAMWREQEKWPRTANGHLSTAGTTATVAFMRREKIYIGHVGDSGIVLGYQNKGERNWRARPLTTDHKPESLAEKTRIQRSGGNVAIKSGVPRVVWNRPRDPMHRGPIRRRTLVDEIPFLAVARSLGDLWSYNSRFKEFVVSPDPDVKVVKINPSTFRCLIFGTDGLWNVVTAQEAVDSVRKEHLIGEILNEQDVMNPSKALVDQALKTWAAKKMRADNTSVVTVILTPAARNNSPTTPTRSPSAMARDNDLEVELLLEEDDEELPTLDVENNYPDFLIEEHEYVLDQPYSALAKRHSPPEAFRNFDYFDVDEDELDEDEETVEEDEEEEEEEEETKSVGILQQSLFNPRKTWRKSTINNSWSGVTEPEPEPDPEPDRIDVLTLDMYSHTSIDKGTNYGGSIAQSSIDPAETAENRELSELEQHLESSYSFAESYNSLLNEQEEQEARSRSAAAAAAAAEAAAVEAQQTTAHSASVVLDRSMLEIIQEQQHYQQQEGYSLTQLETRRERERLTESWPQQPAELLELDALLQQERAEEEQVALEQQQQREQQMEQMEVEAISSSGQHEFAYPVTTATASEWCATLQEDEEELDSTVIDIVIQPEQELQDNEVSSTLPATPTHVEPEQIVDKMEPLKVQEMLTAVEKPPSKQEKKLPKKQETKQVAVLDTVAEMPKEDAHAVHYIFQRIQKVQDSEATPVAVTNSTMADALPTESSGLGGSMTAPRIRRYRNVPNENHQHMQTRRRQIFKHVKPKSFIQSSAAAIVAYGDSTETVGGTAGASGTPAAGRVGGGGGGGGGRGSASGGSSPAVAANSRRSVNVVANASGNSASKVVPSSSSMMMTRRSHTLTASGGVNKRQLRSSLCTLGLGVGVGVGLGMDLDMTKRTLRTRNVPALSGGSATPSSNSSPASGGSSPAGFTSPASPVITSRGSGSRTTASPARRLKRSHEDREQRMSLRRSTLSGSASGSGLVGTGGSPSNVKSNRLQACNGAISARPPPSPKKLNAAVPTLAIGTRAYTAALAAAADHLNKRWSLRSSSGNSGNLITAISCYSDRSRAATAAGSPGSGGGAAGPPGASLAASTVGTRRR


Human homologue of Complete Genome candidate


AAB61637 Wip1

(SEQ ID NO:134)1ctggctctgc tcgctccggc gctccggccc agctctcgcg gacaagtcca gacatcgcgc61gccccccctt ctccgggtcc gccccctccc ccttctcggc gtcgtcgaag ataaacaata121gttggccggc gagcgcctag tgtgtctccc gccgccggat tcggcgggct gcgtgggacc181ggcgggatcc cggccagccg gccatggcgg ggctgtactc gctgggagtg agcgtcttct241ccgaccaggg cgggaggaag tacatggagg acgttactca aatcgttgtg gagcccgaac301cgacggctga agaaaagccc tcgccgcggc ggtcgctgtc tcagccgttg cctccgcggc361cgtcgccggc cgcccttccc ggcggcgaag tctcggggaa aggcccagcg gtggcagccc421gagaggctcg cgaccctctc ccggacgccg gggcctcgcc ggcacctagc cgctgctgcc481gccgccgttc ctccgtggcc tttttcgccg tgtgcgacgg gcacggcggg cgggaggcgg541cacagtttgc ccgggagcac ttgtggggtt tcatcaagaa gcagaagggt ttcacctcgt601ccgagccggc taaggtttgc gctgccatcc gcaaaggctt tctcgcttgt caccttgcca661tgtggaagaa actggcggaa tggccaaaga ctatgacggg tcttcctagc acatcaggga721caactgccag tgtggtcatc attcggggca tgaagatgta tgtagctcac gtaggtgact781caggggtggt tcttggaatt caggatgacc cgaaggatga ctttgtcaga gctgtggagg841tgacacagga ccataagcca gaacttccca aggaaagaga acgaatcgaa ggacttggtg901ggagtgtaat gaacaagtct ggggtgaatc gtgtagtttg gaaacgacct cgactcactc961acaatggacc tgttagaagg agcacagtta ttgaccagat tccttttctg gcagtagcaa1021gagcacttgg tgatttgtgg agctatgatt tcttcagtgg tgaatttgtg gtgtcacctg1081aaccagacac aagtgtccac actcttgacc ctcagaagca caagtatatt atattgggga1141gtgatggact ttggaatatg attccaccac aagatgccat ctcaatgtgc caggaccaag1201aggagaaaaa atacctgatg ggtgagcatg gacaatcttg tgccaaaatg cttgtgaatc1261gagcattggg ccgctggagg cagcgtatgc tccgagcaga taacactagt gccatagtaa1321tctgcatctc tccagaagtg gacaatcagg gaaactttac caatgaagat gagttatacc1381tgaacctgac tgacagccct tcctataata gtcaagaaac ctgtgtgatg actccttccc1441catgttctac accaccagtc aagtcactgg aggaggatcc atggccaagg gtgaattcta1501aggaccatat acctgccctg gttcgtagca atgccttctc agagaatttt ttagaggttt1561cagctgagat agctcgagag aatgtccaag gtgtagtcat accctcaaaa gatccagaac1621cacttgaaga aaattgcgct aaagccctga ctttaaggat acatgattct ttgaataata1681gccttccaat tggccttgtg cctactaatt caacaaacac tgtcatggac caaaaaaatt1741tgaagatgtc aactcctggc caaatgaaag cccaagaaat tgaaagaacc cctccaacaa1801actttaaaag gacattagaa gagtccaatt ctggccccct gatgaagaag catagacgaa1861atggcttaag tcgaagtagt ggtgctcagc ctgcaagtct ccccacaacc tcacagcgaa1921agaactctgt taaactcacc atgcgacgca gacttagggg ccagaagaaa attggaaatc1981ctttacttca tcaacacagg aaaactgttt gtgtttgctg aaatgcatct gggaaatgag2041gtttttccaa acttaggata taagagggct ttttaaattt ggtgccgatg ttgaactttt2101tttaagggga gaaaattaaa agaaatatac agtttgactt tttggaattc agcagtttta2161tcctggcctt gtacttgctt gtattgtaaa tgtggatttt gtagatgtta gggtataagt2221tgctgtaaaa tttgtgtaaa tttgtatcca cacaaattca gtctctgaat acacagtatt2281cagagtctct gatacacagt aattgtgaca atagggctaa atgtttaaag aaatcaaaag2341aatctattag attttagaaa aacatttaaa ctttttaaaa tacttattaa aaaatttgta2401taagccactt gtcttgaaaa ctgtgcaact ttttaaagta aattattaag cagactggaa2461aagtgatgta ttttcatagt gacctgtgtt tcacttaatg tttcttagag ccaagtgtct2521tttaaacatt attttttatt tctgatttca taattcagaa ctaaattttt catagaagtg2581ttgagccatg ctacagttag tcttgtccca attaaaatac tatgcagtat ctcttacatc2641agtagcattt ttctaaaacc ttagtcatca gatatgctta ctaaatcttc agcatagaag2701gaagtgtgtt tgcctaaaac aatctaaaac aattcccttc tttttcatcc cagaccaatg2761gcattattag gtcttaaagt agttactccc ttctcgtgtt tgcttaaaat atgtgaagtt2821ttccttgcta tttcaataac agatggtgct gctaattccc aacatttctt aaattatttt2881atatcataca gttttcattg attatatggg tatatattca tctaataaat cagtgaactg2941ttcctcatgt tgctgaaaaa aaaaaaaaaa aaa(SEQ ID NO:135)1maglyslgvs vfsdqggrky medvtqivve peptaeekps prrslsqplp prpspaalpg61gevsgkgpav aareardplp dagaspapsr ccrrrssvaf favcdghggr eaaqfarehl121wgfikkqkgf tssepakvca airkgflach lamwkklaew pktmtglpst sgttasvvii181rgmkmyvahv gdsgvvlgiq ddpkddfvra vevtqdhkpe lpkererieg lggsvmnksg241vnrvvwkrpr lthngpvrrs tvidqipfla varalgdlws ydffsgefvv spepdtsvht301ldpqkhkyii lgsdglwnmi ppqdaismcq dqeekkylmg ehgqscakml vnralgrwrq361rmlradntsa ivicispevd nqgnftnede lylnltdsps ynsqetcvmt pspcstppvk421sleedpwprv nskdhipalv rsnafsenfl evsaeiaren vqgvvipskd pepleencak481altlrihdsl nnslpiglvp tnstntvmdq knlkmstpgq mkaqeiertp ptnfkrtlee541snsgplmkkh rrnglsrssg aqpaslptts qrknsvkltm rrrlrgqkki gnpllhqhrk601tvcvc


Putative function


Protein phosphatase, with p53 dependent expression, so may be inhibitory to division


Example 8 (Category 2)

Line ID—ms(1)04


Phenotype—Cytokinesis defect, small testis, no meiosis observed, variable sized Nebenkerns with 2-4N nuclei


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003442 (7C-D)


P element insertion site—not determined


Annotated Drosophila genome Complete Genome candidate


CG1524—RpS14A ribosomal protein (2 splice variants)

(SEQ ID NO:136)GATATCCGGTTAACGCAAGTGTTGCTGATCGACAAACAAACCCAGAATGGCACCCAGGAAGGCTAAAGTTCAGAAGGAGGAGGTTCAGGTCCAGCTGGGACCCCAAGTTCGCGACGGCGAGATCGTGTTCGGAGTGGCTCACATCTACGCCAGCTTCAACGACACCTTCGTCCATGTCACTGATCTGTCCGGCCGTGAGACCATCGCTCGTGTCACCGGAGGCATGAAGGTGAAGGCCGATCGTGATGAGGCTTCGCCCTACGCCGCTATGTTGGCCGCTCAGGATGTGGCTGAGAAGTGCAAGACACTGGGCATTACTGCCCTGCATATTAAGCTGCGTGCCACCGGCGGCAACAAGACCAAGACCCCCGGACCCGGCGCCCAGTCCGCTCTGCGTGCTTTGGCCCGTTCGTCCATGAAGATTGGCCGCATCGAGGATGTGACGCCCATCCCATCGGACTCCACCCGCAGGAAGGGCGGTCGCCGTGGTCGTCGTCTGTAGATGGCAGTATCTGGAAAGCAGTAGTCTATGTTTGCGGTCGAAATACAATACTGC(SEQ ID NO:137)MAPRKAKVQKEEVQVQLGPQVRDGEIVFGVAHIYASFNDTFVHVTDLSGRETIARVTGGMKVKADRDEASPYAAMLAAQDVAEKCKTLGITALHIKLRATGGNKTKTPGPGAQSALRALARSSMKIGRIEDVTPIPSDSTRRKGGRRGRRL(SEQ ID NO:138)CAAGTGGTTCGTCTTTAATTTTTCCCTCTTAATTTTTGCGAAAAAAAACCCGACTTTGAGCCCCTAAACTTAAAAAATGTGCCTTCCTCCAGAGTGTTCAGAGCGTCGACTGAAAATGACAAACAAGCTGCCCGGCAGCTAATTTTTTTTTACATTTTTTGTTTTGTTTGTTCGCACGCATTTGTTTTTATTTGTGAAACACGTGGTATAAATGTGGAAATTCCCTTGCTATTCCCGCAGTTGCTGATCGACAAACAAACCCAGAATGGCACCCAGGAAGGCTAAAGTTCAGAAGGAGGAGGTTCAGGTCCAGCTGGGACCCCAAGTTCGCGACGGCGAGATCGTGTTCGGAGTGGCTCACATCTACGCCAGCTTCAACGACACCTTCGTCCATGTCACTGATCTGTCCGGCCGTGAGACCATCGCTCGTGTCACCGGAGGCATGAAGGTGAAGGCCGATCGTGATGAGGCTTCGCCCTACGCCGCTATGTTGGCCGCTCAGGATGTGGCTGAGAAGTGCAAGACACTGGGCATTACTGCCCTGCATATTAAGCTGCGTGCCACCGGCGGCAACAAGACCAAGACCCCCGGACCCGGCGCCCAGTCCGCTCTGCGTGCTTTGGCCCGTTCGTCCATGAAGATTGGCCGCATCGAGGATGTGACGCCCATCCCATCGGACTCCACCCGCAGGAAGGGCGGTCGCCGTGGTCGTCGTCTGTAGATGGCAGTATCTGGAAAGCAGTAGTCTATGTTTGCGGTCGAAATACAATACTGC(SEQ ID NO:139)MAPRKAKVQKEEVQVQLGPQVRDGEIVFGVAHIYASFNDTFVHVTDLSGRETIARVTGGMKVKADRDEASPYAAMLAAQDVAEKCKTLGITALHIKLRATGGNKTKTPGPGAQSALRALARSSMKIGRIEDVTPIPSDSTRRKGGRRGRRL


Human homologue of Complete Genome candidate


A25220 ribosomal protein S14, cytosolic

(SEQ ID NO:140)1ctccgccctc tcccactctc tctttccggt gtggagtctg gagacgacgt gcagaaatgg61cacctcgaaa ggggaaggaa aagaaggaag aacaggtcat cagcctcgga cctcaggtgg121ctgaaggaga gaatgtattt ggtgtctgcc atatctttgc atccttcaat gacacttttg181tccatgtcac tgatctttct ggcaaggaaa ccatctgccg tgtgactggt gggatgaagg241taaaggcaga ccgagatgaa tcctcaccat atgctgctat gttggctgcc caggatgtgg301cccagaggtg caaggagctg ggtatcaccg ccctacacat caaactccgg gccacaggag361gaaataggac caagacccct ggacctgggg cccagtcggc cctcagagcc cttgcccgct421cgggtatgaa gatcgggcgg attgaggatg tcacccccat cccctctgac agcactcgca481ggaagggggg tcgccgtggt cgccgtctgt gaacaagatt cctcaaaata ttttctgtta541ataaattgcc ttcatgtaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa(SEQ ID NO:141)1maprkgkekk eeqvislgpq vaegenvfgv chifasfndt fvhvtdlsgk eticrvtggm61kvkadrdess pyaamlaaqd vaqrckelgi talhiklrat ggnrtktpgp gaqsalrala121rsgmkigrie dvtpipsdst rrkggrrgrr l


Putative function


Ribosomal protein


Example 9 (Category 2)

Line ID—thb-a


Phenotype—Male sterile. Cytokinesis defect, larger Nebenkerns with 2-4N nuclei


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—(10B1-2)


P element insertion site—not determined


Annotated Drosophila genome Complete Genome candidate


2 candidates:


CG1453—kinesin-like protein KIF2 homolog

(SEQ ID NO:142)AAACTAAAAAATTGTGTTGCTGACATCTGGTCGCTTGCAAAACTATTTCTAGCAGATTTTGTGATATTTCGTTGTGATCGGTCGATAAATCCGCCAGTTTTTTTTTTAATGGAAAGTGCTAACACATTGTAGCGGTTGGGAAGATAGCAGGAAAGAGCCAGCGGGCTGCCGTTTTTCCTTTTTGTTATCCGTTGCCAGACGCAACGAAAACGACAGTTGGCATTTGAATTCAGCACAAACACACATACTAACGCCGACCCGCAAGCAGCACACACACACACACTGGGACACTCGAAAAAAAAAAAACAGACGCTGTCGGCGACCTCGACAAGCAGTTGGGTTCGATTTAGTTGTCAATGCCTTGAATTCGGTTCGGGGCTTAGTTTCCACAAGTTTATCGCTCGTCAAGAAACAACGAAATAAAATTATTTTCGACCTAAAAAATCTGACTAAATTGTGTTTTTTGTTTATGTATTTATTTAGGCACATTTTGCACACCACAACGTAGTTACTACATCTACGACTAACGGAACTCCTCCTGCAAGCAGTGGAAGTTGCTGTCCATCAAGCAGTACTCGGAGTTAACGCAGGATAAGCCGGGAGAAAGAGAAAGAGATCGGTGGAGAATAGAGATATACAGGTGGAGTCAAAGAGGAAGGATCATGGACATGATTACGGTGGGGCAGAGCGTCAAGATCAAGCGGACGGATGGCCGCGTCCACATGGCCGTGGTGGCGGTGATCAACCAGTCGGGCAAGTGCATCACAGTCGAATGGTACGAGCGCGGCGAAACGAAGGGCAAGGAGGTAGAACTGGACGCCATACTCACGCTCAATCCGGAGCTAATGCAAGATACTGTCGAACAGCACGCCGCCCCGGAGCCCAAGAAACAAGCCACCGCGCCGATGAACCTCTCGCGTAATCCCACACAATCGGCTATCGGTGGCAATCTCACCAGCCGTATGACCATGGCCGGAAACATGCTGAACAAGATCCAGGAAAGCCAGTCGATTCCCAATCCGATTGTCAGCAGCAATAGCGTGAATACAAACAGCAACTCCAACACTACGGCCGGCGGAGGTGGTGGCACCACAACGTCGACGACCACTGGATTACAGCGTCCACGGTACTCGCAAGCTGCTACCGGCCAGCAGCAGACAAGGATCGCCTCGGCGGTGCCTAATAACACATTGCCCAATCCCAGCGCGGCAGCCAGTGCTGGTCCGGCGGCACAAGGAGTCGCCACTGCGGCCACAACCCAGGGAGCTGGCGGCGCTAGTACCCGGCGATCGCACGCATTGAAAGAGGTGGAGCGACTGAAGGAGAATCGCGAGAAGCGACGCGCCCGACAGGCCGAGATGAAGGAGGAGAAGGTGGCGCTGATGAACCAGGATCCGGGCAATCCAAACTGGGAGACGGCGCAAATGATACGCGAATATCAGAGCACGCTGGAATTTGTGCCGCTGCTCGATGGCCAGGCCGTCGATGACCATCAGATCACAGTGTGCGTGCGCAAGCGTCCCATTAGCCGCAAGGAGGTCAATCGCAAGGAGATCGATGTCATTTCGGTGCCGCGCAAGGACATGCTCATCGTGCACGAGCCGCGCAGCAAGGTCGACCTCACCAAGTTCCTGGAGAACCACAAGTTTCGCTTCGACTACGCCTTCAACGACACGTGCGACAATGCCATGGTATACAAATACACAGCCAAGCCGTTGGTGAAAACCATTTTCGAGGGCGGAATGGCGACGTGCTTCGCCTACGGCCAGACGGGATCGGGCAAAACGCACACCATGGGCGGTGAGTTTAATGGAAAGGTGCAGGACTGCAAGAACGGCATCTACGCCATGGCGGCCAAGGATGTCTTTGTGACCCTGAATATGCCGCGTTACCGCGCCATGAATCTAGTCGTCTCGGCCAGTTTCTTTGAGATTTACAGTGGCAAGGTCTTCGATCTTCTGTCCGACAAGCAGAAACTGCGCGTCCTGGAGGATGGTAAACAGCAAGTGCAGGTGGTGGGACTCACCGAGAAGGTGGTCGATGGCGTCGAGGAGGTACTGAAGCTCATCCAGCACGGCAATGCTGCCCGAACATCCGGCCAGACGTCGGCCAACTCCAATTCGTCGCGTTCGCACGCCGTTTTCCAGATTGTGCTGCGGCCGCAGGGCTCGACGAAGATCCATGGCAAGTTCTCGTTCATCGATCTGGCGGGCAATGAGCGGGGCGTGGACACTTCCTCGGCCGATCGGCAGACGCGTATGGAGGGTGCCGAGATTAACAAATCGCTGCTGGCCCTCAAGGAGTGCATTCGTGCGTTGGGCAAACAGTCGGCCCACTTGCCCTTCCGTGTCTCCAAACTCACCCAGGTGCTGCGCGACTCGTTCATTGGCGAGAAGAGCAAGACGTGCATGATAGCCATGATCTCGCCGGGACTTAGCTCCTGCGAGCACACGCTCAACACGCTGCGCTATGCGGATCGTGTCAAGGAGCTGGTGGTCAAGGATATCGTCGAAGTTTGCCCTGGCGGCGACACCGAGCCCATCGAGATCACGGACGACGAGGAGGAGGAGGAGCTCAACATGGTGCATCCGCACTCGCATCAGCTGCATCCCAATTCGCATGCACCGGCCAGCCAGTCGAATAATCAGCGTGCTCCGGCCTCTCATCACTCGGGGGCGGTCATTCACAACAATAATAATAACAACAACAAGAACGGAAACGCCGGCAACATGGACCTGGCCATGCTGAGTTCGCTGAGCGAACACGAGATGTCCGACGAGCTGATTGTGCAGCACCAGGCCATCGACGACCTGCAGCAGACGGAGGAGATGGTGGTGGAGTATCATCGCACCGTTAATGCCACACTGGAGACCTTCCTCGCCGAGTCGAAGGCGCTGTACAATCTGACCAACTATGTGGACTACGACCAGGACTCGTACTGCAAACGGGGCGAGTCGATGTTCTCGCAGCTGCTGGACATCGCCATCCAGTGCCGCGACATGATGGCCGAATATCGCGCCAAGTTGGCCAAGGAGGAGATGCTGTCGTGCAGCTTCAATTCGCCGAATGGCAAGCGTTAGT(SEQ ID NO:143)1mitvgqsvki krtdgrvhma vvavinqsgk citvewyerg etkgkeveld ailtlnpelm61qdtveqhaap epkkqatapm nlsrnptqsa iggnltsrmt magnmlnkiq esqsipnpiv121ssnsvntnsn snttaggggg tttstttglq rprysqaatg qqqtriasav pnntlpnpsa181aasagpaaqg vataattqga ggastrrsha lkeverlken rekrrarqae mkeekvalmn241qdpgnpnwet aqmireyqst lefvplldgq avddhqitvc vrkrpisrke vnrkeidvis301vprkdmlivh eprskvdltk flenhkfrfd yafndtcdna mvykytakpl vktifeggma361tcfaygqtgs gkthtmggef ngkvqdckng iyamaakdvf vtlnmpryra mnlvvsasff421eiysgkvfdl lsdkqklrvl edgkqqvqvv gltekvvdgv eevlkliqhg naartsgqts481ansnssrsha vfqivlrpqg stkihgkfsf idlagnergv dtssadrqtr megaeinksl541lalkeciral gkqsahlpfr vskltqvlrd sfigeksktc miamispgls scehtlntlr601yadrvkelvv kdivevcpgg dtepieitdd eeeeelnmvh phshqlhpns hapasqsnnq661rapashhsga vihnnnnnnn kngnagnmdl amlsslsehe msdelivqhq aiddlqqtee721mvveyhrtvn atletflaes kalynltnyv dydqdsyckr gesmfsqlld iaiqcrdmma781eyraklakee mlscsfnspn gkrCG18292 - novel(SEQ ID NO:144)CGTAATAACGCCTCCTGATATCGATATCGATATCATATCACAAAAAACAATAAACCAAAAAAGAAACGCTAAAAACTAGTAGTTTTGTGTGCCAGGAAAACGGAAAGGTGGACATAGTTAAGTTACCACAACAACCGACGGATATCGACTCCAGACACCACATCGCCCAGCGCCACCATGGACATCATGGATATCCAGGCCGTAGAGTCCAAGCTGAGTGACGTCACGGTGACACCGATACCGCGCAGCCAAGTGCAGAATTTCTACAATTACCAGCAGCAGCGGGAGCAGCGCGAGCAGCAGCCCCAAATCCAGATATCGGCCATCCACCACTCGCGTGGATCCGTTGGCGGAGGAGGCGGATCCAACTCATCCAACGCTGCCACCGACTACTCCACGAGCAGCGGTGGCAAGCGGGAGCGGGACCGCTCCTCCGCCAGCGACTACAGCAGCTCGTCCAGCAAGCAGAGCTCCGCTGCAGCGGCCAATGCAGCAGCAGCTGCCGCCGCCGTCGCTGCCCTCCAATACTCCCCGCAGTTCCTCCAGGCCCAGCTGGCGCTACTCCAGCAGCAGTCGAACACGACGGCCACGCCGGCAGCCGTCGCCGCTGCGGCCCTCTCGCTGGCCAACATGTGCTCCAGCAATGGTGGTCAGCGGAATTCCGGTGCCGGCGTTTCCTCCACCTCCTCTGGCAGCAATGGCCAGAGCATGGGCCTGAATCTGAGCTCATCGCAGCTAAAGTACCCGCCACCCTCCACCTCGCCCGTGGTGGTGACCACCCAAACTTCGGCCAATATCACCACGCCGCTGACCTCCACGGCCAGCCTGCCCTCAGTGGGCCCGGGCAATGGGCTGACCAAGTACGCCCAGCTGCTGGCCGTCATTGAGGAGATGGGCCGCGATATCCGGCCCACGTACACGGGCTCGCGCAGCTCCACGGAGCGTCTCAAGCGGGGCATTGTCCATGCCCGCATCCTGGTGCGCGAATGCCTCATGGAAACGGAGCGTGCGGCGCGCCAATGA(SEQ ID NO:145)1mdiqaveskl sdvtvtpipr sqvqnfynyq qqreqreqqp qiqisaihhs rgsvgggggs61nssnaatdys tssggkrerd rssasdysss sskqssaaaa naaaaaaava alqyspqflq121aqlallqqqs nttatpaava aaalslanmc ssnggqrnsg agvsstssgs ngqsmglnls181ssqlkyppps tspvvvttqt sanittplts taslpsvgpg ngltkyaqll avieemgrdi241rptytgsrss terlkrgivh arilvreclm eteraarq


Human homologue of Complete Genome candidate


(CG1453)—CAA69621—kinesin-2

(SEQ ID NO:146)1ggccgaatac atcaagcaat ggtaacatct ttaaatgaag ataatgaaag tgtaactgtt61gaatggatag aaaatggaga tacaaaaggc aaagagattg acctggagag catcttttca121cttaaccctg accttgttcc tgatgaagaa attgaaccca gtccagaaac acctccacct181ccagcatcct cagccaaagt aaacaaaatt gtaaagaatc gacggactgt agcttctatt241aagaatgacc ctccttcaag agataataga gtggttggtt cagcacgtgc acggcccagt301caatttcctg aacagtcttc ctctgcacaa cagaatggta gtgtttcaga tatatctcca361gttcaagctg caaaaaagga atttggaccc ccttcacgta gaaaatctaa ttgtgtgaaa421gaagtagaaa aactgcaaga aaaacgagag aaaaggagat tgcaacagca agaacttaga481gaaaaaagag cccaggacgt tgatgctaca aacccaaatt atgaaattat gtgtatgatc541agagacttta gaggaagttt ggattataga ccattaacaa cagcagatcc tattgatgaa601cataggatat gtgtgtgtgt aagaaaacga ccactcaata aaaaagaaac tcaaatgaaa661gatcttgatg taatcacaat tcctagtaaa gatgttgtga tggtacatga accaaaacaa721aaagtagatt taacaaggta cctagaaaac caaacatttc gttttgatta tgcctttgat781gactcagctc ctaatgaaat ggtttacagg tttactgcta aaccactagt ggaaactata841tttgaaaggg gaatggctac atgctttgct tatgggcaga ctggaagtgg aaaaactcat901actatgggtg gtgacttttc aggaaagaac caagattgtt ctaaaggaat ttatgcatta961gcagctcgag atgtcttttt aatgctaaag aagccaaact ataagaagct agaacttcaa1021gtatatgcaa ccttctttga aatttatagt ggaaaggtgt ttgacttgct aaacaggaaa1081acaaaattaa gagttctaga agatggaaaa cagcaggttc aagtggtggg attacaggaa1141cgggaggtca aatgtgttga agatgtactg aaactcattg acataggcaa cagttgcaga1201acatccggtc aaacatctgc aaatgcacat tcatctcgga gccatgcagt gtttcagatt1261attcttagaa ggaaaggaaa actacatggc aaattttctc tcattgattt ggctggaaat1321gaaagaggag ctgatacttc cagtgcggac aggcaaacta ggcttgaagg tgctgaaatt1381aataaaagcc ttttagcact caaggagtgc atcagagcct taggtagaaa taaacctcat1441actcctttcc gtgcaagtaa actcactcag gtgttaagag attctttcat aggtgaaaac1501tctcgtacct gcatgattgc cacaatctct ccaggaatgg catcctgtga aaatactctt1561aatacattaa gatatgcaaa tagggtcaaa gaattgactg tagatccaac tgctgctggt1621gatgttcgtc caataatgca ccatccacca aaccagattg atgacttaga gacacagtgg1681ggtgtgggga gttcccctca gagagatgat ctaaaacttc tttgtgaaca aaatgaagaa1741gaagtctctc cacagttgtt tactttccac gaagctgttt cacaaatggt agaaatggaa1801gaacaagttg tagaagatca cagggcagtg ttccaggaat ctattcggtg gttagaagat1861gaaaaggccc tcttagagat gactgaagaa gtagattatg atgtcgattc atatgctaca1921caacttgaag ctattcttga gcaaaaaata gacattttaa ctgaactgcg ggataaagtg1981aaatctttcc gtgcagctct acaagaggag gaacaagcca gcaagcaaat caacccgaag2041agaccccgtg ccctttaaac cggcatttgc tgctaaagga tacccagaac cctcactact2101gtaacataca acggttcagc tgtaagggcc atttgaaagt ttggaatttt aagtgtctgt2161ggaaaatgtt ttgtccttca cctgaattac atttcaattt tgtgaaacac tcttttgtct2221acaaaatgct tctagtccag gaggcacaac caagaactgg gattaatgaa gcattttgtt2281tcatttacac aaatagtgat ttacttttgg agatccttgt cagttttatt ttctatttga2341tgaagtaaga ctgtggactc aatccagagc cagatagtag gggaagccac agcatttcct2401tttaactcag ttcaattttt gtagtgagac tgagcagttt taaatccttt gcgtgcatgc2461atacctcatc agtgattgta cataccttgc ccactcctag agacagctgt gctcactttt2521cctgctttgt gccttgatta aggctactga ccctaaattt ctgaagcaca gccaagaaaa2581attacattcc ttgtcattgt aaattacctt tgtgtgtaca tttttactgt atttgagaca2641ttttttgtgt gtgactagtt aattttgcag gatgtgccat atcattgaac ggaactaaag2701tctgtgacag tggatatagc tgctggacca ttccatctta tatgtaaaga aatctggaat2761tattatttta aaaccatata acatgtgatt ataatttttc ttagcatttt ctttgtaaag2821aactacaata taaactagtt ggtgtataat aaaaagtaat gaaattctga agaaaaaaaa2881aaaaaaaaaa aaaaaaaaaa aaaaa(SEQ ID NO:147)1mvtslnedne svtvewieng dtkgkeidle sifslnpdlv pdeeiepspe tppppassak61vnkivknrrt vasikndpps rdnrvvgsar arpsqfpeqs ssaqqngsvs dispvqaakk121efgppsrrks ncvkeveklq ekrekrrlqq qelrekraqd vdatnpnyei mcmirdfrgs181ldyrplttad pidehricvc vrkrplnkke tqmkdldvit ipskdvvmvh epkqkvdltr241ylenqtfrfd yafddsapne mvyrftakpl vetifergma tcfaygqtgs gkthtmggdf301sgknqdcskg iyalaardvf lmlkkpnykk lelqvyatff eiysgkvfdl lnrktklrvl361edgkqqvqvv glqerevkcv edvlklidig nscrtsgqts anahssrsha vfqiilrrkg421klhgkfslid lagnergadt ssadrqtrle gaeinkslla lkeciralgr nkphtpfras481kltqvlrdsf igensrtcmi atispgmasc entlntlrya nrvkeltvdp taagdvrpim541hhppnqiddl etqwgvgssp qrddlkllce qneeevspql ftfheavsqm vemeeqvved601hravfqesir wledekalle mteevdydvd syatqleail eqkidiltel rdkvksfraa661lqeeeqaskq inpkrpral(CG18292) - BAA22937 - cdk2-associated protein 1; cdk2ap1, deleted inoral cancer 1 (doc-1, alias DORC1)(SEQ ID NO:148)1accgcccggc ctcgccgccg ccgccgccgc cctcgcggcc tggccccgcc gcgcccggcg61cgcccgccgc ccggggggat gtcttacaaa ccgaacttgg ccgcgcacat gcccgccgcc121gccctcaacg ccgctgggag tgtccactcg ccttccacca gcatggcaac gtcttcacag181taccgccagc tgctcagtga ctacgggcca ccgtccctag gctacaccca gggaactggg241aacagccagg tgccccaaag caaatacgcg gagctgctgg ccatcattga agagctgggg301aaggagatca gacccacgta cgcagggagc aagagtgcca tggagaggct gaagcgcggc361atcattcacg ctagaggact ggttcgggag tgcttggcag aaacggaacg gaatgccaga421tcctagctgc cttgttggtt ttgaaggatt tccatctttt tacaagatga gaagttacag481ttcatctccc ctgttcagat gaaacccttg ttttcaaaat ggttacagtt tcgtttttcc541tcccatggtt cacttggctc tgaacctaca gtctcaaaga ttgagaaaag attttgcagt601taattaggat ttgcatttta agtagttagg aactgcccag gttttttttg ttttttaagc661attgatttaa aagatgcacg gaaagttatc ttacagcaaa ctgtagtttg cctccaagac721accattgtct ccctttaatc ttctcttttg tatacatttg ttacccatgg tgttctttgt781tccttttcat aagctaatac cactgtaggg attttgtttt gaacgcatat tgacagcacg841ctttacttag tagccggttc ccatttgcca tacaatgtag gttctgctta atgtaacttc901ttttttgctt aagcatttgc atgactatta gtgcttcaaa gtcaattttt aaaaatgcac961aagttataaa tacagaagaa agagcaaccc accaaaccta acaaggaccc ccgaacactt1021tcatactaag actgtaagta gatctcagtt ctgcgtttat tgtaagttga taaaaacatc1081tgggaggaaa tgactaaaac tgtttgcatc tttgtatgta tttattactt gatgtaataa1141agcttatttt cattaacc(SEQ ID NO:149)1msykpnlaah mpaaalnaag svhspstsma tssqyrqlls dygppslgyt qgtgnsqvpq61skyaellaii eelgkeirpt yagsksamer lkrgiiharg lvreclaete rnars


Putative function


(CG1453)—Motor protein


(CG18292)—Cdk2 associated, candidate tumour supressor


Example 9A (Category 2)

Line ID—ms(1) 13


Phenotype—Male sterile, Cytokinesis defect: variable sized Nebenkerns with 4N nuclei, some nuclei detached from Nebenkern


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003436 (5D1)


P element insertion site sequence

(SEQ ID NO:150)CATCATGTATCATACATTGAAGACGGATTAGCACCGTCGACCACGAAAAAAGAACGCAAGGAAATCGTGCAAAATGTTCAAAAAGTACGTATGGCATGAGTTAGATGGGGACATCAGACTAACCATAGCAATTCGATCTGTGCAGATTCGAAGAGAAGGACAGCATTTCCAGCATTCAGCAGCTGAAGTCGTCTGTGCAGAAGGGCATACGTGCCAAGTTGCTGGAGGCCTATCCCAAGTTGGAGAGTCACATCGACCTGATCCTGCCCAAGAAGGACTCGTACCGCATCGCCAAGTGGTAGGATGGCTCAGTTCTTGCCACAGCACATAACTCCATTCATATTCCCGATCCCTACTCCTCCACCAGCCATGACCACATCGAACTGCTGCTAAACGGAGCCGGCGACCAGGTGTTCTTTCGCCACCGCGATGGCCCCTGGATGCCTACCCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGNCACGACGTTGNAAAACGACGGNCANNGCCAAGCTCTGCTGCT


Annotated Drosophila genome Complete Genome candidate—CG5941—novel protein with a PUA domain

(SEQ ID NO:151)CGGATTAGCACCGTCGACCACGAAAAAAGAACGCAAGGAAATCGTGCAAAATGTTCAAAAAATTCGAAGAGAAGGACAGCATTTCCAGCATTCAGCAGCTGAAGTCGTCTGTGCAGAAGGGCATACGTGCCAAGTTGCTGGAGGCCTATCCCAAGTTGGAGAGTCACATCGACCTGATCCTGCCCAAGAAGGACTCGTACCGCATCGCCAAGTGCCATGACCACATCGAACTGCTGCTAAACGGAGCCGGCGACCAGGTGTTCTTTCGCCACCGCGATGGCCCCTGGATGCCTACCCTGCGCCTCCTGCACAAGTTCCCCTACTTCGTGACCATGCAGCAAGTGGACAAAGGCGCCATCCGCTTCGTCCTGAGCGGAGCGAACGTCATGTGTCCCGGCCTCACATCGCCAGGCGCCTGTATGACGCCGGCCGACAAGGACACCGTGGTGGCCATCATGGCTGAGGGCAAGGAGCACGCCCTGGCCGTTGGACTCCTCACGTTATCCACACAGGAAATTCTGGCGAAGAACAAAGGCATCGGTATCGAGACGTACCACTTCCTCAACGACGGCCTGTGGAAGTCGAAGCCCGTGAAGTAGGCGAAATAGGAATCTGCACTTGCACTTTTTA(SEQ ID NO:152)MFKKFEEKDSISSIQQLKSSVQKGIRAKLLEAYPKLESHIDLILPKKDSYRIAKCHDHIELLLNGAGDQVFFRHRDGPWMPTLRLLHKFPYFVTMQQVDKGAIRFVLSGANVMCPGLTSPGACMTPADKDTVVAIMAEGKEHALAVGLLTLSTQEILAKNKGIGIETYHFLNDGLWKSKPVK


Human homologue of Complete Genome candidate


MCT-1 (multiple copies in a T-cell malignancies) (BAA86055), a novel candidate oncogene involved in cell cycle which has a domain similar to cyclin H

(SEQ ID NO:153)1gctacctcca actgctgagg aaccggttgc ctaaaaggag ccggcaaaag cgcctacgtg61gagtccagag gagcggaagt agtcagattt gactgagagc cgtaaagcgc ggctggctct121cgttttccgg ataacgacta cagctccgac tgtcagtgcc ggccttcctc gtgtgagggg181atctgccgga cccctgcaaa ttcaatttct ttcccattcc gggcccttcc ctatcgtcgc241ccccttcacc ttggatcatg ttcaagaaat ttgatgaaaa agaaaatgtg tccaactgca301tccagttgaa aacttcagtt attaagggta ttaagaatca attgatagag caatttccag361gtattgaacc atggcttaat caaatcatgc ctaagaaaga tcctgtcaaa atagtccgat421gccatgaaca tatagaaatc cttacagtaa atggagaatt actctttttt agacaaagag481aagggccttt ttatccaacc ctaagattac ttcacaaata tccttttatc ctgccacacc541agcaggttga taaaggagcc atcaaatttg tactcagtgg agcaaatatc atgtgtccag601gcttaacttc tcctggagct aagctttacc ctgctgcagt agataccatt gttgctatca661tggcagaagg aaaacagcat gctctatgtg ttggagtcat gaagatgtct gcagaagaca721ttgagaaagt caacaaagga attggcattg aaaatatcca ttatttaaat gatgggctgt781ggcatatgaa gacatataaa tgagcctcag aaggaatgca cttgggctaa atatggatat841tgtgctgtat ctgtgtttgt gtctgtgtgt gacagcatga agataatgcc tgtggttatg901ctgaataaat tcaccagatg ctaaaaaaaa aaaaaaaaaa aaa(SEQ ID NO:154)1mfkkfdeken vsnciqlkts vikgiknqli eqfpgiepwl nqimpkkdpv kivrchehie61iltvngellf frqregpfyp tlrllhkypf ilphqqvdkg aikfvlsgan imcpgltspg121aklypaavdt ivaimaegkq halcvgvmkm saediekvnk gigienihyl ndglwhmkty181k


Putative function


Role in cell cycle progression


Category 3—Mitotic (Neuroblast) Phenotypes


Example 10 (Category 3)

Line ID—187


Phenotype—lethal phase between pupil and pharate adult (P-pA). High mitotic index, rod-like overcondensed chromosomes, a few circular metaphases, many overcondensed anaphases and telophases, a few tetraploid cells


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003445 (8B3-7)


P element insertion site—174,362


Annotated Drosophila genome Complete Genome candidate—CG10701 moesin, cytoskeletal binding protein (4 splice variants)

(SEQ ID NO:155)ACGCCGCATGCACTTTTTTATCTATGATATTATGTTTATTATTTCATTATTGAATCGGGAAAACCAAACGTTTTTTTTTTTTTCGTATACAAATCCATTTGCAGTTTGTAAACTTTAGCGTGCATTCGCATCTAATAGTGATATGTTTTCGCTTTTCACAGGTGATGAACCAGGACGTGAAGAAGGAGAATCCCTTGCAGTTTAGGTTCCGTGCCAAATTCTATCCCGAGGATGTGGCCGAGGAGCTGATCCAGGACATTACACTGCGTCTGTTCTACCTGCAGGTGAAGAATGCCATACTGACCGACGAGATCTATTGTCCGCCAGAGACATCCGTGCTGCTCGCCTCGTACGCCGTCCAGGCGCGTCATGGTGACCACAATAAGACCACCCACACAGCCGGCTTTCTGGCCAACGATCGCCTGCTGCCGCAGCGCGTCATCGACCAGCACAAGATGTCCAAGGACGAGTGGGAGCAGTCGATTATGACCTGGTGGCAGGAGCATCGCAGCATGCTGCGCGAGGATGCCATGATGGAGTATCTGAAGATCGCCCAAGACCTGGAGATGTACGGCGTTAACTACTTTGAGATCCGCAACAAGAAGGGCACGGATCTTTGGCTGGGCGTAGACGCACTGGGTCTGAACATTTACGAGCAGGACGATAGGTTGACGCCGAAAATTGGTTTCCCATGGTCCGAGATTCGCAACATTTCGTTCTCGGAGAAGAAGTTCATCATCAAGCCGATCGACAAGAAGGCTCCGGACTTTATGTTCTTTGCGCCACGTGTCCGCATCAACAAGCGCATTCTGGCCCTCTGCATGGGCAACCACGAGCTGTACATGCGTCGCCGCAAGCCGGACACCATCGATGTGCAGCAGATGAAGGCGCAGGCGCGCGAGGAGAAGAATGCCAAACAGCAGGAACGTGAGAAGCTGCAGCTGGCGCTGGCCGCACGCGAACGCGCTGAAAAGAAGCAGCAGGAGTACGAGGATCGGCTAAAGCAGATGCAGGAGGACATGGAGCGTTCGCAGCGCGATCTGCTTGAGGCGCAGGACATGATCCGCCGGCTGGAGGAGCAGCTGAAGCAGCTGCAGGCCGCCAAGGATGAGCTGGAGCTGCGCCAGAAGGAGCTGCAGGCGATGCTGCAGCGCCTCGAGGAGGCCAAGAATATGGAGGCCGTCGAGAAGCTCAAGCTCGAGGAGGAGATCATGGCCAAGCAGATGGAGGTGCAGCGCATTCAGGACGAGGTCAACGCCAAGGATGAGGAGACAAAGCGTCTGCAGGACGAAGTGGAAGACGCCCGACGCAAGCAGGTCATTGCGGCTGAAGCCGCTGCCGCTCTGCTGGCCGCGTCGACAACGCCGCAGCATCACCACGTGGCCGAGGATGAGAACGAGAACGAGGAGGAGCTGACGAACGGCGATGCCGGTGGCGATGTGTCGCGCGACCTGGACACCGACGAGCATATCAAGGACCCCATCGAGGACAGACGCACGCTGGCCGAGCGCAACGAACGCTTGCACGATCAGCTCAAGGCTCTGAAACAAGATTTGGCGCAGTCTCGCGACGAGACGAAAGAGACGGCAAACGATAAGATTCATCGCGAGAACGTTCGCCAGGGACGTGACAAGTACAAGACGCTCCGCGAGATTCGTAAGGGCAACACAAAGCGTCGCGTCGATCAGTTTGAGAACATGTAAAAGCTATCAAAGATCAGAGATCGATAGTGCGCGGGAAAGAGAGAGGGAGCGGTGAGACTCCAGAAAGA(SEQ ID NO:156)MNQDVKKENPLQFRFRAKFYPEDVAEELIQDITLRLFYLQVKNAILTDEIYCPPETSVLLASYAVQARHGDHNKTTHTAGFLANDRLLPQRVIDQHKMSKDEWEQSIMTWWQEHRSMLREDAMMEYLKIAQDLEMYGVNYFEIRNKKGTDLWLGVDALGLNIYEQDDRLTPKIGFPWSEIRNISFSEKKFIIKPIDKKAPDFMFFAPRVRINKRILALCMGNHELYMRRRKPDTIDVQQMKAQAREEKNAKQQEREKLQLALAARERAEKKQQEYEDRLKQMQEDMERSQRDLLEAQDMIRRLEEQLKQLQAAKDELELRQKELQAMLQRLEEAKNMEAVEKLKLEEEIMAKQMEVQRIQDEVNAKDEETKRLQDEVEDARRKQVIAAEAAAALLAASTTPQHHHVAEDENENEEELTNGDAGGDVSRDLDTDEHIKDPIEDRRTLAERNERLHDQLKALKQDLAQSRDETKETANDKIHRENVRQGRDKYKTLREIRKGNTKRRVDQFENM(SEQ ID NO:157)GACAACAGAATCGAATCGTCGCTTTTCCGCTTTTAACCATCGTGTCGCGTTGGTCGGTTGGTTTTCCCGCGTAGCTTGTGGCTGCTCAAGAATATATATATATTTCCCAGACGGAGATTTGCATTGAAAAGGCGTAATAATTCAAAAGCTACTGCGCAATCCGTTTTCGGTGCCCAAAATGGTCGTCGTCTCCGACAGCCGCGTCCGTTTGCCGCGTTACGGCGGAGTCAGCGTCAAACGGAAAACGCTAAATGTGCGCGTCACGACAATGGACGCGGAACTGGAGTTCGCCATTCAGTCGACGACGACGGGCAAGCAATTGTTTGACCAGGTGGTGAAGACGATCGGCCTGCGAGAGGTTTGGTTCTTTGGACTCCAGTACACCGACTCCAAGGGCGACTCCACATGGATCAAGCTGTACAAAAAGCCCGAATCGCCGGCCATAAAGACAATAAAATATTTAAAGCGTGTAAAGAAGTATGTGGACAAAAAGACAGCCGACAGCAATGGAGTAAATCATTTAGAGACGAGCGAAGAGGATGACGACGCCGATGATATGACTGGATCAATGCCGTTTTCGACATGGGTGATGAACCAGGACGTGAAGAAGGAGAATCCCTTGCAGTTTAGGTTCCGTGCCAAATTCTATCCCGAGGATGTGGCCGAGGAGCTGATCCAGGACATTACACTGCGTCTGTTCTACCTGCAGGTGAAGAATGCCATACTGACCGACGAGATCTATTGTCCGCCAGAGACATCCGTGCTGCTCGCCTCGTACGCCGTCCAGGCGCGTCATGGTGACCACAATAAGACCACCCACACAGCCGGCTTTCTGGCCAACGATCGCCTGCTGCCGCAGCGCGTCATCGACCAGCACAAGATGTCCAAGGACGAGTGGGAGCAGTCGATTATGACCTGGTGGCAGGAGCATCGCAGCATGCTGCGCGAGGATGCCATGATGGAGTATCTGAAGATCGCCCAAGACCTGGAGATGTACGGCGTTAACTACTTTGAGATCCGCAACAAGAAGGGCACGGATCTTTGGCTGGGCGTAGACGCACTGGGTCTGAACATTTACGAGCAGGACGATAGGTTGACGCCGAAAATTGGTTTCCCATGGTCCGAGATTCGCAACATTTCGTTCTCGGAGAAGAAGTTCATCATCAAGCCGATCGACAAGAAGGCTCCGGACTTTATGTTCTTTGCGCCACGTGTCCGCATCAACAAGCGCATTCTGGCCCTCTGCATGGGCAACCACGAGCTGTACATGCGTCGCCGCAAGCCGGACACCATCGATGTGCAGCAGATGAAGGCGCAGGCGCGCGAGGAGAAGAATGCCAAACAGCAGGAACGTGAGAAGCTGCAGCTGGCGCTGGCCGCACGCGAACGCGCTGAAAAGAAGCAGCAGGAGTACGAGGATCGGCTAAAGCAGATGCAGGAGGACATGGAGCGTTCGCAGCGCGATCTGCTTGAGGCGCAGGACATGATCCGCCGGCTGGAGGAGCAGCTGAAGCAGCTGCAGGCCGCCAAGGATGAGCTGGAGCTGCGCCAGAAGGAGCTGCAGGCGATGCTGCAGCGCCTCGAGGAGGCCAAGAATATGGAGGCCGTCGAGAAGCTCAAGCTCGAGGAGGAGATCATGGCCAAGCAGATGGAGGTGCAGCGCATTCAGGACGAGGTCAACGCCAAGGATGAGGAGACAAAGCGTCTGCAGGACGAAGTGGAAGACGCCCGACGCAAGCAGGTCATTGCGGCTGAAGCCGCTGCCGCTCTGCTGGCCGCGTCGACAACGCCGCAGCATCACCACGTGGCCGAGGATGAGAACGAGAACGAGGAGGAGCTGACGAACGGCGATGCCGGTGGCGATGTGTCGCGCGACCTGGACACCGACGAGCATATCAAGGACCCCATCGAGGACAGACGCACGCTGGCCGAGCGCAACGAACGCTTGCACGATCAGCTCAAGGCTCTGAAACAAGATTTGGCGCAGTCTCGCGACGAGACGAAAGAGACGGCAAACGATAAGATTCATCGCGAGAACGTTCGCCAGGGACGTGACAAGTACAAGACGCTCCGCGAGATTCGTAAGGGCAACACAAAGCGTCGCGTCGATCAGTTTGAGAACATGTAAAAGCTATCAAAGATCAGAGATCGATAGTGCGCGGGAAAGAGAGAGGGAGCGGTGAGACTCCAGAAAGA(SEQ ID NO:158)MVVVSDSRVRLPRYGGVSVKRKTLNVRVTTMDAELEFAIQSTTTGKQLFDQVVKTIGLREVWFFGLQYTDSKGDSTWIKLYKKPESPAIKTIKYLKRVKKYVDKKTADSNGVNHLETSEEDDDADDMTGSMPFSTWVMNQDVKKENPLQFRFRAKFYPEDVAEELIQDITLRLFYLQVKNAILTDEIYCPPETSVLLASYAVQARHGDHNKTTHTAGFLANDRLLPQRVIDQHKMSKDEWEQSIMTWWQEHRSMLREDAMMEYLKIAQDLEMYGVNYFEIRNKKGTDLWLGVDALGLNIYEQDDRLTPKIGFPWSEIRNISFSEKKFIIKPIDKKAPDFMFFAPRVRINKRILALCMGNHELYMRRRKPDTIDVQQMKAQAREEKNAKQQEREKLQLALAARERAEKKQQEYEDRLKQMQEDMERSQRDLLEAQDMIRRLEEQLKQLQAAKDELELRQKELQAMLQRLEEAKNMEAVEKLKLEEEIMAKQMEVQRIQDEVNAKDEETKRLQDEVEDARRKQVIAAEAAAALLAASTTPQHHHVAEDENENEEELTNGDAGGDVSRDLDTDEHIKDPIEDRRTLAERNERLHDQLKALKQDLAQSRDETKETANDKIHRENVRQGRDKYKTLREIRKGNTKRRVDQFENM(SEQ ID NO:159)CCAAAGCGAAACGGGAGCTCTTGGCACGTGCCCTGCTCACATCCCGTTAATCCATCGACCCCTAAACAAATCGTGGGGGATTCTCCTCTGCACGCCACCTTCATCGATGGGTGTCAATTTTTTACTCTTTTTTTTTTCTATTTGGCTTCTAAATGTGCGCGTCACGACAATGGACGCGGAACTGGAGTTCGCCATTCAGTCGACGACGACGGGCAAGCAATTGTTTGACCAGGTGGTGAAGACGATCGGCCTGCGAGAGGTTTGGTTCTTTGGACTCCAGTACACCGACTCCAAGGGCGACTCCACATGGATCAAGCTGTACAAAAAGCCCGAATCGCCGGCCATAAAGACAATAAAATATTTAAAGCGTGTAAAGAAGTATGTGGACAAAAAGACAGCCGACAGCAATGGAGTAAATCATTTAGAGACGAGCGAAGAGGATGACGACGCCGATGATATGACTGGATCAATGCCGTTTTCGACATGGGTGATGAACCAGGACGTGAAGAAGGAGAATCCCTTGCAGTTTAGGTTCCGTGCCAAATTCTATCCCGAGGATGTGGCCGAGGAGCTGATCCAGGACATTACACTGCGTCTGTTCTACCTGCAGGTGAAGAATGCCATACTGACCGACGAGATCTATTGTCCGCCAGAGACATCCGTGCTGCTCGCCTCGTACGCCGTCCAGGCGCGTCATGGTGACCACAATAAGACCACCCACACAGCCGGCTTTCTGGCCAACGATCGCCTGCTGCCGCAGCGCGTCATCGACCAGCACAAGATGTCCAAGGACGAGTGGGAGCAGTCGATTATGACCTGGTGGCAGGAGCATCGCAGCATGCTGCGCGAGGATGCCATGATGGAGTATCTGAAGATCGCCCAAGACCTGGAGATGTACGGCGTTAACTACTTTGAGATCCGCAACAAGAAGGGCACGGATCTTTGGCTGGGCGTAGACGCACTGGGTCTGAACATTTACGAGCAGGACGATAGGTTGACGCCGAAAATTGGTTTCCCATGGTCCGAGATTCGCAACATTTCGTTCTCGGAGAAGAAGTTCATCATCAAGCCGATCGACAAGAAGGCTCCGGACTTTATGTTCTTTGCGCCACGTGTCCGCATCAACAAGCGCATTCTGGCCCTCTGCATGGGCAACCACGAGCTGTACATGCGTCGCCGCAAGCCGGACACCATCGATGTGCAGCAGATGAAGGCGCAGGCGCGCGAGGAGAAGAATGCCAAACAGCAGGAACGTGAGAAGCTGCAGCTGGCGCTGGCCGCACGCGAACGCGCTGAAAAGAAGCAGCAGGAGTACGAGGATCGGCTAAAGCAGATGCAGGAGGACATGGAGCGTTCGCAGCGCGATCTGCTTGAGGCGCAGGACATGATCCGCCGGCTGGAGGAGCAGCTGAAGCAGCTGCAGGCCGCCAAGGATGAGCTGGAGCTGCGCCAGAAGGAGCTGCAGGCGATGCTGCAGCGCCTCGAGGAGGCCAAGAATATGGAGGCCGTCGAGAAGCTCAAGCTCGAGGAGGAGATCATGGCCAAGCAGATGGAGGTGCAGCGCATTCAGGACGAGGTCAACGCCAAGGATGAGGAGACAAAGCGTCTGCAGGACGAAGTGGAAGACGCCCGACGCAAGCAGGTCATTGCGGCTGAAGCCGCTGCCGCTCTGCTGGCCGCGTCGACAACGCCGCAGCATCACCACGTGGCCGAGGATGAGAACGAGAACGAGGAGGAGCTGACGAACGGCGATGCCGGTGGCGATGTGTCGCGCGACCTGGACACCGACGAGCATATCAAGGACCCCATCGAGGACAGACGCACGCTGGCCGAGCGCAACGAACGCTTGCACGATCAGCTCAAGGCTCTGAAACAAGATTTGGCGCAGTCTCGCGACGAGACGAAAGAGACGGCAAACGATAAGATTCATCGCGAGAACGTTCGCCAGGGACGTGACAAGTACAAGACGCTCCGCGAGATTCGTAAGGGCAACACAAAGCGTCGCGTCGATCAGTTTGAGAACATGTAAAAGCTATCAAAGATCAGAGATCGATAGTGCGCGGGAAAGAGAGAGGGAGCGGTGAGACTCCAGAAAGA(SEQ ID NO:160)MGVNFLLFFFSIWLLNVRVTTMDAELEFAIQSTTTGKQLFDQVVKTIGLREVWFFGLQYTDSKGDSTWIKLYKKPESPAIKTIKYLKRVKKYVDKKTADSNGVNHLETSEEDDDADDMTGSMPFSTWVMNQDVKKENPLQFRFRAKFYPEDVAEELIQDITLRLFYLQVKNAILTDEIYCPPETSVLLASYAVQARHGDHNKTTHTAGFLANDRLLPQRVIDQHKMSKDEWEQSIMTWWQEHRSMLREDAMMEYLKIAQDLEMYGVNYFEIRNKKGTDLWLGVDALGLNIYEQDDRLTPKIGFPWSEIRNISFSEKKFIIKPIDKKAPDFMFFAPRVRINKRILALCMGNHELYMRRRKPDTIDVQQMKAQAREEKNAKQQEREKLQLALAARERAEKKQQEYEDRLKQMQEDMERSQRDLLEAQDMIRRLEEQLKQLQAAKDELELRQKELQAMLQRLEEAKNMEAVEKLKLEEEIMAKQMEVQRIQDEVNAKDEETKRLQDEVEDARRKQVIAAEAAAALLAASTTPQHHHVAEDENENEEELTNGDAGGDVSRDLDTDEHIKDPIEDRRTLAERNERLHDQLKALKQDLAQSRDETKETANDKIHRENVRQGRDKYKTLREIRKGNTKRRVDQFENM(SEQ ID NO:161)AAAGCTCACGAAAAACACGCGGCAATTGGATAAGAAACGAAATTGTTGATCCAACGCGAGGAAGAAGAAGAATTGTGAAGCAAGAAGAAGCGAAAACAAACTGCGATTGCAGCACAAAAACAATAAAGAGTTCAGACGATAATATCCTGGAAAGAAAACATTTCGTTTCGATAAGTACGACAAGACACGAAACAACAAAATGTCTCCAAAAGCGCTAAATGTGCGCGTCACGACAATGGACGCGGAACTGGAGTTCGCCATTCAGTCGACGACGACGGGCAAGCAATTGTTTGACCAGGTGGTGAAGACGATCGGCCTGCGAGAGGTTTGGTTCTTTGGACTCCAGTACACCGACTCCAAGGGCGACTCCACATGGATCAAGCTGTACAAAAAGGTGATGAACCAGGACGTGAAGAAGGAGAATCCCTTGCAGTTTAGGTTCCGTGCCAAATTCTATCCCGAGGATGTGGCCGAGGAGCTGATCCAGGACATTACACTGCGTCTGTTCTACCTGCAGGTGAAGAATGCCATACTGACCGACGAGATCTATTGTCCGCCAGAGACATCCGTGCTGCTCGCCTCGTACGCCGTCCAGGCGCGTCATGGTGACCACAATAAGACCACCCACACAGCCGGCTTTCTGGCCAACGATCGCCTGCTGCCGCAGCGCGTCATCGACCAGCACAAGATGTCCAAGGACGAGTGGGAGCAGTCGATTATGACCTGGTGGCAGGAGCATCGCAGCATGCTGCGCGAGGATGCCATGATGGAGTATCTGAAGATCGCCCAAGACCTGGAGATGTACGGCGTTAACTACTTTGAGATCCGCAACAAGAAGGGCACGGATCTTTGGCTGGGCGTAGACGCACTGGGTCTGAACATTTACGAGCAGGACGATAGGTTGACGCCGAAAATTGGTTTCCCATGGTCCGAGATTCGCAACATTTCGTTCTCGGAGAAGAAGTTCATCATCAAGCCGATCGACAAGAAGGCTCCGGACTTTATGTTCTTTGCGCCACGTGTCCGCATCAACAAGCGCATTCTGGCCCTCTGCATGGGCAACCACGAGCTGTACATGCGTCGCCGCAAGCCGGACACCATCGATGTGCAGCAGATGAAGGCGCAGGCGCGCGAGGAGAAGAATGCCAAACAGCAGGAACGTGAGAAGCTGCAGCTGGCGCTGGCCGCACGCGAACGCGCTGAAAAGAAGCAGCAGGAGTACGAGGATCGGCTAAAGCAGATGCAGGAGGACATGGAGCGTTCGCAGCGCGATCTGCTTGAGGCGCAGGACATGATCCGCCGGCTGGAGGAGCAGCTGAAGCAGCTGCAGGCCGCCAAGGATGAGCTGGAGCTGCGCCAGAAGGAGCTGCAGGCGATGCTGCAGCGCCTCGAGGAGGCCAAGAATATGGAGGCCGTCGAGAAGCTCAAGCTCGAGGAGGAGATCATGGCCAAGCAGATGGAGGTGCAGCGCATTCAGGACGAGGTCAACGCCAAGGATGAGGAGACAAAGCGTCTGCAGGACGAAGTGGAAGACGCCCGACGCAAGCAGGTCATTGCGGCTGAAGCCGCTGCCGCTCTGCTGGCCGCGTCGACAACGCCGCAGCATCACCACGTGGCCGAGGATGAGAACGAGAACGAGGAGGAGCTGACGAACGGCGATGCCGGTGGCGATGTGTCGCGCGACCTGGACACCGACGAGCATATCAAGGACCCCATCGAGGACAGACGCACGCTGGCCGAGCGCAACGAACGCTTGCACGATCAGCTCAAGGCTCTGAAACAAGATTTGGCGCAGTCTCGCGACGAGACGAAAGAGACGGCAAACGATAAGATTCATCGCGAGAACGTTCGCCAGGGACGTGACAAGTACAAGACGCTCCGCGAGATTCGTAAGGGCAACACAAAGCGTCGCGTCGATCAGTTTGAGAACATGTAAAAGCTATCAAAGATCAGAGATCGATAGTGCGCGGGAAAGAGAGAGGGAGCGGTGAGACTCCAGAAAGA(SEQ ID NO:162)MSPKALNVRVTTMDAELEFAIQSTTTGKQLFDQVVKTIGLREVWFFGLQYTDSKGDSTWIKLYKKVMNQDVKKENPLQFRFRAKFYPEDVAEELIQDITLRLFYLQVKNAILTDEIYCPPETSVLLASYAVQARHGDHNKTTHTAGFLANDRLLPQRVIDQHKMSKDEWEQSIMTWWQEHRSMLREDAMMEYLKIAQDLEMYGVNYFEIRNKKGTDLWLGVDALGLNIYEQDDRLTPKIGFPWSEIRNISFSEKKFIIKPIDKKAPDFMFFAPRVRINKRILALCMGNHELYMRRRKPDTIDVQQMKAQAREEKNAKQQEREKLQLALAARERAEKKQQEYEDRLKQMQEDMERSQRDLLEAQDMIRRLEEQLKQLQAAKDELELRQKELQAMLQRLEEAKNMEAVEKLKLEEEIMAKQMEVQRIQDEVNAKDEETKRLQDEVEDARRKQVIAAEAAAALLAASTTPQHHHVAEDENENEEELTNGDAGGDVSRDLDTDEHIKDPIEDRRTLAERNERLHDQLKALKQDLAQSRDETKETANDKIHRENVRQGRDKYKTLREIRKGNTKRRVDQFENM


Human homologue of Complete Genome candidate


A41289 human moesin

(SEQ ID NO:163)1ggcacgaggc cagccgaatc caagccgtgt gtactgcgtg ctcagcactg cccgacagtc61ctagctaaac ttcgccaact ccgctgcctt tgccgccacc atgcccaaaa cgatcagtgt121gcgtgtgacc accatggatg cagagctgga gtttgccatc cagcccaaca ccaccgggaa181gcagctattt gaccaggtgg tgaaaactat tggcttgagg gaagtttggt tctttggtct241gcagtaccag gacactaaag gtttctccac ctggctgaaa ctcaataaga aggtgactgc301ccaggatgtg cggaaggaaa gccccctgct ctttaagttc cgtgccaagt tctaccctga361ggatgtgtcc gaggaattga ttcaggacat cactcagcgc ctgttctttc tgcaagtgaa421agagggcatt ctcaatgatg atatttactg cccgcctgag accgctgtgc tgctggcctc481gtatgctgtc cagtctaagt atggcgactt caataaggaa gtgcataagt ctggctacct541ggccggagac aagttgctcc cgcagagagt cctggaacag cacaaactca acaaggacca601gtgggaggag cggatccagg tgtggcatga ggaacaccgt ggcatgctca gggaggatgc661tgtcctggaa tatctgaaga ttgctcaaga tctggagatg tatggtgtga actacttcag721catcaagaac aagaaaggct cagagctgtg gctgggggtg gatgccctgg gtctcaacat781ctatgagcag aatgacagac taactcccaa gataggcttc ccctggagtg aaatcaggaa841catctctttc aatgataaga aatttgtcat caagcccatt gacaaaaaag ccccggactt901cgtcttctat gctccccggc tgcggattaa caagcggatc ttggccttgt gcatggggaa961ccatgaacta tacatgcgcc gtcgcaagcc tgataccatt gaggtgcagc agatgaaggc1021acaggcccgg gaggagaagc accagaagca gatggagcgt gctatgctgg aaaatgagaa1081gaagaagcgt gaaatggcag agaaggagaa agagaagatt gaacgggaga aggaggagct1141gatggagagg ctgaagcaga tcgaggaaca gactaagaag gctcagcaag aactggaaga1201acagacccgt agggctctgg aacttgagca ggaacggaag cgtgcccaga gcgaggctga1261aaagctggcc aaggagcgtc aagaagctga agaggccaag gaggccttgc tgcaggcctc1321ccgggaccag aaaaagactc aggaacagct ggccttggaa atggcagagc tgacagctcg1381aatctcccag ctggagatgg cccgacagaa gaaggagagt gaggctgtgg agtggcagca1441gaaggcccag atggtacagg aagacttgga gaagacccgt gctgagctga agactgccat1501gagtacacct catgtggcag agcctgctga gaatgagcag gatgagcagg atgagaatgg1561ggcagaggct agtgctgacc tacgggctga tgctatggcc aaggaccgca gtgaggagga1621acgtaccact gaggcagaga agaatgagcg tgtgcagaag cacctgaagg ccctcacttc1681ggagctggcc aatgccagag atgagtccaa gaagactgcc aatgacatga tccatgctga1741gaacatgcga ctgggccgag acaaatacaa gaccctgcgc cagatccggc agggcaacac1801caagcagcgc attgacgaat ttgagtctat gtaatgggca cccagcctct agggacccct1861cctccctttt tccttgtccc cacactccta cacctaactc acctaactca tactgtgctg1921gagccactaa ctagagcagc cctggagtca tgccaagcat ttaatgtagc catgggacca1981aacctagccc cttagccccc acccacttcc ctgggcaaat gaatggctca ctatggtgcc2041aatggaacct cctttctctt ctctgttcca ttgaatctgt atggctagaa tatcctactt2101ctccagccta gaggtacttt ccacttgatt ttgcaaatgc ccttacactt actgttgtcc2161tatgggagtc aagtgtggag taggttggaa gctagctccc ctcctctccc ctccactgtc2221ttcttcaggt cctgagatta cacggtggag tgtatgcggt ctaggaatga gacaggacct2281agatatcttc tccagggatg tcaactgacc taaaatttgc cctcccatcc cgtttagagt2341tatttaggct ttgtaacgat tgggggaata aaaagatgtt cagtcatttt tgtttctacc2401tcccagatcg gatctgttgc aaactcagcc tcaataagcc ttgtcgttga ctttagggac2461tcaatttctc cccagggtgg atgggggaaa tggtgccttc aagaccttca ccaaacatac2521tagaagggca ttggccattc tattgtggca aggctgagta gaagatccta ccccaattcc2581ttgtaggagt ataggccggt ctaaagtgag ctctatgggc agatctaccc cttacttatt2641attccagatc tgcagtcact tcgtgggatc tgcccctccc tgcttcaata cccaaatcct2701ctccagctat aacagtaggg atgagtaccc aaaagctcag ccagccccat caggactctt2761gtgaaaagag aggatatgtt cacacctagc gtcagtattt tccctgctag gggttttagg2821tctcttcccc tctcagagct acttgggcca tagctcctgc tccacagcca tcccagcctt2881ggcatctaga gcttgatgcc agtaggctca actagggagt gagtgcaaaa agctgagtat2941ggtgagagaa gcctgtgccc tgatccaagt ttactcaacc ctctcaggtg accaaaatcc3001ccttctcatc actcccctca aagaggtgac tgggccctgc ctctgtttga caaacctcta3061acccaggtct tgacaccagc tgttctgtcc cttggagctg taaaccagag agctgctggg3121ggattctggc ctagtccctt ccacaccccc accccttgct ctcaacccag gagcatccac3181ctccttctct gtctcatgtg tgctcttctt ctttctacag tattatgtac tctactgata3241tctaaatatt gatttctgcc ttccttgcta atgcaccatt agaagatatt agtcttgggg3301caggatgatt ttggcctcat tactttacca cccccacacc tggaaagcat atactatatt3361acaaaatgac attttgccaa aattattaat ataagaagct ttcagtatta gtgatgtcat3421ctgtcactat aggtcataca atccattctt aaagtacttg ttatttgttt ttattattac3481tgtttgtctt ctccccaggg ttcagtccct caaggggcca tcctgtccca ccatgcagtg3541ccccctagct tagagcctcc ctcaattccc cctggccacc accccccact ctgtgcctga3601ccttgaggag tcttgtgtgc attgctgtga attagctcac ttggtgatat gtcctatatt3661ggctaaattg aaacctggaa ttgtggggca atctattaat agctgcctta aagtcagtaa3721cttaccctta gggaggctgg gggaaaaggt tagattttgt attcaggggt tttttgtgta3781ctttttgggt ttttaaaaaa ttgtttttgg aggggtttat gctcaatcca tgttctattt3841cagtgccaat aaaatttagg tgacttcaaa aaaaaaaaa(SEQ ID NO:164)1mpktisvrvt tmdaelefai qpnttgkqlf dqvvktiglr evwffglqyq dtkgfstwlk61lnkkvtaqdv rkespllfkf rakfypedvs eeliqditqr lfflqvkegi lnddiycppe121tavllasyav qskygdfnke vhksgylagd kllpqrvleq hklnkdqwee riqvwheehr181gmlredavle ylkiaqdlem ygvnyfsikn kkgselwlgv dalglniyeq ndrltpkigf241pwseirnisf ndkkfvikpi dkkapdfvfy aprlrinkri lalcmgnhel ymrrrkpdti301evqqmkaqar eekhqkqmer amlenekkkr emaekekeki erekeelmer lkqieeqtkk361aqqeleeqtr raleleqerk raqseaekla kerqeaeeak eallqasrdq kktqeqlale421maeltarisq lemarqkkes eavewqqkaq mvqedlektr aelktamstp hvaepaeneq481deqdengaea sadlradama kdrseeertt eaeknervqk hlkaltsela nardeskkta541ndmihaenmr lgrdkyktlr qirqgntkqr idefesm


Putative function


Cytoskeletal binding protein linking to plama membrane, involved in cytokinesis and cell shape


Example 11 (Category 3)

Line ID—226


Phenotype—Lethal phase pharate adult. High mitotic index, rod-like overcondensed chromosomes, lagging chromosomes and bridges in anaphase, highly condensed


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003423 (2F1-2)


P element insertion site—226,527


Annotated Drosophila genome Complete Genome candidate—CG2865—EG:25E8.4

(SEQ ID NO:165)AGAAAACCACATAAACAAGCCAGCAAACAAGGCACACACTTGCTTGAAAAACGCACAATGACCTTGCCCACAAACACACACGCATCTGCAAACGACGGCGGCAGCGGCAACAACAACCACAGCAATATCAGCAGTAACAACAGCAGCAGCAGCGACGAAGACTCAGACATGTTTGGACCACCCCGCTGCTCCCCGCCCATCGGCTATCACCATCACCGTTCCCGTGTGCCCATGATCTCGCCAAAGCTGCGGCAGCGCGAGGAGCGCAAGCGGATCCTCCAGCTCTGCGCCCACAAGATGGAGAGGATCAAGGACTCGGAGGCGAACCTGCGGCGCAGCGTCTGCATCAACAACACCTACTGCCGCCTGAATGACGAACTGCGGCGCGAGAAGCAGATGCGCTACCTCCAGAATCTGCCCAGAACCAGCGACAGCGGCGCAAGCACCGAACTGGCGCGTGAGAATCTCTTCCAGCCGAACATGGACGACGCCAAGCCGGCCGGCAATAGCACTAGCAATAATATCAACGCCAACGGCAAGCCTTCATCCTCTTTTGGCGATGCCTTTGGCTCCTCAAACGGATCATCGTCGGGTCGCGGCGGAATTTGCTCCCTGGAGAATCAACCGCCCGAGCGTCAGCAGTTGGGGACGCCCGCTGGTGCCTCCGCTCCCGAGGCGGCCAATTCGGCGCCCCTTTCCGTTTCGGGCTCGGCATCGGAACGCGTGAATAACCGAAAACGCCACCTGTCCAGCTGCAACTTGGTCAACGATCTGGAAATACTGGACAGGGAGCTGAGCGCCATCAATGCACCCATGCTGCTAATCGATCCAGAGATTACCCAAGGAGCCGAACAGCTGGAGAAGGCCGCCTTGTCCGCCAGCAGGAAGAGATTGAGGAGCAATAGCGGCAGCGAGGACGAAAGTGATCGCCTGGTGCGCGAGGCTCTGTCCCAGTTCTACATACCGCCACAGCGCCTCATCTCCGCCATTGAGGAGTGTCCCCTGGATGTGGTTGGCTTGGGTATGGGAATGAATGTGAATGTGAATGTGGGAGGAATTAGTGGAATCGGTGGCATCGGAGGAGCTGCAGGCGCTGGCGTCGAAATGCCCGGAGGCAAACGGATGAAGCTGAATGACCATCACCATCTCAATCACCATCACCATTTGCACCATCATCTGGAGCTGGTCGATTTCGACATGAACCAAAACCAAAAGGATTTCGAGGTGATCATGGACGCCTTGAGGCTGGGAACGGCGACACCGCCGAGCGGCGCCAGCAGCGATTCTTGCGGACAGGCGGCGATGATGAGCGAGTCGGCCAGCGTGTTCCACAATCTGGTGGTCACCTCGTTGGAGACATGA(SEQ ID NO:166)MTLPTNTHASANDGGSGNNNHSNISSNNSSSSDEDSDMFGPPRCSPPIGYHHHRSRVPMISPKLRQREERKRILQLCAHKMERIKDSEANLRRSVCINNTYCRLNDELRREKQMRYLQNLPRTSDSGASTELARENLFQPNMDDAKPAGNSTSNNINANGKPSSSFGDAFGSSNGSSSGRGGICSLENQPPERQQLGTPAGASAPEAANSAPLSVSGSASERVNNRKRHLSSCNLVNDLEILDRELSAINAPMLLIDPEITQGAEQLEKAALSASRKRLRSNSGSEDESDRLVREALSQFYIPPQRLISAIEECPLDVVGLGMGMNVNVNVGGISGIGGIGGAAGAGVEMPGGKRMKLNDHHHLNHHHHLHHHLELVDFDMNQNQKDFEVIMDALRLGTATPPSGASSDSCGQAAMMSESASVFHNLVVTSLET


Human homologue of Complete Genome candidate


CG2865—none


Putative function


Putative phosphatidylinositol 3-kinase


Example 12 (Category 3)

Line ID—269


Phenotype—Lethal phase pupal-pharate adult. High mitotic index, colchicines-type overcondensation, high frequency of polyploids


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003568 (19F)


P element insertion site—197,805


Annotated Drosophila genome Complete Genome candidate—CG 1696—novel protein

(SEQ ID NO:167)AAAACTCATCGATGCTGCGAAAGTGCGATAGTATCGAATAAACATGAGTGTGTGCATGAGTGTGGGAATTTATTAAACAAAAACGAAACGCGGACAAACTATATTTATGTAATAAACACTAAGCCGCAGCGCCAACGAGTAATGAACAGTCCACGGCCAGGTCGTACTATTCAGGCGAACGCACCTCGCAATCGACTGCAATCAAAGTGCAATAGCTCAATCAATTGATTCGTTTTGCTCAACCAAAAACAAAATCTATTCCCAAATCGGTGCGATAGTTGCCAAAATATAAAAACTACACTACGCTAAAAAAAAAACAATACACTCACACACTGGCGTACAAGACAACAAAAGAGAAGAAGAAGAGCAGACGCCAGATATAAAAAGCCCCCAAAAGAATTGGAAATAAGACCATACCCCTCCTTCTCCCTTGAAAAGGGACCTTAAAACTAGGCGACACCGAATAATTGAACTCAAGTAAAAAACCGGGAAAAGAGAAAAACACTTTCAACAAAATATCTAGAAGCCTTGTTATCGATTTTGTTCCGGGTTTTTTTTGTGTGAGTGTGTGTTGTGTGAAGCGCGCCCGCGGGTGTGTGGGTGAGTGTGCGTGTGGCTCTCGGCGCGTTATCAAAAACAACAACAATTCGTTGCAAAAGAAAAAATAAAGTAGAGGAGGCGGAAGAAGAAGAGGAATCTGCTCGCACCGCGGTCAATCGCGGATCGTGGTCGATTTATCGAATTAATCGCCCCGAACAAAAAAAACACCGTACAAGGACTTGCACTATTTCCAATGATTTCGCTGCTGCAAATGAAATTCCGTGCGCTTTTGTTGTTGCTATCAAAAGTATGGACATGCATTTGTTTCATGTTCAATCGCCAAGTGCGAGCTTTTATCCAGTATCAACCGGTTAAATACGAACTCTTCCCGTTGTCACCCGTCTCGCGGCACCGCCTGAGCCTGGTGCAGCGCAAGACCCTCGTTCTGGACCTGGACGAAACGCTAATCCACTCCCATCACAATGCGATGCCCCGGAATACGGTGAAGCCGGGCACGCCGCACGATTTCACTGTCAAAGTGACCATCGATCGGAATCCAGTGCGCTTTTTCGTGCACAAGCGACCGCATGTGGACTACTTCCTGGACGTGGTCTCGCAGTGGTACGATCTGGTGGTCTTCACGGCCAGCATGGAGATTTACGGAGCGGCGGTGGCAGACAAGCTGGACAACGGACGAAACATCCTCCGGAGGCGATACTACAGACAGCACTGCACGCCCGACTACGGATCCTACACCAAAGACCTGTCGGCCATCTGCAGTGACCTAAATAGGATATTTATCATCGACAATTCGCCCGGCGCCTATCGCTGTTTTCCCAACAACGCCATACCCATCAAGAGTTGGTTCTCGGACCCGATGGACACGGCGCTGCTGTCGCTGCTGCCCATGCTGGATGCGCTGAGGTTCACGAACGACGTGAGATCGGTGCTGTCGAGGAACTTGCACCTGCACCGCCTCTGGTAGCAGGTGGGCCGCCTGTCGCTAGTTTAGTTTA(SEQ ID NO:168)MISLLQMKFRALLLLLSKVWTCICFMFNRQVRAFIQYQPVKYELFPLSPVSRHRLSLVQRKTLVLDLDETLIHSHHNAMPRNTVKPGTPHDFTVKVTIDRNPVRFFVHKRPHVDYFLDVVSQWYDLVVFTASMEIYGAAVADKLDNGRNILRRRYYRQHCTPDYGSYTKDLSAICSDLNRIFIIDNSPGAYRCFPNNAIPIKSWFSDPMDTALLSLLPMLDALRFTNDVRSVLSRNLHLHRLW


Human homologue of Complete Genome candidate


NP056158 hypothetical protein

(SEQ ID NO:169)1gccggggccg gcggtgccgg ggtcatcggg atgatgcggacgcagtgtct gctggggctg61cgcgcgttcg tggccttcgc cgccaagctc tggagcttcttcatttacct tttgcggagg121cagatccgca cggtaattca gtaccaaact gttcgatatgatatcctccc cttatctcct181gtgtcccgga atcggctagc ccaggtgaag aggaagatcctggtgctgga tctggatgag241acacttattc actcccacca tgatggggtc ctgaggcccacagtccggcc tggtacgcct301cctgacttca tcctcaaggt ggtaatagac aaacatcctgtccggttttt tgtacataag361aggccccatg tggatttctt cctggaagtg gtgagccagtggtacgagct ggtggtgttt421acagcaagca tggagatcta tggctctgct gtggcagataaactggacaa tagcagaagc481attcttaaga ggagatatta cagacagcac tgcactttggagttgggcag ctacatcaag541gacctctctg tggtccacag tgacctctcc agcattgtgatcctggataa ctccccaggg601gcttacagga gccatccaga caatgccatc cccatcaaatcctggttcag tgaccccagc661gacacagccc ttctcaacct gctcccaatg ctggatgccctcaggttcac cgctgatgtt721cgttccgtgc tgagccgaaa ccttcaccaa catcggctctggtgacagct gctccccctc781cacctgagtt ggggtggggg ggaaagggag ggcgagcccttgggatgccg tctgatgccc841tgtccaatgt gaggactgcc tgggcagggt ctgcccctcccacccctctc tgccctggga901gccctacact ccacttggag tctggatgga cacatgggccaggggctctg aagcagcctc961actcttaact tcgtgttcac actccatgga aaccccagactgggacacag gcggaagcct1021aggagagccg aatcagtgtt tgtgaagagg caggactggccagagtgaca gacatacggt1081gatccaggag gctcaaagag aagccaagtc agctttgttgtgatttgatt ttttttaaaa1141aactcttgta caaaactgat ctaattcttc actcctgctccaagggctgg gctgtgggtg1201ggatactggg attttgggcc actggatttt ccctaaatttgtcccccctt tactctccct1261ctatttttct ctccttagac tccctcagac ctgtaaccagctttgtgtct tttttccttt1321tctctctttt aaaccatgca ttataacttt gaaacc(SEQ ID NO:170)1mmrtqcllgl rafvafaakl wsffiyllrr qirtviqyqtvrydilplsp vsrnrlaqvk61rkilvldlde tlihshhdgv lrptvrpgtp pdfilkvvidkhpvrffvhk rphvdfflev121vsqwyelvvf tasmeiygsa vadkldnsrs ilkrryyrqhctlelgsyik dlsvvhsdls181sivildnspg ayrshpdnai pikswfsdps dtallnllpmldalrftadv rsvlsrnlhq241hrlw


Putative function


unknown


Example 13 (Category 3)

Line ID—291


Phenotype—Lethal phase pupal-pharate adult. High mitotic index, colchicines-type overcondensed chromosomes, many strongly stained nuclei


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003427 (3D5)


P element insertion site—131,166


Annotated Drosophila genome Complete Genome candidate—CG10798—dm diminutive, dMyc1

(SEQ ID NO:171)GTCGCGTGTTCAGTTCACCGCGGGTAATTCAGAGAATCGCTTTGTGGATTGGATTTTTGCCTGTTTTCCGCCCGATACAAAAAAAAAAAACCAAACGCTATATAAATAGTTCTGTAGTAAAACCTGAAGCAACACGTTTTAAAATATACAACTACTACTAACAACTGTCACAGCCAAGTTACAAAAGTGCTAAATCCCAGAAATAACCTAAGAGCCGACTTAAAACCGCGCAAATACATAAAAAAAAATCTTCTCCAAAGCAGAAACAAAAACTTGTGAAAAACTAGAATTAAAAAAAGATTTTTTAAAAAAAATCAGCTAGTGCAAAATAAACGGGAAGAATTTTTTTTTGTGTCCCTTTTTTTGGTGTTTTTTCTCCGTCTTTCCCCTTCTTTGACGCAAAAAAAAAAGTGCCCAACTTGCTGGCGGCACGGGAACGGGATAGAAATAGATATAGCCGAAAGCGACTGGAAAGCAAAGGAAGCTAACTAAATTGGATTACAATCAATTAAATAGAGACGGATACGGAAACTATGTTCAGCGAGACAGGCATATAACTCAGGAACTTAAGATATATAGAAAGAAAAAAAAACCCAGACAACATAATCGCAATGGCCCTTTACCGCTCTGATCCGTATTCCATAATGGACGACCAACTTTTTTCAAATATTTCAATATTCGATATGGATAATGATCTGTACGATATGGACAAACTCCTTTCGTCGTCCACCATTCAGAGTGATCTCGAGAAGATCGAGGACATGGAAAGTGTATTTCAAGACTATGACTTAGAGGAGGATATGAAGCCAGAGATCCGCAACATCGACTGCATGTGGCCGGCGATGTCCAGCTGTTTGACCAGCGGTAACGGTAATGGAATAGAGAGCGGAAACAGTGCAGCCTCGTCGTACAGCGAAACCGGTGCCGTATCCCTGGCGATGGTTTCCGGCTCTACGAATCTCTACAGCGCGTATCAACGATCGCAGACGACAGATAACACCCAGTCAAATCAACAGCATGTCGTCAACAGTGCCGAGAACATGCCGGTGATCATCAAGAAGGAGCTCGCAGATCTGGACTACACGGTCTGTCAGAAGCGCCTCCGTTTGAGCGGCGGTGACAAGAAGTCACAGATCCAGGACGAGGTCCATTTAATACCGCCCGGCGGAAGTTTGCTCCGCAAGCGGAACAACCAGGACATTATCCGCAAATCGGGCGAATTGAGCGGCAGCGATAGCATAAAATACCAGAGACCAGACACACCTCACAGTCTTACCGACGAGGTGGCCGCCTCAGAGTTTAGACATAACGTCGACTTGCGTGCCTGCGTGATGGGCAGCAATAATATCTCGCTGACCGGCAATGATAGCGATGTCAACTACATTAAGCAAATCAGCAGGGAGCTTCAGAATACCGGCAAGGATCCGTTGCCGGTGCGTTACATCCCGCCGATCAACGATGTCCTCGATGTGCTCAACCAGCATTCCAATTCGACGGGTGGCCAACAGCAGTTGAACCAACAGCAACTGGACGAGCAACAACAGGCCATCGATATAGCCACTGGACGCAACACAGTGGATTCTCCGCCGACGACCGGCTCTGATAGTGACTCCGATGACGGTGAACCCCTCAACTTTGACCTGCGCCATCATCGCACTAGCAAAAGCGGCAGCAATGCCAGCATCACCACCAACAACAACAACAGCAACAACAAAAACAACAAATTGAAGAACAACAGCAACGGCATGCTGCACATGATGCACATCACCGATCACAGCTACACGCGCTGCAACGATATGGTGGACGATGGTCCCAATTTGGAGACCCCCTCAGATTCCGATGAGGAAATCGATGTCGTTTCATATACGGACAAGAAGCTACCCACAAATCCCTCGTGCCACTTGATGGGCGCCCTACAGTTCCAGATGGCCCATAAGATCTCGATTGATCACATGAAGCAAAAACCGCGCTACAATAACTTCAATCTGCCGTACACACCGGCCAGCAGCAGTCCAGTGAAATCGGTGGCCAACTCGCGTTATCCATCACCGTCGAGCACACCGTATCAGAACTGCTCCTCCGCTTCGCCGTCCTACTCGCCGCTATCCGTGGACTCTTCAAATGTCAGCTCGAGCAGCTCCAGTTCCAGTTCGCAGTCAAGCTTCACCACCTCCAGTTCGAACAAGGGACGCAAACGATCCAGTCTGAAGGATCCAGGCTTGTTGATCTCCTCCAGCAGCGTTTATCTGCCGGGAGTCAATAACAAAGTGACGCATAGCTCCATGATGAGCAAAAAGAGTCGTGGCAAGAAGGTGGTTGGCACCTCGTCTGGCAATACATCTCCGATATCGTCTGGCCAGGATGTGGATGCCATGGATCGTAATTGGCAGCGGCGCAGTGGTGGAATTGCCACTAGCACAAGCTCCAACAGCAGTGTCCATCGGAAGGACTTTGTTTTGGGCTTTGATGAGGCCGATACGATCGAGAAGCGCAATCAGCACAATGATATGGAGCGTCAGCGACGCATTGGACTCAAGAACCTCTTTGAGGCTCTAAAGAAACAGATTCCCACAATTAGGGACAAGGAGCGGGCTCCCAAGGTAAATATCCTGCGAGAGGCGGCCAAGCTATGCATCCAGCTGACCCAGGAGGAGAAGGAGCTTAGTATGCAGCGCCAGCTTTTGTCGCTGCAGCTGAAGCAACGTCAGGACACTCTGGCCAGTTACCAAATGGAGTTGAACGAATCGCGCTCGGTTAGTGGATAGTGTTGTCTCATACTATCGGCTTAAAGCGGCGGCGTAGGGCTAGGATAACCCCCAATGTATATGCAAGATTTGTATATCCTCCTACTTTTTTTTTTTTGCAATTTACTTTGATTTAGCTTCGATCCTTTCTTGACATTAAGCCCTAAATATGATTTTTTTCTGGAGAACTTCAATATCAGTTAGTAGGTTATGTTTAACGATTTGCTTGCGCTTTTTCCGCTTTTTTTTTTGTTTTTTTACCATACCATACCATAC(SEQ ID NO:172)MDDQLFSNISIFDMDNDLYDMDKLLSSSTIQSDLEKIEDMESVFQDYDLEEDMKPEIRNIDCMWPAMSSCLTSGNGNGIESGNSAASSYSETGAVSLAMVSGSTNLYSAYQRSQTTDNTQSNQQHVVNSAENMPVIIKKELADLDYTVCQKRLRLSGGDKKSQIQDEVHLIPPGGSLLRKRNNQDIIRKSGELSGSDSIKYQRPDTPHSLTDEVAASEFRHNVDLRACVMGSNNISLTGNDSDVNYIKQISRELQNTGKDPLPVRYIPPINDVLDVLNQHSNSTGGQQQLNQQQLDEQQQAIDIATGRNTVDSPPTTGSDSDSDDGEPLNFDLRHHRTSKSGSNASITTNNNNSNNKNNKLKNNSNGMLHMMHITDHSYTRCNDMVDDGPNLETPSDSDEEIDVVSYTDKKLPTNPSCHLMGALQFQMAHKISIDHMKQKPRYNNFNLPYTPASSSPVKSVANSRYPSPSSTPYQNCSSASPSYSPLSVDSSNVSSSSSSSSSQSSFTTSSSNKGRKRSSLKDPGLLISSSSVYLPGVNNKVTHSSMMSKKSRGKKVVGTSSGNTSPISSGQDVDAMDRNWQRRSGGIATSTSSNSSVHRKDFVLGFDEADTIEKRNQHNDMERQRRIGLKNLFEALKKQIPTIRDKERAPKVNILREAAKLCIQLTQEEKELSMQRQLLSLQLKQRQDTLASYQMELNESRSVSG


Human homologue of Complete Genome candidate


CAA23831 c-myc oncogene

(SEQ ID NO:173)1ctgctcgcgg ccgccaccgc cgggccccgg ccgtccctggctcccctcct gcctcgagaa61gggcagggct tctcagaggc ttggcgggaa aaaagaacggagggagggat cgcgctgagt121ataaaagccg gttttcgggg ctttatctaa ctcgctgtagtaattccagc gagaggcaga181gggagcgagc gggcggccgg ctagggtgga agagccgggcgagcagagct gcgctgcggg241cgtcctggga agggagatcc ggagcgaata gggggcttcgcctctggccc agccctcccg301cttgatcccc caggccagcg gtccgcaacc cttgccgcatccacgaaact ttgcccatag361cagcgggcgg gcactttgca ctggaactta caacacccgagcaaggacgc gactctcccg421acgcggggag gctattctgc ccatttgggg acacttccccgccgctgcca ggacccgctt481ctctgaaagg ctctccttgc agctgcttag acgctggatttttttcgggt agtggaaaac541cagcagcctc ccgcgacgat gcccctcaac gttagcttcaccaacaggaa ctatgacctc601gactacgact cggtgcagcc gtatttctac tgcgacgaggaggagaactt ctaccagcag661cagcagcaga gcgagctgca gcccccggcg cccagcgaggatatctggaa gaaattcgag721ctgctgccca ccccgcccct gtcccctagc cgccgctccgggctctgctc gccctcctac781gttgcggtca cacccttctc ccttcgggga gacaacgacggcggtggcgg gagcttctcc841acggccgacc agctggagat ggtgaccgag ctgctgggaggagacatggt gaaccagagt901ttcatctgcg acccggacga cgagaccttc atcaaaaacatcatcatcca ggactgtatg961tggagcggct tctcggccgc cgccaagctc gtctcagagaagctggcctc ctaccaggct1021gcgcgcaaag acagcggcag cccgaacccc gcccgcggccacagcgtctg ctccacctcc1081agcttgtacc tgcaggatct gagcgccgcc gcctcagagtgcatcgaccc ctcggtggtc1141ttcccctacc ctctcaacga cagcagctcg cccaagtcctgcgcctcgca agactccagc1201gccttctctc cgtcctcgga ttctctgctc tcctcgacggagtcctcccc gcagggcagc1261cccgagcccc tggtgctcca tgaggagaca ccgcccaccaccagcagcga ctctgaggag1321gaacaagaag atgaggaaga aatcgatgtt gtttctgtggaaaagaggca ggctcctggc1381aaaaggtcag agtctggatc accttctgct ggaggccacagcaaacctcc tcacagccca1441ctggtcctca agaggtgcca cgtctccaca catcagcacaactacgcagc gcctccctcc1501actcggaagg actatcctgc tgccaagagg gtcaagttggacagtgtcag agtcctgaga1561cagatcagca acaaccgaaa atgcaccagc cccaggtcctcggacaccga ggagaatgtc1621aagaggcgaa cacacaacgt cttggagcgc cagaggaggaacgagctaaa acggagcttt1681tttgccctgc gtgaccagat cccggagttg gaaaacaatgaaaaggcccc caaggtagtt1741atccttaaaa aagccacagc atacatcctg tccgtccaagcagaggagca aaagctcatt1801tctgaagagg acttgttgcg gaaacgacga gaacagttgaaacacaaact tgaacagcta1861cggaactctt gtgcgtaagg aaaagtaagg aaaacgattccttctaacag aaatgtcctg1921agcaatcacc tatgaacttg tttcaaatgc atgatcaaatgcaacctcac aaccttggct1981gagtcttgag actgaaagat ttagccataa tgtaaactgcctcaaattgg actttgggca2041taaaagaact tttttatgct taccatcttt tttttttctttaacagattt gtatttaaga2101attgttttta aaaaatttta a(SEQ ID NO:174)1mplnvsftnr nydldydsvq pyfycdeeen fyqqqqqselqppapsediw kkfellptpp61lspsrrsglc spsyvavtpf slrgdndggg gsfstadqlemvtellggdm vnqsficdpd121detfikniii qdcmwsgfsa aaklvsekla syqaarkdsgspnparghsv cstsslylqd181lsaaasecid psvvfpypln dssspkscas qdssafspssdsllsstess pqgspeplvl241heetppttss dseeeqedee eidvvsvekr qapgkrsesgspsagghskp phsplvlkrc301hvsthqhnya appstrkdyp aakrvkldsv rvlrqisnnrkctsprssdt eenvkrrthn361vlerqrrnel krsffalrdq ipelenneka pkvvilkkatayilsvqaee qkliseedll421rkrreqlkhk leqlrnsca


Putative function


C-myc oncogene, transcription factor


Example 14 (Category 3)

Line ID—316


Phenotype—Lethal phase larval stage 3—Pre-pupal-pupal. Small optic lobes, missing or small imaginal discs, badly defined chromosomes.


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003506 (16B-C)


P element insertion site—27,868


Annotated Drosophila genome Complete Genome candidate—CG8465—novel protein (3 splice variants)

(SEQ ID NO:175)TGACAGTCCGCCTCTAATTTAATTTCGTTTGTGCACATTTTGTTTGAAAGACGCTTAAGATTATTGGGTTTTGTTTCATGTATTGTGCCCTTTGTGCTAAAAGTGCATCCGCCATTTTACGCAGAGATGTCGACCTATTTCGGGGTCTATATCCCGACCTCCAAAGCGGGCTGTTTTGAGGGATCGGTGTCGCAGTGCATCGGCTCCATAGCCGCGGTGAACATAAAGCCATCCAATCCGGCGTCTGGATCGGCATCAGTAGCATCGGGATCGCCATCCGGCTCGGCGGCATCCGTGCAAACGGGCAACGCAGACGATGGCAGTGCTGCCACCAAGTACGAGGATCCCGACTATCCACCGGACTCGCCACTGTGGCTGATCTTCACGGAGAAATCCAAGGCGCTGGACATCCTGCGACACTACAAGGAGGCGCGCCTCCGCGAGTTTCCCAATCTGGAGCAGGCGGAGAGTTACGTTCAGTTTGGGTTCGAGAGCATCGAGGCGCTCAAGAGATTTTGCAAGGCAAAGCCCGAAAGCAAGCCCATTCCGATAATCAGCGGTAGCGGTTACAAGAGCTCACCGACCTCGACGGACAATTCGTGCTCCTCCTCGCCGACGGGTAACGGCAGTGGCTTCATCATTCCCCTGGGAAGCAATTCCTCAATGTCGAATTTACTGCTCAGTGACTCACCGACTTCCTCGCCGAGCAGCTCCAGCAACGTCATTGCCAATGGGCGACAGCAGCAGATGCAGCAGCAACAGCAGCAGCAGCCGCAGCAGCCGGATGTGTCCGGAGAAGGCCCTCCTTTCCGGGCGCCCACCAAACAGGAACTGGTAGAGTTTCGCAAGCAAATCGAAGGTGGTCACATAGACCGGGTGAAGAGGATTATATGGGAGAATCCACGATTTTTGATCAGCAGCGGTGATACGCCCACCAGTTTGAAGGAGGGCTGTCGCTATAATGCCATGCACATCTGCGCCCAGGTCAATAAGGCCAGGATCGCTCAGTTGCTGTTAAAGACCATTTCGGATCGGGAGTTCACTCAGCTTTACGTTGGCAAGAAGGGCAGTGGCAAGATGTGTGCTGCCCTCAACATCAGTCTCCTGGACTATTACCTGAACATGCCGGACAAGGGGCGCGGCGAAACACCGCTCCACTTTGCCGCAAAGAACGGTCATGTGGCCATGGTCGAGGTTCTCGTTTCCTATCCGGAGTGCAAATCGCTGCGGAATCATGAGGGCAAGGAGCCCAAGGAAATCATCTGCCTGCGTAATGCTAATGCTACACATGTGACCATCAAGAAGCTGGAGCTGCTCTTGTACGATCCGCATTTTGTGCCCGTACTAAGATCCCAGTCAAATACACTGCCGCCAAAAGTGGGTCAACCGTTCTCGCCCAAAGATCCACCGAACCTGCAACACAAAGCGGACGATTACGAGGGCCTCAGCGTGGACCTGGCAATCAGTGCGCTGGCGGGACCCATGTCCCGCGAAAAGGCCATGAACTTCTATCGCCGTTGGAAGACACCACCGCGGGTCAGCAACAATGTGATGTCGCCGCTGGCTGGTTCACCATTTAGCTCGCCGGTGAAAGTAACCCCAAGCAAGTCGATCTTTGACCGAAGTGCTGGAAACTCGAGTCCAGTCCACTCAGGACGCAGAGTGCTCTTTAGTCCATTGGCGGAGGCGACCAGCTCACCAAAACCGACGAAAAACGTGCCCAATGGCACCAATGAGTGCGAGCACAACAATAATAATGTGAAGCCAGTGTATCCGTTGGAGTTCCCGGCGACACCCATTCGAAAAATGAAACCGGATTTATTCATGGCCTATCGCAATAACAATAGCTTTGATTCGCCATCTTTGGCCGATGACTCCCAAATCCTGGACATGAGCCTAAGCCGCAGCCTGAATGCGTCGCTAAATGACAGCTTCCGTGAGCGGCACATCAAGAACACTGATATCGAGAAGGGTCTGGAGGTGGTCGGCCGCCAACTGGCACGACAGGAGCAGTTAGAGTGGCGCGAGTACTGGGATTTTCTCGATTCATTTTTGGACATTGGTACGACCGAAGGCCTGGCCCGTCTTGAAGCGTATTTCCTGGAAAAGACCGAACAGCAGGCGGATAAATCAGAAACGGTCTGGAACTTTGCCCATCTGCATCAGTATTTCGATTCGATGGCCGGCGAGCAACAGCAGCAACTCCGAAAGGATAAAAATGAGGCTGCGGGAGCAACTTCGCCATCCGCCGGAGTCATGACTCCGTACACATGCGTAGAGAAGTCGCTGCAAGTGTTCGCCAAGCGCATCACTAAAACGTTGATCAACAAAATCGGCAACATGGTGTCCATCAACGACACGCTGCTCTGTGAGCTCAAAAGACTGAAATCGCTGATTGTCAGCTTCAAGGATGATGCCCGCTTCATTAGCGTGGACTTTAGCAAGGTGCATTCACGTATCGCCCACCTGGTGGCCAGCTATGTGACCCACTCGCAGGAGGTCAGCGTAGCCATGCGTCTACAATTGTTGCAGATGCTCCGAAGTTTGCGGCAACTGCTGGCCGACGAGCGTGGTCGAGAACAGCATTTGGGCTGCGTGTGCGCTAGTCTATTGCTGATGCTGGAACAGGCGCCGACATCCGCCGTGCATCTACCAGACACTCTGAAGACCGAGGAGCTATGTTGCGCCGCCTGGGAGACGGAGCAGTGTTGCGCCTGTCTGTGGGACGCAAATCTCAGCCGTAAGACCAGTCGTCGAAAGCGCACTAAGTCGCTGCGGGCAGCTGCTGTTGTTCAGTCTCAGGGTCAGCTTCAGGATACTTCGGGATCGACAGGGTCGTCCGCCTTGCACGCTTCGCTTGGTGTGGGATCGACCAGTTTGGGAGCATCGAGGGTCGTGGCGTCCGCTTCGAAAGATGCTTGGCGCCGTCAACAAAGCGACGACGAGGACTACGACAGCGATGAGCAAGTAATCTTTTTCGACTGCACTAATGTTACGCTGCCTTATGGAAGCAGCAGCGAGGACGAGGAAAACTTCCGTACGCCGCCGCAAAGCTTGTCGCCAGGTATTTCCATGGATTTGGAGCCGCGTTACGAGTTGTTTATTTTTGGAAACGAGCCAACCAAGCGAGATTTGGATGTGCTGAATGCCCTTTCCAATGTCGACATTGATAAGGAAACACTGCCGCATGTCTACGCCTGGAAGACTGCCATGGAGAGCTACTCCTGTGCTGAAATGAATCTGAACGTCAAGGTTCAAAAGCCGGAGCCTTGGTATTCTGGAACCAGTTCTAGCCACAACAGCCAACCATTGTTGCATCCCAAGCGTCTGCTTGCCACGCCAAAGCTGAATGCCGTGGTCAGCGGCAGACGCGGATCCGGACCATTGACGGCGCCAGTTACACCGCGTCTGGCGCGAACTCCGTCCGCCGCCAGTATTCAAGTTGCATCCGAGACGAATGGCGAGTCGGTCGGAACTGCTGTGACTCCGGCATCGCCGATTTTGAGTTTTGCCGCCTTGACGGCAGCGACGCAGTCATTCCAAACACCATTGAACAAGGTGCGCGGCTTGTTCAGCCAATATCGGGATCAACGGTCCTATAACGAGGGGGACACGCCGCTGGGCAATCGGAACTGAAACGGAATCGGCCCGGAAACAGAAACAGAAACAGCGACTGATTGATGAAAGGCCGACTGCATACTTACCCCCCTGAATAGCCGGTGTCGTCCATTGTCCCTTTTAATGTTAATCGCATGTATATTA(SEQ ID NO:176)MSTYFGVYIPTSKAGCFEGSVSQCIGSIAAVNIKPSNPASGSASVASGSPSGSAASVQTGNADDGSAATKYEDPDYPPDSPLWLIFTEKSKALDILRHYKEARLREFPNLEQAESYVQFGFESIEALKRFCKAKPESKPIPIISGSGYKSSPTSTDNSCSSSPTGNGSGFIIPLGSNSSMSNLLLSDSPTSSPSSSSNVIANGRQQQMQQQQQQQPQQPDVSGEGPPFRAPTKQELVEFRKQIEGGHIDRVKRIIWENPRFLISSGDTPTSLKEGCRYNAMHICAQVNKARIAQLLLKTISDREFTQLYVGKKGSGKMCAALNISLLDYYLNMPDKGRGETPLHFAAKNGHVAMVEVLVSYPECKSLRNHEGKEPKEIICLRNANATHVTIKKIELLLYDPHFVPVLRSQSNTLPPKVGQPFSPKDPPNLQHKADDYEGLSVDLAISALAGPMSREKAMNFYRRWKTPPRVSNNVMSPLAGSPFSSPVKVTPSKSIFDRSAGNSSPVHSGRRVLFSPLAEATSSPKPTKNVPNGTNECEHNNNNVKPVYPLEFPATPIRKMKPDLFMAYRNNNSFDSPSLADDSQILDMSLSRSLNASLNDSFRERHIKNTDIEKGLEVVGRQLARQEQLEWREYWDFLDSFLDIGTTEGLARLEAYFLEKTEQQADKSETVWNFAHLHQYFDSMAGEQQQQLRKDKNEAAGATSPSAGVMTPYTCVEKSLQVFAKRITKTLINKIGNMVSINDTLLCELRRLKSLIVSFKDDARFISVDFSKVHSRIAHLVASYVTHSQEVSVAMRLQLLQMLRSLRQLLADERGREQHLGCVCASLLLMLEQAPTSAVHLPDTLKTEELCCAAWETEQCCACLWDANLSRKTSRRKRTKSLRAAAVVQSQGQLQDTSGSTGSSALHASLGVGSTSLGASRVVASASKDAWRRQQSDDEDYDSDEQVIFFDCTNVTLPYGSSSEDEENFRTPPQSLSPGISMDLEPRYELFIFGNEPTKRDLDVLNALSNVDIDKETLPHVYAWKTAMESYSCAEMNLNVKVQKPEPWYSGTSSSHNSQPLLHPKRLLATPKLNAVVSGRRGSGPLTAPVTPRLARTPSAASIQVASETNGESVGTAVTPASPILSFAALTAATQSFQTPLNKVRGLFSQYRDQRSYNEGDTPLGNRN(SEQ ID NO:177)TTGATGTTACCCTATTTTTACCGTTGCCTTCGCTTGCCATCAGCGGAACTTTACATTTTTTCACGGAGTTGTGAAGAAGTTGCCTGTTATTTGGTGTTGATGTCAAACCATTTTAACCGCTTACCTTGCAGTGCATCCGCCATTTTACGCAGAGATGTCGACCTATTTCGGGGTCTATATCCCGACCTCCAAAGCGGGCTGTTTTGAGGGATCGGTGTCGCAGTGCATCGGCTCCATAGCCGCGGTGAACATAAAGCCATCCAATCCGGCGTCTGGATCGGCATCAGTAGCATCGGGATCGCCATCCGGCTCGGCGGCATCCGTGCAAACGGGCAACGCAGACGATGGCAGTGCTGCCACCAAGTACGAGGATCCCGACTATCCACCGGACTCGCCACTGTGGCTGATCTTCACGGAGAAATCCAAGGCGCTGGACATCCTGCGACACTACAAGGAGGCGCGCCTCCGCGAGTTTCCCAATCTGGAGCAGGCGGAGAGTTACGTTCAGTTTGGGTTCGAGAGCATCGAGGCGCTCAAGAGATTTTGCAAGGCAAAGCCCGAAAGCAAGCCCATTCCGATAATCAGCGGTAGCGGTTACAAGAGCTCACCGACCTCGACGGACAATTCGTGCTCCTCCTCGCCGACGGGTAACGGCAGTGGCTTCATCATTCCCCTGGGAAGCAATTCCTCAATGTCGAATTTACTGCTCAGTGACTCACCGACTTCCTCGCCGAGCAGCTCCAGCAACGTCATTGCCAATGGGCGACAGCAGCAGATGCAGCAGCAACAGCAGCAGCAGCCGCAGCAGCCGGATGTGTCCGGAGAAGGCCCTCCTTTCCGGGCGCCCACCAAACAGGAACTGGTAGAGTTTCGCAAGCAAATCGAAGGTGGTCACATAGACCGGGTGAAGAGGATTATATGGGAGAATCCACGATTTTTGATCAGCAGCGGTGATACGCCCACCAGTTTGAAGGAGGGCTGTCGCTATAATGCCATGCACATCTGCGCCCAGGTCAATAAGGCCAGGATCGCTCAGTTGCTGTTAAAGACCATTTCGGATCGGGAGTTCACTCAGCTTTACGTTGGCAAGAAGGGCAGTGGCAAGATGTGTGCTGCCCTCAACATCAGTCTCCTGGACTATTACCTGAACATGCCGGACAAGGGGCGCGGCGAAACACCGCTCCACTTTGCCGCAAAGAACGGTCATGTGGCCATGGTCGAGGTTCTCGTTTCCTATCCGGAGTGCAAATCGCTGCGGAATCATGAGGGCAAGGAGCCCAAGGAAATCATCTGCCTGCGTAATGCTAATGCTACACATGTGACCATCAAGAAGCTGGAGCTGCTCTTGTACGATCCGCATTTTGTGCCCGTACTAAGATCCCAGTCAAATACACTGCCGCCAAAAGTGGGTCAACCGTTCTCGCCCAAAGATCCACCGAACCTGCAACACAAAGCGGACGATTACGAGGGCCTCAGCGTGGACCTGGCAATCAGTGCGCTGGCGGGACCCATGTCCCGCGAAAAGGCCATGAACTTCTATCGCCGYFGGAAGACACCACCGCGGGTCAGCAACAATGTGATGTCGCCGCTGGCTGGTTCACCATTTAGCTCGCCGGTGAAAGTAACCCCAAGCAAGTCGATCTTTGACCGAAGTGCTGGAAACTCGAGTCCAGTCCACTCAGGACGCAGAGTGCTCTTTAGTCCATTGGCGGAGGCGACCAGCTCACCAAAACCGACGAAAAACGTGCCCAATGGCACCAATGAGTGCGAGCACAACAATAATAATGTGAAGCCAGTGTATCCGTTGGAGTTCCCGGCGACACCCATTCGAAAAATGAAACCGGATTTATTCATGGCCTATCGCAATAACAATAGCTTTGATTCGCCATCTTTGGCCGATGACTCCCAAATCCTGGACATGAGCCTAAGCCGCAGCCTGAATGCGTCGCTAAATGACAGCTTCCGTGAGCGGCACATCAAGAACACTGATATCGAGAAGGGTCTGGAGGTGGTCGGCCGCCAACTGGCACGACAGGAGCAGTTAGAGTGGCGCGAGTACTGGGATTTTCTCGATTCATTTTTGGACATTGGTACGACCGAAGGCCTGGCCCGTCTTGAAGCGTATTTCCTGGAAAAGACCGAACAGCAGGCGGATAAATCAGAAACGGTCTGGAACTTTGCCCATCTGCATCAGTATTTCGATTCGATGGCCGGCGAGCAACAGCAGCAACTCCGAAAGGATAAAAATGAGGCTGCGGGAGCAACTTCGCCATCCGCCGGAGTCATGACTCCGTACACATGCGTAGAGAAGTCGCTGCAAGTGTTCGCCAAGCGCATCACTAAAACGTTGATCAACAAAATCGGCAACATGGTGTCCATCAACGACACGCTGCTCTGTGAGCTCAAAAGACTGAAATCGCTGATTGTCAGCTTCAAGGATGATGCCCGCTTCATTAGCGTGGACTTTAGCAAGGTGCATTCACGTATCGCCCACCTGGTGGCCAGCTATGTGACCCACTCGCAGGAGGTCAGCGTAGCCATGCGTCTACAATTGTTGCAGATGCTCCGAAGTTTGCGGCAACTGCTGGCCGACGAGCGTGGTCGAGAACAGCATTTGGGCTGCGTGTGCGCTAGTCTATTGCTGATGCTGGAACAGGCGCCGACATCCGCCGTGCATCTACCAGACACTCTGAAGACCGAGGAGCTATGTTGCGCCGCCTGGGAGACGGAGCAGTGTTGCGCCTGTCTGTGGGACGCAAATCTCAGCCGTAAGACCAGTCGTCGAAAGCGCACTAAGTCGCTGCGGGCAGCTGCTGTTGTTCAGTCTCAGGGTCAGCTTCAGGATACTTCGGGATCGACAGGGTCGTCCGCCTTGCACGCTTCGCTTGGTGTGGGATCGACCAGTTTGGGAGCATCGAGGGTCGTGGCGTCCGCTTCGAAAGATGCTTGGCGCCGTCAACAAAGCGACGACGAGGACTACGACAGCGATGAGCAAGTAATCTTTTTCGACTGCACTAATGTTACGCTGCCTTATGGAAGCAGCAGCGAGGACGAGGAAAACTTCCGTACGCCGCCGCAAAGCTTGTCGCCAGGTATTTCCATGGATTTGGAGCCGCGTTACGAGTTGTTTATTTTTGGAAACGAGCCAACCAAGCGAGATTTGGATGTGCTGAATGCCCTTTCCAATGTCGACATTGATAAGGAAACACTGCCGCATGTCTACGCCTGGAAGACTGCCATGGAGAGCTACTCCTGTGCTGAAATGAATCTGAACGTCAAGGTTCAAAAGCCGGAGCCTTGGTATTCTGGAACCAGTTCTAGCCACAACAGCCAACCATTGTTGCATCCCAAGCGTCTGCTTGCCACGCCAAAGCTGAATGCCGTGGTCAGCGGCAGACGCGGATCCGGACCATTGACGGCGCCAGTTACACCGCGTCTGGCGCGAACTCCGTCCGCCGCCAGTATTCAAGTTGCATCCGAGACGAATGGCGAGTCGGTCGGAACTGCTGTGACTCCGGCATCGCCGATTTTGAGTTTTGCCGCCTTGACGGCAGCGACGCAGTCATTCCAAACACCATTGAACAAGGTGCGCGGCTTGTTCAGCCAATATCGGGATCAACGGTCCTATAACGAGGGGGACACGCCGCTGGGCAATCGGAACTGAAACGGAATCGGCCCGGAAACAGAAACAGAAACAGCGACTGATTGATGAAAGGCCGACTGCATACTTACCCCCCTGAATAGCCGGTGTCGTCCATTGTCCCTTTTAATGTTAATCGCATGTATATTA(SEQ ID NO:178)MSTYFGVYIPTSKAGCFEGSVSQCIGSIAAVNIKPSNPASGSASVASGSPSGSAASVQTGNADDGSAATKYEDPDYPPDSPLWLIFTEKSKALDILRHYKEARLREFPNLEQAESYVQFGFESIEALKRFCKARPESKPIPIISGSGYKSSPTSTDNSCSSSPTGNGSGFIIPLGSNSSMSNLLLSDSPTSSPSSSSNVIANGRQQQMQQQQQQQPQQPDVSGEGPPFRAPTKQELVEFRKQIEGGHIDRVKRIIWENPRFLISSGDTPTSLKEGCRYNAMHICAQVNKARIAQLLLKTISDREFTQLYVGKKGSGKMCAALNISLLDYYLNMPDKGRGETPLHFAAKNGHVAMVEVLVSYPECKSLRNHEGKEPKEIICLRNANATHVTIKKLELLLYDPHFVPVLRSQSNTLPPKVGQPFSPKDPPNLQHKADDYEGLSVDLAISALAGPMSREKAMNFYRRWKTPPRVSNNVMSPLAGSPFSSPVKVTPSKSIFDRSAGNSSPVHSGRRVLFSPLAEATSSPKPTKNVPNGTNECEHNNNNVKPVYPLEFPATPIRKMKPDLFMAYRNNNSFDSPSLADDSQILDMSLSRSLNASLNDSFRERHIKNTDIEKGLEVVGRQLARQEQLEWREYWDFLDSFLDIGTTEGLARLEAYFLEKTEQQADKSETVWNFAHLHQYFDSMAGEQQQQLRKDKNEAAGATSPSAGVMTPYTCVEKSLQVFAKRITKTLINKIGNMVSINDTLLCELKRLKSLIVSFKDDARFISVDFSKVHSRIAHLVASYVTHSQEVSVAMRLQLLQMLRSLRQLLADERGREQHLGCVCASLLLMLEQAPTSAVHLPDTLKTEELCCAAWETEQCCACLWDANLSRKTSRRKRTKSLRAAAVVQSQGQLQDTSGSTGSSALHASLGVGSTSLGASRVVASASKDAWRRQQSDDEDYDSDEQVIFFDCTNVTLPYGSSSEDEENFRTPPQSLSPGISMDLEPRYELFIFGNEPTKRDLDVLNALSNVDIDKETLPHVYAWKTAMESYSCAEMNLNVKVQKPEPWYSGTSSSHNSQPLLHPKRLLATPKLNAVVSGRRGSGPLTAPVTPRLARTPSAASIQVASETNGESVGTAVTPASPILSFAALTAATQSFQTPLNKVRGLFSQYRDQRSYNEGDTPLGNRN(SEQ ID NO:179)AAAACAGCCAGCTCATTTATTAATGGTTTATCCCTCTCGATGCCCACACATCAACATTGCCATCGCCACGACGGAGCAGCGGACTCGCCACTGTGGCTGATCTTCACGGAGAAATCCAAGGCGCTGGACATCCTGCGACACTACAAGGAGGCGCGCCTCCGCGAGTTTCCCAATCTGGAGCAGGCGGAGAGTTACGTTCAGTTTGGGTTCGAGAGCATCGAGGCGCTCAAGAGATTTTGCAAGGCAAAGCCCGAAAGCAAGCCCATTCCGATAATCAGCGGTAGCGGTTACAAGAGCTCACCGACCTCGACGGACAATTCGTGCTCCTCCTCGCCGACGGGTAACGGCAGTGGCTTCATCATTCCCCTGGGAAGCAATTCCTCAATGTCGAATTTACTGCTCAGTGACTCACCGACTTCCTCGCCGAGCAGCTCCAGCAACGTCATTGCCAATGGGCGACAGCAGCAGATGCAGCAGCAACAGCAGCAGCAGCCGCAGCAGCCGGATGTGTCCGGAGAAGGCCCTCCTTTCCGGGCGCCCACCAAACAGGAACTGGTAGAGTTTCGCAAGCAAATCGAAGGTGGTCACATAGACCGGGTGAAGAGGATTATATGGGAGAATCCACGATTTTTGATCAGCAGCGGTGATACGCCCACCAGTTTGAAGGAGGGCTGTCGCTATAATGCCATGCACATCTGCGCCCAGGTCAATAAGGCCAGGATCGCTCAGTTGCTGTTAAAGACCATTTCGGATCGGGAGTTCACTCAGCTTTACGTTGGCAAGAAGGGCAGTGGCAAGATGTGTGCTGCCCTCAACATCAGTCTCCTGGACTATTACCTGAACATGCCGGACAAGGGGCGCGGCGAAACACCGCTCCACTTTGCCGCAAAGAACGGTCATGTGGCCATGGTCGAGGTTCTCGTTTCCTATCCGGAGTGCAAATCGCTGCGGAATCATGAGGGCAAGGAGCCCAAGGAAATCATCTGCCTGCGTAATGCTAATGCTACACATGTGACCATCAAGAAGCTGGAGCTGCTCTTGTACGATCCGCATTTTGTGCCCGTACTAAGATCCCAGTCAAATACACTGCCGCCAAAAGTGGGTCAACCGTTCTCGCCCAAAGATCCACCGAACCTGCAACACAAAGCGGACGATTACGAGGGCCTCAGCGTGGACCTGGCAATCAGTGCGCTGGCGGGACCCATGTCCCGCGAAAAGGCCATGAACTTCTATCGCCGTTGGAAGACACCACCGCGGGTCAGCAACAATGTGATGTCGCCGCTGGCTGGTTCACCATTTAGCTCGCCGGTGAAAGTAACCCCAAGCAAGTCGATCTTTGACCGAAGTGCTGGAAACTCGAGTCCAGTCCACTCAGGACGCAGAGTGCTCTTTAGTCCATTGGCGGAGGCGACCAGCTCACCAAAACCGACGAAAAACGTGCCCAATGGCACCAATGAGTGCGAGCACAACAATAATAATGTGAAGCCAGTGTATCCGTTGGAGTTCCCGGCGACACCCATTCGAAAAATGAAACCGGATTTATTCATGGCCTATCGCAATAACAATAGCTTTGATTCGCCATCTTTGGCCGATGACTCCCAAATCCTGGACATGAGCCTAAGCCGCAGCCTGAATGCGTCGCTAAATGACAGCTTCCGTGAGCGGCACATCAAGAACACTGATATCGAGAAGGGTCTGGAGGTGGTCGGCCGCCAACTGGCACGACAGGAGCAGTTAGAGTGGCGCGAGTACTGGGATTTTCTCGATTCATTTTTGGACATTGGTACGACCGAAGGCCTGGCCCGTCTTGAAGCGTATTTCCTGGAAAAGACCGAACAGCAGGCGGATAAATCAGAAACGGTCTGGAACTTTGCCCATCTGCATCAGTATTTCGATTCGATGGCCGGCGAGCAACAGCAGCAACTCCGAAAGGATAAAAATGAGGCTGCGGGAGCAACTTCGCCATCCGCCGGAGTCATGACTCCGTACACATGCGTAGAGAAGTCGCTGCAAGTGTTCGCCAAGCGCATCACTAAAACGTTGATCAACAAAATCGGCAACATGGTGTCCATCAACGACACGCTGCTCTGTGAGCTCAAAAGACTGAAATCGCTGATTGTCAGCTTCAAGGATGATGCCCGCTTCATTAGCGTGGACTTTAGCAAGGTGCATTCACGTATCGCCCACCTGGTGGCCAGCTATGTGACCCACTCGCAGGAGGTCAGCGTAGCCATGCGTCTACAATTGTTGCAGATGCTCCGAAGTTTGCGGCAACTGCTGGCCGACGAGCGTGGTCGAGAACAGCATTTGGGCTGCGTGTGCGCTAGTCTATTGCTGATGCTGGAACAGGCGCCGACATCCGCCGTGCATCTACCAGACACTCTGAAGACCGAGGAGCTATGTTGCGCCGCCTGGGAGACGGAGCAGTGTTGCGCCTGTCTGTGGGACGCAAATCTCAGCCGTAAGACCAGTCGTCGAAAGCGCACTAAGTCGCTGCGGGCAGCTGCTGTTGTTCAGTCTCAGGGTCAGCTTCAGGATACTTCGGGATCGACAGGGTCGTCCGCCTTGCACGCTTCGCTTGGTGTGGGATCGACCAGTTTGGGAGCATCGAGGGTCGTGGCGTCCGCTTCGAAAGATGCTTGGCGCCGTCAACAAAGCGACGACGAGGACTACGACAGCGATGAGCAAGTAATCTTTTTCGACTGCACTAATGTTACGCTGCCTTATGGAAGCAGCAGCGAGGACGAGGAAAACTTCCGTACGCCGCCGCAAAGCTTGTCGCCAGGTATTTCCATGGATTTGGAGCCGCGTTACGAGTTGTTTATTTTTGGAAACGAGCCAACCAAGCGAGATTTGGATGTGCTGAATGCCCTTTCCAATGTCGACATTGATAAGGAAACACTGCCGCATGTCTACGCCTGGAAGACTGCCATGGAGAGCTACTCCTGTGCTGAAATGAATCTGAACGTCAAGGTTCAAAAGCCGGAGCCTTGGTATTCTGGAACCAGTTCTAGCCACAACAGCCAACCATTGTTGCATCCCAAGCGTCTGCTTGCCACGCCAAAGCTGAATGCCGTGGTCAGCGGCAGACGCGGATCCGGACCATTGACGGCGCCAGTTACACCGCGTCTGGCGCGAACTCCGTCCGCCGCCAGTATTCAAGTTGCATCCGAGACGAATGGCGAGTCGGTCGGAACTGCTGTGACTCCGGCATCGCCGATTTTGAGTTTTGCCGCCTTGACGGCAGCGACGCAGTCATTCCAAACACCATTGAACAAGGTGCGCGGCTTGTTCAGCCAATATCGGGATCAACGGTCCTATAACGAGGGGGACACGCCGCTGGGCAATCGGAACTGAAACGGAATCGGCCCGGAAACAGAAACAGAAACAGCGACTGATTGATGAAAGGCCGACTGCATACTTACCCCCCTGAATAGCCGGTGTCGTCCATTGTCCCTTTTAATGTTAATCGCATGTATATTA(SEQ ID NO:180)MPTHQHCHRHDGAADSPLWLIFTEKSKALDILRHYKEARLREFPNLEQAESYVQFGFESIEALKRFCKAKPESKPIPIISGSGYKSSPTSTDNSCSSSPTGNGSGFIIPLGSNSSMSNLLLSDSPTSSPSSSSNVIANGRQQQMQQQQQQQPQQPDVSGEGPPFRAPTKQELVEFRKQIEGGHIDRVKRIIWENPRFLISSGDTPTSLKEGCRYNAMHICAQVNKARIAQLLLKTISDREFTQLYVGKKGSGKMCAALNISLLDYYLNMPDKGRGETPLHFAAKNGHVAMVEVLVSYPECKSLRNHEGKEPKEIICLRNANATHVTIKKLELLLYDPHFVPVLRSQSNTLPPKVGQPFSPKDPPNLQHKADDYEGLSVDLAISALAGPMSREKAMNFYRRWKTPPRVSNNVMSPLAGSPFSSPVKVTPSKSIFDRSAGNSSPVHSGRRVLFSPLAEATSSPKPTKNVPNGTNECEHNNNNVKPVYPLEFPATPIRKMKPDLFMAYRNNNSFDSPSLADDSQILDMSLSRSLNASLNDSFRERHIKNTDIEKGLEVVGRQLARQEQLEWREYWDFLDSFLDIGTTEGLARLEAYFLEKTEQQADKSETVWNFAHLHQYFDSMAGEQQQQLRKDKNEAAGATSPSAGVMTPYTCVEKSLQVFAKRITKTLINKIGNMVSINDTLLCELKRLKSLIVSFKDDARFISVDFSKVHSRIAHLVASYVTHSQEVSVAMRLQLLQMLRSLRQLLADERGREQHLGCVCASLLLMLEQAPTSAVHLPDTLKTEELCCAAWETEQCCACLWDANLSRKTSRRKRTKSLRAAAVVQSQGQLQDTSGSTGSSALHASLGVGSTSLGASRVVASASKDAWRRQQSDDEDYDSDEQVIFFDCTNVTLPYGSSSEDEENFRTPPQSLSPGISMDLEPRYELFIFGNEPTKRDLDVLNALSNVDIDKETLPHVYAWKTAMESYSCAEMNLNVKVQKPEPWYSGTSSSHNSQPLLHPKRLLATPKLNAVVSGRRGSGPLTAPVTPRLARTPSAASIQVASETNGESVGTAVTPASPILSFAALTAATQSFQTPLNKVRGLFSQYRDQRSYNEGDTPLGNRN


Human Homologue of Complete Genome candidate


BAA31667 KIAA0692 protein

(SEQ ID NO:181)1gagattttgg ttacagtgtg ggcctgaatc ctccagaggaggaagctgtg acatccaaga61cctgctcggt gccccctagt gacaccgaca cctacagagctggagcgact gcgtctaagg121agccgcccct gtactatggg gtgtgtccag tgtatgaggacgtcccagcg agaaatgaaa181ggatctatgt ttatgaaaat aaaaaggaag cattgcaagctgtcaagatg atcaaagggt241cccgatttaa agctttttct accagagaag acgctgagaaatttgctaga ggaatttgtg301attatttccc ttctccaagc aaaacgtcct taccactgtctcctgtgaaa acagctccac361tctttagcaa tgacaggttg aaagatggtt tgtgcttgtcggaatcagaa acagtcaaca421aagagcgagc gaacagttac aaaaatcccc gcacgcaggacctcaccgcc aagcttcgga481aagctgtgga gaagggagag gaggacacct tttctgaccttatctggagc aacccccggt541atctgatagg ctcaggagac aaccccacta tcgtgcaggaagggtgcagg tacaacgtga601tgcatgttgc tgccaaagag aaccaggctt ccatctgccagctgactctg gacgtcctgg661agaaccctga cttcatgagg ctgatgtacc ctgatgacgacgaggccatg ctgcagaagc721gtatccgtta cgtggtggac ctgtacctca acacccccgacaagatgggc tatgacacac781cgttgcattt tgcttgtaag tttggaaatg cagatgtagtcaacgtgctt tcgtcacacc841atttgattgt aaaaaactca aggaataaat atgataaaacacctgaagat gtaatttgtg901aaagaagcaa aaataaatct gtggaactga aggagcggatcagagagtat ttaaagggcc961actactacgt gcccctcctg agagcggaag agacttcttctccagtcatc ggggagctgt1021ggtccccaga ccagacggct gaggcctctc acgtcagccgctatggaggc agccccagag1081acccggtact gaccctgaga gccttcgcag ggcccctgagtccagccaag gcagaagatt1141ttcgcaagct ctggaaaact ccacctcgag agaaagcaggcttccttcac cacgtcaaga1201agtcggaccc ggaaagaggc tttgagagag tgggaagggagctagctcat gagctggggt1261atccctgggt tgaatactgg gaatttctgg gctgttttgttgatctgtct tcccaggaag1321gcctgcaaag actagaagaa tatctcacac agcaggaaataggcaaaaag gctcaacaag1381aaacaggaga acgggaagcc tcctgccgag ataaagccaccacgtctggc agcaattcca1441tttccgtgag ggcgtttcta gatgaagatg acatgagcttggaagaaata aaaaatcggc1501aaaatgcagc tcgaaataac agcccgccca cagtcggtgcttttggacat acgaggtgca1561gcgccttccc cttggagcag gaggcagacc tcatagaagccgccgagccg ggaggtccac1621acagcagcag aaatgggctc tgccatcctc tgaatcacagcaggaccctg gcgggcaaga1681gaccaaaggc cccccatggg gaggaagccc atctgccacctgtctcggat ttgactgttg1741agtttgataa actgaatttg caaaatatag gacgtagcgtttccaagaca ccagatgaaa1801gtacaaaaac taaagatcag atcctgactt caagaatcaatgcagtagaa agagacttgt1861tagagccttc tcccgcagac caactcggga atggccacaggaggacagaa agtgaaatgt1921cagccaggat cgctaaaatg tccttgagtc ccagcagccccaggcacgag gatcagctcg1981aggtcaccag ggaaccggcc aggcggctct tcctttttggagaggagcca tcaaaactcg2041atcaggatgt tttggccgct cttgaatgtg cagacgtcgacccccatcag ttcccggccg2101tgcacagatg gaagagtgct gtcctgtgct actcaccctcggacagacag agttggccca2161gtcccgcggt gaaaggaagg ttcaagtctc agctgccagatctcagtggc cctcacagct2221acagtccggg gagaaacagc gtggctggaa gcaaccccgcaaagccaggc ctgggcagtc2281ctgggcgcta cagccccgtg cacgggagcc agctccgcaggatggcgcgc ctggctgagc2341ttgccgccct gtaggcttgg cgctgggctc tcggtttgttcttcattttt aaagaaggaa2401gggtcatatg tttattgcta aactgtcaaa aaggaatatattctgattaa attattactc2461ctcactttga gggtgtgaga attttagaag atttaaatgttctatataac acttagattt2521ctgatatttt ggaagaagtt agaagttaat gaaagcaaactcagttacca attttctgga2581aaatatccat gtggtaatgt agacttttta ggtggcaatttctaggtctg aaatatagca2641gaggaaaggg cgctgaggca gttgcaggca ggcagccctgtacttaccct gtactcacct2701catccgacag acgctgtgga tgaggagggg cttggcggaggcgtgagcac cgatgtccct2761ttgataacct gcactcacca agatgaacta tttgccgccctgtcttttcc tgggttgggg2821ggtggcatct gatggtggca gagtgcctgt tggttcgcccgtgggtctca tggttcagac2881agagggaggt ggacggcagg gatcagggag ccaggagcgcgcctcagact tgcagcaacc2941attgtgattt gggttgttcg gaatatttaa attactgatcagaagatgaa agtagctttt3001ctcttgggaa gtcttgcagc ccgtgggagt gataccaggagcaacacaga gctcagcagc3061ggcgccaagg tgttccctgt ttcctcagca cgtgagccttcaccgcctgc ttcattcagg3121agccagtgca gcagtaatac agtctataca ttgttctgttttcaaattta tcctgaggct3181ttgttgagca taaatgatta tacgataaag gtatccgttattttggaact catttcagtt3241gggatctcct gtatgcagag tgttgcattt agaggtttgagtcccatctt ggtttcttgc3301cgtgctgact gtagccttca ccttgacttg aatgaaggtctgtggttgga atgtgtgagg3361agccgctgag gtgttcagga ggtgctgcct ggaggtcggtttcttcctgg gtgttacggg3421caactgctca cacagttgtt tctctgtgaa catttccagtgtttaatcca aaatgaaaac3481ccaccaatgc ttttgctaac ttcagtgcct tttataaatcatttttaaat ttcctgaact3541tgctttttga ggatatacag ggatattaag tagacgcaggattgtttttg tttgtaaaaa3601ttctgaattg aaactttgtt ttaaaaaaag gcttctttctttcatatgac aagagatagg3661tcaggaatat tggaatcaag atttaaatgt taaaattcgattttgttaca cagggtgtgt3721tcatttgttt tgtagcagac aagatctaga tcccagacagaaacaacaca tgctattcta3781aaaagccgca ttttaaaagg caccttggtt ctcaaaagaaatcagaatat ggatattcgt3841agtgatgatc tgttttctct aaaatcttac catattgtctgtatatggtt gtaaattcaa3901atggaaagta aaacgttttg gccctgattt tgtatgtggaccactgctcc tgatttccca3961ggtcttaggc cacctttgac tgtttctccg tttgtttgtgggcagcgatt ccagtcccaa4021cggaggcatt ctcgtgtgtc ccggggggtt atgtccttcacaaaacactt aatgaaatga4081attacttc(SEQ ID NO:182)1dfgysvglnp peeeavtskt csvppsdtdt yragataskepplyygvcpv yedvparner61iyvyenkkea lqavkmikgs rfkafstred aekfargicdyfpspsktsl plspvktapl121fsndrlkdgl clsesetvnk eransyknpr tqdltaklrkavekgeedtf sdliwsnpry181ligsgdnpti vqegcrynvm hvaakenqas icqitidvlenpdfmrlmyp dddeamlqkr241iryvvdlyln tpdkmgydtp lhfackfgna dvvnvlsshhlivknsrnky dktpedvice301rsknksvelk erireylkgh yyvpllraee tsspvigelwspdqtaeash vsryggsprd361pvltlrafag plspakaedf rklwktppre kagflhhvkksdpergferv grelahelgy421pwveyweflg cfvdlssqeg lqrleeyltq qeigkkaqqetgereascrd kattsgsnsi481svrafldedd msleeiknrq naarnnsppt vgafghtrcsafpleqeadl ieaaepggph541ssrnglchpl nhsrtlagkr pkaphgeeah lppvsdltvefdklnlqnig rsvsktpdes601tktkdqilts rinaverdll epspadqlgn ghrrtesemsariakmslsp ssprhedqle661vtreparrlf lfgeepskld qdvlaaleca dvdphqfpavhrwksavlcy spsdrqswps721pavkgrfksq lpdlsgphsy spgrnsvags npakpglgspgryspvhgsq lrrmarlael781aal


Putative function


Unknown


Example 15 (Category 3)

Line ID—379


Category—Lethal phase pharate adult, Dot and rod-like overcondensed chromosomes, high mitotic index, overcondensed anaphases some with lagging chromosomes, a few tetraploid cells with overcondensed chromosomes, XYY males.


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003443 (7D14-E2)


P element insertion site—130,532


Annotated Drosophila genome Complete Genome candidate—2 candidates:


CG10964—novel, similarity to dehydrogenases

(SEQ ID NO:183)AACGAAACAGCCGGCCGTCAAAATTTTTCCTAACATTTCACTATTTTCACGCTTGTGTTACGGCAATAAAGTCGATTGATAAGCACGGAAAGATCTGGCTGCGGGTCTGGTGAAATCCACAGAACACACGGAACCCGTATAGTAGTGCCGCCCTTTATTGGTTTTATCTCAAGTACGACGCGATAAGATTTCGAGCAACTCGATCGCGGATCTTCGGAAAAAAAAAACATGAACTCCATCCTGATAACCGGCTGCAATCGAGGATTGGGTCTGGGCCTGGTCAAGGCGCTGCTCAATCTTCCCCAGCCGCCGCAGCATCTATTTACCACCTGCCGGAATCGCGAGCAGGCAAAGGAGCTGGAGGATCTAGCCAAGAACCACTCGAACATACACATACTTGAGATTGATTTGAGAAATTTCGATGCCTATGACAAGCTAGTCGCCGACATCGAGGGCGTGACCAAGGACCAAGGCCTCAATGTGCTCTTCAACAATGCCGGCATAGCGCCCAAATCGGCCAGGATAACGGCCGTTCGATCGCAGGAGCTGCTCGACACCTTGCAGACCAACACGGTTGTGCCCATCATGCTGGCCAAGGCGTGTCTGCCGCTCCTTAAGAAGGCAGCCAAAGCGAACGAATCCCAGCCGATGGGCGTGGGCCGTGCCGCCATTATTAACATGTCCTCGATCCTTGGCTCCATCCAGGGCAACACGGACGGCGGAATGTACGCCTATCGCACCTCTAAGTCGGCCTTGAATGCGGCCACCAAGTCGTTGAGCGTGGATCTGTATCCGCAACGCATCATGTGCGTCAGTCTGCATCCTGGCTGGGTGAAAACCGACATGGGTGGCTCCAGTGCCCCCTTGGACGTGCCCACCAGCACGGGACAAATTGTGCAGACCATCAGCAAGCTGGGCGAGAAACAGAACGGCGGTTTTGTCAACTACGACGGCACTCCGCTGGCCTGGTAA(SEQ ID NO:184)MNSILITGCNRGLGLGLVKALLNLPQPPQHLFTTCRNREQAKELEDLAKNHSNIHILEIDLRNFDAYDKLVADIEGVTKDQGLNVLFNNAGIAPKSARITAVRSQELLDTLQTNTVVPIMLAKACLPLLKKAAKANESQPMGVGRAAIINMSSILGSIQGNTDGGMYAYRTSKSALNAATKSLSVDLYPQRIMCVSLHPGWVKTDMGGSSAPLDVPTSTGQIVQTISKLGEKQNGGFVNYDGTPLAWCG2151-Trxr-1 thoredoxin reductase-1 (2 splicevariants)(SEQ ID NO:185)CGACAAGCCAATCGACGTCTCCCTTTCGCACGCTCGTACGAAAGTACAAAAGCTATTGCAAAAGTTGGCTCCGCTTATTCGTTTCGTGCTTTCGCGAGTGCCGAGAGCCGCTACAATACACGCTTAGCAGTTTTTACATTTCCGCTTCGACTACAACAACATTCACTACCCGCCGTTGATCCTTGTTTTCTGTCTGATTTACGTGGAGCACCTACCAACAAGCAACAAAATAATGGCGCCCGTGCAAGGATCCTACGACTACGACCTTATTGTGATTGGAGGCGGCTCAGCTGGCCTGGCCTGCGCCAAGGAGGCAGTCCTCAATGGAGCCCGTGTGGCCTGTCTGGATTTCGTTAAGCCCACGCCCACTCTGGGCACCAAGTGGGGCGTTGGCGGCACCTGCGTGAACGTGGGCTGCATTCCCAAGAAGCTGATGCACCAGGCCTCCCTTCTGGGCGAGGCTGTCCATGAGGCGGCCGCCTACGGCTGGAACGTGGACGAAAAGATCAAGCCAGACTGGCACAAGCTGGTGCAGTCCGTACAGAACCACATCAAGTCCGTCAACTGGGTGACCCGTGTGGATCTGCGCGACAAGAAAGTGGAGTACATCAATGGACTGGGCTCCTTCGTGGACTCGCACACACTGCTGGCCAAGCTGAAGAGCGGCGAGCGCACAATCACCGCCCAGACCTTCGTCATTGCCGTTGGCGGCCGACCACGTTATCCGGATATTCCCGGTGCTGTCGAGTATGGCATCACCAGCGATGATCTGTTCAGTTTGGACCGCGAGCCCGGCAAGACCCTGGTGGTGGGAGCTGGCTACATTGGCTTGGAGTGCGCTGGATTCCTGAAGGGTCTCGGCTACGAGCCCACTGTGATGGTGCGTTCTATTGTGCTGCGTGGCTTCGACCAGCAGATGGCCGAGCTGGTGGCAGCCTCGATGGAGGAGCGTGGCATTCCCTTCCTCCGCAAGACGGTGCCGCTGTCCGTGGAAAAGCAGGATGATGGCAAGCTGCTCGTGAAGTACAAGAACGTGGAGACCGGCGAGGAGGCCGAGGATGTTTACGACACCGTTCTGTGGGCCATCGGCCGCAAGGGTCTGGTGGACGATCTGAACCTGCCCAATGCCGGCGTGACTGTGCAGAAGGACAAGATTCCAGTGGACTCCCAGGAGGCTACCAATGTGGCAAACATCTACGCTGTCGGCGATATCATCTATGGCAAGCCAGAGCTGACGCCCGTCGCCGTTTTGGCTGGCCGTTTGCTGGCCCGCCGCCTGTACGGAGGATCTACCCAGCGCATGGACTACAAGGATGTGGCCACCACCGTTTTCACGCCCCTGGAGTACGCCTGCGTCGGCCTGAGCGAGGAGGATGCCGTCAAGCAGTTCGGAGCCGATGAGATCGAGGTGTTCCACGGCTACTACAAGCCCACGGAGTTCTTCATTCCCCAGAAGAGCGTGCGCTACTGCTACTTGAAGGCTGTGGCCGAGCGCCATGGTGACCAGCGCGTCTATGGACTGCACTATATTGGCCCGGTGGCCGGTGAGGTTATCCAGGGATTCGCTGCCGCTTTGAAGTCTGGCCTGACTATTAACACGCTGATCAACACCGTGGGCATCCATCCCACTACCGCCGAAGAATTCACCCGGCTGGCCATCACCAAGCGCTCCGGACTGGACCCCACGCCGGCCAGCTGCTGCAGCTAAAGCGGGAACGCAGCTCAGCCGCCTGGGACGTGTCGAAGCCGCTTGCTCCACCCGAAATCCCGTAGATGAATGGTTGTTGTCGCGGCCCAGCGATCGATGAGTTCAATAGTTCCGTTTCGTTTCCACAATTAACACCCAACACAATAGCTCTGCGCAAGGGAGGGGCACTGGGCAGCGATGGCGGGTGGAACGACACCAGTGGAACTACCCGCGCGACCAGCCCAACCCACGACTGCTGCGCCGCCGACATGCACTCAAAATTTTGAATTTGTTTGAACCTATGAAATTAACTATGAAATCCCCTAAATGTACGGTTGAAGAATATAATTTTTCACC(SEQ ID NO:186)MAPVQGSYDYDLIVIGGGSAGLACAKEAVLNGARVACLDFVKPTPTLGTKWGVGGTCVNVGCIPKKLMHQASLLGEAVHEAAAYGWNVDEKIKPDWHKLVQSVQNHIKSVNWVTRVDLRDKKVEYINGLGSFVDSHTLLAKLKSGERTITAQTFVIAVGGRPRYPDIPGAVEYGITSDDLFSLDREPGKTLVVGAGYIGLECAGFLKGLGYEPTVMVRSIVLRGFDQQMAELVAASMEERGIPFLRKTVPLSVEKQDDGKLLVKYKNVETGEEAEDVYDTVLWAIGRKGLVDDLNLPNAGVTVQKDKIPVDSQEATNVANIYAVGDIIYGKPELTPVAVLAGRLLARRLYGGSTQRMDYKDVATTVFTPLEYACVGLSEEDAVKQFGADEIEVFHGYYKPTEFFIPQKSVRYCYLKAVAERHGDQRVYGLHYIGPVAGEVIQGFAAALKSGLTINTLINTVGIHPTTAEEFTRLAITKRSGLDPTPASCCS(SEQ ID NO:187)CCCGGCCGAACCAGCGAACGTGTTTGTGTTGTGTGTTCCGCCGTCATTTTTCTGCACCCTTTTCGCGAATAGTTTCGTTTCGCCTCCAGCTGGTAGAGTGAAACGCCAAACGTTGAAGAAGGGGAAAGGCCAACAAGATGAACTTGTGCAATTCGAGATTCTCCGTTACGTTCGTGCGGCAGTGCTCGACGATTTTAACGTCTCCTTCGGCTGGCATTATACAAAACAGAGGCTCACTGACAACAAAGGTTCCCCATTGGATTTCCAGTAGTCTCAGCTGTGCCCATCACACGTTTCAGCGAACTATGAACTTGACGGGACAGCGAGGATCACGCGACAGTACTGGAGCTACCGGTGGGAATGCTCCAGCCGGATCCGGTGCCGGCGCACCACCACCCTTCCAGCATCCACATTGCGACAGGGCGGCCATGTACGCGCAACCGGTGCGAAAGATGAGCACCAAAGGAGGATCCTACGACTACGACCTTATTGTGATTGGAGGCGGCTCAGCTGGCCTGGCCTGCGCCAAGGAGGCAGTCCTCAATGGAGCCCGTGTGGCCTGTCTGGATTTCGTTAAGCCCACGCCCACTCTGGGCACCAAGTGGGGCGTTGGCGGCACCTGCGTGAACGTGGGCTGCATTCCCAAGAAGCTGATGCACCAGGCCTCCCTTCTGGGCGAGGCTGTCCATGAGGCGGCCGCCTACGGCTGGAACGTGGACGAAAAGATCAAGCCAGACTGGCACAAGCTGGTGCAGTCCGTACAGAACCACATCAAGTCCGTCAACTGGGTGACCCGTGTGGATCTGCGCGACAAGAAAGTGGAGTACATCAATGGACTGGGCTCCYFCGTGGACTCGCACACACTGCTGGCCAAGCTGAAGAGCGGCGAGCGCACAATCACCGCCCAGACCTTCGTCATTGCCGTTGGCGGCCGACCACGTTATCCGGATATTCCCGGTGCTGTCGAGTATGGCATCACCAGCGATGATCTGTTCAGTTTGGACCGCGAGCCCGGCAAGACCCTGGTGGTGGGAGCTGGCTACATTGGCTTGGAGTGCGCTGGATTCCTGAAGGGTCTCGGCTACGAGCCCACTGTGATGGTGCGTTCTATTGTGCTGCGTGGCTTCGACCAGCAGATGGCCGAGCTGGTGGCAGCCTCGATGGAGGAGCGTGGCATTCCCTTCCTCCGCAAGACGGTGCCGCTGTCCGTGGAAAAGCAGGATGATGGCAAGCTGCTCGTGAAGTACAAGAACGTGGAGACCGGCGAGGAGGCCGAGGATGTTTACGACACCGTTCTGTGGGCCATCGGCCGCAAGGGTCTGGTGGACGATCTGAACCTGCCCAATGCCGGCGTGACTGTGCAGAAGGACAAGATTCCAGTGGACTCCCAGGAGGCTACCAATGTGGCAAACATCTACGCTGTCGGCGATATCATCTATGGCAAGCCAGAGCTGACGCCCGTCGCCGTTTTGGCTGGCCGTTTGCTGGCCCGCCGCCTGTACGGAGGATCTACCCAGCGCATGGACTACAAGGATGTGGCCACCACCGTTTTCACGCCCCTGGAGTACGCCTGCGTCGGCCTGAGCGAGGAGGATGCCGTCAAGCAGTTCGGAGCCGATGAGATCGAGGTGTTCCACGGCTACTACAAGCCCACGGAGTTCTTCATTCCCCAGAAGAGCGTGCGCTACTGCTACTTGAAGGCTGTGGCCGAGCGCCATGGTGACCAGCGCGTCTATGGACTGCACTATATTGGCCCGGTGGCCGGTGAGGTTATCCAGGGATTCGCTGCCGCTTTGAAGTCTGGCCTGACTATTAACACGCTGATCAACACCGTGGGCATCCATCCCACTACCGCCGAAGAATTCACCCGGCTGGCCATCACCAAGCGCTCCGGACTGGACCCCACGCCGGCCAGCTGCTGCAGCTAAAGCGGGAACGCAGCTCAGCCGCCTGGGACGTGTCGAAGCCGCTTGCTCCACCCGAAATCCCGTAGATGAATGGTTGTTGTCGCGGCCCAGCGATCGATGAGTTCAATAGTTCCGTTTCGTTTCCACAATTAACACCCAACACAATAGCTCTGCGCAAGGGAGGGGCACTGGGCAGCGATGGCGGGTGGAACGACACCAGTGGAACTACCCGCGCGACCAGCCCAACCCACGACTGCTGCGCCGCCGACATGCACTCAAAATTTTGAATTTGTTTGAACCTATGAAATTAACTATGAAATCCCCTAAATGTACGGTTGAAGAATATAATTTTTCACC(SEQ ID NO:188)MSTKGGSYDYDLIVIGGGSAGLACAKEAVLNGARVACLDFVKPTPTLGTKWGVGGTCVNVGCIPKKLMHQASLLGEAVHEAAAYGWNVDEKIKPDWHKLVQSVQNHIKSVNWVTRVDLRDKKVEYINGLGSFVDSHTLLAKLKSGERTITAQTFVIAVGGRPRYPDIPGAVEYGITSDDLFSLDREPGKTLVVGAGYIGLECAGFLKGLGYEPTVMVRSIVLRGFDQQMAELVAASMEERGIPFLRKTVPLSVEKQDDGKLLVKYKNVETGEEAEDVYDTVLWAIGRKGLVDDLNLPNAGVTVQKDKIPVDSQEATNVANIYAVGDIIYGKPELTPVAVLAGRLLARRLYGGSTQRMDYKDVATTVFTPLEYACVGLSEEDAVKQFGADEIEVFHGYYKPTEFFIPQKSVRYCYLKAVAERHGDQRVYGLHYIGPVAGEVIQGFAAALKSGLTINTLINTVGIHPTTAEEFTRLAITKRSGLDPTPASCCS


Human homologue of Complete Genome candidate


(CG10965)—AAC50725 11-cis retinol dehydrogenase

(SEQ ID NO: 189)1taagcttcgg gcgctgtagt acctgccagc tttcgccacaggaggctgcc acctgtaggt61cacttgggct ccagctatgt ggctgcctct tctgctgggtgccttactct gggcagtgct121gtggttgctc agggaccggc agagcctgcc cgccagcaatgcctttgtct tcatcaccgg181ctgtgactca ggctttgggc gccttctggc actgcagctggaccagagag gcttccgagt241cctggccagc tgcctgaccc cctccggggc cgaggacctgcagcgggtgg cctcctcccg301cctccacacc accctgttgg atatcactga tccccagagcgtccagcagg cagccaagtg361ggtggagatg cacgttaagg aagcagggct ttttggtctggtgaataatg ctggtgtggc421tggtatcatc ggacccacac catggctgac ccgggacgatttccagcggg tgctgaatgt481gaacacaatg ggtcccatcg gggtcaccct tgccctgctgcctctgctgc agcaagcccg541gggccgggtg atcaacatca ccagcgtcct gggtcgcctggcagccaatg gtgggggcta601ctgtgtctcc aaatttggcc tggaggcctt ctctgacagcctgaggcggg atgtagctca661ttttgggata cgagtctcca tcgtggagcc tggcttcttccgaacccctg tgaccaacct721ggagagtctg gagaaaaccc tgcaggcctg ctgggcacggctgcctcctg ccacacaggc781ccactatggg ggggccttcc tcaccaagta cctgaaaatgcaacagcgca tcatgaacct841gatctgtgac ccggacctaa ccaaggtgag ccgatgcctggagcatgccc tgactgctcg901acacccccga acccgctaca gcccaggttg ggatgccaagctgctctggc tgcctgcctc961ctacctgcca gccagcctgg tggatgctgt gctcacctgggtccttccca agcctgccca1021agcagtctac tgaatccagc cttccagcaa gagattgtttttcaaggaca aggactttga1081tttatttctg cccccaccct ggtactgcct ggtgcctgccacaaaata(SEQ ID NO:190)1mwlplllgal lwaviwllrd rqslpasnaf vfitgcdsgfgrllalqldq rgfrvlascl61tpsgaedlqr vassrlhttl lditdpqsvq qaakwvemhvkeaglfglvn nagvagiigp121tpwltrddfq rvlnvntmgp igvtlallpl lqqargrvinitsvlgrlaa ngggycvskf181gleafsdslr rdvahfgirv sivepgffrt pvtnleslektlqacwarlp patqahygga241fltkylkmqq rimnlicdpd ltkvsrcleh altarhprtryspgwdakll wlpasylpas301lvdavltwvl pkpaqavy(CG2151)-XP_033135 thioredoxin reductase beta(SEQ ID NO:191)1ccggacctca ggcccagttc agtgtacttc ccctctctacttcctccctc cagtcccttc61tccatccctc ccttttttgg ctgccccttg cctgccttcctcgccagtag cttgcagagt121agacacgatg acaccttttg caggctaaaa aggctgagagtggcactatg tgcagtgagc181caccatggag gaccaagcag gtcagcggga ctatgatctcctggtggtcg gcgggggatc241tggtggcctg gcttgtgcca aggaggccgc ccagctgggaaggaaggtgg ccgtggtgga301ctacgtggaa ccttctcccc aaggcacccg gtggggcctcggcggcacct gcgtcaacgt361gggctgcatc cccaagaagc tgatgcacca ggcggcactgctgggaggcc tgatccaaga421tgcccccaac tatggctggg aggtggccca gcccgtgccgcatgactgga ggaagatggc481agaagctgtt caaaatcacg tgaaatcctt gaactggggccaccgtgtcc agcttcagga541cagaaaagtc aagtacttta acatcaaagc cagctttgttgacgagcaca cggtttgcgg601cgttgccaaa ggtgggaaag agattctgct gtcagccgatcacatcatca ttgctactgg661agggcggccg agatacccca cgcacatcga aggtgccttggaatatggaa tcacaagtga721tgacatcttc tggctgaagg aatcccctgg aaaaacgttggtggtcgggg ccagctatgt781ggccctggag tgtgctggct tcctcaccgg gattgggctggacaccacca tcatgatgcg841cagcatcccc ctccgcggct tcgaccagca aatgtcctccatggtcatag agcacatggc901atctcatggc acccggttcc tgaggggctg tgccccctcgcgggtcagga ggctccctga961tggccagctg caggtcacct gggaggacag caccaccggcaaggaggaca cgggcacctt1021tgacaccgtc ctgtgggcca taggtcgagt cccagacaccagaagtctga atttggagaa1081ggctggggta gatactagcc ccgacactca gaagatcctggtggactccc gggaagccac1141ctctgtgccc cacatctacg ccattggtga cgtggtggaggggcggcctg agctgacacc1201catagcgatc atggccggga ggctcctggt gcagcggctcttcggcgggt cctcagatct1261gatggactac gacaatgttc ccacgaccgt cttcaccccgctggagtatg gctgtgtggg1321gctgtccgag gaggaggcag tggctcgcca cgggcaggagcatgttgagg tctatcacgc1381ccattataaa ccactggagt tcacggtggc tggacgagatgcatcccagt gttatgtaaa1441gatggtgtgc ctgagggagc ccccacagct ggtgctgggcctgcatttcc ttggccccaa1501cgcaggcgaa gttactcaag gatttgctct ggggatcaagtgtggggctt cctatgcgca1561ggtgatgcgg accgtgggta tccatcccac atgctctgaggaggtagtca agctgcgcat1621ctccaagcgc tcaggcctgg accccacggt gacaggctgctgagggtaag cgccatccct1681gcaggccagg gcacacggtg cgcccgccgc cagctcctcggaggccagac ccaggatggc1741tgcaggccag gtttgggggg cctcaaccct ctcctggagcgcctgtgaga tggtcagcgt1801ggagcgcaag tgctggacag gtggcccgtg tgccccacagggatggctca ggggactgtc1861cacctcaccc ctgcacctct cagcctctgc cgccgggcacccccccccag gctcctggtg1921ccagatgatg acgacctggg tggaaaccta ccctgtgggcacccatgtcc gagccccctg1981gcatttctgc aatgcaaata aagagggtac tttttctgaagtgtg(SEQ ID NO:192)1medqagqrdy dllvvgggsg glacakeaaq lgrkvavvdyvepspqgtrw glggtcvnvg61cipkklmhqa allggliqda pnygwevaqp vphdwrkmaeavqnhvksln wghrvqlqdr121kvkyfnikas fvdehtvcgv akggkeills adhiiiatggrprypthieg aleygitsdd181ifwlkespgk tlvvgasyva lecagfltgi gldttimmrsiplrgfdqqm ssmviehmas241hgtrflrgca psrvrrlpdg qlqvtwedst tgkedtgtfdtvlwaigrvp dtrslnleka301gvdtspdtqk ilvdsreats vphiyaigdv vegrpeltpiaimagrllvq rlfggssdlm361dydnvpttvf tpleygcvgl seeeavarhg qehvevyhahykpleftvag rdasqcyvkm421vclreppqlv lglhflgpna gevtqgfalg ikcgasyaqvmrtvgihptc seevvklris481krsgldptvt gcxg


Putative function


(CG 10964)—unknown, similarity to dehydrogenases


(CG2151)—thioredoxin reductase


Example 16 (Category 3)

Line ID—418


Phenotype—Lethal phase embryonic larval phase3-pre-pupal-pupal. High mitotic index, dot-like chromosomes, strong metaphase arrest


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003431 (4C 1-16)


P element insertion site—289,752


Annotated Drosophila genome Complete Genome candidate


CG3000—rap, fizzy related

(SEQ ID NO:193)CTTTGGCTTGTTTGCTTGAAAAAACGTAACTTTTTTTGTTGTAATGAAGGAAGCAGCACGGGCAGTAGACCAACTCGAAATCGCGCATTGCCAACACGTAACGTACCAGCCCGTGTAATAACAGAAGAAACCCCGAGCCGCAACAACAACCCCCGAAAAGCGGTAGTTGTAAGAGTTTTCCCAAAGTGGCAGCGGCAATTACACGGCGAGAAACGAGTTCGCGTCGCGTCCAGCTGTTTGAAAATCAAAATTAACCGTTTTTAGCGCGTGAAACAAGACGTTTAGAACCGTGTTCAAAATCCCTCGTACATAAATTGTGTGTACATTTATATATATATATATTTTCTACGCCACGTTAACCAGACTTTTTAAGTTTTAAATTAAAACTAAAGACGTATTATTTTTTTTTTTTTGAGTGTTTATATTTTTTTTTTTGCAAGTTTTGTTTGGTTACATTTGAGTTTGTGTTGAGTTTTTGCCAGCCAAAGGCGCTTAAGATGTTTAGTCCCGAGTACGAGAAGCGCATCCTGAAGCACTACAGTCCTGTGGCACGGAATCTGTTCAACAACTTCGAGTCGTCCACTACGCCCACATCTCTCGACCGCTTCATACCCTGCAGAGCGTACAACAACTGGCAGACGAACTTTGCGTCAATCAACAAGTCCAATGACAACTCGCCGCAGACGAGTAAGAAGCAGCGGGACTGCGGGGAAACGGCACGCGATAGTCTCGCCTACTCCTGCCTACTGAAGAACGAGCTCCTCGGATCGGCAATCGACGACGTGAAGACCGCCGGCGAGGAGCGGAATGAGAATGCCTACACGCCGGCCGCAAAGCGGAGTCTCTTCAAGTACCAGTCACCCACCAAGCAGGACTACAATGGCGAGTGTCCGTACTCGTTGTCACCCGTCAGCGCCAAAAGTCAGAAGCTGTTGCGATCGCCGCGCAAGGCTACGCGCAAAATCTCTCGCATTCCCTTCAAGGTGCTAGACGCGCCCGAGTTGCAGGACGACTTCTATCTGAACCTGGTCGACTGGTCGTCGCAGAACGTACTGGCTGTAGGCCTGGGCAGCTGTGTCTATCTGTGGAGCGCGTGCACCAGTCAGGTTACCCGCCTGTGTGATCTCAGTCCGGATGCGAATACGGTGACCTCGGTGTCGTGGAACGAGCGTGGCAACACCGTGGCCGTGGGCACACATCACGGCTACGTGACCGTCTGGGATGTGGCGGCCAATAAGCAGATCAACAAACTGAATGGCCATTCGGCGCGTGTGGGCGCCTTGGCATGGAACAGTGACATCCTGTCGAGCGGGTCGCGAGACCGTTGGATCATACAGCGGGATACGAGAACGCCGCAACTGCAATCGGAGCGCAGATTGGCCGGACATCGGCAGGAGGTGTGCGGACTGAAATGGTCACCGGATAATCAATACTTGGCCAGTGGCGGCAACGATAATCGGTTGTATGTGTGGAATCAGCAITCCGTGAATCCCGTACAATCATACACGGAGCATATGGCGGCTGTAAAGGCGATCGCGTGGTCGCCGCATCACCACGGACTCCTGGCCAGCGGCGGTGGAACGGCGGATAGGTGTATCCGTTTCTGGAATACGCTGACGGGCCAGCCCATGCAGTGCGTGGACACGGGCTCGCAGGTTTGCAATCTGGCCTGGTCCAAGCACTCCTCGGAGCTGGTCTCCACGCACGGCTACTCGCAGAACCAGATACTCGTGTGGAAATATCCCTCCCTGACGCAAGTGGCCAAGCTGACGGGCCATTCGTATCGTGTGCTCTATCTGGCGCTGAGTCCCGATGGTGAGGCTATTGTTACGGGCGCCGGCGACGAGACGCTGCGATTTTGGAACGTATTCAGCAAGGCGCGCAGTCAGAAGGAGAACAAGTCCGTTCTGAATCTGTTTGCCAATATCAGATAAGGACAATAACTCCAAGCGAGCGAAGACTGAGCGAGCGCCAAAGGCAAACACAACACAACACAAAACAAAACAAAACAAAGCAAAGTATAATATAAATAAAATGGATACTTGAAACCGAAAAACAAAGCCAACCAACCAATCAGCAAAAACCAAGCTGAAGCTAACAAACTAATCGAGCCTATATGCTATATATATACAAACGATTCTTGTTCAGCAGTCGTTTTGTAAATTGTTGTGTGACCCCACAGCAGCAATAGATTAAATAAATTTAAGTTAAGCAATCTGTATAGAACGGTAATTAGCAACATTTACGTAGGTAAACACATGCAATTTATGAAGGAATAACATCAAGAGAGATGGCTGAAACAAGAACTGAAAATGAAACTAAGTCTATGGAAATTGTAAGTAATTGGAAAATCAACAACACCACACTCACACACTATCTTTAATCGACATTTTTTGTTGCTGCTTTTTTAAATGTATTGTTTTTTTTTTGTGGTACACCTACACTACACCTAAGAAAATTGGATACCCCTACATATACATTTATACGTTTATATATATATATTTTTTTGCTAGCCTCTAAGTAACTAACTTTATTTCAAGCAAACATTTATACACATATTTCGCTCACTAGAAACACTCATACCCCCGAAAACACAATGTATATTAAATAAACTTATACAATTTCAAAATGTGCCCCAAAAAGTA(SEQ ID NO:194)MFSPEYEKRILKHYSPVARNLFNNFESSTTPTSLDRFIPCRAYNNWQTNFASINKSNDNSPQTSKKQRDCGETARDSLAYSCLLKNELLGSAIDDVKTAGEERNENAYTPAAKRSLFKYQSPTKQDYNGECPYSLSPVSAKSQKLLRSPRKATRKISRIPFKVLDAPELQDDFYLNLVDWSSQNVLAVGLGSCVYLWSACTSQVTRLCDLSPDANTVTSVSWNERGNTVAVGTHHGYVTVWDVAANKQINKLNGHSARVGALAWNSDILSSGSRDRWIIQRDTRTPQLQSERRLAGHRQEVCGLKWSPDNQYLASGGNDNRLYVWNQHSVNPVQSYTEHMAAVKAIAWSPHHHGLLASGGGTADRCIRFWNTLTGQPMQCVDTGSQVCNLAWSKHSSELVSTHGYSQNQILVWKYPSLTQVAKLTGHSYRVLYLALSPDGEAIVTGAGDETLRFWNVFSKARSQKENKSVLNLFANIR


Human homologue of Complete Genome candidate


XP009259 Fzr1 protein

(SEQ ID NO:195)1ggccgcggcc gggcctgcgg gagctgcgga ggccggaggcgggcgctgtg cggtgccagg61agaggcgggg tcggcgggag ccagcgagcc acgggagcgagccaggctaa ccttgccgcg121ggccgagccc tgcctcgcca tggaccagga ctatgagcggcgcctgcttc gccagatcgt181catccagaat gagaacacga tgccacgcgt cacagagatgcggcggaccc tgacgcctgc241cagctcccca gtgtcctcgc ccagcaagca cggagaccgcttcatcccct ccagagccgg301agccaactgg agcgtgaact tccacaggat taacgagaatgagaagtctc ccagtcagaa361ccggaaagcc aaggacgcca cctcagacaa cggcaaagacggcctggcct actctgccct421gctcaagaat gagctgctgg gtgccggcat cgagaaggtgcaggacccgc agactgagga481ccgcaggctg cagccctcca cgcctgagaa gaagggtctgttcacgtatt cccttagcac541caagcgctcc agccccgatg acggcaacga tgtgtctccctactccctgt ctcccgtcag601caacaagagc cagaagctgc tccggtcccc ccggaaacccacccgcaaga tctccaagat661ccccttcaag gtgctggacg cgcccgagct gcaggacgacttctacctca atctggtgga721ctggtcgtcc ctcaatgtgc tcagcgtggg gctaggcacctgcgtgtacc tgtggagtgc781ctgtaccagc caggtgacgc ggctctgtga cctctcagtggaaggggact cagtgacctc841cgtgggctgg tctgagcggg ggaacctggt ggcggtgggcacacacaagg gcttcgtgca901gatctgggac gcagccgcag ggaagaagct gtccatgttggagggccaca cggcacgcgt961cggggcgctg gcctggaatg ctgagcagct gtcgtccgggagccgcgacc gcatgatcct1021gcagagggac atccgcaccc cgccactgca gtcggagcggcggctgcagg gccaccggca1081ggaggtgtgc gggctcaagt ggtccacaga ccaccagctcctcgcctcgg ggggcaacga1141caacaagctg ctggtctgga atcactcgag cctgagccccgtgcagcagt acacggagca1201cctggcggcc gtgaaggcca tcgcctggtc cccacatcagcacgggctgc tggcctcggg1261gggcggcaca gctgaccgct gtatccgctt ctggaacacgctgacaggac aaccactgca1321gtgtatcgac acgggctccc aagtgtgcaa tctggcctggtccaagcacg ccaacgagct1381ggtgagcacg cacggctact cacagaacca gatccttgtctggaagtacc cctccctgac1441ccaggtggcc aagctgaccg ggcactccta ccgcgtgctgtacctggcaa tgtcccctga1501tggggaggcc atcgtcactg gtgctggaga cgagaccctgaggttctgga acgtctttag1561caaaacccgt tcgacaaagg agtctgtgtc tgtgctcaacctcttcacca ggatccggta1621aacctgccgg gcaggaccgt gccacaccag ctgtccagagtcggaggacc ccagctcctc1681agcttgcatg gactctgcct tcccagcgct tgtcccccgaggaaggcggc tgggcgggcg1741gggagctggg cctggaggat cctggagtct cattaaatgcctgattgtga accatgtcca1801ccagtatctg gggtgggcac gtggtcgggg accctcagcagcaggggctc tgtctccctt1861cccaaagggc gagaaccaca ttggacggtc ccggctcagaccgtctgtac tcagagcgac1921ggatgccccc tgggaccctc actgcctccg tctgttcatcacctgcccac cggagccgca1981tgctcttcct ggaactgccc acgtctgcac agaacagaccaccagacgcc agggctgatt2041ggtgggggcc tgagaccccg gttgcccatt catggctgcaccccaccatg tcaaacccaa2101gaccagcccc aaggccagac caaggcatgt aggcctgggcaggtggctcg gggccactgg2161cggagccagc ctgtggatcc aagagacagt ccccacctgggcttcacggc atccttgcag2221ccacctctgc tgtcactgct cgaagcagca gtctctctggaagcatctgt gtcatggcca2281tcgcccggcg gtcagtgggc ttcagatggg cctgtgcatcctggccaagc gtcaccctca2341cactggagga ggatgtctgc tctggactta tcaccccaggagaactgaac ccggacctgc2401tcactgccct ggctggagag gagcacaaca gatgccacgtcttcgtgcat tcgccaacac2461gtgccctcac agggccagcg tcctccttcc ctgcgcaagacttgcgtccc ccatgcctgc2521tgggtggctg ggtcctgtgg aggccagcag cggtgtggcccccgccccca ggctgcctgt2581gtcttcacct gtcctgtcca ccagcgccaa cagccgtggggaagccaagg agacccaagg2641ggtccaggag gtgggcgccc tccatccttc gagaagcttcccaggctcct ctgcttctct2701gtctcatgct cccaggctgc acagcaggca gggagggaggcaaggcaggg gagtggggcc2761tgagctgagc actgccccct caccccccca ccaccccttcccatttcatc ggtggggacg2821tggagagggt ggggcgggct ggggttggag ggtcccacccaccaccctgc tgtgcttggg2881aacccccact ccccactccc cacatcccaa catcctggtgtctgtcccca gtggggttgg2941cgtgcatgtg tacatatgta tttgtgactt ttctttgg(SEQ ID NO:196)1mdqdyerrll rqiviqnent mprvtemrrt ltpasspvsspskhgdrfip sraganwsvn61fhrineneks psqnrkakda tsdngkdgla ysallknellgagiekvqdp qtedrrlqps121tpekkglfty slstkrsspd dgndvspysl spvsnksqkllrsprkptrk iskiptkvld181apelqddfyl nlvdwsslnv lsvglgtcvy lwsactsqvtrlcdlsvegd svtsvgwser241gnlvavgthk gfvqiwdaaa gkklsmlegh tarvgalawnaeqlssgsrd rmilqrdirt301pplqserrlq ghrqevcglk wstdhqllas ggndnkllvwnhsslspvqq ytehlaavka361iawsphqhgl lasgggtadr cirfwntltg qplqcidtgsqvcnlawskh anelvsthgy421sqnqilvwky psltqvaklt ghsyrvlyla mspdgeaivtgagdetlrfw nvfsktrstk481esvsvlnlft rir


Putative function


Cell cycle regulator involved in cyclin degradation


Example 17 (Category 3)

Line ID—121


Phenotype—Lethal phase larval phase 3-prepupal-pupal-pharate adult-adult. High mitotic index, dot and rod-like overcondensed chromosomes, high frequency of polyploids


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003493 (12B7)


P element insertion site—not determined


Annotated Drosophila genome Complete Genome candidate


CG10988—1(1)dd4 gamma tubulin ring complex

(SEQ ID NO:197)TAACACTGCACTAAATAATTTTAATAAATTATTTGTATGAAGTACGCGCCAATTGGATGCGTTTTTGTCCTATCTGTCGAAGATTTCACGCATCCCGAACAATTGCCAGTGACTGCACGCCGTATTATAGCCAGGGAACAGCTGTGCGTTTGCCATTGGCCAACAGTTGTTGTCCACTTCGCAATTACCAAGCCATCCAAAATCGGCTGTTTAACGCGCGCTTGATTGGATATTTATGAACAATTCAGTGCACCAGGATGTCGCAGGACAGGATCGCCGGCATCGATGTGGCAACCAATTCCACTGATATATCGAATATCATTAACGAGATGATCATCTGCATCAAGGGCAAGCAGATGCCCGAAGTTCACGAAAAAGCAATGGATCATTTAAGCAAAATGATTGCCGCCAATAGTCGGGTCATTCGGGACTCAAATATGTTGACTGAGCGCGAATGTGTCCAGAAGATAATGAAACTGCTGAGCGCCCGGAATAAGAAGGAGGAGGGCAAAACTGTGTCGGATCACTTCAATGAGCTGTACAGGAAACTCACGTTGACCAAGTGCGATCCGCACATGAGGCACTCGCTAATGACCCATCTACTTACGATGACCGACAATTCGGATGCCGAAAAGGCAGTTGCCAGCGAAGATCCACGTACTCAGTGCGATAATCTCACTCAGATTCTGGTCAGTCGTCTTAACTCAATAAGTTCCTCCATAGCCAGTCTGAATGAGATGGGAGTGGTCAACGGAAATGGAGTAGGAGCAGCAGCGGTAACAGGAGCAGCAGCGGTAACAGGAGCAGCAGCGGTAACAGGAGCAGCAGCGGTAACAGGAGCAGCAGCAAGCCACAGTTATGATGCCACACAGTCCAGCATCGGATTGAGAAAACAGTCCTTGCCCAACTACCTGGATGCAACAAAGATGTTGCCCGAGTCTCGACATGATATAGTGATGAGTGCCATTTACTCCTTCACCGGCGTTCAAGGGAAGTATTTGAAGAAGGATGTGGTAACGGGCCGTTTCAAGCTGGATCAGCAGAACATCAAGTTCCTGACCACCGGCCAAGCGGGCATGTTGCTGCGGCTCTCCGAACTTGGCTACTACCACGATCGAGTGGTCAAGTTTTCGGATGTATCGACCGGTTTCAATGCCATTGGCAGCATGGGCCAGGCCCTGATTTCCAAACTCAAGGAGGAGCTGGCGAATTTTCACGGGCAAGTGGCAATGCTTCACGATGAAATGCAGCGTTTTCGGCAGGCCTCGGTGAATGGAATTGCAAACAAGGGGAAAAAGGATAGTGGGCCCGATGCTGGCGATGAAATGACGCTATTCAAGCTGCTCGCCTGGTATATAAAGCCACTGCACCGGATGCAGTGGTTAACCAAGATTGCCGACGCCTGCCAGGTAAAGAAGGGCGGTGATTTGGCATCGACCGTTTATGATTTCCTTGACAACGGTAACGATATGGTCAATAAATTGGTGGAGGATCTCCTAACTGCCATTTGTGGCCCACTGGTGCGCATGATCTCCAAATGGATTCTGGAGGGCGGCATTAGCGATATGCATAGAGAGTTCTTTGTGAAGTCCATTAAAGATGTGGGCGTTGATCGGCTATGGCACGATAAATTCCGCCTACGATTGCCAATGCTGCCCAAGTTTGTGCCCATGGATATGGCCAATAAGATACTCATGACGGGCAAATCCATTAATTTTCTAAGAGAAATCTGCGAGGAGCAGGGTATGATGAAGGAGCGCGACGAACTAATGAAGGTCATGGAATCTAGTGCCTCTCAAATCTTTTCGTACACACCGGACACCAGTTGGCATGCGGCCGTGGAAACGTGCTACCAGCAGACCTCCAAACATGTCCTCGACATTATGGTGGGCCCACACAAGCTGCTGGATCATTTGCACGGAATGCGGCGCTACTTGCTGTTGGGCCAGGGCGATTTTATTAGCATTCTGATTGAAAACATGAAGAACGAACTGGAGCGACCGGGCCTTGATATATATGCTAACGATCTCACCTCCATGTTGGATTCCGCTCTGCGCTGTACGAATGCCCAGTACGATGATCCTGATATTCTAAACCATCTCGATGTGATTGTTCAACGACCGTTCAACGGTGATATTGGCTGGAACATCATCTCGCTGCAGTACATTGTCCACGGACCACTGGCCGCCATGCTGGAGTCGACCATGCCAACGTACAAGGTGCTCTTCAAGCCACTCTGGCGCATGAAGCACATGGAGTTTGTGCTCTCGATGAAGATCTGGAAGGAGCAGATGGGCAACGCAAAGGCCCTTCGTACAATGAAGTCCGAAATCGGCAAGGCGTCACACCGCCTCAACCTTTTCACTTCCGAGATCATGCACTTTATCCACCAAATGCAGTACTATGTGCTATTTGAGGTCATCGAGTGCAACTGGGTGGAGCTACAGAAGAAGATGCAGAAGGCTACTACGTTGGACGAAATCCTGGAAGCTCACGAGAAGTTTCTGCAAACGATTTTGGTGGGCTGTTTTGTCAGCAACAAAGCGAGTGTGGAGCATTCGCTGGAGGTGGTGTACGAGAACATTATCGAATTGGAGAAGTGGCAGTCGAGCTTTTACAAGGACTGCTTTAAGGAGCTAAATGCCCGCAAGGAACTGTCCAAAATTGTGGAGAAATCGGAAAAGAAGGGTGTCTACGGACTGACCAACAAGATGATCCTGCAGCGCGACCAGGAGGCGAAGATATTTGCCGAAAAGATGGACATCGCCTGCCGCGGCTTAGAAGTCATAGCAACCGATTACGAAAAGGCTGTCAGCACTTTCCTAATGTCTCTCAACTCTAGCGACGATCCGAATTTGCAGCTCTTTGGCACTCGGCTGGACTTCAACGAGTACTACAAGAAGAGGGACACCAATTTGAGCAAACCCCTGACCTTCGAGCACATGCGCATGAGCAATGTGTTCGCCGTGAACAGTCGCTTCGTGATATGTACGCCGTCCACTCAGGAATAGCGACCAATGTCCATGCAATCGGTTTATCCCAGTGTCCATACATCATACCAAATCCCAAATCCCATACAGCATCAGCACTCCATTCAGTTCAATTGCTGCTAAATATTTGAGATATCTCGATATCATTGGAGCCAATCCAACCAAACAAACTAATCCAATTATTAACTAAGCCTTCGAATCGAAAACAACCTCTATACATATATATCTCAAGCTTTGCCGTCAATCGCCTGGCTGCAAGCCATCAACTTAAGATATCTCCAATACAAAATTATTGAGTAGTTGTAACGAAAGTATTAAGCGACAATTTGTTTGTCGAAAAACGCAACGTTCTATTTTGTTTGCGAATCCCATAATTTTTTTTACATCGAAGCTTAGTTGAAATAGATTTTCGTAAGTGCATTTGCCAATTGCCATGTTGTAATTAAAGAGAATAAGAGAATGTTACGTACTTTAAAAGAATGTTTTAAAAAAGTTAATGTTTTGAACAGTTTTAAACCGTAATGCGAG(SEQ ID NO:198)MSQDRIAGIDVATNSTDISNIINEMIICIKGKQMPEVHEKAMDHLSKMIAANSRVIRDSNMLTERECVQKIMKLLSARNKKEEGKTVSDHFNELYRKLTLTKCDPHMRHSLMTHLLTMTDNSDAEKAVASEDPRTQCDNLTQILVSRLNSISSSIASLNEMGVVNGNGVGAAAVTGAAAVTGAAAVTGAAAVTGAAASHSYDATQSSIGLRKQSLPNYLDATKMLPESRHDIVMSAIYSFTGVQGKYLKKDVVTGRFKLDQQNIKFLTTGQAGMLLRLSELGYYHDRVVKFSDVSTGFNAIGSMGQALISKLKEELANFHGQVAMLHDEMQRFRQASVNGIANKGKKDSGPDAGDEMTLFKLLAWYIKPLHRMQWLTKIADACQVKKGGDLASTVYDFLDNGNDMVNKLVEDLLTAICGPLVRMISKWILEGGISDMHREFFVKSIKDVGVDRLWHDKFRLRLPMLPKFVPMDMANKILMTGKSINFLREICEEQGMMKERDELMKVMESSASQIFSYTPDTSWHAAVETCYQQTSKHVLDIMVGPHKLLDHLHGMRRYLLLGQGDFISILIENMKNELERPGLDIYANDLTSMLDSALRCTNAQYDDPDILNHLDVIVQRPFNGDIGWNIISLQYIVHGPLAAMLESTMPTYKVLFKPLWRMKHMEFVLSMKIWKEQMGNAKALRTMKSEIGKASHRLNLFTSEIMHFIHQMQYYVLFEVIECNWVELQKKMQKATTLDEILEAHEKFLQTILVGCFVSNKASVEHSLEVVYENIIELEKWQSSFYKDCFKELNARKELSKIVEKSEKKGVYGLTNKMILQRDQEAKIFAEKMDIACRGLEVIATDYEKAVSTFLMSLNSSDDPNLQLFGTRLDFNEYYKKRDTNLSKPLTFEHMRMSNVFAVNSRFVICTPSTQE


Human homologue of Complete Genome candidate


AAC39727—spindle pole body protein spc98 homolog GCP3

(SEQ ID NO:199)1caggaagggc gcgggccgcg gtccctgcgc gtgcggcggc agtggcggct ctgcccggac61caccgtgcac ggctccgggc gaggatggcg accccggacc agaagtcgcc gaacgttctg121ctgcagaacc tgtgctgcag gatcctgggc aggagcgaag ctgatgtagc ccagcagttc181cagtatgctg tgcgggtgat tggcagcaac ttcgccccaa ctgttgaaag agatgaattt241ttagtagctg aaaaaatcaa gaaagagctt attcgacaac gaagagaagc agatgctgca301ttattttcag aactccacag aaaacttcat tcacagggag ttttgaaaaa taaatggtca361atactctacc tcttgctgag cctcagtgag gacccacgca ggcagccaag caaggtttct421agctatgcta cgttatttgc tcaggcctta ccaagagatg cccactcaac cccttactac481tatgccaggc ctcagaccct tcccctgagc taccaagatc ggagtgccca gtcagcccag541agctccggca gcgtgggcag cagtggcatc agcagcattg gcctgtgtgc cctcagtggc601cccgcgcctg cgccacaatc tctcctccca ggacagtcta atcaagctcc aggagtagga661gattgccttc gacagcagtt ggggtcacga ctcgcatgga ctttaactgc aaatcagcct721tcttcacaag ccactacctc aaaaggtgtc cccagtgctg tgtctcgcaa catgacaagg781tccaggagag aaggggatac gggtggtact atggaaatta cagaagcagc tctggtaagg841gacattttgt acgtctttca gggcatagat ggcaaaaaca tcaaaatgaa caacactgaa901aattgttaca aagtagaagg aaaggcaaat ctaagtaggt ctttgagaga cacagcagtc961aggctttctg agttgggatg gttgcataat aaaatcagaa gatacacgga ccagaggagc1021ctggaccgct cattcggact cgtcgggcag agcttttgtg ctgccttgca ccaggaactc1081agagaatact atcgattgct ctctgtttta cattctcagc tacaactaga ggatgaccag1141ggtgtgaatt tgggacttga gagtagttta acacttcggc gcctcctggt ttggacctat1201gatcccaaaa tacgactgaa gacccttgcg gccctagtgg accactgcca aggaaggaaa1261ggaggtgagc tggcctcagc tgtccacgcc tacacaaaaa caggagaccc gtacatgcgg1321tctctggtgc agcacatcct cagcctcgtg tctcatcctg ttttgagctt cctgtaccgc1381tggatatatg atggggagct tgaggacact taccacgaat tttttgtagc atcagatcca1441acagttaaaa cagatcgact gtggcacgac aagtatactt tgaggaaatc gatgattcct1501tcgtttatga cgatggatca gtctaggaag gtccttttga taggaaaatc aataaatttc1561ttgcaccaag tttgtcatga tcagactccc actacaaaga tgatagctgt gaccaagtct1621gcagagtcac cccaggacgc tgcagaccta ttcacagact tggaaaatgc atttcagggg1681aagattgatg ctgcttattt tgagaccagc aaatacctgt tggatgttct caataaaaag1741tacagcttgc tggaccacat gcaggcaatg aggcggtacc tgcttcttgg tcaaggagac1801tttataaggc acttaatgga cttgctaaaa ccagaacttg tccgtccagc tacgactttg1861tatcagcata acttgactgg aattctagaa accgctgtca gagccaccaa cgcacagttt1921gacagtcctg agatcctgcg aaggctggac gtgcggctgc tggaggtctc tccaggtgac1981actggatggg atgtcttcag cctcgattat catgttgacg gaccaattgc aactgtgttt2041actcgagaat gtatgagcca ctacctaaga gtatttaact tcctctggag ggcgaagcgg2101atggaataca tcctcactga catacggaag ggacacatgt gcaatgcaaa gctcctgaga2161aacatgccag agttctccgg ggtgctgcac cagtgtcaca ttttggcctc tgagatggtc2221catttcattc atcagatgca gtattacatc acatttgagg tgcttgaatg ttcttgggat2281gagctttgga acaaagtcca gcaggcccag gatttggatc acatcattgc tgcacacgag2341gtgttcttag acaccatcat ctcccgctgc ctgctggaca gtgactccag ggcactttta2401aatcaactta gagctgtgtt tgatcaaatt attgaacttc agaatgctca agatgcaata2461tacagagctg ctctggaaga attgcagaga cgattacagt ttgaagagaa aaagaaacag2521cgtgaaattg agggccagtg gggagtgacg gcagcagagg aagaggagga aaataagagg2581attggagaat ttaaagaatc tataccaaaa atgtgctcac agttgcgaat attgacccat2641ttctaccagg gtatcgtgca gcagtttttg gtgttactga cgaccagctc tgacgagagt2701cttcggtttc ttagcttcag gctggacttc aacgagcatt acaaagccag ggagcccagg2761ctccgtgtgt ctctgggtac cagggggcgg cgcagctccc acacgtgaag ctcgcggtcc2821tcccagggag ctgcgggtga tgttcgttgc actgctagac acgaaattcc cattgacgtc2881ctgcaggaac tgcatgctgc aggtgtcctg cccttccgcc cacgagtgcg ccatgtttca2941gcggagcggc gtgtgggaga agccacgtcg tgtttcacat gtcggagtcg aatgcatttg3001taaatcccta agtcaagtag gctggctgca ctgttcacat ttgtctctaa aagtcttcat3061cgctaaaaga taccataatt tgctgaggct tcttaagctt tctatgttat aatttatatt3121tgtcacttta aaaaatccat ttcttttaga aaaaattagg gtgataggat attcattagt3181taagatggta acgtcattgc tattttttta acatcctctt tagaggtaat ttttgttaac3241ataaccaaaa attaaattga aacaaaatgt cccaactaag aaaatatata gagcatttta3301ttttttttta gtgttgtaaa atattaacct ctgtgagatc ctttgtatct taatgcatta3361cctttacaca tatttattct tattttctct cctttcagag tttacatttt tatatttaat3421ttactatttc agatttttaa aatagtatag aaaaaagtag gagtgataga gaacaaaaat3481actcttatac agtgcaaccc aaataccgcg aatgcatcag ctaaagcagc gtgtaaatag3541gagtgatgag aaagttaatg gagtatttta ttttcaaagt tcctgataag cattggaaag3601aaatcgacat ggataatgaa gatttccttt ttccttgcct attttttcat tgtaaatatt3661tatatactac tgaccaagat gttggggtgg gggggattgt tttttgtaaa aatgtcatta3721tcaggtcaca taaatctgcc tttatgttgc ataagtgaaa atttagaaaa ttaaaagcaa3781ttatctttca aaaaa(SEQ ID NO:200)1matpdqkspn vllqnlccri lgrseadvaq qfqyavrvig snfaptverd eflvaekikk61elirqrread aalfselhrk lhsqgvlknk wsilylllsl sedprrqpsk vssyatlfaq121alprdahstp yyyarpqtlp lsyqdrsaqs aqssgsvgss gissiglcal sgpapapqsl181lpgqsnqapg vgdclrqqlg srlawtltan qpssqattsk gvpsavsrnm trsrregdtg241gtmeiteaal vrdilyvfqg idgknikmnn tencykvegk anlsrslrdt avrlselgwl301hnkirrytdq rsldrsfglv gqsfcaalhq elreyyrlls vlhsqlqled dqgvnlgles361sltlrrllvw tydpkirlkt laalvdhcqg rkggelasav haytktgdpy mrslvqhils421lvshpvlsfl yrwiydgele dtyheffvas dptvktdrlw hdkytlrksm ipsfmtmdqs481rkvlligksi nflhqvchdq tpttkmiavt ksaespqdaa dlftdlenaf qgkidaayfe541tskylldvln kkyslldhmq amrrylllgq gdfirhlmdl lkpelvrpat tlyqhnltgi601letavratna qfdspeilrr ldvrllevsp gdtgwdvfsl dyhvdgpiat vftrecmshy661lrvfnflwra krmeyiltdi rkghmcnakl lrnmpefsgv lhqchilase mvhfihqmqy721yitfevlecs wdelwnkvqq aqdldhiiaa hevfldtiis rclldsdsra llnqlravfd781qiielqnaqd aiyraaleel qrrlqfeekk kqreiegqwg vtaaeeeeen krigefkesi841pkmcsqlril thfyqgivqq flvllttssd eslrflsfrl dfnehykare prlrvslgtr901grrssht


Putative function


Component of the centrosome


Example 18 (Category 3)

Line ID—237


Phenotype—Lethal phase larval stage 3 (few pupae). High mitotic index, colchicine-type overcondensation of chromosomes, polyploid cells, ‘mininuclei’ formation


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE0086 (10C4-5)


P element insertion site—182,487


Annotated Drosophila genome Complete Genome candidate


2 candidates:


CG1558—novel protein

(SEQ ID NO:201)ATGGAGCCAGCCGAAAGTCCAGAAAAATTAATGAAATTCGTACGCCGCAGTGACGTACTGGAATACGTGGGCAACACGAGTGCCGTCGATCTATCGAGCGGTGATCTCTCCGACATCGATCTCAAGGACGTGCCGGCCCAACTGGAGGCCACTTTGAAACCGCGTCGCTATGAAGCAAGCACTTTGTTTAACATTGACCTGGACGATATCTGGGATCCTAGCTGTCAGGAGGACGAGGTGCAGCAGTACAAGGAGCGCGCCCAGAAGGAGCAGCAAAAGTTCTTCGACTTTGTAATGCATGCGGCACTGGACACGGACAATCGCAAGGTTAGCTTCAAGCCAAACAAGGAGCAGCAGCGTTACCTAGATCAGGGACCCAATTTGCAAAACTTCGTGCGAAGCTCGTTGGCTTTCACAAACGCGGCCATCCGATTTCAGGCGGAGCACGAGGACATGATGGAGCTGCAGTGCAATATGGACGATCACTACCTATTCATGCGGAACACCATGATCAACAACGCTATACACCAGAATATGGCCAACCAACGGTGACCCTAAGCTATGCATAAATATACATATGTGAATTGTAGATATTGATAAATTAAATTAAGACTCAGAGATTGTAAGACGGTTTGCTTTTGGCTTATACAGTATAATTCGCTTAGCTGCCTCGAGTACTTTGCACAATGCCTCGATGCAGGTAACTTAAAAATGCAGCTAACTTAATTTTTTTTTTTCTATTTTCTATTTTCTATTCACAC(SEQ ID NO:202)MEPAESPEKLMKFVRRSDVLEYVGNTSAVDLSSGDLSDIDLKDVPAQLEATLKPRRYEASTLFNIDLDDIWDPSCQEDEVQQYKERAQKEQQKFFDFVMHAALDTDNRKVSFKPNKEQQRYLDQGPNLQNFVRSSLAFTNAAIRFQAEHEDMMELQCNMDDHYLFMRNTMINNAIHQNMANQR


CG11697—novel protein

(SEQ ID NO:203)ATGATTTATGCGATCGTGATACACATACTGTCCCTTCTGGTGGGCTGTTTCTATCCAGCATTCGCGTCCTACAAGATCCTGAAAAGTCAGAATTGTAGCGTCAATGATCTTCGCGGATGGTTAATCTACTGGATTGCCTATGGAGTTTATGTGGCCTTTGATTATTTCACAGCGGGTCTGCTGGCATTTATTCCATTGCTAAGTGAGTTCAAGGTGCTTCTCCTGTTCTGGATGTTGCCCTCTGTGGGCGGCGGCAGTGAGGTGATCTACGAGGAGTTCCTGCGATCCTTTAGCTGTAACGAATCCTTCGACCAGGTCCTGGGACGTATCACCTTGGAATGGGGCGAATTGGTGTGGCAACAAGTTTGCTCCGTTCTTAGCCATTTGATGGTTTTGGCAGATCGCTATCTCCTGCCCAGCGGTCATCGTCCTGCCCTCCAAATAACGCCCAGCATCGAGGATCTGGTCAACGATGCCATAGCCAAAAGGCAGTTGGAAGAGAAGCGGAAACAGATGGGTAACTTATCTGATACCATCAACGAGGTTTTGGGAGAAAATATCGATTTAAATATGGATCTGCTGCACGGATCCGAATCTGATTTATTGGTTATTAAGGAGCCTATTTCCAAGCCCAAGGAGAGACCAATACCGCCGCCGAAGCCAATGCGTCAGCCATCATCAAGCAACCAGCAAGAAATGAATCTTTCGTCGCAGTTTATGTGA(SEQ ID NO:204)MIYAIVIHILSLLVGCFYPAFASYKILKSQNCSVNDLRGWLIYWIAYGVYVAFDYFTAGLLAFIPLLSEFKVLLLFWMLPSVGGGSEVIYEEFLRSFSCNESFDQVLGRITLEWGELVWQQVCSVLSHLMVLADRYLLPSGHRPALQITPSIEDLVNDAIAKRQLEEKRKQMGNLSDTINEVLGENIDLNMDLLHGSESDLLVIKEPISKPKERPIPPPKPMRQPSSSNQQEMNLSSQFM


Human homologue of Complete Genome candidate


(CG1558)—none


(CG11697)—BAB14444 unamed protein—similar to a hypothetical protein in the region deleted in human familial adenomatous polyposis 1

(SEQ ID NO:205)1aacgccgggc agggcggcgg gcgcgctcag tctggcggcg gctgccgtga gctgactgac61gttccgggaa cgccgcagca gcccgcgccg cccgcagcct agccgagccg cgccgcccgg121gcctcgcccg cccgcctgcc cgccatggtg tcatggatca tctccaggct ggtggtgctt181atatttggca ccctttaccc tgcgtattat tcctacaagg ctgtgaaatc aaaggacatt241aaggaatatg tcaaatggat gatgtactgg attatatttg cacttttcac cacagcagag301acattcacag acatcttcct ttgttggttt ccattctatt atgaactaaa aatagcattt361gtagcctggc tgctgtctcc ctacacaaaa ggctccagcc tcctgtacag gaagtttgta421catcccacac tatcttcaaa agaaaaggaa atcgatgatt gtctggtcca agcaaaagac481cgaagttacg atgcccttgt gcacttcggg aagcggggct tgaacgtggc cgccacagcg541gctgtgatgg ctgcttccaa gggacagggt gccttatcgg agagactgcg gagcttcagc601atgcaggacc tcaccaccat caggggagac ggcgcccctg ctccctcggg ccccccacca661ccggggtctg ggcgggccag cggcaaacac ggccagccta agatgtccag gagtgcttct721gagagcgcta gcagctcagg caccgcctag aatccttcga tctcgcttca ggaagaaaag781tacctcatcc tcggccaccg aaaccacgtg agtgagatga gccaacagca ccggatccac841agaatgtttc ttctctgcct taaagagcta ttcactaata acatagaaat ccgcaagctg901ggtgtgcttt gagtgtgcag cctcacaaac atggcctttt ctctctcccc ttccactttt961aaggatttat ttttttcccc cttttcttta ttttgctggg gagaggctaa agggaaaggt1021agtaggggcg ggggtggtga cctttaagtc ttctgaggtt ggtaattttc cacaattgga1081ttgtcattat agacagcagt gtgtttttta gaaagataag agaatcaccc ctatgctgct1141gagatgtaca tttgtaattt atctgttgca tacttagttt ttagtcctgt aaatgcaaac1201acagcatttt ttacaacttt ctttgttctt ggtacttata ctttgaacta tgatgtacat1261atttatggct tttggctttt aatataatgg acttgcaagg gctgccagag gttctgatat1321gtaagaaaac tgcaaaaaca aatatagaca aatattttga ttctagagaa cgtctcagat1381gtgcttataa agcttccaaa tacaactcca gtaagacatc cctttccctg caggagtgtg1441gtctatattc tttagatagt tgtttagtca aaagaccaga caagttacaa actaagagaa1501acaatatttc acaacacagt aaagtgtgat gagaggtcag gggaacatcc cagtaaaaga1561gaagagtcac aggaagctca tctcctccct ggattctgga ttaggagctt ctgaatcttt1621tccagggata ggcaggtagc tcactcttgg tgcaatttct tgaggatggg aacatgtaga1681gctgctggaa ggagtaattc tgtgcttgac aaaggacgat ttctccttta tcgtgaccag1741tgctgccgat ttcctgacag aggagcttac actctgagca ccttgtttta gcgaactcta1801gcaaaacttg tttagcttag caaaaacaaa cacacaaaaa actgagaact ctgctgtttc1861agatatgcca taacatacat ctgaaacaca tgtgtaacaa tcaaaatggt gggctctaga1921atggttttgg agctcgagat cttcatgggt tagacttgct ggtcagaccc aggagcacct1981gtggctcaca ccttctgttc ccctcctggc ctgtgcagaa tgtaaacagc agactcatac2041tcaatgggca ctacaggcct tatcagacgt tttatacaag cctggattgc ttagtagggg2101aataaggcat tctctgaggg ggctttccac ttagattgag aattttattt gaaaagaatc2161tggtttaaat ggcattgtgg tccgaggtag ctgctctccc cactgagagc tgagccgaaa2221tataagaata atatatttgt gcttcgagtt ggtgtttctt tcagtgtaat gcatgcagtg2281gtcacaaccc agttactcat aatatttgga ttgtatttgt tcgtagatat gcccagaaga2341ctagagaatt agtgttatat accatataga acttactgtc agtcaactat aaacaggccc2401aattaaaaac tgttccatta ctacgcaaac acatattaga ggcctttgct gatgacacat2461tagctggatc ttagccaccc cagaaagggt ttgatttgaa gctgattgtt gccagatatg2521catattggaa tcccatctac ccatagttcc tctgaaggtg attttgtaat ttgcaaaagg2581gtataggaaa atatacctaa aagcgaattt gtggctgaga ggataaacag aagctgtttg2641ctcatgttct gtgccccaca cccaccaata cctaaatctg ttaaggaaga cagaaaatgt2701tttctttgtg ctcattgagt agttccagac agaagaagaa tatactcttt aaaatgtatt2761tacctgttag ttggaagtac ccagaattat cagaaacgaa tgcaaaaaaa aaaaaaaaaa2821aaaaaagctt acacagcttc ttagcaattt tttttttttt tgccgaaaca ataaattgcc2881tttagcagca gtttaaaatc ctatcgtgaa caacctatat tttcgccatt ttacaatgga2941gagttgtgac aagtacaggt tatcaagttt gcacttaact atgccaaaaa aagtttgaag3001cgctctattc tcagacatgc tgtattatta cttctcattc aagattgaaa aatataaagg3061tatccaaact ctgtcttaat gtaaatgtaa ctatttttcc ttcaagtgtt gactagggag3121tcggtttctc tcttaaagac actcactgta caactgaaag cagctgtcat atttctggca3181aaatgtgttt acgtatctga caagttgtac atttgtgtat gaactgacat aaaatgtgaa3241agcctgtaag tgtacatgta gtggtgtggt gttctgtcta gaggatacaa ctgaatgttt3301ttaatttgct gacttacaga cacaggctgt ttacaaaatg ctagctggaa agtctgtaat3361gttcatgtca taacttttag ttaattgcca ttgagcacct gttctgagga ggtgagatgt3421ggacttgtgc ttataaactg gagagtttag tcataatccc tcctggcttt gtgtgaatag3481cttgctcact ttgctggcct ttgaaatgtg ttctccgtga taagctatcc atgtgtttgt3541gataagagtg cttgtcaacc atgaccatct ttgagccttc ctagtcctcc acctggcaca3601gtatttgaaa tggcaaagga tgtgcttcat cctctaacaa acagtgtaca ctcccagagc3661tgatattctg gattgtgact gtgcacattt cctctagttc atgtctgtag tccctataga3721atgatctgta ataaaatagt atactggact gtgcatcaaa gggatgtaaa attacagtat3781tccaaaggtt gaagttctgc tgttttgtta taatgcctga tacacatctt gaataaagtc3841ttaacatttt tctttt(SEQ ID NO:206)1miyaivihil sllvgcfypa fasykilksq ncsvndlrgw liywiaygvy vafdyftagl61lafipllsef kvlllfwmlp svgggseviy eeflrsfscn esfdqvlgri tlewgelvwq121qvcsvlshlm vladryllps ghrpalqitp siedlvndai akrqleekrk qmgnlsdtin181evlgenidln mdllhgsesd llvikepisk pkerpipppk pmrqpsssnq qemnlssqfm241


Putative function


(CG1558)—unknown


(CG11697)—may be deleted in human cancers, possibly a receptor.


Example 19
Corkscrew/Shp2 (Category 3)

Corkscrew (CG3954) as a candidate gene is detected in a screen of a P-element insertion library covering the X chromosome of Drosophila melanogaster (Peter et al. 2001) as mutant phenotype in fly line 171 , as described above.


Mitotic defects are observed in brain squashes: low mitotic index, few cells in mitosis and metaphases with separated chromosomes, and is placed in Category 3 as described above.


Rescue and sequencing of genomic DNA flanking the P-element insertion site indicates that the P-element is inserted into the 5′ region of two genes: CG3954 corkscrew and CG16903 cyclin/non-specific RNA polymersae II transcription factor.


Line ID—171


Phenotype—Lethal phase larval stage 1-2. Low mitotic index, few cells in mitosis, metaphase with separated chromosomes


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003423 (2D 1-2)


P element insertion site—42,253


Annotated Drosophila genome Complete Genome candidate


2 candidates: CG3954—corkscrew. Protein tyrosine phosphatase required for cell signaling in eye development (2 splice variants) and CG16903—cyclin/non-specific RNA polymersae II transcription factor


CG3954—corkscrew. Protein tyrosine phosphatase required for cell signaling in eye splice variant 1

(SEQ ID NO:207)ATGCTGTTCAACAAATGTCTGGAAAAGTTGTCCAGCTCGCTGGGCAATGTGGTCAATCACAAGCTGCAAGAGAAACAAGTCTACAACAACAACAATATCAACAATAACAATAACAATACGCTAAACAACAACAATGCCTACAACAATCAGCGAAACTTTGAGTACGAAAGAGCCATACAGGCGCACTACGGAAGCAAGGGAAGACGCTCGGAGGAGCGCGAAAGGAGCGGCAAGTTCAAGGCCAGCAAGGGTCGGAAAGCAAAGGTCACCCCACCAACGGAGACACCCGAGGCCCAGGAGCCGGCCTGCAAGAACTGTATGACCCACGACGAGCTGGCCCAGATCATAAAGGGCGTGGCCAAGGGCGCTGACGCGCAACGTAATCGAGACAACCGACTGCAGCGCAGACGTCGTCCTCTCTCCGCCCAACCCTCCGCCGCTGCCTCCGCCTCCACATCGACGGAATCTCTGCACCGTCTTACACCCAGCCCGCAGGCTTCCTACCCGGCCACGCCCACCTCCTGGACAGCCACACCGCCCCAGTTCCCAGCCGCCTTCGGCGGCGCCAGCTGCTCCAACAGCACACTGTCCCTCTTGGCCACCATGCGCGTCCAGCTCCATGGTTACACATGGTTTCATGGCAATCTTTCCGGAAAGGAAGCGGAAAAATTGATCCTGGAGCGGGGCAAGAATGGTTCGTTTCTCGTCCGTGAATCTCAGAGCAAGCCTGGCGACTTCGTCCTTTCCGTGCGCACGGACGACAAAGTAACGCATGTCATGATTCGATGGCAGGACAAGAAGTACGACGTCGGCGGCGGGGAATCCTTTGGCACCTTGTCGGAACTGATCGATCACTACAAGCGTAATCCCATGGTGGAGACGTGCGGAACCGTGGTGCATCTGCGACAGCCATTCAACGCCACACGAATCACGGCGGCCGGCATCAATGCCCGGGTGGAACAGCTGGTCAAGGGAGGTTTCTGGGAGGAATTCGAATCGCTGCAACAGGACAGTCGGGACACATTCTCGCGCAACGAGGGCTACAAACAGGAGAACCGCCTCAAGAATCGCTACCGCAACATATTGCCATACGACCACACGCGCGTCAAGCTGCTGGACGTGGAGCATAGCGTGGCCGGAGCCGAGTACATCAATGCCAACTACATACGGCTGCCCACCGACGGCGACCTGTACAACATGAGCAGCTCGTCGGAGAGCCTGAACAGCTCGGTGCCCTCGTGCCCCGCCTGCACGGCTGCCCAGACACAGCGGAACTGCTCCAACTGCCAGCTGCAAAACAAGACGTGCGTGCAGTGCGCCGTGAAGAGCGCCATTCTGCCGTATAGCAACTGTGCCACCTGCAGCCGCAAGTCAGACTCCCTGAGCAAGCACAAGCGGAGCGAATCCTCGGCCTCTTCATCGCCCTCCTCCGGCTCTGGGTCCGGACCAGGATCGTCGGGCACCAGCGGAGTGAGCAGCGTCAATGGACCCGGCACACCCACCAATCTCACGAGCGGCACAGCCGGATGTCTGGTCGGCCTGCTGAAGAGACACTCGAACGACTCGTCCGGAGCTGTTTCTATATCGATGGCCGAACGGGAACGCGAGAGGGAGCGCGAGATGTTTAAGACCTACATCGCCACCCAGGGCTGTCTGCTCACCCAGCAAGTGAACACGGTGACGGACTTCTGGAACATGGTCTGGCAGGAGAACACGCGGGTGATCGTCATGACCACCAAGGAGTACGAGCGCGGCAAAGAAAAGTGCGCCCGCTACTGGCCGGACGAGGGTAGATCGGAGCAGTTCGGCCACGCGCGGATACAGTGCGTCTCGGAGAACTCGACCAGTGACTATACGCTGCGCGAGTTCCTCGTCTCGTGGCGGGATCAGCCGGCGCGCCGGATCTTTCACTACCATTTCCAGGTGTGGCCGGATCACGGAGTGCCCGCCGATCCGGGCTGTGTGCTCAACTTCCTGCAAGATGTCAACACGCGTCAGAGTCACCTGGCTCAAGCGGGCGAGAAGCCGGGTCCGATCTGCGTGCACTGCTCTGCGGGCATCGGTCGCACTGGCACCTTTATTGTGATCGATATGATTCTCGATCAGATTGTGCGCAATGGATTGGATACTGAAATCGACATCCAGCGCACCATTCAGATGGTCCGATCGCAGCGTTCCGGTCTTGTGCAAACCGAGGCGCAATACAAGTTCGTCTACTATGCGGTGCAGCACTATATACAGACCCTGATCGCCCGGAAACGAGCTGAGGAGCAGAGCCTGCAGGTTGGCCGCGAGTACACCAATATAAAGTACACGGGCGAAATTGGAAACGATTCACAAAGATCTCCATTACCACCAGCAATTTCTAGCATAAGTTTAGTTCCGAGTAAGACGCCACTGACGCCGACATCGGCGGATTTGGGCACTGGGATGGGCCTAAGCATGGGCGTGGGCATGGGCGTCGGCAACAAGCACGCATCGAAGCAGCAGCCGCCGTTGCCGGTGGTCAACTGCAACAATAATAACAACGGCATTGGCAATAGCGGCTGCAGCAACGGCGGCGGGAGCAGCACCACCAGCAGCAGCAACGGCAGCAGCAACGGTAACATCAACGCCCTACTGGGCGGCATCGGCTTGGGGCTGGGCGGCAATATGCGCAAGTCGAACTTTTACAGCGACTCGCTGAAGCAGCAACAGCAGCGCGAGGAGCAGGCTCCGGCGGGAGCAGGTAAGATGCAGCAGCCGGCGCCGCCGCTGCGACCGCGTCCTGGAATACTCAAGTTGCTCACCAGTCCCGTCATCTTTCAGCAAAATTCAAAAACATTCCCAAAGACATGA(SEQ ID NO:208)MLFNKCLEKLSSSLGNVVNHKLQEKQVYNNNNINNNNNNTLNNNNAYNNQRNFEYERAIQAHYGSKGRRSEERERSGKFKASKGRKAKVTPPTETPEAQEPACKNCMTHDELAQIIKGVAKGADAQRNRDNRLQRRRRPLSAQPSAAASASTSTESLHRLTPSPQASYPATPTSWTATPPQFPAAFGGASCSNSTLSLLATMRVQLHGYTWFHGNLSGKEAEKLILERGKNGSFLVRESQSKPGDFVLSVRTDDKVTHVMIRWQDKKYDVGGGESFGTLSELIDHYKRNPMVETCGTVVHLRQPFNATRITAAGINARVEQLVKGGFWEEFESLQQDSRDTFSRNEGYKQENRLKNRYRNILPYDHTRVKLLDVEHSVAGAEYINANYIRLPTDGDLYNMSSSSESLNSSVPSCPACTAAQTQRNCSNCQLQNKTCVQCAVKSAILPYSNCATCSRKSDSLSKHKRSESSASSSPSSGSGSGPGSSGTSGVSSVNGPGTPTNLTSGTAGCLVGLLKRHSNDSSGAVSISMAEREREREREMFKTYIATQGCLLTQQVNTVTDFWNMVWQENTRVIVMTTKEYERGKEKCARYWPDEGRSEQFGHARIQCVSENSTSDYTLREFLVSWRDQPARRIFHYHFQVWPDHGVPADPGCVLNFLQDVNTRQSHLAQAGEKPGPICVHCSAGIGRTGTFIVIDMILDQIVRNGLDTEIDIQRTIQMVRSQRSGLVQTEAQYKFVYYAVQHYIQTLIARKRAEEQSLQVGREYTNIKYTGEIGNDSQRSPLPPAISSISLVPSKTPLTPTSADLGTGMGLSMGVGMGVGNKHASKQQPPLPVVNCNNNNNGIGNSGCSNGGGSSTTSSSNGSSNGNINALLGGIGLGLGGNMRKSNFYSDSLKQQQQREEQAPAGAGKMQQPAPPLRPRPGILKLLTSPVIFQQNSKTFPKT


CG3954—corkscrew. Protein tyrosine phosphatase required for cell signaling in eve splice variant 2

(SEQ ID NO:209)AGTAAAAAAATAGTTTTTTTTTTGTATCCAACCAACCAACTGTAAAAATAAGTTTAAACAAAGCATCTACTCATAAGTTTCATTTTTTTCCGTTAAGTGTCAACATTATTTATTTTTTAAGTGTGCATTCAATAAGAAAATGTCATCGCGAAGATGGTTCCACCCAACGATATCTGGCATCGAAGCTGAGAAACTGCTGCAGGAGCAGGGATTCGACGGCTCCTTCCTCGCCCGCCTCTCCTCCTCGAATCCGGGCGCCTTCACGCTCTCCGTGCGCCGCGGCAACGAGGTGACCCACATCAAAATCCAAAACAATGGCGACTTCTTTGATCTCTACGGTGGTGAAAAGTTCGCCACACTGCCGGAACTGGTACAATACTACATGGAGAATGGCGAGCTAAAGGAGAAGAACGGCCAGGCCATCGAACTCAAGCAGCCGCTGATCTGCGCCGAGCCCACCACGGAAAGATGGTTTCATGGCAATCTTTCCGGAAAGGAAGCGGAAAAATTGATCCTGGAGCGGGGCAAGAATGGTTCGTTTCTCGTCCGTGAATCTCAGAGCAAGCCTGGCGACTTCGTCCTTTCCGTGCGCACGGACGACAAAGTAACGCATGTCATGATTCGATGGCAGGACAAGAAGTACGACGTCGGCGGCGGGGAATCCTTTGGCACCTTGTCGGAACTGATCGATCACTACAAGCGTAATCCCATGGTGGAGACGTGCGGAACCGTGGTGCATCTGCGACAGCCATTCAACGCCACACGAATCACGGCGGCCGGCATCAATGCCCGGGTGGAACAGCTGGTCAAGGGAGGTTTCTGGGAGGAATTCGAATCGCTGCAACAGGACAGTCGGGACACATTCTCGCGCAACGAGGGCTACAAACAGGAGAACCGCCTCAAGAATCGCTACCGCAACATATTGCCATACGACCACACGCGCGTCAAGCTGCTGGACGTGGAGCATAGCGTGGCCGGAGCCGAGTACATCAATGCCAACTACATACGGCTGCCCACCGACGGCGACCTGTACAACATGAGCAGCTCGTCGGAGAGCCTGAACAGCTCGGTGCCCTCGTGCCCCGCCTGCACGGCTGCCCAGACACAGCGGAACTGCTCCAACTGCCAGCTGCAAAACAAGACGTGCGTGCAGTGCGCCGTGAAGAGCGCCATTCTGCCGTATAGCAACTGTGCCACCTGCAGCCGCAAGTCAGACTCCCTGAGCAAGCACAAGCGGAGCGAATCCTCGGCCTCTTCATCGCCCTCCTCCGGCTCTGGGTCCGGACCAGGATCGTCGGGCACCAGCGGAGTGAGCAGCGTCAATGGACCCGGCACACCCACCAATCTCACGAGCGGCACAGCCGGATGTCTGGTCGGCCTGCTGAAGAGACACTCGAACGACTCGTCCGGAGCTGTTTCTATATCGATGGCCGAACGGGAACGCGAGAGGGAGCGCGAGATGTTTAAGACCTACATCGCCACCCAGGGCTGTCTGCTCACCCAGCAAGTGAACACGGTGACGGACTTCTGGAACATGGTCTGGCAGGAGAACACGCGGGTGATCGTCATGACCACCAAGGAGTACGAGCGCGGCAAAGAAAAGTGCGCCCGCTACTGGCCGGACGAGGGTAGATCGGAGCAGTTCGGCCACGCGCGGATACAGTGCGTCTCGGAGAACTCGACCAGTGACTATACGCTGCGCGAGTTCCTCGTCTCGTGGCGGGATCAGCCGGCGCGCCGGATCTTTCACTACCATTTCCAGGTGTGGCCGGATCACGGAGTGCCCGCCGATCCGGGCTGTGTGCTCAACTTCCTGCAAGATGTCAACACGCGTCAGAGTCACCTGGCTCAAGCGGGCGAGAAGCCGGGTCCGATCTGCGTGCACTGCTCTGCGGGCATCGGTCGCACTGGCACCTTTATTGTGATCGATATGATTCTCGATCAGATTGTGCGCAATGGATTGGATACTGAAATCGACATCCAGCGCACCATTCAGATGGTCCGATCGCAGCGTTCCGGTCTTGTGCAAACCGAGGCGCAATACAAGTTCGTCTACTATGCGGTGCAGCACTATATACAGACCCTGATCGCCCGGAAACGAGCTGAGGAGCAGAGCCTGCAGGTTGGCCGCGAGTACACCAATATAAAGTACACGGGCGAAATTGGAAACGATTCACAAAGATCTCCATTACCACCAGCAATTTCTAGCATAAGTTTAGTTCCGAGTAAGACGCCACTGACGCCGACATCGGCGGATTTGGGCACTGGGATGGGCCTAAGCATGGGCGTGGGCATGGGCGTCGGCAACAAGCACGCATCGAAGCAGCAGCCGCCGTTGCCGGTGGTCAACTGCAACAATAATAACAACGGCATTGGCAATAGCGGCTGCAGCAACGGCGGCGGGAGCAGCACCACCAGCAGCAGCAACGGCAGCAGCAACGGTAACATCAACGCCCTACTGGGCGGCATCGGCTTGGGGCTGGGCGGCAATATGCGCAAGTCGAACTTTTACAGCGACTCGCTGAAGCAGCAACAGCAGCGCGAGGAGCAGGCTCCGGCGGGAGCAGGTAAGATGCAGCAGCCGGCGCCGCCGCTGCGACCGCGTCCTGGAATACTCAAGTTGCTCACCAGTCCCGTCATCTTTCAGCAAAATTCAAAAACATTCCCAAAGACATGA(SEQ ID NO:210)MSSRRWFHPTISGIEAEKLLQEQGFDGSFLARLSSSNPGAFTLSVRRGNEVTHIKIQNNGDFFDLYGGEKFATLPELVQYYMENGELKEKNGQAIELKQPLICAEPTTERWFHGNLSGKEAEKLILERGKNGSFLVRESQSKPGDFVLSVRTDDKVTHVMIRWQDKKYDVGGGESFGTLSELIDHYKRNPMVETCGTVVHLRQPFNATRITAAGINARVEQLVKGGFWEEFESLQQDSRDTFSRNEGYKQENRLKNRYRNILPYDHTRVKLLDVEHSVAGAEYINANYIRLPTDGDLYNMSSSSESLNSSVPSCPACTAAQTQRNCSNCQLQNKTCVQCAVKSAILPYSNCATCSRKSDSLSKHKRSESSASSSPSSGSGSGPGSSGTSGVSSVNGPGTPTNLTSGTAGCLVGLLKRHSNDSSGAVSISMAEREREREREMFKTYIATQGCLLTQQVNTVTDFWNMVWQENTRVIVMTTKEYERGKEKCARYWPDEGRSEQFGHARIQCVSENSTSDYTLREFLVSWRDQPARRIFHYHFQVWPDHGVPADPGCVLNFLQDVNTRQSHLAQAGEKPGPICVHCSAGIGRTGTFIVIDMILDQIVRNGLDTEIDIQRTIQMVRSQRSGLVQTEAQYKFVYYAVQHYIQTLIARKRAEEQSLQVGREYTNIKYTGEIGNDSQRSPLPPAISSISLVPSKTPLTPTSADLGTGMGLSMGVGMGVGNKHASKQQPPLPVVNCNNNNNGIGNSGCSNGGGSSTTSSSNGSSNGNINALLGGIGLGLGGNMRKSNFYSDSLKQQQQREEQAPAGAGKMQQPAPPLRPRPGILKLLTSPVIFQQNSKTFPKT


CG16903—cyclin/non-specific RNA polymersae II transcription factor

(SEQ ID NO:211)ATTTAGTATAAAAGCACGCCTGTTATCGGCTAAATTTACAAAAAAAAAGGGAAAATTAAAAAATTAAAACACTTAAATAAACGCTTTCCTGGGTTAACCGCGCACGAATGGCCACCCGTGGGGCCGGCTCGACTGTGGTCCACACGACGGTGACAGCGCTGACGGTGGAGACGATCACCAATGTCCTGACCACGGTGACTTCGTTCCATTCGAACAGCGTCAACATTTCGAACAACAACAGCAGCAGTGGAGCGGCCCCGGGGGCGGATGCAGCTGGCGGCGATGCAGGGGGCGTGGCAGCGGCTCAGGCGGACGCCAACAAGCCTATCTATCCTCGGCTCTTTAACCGCATCGTGCTGACGCTGGAGAACAGCCTCATTCCGGAGGGCAAAATCGATGTGACGCCATCCAGCCAGGATGGACTGGACCATGAGACGGAGAAGGACCTGCGCATACTGGGCTGCGAGCTTATTCAGACAGCCGGAATTTTGCTGCGCTTGCCGCAGGTTGCCATGGCCACCGGCCAGGTGCTGTTCCAGCGCTTCTTCTACTCGAAGAGCTTTGTGCGGCACAACATGGAGACTGTGGCCATGAGCTGCGTGTGCCTGGCGTCCAAGATCGAGGAGGCGCCGCGCCGCATTAGAGACGTGATCAATGTGTTCCATCACATCAAGCAAGTGCGGGCCCAAAAGGAAATCTCGCCCATGGTGCTAGATCCTTACTACACGAACCTCAAGATGCAGGTGATCAAGGCCGAGCGGCGCGTCCTCAAGGAACTGGGCTTCTGTGTACACGTGAAGCATCCGCACAAGCTGATCGTGATGTATCTGCAGGTGCTTCAGTACGAGAAGCACGAGAAGCTGATGCAGCTCTCCTGGAACTTCATGAATGACTCGCTGAGGACGGACGTTTTTATGCGCTACACACCAGAGGCGATTGCATGCGCCTGCATCTACCTGAGTGCCCGCAAGCTCAACATACCTCTGCCCAACAGCCCGCCGTGGTTCGGCATTTTTCGGGTGCCCATGGCGGACATTACGGATATCTGCTACCGTGTGATGGAGCTGTACATGCGTTCCAAGCCGGTGGTGGAGAAACTGGAGGCGGCCGTGGACGAGCTGAAAAAGCGGTACATTGATGCGCGCAACAAAACGAAGGAGGCAAACACACCGCCGGCTGTAATCACCGTGGATCGGAACAATGGCTCGCACAATGCGTGGGGTGGCTTCATCCAGCGTGCTATCCCACTGCCCTTGCCATCGGAAAAGTCGCCGCAAAAGGATTCGAGGTCACGCTCGCGATCCAGGACGCGCACCCATTCGCGGACACCTCGCTCCCGATCACCCAGGTCCAGGTCGCCTAGTCGCGAGCGCACTAAGAAGACCCACCGCAGTCGATCCTCCCGCTCGCGCTCCCGTTCGCCGCCGAAGCATAAGAAAAAGTCACGTCACTACTCGAGGTCGCCCACGCGCTCCAATTCGCCGCACAGCAAGCACAGGAAGTCGAAATCCTCGCGAGAACGCTCTGAATACTACTCCAAGAAAGATCGGTCTGGAAACCCAGGCAGTAGCAATAATCTAGGTGATGGCGACAAGTATCGCAACTCCGTCTCCAATTCCGGCAAGCACAGTCGGTACTCCTCCTCCTCGTCGCGTCGGAACAGCGGTGGTGGTGGAGACGGAAGAAGCGGAGGAGGAGGTGGTGGCGGCGGTGGAGGCAACGGGAACCACGGCAGCCGAGGGGGGCACAAGCATCGGGATGGCGATCGCTCCAGGGATCGCAAGCGCTAGTGATTGATAGACAAGCGAGACAAACACTCCCTTATATTTAATTGCTCTTTATTTTACAAATTTACAGATTATTTCTACCGATTTAGTAATGCTAATGTGTATTGAAAAAACGAACGCGGGTAAACAATAAATGTAACTCTTCAATC(SEQ ID NO:212)MATRGAGSTVVHTTVTALTVETITNVLTTVTSFHSNSVNISNNNSSSGAAPGADAAGGDAGGVAAAQADANKPIYPRLFNRIVLTLENSLIPEGKIDVTPSSQDGLDHETEKDLRILGCELIQTAGILLRLPQVAMATGQVLFQRFFYSKSFVRHNMETVAMSCVCLASKIEEAPRRIRDVINVFHHIKQVRAQKEISPMVLDPYYTNLKMQVIKAERRVLKELGFCVHVKHPHKLIVMYLQVLQYEKHEKLMQLSWNFMNDSLRTDVFMRYTPEAIACACIYLSARKLNIPLPNSPPWFGIFRVPMADITDICYRVMELYMRSKPVVEKLEAAVDELKKRYIDARNKTKEANTPPAVITVDRNNGSHNAWGGFIQRAIPLPLPSEKSPQKDSRSRSRSRTRTHSRTPRSRSPRSRSPSRERTKKTHRSRSSRSRSRSPPKHKKKSRHYSRSPTRSNSPHSKHRKSKSSRERSEYYSKKDRSGNPGSSNNLGDGDKYRNSVSNSGKHSRYSSSSSRRNSGGGGDGRSGGGGGGGGGGNGNHGSRGGHKHRDGDRSRDRKR


Human homologue of Complete Genome candidate


CG3954 homologue is Homo sapiens protein tyrosine phosphatase, non-receptor type 11 (PTPN11), also known as Shp2. Shp2 has 2 alternative transcripts having accession numbers NM002834 and NM080601.


NM002834 Homo sapiens protein tyrosine phosphatase, non-receptor type 11 (PTPN11) transcript variant 1, mRNA also known as Shp2.

(SEQ ID NO:213)1cggccgcggt ttccaggagg aagcaaggat gctttggaca ctgtgcgtgg cgcctccgcg61gagcccccgc gctgccattc ccggccgtcg ctcggtcctc cgctgacggg aagcaggaag121tggcggcggg cgtcgcgagc ggtgacatca cgggggcgac ggcggcgaag ggcgggggcg181gaggaggagc gagccgggcc ggggggcagc tgcacagtct ccgggatccc caggcctgga241ggggggtctg tgcgcggccg gctggctctg ccccgcgtcc ggtcccgagc gggcctccct301cgggccagcc cgatgtgacc gagcccagcg gagcctgagc aaggagcggg tccgtcgcgg361agccggaggg cgggaggaac atgacatcgc ggagatggtt tcacccaaat atcactggtg421tggaggcaga aaacctactg ttgacaagag gagttgatgg cagttttttg gcaaggccta481gtaaaagtaa ccctggagac ttcacacttt ccgttagaag aaatggagct gtcacccaca541tcaagattca gaacactggt gattactatg acctgtatgg aggggagaaa tttgccactt601tggctgagtt ggtccagtat tacatggaac atcacgggca attaaaagag aagaatggag661atgtcattga gcttaaatat cctctgaact gtgcagatcc tacctctgaa aggtggtttc721atggacatct ctctgggaaa gaagcagaga aattattaac tgaaaaagga aaacatggta781gttttcttgt acgagagagc cagagccacc ctggagattt tgttctttct gtgcgcactg841gtgatgacaa aggggagagc aatgacggca agtctaaagt gacccatgtt atgattcgct901gtcaggaact gaaatacgac gttggtggag gagaacggtt tgattctttg acagatcttg961tggaacatta taagaagaat cctatggtgg aaacattggg tacagtacta caactcaagc1021agccccttaa cacgactcgt ataaatgctg ctgaaataga aagcagagtt cgagaactaa1081gcaaattagc tgagaccaca gataaagtca aacaaggctt ttgggaagaa tttgagacac1141tacaacaaca ggagtgcaaa cttctctaca gccgaaaaga gggtcaaagg caagaaaaca1201aaaacaaaaa tagatataaa aacatcctgc cctttgatca taccagggtt gtcctacacg1261atggtgatcc caatgagcct gtttcagatt acatcaatgc aaatatcatc atgcctgaat1321ttgaaaccaa gtgcaacaat tcaaagccca aaaagagtta cattgccaca caaggctgcc1381tgcaaaacac ggtgaatgac ttttggcgga tggtgttcca agaaaactcc cgagtgattg1441tcatgacaac gaaagaagtg gagagaggaa agagtaaatg tgtcaaatac tggcctgatg1501agtatgctct aaaagaatat ggcgtcatgc gtgttaggaa cgtcaaagaa agcgccgctc1561atgactatac gctaagagaa cttaaacttt caaaggttgg acaagggaat acggagagaa1621cggtctggca ataccacttt cggacctggc cggaccacgg cgtgcccagc gaccctgggg1681gcgtgctgga cttcctggag gaggtgcacc ataagcagga gagcatcatg gatgcagggc1741cggtcgtggt gcactgcagt gctggaattg gccggacagg gacgttcatt gtgattgata1801ttcttattga catcatcaga gagaaaggtg ttgactgcga tattgacgtt cccaaaacca1861tccagatggt gcggtctcag aggtcaggga tggtccagac agaagcacag taccgattta1921tctatatggc ggtccagcat tatattgaaa cactacagcg caggattgaa gaagagcaga1981aaagaaagag gaaagggcac gaatatacaa atattaagta ttctctagcg gaccagacga2041gtggagatca gagccctctc ccgccttgta ctccaacgcc accctgtgca gaaatgagag2101aagacagtgc tagagtctat gaaaacgtgg gcctgatgca acagcagaaa agtttcagat2161gagaaaacct gccaaaactt cagcacagaa atagatgtgg actttcaccc tctccctaaa2221aagatcaaga acagacgcaa gaaagtttat gtgaagacag aatttggatt tggaaggctt2281gcaatgtggt tgactacctt ttgataagca aaatttgaaa ccatttaaag accactgtat2341tttaactcaa caatacctgc ttcccaatta ctcatttcct cagataagaa gaaatcatct2401ctacaatgta gacaacatta tattttatag aatttgtttg aaattgagga agcagttaaa2461ttgtgcgctg tattttgcag attatgggga ttcaaattct agtaataggc ttttttattt2521ttatttttat acccttaacc agtttaattt tttttttcct cattgttggg gatgatgaga2581agaaatgatt tgggaaaatt aagtaacaac gacctagaaa agtgagaaca atctcattta2641ccatcatgta tccagtagtg gataattcat tttgatggct tctatttttg gccaaatgag2701aattaagcca gtgcctgaga ctgtcagaag ttgacctttg cactggcatt aaagagtcat2761agaaaaaa(SEQ ID NO:214)MTSRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGDFTLSVRRNGAVTHIKIQNTGDYYDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELKYPLNCADPTSERWFHGHLSGKEAEKLLTEKGKHGSFLVRESQSHPGDFVLSVRTGDDKGESNDGKSKVTHVMIRCQELKYDVGGGERFDSLTDLVEHYKKNPMVETLGTVLQLKQPLNTTRINAAEIESRVRELSKLAETTDKVKQGFWEEFETLQQQECKLLYSRKEGQRQENKNKNRYKNILPFDHTRVVLHDGDPNEPVSDYINANIIMPEFETKCNNSKPKKSYIATQGCLQNTVNDFWRMVFQENSRVIVMTTKEVERGKSKCVKYWPDEYALKEYGVMRVRNVKESAAHDYTLRELKLSKVGQGNTERTVWQYHFRTWPDHGVPSDPGGVLDFLEEVHHKQESIMDAGPVVVHCSAGIGRTGTFIVIDILIDIIREKGVDCDIDVPKTIQMVRSQRSGMVQTEAQYRFIYMAVQHYIETLQRRIEEEQKRKRKGHEYTNIKYSLADQTSGDQSPLPPCTPTPPCAEMREDSARVYENVGLMQQQKSFR


NM080601 Homo sapiens protein tyrosine phosphatase, non-receptor type 11 (PTPN11), transcript variant 2, mRNA (version 1)

(SEQ ID NO:215)1gcggaggagg agcgagccgg gccggggggc agctgcacag tctccgggat ccccaggcct61ggaggggggt ctgtgcgcgg ccggctggct ctgccccgcg tccggtcccg agcgggcctc121cctcgggcca gcccgatgtg accgagccca gcggagcctg agcaaggagc gggtccgtcg181cggagccgga gggcgggagg aacatgacat cgcggagatg gtttcaccca aatatcactg241gtgtggaggc agaaaaccta ctgttgacaa gaggagttga tggcagtttt ttggcaaggc301ctagtaaaag taaccctgga gacttcacac tttccgttag aagaaatgga gctgtcaccc361acatcaagat tcagaacact ggtgattact atgacctgta tggaggggag aaatttgcca421ctttggctga gttggtccag tattacatgg aacatcacgg gcaattaaaa gagaagaatg481gagatgtcat tgagcttaaa tatcctctga actgtgcaga tcctacctct gaaaggtggt541ttcatggaca tctctctggg aaagaagcag agaaattatt aactgaaaaa ggaaaacatg601gtagttttct tgtacgagag agccagagcc accctggaga ttttgttctt tctgtgcgca661ctggtgatga caaaggggag agcaatgacg gcaagtctaa agtgacccat gttatgattc721gctgtcagga actgaaatac gacgttggtg gaggagaacg gtttgattct ttgacagatc781ttgtggaaca ttataagaag aatcctatgg tggaaacatt gggtacagta ctacaactca841agcagcccct taacacgact cgtataaatg ctgctgaaat agaaagcaga gttcgagaac901taagcaaatt agctgagacc acagataaag tcaaacaagg cttttgggaa gaatttgaga961cactacaaca acaggagtgc aaacttctct acagccgaaa agagggtcaa aggcaagaaa1021acaaaaacaa aaatagatat aaaaacatcc tgccctttga tcataccagg gttgtcctac1081acgatggtga tcccaatgag cctgtttcag attacatcaa tgcaaatatc atcatgcctg1141aatttgaaac caagtgcaac aattcaaagc ccaaaaagag ttacattgcc acacaaggct1201gcctgcaaaa cacggtgaat gacttttggc ggatggtgtt ccaagaaaac tcccgagtga1261ttgtcatgac aacgaaagaa gtggagagag gaaagagtaa atgtgtcaaa tactggcctg1321atgagtatgc tctaaaagaa tatggcgtca tgcgtgttag gaacgtcaaa gaaagcgccg1381ctcatgacta tacgctaaga gaacttaaac tttcaaaggt tggacaaggg aatacggaga1441gaacggtctg gcaataccac tttcggacct ggccggacca cggcgtgccc agcgaccctg1501ggggcgtgct ggacttcctg gaggaggtgc accataagca ggagagcatc atggatgcag1561ggccggtcgt ggtgcactgc aggtgacagc tcctgctgcc cctctaggcc acagcctgtc1621cctgtctcct agcgcccagg gcttgctttt acctacccac tcctagctct ttaactgtag1681gaagaattta atatctgttt gaggcataga gcaactgcat tgagggacat tttgatccca1741aggcatattt ctcctagacc ctacagcact gccattggcc atggccatgg caacatgctc1801agttaaaaca gcaaagacta agtcagcatt atctctgagt ccaccagaag ttgtgcatta1861aacaacttca tcctggaaaa aaaaaaaaaa aa(SEQ ID NO:216)1mtsrrwfhpn itgveaenll ltrgvdgsfl arpsksnpgd ftlsvrrnga vthikiqntg61dyydlyggek fatlaelvqy ymehhgqlke kngdvielky plncadptse rwfhghlsgk121eaeklltekg khgsflvres qshpgdfvls vrtgddkges ndgkskvthv mircqelkyd181vgggerfdsl tdlvehykkn pmvetlgtvl qlkqplnttr inaaeiesrv relsklaett241dkvkqgfwee fetlqqqeck llysrkegqr qenknknryk nilpfdhtrv vlhdgdpnep301vsdyinanii mpefetkcnn skpkksyiat qgclqntvnd fwrmvfqens rvivmttkev361ergkskcvky wpdeyalkey gvmrvrnvke saahdytlre lklskvgqgn tertvwqyhf421rtwpdhgvps dpggvldfle evhhkqesim dagpvvvhcr


NM080601 Homo sapiens protein tyrosine phosphatase non-receptor type 11 (PTPN11), transcript variant 2, mRNA (version 2)

(SEQ ID NO:217)1cggccgcggt ttccaggagg aagcaaggat gctttggacactgtgcgtgg cgcctccgcg61gagcccccgc gctgccattc ccggccgtcg ctcggtcctccgctgacggg aagcaggaag121tggcggcggg cgtcgcgagc ggtgacatca cgggggcgacggcggcgaag ggcgggggcg181gaggaggagc gagccgggcc ggggggcagc tgcacagtctccgggatccc caggcctgga241ggggggtctg tgcgcggccg gctggctctg ccccgcgtccggtcccgagc gggcctccct301cgggccagcc cgatgtgacc gagcccagcg gagcctgagcaaggagcggg tccgtcgcgg361agccggaggg cgggaggaac atgacatcgc ggagatggtttcacccaaat atcactggtg421tggaggcaga aaacctactg ttgacaagag gagttgatggcagttttttg gcaaggccta481gtaaaagtaa ccctggagac ttcacacttt ccgttagaagaaatggagct gtcacccaca541tcaagattca gaacactggt gattactatg acctgtatggaggggagaaa tttgccactt601tggctgagtt ggtccagtat tacatggaac atcacgggcaattaaaagag aagaatggag661atgtcattga gcttaaatat cctctgaact gtgcagatcctacctctgaa aggtggtttc721atggacatct ctctgggaaa gaagcagaga aattattaactgaaaaagga aaacatggta781gttttcttgt acgagagagc cagagccacc ctggagattttgttctttct gtgcgcactg841gtgatgacaa aggggagagc aatgacggca agtctaaagtgacccatgtt atgattcgct901gtcaggaact gaaatacgac gttggtggag gagaacggtttgattctttg acagatcttg961tggaacatta taagaagaat cctatggtgg aaacattgggtacagtacta caactcaagc1021agccccttaa cacgactcgt ataaatgctg ctgaaatagaaagcagagtt cgagaactaa1081gcaaattagc tgagaccaca gataaagtca aacaaggcttttgggaagaa tttgagacac1141tacaacaaca ggagtgcaaa cttctctaca gccgaaaagagggtcaaagg caagaaaaca1201aaaacaaaaa tagatataaa aacatcctgc cctttgatcataccagggtt gtcctacacg1261atggtgatcc caatgagcct gtttcagatt acatcaatgcaaatatcatc atgcctgaat1321ttgaaaccaa gtgcaacaat tcaaagccca aaaagagttacattgccaca caaggctgcc1381tgcaaaacac ggtgaatgac ttttggcgga tggtgttccaagaaaactcc cgagtgattg1441tcatgacaac gaaagaagtg gagagaggaa agagtaaatgtgtcaaatac tggcctgatg1501agtatgctct aaaagaatat ggcgtcatgc gtgttaggaacgtcaaagaa agcgccgctc1561atgactatac gctaagagaa cttaaacttt caaaggttggacaagggaat acggagagaa1621cggtctggca ataccacttt cggacctggc cggaccacggcgtgcccagc gaccctgggg1681gcgtgctgga cttcctggag gaggtgcacc ataagcaggagagcatcatg gatgcagggc1741cggtcgtggt gcactgcagg tgacagctcc tgctgcccctctaggccaca gcctgtccct1801gtctcctagc gcccagggct tgcttttacc tacccactcctagctcttta actgtaggaa1861gaatttaata tctgtttgag gcatagagca actgcattgagggacatttt gatcccaagg1921catatttctc ctagacccta cagcactgcc attggccatggccatggcaa catgctcagt1981taaaacagca aagactaagt cagcattatc tctgagtccaccagaagttg tgcattaaac2041aacttcatcc tggaaaaaaa aaaaaaaaa(SEQ ID NO:218)MTSRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGDFTLSVRRNGAVTHIKIQNTGDYYDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELKYPLNCADPTSERWFHGHLSGKEAEKLLTEKGKHGSFLVRESQSHPGDFVLSVRTGDDKGESNDGKSKVTHVMIRCQELKYDVGGGERFDSLTDLVEHYKKNPMVETLGTVLQLKQPLNTTRINAAEIESRVRELSKLAETTDKVKQGFWEEFETLQQQECKLLYSRKEGQRQENKNKNRYKNILPFDHTRVVLHDGDPNEPVSDYINANIIMPEFETKCNNSKPKKSYIATQGCLQNTVNDFWRMVFQENSRVIVMTTKEVERGKSKCVKYWPDEYALKEYGVMRVRNVKESAAHDYTLRELKLSKVGQGNTERTVWQYHFRTWPDHGVPSDPGGVLDFLEEVHHKQESIMDAGPVVVHCR


Putative function


(CG3954)—protein tyrosine phosphatase


(CG16903)—cyclin, potentially involved in differentiation and neural plasticity


Example 19B
Validation of GENE Function by RNA Interference (RNAi) Knockdown in Drosophila Cultured Cells

To confirm the mitotic role of the target protein, knockdown of Corkscrew (CG3954) expression is performed in cultured Drosophila Dmel-2 cells using a double stranded RNA (dsRNA) from within the Corkscrew (CG3954) CDS corresponding to the following CDS sequence:

(SEQ ID NO:219)GCCGAGTACATCAATGCCAACTACATACGGCTGCCCACCGACGGCGACCTGTACAACATGAGCAGCTCGTCGGAGAGCCTGAACAGCTCGGTGCCCTCGTGCCCCGCCTGCACGGCTGCCCAGACACAGCGGAACTGCTCCAACTGCCAGCTGCAAAACAAGACGTGCGTGCAGTGCGCCGTGAAGAGCGCCATTCTGCCGTATAGCAACTGTGCCACCTGCAGCCGCAAGTCAGACTCCCTGAGCAAGCACAAGCGGAGCGAATCCTCGGCCTCTTCATCGCCCTCCTCCGGCTCTGGGTCCGGACCAGGATCGTCGGGCACCAGCGGAGTGAGCAGCGTCAATGGACCCGGCACACCCACCAATCTCACGAGCGGCACAGCCGGATGTCTGGTCGGCCTGCTGAAGAGACACTCGAACGACTCGTCCGGAGCTGTTTCTATATCGATGGCCGAACGGGAACGCGAGAGGGAGCGCGAGATGTTTAAGACCTACATCGCCACCCA


dsRNA is prepared by annealing complimentary RNAs made by in vitro transcription from a PCR fragment created with the following PCR primers:

(SEQ ID NO:220)TAATACGACTCACTATAGGGAGAGCCGAGTACATCAATGCCAACTACAT(SEQ ID NO:221)TAATACGACTCACTATAGGGAGATGGGTGGCGATGTAGGTCTTAAACAT


Cells are transfected with double stranded RNA in the presence of ‘Transfast’ transfection reagent. A control transfection of a non-endogenous RNA corresponding to RFP (red fluorescent protein) is carried out in parallel.


Analysis of Corkscrew CG3954 Knockdown by RNAi in D-Mel2 cells by Cellomics Mitotic Index Assay


For the transfection, 1 μg dsRNA is added to a well of a 96-well Packard viewplate and 35 μl of logarithmically growing DMel-2 cells diluted to 2.3×105 cells/ml in fresh Drosophila-SFM/glutamine/Pen-Strep are added. Cells are incubated with the dsRNA (60 nM) in a humid chamber at 28° C. for 1 hr before addition of 100 μl Drosophila-SFM/glutamine/Pen-Strep. Cells are incubated at 28° C. for 72 hours before analysis. For the assay, cells were fixed and stained using the Cellomics Mitotic Index HitKit following manufacturers instructions. The mitotic index of cells in each well was determined using the ArrayScan HCS System, running the Application protocol Mike250502_Polgen_MitoticIndex10×_p2.0 with the 10× objective and the DualBGlp filter set. This automated screening system detects the levels of a specific antigen (phosphorylated histone H3) which is only detectable during mitosis while the chromosomes are condensed.


Results for Corkscrew (CG3954) are shown in FIG. 1. A reproducible and significant reduction in mitotic index is observed in this assay indicating a reduction in the number of cells able to exit S-phase and enter mitosis after RNAi


Analysis of Corkscrew CG3954 Knockdown by RNAi in D-Mel2 Cells by Microscopy


For transfection 9 μl of Transfast reagent (Promega) is added to 3 μg gene specific dsRNA in 500 μl Drosophila Schneiders medium (no additives) and incubated at room temperature for 15 min. For control wells an equivalent amount of RFP dsRNA is used. This mix is added to a well of a 6-well tissue culture plate containing a glass coverslip and 500 μl of a Dmel-2 cells at 1×106 cells/ml in shneiders medium. After a 1 hour incubation at 28° C., 2 mls Schneiders medium+10% FCS and pen/strep solution is added and cells are incubated at 28° C. for 48 hours. Cells on the coverslip are fixed in formaldehyde and stained with antibodies which detect α-tubulin and γ-tubulin (centrosomes), and are co-stained with DAPI to detect DNA.


An increase in the number of cells with chromosomal defects (see Table 1 below) was observed upon RNAi. The phenotypes seen were aneuploidy (65% of mitoses compared to 30% in control cells), misaligned chromosomes (80% compared to 40% in control cells), and polyploidy, however no spindle defects were observed.

NumberNumber of% of chromosomalcells withcells withdefects (nochromosomalnormaldefects/totaldsRNAdefectsmitosiscells in mitosis)No RNA13531439.47RFP13730940.29CG17251868768.13


Table 1 shows mitotic defects observed by microscopy after RNAi knockdown of Corkscrew (CG3954) in Dmel2 Drosophila cultured cells.


Example 19C
Shp2 is a Human Homologue of Drosophila Corkscrew CG3954

BLASTP with Drosophila Corkscrew CG3954 reveals 46% (327/700) sequence identity with the human Shp2 gene (genbank accession D13540), indicating that they are homologues. The BLASTP results are shown in FIG. 2.


The sequence of the human Shp2 gene mRNA (2 splice variants is shown in Example 19 above).


Example 19D
Validation of the Mitotic Role of the Human Homologue by siRNA Knockdown of Shp2 Expression in Human Cultured Cells

Generation of Shp2 siRNA Knockdowns


Knockdown of human Shp2 gene expression is achieved by siRNA (short interfering RNA, Elbashir et al, Nature 2001 May 24; 411(6836):494-8). We used synthetic double stranded RNAs corresponding to two different regions of the Shp2 mRNA. siRNAs are obtained from Dharmacon (our supplier). The siRNA sequences are:

COD1650shp2-1AACGUCAAAGAAAGCGCCGCUCorresponds tosiRNA(SEQ ID NO:222)nucleotides 1539-1559 inhuman Shp2 splicevariants 1 and 2 seeexample 19 above)COD1651shp2-2AAUUGGCCGGACAGGGACGUUCorresponds tosiRNA(SEQ ID NO:223)nucleotides 1766-1786 inhuman Shp2 splicevariants 1 and 2 seeexample 19 above)


Analysis of siRNA Hu Shp2 Knockdowns in U2OS Cells by Flow Cytometry Analysis


Cells are seeded in 6-well tissue culture dishes at 1×105 cells/well in 2 ml Dulbecco's Modified Eagle's Medium (DMEM) (Sigma)+10% Foetal Bovine Serum (FBS) (Perbio), and incubated overnight (37° C./5% CO2).


For each well, 12 μl of 20 μM siRNA duplex (Dharmacon, Inc) (in RNAse-free H2O) is mixed with 200 μl of Optimem (Invitrogen). In a separate tube 8 μl of oligofectamine reagent (Invitrogen) was mixed with 52 μl of Optimem, and incubated at room temperature for 7-10 mins. The oligofectamine/Optimem mix is then added dropwise to the siRNA/Optimem mix, and this is then mixed gently, before being incubated for 15-20 mins at room temperature. During this incubation the cells are washed once with DMEM (with no FBS or antibiotics added). 600 μl of DMEM (no FBS or antibiotics) is then added to each well.


Following the 15-20 min incubation, 128 μl of Optimem is added to the siRNA/oligofectamine/optimem mix, and this was added to the cells (in 600 μl DMEM). The transfection mix is added at the edge of each well to assist dilution before contact is made with the cells. Cells are then incubated with the transfection mix for 4 h (37° C./5% CO2). Subsequently 1 ml DMEM+20% FBS is added to each well. Cells are then incubated at 37° C./5% CO2 for 72 h. Cells are harvested by trypsinisation, washed in PBS, fixed in ice-cold 70% EtOH and stained with propidium iodide before Facs analysis.


siRNA Hu Shp2 knockdowns are conducted in U2OS. As shown in FIG. 3 major changes in the distribution of cells between cell cycle compartments (G1, S, G2/M) are seen with Shp2 siRNA COD1650 which is directed to both alternative transcripts of Shp2. An accumulation of cells in the S2 compartment cell cycle, is observed with a concomitant reduction in the G1 compartment population. This indicates that a proportion of cells may unable complete S-phase and enter mitosis.


Subsequent microscopic analysis is performed in order to look at phenotypes resulting from the Shp2 siRNA induced defect and check for the presence of large multinucleate cells which may, due to their size and ploidy, be excluded from the FACS analysis.


Analysis of Hu Shp2 siRNA Knockdowns in U2OS Cells by Microscopy


The transfection method for samples for microscopy is identical to that for Facs except that cells are plated in wells containing a sterile glass coverslip. Cells are incubated with siRNA for 48 hours before formaldehyde fixation and co-staining with Dapi to reveal DNA (blue) and antibodies to reveal microtubules (red) and centrosomes (green). Antibodies used are: rat anti-alpha tubulin (YL12) (supplier Serotec) with secondary antibody goat anti-rat IgG-TRITC (supplier Jackson Immunoresearch) and mouse anti-gamma-tubulin (GTU88) with secondary antibody Alexagreen488-goat anti-mouseIgG (supplier Sigma).


Phenotype analysis by microscopy is conducted on U2OS cells. Results from duplicate experiments in U2OS cells are shown in FIG. 4, and Table 2 below. After siRNA no mitotic defects were seen, only a small increase in binucleate and apoptotic cells. These results are consistent with the Facs analysis, and in conjunction with the results of Corkscrew siRNA in Dmel-2 cells, they confirm that Shp2 is involved in cell cycle progression, in particular, in completing S-phase. Accordingly, modulators of Shp2 activity (as identified by the assays described above) may be used to treat any proliferative disease.

TABLE 2Description of significant cell divisiondefects after Shp2 siRNA in U2OS cells.Gene/siRNAShp2/COD1650Cell TypeU2OSPolyploidyNormalMitotic DefectsNormalMain knockout phenotypeNo mitotic phenotype observedAdditional observationsIncreased number of binuclear cells(0.6/field compared to 0.2/field inuntreated)Increase in apoptotic cells


Example 19E
Expression of Recombinant Hu Shp2 Protein in Insect Cells

A cDNA encoding the Human Shp2 coding region derived by RT-PCR is inserted into the baculovirus expression vector pFastbacHTc (Life Technologies). A baculovirus stock is generated and western blot of subsequent infections of Sf9 insect cells demonstrates expression of N-terminal 6-His tagged proteins of approximately 68 kD. The recombinant protein is purified by Ni-NTA resin affinity chromatography.


Similarly 6-His tagged Dlg proteins are expressed in bacteria by inserting cDNAs into bacterial expression plamids pDest17 or pET series. Protein expression in cultures of host E. coli cells transformed with recombinant plasmid is induced by addition of inducer chemical IPTG. The recombinant protein is purified by Ni-NTA resin affinity chromatography


Example 19F
Assay for Modulators of Shp2 Activity

Shp2 is a non-transmembrane-type protein tyrosine phosphatase that participates in the signal transduction pathways of a variety of growth factors and cytokines. Shp2 binds directly to the PDGF receptor, EGF receptor, and c-KIT in response to stimulation of cells with the corresponding receptor ligand and undergoes tyrosine phosphorylation. Shp2 is implicated in PDGF-induced RAS activation and EGF stimulation of the RAS-MAP kinase cascade that leads to DNA synthesis. Corkscrew (the putative Drosophila homolog of Shp2) is thought to be required for Ras1 activation or to function in conjunction with Ras1 during signaling by the Sevenless receptor tyrosine kinase. In addition Shp2 is implicated in insulin dependent signaling. Shp2 does not interact directly with the insulin receptor, but it binds through its SH2 domains to tyrosine-phosphorylated docking proteins such as IRS1, IRS2, and GAB 1 in response to insulin. Overall Shp2 appears to play a role in growth factor-induced cell proliferation, through activation of the RAS-MAP kinase cascade. In addition to its role in receptor tyrosine kinase-mediated MAP kinase activation, Shp2 may play an important role, partly through its interaction with the membrane glycoprotein SHPS-1, in the activation of MAP kinase in response to the engagement of integrins by the extracellular matrix.


phosphotyrosyl proteins or peptides derived from SHPS-1, IRS1 or PDGF. An assay for modulators of Shp2 activity would consist of detection of dephosphorylation of ligand proteins, or phosphotyrosyl peptides derived from ligand proteins, described above e.g. phosphotyrosyl proteins or peptides derived from SHPS-1, IRS1 or PDGF (Takada et al 1998). Dephosphorylation of the substrate would be detected by quantifying the released inorganic phosphate, or by detecting loss of phosphate using an anti-phosphotyrosine antibody.


Example 20 (Category 3)

Line ID—500


Phenotype—Viable, High mitotic index, colchicines-type overcondensed chromosomes, a few polyploid cells


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003422 (2C)


P element insertion site—247,403


Annotated Drosophila genome Complete Genome candidate CG4399—EAST

(SEQ ID NO:224)ATGTCTAGCCGGAAGGTGCCAGGAGGCTCTGGAGGAGCTGACGAATCCACAGCAGCAGCTGCCCCCCTGGATGATAATGCCAATGCCAGTGTGGAGATTCCAGACAGCAGCGAGGAGCCAGCAATGGGCGTCGGCGAAGAGATGTCTATCATAAGCAAAACACGCACCTCAACTTTGTCAGTGGAGCCCGCTAAGGAGCCAACAGTAACAGCAGAGCTGGAAGGCGAAAAAGAGCTGGAATCGAATCCAGTCTCCAAAACTCCTAGGTCCACGCCTACGCCAACCCTTACGCCAGCCGTCACGCCTACCGCCAGTGATGGAGTGGCGGCCAAGAGCGTGAGGGTTACCCGGCACTCGTCGCCACTGCTTCTGATCATCTCGCCCACGACAAGTAGACGTGAGGTCGGCGACGGAGAGCTAGACACCGAGGAACCAACGGGATCGGGTGGCCAAAGAAAGAGCTCCGTGGAGCGATCTTTGGCGCCCGTTATACGCGGACGAAAGTCCATCAAGGATCTGAAAGAAGCCAAAGAAGTCAAGTCCGAGGAGCCGCCTGCCGCAGCATCAGAGTCACGAGCTGCAAGTGGAGTGACGCCTGGCCAGGTCAAGGAACAGCATGTCGCGGATGGCAACGAAATGGAATCCTTGCCAATCACAGACAAGAAAGACCACAAAGACACAAAAGACAAGGGAGATGAGCGGGAAACCGATCAGGAGGAAGAGAAGGAAAAATCAGCTGATACAGAAATAATTGCAGATACAGAAAAAACTTCGGAGAAACAAAAGTATACAGAGAAGGACAAAGCTGCCGATAAAGATGGAGGAAAAGAAAAAGATATTGATGCAAATAAGGATATAGATAAGGAGAAGGAAAAGGTCAAGGAAGTACTTCCGCCAGTGGTGCCTATAGCACCAGTGACACCCACTTGTAACCGTGTCACACGTAAATCACATGCCCAGGAGCAGGCGATTAACACGCGGGTCACTCGCAATCGTCGCCAGTCCTCTACAGTTGGAGCCAACTCCACCGCGTCTTTGGTAGCTGCATCCTCCTCAGTAACAGAGCAACCCCCTCCATCTCGCGGTCGACGGAAGAAGCCAGTGGTGGTGGCTCCTCCCTTGGAGCCTGCGGTAAAACGGAAGCGATCGCAAGATGTTGAAGCCGACTCAGACGCCAACAACAGCACGAAATACAGCAAGGTGGAAGTGGTAAAGTCTGAGGAAGCTGAGGCACCAGAGGAGGACTCCAGTGCCGTGCCCATTAAGCAGGAATCTGTTGATGGCAACGAGGTCAGTTCTATTTCTCCAACAGTCACGCCCACACCCACACCTGCGCCAACACCAGCTCCAGTCCCGGGCAGTCGACGGGGTCGTGGGCGCCCGCAGAACAGGAACTCCTCTTCGCCTGCAACCACAACGCGGGCAACGCGGCTAAGCAAGGCGGGATCACCGGTTATCCTGACGCCAGTAGCCCAGGAACCGGCGCCACCGAAACGGCGGCGAGTCGGCTCCAGCACACGGAAGACTGTCTCGGCCAGCTCGCTGGCACCCAGCTCGCAGGGCGGCGCCGGGGATGAGGACTCCAAGGACAGTATGGCCTCGTCCATGGACGACCTGCTGATGGCCGCAGCAGATATCAAGCAGGAGAAGCTGACGCCCGATTTCGACGATAGTTTGATGCCAGAAGGCCTGCCCTCTACTTCTGGTGCGTCGAGTGCCAATGGTCATTCCTGCACCGAACCGCTTACTGTGGACACGGAAATTAATGTTAAGCCCGCTGATTCCAAAGTAAAACCAAAGGAGTCACCGGTGGTAGCAGTCGAGGAATCTCCATCACAATCCGAAACGCAATCTGCAAAGGTGTCAGCGCATGCGGGGAAGGCTCCATCTCTTAGTCCAGATATGATAAGTGAAGGCGTGAGCGCGGTCAGTGTTCGAAAGTTTTATAAGAAGCCTGAGTTCCTGGAAAACAATCTGGGCATTGAAAAGGATCCGGAGCTAGGTGAAATCGTTCAGACGGTTAGTAACAATGACACGGAAACAGATGTGGAGATGGCTGTTGATGGCGAGGTGAATCAACCGTCAACTCCCAAGTCGCAGGATAAAAAGAAAGAGGAGCAGGAAAAGAATCAGAAATCAGGGCTAAAGGCAGCAAAGAAGGCTCCTGCTAAGTTAGAACCTAAAGCTGAAGACATTTCTGAAATTCTTACTGACGTTCCTGTTGATATTTCGACTGAGGCAGTAGAAATTATAGAAGAAGCAGAGGAAGACACTTGTTCAAATAGCTCAATCAAACCAGGTGAGCTCCGACTGGACGAGAGCAACGATGAACCTGAACTGCTTCTTGAAGACGCCCTCATAGTCAATGGTGATGAGAATGAGACACCAGATCAACCGGAGGAAAAGGAGGACCAGGTGGAGTTCTTCCATACAGGAGAATACGACGACTTTGAGCACGAGATTATGGTGGAGCTGGCGAAGGAGGGGGTGCTAGATGCCAGCGGCAATGCATTAAGTCAGCAAAAGGTAGAACTTGAGCATCCCGAGGATGTAACTCTACACGAATCAAAAAATGACATAGAAGCCGAAGAATCGGTTGAACGTAAGCCTCTTAAGGACCCGTCGGTTGCGGACGAAATGGAGGACATGAATGAGGAATCCTATATTGACATTAAGGACCAGACAAATCAACTGTTAGTTGAACACTTGGCAGAAGAGGCCATGGAAGCGGACTGCGGTCCCGAGGATAACAAGGAGAACTTGTCCACGTCTGCTTCGAGCACCGCTGCCGATGGTCTAGATATTCAGTTGGCCATCAAGGAGGATGACGACGAGGAGAAACCGCTTGCAGTTATCGCTGACGAACAGAAGCCTGGGCTGCTGTTGACCAATGACATGAAAGTGGATGAGAAACCAAATGGCAAGCAGGAATCGGTCTGTGATGAGCACGTTCAGCTGGTGCCAAACCTTCGTCAAGAACAGGAAATTCACTTACAAAATCTGGGCCTACTCACGCACCAGGCCGCTGAACATAGGCGCAAGTGTCTGCTTGAGGCACAGGCCCGCCAGGCGCAAATGCAGCTCCAGCAACATCACCACCATCAGCACAAGCGACAAGGAGCGCGCGGAGGAGGCAGTGCCACTCATGTGGAATCCAGCGGTACTTTGAAGACAGTCATCAAGCTGAACAGGAGCAGCAACGGAGGAGTAAGCGGTAGTGGCGGCCTGCCTACTGGTACAGTTATCCATGGAGGCTGTGGCTCCTCTTCAGCTTCTTCCACGTCCTCCTCCTCGGTGGGCAGTGCCACACGTAAGTCAAGCGGGACCTTGGGCTCAGGAGCGGGAGCAGGAGCTGGCGTTCGCCGGCAGTCGCTTAAGATGACATTCCAGAAGGGTCGGGCTCGTGGTCACGGTGCTGCGGATCGATCCGCCGATCAGTATGGCGCCCACGCCGAGGACTCCTACTACACCATTCAAAACGAGAACGAAGGTGCGAAAAAGTTTGTTGTAACTACTGGTAATACCGGCCGCAAGACTAATAACCGTTTCAGCTCAACTAACAACTACCACTCGACGGTAGCCTTGCACGGTAGCAACTCTGCGCTCCAGTACTATTCGTCCCACTCGGAAAGTCAGGGACAGACGGACCACGGCTTCTATCAGATGGTCAAAAAGGACGAAAAGGAGAAGATCCTCATTCCGGAAAAGGCCTCCTCGTTTAAGTTTCACCCAGGGAGACTGTGCGAAGACCAGTGCTACTACTGTAGCGGAAAGTTTGGCCTCTATGACACCCCCTGCCATGTTGGACAAATAAAGTCCGTGGAGCGCCAGCAGAAGATCCTAGCCAACGAGGAGAAGCTCACCGTGGATAACTGCTTGTGCGACGCATGTTTTCGACACGTGGACCGCCGGGCAAATGTGCCATCCTATAAGAAGCGTCTTTCCGCTTCAGGTCACTTGGAGATGGGGTCTGCAGCGGGATCTGCACTAGAGAAACAGTTTGCTGGCGACAGCGGCGTCATTACGGAATCGGGTGGCGAAGCTGGTTCTACGGCAGCTGTGGCCGTGCAGCAACGATCTTGTGGCGTGAAGGACTGCGTCGAAGCGGCACGACACTCGCTGCGGCGCAAGTGCATACGCAAGAGAGTAAAGAAGTATCAGCTCAGCCTGGAGATTCCCGCAGGCTCGTCGAACGTGGGGCTGTGTGAGGCACATTACAATACGGTCATCCAATTTTCCGGCTGCGTTCTTTGCAAGCGTAGATTAGGCAAGAACCATATGTACAACATAACCACGCAGGACACAATTCGACTGGAAAAGGCGCTGTCCGAGATGGGCATCCCAGTTCAGCTTGGCATGGGCACTGCAGTCTGCAAGCTGTGTCGCTATTTTGCCAACCTTTTGATAAAGCCACCGGATAGCACCAAGGCACAAAAGGCGGAATTCGTGAAGAACTACAGAAAGAGGCTCCTCAAGGTGCATAATCTGCAGGATGGCAGTCATGAGCTGTCCGAAGCGGATGAAGAAGAGGCACCTAATGCAACGGAGACAGAAAGGCCAACCTCAGACGGACACGAAGATCCCGAGATGCCCATGGTAGCGGACTATGATGGACCTACCGACTCCAATTCCAGTAGTTCTTCGACTGCAGCCCTGGACACCAGCAAACAAATGTCCAAGCTTCAGGCCATCCTGCAGCAAAATGTGGGAGCGGATGCGGCAGGAGCTGCGGGAACAGGAACTGTTGCAGCAAGTCCCGGAGGAAGCGGATCTGGGGCAGATATCTCTAACGTATTGCGAGGGAATCCGAACATTTCCATGCGCGAACTTTTCCACGGCGAGGAAGAGCTGGGTGTGCAGTTCAAGGTGCCGTTCGGATGCAGCAGCAGCCAGCGTACTCCGGAGGGCTGGACACGAGTGCAGACTTTCCTACAATACGATGAGCCGACGCGCCGCCTCTGGGAGGAGTTGCAAAAGCCGTACGGAAATCAGAGCTCATTTCTGCGCCACTTGATACTATTAGAGAAGTACTACCGAAACGGAGATCTCGTCCTAGCACCGCATGCTTCCTCCAATGCCACGGTTTACACAGAGACTGTTCGTCAGCGGCTGAATTCGTTTGATCACGGTCACTGCGGTGGATTGAACATCGCAGGCAGCCCTTCTTCTTCGGGTTCCGGCAAGCGCAGTGGAGTTCCTCAACCTACGGGTGCCAGTGTGCTGGCCACCGCCCTCACAACACCCTTGACAAGCCATTCATCCTCCTCTGCATCCATTTCCTCCGAACAGCATTCGTCGGTTGATCCTGTCATTCCGCTGGTAGACCTCAATGATGACGATGAAGGCGAAGATGGGGCAGGAGGAGCGGGCGAAAGGGAGTCGACAAATAGGCAGCAGGACGTAATCTTGGAATGCCTTAGAACTGCCTCTGTGGACAAGCTGACTAAGCAGCTCAGCTCGAATGCGGTGACGATTATCGCCCGGCCCAAAGACAAATCGCAGCTCTCCTGCAACAGCGGATCCTCCACGTCCATTTCCAGCTCCTCGTCCGCTATTTCCTCGCCGGAGGAAGTGGCCGTCACTAAGGTTACAGCAGTCGCACCAGTCCAGTCCAAGGATGCACCGCCACTGGCGCCAGCAAGTAGCGGTGTTAGCAACAGTCGTAGTATCCTTAAAACCAACCTCTTGGGCATGAACAAGGCCGTGGAACTCGTGCCCTTAACGACTGCCCCCCACGCTTACAAGCCAACTGGATGCCATAAGCCTGAGAAACAGCAAAAGATTCTTGACGTGGCCAATAAGCAGCCCGGTAGCCAGGGGGAACCGGTACCATCAAGCGCCTTGCTTGGCCTGCAGTCAAAGCTAAAGCCTCCAACGCATCAGCAGCAGGTCAGCGGATCAGGAGCGGGAACTAGTGGTTCTCAGAAGCCATCTAATGTGGCGCAATTGCTTAGCTCTCCACCGGAGCTAATCAGCTTGCATCGACGGCAGACCAGCGGAGCAGCAGCGGGGTCCAGCAGCTTCCTTCAGGGCAAGAGGCTTCAACTTCCACGATCTGGAGCAGGGCCTTCAGGAGCGGGAACGGGAACAGGCGCTGGAGCAGCAGGAAGCCGCAGTGCGGGTGGACCACCACCGCCCAATGTGGTCATACTGCCGGACGCCTTAACCCCCCAGGAGCGACACGAGAGCAAGAGCTGGAAGCCAACGCTGATACCGCTGGAGGATCAGCACAAGGTGCCGAACAAATCACATGCTCTTTATCAGACCGCCGACGGTCGAAGGTTGCCCGCCCTGGTGCAAGTGCAGTCTGGTGGCAAGCCATACCTCATCTCTATCTTCGACTATAACCGCATGTGCATCTTGCGAAGGGAAAAGCTGATGCGGGACCAGTTGCTCAAGAGTAACGCCAAGCCAAAGCCGCAGAACCAGCAACAGCAGCAGGGCCAAACGCACCAGCAGCAGCAGAATTCCGCCGCATCGGCGGCTGCCTTCTCCAACATGGTGAAGTTGGCCCAGCAACACACGGCGCGACAGCAGCTTCAGCAGCTGCAACAGAAGCCACAACAGCAGCAACAATTGCCCACTTTGCAGCCAGGTGGGGTGCGACTTGCCCGGCTGCCGCAAAAACTACTGATGCCACCACTGACTAATCCGCAGATTGGCAGTCAAGCACCCAACTTACAGCCGTTGCTGTCTAGTACGCTGGATAACAGCAACAACTGCTGGCTGTGGAAAAACTTTCCTGATCCCAATCAGTATCTGCTAAATGGAAACGGAGGGGGTGCCGGGAGCTCCTCCAGCAAGTTGCCACATCTCACGGCCAAACCAGCCACGGCAACTAGTAGCTCCGGAGCGGCCAACAAATCAGCAGGAAGCCTATTTACCCTCAAGCAGCAGCAGCACCAGCAGAAACTCATCGACAACGCTATCATGTCAAAGATACCCAAAAGTCTGACAGTAATACCGCAGCAGATGGGTGGTAATACCGGTGGCGATATGGGGGGCAGCAGCTCCTCCGGCAAGGACTGATGACGGCGAAGGAGGGCGCCATGGCCATTAGCCGTAGCGCCGGAGGTAACCCGGCGAAGTAGTAGGATCAACAAGCAGGCGACGTGCAGCTTAAGCGGCGATCTTCAGAACAAGAGGTGACCAGCGGCGGCTCCATGGATATCACAAACTCCACAATTCCATGGCTGCAGTAGAATAAGTGATACACT(SEQ ID NO:225)MSSRKVPGGSGGADESTAAAAPLDDNANASVEIPDSSEEPAMGVGEEMSIISKTRTSTLSVEPAKEPTVTAELEGEKELESNPVSKTPRSTPTPTLTPAVTPTASDGVAAKSVRVTRHSSPLLLIISPTTSRREVGDGELDTEEPTGSGGQRKSSVERSLAPVIRGRKSIKDLKEAKEVKSEEPPAAASESRAASGVTPGQVKEQHVADGNEMESLPITDKKDHKDTKDKGDERETDQEEEKEKSADTEIIADTEKTSEKQKYTEKDKAADKDGGKEKDIDANKDIDKEKEKVKEVLPPVVPIAPVTPTCNRVTRKSHAQEQAINTRVTRNRRQSSTVGANSTASLVAASSSVTEQPPPSRGRRKKPVVVAPPLEPAVKRKRSQDVEADSDANNSTKYSKVEVVKSEEAEAPEEDSSAVPIKQESVDGNEVSSISPTVTPTPTPAPTPAPVPGSRRGRGRPQNRNSSSPATTTRATRLSKAGSPVILTPVAQEPAPPKRRRVGSSTRKTVSASSLAPSSQGGAGDEDSKDSMASSMDDLLMAAADIKQEKLTPDFDDSLMPEGLPSTSGASSANGHSCTEPLTVDTEINVKPADSKVKPKESPVVAVEESPSQSETQSAKVSAHAGKAPSLSPDMISEGVSAVSVRKFYKKPEFLENNLGLEKDPELGEIVQTVSNNDTETDVEMAVDGEVNQPSTPKSQDKKKEEQEKNQKSGLKAAKKAPAKLEPKAEDISEILTDVPVDISTEAVEIIEEAEEDTCSNSSIKPGELRLDESNDEPELLLEDALIVNGDENETPDQPEEKEDQVEFFHTGEYDDFEHEIMVELAKEGVLDASGNALSQQKVELEHPEDVTLHESKNDIEAEESVERKPLKDPSVADEMEDMNEESYIDIKDQTNQLLVEHLAEEAMEADCGPEDNKENLSTSASSTAADGLDIQLAIKEDDDEEKPLAVIADEQKPGLLLTNDMKVDEKPNGKQESVCDEHVQLVPNLRQEQEIHLQNLGLLTHQAAEHRRKCLLEAQARQAQMQLQQHHHHQHKRQGARGGGSATHVESSGTLKTVIKLNRSSNGGVSGSGGLPTGTVIHGGCGSSSASSTSSSSVGSATRKSSGTLGSGAGAGAGVRRQSLKMTFQKGRARGHGAADRSADQYGAHAEDSYYTIQNENEGAKKFVVTTGNTGRKTNNRFSSTNNYHSTVALHGSNSALQYYSSHSESQGQTDHGFYQMVKKDEKEKILIPEKASSFKFHPGRLCEDQCYYCSGKFGLYDTPCHVGQIKSVERQQKILANEEKLTVDNCLCDACFRHVDRRANVPSYKKRLSASGHLEMGSAAGSALEKQFAGDSGVITESGGEAGSTAAVAVQQRSCGVKDCVEAARHSLRRKCIRKRVKKYQLSLEIPAGSSNVGLCEAHYNTVIQFSGCVLCKRRLGKNHMYNITTQDTIRLEKALSEMGIPVQLGMGTAVCKLCRYFANLLIKPPDSTKAQKAEFVKNYRKRLLKVHNLQDGSHELSEADEEEAPNATETERPTSDGHEDPEMPMVADYDGPTDSNSSSSSTAALDTSKQMSKLQAILQQNVGADAAGAAGTGTVAASPGGSGSGADISNVLRGNPNISMRELFHGEEELGVQFKVPFGCSSSQRTPEGWTRVQTFLQYDEPTRRLWEELQKPYGNQSSFLRHLILLEKYYRNGDLVLAPHASSNATVYTETVRQRLNSFDHGHCGGLNIAGSPSSSGSGKRSGVPQPTGASVLATALTTPLTSHSSSSASISSEQHSSVDPVIPLVDLNDDDEGEDGAGGAGERESTNRQQDVILECLRTASVDKLTKQLSSNAVTIIARPKDKSQLSCNSGSSTSISSSSSAISSPEEVAVTKVTAVAPVQSKDAPPLAPASSGVSNSRSILKTNLLGMNKAVELVPLTTAPHAYKPTGCHKPEKQQKILDVANKQPGSQGEPVPSSALLGLQSKLKPPTHQQQVSGSGAGTSGSQKPSNVAQLLSSPPELISLHRRQTSGAAAGSSSFLQGKRLQLPRSGAGPSGAGTGTGAGAAGSRSAGGPPPPNVVILPDALTPQERHESKSWKPTLIPLEDQHKVPNKSHALYQTADGRRLPALVQVQSGGKPYLISIFDYNRMCILRREKLMRDQLLKSNAKPKPQNQQQQQGQTHQQQQNSAASAAAFSNMVKLAQQHTARQQLQQLQQKPQQQQQLPTLQPGGVRLARLPQKLLMPPLTNPQIGSQAPNLQPLLSSTLDNSNNCWLWKNFPDPNQYLLNGNGGGAGSSSSKLPHLTAKPATATSSSGAANKSAGSLFTLKQQQHQQKLIDNAIMSKIPKSLTVIPQQMGGNTGGDMGGSSSSGKD


Human homologue of Complete Genome candidate


AAF13722—neurofilament protein

(SEQ ID NO:226)1atgatgagct tcggcggcgc ggacgcgctg ctgggcgccccgttcgcgcc gctgcatggc61ggcggcagcc tccactacgc gctagcccga aagggtggcgcaggcgggac gcgctccgcc121gctggctcct ccagcggctt ccactcgtgg acacggacgtccgtgagctc cgtgtccgcc181tcgcccagcc gcttccgtgg cgcaggcgcc gcctcaagcaccgactcgct ggacacgctg241agcaacgggc cggagggctg catggtggcg gtggccacctcacgcagtga gaaggagcag301ctgcaggcgc tgaacgaccg cttcgccggg tacatcgacaaggtgcggca gctggaggcg361cacaaccgca gcctggaggg cgaggctgcg gcgctgcggcagcagcaggc gggccgctcc421gctatgggcg agctgtacga gcgcgaggtc cgcgagatgcgcggcgcggt gctgcgcctg481ggcgcggcgc gcggtcagct acgcctggag caggagcacctgctcgagga catcgcgcac541gtgcgccagc gcctagacga cgaggcccgg cagcgagaggaggccgaggc ggcggcccgc601gcgctggcgc gcttcgcgca ggaggccgag gcggcgcgcgtggacctgca gaagaaggcg661caggcgctgc aggaggagtg cggctacctg cggcgccaccaccaggaaga ggtgggcgag721ctgctcggcc agatccaggg ctccggcgcc gcgcaggcgcagatgcaggc cgagacgcgc781gacgccctga agtgcgacgt gacgtcggcg ctgcgcgagattcgcgcgca gcttgaaggc841cacgcggtgc agagcacgct gcagtccgag gagtggttccgagtgaggct ggaccgactg901tcggaggcag ccaaggtgaa cacagacgct atgcgctcagcgcaggagga gataactgag961taccggcgtc agctgcaggc caggaccaca gagctggaggcactgaaaag caccaaggac1021tcactggaga ggcagcgctc tgagctggag gaccgtcatcaggccgacat tgcctcctac1081caggaagcca ttcagcagct ggacgctgag ctgaggaacaccaagtggga gatggccgcc1141cagctgcgag aataccagga cctgctcaat gtcaagatggctctggatat agagatagcc1201gcttacagaa aactcctgga aggtgaagag tgtcggattggctttggccc aattcctttc1261tcgcttccag aaggactccc caaaattccc tctgtgtccactcacataaa ggtgaaaagc1321gaagagaaga tcaaagtggt ggagaagtct gagaaagaaactgtgattgt ggaggaacag1381acagaggaga cccaagtgac tgaagaagtg actgaagaagaggagaaaga ggccaaagag1441gaggagggca aggaggaaga agggggtgaa gaagaggaggcagaaggggg agaagaagaa1501acaaagtctc ccccagcaga agaggctgca tccccagagaaggaagccaa gtcaccagta1561aaggaagagg caaagtcacc ggctgaggcc aagtccccagagaaggagga agcaaaatcc1621ccagccgaag tcaagtcccc tgagaaggcc aagtctccagcaaaggaaga ggcaaagtca1681ccgcctgagg ccaagtcccc agagaaggag gaagcaaaatctccagctga ggtcaagtcc1741cccgagaagg ccaagtcccc agcaaaggaa gaggcaaagtcaccggctga ggccaagtct1801ccagagaagg ccaagtcccc agtgaaggaa gaagcaaagtcaccggctga ggccaagtcc1861ccagtgaagg aagaagcaaa atctccagct gaggtcaagtccccggaaaa ggccaagtct1921ccaacgaagg aggaagcaaa gtcccctgag aaggccaagtcccctgagaa ggccaagtcc1981ccagagaagg aagaggccaa gtcccctgag aaggccaagtccccagtgaa ggcagaagca2041aagtcccctg agaaggccaa gtccccagtg aaggcagaagcaaagtcccc tgagaaggcc2101aagtccccag tgaaggaaga agcaaagtcc cctgagaaggccaagtcccc agtgaaggaa2161gaagcaaagt cccctgagaa ggccaagtcc ccagtgaaggaagaagcaaa gacccccgag2221aaggccaagt ccccagtgaa ggaagaagcc aagtccccagagaaggccaa gtccccagag2281aaggccaaga ctcttgatgt gaagtctcca gaagccaagactccagcgaa ggaggaagca2341aggtcccctg cagacaaatt ccctgaaaag gccaaaagccctgtcaagga ggaggtcaag2401tccccagaga aggcgaaatc tcccctgaag gaggatgccaaggcccctga gaaggagatc2461ccaaaaaagg aagaggtgaa gtccccagtg aaggaggaggagaagcccca ggaggtgaaa2521gtcaaagagc ccccaaagaa ggcagaggaa gagaaagcccctgccacacc aaaaacagag2581gagaagaagg acagcaagaa agaggaggca cccaagaaggaggctccaaa gcccaaggtg2641gaggagaaga aggaacctgc tglcgaaaag cccaaagaatccaaagttga agccaagaag2701gaagaggctg aagataagaa aaaagtcccc accccagagaaggaggctcc tgccaaggtg2761gaggtgaagg aagacgctaa acccaaagaa aagacagaggtggccaagaa ggaaccagat2821gatgccaagg ccaaggaacc cagcaaacca gcagagaagaaggaggcagc accggagaaa2881aaagacacca aggaggagaa ggccaagaag cctgaggagaaacccaagac agaggccaaa2941gccaaggaag atgacaagac cctctcaaaa gagcctagcaagcctaaggc agaaaaggct3001gaaaaatcct ccagcacaga ccaaaaagac agcaagcctccagagaaggc cacagaagac3061aaggccgcca aggggaagta aggcagggag aaaggaacatccggaacagc caaagaaact3121cagaagagtc ccggagctca aggatcagag taacacaattttcacttttt ctgtctttat3181gtaagaagaa actgcttaga tgacggggcc tccttcttcaaacaggaatt tctgttagca3241atatgttagc aagagagggc actcccaggc ccctgcccccatgccctccc caggcgatgg3301acaattatga tagcttatgt agctgaatgt gatacatgccgaatgccaca cgtaaacact3361tgactataaa aactgccccc ctcctttcca aataagtgcatttattgcct ctatgtgcaa3421ctgacagatg accgcaataa tgaatgagca gttagaaatacattatgctt gagatgtctt3481aacctattcc caaatgcctt ctgttttcca aaggagtggtcaagcccttg cccagagctc3541tctattctgg aagagcggtc caggtggggc cgggcactggccactgaatt atgccagggc3601gcactttcca ctggagttca ctttcaattg cttctgtgcaataaaaccaa gtgcttataa3661aatgaaaaaa aaaaaaaaaa tgctgttatt ctctttccctgggaaggctg ggggcagggc3721aggggaggtc tggatgtgac accccagact gcatgggactgagcaagcat cagt(SEQ ID NO:227)1mmsfggadal lgapfaplhg ggslhyalar kggaggtrsaagsssgfhsw trtsvssvsa61spsrfrgaga asstdsldtl sngpegcmva vatsrsekeqlqalndrfag yidkvrqlea121hnrslegeaa alrqqqagrs amgelyerev remrgavlrlgaargqlrle qehllediah181vrqrlddear qreeaeaaar alarfaqeae aarvdlqkkaqalqeecgyl rrhhqeevge241llgqiqgsga aqaqmqaetr dalkcdvtsa lreiraqleghavqstlqse ewfrvrldrl301seaakvntda mrsaqeeite yrrqlqartt elealkstkdslerqrsele drhqadiasy361qeaiqqldae lrntkwemaa qlreyqdlln vkmaldieiaayrkllegee crigfgpipf421slpeglpkip svsthikvks eekikvveks eketviveeqteetqvteev teceekeake481eegkeeegge eeeaeggeee tksppaeeaa spekeakspvkeeakspaea kspekeeaks541paevkspeka kspakeeaks ppeakspeke eakspaevkspekakspake eakspaeaks601pekakspvke eakspaeaks pvkeeakspa evkspekaksptkeeakspe kakspekaks661pekeeakspe kakspvkaea kspekakspv kaeakspekakspvkeeaks pekakspvke721eakspekaks pvkeeaktpe kakspvkeea kspekakspekaktldvksp eaktpakeea781rspadkfpek akspvkeevk spekaksplk edakapekeipkkeevkspv keeekpqevk841vkeppkkaee ekapatpkte ekkdskkeea pkkeapkpkveekkepavek pkeskveakk901eeaedkkkvp tpekeapakv evkedakpke ktevakkepddakakepskp aekkeaapek961kdtkeekakk peekpkteak akeddktlsk epskpkaekaekssstdqkd skppekated1021kaakgk


Putative function


unknown


Example 21 (Category 3)

Line ID—265


Phenotype—Lethal phase pharate adult. High mitotic index, rod like overcondensed chromosomes, few anaphases with lagging chromosomes


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003509 (17B4-5)


P element insertion site—52,563


Annotated Drosophila genome Complete Genome candidate


CG6407—Wnt5

(SEQ ID NO:228)CAGTTGTTTACAATTTGTCGTTGAGGGTGGATTACTTCGTCGCGAGTTTCGTTCGTGCATGATGCGGTTGTGGTTGATTGTATACATACATACTATGCACAAATCCAGTTCTCATTTTGTTATTTTACAAATTCTCAGCGAGCGCATGAACTGGCAGCCTATAGCGAGCAGCTAATCACAATATTTACGGCAGATTCGTGGACTCAAGGAAATTCAGCCAGCAGCCAATCGATTTTCTAGTGTTATCGAAAAACATTTTTCATTCCTTCATTTCGTTCAACTAACAATACTAGTTACTACTAACAATACTCTGTAATAGTAATAGTAAGAGGAACAGGAATAGGAATACACATACTCCAAAGCGATAATGAGTTGCTACAGAAAAAGGCACTTTCTATTGTGGCTCTTGCGTGCTGTGTGTATGTTGCACTTAACCGCGAGAGGGGCATATGCCACAGTTGGGTTGCAAGGAGTGCCGACATGGATATATCTCGGCCTCAAGTCCCCCTTCATCGAGTTTGGCAACCAGGTGGAGCAGCTGGCCAATTCCAGCATACCACTGAACATGACCAAGGACGAGCAGGCCAATATGCATCAAGAGGGCCTACGCAAGCTCGGTACGTTCATAAAGCCAGTGGACCTGCGGGACTCGGAGACTGGCTTCGTCAAGGCCGATCTCACCAAGAGACTGGTATTCGATAGACCGAACAACATTACATCTCGCCCTATTCACCCGATACAGGAGGAGATGGATCAGAAGCAGATAATCCTGCTCGACGAGGATACCGACGAGAATGGCCTGCCAGCCAGTCTCACCGACGAGGATCGCAAGTTTATAGTGCCGATGGCGCTCAAGAATATATCGCCCGATCCACGCTGGGCGGCCACTACACCGAGTCCCTCCGCTTTGCAGCCGAACGCTAAAGCCATCTCGACCATTGTGCCCTCGCCTCTGGCCCAGGTCGAGGGGGATCCCACGTCCAACATCGATGACCTGAAGAAGCACATACTCTTCTTGCACAACATGACCAAGACCAATTCGAACTTCGAGTCGAAATTCGTTAAATTCCCGAGCCTGCAAAAGGACAAGGCCAAGACATCGGGAGCTGGCGGTTCGCCGCCCAATCCCAAGCGGCCCCAGCGGCCGATTCATCAGTATTCCGCGCCCATAGCCCCACCAACACCCAAGGTGCCCGCGCCAGATGGCGGCGGCGTAGGAGGAGCAGCTTACAATCCCGGAGAGCAGCCAATTGGTGGCTACTATCAGAACGAGGAACTAGCGAATAATCAATCCCTTCTTAAACCAACAGATACCGACTCCCATCCAGCGGCCGGCGGTAGCAGCCATGGCCAGAAGAATCCCAGCGAGCCCCAGGTGATACTGCTCAACGAGACACTCTCCACGGAGACCTCAATCGAAGCGGATCGCAGTCCATCGATAAACCAGCCCAAGGCGGGATCGCCTGCGCGCACAACAAAGCGACCACCTTGCCTGCGCAATCCCGAGTCCCCGAAATGCATACGTCAGCGTCGGCGGGAGGAGCAACAGCGGCAGCGGGAGCGGGACGAGTGGTTCCGCGGTCAGTCGCAGTACATGCAGCCCCGGTTCGAGCCGATCATACAGACGATTAACAATACGAAGAGATTTGCCGTATCAATCGAGATTCCAGACTCCTTTAAAGTATCCTCCGAGGGATCGGATGGGGAGTTGCTTTCGCGAGTCGAACGCTCGCAGCCCAGCATTAGTAGTAGTAGTAGTAGCAGTAGTAGCAGTAGTAGGAAAATCATGCCAGACTATATTAAGGTATCCATGGAGAACAACACATCCGTCACGGATTATTTTAAGCACGACGTTGTGATGACATCGGCAGATGTCGCCAGCGATAGGGAATTCCTTATCAAGAACATGGAGGAGCACGGAGGCGCTGGCTCCGCGAACAGTCATCACAATGATACGACTCCAACTGCAGACGCATATTCGGAGACAATCGATCTTAATCCCAATAACTGCTATAGCGCAATAGGTCTAAGCAACAGCCAAAAGAAGCAATGTGTTAAGCACACCAGCGTGATGCCGGCCATAAGTCGTGGTGCCCGTGCCGCCATCCAGGAGTGCCAGTTTCAGTTCAAGAATCGCCGCTGGAACTGCAGCACAACGAACGACGAGACCGTATTTGGTCCCATGACCAGCCTGGCTGCTCCCGAAATGGCCTTCATCCACGCCCTGGCCGCGGCCACGGTGACCAGCTTCATAGCTCGCGCCTGCCGGGATGGCCAACTGGCCTCCTGCAGCTGCTCCCGCGGCAGTCGACCCAAACAGCTCCACGACGACTGGAAGTGGGGCGGCTGTGGCGACAACCTGGAGTTCGCCTACAAGTTCGCCACGGACTTCATCGATTCGCGGGAGAAGGAAACCAATCGCGAGACGCGTGGCGTTAAGAGAAAACGCGAGGAGATCAACAAGAATCGCATGCATTCCGATGACACGAATGCTTTTAACATAGGTATTAAACGTAACAAAAACGTAGATGCTAAAAACGATACAAGTTTGGTAGTGAGAAACGTTAGGAAAAGCACTGAGGCTGAAAACAGTCACATACTCAATGAGAACTTTGATCAGCACCTATTGGAACTAGAGCAGCGCATTACGAAGGAGATACTTACATCCAAGATAGACGAGGAGGAGATGATTAAGCTGCAGGAGAAGATCAAACAGGAGATTGTCAACACCAAGTTCTTCAAGGGTGAGCAGCAGCCGCGCAAGAAGAAGCGAAAAAACCAGAGAGCCGCCGCCGATGCGCCCGCCTATCCGAGGAACGGCATCAAGGAGAGCTACAAGGATGGCGGCATATTGCCGCGCAGCACGGCCACTGTCAAGGCCAGGAGCCTGATGAACTTGCACAACAACGAGGCCGGACGTCGGGCGGTGATCAAGAAGGCCAGGATAACGTGCAAGTGCCACGGCGTGTCCGGCTCCTGCAGCCTGATCACCTGCTGGCAGCAATTGTCCTCCATCCGGGAGATTGGCGACTATCTGCGCGAGAAGTACGAGGGCGCCACCAAGGTGAAGATCAACAAGCGTGGCCGCCTCCAGATCAAGGACTTGCAATTCAAGGTGCCGACCGCTCACGATCTTATTTACCTAGACGAAAGTCCCGACTGGTGCCGCAATAGCTATGCGCTGCATTGGCCGGGAACGCACGGACGTGTGTGCCACAAAAACTCGTCGGGATTGGAGAGCTGTGCCATCCTCTGCTGCGGCCGGGGCTATAATACGAAGAACATTATAGTTAACGAACGCTGCAATTGCAAATTTCACTGGTGTTGCCAGGTTAAATGTGAAGTTTGTACGAAGGTACTCGAGGAGCACACATGTAAATAGAGCGTTGATTGAATTCGAATGTCTTAATGTTTGTGACTAAGCCATGAAGGAAATAATCGTATTTAAACAGTCCTCTCCATTTTAATTGCCATTACCATACACCATCATATTGCTTCTTCTTAAAATGCT(SEQ ID NO:229)MSCYRKRHFLLWLLRAVCMLHLTARGAYATVGLQGVPTWIYLGLKSPFIEFGNQVEQLANSSIPLNMTKDEQANMHQEGLRKLGTFIKPVDLRDSETGFVKADLTKRLVFDRPNNITSRPIHPIQEEMDQKQIILLDEDTDENGLPASLTDEDRKFIVPMALKNISPDPRWAATTPSPSALQPNAKAISTIVPSPLAQVEGDPTSNIDDLKKHILFLHNMTKTNSNEESKFVKFPSLQKDKAKTSGAGGSPPNPKRPQRPIHQYSAPIAPPTPKVPAPDGGGVGGAAYNPGEQPIGGYYQNEELANNQSLLKPTDTDSHPAAGGSSHGQKNPSEPQVILLNETLSTETSIEADRSPSINQPKAGSPARTTKRPPCLRNPESPKCIRQRRREEQQRQRERDEWFRGQSQYMQPRFEPIIQTINNTKRFAVSIEIPDSFKVSSEGSDGELLSRVERSQPSISSSSSSSSSSSRKIMPDYIKVSMENNTSVTDYFKHDVVMTSADVASDREFLIKNMEEHGGAGSANSHHNDTTPTADAYSETIDLNPNNCYSAIGLSNSQKKQCVKHTSVMPAISRGARAAIQECQFQFKNRRWNCSTTNDETVFGPMTSLAAPEMAFIHALAAATVTSFIARACRDGQLASCSCSRGSRPKQLHDDWKWGGCGDNLEFAYKFATDFIDSREKETNRETRGVKRKREEINKNRMHSDDTNAFNIGIKRNKNVDAKNDTSLVVRNVRKSTEAENSHILNENFDQHLLELEQRITKEILTSKIDEEEMIKLQEKIKQEIVNTKFFKGEQQPRKKKRKNQRAAADAPAYPRNGIKESYKDGGILPRSTATVKARSLMNLHNNEAGRRAVIKKARITCKCHGVSGSCSLITCWQQLSSIREIGDYLREKYEGATKVKINKRGRLQIKDLQFKVPTAHDLIYLDESPDWCRNSYALHWPGTHGRVCHKNSSGLESCAILCCGRGYNTKNIIVNERCNCKFHWCCQVKCEVCTKVLEEHTCK


Human homologue of Complete Genome candidate


AAA16842—hWNT5A

(SEQ ID NO:230)1attaattctg gctccacttg ttgctcggcc caggttggggagaggacgga gggtggccgc61agcgggttcc tgagtgaatt acccaggagg gactgagcacagcaccaact agagaggggt121cagggggtgc gggactcgag cgagcaggaa ggaggcagcgcctggcacca gggctttgac181tcaacagaat tgagacacgt ttgtaatcgc tggcgtgccccgcgcacagg atcccagcga241aaatcagatt tcctggtgag gttgcgtggg tggattaatttggaaaaaga aactgcctat301atcttgccat caaaaaactc acggaggaga agcgcagtcaatcaacagta aacttaagag361acccccgatg ctcccctggt ttaacttgta tgcttgaaaattatctgaga gggaataaac421atcttttcct tcttccctct ccagaagtcc attggaatattaagcccagg agttgctttg481gggatggctg gaagtgcaat gtcttccaag ttcttcctagtggctttggc catatttttc541tccttcgccc aggttgtaat tgaagccaat tcttggtggtcgctaggtat gaataaccct601gttcagatgt cagaagtata tattatagga gcacagcctctctgcagcca actggcagga661ctttctcaag gacagaagaa actgtgccac ttgtatcaggaccacatgca gtacatcgga721gaaggcgcga agacaggcat caaagaatgc cagtatcaattccgacatcg acggtggaac781tgcagcactg tggataacac ctctgttttt ggcagggtgatgcagatagg cagccgcgag841acggccttca catacgccgt gagcgcagca ggggtggtgaacgccatgag ccgggcgtgc901cgcgagggcg agctgtccac ctgcggctgc agccgcgccgcgcgccccaa ggacctgccg961cgggactggc tctggggcgg ctgcggcgac aacatcgactatggctaccg ctttgccaag1021gagttcgtgg acgcccgcga gcgggagcgc atccacgccaagggctccta cgagagtgct1081cgcatcctca tgaacctgca caacaacgag gccggccgcaggacggtgta caacctggct1141gatgtggcct gcaagtgcca tggggtgtcc ggctcatgtagcctgaagac atgctggctg1201cagctggcag acttccgcaa ggtgggtgat gccctgaaggagaagtacga cagcgcggcg1261gccatgcggc tcaacagccg gggcaagttg gtacaggtcaacagccgctt caactcgccc1321accacacaag acctggtcta catcgacccc agccctgactactgcgtgcg caatgagagc1381accggctcgc tgggcacgca gggccgcctg tgcaacaagacgtcggaggg catggatggc1441tgcgagctca tgtgctgcgg ccgtgggtac gaccagttcaagaccgtgca gacggagcgc1501tgccactgca agttccactg gtgctgctac gtcaagtgcaagaagtgcac ggagatcgtg1561gaccagtttg tgtgcaagta gtgggtgcca cccagcactcagccccgctc ccaggacccg1621cttatttata gaaagtacag tgattctggt ttttggtttttagaaatatt ttttattttt1681ccccaagaat tgcaaccgga accatttttt ttcctgttaccatctaagaa ctctgtggtt1741tattattaat attataatta ttatttggca ataatgggggtgggaaccac gaaaaatatt1801tattttgtgg atctttgaaa aggtaataca agacttcttttggatagtat agaatgaagg1861gggaaataac acatacccta acttagctgt gtgggacatggtacacatcc agaaggtaaa1921gaaatacatt ttctttttct caaatatgcc atcatatgggatgggtaggt tccagttgaa1981agagggtggt agaaatctat tcacaattca gcttctatgaccaaaatgag ttgtaaattc2041tctggtgcaa gataaaaggt cttgggaaaa caaaacaaaacaaaacaaac ctcccttccc2101cagcagggct gctagcttgc tttctgcatt ttcaaaatgataatttacaa tggaaggaca2161agaatgtcat attctcaagg aaaaaaggta tatcacatgtctcattctcc tcaaatattc2221catttgcaga cagaccgtca tattctaata gctcatgaaatttgggcagc agggaggaaa2281gtccccagaa attaaaaaat ttaaaactct tatgtcaagatgttgatttg aagctgttat2341aagaattggg attccagatt tgtaaaaaga cccccaatgattctggacac tagatttttt2401gtttggggag gttggcttga acataaatga aatatcctgtattttcttag ggatacttgg2461ttagtaaatt ataatagtag aaataataca tgaatcccattcacaggttt ctcagcccaa2521gcaacaaggt aattgcgtgc cattcagcac tgcaccagagcagacaacct atttgaggaa2581aaacagtgaa atccaccttc ctcttcacac tgagccctctctgattcctc cgtgttgtga2641tgtgatgctg gccacgtttc caaacggcag ctccactgggtcccctttgg ttgtaggaca2701ggaaatgaaa cattaggagc tctgcttgga aaacagttcactacttaggg atttttgttt2761cctaaaactt ttattttgag gagcagtagt tttctatgttttaatgacag aacttggcta2821atggaattca cagaggtgtt gcagcgtatc actgttatgatcctgtgttt agattatcca2881ctcatgcttc tcctattgta ctgcaggtgt accttaaaactgttcccagt gtacttgaac2941agttgcattt ataagggggg aaatgtggtt taatggtgcctgatatctca aagtcttttg3001tacataacat atatatatat atacatatat ataaatataaatataaatat atctcattgc3061agccagtgat ttagatttac agcttactct ggggttatctctctgtctag agcattgttg3121tccttcactg cagtccagtt gggattattc caaaagttttttgagtcttg agcttgggct3181gtggccccgc tgtgatcata ccctgagcac gacgaagcaacctcgtttct gaggaagaag3241cttgagttct gactcactga aatgcgtgtt gggttgaagatatctttttt tcttttctgc3301ctcacccctt tgtctccaac ctccatttct gttcactttgtggagagggc attacttgtt3361cgttatagac atggacgtta agagatattc aaaactcagaagcatcagca atgtttctct3421tttcttagtt cattctgcag aatggaaacc catgcctattagaaatgaca gtacttatta3481attgagtccc taaggaatat tcagcccact acatagatagcttttttttt tttttttttt3541ttttaataag gacacctctt tccaaacagg ccatcaaatatgttcttatc tcagacttac3601gttgttttaa aagtttggaa agatacacat cttttcatacccccccttag gaggttgggc3661tttcatatca cctcagccaa ctgtggctct taatttattgcataatgata tccacatcag3721ccaactgtgg ctctttaatt tattgcataa tgatattcacatcccctcag ttgcagtgaa3781ttgtgagcaa aagatcttga aagcaaaaag cactaattagtttaaaatgt cacttttttg3841gtttttatta tacaaaaacc atgaagtact ttttttatttgctaaatcag attgttcctt3901tttagtgact catgtttatg aagagagttg agtttaacaatcctagcttt taaaagaaac3961tatttaatgt aaaatattct acatgtcatt cagatattatgtatatcttc tagcctttat4021tctgtacttt taatgtacat atttctgtct tgcgtgatttgtatatttca ctggtttaaa4081aaacaaacat cgaaaggctt attccaaatg gaag(SEQ ID NO:231)1magsamsskf flvalaiffs faqvvieans wwslgmnnpvqmsevyiiga qplcsqlagl61sqgqkldchl yqdhmqyige gaktgikecq yqfrhrrwncstvdntsvfg rvmqigsret121aftyavsaag vvnamsracr egelstcgcs raarpkdlprdwlwggcgdn idygyrfake181fvdarereri hakgsyesar ilmnlhnnea grrtvynladvackchgvsg scslktcwlq241ladfrkvgda lkekydsaaa mrlnsrgklv qvnsrfnspttqdlvyidps pdycvrnest301gslgtqgrlc nktsegmdgc elmccgrgyd qfktvqterchckfhwccyv kckkcteivd361qfvck


Putative function


Wnt oncogene


Example 22 (Category 3)

Line ID—392


Phenotype—Lethal phase larval stage 3-pharate adult, small brain and optic lobes, high mitotic index, rod-like overcondensed chromosomes, fewer ana- and telophases, overcondensed chromosomes in ana- and telophase


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003495 (12D)


P element insertion site—35,688


Annotated Drosophila genome Complete Genome candidate


CG12482—novel protein

(SEQ ID NO:232)ATGGGTTGCACCTGCTGTGACAATAAACCCAAGCCGGAGACCATTGAGATATATTCGGTGAAAATCCGTGAGAATGGTACATACAAGTTGATCAAGATGCAATTGGCGGATATTTGGAGTCACGGATGGGAGCTGCGTATCAATAACTTTGCCGACAAGGAAAAGGTGCCGCACAACGAGAAGGATATTCGCAATCAGGTGTCGGTGGCGCGCAAAGCCAAACAGAGTCTGTGGAACAATAATAAGCATTTTGTGTACTGGTGCCGCTACGGAAGTCGTCAGCAGGATCTGCGAAAGCGACAGGTAACGACGAGTGCCAATCACGTGCTGCTGCACCTGATCAATTGA(SEQ ID NO:233)MGCTCCDNKPKPETIEIYSVKIRENGTYKLIKMQLADIWSHGWELRINNFADKEKVPFINEKDIRNQVSVARKAKQSLWNNNKHFVYWCRYGSRQQDLRKRQVTTSANHVLLHLIN


Human homologue of Complete Genome candidate


none


Putative function


unknown


Example 23 (Category 3)

Line ID—37


Phenotype—Lethal phase larval stage 3. Small brain, few cells in mitosis, badly defined chromosomes form a broad bend, weak chromosome condensation, abnormal anaphases with broken chromosomes


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003418 (1C1-2)


P element insertion site—105,970


Annotated Drosophila genome Complete Genome candidate


CG16983—skpA, SCF ubiquitin ligase subunit (3 splice variants)

(SEQ ID NO:234)CCATTTGAAAGTATCGGTGTAATTTGTTTTCAGAGAAATTAATTTCCGTTTACTGTGCAATTCGGTGTGAAAGTGTTCAGATTTATCAATGCGTATTCTGCTTTCGACTTCGCCACCAATCTGTGCTGCAAGTTACCATTACCAGGTCCACCTGGTTCCCGCCAGTTTTCTTTCATTGTGGCTAGTTGTTGTTCGTGCCTTCGATAAAGACGTTTAGAGGTGTTTTTAGAGTTTCGCCATCTGGTCACTATAGCCGTTTCGTTTTTTACATGCCCAGCATCAAGTTGCAATCTTCGGATGAGGAGATCTTTGACACGGATATCCAGATCGCCAAGTGCTCCGGCACTATCAAGACCATGCTGGAGGACTGCGGCATGGAGGACGATGAGAATGCCATTGTGCCGTTGCCCAATGTGAATTCGACGATTCTTCGCAAGGTGCTTACCTGGGCTCACTACCACAAGGACGACCCCCAGCCAACGGAGGATGATGAGAGCAAGGAGAAGCGCACAGACGACATTATCTCATGGGATGCAGATTTCCTAAAAGTCGACCAGGGCACACTGTTTGAGCTGATATTGGCAGCGAACTATCTGGACATTAAGGGCCTTCTGGAGCTCACCTGCAAGACTGTTGCAAACATGATTAAGGGAAAGACTCCCGAGGAAATACGCAAGACCTTCAACATTAAGAAGGACTTTTCGCCCGCCGAGGAGGAGCAGGTGCGCAAGGAGAACGAGTGGTGCGAGGAGAAGTAAAGCGCGGCATTTCGCGGGACCAACATTAAGTTGAAACAGCTAGGGGATTCGGGAACGAATTGGATTTGCAGCATTGCAACTTTACTTAGTTGCTACTTTCATTTACATTTTTTTTTATTTTTAACCCCAGCAGAGACTCGATTTAAATTGTGTATAAATGATCTGTTGCTGATTTGATTCGCGGGGTTCATTTTTTGTCGTAAATATATCTCATATACATACATATGCGAGATTGTAACACTCTCTTTAACCTATTGGAGTAACACTTGATTTCACTTTAATAAATATAACTACCCAACAC(SEQ ID NO:235)MPSIKLQSSDEEIFDTDIQIAKCSGTIKTMLEDCGMEDDENAIVPLPNVNSTILRKVLTWAHYHKDDPQPTEDDESKEKRTDDIISWDADFLKVDQGTLFELILAANYLDIKGLLELTCKTVANMIKGKTPEEIRKTFNIKKDFSPAEEEQVRKENEWCEEK(SEQ ID NO:236)TTTCGCCATCTGGTCACTATAGCCGTTTCGTTTTTTACGTGAGTATTGTGAATTTGGTGTGTTGATTTATATCTCAGTTGGAGCCTGCGTGGAAATAGTGTCAGTACGTTTAAAGGCATCATCGTAAGGAAAGCCCAAAATGCCCAGCATCAAGTTGCAATCTTCGGATGAGGAGATCTTTGACACGGATATCCAGATCGCCAAGTGCTCCGGCACTATCAAGACCATGCTGGAGGACTGCGGCATGGAGGACGATGAGAATGCCATTGTGCCGTTGCCCAATGTGAATTCGACGATTCTTCGCAAGGTGCTTACCTGGGCTCACTACCACAAGGACGACCCCCAGCCAACGGAGGATGATGAGAGCAAGGAGAAGCGCACAGACGACATTATCTCATGGGATGCAGATTTCCTAAAAGTCGACCAGGGCACACTGTTTGAGCTGATATTGGCAGCGAACTATCTGGACATTAAGGGCCTTCTGGAGCTCACCTGCAAGACTGTTGCAAACATGATTAAGGGAAAGACTCCCGAGGAAATACGCAAGACCTTCAACATTAAGAAGGACTTTTCGCCCGCCGAGGAGGAGCAGGTGCGCAAGGAGAACGAGTGGTGCGAGGAGAAGTAAAGCGCGGCATTTCGCGGGACCAACATTAAGTTGAAACAGCTAGGGGATTCGGGAACGAATTGGATTTGCAGCATTGCAACTTTACTTAGTTGCTACTTTCATTTACATTTTTTTTTATTTTTAACCCCAGCAGAGACTCGATTTAAATTGTGTATAAATGATCTGTTGCTGATTTGATTCGCGGGGTTCATTTTTTGTCGTAAATATATCTCATATACATACATATGCGAGATTGTAACACTCTCTTTAACCTATTGGAGTAACACTTGATTTCACTTTAATAAATATAACTACCCAACAC(SEQ ID NO:237)MPSIKLQSSDEEIFDTDIQIAKCSGTIKTMLEDCGMEDDENAIVPLPNVNSTILRKVLTWAHYHKDDPQPTEDDESKEKRTDDIISWDADFLKVDQGTLFELILAANYLDIKGLLELTCKTVANMIKGKTPEEIRKTFNIKKDFSPAEEEQVRKENEWCEEK(SEQ ID NO:238)AAACATCGAAAGTGCACAATCGTTTGTTATCTTTGTACGAAAACAACGGTGATTTCCACACAGGCATAACCTGCAAGAGAAAGCCCAAAATGCCCAGCATCAAGTTGCAATCTTCGGATGAGGAGATCTTTGACACGGATATCCAGATCGCCAAGTGCTCCGGCACTATCAAGACCATGCTGGAGGACTGCGGCATGGAGGACGATGAGAATGCCATTGTGCCGTTGCCCAATGTGAATTCGACGATTCTTCGCAAGGTGCTTACCTGGGCTCACTACCACAAGGACGACCCCCAGCCAACGGAGGATGATGAGAGCAAGGAGAAGCGCACAGACGACATTATCTCATGGGATGCAGATTTCCTAAAAGTCGACCAGGGCACACTGTTTGAGCTGATATTGGCAGCGAACTATCTGGACATTAAGGGCCTTCTGGAGCTCACCTGCAAGACTGTTGCAAACATGATTAAGGGAAAGACTCCCGAGGAAATACGCAAGACCTTCAACATTAAGAAGGACTTTTCGCCCGCCGAGGAGGAGCAGGTGCGCAAGGAGAACGAGTGGTGCGAGGAGAAGTAAAGCGCGGCATTTCGCGGGACCAACATTAAGTTGAAACAGCTAGGGGATTCGGGAACGAATTGGATTTGCAGCATTGCAACTTTACTTAGTTGCTACTTTCATTTACATTTTTTTTTATTTTTAACCCCAGCAGAGACTCGATTTAAATTGTGTATAAATGATCTGTTGCTGATTTGATTCGCGGGGTTCATTTTTTGTCGTAAATATATCTCATATACATACATATGCGAGATTGTAACACTCTCTTTAACCTATTGGAGTAACACTTGATTTCACTTTAATAAATATAACTACCCAACAC(SEQ ID NO:239)MPSIKLQSSDEEIFDTDIQIAKCSGTIKTMLEDCGMEDDENAIVPLPNVNSTILRKVLTWAHYHKDDPQPTEDDESKEKRTDDIISWDADFLKVDQGTLFELILAANYLDIKGLLELTCKTVANMIKGKTPEEIRKTFNIKKDFSPAEEEQVRKENEWCEEK


Human homologue of Complete Genome candidate


XP054159—hypothetical protein

(SEQ ID NO:240)  1gcctcccagc tctcgtcagc ctcctgctgg ccatctcctt aacaccaaac actatgcctt 61caattcagtt gcagagtttt gatggagaga tatttgcagt tgatgtggaa attgccaaac121aatctgtgac tatcaagacc acgttggaag atttgggaat ggatgatgaa ggagatgacc181cagttcctct accaaatgtg aatgcagcag tattaaaaaa ggtcattcag tggtgcaccc241accacaagga tgaccctcct ccccctgaag atgatgagaa caaagaaaag caaacagacg301atatccctgt ttgggaccaa gaattcctga aagttgctca aggaacactt tttgaactca361ttcgggctgc aaactactta gacatcaaag gtttgcttga tgttacatgc aagactgttg421ccaatatgat caaggggaaa actcctgagg agattcgcaa gacattcaat atcaaaaatg481actttactga agaggaggaa gcccaggtac gcaaagagaa ccagtggtgt gaagagaagt541gaaatgttgt gcctgacact gtaacactgt aaggat(SEQ ID NO:241)  1mpsiqlqsfd geifavdvei akqsvtiktt ledlgmddeg ddpvplpnvn aavlkkviqw 61cthhkddppp peddenkekq tddipvwdqe flkvaqgtlf eliraanyld ikglldvtck121tvanmikgkt peeirktfni kndfteeeea qvrkenqwce ek


Putative function


Cell cycle protein, ubiquitin ligase


Example 24 (Category 3)

Line ID—186


Phenotype—Lethal phase larval stage 3. Small brain, high mitotic index, rod-like overcondensed chromosomes, fewer ana- and telophases.


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003494 (12C6-7)


P element insertion site—123,540


Annotated Drosophila genome Complete Genome candidate


CG18319—bendless ubiquitin conjugating enzyme

(SEQ ID NO:242)TTAGTCACAGCAACGCACACACACACTACCAAACGGCTACATTTTTTTTCGAGTGTGTTCGACATTCATAATTTTTGTGGTGGAGCTGCCTGCAAAATCGAATTTTATCAGTTTGCCAACGAAGTTATCGGCCATAACTGCAAATAAAGTTCAGCAATAACTTGGCGCTGTTACGATCTCAACGAGAAGGTCCAGACTCAACCCGCGTTTCCAGTTCACCGCGTAAAAGGAACCAGCTAAACGATGTCCAGCCTGCCACGTCGCATCATCAAGGAGACTCAACGTTTGATGCAGGAGCCAGTGCCTGGGATCAATGCCATTCCCGATGAGAACAATGCCCGTTACTTCCATGTGATCGTGACCGGACCGAACGATTCGCCCTTCGAGGGCGGCGTGTTCAAGCTGGAGCTGTTCCTACCGGAGGACTATCCAATGTCAGCGCCCAAAGTGCGCTTCATCACGAAGATCTACCATCCGAACATCGATCGTTTGGGCCGCATTTGCCTCGACGTGCTGAAGGACAAGTGGAGTCCAGCCCTGCAGATCCGGACCATATTGCTATCCATTCAGGCACTGCTCAGTGCACCCAATCCCGACGATCCGCTGGCCAACGATGTGGCTGAGTTGTGGAAGGTCAACGAGGCGGAGGCCATTCGCAATGCCCGCGAGTGGACCCAGAAATATGCCGTCGAAGACTGAACGCCCGAGGTCAGGAGGAAAGTCAGAAAGCGGATCCGTCAGTTGTATCGGCGTTTTTCCAGAAAGTGGGTGCGTGACATGAACGGGCGGGTGGGTAAATTGAATACTTTAAAAGCAACCAGAAAAACCTAAAACATACGAAAGAAAACATAAAATAAGAAAAAAGTAAGGAAGCAAACATAAAAAAAAACGATTTAAGAACACATTTTTTTTTCGAACCTTCTGGGGCGGGATATACATATAAAATATTAATATATATATTTTTTTCAACCAATCGATCGGGGCGATCGGCGAAATGGAGGAGAGATAGCGAAAGCATTCTTTATGTAAGACGTATACATGTATCCGAAACAAACTAAAAACGAAAAAAAAAAAAAAAAAAAAAAACAGTAATTGGTTTTAGTCGTTTCTATTGATTTGTTCGAGGGTTCTGGTGTCTATATACATATAGCCGTATATAATTCTATGTGTAACTGAAATAACCAACCCATAACCATTAACACATGTAGCATCAGATATGATAAATCAATTGGAAAGGCAAACAAGAAGGGATTTTGATTTCCTTTAACTCGTCATTTGAAAACTCGGCTTAAATGTCAATTCAAAATAGAGAATTTTGATTGTATCATTTTCAGTGTTTCAGAAAATTTAAGATGTGATCGTCCAACTTGTAGACTTTACTTTTCTTAACTAAGAGTTCACCATTTCGATTGATACTTGAGCTTTGCCTGGGTTGTGTCAGAGTCCCTTTGATAAACGATAAATAGTTTTTACTCGAAAACAATTTTTTTTAACCAAACAATGAAGCCTTTAAGCTATTAGTAATTTTTGAAAAAAAAAAAAAATAAAAATATATATATAAAAAATATACAAAAATATGATACATGATCAAAATACAATGAATGCATACACTATATATTTATACAAAAAAAATACAAAAAGAAAAACAAAAGTAGTGGCTTGATTGCGTGAAAATTTCAAGTGCAGTTCTCAACAAAAATTGTGTACAGTAATTAAATGTTTGTCACCGAAATCACTAAAGGATAATCCAAAAAACAATAGCAACCGAAAAGCAACCATAAATCAAAGAGTAAGCGAAAATAAAAATTCAGTTTTCTTTAATTTTAATTAATTTTTTTCTAAGAAAAATAAATAAAAACGAAAAATTCAAAT(SEQ ID NO:243)MSSLPRRIIKETQRLMQEPVPGINAIPDENNARYFHVIVTGPNDSPFEGGVFKLELFLPEDYPMSAPKVRFITKIYHPNIDRLGRICLDVLKDKWSPALQIRTILLSIQALLSAPNPDDPLANDVAELWKVNEAEAIRNAREWTQKYAVED


Human homologue of Complete Genome candidate


BAA11675—ubiquitin-conjugating enzyme E2 UbcH-ben

(SEQ ID NO:244)   1actcgtgcgt gaggcgagag gagccggaga cgagaccaga ggccgaactc gggttctgac  61aagatggccg ggctgccccg caggatcatc aaggaaaccc agcgtttgct ggcagaacca 121gttcctggca tcaaagccga accagatgag agcaacgccc gttattttca tgtggtcatt 181gctggccctc aggattcccc ctttgaggga gggactttta aacttgaact attccttcca 241gaagaatacc caatggcagc ccctaaagta cgtttcatga ccaaaattta tcatcctaat 301gtagacaagt tgggaagaat atgtttagat attttgaaag ataagtggtc cccagcactg 361cagatccgca cagttctgct atcgatccag gccttgttaa gtgctcccaa tccagatgat 421ccattagcaa atgatgtagc ggagcagtgg aagaccaacg aagcccaagc catagaaaca 481gctagagcat ggactaggct atatgccatg aataatattt aaattgatac gatcatcaag 541tgtgcatcac ttctcctgtt ctgccaagac ttcctcctct ttgtttgcat ttaatggaca 601cagtcttaga aacattacag aataaaaaag cccagacatc ttcagtcctt tggtgattaa 661atgcacatta gcaaatctat gtcttgtcct gattcactgt cataaagcat gagcagaggc 721tagaagtatc atctggattg ttgtgaaacg tttaaaagca gtggcccctc cctgctttta 781ttcatttccc ccatcctggt ttaagtataa agcactgtga atgaaggtag ttgtcaggtt 841agctgcaggg gtgtgggtgt ttttatttta ttttatttta ttttattttt gaggggggag 901gtagtttaat tttatgggct cctttccccc ttttttggtg atctaattgc attggttaaa 961agcagctaac caggtcttta gaatatgctc tagccaagtc taactttatt tagacgctgt1021agatggacaa gcttgattgt tggaaccaaa atgggaacat taaacaaaca tcacagccct1081cactaataac attgctgtca agtgtagatt ccccccttca aaaaaagctt gtgaccattt1141tgtatggctt gtctggaaac ttctgtaaat cttatgtttt agtaaaatat tttttgttat1201tct(SEQ ID NO:245)   1maglprriik etqrllaepv pgikaepdes naryfhvvia gpqdspfegg tfklelflpe  61eypmaapkvr fmtkiyhpnv dklgricldi lkdkwspalq irtvllsiqa llsapnpddp 121landvaeqwk tneaqaieta rawtrlyamn ni


Putative function


Ubiquitin conjugating enzyme


Example 25 (Category 3)

Line ID—301


Phenotype—semilethal male and female, Low mitotic index, badly defined chromosomes, weak/uneven staining, fewer ana- and telophases


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003422 (2B7-10)


P element insertion site—96,307


Annotated Drosophila genome Complete Genome candidate


CG 14813—deltaCOP, component of cotamer involved in retrograde (golgi to ER) transport

(SEQ ID NO:246)TCGCAGAACCGAACACGTCAGCTACGGGGATTGATTGTTAAACAACGTTTCTATCGCCCCGCAAATCCGATCCGTAGCAGCAGTCCATCCTGCGCCGTCCGCATCCGATCCGCGAAGTATTTTCCAGGGCAAAAACGTCAAACGCAGCAGCAAAATGGTATTAATTGCTGCGGCTGTCTGCACGAAGAATGGCAAAGTGATTCTGTCACGTCAGTTCGTCGAGATGACGAAGGCACGCATCGAGGGACTGCTGGCTGCCTTTCCCAAGCTGATGACTGCTGGCAAGCAGCACACTTACGTGGAGACGGACTCCGTGCGCTACGTCTACCAGCCGATGGAGAAACTATATATGCTGCTCATCACCACTAAGGCCAGCAACATTCTGGAGGATCTGGAGACCCTGCGCCTCTTCTCGAAAGTGATTCCCGAGTACAGCCACTCGCTCGACGAGAAGGAGATTGTGGAGAATGCCTTCAATCTGATCTTCGCATTTGACGAGATCGTGGCACTCGGCTACAGGGAGAGCGTCAACTTGGCCCAGATCAAGACCTTCGTGGAGATGGACTCACATGAGGAGAAGGTCTACCAGGCAGTGCGTCAGACGCAGGAGCGTGATGCGCGCCAGAAGATGCGCGAGAAGGCCAAGGAACTGCAGCGGCAGCGCATGGAGGCCAGCAAACGGGGTGGTCCCTCCCTGGGTGGCATTGGCAGCCGCAGCGGCGGCTTTAGCGCCGACGGAATTGGCAGTAGCGGCGTGAGCAGCAGTTCCGGTGCCTCCAGCGCCAACACCGGCATCACCTCCATCGATGTGGACACCAAATCCAAGGCGGCTGCCAGTAAACCAGCTTCCCGCAATGCCCTCAAGCTAGGTGGCAAGTCCAAGGACGTCGATAGTTTCGTGGATCAGCTGAAGAACGAGGGCGAGAAGATTGCCAATCTGGCACCGGCGGCGCCCGCTGGAGGTTCCAGTGCTGCAGCTAGCGCCAGTGCAGCGGCCAAGGCAGCTATCGCGTCGGACATTCACAAAGAGAGCGTACATCTGAAGATTGAGGACAAGCTAGTAGTGCGTCTGGGACGCGATGGTGGCGTGCAGCAGTTCGAGAACTCGGGCCTCCTGACGTTGCGCATTACGGACGAGGCCTACGGACGCATTTTGCTGAAGCTGTCTCCCAACCACACACAGGGCCTGCAGTTGCAGACCCACCCCAACGTGGACAAGGAGCTGTTCAAGTCGCGCACTACCATCGGACTAAAGAACTTGGGCAAGCCGTTTCCCCTTAACACCGATGTGGGTGTGCTCAAGTGGCGCTTCGTCTCGCAGGACGAGTCGGCAGTCCCGCTGACCATTAACTGCTGGCCATCGGATAATGGAGAGGGTGGATGCGATGTTAACATTGAGTATGAACTGGAGGCGCAGCAGCTAGAGCTGCAGGACGTGGCCATTGTCATTCCCTTGCCAATGAATGTGCAGCCTTCGGTGGCGGAGTACGACGGCACCTACAACTACGATTCACGCAAGCATGTGCTCCAGTGGCACATTCCAATAATCGATGCCGCCAACAAGTCCGGTTCTATGGAGTTCAGCTGCAGTGCCTCCATTCCCGGTGACTTCTTCCCCTTGCAGGTGTCCTTCGTCTCGAAAACGCCGTATGCGGGCGTCGTGGCCCAGGATGTGGTGCAGGTGGACAGCGAGGCGGCGGTCAAGTATTCAAGCGAGTCCATTCTGTTCGTGGAAAAGTACGAGATCGTGTAGGCCGCGCCGCTGGCCACGCCCACCTAAGTAGTACATAAATATACATAATTTCCCGGGGTCATCCGATGCGATGCAATTAATTCAACTGCTGCAGCATGTTGAGAATTATTTTTCCATGTGCGAACTTTACATATTTATGGCGCAGACAGCTTCTCAGAGCGAGTAATTGATTCC(SEQ ID NO:247)MVLIAAAVCTKNGKVILSRQFVEMTKARIEGLLAAFPKLMTAGKQHTYVETDSVRYVYQPMEKLYMLLITTKASNILEDLETLRLFSKVIPEYSHSLDEKEIVENAFNLIFAFDEIVALGYRESVNLAQIKTFVEMDSHEEKVYQAVRQTQERDARQKMREKAKELQRQRMEASKRGGPSLGGIGSRSGGFSADGIGSSGVSSSSGASSANTGITSIDVDTKSKAAASKPASRNALKLGGKSKDVDSFVDQLKNEGEKIANLAPAAPAGGSSAAASASAAAKAAIASDIHKESVHLKIEDKLVVRLGRDGGVQQFENSGLLTLRITDEAYGRILLKLSPNHTQGLQLQTHPNVDKELFKSRTTTGLRNLGRPFPLNTDVGVLKWRFVSQDESAVPLTINCWPSDNGEGGCDVNIEYELEAQQLELQDVAIVIPLPMNVQPSVAEYDGTYNYDSRKHVLQWHIPIIDAANKSGSMEFSCSASIPGDFFPLQVSFVSKTPYAGVVAQDVVQVDSEAAVKYSSESILFVEKYEIV


Human homologue of Complete Genome candidate


CAA57071—archain, possible role in vesicle structure or trafficking

(SEQ ID NO:248)   1cgggcggttc ctgtcaaggg ggcagcaggt ccagagctgc tggtgctccc gttccccaga  61ccctacccct atccccagtg gagccggagt gcggcgcgcc ccaccaccgc cctcaccatg 121gtgctgttgg cagcagcggt ctgcacaaaa gcaggaaagg ctattgtttc tcgacagttt 181gtggaaatga cccgaactcg gattgagggc ttattagcag cttttccaaa gctcatgaac 241actggaaaac aacatacgtt tgttgaaaca gagagtgtaa gatatgtcta ccagcctatg 301gagaaactgt atatggtact gatcactacc aaaaacagca acattttaga agatttggag 361accctaaggc tcttctcaag agtgatccct gaatattgcc gagccttaga agagaatgaa 421atatctgagc actgttttga tttgattttt gcttttgatg aaattgtcgc actgggatac 481cgggagaatg ttaacttggc acagatcaga accttcacag aaatggattc tcatgaggag 541aaggtgttca gagccgtcag agagactcaa gaacgtgaag ctaaggctga gatgcgtcgt 601aaagcaaagg aattacaaca ggcccgaaga gatgcagaga gacagggcaa aaaagcacca 661ggatttggcg gatttggcag ctctgcagta tctggaggca gcacagctgc catgatcaca 721gagaccatca ttgaaactga taaaccaaaa gtggcacctg caccagccag gccttcaggc 781cccagcaagg ctttaaaact tggagccaaa ggaaaggaag tagataactt tgtggacaaa 841ttaaaatctg aaggtgaaac catcatgtcc tctagtatgg gcaagcgtac ttctgaagca 901accaaaatgc atgctccacc cattaatatg gaaagtgtac atatgaagat tgaagaaaag 961ataacattaa cctgtggacg agacggagga ttacagaata tggagttgca tggcatgatc1021atgcttagga tctcagatga caagtatggc cgaattcgtc ttcatgtgga aaatgaagat1081aagaaagggg tgcagctaca gacccatcca aatgtggata aaaaactttt cactgcagag1141tctctaattg gcctgaagaa tccagagaag tcatttccag tcaacagtga cgtaggggtg1201ctaaagtgga gactacaaac cacagaggaa tcttttattc cactgacaat taattgctgg1261ccctcggaga gtggaaatgg ctgtgatgtc aacatagaat atgagctaca agaagataat1321ttagaactga atgatgtggt tatcaccatc ccactcccgt ctggtgtcgg cgcgcctgtt1381atcggtgaga tcgatgggga gtatcgacat gacagtcgac gaaataccct ggagtggtgc1441ctgcctgtga ttgatgccaa aaataagagt ggcagcctgg agtttagcat tgctgggcag1501cccaatgact tcttccctgt tcaagtttcc tttgtctcca agaaaaatta ctgtaacata1561caggttacca aagtgaccca ggtagatgga aacagccccg tcaggttttc cacagagacc1621actttcctag tggataagta tgaaatcctg taataccaag aagagggagc tgaaaaggaa1681aattttcaga ttaataaaga agacgccaat gatggctgaa gagtttttcc cagatttaca1741agccactgga gacccctttt ttctgataca atgcacgatt ctctgcgcgc aaggaccctc1801gactcacccc catgtttcag tgtcacagag acattctttg ataaggaaat ggcacaaaca1861taaagggaaa ggctgctaat tttctttggc agattgtatt ggccagcagg aaagcaagct1921ctccagagaa tgcccccagt taaatacctc ctctaccttt acctaagttg ctcctttatt1981tttattttat aataataa(SEQ ID NO:249)   1mvllaaavct kagkaivsrq fvemtrtrie gllaafpklm ntgkqhtfve tesvryvyqp  61meklymvlit tknsniledl etlrlfsrvi peycraleen eisehcfdli fafdeivalg 121yrenvnlaqi rtftemdshe ekvfravret qereakaemr rkakelqqar rdaerqgkka 181pgfggfgssa vsggstaami tetiietdkp kvapaparps gpskalklga kgkevdnfvd 241klksegetim sssmgkrtse atkmhappin mesvhmkiee kitltcgrdg glqnmelhgm 301imlrisddky grirlivene dkkgvqlqth pnvdkklfta esliglknpe ksfpvnsdvg 361vlkwrlqtte esfipitinc wpsesgngcd vnieyelqed nlelndvvit iplpsgvgap 421vigeidgeyr hdsrrntlew clpvidaknk sgslefsiag qpndffpvqv sfvskknycn 481iqvtkvtqvd gnspvrfste ttflvdkyei l


Putative function


Role in vesicle trafficking


Example 26 (Category 3)

Line ID—148


Phenotype—Lethal phase pupal to pharate adult. Lagging chromosomes and bridges in ana- and telophase


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003438 (6B-C)


P element insertion site—116,914


Annotated Drosophila genome Complete Genome candidate


CG8655—cdc7 kinase

(SEQ ID NO:250)ATGCGTTATGACGCCTCCGCCGCTTTCGTGATGCCCTTCATGGCACATGACCGATTCCAGGACTTTTACACGCGCATGGATGTGCCCGAGATCCGGCAGTATATGCGCAATCTCCTGGTGGCACTGCGTCATGTCCACAAGTTCGATGTCATCCATCGCGACGTGAAGCCGAGCAACTTTCTCTACAATCGACGTCGGCGAGAGTTTCTCCTCGTCGATTTCGGTCTGGCCCAGCATGTGAATCCTCCGGCTGCGCGATCTTCCGGAAGTGCCGCCGCCATCGCCGCAGCCAACAACAAAAACAACAACAATAATAACAATAATAATAGCAAACGGCCACGAGAGCGCGAATCAAAGGGGGATGTGCAGCAAATTGCGCTGGATGCTGGTTTGGGTGGAGCAGTGAAGCGTATGCGTTTGCACGAGGAGTCCAACAAGATGCCCCTGAAACCGGTCAACGATATTGCGCCAAGCGATGCGCCGGAGCAGTCAGTAGATGGGTCCAATCACGTCCAGCCACAGCTAGTGCAGCAAGAGCAGCAACAACTGCAGCCGCAACAGCAGCAGCAACAACAGCAGCAGCAACAACAGTCGCAACAGCAGCAGCAGCCGCAGCAGCAGTCGCAACAGCAGCACCCACAACGACAGCCACAACTGGCGCAGATGGATCAAACAGCATCGACGCCATCTGGCAGCAAGTACAATACGAATCGAAATGTCTCGGCAGCAGCGGCTAATAATGCCAAGTGCGTTTGCTTTGCAAATCCCTCAGTTTGCCTCAACTGTCTGATGAAGAAGGAGGTGCACGCCTCCAGGGCAGGAACACCTGGCTATCGGCCGCCCGAGGTTCTGCTCAAGTACCCAGATCAGACCACTGCCGTGGACGTTTGGGCGGCGGGTGTGATATTCCTTTCGATCATGTCAACGGTGTATCCGTTTTTCAAAGCGCCCAACGATTTTATCGCGCTGGCCGAGATTGTAACAATATTTGGAGATCAGGCGATACGGAAGACGGCCTTGGCTCTCGACCGTATGATCACCCTGAGCCAGAGGTCCAGGCCACTGAATCTGCGAAAGTTGTGCCTGCGCTTTCGCTATCGTTCCGTTTTTAGTGATGCCAAGCTCCTCAAGAGCTACGAATCTGTGGACGGAAGCTGCGAAGTGTGCCGGAATTGTGATCAATACTTCTTCAACTGCCTATGCGAGGATAGCGATTACTTGACAGAGCCACTGGACGCATACGAATGTTTTCCACCCAGCGCCTATGACCTACTGGATCGCCTGCTCGAGATTAATCCCCATAAACGAATTACCGCCGAAGAGGCACTAAAGCATCCATTCTTTACGGCCGCCGAGGAGGCCGAGCAGACGGAGCAGGATCAGTTGGCCAATGGAACGCCGCGCAAGATGCGTCGACAAAGATATCAAAGTCACAGAACGGTGGCCGCCTCACAGGAGCAGGTCAAGCAGCAGGTTGCCCTTGATCTGCAGCAAGCGGCCATTAACAAGCTGTGA(SEQ ID NO:251)MRYDASAAFVMPFMAHDRFQDFYTRMDVPEIRQYMRNLLVALRHVHKFDVIHRDVKPSNFLYNRRRREFLLVDFGLAQHVNPPAARSSGSAAAIAAANNKNNNNNNNNNSKRPRERESKGDVQQIALDAGLGGAVKRMRLHEESNKMPLKPVNDIAPSDAPEQSVDGSNHVQPQLVQQEQQQLQPQQQQQQQQQQQQSQQQQQPQQQSQQQHPQRQPQLAQMDQTASTPSGSKYNTNRNVSAAAANNAKCVCFANPSVCLNCLMKKEVHASRAGTPGYRPPEVLLKYPDQTTAVDVWAAGVIFLSIMSTVYPFFKAPNDFIALAEIVTIFGDQAIRKTALALDRMITLSQRSRPLNLRKLCLRFRYRSVFSDAKLLKSYESVDGSCEVCRNCDQYFFNCLCEDSDYLTEPLDAYECFPPSAYDLLDRLLEINPHKRITAEEALKHPFFTAAEEAEQTEQDQLANGTPRKMRRQRYQSHRTVAASQEQVKQQVALDLQQAAINKL


Human homologue of Complete Genome candidate


AAB97512—HsCdc7

(SEQ ID NO:252)   1atggaggcgt ctttggggat tcagatggat gagccaatgg ctttttctcc ccagcgtgac  61cggtttcagg ctgaaggctc tttaaaaaaa aacgagcaga attttaaact tgcaggtgtt 121aaaaaagata ttgagaagct ttatgaagct gtaccacagc ttagtaatgt gtttaagatt 181gaggacaaaa ttggagaagg cactttcagc tctgtttatt tggccacagc acagttacaa 241gtaggacctg aagagaaaat tgctgtaaaa cacttgattc caacaagtca tcctataaga 301attgcagctg aacttcagtg cctaacagtg gctggggggc aagataatgt catgggagtt 361aaatactgct ttaggaagaa tgatcatgta gttattgcta tgccatatct ggagcatgag 421tcgtttttgg acattctgaa ttctctttcc tttcaagaag tacgggaata tatgcttaat 481ctgttcaaag ctttgaaacg cattcatcag tttggtattg ttcaccgtga tgttaagccc 541agcaattttt tatataatag gcgcctgaaa aagtatgcct tggtagactt tggtttggcc 601caaggaaccc atgatacgaa aatagagctt cttaaatttg tccagtctga agctcagcag 661gaaaggtgtt cacaaaacaa atcccacata atcacaggaa acaagattcc actgagtggc 721ccagtaccta aggagctgga tcagcagtcc accacaaaag cttctgttaa aagaccctac 781acaaatgcac aaattcagat taaacaagga aaagacggaa aggagggatc tgtaggcctt 841tctgtccagc gctctgtttt tggagaaaga aatttcaata tacacagctc catttcacat 901gagagccctg cagtgaaact catgaagcag tcaaagactg tggatgtact gtctagaaag 961ttagcaacaa aaaagaaggc tatttctacg aaagttatga atagtgctgt gatgaggaaa1021actgccagtt cttgcccagc tagcctgacc tgtgactgct atgcaacaga taaagtttgt1081agtatttgcc tttcaaggcg tcagcaggtt gcccctaggg caggtacacc aggattcaga1141gcaccagagg tcttgacaaa gtgccccaat caaactacag caattgacat gtggtctgca1201ggtgtcatat ttctttcttt gcttagtgga cgatatccat tttataaagc aagtgatgat1261ttaactgctt tggcccaaat tatgacaatt aggggatcca gagaaactat ccaagctgct1321aaaacttttg ggaaatcaat attatgtagc aaagaagttc cagcacaaga cttgagaaaa1381ctctgtgaga gactcagggg tatggattct agcactccca agttaacaag tgatatacag1441gggcatgctt ctcatcaacc agctatttca gagaagactg accataaagc ttcttgcctc1501gttcaaacac ctccaggaca atactcaggg aattcattta aaaaggggga tagtaatagc1561tgtgagcatt gttttgatga gtataatacc aatttagaag gctggaatga ggtacctgat1621gaagcttatg acctgcttga taaacttcta gatctaaatc cagcttcaag aataacagca1681gaagaagctt tgttgcatcc attttttaaa gatatgagct tgtga(SEQ ID NO:253)   1measlgiqmd epmafspqrd rfqaegslkk neqnfklagv kkdieklyea vpqlsnvfki  61edkigegtfs svylataqlq vgpeekiavk hliptshpir iaaelqcltv aggqdnvmgv 121kycfrkndhv viampylehe sfldilnsls fqevreymln lfkalkrihq fgivhrdvkp 181snflynrrlk kyalvdfgla qgthdtkiel lkfvqseaqq ercsqnkshi itgnkiplsg 241pvpkeldqqs ttkasvkrpy tnaqiqikqg kdgkegsvgl svqrsvfger nfnihssish 301espavklmkq sktvdvlsrk latkkkaist kvmnsavmrk tasscpaslt cdcyatdkvc 361siclsrrqqv apragtpgfr apevltkcpn qttaidmwsa gviflsllsg rypfykasdd 421ltalaqimti rgsretiqaa ktfgksilcs kevpaqdlrk lcerlrgmds stpkltsdiq 481ghashcpais ektdhkascl vqtppgqysg nsfkkgdsns cehcfdeynt nlegwnevpd 541eaydlldkll dlnpasrita eeallhpffk dmsl


Putative function


Protein kinase which regulates the G1/S phase transition and/or DNA replication in mammalian cells.


Example 27 (Category 3)

Line ID—335


Phenotype—Lethal phase, pupal. Uneven chromosome condensation, lagging chromosomes in anaphase


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003424 (3B1-2)


P element insertion site—286,560


Annotated Drosophila genome Complete Genome candidate


CG2621—shaggy, protein serine/threonine kinase

(SEQ ID NO:254)ATGTTTACCTTCTACACCAATATAAATAATACACTGATCAACAACAACAATTTTAATAATAATACTAGTAACAGTAATAATAATAATAACAACGTTATAAGCCAGCCGATTAAAATACCGCTAACCGAGCGCTTCTCATCGCAAACATCGACGGGCTCGGCGGATAGCGGTGTAATTGTTTCCAGTGCATCGCAGCAGCAACTGCAGTTGCCACCACCACGCAGTAGCAGTGGATCGCTGAGTCTGCCACAAGCGCCACCTGGCGGCAAGTGGCGGCAGAAGCAGCAGCGCCAACAGTTGCTGCTCAGCCAGGACAGCGGCATCGAAAATGGTGTCACCACTCGTCCATCGAAAGCCAAGGACAACCAGGGTGCGGGAAAAGCCAGTCACAATGCCACAAGCTCGAAGGAGAGCGGCGCGCAGTCGAACAGCAGCAGCGAGAGCCTGGGCAGCAATTGCTCCGAGGCCCAGGAGCAGCAGAGAGTAAGAGCCTCCTCCGCTCTGGAGCTCAGCAGCGTGGACACTCCCGTGATCGTCGGCGGTGTGGTCAGTGGAGGCAACAGCATCTTGCGCAGCCGCATTAAGTACAAGAGTACGAACAGCACCGGAACCCAGGGATTCGATGTGGAGGATCGCATCGATGAGGTGGATATCTGTGATGATGATGATGTCGACTGCGATGATCGCGGATCGGAGATCGAGGAGGAGGAGGAGGACCAAACCGAACAAGAGGAGGAGGTCGATGAGGTGGATGCCAAGCCGAAGAACCGACTTTTGCCACCGGATCAGGCGGAACTCACAGTGGCGGCGGCCATGGCACGTCGACGCGATGCCAAGAGCCTGGCCACCGACGGTCACATATATTTCCCACTGCTCAAGATCAGCGAGGATCCGCACATTGATTCGAAGCTGATCAATCGCAAGGATGGCCTCCAGGACACCATGTATTATTTGGACGAATTCGGCAGTCCAAAGTTGCGAGAGAAGTTCGCCCGCAAGCAGAAGCAGCTGCTCGCCAAGCAGCAGAAGCAGTTGATGAAACGTGAAAGGAGGAGCGAGGAGCAGCGCAAGAAGCGAAACACCACCGTGGCATCCAACTTGGCGGCCAGCGGAGCGGTGGTGGACGACACCAAAGATGATTACAAACAACAACCACACTGTGATACTAGCTCTAGGAGCAAAAATAACTCGGTACCCAATCCACCCAGCAGCCATCTCCATCAGAACCACAATCATCTCGTTGTGGATGTGCAAGAGGATGTGGATGATGTGAATGTGGTTGCCACCAGCGACGTGGACAGTGGTGTCGTCAAGATGCGCCGCCATAGCCACGATAACCACTACGACCGAATTCCCCGGAGCAATGCTGCCACCATTACCACCCGCCCTCAAATCGACCAACAGTCGTCGCACCACCAGAACACCGAGGATGTGGAGCAAGGAGCTGAGCCCCAAATCGATGGCGAAGCGGATCTGGATGCGGATGCGGATGCGGACAGCGATGGGAGTGGCGAGAACGTTTAAGACTGCCAAATGGCCAGAACACAGTCCTGCAAAAACCAAACAGGTCGCGATGGTTCTAAAATCACAACAGTTGTTGCAACACCCGGCCAAGGCACCGATCGCGTACAAGAGGTCTCCTATACAGACACAAAGGTCATCGGCAATGGCAGCTTCGGCGTCGTGTTCCAGGCAAAGCTCTGCGATACCGGCGAACTGGTGGCAATCAAAAAAGTTTTACAAGACAGACGATTTAAGAATCGCGAATTGCAAATAATGCGCAAATTGGAGCATTGTAATATTGTGAAGCTTTTGTACTTTTTCTATTCGAGTGGTGAAAAGCGTGATGAAGTATTTTTGAATTTAGTCCTCGAATATATACCAGAAACCGTATACAAAGTGGCTCGCCAATATGCCAAAACCAAGCAAACGATACCAATCAACTTTATTCGGCTCTACATGTATCAACTGTTCAGAAGTTTGGCCTACATCCACTCGCTGGGCATTTGCCATCGTGATATCAAGCCGCAGAATCTTCTGCTCGATCCGGAGACGGCTGTGCTGAAGCTCTGTGACTTTGGCAGCGCCAAACAGCTGCTGCACGGCGAGCCGAATGTATCGTATATCTGCTCCCGGTATTACCGCGCCCCCGAGCTCATCTTTGGCGCCATCAATTATACAACAAAGATCGATGTCTGGAGTGCCGGTTGCGTTTTGGCCGAACTGCTGCTGGGCCAGCCCATCTTCCCTGGCGATTCCGGTGTGGATCAGCTCGTCGAGGTCATCAAGGTCCTGGGCACACCGACAAGAGAACAGATACGCGAAATGAATCCAAACTACACGGAATTCAAGTTCCCTCAGATTAAGAGTCATCCATGGCAGAAAGTTTTCCGTATACGCACTCCTACAGAAGCTATCAACTTGGTGTCCCTGCTGCTCGAGTATACGCCCAGTGCCAGGATCACACCGCTCAAGGCCTGCGCACATCCGTTCTTCGATGAGCTACGCATGGAGGGTAATCACACCTTGCCCAACGGTCGCGATATGCCGCCGCTGTTCAACTTCACAGAGCATGAGCTCTCAATACAGCCCAGCCTAGTGCCGCAGTTGTTGCCCAAGCATCTGCAGAACGCATCCGGACCTGGCGGCAATCGACCCTCGGCCGGCGGAGCAGCCTCCATTGCGGCCAGCGGCTCCACCAGCGTCTCGTCAACGGGCAGTGGTGCCTCGGTGGAAGGATCCGCCCAGCCACAGTCGCAGGGTACAGCAGCAGCTGCGGGATCCGGATCGGGCGGAGCAACAGCAGGAACCGGCGGAGCGAGTGCCGGTGGACCCGGATCTGGTAACAACAGTAGCAGCGGCGGAGCATCGGGAGCGCCGTCCGCTGTGGCTGCCGGAGGAGCCAATGCCGCCGTCGCTGGCGGTGCTGGTGGTGGTGGCGGAGCCGGTGCGGCGACCGCAGCTGCAACAGCAACTGGCGCTATAGGCGCGACTAATGCCGGCGGCGCCAATGTAACAGATTCATAGGGGAAATAGTAACATACATACACACACTAAATATATATCCAAGCATATATATATAGTAATCATTATATATAACACCTACACCCACAACAACAACAACAGCAATTATATATAATAACCATAAACAAGAATGGAGAAAGCCAATCCAGCAATCACAGCAAACTATATACACAACAACAACAATTAAATTAATTAATGCAATTGATGAAAGAACAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCATCAACCGCAATTTCAAAAGAACTCTAGAAACAGCAAAGGCATAAAATATAACAAAAGAAATATTTTACTTAGGTAAAACATTAAATTTATTTTAAATCTAAAATAAACTAATAAGCATTAAATAATACATGATAATGGTAAATAAACACACAATAATTATAATAGTAGAGCGAGCGCTGATCGATTGTCATTTTATTGCTGCCGC(SEQ ID NO:255)MFTFYTNINNTLINNNNNNNNTSNSNNNNNNVISQPIKIPLTERFSSQTSTGSADSGVIVSSASQQQLQLPPPRSSSGSLSLPQAPPGGKWRQKQQRQQLLLSQDSGIENGVTTRPSKAKDNQGAGKASHNATSSKESGAQSNSSSESLGSNCSEAQEQQRVRASSALELSSVDTPVIVGGVVSGGNSILRSRIKYKSTNSTGTQGFDVEDRIDEVDICDDDDVDCDDRGSEIEEEEEDQTEQEEEVDEVDAKPKNRLLPPDQAELTVAAAMARRRDAKSLATDGHIYFPLLKISEDPHIDSKLINRKDGLQDTMYYLDEFGSPKLREKFARKQKQLLAKQQKQLMKRERRSEEQRKKRNTTVASNLAASGAVVDDTKDDYKQQPHCDTSSRSKNNSVPNPPSSHLHQNHNHLVVDVQEDVDDVNVVATSDVDSGVVKMRRHSHDNHYDRIPRSNAATITTRPQIDQQSSHHQNTEDVEQGAEPQIDGEADLDADADADSDGSGENVKTAKLARTQSCKNQTGRDGSKITTVVATPGQGTDRVQEVSYTDTKVIGNGSFGVVFQAKLCDTGELVAIKKVLQDRRFKNRELQIMRKLEHCNIVKLLYFFYSSGEKRDEVFLNLVLEYIPETVYKVARQYAKTKQTIPINFIRLYMYQLFRSLAYIHSLGICHRDIKPQNLLLDPETAVLKLCDFGSAKQLLHGEPNVSYICSRYYRAPELIFGAINYTTKIDVWSAGCVLAELLLGQPIFPGDSGVDQLVEVIKVLGTPTREQIREMNPNYTEFKFPQIKSHPWQKVFRIRTPTEAINLVSLLLEYTPSARITPLKACAHPFFDELRMEGNHTLPNGRDMPPLFNFTEHELSIQPSLVPQLLPKHLQNASGPGGNRPSAGGAASLAASGSTSVSSTGSGASVEGSAQPQSQGTAAAAGSGSGGATAGTGGASAGGPGSGNNSSSGGASGAPSAVAAGGANAAVAGGAGGGGGAGAATAAATATGAIGATNAGGANVTDS


Human homologue of Complete Genome candidate


NP002084—glycogen synthase kinase 3 beta

(SEQ ID NO:256)   1ggagaaggaa ggaaaaggtg attcgcgaag agagtgatca tgtcagggcg gcccagaacc  61acctcctttg cggagagctg caagccggtg cagcagcctt cagcttttgg cagcatgaaa 121gttagcagag acaaggacgg cagcaaggtg acaacagtgg tggcaactcc tgggcagggt 181ccagacaggc cacaagaagt cagctataca gacactaaag tgattggaaa tggatcattt 241ggtgtggtat atcaagccaa actttgtgat tcaggagaac tggtcgccat caagaaagta 301ttgcaggaca agagatttaa gaatcgagag ctccagatca tgagaaagct agatcactgt 361aacatagtcc gattgcgtta tttcttctac tccagtggtg agaagaaaga tgaggtctat 421cttaatctgg tgctggacta tgttccggaa acagtataca gagttgccag acactatagt 481cgagccaaac agacgctccc tgtgatttat gtcaagttgt atatgtatca gctgttccga 541agtttagcct atatccattc ctttggaatc tgccatcggg atattaaacc gcagaacctc 601ttgttggatc ctgatactgc tgtattaaaa ctctgtgact ttggaagtgc aaagcagctg 661gtccgaggag aacccaatgt ttcgtatatc tgttctcggt actatagggc accagagttg 721atctttggag ccactgatta tacctctagt atagatgtat ggtctgctgg ctgtgtgttg 781gctgagctgt tactaggaca accaatattt ccaggggata gtggtgtgga tcagttggta 841gaaataatca aggtcctggg aactccaaca agggagcaaa tcagagaaat gaacccaaac 901tacacagaat ttaaattccc tcaaattaag gcacatcctt ggactaaggt cttccgaccc 961cgaactccac cggaggcaat tgcactgtgt agccgtctgc tggagtatac accaactgcc1021cgactaacac cactggaagc ttgtgcacat tcattttttg atgaattacg ggacccaaat1081gtcaaacatc caaatgggcg agacacacct gcactcttca acttcaccac tcaagaactg1141tcaagtaatc cacctctggc taccatcctt attcctcctc atgctcggat tcaagcagct1201gcttcaaccc ccacaaatgc cacagcagcg tcagatgcta atactggaga ccgtggacag1261accaataatg ctgcttctgc atcagcttcc aactccacct gaacagtccc gacgagccag1321ctgcacagga aaaaccacca gttacttgag tgtcactcag caacactggt cacgtttgga1381aagaatatt(SEQ ID NO:257)   1msgrprttsf aesckpvqqp safgsmkvsr dkdgskvttv vatpgqgpdr pqevsytdtk  61vigngsfgvv yqaklcdsge lvaikkvlqd krfknrelqi mrkldhcniv rlryffyssg 121ekkdevylnl vldyvpetvy rvarhysrak qtlpviyvkl ymyqlfrsla yihsfgichr 181dikpqnllld pdtavlklcd fgsakqlvrg epnvsyicsr yyrapelifg atdytssidv 241wsagcvlael llgqpifpgd sgvdqlveii kvlgtptreq iremnpnyte fkfpqikahp 301wtkvfrprtp peaialcsrl leytptarlt pleacahsff delrdpnvkh pngrdtpalf 361nfttqelssn pplatilipp hariqaaast ptnataasda ntgdrgqtnn aasasasnst 421


Putative function


Serine/threonine kinase involved in winglwess signaling pathway


Example 28 (Category 3)

Dlg1 (CG1725) as a candidate gene is detected in a screen of a P-element insertion library covering the X chromosome of Drosophila melanogaster (Peter et al. 2001) as mutant phenotype in fly line 342 , as described above.


Mitotic defects are observed in brain squashes: high mitotic index, overcondensed chromosomes, lagging chromosomes and a high proportion of anaphases and telophases compared to normal brains.


Rescue and sequencing of genomic DNA flanking the P-element insertion site indicates that the P-element is inserted into the 5′ region of gene Dlg1 (CG1725).


Line ID—342


Phenotype—Lethal phase pupal. Higher mitotic index, colchicine-like overcondensed chromosomes, many ana- and telophases, lagging chromosomes


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003486 (10B8-10)


P element insertion site—1128 and 3755


Annotated Drosophila genome Complete Genome candidate


CG1725—dlg, membrane-associated guanylate kinase homologs, role in cell junctions and proliferation (version 1)

(SEQ ID NO:258)CACAAACAACACGCTCGTGCGTGCGATTTAAATATATAGATGTTTCAAAAGTCAACCTCTCTGTTCGCAATTGTGTGCATTTTCGTTTGTCTAGTGCAAAAAGTTGGATAATCACAGGCGGCAAATAAAATAGTAACGAATCGAGTTCAAGAAGAAGAAGAAGAGAAGAGGAAGCAGAGGCAGCAGCGCCGGCATTTGTCCGTGTGTTGTTGTTGTTGTTTGTGCGCGGCTGTAACTTTAACCCTCGAACGCCATAAGATTAAAAAACCAAGTATAACAATAAGTTATAAAATCAATTAAACAAAAGCCGCTGCGATATGACAACGAGGAAAAAGAAGCGCGACGGCGGCGGCAGCGGCGGCGGATTCATCAAGAAAGTTTCGTCACTCTTCAATCTGGATTCGGTGAATGGCGATGATAGCTGGTTATACGAGGACATTCAGCTGGAGCGCGGCAACTCCGGATTGGGCTTTTCCATTGCCGGCGGTACGGATAATCCGCACATCGGCACCGACACCTCCATCTACATCACCAAGCTCATTTCCGGTGGAGCAGCTGCCGCCGATGGACGTCTGAGCATCAACGATATCATCGTATCGGTGAACGATGTGTCCGTGGTGGATGTGCCACATGCCTCCGCCGTGGATGCCCTCAAGAAGGCGGGCAATGTTGTTAAGCTGCATGTGAAGCGAAAACGTGGAACGGCCACCACCCCGGCAGCGGGATCGGCGGCAGGAGATGCTCGGGATAGTGCGGCCAGCGGACCGAAGGTCATCGAAATCGATCTGGTCAAGGGCGGCAAGGGACTGGGCTTCTCAATTGCCGGCGGCATTGGCAACCAGCACATCCCCGGCGACAATGGCATCTATGTGACCAAGTTGATGGACGGCGGAGCAGCGCAGGTGGACGGACGTCTCTCCATCGGAGATAAGCTGATTGCAGTGCGCACCAACGGGAGCGAGAAGAACCTGGAGAACGTAACGCACGAACTGGCGGTGGCCACGTTGAAATCGATCACCGACAAGGTGACGCTGATCAATGGAAAGACACAGCATCTGACCACCAGTGCGTCCGGCGGCGGAGGAGGAGGCCTTTCATCCGGACAACAATTGTCGCAGTCCCAATCGCAGTTGGCCACCAGCCAGAGCCAAAGTCAGGTGCATCAGCAGCAGCATGCGACGCCGATGGTCAATTCGCAGTCGACAGGTGCGCTAAATAGTATGGGACAGACGGTTGTCGATTCACCATCAATACCACAAGCAGCCGCAGCAGTAGCAGCAGCAGCAAATGCATCTGCATCTGCATCAGTCATTGCAAGCAACAACACAATCAGCAACACCACAGTCACCACAGTCACGGCCACGGCCACAGCCAGCAACAGTAGCAGCAAGTTGCCGCCGTCGCTTGGCGCTAACAGCAGCATTAGCATTAGCAATAGCAATAGCAATAGCTTCAGCTTTAATATCAACAACATTAATAGCATCAACAACAACAACAGTAGCAGCAGCAGCACGACGGCAACTGTTGCAGCAGCAACACCAACAGCAGCATCAGCAGCAGCAGCAGCAGCATCATCTCCACCCGCCAACTCCTTCTATAACAATGCTTCCATGCCCGCCCTGCCTGTCGAATCCAATCAAACAAACAACCGATCCCAATCACCCCAGCCGCGCCAGCCCGGGTCGCGATACGCCTCTACAAATGTCCTAGCCGCCGTTCCACCAGGAACTCCACGCGCTGTCAGCACCGAGGATATAACCAGAGAACCGCGCACCATCACCATCCAGAAGGGACCGCAGGGCCTGGGCTTCAATATCGTTGGCGGCGAGGATGGCCAGGGTATCTATGTGTCCTTCATCCTGGCCGGCGGCCCAGCGGATCTCGGGTCGGAGTTGAAGCGTGGCGACCAGCTGCTCAGCGTGAACAATGTCAATCTCACGCACGCCACCCACGAAGAGGCAGCCCAGGCGCTCAAGACTTCTGGCGGTGTGGTGACCCTGTTGGCGCAGTACCGCCCAGAGGAGTACAATCGCTTCGAGGCACGCATTCAAGAGTTGAAACAACAGGCTGCCCTCGGTGCCGGCGGATCGGGAACGCTGCTGCGCACCACGCAAAAGCGATCGCTGTATGTGCGCGCCCTGTTTGACTACGATCCGAATCGGGATGATGGATTGCCCTCGCGAGGATTGCCCTTTAAGCACGGCGATATCCTGCACGTGACCAATGCCTCCGACGATGAATGGTGGCAGGCACGACGAGTTCTCGGCGACAACGAGGACGAGCAAATCGGTATTGTACCATCGAAAAGGCGTTGGGAGCGCAAAATGCGAGCTAGGGACCGCAGCGTTAAGTTCCAGGGACATGCGGCAGCTAATAATAATCTGGATAAGCAATCGACATTGGATCGAAAGAAAAAGAATTTCACATTCTCGCGCAAATTTCCGTTTATGAAGAGTCGCGATGAGAAGAATGAAGATGGCAGCGACCAAGAGCCCAATGGAGTTGTGAGCAGCACCAGCGAGATTGACATCAATAATGTCAACAACAACCAGTCAAATGAACCGCAACCTTCCGAGGAGAACGTGTTGTCCTACGAGGCCGTACAGCGTTTGTCCATCAACTACACGCGCCCGGTGATTATTCTGGGACCCCTGAAGGATCGCATCAACGATGACCTTATATCAGAGTATCCCGACAAGTTCGGCTCTTGTGTGCCACACACCACCCGACCCAAGCGAGAGTACGAGGTGGATGGTAGGGACTACCACTTTGTATCCTCTCGCGAGCAAATGGAACGGGATATTCAGAATCATCTGTTCATCGAGGCGGGACAGTATAACGACAATCTGTACGGCACATCGGTGGCCAGCGTGCGCGAAGTGGCCGAGAAGGGTAAACACTGCATCCTGGACGTGTCCGGGAACGCCATCAAGCGACTCCAAGTTGCCCAGCTGTATCCCGTCGCCGTGTTCATCAAGCCCAAGTCGGTGGATTCAGTGATGGAAATGAATCGTCGCATGACGGAGGAGCAGGCCAAGAAGACTTACGAGCGGGCGATTAAAATGGAGCAAGAATTCGGCGAATACTTTACGGGCGTTGTCCAAGGCGATACCATCGAGGAGATTTACAGCAAAGTGAAATCGATGATTTGGTCCCAGTCGGGACCAACCATTTGGGTACCTTCCAAGGAATCTCTATGACCAACAGCCACCACAACTTGGACACTGCCGCCTCGAGTTCGATGTCGACCAGTCTCGAGAACAACAATAGGAGCAACAGCAGCAGCAACAAATCAGCAGCCGCAGCAGAAGACGCCGCACTGATGATGCATCACAGTAACAACAGATACTTTTACTTCTACTTCAACAACAAGAACAACAACAACAACAGCAACCACAGCAGCAGCCACAGCGACAACAACAAAAACAACAACACTGACAACGACAGGAAACGG(SEQ ID NO:259)MTTRKKKRDGGGSGGGFIKKVSSLFNLDSVNGDDSWLYEDIQLERGNSGLGFSIAGGTDNPHIGTDTSIYITKLISGGAAAADGRLSINDIIVSVNDVSVVDVPHASAVDALKKAGNVVKLHVKRKRGTATTPAAGSAAGDARDSAASGPKVIEIDLVKGGKGLGFSIAGGIGNQHIPGDNGIYVTKLTDGGRAQVDGRLSIGDKLIAVRTNGSEKNLENVTHELAVATLKSITDKVTLIIGKTQHLTTSASGGGGGGLSSGQQLSQSQSQLATSQSQSQVHQQQHATPMVNSQSTGALNSMGQTVVDSPSIPQAAAAVAAAANASASASVIASNNTISNTTVTTVTATATASNDSSKLPPSLGANSSISISNSNSNSNSNNINNINSINNNNSSSSSTTATVAAATPTAASAAAAAASSPPANSFYNNASMPALPVESNQTNNRSQSPQPRQPGSRYASTNVLAAVPPGTPRAVSTEDITREPRTITIQKGPQGLGFNIVGGEDGQGIYVSFILAGGPADLGSELKRGDQLLSVNNVNLTHATHEEAAQALKTSGGVVTLLAQYRPEEYNRFEARIQELKQQAALGAGGSGTLLRTTQKRSLYVRALFDYDPNRDDGLPSRGLPFKHGDILHVTNASDDEWWQARRVLGDNEDEQIGIVPSKRRWERKMRARDRSVKFQGHAAANNNLDKQSTLDRKKKNFTFSRKFPFMKSRDEKNEDGSDQEPNGVVSSTSEIDINNVNNNQSNEPQPSEENVLSYEAVQRLSINYTRPVIILGPLKDRINDDLISEYPDKFGSCVPHTTRPKREYEVDGRDYHFVSSREQMERDIQNHLFIEAGQYNDNLYGTSVASVREVAEKGKHCILDVSGNAIKRLQVAQLYPVAVFIKPKSVDSVMEMNRRMTEEQAKKTYERAIKMEQEFGEYFTGVVQGDTTEEIYSKVKSMIWSQSGPTTWVPSKESL


CG1725—dlg, membrane-associated guanylate kinase homologs, role in cell junctions and proliferation, genbank accession number M73529 (version 2)

(SEQ ID NO:260)1cccccccccc cccagttggg tgtgttgttt tcgtcgcgtt cggttgctcg ctttattttt61ttgtttgttt attttgtttt gtgcaatgga aatgtgaaca caaatgtttc aaaagtcaac121ctctctgttc gcaattgtgt gcattttcgt ttgtctagtg caaaaagttg gataacacag181gcggcaaata aaatagtaac gaatcgagtt caagaagaag aagaagagaa gaggaagcag241aggcagcagc gccggcattt gtccgtgtgt tgttgttgtt gtttgtgcgc ggctgtaact301ttaaccctcg aacgccataa gattaaaaaa ccaactataa caataagtta taaaatcaat361taaacaaaag ccgctgcgat atgacaacga ggaaaaagaa gcgcgacggc ggcggcagcg421gcggcggatt catcaagaaa gtttcgtcac tcttcaatct ggattcggtg aatggcgatg481atagctggtt atacgaggac attcagctgg agcgcggcaa ctccggattg ggcttttcca541ttgccggcgg tacggataat ccgcacatcg gcaccgacac ctccatctac atcaccaagc601tcatttccgg tggagcagct gccgccgatg gacgtctgag catcaacgat atcatcgtat661cggtgaacga tgtgtccgtg gtggatgtgc cacatgcctc cgccgtggat gccctcaaga721aggcgggcaa tgttgttaag ctgcatgtga agcgaaaacg tggaacggcc accaccccgg781cagcgggatc ggcggcagga gatgctcggg atagtgcggc cagcggaccg aaggtcatcg841aaatcgatct ggtcaagggc ggcaagggac tgggcttctc aattgccggc ggcattggca901accagcacat ccccggcgac aatggcatct atgtgaccaa gttgacggac ggcggacgag961cgcaggtgga cggacgtctc tccatcggag ataagctgat tgcagtgcgc accaacggga1021gcgagaagaa cctggagaac gtaacgcacg aactggcggt ggccacgttg aaatcgatca1081ccgacaaggt gacgctgatc attggaaaga cacagcatct gaccaccagt gcgtccggcg1141gcggaggagg aggcctttca tccggacaac aattgtcgca gtcccaatcg cagttggcca1201ccagccagag ccaaagtcag gtgcatcagc agcagcatgc gacgccgatg gtcaattcgc1261agtcgacagg tgcgctaaat agtatgggac agacggttgt cgattcacca tcaataccac1321aagcagccgc agcagtagca gcagcagcaa atgcatctgc atctgcatca gtcattgcaa1381gcaacaacac aatcagcaac accacagtca ccacagtcac ggccacggcc acagccagca1441acgatagcag caagttgccg ccgtcgcttg gcgctaacag cagcattagc attagcaata1501gcaatagcaa tagcaacagc aataatatca acaacattaa tagcatcaac aacaacaaca1561gtagcagcag cagcacgacg gcaactgttg cagcagcaac accaacagca gcatcagcag1621cagcagcagc agcatcatct ccacccgcca actccttcta taacaatgct tccatgcccg1681ccctgcctgt cgaatccaat caaacaaaca accgatccca atcaccccag ccgcgccagc1741ccgggtcgcg atacgcctct acaaatgtcc tagccgccgt tccaccagga actccacgcg1801ctgtcagcac cgaggatata accagagaac cacgcaccat caccatccag aagggaccgc1861agggcctggg cttcaatatc gttggcggcg aggatggcca gggtatctat gtgtccttca1921tcctggccgg cggcccagcg gatctcgggt cggagttgaa gcgtggcgac cagctgctca1981gcgtgaacaa tgtcaatctc acgcacgcca cccacgaaga ggcagcccag gcgctcaaga2041cttctggcgg tgtggtgacc ctgttggcgc agtaccgccc agaggagtac aatcgcttcg2101aggcacgcat tcaagagttg aaacaacagg ctgccctcgg tgccggcgga tcgggaacgc2161tgctgcgcac cacgcaaaag cgatcgctgt atgtgcgcgc cctgtttgac tacgatccga2221atcgggatga tggattgccc tcgcgaggat tgccctttaa gcacggcgat atcctgcacg2281tgaccaatgc ctccgacgat gaatggtggc aggcacgacg agttctcggc gacaacgagg2341acgagcaaat cggtattgta ccatcgaaaa ggcgttggga gcgcaaaatg cgagctaggg2401accgcagcgt taagttccag ggacatgcgg cagctaataa taatctggat aagcaatcga2461cattggatcg aaagaaaaag aatttcacat tctcgcgcaa atttccgttt atgaagagtc2521gcgatgagaa gaatgaagat ggcagcgacc aagagcccaa tggagttgtg agcagcacca2581gcgagattga catcaataat gtcaacaaca accagtcaaa tgaaccgcaa ccttccgagg2641agaacgtgtt gtcctacgag gccgtacagc gtttgtccat caactacacg cgcccggtga2701ttattctggg acccctgaag gatcgcatca acgatgacct tatatcagag tatcccgaca2761agttcggctc ctgtgtgcca cacaccaccc gacccaagcg agagtacgag gtggatggta2821gggactacca ctttgtatcc tctcgcgagc aaatggaacg ggatattcag aatcatctgt2881tcatcgaggc gggacagtat aacgacaatc tgtacggcac atcggtggcc agcgtgcgcg2941aagtggccga gaagggtaaa cactgcatcc tggacgtgtc cgggaacgcc atcaagcgac3001tccaagttgc ccagctgtat cccgtcgccg tgttcatcaa gcccaagtcg gtggattcag3061tgatggaaat gaatcgtcgc atgacggagg agcaggccaa gaagacttac gagcgggcga3121ttaaaatgga gcaagaattc ggcgaatact ttacgggcgt tgtccagggc gataccatcg3181aggagatcta cagcaaagtg aaatcgatga tttggtccca gtcgggacca accatttggg3241taccttccaa ggaatctcta tga(SEQ ID NO:261)MTTRKKKRDGGGSGGGFIKKVSSLFNLDSVNGDDSWLYEDIQLERGNSGLGFSIAGGTDNPHIGTDTSIYITKLISGGAAAADGRLSINDIIVSVNDVSVVDVPHASAVDALKKAGNVVKLHVKRKRGTATTPAAGSAAGDARDSAASGPKVIEIDLVKGGKGLGFSIAGGIGNQHIPGDNGIYVTKLTDGGRAQVDGRLSIGDKLIAVRTNGSEKNLENVTHELAVATLKSITDKVTLIIGKTQHLTTSASGGGGGGLSSGQQLSQSQSQLATSQSQSQVHQQQHATPMVNSQSTGALNSMGQTVVDSPSIPQAAAAVAAAANASASASVIASNNTISNTTVTTVTATATASNDSSKLPPSLGANSSISISNSNSNSNSNNINNINSINNNNSSSSSTTATVAAATPTAASAAAAAASSPPANSFYNNASMPALPVESNQTNNRSQSPQPRQPGSRYASTNVLAAVPPGTPRAVSTEDITREPRTITIQKGPQGLGFNIVGGEDGQGIYVSFILAGGPADLGSELKRGDQLLSVNNVNLTHATHEEAAQALKTSGGVVTLLAQYRPEEYNRFEARIQELKQQAALGAGGSGTLLRTTQKRSLYVRALFDYDPNRDDGLPSRGLPFKHGDILHVTNASDDEWWQARRVLGDNEDEQIGIVPSKRRWERKMRARDRSVKFQGHAAANNNLDKQSTLDRKKKNFTFSRKFPFMKSRDEKNEDGSDQEPNGVVSSTSEIDINNVNNNQSNEPQPSEENVLSYEAVQRLSINYTRPVIILGPLKDRINDDLISEYPDKFGSCVPHTTRPKREYEVDGRDYHFVSSREQMERDIQNHLFIEAGQYNDNLYGTSVASVREVAEKGKHCILDVSGNAIKRLQVAQLYPVAVFIKPKSVDSVMEMNRRMTEEQAKKTYERAIKMEQEFGEYFTGVVQGDTIEEIYSKVKSMIWSQSGPTIWVPSKESL


Human homologue of Complete Genome candidate


XP012060—discs, large (Drosophila) homolog 2, channel-associated protein of synapses-110′ (version 1)

(SEQ ID NO:262)1gggaattctg gcctgggatt cagtattgct ggggggacag ataatcccca cattggagat61gaccctggca tatttattac gaagattata ccaggaggtg ctgcagcaga ggatggcaga121ctcagggtca atgattgtat cttgcgggtg aatgaggttg atgtgtcaga ggtttcccac181agtaaagcgg tggaagccct gaaggaagca gggtctatcg ttcggctgta tgtgcgtaga241agacgaccta ttttggagac cgttgtggaa atcaaactgt tcaaaggccc taaaggttta301ggcttcagta ttgcaggagg tgtggggaac caacacattc ctggagacaa cagcatttat361gtaactaaaa ttatagatgg aggagctgca caaaaagatg gaaggttgca agtaggagat421agactactaa tggtaaacaa ctacagttta gaagaagtaa cacacgaaga ggcagtagca481atattaaaga acacatcaga ggtagtttat ttaaaagttg gcaaacccac taccatttat541atgactgatc cttatggtcc acctgatatt actcactctt attctccacc aatggaaaac601catctactct ctggcaacaa tggcacttta gaatataaaa cctccctgcc acccatctct661ccaggaaggt actcaccaat tccaaagcac atgcttgttg acgacgacta caccaggcct721ccggaacctg tttacagcac tgtgaacaaa ctatgtgata agcctgcttc tcccaggcac781tattcccctg ttgagtgtga caaaagcttc ctcctctcag ctccctattc ccactaccac841ctaggcctgc tacctgactc tgagatgacc agtcattccc aacatagcac cgcaactcgt901cagccttcaa tgactctcca acgggccgtc tccctggaag gagagcctcg caaggtagtc961ctgcacaaag gctccactgg cctgggcttc aacattgtcg gtggggaaga tggagaaggt1021atttttgtgt ccttcattct ggctggtgga ccagcagacc taagtgggga gctccagaga1081ggagaccaga tcctatcggt gaatggcatt gacctccgtg gtgcatccca cgagcaggca1141gctgctgcac taaagggggc tggacagaca gtgacgatta tagcacaata tcaacctgaa1201gattacgctc gatttgaggc caaaatccat gacctacgag agcagatgat gaaccacagc1261atgagctccg ggtccggatc cctgcgaacc aatcagaaac gctccctcta cgtcagagcc1321atgttcgact acgacaagag caaggacagt gggctgccaa gtcaaggact tagttttaaa1381tatggagata ttctccacgt tatcaatgcc tctgatgatg agtggtggca agccaggaga1441gtcatgctgg agggagacag tgaggagatg ggggtcatcc ccagcaaaag gagggtggaa1501agaaaggaac gtgcccgatt gaagacagtg aagtttaatg ccaaacctgg agtgattgat1561tcgaaagggt cattcaatga caagcgtaaa aagagcttca tcttttcacg aaaattccca1621ttctacaaga acaaggagca gagtgagcag gaaaccagtg atcctgaacg tggacaagaa1681gacctcattc tttcctatga gcctgttaca aggcaggaaa taaactacac ccggccggtg1741attatcctgg ggcccatgaa ggatcggatc aatgacgact tgatatctga attccctgat1801aaatttggct cctgtgtgcc tcatactacg aggccaaagc gagactacga ggtggatggc1861agagactatc actttgtcat ttccagagaa caaatggaga aagatatcca agagcacaag1921tttatagaag ccggccagta caatgacaat ttatatggaa ccagtgtgca gtctgtgaga1981tttgtagcag aaagaggcaa acactgtata cttgatgtat caggaaatgc tatcaagcgg2041ttacaagttg cccagctcta tcccattgcc atcttcataa aacccaggtc tctggaacct2101cttatggaga tgaataagcg tctaacagag gaacaagcca agaaaaccta tgatcgagca2161attaagctag aacaagaatt tggagaatat tttacagcta ttgtccaagg agatacttta2221gaagatatat ataaccaatg caagcttgtt attgaagagc aatctgggcc tttcatctgg2281attccctcaa aggaaaagtt ataaattagc tactgcgcct ctgacaacga cagaagagca2341tttagaagaa caaaatatat ataacatact acttggaggc ttttatgttt ttgttgcatt2401tatgtttttg cagtcaatgt gaattcttac gaatgtacaa cacaaactgt atgaagccat2461gaaggaaaca gaggggccaa agggtg(SEQ ID NO:263)1mvnnysleev theeavailk ntsevvylkv gkpttiymtd pygppdiths ysppmenhll61sgnngtleyk tslppispgr yspipkhmlv dddytrppep vystvnklcd kpasprhysp121vecdksflls apyshyhlgl lpdsemtshs qhstatrqps mtlqravsle geprkvvlhk181gstglgfniv ggedgegifv sfilaggpad lsgelqrgdq ilsvngidlr gasheqaaaa241lkgagqtvti iaqyqpedya rfeakihdlr eqmmnhsmss gsgslrtnqk rslyvramfd301ydkskdsglp sqglsfkygd ilhvinasdd ewwqarrvml egdseemgvi pskrrverke361rarlktvkfn akpgvidskg sfndkrkksf ifsrkfpfyk nkeqseqets dpergqedli421lsyepvtrqe inytrpviil gpmkdrindd lisefpdkfg scvphttrpk rdyevdgrdy481hfvisreqme kdiqehkfie agqyndnlyg tsvqsvrfva ergkhcildv sgnaikrlqv541aqlypiaifi kprsleplme mnkrlteeqa kktydraikl eqefgeyfta ivqgdtledi601ynqcklviee qsgpfiwips kekl


DLG2: discs, large homolog 2, chapsyn-110 channel-associated protein of synapses-110′ genbank accession number U32376 (version 2)

(SEQ ID NO:264)1aaaagcaact gaggtcttaa ctttcagacg ctgaattctc atctaattga aattactggg61cataatgcta tatatagcca atgaagagat tttgagctct cactcagtgc cttcaagaca121tgtcgttttg tagtcagaga aaacagagat caatgcattt tcaaactgac agagggaacg181gatgctcttt agtagcacat gcccaggatc gtgtgtgtgg ggcttgcgct gtgctgagaa241gctgaatacc ggtccatatg ctccttattt actgcaatgt tctttgcatg ttactgtgca301ctccggacta acgtgaagaa gtatcgatat caagatgagg acgctccaca tgatcattcc361ttacctcgac taacccacga agtaagaggc ccagaactcg tgcatgtatc agaaaagaac421ctctctcaaa tagaaaatgt ccatggatat gtcctgcagt ctcatatttc tcctctgaag481gccagtcctg ctcctataat tgtcaacaca gatactttgg acacaattcc ttatgtcaat541gggacagaaa ttgaatatga atttgaagaa attacactgg agagggggaa ttctggcctg601ggattcagta ttgctggggg gacagataat ccccacattg gagatgaccc tggcatattt661attacgaaga ttataccagg aggtgctgca gcagaggatg gcagactcag ggtcaatgat721tgtatcttgc gggtgaatga ggttgatgtg tcagaggttt cccacagtaa agcggtggaa781gccctgaagg aagcagggtc tatcgctcgg ctgtatgtgc gtagaagacg acctattttg841gagaccgttg tggaaatcaa actgttcaaa ggccctaaag gtttaggctt cagtattgca901ggaggtgtgg ggaaccaaca cattcctgga gacaacagca tttatgtaac taaaattata961gatggaggag ctgcacaaaa agatggaagg ttgcaagtag gagatagact actaatggta1021aacaactaca gtttagaaga agtaacacac gaagaggcag tagcaatatt aaagaacaca1081tcagaggtag tttatttaaa agttggcaac cccactacca tttatatgac tgatccttat1141ggtccacctg atattactca ctcttattct ccaccaatgg aaaaccatct actctctggc1201aacaatggca ctttagaata taaaacctcc ctgccaccca tctctccagg gaggtactca1261ccaattccaa agcacatgct tgttgacgac gactacacca ggcctccgga acctgtttac1321agcactgtga acaaactatg tgataagcct gcttctccca ggcactattc ccctgttgag1381tgtgacaaaa gcttcctcct ctcagctccc tattcccact accacctagg cctgctacct1441gactctgaga tgaccagtca ttcccaacat agcaccgcaa ctcgtcagcc ttcaatgact1501ctccaacggg ccgtctccct ggaaggagag cctcgcaagg tagtcctgca caaaggctcc1561actggcctgg gcttcaacat tgtcggtggg gaagatggag aaggtatttt tgtgtccttc1621attctggctg gtggaccagc agacctaagt ggggagctcc agagaggaga ccagatccta1681tcggtgaatg gcattgacct ccgtggtgca tcccacgagc aggcagctgc tgcactaaag1741ggggctggac agacagtgac gattatagca caatatcaac ctgaagatta cgctcgattt1801gaggccaaaa tccatgacct acgagagcag atgatgaacc acagcatgag ctccgggtcc1861ggatccctgc gaaccaatca gaaacgctcc ctctacgtca gagccatgtt cgactacgac1921aagagcaagg acagtgggct gccaagtcaa ggacttagtt ttaaatatgg agatattctc1981cacgttatca atgcctctga tgatgagtgg tggcaagcca ggagagtcat gctggaggga2041gacagtgagg agatgggggt catccccagc aaaaggaggg tggaaagaaa ggaacgtgcc2101cgattgaaga cagtgaagtt taatgccaaa cctggagtga ttgattcgaa agggtcattc2161aatgacaagc gtaaaaagag cttcatcttt tcacgaaaat tcccattcta caagaacaag2221gagcagagtg agcaggaaac cagtgatcct gaacgtggac aagaagacct cattctttcc2281tatgagcctg ttacaaggca ggaaataaac tacacccggc cggtgattat cctggggccc2341atgaaggatc ggatcaatga cgacttgata tctgaattcc ctgataaatt tggctcctgt2401gtgcctcata ctacgaggcc aaagcgagac tacgaggtgg atggcagaga ctatcacttt2461gtcatttcca gagaacaaat ggagaaagat atccaagagc acaagtttat agaagccggc2521cagtacaatg acaatttata tggaaccagt gtgcagtctg tgagatttgt agcagaaaga2581ggcaaacact gtatacttga tgtatcagga aatgctatca agcggttaca agttgcccag2641ctctatccca ttgccatctt cataaaaccc aggtctctgg aatctcttat ggagatgaat2701aagcgtctaa cagaggaaca agccaagaaa acctatgatc gagcaattaa gctagaacaa2761gaatttggag aatattttac agctattgtc caaggagata ctttagaaga tatatataac2821caatgcaagc ttgttattga agagcaatct gggcctttca tctggattcc ctcaaaggaa2881aagttataaa ttagctactg cgcctctgac aacgacagaa gagcatttag aagaacaaaa2941tatatataac atactacttg gaggctttta tgtttttgtt gcatttatgt ttttgcagtc3001aatgtgaatt cttacgaatg tacaacacaa actgtatgaa gccatgaagg aaacagaggg3061gccaaagggt g(SEQ ID NO:265)FFACYCALRTNVKKYRYQDEDAPHDHSLPRLTHEVRGPELVHVEKNLSQIENVHGYVLQSHISPLKASPAPIIVNTDTLDTIPYVNGTEIEYEFEEITLEGNSGLGFSIAGGTDNPHIGDDPGIFITKIIPGGAAAEDGRLRVNDCILRVNEVDVSESHSKAVEALKEAGSIARLYVRRRRPILETVVEIKLFKGPKGLGFSIAGGVGNQHIPGNSIYVTKIIDGGAAQKDGRLQVGDRLLMVNNYSLEEVTHEEAVAILKNTSEVVYLKVNPTTIYMTDPYGPPDITHSYSPPMENHLLSGNNGTLEYKTSLPPISPGRYSPIPKHMVDDDYTRPPEPVYSTVNKLCDKPASPRHYSPVECDKSFLLSAPYSHYHLGLLPDSEMSHSQHSTATRQPSMTLQRAVSLEGEPRKVVLHKGSTGLGFNIVGGEDGEGIFVSFILGGPADLSGELQRGDQILSVNGIDLRGASHEQAAAALKGAGQTVTIIAQYQPEDYARFAKIHDLREQMMNHSMSSGSGSLRTNQKRSLYVRAMFDYDKSKDSGLPSQGLSFKYGDLHVINASDDEWWQARRVMLEGDSEEMGVIPSKRRVERKERARLKTVKFNAKPGVIDSGSFNDKRKKSFIFSRKFPFYKNKEQSEQETSDPERGQEDLILSYEPVTRQEINYTRPIILGPMKDRINDDLISEFPDKFGSCVPHTTRPKRDYEVDGRDYHFVISREQMEKDIQHKFIEAGQYNDNLYGTSVQSVRFVAERGKHCILDVSGNAIKRLQVAQLYPIAIFIKPSLESLMEMNKRLTEEQAKKTYDRAIKLEQEFGEYFTAIVQGDTLEDIYNQCKLVIEESGPFIWIPSKEKL


DLG1: discs, large (Drosophila) homolog 1, genbank accession number U13896

(SEQ ID NO:266)1gttggaaacg gcactgctga gtgaggttga ggggtgtctc ggtatgtgcg ccttggatct61ggtgtaggcg aggtcacgcc tctcttcaga cagcccgagc cttcccggcc tggcgcgttt121agttcggaac tgcgggacgc cggtgggcta gggcaaggtg tgtgccctct tcctgattct181ggagaaaaat gccggtccgg aagcaagata cccagagagc attgcacctt ttggaggaat241atcgttcaaa actaagccaa actgaagaca gacagctcag aagttccata gaacgggtta301ttaacatatt tcagagcaac ctctttcagg ctttaataga tattcaagaa ttttatgaag361tgaccttact ggataatcca aaatgtatag atcgttcaaa gccgtctgaa ccaattcaac421ctgtgaatac ttgggagatt tccagccttc caagctctac tgtgacttca gagacactgc481caagcagcct tagccctagt gtagagaaat acaggtatca ggatqaagat acacctcctc541aagagcatat ttccccacaa atcacaaatg aagtgatagg tccagaattg gttcatgtct601cagagaagaa cttatcagag attgagaatg tccatggatt tgtttctcat tctcatattt661caccaataaa gccaacagaa gctgttcttc cctctcctcc cactgtccct gtgatccctg721tcctgccagt ccctgctgag aatactgtca tcctacccac cataccacag gcaaatcctc781ccccagtact ggtcaacaca gatagcttgg aaacaccaac ttacgttaat ggcacagatg841cagattatga atatgaagaa atcacacttg aaaggggaaa ttcagggctt ggtttcagca901ttgcaggagg tacggacaac ccacacattg gagatgactc aagtattttc attaccaaaa961ttatcacagg gggagcagcc gcccaagatg gaagattgcg ggtcaatgac tgtatattac1021aagtaaatga agtagatgtt cgtgatgtaa cacatagcaa agcagttgaa gcgttgaaag1081aagcagggtc tattgtacgc ttgtatgtaa aaagaaggaa accagtgtca gaaaaaataa1141tggaaataaa gctcattaaa ggtcctaaag gtcttgggtt tagcattgct ggaggtgttg1201gaaatcagca tattcctggg gataatagca tctatgtaac caaaataatt gaaggaggtg1261cagcacataa ggatggcaaa cttcagattg gagataaact tttagcagtg aataacgtat1321gtttagaaga agttactcat gaagaagcag taactgcctt aaagaacaca tctgattttg1381tttatttgaa agtggcaaaa cccacaagta tgtatatgaa tgatggctat gcaccacctg1441atatcaccaa ctcttcttct cagcctgttg ataaccatgt tagcccatct tccttcttgg1501gccagacacc agcatctcca gccagatact ccccagtttc taaagcagta cttggagatg1561atgaaattac aagggaacct agaaaagttg ttcttcatcg tggctcaacg ggccttggtt1621tcaacattgt aggaggagaa gatggagaag gaatatttat ttcctttatc ttagccggag1681gacctgctga tctaagtgga gagctcagaa aaggagatcg tattatatcg gtaaacagtg1741ttgacctcag agctgctagt catgagcagg cagcagctgc attgaaaaat gctggccagg1801ctgtcacaat tgttgcacaa tatcgacctg aagaatacag tcgttttgaa gctaaaatac1861atgatttacg ggagcagatg atgaatagta gtattagttc agggtcaggt tctcttcgaa1921ctagccagaa gcgatccctc tatgtcagag ccctttttga ttatgacaag actaaagaca1981gtgggcttcc cagtcaggga ctgaacttca aatttggaga tatcctccat gttattaatg2041cttctgatga tgaatggtgg caagccaggc aggttacacc agatggtgag agcgatgagg2101tcggagtgat tcccagtaaa cgcagagttg agaagaaaga acgagcccga ttaaaaacag2161tgaaattcaa ttctaaaacg agagataaag ggcagtcatt caatgacaag cgtaaaaaga2221acctcttttc ccgaaaattc cccttctaca agaacaagga ccagagtgag caggaaacaa2281gtgatgctga ccagcatgta acttctaatg ccagcgatag tgaaagtagt taccgtggtc2341aagaagaata cgtcttatct tatgaaccag tgaatcaaca agaagttaat tatactcgac2401cagtgatcat attgggacct atgaaagaca ggataaatga tgacttgatc tcagaatttc2461ctgacaaatt tggatcctgt gttcctcata caactagacc aaaacgagat tatgaggtag2521atggaagaga ttatcatttt gtgacttcaa gagagcagat ggaaaaagat atccaggaac2581ataaattcat tgaagctggc cagtataaca atcatctata tggaacaagt gttcagtctg2641tacgagaagt agcaggaaag ggcaaacact gtatccttga tgtgtctgga aatgccataa2701agagattaca gattgcacag ctttacccta tctccatttt tattaaaccc aaatccatgg2761aaaatatcat ggaaatgaat aagcgtctaa cagaagaaca agccagaaaa acatttgaga2821gagccatgaa actggaacag gagtttactg aacatttcac agctattgta cagggggata2881cgctggaaga catttacaac caagtgaaac agatcataga agaacaatct ggttcttaca2941tctgggttcc ggcaaaagaa aagctatgaa aactcatgtt tctctgtttc tcttttccac3001aattccattt tctttggcat ctctttgccc tttcctctgg aaaaaa(SEQ ID NO:267)MPVRKQDTQRALHLLEEYRSKLSQTEDRQLRSSIERVINIFQSNLFQALIDIQEFYEVTLLDNPKCIDRSKPSEPIQPVNTWEISSLPSSTVTSETLPSSLSPSVEKYRYQDEDTPPQEHISPQITNEVIGPELVHVSEKNLSEIENVHGFVSHSHISPIKPTEAVLPSPPTVPVIPVLPVPAENTVILPTIPQANPPPVLVNTDSLETPTYVNGTDADYEYEEITLERGNSGLGFSIAGGTDNPHIGDDSSIFITKIITGGAAAQDGRLRVNDCILQVNEVDVRDVTHSKAVEALKEAGSIVRLYVKRRKPVSEKIMEIKLIKGPKGLGFSIAGGVGNQHIPGDNSIYVTKIIEGGAAHKDGKLQIGDKLLAVNNVCLEEVTHEEAVTALKNTSDFVYLKVAKPTSMYMNDGYAPPDITNSSSQPVDNHVSPSSFLGQTPASPARYSPVSKAVLGDDEITREPRKVVLHRGSTGLGFNIVGGEDGEGIFISFILAGGPADLSGELRKGDRIISVNSVDLRAASHEQAAAALKNAGQAVTIVAQYRPEEYSRFEAKIHDLREQMMNSSISSGSGSLRTSQKRSLYVRALFDYDKTKDSGLPSQGLNFKFGDILHVINASDDEWWQARQVTPDGESDEVGVIPSKRRVEKKERARLKTVKFNSKTRDKGQSFNDKRKKNLFSRKFPFYKNKDQSEQETSDADQHVTSNASDSESSYRGQEEYVLSYEPVNQQEVNYTRPVIILGPMKDRINDDLISEFPDKFGSCVPHTTRPKRDYEVDGRDYHFVTSREQMEKDIQEHKFIEAGQYNNHLYGTSVQSVREVAGKGKHCILDVSGNAIKRLQIAQLYPISIFIKPKSMENIMEMNKRLTEEQARKTFERAMKLEQEFTEHFTAIVQGDTLEDIYNQVKQIIEEQSGSYIWVPAKEKL


Putative function


Component of cell junctions, possible role in proliferation


Example 28B
Validation of GENE Function by RNA Interference (RNAi) Knockdown in Drosophila Cultured Cells

To confirm the mitotic role of the target protein, knockdown of GENE expression is performed in cultured Drosophila Dmel-2 cells using a double stranded RNA (dsRNA) from within the Dlg1 (CG1725) gene corresponding to the following sequence:

(SEQ ID NO:268)GGAGGCCTTTCATCCGGACAACAATTGTCGCAGTCCCAATCGCAGTTGGCCACCAGCCAGAGCCAAAGTCAGGTGCATCAGCAGCAGCATGCGACGCCGATGGTCAATTCGCAGTCGACAGGTGCGCTAAATAGTATGGGACAGACGGTTGTCGATTCACCATCAATACCACAAGCAGCCGCAGCAGTAGCAGCAGCAGCAAATGCATCTGCATCTGCATCAGTCATTGCAAGCAACAACACAATCAGCAACACCACAGTCACCACAGTCACGGCCACGGCCACAGCCAGCAACAGTAGCAGCAAGTTGCCGCCGTCGCTTGGCGCTAACAGCAGCATTAGCATTAGCAATAGCAATAGCAATAGCAACAGCAATAATATCAACAACATTAATAGCATCAACAACAACAACAGTAGCAGCAGCAGCACGACGGCAACTGTTGCAGCAGCAACACCAACAGCAGCATCAGCAGCAGCAGCAGCAGCATCATCTCCACCCGCCAACTCCTTCTATAA


dsRNA is prepared by annealing complimentary RNAs made by in vitro transcription from a PCR fragment created with the following PCR primers:

(SEQ ID NO:269)TAATACGACTCACTATAGGGAGAGGAGGCCTTTCATCCGGACAACAAT(SEQ ID NO:270)TAATACGACTCACTATAGGGAGATTATAGAAGGAGTTGGCGGGTGGAG


Cells are transfected with double stranded RNA in the presence of ‘Transfast’ transfection reagent. A control transfection of a non-endogenous RNA corresponding to RFP (red fluorescent protein) is carried out in parallel.


Analysis of Dlg1 Knockdown by RNAi in D-Mel2 Cells by Cellomics Mitotic Index Assay


For the transfection, 1 μg dsRNA is added to a well of a 96-well Packard viewplate and 35 μl of logarithmically growing DMel-2 cells diluted to 2.3×105 cells/ml in fresh Drosophila-SFM/glutamine/Pen-Strep are added. Cells are incubated with the dsRNA (60 nM) in a humid chamber at 28° C. for 1 hr before addition of 100 μl Drosophila-SFM/glutamine/Pen-Strep. Cells are incubated at 28° C. for 72 hours before analysis. For the assay, cells were fixed and stained using the Cellomics Mitotic Index HitKit following manufacturers instructions. The mitotic index of cells in each well was determined using the ArrayScan HCS System, running the Application protocol Mike250502_Polgen_MitoticIndex10×_p2.0 with the 10× objective and the DualBGlp filter set. This automated screening system detects the levels of a specific antigen (phosphorylated histone H3) which is only detectable during mitosis while the chromosomes are condensed.


Results for Dlg1 (CG1725) are shown in FIG. 5. A reproducible and significant reduction in mitotic index is observed in this assay indicating a reduction in the number of cells entering mitosis after RNAi


Analysis of Dlg1 Knockdown by RNAi in D-Mel2 Cells by Microscopy


For transfection 9 μl of Transfast reagent (Promega) is added to 3 μg gene specific dsRNA in 500 μl Drosophila Schneiders medium (no additives) and incubated at room temperature for 15 min. For control wells an equivalent amount of RFP dsRNA is used. This mix is added to a well of a 6-well tissue culture plate containing a glass coverslip and 500 μl of a Dmel-2 cells at 1×106 cells/ml in shneiders medium. After a 1 hour incubation at 28° C., 2 mls Schneiders medium+10% FCS and pen/strep solution is added and cells are incubated at 28° C. for 48 hours. Cells on the coverslip are fixed in formaldehyde and stained with antibodies which detect α-tubulin and γ-tubulin (centrosomes), and are co-stained with DAPI to detect DNA.


Although no pronounced increase in the frequency of chromosomal defects (see Table 3 below) was observed upon RNAi, there was a small increase (30% compared to 10% in control cells) of spindle defects, of which the majority (>90%) had multiple centrosomes (more than 2).

TABLE 3Mitotic defects observed in Dmel-2 cellsafter siRNA with Dlg1 (CGI725)NumberNumber of% of chromosomalcells withcells withdefects (nochromosomalnormaldefects/totaldsRNAdefectsmitosiscells in mitosis)No RNA13531439.47RFP13730940.29CG172515216947.35


Example 28B
Human Dlg1 and Dlg2 are Human Homologues of Drosophila Dlg1

BLASTP with Drosophila Dlg1 reveals 59% (306/517) sequence identity with regions of the human discs, large (Drosophila) homolog 1 (GENBANK ACCESSION U13896), and 60% (318/524) sequence identity with regions of human discs, large (Drosophila) homolog 2 (GENBANK ACCESSION U32376) that human Dlg1 and Dlg2 are is a homologues of Drosophila Dlg1. The BLASTP results are shown in FIG. 6. FIG. 7 shows a Clustal W alignment of Drosophila Dlg1 and the five human Dlg homologues that are currently detailed in the NCBI database. Considering the homology between the human Dlg proteins, it is probable that some or all of them are functionally similar to Drosophila Dlg1.


The nucleotide sequence of the human Dlg1 and human Dlg2 genes and their deduced amino acid sequences are shown in example 28 above.


Example 28C
Validation of the Mitotic Role of the Human Homologue by siRNA Knockdown of GENE Expression in Human Cultured Cells

Generation of siRNA human Dlg1 and Dlg2 Knockdowns


Knockdown of human Dlg1 and Dlg2 gene expression is achieved by siRNA (short interfering RNA, Elbashir et al, Nature 2001 May 24; 411(6836):494-8). We used synthetic double stranded RNAs corresponding to two different regions of each of the human Dlg1 and Dlg2 mRNAs. Synthetic siRNAs are obtained from Dharmacon Inc (our supplier). The siRNA sequences are:

COD1652dlg2-1AACAUUGUCGGUGGGGACorresponds toAGAUnucleotides(SEQ ID NO:271)1576-1596 in humanDlg-2 (see example28 above)COD1653dlg2-2AAAACCCAGGUCUCUGGCorresponds toAACCnucleotides(SEQ ID NO:272)2664-2684 in humanDlg-2 (see example28 above)COD1654dlg1-1AAAGGGGAAAUUCAGGGCorresponds toCUUGnucleotides(SEQ ID NO:273)871-891 in humanDlg-1 (see example28 above)COD1655dlg1-2AAGUAGCAGGAAAGGGCCorresponds toAAACnucleotides(SEQ ID NO:274)2647-2667 in humanDlg-1 (see example28 above)


Analysis of siRNA Hu Dlg1 and Dlg2 Knockdowns in U2OS Cells by Flow Cytometry Analysis


Cells are seeded in 6-well tissue culture dishes at 1×105 cells/well in 2 ml Dulbecco's Modified Eagle's Medium (DMEM) (Sigma)+10% Foetal Bovine Serum (FBS) (Perbio), and incubated overnight (37° C./5% CO2).


For each well, 12 μl of 20 μM siRNA duplex (Dharmacon, Inc) (in RNAse-free H2O) is mixed with 200 μl of Optimem (Invitrogen). In a separate tube 8 μl of oligofectamine reagent (Invitrogen) was mixed with 52 μl of Optimem, and incubated at room temperature for 7-10 mins. The oligofectamine/Optimem mix is then added dropwise to the siRNA/Optimem mix, and this is then mixed gently, before being incubated for 15-20 mins at room temperature. During this incubation the cells are washed once with DMEM (with no FBS or antibiotics added). 600 μl of DMEM (no FBS or antibiotics) is then added to each well.


Following the 15-20 min incubation, 128 μl of Optimem is added to the siRNA/oligofectamine/optimem mix, and this was added to the cells (in 600 μl DMEM). The transfection mix is added at the edge of each well to assist dilution before contact is made with the cells. Cells are then incubated with the transfection mix for 4 h (37° C./5% CO2). Subsequently 1 ml DMEM+20% FBS is added to each well. Cells are then incubated at 37° C./5% CO2 for 72 h. Cells are harvested by trypsinisation, washed in PBS, fixed in ice-cold 70% EtOH and stained with propidium iodide before Facs analysis.


siRNA Hu Dlg1 and Dlg2 knockdowns are conducted in U2OS. As shown in FIG. 8 major changes in the distribution of cells between cell cycle compartments (G1, S, G2/M) are seen with Dlg1 siRNA COD1564 and Dlg2 siRNA COD1562. In both cases an accumulation of cells with a 2N DNA content, indicated as the G2/M compartment of the cell cycle, is observed with a concomitant reduction in the 1N DNA content G1 compartment population. This indicates that a proportion of cells may unable to exit mitosis and renter G1 and so may be unable to complete cytokinesis, or have entered the next cycle as polyploid cells.


Subsequent microscopic analysis is performed in order to phenotype the Hu Dlg1 and Dlg2 siRNA induced defect and check for the presence of large multinucleate cells which may, due to their size and ploidy, be excluded from the FACS analysis.


Analysis of Hu Dlg1 and Dlg2 siRNA Knockdowns in U2OS Cells by Microscopy


The transfection method for samples for microscopy is identical to that for Facs except that cells are plated in wells containing a sterile glass coverslip. Cells are incubated with siRNA for 48 hours before formaldehyde fixation and co-staining with Dapi to reveal DNA (blue) and antibodies to reveal microtubules (red) and centrosomes (green). Antibodies used are: rat anti-alpha tubulin (YL12) (supplier Serotec) with secondary antibody goat anti-rat IgG-TRITC (supplier Jackson Immunoresearch) and mouse anti-gamma-tubulin (GTU88) with secondary antibody Alexagreen488-goat anti-mouseIgG (supplier Sigma).


Phenotype analysis by microscopy is conducted on U2OS cells. Results from duplicate experiments in U2OS cells are shown in FIGS. 9 and 10, and Table 4 below. Generally after siRNA more of the cells in mitosis seem to be in the early stages, prometaphase rather than the later stages (metaphase, anaphase telophase) a high frequency of cells have multiple centrosomes as is also observed in RNAi with Dmel-2 cell siRNA (see above). In addition transfected cells appear to be unable to successfully carry out cytokinesis which may account for the increase in polyploid cells.

TABLE 4Brief description of significant cell divisiondefects after Dlg1 and 2 siRNA in U2OS cells.Gene/siRNADlg1/COD1564Dlg2/COD1562Cell TypeU2OSU2OSPolyploidyIncreased (4.8/fieldIncreased (4.8/fieldcompared to 1.6/field incompared to 1.6/field innuntreated)nuntreated)MitoticIncreased (23%Increased (36% comparedDefectscompared to 13% into 13% in untreated)untreated)Main knockoutIncreased number ofIncreased number ofphenotypemulti -centrosomal cellsmulti -centrosomal cells(7.3% compared to 2.6%(6.6% compared to 2.6%)in untreated)in untreated)Cytokinesis defects (10%Cytokinesis defects (23%compared to 0% incompared to 0% inuntreated)untreated)Large increase inLarge increase inapoptotic cellsapoptotic cellsAdditionalIncrease in ratio ofIncrease in ratio ofobservationsprophase toprophase to prometaphaseprometaphase (61%(72% compared to 43%compared to 43% inin untreated cells)untreated cells)Decrease in ratio ofDecrease in ratio ofmetaphase (6% comparedmetaphase (5% comparedto 22% in untreated cells)to 22% in untreated cells)Decrease in ratio ofanaphase and telophase(19% compared to 27%in untreated cells)


The above results confirm that Dlg1 and Dlg2 are involved in cell cycle progression, in particular, in achieving successful cell separation during cytokinesis. The mutiplication of centrosomes in many cells after Dlg 1 or 2 RNAi may reflect failure to undergo cytokinesis so that cells prematurely enter the next cycle, or may indicate that the centrosome duplication cycle is overriding normal cell cycle checkpoints. Accordingly, modulators of Dlg1 and Dlg2 activity (as identified by the assays described above) may be used to treat any proliferative disease.


Example 28D
Expression of Recombinant Hu Dlg Protein in Insect Cells

A cDNA encoding the Human Dlg1 or Dlg2 coding region derived by RT-PCR is inserted into the baculovirus expression vector pFastbacHTc (Life Technologies). A baculovirus stock is generated and western blot of subsequent infections of Sf9 insect cells demonstrates expression of N-terminal 6-His tagged proteins of approximately 100 kD (Dlg1 ) and 97 kD (Dlg2). The recombinant protein is purified by Ni-NTA resin affinity chromatography.


Similarly 6-His tagged Dlg proteins are expressed in bacteria by inserting cDNAs into bacterial expression plamids pDest17 or pET series. Protein expression in cultures of host E. coli cells transformed with recombinant plasmid is induced by addition of inducer chemical IPTG. The recombinant protein is purified by Ni-NTA resin affinity chromatography


Example 28E
Assay for Modulators of Dlg Activity

Dlgs are Membrane-associated guanylate kinase (MAGUK) homologues and contain several protein-protein interaction domains including PDZ domains, SH3 domains and a C-terminal guanylate kinase homology region that does not possess guanylate kinase activities but may act as a protein-protein interaction domain. Several proteins are known to bind huDlg1 including the adenomatous polposis coli (APC) tumour suppressor protein, the human papillomavirus E6 transforming protein, transforming adenovirus E4 protein, and the PDZ-binding kinase PBK (Gaudet et al 2000). An assay for modulators of Dlg activity would consist of an ELISA type assay where full length Dlg protein, or individual PDZ domains of Dlg protein expressed in bacteria or insect cells (as described above) are bound to a solid support, and interaction with the PDZ binding proteins described above could be measured by antibody detection of, or radioactive labelling of the PDZ binding proteins.


Example 29 (Category 3)

Line ID—419


Phenotype—Lethal phase, prepupal-pupal. High mitotic index, colchicines-like chromosome condensation, metaphase arrest


Annotated Drosophila genome genomic segment containing P element insertion site (and map position)—AE003450 (9C)


P element insertion site—292,726


Annotated Drosophila genome Complete Genome candidate


CG12638—sprint, ras associated protein

(SEQ ID NO:275)ATGTTTGCCATATCATTGCAGCTGCTCAGCTCGCTGGCCAGCGATTTGGACATAATGCTAAACGATCTTCGATCGGCGCCGAGTCATGCTGCAACAGCAACAGCAACAGCAACAACAACGGCAACAGTTGCAACTGCAACCGCAACAACAACGGCCAACCGGCAGCAGCAACATCATAATCACCATAATCAGCAGCAAATGCAATCAAGGCAATTGCATGCACATCATTGGCAGAGCATTAACAACAATAAGAATAACAACATTAGTAACAAAAACAACAACAACAACAACAATAATAACAATAACATTAATAACAATAATAATAATAATAATCATTCGGCACACCCACCTTGCCTGATCGATATTAAGCTGAAGTCAAGCCGATCGGCAGCAACAAAAATAACCCATACAACAACCGCCAATCAGCTGCAGCAACAACAACGCCGCCGTGTGGCACCCAAGCCACTGCCACGCCCACCGCGACGTACCCGCCCAACGGGACAAAAGGAGGTGGGGCCGTCTGAAGAGGATGGGGACACGGATGCCAGTGACCTGGCCAATATGACATCACCGCTGAGCGCCAGTGCAGCGGCCACTCGAATCAACGGCCTCTCGCCGGAAGTGAAGAAAGTCCAGCGGTTGCCACTGTGGAATGCGCGAAACGGAAACGGAAGTACCACCACCCACTGTCACCCAACCGGCGTCTCTGTGCAACGCCGTCTGCCCATCCAAAGTCATCAGCAGCGAATTCTAAACCAACGATTTCATCACCAGCGAATGCATCATGGGTAA(SEQ ID NO:276)MFAISLQLLSSLASDLDIMLNDLRSAPSHAATATATATTTATVATATATTTANRQQQHHNHHNQQQMQSRQLHAHHWQSINNKNNNISNKNNNNNNNNNNNINNNNNNNNHSAHPPCLIDIKLKSSRSAATKITHTTTANQLQQQQRRRVAPKPLPRPPRRTRPTGQKEVGPSEEDGDTDASDLANMTSPLSASAAATRINGLSPEVKKVQRLPLWNARNGNGSTTTHCHPTGVSVQRRLPIQSHQQRILNQRFHHQRMHHG


Human homologue of Complete Genome candidate


B38637—Ras inhibitor (clone JC265)—human (fragment)

(SEQ ID NO:277)1ggccggcagc ggctgagcga catgagcatt tctacttcct cctccgactc gctggagttc61gaccggagca tgcctctgtt tggctacgag gcggacacca acagcagcct ggaggactac121gagggggaaa gtgaccaaga gaccatggcg ccccccatca agtccaaaaa gaaaaggagc181agctccttcg tgctgcccaa gctcgtcaag tcccagctgc agaaggtgag cggggtgttc241agctccttca tgaccccgga gaagcggatg gtccgcagga tcgccgagct ttcccgggac301aaatgcacct acttcgggtg cttagtgcag gactacgtga gcttcctgca ggagaacaag361gagtgccacg tgtccagcac cgacatgctg cagaccatcc ggcagttcat gacccaggtc421aagaactatt tgtctcagag ctcggagctg gaccccccca tcgagtcgct gatccctgaa481gaccaaatag atgtggtgct ggaaaaagcc atgcacaagt gcatcttgaa gcccctcaag541gggcacgtgg aggccatgct gaaggacttt cacatggccg atggctcatg gaagcaactc601aaggagaacc tgcagcttgt gcggcagagg aatccgcagg agctgggggt cttcgccccg661acccctgatt ttgtggatgt ggagaaaatc aaagtcaagt tcatgaccat gcagaagatg721tattcgccgg aaaagaaggt catgctgctg ctgcgggtct gcaagctcat ttacacggtc781atggagaaca actcagggag gatgtatggc gctgatgact tcttgccagt cctgacctat841gtcatagccc agtgtgacat gcttgaattg gacactgaaa tcgagtacat gatggagctc901ctagacccat cgctgttaca tggagaagga ggctattact tgacaagcgc atatggagca961ctttctctga taaagaattt ccaagaagaa caagcagcgc gactgctcag ctcagaaacc1021agagacaccc tgaggcagtg gcacaaacgg agaaccacca accggaccat cccctctgtg1081gacgacttcc agaattacct ccgagttgca tttcaggagg tcaacagtgg ttgcacagga1141aagaccctcc ttgtgagacc ttacatcacc actgaggatg tgtgtcagat ctgcgctgag1201aagttcaagg tgggggaccc tgaggagtac agcctctttc tcttcgttga cgagacatgg1261cagcagctgg cagaggacac ttaccctcaa aaaatcaagg cggagctgca cagccgacca1321cagccccaca tcttccactt tgtctacaaa cgcatcaaga acgatcctta tggcatcatt1381ttccagaacg gggaagaaga cctcaccacc tcctagaaga caggcgggac ttcccagtgg1441tgcatccaaa ggggagctgg aagccttgcc ttcccgcttc tacatgcttg agcttgaaaa1501gcagtcacct cctcggggac ccctcagtgt agtgactaag ccatccacag gccaactcgg1561ccaagggcaa ctttagccac gcaaggtagc tgaggtttgt gaaacagtag gattctcttt1621tggcaatgga gaattgcatc tgatggttca agtgtcctga gattgtttgc tacctacccc1681cagtcaggtt ctaggttggc ttacaggtat gtatatgtgc agaagaaaca cttaagatac1741aagttctttt gaattcaaca gcagatgctt gcgatgcagt gcgtcaggtg attctcactc1801ctgtggatgg cttcatccct g(SEQ ID NO:278)1grqrlsdmsi stsssdslef drsmplfgye adtnssledy egesdqetma ppikskkkrs61ssfvlpklvk sqlqkvsgvf ssfmtpekrm vrriaelsrd kctyfgclvq dyvsflqenk121echvsstdml qtirqfmtqv knylsqssel dppieslipe dqidvvleka mhkcilkplk181ghveamlkdf hmadgswkql kenlqlvrqr npqelgvfap tpdfvdveki kvkfmtmqkm241yspekkvmll lrvckliytv mennsgrmyg addflpvlty viaqcdmlel dteieymmel301ldpsllhgeg gyyltsayga lsliknfqee qaarllsset rdtlrqwhkr rttnrtipsv361ddfqnylrva fqevnsgctg ktllvrpyit tedvcqicae kfkvgdpeey slflfvdetw421qqlaedtypq kikaelhsrp qphifhfvyk rikndpygii fqngeedltt s


Putative function


Ras associated effector protein


REFERENCES

Altschul, S. F. and Lipman, D. J. (1990) Protein database searches for multiple alignments. Proc. Natl. Acad. Sci. USA 87: 5509-5513


Burge, C. and Karlin, S. (1997) Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78-94.


Deak, P., Omar, M. M., Saunders, R. D. C., Pal, M., Komonyi, O., Szidonya, J., Maroy, P., Zhang, Y., Ashburner, M., Benos, P., Savakis, C., Siden-Kiamos, I., Louis, C., Bolshakov, V. N., Kafatos, F. C., Madueno, E., Modolell, J., Glover, D. M. (1997) Correlating physical and cytogenetic maps in chromosomal region 86E-87F of Drosophila melanogaster. Genetics 147:1697-1722.


Gaudet S, Branton D and Lue R A (2000) Characterisation of PDZ-binding kinase, a mitotic kinase PNAS 97, 5167-5172


Jowett, T. (1986) Preparation of nucleic acids. In “Drosophila: A Practical Approach.” Ed Roberts, D. B. IRL Press Oxford.


Lefevre, G. (1976) A photographic representation and interpretation of the polytene chromosomes of Drosophila melanogaster salivary glands. In: The Genetics and Biology of Drosophila, Eds Ashburner, M. and Novitski, E. Academic Press.


Pirrotta, V. (1986) Cloning Drosophila genes. In: In “Drosophila: A Practical Approach.” Ed Roberts, D. B. IRL Press Oxford.


Saunders, R. D. C., Glover, D. M., Ashburner, M., Siden-Kiamos, I., Louis, C., Monastirioti, M., Savakis, C., Kafatos, F. C. (1989) PCR amplification of DNA microdissected from a single polytene chromosome band: a comparison with conventional microcloning. Nucleic Acids Res. 17:9027-9037


Takada T, Matozaki T, Takeda H, Fukunaga K, Noguchi T, Fujioka Y, Okazaki I, Tsuda M, Yamao T, Ochi F, Kasuga M. (1998) Roles of the complex formation of SHPS-1 with SHP-2 in insulin-stimulated mitogen-activated protein kinase activation. J Biol Chem 1998 Apr. 10; 273(15):9234-42


Torok, T., Tick, G., Alvarado, M., Kiss, I. (1993) P-lacW insertional mutagenesis on the second chromosome of Drosophila melanogaster: isolation of lethals with different overgrowth phenotypes. Genetics 135(1):71-80


Each of the applications and patents mentioned in this document, and each document cited or referenced in each of the above applications and patents, including during the prosecution of each of the applications and patents (“application cited documents”) and any manufacturer's instructions or catalogues for any products cited or mentioned in each of the applications and patents and in any of the application cited documents, are hereby incorporated herein by reference. Furthermore, all documents cited in this text, and all documents cited or referenced in documents cited in this text, and any manufacturer's instructions or catalogues for any products cited or mentioned in this text, are hereby incorporated herein by reference.


Various modifications and variations of the described methods and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in molecular biology or related fields are intended to be within the scope of the claims.

Claims
  • 1. Use of a polynucleotide as set out in Table 5, or a polypeptide encoded by the polypeptide, in a method of prevention, treatment or diagnosis of a disease in an individual.
  • 2. A use as claimed in claim 1, in which the polynucleotide comprises a human polypeptide as set out in column 3 of Table 5.
  • 3. A use as claimed in claim 1 or 2, in which the polynucleotide or polypeptide is used to identify a substance capable of binding to the polypeptide, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.
  • 4. A use as claimed in claim 1, 2 or 3, in which the polynucleotide or polypeptide is used to identify a substance capable of modulating the function of the polypeptide, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.
  • 5. A use as claimed in any preceding claim, in which the polynucleotide or polypeptide is administered to an individual in need of such treatment.
  • 6. A use as claimed in any preceding claim, in which the substance identified by the method is administered to an individual in need of such treatment.
  • 7. A use as claimed in claim 1 or 2 in a method of diagnosis, in which the presence or absence of a polynucleotide is detected in a biological sample in a method comprising: (a) bringing the biological sample containing nucleic acid such as DNA or RNA into contact with a probe comprising a fragment of at least 15 nucleotides of the polynucleotide as set out in Table 5 under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.
  • 8. A use as claimed in claim 1 or 2 in a method of diagnosis, in which the presence or absence of a polypeptide is detected in a biological sample in a method comprising: (a) providing an antibody capable of binding to the polypeptide; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.
  • 9. A use as claimed in any preceding claim, in which the disease comprises a proliferative disease such as cancer.
  • 10. A method of modulating, preferably down-regulating, the expression of a polynucleotide as set out in Table 5 in a cell, the method comprising introducing a double stranded RNA (dsRNA) corresponding to the polynucleotide, or an antisense RNA corresponding to the polynucleotide, or a fragment thereof, into the cell.
  • 11. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Example 19, preferably Shp2 polynucleotide, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 12. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Example 28, preferably Dlg1 or Dlg2 polynucleotide, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 13. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Table 5 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Table 5, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Table 5, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 14. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 1 to 18, 20 to 27 and 29, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 15. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 1, 2, 2A, 2B and 2C, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 16. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 3 to 9 and 9A or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 3 to 9 and 9A, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 3 to 9 and 9A, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 17. A polynucleotide selected from: (a) polynucleotides comprising any one of the nucleotide sequences set out in Examples 10 to 29 or the complement thereof; (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequences set out in Examples 10 to 29, or a fragment thereof; (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequences set out in Examples 10 to 29, or a fragment thereof; (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
  • 18. A polynucleotide probe which comprises a fragment of at least 15 nucleotides of a polynucleotide according to any of claims 11 to 16.
  • 19. A polypeptide which comprises any one of the amino acid sequences set out in any of the following: (a) Example 19, preferably Shp2 polypeptide; (b) Example 28, preferably Dlg1 or Dlg2 polypeptide; (c) Table 5; (d) Examples 1 to 18, 20 to 27 and 29; (e) Examples 1 to 2, 2A, 2B and 2C; (f) Examples 3 to 9 and 9A; (g) Examples 10 to 29; or a homologue, variant, derivative or fragment thereof.
  • 20. A polynucleotide encoding a polypeptide according to claim 19.
  • 21. A vector comprising a polynucleotide according to any of claims 11 to 18 and 20.
  • 22. An expression vector comprising a polynucleotide according to any of claims 11 to 19 and 20 operably linked to a regulatory sequence capable of directing expression of said polynucleotide in a host cell.
  • 23. An antibody capable of binding a polypeptide according to claim 19.
  • 24. A method for detecting the presence or absence of a polynucleotide according to any of claims 11 to 18 and 20 in a biological sample which comprises: (a) bringing the biological sample containing DNA or RNA into contact with a probe according to claim 18 under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.
  • 25. A method for detecting a polypeptide according to claim 19 present in a biological sample which comprises: (a) providing an antibody according to claim 23;(b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.
  • 26. A polynucleotide according to according to any of claims 11 to 18 and 20 for use in therapy.
  • 27. A polypeptide according to claim 19 for use in therapy.
  • 28. An antibody according to claim 23 for use in therapy.
  • 29. A method of treating a tumour or a patient suffering from a proliferative disease comprising administering to a patient in need of treatment an effective amount of a polynucleotide according to any of claims 11 to 18 and 20.
  • 30. A method of treating a tumour or a patient suffering from a proliferative disease, comprising administering to a patient in need of treatment an effective amount of a polypeptide according to claim 17.
  • 31. A method of treating a tumour or a patient suffering from a proliferative disease, comprising administering to a patient in need of treatment an effective amount of an antibody according to claim 23 to a patient.
  • 32. Use of a polypeptide according to claim 19 in a method of identifying a substance capable of affecting the function of the corresponding gene.
  • 33. Use of a polypeptide according to claim 19 in an assay for identifying a substance capable of inhibiting the cell division cycle.
  • 34. Use as claimed in claim 33, in which the substance is capable of inhibiting mitosis and/or meiosis.
  • 35. A method for identifying a substance capable of binding to a polypeptide according to claim 19, which method comprises incubating the polypeptide with a candidate substance under suitable conditions and determining whether the substance binds to the polypeptide.
  • 36. A method for identifying a substance capable of modulating the function of a polypeptide according to claim 19 or a polypeptide encoded by a polynucleotide according to any of claims 11 to 18 and 20, the method comprising the steps of: incubating the polypeptide with a candidate substance and determining whether activity of the polypeptide is thereby modulated.
  • 37. A substance identified by a method or assay according to any of claims 32 to 36.
  • 38. Use of an antibody according to claim 23 or a substance according to claim 36 in a method of inhibiting the function of a polypeptide.
  • 39. Use of an antibody according to claim 23 or a substance according to claim 37 in a method of regulating a cell division cycle function.
  • 40. A method of identifying a human nucleic acid sequence, by: (a) selecting a Drosophila polypeptide identified in any of Examples 11 to 39; (b) identifying a corresponding human polypeptide; (c) identifying a nucleic acid encoding the polypeptide of (b).
  • 41. A method according to claim 40, in which a human homologue of the Drosophila sequence, or a human sequence similar to the Drosophila sequence, is identified in step (b).
  • 42. A method according to claim 40 or 41, in which the human polypeptide has at least one of the biological activities, preferably substantially all the biological activities of the Drosophila polypeptide.
  • 43. A human polypeptide identified by a method according to claim 40, 41 or 42.
Priority Claims (3)
Number Date Country Kind
GB 0126506.5 Nov 2001 GB national
GB 0128384.5 Nov 2001 GB national
GB 0203185.4 Feb 2002 GB national
Continuations (2)
Number Date Country
Parent 10840060 May 2004 US
Child 11634815 Dec 2006 US
Parent PCT/GB02/04780 Oct 2002 US
Child 10840060 May 2004 US