The present invention relates generally to the cloning, identification, and expression of the CA125 gene's glycosylated amino terminal domain, the multiple repeat domain, and the carboxy terminal domain in vitro and, more specifically, to the use of recombinant CA125 with epitope binding sites for diagnostic and therapeutic purposes. Additionally, the genomic DNA, a molecule encoding a 5′ upstream region of CA125 and a genomic DNA sequence for the amino terminal, extra cellular repeats and carboxy terminal of CA125 has been determined.
CA125 is an antigenic determinant located on the surface of ovarian carcinoma cells with essentially no expression in normal adult ovarian tissue. Elevated in the sera of patients with ovarian adenocarcinoma, CA125 has played a critical role for more than 15 years in the management of these patients relative to their response to therapy and also as an indicator of recurrent disease.
It is well established that CA125 is not uniquely expressed in ovarian carcinoma, but is also found in both normal secretory tissues and other carcinomas (i.e., pancreas, liver, colon) [Hardardottir H et al., Distribution of CA125 in embryonic tissue and adult derivatives of the fetal periderm, Am J Obstet. Gynecol. 163; 6(1):1925-1931 (1990); Zurawski V R et al., Tissue distribution and characteristics of the CA125 antigen, Cancer Rev. 11-12:102-108 (1988); and O'Brien T J et al., CA125 antigen in human amniotic fluid and fetal membranes, Am J Obstet. Gynecol. 155:50-55, (1986); Nap M et al., Immunohistochemical characterization of 22 monoclonal antibodies against the CA125 antigen: 2nd report from the ISOBM TD-1 workshop, Tumor Biology 17:325-332 (1996)]. Notwithstanding, CA125 correlates directly with the disease status of affected patients (i.e., progression, regression, and no change), and has become the “gold standard” for monitoring patients with ovarian carcinoma [Bast R C et al., A radioimmunoassay using a monoclonal antibody to monitor the course of epithelial ovarian cancer, N Engl J Med. 309:883-887 (1983); and Bon G C et al., Serum tumor marker immunoassays in gynecologic oncology: Establishment of reference values, Am J Obstet. Gynecol. 174:107-114 (1996)]. CA125 is especially useful in post-menopausal patients where endometrial tissue has become atrophic and, as a result, is not a major source of normal circulating CA125.
During the mid 1980's, the inventor of the present invention and others developed M11, a monoclonal antibody to CA125. M11 binds to a dominant epitope on the repeat structure of the CA125 molecule [O'Brien T J et al., New monoclonal antibodies identify the glycoprotein carrying the CA125 epitope, Am J Obstet Gynecol 165:1857-64 (1991)]. More recently, the inventor and others developed a purification and stabilization scheme for CA125, which allows for the accumulation of highly purified high molecular weight CA125 [O'Brien T J et al., More than 15 years of CA125: What is known about the antigen, its structure and its function, Int J Biological Markers 13(4):188-195 (1998)].
Considerable progress has been made over the years to further characterize the CA125 molecule, its structure and its function. The CA125 molecule is a high molecular weight glycoprotein with a predominance of O-linked sugar side chains. The native molecule exists as a very large complex (˜2-5 million daltons). The complex appears to be composed of an epitope containing CA125 molecule and binding proteins which carry no CA125 epitopes. The CA125 molecule is heterogenous in both size and charge, most likely due to continuous deglycosylation of the side chains during its life-span in bodily fluids. The core CA125 subunit is in excess of 200,000 daltons, and retains the capacity to bind both OC125 and M11 class antibodies.
Despite the advances in detection and quantitation of serum tumor markers like CA125, the majority of ovarian cancer patients are still diagnosed at an advanced stage of the disease—Stage III or IV. Further, the management of patients' responses to treatment and the detection of disease recurrence remain major problems. There, thus, remains a need to significantly improve and standardize current CA125 assay systems. Further, the development of an early indicator of risk of ovarian cancer will provide a useful tool for early diagnosis and improved prognosis.
The genomic DNA and a full-length cDNA sequence of human CA125 has been determined. Additionally, a nucleic acid molecule encoding a 5′ upstream region of the CA125 gene has been determined.
The genomic sequence for CA125 and a 5′ upstream region has been determined. A DNA sequence showing the 5′ upstream region and the amino terminal portion of the CA125 molecule is set out in Table 27. The extracellular amino terminal domain is made of exons: Exon 1 from 2205-11679; Exon 2 from 13464-13570; Exon 3 from 16177-34419; Exon 4 from 34575-38024; Exon 5 from 38689-38800; Exon 6 from 40578-45257; Exon 7 from 47360-47395; Exon 8 from 52407-52442; Exon 9 from 52686-52744 as set out in SEQ ID NO 311. A DNA sequence showing the extracellular repeat portion of the CA125 molecule is set out in Table 28. The repeat portion is made of exons: Exon R1 from 1-130; Exon R2 from 442-510; Exon R3 from 5479-5652; Exon R4 from 6301-6334; Exon R5 from 6593-6657; Exon R1 from 7558-7683; Exon R2 from 8216-8284; Exon R3 from 8877-9050; Exon R4 from 9380-9413; Exon R5 from 9675-9739; Exon R1 from 10201-10291; Exon R2 from 10524-10592; Exon R3 from 11200-11373; Exon R4 from 11722-11755; Exon R5 from 12016-12036; Exon R1 from 12169-12295; Exon R2 from 12532-12600; Exon R3 from 13219-13392; Exon R4 from 13723-13756; Exon R5 from 14016-14077; Exon R1 from 15001-15126; Exon R2 from 15367-15435; Exon R1 from 15648-15773; Exon R2 from 16002-16070; Exon R3 from 16653-16826; Exon R4 from 17158-17191; Exon R5 from 17453-17517; Exon R1 from 18532-18657; Exon R2 from 18888-18956; Exon R3 from 19633-19806; Exon R4 from 20141-20176; Exon R5 from 20387-20449; Exon R1 from 21609-21731; Exon R2 from 21940-22008; Exon R3 from 22605-22778; Exon R4 from 23109-23142; Exon R1 from 29046-29168; Exon R2 from 29266-29334; Exon R3 from 33917-34090; Exon R4 from 36702-36734; Exon R5 from 38270-38320; Exon R1 from 39104-39224; Exon R2 from 39315-39383; Exon R3 from 39532-39705; Exon R4 from 41862-41992 as set out in SEQ ID NO 312. A DNA sequence showing the carboxy terminal domain of the CA125 molecule is set out in Table 29. The carboxy terminal portion is made of exons: Exon C1 from 1-66; Exon C2 from 1802-1947; Exon C3 from 4198-4350; Exon C4 from 4679-4747; Exon C5 from 6811-6978; Exon C6 from 11232-11270; Exon C7 from 11594-11677; Exon C8 from 14095-14187 as set out in SEQ ID NO 313. A full length cDNA molecule for CA125 is set out in Table 30 and SEQ ID NO 314. A CA125 protein is set out in Table 31 and SEQ ID NO 315.
The CA125 gene has been cloned and multiple repeat sequences as well as the glycosylated amino terminal and the carboxy terminus have been identified. CA125 requires a transcript of more than 35,000 bases and occupies approximately 150,000 bp on chromosome 19q 13.2. The CA125 molecule comprises three major domains: an extracellular amino terminal domain (Domain 1); a large multiple repeat domain (Domain 2); and a carboxy terminal domain (Domain 3) which includes a transmembrane anchor with a short cytoplasmic domain. The amino terminal domain is assembled by combining five genomic exons, four very short amino terminal sequences and one extraordinarily large exon. This domain is dominated by its capacity for O-glycosylation and its resultant richness in serine and threonine residues. Additionally, an amino terminal extension is present, which comprises four genomic exons. Analysis of the amino terminal extension revealed that its amino acid composition is consistent with the amino acid composition of the amino terminal domain.
The extracellular repeat domain, which characterizes the CA125 molecule, also represents a major portion of the CA125 molecular structure. It is downstream from the amino terminal domain and presents itself in a much different manner to its extracellular matrix neighbors. These repeats are characterized by many features including a highly-conserved nature and uniformity in exon structure. But most consistently, a cysteine enclosed sequence may form a cysteine loop. Domain 2 comprises 156 amino acid repeat units of the CA125 molecule. The repeat domain constitutes the largest proportion of the CA125 molecule. The repeat units also include the epitopes now well-described and classified for both the major class of CA125 antibodies of the OC125 group and the M11 group. More than 60 repeat units have been identified, sequenced, and contiguously placed in the CA125 domain structure. The repeat sequences demonstrated 70-85% homology to each other. The existence of the repeat sequences was confirmed by expression of the recombinant protein in E. coli where both OC125/M11 class antibodies were found to bind to sites on the CA125 repeat.
The CA125 molecule is anchored at its carboxy terminal through a transmembrane domain and a short cytoplasmic tail. The carboxy terminal also contains a proteolytic cleavage site approximately 50 amino acids upstream from the transmembrane domain, which allows for proteolytic cleavage and release of the CA125 molecule. The identification and sequencing of multiple repeat domains of the CA125 antigen provides potentially new clinical and therapeutic applications for detecting, monitoring and treating patients with ovarian cancer and other carcinomas where CA125 is expressed. For example, the ability to express repeat domains of CA125 with the appropriate epitopes would provide a much needed standard reagent for research and clinical applications. Current assays for CA125 utilize as standards either CA125 produced from cultured cell lines or from patient ascites fluid. Neither source is defined with regard to the quality or purity of the CA125 molecule. The present invention overcomes the disadvantages of current assays by providing multiple repeat domains of CA125 with epitope binding sites. At least one or more of any of the more than 60 repeats shown in Table 16 can be used as a “gold standard” for testing the presence of CA125. Furthermore, new and more specific assays may be developed utilizing recombinant products for antibody production.
Perhaps even more significantly, the multiple repeat domains of CA125 or other domains could also be used for the development of a potential vaccine for patients with ovarian cancer. In order to induce cellular and humoral immunity in humans to CA125, murine antibodies specific for CA125 were utilized in anticipation of patient production of anti-ideotypic antibodies, thus indirectly allowing the induction of an immune response to the CA125 molecule. With the availability of recombinant CA125, especially domains which encompass epitope binding sites for known murine antibodies, it will be feasible to more directly stimulate patients' immune systems to CA125 and, as a result, extend the life of ovarian carcinoma patients.
The recombinant CA125 of the present invention may also be used to develop therapeutic targets. Molecules like CA125, which are expressed on the surface of tumor cells, provide potential targets for immune stimulation, drug delivery, biological modifier delivery or any agent which can be specifically delivered to ultimately kill the tumor cells. Humanized or human antibodies to CA125 epitopes could be used to deliver all drug or toxic agents including radioactive agents to mediate direct killing of tumor cells. Natural ligands having a natural binding affinity for domains on the CA125 molecule could also be utilized to deliver therapeutic agents to tumor cells.
CA125 expression may further provide a survival or metastatic advantage to ovarian tumor cells. Antisense oligonucleotides derived from the CA125 repeat sequences could be used to down-regulate the expression of CA125. Further, antisense therapy could be used in association with a tumor cell delivery system of the type described above.
Recombinant domains of the CA125 molecule also have the potential to identify small molecules, which bind to individual domains of the CA125 molecule. These small molecules could also be used as delivery agents or as biological modifiers.
In one aspect of the present invention, a CA125 molecule is disclosed comprising: (a) an extracellular amino terminal domain, comprising 5 genomic exons, wherein exon 1 comprises amino acids #1-33 of SEQ ID NO: 299, exon 2 comprises amino acids #34-1593 of SEQ ID NO: 299, exon 3 comprises amino acids #1594-1605 of SEQ ID NO: 299, exon 4 comprises amino acids #1606-1617 of SEQ ID NO: 299, and exon 5 comprises amino acids #1618-1637 of SEQ ID NO: 299; (b) an amino terminal extension, comprising 4 genomic exons, wherein exon 1 comprises amino acids #1-3157 of SEQ ID NO: 310, exon 2 comprises amino acids #3158-3193 of SEQ ID NO: 310, exon 3 comprises amino acids #3194-9277 of SEQ ID NO: 310, and exon 4 comprises amino acids #9278-10,427 of SEQ ID NO: 310; (c) a multiple repeat domain, wherein each repeat unit comprises 5 genomic exons, wherein exon 1 comprises amino acids #1-42 in any of SEQ ID NOS: 164 through 194; exon 2 comprises amino acids #43-65 in any of SEQ ID NOS: 195 through 221; exon 3 comprises amino acids #66-123 in any of SEQ ID NOS: 222 through 249; exon 4 comprises amino acids #124-135 in any of SEQ ID NOS: 250 through 277; and exon 5 comprises amino acids #136-156 in any of SEQ ID NOS: 278 through 298; and (d) a carboxy terminal domain comprising a transmembrane anchor with a short cytoplasmic domain, and further comprising 9 genomic exons, wherein exon 1 comprises amino acids #1-11 of SEQ ID NO: 300; exon 2 comprises amino acids #12-33 of SEQ ID NO: 300; exon 3 comprises amino acids #34-82 of SEQ ID NO: 300; exon 4 comprises amino acids #83-133 of SEQ ID NO: 300; exon 5 comprises amino acids #134-156 of SEQ ID NO: 300; exon 6 comprises amino acids #157-212 of SEQ ID NO: 300; exon 7 comprises amino acids #213-225 of SEQ ID NO: 300; exon 8 comprises amino acids #226-253 of SEQ ID NO: 300; and exon 9 comprises amino acids #254-284 of SEQ ID NO: 300.
In another aspect of the invention, the repeats comprise amino acids selected from the group consisting of SEQ ID NO 11-46, 69-80 and 58-161, wherein the repeats in any of the repeats are in any order.
In another aspect of the present invention, the N-glycosylation sites of the amino terminal domain marked (x) in
In another aspect of the present invention, the serine and threonine O-glycosylation pattern for the amino terminal domain is marked (o) in SEQ ID NO: 299 in
In another aspect of the present invention, the N-glycosylation sites of the amino terminal extension marked (x) in Table 26 are encoded at positions #139, #434, #787, #930, #957, #1266, #1375, #1633, #1840, #1877, #1890, #2345, #2375, #2737, #3085, #3178, #3501, #4221, #4499, #4607, #4614, #4625, #5048, #5133, #5322, #5396, #5422, #5691, #5865, #6090, #6734, #6861, #6963, #8031, #8057, #8326, #8620, #8686, #8915, #9204, #9495, #9787, #10,077, and #10,175.
In another aspect, the serine and threonine O-glycosylation pattern for the amino terminal extension is marked (o) in Table 26.
In another aspect of the present invention, exon 1 in the repeat domain comprises at least 31 different copies; exon 2 comprises at least 27 different copies; exon 3 comprises at least 28 different copies; exon 4 comprises at least 28 different copies, and exon 5 comprises at least 21 different copies.
In another aspect of the present invention, the repeat domain comprises 156 amino acid repeat units which comprise epitope binding sites. The epitope binding sites are located in at least part of the C-enclosure at amino acids #59-79 (marked C-C) in SEQ ID NO: 150 in
In another aspect, the 156 amino acid repeat unit comprises O-glycosylation sites at positions #128, #129, #132, #133, #134, #135, #139, #145, #146, #148, #150, #151, and #156 in SEQ ID NO: 150 in
In another aspect of the invention, the multiple repeat domain is made of repeats selected from SEQ ID NOS 11-46, 69-80 and 58-161, wherein the repeat units are in any order.
In yet another aspect, the transmembrane domain of the carboxy terminal domain is located at positions #230-252 (underlined) in SEQ ID NO: 300 of
In another aspect of the present invention, an isolated nucleic acid of the CA125 gene is disclosed, which comprises a nucleotide sequence selected from the group consisting of: (a) the nucleotide sequences set forth in SEQ ID NOS: 311, 312, 313 and 314; (b) a nucleotide sequence having at least 70% sequence identity to any one of the sequences in (a); (c) a degenerate variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c).
In another aspect of the present invention, an isolated nucleic acid of the CA125 gene, comprising a sequence that encodes a polypeptide with the amino acid sequence selected from the group consisting of: (a) the amino acid sequences set forth in SEQ ID NO: 315; (b) an amino acid sequence having at least 50% sequence identity to any one of the sequences in (a); (c) a conservative variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c).
In yet another aspect, a vector comprising the nucleic acid of the CA125 gene is disclosed. The vector may be a cloning vector, a shuttle vector, or an expression vector. A cultured cell comprising the vector is also disclosed.
In yet another aspect, a method of expressing CA125 antigen in a cell is disclosed, comprising the steps of: (a) providing at least one nucleic acid comprising a nucleotide sequence selected from the group consisting of: (i) the nucleotide sequences set forth in SEQ ID NOS: 49, 67, 81, 83-145, 147, 150, and 152; (ii) a nucleotide sequence having at least 70% sequence identity to any one of the sequences in (i); (iii) a degenerate variant of any one of (i) to (ii); and (iv) a fragment of any one of (i) to (iii); (b) providing cells comprising an mRNA encoding the CA125 antigen; and (c) introducing the nucleic acid into the cells, wherein the CA125 antigen is expressed in the cells.
In yet another aspect, a purified polypeptide of the CA125 gene, comprising an amino acid sequence selected from the group consisting of: (a) the amino acid sequences set forth in SEQ ID NOS: 11-48, 50, 68-80, 82, 146, 148, 149, 150, 151, and 153-158; (b) an amino acid sequence having at least 50% sequence identity to any one of the sequences in (a); (c) a conservative variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c).
In another aspect, a purified antibody that selectively binds to an epitope in the receptor-binding domain of CA125 protein, wherein the epitope is within the amino acid sequence selected from the group consisting of: (a) the amino acid sequences set forth in SEQ ID NOS: 11-48, 50, 68-80, 146, 151, and 153-158; (b) an amino acid sequence having at least 50% sequence identity to any one of the sequences in (a); (c) a conservative variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c).
More specifically, this invention relates to a purified antibody that selectively binds to an epitope in the CA125 protein of SEQ ID NO 315. Similarly, the purified antibody selectively binds to an amino acid sequence having at least 50% sequence identity to said sequence; the purified antibody selectively binds to an amino acid sequence having at least 60% sequence identity to said sequence; the purified antibody selectively binds to an amino acid sequence having at least 70% sequence identity to said sequence; the purified antibody selectively binds to an amino acid sequence having at least 80% sequence identity to said sequence; and the purified antibody selectively binds to an amino acid sequence having at least 90% sequence identity to said sequence. Additionally, purified antibody can be a conservative variant of the amino acid sequence set forth in SEQ ID NO 315 or a fragment thereof.
A diagnostic for detecting and monitoring the presence of CA125 antigen is also disclosed, which comprises recombinant CA125 comprising at least one repeat unit of the CA125 repeat domain including epitope binding sites selected from the group consisting of amino acid sequences set forth in SEQ ID NOS: 11-48, 50, 68-80, 82, 146, 150, 151, 153-161, and 162 (amino acids #1,643-11,438).
A therapeutic vaccine to treat mammals with elevated CA125 antigen levels or at risk of developing a disease or disease recurrence associated with elevated CA125 antigen levels is also disclosed. The vaccine comprises recombinant CA125 repeat domains including epitope binding sites, wherein the repeat domains are selected from the group of amino acid sequences consisting of SEQ ID NOS: 11-48, 50, 68-80, 82, 146, 148, 149, 150, 151, 153-161, and 162 (amino acids #1,643-11,438), and amino acids #175-284 of SEQ ID NO: 300. Mammals include animals and humans.
In another aspect of the present invention, an antisense oligonucleotide is disclosed that inhibits the expression of CA 125 encoded by: (a) the nucleotide sequences set forth in SEQ ID NOS: 49, 67, 81, 83-145, 147, 150, and 152; (b) a nucleotide sequence having at least 70% sequence identity to any one of the sequences in (a); (c) a degenerate variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c).
The preceding and further aspects of the present invention will be apparent to those of ordinary skill in the art from the following description of the presently preferred embodiments of the invention, such description being merely illustrative of the present invention.
In accordance with the present invention, conventional molecular biology, microbiology, and recombinant DNA techniques may be used that will be apparent to those skilled in the relevant art. Such techniques are explained fully in the literature (see, e.g., Maniatis, Fritsch & Sambrook, “Molecular Cloning: A Laboratory Manual (1982); “DNA Cloning: A Practical Approach,” Volumes I and II (D. N. Glover ed. 1985); “Oligonucleotide Synthesis” (M. J. Gait ed. 1984); “Nucleic Acid Hybridization” (B. D. Hames & S. J. Higgins eds. (1985)); “Transcription and Translation” (B. D. Hames & S. J. Higgins eds. (1984)); “Animal Cell Culture” (R. I. Freshney, ed. (1986)); “Immobilized Cells And Enzymes” (IRL Press, (1986)); and B. Perbal, “A Practical Guide To Molecular Cloning” (1984)).
Therefore, if appearing herein, the following terms shall have the definitions set out below.
A “vector” is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment.
A “DNA molecule” refers to the polymeric form of deoxyribonucleotides (adenine, guanine, thymine, or cytosine) in either single stranded form, or a double-stranded helix. This term refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear DNA molecules (e.g., restriction fragments), viruses, plasmids, and chromosomes.
As used herein, the term “gene” shall mean a region of DNA encoding a polypeptide chain.
“Messenger RNA” or “mRNA” shall mean an RNA molecule that encodes for one or more polypeptides.
“DNA polymerase” shall mean an enzyme which catalyzes the polymerization of deoxyribonucleotide triphosphates to make DNA chains using a DNA template.
“Reverse transcriptase” shall mean an enzyme which catalyzes the polymerization of deoxy- or ribonucleotide triphosphates to make DNA or RNA chains using an RNA or DNA template.
“Complementary DNA” or “cDNA” shall mean the DNA molecule synthesized by polymerization of deoxyribonucleotides by an enzyme with reverse transcriptase activity.
An “isolated nucleic acid” is a nucleic acid the structure of which is not identical to that of any naturally occurring nucleic acid or to that of any fragment of a naturally occurring genomic nucleic acid spanning more than three separate genes. The term therefore covers, for example, (a) a DNA which has the sequence of part of a naturally occurring genomic DNA molecule but is not flanked by both of the coding sequences that flank that part of the molecule in the genome of the organism in which it naturally occurs; (b) a nucleic acid incorporated into a vector or into the genomic DNA of a prokaryote or eukaryote in a manner such that the resulting molecule is not identical to any naturally occurring vector or genomic DNA; (c) a separate molecule such as a cDNA, a genomic fragment, a fragment produced by polymerase chain reaction (PCR), or a restriction fragment; and (d) a recombinant nucleotide sequence that is part of a hybrid gene, i.e., a gene encoding a fusion protein.
“Oligonucleotide”, as used herein in referring to the probes or primers of the present invention, is defined as a molecule comprised of two or more deoxy- or ribonucleotides, preferably more than ten. Its exact size will depend upon many factors which, in turn, depend upon the ultimate function and use of the oligonucleotide.
“DNA fragment” includes polynucleotides and/or oligonucleotides and refers to a plurality of joined nucleotide units formed from naturally-occurring bases and cyclofuranosyl groups joined by native phosphodiester bonds. This term effectively refers to naturally-occurring species or synthetic species formed from naturally-occurring subunits. “DNA fragment” also refers to purine and pyrimidine groups and moieties which function similarly but which have non naturally-occurring portions. Thus, DNA fragments may have altered sugar moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing species. They may also contain altered base units or other modifications, provided that biological activity is retained. DNA fragments may also include species which include at least some modified base forms. Thus, purines and pyrimidines other than those normally found in nature may be so employed. Similarly, modifications on the cyclofuranose portions of the nucleotide subunits may also occur as long as biological function is not eliminated by such modifications.
“Primer” shall refer to an oligonucleotide, whether occurring naturally or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product, which is complementary to a nucleic acid strand, is induced, i.e., in the presence of nucleotides and an inducing agent such as a DNA polymerase and at a suitable temperature and pH. The primer may be either single-stranded or double-stranded and must be sufficiently long to prime the synthesis of the desired extension product in the presence of the inducing agent. The exact length of the primer will depend upon many factors, including temperature, the source of primer and the method used. For example, for diagnostic applications, depending on the complexity of the target sequence, the oligonucleotide primer typically contains 10-25 or more nucleotides, although it may contain fewer nucleotides.
The primers herein are selected to be “substantially” complementary to different strands of a particular target DNA sequence. This means that the primers must be sufficiently complementary to hybridize with their respective strands. Therefore, the primer sequence need not reflect the exact sequence of the template. For example, a non-complementary nucleotide fragment may be attached to the 5′ end of the primer, with the remainder of the primer sequence being complementary to the strand. Alternatively, non-complementary bases or longer sequences can be interspersed into the primer, provided that the primer sequence has sufficient complementarity with the sequence or hybridize therewith and thereby form the template for the synthesis of the extension product.
As used herein, the term “hybridization” refers generally to a technique wherein denatured RNA or DNA is combined with complementary nucleic acid sequence which is either free in solution or bound to a solid phase. As recognized by one skilled in the art, complete complementarity between the two nucleic acid sequences is not a pre-requisite for hybridization to occur. The technique is ubiquitous in molecular genetics and its use centers around the identification of particular DNA or RNA sequences within complex mixtures of nucleic acids.
As used herein, “restriction endonucleases” and “restriction enzymes” shall refer to bacterial enzymes which cut double-stranded DNA at or near a specific nucleotide sequence.
“Purified polypeptide” refers to any peptide generated from CA125 either by proteolytic cleavage or chemical cleavage.
“Degenerate variant” refers to any amino acid variation in the repeat sequence, which fulfills the homology exon structure and conserved sequences and is recognized by the M11, OC125 and ISOBM series of antibodies.
“Fragment” refers to any part of the CA125 molecule identified in a purification scheme.
“Conservative variant antibody” shall mean any antibody that fulfills the criteria of M11, OC125 or any of the ISOBM antibody series.
The CA125 gene has been cloned and multiple repeat sequences as well as the carboxy terminus have been identified. The genomic DNA for the CA125 gene is set out in SEQ ID NO 311-313. The CA125 molecule comprises three major domains: an extracellular amino terminal domain (Domain 1); a large multiple repeat domain (Domain 2); and a carboxy terminal domain (Domain 3) which includes a transmembrane anchor with a short cytoplasmic domain. The amino terminal domain is assembled by combining five genomic exons, four very short amino terminal sequences and one extraordinarily large exon. This domain is dominated by its capacity for O-glycosylation and its resultant richness in serine and threonine residues. Additionally, an amino terminal extension is present, which comprises four genomic exons. The amino acid composition of the amino terminal extension was found to be consistent with the amino acid composition of the amino terminal domain. The molecular structure is dominated by a repeat domain comprising 156 amino acid repeat units, which encompass the epitope binding sites. More than 60 repeat units have been identified, sequenced, and contiguously placed in the CA125 domain structure. The repeat units encompass an interactive disulfide bridged C-enclosure and the site of OC125 and M11 binding. The repeat sequences demonstrated 70-85% homology to each other. Expression of the repeats was demonstrated in E. coli. The CA125 molecule is anchored at its carboxy terminal through a transmembrane domain and a short cytoplasmic tail. The carboxy terminal also contains a proteolytic cleavage site approximately 50 amino acids upstream from the transmembrane domain, which allows for proteolytic cleavage and release of the CA125 molecule. Any one of the repeat domains has the potential for use as a new gold standard for detecting and monitoring the presence of the CA125 antigen. Further, the repeat domains or other domains, especially the c-terminal to the repeat domain also provide a basis for the development of a vaccine, which would be useful for the treatment of ovarian cancer and other carcinomas where CA125 is elevated.
The DNA sequences of the present invention can also be characterized as encoding the amino acid sequence equivalents of the amino acid sequence, equivalents, as used in this context, include peptides of substantially similar length and amino acid identity to those disclosed, but having conservative amino acid substitution at a non-critical residue position. A conservative amino acid substitution is a substitution in which an amino acid residue is replaced with an amino acid residue of differing identity, but whose R group can be characterized by chemically similar. Four common categories include: polor but uncharged R groups; positively charged R groups; negatively charged R groups; and, hydrophobic R groups. A preferred conservative substitution involves the substitution of a second hydrophobic residue for a fir hydrophobic residue, the first and second hydrophobic residues differing primarily in the size of the R group. The hydrophobic residue would be predicted to be located internally in the folded peptide structure and the mild pertubatim caused only by a change in the size of an R group at an internally located which would not alter the antigenicity of R protein.
The isolated cDNA sequences (Table 30 and SEQ ID NO 314) of the present invention can be inserted into an expression vector. Such vectors contain all necessary regulatory signals to promote the expression of a DNA sequence of interest. Expression vectors are typically either prokaryote or eukaryote specific. Expression vectors can be introduced into either prokaryote or eukaryote cells to produce CA125 proteins or portions thereof. This cDNA sequence was expressed to provide the CA125 molecule set out in Table 31 and SEQ ID NO 315.
Materials and Methods
A. Tissue Collection, RNA Isolation and cDNA Synthesis
Both normal and ovarian tumor tissues were utilized for cDNA preparation. Tissues were routinely collected and stored at −80° C. according to a tissue collection protocol.
Total RNA isolation was performed according to the manufacturer's instructions using the TriZol Reagent purchased from GibcoBRL (Catalog #15596-018). In some instances, mRNA was isolated using oligo dT affinity chromatography. The amount of RNA recovered was quantitated by UV spectrophotometry. First strand complementary DNA (cDNA) was synthesized using 5.0 μg of RNA and random hexamer primers according to the manufacturer's protocol utilizing a first strand synthesis kit obtained from Clontech (Catalog #K1402-1). The purity of the cDNA was evaluated by PCR using primers specific for the β-tubulin gene. These primers span an intron such that the PCR products generated from pure cDNA can be distinguished from cDNA contaminated with genomic DNA.
B. Identification and Ordering of CA125 Repeat Units
It has been demonstrated that the 2-5 million dalton CA125 glycoprotein (with repeat domains) can be chemically segmented into glycopeptide fragments using cyanogen bromide. As shown in
To convert CA125 into a consistent glycopeptide, the CA125 parent molecule was processed by cyanogen bromide digestion. This cleavage process resulted in two main fractions on commassie blue staining following polyacrylamide gel electrophoresis. An approximately 60 kDa band and a more dominant 40 kDa band were identified as shown in
The 40 kDa and 60 kDa bands were excised from PVDF blots and submitted to amino terminal and internal peptide amino acid sequencing as described and practiced by Harvard Sequencing, (Harvard Microchemistry Facility and The Biological Laboratories, 16 Divinity Avenue, Cambridge, Mass. 02138). Sequencing was successful only for the 40 kDa band where both amino terminal sequences and some internal sequences were obtained as shown in Table 1 at SEQ ID NOS: 1-4. The 40 kDa fragment of the CA125 protein was found to have homology to two translated EST sequences (GenBank Accession Nos. BE005912 and AA640762). Visual examination of these translated sequences revealed similar amino acid regions, indicating a possible repetitive domain. The nucleotide and amino acid sequences for EST Genbank Accession No. BE005912 (corresponding to SEQ ID NO: 5 and SEQ ID NO: 6, respectively) are illustrated in Table 1. Common sequences are boxed or underlined.
In an attempt to identify other individual members of this proposed repeat family, two oligonucleotide primers were synthesized based upon regions of homology in these EST sequences. Shown in Table 2A, the primer sequences correspond to SEQ ID NOS: 7 and 8 (sense primers) and SEQ ID NOS: 9 and 10 (antisense primers). Repeat sequences were amplified in accordance with the methods disclosed in the following references: Shigemasa K et al., p21: A monitor of p53 dysfunction in ovarian neoplasia, Int. J. Gynecol. Cancer 7:296-303 (1997) and Shigemasa K et al., p16 Overexpression: A potential early indicator of transformation in ovarian carcinoma, J. Soc. Gynecol. Invest. 4:95-102 (1997). Ovarian tumor cDNA obtained from a tumor cDNA bank was used.
Amplification was accomplished in a Thermal Cycler (Perkin-Elmer Cetus). The reaction mixture consisted of 1 U Taq DNA Polymerase in storage buffer A (Promega), 1× Thermophilic DNA Polymerase 10×Mg free buffer (Promega), 300 mM dNTPs, 2.5 mM MgCl2, and 0.25 mM each of the sense and antisense primers for the target gene. A 20 μl reaction included 1 μl of cDNA synthesized from 50 ng of mRNA from serous tumor mRNA as the template. PCR reactions required an initial denaturation step at 94° C./1.5 min. followed by 35 cycles of 94° C./0.5 min., 48° C./0.5 min., 72° C./0.5 min. with a final extension at 72° C./7 min. Three bands were initially identified (>>400 bp, >>800 bp, and >>1200 bp) and isolated. After size analysis by agarose gel electrophoresis, these bands as well as any other products of interest were then ligated into a T-vector plasmid (Promega) and transformed into competent DH5a strain of E. coli cells. After growth on selective media, individual colonies were cultured overnight at 37° C., and plasmid DNA was extracted using the QIAprep Spin Miniprep kit (Qiagen). Positive clones were identified by restriction digests using Apa I and Sac I. Inserts were sequenced using an ABI automatic sequencer, Model 377, T7 primers, and a Big Dye Terminator Cycle Sequencing Kit (Applied Biosystems).
Obtained sequences were analyzed using the Pileup program of the Wisconsin Genetic's Computer Group (GCG). Repeat units were ordered using primers designed against two highly conserved regions within the nucleotide sequence of these identified repeat units. Shown in Table 2B, the sense and antisense primers (5′-GTCTCTATGTCAATGGTTTCACCC-3′/5′-TAGCTGCTCTCTGTCCAGTCC-3′ SEQ ID NOS: 301 and 302, respectively) faced away from one another within any one repeat creating an overlap sequence, thus enabling amplification across the junction of any two repeat units. PCR reactions, cloning, sequencing, and analysis were performed as described above.
C. Identification and Assembly of the CA125 Amino Terminal Domain
In search of open reading frames containing sequences in addition to CA125 repeat units, database searches were performed using the BLAST program available at the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/). Using a repeat unit as the query sequence, cosmid AC008734 was identified as having multiple repeat sequences throughout the unordered (35) contiguous pieces of DNA, also known as contigs. One of these contigs, #32, was found to have exons 1 and 2 of a repeat region at its 3′ end. Contig#32 was also found to contain a large open reading frame (ORF) upstream of the repeat sequence. PCR was again used to verify the existence of this ORF and confirm its connection to the repeat sequence. The specific primers recognized the 3′ end of this ORF (5′-CAGCAGAGACCAGCACGAGTACTC-3′) (SEQ ID NO: 51) and sequence within the repeat (5′-TCCACTGCCATGGCTGAGCT-3′) (SEQ ID NO: 52). The remainder of the amino-terminal domain was assembled from this contig in a similar manner. With each PCR confirmation, a new primer (see Table 10A) was designed against the assembled sequence and used in combination with a primer designed against another upstream potential ORF (Set 1: 5′-CCAGCACAGCTCTTCCCAGGAC-3′/5′-GGAATGGCTGAGCTGACGTCTG-3′(SEQ ID NO: 53 and SEQ ID NO: 54); Set 2: 5′-CTTCCCAGGACAACCTCAAGG-3′/5′-GCAGGATGAGTGAGCCACGTG-3′(SEQ ID NO: 55 and SEQ ID NO: 56); Set 3: 5′-GTCAGATCTGGTGACCTCACTG-3′/5′-GAGGCACTGGAAAGCCCAGAG-3′) (SEQ ID NO: 57 and SEQ ID NO: 58). Potential adjoining sequence (contig #7 containing EST AU133673) was also identified using contig #32 sequence as query sequence in database searches. Confirmation primers were designed and used in a typical manner (5′-CTGATGGCATTATGGAACACATCAC-3′/5′-CCCAGAACGAGAGACCAGTGAG-3′) (SEQ ID NO: 59 and SEQ ID NO: 60).
In order to identify the 5′ end of the CA125 sequence, 5′ Rapid Amplification of cDNA Ends (FirstChoice™ RLM-RACE Kit, Ambion) was performed using tumor cDNA. The primary PCR reaction used a sense primer supplied by Ambion (5′-GCTGATGGCGATGAATGAACACTG-3′) (SEQ ID NO: 61) and an anti-sense primer specific to confirmed contig #32 sequence (5′-CCCAGAACGAGAGACCAGTGAG-3′) (SEQ ID NO: 62). The secondary PCR was then performed using nested primers, sense from Ambion (5′-CGCGGATCCGAACACTGCGTTTGCTGGCTTTGATG-3′) (SEQ ID NO: 63) and the anti-sense was specific to confirmed contig #7 sequence (5′-CCTCTGTGTGCTGCTTCATTGGG-3′) (SEQ ID NO: 64). The RACE PCR product (a band of approximately 300 bp) was cloned and sequenced as previously described.
D. Identification and Assembly of the CA125 Carboxy Terminal Domain
Database searches using confirmed repeat units as query also identified a cDNA sequence (GenBank AK024365) containing other repeat units, but also a potential carboxy terminal sequence. The contiguous nature of this sequence with assembled CA125 was confirmed using PCR (5′-GGACAAGGTCACCACACTCTAC-3′/5′-GCAGATCCTCCAGGTCTAGGTGTG-3′), (SEQ ID NO: 303 and SEQ ID NO: 304, respectively) as well as contig and EST analysis.
E. Expression of 6×His-Tagged CA125 Repeat in E. coli
The open reading frame of a CA125 repeat shown in Table 11 was amplified by PCR with the sense primer (5′-ACCGGATCCATGGGCCACACAGAGCCTGGCCC-3′) (SEQ ID NO: 65) the antisense primer (5′-TGTAAGCTTAGGCAGGGAGGATGGAGTCC-3′) (SEQ ID NO: 66) PCR was performed in a reaction mixture consisting of ovarian tumor cDNA derived from 50 ng of mRNA, 5 pmol each of sense and antisense primers for the CA125 repeat, 0.2 mmol of dNTPs, and 0.625 U of Taq polymerase in 1× buffer in a final volume of 25 ml. This mixture was subjected to 1 minute of denaturation at 95° C. followed by 30 cycles of PCR consisting of the following: denaturation for 30 seconds at 95° C., 30 seconds of annealing at 62° C., and 1 minute of extension at 72° C. with an additional 7 minutes of extension on the last cycle. The product was electrophoresed through a 2% agarose gel for separation. The PCR product was purified and digested with the restriction enzymes Bam HI and Hind III. This digested PCR product was then ligated into the expression vector pQE-30, which had also been digested with Bam HI and Hind III. This clone would allow for expression of recombinant 6×His-tagged CA125 repeat. Transformed E. coli (JM109) were grown to an OD600 of 1.5-2.0 at 37° C. and then induced with IPTG (0.1 mM) for 4-6 hours at 25° C. to produce recombinant protein. Whole E. coli lysate was electrophoresed through a 12% SDS polyacrylamide gel and Coomassie stained to detect highly expressed proteins.
F. Western Blot Analysis
Proteins were separated on a 12% SDS-PAGE gel and electroblotted at 100V for 40 minutes at 4° C. to nitrocellulose membrane. Blots were blocked overnight in phosphate-buffered saline (PBS) pH 7.3 containing 5% non-fat milk. CA125 antibodies M11, OC125, or ISOBM 9.2 were incubated with the membrane at a dilution of 5 μg/ml in 5% milk/PBS-T (PBS plus 0.1% TX-100) and incubated for 2 hours at room temperature. The blot was washed for 30 minutes with several changes of PBS and incubated with a 1:10,000 dilution of horseradish peroxidase (HRP) conjugated goat anti-mouse IgG antibody (Bio-Rad) for 1 hour at room temperature. Blots were washed for 30 minutes with several changes of PBS and incubated with a chemiluminescent substrate (ECL from Amersham Pharmacia Biotech) before a 10-second exposure to X-ray film for visualization.
G. Northern Blot Analysis
Total RNA samples (approximately 10 μg) were separated by electrophoresis through a 6.3% formaldehyde, 1.2% agarose gel in 0.02 M MOPS, 0.05 M sodium acetate (pH 7.0), and 0.001 M EDTA. The RNAs were then blotted to Hybond-N (Amersham) by capillary action in 20×SSPE and fixed to the membrane by baking for 2 hours at 80° C. A PCR product representing one 400 bp repeat of the CA125 molecule was radiolabelled using the Prime-a-Gene Labeling System available from Promega (cat. #U1100). The blot was probed and stripped according to the ExpressHyb Hybridization Solution protocol available from Clontech (Catalog #8015-1).
In 1997, a system was described by a co-inventor of the present invention and others for purification of CA125 (primarily from patient ascites fluid), which when followed by cyanogen bromide digestion, resulted in peptide fragments of CA125 of 60 kDa and 40 kDa [O'Brien T J et al., More than 15 years of CA125: What is known about the antigen, its structure and its function, Int J Biological Markers 13(4)188-195 (1998)]. Both fragments were identifiable by commassie blue staining on polyacrylamide gels and by Western blot. Both fragments were shown to bind both OC125 and M11 antibodies, indicating both major classes of epitopes were preserved in the released peptides (
Protein sequencing of the 40 kDa band yielded both amino terminal sequences and some internal sequences generated by protease digestion (Table 1—SEQ ID NOS: 1-4). Insufficient yields of the 60 kDa band resulted in unreliable sequence information. Unfortunately, efforts to amplify PCR products utilizing redundant primers designed to these sequences were not successful. In mid 2000, an EST (#BE005912) was entered into the GCG database, which contained homology to the 40 kDa band sequence as shown in Table 1 (SEQ ID NOS: 5 and 6). The translation of this EST indicated good homology to the amino terminal sequence of the 40 kDa repeat (e.g. residues 2-12 of SEQ ID NO:6) with only one amino acid difference (i.e. an asparagine is present instead of phenylalanine in the EST sequence). Also, some of the internal sequences are partially conserved (e.g. SEQ ID NO: 2 and to a lesser extent, SEQ ID NO: 3 and SEQ ID NO: 4). More importantly, all the internal sequences are preceded by a basic amino acid (Table 1, indicated by arrows) appropriate for proteolysis by the trypsin used to create the internal peptides from the 40 kDa cyanogen bromide repeat. Utilizing the combined sequences, those obtained by amino acid sequencing and those identified in the EST (#BE005912) and a second EST (#AA640762) identified in the database, sense primers were created as follows: 5′-GGA GAG GGT TCT GCA GGG TC-3′ (SEQ ID NO: 7) representing amino acids ERVLQG (SEQ ID NO: 8) and anti-sense primer, 5′ GTG AAT GGT ATC AGG AGA GG-3′ (SEQ ID NO: 9) representing PLLIPF (SEQ ID NO: 10). Using PCR, the presence of transcripts was confirmed representing these sequences in ovarian tumors and their absence in normal ovary and either very low levels or no detectable levels in a mucinous tumor (
After cloning and sequencing of the amplified 400 base pair PCR products, a series of sequences were identified, which had high homology to each other but which were clearly distinct repeat entities (
Examples of each category of repeats were sequenced, and the results are shown in Tables 3, 4, and 5. The sequences represent amplification and sequence data of PCR products obtained using oligonucleotide primers derived from an EST (Genbank Accession No. BE005912). Table 3 illustrates the amino acid sequence for a 400 bp repeat in the CA125 molecule, which is identified as SEQ ID NO: 11 through SEQ ID NO: 21. Table 4 illustrates the amino acid sequence for a 800 bp repeat in the CA125 molecule, which corresponds to SEQ ID NO: 22 through SEQ ID NO: 35. Table 5 illustrates the amino acid sequence for a 1200 bp repeat in the CA125 molecule, which is identified as SEQ ID NO: 36 through SEQ ID NO: 46. Assembly of these repeat sequences (which showed 75-80% homology to each other as determined by GCG Software (GCG=Genetics Computer Group) using the Pileup application) utilizing PCR amplification and sequencing of overlapping sequences allowed for the construction of a 9 repeat structure. The amino acid sequence for the 9 repeat is shown in Table 6 as SEQ ID NO: 47. The individual C-enclosures are highlighted in the table.
Using the assembled repeat sequence in Table 6 to search genebank databases, a cDNA sequence referred to as Genbank Accession No. AK024365 (entered on Sep. 29, 2000) was discovered. Table 7 shows the amino acid sequence for AK024365, which corresponds to SEQ ID NO: 48. AK024365 was found to overlap with two repeats of the assembled repeat sequence shown in Table 6. Individual C-enclosures are highlighted in Table 7.
The cDNA for AK024365 allowed alignment of four additional repeats as well as a downstream carboxy terminus sequence of the CA125 gene. Table 8 illustrates the complete DNA sequence of 13 repeats contiguous with the carboxy terminus of the CA125 molecule, which corresponds to SEQ ID NO: 49. Table 9 illustrates the complete amino acid sequence of the 13 repeats and the carboxy terminus of the CA125 molecule, which corresponds to SEQ ID NO: 50. The carboxy terminus domain was further confirmed by the existence of two EST's (Genbank Accession Nos. AW150602 and AI923224) in the genebank database, both of which confirmed the stop-codon indicated (TGA) as well as the poly A signal sequence (AATAA) and the poly A tail (see Table 9). The presence of these repeats has been confirmed in serous ovarian tumors and their absence in normal ovarian tissue and mucinous tumors as expected (see
To date, 45 repeat sequences have been identified with high homology to each other. To order these repeat units, overlapping sequences were amplified using a sense primer (5′ GTC TCT ATG TCA ATG GTT TCA CCC-3′) (SEQ ID NO: 305) from an upstream repeat and an antisense primer from a downstream repeat sequence (antisense 5′ TAG CTG CTC TCT GTC CAG TCC-3′) (SEQ ID NO: 306). Attempts have been made to place these repeats in a contiguous fashion as shown in
Final confirmation of the relationship of the putative CA125 repeat domain to the known CA125 molecule was achieved by expressing a recombinant repeat domain in E. coli. In
To further characterize the epitope location of the CA125 antibodies, recombinant CA125 repeat was digested with the endoprotease Lys-C and separately with the protease Asp-N. In both cases, epitope recognition was destroyed. As shown in
To determine transcript size of the CA125 molecule, Northern blot analysis was performed on mRNA extracts from both normal and tumor tissues. In agreement with the notion that CA125 may be represented by an unusually large transcript due to its known mega dalton size in tumor sera, ascites fluid, and peritoneal fluid [Nustad K et al., CA125-epitopes and molecular size, Int. J of Biolog. Markers, 13(4)196-199 (1998)], a transcript was discovered which barely entered the gel from the holding well (
Evidence demonstrates that the repeat domain of the CA125 molecule encompasses a minimum of 45 different 156 amino acid repeat units and possibly greater than 60 repeats, as individual repeats occur more than once in the sequence. This finding may well account for the extraordinary size of the observed transcript. The amino acid composition of the repeat units (
Also noteworthy is a totally conserved methionine at position 24 of the repeat (
Based on homology of the repeat sequences to chromosome 19q 13.2 (cosmid #AC008734) and confirmed by genomic amplification, it has been established that each repeat is comprised of 5 exons (covering approximately 1900 bases of genomic DNA): exon 1 comprises 42-amino acids (#1-42); exon 2 comprises 23 amino acids (#43-65); exon 3 comprises 58 amino acids (#66-123); exon 4 comprises 12 amino acids (#124-135); and exon 5 comprises 21 amino acids (#136-156) (see
Currently, the repetitive units of the repeat domain of the CA125 molecule constitute the majority of its extracellular molecular structure. These sequences have been presented in a tandem fashion based on overlap sequencing data. Some sequences may be incorrectly placed and some repeat units may not as yet be identified (Table 21). More recently, an additional repeat was identified in CA125 as shown in Tables 22 and 23 (SEQ. ID NOS: 307 and 308). The exact position has not yet been identified. Also, there is a potential that alternate splicing and/or mutation could account for some of the repeat variants that are listed. Studies are being conducted to compare both normal tissue derived CA125 repeats to individual tumor derived CA125 repeats to determine if such variation is present. Currently, the known exon configurations would easily accommodate the greater than 60 repeat units as projected. It is, therefore, unlikely that alternate splicing is a major contributor to the repetitive sequences in CA125. It should also be noted that the genomic database for chromosome 19q 13.2 only includes about 10 repeat units, thus indicating a discrepancy between the data of the present invention (more than 60 repeats) and the genomic database. A recent evaluation of the methods used for selection and assembly for genomic sequence [Marshall E, DNA Sequencing: Genome teams adjust to shotgum marriage, Science 292:1982-1983 (2001)] reports that “more research is needed on repeat blocks of almost identical DNA sequence which are more common in the human genome. Existing assembly programs can't handle them well and often delete them.” The CA125 repeat units located on chromosome 19 may well be victims of deletion in the genomic database, thus accounting for most CA125 repeat units absent from the current databases.
A. Sequence Confirmation and Assembly of the Amino Terminal Domain (Domain 1) of the CA125 Molecule
As previously mentioned, homology for repeat sequences was found in the chromosome 19 cosmid AC008734 of the GCG database. This cosmid at the time consisted of 35 unordered contigs. After searching the cosmid for repeat sequences, contig #32 was found to have exons 1 and 2 of a repeat unit at its 3′ end. Contig #32 also had a large open reading frame upstream from the two repeat units, which suggested that this contig contained sequences consistent with the amino terminal end of the CA125 molecule. A sense primer was synthesized to the upstream non-repeat part of contig #32 coupled with a specific primer from within the repeat region (see Methods). PCR amplification of ovarian tumor cDNA confirmed the contiguous positioning of these two domains.
The PCR reaction yielded a band of approximately 980 bp. The band was sequenced and found to connect the upstream open reading frame to the repeat region of CA125. From these data, more primer sets (see Methods) were synthesized and used in PCR reactions to piece together the entire open reading frame contained in contig #32. To find the 5′ most end of the sequence, an EST (AU133673) was discovered, which linked contig #32 to contig #7 of the same cosmid. Specific primers were synthesized, (5′-CTGATGGCATTATGGAACACATCAC-3′ (SEQ ID NO: 59) and 5′-CCCAGAACGAGAGACCAGTGAG-3′ (SEQ ID NO: 60)), to the EST and contig #32. A PCR reaction was performed to confirm that part of the EST sequence was in fact contiguous with contig #32. Confirmation of this contiguous 5′ prime sequencing strategy using overlapping sequences allowed the assembly of the 5′ region (Domain 1) (
The amino terminal domain comprises five genomic exons covering approximately 13,250 bp. Exon 1, a small exon, (amino acids #1-33) is derived from contig #7 (
Potential N-glycosylation sites marked (x) are encoded at positions #81, #271, #320, #624, #795, #834, #938, and #1,165 (see
With additional research, an extension of the glycosylated amino terminal sequence was identified and cloned. Table 24 (SEQ ID NO: 309) illustrates the DNA sequence of the CA125 amino terminal extension. Table 25 (SEQ ID NO: 310) illustrates the protein sequence for the amino terminal extension of the CA125 gene. It should be noted that the last four amino acids, TDGI, in SEQ ID NO: 310 belong to exon 1 of the amino terminal domain. Table 26 illustrates the serine/threonine o-glycosylation pattern for the CA125 amino terminal extension.
B. Sequence Confirmation and Assembly of the CA125 Carboxy Terminal End (Domain 3)
A search of Genbank using the repeat sequences described above uncovered a cDNA sequence referred to as Genbank accession number AK024365. This sequence was found to have 2 repeat sequences, which overlapped 2 known repeat sequences of a series of 6 repeats. As a result, the cDNA allowed the alignment of all six carboxy terminal repeats along with a unique carboxy terminal sequence. The carboxy terminus was further confirmed by the existence of two other ESTs (Genbank accession numbers AW150602 and A11923224), both of which confirmed a stop codon as well as a poly-A signal sequence and a poly-A tail (see GCG database #AF414442). The sequence of the carboxy terminal domain was confirmed using primers designed to sequence just downstream of the repeat domain (sense primer 5′ GGA CAA GGT CAC CAC ACT CTA C-3′) (SEQ ID NO: 303) and an antisense primer (5′-GCA GAT CCT CCA GGT CTA GGT GTG-3′) (SEQ ID NO: 304) designed to carboxy terminus (
The carboxy terminal domain covers more than 14,000 genomic bp. By ligation, this domain comprises nine exons as shown in
Assembly of the CA125 molecule as validated by PCR amplification of overlap sequence provides a picture of the whole molecule (see
The CA125 molecule comprises three major domains; an extracellular amino terminal domain (Domain 1), a large multiple repeat domain (Domain 2) and a carboxy terminal domain (Domain 3), which includes a transmembrane anchor with a short cytoplasmic domain (
Efforts to purify CA125 over the years were obviously complicated by the presence of this amino terminal domain, which is unlikely to have any epitope sites recognized by the OC125 or M11 class antibodies. As the CA125 molecule is degraded in vivo, it is likely that this highly glycosylated amino terminal end will be found associated with varying numbers of repeat units. This could very well account for both the charge and size heterogeneity of the CA125 molecule so often identified from serum and ascites fluid. Also of note are two T-TALK sequences at amino acids #45-58 (underlined in
The extracellular repeat domain, which characterizes the CA125 molecule, also represents a major portion of the molecular structure. It is downstream from the amino terminal domain and presents itself in a much different manner to its extracellular matrix neighbors. These repeats are characterized by many features including a highly-conserved nature (
Initial evidence suggests that this area is a potential site for antibody binding and also for ligand binding. The highly conserved methionine and several highly conserved sequences within the repeat domain also suggests a functional capacity for these repeat units. The extensive glycosylation of exons 4 & 5 of the repeat unit and the N-glycosylation potential in exon 1 and the 5′ end of exon 2 might further point to a functional capacity for the latter part of exon 2 and exon 3 which includes the C-enclosure (see
The carboxy terminal domain of the CA125 molecule comprises an extracellular domain, which does not have any homology to other known domains. It encodes a typical transmembrane domain and a short cytoplasmic tail. It also contains a proteolytic cleavage site approximately 50 amino acids upstream from the transmembrane domain. This would allow for proteolytic cleavage and release of the CA125 molecule (
These features of the CA125 molecule suggest a signal transduction pathway involvement in the biological function of CA125 [Fendrick J L et al., CA125 phosphorylation is associated with its secretion from the WISH human amnion cell line, Tumor Biology 18:278-289 (1997); and Konish I et al., Epidermal growth factor enhances secretion of the ovarian tumor-associated cancer antigen CA125 from the human amnion WISH cell line, J Soc. Gynecol. Invest. 1:89-96 (1994)]. It also reinforces the prediction of phosphorylation prior to CA125 release from the membrane surface as previously proposed [Fendrick J L et al., CA125 phosphorylation is associated with its secretion from the WISH human amnion cell line, Tumor Biology 18:278-289 (1997); and Konish I et al., Epidermal growth factor enhances secretion of the ovarian tumor-associated cancer antigen CA125 from the human amnion WISH cell line, J Soc. Gynecol. Invest. 1:89-96 (1994)]. Furthermore, a putative proteolytic cleavage site on the extra-cellular side of the transmembrane domain is present at position #176-181.
How well does the CA125 structure described in the present invention compare to the previously known CA125 structure? O'Brien et al. reported that a number of questions needed to be addressed: 1) the multivalent nature of the molecule; 2) the heterogeneity of CA125; 3) the carbohydrate composition; 4) the secretory or membrane bound nature of the CA125 molecule; 5) the function of the CA125 molecule; and 6) the elusive CA125 gene [More than 15 years of CA125: What is known about the antigen, its structure and its function, Int J Biological Markers 13(4)188-195 (1998)]. Several of these questions have been addressed in the present invention including, of course, the gene and its protein core product. Perhaps, most interestingly is the question of whether an individual large transcript accounted for the whole CA125 molecule, or a number of smaller transcripts which represented subunits that specifically associated to produce the CA125 molecule. From the results produced by way of the present invention, it is now apparent that the transcript of CA125 is large—similar to some of the mucin gene transcripts e.g. MUC 5B [see Verma M et al., Mucin genes: Structure, expression and regulation, Glycoconjugate J. 11:172-179 (1994); and Gendler S J et al., Epithelial mucin genes, Annu. Rev. Physiol. 57:607-634 (1995)]. The protein core extracellular domains all have a high capacity for O-glycosylation and, therefore, probably accounts for the heterogeneity of charge and size encountered in the isolation of CA125. The data also confirm the O-glycosylation inhibition data, indicating CA125 to be rich in O-glycosylation [Lloyd K O et al., Synthesis and secretion of the ovarian cancer antigen CA125 by the human cancer cell line NIH: OVCAR-3, Tumor Biology 22, 77-82 (2001); Lloyd K O et al., Isolation and characterization of ovarian cancer antigen CA125 using a new monoclonal antibody (VK-8): Identification as a mucin-type molecule, Int. J. Cancer, 71:842-850 (1997); and Fendrick J L et al., Characterization of CA125 synthesized by the human epithelial amnion WISH cell line, Tumor Biology 14:310-318 (1993)].
The repeat domain which includes more than 60 repeat units accounts for the multivalent nature of the epitopes present, as each repeat unit likely contains epitope binding sites for both OC125-like antibodies and M11-like antibodies. The presence of a transmembrane domain and cleavage site confirms the membrane association of CA125, and reinforces the data which indicates a dependence of CA125 release on proteolysis. Also, the release of CA125 from the cell surface may well depend on cytoplasmic phosphorylation and be the result of EGF signaling [Nustad K et al., Specificity and affinity of 26 monoclonal antibodies against the CA125 antigen: First report from the ISOBM TD-1 workshop, Tumor Biology 17:196-219 (1996)]. As for the question of inherent capacity of CA125 for proteolytic activity, this does not appear to be the case. However, it is likely that the associated proteins isolated along with CA125 (e.g. the 50 kDa protein which has no antibody binding ability) may have proteolytic activity. In any case, proteolysis of an extracellular cleavage site is the most likely mechanism of CA125 release. Such cleavage would be responsive to cytoplasmic signaling and mediated by an associated extracellular protease activity.
In summary, the large number of tandem repeats of the CA125 molecule, which dominate its molecular structure and contain the likely epitope binding sites of the CA125 molecule, was unexpected. Also, one cannot as yet account for the proteolytic activity, which has plagued the isolation and characterization of this molecule for many years. While no protease domain per se is constituitively part of the CA125 molecule, there is a high likelihood of a direct association by an extracellular protease with the ligand binding domains of the CA125 molecule. Finally, what is the role of the dominant repeat domain of this extracellular structure? Based on the expression data of CA125 on epithelial surfaces and in glandular ducts, it is reasonable to conclude that the unique structure of these repeat units with their cysteine loops plays a role both as glandular anti-invasive molecules (bacterial entrapment) and/or a role in anti-adhesion (maintaining patency) between epithelial surfaces and in ductal linings.
Recently, Yin and Lloyd described the partial cloning of the CA125 antigen using a completely different approach to that described in the present invention [Yin T W T et al., Molecular cloning of the CA125 ovarian cancer antigen. Identification as a new mucin (MUC16), J Biol. Chem. 276:27371-27375 (2001)]. Utilizing a polyclonal antibody to CA125 to screen an expression library of the ovarian tumor cell line OVCAR-3, these researchers identified a 5965 bp clone containing a stop codon and a poly A tail, which included nine partially conserved tandem repeats followed by a potential transmembrane region with a cytoplasmic tail. The 5965 bp sequence is almost completely homologous to the carboxy terminus region shown in Table 21. Although differing in a few bases, the sequences are homologous. As mentioned above, the cytoplasmic tail has the potential for phosphorylation and a transmembrane domain would anchor this part of the CA125 molecule to the surface of the epithelial or tumor cell. In the extracellular matrix, a relatively short transition domain connects the transmembrane anchor to a series of tandem repeats—in the case of Yin and Lloyd, nine.
By contrast, the major extracellular part of the molecule of the present invention as shown is upstream from the sequence described by Yin and includes a large series of tandem repeats. These results, of course, provide a different picture of the CA125 molecule, which suggest that CA125 is dominated by the series of extracellular repeats. Also included is a major amino terminal domain (˜1638 amino acids) for the CA125 molecule, which it is believed accounts for a great deal of the O-glycosylation known to be an important structural component of CA125.
In conclusion, a CA125 molecule is disclosed which requires a transcript of more than 35,000 bases and occupies approximately 150,000 bp on chromosome 19q 13.2. It is dominated by a large series of extracellular repeat units (156 amino acids), which offer the potential for molecular interactions especially through a highly conserved unique cysteine loop. The repeat units also include the epitopes now well-described and classified for both the major class of CA125 antibodies (i.e., the OC125 and the M11 groups). The CA125 molecule is anchored at its carboxy terminal through a transmembrane domain and a short cytoplasmic tail. CA125 also contains a highly glycosylated amino terminal domain, which includes a large extracellular exon typical of some mucins. Given the massive repeat domain presence of both epithelial surfaces and ovarian tumor cell surfaces, it might be anticipated that CA125 may play a major role in determining the extracellular environment surrounding epithelial and tumor cells.
Advantages and Uses of the CA125 Recombinant Products
1) Current assays to CA125 utilize as standards either CA125 produced from cultured cell lines or from patient ascites fluid. Neither source is defined with regard to the quality or purity of the CA125 molecule. Therefore arbitrary units are used to describe patient levels of CA125. Because cut-off values are important in the treatment of patients with elevated CA125 and because many different assay systems are used clinically to measure CA125, it is relevant and indeed necessary to define a standard for all CA125 assays. Recombinant CA125 containing epitope binding sites could fulfill this need for standardization. Furthermore, new and more specific assays may be developed utilizing recombinant products for antibody production.
There are now some highly reliable computer programs that can identify peptide sequences within the primary structure of a protein that are likely to be immunogenic. Such programs can be used to identify immunogenic sequences within the inferred CA125 structure. Thus, knowledge of the nucleotide sequence of CA125 cDNA and genomic DNA can lead to the design of synthetic “epitopes” and preparation of highly specific polyclonal and monoclonal antibodies. Antibodies are useful in the development of immuno assays having diagnostic uses. Alternatively, a recombinant expression of CADS protein clearly provides an appropriate antigen for preparing specific antibodies of CA125.
2) Vaccines: Adequate data now exists [see Wagner U et al., Immunological consolidation of ovarian carcinoma recurrences with monoclonal anti-idiotype antibody ACA125: Immune responses and survival in palliative treatment, Clin. Cancer Res. 7:1112-1115 (2001)], which suggest and support the idea that CA125 could be used as a therapeutic vaccine to treat patients with ovarian carcinoma. Heretofore, in order to induce cellular and humoral immunity in humans to CA125, murine antibodies specific for CA125 were utilized in anticipation of patient production of anti-ideotypic antibodies, thus indirectly allowing the induction of an immune response to the CA125 molecule. With the availability of recombinant CA125, especially domains which encompass epitope binding sites for known murine antibodies and domains directly anchoring CA125 on the tumor cell, it will be feasible to more directly stimulate patients' immune systems to CA125 and as a result, extend the life of ovarian carcinoma patients as demonstrated by Wagner et al.
Several approaches can be utilized to achieve such a therapeutic response in the immune system by: 1) directly immunizing the patient with recombinant antigen containing the CA125 epitopes or other domains; 2) harvesting dendritic cells from the patient; 3) expanding these cells in in vitro culture; 4) activating the dendritic cells with the recombinant CA125 epitope domain or other domains or with peptides derived from these domains [see Santin A D et al., Induction of ovarian tumor-specific CD8+ cytotoxic T lymphocytes by acid-eluted peptide-pulsed autologous dendritic cells, Obstetrics & Gynecology 96(3):422-430 (2000)]; and then 5) returning these immune stem cells to the patient to achieve an immune response to CA125. This procedure can also be accomplished using specific peptides which are compatible with histocompatibility antigens of the patient. Such peptides compatible with the HLA-A2 binding motifs common in the population are indicated in
3) Therapeutic Targets: Molecules, which are expressed on the surface of tumor cells as CA125 is, offer potential targets for immune stimulation, drug delivery, biological modifier delivery or any agent which can be specifically delivered to ultimately kill the tumor cells. CA125 offers such potential as a target: 1) Antibodies to CA125 epitopes or newly described potential epitopes: Most especially humanized or human antibodies to CA125 which could directly activate the patients' immune system to attack and kill tumor cells. Antibodies could be used to deliver all drug or toxic agents including radioactive agents to mediate direct killing of tumor cells. 2) Natural ligands: Under normal circumstances, molecules are bound to the CA125 molecule e.g. a 50 k dalton protein which does not contain CA125 epitopes co-purifies with CA125. Such a molecule, which might have a natural binding affinity for domains on the CA125 molecule, could also be utilized to deliver therapeutic agents to tumor cells.
4) Anti-sense therapy: CA125 expression may provide a survival or metastatic advantage to ovarian tumor cells as such antisense oligonucleotide derived from the CA125 sequence could be used to down-regulate the expression of CA125. Antisense therapy could be used in association with a tumor cell delivery system such as described above.
5) Small Molecules: Recombinant domains of CA125 also offer the potential to identify small molecules which bind to individual domains of the molecule. Small molecules either from combinatorial chemical libraries or small peptides can also be used as delivery agents or as biological modifiers.
6) Transgenic Animals/Transformed: CA125 and genomic DNA can be used to develop transgenic animal models and can be used under low stringency conditions, to clone CA125 cDNAs and genomic DNAs of other animal species (would this be worthwhile?). The CA125 cDNA can be used to prepare stable transformants. The bacterial cells could be transformed with CA125 cDNA to include these genes.
All references referred to herein are hereby incorporated by reference in their entirety.
It should be understood that various changes and modifications to the presently preferred embodiments described herein will be apparent to those skilled in the art. Such changes and modifications can be made without departing from the spirit and scope of the present invention and without diminishing its attendant advantages.
MGHPGSRKFN ITERVLQGLL NPIFKNSSVG PLYSGCRLTS LRPEKDGAAT
GMDAVCLYHP NPKRPGLDRE QLYCELSQLT HNITELGPYS LDRDSLYVNG
KHGAATGVDA ICTLRLDPTG PGLDRERLYW ELSQLTNSVT ELGPYTLDRD
TLLRPEKRGA ATGVDTICTH RLDPLNPGLD REQLYWELSK LTRGIIELGP
AICTHRPDPT GPGLDREQLY LELSQLTHSI TELGPYTLDR DSLYVNGFTH
H Y E E N M Q H P G S R K F N T T E R V L Q G L L K P L F K N T S V
G P L Y S G C R L T L L R P E K H E A A T G V D T I C T H R V D P I
G P G L D R E R L Y W E L S Q L T N S I T E L G P Y T L D R D S L Y
ATRVDAVCTH RPDPKSPGLD RERLYWKLSQ LTHGITELGP YTLDRNSLYV
PEKHGAATGV DAICTLRLDP TGPGLDRERL YWELSQLTNS VTELGPYTLD
RLTLLRXEKX XAATXVDXXC XXXXDPXXPG LDREXLYWEL SXLTXXIXEL
ICTHHPDPQS PGLNREQLYW ELSQLTHGIT ELGPYTLDRD SLYVDGFTHW
ATGMDAVCLY HPNPKRPGLD REQLY
TGVDTICTHR VDPIGPGLDR ERLYWELSQL TNSITELGPY TLDRDSLYVN
EKHGAATGVD AICTHRLDPK SPGVDREQLY WELSQLTNGI KELGPYTLDR
VDTICTHRLD PLNPGLDREQ LYWELSKLTR GIIELGPYLL DRGSLYVNGF
DGAATGVDAI CTHHLNPQSP GLDREQLYWQ LSQMTNGIKE LGPYTLDRNS
MRRTGSRKFN TMESVLQGLL KPLFKNTSVG PLYSGCRLTL LRPEKDGAAT
GVDAICTHRL DPKSPGLNRE QLYWELSKL
CLICGVLVTT RRRKKEGEYN VQQQCPGYYQ SHLDLEDLQ
SLRPEKDSSA MAVDAICTHR PDPEDLGLDR ERLYWELSNL TNGIQELGPY
DAICSHRLDP KSPGLNREQL YWELSQLTHG IKELGPYTLD RNSLYVNGFT
GAATGVDAIC THHLNPQSPG LDREQLYWQL SQMTNGIKEL GPYTLDRNSL
VCLYHPNPKR PGLDREQLYW ELSQLTHNIT ELGPYSLDRD SLYVNGFTHQ
ATGVDAICTH RLDPKSPGLN REQLYWELSK LTNDIEELGP YTLDRNSLYV
LRPEKDGAAT RVDAACTYRP DPKSPGLDRE QLYWELSQLT HSITELGPYT
TICTHRLDPL NPGLDREQLY WELSKLTRGI IELGPYLLDR GSLYVNGFTH
AATGVDAICT HRLDPKSPGV DREQLYWELS QLTNGIKELG PYTLDRNSLY
SEKDGAATGV DAICTHRLDP KSPGVDREQL YWELSQLTNG IKELGPYTLD
RLTLLRXEKX XAATXVDXXC XXXXDPXXPG LDREXLYWEL SXLTXXIXEL
VDAICTYRPD PKSPGLDREQ LYWELSQLTH SITELGPYTQ DRDSLYVNGF
QEAATGVDTI CTHRVDPIGP GLDRERLYWE LSQLTNSITE LGPYTLDRDS
LLRPEKHGAA TGVDAICTLR LDPTGPGLDR ERLYWELSQL TNSITELGPY
XXCXXXXDPX XPGLDREXLY WELSXLTXXI XELGPYXLDR XSLYVNGFTH
AATGVDTICT HRVDPIGPGL DREXLYWELS XLTXXIXELG PYXLDRXSLY
RXEKXXAATX VDXXCXXXXD PXXPGLDREX LYWELSXLTX XIXELGPYXL
CRLTLLRPEK HGAATGVDAI CTLRLDPTGP GLDREXLYWE LSXLTXXIXE
CXXXXDPXXP GLDREXLYWE LSXLTXXIXE LGPYXLDRXS LYVNGFTHRS
TRVDAVCTHR PDPKSPGLDR EXLYWELSXL TXXIXELGPY XLDRXSLYVN
EKXXAATXVD XXCXXXXDPX XPGLDREXLY WELSXLTXXI XELGPYXLDR
TSLRSEKDGA ATGVDAICIH HLDPKSPGLD REXLYWELSX LTXXIXELGP
DAICTHRPDP KIPGLDRQQL YWELSQLTHS ITELGPYTLD RDSLYVNGFT
GAATRVDAVC THRPDPKSPG LDRERLYWKL SQLTHGITEL GPYTLDRHSL
LRPKKDGAAT KVDAICTYRP DPKSPGLDRE QLYWELSQLT HSITELGPYT
NGAETRVDLL CTYLQPLSGP GLPIKQVFHE LSQQTHGITR LGPYSLDKDS
DGAATGVDTT CTYHPDPVGP GLDIQQLYWE LSQLTHGVTQ LGFYVLDRDS
CCTTGATGAC AGGGAGCAGG AGCACTAAAG CCACACCAGA AATGGATTCA
GGACTGACAG GAGCCACCTT GTCACCTAAG ACATCTACAG GTGCAATCGT
GGTGACAGAA CATACTCTGC CCTTTACTTC CCCAGATAAG ACCTTGGCCA
GTCCTACATC TTCGGTTGTG GGAAGAACCA CCCAGTCTTT GGGGGTGATG
TCCTCTGCTC TCCCTGAGTC AACCTCTAGA GGAATGACAC ACTCCGAGCA
AAGAACCAGC CCATCGCTGA GTCCCCAGGT CAATGGAACT CCCTCTAGGA
ACTACCCTGC TACAAGCATG GTTTCAGGAT TGAGTTCCCC AAGGACCAGG
ACCAGTTCCA CAGAAGGAAA TTTTACCAAA GAAGCATCTA CATACACACT
CACTGTAGAG ACCACAAGTG GCCCAGTCAC TGAGAAGTAC ACAGTCCCCA
CTGAGACCTC AACAACTGAA GGTGACAGCA CAGAGACCCC CTGGGACACA
AGATATATTC CTGTAAAAAT CACATCTCCA ATGAAAACAT TTGCAGATTC
AACTGCATCC AAGGAAAATG CCCCAGTGTC TATGACTCCA GCTGAGACCA
CAGTTACTGA CTCACATACT CCAGGAAGGA CAAACCCATC ATTTGGGACA
CTTTATTCTT CCTTCCTTGA CCTATCACCT AAAGGGACCC CAAATTCCAG
AGGTGAAACA AGCCTGGAAC TGATTCTATC AACCACTGGA TATCCCTTCT
CCTCTCCTGA ACCTGGCTCT GCAGGACACA GCAGAATAAG TACCAGTGCG
CCTTTGTCAT CATCTGCTTC AGTTCTCGAT AATAAAATAT CAGAGACCAG
CATATTCTCA GGCCAGAGTC TCACCTCCCC TCTGTCTCCT GGGGTGCCCG
AGGCCAGAGC CAGCACAATG CCCAACTCAG CTATCCCTTT TTCCATGACA
CTAAGCAATG CAGAAACAAG TGCCGAAAGG GTCAGAAGCA CAATTTCCTC
TCTGGGGACT CCATCAATAT CCACAAAGCA GACAGCAGAG ACTATCCTTA
CCTTCCATGC CTTCGCTGAG ACCATGGATA TACCCAGCAC CCACATAGCC
AAGACTTTGG CTTCAGAATG GTTGGGAAGT CCAGGTACCC TTGGTGGCAC
CAGCACTTCA GCGCTGACAA CCACATCTCC ATCTACCACT TTAGTCTCAG
AGGAGACCAA CACCCATCAC TCCACGAGTG GAAAGGAAAC AGAAGGAACT
TTGAATACAT CTATGACTCC ACTTGAGACC TCTGCTCCTG GAGAAGAGTC
CGAAATGACT GCCACCTTGG TCCCCACTCT AGGTTTTACA ACTCTTGACA
GCAAGATCAG AAGTCCATCT CAGGTCTCTT CATCCCACCC AACAAGAGAG
CTCAGAACCA CAGGCAGCAC CTCTGGGAGG CAGAGTTCCA GCACAGCTGC
CCACGGGAGC TCTGACATCC TGAGGGCAAC CACTTCCAGC ACCTCAAAAG
CATCATCATG GACCAGTGAA AGCACAGCTC AGCAATTTAG TGAACCCCAG
CACACACAGT GGGTGGAGAC AAGTCCTAGC ATGAAAACAG AGAGACCCCC
AGCATCAACC AGTGTGGCAG CCCCTATCAC CACTTCTGTT CCCTCAGTGG
TCTCTGGCTT CACCACCCTG AAGACCAGCT CCACAAAAGG GATTTGGCTT
GAAGAAACAT CTGCAGACAC ACTCATCGGA GAATCCACAG CTGGCCCAAC
CACCCATCAG TTTGCTGTTC CCACTGGGAT TTCAATGACA GGAGGCAGCA
GCACCAGGGG AAGCCAGGGC ACAACCCACC TACTCACCAG AGCCACAGCA
TCATCTGAGA CATCCGCAGA TTTGACTCTG GCCACGAACG GTGTCCCAGT
CTCCGTGTCT CCAGCAGTGA GCAAGACGGC TGCTGGCTCA AGTCCTCCAG
GAGGGACAAA GCCATCATAT ACAATGGTTT CTTCTGTCAT CCCTGAGACA
TCATCTCTAC AGTCCTCAGC TTTCAGGGAA GGAACCAGCC TGGGACTGAC
TCCATTAAAC ACTAGACATC CCTTCTCTTC CCCTGAACCA GACTCTGCAG
GACACACCAA GATAAGCACC AGCATTCCTC TGTTGTCATC TGCTTCAGTT
CTTGAGGATA AAGTGTCAGC GACCAGCACA TTCTCACACC ACAAAGCCAC
CTCATCTATT ACCACAGGGA CTCCTGAAAT CTCAACAAAG ACAAAGCCCA
GCTCAGCCGT TCTTTCCTCC ATGACCCTAA GCAATGCAGC AACAAGTCCT
GAAAGAGTCA GAAATGCAAC TTCCCCTCTG ACTCATCCAT CTCCATCAGG
GGAAGAGACA GCAGGGAGTG TCCTCACTCT CAGCACCTCT GCTGAGACTA
CAGACTCACC TAACATCCAC CCAACTGGGA CACTGACTTC AGAATCGTCA
GAGAGTCCTA GCACTCTCAG CCTCCCAAGT GTCTCTGGAG TCAAAACCAC
ATTTTCTTCA TCTACTCCTT CCACTCATCT ATTTACTAGT GGAGAAGAAA
CAGAGGAAAC TTCGAATCCA TCTGTGTCTC AACCTGAGAC TTCTGTTTCC
AGAGTAAGGA CCACCTTGGC CAGCACCTCT GTCCCTACCC CAGTATTCCC
CACCATGGAC ACCTGGCCTA CACGTTCAGC TCAGTTCTCT TCATCCCACC
TAGTGAGTGA GCTCAGAGCT ACGAGCAGTA CCTCAGTTAC AAACTCAACT
GGTTCAGCTC TTCCTAAAAT ATCTCACCTC ACTGGGACGG CAACAATGTC
ACAGACCAAT AGAGACACGT TTAATGACTC TGCTGCACCC CAAAGCACAA
CTTGGCCAGA GACTAGTCCC AGATTCAAGA CAGGGTTACC TTCAGCAACA
ACCACTGTTT CAACCTCTGC CACTTCTCTC TCTGCTACTG TAATGGTCTC
TAAATTCACT TCTCCAGCAA CTAGTTCCAT GGAAGCAACT TCTATCAGGG
AACCATCAAC AACCATCCTC ACAACAGAGA CCACGAATGG CCCAGGCTCT
ATGGCTGTGG CTTCTACCAA CATCCCAATT GGAAAGGGCT ACATTACTGA
AGGAAGATTG GACACAAGCC ATCTGCCCAT TGGAACCACA GCTTCCTCTG
AGACATCTAT GGATTTTACC ATGGCCAAAG AAAGTGTCTC AATGTCAGTA
TCTCCATCTC AGTCCATGGA TGCTGCTGGC TCAAGCACTC CAGGAAGGAC
AAGCCAATTC GTTGACACAT TTTCTGATGA TGTCTATCAT TTAACATCCA
GAGAAATTAC AATACCTAGA GATGGAACAA GCTCAGCTCT GACTCCACAA
ATGACTGCAA CTCACCCTCC ATCTCCTGAT CCTGGCTCTG CTAGAAGCAC
CTGGCTTGGC ATCTTGTCCT CATCTCCTTC TTCTCCTACT CCCAAAGTCA
CAATGAGCTC CACATTTTCA ACTCAGAGAG TCACCACAAG CATGATAATG
GACACAGTTG AAACTAGTCG GTGGAACATG CCCAACTTAC CTTCCACGAC
TTCCTTGACA CCAAGTAATA TTCCAACAAG TGGTGCCATA GGAAAAAGCA
CCCTGGTTCC CTTGGACACT CCATCTCCAG CCACATCATT GGAGGCATCA
GAAGGGGGAC TTCCAACCCT CAGCACCTAC CCTGAATCAA CAAACACACC
CAGCATCCAC CTCGGAGCAC ACGCTAGTTC AGAAAGTCCA AGCACCATCA
AACTTACCAT GGCTTCAGTA GTAAAACCTG GCTCTTACAC ACCTCTCACC
TTCCCCTCAA TAGAGACCCA CATTCATGTA TCAACAGCCA GAATGGCTTA
CTCTTCTGGG TCTTCACCTG AGATGACAGC TCCTGGAGAG ACTAACACTG
GTAGTACCTG GGACCCCACC ACCTACATCA CCACTACGGA TCCTAAGGAT
ACAAGTTCAG CTCAGGTCTC TACACCCCAC TCAGTGAGGA CACTCAGAAC
CACAGAAAAC CATCCAAAGA CAGAGTCCGC CACCCCAGCT GCTTACTCTG
GAAGTCCTAA AATCTCAAGT TCACCCAATC TCACCAGTCC GGCCACAAAA
GCATGGACCA TCACAGACAC AACTGAACAC TCCACTCAAT TACATTACAC
AAAATTGGCA GAAAAATCAT CTGGATTTGA GACACAGTCA GCTCCAGGAC
CTGTCTCTGT AGTAATCCCT ACCTCCCCTA CCATTGGAAG CAGCACATTG
GAACTAACTT CTGATGTCCC AGGGGAACCC CTGGTCCTTG CTCCCAGTGA
GCAGACCACA ATCACTCTCC CCATGGCAAC ATGGCTGAGT ACCAGTTTGA
CAGAGGAAAT GGCTTCAACA GACCTTGATA TTTCAAGTCC AAGTTCACCC
ATGAGTACAT TTGCTATTTT TCCACCTATG TCCACACCTT CTCATGAACT
TTCAAAGTCA GAGGCAGATA CCAGTGCCAT TAGAAATACA GATTCAACAA
CGTTGGATCA GCACCTAGGA ATCAGGAGTT TGGGCAGAAC TGGGGACTTA
ACAACTGTTC CTATCACCCC ACTGACAACC ACGTGGACCA GTGTGATTGA
ACACTCAACA CAAGCACAGG ACACCCTTTC TGCAACGATG AGTCCTACTC
ACGTGACACA GTCACTCAAA GATCAAACAT CTATACCAGC CTCAGCATCC
CCTTCCCATC TTACTGAAGT CTACCCTGAG CTCGGGACAC AAGGGAGAAG
CTCCTCTGAG GCAACCACTT TTTGGAAACC ATCTACAGAC ACACTGTCCA
GAGAGATTGA GACTGGCCCA ACAAACATTC AATCCACTCC ACCCATGGAC
AACACAACAA CAGGGAGCAG TAGTAGTGGA GTCACCCTGG GCATAGCCCA
CCTTCCCATA GGAACATCCT CCCCAGCTGA GACATCCACA AACATGGCAC
TGGAAAGAAG AAGTTCTACA GCCACTGTCT CTATGGCTGG GACAATGGGA
CTCCTTGTTA CTAGTGCTCC AGGAAGAAGC ATCAGCCAGT CATTAGGAAG
AGTTTCCTCT GTCCTTTCTG AGTCAACTAC TGAAGGAGTC ACAGATTCTA
GTAAGGGAAG CAGCCCAAGG CTGAACACAC AGGGAAATAC AGCTCTCTCC
TCCTCTCTTG AACCCAGCTA TGCTGAAGGA AGCCAGATGA GCACAAGCAT
CCCTCTAACC TCATCTCCTA CAACTCCTGA TGTGGAATTC ATAGGGGGCA
GCACATTTTG GACCAAGGAG GTCACCACAG TTATGACCTC AGACATCTCC
AAGTCTTCAG CAAGGACAGA GTCCAGCTCA GCTACCCTTA TGTCCACAGC
TTTGGGAAGC ACTGAAAATA CAGGAAAAGA AAAACTCAGA ACTGCCTCTA
TGGATCTTCC ATCTCCAACT CCATCAATGG AGGTGACACC ATGGATTTCT
CTCACTCTCA GTAATGCCCC CAATACCACA GATTCACTTG ACCTCAGCCA
TGGGGTGCAC ACCAGCTCTG CAGGGACTTT GGCCACTGAC AGGTCATTGA
ATACTGGTGT CACTAGAGCC TCCAGATTGG AAAACGGCTC TGATACCTCT
TCTAAGTCCC TGTCTATGGG AAACAGCACT CACACTTCCA TGACTTACAC
AGAGAAGAGT GAAGTGTCTT CTTCAATCCA TCCCCGACCT GAGACCTCAG
CTCCTGGAGC AGAGACCACT TTGACTTCCA CTCCTGGAAA CAGGGCCATA
AGCTTAACAT TGCCTTTTTC ATCCATTCCA GTGGAAGAAG TCATTTCTAC
AGGCATAACC TCAGGACCAG ACATCAACTC AGCACCCATG ACACATTCTC
CCATCACCCC ACCAACAATT GTATGGACCA GTACAGGCAC AATTGAACAG
TCCACTCAAC CACTACATGC AGTTTCTTCA GAAAAAGTTT CTGTGCAGAC
ACAGTCAACT CCATATGTCA ACTCTGTGGC AGTGTCTGCT TCCCCTACCC
ATGAGAATTC AGTCTCTTCT GGAAGCAGCA CATCCTCTCC ATATTCCTCA
GCCTCACTTG AATCCTTGGA TTCCACAATC AGTAGGAGGA ATGCAATCAC
TTCCTGGCTA TGGGACCTCA CTACATCTCT CCCCACTACA ACTTGGCCAA
GTACTAGTTT ATCTGAGGCA CTGTCCTCAG GCCATTCTGG GGTTTCAAAC
CCAAGTTCAA CTACGACTGA ATTTCCACTC TTTTCAGCTG CATCCACATC
TGCTGCTAAG CAAAGAAATC CAGAAACAGA GACCCATGGT CCCCAGAATA
CAGCCGCGAG TACTTTGAAC ACTGATGCAT CCTCGGTCAC AGGTCTTTCT
GAGACTCCTG TGGGGGCAAG TATCAGCTCT GAAGTCCCTC TTCCAATGGC
CATAACTTCT AGATCAGATG TTTCTGGCCT TACATCTGAG AGTACTGCTA
ACCCGAGTTT AGGCACAGCC TCTTCAGCAG GGACCAAATT AACTAGGACA
ATATCCCTGC CCACTTCAGA GTCTTTGGTT TCCTTTAGAA TGAACAAGGA
TCCATGGACA GTGTCAATCC CTTTGGGGTC CCATCCAACT ACTAATACAG
AAACAAGCAT CCCAGTAAAC AGCGCAGGTC CACCTGGCTT GTCCACAGTA
GCATCAGATG TAATTGACAC ACCTTCAGAT GGGGCTGAGA GTATTCCCAC
TGTCTCCTTT TCCCCCTCCC CTGATACTGA AGTGACAACT ATCTCACATT
TCCCAGAAAA GACAACTCAT TCATTTAGAA CCATTTCATC TCTCACTCAT
GAGTTGACTT CAAGAGTGAC ACCTATTCCT GGGGATTGGA TGAGTTCAGC
TATGTCTACA AAGCCCACAG GAGCCAGTCC CTCCATTACA CTGGGAGAGA
GAAGGACAAT CACCTCTGCT GCTCCAACCA CTTCCCCCAT AGTTCTCACT
GCTAGTTTCA CAGAGACCAG CACAGTTTCA CTGGATAATG AAACTACAGT
AAAAACCTCA GATATCCTTG ACGCACGGAA AACAAATGAG CTCCCCTCAG
ATAGCAGTTC TTCTTCTGAT CTGATCAACA CCTCCATAGC TTCTTCAACT
ATGGATGTCA CTAAAACAGC CTCCATCAGT CCCACTAGCA TCTCAGGAAT
GACAGCAAGT TCCTCCCCAT CTCTCTTCTC TTCAGATAGA CCCCAGGTTC
CCACATCTAC AACAGAGACA AATACAGCCA CCTCTCCATC TGTTTCCAGT
AACACCTATT CTCTTGATGG GGGCTCCAAT GTGGGTGGCA CTCCATCCAC
TTTACCACCC TTTACAATCA CCCACCCTGT CGAGACAAGC TCGGCCCTAT
TAGCCTGGTC TAGACCAGTA AGAACTTTCA GCACCATGGT CAGCACTGAC
ACTGCCTCCG GAGAAAATCC TACCTCTAGC AATTCTGTGG TGACTTCTGT
TCCAGCACCA GGTACATGGA CCAGTGTAGG CAGTACTACT GACTTACCTG
CCATGGGCTT TCTCAAGACA AGTCCTGCAG GAGAGGCACA CTCACTTCTA
GCATCAACTA TTGAACCAGC CACTGCCTTC ACTCCCCATC TCTCAGCAGC
AGTGGTCACT GGATCCAGTG CTACATCAGA AGCCAGTCTT CTCACTACGA
GTGAAAGCAA AGCCATTCAT TCTTCACCAC AGACCCCAAC TACACCCACC
TCTGGAGCAA ACTGGGAAAC TTCAGCTACT CCTGAGAGCC TTTTGGTAGT
CACTGAGACT TCAGACACAA CACTTACCTC AAAGATTTTG GTCACAGATA
CCATCTTGTT TTCAACTGTG TCCACGCCAC CTTCTAAATT TCCAAGTACG
GGGACTCTGT CTGGAGCTTC CTTCCCTACT TTACTCCCGG ACACTCCAGC
CATCCCTCTC ACTGCCACTG AGCCAACAAG TTCATTAGCT ACATCCTTTG
ATTCCACCCC ACTGGTGACT ATAGCTTCGG ATAGTCTTGG CACAGTCCCA
GAGACTACCC TGACCATGTC AGAGACCTCA AATGGTGATG CACTGGTTCT
TAAGACAGTA AGTAACCCAG ATAGGAGCAT CCCTGGAATC ACTATCCAAG
GAGTAACAGA AAGTCCACTC CATCCTTCTT CCACTTCCCC CTCTAAGATT
GTTGCTCCAC GGAATACAAC CTATGAAGGT TCGATCACAG TGGCACTTTC
TACTTTGCCT GCGGGAACTA CTGGTTCCCT TGTATTCAGT CAGAGTTCTG
AAAACTCAGA GACAACGGCT TTGGTAGACT CATCAGCTGG GCTTGAGAGG
GCATCTGTGA TGCCACTAAC CACAGGAAGC CAGGGTATGG CTAGCTCTGG
AGGAATCAGA AGTGGGTCCA CTCACTCAAC TGGAACCAAA ACATTTTCTT
CTCTCCCTCT GACCATGAAC CCAGGTGAGG TTACAGCCAT GTCTGAAATC
ACCACGAACA GACTGACAGC TACTCAATCA ACAGCACCCA AAGGGATACC
TGTGAAGCCC ACCAGTGCTG AGTCAGGCCT CCTAACACCT GTCTCTGCCT
CCTCAAGCCC ATCAAAGGCC TTTGCCTCAC TGACTACAGC TCCCCCAACT
TGGGGGATCC CACAGTCTAC CTTGACATTT GAGTTTTCTG AGGTCCCAAG
TTTGGATACT AAGTCCGCTT CTTTACCAAC TCCTGGACAG TCCCTGAACA
CCATTCCAGA CTCAGATGCA AGCACAGCAT CTTCCTCACT GTCCAAGTCT
CCAGAAAAAA ACCCAAGGGC AAGGATGATG ACTTCCACAA AGGCCATAAG
TGCAAGCTCA TTTCAATCAA CAGGTTTTAC TGAAACCCCT GAGGGATCTG
CCTCCCCTTC TATGGCAGGG CATGAACCCA GAGTCCCCAC TTCAGGAACA
GGGGACCCTA GATATGCCTC AGAGAGCATG TCTTATCCAG ACCCAAGCAA
GGCATCATCA GCTATGACAT CGACCTCTCT TGCATCAAAA CTCACAACTC
TCTTCAGCAC AGGTCAAGCA GCAAGGTCTG GTTCTAGTTC CTCTCCCATA
AGCCTATCCA CTGAGAAAGA AACAAGCTTC CTTTCCCCCA CTGCATCCAC
CTCCAGAAAG ACTTCACTAT TTCTTGGGCC TTCCATGGCA AGGCAGCCCA
ACTCTAAATA TGTCCCAGGA GGAGCCTCCT GAGTTAACCT CAAGCCAGAC
CATTGCAGAA GAAGAGGGAA CAACAGCTGA AACACAGACG TTAACCTTCA
CACCATCTGA GACCCCAACA TCCTTGTTAC CTGTCTCTTC TCCCACAGAA
CCCACAGCCA GAAGAAAGAG TTCTCCAGAA ACATGGGCAA GCTCTATTTC
AGTTCCTGCC AAGACCTCCT TGGTTGAAAG TAAGAATGCC CTGCTCCTTC
AAGCCAGGCA GCACAAGGAA ATTCCACGTG GCCTGCCCCA GCAGAGGAGA
CGGGGACCAG TCCAGCAGGT AAATATAGAC CTTGTTTCCA TTTCTGCTCT
TCTACCACTC TCAAAATCAT GAGCTCCAAG GAACCCGGCA TCAGCCCAGA
GATCAGGTCC ACTGTGAGAA ATTCTCCTTG GAAGACTCCA GAAACAACTG
TTCCCATGGA GACCACAGTG GAACCAGTCA CCCTTCAGTC CACAGCCCTA
GGAAGTGGCA GCACCAGCAT CTCTCACCTG CCCACAGGAA CCACATCACC
AACCAAGTCA CCAACAGAAA ATATGTTGGC TACAGAAAGG GTCTCCCTCT
CCCCATCCCC ACCTGAGGCT TGGACCAACC TTTATTCTGG AACTCCAGGA
GGGACCAGGC AGTCACTGGC CACAATGTCC TCTGTCTCCC TAGAGTCACC
AACTGCTAGA AGCATCACAG GGACTGGTCA GCAAAGCAGT CCAGAACTGG
TTTTAAAGAC AACTGGAATG GAATTCTCTA TGTGGCATGG CTCTACTGGA
GGGACCACAG GGGACACACA TGTCTCTCTG AGCACATCTT CCAATATCCT
TGAAGACCCT GTAACCAGCC CAAACTCTGT GAGCTCATTG ACAGATAAAT
CCAAACATAA AACCGAGACA TGGGTCAGCA CCACAGCCAT TCCCTCCACT
GTCCTGAATA ATAAGATAAT GGCAGCTGAA CAACAGACAA GTCGATCTGT
GGATGAGGCT TATTCATCAA CTAGTTCTTG GTCAGATCAG ACATCTGGGA
GTGACATCAC CCTTGGTGCA TCTCCTGATG TCACAAACAC ATTATACATC
ACCTCCACAG CACAAACCAC CTCACTAGTA TCTCTGCCCT CTGGAGACCA
AGGCATTACA AGCCTCACCA ATCCCTCAGG AGGAAAAACA AGCTCTGCAT
CATCTGTCAC ATCTCCTTCA ATAGGGCTTG AGACTCTGAT GGCCAATGTA
AGTGCAGTGA CAAGTGACAT TGCCCCTACT GCTGGGCATC TATCTCAGAC
TTCATCTCCT GCGGAAGTGA GCATCCTGGA CATAACCACA GCTCCTACTC
CAGGTATCTC CACCACCATC ACCACCATGG GAACCAACTC AATCTCAACT
ACCACACCCA ACCCAGAAGT GGGTATGAGT ACCATGGACA GCACCCCGGC
CACAGAGAGG CACACAACTT CTACAGAACA CCCTTCCACC TGGTCTTCCA
CAGCTGCATC AGATTCCTGG ACTGTCACAG ACATGACTTC AAACTTGAAA
GTTGCAAGAT CTCCTGGAAC AATTTCCACA ATGCATACAA CTTCATTCTT
AGCCTCAAGC ACTGAATTAG ACTCCATGTC TACTCCCCAT GGCCGTATAA
CTGTCATTGG AACCAGCCTG GTCACTCCAT CCTCTGATGC TTCAGCTGTA
AAGACAGAGA CCAGTACAAG TGAAAGAACA TTGAGTCCTT CAGACACAAC
TGCATCTACT CCCATCTCAA CTTTTTCTCG TGTCCAGAGG ATGAGCATCT
CAGTTCCTGA CATTTTAAGT ACAAGTTGGA CTCCCAGTAG TACAGAAGCA
GAAGATGTGC CTGTTTCAAT GGTTTCTACA GATCATGCTA GTACAAAGAC
TGACCCAAAT ATGCCCCTGT CCACTTTTCT GTTTGATTCT CTGTCCACTC
TTGACTGGGA CACTGGGAGA TCTCTGTCAT CAGCCACAGC CACTACCTCA
GCTCCTCAGG GGGCCACAAC TCCCCAAGAA CTCACTTTGG AAACCATGAT
CAGCCCAGCT ACCTCACAGT TGCCCTTCTC TATAGGGCAC ATTACAAGTG
CAGTCATACC AGCTGCAATG GCAAGGAGCT CTGGAGTTAC TTTTTCAAGA
CCAGATCCCA CAAGCAAAAA GGCAGAGCAG ACTTCCACTC AGCTTCCCAC
CACCACTTCT GCACATCCAG AGCAGGTGCC CAGATCAGCA GCAACAACTC
TGGATGTGAT CCCACACACA GCAAAAACTC CAGATGCAAC TTTTCAGAGA
CAAGGGCAGA CAGCTCTTAC AACAGAGGCA AGAGCTACAT CTGACTCCTG
GAATGAGAAA GAAAAATCAA CCCCAAGTGC ACCTTGGATC ACTGAGATGA
TGAATTCTGT CTCAGAAGAT ACCATCAAGG AGGTTACCAG CTCCTCCAGT
GTGTTAAGGA CCCTGAATAC GCTGGACATA AACTTGGAAT CTGGGACGAC
TTCATCCCCA AGTTGGAAAA GCAGCCCATA TGAGAGAATT GCCCCTTCTG
AGTCTACCAC AGACAAAGAG GCAATTCACC CTTCTACAAA CACAGTAGAG
ACCACTGGCT GGGTCACAAG TTCCGAACAT GCTTCTCATT CCACTATCCC
AGCCCACTCA GCGTCATCCA AACTCACATC TCCAGTGGTT ACAACCTCCA
CCAGGGAACA AGCAATAGTT TCTATGTCAA CAACCACATG GCCAGAGTCT
ACAAGGGCTA GAACAGAGCC TAATTCCTTC TTGACTATTG AACTGAGGGA
CGTCAGCCCT TACATGGACA CCAGCTCAAC CACACAAACA AGTTTTATCT
CTTCCCCAGG TTCCACTGCG ATCACCAAGG GGCCTAGAAC AGAAATTACC
TCCTCTAAGA GAATATCCAG CTCATTCCTT GCCCAGTCTA TGAGGTCGTC
AGACAGCCCC TCAGAAGCCA TCTCCAGGCT GTCTAACTTT CCTGCCATGA
CAGAATCTGG AGGAATGATC CTTGCTATGC AAACAAGTCC ACCTGGCGCT
ACATCACTAA GTGCACCTAC TTTGGATACA TCAGCCACAG CCTCCTGGAC
AGGGACTCCA CTGGCTACGA CTCAGAGATT TACATACTCA GAGAAGACCA
CTCTCTTTAG CAAAGGTCCT GAGGATACAT CACAGCCAAG CCCTCCCTCT
GTGGAAGAAA CCAGCTCTTC CTCTTCCCTG GTACCTATCA ATGCTACAAC
CTCGCCTTCC AATATTTTGT TGACATCACA AGGGCACAGT CCCTCCTCTA
CTCCACCTGT GACCTCAGTT TTCTTGTCTG AGACCTCTGG CCTGGGGAAG
ACCACAGACA TGTCGAGGAT AAGCTTGGAA CCTGGCACAA GTTTACCTCC
CAATTTGAGC AGTACAGCAG GTGAGGCGTT ATCCACTTAT GAAGCCTCCA
GAGATACAAA GGCAATTCAT CATTCTGCAG ACACAGCAGT GACGAATATG
GAGGCAACCA GTTCTGAATA TTCTCCTATC CCAGGCCATA CAAAGCCATC
CAAAGCCACA TCTCCATTGG TTACCTCCCA CATCATGGGG GACATCACTT
CTTCCACATC AGTATTTGGC TCCTCCGAGA CCACAGAGAT TGAGACAGTG
TCCTCTGTGA ACCAGGGACT TCAGGAGAGA AGCACATCCC AGGTGGCCAG
CTCTGCTACA GAGACAAGCA CTGTCATTAC CCATGTGTCT AGTGGTGATG
CTACTACTCA TGTCACCAAG ACACAAGCCA CTTTCTCTAG CGGAACATCC
ATCTCAAGCC CTCATCAGTT TATAACTTCT ACCAACACAT TTACAGATGT
GAGCACCAAC CCCTCCACCT CTCTGATAAT GACAGAATCT TCAGGAGTGA
CCATCACCAC CCAAACAGGT CCTACTGGAG CTGCAACACA GGGTCCATAT
CTCTTGGACA CATCAACCAT GCCTTACTTG ACAGAGACTC CATTAGCTGT
GACTCCAGAT TTTATGCAAT CAGAGAAGAC CACTCTCATA AGCAAAGGTC
CCAAGGATGT GTCCTGGACA AGCCCTCCCT CTGTGGCAGA AACCAGCTAT
CCCTCTTCCC TGACACCTTT CTTGGTCACA ACCATACCTC CTGCCACTTC
CACGTTACAA GGGCAACATA CATCCTCTCC TGTTTCTGCG ACTTCAGTTC
TTACCTCTGG ACTGGTGAAG ACCACAGATA TGTTGAACAC AAGCATGGAA
CCTGTGACCA ATTCACCTCA AAATTTGAAC AATCCATCAA ATGAGATACT
GGCCACTTTG GCAGCCACCA CAGATATAGA GACTATTCAT CCTTCCATAA
ACAAAGCAGT GACCAATATG GGGACTGCCA GTTCAGCACA TGTACTGCAT
TCCACTCTCC CAGTCAGCTC AGAACCATCT ACAGCCACAT CTCCAATGGT
TCCTGCCTCC AGCATGGGGG ACGCTCTTGC TTCTATATCA ATACCTGGTT
CTGAGACCAC AGACATTGAG GGAGAGCCAA CATCCTCCCT GACTGCTGGA
CGAAAAGAGA ACAGCACCCT CCAGGAGATG AACTCAACTA CAGAGTCAAA
CATCATCCTC TCCAATGTGT CTGTGGGGGC TATTACTGAA GCCACAAAAA
TGGAAGTCCC CTCTTTTGAT GCAACATTCA TACCAACTCC TGCTCAGTCA
ACAAAGTTCC CAGATATTTT CTCAGTAGCC AGCAGTAGAC TTTCAAACTC
TCCTCCCATG ACAATATCTA CCCACATGAC CACCACCCAG ACAGGGTCTT
CTGGAGCTAC ATCAAAGATT CCACTTGCCT TAGACACATC AACCTTGGAA
ACCTCAGCAG GGACTCCATC AGTGGTGACT GAGGGGTTTG CCCACTCAAA
AATAACCACT GCAATGAACA ATGATGTCAA GGACGTGTCA CAGACAAACC
CTCCCTTTCA GGATGAAGCC AGCTCTCCCT CTTCTCAAGC ACCTGTCCTT
GTCACAACCT TACCTTCTTC TGTTGCTTTC ACACCGCAAT GGCACAGTAC
CTCCTCTCCT GTTTCTATGT CCTCAGTTCT TACTTCTTCA CTGGTAAAGA
CCGCAGGCAA GGTGGATACA AGCTTAGAAA CAGTGACCAG TTCACCTCAA
AGATATAGAG ACAACGCATC CTTCCATAAA CACAGTAGTT ACCAATGTGG
GGACCACCGG TTCAGCATTT GAATCACATT CTACTGTCTC AGCTTACCCA
GAGCCATCTA AAGTCACATC TCCAAATGTT ACCACCTCCA CCATGGAAGA
CACCACAATT TCCAGATCAA TACCTAAATC CTCTAAGACT ACAAGAACTG
AGACTGAGAC AACTTCCTCC CTGACTCCTA AACTGAGGGA GACCAGCGTC
TCCCAGGAGA TCACCTCGTC CACAGAGACA AGCACTGTTC CTTACAAAGA
GCTCACTGGT GCCACTACCG AGGTATCCAG GACAGATGTC ACTTCCTCTA
GCAGTACATC CTTCCCTGGC CCTGATCAGT CCACAGTGTC ACTAGACATC
TCCACAGAAA CCAACACCAG GCTGTCTACC TCCCCAATAA TGACAGAATC
TGCAGAAATA ACCATCACCA CCCAAACAGG TCCTCATGGG GCTACATCAC
AGGATACTTT TACCATGGAC CCATCAAATA CAACCCCCCA GGCAGGGATC
CACTCAGCTA TGACTCATGG ATTTTCACAA TTGGATGTGA CCACTCTTAT
GAGCAGAATT CCACAGGATG TATCATGGAC AAGTCCTCCC TCTGTGGATA
AAACCAGCTC CCCCTCTTCC TTTCTGCCCT CACCTGCAAT GACCACACCT
TCCCTGATTT CTTCTACCTT ACCAGAGGAT AAGCTCTCCT CTCCTATGAC
TTCACTTCTC ACCTCTGGCC TAGTGAAGAT TACAGACATA TTACGTACAC
GCTTGGAACC TGTGACCAGC TCACTTCCAA ATTTCAGCAG CACCTCAGAT
AAGATACTGG CCACTTCTAA AGACAGTAAA GACACAAAGG AAATTTTTCC
TTCTATAAAC ACAGAAGAGA CCAATGTGAA AGCCAACAAC TCTGGACATG
AATCCCATTC CCCTGCACTG GCTGACTCAG AGACACCCAA AGCCACAACT
CAAATGGTTA TCACCACCAC TGTGGGAGAT CCAGCTCCTT CCACATCAAT
GCCAGTGCAT GGTTCCTCTG AGACTACAAA CATTAAGAGA GAGCCAACAT
ATTTCTTGAC TCCTAGACTG AGAGAGACCA GTACCTCTCA GGAGTCCAGC
TTTCCCACGG ACACAAGTTT TCTACTTTCC AAAGTCCCCA CTGGTACTAT
TACTGAGGTC TCCAGTACAG GGGTCATCTC TTCTAGCAAA ATTTCCACCC
CAGACCATGA TAAGTCCACA GTGCCACCTG ACACCTTCAC AGGAGAGATC
CCCAGGGTCT TCACCTCCTC TATTAAGACA AAATCTGCAG AAATGACGAT
CACCACCCAA GCAAGTCCTC CTGAGTCTGC ATCGCACAGT ACCCTTCCCT
TGGACACATC AACCACACTT TCCCAGGGAG GGACTCATTC AACTGTGACT
CAGGGATTCC CATACTCAGA GGTGACCACT CTCATGGGCA TGGGTCCTGG
GAATGTGTCA TGGATGACAA CTCCCCCTGT GGAAGAAACC AGCTCTGTGT
CTTCCCTGAT GTCTTCACCT GCCATGACAT CCCCTTCTCC TGTTTCCTCC
ACATCACCAC AGAGCATCCC CTCCTCTCCT CTTCCTGTGA CTGCACTTCC
TACTTCTGTT CTGGTGACAA CCACAGATGT GTTGGGCACA ACAAGCCCAG
AGTCTGTAAC CAGTTCACCT CCAAATTTGA GCAGCATCAC TCATGAGAGA
CCGGCCACTT ACAAAGACAC TGCACACACA GAAGCCGCCA TGCATCATTC
CACAAACACC GCAGTGACCA ATGTAGGGAC TTCCGGGTCT GGACATAAAT
CACAATCCTC TGTCCTAGCT GACTCAGAGA CATCGAAAGC CACACCTCTG
ATGAGTACCA CCTCCACCCT GGGGGACACA AGTGTTTCCA CATCAACTCC
TAATATCTCT CAGACTAACC AAATTCAAAC AGAGCCAACA GCATCCCTGA
GCCCTAGACT GAGGGAGAGC AGCACGTCTG AGAAGACCAG CTCAACAACA
GAGACAAATA CTGCCTTTTC TTATGTGCCC ACAGGTGCTA TTACTCAGGC
CTCCAGAACA GAAATCTCCT CTAGCAGAAC ATCCATCTCA GACCTTGATC
GGTCCACAAT AGCACCCGAC ATCTCCACAG GAATGATCAC CAGGCTCTTC
ACCTCCCCCA TCATGACAAA ATCTGCAGAA ATGACCGTCA CCACTCAAAC
AACTACTCCT GGGGCTACAT CACAGGGTAT CCTTCCCTGG GACACATCAA
CCACACTTTT CCAGGGAGGG ACTCATTCAA CCGTGTCTCA GGGATTCCCA
CACTCAGAGA TAACCACTCT TCGGAGCAGA ACCCCTGGAG ATGTGTCATG
GATGACAACT CCCCCTGTGG AAGAAACCAG CTCTGGGTTT TCCCTGATGT
CACCTTCCAT GACATCCCCT TCTCCTGTTT CCTCCACATC ACCAGAGAGC
ATCCCCTCCT CTCCTCTCCC TGTGACTGCA CTTCTTACTT CTGTTCTGGT
GACAACCACA AATGTATTGG GCACAACAAG CCCAGAGCCC GTAACGAGTT
CACCTCCAAA TTTAAGCAGC CCCACACAGG AGAGACTGAC CACTTACAAA
GACACTGCGC ACACAGAAGC CATGCATGCT TCCATGCATA CAAACACTGC
AGTGGCCAAC GTGGGGACCT CCATTTCTGG ACATGAATCA CAATCTTCTG
TCCCAGCTGA TTCAGACACA TCCAAAGCCA CATCTCCAAT GGGTACCACC
TTCGCCATGG GGGATACAAG TGTTTCTACA TCAACTCCTG CCTTCTTTGA
GACTAGAATT CAGACTGAAT CAACATCCTC TTTGATTCCT GGATTAAGGG
ACACCAGGAC GTCTGAGGAG ATCAACACTG TGACAGAGAC CAGCACTGTC
CTTTCAGAAG TGCCCACTAC TACTACTACT GAGGTCTCCA GGACAGAAGT
TATCACTTCC AGCAGAACAA CCATCTCAGG GCCTGATCAT TCCAAAATGT
CACCCTACAT CTCCACAGAA ACCATCACCA GGCTCTCCAC TTTTCCTTTT
GTAACAGGAT CCACAGAAAT GGCCATCACC AACCAAACAG GTCCTATAGG
GACTATCTCA CAGGCTACCC TTACCCTGGA CACATCAAGC ACAGCTTCCT
GGGAAGGGAC TCACTCACCT GTGACTCAGA GATTTCCACA CTCAGAGGAG
ACCACTACTA TGAGCAGAAG TACTAAGGGC GTGTCATGGC AAAGCCCTCC
CTCTGTGGAA GAAACCAGTT CTCCTTCTTC CCCAGTGCCT TTACCTGCAA
TAACCTCACA TTCATCTCTT TATTCCGCAG TATCAGGAAG TAGCCCCACT
TCTGCTCTCC CTGTGACTTC CCTTCTCACC TCTGGCAGGA GGAAGACCAT
AGACATGTTG GACACACACT CAGAACTTGT GACCAGCTCC TTACCAAGTG
CAAGTAGCTT CTCAGGTGAG ATACTCACTT CTGAAGCCTC CACAAATACA
GAGACAATTC ACTTTTCAGA GAACACAGCA GAAACCAATA TGGGGACCAC
CAATTCTATG CATAAACTAC ATTCCTCTGT CTCAATCCAC TCCCAGCCAT
CCGGACACAC ACCTCCAAAG GTTACTGGAT CTATGATGGA GGACGCTATT
GTTTCCACAT CAACACCTGG TTCTCCTGAG ACTAAAAATG TTGACAGAGA
CTCAACATCC CCTCTGACTC CTGAACTGAA AGAGGACAGC ACCGCCCTGG
TGATGAACTC AACTACAGAG TCAAACACTG TTTTCTCCAG TGTGTCCCTG
GATGCTGCTA CTGAGGTCTC CAGGGCAGAA GTCACCTACT ATGATCCTAC
ATTCATGCCA GCTTCTGCTC AGTCAACAAA GTCCCCAGAC ATTTCACCTG
AAGCCAGCAG CAGTCATTCT AACTCTCCTC CCTTGACAAT ATCTACACAC
AAGACCATCG CCACACAAAC AGGTCCTTCT GGGGTGACAT CTCTTGGCCA
ACTGACCCTG GACACATCAA CCATAGCCAC CTCAGCAGGA ACTCCATCAG
CCAGAACTCA GGATTTTGTA GATTCAGAAA CAACCAGTGT CATGAACAAT
GATCTCAATG ATGTGTTGAA GACAAGCCCT TTCTCTGCAG AAGAAGCCAA
CTCTCTCTCT TCTCAGGCAC CTCTCCTTGT GACAACCTCA CCTTCTCCTG
TAACTTCCAC ATTGCAAGAG CACAGTACCT CCTCTCTTGT TTCTGTGACC
TCAGTACCCA CCCCTACACT GGCGAAGATC ACAGACATGG ACACAAACTT
AGAACCTGTG ACTCGTTCAC CTCAAAATTT AAGGAACACC TTGGCCACTT
CAGAAGCCAC CACAGATACA CACACAATGC ATCCTTCTAT AAACACAGCA
GTGGCCAATG TGGGGACCAC CAGTTCACCA AATGAATTCT ATTTTACTGT
CTCACCTGAC TCAGACCCAT ATAAAGCCAC ATCCGCAGTA GTTATCACTT
CCACCTCGGG GGACTCAATA GTTTCCACAT CAATGCCTAG ATCCTCTGCG
ATGAAAAAGA TTGAGTCTGA GACAACTTTC TCCCTGATAT TTAGACTGAG
GGAGACTAGC ACCTCCCAGA AAATTGGCTC ATCCTCAGAC ACAAGCACGG
TCTTTGACAA AGCATTCACT GCTGCTACTA CTGAGGTCTC CAGAACAGAA
CTCACCTCCT CTAGCAGAAC ATCCATCCAA GGCACTGAAA AGCCCACAAT
GTCACCGGAC ACCTCCACAA GATCTGTCAC CATGCTTTCT ACTTTTGCTG
GCCTGACAAA ATCCGAAGAA AGGACCATTG CCACCCAAAC AGGTCCTCAT
AGGGCGACAT CACAGGGTAC CCTTACCTGG GACACATCAA TCACAACCTC
ACAGGCAGGG ACCCACTCAG CTATGACTCA TGGATTTTCA CAATTAGATT
TGTCCACTCT TACGAGTAGA GTTCCTGAGT ACATATCAGG GACAAGCCCA
CCCTCTGTGG AAAAAACCAG CTCTTCCTCT TCCCTTCTGT CTTTACCAGC
AATAACCTCA CCGTCCCCTG TACCTACTAC ATTACCAGAA AGTAGGCCGT
CTTCTCCTGT TCATCTGACT TCACTCCCCA CCTCTGGCCT AGTGAAGACC
ACAGATATGC TGGCATCTGT GGCCAGTTTA CCTCCAAACT TGGGCAGCAC
CTCACATAAG ATACCGACTA CTTCAGAAGA CATTAAAGAT ACAGAGAAAA
TGTATCCTTC CACAAACATA GCAGTAACCA ATGTGGGGAC CACCACTTCT
GAAAAGGAAT CTTATTCGTC TGTCCCAGCC TACTCAGAAC CACCCAAAGT
CACCTCTCCA ATGGTTACCT CTTTCAACAT AAGGGACACC ATTGTTTCCA
CATCCATGCC TGGCTCCTCT GAGATTACAA GGATTGAGAT GGAGTCAACA
TTCTCCCTGG CTCATGGGCT GAAGGGAACC AGCACCTCCC AGGACCCCAT
CGTATCCACA GAGAAAAGTG CTGTCCTTCA CAAGTTGACC ACTGGTGCTA
CTGAGACCTC TAGGACAGAA GTTGCCTCTT CTAGAAGAAC ATCCATTCCA
GGCCCTGATC ATTCCACAGA GTCACCAGAC ATCTCCACTG AAGTGATCCC
CAGCCTGCCT ATCTCCCTTG GCATTACAGA ATCTTCAAAT ATGACCATCA
TCACTCGAAC AGGTCCTCCT CTTGGCTCTA CATCACAGGG CACATTTACC
TTGGACACAC CAACTACATC CTCCAGGGCA GGAACACACT CGATGGCGAC
TCAGGAATTT CCACACTCAG AAATGACCAC TGTCATGAAC AAGGACCCTG
AGATTCTATC ATGGACAATC CCTCCTTCTA TAGAGAAAAC CAGCTTCTCC
TCTTCCCTGA TGCCTTCACC AGCCATGACT TCACCTCCTG TTTCCTCAAC
ATTACCAAAG ACCATTCACA CCACTCCTTC TCCTATGACC TCACTGCTCA
CCCCTAGCCT AGTGATGACC ACAGACACAT TGGGCACAAG CCCAGAACCT
ACAACCAGTT CACCTCCAAA TTTGAGCAGT ACCTCACATG AGATACTGAC
AACAGATGAA GACACCACAG CTATAGAAGC CATGCATCCT TCCACAAGCA
CAGCAGCGAC TAATGTGGAA ACCACCAGTT CTGGACATGG GTCACAATCC
TCTGTCCTAG CTGACTCAGA AAAAACCAAG GCCACAGCTC CAATGGATAC
CACCTCCACC ATGGGGCATA CAACTGTTTC CACATCAATG TCTGTTTCCT
CTGAGACTAC AAAAATTAAG AGAGAGTCAA CATATTCCTT GACTCCTGGA
CTGAGAGAGA CCAGCATTTC CCAAAATGCC AGCTTTTCCA CTGACACAAG
TATTGTTCTT TCAGAAGTCC CCACTGGTAC TACTGCTGAG GTCTCCAGGA
CAGAAGTCAC CTCCTCTGGT AGAACATCCA TCCCTGGCCC TTCTCAGTCC
ACAGTTTTGC CAGAAATATC CACAAGAACA ATGACAAGGC TCTTTGCCTC
GCCCACCATG ACAGAATCAG CAGAAATGAC CATCCCCACT CAAACAGGTC
CTTCTGGGTC TACCTCACAG GATACCCTTA CCTTGGACAC ATCCACCACA
AAGTCCCAGG CAAAGACTCA TTCAACTTTG ACTCAGAGAT TTCCACACTC
AGAGATGACC ACTCTCATGA GCAGAGGTCC TGGAGATATG TCATGGCAAA
GCTCTCCCTC TCTGGAAAAT CCCAGCTCTC TCCCTTCCCT GCTGTCTTTA
CCTGCCACAA CCTCACCTCC TCCCATTTCC TCCACATTAC CAGTGACTAT
CTCCTCCTCT CCTCTTCCTG TGACTTCACT TCTCACCTCT AGCCCGGTAA
CGACCACAGA CATGTTACAC ACAAGCCCAG AACTTGTAAC CAGTTCACCT
CCAAAGCTGA GCCACACTTC AGATGAGAGA CTGACCACTG GCAAGGACAC
CACAAATACA GAAGCTGTGC ATCCTTCCAC AAACACAGCA GCGTCCAATG
TGGAGATTCC CAGCTCTGGA CATGAATCCC CTTCCTCTGC CTTAGCTGAC
TCAGAGACAT CCAAAGCCAC ATCACCAATG TTTATTACCT CCACCCAGGA
GGATACAACT GTTGCCATAT CAACCCCTCA CTTCTTGGAG ACTAGCAGAA
TTCAGAAAGA GTCAATTTCC TCCCTGAGCC CTAAATTGAG GGAGACAGGC
AGTTCTGTGG AGACAAGCTC AGCCATAGAG ACAAGTGCTG TCCTTTCTGA
AGTGTCCGTT GGTGCTACTA CTGAGATCTC CAGGACAGAA GTCACCTCCT
CTAGCAGAAC ATCCATCTCT GGTTCTGCTG AGTCCACAAT GTTGCCAGAA
ATATCCACCA CAAGAAAAAT CATTAAGTTC CCTACTTCCC CCATCCTGGC
AGAATCATCA GAAATGACCA TCAAGACCCA AACAAGTCCT CCTGGGTCTA
CATCAGAGAG TACCTTTACA TTAGACACAT CAACCACTCC CTCCTTGGTA
ATAACCCATT CGACTATGAC TCAGAGATTG CCACACTCAG AGATAACCAC
TCTTGTGAGT AGAGGTGCTG GGGATGTGCC ACGGCCCAGC TCTCTCCCTG
TGGAAGAAAC AAGCCCTCCA TCTTCCCAGC TGTCTTTATC TGCCATGATC
TCACCTTCTC CTGTTTCTTC CACATTACCA GCAAGTAGCC ACTCCTCTTC
TGCTTCTGTG ACTTCACTTC TCACACCAGG CCAAGTGAAG ACTACTGAGG
TGTTGGACGC AAGTGCAGAA CCTGAAACCA GTTCACCTCC AAGTTTGAGC
AGCACCTCAG TTGAAATACT GGCCACCTCT GAAGTCACCA CAGATACGGA
GAAAATTCAT CCTTTCTCAA ACACGGCAGT AACCAAAGTT GGAACTTCCA
GTTCTGGACA TGAATCCCCT TCCTCTGTCC TACCTGACTC AGAGACAACC
AAAGCCACAT CGGCAATGGG TACCATCTCC ATTATGGGGG ATACAAGTGT
TTCTACATTA ACTCCTGCCT TATCTAACAC TAGGAAAATT CAGTCAGAGC
CAGCTTCCTC ACTGACCACC AGATTGAGGG AGACCAGCAC CTCTGAAGAG
ACCAGCTTAG CCACAGAAGC AAACACTGTT CTTTCTAAAG TGTCCACTGG
TGCTACTACT GAGGTCTCCA GGACAGAAGC CATCTCCTTT AGCAGAACAT
CCATGTCAGG CCCTGAGCAG TCCACAATGT CACAAGACAT CTCCATAGGA
ACCATCCCCA GGATTTCTGC CTCCTCTGTC CTGACAGAAT CTGCAAAAAT
GACCATCACA ACCCAAACAG GTCCTTCGGA GTCTACACTA GAAAGTACCC
TTAATTTGAA CACAGCAACC ACACCCTCTT GGGTGGAAAC CCACTCTATA
GTAATTCAGG GATTTCCACA CCCAGAGATG ACCACTTCCA TGGGCAGAGG
TCCTGGAGGT GTGTCATGGC CTAGCCCTCC CTTTGTGAAA GAAACCAGCC
CTCCATCCTC CCCGCTGTCT TTACCTGCCG TGACCTCACC TCATCCTGTT
TCCACCACAT TCCTAGCACA TATCCCCCCC TCTCCCCTTC CTGTGACTTC
ACTTCTCACC TCTGGCCCGG CGACAACCAC AGATATCTTG GGTACAAGCA
CAGAACCTGG AACCAGTTCA TCTTCAAGTT TGAGCACCAC CTCCCATGAG
AGACTGACCA CTTACAAAGA CACTGCACAT ACAGAAGCCG TGCATCCTTC
CACAAACACA GGAGGGACCA ATGTGGCAAC CACCAGCTCT GGATATAAAT
CACAGTCCTC TGTCCTAGCT GACTCATCTC CAATGTGTAC CACCTCCACC
ATGGGGGATA CAAGTGTTCT CACATCAACT CCTGCCTTCC TTGAGACTAG
GAGGATTCAG ACAGAGCTAG CTTCCTCCCT GACCCCTGGA TTGAGGGAGT
CCAGTGGCTC TGAAGGGACC AGCTCAGGCA CCAAGATGAG CACTGTCCTC
TCTAAAGTGC CCACTGGTGC TACTACTGAG ATCTCCAAGG AAGACGTCAC
CTCCATCCCA GGTCCCGCTC AATCCACAAT ATCACCAGAC ATCTCCACAA
GAACCGTCAG CTGGTTCTCT ACATCCCCTG TCATGACAGA ATCAGCAGAA
ATAACCATGA ACACCCATAC AAGTCCTTTA GGGGCCACAA CACAAGGCAC
CAGTACTTTG GCCACGTCAA GCACAACCTC TTTGACAATG ACACACTCAA
CTATATCTCA AGGATTTTCA CACTCACAGA TGAGCACTCT TATGAGGAGG
GGTCCTGAGG ATGTATCATG GATGAGCCCT CCCCTTCTGG AAAAAACTAG
ACCTTCCTTT TCTCTGATGT CTTCACCAGC CACAACTTCA CCTTCTCCTG
TTTCCTCCAC ATTACCAGAG AGCATCTCTT CCTCTCCTCT TCCTGTGACT
TCACTCCTCA CGTCTGGCTT GGCAAAAACT ACAGATATGT TGCACAAAAG
CTCAGAACCT GTAACCAACT CACCTGCAAA TTTGAGCAGC ACCTCAGTTG
AAATACTGGC CACCTCTGAA GTCACCACAG ATACAGAGAA AACTCATCCT
TCTTCAAACA GAACAGTGAC CGATGTGGGG ACCTCCAGTT CTGGACATGA
ATCCACTTCC TTTGTCCTAG CTGACTCACA GACATCCAAA GTCACATCTC
CAATGGTTAT TACCTCCACC ATGGAGGATA CGAGTGTCTC CACATCAACT
CCTGGCTTTT TTGAGACTAG CAGAATTCAG ACAGAACCAA CATCCTCCCT
GACCCTTGGA CTGAGAAAGA CCAGCAGCTC TGAGGGGACC AGCTTAGCCA
CAGAGATGAG CACTGTCCTT TCTGGAGTGC CCACTGGTGC CACTGCTGAA
GTCTCCAGGA CAGAAGTCAC CTCCTCTAGC AGAACATCCA TCTCAGGCTT
TGCTCAGCTC ACAGTGTCAC CAGAGACTTC CACAGAAACC ATCACCAGAC
TCCCTACCTC CAGCATAATG ACAGAATCAG CAGAAATGAT GATCAAGACA
CAAACAGATC CTCCTGGGTC TACACCAGAG AGTACTCATA CTGTGGACAT
ATCAACAACA CCCAACTGGG TAGAAACCCA CTCGACTGTG ACTCAGAGAT
TTTCACACTC AGAGATGACC ACTCTTGTGA GCAGAAGCCC TGGTGATATG
TTATGGCCTA GTCAATCCTC TGTGGAAGAA ACCAGCTCTG CCTCTTCCCT
GCTGTCTCTG CCTGCCACGA CCTCACCTTC TCCTGTTTCC TCTACATTAG
TAGAGGATTT CCCTTCCGCT TCTCTTCCTG TGACTTCTCT TCTCACCCCT
GGCCTGGTGA TAACCACAGA CAGGATGGGC ATAAGCAGAG AACCTGGAAC
CAGTTCCACT TCAAATTTGA GCAGCACCTC CCATGAGAGA CTGACCACTT
TGGAAGACAC TGTAGATACA GAAGACATGC AGCCTTCCAC ACACACAGCA
GTGACCAACG TGAGGACCTC CATTTCTGGA CATGAATCAC AATCTTCTGT
CCTATCTGAC TCAGAGACAC CCAAAGCCAC ATCTCCAATG GGTACCACCT
ACACCATGGG GGAAACGAGT GTTTCCATAT CCACTTCTGA CTTCTTTGAG
ACCAGCAGAA TTCAGATAGA ACCAACATCC TCCCTGACTT CTGGATTGAG
GGAGACCAGC AGCTCTGAGA GGATCAGCTC AGCCACAGAG GGAAGCACTG
TCCTTTCTGA AGTGCCCAGT GGTGCTACCA CTGAGGTCTC CAGGACAGAA
GTGATATCCT CTAGGGGAAC ATCCATGTCA GGGCCTGATC AGTTCACCAT
ATCACCAGAC ATCTCTACTG AAGCGATCAC CAGGCTTTCT ACTTCCCCCA
TTATGACAGA ATCAGCAGAA AGTGCCATCA CTATTGAGAC AGGTTCTCCT
GGGGCTACAT CAGAGGGTAC CCTCACCTTG GACACCTCAA CAACAACCTT
TTGGTCAGGG ACCCACTCAA CTGCATCTCC AGGATTTTCA CACTCAGAGA
TGACCACTCT TATGAGTAGA ACTCCTGGAG ATGTGCCATG GCCGAGCCTT
CCCTCTGTGG AAGAAGCCAG CTCTGTCTCT TCCTCACTGT CTTCACCTGC
CATGACCTCA ACTTCTTTTT TCTCCACATT ACCAGAGAGC ATCTCCTCCT
CTCCTCATCC TGTGACTGCA CTTCTCACCC TTGGCCCAGT GAAGACCACA
GACATGTTGC GCACAAGCTC AGAACCTGAA ACCAGTTCAC CTCCAAATTT
GAGCAGCACC TCAGCTGAAA TATTAGCCAC GTCTGAAGTC ACCAAAGATA
GAGAGAAAAT TCATCCCTCC TCAAACACAC CTGTAGTCAA TGTAGGGACT
GTGATTTATA AACATCTATC CCCTTCCTCT GTTTTGGCTG ACTTAGTGAC
AACAAAACCC ACATCTCCAA TGGCTACCAC CTCCACTCTG GGGAATACAA
GTGTTTCCAC ATCAACTCCT GCCTTCCCAG AAACTATGAT GACACAGCCA
ACTTCCTCCC TGACTTCTGG ATTAAGGGAG ATCAGTACCT CTCAAGAGAC
CAGCTCAGCA ACAGAGAGAA GTGCTTCTCT TTCTGGAATG CCCACTGGTG
CTACTACTAA GGTCTCCAGA ACAGAAGCCC TCTCCTTAGG CAGAACATCC
ACCCCAGGTC CTGCTCAATC CACAATATCA CCAGAAATCT CCACGGAAAC
CATCACTAGA ATTTCTACTC CCCTCACCAC GACAGGATCA GCAGAAATGA
CCATCACCCC CAAAACAGGT CATTCTGGGG CATCCTCACA AGGTACCTTT
ACCTTGGACA CATCAAGCAG AGCCTCCTGG CCAGGAACTC ACTCAGCTGC
AACTCACAGA TCTCCACACT CAGGGATGAC CACTCCTATG AGCAGAGGTC
CTGAGGATGT GTCATGGCCA AGCCGCCCAT CAGTGGAAAA AACTAGCCCT
CCATCTTCCC TGGTGTCTTT ATCTGCAGTA ACCTCACCTT CGCCACTTTA
TTCCACACCA TCTGAGAGTA GCCACTCATC TCCTCTCCGG GTGACTTCTC
TTTTCACCCC TGTCATGATG AAGACCACAG ACATGTTGGA CACAAGCTTG
GAACCTGTGA CCACTTCACC TCCCAGTATG AATATCACCT CAGATGAGAG
TCTGGCCACT TCTAAAGCCA CCATGGAGAC AGAGGCAATT CAGCTTTCAG
AAAACACAGC TGTGACTCAG ATGGGCACCA TCAGCGCTAG ACAAGAATTC
TATTCCTCTT ATCCAGGCCT CCCAGAGCCA TCCAAAGTGA CATCTCCAGT
GGTCACCTCT TCCACCATAA AAGACATTGT TTCTACAACC ATACCTGCTT
CCTCTGAGAT AACAAGAATT GAGATGGAGT CAACATCCAC CCTGACCCCC
ACACCAAGGG AGACCAGCAC CTCCCAGGAG ATCCACTCAG CCACAAAGCC
AAGCACTGTT CCTTACAAGG CACTCACTAG TGCCACGATT GAGGACTCCA
TGACACAAGT CATGTCCTCT AGCAGAGGAC CTAGCCCTGA TCAGTCCACA
ATGTCACAAG ACATATCCAG TGAAGTGATC ACCAGGCTCT CTACCTCCCC
CATCAAGGCA GAATCTACAG AAATGACCAT TACCACCCAA ACAGGTTCTC
CTGGGGCTAC ATCAAGGGGT ACCCTTACCT TGGACACTTC AACAACTTTT
ATGTCAGGGA CCCACTCAAC TGCATCTCAA GGATTTTCAC ACTCACAGAT
GACCGCTCTT ATGAGTAGAA CTCCTGGAGA TGTGCCATGG CTAAGCCATC
CCTCTGTGGA AGAAGCCAGC TCTGCCTCTT TCTCACTGTC TTCACCTGTC
ATGACCTCAT CTTCTCCCGT TTCTTCCACA TTACCAGACA GCATCCACTC
TTCTTCGCTT CCTGTGACAT CACTTCTCAC CTCAGGGCTG GTGAAGACCA
CAGAGCTGTT GGGCACAAGC TCAGAACCTG AAACCAGTTC ACCCCCAAAT
TTGAGCAGCA CCTCAGCTGA AATACTGGCC ACCACTGAAG TCACTACAGA
TACAGAGAAA CTGGAGATGA CCAATGTGGT AACCTCAGGT TATACACATG
AATCTCCTTC CTCTGTCCTA GCTGACTCAG TGACAACAAA GGCCACATCT
TCAATGGGTA TCACCTACCC CACAGGAGAT ACAAATGTTC TCACATCAAC
CCCTGCCTTC TCTGACACCN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN
CTCACTGACT CCTGGGTTGA TGGAGACCAG CATCTCTGAA GAGACCAGCT
CTGCCACAGA AAAAAGCACT GTCCTTTCTA GTGTGCCCAC TGGTGCTACT
ACTGAGGTCT CCAGGACAGA AGCCATCTCT TCTAGCAGAA CATCCATCCC
AGGCCCTGCT CAATCCACAA TGTCATCAGA CACCTCCATG GAAACCATCA
CTAGAATTTC TACCCCCCTC ACAAGGAAAG AATCAACAGA CATGGCCATC
ACCCCCAAAA CAGGTCCTTC TGGGGCTACC TCGCAGGGTA CCTTTACCTT
GGACTCATCA AGCACAGCCT CCTGGCCAGG AACTCACTCA GCTACAACTC
AGAGATTTCC ACAGTCAGTG GTGACAACTC CTATGAGCAG AGGTCCTGAG
GATGTGTCAT GGCCAAGCCC GCTGTCTGTG GAAAAAAACA GCCCTCCATC
TTCCCTGGTA TCTTCATCTT CAGTAACCTC ACCTTCGCCA CTTTATTCCA
CACCATCTGG GAGTAGCCAC TCCTCTCCTG TCCCTGTCAC TTCTCTTTTC
ACCTCTATCA TGATGAAGGC CACAGACATG TTGGATGCAA GTTTGGAACC
TGAGACCACT TCAGCTCCCA ATATGAATAT CACCTCAGAT GAGAGTCTGG
CCACTTCTAA AGCCACCACG GAGACAGAGG CAATTCACGT TTTTGAAAAT
ACAGCAGCGT CCCATGTGGA AACCACCAGT GCTACAGAGG AACTCTATTC
CTCTTCCCCA GGCTTCTCAG AGCCAACAAA AGTGATATCT CCAGTGGTCA
CCTCTTCCTC TATAAGAGAC AACATGGTTT CCACAACAAT GCCTGGCTCC
TCTGGCATTA CAAGGATTGA GATAGAGTCA ATGTCATCTC TGACCCCTGG
ACTGAGGGAG ACCAGAACCT CCCAGGACAT CACCTCATCC ACAGAGACAA
GCACTGTCCT TTACAAGATG TCCTCTGGTG CCACTCCTGA GGTCTCCAGG
ACAGAAGTTA TGCCCTCTAG CAGAACATCC ATTCCTGGCC CTGCTCAGTC
CACAATGTCA CTAGACATCT CCGATGAAGT TGTCACCAGG CTGTCTACCT
CTCCCATCAT GACAGAATCT GCAGAAATAA CCATCACCAC CCAAACAGGT
TATTCTCTGG CTACATCCCA GGTTACCCTT CCCTTGGGCA CCTCAATGAC
CTTTTTGTCA GGGACCCACT CAACTATGTC TCAAGGACTT TCACACTCAG
AGATGACCAA TCTTATGAGC AGGGGTCCTG AAAGTCTGTC ATGGACGAGC
CCTCGCTTTG TGGAAACAAC TAGATCTTCC TCTTCTCTGA CATCATTACC
TCTCACGACC TCACTTTCTC CTGTGTCCTC CACATTACTA GACAGTAGCC
CCTCCTCTCC TCTTCCTGTG ACTTCACTTA TCCTCCCAGG CCTGGTGAAG
ACTACAGAAG TGTTGGATAC AAGCTCAGAG CCTAAAACCA GTTCATCTCC
AAATTTGAGC AGCACCTCAG TTGAAATACC GGCCACCTCT GAAATCATGA
CAGATACAGA GAAAATTCAT CCTTCCTCAA ACACAGCGGT GGCCAAAGTG
AGGACCTCCA GTTCTGTTCA TGAATCTCAT TCCTCTGTCC TAGCTGACTC
AGAAACAACC ATAACCATAC CTTCAATGGG TATCACCTCC GCTGTGGACG
ATACCACTGT TTTCACATCA AATCCTGCCT TCTCTGAGAC TAGGAGGATT
CCGACAGAGC CAACATTCTC ATTGACTCCT GGATTCAGGG AGACTAGCAC
CTCTGAAGAG ACCACCTCAA TCACAGAAAC AAGTGCAGTC CTTTATGGAG
TGCCCACTAG TGCTACTACT GAAGTCTCCA TGACAGAAAT CATGTCCTCT
AATAGAACAC ACATCCCTGA CTCTGATCAG TCCACGATGT CTCCAGACAT
CATCACTGAA GTGATCACCA GGCTCTCTTC CTCATCCATG ATGTCAGAAT
CAACACAAAT GACCATCACC ACCCAAAAAA GTTCTCCTGG GGCTACAGCA
CAGAGTACTC TTACCTTGGC CACAACAACA GCCCCCTTGG CAAGGACCCA
CTCAACTGTT CCTCCTAGAT TTTTACACTC AGAGATGACA ACTCTTATGA
GTAGGAGTCC TGAAAATCCA TCATGGAAGA GCTCTCCCTT TGTGGAAAAA
ACTAGCTCTT CATCTTCTCT GTTGTCCTTA CCTGTCACGA CCTCACCTTC
TGTTTCTTCC ACATTACCGC AGAGTATCCC TTCCTCCTCT TTTTCTGTGA
CTTCACTCCT CACCCCAGGC ATGGTGAAGA CTACAGACAC AAGCACAGAA
CCTGGAACCA GTTTATCTCC AAATCTGAGT GGCACCTCAG TTGAAATACT
GGCTGCCTCT GAAGTCACCA CAGATACAGA GAAAATTCAT CCTTCTTCAA
GCATGGCAGT GACCAATGTG GGAACCACCA GTTCTGGACA TGAACTATAT
TCCTCTGTTT CAATCCACTC GGAGCCATCC AAGGCTACAT ACCCAGTGGG
TACTCCCTCT TCCATGGCTG AAACCTCTAT TTCCACATCA ATGCCTGCTA
ATTTTGAGAC CACAGGATTT GAGGCTGAGC CATTTTCTCA TTTGACTTCT
GGATTTAGGA AGACAAACAT GTCCCTGGAC ACCAGCTCAG TCACACCAAC
AAATACACCT TCTTCTCCTG GGTCCACTCA CCTTTTACAG AGTTCCAAGA
CTGATTTCAC CTCTTCTGCA AAAACATCAT CCCCAGACTG GCCTCCAGCC
TCACAGTATA CTGAAATTCC AGTGGACATA ATCACCCCCT TTAATGCTTC
TCCATCTATT ACGGAGTCCA CTGGGATAAC CTCCTTCCCA GAATCCAGGT
TTACTATGTC TGTAACAGAA AGTACTCATC ATCTGAGTAC AGATTTGCTG
CCTTCAGCTG AGACTATTTC CACTGGCACA GTGATGCCTT CTCTATCAGA
GGCCATGACT TCATTTGCCA CCACTGGAGT TCCACGAGCC ATCTCAGGTT
CAGGTAGTCC ATTCTCTAGG ACAGAGTCAG GCCCTGGGGA TGCTACTCTG
TCCACCATTG CAGAGAGCCT GCCTTCATCC ACTCCTGTGC CATTCTCCTC
TTCAACCTTC ACTACCACTG ATTCTTCAAC CATCCCAGCC CTCCATGAGA
TAACTTCCTC TTCAGCTACC CCATATAGAG TGGACACCAG TCTTGGGACA
GAGAGCAGCA CTACTGAAGG ACGCTTGGTT ATGGTCAGTA CTTTGGACAC
TTCAAGCCAA CCAGGCAGGA CATCTTCAAC ACCCATTTTG GATACCAGAA
TGACAGAGAG CGTTGAGCTG GGAACAGTGA CAAGTGCTTA TCAAGTTCCT
TCACTCTCAA CACGGTTGAC AAGAGAATGC GCATGGCGAG AAGGGAGAAG
ATGGAACACA TCACAAAAAT ACCCAATGAA GCAGCACACA GAGGTACCAT
AAGACCAGTC AAAGGCCCTC AGACATCCAC TTCGCCTGCC AGTCCTAAAG
AGAATGGAGA CCACAACCAC AGCTCTGAAG ACCACCACCA CAGCTCTGAA
GACCACTTCC AGAGCCACCT TGACCACCAG TGTCTATACT CCCACTTTGG
GAACACTGAC TCCCCTCAAT GCATCAATGC AAATGGCCAG CACAATCCCC
ACAGAAATGA TGATCACAAC CCCATATGTT TTCCCTGATG TTCCAGAAAC
GACATCCTCA TTGGCTACCA GCCTGGGAGC AGAAACCAGC ACAGCTCTTC
CCAGGACAAC CCCATCTGTT TTCAATAGAG AATCAGAGAC CACAGCCTCA
CTGGTCTCTC GTTCTGGGGC AGAGAGAAGT CCGGTTATTC AAACTCTAGA
TGTTTCTTCT AGTGAGCCAG ATACAACAGC TTCATGGGTT ATCCATCCTG
CAGAGACCAT CCCAACTGTT TCCAAGACAA CCCCCAATTT TTTCCACAGT
GAATTAGACA CTGTATCTTC CACAGCCACC AGTCATGGGG CAGACGTCAG
CTCAGCCATT CCAACAAATA TCTCACCTAG TGAACTAGAT GCACTGACCC
CACTGGTCAC TATTTCGGGG ACAGATACTA GTACAACATT CCCAACACTG
ACTAAGTCCC CACATGAAAC AGAGACAAGA ACCACATGGC TCACTCATCC
TGCAGAGACC AGCTCAACTA TTCCCAGAAC AATCCCCAAT TTTTCTCATC
ATGAATCAGA TGCCACACCT TCAATAGCCA CCAGTCCTGG GGCAGAAACC
AGTTCAGCTA TTCCAATTAT GACTGTCTCA CCTGGTGCAG AAGATCTGGT
GACCTCACAG GTCACTAGTT CTGGGACAGA CAGAAATATG ACTATTCCAA
CTTTGACTCT TTCTCCTGGT GAACCAAAGA CGATAGCCTC ATTAGTCACC
CATCCTGAAG CACAGACAAG TTCGGCCATT CCAACTTCAA CTATCTCGCC
TGCTGTATCA CGGTTGGTGA CCTCAATGGT CACCAGTTTG GCGGCAAAGA
CAAGTACAAC TAATCGAGCT CTGACAAACT CCCCTGGTGA ACCAGCTACA
ACAGTTTCAT TGGTCACGCA TCCTGCACAG ACCAGCCCAA CAGTTCCCTG
GACAACTTCC ATTTTTTTCC ATAGTAAATC AGACACCACA CCTTCAATGA
CCACCAGTCA TGGGGCAGAA TCCAGTTCAG CTGTTCCAAC TCCAACTGTT
TCAACTGAGG TACCAGGAGT AGTGACCCCT TTGGTCACCA GTTCTAGGGC
AGTGATCAGT ACAACTATTC CAATTCTGAC TCTTTCTCCT GGTGAACCAG
AGACCACACC TTCAATGGCC ACCAGTCATG GGGAAGAAGC CAGTTCTGCT
ATTCCAACTC CAACTGTTTC ACCTGGGGTA CCAGGAGTGG TGACCTCTCT
GGTCACTAGT TCTAGGGCAG TGACTAGTAC AACTATTCCA ATTCTGACTT
TTTCTCTTGG TGAACCAGAG ACCACACCTT CAATGGCCAC CAGTCATGGG
ACAGAAGCTG GCTCAGCTGT TCCAACTGTT TTACCTGAGG TACCAGGAAT
GGTGACCTCT CTGGTTGCTA GTTCTAGGGC AGTAACCAGT ACAACTCTTC
CAACTCTGAC TCTTTCTCCT GGTGAACCAG AGACCACACC TTCAATGGCC
ACCAGTCATG GGGCAGAAGC CAGCTCAACT GTTCCAACTG TTTCACCTGA
GGTACCAGGA GTGGTGACCT CTCTGGTCAC TAGTTCTAGT GGAGTAAACA
GTACAAGTAT TCCAACTCTG ATTCTTTCTC CTGGTGAACT AGAAACCACA
CCTTCAATGG CCACCAGTCA TGGGGCAGAA GCCAGCTCAG CTGTTCCAAC
TCCAACTGTT TCACCTGGGG TATCAGGAGT GGTGACCCCT CTGGTCACTA
GTTCCAGGGC AGTGACCAGT ACAACTATTC CAATTCTAAC TCTTTCTTCT
AGTGAGCCAG AGACCACACC TTCAATGGCC ACCAGTCATG GGGTAGAAGC
CAGCTCAGCT GTTCTAACTG TTTCACCTGA GGTACCAGGA ATGGTGACCT
CTCTGGTCAC TAGTTCTAGA GCAGTAACCA GTACAACTAT TCCAACTCTG
ACTATTTCTT CTGATGAACC AGAGACCACA ACTTCATTGG TCACCCATTC
TGAGGCAAAG ATGATTTCAG CCATTCCAAC TTTAGCTGTC TCCCCTACTG
TACAAGGGCT GGTGACTTCA CTGGTCACTA GTTCTGGGTC AGAGACCAGT
GCGTTTTCAA ATCTAACTGT TGCCTCAAGT CAACCAGAGA CCATAGACTC
ATGGGTCGCT CATCCTGGGA CAGAAGCAAG TTCTGTTGTT CCAACTTTGA
CTGTCTCCAC TGGTGAGCCG TTTACAAATA TCTCATTGGT CACCCATCCT
GCAGAGAGTA GCTCAACTCT TCCCAGGACA ACCTCAAGGT TTTCCCACAG
TGAATTAGAC ACTATGCCTT CTACAGTCAC CAGTCCTGAG GCAGAATCCA
GCTCAGCCAT TTCAACAACT ATTTCACCTG GTATACCAGG TGTGCTGACA
TCACTGGTCA CTAGCTCTGG GAGAGACATC AGTGCAACTT TTCCAACAGT
GCCTGAGTCC CCACATGAAT CAGAGGCAAC AGCCTCATGG GTTACTCATC
CTGCAGTCAC CAGCACAACA GTTCCCAGGA CAACCCCTAA TTATTCTCAT
AGTGAACCAG ACACCACACC ATCAATAGCC ACCAGTCCTG GGGCAGAAGC
CACTTCAGAT TTTCCAACAA TAACTGTCTC ACCTGATGTA CCAGATATGG
TAACCTCACA GGTCACTAGT TCTGGGACAG ACACCAGTAT AACTATTCCA
ACTCTGACTC TTTCTTCTGG TGAGCCAGAG ACCACAACCT CATTTATCAC
CTATTCTGAG ACACACACAA GTTCAGCCAT TCCAACTCTC CCTGTCTCCC
CTGGTGCATC AAAGATGCTG ACCTCACTGG TCATCAGTTC TGGGACAGAC
AGCACTACAA CTTTCCCAAC ACTGACGGAG ACCCCATATG AACCAGAGAC
AACAGCCATA CAGCTCATTC ATCCTGCAGA GACCAACACA ATGGTTCCCA
GGACAACTCC CAAGTTTTCC CATAGTAAGT CAGACACCAC ACTCCCAGTA
GCCATCACCA GTCCTGGGCC AGAAGCCAGT TCAGCTGTTT CAACGACAAC
TATCTCACCT GATATGTCAG ATCTGGTGAC CTCACTGGTC CCTAGTTCTG
GGACAGACAC CAGTACAACC TTCCCAACAT TGAGTGAGAC CCCATATGAA
CCAGAGACTA CAGCCACGTG GCTCACTCAT CCTGCAGAAA CCAGCACAAC
GGTTTCTGGG ACAATTCCCA ACTTTTCCCA TAGGGGATCA GACACTGCAC
CCTCAATGGT CACCAGTCCT GGAGTAGACA CGAGGTCAGG TGTTCCAACT
ACAACCATCC CACCCAGTAT ACCAGGGGTA GTGACCTCAC AGGTCACTAG
TTCTGCAACA GACACTAGTA CAGCTATTCC AACTTTGACT CCTTCTCCTG
GTGAACCAGA GACCACAGCC TCATCAGCTA CCCATCCTGG GACACAGACT
GGCTTCACTG TTCCAATTCG GACTGTTCCC TCTAGTGAGC CAGATACAAT
GGCTTCCTGG GTCACTCATC CTCCACAGAC CAGCACACCT GTTTCCAGAA
CAACCTCCAG TTTTTCCCAT AGTAGTCCAG ATGCCACACC TGTAATGGCC
ACCAGTCCTA GGACAGAAGC CAGTTCAGCT GTACTGACAA CAATCTCACC
TGGTGCACCA GAGATGGTGA CTTCACAGAT CACTAGTTCT GGGGCAGCAA
CCAGTACAAC TGTTCCAACT TTGACTCATT CTCCTGGTAT GCCAGAGACC
ACAGCCTTAT TGAGCACCCA TCCCAGAACA GAGACAAGTA AAACATTTCC
TGCTTCAACT GTGTTTCCTC AAGTATCAGA GACCACAGCC TCACTCACCA
TTAGACCTGG TGCAGAGACT AGCACAGCTC TCCCAACTCA GACAACATCC
TCTCTCTTCA CCCTACTTGT AACTGGAACC AGCAGAGTTG ATCTAAGTCC
AACTGCTTCA CCTGGTGTTT CTGCAAAAAC AGCCCCACTT TCCACCCATC
CAGGGACAGA AACCAGCACA ATGATTCCAA CTTCAACTCT TTCCCTTGGT
CACGAGTACT CTAACTCTGA CTGTTTCCCC TGCTGTCTCT GGGCTTTCCA
GTGCCTCTAT AACAACTGAT AAGCCCCAAA CTGTGACCTC CTGGAACACA
GAAACCTCAC CATCTGTAAC TTCAGTTGGA CCCCCAGAAT TTTCCAGGAC
TGTCACAGGC ACCACTATGA CCTTGATACC ATCAGAGATG CCAACACCAC
CTAAAACCAG TCATGGAGAA GGAGTGAGTC CAACCACTAT CTTGAGAACT
ACAATGGTTG AAGCCACTAA TTTAGCTACC ACAGGTTCCA GTCCCACTGT
GGCCAAGACA ACAACCACCT TCAATACACT GGCTGGAAGC CTCTTTACTC
CTCTGACCAC ACCTGGGATG TCCACCTTGG CCTCTGAGAG TGTGACCTCA
AGAACAAGTA AGAATAACTT TTTTATTGTG GTAAAATATA AATACTATAA
CATTCTCCCC AGGGATTTCC ACATCCTCCA TCCCCAGCTC CACAGGTAGG
AGCAGCCACA GTCCCATTCA TGGTGCCATT CACCCTCAAC TTCACCATCA
CCAACCTGCA GTACGAGGAG GACATGCGGC ACCCTGGTTC CAGGAAGTTC
AACGCCACAG AGAGAGAACT GCAGGGTCTG GTGAGAGCCC CGCCCACCGT
TTGTTCAGGA ATAGCAGTCT GGAATACCTC TATTCAGGCT GCAGACTAGC
CTCACTCAGG TGAGACGCTC CTTAAGAAAA ACACAGCCCA ACAGGTGAAT
CGGCAGTGGA TGCCATCTGC ACACATCGCC CTGACCCTGA AGACCTCGGA
CTGGACAGAG AGCGACTGTA CTGGGAGCTG AGCAATCTGA CAAATGGCAT
CCAGGAGCTG GGCCCCTACA CCCTGGACCG GAACAGTCTC TATGTCAATG
GTGAGCAGCT GTGATGTGGT TGGAGGCTCT TCCTCCTTGC TGAGCAGCCT
TTCACCCATC GAAGCTCTAT GCCCACCACC AGCAGTGAGT ATTCAACTCA
ACCTCCACAG TGGATGTGGG AACCTCAGGG ACTCCATCCT CCAGCCCCAG
CCCCACGAGT AAGTACCAGT CAATGGCATC TCTATTAGAG CATGCTATCT
TCACCAACCT GCAGTACGAG GAGGACATGC GTCGCACTGG CTCCAGGAAG
TTCAACACCA TGGAGAGTGT CCTGCAGGGT CTGGTTAGTG TCCTGCCCTC
TCTGTACTCT GGCTGCAGAT TGACCTTGCT CAGGTGAGAA CTTAGAATTT
GGAGTGGATG CCATCTGCAC CCACCGCCTT GACCCCAAAA GCCCTGGACT
CAACAGGGAG CAGCTGTACT GGGAGCTAAG CAAACTGACC AATGACATTG
AAGAGCTGGG CCCCTACACC CTGGACAGGA ACAGTCTCTA TGTCAATGGT
TCCACCACCA GCAGTGAGTA TTCAACTCAT ATCCACATGC CTCGGTTCCT
AGAACCTCAG GGACTCCATC CTCCCTCTCC AGCCCCACAA GTAAGTATCA
CTTCACCATC ACCAACCTGC AGTATGAGGA GGACATGCAT CGCCCTGGAT
CTAGGAAGTT CAACACCACA GAGAGGGTCC TGCAGGGTCT GGTTAGCACC
GTTGGCCCTC TGTACTCTGG CTGCAGACTG ACCTCTCTCA GGTGAGACCT
CTGAGAAGGA TGGAGCAGCC ACTGGAGTGG ATGCCATCTG CATCCATCAT
CTTGACCCCA AAAGCCCTGG ACTCAACAGA GAGCGGCTGT ACTGGGAGCT
GAGCCGACTG ACCAATGGCA TCAAAGAGCT GGGCCCCTAC ACCCTGGACA
GGAACAGTCT CTATGTCAAT GGTGAGCAGC TGTGATGTGG TTGGAGTCTT
CAGCAGTGAG TATTCAACTC ATGTCCACAT GCCCCTGATC CTACATTAAG
TCAACTTCAC CATCACCAAC CTGAAGTATG AGGAGGACAT GCATCGCCCT
GGCTCCAGGA AGTTCAACAC CACTGAGAGG GTCCTGCAGA CTCTGGTTAG
ACACCAGTGT TGGCCTTCTG TACTCTGGCT GCAGACTGAC CTTGCTCAGG
TGCCATCTGC ACCCACCGTC TTGACCCCAA AAGCCCTGGA GTGGACAGGG
AGCAGCTATA CTGGGAGCTG AGCCAGCTGA CCAATGGCAT CAAAGAGCTG
GGCCCCTACA CCCTGGACAG GAACAGTCTC TATGTCAATG GTGAGCAGCT
GCAGCAGTGA GTATTCAACT CATGTCCATG ATGCCCCTGA TCCTACATCA
ACTCCATCCT CCCTCCCCAG CCCCACAAGT AAGTACCAGC CAATGGTATC
GCTGCTGGCC CTCTCCTGGT GCCATTCACC CTCAACTTCA CCATCACCAA
CCTGCAGTAC GAGGAGGACA TGCATCACCC AGGCTCCAGG AAGTTCAACA
CCACGGAGCG GGTCCTGCAG GGTCTGGTTA GTGCTCCACC CTCCTCACTC
TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTGAGA CCTTAGAAGA
GCCAGCCCTC TCCTGGTGCT ATTCACAATT AACTTCACCA TCACTAACCT
GCGGTATGAG GAGAACATGC ATCAGCCTGG CTCTAGAAAG TTTAACACCA
CGGAGAGAGT CCTTCAGGGT CTGGTAAGAG CCCCACATAC CTCATTCTAC
GCAGACTGAC CTTGCTCAGG TGAGAACTGA GAACAGCCAG TCTGACTGAT
CGCCCTGATC CCAAAAGCCC TGGACTGGAC AGAGAGCAGC TATACTGGGA
GCTGAGCCAG CTGACCCACA GCATCACTGA GCTGGGCCCC TACACACTGG
ACAGGGACAG TCTCTATGTC AATGGTGAGT AGTTGTGATG TGGTTGGAGT
CTAAACCTGG TCCCTCGGGT AAGTACAAAT CAATCGCATC TCTGTTAGAG
TGCTATTCAC TCTCAACTTC ACCATCACCA ACCTGCGGTA TGAGGAGAAC
ATGCAGCACC CTGGCTCCAG GAAGTTCAAC ACCACGGAGA GGGTCCTTCA
GGGCCTGNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN
TCAAGAGCAC CAGTGTTGGC CCTCTGTACT CTGGCTGCAG ACTGACTTTG
CTCAGGNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNAGAAT
GCCACTGGAG TGGATGCCAT CTGCACCCAC CACCCTGACC CCAAAAGCCC
TAGGCTGGAC AGAGAGCAGC TGTATTGGGA GCTGAGCCAG CTGACCCACA
ATATCACTGA GCTGGGCCCC TATGCCCTGG ACAACGACAG CCTCTTTGTC
AATGGTGAGC AATTGTGATG TGGTTGGAGT TTCTTCTTCC TTGCTGAGCA
GGAGCTCTGT GTCCACCACC AGCACTNNNN NNNNNNNNNN NNNNNNNNNN
AGTGTATCTG GGAGCATCTA AGACTCCAGC CTCGATATTT GGCCCTTCAN
ATCACTAACC TGCGGTATGA GGAGAACATG TGGCCTGGCT CCAGGAAGTT
CAACACTACA GAGAGGGTCC TTCAGGGCCT GGTGAGAGCC CTGCCCACCT
GTTCAAGAAC ACCAGTGTTG GCCCTCTGTA CTCTGGCTGC AGGCTGACCT
TGCTCAGGTG AGAACTGAGA ATAACCAGTC TGGCTACCCC AAGTGTTCCC
ACCGCCCTGA CCCCACAGGC CCTGGGCTGG ACAGAGAGCA GCTGTATTTG
GAGCTGAGCC AGCTGACCCA CAGCATCACT GAGCTGGGCC CCTACACACT
GGACAGGGAC AGTCTCTATG TCAATGGTGA GCGGCTGTGA TGTGGTTGGA
GGTCAGCGAG GAGCCATTCA CACTGAACTT CACCATCAAC AACCTGCGCT
ACATGGCGGA CATGGGCCAA CCCGGCTCCC TCAAGTTCAA CATCACAGAC
AACGTCATGC AGCACCTGGT GAGAGGCCTG CCTCCCGCTG CAGCCCTGCC
ACGGTACACA GGCTGCAGGG TCATCGCACT AAGGTGAGAA ACTCCCCCAC
TCCTCTGCAC CTACCTGCAG CCCCTCAGCG GCCCAGGTCT GCCTATCAAG
CAGGTGTTCC ATGAGCTGAG CCAGCAGACC CATGGCATCA CCCGGCTGGG
CCCCTACTCT CTGGACAAAG ACAGCCTCTA CCTTAACGGT GAGCAGCTAT
TCTGTCAGAA GCCACAACAG GTATTTGGGG CCATTTTTCC TCCTCGAAGA
ATCTCCAGTA TTCACCAGAT ATGGGCAAGG GCTCAGCTAC ATTCAACTCC
ACCGAGGGGG TCCTTCAGCA CCTGGTGAGA CCCTGGTCCC AGCAGCTCCT
TTCTACTTGG GTTGCCAACT GATCTCCCTC AGGTGAGACC ACTTCCTGGC
CCACTGGTGT GGACACCACC TGCACCTACC ACCCTGACCC TGTGGGCCCC
GGGCTGGACA TACAGCAGCT TTACTGGGAG CTGAGTCAGC TGACCCATGG
TGTCACCCAA CTGGGCTTCT ATGTCCTGGA CAGGGATAGC CTCTTCATCA
ATGGTGAGTG TCAGGCTGAA CTTGGATTTA CAGTGACTTT TGGGGAGTTG
ATAAATTTCC ACATTGTCAA CTGGAACCTC AGTAATCCAG ACCCCACATC
CTCAGAGTAC ATCACCCTGC TGAGGGACAT CCAGGACAAG GTGGGGCATC
GGTCACCAAC TTGACGTAAG TTCTGAAGGT CATAAGCAGT GACCAAGCTT
ACCCCAGCCT GGTGGAGCAA GTCTTTCTAG ATAAGACCCT GAATGCCTCA
TTCCATTGGC TGGGCTCCAC CTACCAGTTG GTGGACATCC ATGTGACAGG
ATGGAGTCAT CAGTTTATCA ACCAACAAGC AGCTCCAGCA CCCAGCACTT
CTACCTGAAT TTCACCATCA CCAACCTACC ATATTCCCAG GACAAAGCCC
AGCCAGGCAC CACCAATTAC CAGAGGAACA AAAGGAATAT TGAGGATGCG
GCAGCATCAA GAGTTATTTT TCTGACTGTC AAGTTTCAAC ATTCAGGTAA
GTAACTTCTC GCCACTGGCT CGGAGAGTAG ACAGAGTTGC CATCTATGAG
GAATTTCTGC GGATGACCCG GAATGGTACC CAGCTGCAGA ACTTCACCCT
GGACAGGAGC AGTGTCCTTG TGGATGGTAA AGCTCCCTGG GTCATTGGGA
AGCCCTTAAC TGGGAATTCT GGTAAGTCTC AAAGAAGCCC CAGCCCAGGG
CCTTCTGGGC TGTCATCCTC ATCGGCTTGG CAGGACTCCT GGGAGTCATC
ACATGCCTGA TCTGCGGTGT CCTGGTGAGC AAGGAAGGGT TGCTTGTCTT
CGCCGGCGGA AGAAGGAAGG AGAATACAAC GTCCAGCAAC AGTGCCCAGG
CTACTACCAG TCACACCTAG ACCTGGAGGA TCTGCAATGA CTGGAACTTG
This application is a continuation-in-part of PCT/US02/11734 filed Apr. 12, 2002. This application claims the benefit of U.S. Provisional Application Ser. No. 60/284,175 filed Apr. 17, 2001, U.S. Provisional Application Ser. No. 60/299,380 filed Jun. 19, 2001, and U.S. Provisional Application Ser. No. 60/345,180 filed Dec. 21, 2001 through PCT/US02/11734, and is a continuation-in-part of U.S. application Ser. No. 09/965,738, now U.S. Pat. No. 7,309,760, filed Sep. 27, 2001, through PCT/US02/11734. This application is a continuation-in-part of provisional application 60/427,045 (filed Nov. 15, 2002). All of these cited applications are hereby specifically incorporated by reference. Applicant hereby specifically claims the benefit of these prior filed applications under 35 U.S.C. §§119(e), 120 and 363.
Number | Name | Date | Kind |
---|---|---|---|
6074828 | Amara et al. | Jun 2000 | A |
6335194 | Bennett et al. | Jan 2002 | B1 |
6451602 | Popoff et al. | Sep 2002 | B1 |
6468546 | Mitcham et al. | Oct 2002 | B1 |
6962980 | Mitcham et al. | Nov 2005 | B2 |
7205142 | Lloyd et al. | Apr 2007 | B2 |
20020119158 | Algate et al. | Aug 2002 | A1 |
20030091580 | Mitcham et al. | May 2003 | A1 |
20030096238 | Salceda et al. | May 2003 | A1 |
20030143667 | O'Brien et al. | Jul 2003 | A1 |
20040005579 | Birse et al. | Jan 2004 | A1 |
20040009474 | Leach et al. | Jan 2004 | A1 |
20040127401 | O'Brien et al. | Jul 2004 | A1 |
20070015907 | O'Brien et al. | Jan 2007 | A1 |
Number | Date | Country |
---|---|---|
0288082 | Oct 1988 | EP |
1 074 617 | Feb 2001 | EP |
WO 9425482 | Nov 1994 | WO |
WO 9634965 | Nov 1996 | WO |
WO 0036107 | Jun 2000 | WO |
WO0036107 | Jun 2000 | WO |
WO 0058473 | Oct 2000 | WO |
WO 0142277 | Jun 2001 | WO |
WO 0192523 | Jun 2001 | WO |
WO0170804 | Sep 2001 | WO |
WO 03025148 | Sep 2001 | WO |
WO 0175067 | Oct 2001 | WO |
WO0206317 | Jan 2002 | WO |
WO 02071928 | Sep 2002 | WO |
WO 02092836 | Nov 2002 | WO |
WO 02092836 | Nov 2002 | WO |
WO 03029271 | Apr 2003 | WO |
Number | Date | Country | |
---|---|---|---|
20070015907 A1 | Jan 2007 | US |
Number | Date | Country | |
---|---|---|---|
60284175 | Apr 2001 | US | |
60299380 | Jun 2001 | US | |
60345180 | Dec 2001 | US | |
60427045 | Nov 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US02/11734 | Apr 2002 | US |
Child | 10475117 | US | |
Parent | 09965738 | Sep 2001 | US |
Child | PCT/US02/11734 | US |