FUSION PROTEINS AND METHODS THEREOF

Abstract
The invention discloses oncogenic fusion proteins. The invention provides methods for treating gene-fusion based cancers.
Description
SEQUENCE LISTING

The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Nov. 23, 2015, is named 19240.1034US2_SL.txt and is 2,208,530 bytes in size.


BACKGROUND OF THE INVENTION

Glioblastoma multiforme (GBM) is the most common form of brain cancer and among the most incurable and lethal of all human cancers. The current standard of care includes surgery, chemotherapy, and radiation therapy. However, the prognosis of GBM remains uniformly poor. There are few available targeted therapies and none that specifically target GBM.


The target population of GBM patients who may carry EGFR gene fusions and would benefit from targeted inhibition of EGFR kinase activity is estimated to correspond to 6,000 patients per year world-wide.


SUMMARY OF THE INVENTION

The invention is based, at least in part, on the discovery of a highly expressed class of gene fusions in GBM, which join the receptor tyrosine kinase (RTK) domain of EGFR genes to the coiled-coil domain of septin proteins, such as Septin-14, or fused to a polypeptide comprising a phosphoserine phosphatase (PSPH) protein or a polypeptide comprising a Cullin-associated and neddylation-dissociated (CAND) protein. The invention is based, at least in part, on the finding that EGFR-SEPT fusions, EGFR-PSPH fusions, and EGFR-CAND fusions identify a subset of GBM patients who will benefit from targeted inhibition of the tyrosine kinase activity of EGFR. Identification of fusions of EGFR genes in glioblastoma patients are useful therapeutic targets.


An aspect of the invention is directed to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In one embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein comprising the tyrosine kinase domain of an EGFR protein fused 5′ to a polypeptide comprising the coiled-coil domain of a Septin protein. In one embodiment, the Septin protein is Septin-1, Septin-2, Septin-3, Septin-4, Septin-5, Septin-6, Septin-7, Septin-8, Septin-9, Septin-10, Septin-11, Septin-12, Septin-13, or Septin-14. In another embodiment, the Septin protein is Septin-14 (SEPT14). In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein comprising the tyrosine kinase domain of an EGFR protein fused 5′ to a polypeptide comprising a phosphoserine phosphatase (PSPH) protein. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein comprising the tyrosine kinase domain of an EGFR protein fused 3′ to a polypeptide comprising a Cullin-associated and neddylation-dissociated (CAND) protein. In one embodiment, the CAND protein is CAND1, CAND2, or CAND3. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein encoded by an EGFR-SEPT14 nucleic acid, wherein EGFR-SEPT14 comprises a combination of exons 1-25 of EGFR located on human chromosome 7p11.2 spliced 5′ to a combination of exons 7-10 of SEPT14 located on human chromosome 7, wherein a genomic breakpoint occurs in any one of exons 1-25 of EGFR and any one of exons 7-10 of SEPT14. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein encoded by an EGFR-PSPH nucleic acid, wherein EGFR-PSPH comprises a combination of exons 1-25 of EGFR located on human chromosome 7p12 spliced 5′ to a combination of exons 1-10 of PSPH located on human chromosome 7p11.2, wherein a genomic breakpoint occurs in any one of exons 1-25 of EGFR and any one of exons 1-10 of PSPH. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified fusion protein encoded by an EGFR-CAND1 nucleic acid, wherein EGFR-CAND1 comprises a combination of exons 1-25 of EGFR located on human chromosome 7p12 spliced 3′ to a combination of exons 1-16 of CAND1 located on human chromosome 12q14, wherein a genomic breakpoint occurs in any one of exons 1-25 of EGFR and any one of exons 1-16 of CAND1. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a synthetic nucleic acid encoding the EGFR fusion proteins described above.


An aspect of the invention is directed to a purified EGFR-SEPT14 fusion protein comprising SEQ ID NO: 1 or 5. In one embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified EGFR-SEPT14 fusion protein having a genomic breakpoint comprising SEQ ID NO: 4. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified EGFR-PSPH fusion protein comprising SEQ ID NO: 7 or 11. In another embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified EGFR-PSPH fusion protein having a genomic breakpoint comprising SEQ ID NO: 10. In one embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified EGFR-CAND1 fusion protein comprising SEQ ID NO: 13 or 8495. In one embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a purified EGFR-CAND1 fusion protein having a genomic breakpoint comprising SEQ ID NO: 15. In one embodiment, the purified fusion protein is essentially free of other human proteins.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-SEPT14 fusion protein comprising SEQ ID NO: 2.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-SEPT14 fusion protein having a genomic breakpoint comprising SEQ ID NO: 4.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-PSPH fusion protein comprising SEQ ID NO: 8.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-PSPH fusion protein having a genomic breakpoint comprising SEQ ID NO: 10.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-CAND1 fusion protein comprising SEQ ID NO: 14.


An aspect of the invention is directed to a synthetic nucleic acid encoding an EGFR-CAND1 fusion protein having a genomic breakpoint comprising SEQ ID NO: 15.


An aspect of the invention is directed to an antibody or antigen-binding fragment thereof that specifically binds to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In one embodiment, the fusion protein is an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND fusion protein. In another embodiment, the EGFR-SEPT fusion protein is EGFR-SEPT14. In one embodiment, the EGFR-SEPT fusion protein comprises the amino acid sequence of SEQ ID NO: 1, 3, or 5. In one embodiment, the EGFR-CAND fusion protein is EGFR-CAND1. In one embodiment, the EGFR-CAND fusion protein comprises the amino acid sequence of SEQ ID NO: 13, 16, or 8495.


An aspect of the invention is directed to an antibody or antigen-binding fragment thereof that specifically binds to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide comprising the coiled-coil domain of a Septin protein. In another embodiment, the EGFR-SEPT fusion protein is EGFR-SEPT14. In one embodiment, the EGFR-SEPT fusion protein comprises the amino acid sequence of SEQ ID NO: 1, 3, or 5.


An aspect of the invention is directed to an antibody or antigen-binding fragment thereof that specifically binds to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide comprising a phosphoserine phosphatase (PSPH) protein. In one embodiment, the EGFR-PSPH fusion protein comprises the amino acid sequence of SEQ ID NO: 7, 9, or 11.


An aspect of the invention is directed to an antibody or antigen-binding fragment thereof, that specifically binds to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide comprising a Cullin-associated and neddylation-dissociated (CAND) protein. In one embodiment, the EGFR-CAND fusion protein is EGFR-CAND1. In one embodiment, the EGFR-CAND fusion protein comprises the amino acid sequence of SEQ ID NO: 13, 16, or 8495.


An aspect of the invention is directed to a composition for decreasing the expression level or activity of a fusion protein in a subject comprising the tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein, the composition in an admixture of a pharmaceutically acceptable carrier comprising an inhibitor of the fusion protein. In one embodiment, the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof. In another embodiment, the CAND protein is CAND1. In a further embodiment, the SEPT protein is SEPT14. In some embodiments, the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.


An aspect of the invention is directed to a method for treating a gene-fusion associated cancer in a subject in need thereof, the method comprising administering to the subject an effective amount of an EGFR fusion molecule inhibitor. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the EGFR fusion comprises an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In one embodiment, the EGFR fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein. In one embodiment, the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof. In one embodiment, the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.


An aspect of the invention is directed to a method of decreasing growth of a solid tumor in a subject in need thereof, the method comprising administering to the subject an effective amount of an EGFR fusion molecule inhibitor, wherein the inhibitor decreases the size of the solid tumor. In one embodiment, the subject is afflicted with a gene-fusion associated cancer. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the solid tumor comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the EGFR fusion comprises an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In one embodiment, the EGFR fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein. In one embodiment, the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof. In one embodiment, the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.


An aspect of the invention is directed to a method of reducing cell proliferation in a subject afflicted with a gene-fusion associated cancer, the method comprising administering to the subject an effective amount of an EGFR fusion molecule inhibitor, wherein the inhibitor decreases cell proliferation. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the EGFR fusion comprises an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In one embodiment, the EGFR fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein. In one embodiment, the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof. In one embodiment, the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.


An aspect of the invention is directed to a diagnostic kit for determining whether a sample from a subject exhibits a presence of an EGFR fusion, the kit comprising at least one oligonucleotide that specifically hybridizes to an EGFR fusion, or a portion thereof. In one embodiment, the oligonucleotides comprise a set of nucleic acid primers or in situ hybridization probes. In another embodiment, the oligonucleotide comprises SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, 89, or a combination thereof. In a further embodiment, the primers prime a polymerase reaction only when an EGFR fusion is present. In some embodiments, the fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein. In other embodiments, the determining comprises gene sequencing, selective hybridization, selective amplification, gene expression analysis, or a combination thereof.


An aspect of the invention is directed to a diagnostic kit for determining whether a sample from a subject exhibits a presence of an EGFR fusion protein, the kit comprising an antibody that specifically binds to an EGFR fusion protein comprising SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495, wherein the antibody will recognize the protein only when an EGFR fusion protein is present. In one embodiment, the fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein. In one embodiment, the subject is afflicted with a gene-fusion associated cancer. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma.


An aspect of the invention is directed to a method for detecting the presence of an EGFR fusion in a human subject. The method comprises obtaining a biological sample from the human subject; and detecting whether or not there is an EGFR fusion present in the subject. In one embodiment, the detecting comprises measuring EGFR fusion protein levels by ELISA using an antibody directed to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495; western blot using an antibody directed to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495; mass spectroscopy, isoelectric focusing, or a combination thereof.


An aspect of the invention is directed to a method for detecting the presence of an EGFR fusion in a human subject. The method comprises obtaining a biological sample from a human subject; and detecting whether or not there is a nucleic acid sequence encoding an EGFR fusion protein in the subject. In one embodiment, the nucleic acid sequence comprises any one of SEQ ID NOS: 2, 4, 8, 10, 14, and 15. In another embodiment, the detecting comprises using hybridization, amplification, or sequencing techniques to detect an EGFR fusion. In a further embodiment, the amplification uses primers comprising SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, or 89. In some embodiments, the fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein.





BRIEF DESCRIPTION OF THE FIGURES

To conform to the requirements for PCT patent applications, many of the figures presented herein are black and white representations of images originally created in color. In the below descriptions and the examples, the colored plots and images are described in terms of its appearance in black and white. The original color versions can be viewed in Frattini et al., (2013) Nature Genetics, 45(10):1141-49 (including the accompanying Supplementary Information available in the on-line version of the manuscript available on the Nature Genetics web site). For the purposes of the PCT, the contents of Frattini et al., (2013) Nature Genetics, 45(10):1141-49, including the accompanying “Supplementary Information,” are herein incorporated by reference.



FIG. 1A is a chromosome view of validated GBM genes scoring at the top of each of the three categories by MutComFocal. The plot shows mutated genes without significant copy number alterations (Mut, mutation %, frequency of mutations). Previously known GBM genes are indicated in green (light grey in black and white image), new and independently validated GBM genes are indicated in red (dark grey in black and white image).



FIG. 1B is a chromosome view of validated GBM genes scoring at the top of each of the three categories by MutComFocal. The plot shows mutated genes in regions of focal and recurrent amplifications (Amp-Mut, Amplification/mutation scores). Previously known GBM genes are indicated in green (light grey in black and white image), new and independently validated GBM genes are indicated in red (dark grey in black and white image).



FIG. 1C is a chromosome view of validated GBM genes scoring at the top of each of the three categories by MutComFocal. The plot shows mutated genes in regions of focal and recurrent deletions (Del-Mut, Deletion/mutation scores). Previously known GBM genes are indicated in green (light grey in black and white image), new and independently validated GBM genes are indicated in red (dark grey in black and white image).



FIGS. 2A-B shows Localization of altered residues in LZTR-1. FIG. 2A shows lysates from 293T cells transfected with vectors expressing LZTR-1 and the Flag-Cul3 wild type (WT), Flag-Cul3-dominant negative (DN) or the empty vector were immunoprecipitated with Flag antibody and assayed by western blot with the indicated antibodies. *, non specific band; left bracket indicates Cul3 polypeptides. The molecular weight is indicated on the right. FIG. 2B shows homology model of the Kelch (green; grey in black and white image of left hand side of ribbon diagram), BTB (cyan; (light grey in black and white image of center and right of ribbon diagram) and BACK (purple; dark grey in black and white image of center and right of ribbon diagram) domains of LZTR-1 with the Cul3 N-terminal domain (white) docked onto the putative binding site. GBM mutations are indicated in red (dark grey in black and white image; left hand side of ribbon diagram).



FIG. 2C. Sequence alignment of the six blades from the Kelch β-propeller domain. Each blade contains four core β-strands, labeled a, b, c, d. Conserved residues are highlighted in gray and residues mutated in GBM are shown in red. Insertions at the end of blades 5 and 6 are indicated in brackets. Figure discloses SEQ ID NO: 8478.



FIGS. 3A-B. Loss of CD drives mesenchymal transformation of GBM. 3a, Immunofluorescence staining of human brain cortex using δ-catenin antibody (red, left panel); Nuclei are counterstained with Dapi (blue, right panel). 3b, Immunofluorescence staining of human primary GBM included in tissue microarrays (TMA) using δ-catenin antibody (red); Nuclei are counterstained with Dapi (blue). A representative δ-catenin-positive and negative tumor is shown in the left and right panel, respectively.



FIGS. 3C-D. Loss of CD drives mesenchymal transformation of GBM. 3c, Kaplan-Meier analysis for glioma patients with low CTNND2 mRNA expression 2-fold, red line) compared with the rest of glioma (blue line). 3d, Kaplan-Meier analysis for glioma patients with low CTNND2 mRNA expression (2-fold) and decreased CTNND2 gene copy number (1) (red line) compared with the rest of glioma (blue line).



FIGS. 3E-F. Loss of CD drives mesenchymal transformation of GBM. 3e, Growth rate of U87 glioma cells transduced with a lentivirus expressing δ-catenin (squares) or the empty vector (circles, average of triplicate cultures). 3f, Expression of mesenchymal genes in glioma cells expressing δ-catenin or the empty vector (averages of triplicate quantitative RT-PCR). All error bars are SD. *, p≦0.005; **, p≦0.001.



FIGS. 3G-H. Loss of CD drives mesenchymal transformation of GBM. 3g, Immunofluorescence staining for βIII-tubulin (upper panels) and PSD95 (lower panels) in glioma cells expressing δ-catenin or the empty vector. 3h, Western blot using the indicated antibodies in glioma cells expressing δ-catenin or the empty vector. Vinculin is shown as control for loading.



FIG. 4A. EGFR-SEPT14 gene fusion identified by whole transcriptome sequencing. Split reads are shown aligning on the breakpoint. The predicted reading frame at the breakpoint is shown at the top with EGFR sequences in blue and SEPT14 in red. The amino acid sequence (TOP) is SEQ ID NO: 1; the nucleotide sequence (bottom) is SEQ ID NO: 2.



FIG. 4B. EGFR-SEPT14 gene fusion identified by whole transcriptome sequencing. (left panel), EGFR-SEPT14-specific PCR from cDNA derived from GBMs. Marker, 1 kb ladder. (right panel), Sanger sequencing chromatogram showing the reading frame at the breakpoint (SEQ ID NO: 4) and putative translation of the fusion protein (SEQ ID NO: 3) in the positive sample.



FIG. 4C. EGFR-SEPT14 gene fusion identified by whole transcriptome sequencing. EGFR-Septin14 fusion protein sequence (SEQ ID NO: 5) and schematics. Regions corresponding to EGFR and Septin14 are shown in blue (left hand side of diagram; (grey in black and white image; sequence comprising “MRP . . . VIQ” amino acids of SEQ ID NO: 5) and red (right hand side of diagram; light grey in black and white image; sequence comprising “LQD . . . RKK” amino acids of SEQ ID NO: 5), respectively. The fusion joins the tyrosine kinase domain of EGFR and the Coiled-coil domain of Septin14.



FIG. 4D. EGFR-SEPT14 gene fusion identified by whole transcriptome sequencing. Genomic fusion of EGFR exon 25 with intron 9 of SEPT14. In the fuse mRNA exon 24 of EGFR is spliced 5′ to exon 10 of SEPT14. Solid arrows indicate the position of the fusion genome primers that generate a fusion specific PCR product in the GBM sample TCGA-27-1837.



FIG. 5A. Expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. Growth rate of SNB19 glioma cells transduced with a lentivirus expressing EGFR-SEPT14, EGFR Viii, EGFR WT or the empty vector (average of triplicate cultures).



FIG. 5B. Expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. Migration assay in SNB19 glioma cells transduced with a lentivirus expressing EGFR-SEPT14, EGFR Viii, EGFR WT or the empty vector.



FIG. 5C. Expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. Quantification of the cell covered area for the experiments shown in b (average of triplicate cultures). All error bars are SD.



FIG. 5D. Expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. In vivo inhibition of tumor growth by EGFR kinase inhibitors in glioma patient derived xenografts carrying EGFR-SEPT14 fusion but not wild type EGFR. T-C indicates the median difference in survival between drug treated and vehicle (control) treated mice.



FIG. 5E. Expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. Kinetics of tumor growth for the same xenografts treated with Lapatinib or vehicle (control). All error bars are SD.



FIG. 6 shows the distribution of substitutions from whole exome data.



FIG. 7. shows dinucleotide distribution in mutated sites.



FIG. 8. Sequence alignment of selected LZTR-1 orthologs. Mutations detected in GBM are indicated in red above of the aligned sequences. The LZTR-1 gene is present in most metazoans, including the sponge Amphimedon queenslandica, which is generally recognized as the most ancient surviving metazoan lineage17. LZTR-1 is also present in some near-metazoan unicellular protists, including Capsaspora owczarzaki (included in the Figure) and the choanoflagellates Salpingoeca rosetta and Monosiga brevicollis. These opisthokonts are key organisms for the study of the evolution of multicellularity, differentiation and cell-cell communication in animals and help in our understanding of the role of molecular pathways in cancer18. LZTR-1 has a characteristic Kelch-BTB-BACK-BTB-BACK domain architecture, and unlike the BTB-BACK-Kelch proteins8, there has been little, if any, duplication of the LZTR-1 gene since its appearance. Despite its name, LZTR-1 does not contain a leucine zipper region. Figure discloses SEQ ID NOS 8479-8482, respectively, in order of appearance.



FIG. 9. Sequence alignment of BTB-BACK domains. The two BTB-BACK domains of LZTR-1 are included along with the predicted secondary structure from HHpred6. The 3-box is the Cul3 binding element within the BACK domain. The secondary structure of KLHL3 (PDB ID 4HXI), KLHL11 (PDB ID 4AP2) and Gigaxonin (PDB ID 3HVE) are based on the crystal structures. The secondary structure of SPOP is based on a crystal structure for the BTB and 3-box region (PDB ID 3HTM) and HHpred predictions from the remainder of the BACK domain. Only the N-terminal half of the BACK domain from KLHL3, KLHL11 and Gigaxonin is included, as SPOP and LZTR-1 contain truncated versions of the BACK domain. Figure discloses SEQ ID NOS 8483-8488, respectively, in order of appearance.



FIG. 10A. Pattern of somatic mutations, CNVs and expression of CTNND2 in GBM. Schematic representation of identified somatic mutations in CTNND2 shown in the context of the known domain structure of the protein. Numbers refer to amino acid residues of the δ-catenin protein.



FIG. 10B. Pattern of somatic mutations, CNVs and expression of CTNND2 in GBM. Somatic deletions of CTNND2. Samples are sorted according to the focality of CTNND2 deletion. In the red-blue scale, white corresponds to normal (diploid) copy number, blue is deletion and red is gain.



FIG. 10C. Pattern of somatic mutations, CNVs and expression of CTNND2 in GBM. Pattern of expression of δ-catenin in the developing mouse brain (embryonic day 14.5), as determined by immunostaining. The highest levels of δ-catenin are detected in the cortical plate (CP) that contains differentiating neurons. IZ, intermediate zone; VZ/SVZ ventricular zone/subventricular zone; LV, lateral ventricle.



FIG. 10D. Pattern of somatic mutations, CNVs and expression of CTNND2 in GBM. CTNND2 mRNA expression analysis from Atlas-TCGA samples shows that CTNND2 is significantly down-regulated in the mesenchymal subgroup. In the green-red scale, black is the median, green is down-regulation and red is up-regulation.



FIG. 11A. EGFR-PSPH gene fusion identified by whole transcriptome sequencing. Split reads are shown aligning on the breakpoint. The predicted reading frame at the breakpoint is shown at the top with EGFR sequences in blue (grey in black and white image; encompassing “SRR . . . VIQ” amino acids and “AGT . . . CAG” nucleotides) and PSPH in red (light grey in black and white image; encompassing “DAF . . . QQV” amino acids and “GAT . . . CAA” nucleotides). The amino acid sequence (TOP) is SEQ ID NO: 7; the nucleotide sequence (bottom) is SEQ ID NO: 8.



FIG. 11B. EGFR-PSPH gene fusion identified by whole transcriptome sequencing. (left panel), EGFR-PSHP specific PCR from cDNA derived from GBMs. Marker, 1 kb ladder. (right panel), Sanger sequencing chromatogram showing the reading frame (SEQ ID NO: 10) at the breakpoint and putative translation of the fusion protein (SEQ ID NO: 9) in the positive sample.



FIG. 11C. EGFR-PSPH gene fusion identified by whole transcriptome sequencing. EGFR-PSPH fusion protein sequence (SEQ ID NO: 11) and schematics. Regions corresponding to EGFR and PSPH are shown in blue (grey in black and white image; left hand side of schematic; sequence comprising “MRP . . . VIQ” amino acids of SEQ ID NO: 11) and red (light grey in black and white image; right hand side of schematic, encompassing amino acids “DAF . . . LEE” of SEQ ID NO: 11), respectively. The fusion includes the tyrosine kinase domain of EGFR and the last 35 amino acids of PSPH.



FIG. 12A. NFASC-NTRK1 gene fusion identified by whole transcriptome sequencing. Split reads are shown aligning on the breakpoint. The predicted reading frame at the breakpoint is shown at the top with NFASC sequences in blue (grey in black and white image; encompassing “RVQ . . . GED” amino acids and “AGA . . . ATT” nucleotides) and NTRK1 in red (light grey in black and white image; encompassing “YTN . . . VGL” amino acids and “AGA . . . AAG” nucleotides). Figure discloses SEQ ID NOS 8489-8490, respectively, in order of appearance.



FIG. 12B. NFASC-NTRK1 gene fusion identified by whole transcriptome sequencing. (left panel), NFASC-NTRK1 specific PCR from cDNA derived from GBMs. Marker, 1 kb ladder. (right panel), Sanger sequencing chromatogram showing the reading frame (SEQ ID NO: 8491) at the breakpoint and putative translation of the fusion protein (SEQ ID NO: 8492) in the positive sample.



FIG. 12C. NFASC-NTRK1 gene fusion identified by whole transcriptome sequencing. NFASC-NTRK1 fusion protein sequence (SEQ ID NO: 8493) and schematics. Regions corresponding to NFASC and NTRK1 are shown in blue (grey in black and white image; sequence comprising “MAR . . . GED” amino acids) and red (light grey in black and white image; sequence comprising “YTN . . . VLG” amino acids), respectively. The fusion includes two of the five fibronectin-type III domain of neurofascin and the protein kinase domain of NTRK1.



FIG. 12D. NFASC-NTRK1 gene fusion identified by whole transcriptome sequencing. Genomic fusion of NFASC intron 9 with intron 21 of NTRKJ. In the fuse mRNA exon 21 of NFASC is spliced 5′ to exon 10 of NTRKJ. Solid arrows indicate the position of the fusion genome primers that generate a fusion specific PCR product in the GBM sample TCGA-06-5411.



FIG. 13 shows the expression measured by read depth from RNA-seq data. Note the very high level of expression in the regions of the genes implicated in the fusion events.



FIG. 14A. CAND1-EGFR gene fusion identified by whole transcriptome sequencing. Split reads are shown aligning on the breakpoint. The predicted reading frame at the breakpoint is shown at the top with CAND1 sequences in blue (grey in black and white image; sequence comprising “TSA . . . LSR” amino acids of SEQ ID NO: 13 and “TTA . . . CAG” nucleotides of SEQ ID NO: 14) and EGFR in red (light grey in black and white image; sequence comprising “CTG . . . VGX” amino acids of SEQ ID NO: 13 and “ATC . . . GGC” nucleotides of SEQ ID NO: 14). The amino acid sequence (TOP) is SEQ ID NO: 13; the nucleotide sequence (bottom) is SEQ ID NO: 14.



FIG. 14B. CAND1-EGFR gene fusion identified by whole transcriptome sequencing. (left panel), CAND1-EGFR specific PCR from cDNA derived from GBMs. Marker, 1 kb ladder. (right panel), Sanger sequencing chromatogram showing the reading frame at the breakpoint (SEQ ID NO: 15) and putative translation of the fusion protein (SEQ ID NO: 16) in the positive sample (boxed sequences). Figure also discloses SEQ ID NO: 8494.



FIG. 14C. CAND1-EGFR gene fusion identified by whole transcriptome sequencing. CAND1-EGFR fusion protein sequence (SEQ ID NO: 8495). Regions corresponding to CAND1 and EGFR are shown in blue (grey in black and white image; sequence comprising “MAS . . . LSR” amino acids of SEQ ID NO: 8495) and red (grey in black and white image; sequence comprising “CTG . . . IGA*” amino acids of SEQ ID NO: 8495), respectively.



FIG. 14D. CAND1-EGFR gene fusion identified by whole transcriptome sequencing. Genomic fusion of CAND1 intron 4 with intron 15 of EGFR. In the fuse mRNA exon 4 of CAND1 is spliced 5′ to exon 16 of EGFR.



FIG. 15 is a photographic image of a blot showing the interaction with Cul3 and protein stability of wild type and mutant LZTR-1. Lysates from SF188 glioma cells transfected with vectors expressing Myc-LZTR-1 and Flag-Cul3 or the empty vector were immunoprecipitated with Flag antibody and assayed by western blot with the indicated antibodies. *, non specific band; arrowhead indicates neddylated Cul3.



FIG. 16A are a photographic images of a blot showing the interaction with Cul3 and protein stability of wild type and mutant LZTR-1. In vitro analysis of the interaction between Cul3 and LZTR-1 wild type and GBM related mutants. Left panel, In vitro translated Myc-LZTR-1 input. Right panel, In vitro translated Myc-LZTR-1 was mixed with Flag-Cul3 immunoprecipitated from transfected HEK-293T cells. Bound proteins were analyzed by western blot using the indicated antibodies.



FIG. 16B is a photographic image of a blot showing the interaction with Cul3 and protein stability of wild type and mutant LZTR-1. Steady state protein levels of wild type LZTR-1 and GBM-related mutants.



FIG. 16C is a photographic image of a blot (top) and a graph (bottom) Top panel, Cells transfected with LZTR-1 wild type or the R810W mutant were treated with cycloexamide for the indicated time. Bottom panel, Quantification of LZTR-1 wild type and LZTR-1-R810W protein from the experiment in the left panel.



FIG. 16D is a photographic image of a blot showing the interaction with Cul3 and protein stability of wild type and mutant LZTR-1. Semi-quantitative RT-PCR evaluation of LZTR-1 wild type and LZTR-1-R810W RNA expression in cells transfected as in FIG. 16C.



FIG. 17A is a graph showing functional analysis of LZTR-1 wild type and GBM associated mutants in GBM-derived cells. GSEA shows up-regulation of genes associated with the phenotype of “spherical cultures” of glioma cells in primary human GBM carrying mutations in the LZTR-1 gene [Enrichment Score (ES)=0.754; P (family-wise error rate, FWER)=0.000 q (false discovery rate, FDR)=0.000].



FIG. 17B is a graph showing functional analysis of LZTR-1 wild type and GBM associated mutants in GBM-derived cells. Sphere forming assay (left panel) and western blot analysis (right panel) of GBM-derived glioma spheres (#48) expressing vector or LZTR-1. Data are Mean±SD of triplicate samples (p=0.0036). Error bars are SD.



FIG. 17C is a linear regression plot of in vitro limiting dilution assay using GBM-derived glioma spheres #46 expressing vector or LZTR-1. The frequency of sphere forming cells was 8.49±1.04 and 1.44±0.05% in vector and LZTR-1 expressing cells, respectively (p=0.00795). Each data point represents the average of triplicates. Error bars are SD.



FIG. 17D is a graph and photographic microscopy images showing functional analysis of LZTR-1 wild type and GBM associated mutants in GBM-derived cells. Left upper panels, Bright field microphotographs of GBM-derived line 46 cells six days after transduction with vector or LZTR-1 expressing lentivirus. Left lower panels, Bright field microphotographs of spheres from GBM-derived glioma cells #46 expressing lentivirus expressing vector or LZTR-1 from experiment in FIG. 17C. Right panel, The size of tumor spheres from cultures in c was determined by microscopy review after 14 days of culture. n=60 spheres from triplicates for each condition. Data are Mean±SD (p<0.0001). Error bars are SD.



FIG. 17E is a photographic image of a western blot analysis of GBM-derived cells #84 expressing vector or LZTR-1.



FIG. 17F is a linear regression plot of in vitro limiting dilution assay using GBM-derived line 84 expressing vector, LZTR-1, LZTR-1-R810W or LZTR-1-W437STOP. The frequency of sphere forming cells was 7.2±0.92 for vector, 1.48±0.09 for LZTR-1 wild type (p=0.0096); 7.82±0.99 for LZTR-1-R810W (p=0.2489); and 6.74±1.07 for LZTR-1-W437STOP (p=0.2269). Error bars are SD.



FIGS. 18A-B are photographic microscopy images showing expression of δ-catenin in neurons and δ-catenin driven loss of mesenchymal marker in GBM. FIG. 18A shows a pattern of expression of δ-catenin in the developing brain, as determined by immunostaining. Double immunofluorescence staining of brain cortex using δ-catenin antibody (red; dark grey in black and white image (center)) and βIII-tubulin (green; light grey in black and white image (right)); Nuclei are counterstained with Dapi (blue; grey in black and white image (Left)). FIG. 18B shows a pattern of expression of δ-catenin in the adult brain, as determined by immunostaining. Upper panels, Double immunofluorescence staining of brain cortex using δ-catenin antibody (red; dark grey in black and white image (center)) and MAP2 (green; light grey in black and white image (right)); Nuclei are counterstained with Dapi (blue; grey in black and white image (Left)). Lower panels, Double immunofluorescence staining of brain cortex using δ-catenin antibody (red; dark grey in black and white image) and GFAP (green; light grey in black and white image); Nuclei are counterstained with Dapi (blue; grey in black and white image).



FIG. 18C is a photographic image of a western blot using the indicated antibodies for U87 cells expressing δ-catenin wild type, glioma-associated δ-catenin mutants or the empty vector. FBN, fibronectin. Vinculin is shown as control for loading.



FIGS. 19A-B show a functional analysis of δ-catenin in mesenchymal GBM. FIG. 19A is a photographic microscopy image of immunofluorescence for fibronectin, collagen-5α1 (COL5A1) and smooth muscle actin (SMA) in glioma spheres #48 four days after infection with lentiviruses expressing δ-catenin or the empty vector. Nuclei are counterstained with Dapi. FIG. 19B is a bar graph showing the quantification of fluorescence intensity for SMA, COL5A1 and FBN for cultures treated as in a. n=3 independent experiments; data indicate mean±SD.



FIG. 19C is a bar graph showing the quantification of fluorescence intensity for βIII-tubulin in cells #48 infected with lentiviruses expressing CTNND2 or the empty vector.



FIG. 19D are photographic microscopy images showing time course analysis of βIII-tubulin expression in glioma spheres #48 transduced with lentiviruses expressing CTNND2 or the empty vector. Note the loss from the advanced culture of βIII-Tubulin expressing cells.



FIGS. 19E-F are graphs. FIG. 19E shows a linear regression plot of in vitro limiting dilution assay using GBM-derived cells #48 expressing vector or δ-catenin. The frequency of sphere forming cells was 7.42±1.16 and 0.88±0.02 for vector and δ-catenin, respectively (p=0.0098). Error bars are SD. FIG. 19F shows a longitudinal analysis of bioluminescence imaging in mice injected intracranially with GBM-derived line 48 expressing vector or δ-catenin. n=3 mice for vector and 5 for δ-catenin. Data are mean±SEM of photon counts.



FIGS. 20A-E show the functional analysis of EGFR-SEPT14 fusion and effect of inhibition of EGFR kinase on glioma growth. FIG. 20A is a graph of a sphere forming assay in the absence of EGF of GBM-derived primary cells (#48) expressing vector, EGFR wild type, EGFR Viii or EGFR-SEP14 fusion. Data are Mean±SD of triplicate samples (p=0.0051 and 0.027 for EGFR-SEP14 fusion and EGFR Viii compared with vector, respectively). FIG. 20B is a western blot analysis of GBM-derived primary cells (#48) expressing vector, EGFR Viii or EGFR-SEP14 fusion cultured in the presence of EGF. FIG. 20C is a photohraphic image of a blot showing GBM-derived cells (#48) expressing vector, EGFR Viii or EGFR-SEP14 fusion that were cultured in the absence of EGF for 48 h and then stimulated with EGF 20 ng/ml for the indicated time. Cells were assayed by western blot using the indicated antibodies. FIG. 20D is a graph of GSEA showing up-regulation of STAT3 target genes in primary human GBM carrying the EGFR-SEPT14 fusion gene [Enrichment Score (ES)=0.738; P (family-wise error rate, FWER)=0.000 q (false discovery rate, FDR)=0.000]. FIG. 20E is a bar graph showing the survival of GBM-derived cells (#48) expressing vector, EGFR wild type, EGFR Viii or EGFR-SEP14 fusion after treatment with lapatinib for 48 h at the indicated concentrations. Data are Mean±SD of triplicate samples.



FIG. 21 is a plot showing the number of mutations in TCGA samples harboring MutComFocal gene candidates. For a given gene G, the number of mutations M8 was plotted in samples harboring G as solid circles. The mean of M8 is also plotted as asterisks. Given the mean, t and standard deviation σ of the number of mutations in all TCGA samples, the 95% confidence interval of a sample being hyper-mutated (11±1.96*a) was plotted and shown that for all G, the mean of M8 falls well within the 95% confidence interval, demonstrating that MutComFocal genes do not tend to occur in hypermutated samples.



FIGS. 22A-B show pattern of somatic mutations, CNVs and expression of CTNND2 in GBM. FIG. 22A are photograpgic microscopy images of immunofluorescence staining of human primary GBM included in tissue microarrays (TMA) using δ-catenin antibody (red; darkt grey in black and white image); Nuclei are counterstained with Dapi (blue; grey in black and white image). Two representative δ-catenin-positive and two δ-catenin-negative tumors are shown in the upper and lower panels, respectively. FIG. 22B is a Western Blot analysis of the expression of δ-catenin in a panel of GBM-derived glioma sphere cultures. Brain, normal human brain. Arrowhead indicated δ-catenin; Asterisk, non-specific band. Vinculin is shown as control for loading.



FIGS. 23A-B show the effects of expression of δ-catenin in glioma cells. FIG. 23A is a western blot using the indicated antibodies in glioma cells expressing δ-catenin or the empty vector. Vinculin is shown as control for loading. FIG. 23B are photographic microscopy images showing U87 glioma cells transduced with a lentivirus expressing wild type δ-catenin, δ-catenin GBM-derived mutants or the empty vector were analyzed by fluorescence microscopy.



FIG. 23C is a bar graph that shows the effects of expression of δ-catenin in glioma cells. The number of cells displaying neural processes was scored. At least 200 cells/sample were analyzed.



FIG. 23D are photographs of longitudinal bioluminescence imaging for one representative mouse injected intracranially with glioma sphere cells #48 transduced with lentivirus expressing CTNND2 (lower panels) or the empty vector (upper panels).



FIG. 24 is a heat map showing amplification surrounding the genomic neighborhood of EGFR, SEPT14, and PSPH among samples harboring EGFR fusions. Copy number was plotted log 2 ratio across the genomic region of chr7:55000000-56500000 for samples with EGFR-PSPH (top three rows) and EGFR-SEPT14 (bottom six rows). Genomic coordinates are also plotted for EGFR (blue; dark grey in black and white image), SEPT14 (yellow; light grey in black and white image), and PSPH (cyan; grey in black and white image).



FIG. 25 is a plot showing the expression of EGFR-SEPT14 fusion promotes an aggressive phenotype and inhibition of EGFR kinase delays GBM growth in vivo. Growth rate of U87glioma cells transduced with a lentivirus expressing EGFR-SEPT14, EGFR Viii, EGFR WT or the empty vector (average of triplicate cultures).



FIG. 26 is a map showing differential expression of GBM tumor samples harboring EGFR-SEPT14 fusions and EGFRv111 rearrangements. After filtering for statistical significance for differential expression, ten genes remained that characterized the EGFR-SEPT14 phenotype from the EGFRv111 phenotype. Log 2 expression was plotted as a heat map. Samples were hierarchically clustered by Euclidean distance using average linkage. This clustering demonstrates clear separation between EGFR-SEPT14 samples (red; dark grey in black and white image; corresponding to top half of intensity bar of left hand side) and EGFRv111 samples (green; light grey in black and white image; corresponding to bottom half of intensity bar of left hand side), confirming the unique molecular signature of the EGFR-SEPT14 gene fusion.





DETAILED DESCRIPTION OF THE INVENTION

Gene fusions retaining the RTK-coding domain of EGFR are the most frequent gene fusion events in GBM. EGFR gene fusions occur in 7.6% of GBM patients and frequently implicate the Sept14 gene as the 3′ partner in the fusion, with a consistent breakpoint at the RNA level. This makes the EGFR fusions highly manageable genetic alterations both diagnostically and therapeutically. In one embodiment, EGFR fusions enhance the proliferative and migratory capacity of glioma cells. In another embodiment, the EGFR fusions also confer sensitivity to EGFR inhibition to human GBM grown as mouse xenografts. Gene fusions encompassing RTK-coding genes are thus implicated in the pathogenesis of GBM and provide a strong rationale for the inclusion of GBM patients harboring EGFR fusions in clinical trials based on EGFR inhibitors. The target population of GBM patients who may carry EGFR gene fusions can benefit from targeted inhibition of EGFR kinase activity, and is estimated to correspond to 20,000 patients per year world-wide (˜1,000 in USA/year).


Glioblastoma multiformes (GBMs) are the most common form of brain tumors in adults accounting for 12-15% of intracranial tumors and 50-60% of primary brain tumors. GBM is among the most lethal forms of human cancer. The history of successful targeted therapy of cancer largely coincides with the inactivation of recurrent and oncogenic gene fusions in hematological malignancies and recently in some types of epithelial cancer. GBM is among the most lethal and incurable forms of human cancer. Targeted therapies against common genetic alterations in GBM have not changed the dismal clinical outcome of the disease, most likely because they have systematically failed to eradicate the truly addicting oncoprotein activities of GBM. Recurrent chromosomal rearrangements resulting in the creation of oncogenic gene fusions have not been found in GBM.


GBM is among the most difficult forms of cancer to treat in humans (1). So far, the therapeutic approaches that have been tested against potentially important oncogenic targets in GBM have met limited success (2-4). Recurrent chromosomal translocations leading to production of oncogenic fusion proteins are viewed as initiating and addicting events in the pathogenesis of human cancer, thus providing the most desirable molecular targets for cancer therapy (5, 6). Recurrent and oncogenic gene fusions have not been found in GBM. Chromosomal rearrangements are hallmarks of hematological malignancies but recently they have also been uncovered in subsets of solid tumors (breast, prostate, lung and colorectal carcinoma) (7, 8). Important and successful targeted therapeutic interventions for patients whose tumors carry these rearrangements have stemmed from the discovery of functional gene fusions, especially when the translocations involve kinase-coding genes (BCR-ABL, EML4-ALK) (9, 10). GBM, the most common malignant brain tumor, remains one of the most challenging forms of cancer to treat. The abundance of passenger mutations and large regions of copy number alterations has complicated the definition of the landscape of driver mutations in glioblastoma.


A hallmark of GBM is rampant chromosomal instability (CIN), which leads to aneuploidy (11). CIN and aneuploidy are early events in the pathogenesis of cancer (12). Without being bound by theory, genetic alterations targeting mitotic fidelity might be responsible for missegregation of chromosomes during mitosis, resulting in aneuploidy (13, 14).


Epidermal growth factor receptors (EGFR) are transmembrane glycoproteins and members of the protein kinase superfamily. This protein is a receptor for members of the epidermal growth factor family. EGFR is a cell surface protein that binds to epidermal growth factor. Binding of the protein to a ligand induces receptor dimerization and tyrosine autophosphorylation and leads to cell proliferation. Mutations that lead to EGFR overexpression or overactivity have been associated with a number of cancers, including lung cancer, anal cancers and glioblastoma multiforme.


Phosphoserine phosphatase (PSPH) is an enzyme responsible for the third and last step in L-serine formation. It catalyzes magnesium-dependent hydrolysis of L-phosphoserine and is also involved in an exchange reaction between L-serine and L-phosphoserine. Deficiency of this protein is thought to be linked to Williams syndrome.


The singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise.


The term “about” is used herein to mean approximately, in the region of, roughly, or around. When the term “about” is used in conjunction with a numerical range, it modifies that range by extending the boundaries above and below the numerical values set forth. In general, the term “about” is used herein to modify a numerical value above and below the stated value by a variance of 20%.


DNA and AminoAcid Manipulation Methods and Purification Thereof

The practice of aspects of the present invention can employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Molecular Cloning A Laboratory Manual, 3rd Ed., ed. by Sambrook (2001), Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984); Mullis et al. U.S. Pat. No. 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription and Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the series, Methods In Enzymology (Academic Press, Inc., N.Y.), specifically, Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Immunochemical Methods In Cell And Molecular Biology (Caner and Walker, eds., Academic Press, London, 1987); Handbook Of Experimental Immunology, Volumes I-IV (D. M. Weir and C. C. Blackwell, eds., 1986); Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). All patents, patent applications and references cited herein are incorporated by reference in their entireties.


One skilled in the art can obtain a protein in several ways, which include, but are not limited to, isolating the protein via biochemical means or expressing a nucleotide sequence encoding the protein of interest by genetic engineering methods.


A protein is encoded by a nucleic acid (including, for example, genomic DNA, complementary DNA (cDNA), synthetic DNA, as well as any form of corresponding RNA). For example, it can be encoded by a recombinant nucleic acid of a gene. The proteins of the invention can be obtained from various sources and can be produced according to various techniques known in the art. For example, a nucleic acid that encodes a protein can be obtained by screening DNA libraries, or by amplification from a natural source. A protein can be a fragment or portion thereof. The nucleic acids encoding a protein can be produced via recombinant DNA technology and such recombinant nucleic acids can be prepared by conventional techniques, including chemical synthesis, genetic engineering, enzymatic techniques, or a combination thereof. For example, a fusion protein of the invention comprises a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. For example, the fusion protein can be an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND fusion protein. An example of an EGFR-SEPT fusion protein is EGFR-SEPT14. In one embodiment, an EGFR-SEPT14 fusion polypeptide can have the amino acid sequence shown in SEQ ID NO: 1, 3, or 5. An example of an EGFR-PSPH fusion protein is a polypeptide having the amino acid sequence shown in SEQ ID NO: 7, 9, or 11. An example of an EGFR-CAND fusion protein is EGFR-CAND1. In one embodiment, an EGFR-CAND1 fusion polypeptide can have the amino acid sequence shown in SEQ ID NO: 13, 16, or 8495.


The Genbank ID for the EGFR gene is 1956. Four isoforms are listed for EGFR, e.g., having Genebank Accession Nos. NP_005219 (corresponding nucleotide sequence NM_005228); NP_958439 (corresponding nucleotide sequence NM_201282); NP_958440 (corresponding nucleotide sequence NM_201283); NP_958441 (corresponding nucleotide sequence NM_201284). The nucleotide and amino acid sequences can be readily obtained by one of ordinary skill in the art using the listed accession numbers.


The Genbank ID for the SEPT14 gene is 346288. The Genebank Accession No. for SEPT14 is NP_997249 (corresponding nucleotide sequence NM_207366). The nucleotide and amino acid sequences can be readily obtained by one of ordinary skill in the art using the listed accession numbers.


The Genbank ID for the PSPH gene is 5723. The Genebank Accession No. for PSPH is NP_004568 (corresponding nucleotide sequence NM_004577). The nucleotide and amino acid sequences can be readily obtained by one of ordinary skill in the art using the listed accession numbers.


The Genbank ID for the CAND1 gene is 55832. The Genebank Accession No. for CAND1 is NP_060918 (corresponding nucleotide sequence NM_018448). The nucleotide and amino acid sequences can be readily obtained by one of ordinary skill in the art using the listed accession numbers.


As used herein, an “EGFR fusion molecule” can be a nucleic acid which encodes a polypeptide corresponding to a fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. For example, an EGFR fusion molecule can include an EGFR-SEPT fusion (e.g., an EGFR-SEPT14 fusion polypeptide comprising the amino acid sequence shown in SEQ ID NO: 1, 3, or 5, or comprising the nucleic acid sequence shown in SEQ ID NO: 2 or 4); an EGFR-PSPH fusion, (e.g., comprising the amino acid sequence shown in SEQ ID NO: 7, 9, or 11, or comprising the nucleic acid sequence shown in SEQ ID NO: 8 or 10), or an EGFR-CAND fusion (e.g., an EGFR-CAND1 fusion polypeptide comprising the amino acid sequence shown in SEQ ID NO: 13, 16, or 8495, or comprising the nucleic acid sequence shown in SEQ ID NO: 14 or 15). For example, an EGFR fusion molecule can include an EGFR-containing fusion comprising the amino acid sequence corresponding to Genebank Accession no. NP_005219, NP_958439, NP_958440, or NP_958441. AN EGFR fusion molecule can also include a tyrosine kinase domain of an EGFR protein fused to a protein encoded by any one of the genes listed in Table 10. AN EGFR fusion molecule can include a variant of the above described examples, such as a fragment thereof









TABLE 10







Fusion Partners












gene
gene
gene
gene






ABCA13
C21orf29
CAMKK1
DNAJC6



ABCC1
CACNA1C
CAMSAP1
DYRK3



ABCC12
CACNA1G
CAMTA1
EIF2C2



ABCC6
CNTNAP4
CAP2
FAM184B



ABL1
CUL3
CCDC147
FREM2



ADAM12
DMD
CCDC158
GDPD2



ADCY10
DUSP27
CELF2
GLI3



ADCY2
ECE1
CILP
IL1RN



ADCY8
EYS
CMYA5
ISX



AGBL4
FAM172A
COL14A1
KIDINS220



AHNAK
FAM184B
CORO7
LRBA



ANXA7
EGFR4
CSMD2
LY75



AP4S1
ITGAV
CUL3
MDH2



AQP2
LRP1
DDI2
MMP12



ARMC6
LY75
DEPDC5
N4BP2L2



ATP5B
MAPKAP1
DEPDC7
NCF2



ATP6AP1L
MYT1
DI10L
NCOR1



ATP6V0D2
NCF2
DMD
NCRNA00157



ATXN1
NCOR1
EDA
NRXN3



BAHD1
NHSL2
EFHC1
PARP16



BBX
NKAIN2
EFS
PLA2G2F



BCA10
NR3C1
EIF2C2
PLEK2



C15orf23
NUP188
ENTPD2
PRKCH



C15orf33
OSBPL10
EYS
PTPRS



C21orf29
PACSIN1
FAM160A1
ROBO1



C2CD3
PARP16
MUSK
SASH3



C6orf170
PDZRN4
NEUROG1
SH3BP5



C7orf44
POLM
NHSL2
SLC44A2



CACNA1C
PPP1R3A
NR3C1
SLC5A4



CACNA1G
PSEN1
ODZ1
SNX5



FAM168A
PTPRD
PCDH12
SORCS2



FAM172A
PTPRS
PLCL1
SRRM1



FAM192A
RALYL
PLEKHM3
SSX3



FAM19A2
RERE
PLOD3
STAG2



FBXL4
RIMBP2
PRKCH
STK24



FH
RNF216
PSEN1
SURF6



FREM2
SDAD1
SEPT5
SYNPO2



GAPVD1
SEC14L3
SLC44A2
TAF1



GLI3
SH3RF3
SNTA1
TMEM80



GPR182
SLC9A1
USP48
TNFRSF10B



GSTA3
SMOC2
VSNL1
TTYH1



IGFBP3
SNX5
WDFY1
UNC93B1



ITGA9
TACC2
WISP2
VSNL1



ITGB2
SRGAP1
XRRA1
XRCC4



JOSD2
SSX3
LRRC4B
ZNF410



KIDINS220
SUMF1
LRRK2
TRIOBP



LAMA2
SYNPO2
MAPKAP1
TTYH1



LCLAT1
TNFRSF10B
MST1R
LRBA



LIN9









The nucleic acid can be any type of nucleic acid, including genomic DNA, complementary DNA (cDNA), recombinant DNA, synthetic or semi-synthetic DNA, as well as any form of corresponding RNA. A cDNA is a form of DNA artificially synthesized from a messenger RNA template and is used to produce gene clones. A synthetic DNA is free of modifications that can be found in cellular nucleic acids and include, but are not limited to, histones and methylation. For example, a nucleic acid encoding anan EGFR EGFR fusion molecule can comprise a recombinant nucleic acid encoding such a protein. The nucleic acid can be a non-naturally occurring nucleic acid created artificially (such as by assembling, cutting, ligating or amplifying sequences). It can be double-stranded or single-stranded.


The invention further provides for nucleic acids that are complementary to an EGFR fusion molecule. Complementary nucleic acids can hybridize to the nucleic acid sequence described above under stringent hybridization conditions. Non-limiting examples of stringent hybridization conditions include temperatures above 30° C., above 35° C., in excess of 42° C., and/or salinity of less than about 500 mM, or less than 200 mM. Hybridization conditions can be adjusted by the skilled artisan via modifying the temperature, salinity and/or the concentration of other reagents such as SDS or SSC.


According to the invention, protein variants can include amino acid sequence modifications. For example, amino acid sequence modifications fall into one or more of three classes: substitutional, insertional or deletional variants. Insertions can include amino and/or carboxyl terminal fusions as well as intrasequence insertions of single or multiple amino acid residues. Insertions ordinarily will be smaller insertions than those of amino or carboxyl terminal fusions, for example, on the order of one to four residues. Deletions are characterized by the removal of one or more amino acid residues from the protein sequence. These variants ordinarily are prepared by site-specific mutagenesis of nucleotides in the DNA encoding the protein, thereby producing DNA encoding the variant, and thereafter expressing the DNA in recombinant cell culture.


In one embodiment, an EGFR fusion molecule comprises a protein or polypeptide encoded by a nucleic acid sequence encoding an EGFR fusion molecule, such as the sequences shown in SEQ ID NOS: 2, 4, 8, 10, 14, or 15. In some embodiments, the nuceleic acid sequence encoding an EGFR fusion molecule is about 70%, about 75%, about 80%, about 85%, about 90%, about 93%, about 95%, about 97%, about 98%, or about 99% identical to SEQ ID NOS: 2, 4, 8, 10, 14, or 15. In another embodiment, the polypeptide can be modified, such as by glycosylations and/or acetylations and/or chemical reaction or coupling, and can contain one or several non-natural or synthetic amino acids. An example of an EGFR fusion molecule is the polypeptide having the amino acid sequence shown in SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. In some embodiments, the EGFR fusion molecule that is a polypeptide is about 70%, about 75%, about 80%, about 85%, about 90%, about 93%, about 95%, about 97%, about 98%, or about 99% identical to SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. In another embodiment, an EGFR fusion molecule can be a fragment of an EGFR fusion protein. For example, the EGFR fusion molecule can encompass any portion of at least about 8 consecutive amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. The fragment can comprise at least about 10 amino acids, a least about 20 amino acids, at least about 30 amino acids, at least about 40 amino acids, at least about 50 amino acids, at least about 60 amino acids, or at least about 75 amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. Fragments include all possible amino acid lengths between about 8 and about 100 amino acids, for example, lengths between about 10 and about 100 amino acids, between about 15 and about 100 amino acids, between about 20 and about 100 amino acids, between about 35 and about 100 amino acids, between about 40 and about 100 amino acids, between about 50 and about 100 amino acids, between about 70 and about 100 amino acids, between about 75 and about 100 amino acids, or between about 80 and about 100 amino acids. Fragments include all possible amino acid lengths between about 100 and 800 amino acids, for example, lengths between about 125 and 800 amino acids, between about 150 and 800 amino acids, between about 175 and 800 amino acids, between about 200 and 800 amino acids, between about 225 and 800 amino acids, between about 250 and 800 amino acids, between about 275 and 800 amino acids, between about 300 and 800 amino acids, between about 325 and 800 amino acids, between about 350 and 800 amino acids, between about 375 and 800 amino acids, between about 400 and 800 amino acids, between about 425 and 800 amino acids, between about 450 and 800 amino acids, between about 475 and 800 amino acids, between about 500 and 800 amino acids, between about 525 and 800 amino acids, between about 550 and 800 amino acids, between about 575 and 800 amino acids, between about 600 and 800 amino acids, between about 625 and 800 amino acids, between about 650 and 800 amino acids, between about 675 and 800 amino acids, between about 700 and 800 amino acids, between about 725 and 800 amino acids, between about 750 and 800 amino acids, or between about 775 and 800 amino acids.


Chemical Synthesis.


Nucleic acid sequences encoding an EGFR fusion molecule can be synthesized, in whole or in part, using chemical methods known in the art. Alternatively, a polypeptide can be produced using chemical methods to synthesize its amino acid sequence, such as by direct peptide synthesis using solid-phase techniques. Protein synthesis can either be performed using manual techniques or by automation. Automated synthesis can be achieved, for example, using Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer).


Optionally, polypeptides fragments can be separately synthesized and combined using chemical methods to produce a full-length molecule. For example, these methods can be utilized to synthesize a fusion protein of the invention. In one embodiment, a fusion protein of the invention comprises a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. For example, the fusion protein can be an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND fusion protein. An example of an EGFR-SEPT fusion protein is EGFR-SEPT14. In one embodiment, an EGFR-SEPT14 fusion polypeptide can have the amino acid sequence shown in SEQ ID NO: 1, 3, or 5. An example of an EGFR-PSPH fusion protein is a polypeptide having the amino acid sequence shown in SEQ ID NO: 7, 9, or 11. An example of an EGFR-CAND fusion protein is EGFR-CAND1. In one embodiment, an EGFR-CAND1 fusion polypeptide can have the amino acid sequence shown in SEQ ID NO: 13, 16, or 8495.


Obtaining, Purifying and Detecting EGFR Fusion Molecules.


A polypeptide encoded by a nucleic acid, such as a nucleic acid encoding an EGFR fusion molecule, or a variant thereof, can be obtained by purification from human cells expressing a protein or polypeptide encoded by such a nucleic acid. Non-limiting purification methods include size exclusion chromatography, ammonium sulfate fractionation, ion exchange chromatography, affinity chromatography, and preparative gel electrophoresis.


A synthetic polypeptide can be substantially purified via high performance liquid chromatography (HPLC), such as ion exchange chromatography (IEX-HPLC). The composition of a synthetic polypeptide, such as an EGFR fusion molecule, can be confirmed by amino acid analysis or sequencing.


Other constructions can also be used to join a nucleic acid sequence encoding a polypeptide/protein of the claimed invention to a nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp., Seattle, Wash.). Including cleavable linker sequences (i.e., those specific for Factor Xa or enterokinase (Invitrogen, San Diego, Calif.)) between the purification domain and a polypeptide encoded by a nucleic acid of the invention also can be used to facilitate purification. For example, the skilled artisan can use an expression vector encoding 6 histidine residues that precede a thioredoxin or an enterokinase cleavage site in conjunction with a nucleic acid of interest. The histidine residues facilitate purification by immobilized metal ion affinity chromatography, while the enterokinase cleavage site provides a means for purifying the polypeptide encoded by, for example, an EGFR-SEPT, EGFR-CAND, EGFR-PSPH, or EGFR-containing, nucleic acid.


Host cells which contain a nucleic acid encoding an EGFR fusion molecule, and which subsequently express the same, can be identified by various procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques which include membrane, solution, or chip-based technologies for the detection and/or quantification of nucleic acid or protein. For example, the presence of a nucleic acid encoding an EGFR fusion molecule can be detected by DNA-DNA or DNA-RNA hybridization or amplification using probes or fragments of nucleic acids encoding the same. In one embodiment, a nucleic acid fragment of an EGFR fusion molecule can encompass any portion of at least about 8 consecutive nucleotides of SEQ ID NOS: 2, 8, or 14. In another embodiment, the fragment can comprise at least about 10 consecutive nucleotides, at least about 15 consecutive nucleotides, at least about 20 conseutive nucleotides, or at least about 30 consecutive nucleotides of SEQ ID NOS: 2, 8, or 14. Fragments can include all possible nucleotide lengths between about 8 and about 100 nucleotides, for example, lengths between about 15 and about 100 nucleotides, or between about 20 and about 100 nucleotides. Nucleic acid amplification-based assays involve the use of oligonucleotides selected from sequences encoding an EGFR fusion molecule nucleic acid, or EGFR fusion molecule nucleic acid to detect transformants which contain a nucleic acid encoding a protein or polypeptide of the same.


Protocols are known in the art for detecting and measuring the expression of a polypeptide encoded by a nucleic acid, such as a nucleic acid encoding an EGFR fusion molecule, using either polyclonal or monoclonal antibodies specific for the polypeptide. Non-limiting examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay using monoclonal antibodies reactive to two non-interfering epitopes on a polypeptide encoded by a nucleic acid, such as a nucleic acid encoding an EGFR fusion molecule, can be used, or a competitive binding assay can be employed.


Labeling and conjugation techniques are known by those skilled in the art and can be used in various nucleic acid and amino acid assays. Methods for producing labeled hybridization or PCR probes for detecting sequences related to nucleic acid sequences encoding a protein, such as EGFR fusion molecule, include, but are not limited to, oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. Alternatively, nucleic acid sequences, such as nucleic acids encoding an EGFR fusion molecule, can be cloned into a vector for the production of an mRNA probe. Such vectors are known in the art, are commercially available, and can be used to synthesize RNA probes in vitro by addition of labeled nucleotides and an appropriate RNA polymerase such as T7, T3, or SP6. These procedures can be conducted using a variety of commercially available kits (Amersham Pharmacia Biotech, Promega, and US Biochemical). Suitable reporter molecules or labels which can be used for ease of detection include radionuclides, enzymes, and fluorescent, chemiluminescent, or chromogenic agents, as well as substrates, cofactors, inhibitors, and/or magnetic particles.


A fragment can be a fragment of a protein, such as an EGFR fusion protein. For example, a fragment of an EGFR fusion can encompass any portion of at least about 8 consecutive amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. The fragment can comprise at least about 10 consecutive amino acids, at least about 20 consecutive amino acids, at least about 30 consecutive amino acids, at least about 40 consecutive amino acids, a least about 50 consecutive amino acids, at least about 60 consecutive amino acids, at least about 70 consecutive amino acids, at least about 75 consecutive amino acids, at least about 80 consecutive amino acids, at least about 85 consecutive amino acids, at least about 90 consecutive amino acids, at least about 95 consecutive amino acids, at least about 100 consecutive amino acids, at least about 200 consecutive amino acids, at least about 300 consecutive amino acids, at least about 400 consecutive amino acids, at least about 500 consecutive amino acids, at least about 600 consecutive amino acids, at least about 700 consecutive amino acids, or at least about 800 consecutive amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. Fragments include all possible amino acid lengths between about 8 and 100 about amino acids, for example, lengths between about 10 and about 100 amino acids, between about 15 and about 100 amino acids, between about 20 and about 100 amino acids, between about 35 and about 100 amino acids, between about 40 and about 100 amino acids, between about 50 and about 100 amino acids, between about 70 and about 100 amino acids, between about 75 and about 100 amino acids, or between about 80 and about 100 amino acids.


Cell Transfection

Host cells transformed with a nucleic acid sequence of interest can be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The polypeptide produced by a transformed cell can be secreted or contained intracellularly depending on the sequence and/or the vector used. Expression vectors containing a nucleic acid sequence, such as a nucleic acid encoding an EGFR fusion molecule, can be designed to contain signal sequences which direct secretion of soluble polypeptide molecules encoded by the nucleic acid. Cell transfection and culturing methods are described in more detail below.


A eukaryotic expression vector can be used to transfect cells in order to produce proteins encoded by nucleotide sequences of the vector, e.g. those encoding an EGFR fusion molecule. Mammalian cells can contain an expression vector (for example, one that contains a nucleic acid encoding a fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein) via introducing the expression vector into an appropriate host cell via methods known in the art.


A host cell strain can be chosen for its ability to modulate the expression of the inserted sequences or to process the expressed polypeptide encoded by a nucleic acid, in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a “prepro” form of the polypeptide also can be used to facilitate correct insertion, folding and/or function. Different host cells which have specific cellular machinery and characteristic mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and W138), are available from the American Type Culture Collection (ATCC; 10801 University Boulevard, Manassas, Va. 20110-2209) and can be chosen to ensure the correct modification and processing of the foreign protein.


An exogenous nucleic acid can be introduced into a cell via a variety of techniques known in the art, such as lipofection, microinjection, calcium phosphate or calcium chloride precipitation, DEAE-dextran-mediated transfection, or electroporation. Electroporation is carried out at approximate voltage and capacitance to result in entry of the DNA construct(s) into cells of interest (such as glioma cells (cell line SF188), neuroblastoma cells (cell lines IMR-32, SK-N-SH, SH-F and SH-N), astrocytes and the like). Other transfection methods also include modified calcium phosphate precipitation, polybrene precipitation, liposome fusion, and receptor-mediated gene delivery.


Cells that will be genetically engineered can be primary and secondary cells obtained from various tissues, and include cell types which can be maintained and propagated in culture. Non-limiting examples of primary and secondary cells include epithelial cells, neural cells, endothelial cells, glial cells, fibroblasts, muscle cells (such as myoblasts) keratinocytes, formed elements of the blood (e.g., lymphocytes, bone marrow cells), and precursors of these somatic cell types.


Vertebrate tissue can be obtained by methods known to one skilled in the art, such a punch biopsy or other surgical methods of obtaining a tissue source of the primary cell type of interest. In one embodiment, a punch biopsy or removal (e.g., by aspiration) can be used to obtain a source of cancer cells (for example, glioma cells, neuroblastoma cells, and the like). A mixture of primary cells can be obtained from the tissue, using methods readily practiced in the art, such as explanting or enzymatic digestion (for examples using enzymes such as pronase, trypsin, collagenase, elastase dispase, and chymotrypsin). Biopsy methods have also been described in U.S. Pat. No. 7,419,661 and PCT application publication WO 2001/32840, and each are hereby incorporated by reference.


Primary cells can be acquired from the individual to whom the genetically engineered primary or secondary cells are administered. However, primary cells can also be obtained from a donor, other than the recipient, of the same species. The cells can also be obtained from another species (for example, rabbit, cat, mouse, rat, sheep, goat, dog, horse, cow, bird, or pig). Primary cells can also include cells from an isolated or purified vertebrate tissue source grown attached to a tissue culture substrate (for example, flask or dish) or grown in a suspension; cells present in an explant derived from tissue; both of the aforementioned cell types plated for the first time; and cell culture suspensions derived from these plated cells. Secondary cells can be plated primary cells that are removed from the culture substrate and replated, or passaged, in addition to cells from the subsequent passages. Secondary cells can be passaged one or more times. These primary or secondary cells can contain expression vectors having a gene that encodes an EGFR fusion molecule.


Cell Culturing

Various culturing parameters can be used with respect to the host cell being cultured. Appropriate culture conditions for mammalian cells are well known in the art (Cleveland W L, et al., J Immunol Methods, 1983, 56(2): 221-234) or can be determined by the skilled artisan (see, for example, Animal Cell Culture: A Practical Approach 2nd Ed., Rickwood, D. and Hames, B. D., eds. (Oxford University Press: New York, 1992)). Cell culturing conditions can vary according to the type of host cell selected. Commercially available medium can be utilized. Non-limiting examples of medium include, for example, Minimal Essential Medium (MEM, Sigma, St. Louis, Mo.); Dulbecco's Modified Eagles Medium (DMEM, Sigma); Ham's F10 Medium (Sigma); HyClone cell culture medium (HyClone, Logan, Utah); RPMI-1640 Medium (Sigma); and chemically-defined (CD) media, which are formulated for various cell types, e.g., CD-CHO Medium (Invitrogen, Carlsbad, Calif.).


The cell culture media can be supplemented as necessary with supplementary components or ingredients, including optional components, in appropriate concentrations or amounts, as necessary or desired. Cell culture medium solutions provide at least one component from one or more of the following categories: (1) an energy source, usually in the form of a carbohydrate such as glucose; (2) all essential amino acids, and usually the basic set of twenty amino acids plus cysteine; (3) vitamins and/or other organic compounds required at low concentrations; (4) free fatty acids or lipids, for example linoleic acid; and (5) trace elements, where trace elements are defined as inorganic compounds or naturally occurring elements that can be required at very low concentrations, usually in the micromolar range.


The medium also can be supplemented electively with one or more components from any of the following categories: (1) salts, for example, magnesium, calcium, and phosphate; (2) hormones and other growth factors such as, serum, insulin, transferrin, and epidermal growth factor; (3) protein and tissue hydrolysates, for example peptone or peptone mixtures which can be obtained from purified gelatin, plant material, or animal byproducts; (4) nucleosides and bases such as, adenosine, thymidine, and hypoxanthine; (5) buffers, such as HEPES; (6) antibiotics, such as gentamycin or ampicillin; (7) cell protective agents, for example pluronic polyol; and (8) galactose. In one embodiment, soluble factors can be added to the culturing medium.


The mammalian cell culture that can be used with the present invention is prepared in a medium suitable for the type of cell being cultured. In one embodiment, the cell culture medium can be any one of those previously discussed (for example, MEM) that is supplemented with serum from a mammalian source (for example, fetal bovine serum (FBS)). In another embodiment, the medium can be a conditioned medium to sustain the growth of host cells.


Three-dimensional cultures can be formed from agar (such as Gey's Agar), hydrogels (such as matrigel, agarose, and the like; Lee et al., (2004) Biomaterials 25: 2461-2466) or polymers that are cross-linked. These polymers can comprise natural polymers and their derivatives, synthetic polymers and their derivatives, or a combination thereof. Natural polymers can be anionic polymers, cationic polymers, amphipathic polymers, or neutral polymers. Non-limiting examples of anionic polymers can include hyaluronic acid, alginic acid (alginate), carageenan, chondroitin sulfate, dextran sulfate, and pectin. Some examples of cationic polymers, include but are not limited to, chitosan or polylysine. (Peppas et al., (2006) Adv Mater. 18: 1345-60; Hoffman, A. S., (2002) Adv Drug Deliv Rev. 43: 3-12; Hoffman, A. S., (2001) Ann NY Acad Sci 944: 62-73). Examples of amphipathic polymers can include, but are not limited to collagen, gelatin, fibrin, and carboxymethyl chitin. Non-limiting examples of neutral polymers can include dextran, agarose, or pullulan. (Peppas et al., (2006) Adv Mater. 18: 1345-60; Hoffman, A. S., (2002) Adv Drug Deliv Rev. 43: 3-12; Hoffman, A. S., (2001) Ann NY Acad Sci 944: 62-73).


Cells to be cultured can harbor introduced expression vectors, such as plasmids. The expression vector constructs can be introduced via transformation, microinjection, transfection, lipofection, electroporation, or infection. The expression vectors can contain coding sequences, or portions thereof, encoding the proteins for expression and production. Expression vectors containing sequences encoding the produced proteins and polypeptides, as well as the appropriate transcriptional and translational control elements, can be generated using methods well known to and practiced by those skilled in the art. These methods include synthetic techniques, in vitro recombinant DNA techniques, and in vivo genetic recombination which are described in J. Sambrook et al., 2001, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y. and in F. M. Ausubel et al., 1989, Current Protocols in Molecular Biology, John Wiley & Sons, New York, N.Y.


EGFR Fusion Molecule Inhibitors

The invention provides methods for use of compounds that decrease the expression level or activity of an EFGR EGFR fusion molecule in a subject. In addition, the invention provides methods for using compounds for the treatment of a gene-fusion associated cancer. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma.


As used herein, an “EGFR fusion molecule inhibitor” refers to a compound that interacts with an EGFR fusion molecule of the invention and modulates its activity and/or its expression. For example, the compound can decrease the activity or expression of an EGFR fusion molecule. The compound can be an antagonist of an EGFR fusion molecule (e.g., an EGFR fusion molecule inhibitor). Some non-limiting examples of EGFR fusion molecule inhibitors include peptides (such as peptide fragments comprising an EGFR fusion molecule, or antibodies or fragments thereof), small molecules, and nucleic acids (such as siRNA or antisense RNA specific for a nucleic acid comprising an EGFR fusion molecule). Antagonists of an EGFR fusion molecule decrease the amount or the duration of the activity of an EGFR fusion protein. In one embodiment, the fusion protein comprises a tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein (e.g., EGFR-SEPT (such as EFGR-SEPT14), EGFR-PSPH, or EGFR-CAND (such as EGFR-CAND1)). Antagonists include proteins, nucleic acids, antibodies, small molecules, or any other molecule which decrease the activity of an EGFR fusion molecule.


The term “modulate,” as it appears herein, refers to a change in the activity or expression of an EGFR fusion molecule. For example, modulation can cause a decrease in protein activity, binding characteristics, or any other biological, functional, or immunological properties of an EGFR fusion molecule, such as an EGFR fusion protein.


In one embodiment, an EGFR fusion molecule inhibitor can be a peptide fragment of an EGFR fusion protein that binds to the protein itself.


For example, the EGFR fusion polypeptide can encompass any portion of at least about 8 consecutive amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. The fragment can comprise at least about 10 consecutive amino acids, at least about 20 consecutive amino acids, at least about 30 consecutive amino acids, at least about 40 consecutive amino acids, a least about 50 consecutive amino acids, at least about 60 consecutive amino acids, at least about 70 consecutive amino acids, at least about 75 consecutive amino acids, at least about 80 consecutive amino acids, at least about 85 consecutive amino acids, at least about 90 consecutive amino acids, at least about 95 consecutive amino acids, at least about 100 consecutive amino acids, at least about 200 consecutive amino acids, at least about 300 consecutive amino acids, at least about 400 consecutive amino acids, at least about 500 consecutive amino acids, at least about 600 consecutive amino acids, at least about 700 consecutive amino acids, or at least about 800 consecutive amino acids of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. Fragments include all possible amino acid lengths between about 8 and 100 about amino acids, for example, lengths between about 10 and about 100 amino acids, between about 15 and about 100 amino acids, between about 20 and about 100 amino acids, between about 35 and about 100 amino acids, between about 40 and about 100 amino acids, between about 50 and about 100 amino acids, between about 70 and about 100 amino acids, between about 75 and about 100 amino acids, or between about 80 and about 100 amino acids. These peptide fragments can be obtained commercially or synthesized via liquid phase or solid phase synthesis methods (Atherton et al., (1989) Solid Phase Peptide Synthesis: a Practical Approach. IRL Press, Oxford, England). The EGFR fusion peptide fragments can be isolated from a natural source, genetically engineered, or chemically prepared. These methods are well known in the art.


An EGFR fusion molecule inhibitor can be a protein, such as an antibody (monoclonal, polyclonal, humanized, chimeric, or fully human), or a binding fragment thereof, directed against an EGFR fusion molecule. An antibody fragment can be a form of an antibody other than the full-length form and includes portions or components that exist within full-length antibodies, in addition to antibody fragments that have been engineered. Antibody fragments can include, but are not limited to, single chain Fv (scFv), diabodies, Fv, and (Fab′)2, triabodies, Fc, Fab, CDR1, CDR2, CDR3, combinations of CDR's, variable regions, tetrabodies, bifunctional hybrid antibodies, framework regions, constant regions, and the like (see, Maynard et al., (2000) Ann. Rev. Biomed. Eng. 2:339-76; Hudson (1998) Curr. Opin. Biotechnol. 9:395-402). Antibodies can be obtained commercially, custom generated, or synthesized against an antigen of interest according to methods established in the art (see U.S. Pat. Nos. 6,914,128, 5,780,597, and 5,811,523; Roland E. Kontermann and Stefan Dübel (editors), Antibody Engineering, Vol. I & II, (2010) 2nd ed., Springer; Antony S. Dimitrov (editor), Therapeutic Antibodies: Methods and Protocols (Methods in Molecular Biology), (2009), Humana Press; Benny Lo (editor) Antibody Engineering: Methods and Protocols (Methods in Molecular Biology), (2004) Humana Press, each of which are hereby incorporated by reference in their entireties). For example, antibodies directed to an EGFR fusion molecule can be obtained commercially from Abcam, Santa Cruz Biotechnology, Abgent, R&D Systems, Novus Biologicals, etc. Human antibodies directed to an EGFR fusion molecule (such as monoclonal, humanized, fully human, or chimeric antibodies) can be useful antibody therapeutics for use in humans. In one embodiment, an antibody or binding fragment thereof is directed against SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495.


Inhibition of RNA encoding an EGFR fusion molecule can effectively modulate the expression of an EGFR fusion molecule Inhibitors are selected from the group comprising: siRNA; interfering RNA or RNAi; dsRNA; RNA Polymerase III transcribed DNAs; ribozymes; and antisense nucleic acids, which can be RNA, DNA, or an artificial nucleic acid.


Antisense oligonucleotides, including antisense DNA, RNA, and DNA/RNA molecules, act to directly block the translation of mRNA by binding to targeted mRNA and preventing protein translation. For example, antisense oligonucleotides of at least about 15 bases and complementary to unique regions of the DNA sequence encoding an EGFR fusion molecule can be synthesized, e.g., by conventional phosphodiester techniques (Dallas et al., (2006) Med. Sci. Monit. 12(4):RA67-74; Kalota et al., (2006) Handb. Exp. Pharmacol. 173:173-96; Lutzelburger et al., (2006) Handb. Exp. Pharmacol. 173:243-59). Antisense nucleotide sequences include, but are not limited to: morpholinos, 2′-O-methyl polynucleotides, DNA, RNA and the like.


siRNA comprises a double stranded structure containing from about 15 to about 50 base pairs, for example from about 21 to about 25 base pairs, and having a nucleotide sequence identical or nearly identical to an expressed target gene or RNA within the cell. The siRNA comprise a sense RNA strand and a complementary antisense RNA strand annealed together by standard Watson-Crick base-pairing interactions. The sense strand comprises a nucleic acid sequence which is substantially identical to a nucleic acid sequence contained within the target miRNA molecule. “Substantially identical” to a target sequence contained within the target mRNA refers to a nucleic acid sequence that differs from the target sequence by about 3% or less. The sense and antisense strands of the siRNA can comprise two complementary, single-stranded RNA molecules, or can comprise a single molecule in which two complementary portions are base-paired and are covalently linked by a single-stranded “hairpin” area. See also, McMnaus and Sharp (2002) Nat Rev Genetics, 3:737-47, and Sen and Blau (2006) FASEB J., 20:1293-99, the entire disclosures of which are herein incorporated by reference.


The siRNA can be altered RNA that differs from naturally-occurring RNA by the addition, deletion, substitution and/or alteration of one or more nucleotides. Such alterations can include addition of non-nucleotide material, such as to the end(s) of the siRNA or to one or more internal nucleotides of the siRNA, or modifications that make the siRNA resistant to nuclease digestion, or the substitution of one or more nucleotides in the siRNA with deoxyribo-nucleotides. One or both strands of the siRNA can also comprise a 3′ overhang. As used herein, a 3′ overhang refers to at least one unpaired nucleotide extending from the 3′-end of a duplexed RNA strand. For example, the siRNA can comprise at least one 3′ overhang of from 1 to about 6 nucleotides (which includes ribonucleotides or deoxyribonucleotides) in length, or from 1 to about 5 nucleotides in length, or from 1 to about 4 nucleotides in length, or from about 2 to about 4 nucleotides in length. For example, each strand of the siRNA can comprise 3′ overhangs of dithymidylic acid (“TT”) or diuridylic acid (“uu”).


siRNA can be produced chemically or biologically, or can be expressed from a recombinant plasmid or viral vector (for example, see U.S. Pat. No. 7,294,504 and U.S. Pat. No. 7,422,896, the entire disclosures of which are herein incorporated by reference). Exemplary methods for producing and testing dsRNA or siRNA molecules are described in U.S. Patent Application Publication No. 2002/0173478 to Gewirtz, U.S. Pat. No. 8,071,559 to Hannon et al., and in U.S. Pat. No. 7,148,342 to Tolentino et al., the entire disclosures of which are herein incorporated by reference.


In one embodiment, an siRNA directed to a human nucleic acid sequence comprising an EGFR fusion molecule can be generated against any one of SEQ ID NOS: 2, 4, 8, 10, 14, or 15. In another embodiment, an siRNA directed to a human nucleic acid sequence comprising a breakpoint of an EGFR fusion molecule can be generated against any one of SEQ ID NOS: 4, 10, or 15.


RNA polymerase III transcribed DNAs contain promoters, such as the U6 promoter. These DNAs can be transcribed to produce small hairpin RNAs in the cell that can function as siRNA or linear RNAs, which can function as antisense RNA. The EGFR fusion molecule inhibitor can comprise ribonucleotides, deoxyribonucleotides, synthetic nucleotides, or any suitable combination such that the target RNA and/or gene is inhibited. In addition, these forms of nucleic acid can be single, double, triple, or quadruple stranded. (See for example Bass (2001) Nature, 411:428-429; Elbashir et al., (2001) Nature, 411:494 498; U.S. Pat. No. 6,509,154; U.S. Patent Application Publication No. 2003/0027783; and PCT Publication Nos. WO 00/044895, WO 99/032619, WO 00/01846, WO 01/029058, WO 00/044914).


EGFR fusion molecule inhibitor can be a small molecule that binds to an EGFR fusion protein described herein and disrupts its function. Small molecules are a diverse group of synthetic and natural substances generally having low molecular weights. They can be isolated from natural sources (for example, plants, fungi, microbes and the like), are obtained commercially and/or available as libraries or collections, or synthesized. Candidate small molecules that inhibit an EGFR fusion protein can be identified via in silico screening or high-through-put (HTP) screening of combinatorial libraries according to methods established in the art (e.g., see Potyrailo et al., (2011) ACS Comb Sci. 13(6):579-633; Mensch et al., (2009) J Pharm Sci. 98(12):4429-68; Schnur (2008) Curr Opin Drug Discov Devel. 11(3):375-80; and Jhoti (2007) Ernst Schering Found Symp Proc. (3):169-85, each of which are hereby incorporated by reference in their entireties.) Most conventional pharmaceuticals, such as aspirin, penicillin, and many chemotherapeutics, are small molecules, can be obtained commercially, can be chemically synthesized, or can be obtained from random or combinatorial libraries as described below (see, e.g., Werner et al., (2006) Brief Funct. Genomic Proteomic 5(1):32-6).


Non-limiting examples of EGFR fusion molecule inhibitors include the EGFR inhibitors AZD4547 (see Gavine et al., (2012) Cancer Res, 72(8); 2045-56; see also PCT Application Publication No. WO2008/075068, each of which are hereby incorporated by reference in their entireties); NVP-BGJ398 (see Guagnano et al., (2011) J. Med. Chem., 54:7066-7083; see also U.S. Patent Application Publication No. 2008-0312248 A1, each of which are hereby incorporated by reference in their entireties); PD173074 (see Guagnano et al., (2011) J. Med. Chem., 54:7066-7083; see also Mohammadi et al., (1998) EMBO J., 17:5896-5904, each of which are hereby incorporated by reference in their entireties); NF449 (EMD Millipore (Billerica, Mass.) Cat. No. 480420; see also Krejci, (2010) the Journal of Biological Chemistry, 285(27):20644-20653, which is hereby incorporated by reference in its entirety); LY2874455 (Active Biochem; see Zhao et al. (2011) Mol Cancer Ther. (11):2200-10; see also PCT Application Publication No. WO 2010129509, each of which are hereby incorporated by reference in their entireties); TKI258 (Dovitinib); BIBF-1120 (Intedanib-Vargatef); BMS-582664 (Brivanib alaninate); AZD-2171 (Cediranib); TSU-68 (Orantinib); AB-1010 (Masitinib); AP-24534 (Ponatinib); and E-7080 (by Eisai). A non-limiting example of an EGFR fusion molecule inhibitor includes the inhibitor KHS101 (Wurdak et al., (2010) PNAS, 107(38): 16542-47, which is hereby incorporated by reference in its entirety).


Structures of EGFR fusion molecule inhibitors useful for the invention include, but are not limited to: the EGFR inhibitor AZD4547,




embedded image


the EGFR inhibitor NVP-BGJ398,




embedded image


the EGFR inhibitor PD173074,




embedded image


the EGFR inhibitor LY2874455




embedded image


and the EGFR inhibitor NF449 (EMD Millipore (Billerica, Mass.) Cat. No. 480420),




embedded image


Other EGFR inhibitors include, but are not limited to:




embedded image


embedded image


A structure of an EGFR fusion molecule inhibitor useful for the invention includes, but is not limited to the inhibitor KHS101,




embedded image


Assessment and Therapeutic Treatment

The invention provides a method of decreasing the growth of a solid tumor in a subject. The tumor is associated with, but not limited to glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the method comprises detecting the presence of an EGFR fusion molecule in a sample obtained from a subject. In some embodiments, the sample is incubated with an agent that binds to an EFGR fusion molecule, such as an antibody, a probe, a nucleic acid primer, and the like. In further embodiments, the method comprises administering to the subject an effective amount of an EGFR fusion molecule inhibitor, wherein the inhibitor decreases the size of the solid tumor.


The invention also provides a method for treating or preventing a gene-fusion associated cancer in a subject, such as, but not limited to, glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the method comprises detecting the presence of an EGFR fusion molecule in a sample obtained from a subject, the presence of the fusion being indicative of a gene-fusion associated cancer, and, administering to the subject in need a therapeutic treatment against a gene-fusion associated cancer. In some embodiments, the sample is incubated with an agent that binds to an EFGR fusion molecule, such as an antibody, a probe, a nucleic acid primer, and the like.


The invention also provides a method for decreasing in a subject in need thereof the expression level or activity of a fusion protein comprising the tyrosine kinase domain of an EGFR protein fused to a polypeptide that constitutively activates the tyrosine kinase domain of the EGFR protein. In some embodiments, the method comprises obtaining a biological sample from the subject. In some embodiments, the sample is incubated with an agent that binds to an EGFR fusion molecule, such as an antibody, a probe, a nucleic acid primer, and the like. In some embodiments, the method comprises administering to the subject a therapeutic amount of a composition comprising an admixture of a pharmaceutically acceptable carrier an inhibitor of the fusion protein of the invention. In another embodiment, the method further comprises determining the fusion protein expression level or activity. In another embodiment, the method further comprises detecting whether the fusion protein expression level or activity is decreased as compared to the fusion protein expression level or activity prior to administration of the composition, thereby decreasing the expression level or activity of the fusion protein. In some embodiments, the fusion protein is an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or an EGFR-SEPT fusion protein.


The administering step in each of the claimed methods can comprise a drug administration, such as EGFR fusion molecule inhibitor (for example, a pharmaceutical composition comprising an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND). In one embodiment, the therapeutic molecule to be administered comprises a polypeptide of an EGFR fusion molecule, comprising at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 93%, at least about 95%, at least about 97%, at least about 98%, at least about 99%, or 100% of the amino acid sequence of SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495, and exhibits the function of decreasing expression of such a protein, thus treating a gene fusion-associated cancer. In another embodiment, administration of the therapeutic molecule decreases the size of the solid tumor associated with glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In a further embodiment, administration of the therapeutic molecule decreases cell proliferation in a subject afflicted with a gene-fusion associated cancer.


In another embodiment, the therapeutic molecule to be administered comprises an siRNA directed to a human nucleic acid sequence comprising an EGFR fusion molecule. In one embodiment, the siRNA is directed to any one of SEQ ID NOS: 2, 4, 8, 10, 14, or 15. In a further embodiment, the therapeutic molecule to be administered comprises an antibody or binding fragment thereof, that is directed against SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 16, or 8495. In some embodiments, the therapeutic molecule to be administered comprises a small molecule that specifically binds to an EGFR protein, such as AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, or LY2874455.


An EGFR fusion molecule, for example, a fusion between EGFR and SEPT, PSPH, or CAND, can be determined at the level of the DNA, RNA, or polypeptide. Optionally, detection can be determined by performing an oligonucleotide ligation assay, a confirmation based assay, a hybridization assay, a sequencing assay, an allele-specific amplification assay, a microsequencing assay, a melting curve analysis, a denaturing high performance liquid chromatography (DHPLC) assay (for example, see Jones et al, (2000) Hum Genet., 106(6):663-8), or a combination thereof. In one embodiment, the detection is performed by sequencing all or part of an EGFR fusion molecule (e.g., EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), EGFR-CAND fusion (such as an EGFR-CAND1 fusion), EGFR-PSPH), or by selective hybridization or amplification of all or part of an EGFR fusion molecule (e.g., EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), EGFR-CAND fusion (such as an EGFR-CAND1 fusion), EGFR-PSPH)). AN EGFR fusion molecule specific amplification (e.g., EGFR-SEPT (such as an EGFR-SEPT14), EGFR-CAND (such as an EGFR-CAND1), EGFR-PSPH nucleic acid specific amplification) can be carried out before the fusion identification step.


The invention provides for a method of detecting a chromosomal alteration in a subject afflicted with a gene-fusion associated cancer. In some emboidments, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. In one embodiment, the chromosomal alteration is an in-frame fused transcript described herein, for example an EGFR fusion molecule. An alteration in a chromosome region occupied by an EGFR fusion molecule such as a nucleic acid encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH, can be any form of mutation(s), deletion(s), rearrangements) and/or insertions in the coding and/or non-coding region of the locus, alone or in various combination(s). Mutations can include point mutations. Insertions can encompass the addition of one or several residues in a coding or non-coding portion of the gene locus. Insertions can comprise an addition of between 1 and 50 base pairs in the gene locus. Deletions can encompass any region of one, two or more residues in a coding or non-coding portion of the gene locus, such as from two residues up to the entire gene or locus. Deletions can affect smaller regions, such as domains (introns) or repeated sequences or fragments of less than about 50 consecutive base pairs, although larger deletions can occur as well. Rearrangement includes inversion of sequences. The alteration in a chromosome region occupied by an EGFR fusion molecule, e.g., a nucleic acid encoding a an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH, can result in amino acid substitutions, RNA splicing or processing, product instability, the creation of stop codons, production of oncogenic fusion proteins, frame-shift mutations, and/or truncated polypeptide production. The alteration can result in the production of an EGFR fusion molecule, for example, a nucleic acid encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion, with altered function, stability, targeting or structure. The alteration can also cause a reduction, or even an increase in protein expression. In one embodiment, the alteration in the chromosome region occupied by an EGFR fusion molecule can comprise a chromosomal rearrangement resulting in the production of an EGFR fusion molecule, such as an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion. This alteration can be determined at the level of the DNA, RNA, or polypeptide. In another embodiment, the detection or determination comprises nucleic acid sequencing, selective hybridization, selective amplification, gene expression analysis, or a combination thereof. In another embodiment, the detection or determination comprises protein expression analysis, for example by western blot analysis, ELISA, or other antibody detection methods.


The present invention provides a method for treating a gene-fusion associated cancer in a subject in need thereof. In one embodiment, the method comprises obtaining a sample from the subject to determine the level of expression of an EGFR fusion molecule in the subject. In some embodiments, the sample is incubated with an agent that binds to an EGFR fusion molecule, such as an antibody, a probe, a nucleic acid primer, and the like. In another embodiment, the detection or determination comprises nucleic acid sequencing, selective hybridization, selective amplification, gene expression analysis, or a combination thereof. In another embodiment, the detection or determination comprises protein expression analysis, for example by western blot analysis, ELISA, or other antibody detection methods. In some embodiments, the method further comprises assessing whether to administer an EGFR fusion molecule inhibitor based on the expression pattern of the subject. In further embodiments, the method comprises administering an EGFR fusion molecule inhibitor to the subject. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma.


In one embodiment, the invention provides for a method of detecting the presence of altered RNA expression of an EFGR fusion molecule in a subject, for example, one afflicted with a gene-fusion associated cancer. In another embodiment, the invention provides for a method of detecting the presence of an EGFR fusion molecule in a subject. In some embodiments, the method comprises obtaining a sample from the subject to determine whether the subject expresses an EGFR fusion molecule. In some embodiments, the sample is incubated with an agent that binds to an EGFR fusion molecule, such as an antibody, a probe, a nucleic acid primer, and the like. In other embodiments, the detection or determination comprises nucleic acid sequencing, selective hybridization, selective amplification, gene expression analysis, or a combination thereof. In another embodiment, the detection or determination comprises protein expression analysis, for example by western blot analysis, ELISA, or other antibody detection methods. In some embodiments, the method further comprises assessing whether to administer an EGFR fusion molecule inhibitor based on the expression pattern of the subject. In further embodiments, the method comprises administering an EGFR fusion molecule inhibitor to the subject. Altered RNA expression includes the presence of an altered RNA sequence, the presence of an altered RNA splicing or processing, or the presence of an altered quantity of RNA. These can be detected by various techniques known in the art, including sequencing all or part of the RNA or by selective hybridization or selective amplification of all or part of the RNA.


In a further embodiment, the method can comprise detecting the presence or expression of an EGFR fusion molecule, such as a nucleic acid encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion. Altered polypeptide expression includes the presence of an altered polypeptide sequence, the presence of an altered quantity of polypeptide, or the presence of an altered tissue distribution. These can be detected by various techniques known in the art, including by sequencing and/or binding to specific ligands (such as antibodies). In one embodiment, the detecting comprises using a northern blot; real time PCR and primers directed to SEQ ID NOS: 2, 4, 8, 10, 14, or 15; a ribonuclease protection assay; a hybridization, amplification, or sequencing technique to detect an EGFR fusion molecule, such as one comprising SEQ ID NOS: 2, 4, 8, 10, 14, or 15; or a combination thereof. In another embodiment, the PCR primers comprise SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, or 89. In a further embodiment, primers used for the screening of EGFR fusion molecules, comprise SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, or 89. In some embodiments, primers used for genomic detection of an EGFR fusion comprise SEQ ID NOS 40, 41, 42, 43, 44, 45, or 89.


Various techniques known in the art can be used to detect or quantify altered gene or RNA expression or nucleic acid sequences, which include, but are not limited to, hybridization, sequencing, amplification, and/or binding to specific ligands (such as antibodies). Other suitable methods include allele-specific oligonucleotide (ASO), oligonucleotide ligation, allele-specific amplification, Southern blot (for DNAs), Northern blot (for RNAs), single-stranded conformation analysis (SSCA), PFGE, fluorescent in situ hybridization (FISH), gel migration, clamped denaturing gel electrophoresis, denaturing HLPC, melting curve analysis, heteroduplex analysis, RNase protection, chemical or enzymatic mismatch cleavage, ELISA, radio-immunoassays (RIA) and immuno-enzymatic assays (IEMA).


Some of these approaches (such as SSCA and constant gradient gel electrophoresis (CGGE)) are based on a change in electrophoretic mobility of the nucleic acids, as a result of the presence of an altered sequence. According to these techniques, the altered sequence is visualized by a shift in mobility on gels. The fragments can then be sequenced to confirm the alteration. Some other approaches are based on specific hybridization between nucleic acids from the subject and a probe specific for wild type or altered gene or RNA. The probe can be in suspension or immobilized on a substrate. The probe can be labeled to facilitate detection of hybrids. Some of these approaches are suited for assessing a polypeptide sequence or expression level, such as Northern blot, ELISA and RIA. These latter require the use of a ligand specific for the polypeptide, for example, the use of a specific antibody.


Hybridization.


Hybridization detection methods are based on the formation of specific hybrids between complementary nucleic acid sequences that serve to detect nucleic acid sequence alteration(s). A detection technique involves the use of a nucleic acid probe specific for a wild type or altered gene or RNA, followed by the detection of the presence of a hybrid. The probe can be in suspension or immobilized on a substrate or support (for example, as in nucleic acid array or chips technologies). The probe can be labeled to facilitate detection of hybrids. In one embodiment, the probe according to the invention can comprise a nucleic acid directed to SEQ ID NOS: 2, 4, 8, 10, 14, or 15. For example, a sample from the subject can be contacted with a nucleic acid probe specific for a gene encoding an EGFR fusion molecule, and the formation of a hybrid can be subsequently assessed. In one embodiment, the method comprises contacting simultaneously the sample with a set of probes that are specific for an EGFR fusion molecule. Also, various samples from various subjects can be investigated in parallel.


According to the invention, a probe can be a polynucleotide sequence which is complementary to and specifically hybridizes with a, or a target portion of a, gene or RNA corresponding to an EGFR fusion molecule. Useful probes are those that are complementary to the gene, RNA, or target portion thereof. Probes can comprise single-stranded nucleic acids of between 8 to 1000 nucleotides in length, for instance between 10 and 800, between 15 and 700, or between 20 and 500. Longer probes can be used as well. A useful probe of the invention is a single stranded nucleic acid molecule of between 8 to 500 nucleotides in length, which can specifically hybridize to a region of a gene or RNA that corresponds to an EGFR fusion molecule.


The sequence of the probes can be derived from the sequences of the EGFR fusion genes provided herein. Nucleotide substitutions can be performed, as well as chemical modifications of the probe. Such chemical modifications can be accomplished to increase the stability of hybrids (e.g., intercalating groups) or to label the probe. Some examples of labels include, without limitation, radioactivity, fluorescence, luminescence, and enzymatic labeling.


A guide to the hybridization of nucleic acids is found in e.g., Sambrook, ed., Molecular Cloning: A Laboratory Manual (3rd Ed.), Vols. 1-3, Cold Spring Harbor Laboratory, 1989; Current Protocols In Molecular Biology, Ausubel, ed. John Wiley & Sons, Inc., New York, 2001; Laboratory Techniques In Biochemistry And Molecular Biology: Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Tijssen, ed. Elsevier, N.Y., 1993.


Sequencing.


Sequencing can be carried out using techniques well known in the art, using automatic sequencers. The sequencing can be performed on the complete EGFR fusion molecule or on specific domains thereof.


Amplification.


Amplification is based on the formation of specific hybrids between complementary nucleic acid sequences that serve to initiate nucleic acid reproduction. Amplification can be performed according to various techniques known in the art, such as by polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement amplification (SDA) and nucleic acid sequence based amplification (NASBA). These techniques can be performed using commercially available reagents and protocols. Useful techniques in the art encompass real-time PCR, allele-specific PCR, or PCR based single-strand conformational polymorphism (SSCP). Amplification usually requires the use of specific nucleic acid primers, to initiate the reaction. For example, nucleic acid primers useful for amplifying sequences corresponding to an EGFR fusion molecule are able to specifically hybridize with a portion of the gene locus that flanks a target region of the locus. In one embodiment, amplification comprises using forward and reverse PCR primers directed to SEQ ID NOS: 2, 4, 8, 10, 14, or 15. Nucleic acid primers useful for amplifying sequences from an EGFR fusion molecule; the primers specifically hybridize with a portion of an EGFR fusion molecule. In certain subjects, the presence of an EGFR fusion molecule corresponds to a subject with a gene fusion-associated cancer. In one embodiment, amplification can comprise using forward and reverse PCR primers comprising nucleotide sequences of SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, or 89.


Non-limiting amplification methods include, e.g., polymerase chain reaction, PCR (PCR Protocols, A Guide To Methods And Applications, ed. Innis, Academic Press, N.Y., 1990 and PCR Strategies, 1995, ed. Innis, Academic Press, Inc., N.Y.); ligase chain reaction (LCR) (Wu (1989) Genomics 4:560; Landegren (1988) Science 241:1077; Barringer (1990) Gene 89:117); transcription amplification (Kwoh (1989) PNAS 86:1173); and, self-sustained sequence replication (Guatelli (1990) PNAS 87:1874); Q Beta replicase amplification (Smith (1997) J. Clin. Microbiol. 35:1477-1491), automated Q-beta replicase amplification assay (Burg (1996) Mol. Cell. Probes 10:257-271) and other RNA polymerase mediated techniques (e.g., NASBA, Cangene, Mississauga, Ontario; see also Berger (1987) Methods Enzymol. 152:307-316; U.S. Pat. Nos. 4,683,195 and 4,683,202; and Sooknanan (1995) Biotechnology 13:563-564). All the references stated above are incorporated by reference in their entireties.


The invention provides for a nucleic acid primer, wherein the primer can be complementary to and hybridize specifically to a portion of an EGFR fusion molecule, such as a nucleic acid (e.g., DNA or RNA), in certain subjects having a gene fusion-associated cancer. In one embodiment, the gene-fusion associated cancer comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. Primers of the invention can be specific for fusion sequences in a nucleic acid (DNA or RNA) encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion. By using such primers, the detection of an amplification product indicates the presence of a fusion of a nucleic acid encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion. Examples of primers of this invention can be single-stranded nucleic acid molecules of about 5 to 60 nucleotides in length, or about 8 to about 25 nucleotides in length. The sequence can be derived directly from the sequence of an EGFR fusion molecule, e.g. a nucleic acid encoding an EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion. Perfect complementarity is useful to ensure high specificity; however, certain mismatch can be tolerated. For example, a nucleic acid primer or a pair of nucleic acid primers as described above can be used in a method for detecting the presence of a gene fusion-associated cancer in a subject. In one embodiment, primers can be used to detect an EGFR fusion molecule, such as a primer comprising SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, 89; or a combination thereof.


Specific Ligand Binding.


As discussed herein, a nucleic acid encoding an EGFR fusion molecule or expression of an EGFR fusion molecule, can also be detected by screening for alteration(s) in a sequence or expression level of a polypeptide encoded by the same. Different types of ligands can be used, such as specific antibodies. In one embodiment, the sample is contacted with an antibody specific for a polypeptide encoded by an EGFR fusion molecule and the formation of an immune complex is subsequently determined Various methods for detecting an immune complex can be used, such as ELISA, radioimmunoassays (RIA) and immuno-enzymatic assays (IEMA).


For example, an antibody can be a polyclonal antibody, a monoclonal antibody, as well as fragments or derivatives thereof having substantially the same antigen specificity. Fragments include Fab, Fab′2, or CDR regions. Derivatives include single-chain antibodies, humanized antibodies, or poly-functional antibodies. An antibody specific for a polypeptide encoded by an EGFR fusion molecule can be an antibody that selectively binds such a polypeptide. In one embodiment, the antibody is raised against a polypeptide encoded by an EGFR fusion molecule or an epitope-containing fragment thereof. Although non-specific binding towards other antigens can occur, binding to the target polypeptide occurs with a higher affinity and can be reliably discriminated from non-specific binding. In one embodiment, the method can comprise contacting a sample from the subject with an antibody specific for an EGFR fusion molecule, and determining the presence of an immune complex. Optionally, the sample can be contacted to a support coated with antibody specific for an EGFR fusion molecule. In one embodiment, the sample can be contacted simultaneously, or in parallel, or sequentially, with various antibodies specific for different forms of an EGFR fusion molecule, e.g., EGFR-SEPT fusion (such as an EGFR-SEPT14 fusion), an EGFR-CAND fusion (such as an EGFR-CAND1 fusion), or an EGFR-PSPH fusion.


The invention also provides for a diagnostic kit comprising products and reagents for detecting in a sample from a subject the presence of an EGFR fusion molecule. The kit can be useful for determining whether a sample from a subject exhibits reduced expression of an EGFR fusion molecule. For example, the diagnostic kit according to the present invention comprises any primer, any pair of primers, any nucleic acid probe and/or any ligand, or any antibody directed specifically to an EGFR fusion molecule. The diagnostic kit according to the present invention can further comprise reagents and/or protocols for performing a hybridization, amplification, or antigen-antibody immune reaction. In one embodiment, the kit can comprise nucleic acid primers that specifically hybridize to and can prime a polymerase reaction from an EGFR fusion molecule comprising SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, 89, or a combination thereof. In one embodiment, primers can be used to detect an EGFR fusion molecule, such as a primer comprising SEQ ID NOS 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 87, 88, 89; or a combination thereof. In a further embodiment, primers used for the screening of EGFR fusion molecules.


The diagnosis methods can be performed in vitro, ex vivo, or in vivo. These methods utilize a sample from the subject in order to assess the status of an EGFR fusion molecule. The sample can be any biological sample derived from a subject, which contains nucleic acids or polypeptides. Examples of such samples include, but are not limited to, fluids, tissues, cell samples, organs, and tissue biopsies. Non-limiting examples of samples include blood, liver, plasma, serum, saliva, urine, or seminal fluid. The sample can be collected according to conventional techniques and used directly for diagnosis or stored. The sample can be treated prior to performing the method, in order to render or improve availability of nucleic acids or polypeptides for testing. Treatments include, for instance, lysis (e.g., mechanical, physical, or chemical), centrifugation. The nucleic acids and/or polypeptides can be pre-purified or enriched by conventional techniques, and/or reduced in complexity. Nucleic acids and polypeptides can also be treated with enzymes or other chemical or physical treatments to produce fragments thereof. In one embodiment, the sample is contacted with reagents, such as probes, primers, or ligands, in order to assess the presence of an EGFR fusion molecule. Contacting can be performed in any suitable device, such as a plate, tube, well, or glass. In some embodiments, the contacting is performed on a substrate coated with the reagent, such as a nucleic acid array or a specific ligand array. The substrate can be a solid or semi-solid substrate such as any support comprising glass, plastic, nylon, paper, metal, or polymers. The substrate can be of various forms and sizes, such as a slide, a membrane, a bead, a column, or a gel. The contacting can be made under any condition suitable for a complex to be formed between the reagent and the nucleic acids or polypeptides of the sample.


Nucleic Acid Delivery Methods

Delivery of nucleic acids into viable cells can be effected ex vivo, in situ, or in vivo by use of vectors, such as viral vectors (e.g., lentivirus, adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). Non-limiting techniques suitable for the transfer of nucleic acid into mammalian cells in vitro include the use of liposomes, electroporation, microinjection, cell fusion, DEAE-dextran, and the calcium phosphate precipitation method (See, for example, Anderson, Nature, 1998) supplement to 392(6679):25( ). Introduction of a nucleic acid or a gene encoding a polypeptide of the invention can also be accomplished with extrachromosomal substrates (transient expression) or artificial chromosomes (stable expression). Cells can also be cultured ex vivo in the presence of therapeutic compositions of the present invention in order to proliferate or to produce a desired effect on or activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes.


Nucleic acids can be inserted into vectors and used as gene therapy vectors. A number of viruses have been used as gene transfer vectors, including papovaviruses, e.g., SV40 (Madzak et al., (1992) J Gen Virol. 73(Pt 6):1533-6), adenovirus (Berkner (1992) Curr Top Microbiol Immunol. 158:39-66; Berkner (1988) Biotechniques, 6(7):616-29; Gorziglia and Kapikian (1992) J Virol. 66(7):4407-12; Quantin et al., (1992) Proc Natl Acad Sci USA. 89(7):2581-4; Rosenfeld et al., (1992) Cell. 68(1):143-55; Wilkinson et al., (1992) Nucleic Acids Res. 20(9):2233-9; Stratford-Perricaudet et al., (1990) Hum Gene Ther. 1(3):241-56), vaccinia virus (Moss (1992) Curr Opin Biotechnol. 3(5):518-22), adeno-associated virus (Muzyczka, (1992) Curr Top Microbiol Immunol. 158:97-129; Ohi et al., (1990) Gene. 89(2):279-82), herpesviruses including HSV and EBV (Margolskee (1992) Curr Top Microbiol Immunol. 158:67-95; Johnson et al., (1992) Brain Res Mol Brain Res. 12(1-3):95-102; Fink et al., (1992) Hum Gene Ther. 3(1):11-9; Breakefield and Geller (1987) Mol Neurobiol. 1(4):339-71; Freese et al., (1990) Biochem Pharmacol. 40(10):2189-99), and retroviruses of avian (Bandyopadhyay and Temin (1984) Mol Cell Biol. 4(4):749-54; Petropoulos et al., (1992) J Virol. 66(6):3391-7), murine (Miller et al. (1992) Mol Cell Biol. 12(7):3262-72; Miller et al., (1985) J Virol. 55(3):521-6; Sorge et al., (1984) Mol Cell Biol. 4(9):1730-7; Mann and Baltimore (1985) J Virol. 54(2):401-7; Miller et al., (1988) J Virol. 62(11):4337-45), and human origin (Shimada et al., (1991) J Clin Invest. 88(3):1043-7; Helseth et al., (1990) J Virol. 64(12):6314-8; Page et al., (1990) J Virol. 64(11):5270-6; Buchschacher and Panganiban (1992) J Virol. 66(5):2731-9).


Non-limiting examples of in vivo gene transfer techniques include transfection with viral (e.g., retroviral) vectors (see U.S. Pat. No. 5,252,479, which is incorporated by reference in its entirety) and viral coat protein-liposome mediated transfection (Dzau et al., (1993) Trends in Biotechnology 11:205-210), incorporated entirely by reference). For example, naked DNA vaccines are generally known in the art; see Brower, (1998) Nature Biotechnology, 16:1304-1305, which is incorporated by reference in its entirety. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see, e.g., U.S. Pat. No. 5,328,470) or by stereotactic injection (see, e.g., Chen, et al., (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells that produce the gene delivery system.


For reviews of nucleic acid delivery protocols and methods see Anderson et al. (1992) Science 256:808-813; U.S. Pat. Nos. 5,252,479, 5,747,469, 6,017,524, 6,143,290, 6,410,010 6,511,847; and U.S. Application Publication No. 2002/0077313, which are all hereby incorporated by reference in their entireties. For additional reviews, see Friedmann (1989) Science, 244:1275-1281; Verma, Scientific American: 68-84 (1990); Miller (1992) Nature, 357: 455-460; Kikuchi et al. (2008) J Dermatol Sci. 50(2):87-98; Isaka et al. (2007) Expert Opin Drug Deliv. 4(5):561-71; Jager et al. (2007) Curr Gene Ther. 7(4):272-83; Waehler et al. (2007) Nat Rev Genet. 8(8):573-87; Jensen et al. (2007) Ann Med. 39(2):108-15; Herweijer et al. (2007) Gene Ther. 14(2):99-107; Eliyahu et al. (2005) Molecules 10(1):34-64; and Altaras et al. (2005) Adv Biochem Eng Biotechnol. 99:193-260, all of which are hereby incorporated by reference in their entireties.


An EGFR fusion nucleic acid can also be delivered in a controlled release system. For example, the EGFR fusion molecule can be administered using intravenous infusion, an implantable osmotic pump, a transdermal patch, liposomes, or other modes of administration. In one embodiment, a pump can be used (see Sefton (1987) Biomed. Eng. 14:201; Buchwald et al. (1980) Surgery 88:507; Saudek et al. (1989) N. Engl. J. Med. 321:574). In another embodiment, polymeric materials can be used (see Medical Applications of Controlled Release, Langer and Wise (eds.), CRC Pres., Boca Raton, Fla. (1974); Controlled Drug Bioavailability, Drug Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); Ranger and Peppas, (1983) J. Macromol. Sci. Rev. Macromol. Chem. 23:61; see also Levy et al. (1985) Science 228:190; During et al. (1989) Ann. Neurol. 25:351; Howard et al. (1989) J. Neurosurg. 71:105). In yet another embodiment, a controlled release system can be placed in proximity of the therapeutic target thus requiring only a fraction of the systemic dose (see, e.g., Goodson, in Medical Applications of Controlled Release, supra, vol. 2, pp. 115-138 (1984)). Other controlled release systems are discussed in the review by Langer (Science (1990) 249:1527-1533).


Pharmaceutical Compositions and Administration for Therapy

An inhibitor of the invention can be incorporated into pharmaceutical compositions suitable for administration, for example the inhibitor and a pharmaceutically acceptable carrier.


AN EGFR fusion molecule or inhibitor of the invention can be administered to the subject once (e.g., as a single injection or deposition). Alternatively, an EGFR fusion molecule or inhibitor can be administered once or twice daily to a subject in need thereof for a period of from about two to about twenty-eight days, or from about seven to about ten days. AN EGFR fusion molecule or inhibitor can also be administered once or twice daily to a subject for a period of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 times per year, or a combination thereof. Furthermore, an EGFR fusion molecule or inhibitor of the invention can be co-administrated with another therapeutic. Where a dosage regimen comprises multiple administrations, the effective amount of the EGFR fusion molecule or inhibitor administered to the subject can comprise the total amount of gene product administered over the entire dosage regimen.


AN EGFR fusion molecule or inhibitor can be administered to a subject by any means suitable for delivering the EGFR fusion molecule or inhibitor to cells of the subject, such as cancer cells, e.g., glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma. For example, an EGFR fusion molecule or inhibitor can be administered by methods suitable to transfect cells. Transfection methods for eukaryotic cells are well known in the art, and include direct injection of the nucleic acid into the nucleus or pronucleus of a cell; electroporation; liposome transfer or transfer mediated by lipophilic materials; receptor mediated nucleic acid delivery, bioballistic or particle acceleration; calcium phosphate precipitation, and transfection mediated by viral vectors.


The compositions of this invention can be formulated and administered to reduce the symptoms associated with a gene fusion-associated cancer, e.g., glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma, by any means that produces contact of the active ingredient with the agent's site of action in the body of a subject, such as a human or animal (e.g., a dog, cat, or horse). They can be administered by any conventional means available for use in conjunction with pharmaceuticals, either as individual therapeutic active ingredients or in a combination of therapeutic active ingredients. They can be administered alone, but are generally administered with a pharmaceutical carrier selected on the basis of the chosen route of administration and standard pharmaceutical practice.


A therapeutically effective dose of EGFR fusion molecule or inhibitor can depend upon a number of factors known to those or ordinary skill in the art. The dose(s) of the EGFR fusion molecule inhibitor can vary, for example, depending upon the identity, size, and condition of the subject or sample being treated, further depending upon the route by which the composition is to be administered, if applicable, and the effect which the practitioner desires the an EGFR fusion molecule inhibitor to have upon the nucleic acid or polypeptide of the invention. These amounts can be readily determined by a skilled artisan. Any of the therapeutic applications described herein can be applied to any subject in need of such therapy, including, for example, a mammal such as a dog, a cat, a cow, a horse, a rabbit, a monkey, a pig, a sheep, a goat, or a human.


Pharmaceutical compositions for use in accordance with the invention can be formulated in conventional manner using one or more physiologically acceptable carriers or excipients. The therapeutic compositions of the invention can be formulated for a variety of routes of administration, including systemic and topical or localized administration. Techniques and formulations generally can be found in Remmington's Pharmaceutical Sciences, Meade Publishing Co., Easton, Pa. (20th Ed., 2000), the entire disclosure of which is herein incorporated by reference. For systemic administration, an injection is useful, including intramuscular, intravenous, intraperitoneal, and subcutaneous. For injection, the therapeutic compositions of the invention can be formulated in liquid solutions, for example in physiologically compatible buffers such as Hank's solution or Ringer's solution. In addition, the therapeutic compositions can be formulated in solid form and redissolved or suspended immediately prior to use. Lyophilized forms are also included. Pharmaceutical compositions of the present invention are characterized as being at least sterile and pyrogen-free. These pharmaceutical formulations include formulations for human and veterinary use.


According to the invention, a pharmaceutically acceptable carrier can comprise any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. The use of such media and agents for pharmaceutically active substances is well known in the art. Any conventional media or agent that is compatible with the active compound can be used. Supplementary active compounds can also be incorporated into the compositions.


A pharmaceutical composition containing EGFR fusion molecule inhibitor can be administered in conjunction with a pharmaceutically acceptable carrier, for any of the therapeutic effects discussed herein. Such pharmaceutical compositions can comprise, for example antibodies directed to an EGFR fusion molecule, or a variant thereof, or antagonists of an EGFR fusion molecule. The compositions can be administered alone or in combination with at least one other agent, such as a stabilizing compound, which can be administered in any sterile, biocompatible pharmaceutical carrier including, but not limited to, saline, buffered saline, dextrose, and water. The compositions can be administered to a patient alone, or in combination with other agents, drugs or hormones.


Sterile injectable solutions can be prepared by incorporating the EGFR fusion molecule inhibitor (e.g., a polypeptide or antibody) in the required amount in an appropriate solvent with one or a combination of ingredients enumerated herein, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated herein. In the case of sterile powders for the preparation of sterile injectable solutions, examples of useful preparation methods are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.


In some embodiments, the EGFR fusion molecule inhibitor can be applied via transdermal delivery systems, which slowly releases the active compound for percutaneous absorption. Permeation enhancers can be used to facilitate transdermal penetration of the active factors in the conditioned media. Transdermal patches are described in for example, U.S. Pat. No. 5,407,713; U.S. Pat. No. 5,352,456; U.S. Pat. No. 5,332,213; U.S. Pat. No. 5,336,168; U.S. Pat. No. 5,290,561; U.S. Pat. No. 5,254,346; U.S. Pat. No. 5,164,189; U.S. Pat. No. 5,163,899; U.S. Pat. No. 5,088,977; U.S. Pat. No. 5,087,240; U.S. Pat. No. 5,008,110; and U.S. Pat. No. 4,921,475.


“Subcutaneous” administration can refer to administration just beneath the skin (i.e., beneath the dermis). Generally, the subcutaneous tissue is a layer of fat and connective tissue that houses larger blood vessels and nerves. The size of this layer varies throughout the body and from person to person. The interface between the subcutaneous and muscle layers can be encompassed by subcutaneous administration. This mode of administration can be feasible where the subcutaneous layer is sufficiently thin so that the factors present in the compositions can migrate or diffuse from the locus of administration. Thus, where intradermal administration is utilized, the bolus of composition administered is localized proximate to the subcutaneous layer.


Administration of the cell aggregates (such as DP or DS aggregates) is not restricted to a single route, but can encompass administration by multiple routes. For instance, exemplary administrations by multiple routes include, among others, a combination of intradermal and intramuscular administration, or intradermal and subcutaneous administration. Multiple administrations can be sequential or concurrent. Other modes of application by multiple routes will be apparent to the skilled artisan.


In other embodiments, this implantation method will be a one-time treatment for some subjects. In further embodiments of the invention, multiple cell therapy implantations will be required. In some embodiments, the cells used for implantation will generally be subject-specific genetically engineered cells. In another embodiment, cells obtained from a different species or another individual of the same species can be used. Thus, using such cells can require administering an immunosuppressant to prevent rejection of the implanted cells. Such methods have also been described in U.S. Pat. No. 7,419,661 and PCT application publication WO 2001/32840, and are hereby incorporated by reference.


A pharmaceutical composition of the invention is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation or ingestion), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfate; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.


Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, a pharmaceutically acceptable polyol like glycerol, propylene glycol, liquid polyetheylene glycol, and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it can be useful to include isotonic agents, for example, sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.


Sterile injectable solutions can be prepared by incorporating the inhibitor (e.g., a polypeptide or antibody or small molecule) of the invention in the required amount in an appropriate solvent with one or a combination of ingredients enumerated herein, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated herein. In the case of sterile powders for the preparation of sterile injectable solutions, examples of useful preparation methods are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.


Oral compositions generally include an inert diluent or an edible carrier. They can be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier and subsequently swallowed.


Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.


Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.


In some embodiments, the effective amount of the administered EGFR fusion molecule inhibitor is at least about 0.0001 μg/kg body weight, at least about 0.00025 μg/kg body weight, at least about 0.0005 μg/kg body weight, at least about 0.00075 μg/kg body weight, at least about 0.001 μg/kg body weight, at least about 0.0025 μg/kg body weight, at least about 0.005 μg/kg body weight, at least about 0.0075 μg/kg body weight, at least about 0.01 μg/kg body weight, at least about 0.025 μg/kg body weight, at least about 0.05 μg/kg body weight, at least about 0.075 μg/kg body weight, at least about 0.1 μg/kg body weight, at least about 0.25 μg/kg body weight, at least about 0.5 μg/kg body weight, at least about 0.75 μg/kg body weight, at least about 1 μg/kg body weight, at least about 5 μg/kg body weight, at least about 10 μg/kg body weight, at least about 25 μg/kg body weight, at least about 50 μg/kg body weight, at least about 75 μg/kg body weight, at least about 100 μg/kg body weight, at least about 150 μg/kg body weight, at least about 200 μg/kg body weight, at least about 250 μg/kg body weight, at least about 300 μg/kg body weight, at least about 350 μg/kg body weight, at least about 400 μg/kg body weight, at least about 450 μg/kg body weight, at least about 500 μg/kg body weight, at least about 550 μg/kg body weight, at least about 600 μg/kg body weight, at least about 650 μg/kg body weight, at least about 700 μg/kg body weight, at least about 750 μg/kg body weight, at least about 800 μg/kg body weight, at least about 850 μg/kg body weight, at least about 900 μg/kg body weight, at least about 950 μg/kg body weight, at least about 1000 μg/kg body weight, at least about 2000 μg/kg body weight, at least about 3000 μg/kg body weight, at least about 4000 μg/kg body weight, at least about 5000 μg/kg body weight, at least about 6000 μg/kg body weight, at least about 7000 μg/kg body weight, at least about 8000 μg/kg body weight, at least about 9500 μg/kg body weight, or at least about 10,000 μg/kg body weight.


Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention.


All publications and other references mentioned herein are incorporated by reference in their entirety, as if each individual publication or reference were specifically and individually indicated to be incorporated by reference. Publications and references cited herein are not admitted to be prior art.


EXAMPLES

Examples are provided below to facilitate a more complete understanding of the invention. The following examples illustrate the exemplary modes of making and practicing the invention. However, the scope of the invention is not limited to specific embodiments disclosed in these Examples, which are for purposes of illustration only, since alternative methods can be utilized to obtain similar results.


Example 1
The Integrated Landscape of Driver Genomic Alterations in Glioblastoma

To address the challenge of driver mutations in glioblastoma (GBM) and uncover new driver genes in human GBM, a computational platform was developed that integrates the analysis of copy number variations and somatic mutations from a whole-exome dataset. The full spectrum of in-frame gene fusions was unveiled from a large transcriptome dataset of glioblastoma. The analyses revealed focal copy number variations and mutations in all the genes previously implicated in glioblastoma pathogenesis. Recurrent copy number variations and somatic mutations were detected in 18 genes not yet implicated in glioblastoma. For each of the new genes, the occurrence of focal and recurrent copy number changes in addition to somatic mutations underscores the relevance for glioblastoma pathogenesis. Without being bound by theory, mutations in LZTR-1, a Keltch-BTB-BACK-BTB-BACK adaptor of Cul3-containing E3 ligase complexes impacted ubiquitination of LZTR-1 substrates. Loss-of-function mutations of CTNND2 (coding for δ-catenin) targeted a neural-specific gene and were associated with the transformation of glioma cells along the mesenchymal lineage, a hallmark of aggressive glioblastoma. Reconstitution of δ-catenin in mesenchymal glioma cells reprogrammed them towards a neuronal cell fate. Recurrent translocations were also identified that fuse in-frame the coding sequence of EGFR to several partners in 7.6% of tumors, with EGFR-Septin-14 scoring as the most frequent functional gene fusion in human glioblastoma. EGFR fusions enhance proliferation and motility of glioma cells and confer sensitivity to EGFR inhibition in glioblastoma xenografts. These results provide important insights into the pathogenesis of glioblastoma and highlight new targets for therapeutic intervention.


Glioblastoma (GBM) is the most common primary intrinsic malignant brain tumor affecting ˜10,000 new patients each year with a median survival rate of only 12-15 months1,2. Identifying and understanding the functional significance of the genetic alterations that drive initiation and progression of GBM is crucial to develop more effective therapies. Previous efforts in GBM genome characterization included array-based profiling of copy number changes, methylation and gene expression and targeted sequencing of candidate genes3-6. These studies identified somatic changes in well-known GBM genes (EGFR, PTEN, IDH1, TP53, NF1, etc.) and nominated putative cancer genes with somatic mutations, but the functional consequences of most alterations is unknown. The lack of strict correlation between somatic alterations and functionality in GBM is manifested by regions of large copy number variations (CNVs), in which the relevant gene(s) are masked within genomic domains encompassing many other genes. Furthermore, although the potential of next-generation sequencing of the whole coding exome is widely recognized for the nomination of new cancer genes, the elevated somatic mutation rate of GBM is a significant challenge for statistical approaches aimed to distinguish genes harboring driver from those with passenger mutations. A statistical approach was used to nominate driver genes in GBM from the integration of whole-exome sequencing data calling for somatic mutations with a CNVs analysis that prioritizes focality and magnitude of the genetic alterations.


Chromosomal rearrangements resulting in recurrent and oncogenic gene fusions are hallmarks of hematological malignancies and recently they have also been uncovered in solid tumors (breast, prostate, lung and colorectal carcinoma)7,8. Recently, a small subset of GBM harbor FGFR-TACC gene fusions were provided indicating that the patients with FGFR-TACC-positive tumors would benefit from targeted EGFR kinase inhibition9. It remains unknown whether gene fusions involving other RTK-coding genes exist in GBM to create different oncogene addicting states. A large RNA-sequencing dataset of primary GBM and glioma stem cells (GSCs) was analyzed and the global landscape of in-frame gene fusions in human GBM was reported.


Nomination of Candidate GBM Genes

Focal CNVs and point mutations provide exquisite information on candidate driver genes by pinpointing their exact location. Without being bound by theory, the integration of somatic point mutations and focal CNV information in a single framework will nominate candidate genes implicated in GBM. MutComFocal is an algorithm designed for this purpose, in which driver genes are ranked by an integrated recurrence, focality and mutation score (see Methods). Overall, this strategy was applied to a cohort of 139 GBM and matched normal DNA analyzed by whole exome sequencing to identify somatic mutations and 469 GBM were analyzed by the Affymetrix SNP6.0 platform to identify CNVs.


The whole-exome analysis identified a mean of 43 protein-changing somatic mutations per tumor sample. The distribution of substitutions shows a higher rate of transitions vs tranversions (67%), with a strong preference for C→T and G→A (55%) (FIG. 6). As seen in other tumor types10, 19.2% of the mutations occurred in a CpG dinucleotide context (FIG. 7). Among somatic small nucleotide variants, the most frequently mutated genes have well-established roles in cancer, including GBM (TP53, EGFR, PTEN, and IDH1). In addition to known cancer genes, whole-exome sequencing identified several potentially new candidate driver genes mutated in ˜5% of tumor samples. To uncover the most likely driver genes of GBM initiation and/or progression, the mutation results were integrated with common focal genomic alterations, detected using an algorithm applied to high-density SNP arrays to generate MutComFocal scores. This analysis stratified somatically mutated genes into three groups: recurrently mutated genes without significant copy number alterations (Mut), mutated genes in regions of focal and recurrent amplifications (Amp-Mut) and mutated genes in regions of focal and recurrent deletions (Del-Mut). Employing this framework, a list of 67 genes was generated that score at the top of each of the three categories and that included nearly all the genes previously implicated in GBM. These genes, which are labeled in green in FIG. 1 include IDH1 (Mut, FIG. 1a), PIK3C2B, MDM4, MYCN, PIK3CA, PDGFRA, KIT, EGFR, and BRAF (Amp-Mut, FIG. 1b) and PIK3R1, PTEN, RB1, TP53, NF1 and A TRX (Del-Mut, FIG. 1c). Interestingly, the analysis also selected 52 new candidate driver genes previously unreported in GBM. Based upon their role in development and homeostasis of the CNS and their potential function in oncogenesis and tumor progression, 24 genes were selected for re-sequencing in an independent dataset of 35 GBM and matched normal controls. Eighteen genes were found somatically mutated by Sanger sequencing in the independent panel and are labeled in red in FIG. 1. Each of the validated new GBM genes is targeted by somatic mutations and CNVs in a cumulative fraction comprised between 2.9% and 45.7% of GBM (FIG. 9).


Among the commonly mutated and focally deleted genes that exhibited top MutComFocal scores and were validated in the independent GBM dataset, BCOR, LRP family members, HERC2, LZTR-1 and CTNND2. BCOR, a chromosome X-linked gene, encodes for a component of the nuclear co-repressor complex that is essential for normal development of the neuroectoderm and stem cell functions11-13. BCOR mutations have recently been described in retinoblastoma and medulloblastoma, thus indicating that loss-of-function mutations in BCOR are common genetic events in neuroectodermal tumors14,15. LRP1B is a member of the LDL receptor family and is among the most frequently mutated genes in human cancer (FIG. 1c)16. Interestingly, two other LDL receptor family members (LRP2 and LRP1) are mutated in 4.4% and 2.9% of tumors, respectively (FIG. 1a). The LRP proteins are highly expressed in the neuroepithelium and are essential for morphogenesis of the forebrain in mouse and humans17,18. The tumor suppressor function of LRP proteins in GBM may be linked to their ability to promote chemosensitivity and control signaling through the Sonic hedgehog pathway, which is responsible for maintenance of cancer initiating cells in GBM19-21. The gene coding for the Hect ubiquitin ligase Herc2 is localized on chromosome 15q13 and is deleted and mutated in 15.1% and 2.2% of GBM cases, respectively. This gene has been implicated in severe neurodevelopmental syndromes. Moreover, protein substrates of Herc2 are crucial factors in genome stability and DNA damage-repair, two cell functions frequently disrupted in cancer22,23.


Loss-of-Function Genetic Alterations Target the LZTR-1 and CTNND2 Genes in GBM


A gene that received one of the highest Del-Mut score by MutComFocal is LZTR-1 (FIG. 1c). The LZTR-1 coding region had non-synonymous mutations in 4.4% and the LZTR-1 locus (human chromosome 22q11) was deleted in 22.4% of GBM. LZTR-1, which is normally expressed in the human brain, codes for a protein with a characteristic Kelch-BTB-BACK-BTB-BACK domain architecture (FIGS. 8, 9). The LZTR-1 gene is highly conserved in the metazoans and was initially proposed to function as a transcriptional regulator, but follow-up studies have excluded a transcriptional role for this protein24. Most proteins with BTB-BACK domains are substrate adaptors in Cullin3 (Cul3) ubiquitin ligase complexes, in which the BTB-BACK region binds to the N-terminal domain of Cul3, while a ligand binding domain, often a Kelch 6-bladed β-propeller motif, binds to substrates targeted for ubiquitination25. To ask whether LZTR-1 directly binds Cul3, co-immunoprecipitation experiments were performed. FIG. 2a shows that Cul3 immunoprecipitates contain LZTR-1, thus indicating that LZTR-1 is an adaptor in Cul3 ubiquitin ligase complexes.


To address the potential function of LZTR-1 mutants, a homology model of LZTR-1 was built based in part on the crystal structures of the MATH-BTB-BACK protein SPOP26, the BTB-BACK-Kelch proteins KLHL3 and KLHL1127, and the Kelch domain of Keap128 (FIG. 2b). Without being bound by theory, the second BTB-BACK region of LZTR-1 binds Cul3 because of the presence of φ-X-E motif in this BTB domain, followed by a 3-Box/BACK region (FIG. 9)26. However, the preceding BTB-BACK region also participates in Cul3 binding. Four of the six LZTR-1 mutations identified in GBM are located within the Kelch domain and target highly conserved amino acids (FIG. 2b, c, FIG. 8). Interestingly, the concentration of LZTR-1 mutations in the Kelch domain reflects a similar pattern of mutations in the Kelch-coding region of the KLHL3 gene, recently identified in families with hypertension and electrolytic abnormalities29,30. The R198G and G248R mutations localize to the b-c loop of the Kelch domain, in a region predicted to provide the substrate-binding surface of the domain28. The W105R mutation targets a highly conserved anchor residue in the Kelch repeats and the T288I mutation disrupts a buried residue that is conserved in LZTR-1 (FIG. 2b, FIG. 8). Both of these mutations are expected to perturb the folding of the Kelch domain. The remaining two mutations, located in the BTB-BACK domains are predicted to affect the interaction with Cul3 either by removing the entire BTB-BACK-BTB-BACK region (W437STOP) or by disrupting the folding of the last helical hairpin in the BTB-BACK domain (R810W, FIG. 2b). The pattern of mutations of LZTR-1 in GBM indicates that they impair binding either to specific substrates or to Cul3.


Among the top ranking genes in MutComFocal, CTNND2 is the gene expressed at the highest levels in the normal brain. CTNND2 codes for -catenin, a member of the p120 subfamily of catenins that is expressed almost exclusively in the nervous system where it is crucial for neurite elongation, dendritic morphogenesis and synaptic plasticity31-33. Germ-line hemizygous loss of CTNND2 severely impairs cognitive function and underlies some forms of mental retardation34,35. CTNND2 shows a pronounced clustering of mutations in GBM. The observed spectrum of mutations includes four mutations located in the armadillo-coding domain and one in the region coding for the N-terminal coiled-coil domain (FIG. 10a). These regions are the two most relevant functional domains of δ-catenin and each of the mutations targets highly conserved residues with probably (K629Q, A776T, S881L, D999E) and possibly (A71T) damaging consequences36. Together with focal genomic losses of CTNND2 (FIG. 10b), the mutation pattern indicates that CTNND2 is a tumor suppressor gene in GBM.


It was asked whether the expression of CTNND2 is down-regulated during oncogenic transformation in the CNS Immunostaining experiments showed that δ-catenin is strongly expressed in the normal human and mouse brain with the highest expression in neurons (FIG. 3a, FIG. 10c). Conversely, the immunostaining analysis of 69 GBM revealed negligible or absent expression of δ-catenin in 21 cases (FIG. 3b). Oncogenic transformation in the CNS frequently results in loss of the default proneural cell fate in favor of an aberrant mesenchymal phenotype, which is associated with a very aggressive clinical outcome37. The analysis of gene expression profiles of 498 GBM from the ATLAS-TCGA collection showed that low expression of CTNND2 is strongly enriched in tumors identified by a mesenchymal gene expression signature (T-test p-value=2.4 10−12, FIG. 10d). Tumors with low CTNND2 expression were also characterized by poor clinical outcome and, among them tumors with copy number losses of the CTNND2 gene displayed the worst prognosis (FIG. 3c, d). Mesenchymal transformation of GBM, which is detected in the vast majority of established glioma cell lines, is associated with an apparently irreversible loss of the proneural cell fate and neuronal markers37. Expression of δ-catenin in the U87 human glioma cell line reduced cell proliferation (FIG. 3e), decreased the expression of mesenchymal markers (FIG. 3f) and induced neuronal differentiation as shown by elongation of β3-tubulin-positive neurites and development of branched dendritic processes that stained positive for the post-synaptic marker PSD95 (FIG. 3g). Accordingly, δ-catenin decreased expression of cyclin A, a S-phase cyclin and up-regulated the Cdk inhibitor p27Kip1 and the neuronal-specific gene N-cadherin (FIG. 3h). Thus, restoring the normal expression of δ-catenin reprograms mesenchymal glioma cells towards the proneural lineage.


Recurrent EGFR Fusions in GBM


To identify gene fusions in GBM, RNA-seq data was analyzed from a total of 185 GBM samples (161 primary GBM plus 24 short-term cultures of glioma stem-like cells (GSCs) freshly isolated from patients carrying primary GBM). The analysis of the RNA-seq dataset led to the discovery of 92 candidate rearrangements that give rise to in-frame fusion transcripts (Table 7). Beside the previously reported FGFR3-TACC3 fusions events, the most frequent recurrent in-frame fusions involved EGFR in 7.6% of samples (14/185, 3.8%-11.3% CI). Nine of the 14 EGFR fusions included the recurrent partners SEPT14 (6/185, 3.2%) and PSPH (3/185, 1.6%) as the 3′ gene segment in the fusion. Two in-frame highly expressed fusions were also found involving the neurotrophic tyrosine kinase receptor 1 gene (NTRK1) as 3′ gene with two different 5′ partners (NFASC-NTRK1 and BCAN-NTRK1). Fusions with a similar structure involving NTRK1 are commonly found in papillary thyroid carcinomas38. Using EXomeFuse, an algorithm for the reconstruction of genomic fusions from whole-exome data, EGFR-SEPT14 and NRTK1 fusions are the result of recurrent chromosomal translocations and reconstructed the corresponding genomic breakpoints (Table 8).


By sequencing the PCR products spanning the fusion breakpoint, each of the three types of recurrent in-frame fusion predictions (EGFR-SEPT14, EGFR-PSPH and NRTK1 fusions, FIG. 4, FIG. 11, and FIG. 12) were validated. In FIG. 4a, b the prediction and cDNA sequence validation are shown, respectively, for one of the tumors harboring an EGFR-SEPT14 fusion (TCGA-27-1837). The amplified cDNA contained an open reading frame for a protein of 1,041 amino acids resulting from the fusion of an EGFR amino-terminal portion of residues 1-982 with a SEPT14 carboxy-terminal portion of residues 373-432 (FIG. 4c). Thus, the structure of the EGFR-Septin14 fusion proteins involves EGFR at the N-terminus, providing a receptor tyrosine kinase domain fused to a coiled-coil domain from Septin14. Exon-specific gene expression analysis from the RNA-seq coverage in TCGA-27-1837 demonstrated that the EGFR and SEPT14 exons implicated in the fusion are highly overexpressed compared with the mRNA sequences not included in the fusion event (FIG. 13). Using PCR, the genomic breakpoint coordinates were mapped to chromosome 7 (#55,268,937 for EGFR and #55,870,909 for SEPT14, genome build GRCh37/hg19) falling within EGFR exon 25 and SEPT14 intron 9, which gives rise to a transcript in which the 5′ EGFR exon 24 is spliced to the 3′ SEPT14 exon 10 (FIG. 4d). Interestingly, the fused EGFR-PSPH cDNA and predicted fusion protein in the GBM sample TCGA-06-5408 involves the same EGFR N-terminal region implicated in the EGFR-SEPT14 with PSPH providing a carboxy-terminal portion of 35 amino acids (FIG. 11). An example of a fusion in which the EGFR-TK region is the 3′ partner is the CAND1-EGFR fusion in GSC-3316 (FIG. 14). Thus, either in the more frequent fusions in which EGFR is the 5′ partner or in those with EGFR as the 3′ gene, the region of the EGFR mRNA coding for the TK domain is invariably retained in each of the fusion transcripts (Table 7). RT-PCR and genomic PCR followed by Sanger sequencing from GBM TCGA-06-5411 were also used to successfully validate successfully the NFASC-NTRK1 fusions in which the predicted fusion protein includes the TK domain of the high-affinity NGF receptor (TrkA) fused downstream to the immunoglobulin-like region of the cell adhesion and ankyrin-binding region of neurofascin (FIG. 12).


To confirm that GBM harbor recurrent EGFR fusions and determine the frequency in an independent dataset, cDNA was screened from a panel of 248 GBMs and discovered 10 additional cases harboring EGFR-SEPT14 fusions (4%). Conversely, NFASC-NTRK1 fusions were not detected in this dataset. The frequency of EGFR-PSPH fusions was 2.2% (3/135).


The discovery of recurrent EGFR fusions in GBM is of particular interest. EGFR is activated in a significant fraction of primary GBM (˜25%) by an in-frame deletion of exons 2-7 (EGFRvIII)39. To establish the functional relevance of EGFR fusions, it was determined whether the most frequent EGFR fusion in GBM (EGFR-SEPT14) provides an alternative mechanism of EGFR activation and confers sensitivity to EGFR inhibition. The EGFR-SEPT14 cDNA was cloned and prepared lentiviruses expressing EGFR-SEPT14, EGFRvIII or EGFR wild type. Transduction of the SNB19 glioma cell line (which lacks genomic alteration of EGFR) with the recombinant lentiviruses showed that cells expressing EGFR-SEPT14 or EGFRvIII proliferated at a rate that was 2-fold higher than control cells or cells expressing wild type EGFR (FIG. 5a). Furthermore, EGFR-SEPT14 and EGFRvIII markedly enhanced the ability of SNB19 cells to migrate in a wound assay (FIG. 5b, c). Finally it was investigated whether EGFR-SEPT14 fusions confer sensitivity to EGFR-TK activity in vivo. The analysis of a collection of 30 GBM xenografts directly established in the mouse from human GBM identified one xenograft model (D08-0537 MG) harboring the EGFR-SEPT14 fusion. The D08-0537 MG had been established from a heavily pretreated GBM. Treatment of D08-0537 MG tumors with two EGFR inhibitors showed that each of the two drugs significantly delayed the rate of tumor growth (FIG. 5d). Interestingly, lapatinib, an irreversible EGFR inhibitor recently proposed to target EGFR alterations in GBM40, displayed the strongest anti-tumor effects (FIG. 5d, e). Conversely, EGFR inhibitors were ineffective against the GBM xenograft D08-0714 MG, which lacks genomic alterations of the EGFR gene (FIG. 5d, e). Taken together, these data determine that the EGFR-SEPT14 fusion confers a proliferative and migratory phenotype to glioma cells and imparts sensitivity to EGFR inhibition to human glioma harboring the fusion gene.


Discussion


A computational pipeline is described for the nomination of somatic cancer genes. This approach computes frequency, magnitude and focality of CNVs at any loci in the human genome with the somatic mutation rate for the genes residing at that genomic location. Thus, two of the genetic hallmarks of driver cancer genes (focality of copy number aberrations and point mutations) are integrated into a single score. The approach identifies marks of positive somatic selection in large unbiased cancer genome studies by efficiently removing the large burden of passenger mutations that characterize most human tumors and will be applicable to the dissection of the genomic landscape of other cancer types.


Besides recognizing nearly all the known genes reported to have functional relevance in GBM, our study discovered and validated somatic mutations in 18 new genes, which also harbor focal and recurrent CNVs in a significant fraction of GBM. For some of these genes, their importance extends beyond GBM, as underscored by cross-tumor relevance (e.g. BCOR), and protein family recurrence (e.g. LRP family members). For example, mutations of LZTR-1 have been reported in other tumors. In particular, mutations of the highly conserved residues in the Keltch domain (W105, G248, T288) and in the second BTB-BACK domain (R810) reported here are recurrent events in other tumor types41. Thus, understanding the nature of the substrates of LZTR-1-Cul3 ubiquitin ligase activity will provide important insights into the pathogenesis of multiple cancer types.


The identification of genetic and epigenetic loss-of-function alterations of the CTNND2 gene clustered in mesenchymal GBM provides a clue to the genetic events driving this aggressive GBM subtype. The important functions of δ-catenin for such crucial neuronal morphogenesis activities as the coordinated control of axonal and dendritic arborization indicates that full-blown mesenchymal transformation in the brain requires loss of the master regulators constraining cell determination in the CNS along the default neuronal lineage. The ability of δ-catenin to reprogram glioma cells that express mesenchymal genes towards a neuronal fate unravels an unexpected plasticity of mesenchymal GBM that might be exploited therapeutically.


In this study, the landscape of gene fusions is reported from a large dataset of GBM analyzed by RNA-Sequencing. In-frame gene fusions retaining the RTK-coding domain of EGFR emerged as the most frequent gene fusion events in GBM. In this tumor, EGFR is frequently targeted by focal amplications and our finding underscores the strong recombinogenic probability of focally amplified genes, as recently reported for the myc locus in medulloblastoma42. Resembling intragenic rearrangements that generate the EGFRvIII allele, EGFR-SEPT14 fusions enhance the proliferative and migratory capacity of glioma cells. They also confer sensitivity to EGFR inhibition to human GBM grown as mouse xenografts. These findings highlight the relevance of gene fusions implicating RTK-coding genes in the pathogenesis of GBM9. They also provide a strong rationale for the inclusion of GBM patients harboring EGFR fusions in clinical trials based on EGFR inhibitors.


Methods


139 paired tumor-normal samples from TCGA were analyzed with the SAVI pipeline43. The SAVI algorithm estimates frequencies of variant alleles in sample as well as the difference in allele frequency between paired samples. The algorithm establishes posterior high credibility intervals for those frequencies and differences of frequencies, which can be used for genotyping the samples on the one hand, and detecting somatic mutations in the case of tumor/normal pairs of samples on the other. The algorithm allows for random sequencing errors and uses the Phred scores of the sequenced alleles as an estimate of their reliability. To integrate point mutation and 469 GBM CNV data (Affymetrix SNP6.0), MutComFocal (see below) was used. The MutComFocal algorithm assigns a driver score to each gene through three different strategies that give priority to lesions, samples, and genes in which there is less uncertainty regarding potential tumorigenic drivers. First, the focality component of the score is inversely proportional to the size of the genomic lesion to which a gene belongs and thus prioritizes more focal genomic lesions. Second, the recurrence component of the MutComFocal score is inversely proportional to the total number of genes altered in a sample, which prioritizes samples with a smaller number of altered genes. Finally, the mutation component of the score is inversely proportional to the total number of genes mutated in a sample, which achieves the two-fold goal of prioritizing mutated genes on one hand, and samples with a smaller number of mutations on the other.


161 RNA-Seq GBM tumor samples were also analyzed from TCGA plus 24 generated from our own dataset of GSCs. Nine of these samples previously reported in other studies were kept in the list to evaluate recurrence9. The samples were analyzed by means of the ChimeraScan algorithm in order to detect a list of gene fusion candidates44. Using the Pegasus annotation pipeline (http://sourceforge.net/projects/pegasus-fus/), the fusion transcript was reconstructed, the reading frame was annotated and protein domains were detected that are either conserved or lost in the new chimeric event. The genomic breakpoint of recurrent gene fusion RNA transcripts was also probed for using whole-exome sequencing data (EXome-Fuse algorithm)9. The Kaplan-Meier survival analysis for CTNND2 CNV and CTNND2 expression were obtained using the REMBRANDT glioma dataset.


SAVI (Statistical Algorithm for Variant Frequency Identification):


The frequency of alleles in a sample was estimated by the SAVI pipeline, which constructs an empirical Bayesian prior for those frequencies, using data from the whole sample, and obtains a posterior distribution and high credibility intervals for each alleleS1. The prior and posterior are distributed over a discrete set of frequencies with a precision of 1% and are connected by a modified binomial likelihood, which allows for some error rate. More precisely, a prior distribution p(f) of the frequency f and a prior for the error e uniform on the interval [0, E] was assumed for a fixed 0≦E≦1. The sequencing data at a particular allele is a random experiment producing a string of m (the total depth at the allele) bits with n “1” s (the variant depth at the allele). Assuming a binomial likelihood of the data and allowing for bits being misread due to random errors, the posterior probability P(f) of the frequency f is







P


(
f
)


=




p


(
f
)


C

·

1

b
-
1







f

f
+
E
-

2

Ef








x
n



(

1
-
x

)



m
-
n





x








where C is a normalization constant. For a particular allele, the value of E is determined by the quality of the nucleotides sequenced at that position as specified by their Phred scores. The SAVI pipeline takes as input the reads produced by the sequencing technology, filters out low quality reads and maps the rest onto a human reference genome. After mapping, a Bayesian prior for the distribution of allele frequencies for each sample is constructed by an iterative posterior update procedure, starting with a uniform prior. To genotype the sample, the posterior high credibility intervals were used for the frequency of the alleles at each genomic location. Alternatively, combining the Bayesian priors from different samples, posterior high credibility intervals were obtained for the difference between the samples of the frequencies of each allele. Finally, the statistically significant differences between the tumor and normal samples are reported as somatic variants. To estimate the positive prediction value of SAVI in the TCGA GBM samples, 41 mutations were selected for independent validation by Sanger sequencing. 39 of the 41 mutations were confirmed using Sanger sequencing, resulting in 0.95 (95% CI 0.83-0.99) validation rate.


Candidate genes were ranked by the number of somatic non-synonymous mutations. A robust fit of the ratio of non-synonymous to synonymous ratio was generated with a bisquare weighting function. Excess of non-synonymous alterations was estimated using a Poisson distribution with mean equal to the product of the ratio from the robust fit and the number of synonymous mutations. Genes in highly polymorphic genomic regions were filtered out based on an independent cohort of normal samples. The list of these regions includes families of genes known to generate false positives in somatic predictions (e.g. HLA, KRT and OR).


MutComFocal.


Key cancer genes are often found amplified or deleted in chromosomal regions containing many other genes. Point mutations and gene fusions, on the other hand, provide more specific information about which genes may be implicated in the oncogenic process. MutComFocal, a Bayesian approach aiming to identify driver genes by integrating CNV and point mutation data was developed.


For a particular sample, let (c1, N1), . . . , (ck, Nk) describe the amplification lesions in that sample so that Ni is the number of genes in the i-th lesion and ci is its copy number change from normal. For a gene belonging to the i-th lesion the amplification recurrence sample score is defined as ci/(Σjcj·Nj) and its amplification focality sample score is defined as (cij cj)·(1/Ni). To obtain the amplification recurrence and focality scores for a particular gene, the corresponding sample scores were summed over all samples and normalize the result so that each score sums to 1. The deletion and recurrence scores are defined in a similar manner. The mutation score is analogous to a recurrence score in which it was assumed that mutated genes belong to lesions with only one gene.


The amplification/mutation score is defined as the product of the two amplification scores and the mutation score while deletion/mutation score is defined as the product of the two deletion scores and the mutation score. The amplification/mutation and deletion/mutation scores are normalized to 1 and for each score, genes are divided into tiers iteratively, so that the top 2H remaining genes are included in the next tier, where H is the entropy of the scores of the remaining genes normalized to 1. Based on their tier across the different types of scores, genes are assigned to being either deleted/mutated or amplified/mutated and genes in the top tiers are grouped into contiguous regions. The top genes in each region are considered manually and selected for further functional validation.


The recurrence and focality scores can be interpreted as the posterior probabilities that a gene is driving the selection of the disease, under two different priors for this: one global and one local in nature. The recurrence score is higher if a gene participates in many samples that do not have too many altered genes, while the focality score is higher if the gene participates in many focal lesions. Besides lending strong support to the inference of a gene as a potential driver, the directionality of the copy number alteration (amplification or deletion) informs us of the likely behavior of the candidate gene as an oncogene or tumor suppressor, respectively.


The genes displayed in FIG. 1 are selected based on the MutComFocal ranking (top 250 genes), the size of minimal region (less than 10 genes) and frequency of mutations (more than 2% for deletion/mutations and at least 1% in amplification/mutations).


RNA-Seq Bioinformatics Analysis.


161 RNA-Seq GBM tumor samples were analyzed from The Cancer Genome Atlas (TCGA), a public repository containing large-scale genome-sequencing of different cancers, plus 24 patients-derived GSCs. Nine of the GSCs samples reported in previous studies were kept in the list to evaluate recurrenceS2. The samples were analyzed by means of the ChimeraScanS3 algorithm in order to detect a list of gene fusion candidates. Briefly, ChimeraScan detects those reads that discordantly align to different transcripts of the same reference (split inserts). These reads provide an initial set of putative fusion candidates. Finally, the algorithm realigns the initially unmapped reads to the putative fusion candidates and detects those reads that align across the junction boundary (split reads). These reads provide the genomic coordinates of the breakpoint.


RNA-Seq analysis detected a total of 39,329 putative gene fusion events. In order to focus the experimental analysis on biologically relevant fused transcripts, Pegasus annotation pipeline (http://sourceforge.net/projects/pegasus-fus/) were applied. For each putative fusion, Pegasus reconstructs the entire fusion sequence on the base of genomic fusion breakpoint coordinates and gene annotations. Pegasus also annotates the reading frame of the resulting fusion sequences as either in-frame or frame-shift. Moreover, Pegasus detects the protein domains that are either conserved or lost in the new chimeric event by predicting the amino acid sequence and automatically querying the UniProt web service. On the basis of the Pegasus annotation report, relevant gene fusions were selected for further experimental validation according to the reading frame and the conserved/lost domains. The selected list (Table 7) was based on in-frame events expressed by ten or more reads and at least one read spanning the breaking point. To filter out candidate transplicing events, events with putative breakpoints at a distance of at least 25 kb were focused.


EXome-Fuse: Identification of Genetic Rearrangments Using Whole-Exome Data.


Although whole-exome sequencing data contains low intronic coverage that reduces the sensitivity for fusion discovery, it is readily available through the TCGA database. To characterize the genomic breakpoint of the chromosomal rearrangement, EXome-Fuse was designed: a gene fusion discovery pipeline particularly designed to analyze whole-exome data. For the samples harboring EGFR-SEPT14, EGFR-PSPH, NFASC-NTRK1, and BCAN-NTRK1 fusions in RNA, EXome-Fuse was applied to the corresponding whole-exome sequencing data deposited in TCGA. This algorithm can be divided into three stages: split insert identification, split read identification, and virtual reference alignment. Mapping against the human genome reference hg18 with BWA, all split inserts were first identified to compile a preliminary list of fusion candidates. This list was cut of any false positives produced from paralogous gene pairs using the Duplicated Genes Database and the EnsemblCompara GeneTrees4. Pseudogenes in the candidate list were annotated using the list from HUGO Gene Nomenclature Committee (HGNC) databaseS5 and given lower priority. Candidates were also filtered out between homologous genes, as well as those with homologous or low-complexity regions around the breakpoint. For the remaining fusion candidates, any supporting split reads and their mates were probed using BLAST with a word size of 16, identity cutoff of 90%, and an expectation cutoff of 10−4. Finally, a virtual reference was created for each fusion transcript and all reads were re-align to calculate a final tally of split inserts and split reads such that all aligning read pairs maintain F-R directionality.


Targeted Exon Sequencing


All protein-coding exons for the 24 genes of interest were sequenced using genomic DNA extracted from frozen tumors and matched blood. 500 ng of DNA from each sample were sheared to an average of 150 bp in a Covaris instrument for 360 seconds (Duty cycle—10%; intensity—5; cycles/Burst—200). Barcoded libraries were prepared using the Kapa High-Throughput Library Preparation Kit Standard (Kapa Biosystems). Libraries were amplified using the KAPA HiFi Library Amplification kit (Kapa Biosystems) (8 cycles). Libraries were quantified using Qubit Fluorimetric Quantitation (Invitrogen) and the quality and size assessed using an Agilent Bioanalyzer. An equimolar pool of the 4 barcoded libraries (300 ng each) was created and 1,200 ng was input to exon capture using one reaction tube of the custom Nimblegen SeqCap EZ (Roche) with custom probes target the coding exons of the 38 genes. Capture by hybridization was performed according to the manufacturer's protocols with the following modifications: 1 nmol of a pool of blocker oligonucleotides (complementary to the barcoded adapters), and (B) post-capture PCR amplification was done using the KAPA HiFi Library Amplification kit instead of the Phusion High-Fidelity PCR Master Mix with HF Buffer Kit, in a 60 μl volume, since the Kapa HiFi kit greatly reduced or eliminated the bias against GC-rich regions. The pooled capture library was quantified by Qubit (Invitrogen) and Bioanalyzer (Agilent) and sequenced in on an Illumina MiSeq sequencer using the 2×150 paired-end cycle protocol. Reads were aligned to the hg19 build of the human genome using BWA with duplicate removal using samtools as implemented by Illumina MiSeq Reporter. Variant detection was performed using GATK UnifiedGenotyper. Somatic mutations were identified for paired samples using SomaticSniper and filtered for frequency of less than 3% in normal and over 3% in tumor samples. Variants were annotated with Charity annotator to identify protein-coding changes and cross-referenced against known dbSNP, 1000 Genomes, and COSMIC mutations. Sanger sequencing was used to confirm each mutation from normal and tumor DNA.


Modeling of LZTR-1


Structural templates for the Kelch and BTB-BACK regions of human LZTR-1 were identified with HHpredS6. An initial 3D model was generated with the I-TASSER serverS7. The Cul3 N-terminal domain was docked onto the model by superposing the KLHL3BTB-BACK/Cul3NTD crystal structure (PDB ID 4HXI, Xi and Privé PLOS ONE 2013) onto the second LZTR-1 BTB-BACK domain. The model does not include higher quaternary structure, although many BTB domains, and many Kelch domains, are known to self-associateS8. The short linkage between the end of the first BACK domain and the beginning of the second BTB domain would appear to preclude an intrachain BTB-BTB pseudo-homodimer; without being bound by theory, LZTR-1 self-associates and forms higher order assemblies. Both BACK domains are the shorter, atypical form of the domain and consist of 2 helical hairpin motifs, as in SPOPS9,S10, and not the 4-hairpin motif seen most BTB-BACK-Kelch proteinsS10,S11. The model from the Kelch domain predicts an unusual 1+3 velcro arrangementS12, with the N-terminal region contributing strand d of blade 1 and the C-terminal region contributing strands a,b,c of the same blade, although an alternative 2+2 velcro model cannot be ruled out.


Cell Culture


SNB19 and U87 cells were cultured in DMEM supplemented with 10% Fetal Bovine Serum. Growth rate was determined by plating cells in six-well plates post 3 days after infection with the lentivirus indicated in Figure Legends. The number of viable cells was determined by trypan blue exclusion in triplicate cultures obtained from triplicate independent infections. Migration was evaluated by Confluent cells were scratched with a pipette tip and cultured in 0.25% FBS. After 16 h, images were taken using the Olympus IX70 connected to a digital camera. Images were processed using the ImageJ64 software. The area of the cell-free wound was assessed in triplicate samples. Experiments were repeated twice.


Immunofluorescence and Western Blot


immunoflurescence staining on brain tumor tissue microarrays were performed as previously describedS17 Immunofluorescence microscopy was performed on cells fixed with 4% para-formaldehyde (PFA) in phosphate buffer. Cells were permeabilized using 0.2% Triton X 100. Antibodies and concentrations used in immunofluorescence staining are:



















B-III Tubulin
Mouse
1:400
Promega



Catenin D2
Guinea Pig
1:500
Acris



Fibronectin
Mouse
1:1,000
BD-Pharmingen



PSD-95
Rabbit
1:500
Invitrogen









Secondary antibodies conjugated to Alexa Fluor 594 (Molecular Probes) were used. DNA was stained by DAPI (Sigma). Fluorescence microscopy was performed on a Nikon A1R MP microscope.


Western blot analysis of U87 cells transduced with pLOC-GFP or pLOC CTNND2 was performed using the following antibodies:



















Anti-Vinculin
Mouse
1:400
SIGMA



Anti-N-Cadherin
Mouse
1:200
BD-Pharmingen



Cyclin A
Rabbit
1:500
Santa Cruz



P27
Mouse
1:250
BD Transduction









Cloning and Lentiviral Production


The lentiviral expression vector, pLOC-GFP and pLOC-LZTR1 were purchased from Open Biosystems. The full length EGFR-SEPT14 cDNA was amplified from tumor sample TCGA-27-1837. Primers used were: EGFR FW: 5′-agcgATGCGACCCTCCGGGA-3′ (SEQ ID NO: 30) and SEPT14 REV: 5′-TCTTACGATGTTTGTCTTTCTTTGT (SEQ ID NO: 31); EGFR wild type, EGFR Viii and EGFR-SEPT14 cDNAs were cloned into pLoc and lentiviral particles were produced using published protocolsS13-S15.


Genomic and mRNA RT-PCR


Total RNA was extracted from cells by using RNeasy Mini Kit (QIAGEN), following the manufacturer instructions. 500 ng of total RNA was retro-transcribed by using the Superscript III kit (Invitrogen), following the manufacturer instructions. The cDNAs obtained after the retro-transcription was used as templates for qPCR as describedS13,S15. The reaction was performed with a Roche480 thermal cycler, using the Absolute Blue QPCR SYBR Green Mix from Thermo Scientific. The relative amount of specific mRNA was normalized to GAPDH. Results are presented as the mean±SD of triplicate amplifications. The validation of fusion transcripts was performed using both genomic and RT-PCR with forward and reverse primer combinations designed within the margins of the paired-end read sequences detected by RNA-seq. Expressed fusion transcript variants were subjected to direct sequencing to confirm sequence and translation frame. Primers used for the screening of gene fusions are:











hEGFR-RT-FW1:



(SEQ ID NO: 32)



5′- GGGTGACTGTTTGGGAGTTGATG -3′;







hSEP14-RT-REV1:



(SEQ ID NO: 33)



5′- TGTTTGTCTTTCTTTGTATCGGTGC-3′;







hEGFR-RT-FW1:



(SEQ ID NO: 34)



5′- GTGATGTCTGGAGCTACGGG-3′;







hPSPH-RT-REV1:



(SEQ ID NO: 35)



5′- TGCCTGATCACATTTCCTCCA-3′;







hNFASC-RT-FW1:



(SEQ ID NO: 36)



5′- AGTTCCGTGTCATTGCCATCAAC-3′;







hNTRK1-RT-REV1:



(SEQ ID NO: 37)



5′- TGTTTCGTCCTTCTTCTCCACCG-3′;







hCAND1-RT-FW1:



(SEQ ID NO: 38)



5′- GGAAAAAATGACATCCAGCGAC-3′;







hEGFR-RT-REV1:



(SEQ ID NO: 39)



5′- TGGGTGTAAGAGGCTCCACAAG-3′.






Primers used for genomic detection of gene fusions are:











genomic EGFR-FW1:



(SEQ ID NO: 40)



5′- GGATGATAGACGCAGATAGTCGCC-3′;







genomic SEPT14-REV1:



(SEQ ID NO: 41)



5′- TCCAGTTGTTTTTTCTCTTCCTCG-3′;







genomic NFASC-FW1:



(SEQ ID NO: 42)



5′- AAGGGAGAGGGGACCAGAAAGAAC -3′;







genomic NTRK1-REV1:



(SEQ ID NO: 43)



5′- GAAAGGAAGAGGCAGGCAAAGAC -3′;







genomic CAND1-FW1:



(SEQ ID NO: 44)



5′- GCAATAGCAAAACAGGAAGATGTC-3′;







genomic EGFR-REV1:



(SEQ ID NO: 45)



5′- GAACACTTACCCATTCGTTGG-3′.






Subcutaneous Xenografts and Drug Treatment


Female athymic mice (nu/nu genotype, Balb/c background, 6 to 8 weeks old) were used for all antitumor studies. Patient-derived adult human glioblastoma xenografts were maintained. Xenografts were excised from host mice under sterile conditions, homogenized with the use of a tissue press/modified tissue cytosieve (Biowhitter Inc, Walkersville, Md.) and tumor homogenate was loaded into a repeating Hamilton syringe (Hamilton, Co., Reno, Nev.) dispenser. Cells were injected sub-cutaneously into the right flank of the athymic mouse at an inoculation volume of 50 μl with a 19-gauge needleS16. Subcutaneous tumors were measured twice weekly with hand-held vernier calipers (Scientific Products, McGraw, Ill.). Tumor volumes, V were calculated with the following formula: [(width)×(length)]/2=V (mm3) For the sub-cutaneously tumor studies, groups of mice randomly selected by tumor volume were treated with EGFR kinase inhibitors when the median tumor volumes were on average 150 mm3 and were compared with control animals receiving vehicle (saline). Erlotinib was administered at 100 mg/Kg orally daily for 10 days. Lapatinib was administered at 75 mg/Kg orally twice per day for 20 days. Response to treatment was assessed by delay in tumor growth and tumor regression. Growth delay, expressed as T-C, is defined as the difference in days between the median time required for tumors in treated and control animals to reach a volume five times greater than that measured at the start of the treatment. Tumor regression is defined as a decrease in tumor volume over two successive measurements. Statistical analysis was performed using a SAS statistical analysis program, the Wilcoxon rank order test for growth delay, and Fisher's exact test for tumor regression as previously described.


REFERENCES



  • 1 Porter, K. R., McCarthy, B. J., Freels, S., Kim, Y. & Davis, F. G. Prevalence estimates for primary brain tumors in the United States by age, gender, behavior, and histology. Neuro-oncology 12, 520-527, doi:10.1093/neuonc/nop066 (2010).

  • 2 Stupp, R. et al. Radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma. The New England journal of medicine 352, 987-996, doi:10.1056/NEJMoa043330 (2005).

  • 3 Cancer Genome Atlas Research, N. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061-1068, doi: 10.1038/nature07385 (2008).

  • 4 Noushmehr, H. et al. Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell 17, 510-522, doi:10.1016/j.ccr.2010.03.017 (2010).

  • 5 Parsons, D. W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807-1812, doi: 10.1126/science. 1164382 (2008).

  • 6 Verhaak, R. G. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98-110, doi:10.1016/j.ccr.2009.12.020 (2010).

  • 7 Bass, A. J. et al. Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion. Nat Genet 43, 964-968, doi:10.1038/ng.936 (2011).

  • 8 Chinnaiyan, A. M. & Palanisamy, N. Chromosomal aberrations in solid tumors. Prog Mol Biol Transl Sci 95, 55-94, doi: 10.1016/B978-0-12-385071-3.00004-6 (2010).

  • 9 Singh, D. et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231-1235, doi: 10.1126/science. 1220834 (2012).

  • 10 Rubin, A. F. & Green, P. Mutation patterns in cancer genomes. Proc Natl Acad Sci USA 106, 21766-21770, doi:10.1073/pnas.0912499106 (2009).

  • 11 Fan, Z. et al. BCOR regulates mesenchymal stem cell function by epigenetic mechanisms. Nat Cell Biol 11, 1002-1009, doi: 10.1038/ncb1913 (2009).

  • 12 Wamstad, J. A. & Bardwell, V. J. Characterization of Bcor expression in mouse development. Gene Expr Patterns 7, 550-557, doi:10.1016/j.modgep.2007.01.006 (2007).

  • 13 Wamstad, J. A., Corcoran, C. M., Keating, A. M. & Bardwell, V. J. Role of the transcriptional corepressor Bcor in embryonic stem cell differentiation and early embryonic development. PLoS One 3, e2814, doi:10.1371/journal.pone.0002814 (2008).

  • 14 Pugh, T. J. et al. Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations. Nature 488, 106-110, doi: 10.1038/nature11329 (2012).

  • 15 Zhang, J. et al. A novel retinoblastoma therapy from genomic and epigenetic analyses. Nature 481, 329-334, doi: 10.1038/nature10733 (2012).

  • 16 Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899-905, doi: 10.1038/nature08822 (2010).

  • 17 Kantarci, S. et al. Mutations in LRP2, which encodes the multiligand receptor megalin, cause Donnai-Barrow and facio-oculo-acoustico-renal syndromes. Nat Genet 39, 957-959, doi:10.1038/ng2063 (2007).

  • 18 Willnow, T. E. et al. Defective forebrain development in mice lacking gp330/megalin. Proc Natl Acad Sci USA 93, 8460-8464 (1996).

  • 19 Christ, A. et al. LRP2 is an auxiliary SHH receptor required to condition the forebrain ventral midline for inductive signals. Dev Cell 22, 268-278, doi:10.1016/j.devcel.2011.11.023 (2012).

  • 20 Cowin, P. A. et al. LRP1B deletion in high-grade serous ovarian cancers is associated with acquired chemotherapy resistance to liposomal doxorubicin. Cancer Res 72, 4060-4073, doi: 10.1158/0008-5472.CAN-12-0203 (2012).

  • 21 Lima, F. R. et al. Glioblastoma: therapeutic challenges, what lies ahead. Biochim Biophys Acta 1826, 338-349, doi: 10.1016/j.bbcan.2012.05.004 (2012).

  • 22 Bekker-Jensen, S. et al. HERC2 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Nat Cell Biol 12, 80-86; sup pp 81-12, doi:10.1038/ncb2008 (2010).

  • 23 Harlalka, G. V. et al. Mutation of HERC2 causes developmental delay with Angelman-like features. J Med Genet 50, 65-73, doi: 10.1136/jmedgenet-2012-101367 (2013).

  • 24 Nacak, T. G., Leptien, K., Fellner, D., Augustin, H. G. & Kroll, J. The BTB-kelch protein LZTR-1 is a novel Golgi protein that is degraded upon induction of apoptosis. J Biol Chem 281, 5065-5071, doi: 10.1074/jbc.M509073200 (2006).

  • 25 Stogios, P. J., Downs, G. S., Jauhal, J. J., Nandra, S. K. & Prive, G. G. Sequence and structural analysis of BTB domain proteins. Genome Biol 6, R82, doi:10.1186/gb-2005-6-10-r82 (2005).

  • 26 Errington, W. J. et al. Adaptor protein self-assembly drives the control of a cullin-RING ubiquitin ligase. Structure 20, 1141-1153, doi:10.1016/j.str.2012.04.009 (2012).

  • 27 Canning, P. et al. Structural basis for Cul3 assembly with the BTB-Kelch family of E3 ubiquitin ligases. J Biol Chem, doi:10.1074/jbc.M112.437996 (2013).

  • 28 Lo, S. C., Li, X., Henzl, M. T., Beamer, L. J. & Hannink, M. Structure of the Keap1:Nrf2 interface provides mechanistic insight into Nrf2 signaling. EMBO J 25, 3605-3617, doi:10.1038/sj.emboj.7601243 (2006).

  • 29 Boyden, L. M. et al. Mutations in kelch-like 3 and cullin 3 cause hypertension and electrolyte abnormalities. Nature 482, 98-102, doi: 10.1038/nature10814 (2012).

  • 30 Louis-Dit-Picard, H. et al. KLHL3 mutations cause familial hyperkalemic hypertension by impairing ion transport in the distal nephron. Nat Genet 44, 456-460, S451-453, doi: 10.1038/ng.2218 (2012).

  • 31 Abu-Elneel, K. et al. A delta-catenin signaling pathway leading to dendritic protrusions. J Biol Chem 283, 32781-32791, doi: 10.1074/jbc.M804688200 (2008).

  • 32 Arikkath, J. et al. Delta-catenin regulates spine and synapse morphogenesis and function in hippocampal neurons during development. J Neurosci 29, 5435-5442, doi: 10.1523/JNEUROSCI.0835-09.2009 (2009).

  • 33 Kosik, K. S., Donahue, C. P., Israely, I., Liu, X. & Ochiishi, T. Delta-catenin at the synaptic-adherens junction. Trends Cell Biol 15, 172-178, doi:10.1016/j.tcb.2005.01.004 (2005).

  • 34 Israely, I. et al. Deletion of the neuron-specific protein delta-catenin leads to severe cognitive and synaptic dysfunction. Curr Biol 14, 1657-1663, doi:10.1016/j.cub.2004.08.065 (2004).

  • 35 Jun, G. et al. delta-Catenin is genetically and biologically associated with cortical cataract and future Alzheimer-related structural and functional brain changes. PLoS One 7, e43728, doi: 10.1371/journal.pone.0043728 (2012).

  • 36 Hicks, S., Wheeler, D. A., Plon, S. E. & Kimmel, M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum Mutat 32, 661-668, doi: 10.1002/humu.21490 (2011).

  • 37 Phillips, H. S. et al. Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. Cancer Cell 9, 157-173, doi:10.1016/j.ccr.2006.02.019 (2006).

  • 38 Pierotti, M. A. & Greco, A. Oncogenic rearrangements of the NTRK1/NGF receptor. Cancer Lett 232, 90-98, doi: 10.1016/j.canlet.2005.07.043 (2006).

  • 39 Dunn, G. P. et al. Emerging insights into the molecular and cellular basis of glioblastoma. Genes Dev 26, 756-784, doi: 10.1101/gad.187922.112 (2012).

  • 40 Vivanco, I. et al. Differential sensitivity of glioma-versus lung cancer-specific EGFR mutations to EGFR kinase inhibitors. Cancer Discov 2, 458-471, doi:10.1158/2159-8290.CD-11-0284 (2012).

  • 41 Forbes, S. A. et al. COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38, D652-657, doi: 10.1093/nar/gkp995 (2010).

  • 42 Northcott, P. A. et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488, 49-56, doi: 10.1038/nature11327 (2012).

  • 43 Tiacci, E. et al. BRAF mutations in hairy-cell leukemia. The New England journal of medicine 364, 2305-2315, doi: 10.1056/NEJMoa1014209 (2011).

  • 44 Iyer, M. K., Chinnaiyan, A. M. & Maher, C. A. ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics 27, 2903-2904, doi: 10.1093/bioinformatics/btr467 (2011).

  • S1 Tiacci, E. et al. BRAF mutations in hairy-cell leukemia. The New England journal of medicine 364, 2305-2315, doi: 10.1056/NEJMoa1014209 (2011).

  • S2 Singh, D. et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231-1235, doi: 10.1126/science. 1220834 (2012).

  • S3 Iyer, M. K., Chinnaiyan, A. M. & Maher, C. A. ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics 27, 2903-2904, doi: 10.1093/bioinformatics/btr467 (2011).

  • S4 Vilella, A. J. et al. EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19, 327-335, doi: 10.1101/gr.073585.107 (2009).

  • S5 Seal, R. L., Gordon, S. M., Lush, M. J., Wright, M. W. & Bruford, E. A. genenames.org: the HGNC resources in 2011. Nucleic Acids Res 39, D514-519, doi:10.1093/nar/gkq892 (2011).

  • S6 Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951-960, doi: 10.1093/bioinformatics/btil25 (2005).

  • S7 Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5, 725-738, doi:10.1038/nprot.2010.5 (2010).

  • S8 Stogios, P. J., Downs, G. S., Jauhal, J. J., Nandra, S. K. & Prive, G. G. Sequence and structural analysis of BTB domain proteins. Genome Biol 6, R82, doi:10.1186/gb-2005-6-10-r82 (2005).

  • S9 Errington, W. J. et al. Adaptor protein self-assembly drives the control of a cullin-RING ubiquitin ligase. Structure 20, 1141-1153, doi:10.1016/j.str.2012.04.009 (2012).

  • S10 Zhuang, M. et al. Structures of SPOP-substrate complexes: insights into molecular architectures of BTB-Cul3 ubiquitin ligases. Mol Cell 36, 39-50, doi: 10.1016/j.molcel.2009.09.022 (2009).

  • S11 Canning, P. et al. Structural basis for Cul3 assembly with the BTB-Kelch family of E3 ubiquitin ligases. J Biol Chem, doi:10.1074/jbc.M112.437996 (2013).

  • S12 Fulop, V. & Jones, D. T. Beta propellers: structural rigidity and functional diversity. Curr Opin Struct Biol 9, 715-721 (1999).

  • S13 Carro, M. S. et al. The transcriptional network for mesenchymal transformation of brain tumours. Nature 463, 318-325, doi: 10.1038/nature08712 (2010).

  • S14 Niola, F. et al. Mesenchymal high-grade glioma is maintained by the ID-RAP1 axis. J Clin Invest 123, 405-417, doi:10.1172/JCI63811 (2013).

  • S15 Zhao, X. et al. The N-Myc-DLL3 cascade is suppressed by the ubiquitin ligase Huwel to inhibit proliferation and promote neurogenesis in the developing brain. Dev Cell 17, 210-221, doi:10.1016/j.devcel.2009.07.009 (2009).

  • S16 Friedman, H. S. et al. Experimental chemotherapy of human medulloblastoma cell lines and transplantable xenografts with bifunctional alkylating agents. Cancer Res 48, 4189-4195 (1988).

  • S17 Srivastava, M. et al. The Amphimedon queenslandica genome and the evolution of animal complexity. Nature 466, 720-726, doi: 10.1038/nature09201 (2010).

  • S18 Sebe-Pedros, A., Roger, A. J., Lang, F. B., King, N. & Ruiz-Trillo, I. Ancient origin of the integrin-mediated adhesion and signaling machinery. Proc Natl Acad Sci USA 107, 10142-10147, doi:10.1073/pnas.1002257107 (2010).



Example 2
Genomic Alterations in Glioblastoma

Glioblastoma remains one of the most challenging forms of cancer to treat. This example discusses a computational platform that integrates the analysis of copy number variations and somatic mutations and unravels the landscape of in frame gene fusions in glioblastoma. Mutations were found with loss of heterozygosity of LZTR-1, an adaptor of Cul3-containing E3 ligase complexes. Mutations and deletions disrupt LZTR-1 function, which restrains self-renewal and growth of glioma spheres retaining stem cell features. Loss-of-function mutations of CTNND2 target a neural-specific gene and are associated with transformation of glioma cells along the very aggressive mesenchymal phenotype. Recurrent translocations are reported that fuse the coding sequence of EGFR to several partners, with EGFR-SEPT14 as the most frequent functional gene fusion in human glioblastoma. EGFR-SEPT14 fusions activate Stat3 signaling and confer mitogen independency and sensitivity to EGFR inhibition. These results provide important insights into the pathogenesis of glioblastoma and highlight new targets for therapeutic intervention.


Glioblastoma (GBM) is the most common primary intrinsic malignant brain tumor affecting ˜10,000 new patients each year with a median survival rate of 12-15 months1,2. Identifying and understanding the functional significance of genetic alterations that drive initiation and progression of GBM is crucial to develop effective therapies. Previous efforts in GBM genome characterization identified somatic changes in well-known GBM genes (EGFR, PTEN, IDH1, TP53, NF1, etc.) and nominated putative cancer genes with somatic mutations, but the functional consequence of most alterations is unknown3-6. Furthermore, the abundance of passenger mutations and large regions of copy number variations (CNVs) complicates the definition of the landscape of driver mutations in glioblastoma. To address this challenge, a statistical approach was used to nominate driver genes in GBM by integrating somatic mutations identified by whole-exome sequencing with a CNVs analysis that prioritizes focality and magnitude of the genetic alterations.


Recurrent and oncogenic gene fusions are hallmarks of hematological malignancies and have also been uncovered in solid tumors7,8. Recently, a small subset of GBM harbor FGFR-TACC gene fusions was reported indicating that the patients with FGFR-TACC-positive tumors would benefit from targeted FGFR kinase inhibition9. It remains unknown whether gene fusions involving other RTK-coding genes exist and produce oncogene addiction in GBM. Here, a large RNA-sequencing dataset of primary GBM and Glioma Sphere Cultures (GSCs) is investigated and the global landscape of in frame gene fusions in human GBM are reported.


Nomination of Candidate GBM Genes


Without being bound by theory, integration of somatic point mutations and focal CNVs will uncover candidate driver GBM genes. MutComFocal is an algorithm designed to rank genes by an integrated recurrence, focality and mutation score (see Methods). This strategy was applied to 139 GBM and matched normal DNA analyzed by whole-exome sequencing to identify somatic mutations and 469 GBM analyzed by the Affymetrix SNP6.0 platform to identify CNVs.


The whole-exome analysis revealed a mean of 43 nonsynonymous somatic mutations per tumor sample. The distribution of substitutions shows a higher rate of transitions versus tranversions (67%), with a strong preference for C→T and G→A (55%) (FIG. 6). As seen in other tumor types10, 19.2% of the mutations occurred in a CpG dinucleotide context (FIG. 7). Among somatic small nucleotide variants, the most frequently mutated genes have roles in cancer, including GBM (TP53, EGFR, PTEN, and IDH1). In addition to known cancer genes, new candidate driver genes were mutated in ˜5% of tumor samples. By integrating mutational and common focal genomic lesions, MutComFocal stratified somatically mutated genes into three groups: recurrently mutated genes without significant copy number alterations (Mut), in regions of focal and recurrent amplifications (Amp-Mut) and in regions of focal and recurrent deletions (Del-Mut).


A list of 67 genes were generated that score at the top of each of the three categories and included nearly all the genes previously implicated in GBM. Among these genes, (labeled in light grey in FIG. 1) are IDH1 (Mut, FIG. 1a), PIK3C2B, MDM4, MYCN, PIK3CA, PDGFRA, KIT, EGFR, and BRAF (Amp-Mut, FIG. 1b) and PIK3R1, PTEN, RB1, TP53, NF1 and ATRX (Del-Mut, FIG. 1c). The analysis also selected 52 new candidate driver genes previously unreported in GBM. Based upon their role in CNS development and homeostasis as well as their potential function in gliomagenesis, 24 genes were selected for re-sequencing in an independent dataset of 83 GBM and matched normal controls. Eighteen genes were found somatically mutated by Sanger sequencing in the independent panel (labeled in dark grey in FIG. 1). Each validated new GBM gene is targeted by somatic mutations and CNVs in a cumulative fraction comprised between 2.9% and 45.7% of GBM. Furthermore, mutations of the 18 new GBM genes occur mostly in tumors with global mutation rates similar to the mean of 43 mutations per tumor and well within the 95% confidence interval, indicating that mutations of the 18 new genes do not cluster in hypermutated tumors (FIG. 2C and FIG. 9).


Among the commonly mutated and focally deleted genes exhibiting top MutComFocal scores and validated in the independent GBM dataset, BCOR, LRP family members, HERC2, LZTR-1 and CTNND2. BCOR, an X-linked gene, encodes for a component of the nuclear co-repressor complex that is essential for normal development of neuroectoderm and stem cell functions11-13. BCOR mutations have recently been described in retinoblastoma and medulloblastoma14,15. LRP1B, a member of the LDL receptor family, is among the most frequently mutated genes in human cancer (FIG. 1c)16. Interestingly, two other LDL receptor family members (LRP2 and LRP1) are mutated in 4.4% and 2.9% of tumors, respectively (FIG. 1a). The LRP proteins are highly expressed in the neuroepithelium and are essential for forebrain morphogenesis in mouse and humans17,18. The tumor suppressor function of LRP proteins in GBM may relate to the ability to promote chemosensitivity and control in the Sonic hedgehog signaling pathway, which is implicated in cancer initiating cells in GBM19-21. Localized on chromosome 15q13, the Hect ubiquitin ligase Herc2 gene is deleted and mutated in 15.1% and 2.2% of GBM cases, respectively. Herc2 has been implicated in severe neurodevelopmental syndromes and Herc2 substrates regulate genome stability and DNA damage-repair22,23.


LZTR-1 Mutations Inactivate a Cullin-3 Adaptor to Drive Self-Renewal and Growth of Glioma Spheres


A gene that received one of the highest Del-Mut score by MutComFocal is LZTR-1 (FIG. 1c). The LZTR-1 coding region had non-synonymous mutations in 4.4%, and the LZTR-1 locus (human chromosome 22q11) was deleted in 22.4% of GBM. Among the 18 new GBM genes, LZTR-1 had the highest co-occurrence score of mutations and deletions (Fisher's exact test, p=0.0007). It also scored at the top of the list of genes whose CNVs are statistically correlated with expression (Pearson correlation between LZTR-1 CNVs and expression is 0.36, p-value<10−6 by Student's t-distribution). Finally, LZTR-1 emerged as the gene with the highest correlation for monoallelic expression of mutant alleles in tumors harboring LZTR-1 deletions (p-value=0.0007). Taken together, these findings indicate that LZTR-1 is concurrently targeted in GBM by mutations and copy number loss, fulfilling the two-hits model for tumor suppressor inactivation in cancer.


LZTR-1 codes for a protein with a characteristic Kelch-BTB-BACK-BTB-BACK domain architecture (FIGS. 2C, 8, 9) and is expressed in normal brain. The LZTR-1 gene is highly conserved in metazoans. Although it was initially proposed that LZTR-1 functions as a transcriptional regulator, this role was not confirmed in follow-up studies24. Most proteins with BTB-BACK domains are substrate adaptors in Cullin-3 (Cul3) ubiquitin ligase complexes, in which the BTB-BACK region binds to the N-terminal domain of Cul3, while a ligand binding domain, often a Kelch 6-bladed β-propeller motif, binds to substrates targeted for ubiquitylation25. To ask whether LZTR-1 directly binds Cul3, co-immunoprecipitation experiments were performed in human glioma cells. FIG. 15 shows that Cul3 immunoprecipitates contain LZTR-1, indicating that LZTR-1 is an adaptor in Cul3 ubiquitin ligase complexes.


To address the function of LZTR-1 mutants, a homology model of LZTR-1 was built based partly on the crystal structures of the MATH-BTB-BACK protein SPOP26, the BTB-BACK-Kelch proteins KLHL327 and KLHL1128, and the Kelch domain of Keap129 (FIG. 2b). Without being bound by theory, the second BTB-BACK region of LZTR-1 binds Cul3 because of φ-X-E motif in this BTB domain, followed by a 3-Box/BACK region (FIG. 9)26. However, the preceding BTB-BACK region can also participate in Cul3 binding. Five of seven LZTR-1 mutations identified in GBM are located within the Kelch domain and target highly conserved amino acids (FIG. 2b, FIG. 2C, FIG. 8). Interestingly, the concentration of LZTR-1 mutations in the Kelch domain reflects a similar pattern of mutations in the Kelch-coding region of KLHL3, recently identified in families with hypertension and electrolytic abnormalities30,31.


The R198G and G248R mutations localize to the b-c loop of the Kelch domain, in a region predicted to provide the substrate-binding surface29. The W105R mutation targets a highly conserved anchor residue in the Kelch repeats and the T288I mutation disrupts a buried residue conserved in LZTR-1 (FIG. 2b, FIG. 2C, FIG. 8). Both mutations are expected to perturb folding of the Kelch domain. The E353STOP mutation is expected to produce a misfolded Kelch domain besides removing the C-terminal BTB-BACK regions. Located in the BTB-BACK domains, the remaining two mutations either truncate the entire BTB-BACK-BTB-BACK region (W437STOP) or are predicted to disrupt the folding of the last helical hairpin in the BTB-BACK domain (R810W, FIG. 2b).


To ask whether the mutations predicted to affect the BTB-BACK domains perturb the interaction with Cul3, in vitro translated wild type, E353STOP, W437STOP and R810W LZTR-1 Myc-tagged proteins were prepared and their ability to bind to Flag-Cul3 purified from mammalian cells was tested. Wild type LZTR-1 bound Flag-Cul3, but the E353STOP and W437STOP mutants lost this property. However, the R810W mutant retained Cul3 binding in this assay (FIG. 16A). Besides promoting ubiquitin-mediated degradation of substrates, Cullin adaptors are short-lived proteins that undergo auto-ubiquitylation and destruction by the same Cullin complexes that direct substrate ubiquitylation32-34. Thus, impaired ubiquitin ligase activity of the LZTR-1-Cul3 complex should result in accumulation of mutant LZTR-1 proteins. Each of the three LZTR-1 mutants predicted to compromise integrity of the BTB-BACK domains accumulated at higher levels than wild-type LZTR-1 in transient transfection assays (FIG. 16B). The steady state and half-life of the LZTR-1 R810W mutant protein were markedly increased, in the absence of changes of the mutant mRNA (FIG. 16C-D). Thus, as for the two truncated mutants, the R810W mutation compromised protein degradation.


Next, the biological consequences of LZTR-1 inactivation in human GBM. Differential gene expression pattern of GBM harboring mutations was examined and deletions of LZTR-1 or normal LZTR-1 revealed that tumors with genetic inactivation of LZTR-1 were enriched for genes associated with glioma sphere growth and proliferation35 (FIG. 17A). Introduction of LZTR-1 in three independent GBM-derived sphere cultures resulted in strong inhibition of glioma sphere formation and expression of glioma stem cell markers (FIG. 17B-E). LZTR-1 also decreased the size of tumor spheres, induced a flat and adherent phenotype and reduced proteins associated with cell cycle progression (cyclin A, PLK1, p107, FIG. 17D-E). Interestingly, both R810W and W437STOP LZTR-1 mutations abolished LZTR-1 ability to impair glioma sphere formation (FIG. 17F). The above experiments indicate that LZTR-1 inactivation in human GBM drives self-renewal and growth of glioma spheres.


Inactivation of CTNND2 Induces Mesenchymal Transformation in Glioblastoma


Among the top ranking genes in MutComFocal, CTNND2 is expressed at the highest levels in normal brain. CTNND2 codes for δ-catenin, a member of the p120 subfamily of catenins expressed almost exclusively in the nervous system where it is crucial for neurite elongation, dendritic morphogenesis and synaptic plasticity36-38. Germ-line hemizygous loss of CTNND2 impairs cognitive functions and underlies some forms of mental retardation39,40. CTNND2 shows pronounced clustering of mutations in GBM. The observed spectrum of mutations includes four mutations in the armadillo-coding domain and one in the region coding for the N-terminal coiled-coil domain (FIG. 10A), the two most relevant functional domains of δ-catenin. Each mutation targets highly conserved residues with probably (K629Q, A776T, S881L, D999E) and possibly (A71T) damaging consequences41. GBM harbors focal genomic losses of CTNND2, and deletions correlate with loss of CTNND2 expression (FIG. 10B).


Immunostaining experiments showed that δ-catenin is strongly expressed in normal brain, particularly in neurons, as demonstrated by co-staining with the neuronal markers β3-tubulin and MAP2 but not the astrocytic marker GFAP (FIG. 18A-B). Conversely, immunostaining of 69 GBM and western blot of 9 glioma sphere cultures revealed negligible or absent expression of δ-catenin in 21 tumors and in most glioma sphere cultures (FIG. 22). Oncogenic transformation in the CNS frequently disrupts the default proneural cell fate and induces an aberrant mesenchymal phenotype associated with aggressive clinical outcome42. Gene expression analysis of 498 GBM from ATLAS-TCGA showed that low CTNND2 expression is strongly enriched in tumors exhibiting the mesenchymal gene expression signature (t-test p-value=2.4 10−12, FIG. 10D). Tumors with reduced CTNND2 were characterized by poor clinical outcome and, among them, tumors with CTNND2 copy number loss displayed the worst prognosis (FIG. 3C-D). Patients with low CTNND2 expression showed the worst clinical outcome in mesenchymal GBM, though non-mesenchymal tumors also demonstrated poor prognosis, albeit with reduced strength (FIG. 3D).


Mesenchymal transformation of GBM is associated with irreversible loss of proneural cell fate and neuronal markers42 and is detected in most established glioma cell lines. Expression of δcatenin in the U87 human glioma cell line reduced cell proliferation (FIG. 3E), elevated expression of neuronal proteins βIII-tubulin, PSD95 (a post-synaptic marker) and N-cadherin (FIG. 3G, FIG. 23A) and decreased mRNA and protein levels of mesenchymal markers (FIG. 3F, FIG. 18C, FIG. 23A). These effects were associated with morphologic changes characterized by neurite extension and development of branched dendritic processes (FIG. 3F, FIG. 23B-23C). Conversely, expression of the A776T, K629Q and D999E mutants of CTNND2 failed to induce neuronal features and down-regulate the mesenchymal marker fibronectin (FBN, FIG. 18C, FIG. 23B-23C). Consistent with 8-catenin inhibition of cell proliferation in glioma cells, only wild type δ-catenin decreased cyclin A, a S-phase cyclin (FIG. 18C).


Next, the effect of expressing δ-catenin in GBM-derived sphere culture #48 that lacks the endogenous δ-catenin protein (FIG. 22B) and expresses high levels of mesenchymal markers was analyzed43. Introduction of δ-catenin in sphere culture #48 strongly reduced mesenchymal proteins smooth muscle actin (SMA), collagen-5A1 (Col5A1) and FBN, as measured by quantitative immunofluorescence (FIGS. 19A-B). It also induced βIII-tubulin more than eight-fold (FIGS. 19C-D). Time course analysis showed the highest degree of βIII-tubulin-positive neurite extension at 4-6 days post-transduction followed by progressive depletion of neuronal-like cells from culture (FIG. 19D). Finally, whether 8-catenin impacts self-renewal and growth of glioma spheres in vitro and their ability to grow as tumor masses in vivo were examined. In a limiting dilution assay, δ-catenin inhibited glioma sphere formation more than 8-fold (FIG. 19E). To determine the effect of δ-catenin on brain tumorigenesis in vivo, #48 glioma sphere cultures were generated expressing luciferase and bioluminescence imaging was conducted at different times after stereotactic transduction of control and δ-catenin-expressing cells in the mouse brain. When compared to controls, a 5-fold inhibition of tumor growth by δ-catenin at each time point analyzed (FIG. 19F, FIG. 23D). These results identify CTNND2 inactivation as a key genetic alteration driving the aggressive mesenchymal phenotype of GBM.


Recurrent EGFR Fusions in GBM


To identify gene fusions in GBM, RNA-seq data was analyzed from a total of 185 GBM samples (161 primary GBM plus 24 short-term glioma sphere cultures freshly isolated from patients carrying primary GBM). The analysis of RNA-seq led to the discovery of 92 candidate rearrangements giving rise to in-frame fusion transcripts (Table 7). Besides previously reported FGFR3-TACC3 fusions events, the most frequent recurrent in-frame fusions involved EGFR in 7.6% of samples (14/185, 3.8%-11.3% CI). Nine of 14 EGFR fusions included recurrent partners SEPT14 (6/185, 3.2%) and PSPH (3/185, 1.6%) as the 3′ gene segment in the fusion. All EGFR-SEPT14 and two of three EGFR-PSPH gene fusions occurred within amplified regions of the fusion genes (FIG. 24).


The quantitative analysis of expressed reads spanning the fusion breakpoint versus reads spanning EGFR exons not implicated in the fusion transcripts revealed that EGFR fusion genes were expressed at higher levels in five of nine tumors (Table 11). Two in-frame highly expressed fusions involving the neurotrophic tyrosine kinase receptor 1 gene (NTRK1) as the 3′ gene with two different 5′ partners (NFASC-NTRK1 and BCAN-NTRK1). Fusions involving NTRK1 are common in papillary thyroid carcinomas44. Using EXomeFuse, an algorithm that reconstructs genomic fusions from whole-exome data, EGFR-SEPT14 and NRTK1 fusions result from recurrent chromosomal translocations and the corresponding genomic breakpoints were reconstructed (Table 12).


The sequence of the PCR products spanning the fusion breakpoint validated all three types of recurrent in frame fusion predictions (EGFR-SEPT14, EGFR-PSPH and NRTK1 fusions, FIGS. 4, 11, 12). In FIGS. 4A-B, the prediction and cDNA sequence validation is shown respectively, for one tumor harboring an EGFR-SEPT14 fusion (TCGA-27-1837). The amplified cDNA contained an open reading frame for a 1,041 amino-acid protein resulting from the fusion of EGFR residues 1-982 with SEPT14 residues 373-432 (FIG. 4C). Thus, the structure of EGFR-Septin14 fusions involves EGFR at the N-terminus, providing a receptor tyrosine kinase domain fused to a coiled-coil domain from Septin14. Exon-specific RNA-seq expression in TCGA-27-1837 demonstrated that EGFR and SEPT14 exons implicated in the fusion are highly expressed compared with mRNA sequences not included in the fusion event (FIG. 13).


Using PCR, the genomic breakpoint was mapped to chromosome 7 (#55,268,937 for EGFR and #55,870,909 for SEPT14, genome build GRCh37/hg19) within EGFR exon 25 and SEPT14 intron 9, creating a transcript in which the 5′ EGFR exon 24 is spliced to the 3′ SEPT14 exon 10 (FIG. 4D). Interestingly, the fused EGFR-PSPH cDNA and predicted fusion protein in sample TCGA-06-5408 involves the same EGFR N-terminal region implicated in the EGFR-SEPT14 with PSPH providing a carboxy-terminal portion of 35 amino acids (FIG. 11). An example of a fusion in which the EGFR-TK region is the 3′ partner is the CAND1-EGFR fusion in the glioma sphere culture #16 (FIG. 14). Each fusion transcript includes the region of the EGFR mRNA coding for the TK domain (Table 7). RT-PCR and genomic PCR followed by Sanger sequencing of GBM TCGA-06-5411 validated the NFASC-NTRK1 fusions in which the predicted fusion protein includes the TK domain of the high-affinity NGF receptor (TrkA) fused downstream to the immunoglobulin-like region of the cell adhesion and ankyrin-binding region of neurofascin (FIG. 12).


To confirm that GBM harbors recurrent EGFR fusions and determine the frequency in an independent dataset, cDNA was screened from a panel of 248 GBMs and discovered 10 additional cases with EGFR-SEPT14 fusions (4%). Conversely, NFASC-NTRK1 fusions were not detected in this dataset. A 2.2% (3/135) frequency of EGFR-PSPH fusions was determined.


The discovery of recurrent EGFR fusions in GBM is of particular interest. EGFR is activated in a significant fraction of primary GBM (˜25%) by an in-frame deletion of exons 2-7 (EGFRvIII)45. However, seven of nine tumors harboring EGFR-SEPT14 and EGFR-PSPH gene fusions lacked the EGFRvIII rearrangement (Table 13). It was determined whether the most frequent EGFR fusion in GBM (EGFR-SEPT14) provides an alternative mechanism of EGFR activation and confers sensitivity to EGFR inhibition. First, whether EGFR gene fusions cluster into any gene expression subtype of GBM (proneural, neural, classical, mesenchymal) was investigated. Although no individual subtype displayed a statistically significant enrichment of EGFR fusions, 8 of 9 GBM harboring EGFR-SEPT14 or EGFR-PSPH belonged to the classical or mesenchymal subtype (Fisher's P value=0.05 for classical/mesenchymal enrichment, Table 14).


Next, the effects of ectopic EGFR-SEPT14, EGFRvIII or EGFR wild type on glioma cells were investigated. Lentiviral transduction of #48 human glioma sphere culture (which lacks genomic alteration of EGFR) showed that cells expressing EGFR-SEPT14 or EGFRvIII but not those expressing wild type EGFR or vector retained growth and self-renewal in the absence of EGF and bFGF (FIG. 20A). Accordingly, established glioma cell lines expressing EGFR-SEPT14 or EGFRvIII proliferated at higher rate than control cells or cells expressing wild type EGFR (FIG. 5A, FIG. 25). Furthermore, EGFR-SEPT14 and EGFRvIII markedly enhanced migration of glioma cells in a wound assay (FIG. 5B-C). The above findings indicate that EGFR-SEPT14 might constitutively activate signaling events downstream of EGFR. When analyzed in the presence and absence of mitogens, the expression of EGFR-SEPT14 (or EGFRvIII) in glioma sphere cultures #48 triggered constitutive activation of phospho-STAT3 but had no effects on phospho-ERK and phospho-AKT (FIG. 20B-C). This is consistent with enrichment of STAT3-target genes in primary human GBM harboring EGFR-SEPT14 fusions compared with tumors carrying wild type EGFR (FIG. 20D). Differential gene expression analysis identified a set of 9 genes up-regulated in EGFR-SEPT14 tumors compared with EGFRvIII-positive GBM (FIG. 26). These genes broadly relate to inflammatory/immune response, and some code for chemokines (CXCL9, 10, 11) that have been associated with aggressive glioma phenotypes46.


Finally, it was investigated whether EGFR-SEPT14 fusions confer sensitivity to inhibition of EGFR-TK. Treatment of #48 expressing EGFR-SEPT14, EGFRvIII, wild type EGFR or vector control with lapatinib, an irreversible EGFR inhibitor recently proposed to target EGFR alterations in GBM47, revealed that EGFR-Sept14 and EGFRvIII but not wild-type EGFR sensitized glioma cells to pharmaceutical EGFR inhibition (FIG. 20E). Similar effects were obtained following treatment of #48-derivatives with erlotinib, another inhibitor of EGFR-TK (FIG. 5D).


To ask whether sensitivity to EGFR-TK inhibition is retained in human glioma cells naturally harboring EGFR-SEPT14 in vivo, an EGFR-SEPT14-positive GBM xenograft (D08-0537 MG) established from a heavily pretreated patient was used. Treatment of D08-0537 MG tumors with lapatinib or erlotinib showed that both drugs significantly delayed tumor growth, with lapatinib displaying the strongest anti-tumor effects. Conversely, EGFR inhibitors were ineffective against GBM xenograft D08-0714 MG, which lacks EGFR genomic alterations (FIG. 5E). Taken together, these data determine that EGFR-SEPT14 fusions confer mitogen-independent growth, constitutively activate STAT3 signaling and impart sensitivity to EGFR kinase inhibition to glioma cells harboring the fusion gene.


Discussion


A computational pipeline was described that computes frequency, magnitude and focality of CNVs at any loci in the human genome with the somatic mutation rate for genes residing at that genomic location, thus integrating into a single score two genetic hallmarks of driver cancer genes (focality of CNVs and point mutations). Besides recognizing nearly all genes known to have functional relevance in GBM, this study discovered and validated somatic mutations in 18 new genes, which also harbor focal and recurrent CNVs in a significant fraction of GBM. The importance of some of these genes extends beyond GBM, as underscored by cross-tumor relevance (e.g. BCOR), and protein family recurrence (e.g. LRP family members).


Also, the LZTR-1 mutations targeting highly conserved residues in the Kelch domain (W105, G248, T288) and in the second BTB-BACK domain (R810) are recurrent events in other tumor types48. Thus, understanding the nature of substrates of LZTR-1-Cul3 ubiquitin ligase activity will provide important insights into the pathogenesis of multiple cancer types. The importance of LZTR-1 genetic alterations in GBM is underscored by concurrent targeting of LZTR-1 by mutations and deletions that supports a two-hits mechanism of tumor suppressor gene inactivation as well as the impact of mutations targeting the BTB-BACK domains on Cul3 binding and/or protein stability, and their ability to release glioma cells from the restraining activity of the wild-type protein on self-renewal.


The finding that loss-of-function of CTNND2 cluster in mesenchymal GBM provides a clue to the genetic events driving this aggressive GBM subtype. The function of 8-catenin for crucial neuronal morphogenesis indicates that full-blown mesenchymal transformation in the brain requires loss of master regulators constraining cell determination along the neuronal lineage. Introduction of δ-catenin in human glioma spheres collapsed the mesenchymal phenotype and inhibited sphere formation and tumor growth. Thus, the ability of δ-catenin to reprogram glioma cells expressing mesenchymal genes towards a neuronal fate unravels an unexpected plasticity of mesenchymal GBM that might be exploited therapeutically.


In this study, the landscape of gene fusions from a large dataset of GBM analyzed by RNA-Sequencing is also reported. In-frame gene fusions retaining the RTK-coding domain of EGFR emerged as the most frequent gene fusion in GBM. In this tumor, EGFR is frequently targeted by focal amplications and our finding underscores the strong recombinogenic probability of focally amplified genes, as recently reported for the myc locus in medulloblastoma49. Resembling intragenic rearrangements that generate the EGFRvIII allele, EGFR-SEPT14 fusions impart to glioma cells the ability to self-renew and grow in the absence of mitogens, constitutively activate STAT3 signaling, and confer sensitivity to EGFR inhibition. These findings highlight the relevance of fusions implicating RTK-coding genes in the pathogenesis of GBM9. They also provide a strong rationale for the inclusion of GBM patients harboring EGFR fusions in clinical trials based on EGFR inhibitors.


Methods


SAVI (Statistical Algorithm for Variant Frequency Identification).


The frequencies of variant alleles were estimated in 139 paired tumor and normal whole-exome samples from TCGA using the SAVI pipeline50. The algorithm estimates the frequency of variant alleles by constructing an empirical Bayesian prior for those frequencies, using data from the whole sample, and obtains a posterior distribution and high credibility intervals for each allele50. The prior and posterior are distributed over a discrete set of frequencies with a precision of 1% and are connected by a modified binomial likelihood, which allows for some error rate. More precisely, a prior distribution p(f) of the frequency f and a prior for the error e uniform on the interval [0,E] for a fixed 0≦E≦1 is assumed. The sequencing data at a particular allele is a random experiment producing a string of m (the total depth at the allele) bits with n ‘1’s (the variant depth at the allele). Assuming a binomial likelihood of the data and allowing for bits being misread because of random errors, the posterior probability P(f) of the frequency f is







P


(
f
)


=




p


(
f
)


C

·

1

b
-
1







f

f
+
E
-

2

Ef








x
n



(

1
-
x

)



m
-
n





x








where C is a normalization constant. For a particular allele, the value of E is determined by the quality of the nucleotides sequenced at that position as specified by their Phred scores. The SAVI pipeline takes as input the reads produced by the sequencing technology, filters out low-quality reads and maps the rest onto a human reference genome. After mapping, a Bayesian prior for the distribution of allele frequencies for each sample is constructed by an iterative posterior update procedure starting with a uniform prior. To genotype the sample, the posterior high-credibility intervals were used for the frequency of the alleles at each genomic location. Alternatively, combining the Bayesian priors from different samples, posterior high-credibility intervals were obtained for the difference between the samples of the frequencies of each allele. Finally, the statistically significant differences between the tumor and normal samples are reported as somatic variants. To estimate the positive prediction value of SAVI in the TCGA GBM samples, 41 mutations were selected for independent validation by Sanger sequencing. 39 of the 41 mutations using Sanger sequencing were confirmed, resulting in a 0.95 (95% CI 0.83-0.99) validation rate.


Candidate genes were ranked by the number of somatic nonsynonymous mutations. A robust fit of the ratio of nonsynonymous to synonymous mutations was generated with a bisquare weighting function. The excess of nonsynonymous alterations was estimated using a Poisson distribution with a mean equal to the product of the ratio from the robust fit and the number of synonymous mutations. Genes in highly polymorphic genomic regions were filtered out based on an independent cohort of normal samples. The list of these regions includes families of genes known to generate false positives in somatic predictions (for example, the HLA, KRT and OR gene families).


MutComFocal.


Key cancer genes are often amplified or deleted in chromosomal regions containing many other genes. Point mutations and gene fusions, conversely, provide more specific information about which genes may be implicated in the oncogenic process. MutComFocal was developed, a Bayesian approach that assigns a driver score to each gene by integrating point mutations and CNV data from 469 GBMs (Affymetrix SNP6.0). In general, MutComFocal uses three different strategies. First, the focality component of the score is inversely proportional to the size of the genomic lesion to which a gene belongs and thus prioritizes more focal genomic lesions. Second, the recurrence component of the MutComFocal score is inversely proportional to the total number of genes altered in a sample, which prioritizes samples with a smaller number of altered genes. Third, the mutation component of the score is inversely proportional to the total number of genes mutated in a sample, which achieves the twofold goal of prioritizing mutated genes on one hand and prioritizing samples with a smaller number of mutations on the other.


More specifically, for a particular sample, let (c1, N1), . . . , (ck, Nk) describe the amplification lesions in that sample so that N1 is the number of genes in the ith lesion and c1 is its copy number change from normal. For a gene belonging to the ith lesion, the amplification recurrence sample score is defined as (c1, N1), . . . , (ck, Nk), and its amplification focality sample score is defined as (cijcj)×(1/Ni). To obtain the amplification recurrence and focality scores for a particular gene, the corresponding sample scores were summed over all the samples and the result was normalized so that each score sums to 1. The deletion and recurrence scores are defined in a similar manner. The mutation score is analogous to a recurrence score in which it is assumed that mutated genes belong to lesions with only one gene.


The amplification/mutation score is defined as the product of the two amplification scores and the mutation score, whereas the deletion/mutation score is defined as the product of the two deletion scores and the mutation score. The amplification/mutation and deletion/mutation scores are normalized to 1, and for each score, genes are divided into tiers iteratively so that the top 2X remaining genes are included in the next tier, where H is the entropy of the scores of the remaining genes normalized to 1. On the basis of their tier across the different types of scores, genes are assigned to being either deleted/mutated or amplified/mutated, and genes in the top tiers are grouped into contiguous regions. The top genes in each region are considered manually and selected for further functional validation.


The recurrence and focality scores can be interpreted as the posterior probabilities that a gene is driving the selection of the disease under two different priors, one global and one local in nature. The recurrence score is higher if a gene participates in many samples that do not have too many altered genes, whereas the focality score is higher if the gene participates in many focal lesions. Besides lending strong support to the inference of a gene as a potential driver, the directionality of the copy number alteration (amplification or deletion) informs the probable behavior of the candidate gene as an oncogene or tumor suppressor, respectively.


The genes displayed in FIG. 1 were selected on the basis of the MutComFocal ranking (top 250 genes), the size of the minimal region (less than 10 genes) and the frequency of mutations (more than 2% for deletion/mutations and at least 1% for amplification/mutations).


RNA-Seq Bioinformatics Analysis.


161 RNA-seq GBM tumor samples were analyzed from TCGA, a public repository containing large-scale genome sequencing of different cancers, plus 24 patient-derived GSCs. Nine GSC samples reported in previous studies were kept in our analysis to evaluate recurrence9. The samples were analyzed using the ChimeraScan51 algorithm to detect a list of gene fusion candidates. Briefly, ChimeraScan detects those reads that discordantly align to different transcripts of the same reference (split inserts). These reads provide an initial set of putative fusion candidates. The algorithm then realigns the initially unmapped reads to the putative fusion candidates and detects those reads that align across the junction boundary (split reads). These reads provide the genomic coordinates of the breakpoint.


RNA-seq analysis detected a total of 39,329 putative gene fusion events. To focus the experimental analysis on biologically relevant fused transcripts, the Pegasus annotation pipeline (http://sourceforge.net/projects/pegasus-fus/) was applied. For each putative fusion, Pegasus reconstructs the entire fusion sequence on the basis of the genomic fusion breakpoint coordinates and gene annotations. Pegasus also annotates the reading frame of the resulting fusion sequences as either in frame or a frame shift. Moreover, Pegasus detects the protein domains that are either conserved or lost in the new chimeric event by predicting the amino acid sequence and automatically querying the UniProt web service. On the basis of the Pegasus annotation report, relevant gene fusions were selected for further experimental validation according to the reading frame and the conserved and lost domains. The selected list was based on in-frame events expressed by ten or more reads, with at least one read spanning the breaking point. To filter out candidate trans-splicing events, events with putative breakpoints at a distance of at least 25 kb were pursued.


Identification of Genetic Rearrangements Using Whole-Exome Data.


Although whole-exome sequencing data contain low intronic coverage that reduces the sensitivity for fusion discovery, they are readily available through the TCGA database. To characterize the genomic breakpoint of the chromosomal rearrangement, EXome-Fuse, a new gene fusion discovery pipeline that is designed particularly to analyze whole-exome data, was designed. For the samples harboring EGFR-SEPT14, EGFR-PSPH, NFASC-NTRK1 and BCAN-NTRK1 fusions in RNA, EXome-Fuse was applied to the corresponding whole-exome sequencing data deposited in TCGA. This algorithm can be divided into three stages: split-insert identification, split-read identification and virtual reference alignment. Mapping against the human genome reference hg18 with BWA, all split inserts are first identified to compile a preliminary list of fusion candidates. This list was pruned of any false positives produced from paralogous gene pairs using the Duplicated Genes Database and the EnsemblCompara GeneTrees52. Pseudogenes in the candidate list were annotated using the list from the HUGO Gene Nomenclature Committee (HGNC) database53 and were given lower priority. Candidates were also filtered out between homologous genes, as well as those with homologous or low-complexity regions around the breakpoint. For the remaining fusion candidates, any supporting split reads were probed for and their mates using BLAST with a word size of 16, identity cutoff of 90% and an expectation cutoff of 10−4. A virtual reference was created for each fusion transcript and all reads were realigned to calculate a final tally of split inserts and split reads such that all aligning read pairs maintain forward-reverse directionality.


Targeted Exon Sequencing.


All protein-coding exons for the 24 genes of interest were sequenced using genomic DNA extracted from frozen tumors and matched blood. Five-hundred nanograms of DNA from each sample were sheared to an average size of 150 bp in a Covaris instrument for 360 s (duty cycle, 10%; intensity, 5; cycles per burst, 200). Bar-coded libraries were prepared using the Kapa High-Throughput Library Preparation Kit Standard (Kapa Biosystems). Libraries were amplified using the KAPA HiFi Library Amplification kit (Kapa Biosystems) (eight cycles). Libraries were quantified using Qubit Fluorimetric Quantitation (Invitrogen), and the quality and size was assessed using an Agilent Bioanalyzer. An equimolar pool of the four bar-coded libraries (300 ng each) was created, and 1,200 ng was input to exon capture using one reaction tube of the custom Nimblegen SeqCap EZ (Roche) with custom probes targeting the coding exons of the 38 genes. Capture by hybridization was performed according to the manufacturer's protocols with the following modifications: 1 nmol of a pool of blocker oligonucleotides (complementary to the bar-coded adapters) was used, and post-capture PCR amplification was done using the KAPA HiFi Library Amplification kit, instead of the Phusion High-Fidelity PCR Master Mix with HF Buffer Kit, in a 60 μl volume, as the Kapa HiFi kit greatly reduced or eliminated the bias against GC-rich regions.


The pooled capture library was quantified by Qubit (Invitrogen) and Bioanalyzer (Agilent) and sequenced in on an Illumina MiSeq sequencer using the 2×150 paired-end cycle protocol. Reads were aligned to the hg19 build of the human genome using BWA with duplicate removal using SAMtools as implemented by Illumina MiSeq Reporter. Variant detection was performed using GATK UnifiedGenotyper. Somatic mutations were identified for paired samples using SomaticSniper and filtered for frequency of less than 3% in normal samples and over 3% in tumor samples. Variants were annotated with the Charity annotator to identify protein-coding changes and cross referenced against known dbSNP, 1000 Genomes and COSMIC variants. Sanger sequencing was used to confirm each mutation from normal and tumor DNA.


Enrichment of Amplified and Deleted Genes for Single-Nucleotide Variants (SNVs).


Although MutComFocal combines SNV and CNV data to identify genes driving oncogenesis, it does not explicitly determine whether amplified or deleted genes are enriched for SNVs within the same sample. Deletions and SNVs of a gene within the same sample might indicate a two-hit model of a tumor suppressor. Alternatively, amplifications and gain-of-function mutations of an oncogene within the sample might further promote oncogenesis. For each MutComFocal candidate gene, the number of TCGA samples was determined with both amplification and SNVs, amplification alone, SNVs alone or neither. The corresponding Fisher's P value was calculated. A similar analysis for deletions was performed.


Correlation Between Copy Number and Expression.


One method of assessing the functional relevance of an amplified or deleted gene is to assess the effect of gene dosage. For each gene nominated by MutComFocal, the Pearson's correlation coefficient was calculated between copy number and expression. The corresponding P values were computed using paired Student's t test.


Allele-Specific Expression of SNVs.


For a given gene nominated by MutComFocal, RNA sequencing can determine whether the mutant or wild-type allele is expressed. Toward this end, VCFtools54 was applied to the TCGA BAM RNA-seq files produced by TopHat, which produces the depth of reads calling the reference (R) and variant (V) allele. A measure of relative expression of the variant allele is then V/(V+R). For each mutation, the binomial P value of observing more than V out of V+R reads was calculated, assuming that it is equally probable for a read to call the variant or reference. The binomial P values of each mutation were then pooled using the Stouffer's Z-score method to calculate the combined P value per gene.


Ruling Out Passenger Mutations in Hypermutated Samples.


To rule out the possibility that MutComFocal candidates tend to be passenger mutations in hypermutated samples, the number of mutations was compared in samples harboring a MutComFocal mutation to the distribution N of the number of mutations in each TCGA sample. Because the number of TCGA samples was well above 30, N was assummed to be well approximated by the normal distribution and calculated the mean, μ, and s.d., σ. For each MutComFocal mutation, the Z-test was performed and all mutations failed statistical significance after correction by the Benjamini-Hochberg method.


Determining the Presence of EGFRvIII Transcripts.


To determine the prevalence of EGFRvIII transcripts, an in-house script was created to calculate the number of split inserts and split reads supporting the junction between EGFR exons 1 and 8. The EGFRvIII isoform was considered to be expressed if there were more than five split reads or five split inserts in a sample.


Calculating the Relative Expression of EGFR Fusions Compared to Wild-Type EGFR.


To determine the functional relevance of EGFR-SEPT14 and EGFR-PSPH fusions, the relative expression was determined between the fusion and wild-type transcripts within each sample on the basis of BAM files mapped by TopHat and provided by TCGA. As a proxy for expression of the transcript, the depth of reads covering either a mutant or wild-type junction was calculated. In particular, the depth of reads covering the fusion breakpoint of EGFR-SEPT14 or EGFR-PSPH was considered to estimate the expression of the fusion transcript. Because all EGFR fusions stereotypically involved exon 24 joined to either SEPT14 or PSPH, the depth of reads covering the junctions between EGFR exons 25-26, 26-27 and 27-28 to be a specific gauge of wild-type EGFR expression was assessed.


Enrichment of the Classical and Mesenchymal Subtype Among Samples with EGFR Fusions.


To assess whether samples with EGFR fusions tended to occur in a particular GBM subtype, each TCGA GBM sample was first classified by expression according to the methods of Verhaak et al.6. The number of classical, mesenchymal, proneural and neural samples was then tallied with and without EGFR gene fusions. The combined class of classical and mesenchymal phenotype was enriched for EGFR fusions according to the Fisher's exact test.


Copy Number Variation in EGFR Fusions.


Gene fusions often arise from genomic instability. Motivated by this observation, segmented SNP array data was downloaded from TCGA and calculated the log 2 ratio between the tumor and normal copy numbers. This was plotted along the chromosomal neighborhood of EGFR, SEPT14 and PSPH (chr7:55,000,000-56,500,000).


GSEA.


To determine the biological impact of LZTR1 mutations, GSEA55 was used, which is an analytical tool that harnesses expression data to nominate gene sets enriched for a particular phenotype. Having identified TCGA samples with LZTR1 SNVs, GSEA was applied to the TCGA expression data. Samples were first compared with LZTR1 SNVs against those with wild-type LZTR1 (excluding LZTR1 deletions). To assess statistical significance, the data set was randomized by permuting gene sets 500 times and considered only gene sets with an FDR q<0.05.


Differential Expression Between Samples with EGFR-SEPT14 and EGFRvIII.


In-house differential expression analysis was also performed to determine a distinct molecular signature distinguishing the EGFR-SEPT14 and EGFRvIII phenotypes. Toward this end, a t test was performed comparing the expression of the two groups of samples for each gene. Correcting using the Benjamini-Hochberg method, only genes with FDR<0.05 were considered. In addition, genes were excluded with a variance less than the tenth percentile or absolute value lower than two across all samples. These filters left a predictive set of ten genes. Hierarchical clustering was then performed on the expression of these ten genes using Euclidean distance and average linkage.


Modeling of LZTR1.


Structural templates for the kelch and BTB-BACK regions of human LZTR1 were identified with HHpred56. An initial three-dimensional model was generated with the I-TASSER server57. The CUL3 N-terminal domain was docked onto the model by superposing the KLHL3BTB-BACK/CUL3NTD crystal structure27 onto the second LZTR1 BTB-BACK domain. The model does not include higher quaternary structure, although many BTB domains, and many kelch domains, are known to self associate25. The short linkage between the end of the first BACK domain and the beginning of the second BTB domain would seem to preclude an intrachain BTB-BTB pseudo-homodimer, and without being bond by theory, LZTR1 should self associate and form higher-order assemblies. Both BACK domains are the shorter, atypical form of the domain and consist of two helical hairpin motifs, as in SPOP26,58, and not the four-hairpin motif seen in most BTB-BACK-kelch proteins28,58. The model from the kelch domain predicts an unusual 1+3 velcro arrangement59, with the N-terminal region contributing strand d of blade 1 and the C-terminal region contributing strands a, b and c of the same blade, although an alternative 2+2 velcro model cannot be ruled out.


Cell Culture.


U87 cells were obtained from ATCC. SNB19, U87 and HEK-293T cells were cultured in DMEM supplemented with 10% fetal bovine serum (FBS). Growth rates were determined by plating cells in six-well plates at 3 d after infection with the lentivirus indicated in the figure legends. The number of viable cells was determined by Trypan blue exclusion in triplicate cultures obtained from triplicate independent infections. For the wound assay testing migration, confluent cells were scratched with a pipette tip and cultured in 0.25% FBS. After 16 h, images were taken using the Olympus IX70 connected to a digital camera. Images were processed using the ImageJ64 software. The area of the cell-free wound was assessed in triplicate samples. Experiments were repeated twice.


GBM-derived primary cultures were grown in DMEM:F12 medium containing N2 and B27 supplements and human recombinant FGF-2 and EGF (50 ng/ml each; Peprotech). For sphere formation, cells were infected with lentiviral particles. Four days later, single cells were plated at density of 1 cells per well in triplicate in low-attachment 96-well plates. The number and the size of spheres were scored after 10-14 d. Limiting dilution assays were performed as described previously60. Spheres were dissociated into single cells and plated in low-attachment 96-well plates in 0.2 ml of medium containing growth factors (EGF and FGF-2), except for the EGFR-transduced cells, which were cultured in the absence of EGF. Cultures were left undisturbed for 10 d, and then the percentage of wells not containing spheres for each cell dilution was calculated and plotted against the number of cells per well. Linear regression lines were plotted, and the number of cells required to generate at least one sphere in every well (the stem cell frequency) was calculated. The experiment was repeated twice. Treatment of GBM primary cultures with erlotinib or lapatinib was performed in cells transduced with the pLOC vector, wild-type pLOC-EGFR, EGFRvIII or EGFR-SEPT14 and selected with blasticidin for 5 d. Cells were seeded on 6-cm dishes in the absence of EGF and treated with the indicated drugs at the indicated doses for 48 h. Each treatment group was seeded in triplicate. Absolute viable cell counts were determined by Trypan blue exclusion and counted on a hemocytometer. EGF stimulation of EGFR-transduced primary glioma cells was performed in cells deprived of growth factors for 48 h. Cells were collected at the indicated times and processed for protein blot analysis.


Immunofluorescence.


Immunofluorescence staining on normal mouse and human brain and brain tumor tissue microarrays were performed as previously described43,61,62. Immunofluorescence microscopy was performed on cells fixed with 4% paraformaldehyde in phosphate buffer. Cells were permeabilized using 0.2% Triton X-100. The antibodies and concentrations used in the immunofluorescence staining are detailed in Table 15.


Secondary antibodies conjugated to Alexa Fluor 594 (1:300, A11037, Molecular Probes) or Alexa 488 (1:500, A11008, Molecular Probes) were used. DNA was stained with DAPI (Sigma). Fluorescence microscopy was performed on a Nikon A1R MP microscope. Quantification of the fluorescence intensity staining in primary or established glioma cells was performed using NIH ImageJ software (see URLs). A histogram of the intensity of fluorescence of each point of a representative field for each condition was generated. The fluorescence intensity of ten fields from three independent experiments was scored, standardized to the number of cells in the field and divided by the intensity of the vector.


Protein Blotting, Immunoprecipitation and In Vitro Binding.


Protein blot analysis and immunoprecipitation were performed using the antibodies detailed in Table 16. For the in vitro binding between CUL3 and LZTR1, wild-type and mutant LZTR1 were translated in vitro using the TNT Quick Coupled Transcription/Translation System (Promega). Flag-CUL3 was immunoprecipitated from transfected HEK-293T cells with Flag-M2 beads (Sigma) using RIPA buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate (DOC), 0.1% SDS, 1 mM phenylmethylsulfonyl fluoride (PMSF), 10 mM NaF, 0.5 M Na3OV4 (sodium orthovanadate) and Complete Protease Inhibitor Cocktail, Roche). Binding was performed in 200 mM NaCl plus 0.5% NP-40 for 2 h at 4° C. Immunocomplexes were analyzed by SDS-PAGE and immunoblot.


Cloning and Lentiviral Production.


The lentiviral expression vectors pLOC-GFP and pLOC-CTNND2 were purchased from Open Biosystems. Full-length EGFR-SEPT14 cDNA was amplified from tumor sample TCGA-27-1837. Wild-type EGFR, EGFRvIII and EGFR-SEPT14 cDNAs were cloned into the pLOC vector. pCDNA-MYC-Hist-LZTR1 was a kind gift24. pCDNA-Flag-CUL3 was a gift. Wild-type and mutant cDNAs for LZTR1 and CTNND2 obtained by site-directed mutagenesis (QuikChange II, Agilent) were cloned into the pLOC vector. Lentiviral particles were produced using published protocols43, 61, 62, 63, 64.


Genomic PCR and RT-PCR.


Total RNA was extracted from cells using an RNeasy Mini Kit (QIAGEN) following the manufacturer's instructions. Five-hundred nanograms of total RNA was retrotranscribed using the Superscript III kit (Invitrogen) following the manufacturer's instructions. The cDNAs obtained after the retrotranscription were used as templates for quantitative PCR as described43,64. The reaction was performed with a Roche480 thermal cycler using the Absolute Blue QPCR SYBR Green Mix from Thermo Scientific. The relative amount of specific mRNA was normalized to GAPDH. Results are presented as the mean±s.d. of triplicate amplifications. The validation of fusion transcripts was performed using both genomic PCR and RT-PCR with forward and reverse primer combinations designed within the margins of the paired-end read sequences detected by RNA-seq. Expressed fusion transcript variants were subjected to direct sequencing to confirm the sequence and translation frame. The primers used for the screening of gene fusions are detailed in Table 17. The primers used for genomic detection of gene fusions are listed in Table 18. Semiquantitative RT-PCR to detect exogenous wild-type MYC-LZTR1 and mutant p.Arg801Trp LZTR1 was performed using the primers listed in Table 19.


Subcutaneous Xenografts and Drug Treatment.


Female athymic mice (nu/nu genotype, BALB/c background, 6-8 weeks old) were used for all antitumor studies. Patient-derived adult human glioblastoma xenografts were maintained. Xenografts were excised from host mice under sterile conditions and homogenized with the use of a tissue press and modified tissue cytosieve (Biowhitter Inc.), and tumor homogenate was loaded into a repeating Hamilton syringe (Hamilton, Co.) dispenser. Cells were injected subcutaneously into the right flank of the athymic mouse at an inoculation volume of 50 ml with a 19-gauge needle65.


Subcutaneous tumors were measured twice weekly with hand-held vernier calipers (Scientific Products). Tumor volumes (V) were calculated with the following formula: ((width)×(length))/2=V (mm3) For the subcutaneous tumor studies, groups of mice randomly selected by tumor volume were treated with EGFR kinase inhibitors when the median tumor volumes were an average of 150 mm3 and were compared with control animals receiving vehicle (saline).


Erlotinib was administered at 100 mg per kg body weight orally once per day for 10 d. Lapatinib was administered at 75 mg per kg body weight orally twice per day for 20 d. Response to treatment was assessed by a delay in tumor growth and tumor regression.


Growth delay, expressed as a T-C value, is defined as the difference in days between the median time required for tumors in treated and control animals to reach a volume five times greater than that measured at the start of the treatment. Tumor regression is defined as a decrease in tumor volume over two successive measurements. Statistical analysis was performed using a SAS statistical analysis program, the Wilcoxon rank-order test for growth delay and Fisher's exact test for tumor regression.


Intracranial Injection.


GBM-derived primary cells were first infected with a lentivirus expressing luciferase and subsequently transduced with the pLOC vector or pLOC-CTNND2 lentiviral particles. Intracranial injection was performed in 9-week-old male nu/nu mice (Charles River Laboratories). Briefly, 5≦105 cells were resuspended in 2.5 μl of PBS and injected into the caudate putamen using a stereotaxic frame (coordinates relative to the bregma: 0.6 mm anterior; 1.65 mm medium-lateral; 3 mm depth-ventral). Tumor growth was monitored using the IVIS Imaging system. Briefly, mice were anesthetized with 3% isoflurane before intraperitoneal injection of 100 mg per kg body weight n-luciferin (Xenogen). Ten minutes after injection of n-luciferin, images were acquired for 1 min with the Xenogen IVIS system (Xenogen) using Living Image acquisition and analysis software (Xenogen). The bioluminescent signal was expressed in photons per second and displayed as a pseudo-color image representing the spatial distribution of photon counts.


URLs.


DNA and RNA sequencing and copy number variant data in The Cancer Genome Atlas (TCGA), http://cancergenome.nih.gov; glioma patient survival data from the Repository for Molecular Brain Neoplasia Data (REMBRANDT), https://caintegrator.nci.nih.gov/rembrandt/; sequence data deposition in database of Genotypes and Phenotypes (dbGaP), http://www.ncbi.nlm.nih.gov/gap; gene fusion annotation software package Pegasus, http://sourceforge.net/projects/pegasus-fus/.


Data Access.


RNA sequencing of twenty-four human GBM sphere cultures in this study were deposited under the dbGaP study accession phs000505.v2.p1. RNA and DNA sequencing of TCGA GBM samples was also analyzed from the dbGaP study accession phs000178.v1.p1.


REFERENCES FOR EXAMPLE 2



  • 1 Porter, K. R., McCarthy, B. J., Freels, S., Kim, Y. & Davis, F. G. Prevalence estimates for primary brain tumors in the United States by age, gender, behavior, and histology. Neuro-oncology 12, 520-527, doi:10.1093/neuonc/nop066 (2010).

  • 2 Stupp, R. et al. Radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma. The New England journal of medicine 352, 987-996, doi: 10.1056/NEJMoa043330 (2005).

  • 3 Cancer Genome Atlas Research, N. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061-1068, doi: 10.1038/nature07385 (2008).

  • 4 Noushmehr, H. et al. Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell 17, 510-522, doi:10.1016/j.ccr.2010.03.017 (2010).

  • 5 Parsons, D. W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807-1812, doi: 10.1126/science. 1164382 (2008).

  • 6 Verhaak, R. G. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98-110, doi:10.1016/j.ccr.2009.12.020 (2010).

  • 7 Bass, A. J. et al. Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion. Nat Genet 43, 964-968, doi:10.1038/ng.936 (2011).

  • 8 Chinnaiyan, A. M. & Palanisamy, N. Chromosomal aberrations in solid tumors. Prog Mol Biol Transl Sci 95, 55-94, doi: 10.1016/B978-0-12-385071-3.00004-6 (2010).

  • 9 Singh, D. et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231-1235, doi: 10.1126/science. 1220834 (2012).

  • 10 Rubin, A. F. & Green, P. Mutation patterns in cancer genomes. Proc Natl Acad Sci USA 106, 21766-21770, doi:10.1073/pnas.0912499106 (2009).

  • 11 Fan, Z. et al. BCOR regulates mesenchymal stem cell function by epigenetic mechanisms. Nat Cell Biol 11, 1002-1009, doi: 10.1038/ncb1913 (2009).

  • 12 Wamstad, J. A. & Bardwell, V. J. Characterization of Bcor expression in mouse development. Gene Expr Patterns 7, 550-557, doi:10.1016/j.modgep.2007.01.006 (2007).

  • 13 Wamstad, J. A., Corcoran, C. M., Keating, A. M. & Bardwell, V. J. Role of the transcriptional corepressor Bcor in embryonic stem cell differentiation and early embryonic development. PLoS One 3, e2814, doi:10.1371/journal.pone.0002814 (2008).

  • 14 Pugh, T. J. et al. Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations. Nature 488, 106-110, doi: 10.1038/nature11329 (2012).

  • 15 Zhang, J. et al. A novel retinoblastoma therapy from genomic and epigenetic analyses. Nature 481, 329-334, doi:10.1038/nature10733 (2012).

  • 16 Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899-905, doi: 10.1038/nature08822 (2010).

  • 17 Kantarci, S. et al. Mutations in LRP2, which encodes the multiligand receptor megalin, cause Donnai-Barrow and facio-oculo-acoustico-renal syndromes. Nat Genet 39, 957-959, doi:10.1038/ng2063 (2007).

  • 18 Willnow, T. E. et al. Defective forebrain development in mice lacking gp330/megalin. Proc Natl Acad Sci USA 93, 8460-8464 (1996).

  • 19 Christ, A. et al. LRP2 is an auxiliary SHH receptor required to condition the forebrain ventral midline for inductive signals. Dev Cell 22, 268-278, doi:10.1016/j.devcel.2011.11.023 (2012).

  • 20 Cowin, P. A. et al. LRP1B deletion in high-grade serous ovarian cancers is associated with acquired chemotherapy resistance to liposomal doxorubicin. Cancer Res 72, 4060-4073, doi: 10.1158/0008-5472.CAN-12-0203 (2012).

  • 21 Lima, F. R. et al. Glioblastoma: therapeutic challenges, what lies ahead. Biochim Biophys Acta 1826, 338-349, doi: 10.1016/j.bbcan.2012.05.004 (2012).

  • 22 Bekker-Jensen, S. et al. HERC2 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Nat Cell Biol 12, 80-86; sup pp 81-12, doi:10.1038/ncb2008 (2010).

  • 23 Harlalka, G. V. et al. Mutation of HERC2 causes developmental delay with Angelman-like features. J Med Genet 50, 65-73, doi: 10.1136/jmedgenet-2012-101367 (2013).

  • 24 Nacak, T. G., Leptien, K., Fellner, D., Augustin, H. G. & Kroll, J. The BTB-kelch protein LZTR-1 is a novel Golgi protein that is degraded upon induction of apoptosis. J Biol Chem 281, 5065-5071, doi: 10.1074/jbc.M509073200 (2006).

  • 25 Stogios, P. J., Downs, G. S., Jauhal, J. J., Nandra, S. K. & Prive, G. G. Sequence and structural analysis of BTB domain proteins. Genome Biol 6, R82, doi:10.1186/gb-2005-6-10-r82 (2005).

  • 26 Errington, W. J. et al. Adaptor protein self-assembly drives the control of a cullin-RING ubiquitin ligase. Structure 20, 1141-1153, doi:10.1016/j.str.2012.04.009 (2012).

  • 27 Ji, A. X. & Prive, G. G. Crystal structure of KLHL3 in complex with Cullin3. PLoS One 8, e60445, doi: 10.1371/journal.pone.0060445 (2013).

  • 28 Canning, P. et al. Structural basis for Cul3 assembly with the BTB-Kelch family of E3 ubiquitin ligases. J Biol Chem, doi:10.1074/jbc.M112.437996 (2013).

  • 29 Lo, S. C., Li, X., Henzl, M. T., Beamer, L. J. & Hannink, M. Structure of the Keap1:Nrf2 interface provides mechanistic insight into Nrf2 signaling. EMBO J 25, 3605-3617, doi:10.1038/sj.emboj.7601243 (2006).

  • 30 Boyden, L. M. et al. Mutations in kelch-like 3 and cullin 3 cause hypertension and electrolyte abnormalities. Nature 482, 98-102, doi: 10.1038/nature10814 (2012).

  • 31 Louis-Dit-Picard, H. et al. KLHL3 mutations cause familial hyperkalemic hypertension by impairing ion transport in the distal nephron. Nat Genet 44, 456-460, S451-453, doi: 10.1038/ng.2218 (2012).

  • 32 Emanuele, M. J. et al. Global identification of modular cullin-RING ligase substrates. Cell 147, 459-474, doi:10.1016/j.cell.2011.09.019 (2011).

  • 33 Galan, J. M. & Peter, M. Ubiquitin-dependent degradation of multiple F-box proteins by an autocatalytic mechanism. Proc Natl Acad Sci USA 96, 9124-9129 (1999).

  • 34 Zhang, D. D. et al. Ubiquitination of Keap1, a BTB-Kelch substrate adaptor protein for Cul3, targets Keap1 for degradation by a proteasome-independent pathway. J Biol Chem 280, 30091-30099, doi:10.1074/jbc.M501279200 (2005).

  • 35 Gunther, H. S. et al. Glioblastoma-derived stem cell-enriched cultures form distinct subgroups according to molecular and phenotypic criteria. Oncogene 27, 2897-2909, doi: 10.1038/sj.onc. 1210949 (2008).

  • 36 Abu-Elneel, K. et al. A delta-catenin signaling pathway leading to dendritic protrusions. J Biol Chem 283, 32781-32791, doi: 10.1074/jbc.M804688200 (2008).

  • 37 Arikkath, J. et al. Delta-catenin regulates spine and synapse morphogenesis and function in hippocampal neurons during development. J Neurosci 29, 5435-5442, doi: 10.1523/JNEUROSCI.0835-09.2009 (2009).

  • 38 Kosik, K. S., Donahue, C. P., Israely, I., Liu, X. & Ochiishi, T. Delta-catenin at the synaptic-adherens junction. Trends Cell Biol 15, 172-178, doi:10.1016/j.tcb.2005.01.004 (2005).

  • 39 Israely, I. et al. Deletion of the neuron-specific protein delta-catenin leads to severe cognitive and synaptic dysfunction. Curr Biol 14, 1657-1663, doi:10.1016/j.cub.2004.08.065 (2004).

  • 40 Jun, G. et al. delta-Catenin is genetically and biologically associated with cortical cataract and future Alzheimer-related structural and functional brain changes. PLoS One 7, e43728, doi: 10.1371/journal.pone.0043728 (2012).

  • 41 Hicks, S., Wheeler, D. A., Plon, S. E. & Kimmel, M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum Mutat 32, 661-668, doi: 10.1002/humu.21490 (2011).

  • 42 Phillips, H. S. et al. Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. Cancer Cell 9, 157-173, doi:10.1016/j.ccr.2006.02.019 (2006).

  • 43 Carro, M. S. et al. The transcriptional network for mesenchymal transformation of brain tumours. Nature 463, 318-325, doi: 10.1038/nature08712 (2010).

  • 44 Pierotti, M. A. & Greco, A. Oncogenic rearrangements of the NTRK1/NGF receptor. Cancer Lett 232, 90-98, doi: 10.1016/j.canlet.2005.07.043 (2006).

  • 45 Dunn, G. P. et al. Emerging insights into the molecular and cellular basis of glioblastoma. Genes Dev 26, 756-784, doi: 10.1101/gad.187922.112 (2012).

  • 46 Liu, C. et al. Chemokine receptor CXCR3 promotes growth of glioma. Carcinogenesis 32, 129-137, doi: 10.1093/carcin/bgq224 (2011).

  • 47 Vivanco, I. et al. Differential sensitivity of glioma-versus lung cancer-specific EGFR mutations to EGFR kinase inhibitors. Cancer Discov 2, 458-471, doi:10.1158/2159-8290.CD-11-0284 (2012).

  • 48 Forbes, S. A. et al. COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38, D652-657, doi: 10.1093/nar/gkp995 (2010).

  • 49 Northcott, P. A. et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488, 49-56, doi: 10.1038/nature11327 (2012).

  • Srivastava, M. et al. The Amphimedon queenslandica genome and the evolution of animal complexity. Nature 466, 720-726 (2010).

  • Stogios et al. Sequence and structural analysis of BTB domain proteins. Genome Biol. 6(10):R82 (2005).

  • Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 21(7):951-60 (2005).



Tables and Legends for Examples









TABLE 7







Gene fusions identified through RNA sequencing.


Table 7. Gene fusions identified through RNA sequencing.
















sample
#chrom 5p
start5p
end5p
Chrom 3p
Start 3p
End 3p





G17807.TCGA-28-5209-
chr7
55086724
55268105
chr7
56078745
56079561


01A-01R-1850-01.4








G17197.TCGA-06-0211-
chr7
55433140
55479781
chr7
55861236
55886915


01B-01R-1849-01.2








G17650.TCGA-28-2513-
chr7
55086724
55268105
chr7
55861236
55863784


01A-01R-1850-01.2








G17506.TCGA-27-1835-
chr4
1795038
1808660
chr4
1741428
1746894


01A-01R-1850-01.2








G17191.TCGA-06-0211-
chr7
55433140
55479781
chr7
55861236
55886915


01A-01R-1849-01.2








G17512.TCGA-27-1837-
chr7
55086724
55268105
chr7
55861236
55863784


01A-01R-1850-01.2








NYU_A
chr4
1795038
1808660
chr4
1737457
1746894


G17814.TCGA-06-5411-
chr1
204797781
204951147
chr1
156844362
156851641


01A-01R-1849-01.4








G17223.TCGA-06-0750-
chr7
55177539
55268105
chr7
55861236
55863784


01A-01R-1849-01.2








G17798.TCGA-32-5222-
chr7
55086724
55268105
chr7
55861236
55863784


01A-01R-1850-01.4








G17195.TCGA-06-0138-
chr12
69753531
69764754
chr12
58339410
58351050


01A-02R-1849-01.2








G17803.TCGA-76-4925-
chr4
1795038
1808660
chr4
1739324
1746894


01A-01R-1850-01.4








NYU_B
chr17
7387849
7388175
chr17
7604058
7605804


G17507.TCGA-28-1747-
chr7
55086724
55268105
chr7
55861236
55863784


01C-01R-1850-01.2








G17469.TCGA-06-2557-
chr7
7388705
73604247
chr7
74173109
74175019


01A-01R-1849-01.2








G17785.TCGA-06-5413-
chr14
32798478
32903022
chr14
34393422
34400420


01A-01R-1849-01.4








G17467.TCGA-14-0736-
chr14
102606211
102606508
chr19
12856190
12859137


02A-01R-2005-01.2








GBM-CUMC3316_L1
chr22
22160138
22221969
chr12
58166799
58176322


G17663.TCGA-19-2619-
chr1
156611739
156628524
chr1
156844697
156851641


01A-01R-1850-01.2








G17203.TCGA-06-0211-
chr7
54825187
54826938
chr7
55224225
55224641


02A-02R-2005-01.2








G17784.TCGA-76-4929-
chr2
9675962
9695916
chr2
9724106
9731643


01A-01R-1850-01.4








G17675.TCGA-19-2624-
chr7
55086724
55240816
chr12
65107223
65110598


01A-01R-1850-01.2








G17796.TCGA-41-5651-
chr12
58166382
58166910
chr12
58162351
58163738


01A-01R-1850-01.4








G17666.TCGA-06-5415-
chr5
134209459
134210219
chr2
219103386
219119069


01A-01R-1849-01.2








G17219.TCGA-06-0158-
chr2
42396489
42396775
chrX
41374190
41420896


01A-01R-1849-01.2








G17790.TCGA-06-5856-
chr12
122150657
122150869
chr12
58191442
58193702


01A-01R-1849-01.4








NYU_B
chr1
19746154
19811991
chr1
19401001
19433459


G17657.TCGA-19-1787-
chr3
100428159
100438901
chr3
100348441
100414320


01B-01R-1850-01.2








G17643.TCGA-12-5295-
chr20
43976964
43977063
chr3
52740659
52742194


01A-01R-1849-01.2








NYU_G
chr5
100191806
100238986
chr5
102260660
102365415


G17494.TCGA-14-2554-
chr3
48956280
48965245
chr3
48265643
48266974


01A-01R-1850-01.2








G17196.TCGA-06-0178-
chr12
58123421
58135939
chr8
94827532
94830345


01A-01R-1849-01.2








G17782.TCGA-26-5136-
chr1
95392393
95392734
chr1
95448278
95448861


01B-01R-1850-01.4








GBM-CUMC3296_L1
chr7
55639963
55640199
chr12
68642024
68645357


G17199.TCGA-06-0744-
chr7
55086724
55087057
chr9
102741464
102784507


01A-01R-1849-01.2








G17792.TCGA-28-5204-
chr4
6988888
6996028
chr4
8288287
8298197


01A-01R-1850-01.4








G17476.TCGA-06-2569-
chr6
7310149
7313540
chrX
31137344
32328392


01A-01R-1849-01.2








G17195.TCGA-06-0138-
chr12
69139935
69145969
chr12
58109542
58115337


01A-02R-1849-01.2








G17212.TCGA-06-0129-
chr5
140855568
140858112
chr6
163956013
163991168


01A-01R-1849-01.2








G17213.TCGA-06-0157-
chr4
2061586
2062888
chr16
90071278
90075837


01A-01R-1849-01.2








G17200.TCGA-06-0125-
chr1
27022521
27094489
chr14
24624395
24630546


01A-01R-1849-01.2








G17804.TCGA-06-5408-
chr7
55086724
55268105
chr7
56078745
56079561


01A-01R-1849-01.4








G17656.TCGA-28-2514-
chr1
27022521
27024030
chr1
49493541
49202173


01A-02R-1850-01.2








G17654.TCGA-41-4097-
chr5
146017813
146461053
chr1
156278751
156294879


01A-01R-1850-01.2








G17206.TCGA-06-0125-
chr1
27022521
27094489
chr14
24624365
24630546


02A-11R-2005-01.2








G17792.TCGA-28-5204-
chr9
36190891
36204175
chr9
33911961
33920398


01A-01R-1850-01.4








G17663.TCGA-19-2619-
chr1
205719084
205719360
chr1
205797153
205811016


01A-01R-1850-01.2








NYU_E
chr16
69221049
69333676
chr16
69349910
69358944


NYU_G
chr19
42759129
42759308
chr19
42734337
42744293


NYU B
chr12
6457892
6473312
chr12
6437923
6443409


G17675.TCGA-19-2624-
chr12
69201970
69230528
chr12
63037762
63195939


01A-01R-1850-01.2








G17484.TCGA-14-0787-
chr1
155385534
155532323
chr1
156374054
156384544


01A-01R-1849-01.2








G17792.TCGA-28-5204-
chr7
100065174
100076901
chr7
100054237
100061308


01A-01R-1850-01.4








BT299
chr1
23037330
23037535
chr1
1246964
1256472


G17667.TCGA-26-5134-
chr4
54373483
54424435
chr4
53457128
53494329


01A-01R-1850-01.2








GBM-CUMC3338_L1
chr19
41198855
41207363
chr19
41171813
41192899


G17815.TCGA-19-5960-
chr1
154918047
154928566
chr1
154897208
154904890


01A-11R-1850-01.4








G17662.TCGA-32-1970-
chr19
58740069
58758159
chr19
50285013
50310366


01A-01R-1850-01.2








NYU_E
chr7
65540775
65557649
chr7
65592690
54519550


G17816.TCGA-28-5215-
chr7
55086724
55268105
chr7
56078745
56079561


01A-01R-1850-01.4








G17650.TCGA-28-2513-
chr7
55433140
55433921
chr7
55861236
55914329


01A-01R-1850-01.2








BT308
chr20
33222418
33265088
chr20
33302578
33303168


G17656.TCGA-28-2514-
chr7
55086724
55268105
chr7
55538305
55588822


01A-02R-1850-01.2








G17639.TCGA-12-3652-
chr9
33290509
33295424
chr9
34379016
34382866


01A-01R-1849-01.2








G17796.TCGA-41-5651-
chr12
58087885
58090155
chr12
57960908
57978553


01A-01R-1850-01.4








G17468.TCGA-19-0957-
chr1
61547979
61554351
chr10
90043860
90122481


02A-11R-2005-01.2








G17790.TCGA-06-5856-
chr13
20304378
20357082
chr13
20977808
20987525


01A-01R-1849-01.4








GBM-CUMC3338_L1
chr17
7465318
7470321
chr17
7215977
7217039


G17790.TCGA-06-5856-
chr12
117175594
117175842
chr12
69229608
69239210


01A-01R-1849-01.4








G17802.TCGA-28-5208-
chr7
51203854
51258774
chr7
55861236
55886915


01A-01R-1850-01.4








G17210.TCGA-12-0616-
chr7
101006121
101176423
chr7
30536238
30536850


01A-01R-1849-01.2








NYU_B
chr2
233562014
233613791
chr2
234867560
234928164


G17469.TCGA-06-2557-
chr7
55086724
55268105
chr7
55861236
55863784


01A-01R-1849-01.2








G17480.TCGA-27-1830-
chr2
2319216080
231945027
chr2
230222345
230377651


01A-01R-1850-01.2








G17634.TCGA-19-2625-
chr22
36424287
36424584
chr22
37966253
37967937


01A-01R-1850-01.2








G17654.TCGA-41-4097-
chr3
66119284
66313802
chr2
86371055
86374955


01A-01R-1850-01.2








G17485.TCGA-14-1402-
chr3
142166711
142166852
chr1
245912644
246093238


02A-01R-2005-01.2








G17498.TCGA-02-2483-
chr9
119976636
120177316
chr16
332024
333367


01A-01R-1849-01.2








G17799.TCGA-06-1804-
chr13
60971426
61109366
chr13
47345391
47345630


01A-01R-1849-01.4








G17660.TCGA-06-5414-
chr8
61591338
61655655
chr8
59717976
59872566


01A-01R-1849-01.2








GBM-CUMC3296_L1
chr12
67663060
69688935
chr7
55238867
55275029


G17505.TCGA-06-2564-
chr5
216843
218296
chr5
1253286
1282738


01A-01R-1849-01.2








G17798.TCGA-32-5222-
chr1
11313895
11322607
chr1
6880240
6932096


01A-01R-1850-01.4








GBM-CUMC3297_L1
chr22
24640520
24641109
chr22
18659538
18660159


GBM-CUMC3342_L1
chr10
61083747
61122351
chr10
104375024
104378415


G17500.TCGA-27-1831-
chr13
39261172
39438742
chr13
41790516
41808142


01A-01R-1850-01.2








G17212.TCGA-06-0129-
chr6
3410421
3456792
chr6
47199268
47221256


01A-01R-1849-01.2








GBM-CUMC3322_L1
chr7
141251077
141255366
chr7
158715067
158738881


G17675.TCGA-19-2624-
chr12
54378945
54379793
chr12
56295196
56297263


01A-01R-1850-01.2








G17803.TCGA-76-4925-
chr17
74046508
74068577
chr19
10683347
10694745


01A-01R-1850-01.4








G17796.TCGA-41-5651-
chr12
58176535
58186855
chr12
57849876
57851788


01A-01R-1850-01.4








G17663.TCGA-19-2619-
chr7
70597788
70800718
chr7
94827660
94925724


01A-01R-1850-01.2








G17207.TCGA-06-0156-
chr11
910774
910873
chr16
4390252
4438631


01A-03R-1849-01.2








G17802.TCGA-28-5208-
chr7
55433140
55433921
chr7
56078745
56079561


01A-01R-1850-01.4








G17485.TCGA-14-1402-
chr7
151216545
151217009
chr7
151253202
151372722


02A-01R-2005-01.2








NYU_E
chr11
88033697
88070940
chr11
87846430
87883122


G17212.TCGA-06-0129-
chr12
51079615
51128912
chr6
42029094
42048642


01A-01R-1849-01.2








G17787.TCGA-26-5139-
chr19
13915605
13920033
chr1
156005092
156012703


01A-01R-1850-01.4








G17675.TCGA-19-2624-
chr12
65218409
65232630
chr12
63037762
63226038


01A-01R-1850-01.2








G17800.TCGA-06-5859-
chr18
48556582
48586285
chr18
56934268
56936732


01A-01R-1849-01.4








G17638.TCGA-28-2499-
chr2
234263152
234299128
chr2
234967479
234978666


01A-01R-1850-01.2










Total frags
Spanning



Strand



(split inserts +
frags (split


sample
5p
Strand 3p
Genes 5p
Genes 3p
split reads)
reads)





G17807.TCGA-28-5209-
+

EGFR
PSPH
6849
5648


01A-01R-1850-01.4








G17197.TCGA-06-0211-
+

LANCL2
SEPT14
2619
2078


01B-01R-1849-01.2








G17650.TCGA-28-2513-
+

EGFR
SEPT14
1899
1464


01A-01R-1850-01.2








G17506.TCGA-27-1835-
+
+
FGFR3
TACC3
1749
1604


01A-01R-1850-01.2








G17191.TCGA-06-0211-
+

LANCL2
SEPT14
1367
1128


01A-01R-1849-01.2








G17512.TCGA-27-1837-
+

EGFR
SEPT14
989
796


01A-01R-1850-01.2








NYU_A
+
+
FGFR3
TACC3
973
492


G17814.TCGA-06-5411-
+
+
KIAA0756,
NTRK1
841
751


01A-01R-1849-01.4


NFASC





G17223.TCGA-06-0750-
+

EGFR
SEPT14
534
414


01A-01R-1849-01.2








G17798.TCGA-32-5222-
+

EGFR
SEPT14
528
495


01A-01R-1850-01.4








G17195.TCGA-06-0138-
+
+
YEATS4
XRCC6BP1
328
306


01A-02R-1849-01.2








G17803.TCGA-76-4925-
+
+
FGFR3
TACC3
303
203


01A-01R-1850-01.4








NYU_B
+
+
POLR2A
WRAP53
263
240


G17507.TCGA-28-1747-
+

EGFR
SEPT14
181
142


01C-01R-1850-01.2








G17469.TCGA-06-2557-
+
+
EIF4H
GTF2I
180
142


01A-01R-1849-01.2








G17785.TCGA-06-5413-
+

AKAP6
EGLN3
171
129


01A-01R-1849-01.4








G17467.TCGA-14-0736-
+
+
WDR20
ASNA1
151
75


02A-01R-2005-01.2








GBM-CUMC3316_L1

+
MAPK1
FAM119B,
145
96






DKFZp586D0919




G17663.TCGA-19-2619-
+
+
BCAN
NTRK1
130
17


01A-01R-1850-01.2








G17203.TCGA-06-0211-

+
SEC61G
EGFR
129
103


02A-02R-2005-01.2








G17784.TCGA-76-4929-


ADAM17
YWHAQ
106
95


01A-01R-1850-01.4








G17675.TCGA-19-2624-
+

EGFR
GNS
92
59


01A-01R-1850-01.2








G17796.TCGA-41-5651-
+

FAM1198,
METTL1
29
18


01A-01R-1850-01.4


DKFZp586D0919





G17666.TCGA-06-5415-
+
+
TXNDC15
ARPC2
90
36


01A-01R-1849-01.2








G17219.TCGA-06-0158-
+

EML4
CASK
86
63


01A-01R-1849-01.2








G17790.TCGA-06-5856-
+

TMEM120B
AVIL
75
56


01A-01R-1849-01.4








NYU_B


CAPZB
UBR4
72
48


G17657.TCGA-19-1787-
+
+
TFG
GPR128
23
19


01B-01R-1850-01.2








G17643.TCGA-12-5295-

+
SDC4
SPCS1
68
18


01A-01R-1849-01.2








NYU_G

+
ST8SIA4
PAM
60
55


G17494.TCGA-14-2554-
+
+
Arl2, ARIH2
CAMP
59
52


01A-01R-1850-01.2








G17196.TCGA-06-0178-

+
AGAP2
TMEM67
59
21


01A-01R-1849-01.2








G17782.TCGA-26-5136-


CNN3
ALG14
54
49


01B-01R-1850-01.4








GBM-CUMC3296_L1


VOPP1
IL22
48
35


G17199.TCGA-06-0744-
+

EGFR
ERP44,
46
30


01A-01R-1849-01.2



KIAA0573




G17792.TCGA-28-5204-
+
+
TBC1D14
HTRA3
45
33


01A-01R-1850-01.4








G17476.TCGA-06-2569-


SSR1
DM0
44
33


01A-01R-1849-01.2








G17195.TCGA-06-0138-
+
+
SLC35E3
O59
44
38


01A-02R-1849-01.2








G17212.1CGA-06-0129-
+
+
PCDHGC3,
QKI
43
24


01A-01R-1849-01.2


PCDHGC4





G17213.TCGA-06-0157-
+

NAT8L
DBNDD1,
39
24


01A-01R-1849-01.2



DKFZp761L2416




G17200.TCGA-06-0125-
+
+
ARID1A
hoip, RNF31
39
26


01A-01R-1849-01.2








G17804.TCGA-06-5408-
+

EGFR
PSPH
38
37


01A-01R-1849-01.4








G17656.TCGA-28-2514-
+

ARID1A
BEND5
37
29


01A-02R-1850-01.2








G17654.TCGA-41-4097-


PPP2R2B
CCT3,
37
18


01A-01R-1850-01.2



DKFZp667A196




G17206.TCGA-06-0125-
+
+
ARID1A
hoip, RNF31
36
21


02A-11R-2005-01.2








G17792.TCGA-28-5204-
+
+
CLTA
UBE2R2
36
31


01A-01R-1850-01.4








G17663.TCGA-19-2619-


NUCKS1
PM20D1
34
31


01A-01R-1850-01.2








NYU_E
+
+
SNTB2
VPS4A
14
13


NYU_G


ERF
GSK3A
14
3


NYU_B


SCNN1A
TNFRSF1A
13
9


G17675.TCGA-19-2624-
+

MDM2
PPM1H
34
22


01A-01R-1850-01.2








G17484.TCGA-14-0787-


ASH1L
C1orf61
33
30


01A-01R-1849-01.2








G17792.TCGA-28-5204-


TSC22D4
C7orf61
12
1


01A-01R-1850-01.4








BT299
+

EPHB2
CPSF3L
33
6


G17667.TCGA-26-5134-


LNX1
USP46
33
27


01A-01R-1850-01.2








GBM-CUMC3338_L1


ADCK4
NUMBL
11
4


G17815.TCGA-19-5960-


PBXIP1
PMVK
10
8


01A-11 R-1850-01.4








G17662.TCGA-32-1970-
+
+
ZNF544
DKFZp586H1320,
31
28


01A-01R-1850-01.2



AP2A1




NYU_E
+
+
ASL
CRCP
10
8


G17816.TCGA-28-5215-
+

EGFR
PSPH
31
28


01A-01R-1850-01.4








G17650.TCGA-28-2513-
+

LANCL2
SEPT14
31
14


01A-01R-1850-01.2








BT308


PIGU
NCOA6
30
26


G17656.TCGA-28-2514-
+

EGFR
GASP,




01A-02R-1850-01.2



VOPP1
29
23


G17639.TCGA-12-3652-
+

NFX1
C9orf24
28
20


01A-01R-1849-01.2








G17796.TCGA-41-5651-
+
+
OS9
KIF5A
28
23


01A-01R-1850-01.4








G17468.1CGA-19-0957-
+

NFIA
RNLS, 
27
14


02A-11R-2005-01.2



C1orf59




G17790.TCGA-06-5856-


PSPC1
CRYL1
26
18


01A-01R-1849-01.4








GBM-CUMC3338_L1
+

SENP3
GPS2,
26
15






KIAA1787




G17790.TCGA-06-5856-

+
C12orf49
MDM2
26
16


01A-01R-1849-01.4E








G17802.TCGA-28-5208-


COBL
SEPT14
23
15


01A-01R-1850-01.4








G17210.TCGA-12-0616-
+

EMID2
GGCT
22
16


01A-01R-1849-01.2








NYU_B
+
+
KIAA0642,
TRPMB
22
20





GIGYF2





G17469.TCGA-06-2557-
+

EGFR
SEPT14
21
13


01A-01R-1849-01.2








G17480.TCGA-27-1830-
+

PSMD1
DNER
20
14


01A-01R-1850-01.2








G17634.TCGA-19-2625-


RBM9
LGALS2
20
16


01A-01R-1850-01.2








G17654.TCGA-41-4097-
+

SLC25A26
IMMT
19
17


01A-01R-1850-01.2








G17485.TCGA-14-1402-


XRN1
SMYD3
19
11


02A-01R-2005-01.2








G17498.TCGA-02-2483-

+
ASTN2
ARHGDIG,
18
6


01A-01R-1849-01.2



PDIA2




G17799.TCGA-06-1804-
+

TDRD3
ESD
18
13


01A-01R-1849-01.4








G17660.TCGA-06-5414-
+

CHD7
TOX
18
16


01A-01R-1849-01.2








GBM-CUMC3296_L1
+
+
CAND1
EGFR
17
14


G17505.TCGA-06-2564


CCDC127
hTERT, TERT
17
12


01A-01R-1849-01.2








G17798.TCGA-32-5222-

+
MTOR
KIAA0833,
17
13


01A-01R-1850-01.4



CAMTA1




GBM-CUMC3297_L1

+
DKFZp566O011,
USP18
16
6





GGT5





GBM-CUMC3342_L1

+
FAM13C
DKFZp434E2022,
15
11






SUFU




G17500.TCGA-27-1831-
+

FREM2
MTRF1
15
14


01A-01R-1850-01.2








G17212.TCGA-06-0129-


SLC22A23,
TNFRSF21
15
11


01A-01R-1849-01.2


DKFZp434F011





GBM-CUMC3322_L1
+
+
AGK
WDR60
15
12


G17675.TCGA-19-2624-
+

HOXC10
WIBG
15
10


01A-01R-1850-01.2








G17803.TCGA-76-4925-


SRP68
AP1M2
14
8


01A-01R-1850-01.4








G17796.TCGA-41-5651-
+
+
TSFM
INHBE
14
9


01A-01R-1850-01.4








G17663.TCGA-19-2619-
+
+
WBSCR17
PPP1R9A
14
10


01A-01R-1850-01.2








G17207.TCGA-06-0156-


CHID1
Magmas, hCG_
13
13


01A-03R-1849-01.2



1787779, CORO7




G17802.TCGA-28-5208-
+

LANCL2
PSPH
13
2


01A-01R-1850-01.4








G17485.TCGA-14-1402-


RHEB
H91620,
12
11


02A-01R-2005-01.2



PRKAG2




NYU_E


CTSC
RAB38
12
11


G17212.TCGA-06-0129-
+
+
DIP2B
TAF8
12
7


01A-01R-1849-01.2








G17787.TCGA-26-5139-
+

ZSW1M4
UBQLN4
11
9


01A-01R-1850-01.4








G17675.TCGA-19-2624-
+

TBC1030,
PPM1H
10
6


01A-01R-1850-01.2


KIAA0984





G17800.TCGA-06-5859-
+

SMAD4
RAX
10
10


01A-01R-1849-01.4








G17638.TCGA-28-2499-
+
+
DGKD
SPP2
10
10


01A-01R-1850-01.2






Gene


















Break-
Gene Break-
Frame
SEQ ID



sample
point 5p
point 3p
Type
NO
Fused Sequence





G17807.
56268106
56079562
InFrame
8276
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-28-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


5209-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.4




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAGTGATGCT







TTCATTGGATTTGGAGGAAATGTGATCAGGCAACAAG







TCAAGGATAACGCCAAATGGTATATCACTGATTTTGTA







GAGCTGCTGGGAGAACTGGAAGAA


G17197.
55479782
55886916
InFrame
8277
ATGGGCGAGACCATGTCAAAGAGGCTGAAGCTCCACC


TCGA-06-




TGGGAGGGGAGGCAGAAATGGAGGAACGGGCGTTCG


0211-




TCAACCCCTTCCCGGACTACGAGGCCGCCGCCGGGGC


01G-01R-




GCTGCTCGCCTCCGGAGCGGCCGAAGAGACAGGCTGT


1849-




GTTCGTCCCCCGGCGACCACGGATGAGCCCGGCCTCCC


01.2




TTTTCATCAGGACGGGAAGATCATTCATAATTTCATAA







GACGGATCCAGACCAAAATTAAAGATCTTCTGCAGCA







AATGGAAGAAGGGCTGAAGACAGCTGATCCCCATGAC







TGCTCTGCTTATACTGGCTGGACAGGCATAGCCCTTTT







GTACCTGCAGTTGTACCGGGTCACATGTGACCAAACCT







ACCTGCTCCGATCCCTGGATTACGTAAAAAGAACACTT







CGGAATCTGAATGGCCGCAGGGTCACCTTCCTCTGTGG







GGATGCTGGCCCCCTGGCTGTTGGAGCTGTGATTTATC







ACAAACTCAGAAGTGACTGTGAGTCCCAGGAATGTGT







CACAAAACTTTTGCAGCTCCAGAGATCGGTTGTCTGCC







AAGAATCAGACCTTCCTGATGAGCTGCTTTATGGACGG







GCAGGTTATCTGTATGCCTTACTGTACCTGAACACAGA







GATAGGTCCAGGCACCGTGTGTGAGTCAGCTATTAAA







GAGGTAGTCAATGCTATTATTGAATCGGGTAAGACTTT







GTCAAGGGAAGAAAGAAAAACGGAGCGCTGCCCGCT







GTTGTACCAGTGGCACCGGAAGCAGTACGTTGGAGCA







GCCCATGGCATGGCTGGAATTTACTATATGTTAATGCA







GCCGGCAGCAAAAGTGGACCAAGAAACCTTGACAGAA







ATGGTGAAACCCAGTATTGATTATGTGCGCCACAAAAA







ATTCCGATCTGGGAATTACCCATCATCATTAAGCAATG







AAACAGACCGGCTGGTGCACTGGTGCCACGGCGCCCC







GGGGGTCATCCACATGCTCATGCAGGCGTACAAG|GG







GCTGTTACCCTTTGCTGTGGTAGGGAGTACAGATGAA







GTGAAAGTTGGAAAAAGGATGGTCAGAGGCCGTCACT







ACCCTTGGGGAGTTTTGCAAGTGGAAAATGAAAATCA







CTGTGACTTCGTTAAGCTCCGAGATATGCTTCTTTGTAC







CAATATGGAAAATCTAAAAGAAAAAACCCACACTCAG







CACTATGAATGTTATAGGTACCAAAAACTGCAGAAAAT







GGGCTTTACAGATGTGGGTCCAAACAACCAGCCAGTT







AGTTTTCAAGAAATCTTTGAAGCCAAAAGACAAGAGTT







CTATGATCAATGTCAGAGGGAAGAAGAAGAGTTGAAA







CAGAGATTTATGCAGCGAGTCAAGGAGAAAGAAGCAA







CATTTAAAGAAGCTGAAAAAGAGCTGCAGGACAAGTT







CGAGCATCTTAAAATGATTCAACAGGAGGAGATAAGG







AAGCTCGAGGAAGAGAAAAAACAACTGGAAGGAGAA







ATCATAGATTTTTATAAAATGAAAGCTGCCTCCGAAGC







ACTGCAGACTCAGCTGAGCACCGATACAAAGAAAGAC







AAACATCGTAAGAAA


G17650.
55268106
55863785
InFrame
8278
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-28-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


2513-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


G17506.
1808661
1741429
InFrame
8279
ATGGGCGCCCCTGCCTGCGCCCTCGCGCTCTGCGTGGC


TCGA-27-




CGTGGCCATCGTGGCCGGCGCCTCCTCGGAGTCCTTG


1835-




GGGACGGAGCAGCGCGTCGTGGGGCGAGCGGCAGAA


01A-01R-




GTCCCGGGCCCAGAGCCCGGCCAGCAGGAGCAGTTG


1850-




GTCTTCGGCAGCGGGGATGCTGTGGAGCTGAGCTGTC


01.2




CCCCGCCCGGGGGTGGTCCCATGGGGCCCACTGTCTG







GGTCAAGGATGGCACAGGGCTGGTGCCCTCGGAGCGT







GTCCTGGTGGGGCCCCAGCGGCTGCAGGTGCTGAATG







CCTCCCACGAGGACTCCGGGGCCTACAGCTGCCGGCA







GCGGCTCACGCAGCGCGTACTGTGCCACTTCAGTGTGC







GGGTGACAGACGCTCCATCCTCGGGAGATGACGAAGA







CGGGGAGGACGAGGCTGAGGACACAGGTGTGGACAC







AGGGGCCCCTTACTGGACACGGCCCGAGCGGATGGAC







AAGAAGCTGCTGGCCGTGCCGGCCGCCAACACCGTCC







GCTTCCGCTGCCCAGCCGCTGGCAACCCCACTCCCTCC







ATCTCCTGGCTGAAGAACGGCAGGGAGTTCCGCGGCG







AGCACCGCATTGGAGGCATCAAGCTGCGGCATCAGCA







GTGGAGCCTGGTCATGGAAAGCGTGGTGCCCTCGGAC







CGCGGCAACTACACCTGCGTCGTGGAGAACAAGTTTG







GCAGCATCCGGCAGACGTACACGCTGGACGTGCTGGA







GCGCTCCCCGCACCGGCCCATCCTGCAGGCGGGGCTG







CCGGCCAACCAGACGGCGGTGCTGGGCAGCGACGTG







GAGTTCCACTGCAAGGTGTACAGTGACGCACAGCCCC







ACATCCAGTGGCTCAAGCACGTGGAGGTGAATGGCAG







CAAGGTGGGCCCGGACGGCACACCCTACGTTACCGTG







CTCAAGTCCTGGATCAGTGAGAGTGTGGAGGCCGACG







TGCGCCTCCGCCTGGCCAATGTGTCGGAGCGGGACGG







GGGCGAGTACCTCTGTCGAGCCACCAATTTCATAGGC







GTGGCCGAGAAGGCCTTTTGGCTGAGCGTTCACGGGC







CCCGAGCAGCCGAGGAGGAGCTGGTGGAGGCTGACG







AGGCGGGCAGTGTGTATGCAGGCATCCTCAGCTACGG







GGTGGGCTTCTTCCTGTTCATCCTGGTGGTGGCGGCTG







TGACGCTCTGCCGCCTGCGCAGCCCCCCCAAGAAAGG







CCTGGGCTCCCCCACCGTGCACAAGATCTCCCGCTTCC







CGCTCAAGCGACAGGTGTCCCTGGAGTCCAACGCGTC







CATGAGCTCCAACACACCACTGGTGCGCATCGCAAGG







CTGTCCTCAGGGGAGGGCCCCACGCTGGCCAATGTCT







CCGAGCTCGAGCTGCCTGCCGACCCCAAATGGGAGCT







GTCTCGGGCCCGGCTGACCCTGGGCAAGCCCCTTGGG







GAGGGCTGCTTCGGCCAGGTGGTCATGGCGGAGGCC







ATCGGCATTGACAAGGACCGGGCCGCCAAGCCTGTCA







CCGTAGCCGTGAAGATGCTGAAAGACGATGCCACTGA







CAAGGACCTGTCGGACCTGGTGTCTGAGATGGAGATG







ATGAAGATGATCGGGAAACACAAAAACATCATCAACC







TGCTGGGCGCCTGCACGCAGGGCGGGCCCCTGTACGT







GCTGGTGGAGTACGCGGCCAAGGGTAACCTGCGGGA







GTTTCTgcgggcgcggcggcccccgggccTGGACTACTCCTTC







GACACCTGCAAGCCGCCCGAGGAGCAGCTCACCTTCA







AGGACCTGGTGTCCTGTGCCTACCAGGTGGCCCGGGG







CATGGAGTACTTGGCCTCCCAGAAGTGCATCCACAGG







GACCTGGCTGCCCGCAATGTGCTGGTGACCGAGGACA







ACGTGATGAAGATCGCAGACTTCGGGCTGGCCCGGGA







CGTGCACAACCTCGACTACTACAAGAAGACGACCAAC







GGCCGGCTGCCCGTGAAGTGGATGGCGCCTGAGGCCT







TGTTTGACCGAGTCTACACTCACCAGAGTGACGTCTGG







TCCTTTGGGGTCCTGCTCTGGGAGATCTTCACGCTGGG







GGGCTCCCCGTACCCCGGCATCCCTGTGGAGGAGCTCT







TCAAGCTGCTGAAGGAGGGCCACCGCATGGACAAGCC







CGCCAACTGCACACACGACCTGTACATGATCATGCGG







GAGTGCTGGCATGCCGCGCCCTCCCAGAGGCCCACCTT







CAAGCAGCTGGTGGAGGACCTGGACCGTGTCCTTACC







GTGACGTCCACCGAC|GTAAAGGCGACACAGGAGGAG







AACCGGGAGCTGAGGAGCAGGTGTGAGGAGCTCCAC







GGGAAGAACCTGGAACTGGGGAAGATCATGGACAGG







TTCGAAGAGGTTGTGTACCAGGCCATGGAGGAAGTTC







AGAAGCAGAAGGAACTTTCCAAAGCTGAAATCCAGAA







AGTTCTAAAAGAAAAAGACCAACTTACCACAGATCTGA







ACTCCATGGAGAAGTCCTTCTCCGACCTCTTCAAGCGT







TTTGAGAAACAGAAAGAGGTGATCGAGGGCTACCGCA







AGAACGAAGAGTCACTGAAGAAGTGCGTGGAGGATT







ACCTGGCAAGGATCACCCAGGAGGGCCAGAGGTACCA







AGCCCTGAAGGCCCACGCGGAGGAGAAGCTGCAGCT







GGCAAACGAGGAGATCGCCCAGGTCCGGAGCAAGGC







CCAGGCGGAAGCGTTGGCCCTCCAGGCCAGCCTGAGG







AAGGAGCAGATGCGCATCCAGTCGCTGGAGAAGACA







GTGGAGCAGAAGACTAAAGAGAACGAGGAGCTGACC







AGGATCTGCGACGACCTCATCTCCAAGATGGAGAAGA







TC


G17191.
55479782
55886916
InFrame
8280
ATGGGCGAGACCATGTCAAAGAGGCTGAAGCTCCACC


TCGA-06-




TGGGAGGGGAGGCAGAAATGGAGGAACGGGCGTTCG


0211-




TCAACCCCTTCCCGGACTACGAGGCCGCCGCCGGGGC


01A-01R-




GCTGCTCGCCTCCGGAGCGGCCGAAGAGACAGGCTGT


1849-




GTTCGTCCCCCGGCGACCACGGATGAGCCCGGCCTCCC


01.2




TTTTCATCAGGACGGGAAGATCATTCATAATTTCATAA







GACGGATCCAGACCAAAATTAAAGATCTTCTGCAGCA







AATGGAAGAAGGGCTGAAGACAGCTGATCCCCATGAC







TGCTCTGCTTATACTGGCTGGACAGGCATAGCCCTTTT







GTACCTGCAGTTGTACCGGGTCACATGTGACCAAACCT







ACCTGCTCCGATCCCTGGATTACGTAAAAAGAACACTT







CGGAATCTGAATGGCCGCAGGGTCACCTTCCTCTGTGG







GGATGCTGGCCCCCTGGCTGTTGGAGCTGTGATTTATC







ACAAACTCAGAAGTGACTGTGAGTCCCAGGAATGTGT







CACAAAACTTTTGCAGCTCCAGAGATCGGTTGTCTGCC







AAGAATCAGACCTTCCTGATGAGCTGCTTTATGGACGG







GCAGGTTATCTGTATGCCTTACTGTACCTGAACACAGA







GATAGGTCCAGGCACCGTGTGTGAGTCAGCTATTAAA







GAGGTAGTCAATGCTATTATTGAATCGGGTAAGACTTT







GTCAAGGGAAGAAAGAAAAACGGAGCGCTGCCCGCT







GTTGTACCAGTGGCACCGGAAGCAGTACGTTGGAGCA







GCCCATGGCATGGCTGGAATTTACTATATGTTAATGCA







GCCGGCAGCAAAAGTGGACCAAGAAACCTTGACAGAA







ATGGTGAAACCCAGTATTGATTATGTGCGCCACAAAAA







ATTCCGATCTGGGAATTACCCATCATCATTAAGCAATG







AAACAGACCGGCTGGTGCACTGGTGCCACGGCGCCCC







GGGGGTCATCCACATGCTCATGCAGGCGTACAAG|GG







GCTGTTACCCTTTGCTGTGGTAGGGAGTACAGATGAA







GTGAAAGTTGGAAAAAGGATGGTCAGAGGCCGTCACT







ACCCTTGGGGAGTTTTGCAAGTGGAAAATGAAAATCA







CTGTGACTTCGTTAAGCTCCGAGATATGCTTCTTTGTAC







CAATATGGAAAATCTAAAAGAAAAAACCCACACTCAG







CACTATGAATGTTATAGGTACCAAAAACTGCAGAAAAT







GGGCTTTACAGATGTGGGTCCAAACAACCAGCCAGTT







AGTTTTCAAGAAATCTTTGAAGCCAAAAGACAAGAGTT







CTATGATCAATGTCAGAGGGAAGAAGAAGAGTTGAAA







CAGAGATTTATGCAGCGAGTCAAGGAGAAAGAAGCAA







CATTTAAAGAAGCTGAAAAAGAGCTGCAGGACAAGTT







CGAGCATCTTAAAATGATTCAACAGGAGGAGATAAGG







AAGCTCGAGGAAGAGAAAAAACAACTGGAAGGAGAA







ATCATAGATTTTTATAAAATGAAAGCTGCCTCCGAAGC







ACTGCAGACTCAGCTGAGCACCGATACAAAGAAAGAC







AAACATCGTAAGAAA


G17512.
55268106
55863785
InFrame
8281
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-27-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


1837-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


NYU_A
1808661
1737458
InFrame
8282
ATGGGCGCCCCTGCCTGCGCCCTCGCGCTCTGCGTGGC







CGTGGCCATCGTGGCCGGCGCCTCCTCGGAGTCCTTG







GGGACGGAGCAGCGCGTCGTGGGGCGAGCGGCAGAA







GTCCCGGGCCCAGAGCCCGGCCAGCAGGAGCAGTTG







GTCTTCGGCAGCGGGGATGCTGTGGAGCTGAGCTGTC







CCCCGCCCGGGGGTGGTCCCATGGGGCCCACTGTCTG







GGTCAAGGATGGCACAGGGCTGGTGCCCTCGGAGCGT







GTCCTGGTGGGGCCCCAGCGGCTGCAGGTGCTGAATG







CCTCCCACGAGGACTCCGGGGCCTACAGCTGCCGGCA







GCGGCTCACGCAGCGCGTACTGTGCCACTTCAGTGTGC







GGGTGACAGACGCTCCATCCTCGGGAGATGACGAAGA







CGGGGAGGACGAGGCTGAGGACACAGGTGTGGACAC







AGGGGCCCCTTACTGGACACGGCCCGAGCGGATGGAC







AAGAAGCTGCTGGCCGTGCCGGCCGCCAACACCGTCC







GCTTCCGCTGCCCAGCCGCTGGCAACCCCACTCCCTCC







ATCTCCTGGCTGAAGAACGGCAGGGAGTTCCGCGGCG







AGCACCGCATTGGAGGCATCAAGCTGCGGCATCAGCA







GTGGAGCCTGGTCATGGAAAGCGTGGTGCCCTCGGAC







CGCGGCAACTACACCTGCGTCGTGGAGAACAAGTTTG







GCAGCATCCGGCAGACGTACACGCTGGACGTGCTGGA







GCGCTCCCCGCACCGGCCCATCCTGCAGGCGGGGCTG







CCGGCCAACCAGACGGCGGTGCTGGGCAGCGACGTG







GAGTTCCACTGCAAGGTGTACAGTGACGCACAGCCCC







ACATCCAGTGGCTCAAGCACGTGGAGGTGAATGGCAG







CAAGGTGGGCCCGGACGGCACACCCTACGTTACCGTG







CTCAAGTCCTGGATCAGTGAGAGTGTGGAGGCCGACG







TGCGCCTCCGCCTGGCCAATGTGTCGGAGCGGGACGG







GGGCGAGTACCTCTGTCGAGCCACCAATTTCATAGGC







GTGGCCGAGAAGGCCTTTTGGCTGAGCGTTCACGGGC







CCCGAGCAGCCGAGGAGGAGCTGGTGGAGGCTGACG







AGGCGGGCAGTGTGTATGCAGGCATCCTCAGCTACGG







GGTGGGCTTCTTCCTGTTCATCCTGGTGGTGGCGGCTG







TGACGCTCTGCCGCCTGCGCAGCCCCCCCAAGAAAGG







CCTGGGCTCCCCCACCGTGCACAAGATCTCCCGCTTCC







CGCTCAAGCGACAGGTGTCCCTGGAGTCCAACGCGTC







CATGAGCTCCAACACACCACTGGTGCGCATCGCAAGG







CTGTCCTCAGGGGAGGGCCCCACGCTGGCCAATGTCT







CCGAGCTCGAGCTGCCTGCCGACCCCAAATGGGAGCT







GTCTCGGGCCCGGCTGACCCTGGGCAAGCCCCTTGGG







GAGGGCTGCTTCGGCCAGGTGGTCATGGCGGAGGCC







ATCGGCATTGACAAGGACCGGGCCGCCAAGCCTGTCA







CCGTAGCCGTGAAGATGCTGAAAGACGATGCCACTGA







CAAGGACCTGTCGGACCTGGTGTCTGAGATGGAGATG







ATGAAGATGATCGGGAAACACAAAAACATCATCAACC







TGCTGGGCGCCTGCACGCAGGGCGGGCCCCTGTACGT







GCTGGTGGAGTACGCGGCCAAGGGTAACCTGCGGGA







GTTTCTgcgggcgcggcggcccccgggccTGGACTACTCCTTC







GACACCTGCAAGCCGCCCGAGGAGCAGCTCACCTTCA







AGGACCTGGTGTCCTGTGCCTACCAGGTGGCCCGGGG







CATGGAGTACTTGGCCTCCCAGAAGTGCATCCACAGG







GACCTGGCTGCCCGCAATGTGCTGGTGACCGAGGACA







ACGTGATGAAGATCGCAGACTTCGGGCTGGCCCGGGA







CGTGCACAACCTCGACTACTACAAGAAGACGACCAAC







GGCCGGCTGCCCGTGAAGTGGATGGCGCCTGAGGCCT







TGTTTGACCGAGTCTACACTCACCAGAGTGACGTCTGG







TCCTTTGGGGTCCTGCTCTGGGAGATCTTCACGCTGGG







GGGCTCCCCGTACCCCGGCATCCCTGTGGAGGAGCTCT







TCAAGCTGCTGAAGGAGGGCCACCGCATGGACAAGCC







CGCCAACTGCACACACGACCTGTACATGATCATGCGG







GAGTGCTGGCATGCCGCGCCCTCCCAGAGGCCCACCTT







CAAGCAGCTGGTGGAGGACCTGGACCGTGTCCTTACC







GTGACGTCCACCGAC|TTTAAGGAGTCGGCCTTGAGG







AAGCAGTCCTTATACCTCAAGTTCGACCCCCTCCTGAG







GGACAGTCCTGGTAGACCAGTGCCCGTGGCCACCGAG







ACCAGCAGCATGCACGGTGCAAATGAGACTCCCTCAG







GACGTCCGCGGGAAGCCAAGCTTGTGGAGTTCGATTT







CTTGGGAGCACTGGACATTCCTGTGCCAGGCCCACCCC







CAGGTGTTCCCGCGCCTGGGGGCCCACCCCTGTCCACC







GGACCTATAGTGGACCTGCTCCAGTACAGCCAGAAGG







ACCTGGATGCAGTGGTAAAGGCGACACAGGAGGAGA







ACCGGGAGCTGAGGAGCAGGTGTGAGGAGCTCCACG







GGAAGAACCTGGAACTGGGGAAGATCATGGACAGGT







TCGAAGAGGTTGTGTACCAGGCCATGGAGGAAGTTCA







GAAGCAGAAGGAACTTTCCAAAGCTGAAATCCAGAAA







GTTCTAAAAGAAAAAGACCAACTTACCACAGATCTGAA







CTCCATGGAGAAGTCCTTCTCCGACCTCTTCAAGCGTTT







TGAGAAACAGAAAGAGGTGATCGAGGGCTACCGCAA







GAACGAAGAGTCACTGAAGAAGTGCGTGGAGGATTA







CCTGGCAAGGATCACCCAGGAGGGCCAGAGGTACCAA







GCCCTGAAGGCCCACGCGGAGGAGAAGCTGCAGCTG







GCAAACGAGGAGATCGCCCAGGTCCGGAGCAAGGCC







CAGGCGGAAGCGTTGGCCCTCCAGGCCAGCCTGAGGA







AGGAGCAGATGCGCATCCAGTCGCTGGAGAAGACAGT







GGAGCAGAAGACTAAAGAGAACGAGGAGCTGACCAG







GATCTGCGACGACCTCATCTCCAAGATGGAGAAGATC


G17814.
204951148
156844363
InFrame
8283
ATGGCCAGGCAGCCACCGCCGCCCTGGGTCCATGCAG


TCGA-06-




CCTTCCTCCTCTGCCTCCTCAGTCTTGGCGGAGCCATCG


5411-




AAATTCCTATGGATCTGACGCAGCCGCCAACCATCACC


01A-01R-




AAGCAGTCAGCGAAGGATCACATCGTGGACCCCCGTG


1849-




ATAACATCCTGATTGAGTGTGAAGCAAAAGGGAACCC


01.4




TGCCCCCAGCTTCCACTGGACACGAAACAGCAGATTCT







TCAACATCGCCAAGGACCCCCGGGTGTCCATGAGGAG







GAGGTCTGGGACCCTGGTGATTGACTTCCGCAGTGGC







GGGCGGCCGGAGGAATATGAGGGGGAATATCAGTGC







TTCGCCCGCAACAAATTTGGCACGGCCCTGTCCAATAG







GATCCGCCTGCAGGTGTCTAAATCTCCTCTGTGGCCCA







AGGAAAACCTAGACCCTGTCGTGGTCCAAGAGGGCGC







TCCTTTGACGCTCCAGTGCAACCCCCCGCCTGGACTTCC







ATCCCCGGTCATCTTCTGGATGAGCAGCTCCATGGAGC







CCATCACCCAAGACAAACGTGTCTCTCAGGGCCATAAC







GGAGACCTATACTTCTCCAACGTGATGCTGCAGGACAT







GCAGACCGACTACAGTTGTAACGCCCGCTTCCACTTCA







CCCACACCATCCAGCAGAAGAACCCTTTCACCCTCAAG







GTCCTCACCAACCACCCTTATAATGACTCGTCCTTAAGA







AACCACCCTGACATGTACAGTGCCCGAGGAGTTGCAG







AAAGAACACCAAGCTTCATGTATCCCCAGGGCACCGC







GAGCAGCCAGATGGTGCTTCGTGGCATGGACCTCCTG







CTGGAATGCATCGCCTCCGGGGTCCCAACACCAGACAT







CGCATGGTACAAGAAAGGTGGGGACCTCCCATCTGAT







AAGGCCAAGTTTGAGAACTTTAATAAGGCCCTGCGTAT







CACAAATGTCTCTGAGGAAGACTCCGGGGAGTATTTCT







GCCTGGCCTCCAACAAGATGGGCAGCATCCGGCACAC







GATCTCGGTGAGAGTAAAGGCTGCTCCCTACTGGCTG







GACGAACCCAAGAACCTTATTCTGGCTCCTGGCGAGG







ATGGGAGACTGGTGTGTCGAGCCAATGGAAACCCCAA







ACCCACTGTCCAGTGGATGGTGAATGGGGAACCTTTG







CAATCGGCACCACCTAACCCAAACCGTGAGGTGGCCG







GAGACACCATCATCTTCCGGGACACCCAGATCAGCAG







CAGGGCTGTGTACCAGTGCAACACCTCCAACGAGCAT







GGCTACCTGCTGGCCAACGCCTTTGTCAGTGTGCTGGA







TGTGCCGCCTCGGATGCTGTCGCCCCGGAACCAGCTCA







TTCGAGTGATTCTTTACAACCGGACGCGGCTGGACTGC







CCTTTCTTTGGGTCTCCCATCCCCACACTGCGATGGTTT







AAGAATGGGCAAGGAAGCAACCTGGATGGTGGCAAC







TACCATGTTTATGAGAACGGCAGTCTGGAAATTAAGAT







GATCCGCAAAGAGGACCAGGGCATCTACACCTGTGTC







GCCACCAACATCCTGGGCAAAGCTGAAAACCAAGTCC







GCCTGGAGGTCAAAGACCCCACCAGGATCTACCGGAT







GCCCGAGGACCAGGTGGCCAGAAGGGGCACCACGGT







GCAGCTGGAGTGTCGGGTGAAGCACGACCCCTCCCTG







AAACTCACCGTCTCCTGGCTGAAGGATGACGAGCCGC







TCTATATTGGAAACAGGATGAAGAAGGAAGACGACTC







CCTGACCATCTTTGGGGTGGCAGAGCGGGACCAGGGC







AGTTACACGTGTGTCGCCAGCACCGAGCTAGACCAAG







ACCTGGCCAAGGCCTACCTCACCGTGCTAGCTGATCAG







GCCACTCCAACTAACCGTTTGGCTGCCCTGCCCAAAGG







ACGGCCAGACCGGCCCCGGGACCTGGAGCTGACCGAC







CTGGCCGAGAGGAGCGTGCGGCTGACCTGGATCCCCG







GGGATGCTAACAACAGCCCCATCACAGACTACGTCGTC







CAGTTTGAAGAAGACCAGTTCCAACCTGGGGTCTGGC







ATGACCATTCCAAGTACCCCGGCAGCGTTAACTCAGCC







GTCCTCCGGCTGTCCCCGTATGTCAACTACCAGTTCCGT







GTCATTGCCATCAACGAGGTTGGGAGCAGCCACCCCA







GCCTCCCATCCGAGCGCTACCGAACCAGTGGAGCACC







CCCCGAGTCCAATCCTGGTGACGTGAAGGGAGAGGG







GACCAGAAAGAACAACATGGAGATCACGTGGACGCCC







ATGAATGCCACCTCGGCCTTTGGCCCCAACCTGCGCTA







CATTGTCAAGTGGAGGCGGAGAGAGACTCGAGAGGC







CTGGAACAACGTCACAGTGTGGGGCTCTCGCTACGTG







GTGGGGCAGACCCCAGTCTACGTGCCCTATGAGATCC







GAGTCCAGGCTGAAAATGACTTCGGGAAGGGCCCTGA







GCCAGAGTCCGTCATCGGTTACTCCGGAGAAGATT|AC







ACTAACAGCACATCTGGAGACCCGGTGGAGAAGAAG







GACGAAACACCTTTTGGGGTCTCGGTGGCTGTGGGCC







TGGCCGTCTTTGCCTGCCTCTTCCTTTCTACGCTGCTCC







TTGTGCTCAACAAATGTGGACGGAGAAACAAGTTTGG







GATCAACCGCCCGGCTGTGCTGGCTCCAGAGGATGGG







CTGGCCATGTCCCTGCATTTCATGACATTGGGTGGCAG







CTCCCTGTCCCCCACCGAGGGCAAAGGCTCTGGGCTCC







AAGGCCACATCATCGAGAACCCACAATACTTCAGTGAT







GCCTGTGTTCACCACATCAAGCGCCGGGACATCGTGCT







CAAGTGGGAGCTGGGGGAGGGCGCCTTTGGG AAGGT







CTTCCTTGCTGAGTGCCACAACCTCCTGCCTGAGCAGG







ACAAGATGCTGGTGGCTGTCAAGGCACTGAAGGAGGC







GTCCGAGAGTGCTCGGCAGGACTTCCAGCGTGAGGCT







GAGCTGCTCACCATGCTGCAGCACCAGCACATCGTGC







GCTTCTTCGGCGTCTGCACCGAGGGCCGCCCCCTGCTC







ATGGTCTTTGAGTATATGCGGCACGGGGACCTCAACC







GCTTCCTCCGATCCCATGGACCTGATGCCAAGCTGCTG







GCTGGTGGGGAGGATGTGGCTCCAGGCCCCCTGGGTC







TGGGGCAGCTGCTGGCCGTGGCTAGCCAGGTCGCTGC







GGGGATGGTGTACCTGGCGGGTCTGCATTTTGTGCAC







CGGGACCTGGCCACACGCAACTGTCTAGTGGGCCAGG







GACTGGTGGTCAAGATTGGTGATTTTGGCATGAGCAG







GGATATCTACAGCACCGACTATTACCGTGTGGGAGGC







CGCACCATGCTGCCCATTCGCTGGATGCCGCCCGAGA







GCATCCTGTACCGTAAGTTCACCACCGAGAGCGACGT







GTGGAGCTTCGGCGTGGTGCTCTGGGAGATCTTCACC







TACGGCAAGCAGCCCTGGTACCAGCTCTCCAACACGG







AGGCAATCGACTGCATCACGCAGGGACGTGAGTTGGA







GCGGCCACGTGCCTGCCCACCAGAGGTCTACGCCATC







ATGCGGGGCTGCTGGCAGCGGGAGCCCCAGCAACGC







CACAGCATCAAGGATGTGCACGCCCGGCTGCAAGCCC







TGGCCCAGGCACCTCCTGTCTACCTGGATGTCCTGGGC


G17223.
55268106
55863785
InFrame
8284
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-06-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


0750-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1849-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


G17798.
55268106
55863785
InFrame
8285
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-32-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


5222-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.4




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


G17195.
69764755
58339411
InFrame
8286
ATGTTCAAGAGAATGGCCGAATTTGGGCCTGACTCCG


TCGA-06-




GCGGGAGAGTAAAGGGTGTTACTATCGTTAAACCAAT


0138-




AGTTTACGGTAATGTTGCTCGGTATTTTGGAAAGAAAA


01A-02R-




GAGAAGAAGATGGGCACACTCATCAGTGGACAGTATA


1849-




TGTGAAACCATATAGAAATGAGGATATGTCAGCATAT


01.2




GTGAAGAAAATCCAGTTTAAATTACATGAAAGCTATG







GCAATCCTTTAAGAGTTGTTACTAAACCTCCATATGAA







ATTACTGAAACAGGATGGGGTGAATTCGAAATAATCA







TCAAAATATTTTTCATTGACCCTAATGAAAGACCTGTAA







CCCTGTATCATTTGCTAAAGCTGTTTCAATCAGACACCA







ATGCAATGCTGGGGAAAAAGACAGTGGTTTCAGAGTT







CTATGATGAAATGATATTTCAAGACCCAACAGCAATGA







TGCAACAATTATTGACAACATCTCGTCAGCTAACATTA







GGAGCCTATAAGCATGAAACAGAAT|ATCCATATGTCA







AACTTCTGCTTGATGCTATGAAACACTCAGGTTGTGCT







GTTAACAAAGATAGACACTTTTCTTGCGAAGACTGTAA







TGGAAATGTCAGTGGAGGTTTTGATGCTTCAACATCTC







AGATAGTTTTGTGCCAGAATAATATCCATAATCAGGCC







CATATGAACAGAGTGGTCACACACGAGCTTATTCATGC







ATTTGATCATTGTCGTGCCCATGTCGACTGGTTCACCA







ACATCAGACATTTGGCGTGCTCAGAGGTTCGAGCTGCT







AACCTTAGTGGAGACTGCTCACTTGTCAATGAAATATT







CAGGTTACATTTTGGATTAAAACAACACCACCAGACTT







GTGTGCGAGACAGAGCCACTCTTTCTATCCTGGCTGTT







AGGAATATCAGCAAAGAAGTAGCTAAAAAGGCTGTTG







ATGAAGTTTTTGAATCTTGTTTCAATGACCATGAACCTT







TTGGAAGGATCCCACATAACAAGACTTATGCAAGATAT







GCTCACAGAGACTTTGAAAACCGTGATCGGTATTATTC







AAATATA


G17803.
1808661
1739325
InFrame
8287
ATGGGCGCCCCTGCCTGCGCCCTCGCGCTCTGCGTGGC


8287




CGTGGCCATCGTGGCCGGCGCCTCCTCGGAGTCCTTG


TCGA-76-




GGGACGGAGCAGCGCGTCGTGGGGCGAGCGGCAGAA


4925-




GTCCCGGGCCCAGAGCCCGGCCAGCAGGAGCAGTTG


01A-01R-




GTCTTCGGCAGCGGGGATGCTGTGGAGCTGAGCTGTC


1850-




CCCCGCCCGGGGGTGGTCCCATGGGGCCCACTGTCTG


01.4




GGTCAAGGATGGCACAGGGCTGGTGCCCTCGGAGCGT







GTCCTGGTGGGGCCCCAGCGGCTGCAGGTGCTGAATG







CCTCCCACGAGGACTCCGGGGCCTACAGCTGCCGGCA







GCGGCTCACGCAGCGCGTACTGTGCCACTTCAGTGTGC







GGGTGACAGACGCTCCATCCTCGGGAGATGACGAAGA







CGGGGAGGACGAGGCTGAGGACACAGGTGTGGACAC







AGGGGCCCCTTACTGGACACGGCCCGAGCGGATGGAC







AAGAAGCTGCTGGCCGTGCCGGCCGCCAACACCGTCC







GCTTCCGCTGCCCAGCCGCTGGCAACCCCACTCCCTCC







ATCTCCTGGCTGAAGAACGGCAGGGAGTTCCGCGGCG







AGCACCGCATTGGAGGCATCAAGCTGCGGCATCAGCA







GTGGAGCCTGGTCATGGAAAGCGTGGTGCCCTCGGAC







CGCGGCAACTACACCTGCGTCGTGGAGAACAAGTTTG







GCAGCATCCGGCAGACGTACACGCTGGACGTGCTGGA







GCGCTCCCCGCACCGGCCCATCCTGCAGGCGGGGCTG







CCGGCCAACCAGACGGCGGTGCTGGGCAGCGACGTG







GAGTTCCACTGCAAGGTGTACAGTGACGCACAGCCCC







ACATCCAGTGGCTCAAGCACGTGGAGGTGAATGGCAG







CAAGGTGGGCCCGGACGGCACACCCTACGTTACCGTG







CTCAAGTCCTGGATCAGTGAGAGTGTGGAGGCCGACG







TGCGCCTCCGCCTGGCCAATGTGTCGGAGCGGGACGG







GGGCGAGTACCTCTGTCGAGCCACCAATTTCATAGGC







GTGGCCGAGAAGGCCTTTTGGCTGAGCGTTCACGGGC







CCCGAGCAGCCGAGGAGGAGCTGGTGGAGGCTGACG







AGGCGGGCAGTGTGTATGCAGGCATCCTCAGCTACGG







GGTGGGCTTCTTCCTGTTCATCCTGGTGGTGGCGGCTG







TGACGCTCTGCCGCCTGCGCAGCCCCCCCAAGAAAGG







CCTGGGCTCCCCCACCGTGCACAAGATCTCCCGCTTCC







CGCTCAAGCGACAGGTGTCCCTGGAGTCCAACGCGTC







CATGAGCTCCAACACACCACTGGTGCGCATCGCAAGG







CTGTCCTCAGGGGAGGGCCCCACGCTGGCCAATGTCT







CCGAGCTCGAGCTGCCTGCCGACCCCAAATGGGAGCT







GTCTCGGGCCCGGCTGACCCTGGGCAAGCCCCTTGGG







GAGGGCTGCTTCGGCCAGGTGGTCATGGCGGAGGCC







ATCGGCATTGACAAGGACCGGGCCGCCAAGCCTGTCA







CCGTAGCCGTGAAGATGCTGAAAGACGATGCCACTGA







CAAGGACCTGTCGGACCTGGTGTCTGAGATGGAGATG







ATGAAGATGATCGGGAAACACAAAAACATCATCAACC







TGCTGGGCGCCTGCACGCAGGGCGGGCCCCTGTACGT







GCTGGTGGAGTACGCGGCCAAGGGTAACCTGCGGGA







GTTTCTgcgggcgcggcggcccccgggccTGGACTACTCCTTC







GACACCTGCAAGCCGCCCGAGGAGCAGCTCACCTTCA







AGGACCTGGTGTCCTGTGCCTACCAGGTGGCCCGGGG







CATGGAGTACTTGGCCTCCCAGAAGTGCATCCACAGG







GACCTGGCTGCCCGCAATGTGCTGGTGACCGAGGACA







ACGTGATGAAGATCGCAGACTTCGGGCTGGCCCGGGA







CGTGCACAACCTCGACTACTACAAGAAGACGACCAAC







GGCCGGCTGCCCGTGAAGTGGATGGCGCCTGAGGCCT







TGTTTGACCGAGTCTACACTCACCAGAGTGACGTCTGG







TCCTTTGGGGTCCTGCTCTGGGAGATCTTCACGCTGGG







GGGCTCCCCGTACCCCGGCATCCCTGTGGAGGAGCTCT







TCAAGCTGCTGAAGGAGGGCCACCGCATGGACAAGCC







CGCCAACTGCACACACGACCTGTACATGATCATGCGG







GAGTGCTGGCATGCCGCGCCCTCCCAGAGGCCCACCTT







CAAGCAGCTGGTGGAGGACCTGGACCGTGTCCTTACC







GTGACGTCCACCGAC|GTGCCAGGCCCACCCCCAGGT







GTTCCCGCGCCTGGGGGCCCACCCCTGTCCACCGGACC







TATAGTGGACCTGCTCCAGTACAGCCAGAAGGACCTG







GATGCAGTGGTAAAGGCGACACAGGAGGAGAACCGG







GAGCTGAGGAGCAGGTGTGAGGAGCTCCACGGGAAG







AACCTGGAACTGGGGAAGATCATGGACAGGTTCGAAG







AGGTTGTGTACCAGGCCATGGAGGAAGTTCAGAAGCA







GAAGGAACTTTCCAAAGCTGAAATCCAGAAAGTTCTA







AAAGAAAAAGACCAACTTACCACAGATCTGAACTCCAT







GGAGAAGTCCTTCTCCGACCTCTTCAAGCGTTTTGAGA







AACAGAAAGAGGTGATCGAGGGCTACCGCAAGAACG







AAGAGTCACTGAAGAAGTGCGTGGAGGATTACCTGGC







AAGGATCACCCAGGAGGGCCAGAGGTACCAAGCCCTG







AAGGCCCACGCGGAGGAGAAGCTGCAGCTGGCAAAC







GAGGAGATCGCCCAGGTCCGGAGCAAGGCCCAGGCG







GAAGCGTTGGCCCTCCAGGCCAGCCTGAGGAAGGAGC







AGATGCGCATCCAGTCGCTGGAGAAGACAGTGGAGCA







GAAGACTAAAGAGAACGAGGAGCTGACCAGGATCTG







CGACGACCTCATCTCCAAGATGGAGAAGATC


NYU_B
7388176
7604059
InFrame
8288
ATGCACGGGGGTGGCCCCCCCTCGGGGGACAGCGCAT







GCCCGCTGCGCACCATCAAGAGAGTCCAGTTCGGAGT







CCTGAGTCCGGATGAACTG|GTCCCTGTCCTTCGAATG







GTGGAAGGTGATACCATCTATGATTACTGCTGGTATTC







TCTGATGTCCTCAGCCCAGCCAGACACCTCCTACGTGG







CCAGCAGCAGCCGGGAGAACCCGATTCATATCTGGGA







CGCATTCACTGGAGAGCTCCGGGCTTCCTTTCGCGCCT







ACAACCACCTGGATGAGCTGACGGCAGCCCATTCGCTC







TGCTTCTCCCCGGATGGCTCCCAGCTCTTCTGTGGCTTC







AACCGGACTGTGCGTGTTTTTTCCACGGCCCGGCCTGG







CCGAGACTGCGAGGTCCGAGCCACATTTGCAAAAAAG







CAGGGCCAGAGCGGCATCATCTCCTGCATAGCCTTCAG







CCCAGCCCAGCCCCTCTATGCCTGTGGCTCCTACGGCC







GCTCCCTGGGTCTGTATGCCTGGGATGATGGCTCCCCT







CTCGCCTTGCTGGGAGGGCACCAAGGGGGCATCACCC







ACCTCTGCTTTCATCCCGATGGCAACCGCTTCTTCTCAG







GAGCCCGCAAGGATGCTGAGCTCCTGTGCTGGGATCT







CCGGCAGTCTGGTTACCCACTGTGGTCCCTGGGTCGAG







AGGTGACCACCAATCAGCGCATCTACTTCGATCTGGAC







CCGACCGGGCAGTTCCTAGTGAGTGGCAGCACGAGCG







GGGCTGTCTCTGTGTGGGACACGGACGGGCCTGGCAA







TGATGGGAAGCCGGAGCCCGTGTTGAGTTTTCTGCCCC







AGAAGGACTGCACCAATGGCGTGAGCCTGCACCCTAG







CCTGCCTCTCCTGGCCACTGCCTCCGGTCAGCGTGTGT







TTCCTGAGCCCACAGAGAGTGGGGACGAAGGAGAGG







AGCTGGGCCTTCCCTTGCTCTCCACGCGCCACGTCCAC







CTTGAATGTCGGCTTCAGCTCTGGTGGTGTGGGGGGG







CGCCAGACTCCAGCATCCCTGATGATCACCAGGGCGA







GAAAGGGCAGGGAGGAACGGAGGGAGGTGTGGGTG







AGCTGATA


G17507.
55268106
55863785
InFrame
8289
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-28-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


1747-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01C-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


G17469.
73604248
74173110
InFrame
8290
ATGGCGGACTTCGACACCTACGACGATCGGGCCTACA


TCGA-06-




GCAGCTTCGGCGGCGGCAGAGGGTCCCGCGGCAGTG


2557-




CTGGTGGCCATGGTTCCCGTAGCCAGAAGGAGTTGCC


01A-01R-




CACAGAGCCCCCCTACACAGCATACGTAGGAAATCTAC


1849-




CTTTCAATACGGTTCAGGGCGACATAGATGCTATCTTT


01.2




AAGGATCTCAGCATAAGGAGTGTACGGCTAGTCAGAG







ACAAAGACACAGATAAATTTAAAGGATTCTGCTATGTA







GAATTCGATGAAGTGGATTCCCTTAAGGAAGCCTTGAC







ATACGATGGTGCACTGTTGGGCGATCGGTCACTTCGTG







TGGACATTGCAGAAGGCAGAAAACAAGATAAAGGTG







GCTTTGGATTCAGAAAAGGTGGACCAGATGACAGAG|







AAATAAAAGAGACTGATGGAAGCTCTCAGATCAAGCA







AGAACCAGACCCCACGTGG


G17785.
32903023
3440021
InFrame
8291
ATGTTAACCATGAGCGTGACACTTTCCCCCCTGAGGTC


TCGA-06-




ACAGGACCTGGATCCCATGGCTACTGATGCTTCACCCA


5413-




TGGCCATCAACATGACACCCACTGTGGAGCAGGGTGA


01A-01R-




GGGAGAAGAGGCAATGAAGGACATGGACTCTGACCA


1849-




GCAGTATGAAAAGCCACCCCCACTACACACAGGGGCT


01.4




GACTGGAAGATTGTCCTCCACTTACCTGAAATTGAGAC







CTGGCTCCGGATGACCTCAGAGAGGGTCCGAGACCTA







ACCTATTCAGTCCAGCAGGATTCGGACAGCAAGCATGT







GGATGTACATCTAGTTCAACTAAAG|GCAATGGTGGCT







TGCTATCCGGGAAATGGAACAGGTTATGTTCGCCACGT







GGACAACCCCAACGGTGATGGTCGCTGCATCACCTGC







ATCTACTATCTGAACAAGAATTGGGATGCCAAGCTACA







TGGTGGGATCCTGCGGATATTTCCAGAGGGGAAATCA







TTCATAGCAGATGTGGAGCCCATTTTTGACAGACTCCT







GTTCTTCTGGTCAGATCGTAGGAACCCACACGAAGTGC







AGCCCTCTTACGCAACCAGATATGCTATGACTGTCTGG







TACTTTGATGCTGAAGAAAGGGCAGAAGCCAAAAAGA







AATTCAGGAATTTAACTAGGAAAACTGAATCTGCCCTC







ACTGAAGAC


G17467.
102606509
12856191
InFrame
8292
ATGGCGACGGAGGGAGGAGGGAAGGAGATGAACGA


TCGA-14-




GATTAAGACCCAATTCACCACCCGGGAAGGTCTGTACA


0736-




AGCTGCTGCCGCACTCGGAGTACAGCCGGCCCAACCG


02A-01R-




GGTGCCCTTCAACTCGCAGGGATCCAACCCTGTCCGCG


2005-




TCTCCTTCGTAAACCTCAACGACCAGTCTGGCAACGGC


01.2




GACCGCCTCTGCTTCAATGTGGGCCGGGAGCTGTACTT







CTATATCTACAAGGGGGTCCGCAAG|GAGATTGACCCC







AGCCTGGGCGTGGCGGAGCTGCCTGACGAGTTCTTCG







AGGAGGACAACATGCTGAGCATGGGCAAGAAGATGA







TGCAGGAGGCCATGAGCGCATTTCCCGGCATCGATGA







GGCCATGAGCTATGCCGAGGTCATGAGGCTGGTGAAG







GGCATGAACTTCTCGGTGGTGGTATTTGACACGGCACC







CACGGGCCACACCCTGAGGCTGCTCAACTTCCCCACCA







TCGTGGAGCGGGGCCTGGGCCGGCTTATGCAGATCAA







GAACCAGATCAGCCCTTTCATCTCACAGATGTGCAACA







TGCTGGGCCTGGGGGACATGAACGCAGACCAGCTGGC







CTCCAAGCTGGAGGAGACGCTGCCCGTCATCCGCTCA







GTCAGCGAACAGTTCAAGGACCCTGAGCAGACAACTT







TCATCTGCGTATGCATTGCTGAGTTCCTGTCCCTGTATG







AGACAGAGAGGCTGATCCAGGAGCTGGCCAAGTGCA







AGATTGACACACACAATATAATTGTCAACCAGCTCGTC







TTCCCCGACCCCGAGAAGCCCTGCAAGATGTGTGAGG







CCCGTCACAAGATCCAGGCCAAGTATCTGGACCAGAT







GGAGGACCTGTATGAAGACTTCCACATCGTGAAGCTG







CCGCTGTTACCCCATGAGGTGCGGGGGGCAGACAAGG







TCAACACCTTCTCGGCCCTCCTCCTGGAGCCCTACAAG







CCCCCCAGTGCCCAG


GBM-
22160139
58166800
InFrame
8293
atggcggcggcggcggcggcggGCGCGGGCCCGGAGATGGT


CUMC3316_L1




CCGCGGGCAGGTGTTCGACGTGGGGCCGCGCTACACC







AACCTCTCGTACATCGGCGAGGGCGCCTACGGCATGG







TGTGCTCTGCTTATGATAATGTCAACAAAGTTCGAGTA







GCTATCAAGAAAATCAGCCCCTTTGAGCACCAGACCTA







CTGCCAGAGAACCCTGAGGGAGATAAAAATCTTACTG







CGCTTCAGACATGAGAACATCATTGGAATCAATGACAT







TATTCGAGCACCAACCATCGAGCAAATGAAAGATGTAT







ATATAGTACAGGACCTCATGGAAACAGATCTTTACAAG







CTCTTGAAGACACAACACCTCAGCAATGACCATATCTG







CTATTTTCTCTACCAGATCCTCAGAGGGTTAAAATATAT







CCATTCAGCTAACGTTCTGCACCGTGACCTCAAGCCTTC







CAACCTGCTGCTCAACACCACCTGTGATCTCAAG|GCC







CTGAGCCTGTGCAATTATTTCGAGAGTCAAAATGTGGA







TTTCCGAGGCAAGAAGGTGATCGAACTGGGTGCGGG







GACAGGCATCGTGGGGATCTTGGCAGCGCTGCAGGG







GGGGGATGTTACCATCACTGACCTGCCCCTGGCCCTAG







AACAGATCCAGGGCAACGTCCAGGCCAATGTGCCAGC







TGGAGGCCAGGCCCAGGTGCGTGCCTTGTCCTGGGGG







ATTGACCATCATGTCTTCCCTGCAAACTATGACCTGGT







GCTGGGGGCTGATATCGTGTACCTGGAACCCACCTTCC







CTCTGCTGCTGGGGACCCTCCAACACCTGTGCAGGCCC







CATGGCACCATCTATCTGGCCTCCAAGATGAGAAAGG







AGCATGGGACAGAGAGCTTCTTTCAGCACCTCCTGCCC







CAGCATTTCCAACTGGAGCTGGCTCAGCGGGATGAGG







ATGAAAATGTCAACATCTATAGGGCCAGGCACAGGGA







ACCAAGACCTGCT


G17663.
156628525
156844698
InFrame
8294
ATGGCCCAGCTGTTCCTGCCCCTGCTGGCAGCCCTGGT


TCGA-19-




CCTGGCCCAGGCTCCTGCAGCTTTAGCAGATGTTCTGG


2619-




AAGGAGACAGCTCAGAGGACCGCGCTTTTCGCGTGCG


01A-01R-




CATCGCGGGCGACGCGCCACTGCAGGGCGTGCTCGGC


1850-




GGCGCCCTCACCATCCCTTGCCACGTCCACTACCTGCG


01.2




GCCACCGCCGAGCCGCCGGGCTGTGCTGGGCTCTCCG







CGGGTCAAGTGGACTTTCCTGTCCCGGGGCCGGGAGG







CAGAGGTGCTGGTGGCGCGGGGAGTGCGCGTCAAGG







TGAACGAGGCCTACCGGTTCCGCGTGGCACTGCCTGC







GTACCCAGCGTCGCTCACCGACGTCTCCCTGGCGCTGA







GCGAGCTGCGCCCCAACGACTCAGGTATCTATCGCTGT







GAGGTCCAGCACGGCATCGATGACAGCAGCGACGCTG







TGGAGGTCAAGGTCAAAGGGGTCGTCTTTCTCTACCG







AGAGGGCTCTGCCCGCTATGCTTTCTCCTTTTCTGGGG







CCCAGGAGGCCTGTGCCCGCATTGGAGCCCACATCGC







CACCCCGGAGCAGCTCTATGCCGCCTACCTTGGGGGCT







ATGAGCAATGTGATGCTGGCTGGCTGTCGGATCAGAC







CGTGAGGTATCCCATCCAGACCCCACGAGAGGCCTGTT







ACGGAGACATGGATGGCTTCCCCGGGGTCCGGAACTA







TGGTGTGGTGGACCCGGATGACCTCTATGATGTGTACT







GTTATGCTGAAGACCTAAATGGAGAACTGTTCCTGGGT







GACCCTCCAGAGAAGCTGACATTGGAGGAAGCACGG







GCGTACTGCCAGGAGCGGGGTGCAGAGATTGCCACCA







CGGGCCAACTGTATGCAGCCTGGGATGGTGGCCTGGA







CCACTGCAGCCCAGGGTGGCTAGCTGATGGCAGTGTG







CGCTACCCCATCGTCACACCCAGCCAGCGCTGTGGTGG







GGGCTTGCCTGGTGTCAAGACTCTCTTCCTCTTCCCCAA







CCAGACTGGCTTCCCCAATAAGCACAGCCGCTTCAACG







TCTACTGCTTCCGAGACTCGGCCCAGCCTTCTGCCATCC







CTGAGgcctccaacccagcctccaacccagcctcTGATGGACTA







GAGGCTATCGTCACAGTGACAGAGACCCTGGAGGAAC







TGCAGCTGCCTCAGGAAGCCACAGAGAGTGAATCCCG







TGGGGCCATCTACTCCATCCCCATCATGGAGGACGGA







GGAGGTGGAAGCTCCACTCCAGAAGACCCAGCAGAG







GCCCCTAGGACGCTCCTAGAATTTGAAACACAATCCAT







GGTACCGCCCACGGGGTTCTCAGAAGAGGAAGGTAA







GGCATTggaggaagaagagaaatatgaagatgaagaagagaaag







aggaggaagaagaagaggaggaggtggaggatgaggCTCTGTGG







GCATGGCCCAGCGAGCTCAGCAGCCCGGGCCCTGAGG







CCTCTCTCCCCACTGAGCCAGCAGCCCAGGAGGAGTCA







CTCTCCCAGGCGCCAGCAAGGGCAGTCCTGCAGCCTG







GTGCATCACCACTTCCTGATGGAGAGTCAGAAGCTTCC







AGGCCTCCAAGGGTCCATGGACCACCTACTGAGACTCT







GCCCACTCCCAGGGAGAGGAACCTAGCATCCCCATCA







CCTTCCACTCTGGTTGAGGCAAGAGAGGTGGGGGAGG







CAACTGGTGGTCCTGAGCTATCTGGGGTCCCTCGAGG







AGAGAGCGAGGAGACAGGAAGCTCCGAGGGTGCCCC







TTCCCTGCTTCCAGCCACACGGGCCCCTGAGGGTACCA







GGGAGCTGGAGGCCCCCTCTGAAGATAATTCTGGAAG







AACTGCCCCAGCAGGGACCTCAGTGCAGGCCCAGCCA







GTGCTGCCCACTGACAGCGCCAGCCGAGGTGGAGTGG







CCGTGGTCCCCGCATCAGGTGACTGTGTCCCCAGCCCC







TGCCACAATGGTGGGACATGCTTGGAGGAGGAGGAA







GGGGTCCGCTGCCTATGTCTGCCTGGCTATGGGGGGG







ACCTGTGCGATGTTGGCCTCCGCTTCTGCAACCCCGGC







TGGGACGCCTTCCAGGGCGCCTGCTACAAGCACTTTTC







CACACGAAGGAGCTGGGAGGAGGCAGAGACCCAGTG







CCGGATGTACGGCGCGCATCTGGCCAGCATCAGCACA







CCCGAGGAACAGGACTTCATCAACAACCGGTACCGGG







AGTACCAGTGGATCGGACTCAACGACAGGACCATCGA







AGGCGACTTCTTGTGGTCGGATGGCGTCCCCCTGCTCT







ATGAGAACTGGAACCCTGGGCAGCCTGACAGCTACTT







CCTGTCTGGAGAGAACTGCGTGGTCATGGTGTGGCAT







GATCAGGGACAATGGAGTGACGTGCCCTGCAACTACC







ACCTGTCCTACACCTGCAAGATGGGGCTGGTGTCCTGT







GGGCCGCCACCGGAGCTGCCCCTGGCTCAAGTGTTCG







GCCGCCCACGGCTGCGCTATGAGGTGGACACTGTGCT







TCGCTACCGGTGCCGGGAAGGACTGGCCCAGCGCAAT







CTGCCGCTGATCCGATGCCAAGAGAACGGTCGTTGGG







AGGCCCCCCAGATCTCCTGTGTGCCCAGAAGACCT|GT







CTCGGTGGCTGTGGGCCTGGCCGTCTTTGCCTGCCTCT







TCCTTTCTACGCTGCTCCTTGTGCTCAACAAATGTGGAC







GGAGAAACAAGTTTGGGATCAACCGCCCGGCTGTGCT







GGCTCCAGAGGATGGGCTGGCCATGTCCCTGCATTTCA







TGACATTGGGTGGCAGCTCCCTGTCCCCCACCGAGGG







CAAAGGCTCTGGGCTCCAAGGCCACATCATCGAGAAC







CCACAATACTTCAGTGATGCCTGTGTTCACCACATCAA







GCGCCGGGACATCGTGCTCAAGTGGGAGCTGGGGGA







GGGCGCCTTTGGGAAGGTCTTCCTTGCTGAGTGCCACA







ACCTCCTGCCTGAGCAGGACAAGATGCTGGTGGCTGT







CAAGGCACTGAAGGAGGCGTCCGAGAGTGCTCGGCA







GGACTTCCAGCGTGAGGCTGAGCTGCTCACCATGCTG







CAGCACCAGCACATCGTGCGCTTCTTCGGCGTCTGCAC







CGAGGGCCGCCCCCTGCTCATGGTCTTTGAGTATATGC







GGCACGGGGACCTCAACCGCTTCCTCCGATCCCATGGA







CCTGATGCCAAGCTGCTGGCTGGTGGGGAGGATGTGG







CTCCAGGCCCCCTGGGTCTGGGGCAGCTGCTGGCCGT







GGCTAGCCAGGTCGCTGCGGGGATGGTGTACCTGGCG







GGTCTGCATTTTGTGCACCGGGACCTGGCCACACGCAA







CTGTCTAGTGGGCCAGGGACTGGTGGTCAAGATTGGT







GATTTTGGCATGAGCAGGGATATCTACAGCACCGACT







ATTACCGTGTGGGAGGCCGCACCATGCTGCCCATTCGC







TGGATGCCGCCCGAGAGCATCCTGTACCGTAAGTTCAC







CACCGAGAGCGACGTGTGGAGCTTCGGCGTGGTGCTC







TGGGAGATCTTCACCTACGGCAAGCAGCCCTGGTACC







AGCTCTCCAACACGGAGGCAATCGACTGCATCACGCA







GGGACGTGAGTTGGAGCGGCCACGTGCCTGCCCACCA







GAGGTCTACGCCATCATGCGGGGCTGCTGGCAGCGGG







AGCCCCAGCAACGCCACAGCATCAAGGATGTGCACGC







CCGGCTGCAAGCCCTGGCCCAGGCACCTCCTGTCTACC







TGGATGTCCTGGGC


G17203.
54825188
55224226
InFrame
8295
ATGGATCAGGTAATGCAGTTTGTTGAGCCAAGTCGGC


TCGA-06-




AGTTTGTAAAGGACTCCATTCGGCTGGTTAAAAGATGC


0211-




ACTAAACCTGATAGAAAAG|TGTGTAACGGAATAGGT


02A-02R-




ATTGGTGAATTTAAAGACTCACTCTCCATAAATGCTAC


2005-




GAATATTAAACACTTCAAAAACTGCACCTCCATCAGTG


01.2




GCGATCTCCACATCCTGCCGGTGGCATTTAGGGGTGAC







TCCTTCACACATACTCCTCCTCTGGATCCACAGGAACTG







GATATTCTGAAAACCGTAAAGGAAATCACAGGGTTTTT







GCTGATTCAGGCTTGGCCTGAAAACAGGACGGACCTC







CATGCCTTTGAGAACCTAGAAATCATACGCGGCAGGA







CCAAGCAACATGGTCAGTTTTCTCTTGCAGTCGTCAGC







CTGAACATAACATCCTTGGGATTACGCTCCCTCAAGGA







GATAAGTGATGGAGATGTGATAATTTCAGGAAACAAA







AATTTGTGCTATGCAAATACAATAAACTGGAAAAAACT







GTTTGGGACCTCCGGTCAGAAAACCAAAATTATAAGC







AACAGAGGTGAAAACAGCTGCAAGGCCACAGGCCAG







GTCTGCCATGCCTTGTGCTCCCCCGAGGGCTGCTGGG







GCCCGGAGCCCAGGGACTGCGTCTCTTGCCGGAATGT







CAGCCGAGGCAGGGAATGCGTGGACAAGTGCAACCTT







CTGGAGGGTGAGCCAAGGGAGTTTGTGGAGAACTCTG







AGTGCATACAGTGCCACCCAGAGTGCCTGCCTCAGGC







CATGAACATCACCTGCACAGGACGGGGACCAGACAAC







TGTATCCAGTGTGCCCACTACATTGACGGCCCCCACTG







CGTCAAGACCTGCCCGGCAGGAGTCATGGGAGAAAAC







AACACCCTGGTCTGGAAGTACGCAGACGCCGGCCATG







TGTGCCACCTGTGCCATCCAAACTGCACCTACGGATGC







ACTGGGCCAGGTCTTGAAGGCTGTCCAACGAATGGGC







CTAAGATCCCGTCCATCGCCACTGGGATGGTGGGGGC







CCTCCTCTTGCTGCTGGTGGTGGCCCTGGGGATCGGCC







TCTTCATGCGAAGGCGCCACATCGTTCGGAAGCGCAC







GCTGCGGAGGCTGCTGCAGGAGAGGGAGCTTGTGGA







GCCTCTTACACCCAGTGGAGAAGCTCCCAACCAAGCTC







TCTTGAGGATCTTGAAGGAAACTGAATTCAAAAAGATC







AAAGTGCTGGGCTCCGGTGCGTTCGGCACGGTGTATA







AGGGACTCTGGATCCCAGAAGGTGAGAAAGTTAAAAT







TCCCGTCGCTATCAAGGAATTAAGAGAAGCAACATCTC







CGAAAGCCAACAAGGAAATCCTCGATGAAGCCTACGT







GATGGCCAGCGTGGACAACCCCCACGTGTGCCGCCTG







CTGGGCATCTGCCTCACCTCCACCGTGCAGCTCATCAC







GCAGCTCATGCCCTTCGGCTGCCTCCTGGACTATGTCC







GGGAACACAAAGACAATATTGGCTCCCAGTACCTGCTC







AACTGGTGTGTGCAGATCGCAAAGGGCATGAACTACT







TGGAGGACCGTCGCTTGGTGCACCGCGACCTGGCAGC







CAGGAACGTACTGGTGAAAACACCGCAGCATGTCAAG







ATCACAGATTTTGGGCTGGCCAAACTGCTGGGTGCGG







AAGAGAAAGAATACCATGCAGAAGGAGGCAAAGTGC







CTATCAAGTGGATGGCATTGGAATCAATTTTACACAGA







ATCTATACCCACCAGAGTGATGTCTGGAGCTACGGGG







TGACTGTTTGGGAGTTGATGACCTTTGGATCCAAGCCA







TATGACGGAATCCCTGCCAGCGAGATCTCCTCCATCCT







GGAGAAAGGAGAACGCCTCCCTCAGCCACCCATATGT







ACCATCGATGTCTACATGATCATGGTCAAGTGCTGGAT







GATAGACGCAGATAGTCGCCCAAAGTTCCGTGAGTTG







ATCATCGAATTCTCCAAAATGGCCCGAGACCCCCAGCG







CTACCTTGTCATTCAGGGGGATGAAAGAATGCATTTGC







CAAGTCCTACAGACTCCAACTTCTACCGTGCCCTGATG







GATGAAGAAGACATGGACGACGTGGTGGATGCCGAC







GAGTACCTCATCCCACAGCAGGGCTTCTTCAGCAGCCC







CTCCACGTCACGGACTCCCCTCCTGAGCTCTCTGAGTG







CAACCAGCAACAATTCCACCGTGGCTTGCATTGATAGA







AATGGGCTGCAAAGCTGTCCCATCAAGGAAGACAGCT







TCTTGCAGCGATACAGCTCAGACCCCACAGGCGCCTTG







ACTGAGGACAGCATAGACGACACCTTCCTCCCAGTGCC







TGaATACATAAACCAGTCCGTTCCCAAAAGGCCCGCTG







GCTCTGTGCAGAATCCTGTCTATCACAATCAGCCTCTG







AACCCCGCGCCCAGCAGAGACCCACACTACCAGGACC







CCCACAGCACTGCAGTGGGCAACCCCGAGTATCTCAAC







ACTGTCCAGCCCACCTGTGTCAACAGCACATTCGACAG







CCCTGCCCACTGGGCCCAGAAAGGCAGCCACCAAATT







AGCCTGGACAACCCTGACTACCAGCAGGACTTCTTTCC







CAAGGAAGCCAAGCCAAATGGCATCTTTAAGGGCTCC







ACAGCTGAAAATGCAGAATACCTAAGGGTCGCGCCAC







AAAGCAGTGAATTTATTGGAGCA


G17784.
9675963
9731644
InFrame
8296
ATGAGGCAGTCTCTCCTATTCCTGACCAGCGTGGTTCC


TCGA-76-




TTTCGTGCTGGCGCCGCGACCTCCGGATGACCCGGGC


4929-




TTCGGCCCCCACCAGAGACTCGAGAAGCTTGATTCTTT


01A-01R-




GCTCTCAGACTACGATATTCTCTCTTTATCTAATATCCA


1850-




GCAGCATTCGGTAAGAAAAAGAGATCTACAGACTTCA


01.4




ACACATGTAGAAACACTACTAACTTTTTCAGCTTTGAA







AAGGCATTTTAAATTATACCTGACATCAAGTACTGAAC







GTTTTTCACAAAATTTCAAGGTCGTGGTGGTGGATGGT







AAAAACGAAAGCGAGTACACTGTAAAATGGCAGGACT







TCTTCACTGGACACGTGGTTGGTGAGCCTGACTCTAGG







GTTCTAGCCCACATAAGAGATGATGATGTTATAATCAG







AATCAACACAGATGGGGCCGAATATAACATAGAG|GA







ATTGTTGGATAAATATTTAATAGCCAATGCAACTAATC







CAGAGAGTAAGGTCTTCTATCTGAAAATGAAGGGTGA







TTACTTCCGGTACCTTGCTGAAGTTGCGTGTGGTGATG







ATCGAAAACAAACGATAGATAATTCCCAAGGAGCTTAC







CAAGAGGCATTTGATATAAGCAAGAAAGAGATGCAAC







CCACACACCCAATCCGCCTGGGGCTTGCTCTTAACTTTT







CTGTATTTTACTATGAGATTCTTAATAACCCAGAGCTTG







CCTGCACGCTGGCTAAAACGGCTTTTGATGAGGCCATT







GCTGAACTTGATACACTGAATGAAGACTCATACAAAG







ACAGCACCCTCATCATGCAGTTGCTTAGAGACAACCTA







ACACTTTGGACATCAGACAGTGCAGGAGAAGAATGTG







ATGCGGCAGAAGGGGCTGAAAAC


G17675.
55240817
65110599
InFrame
8297
ATGGAGGAAGACGGCGTCCGCAAGTGTAAGAAGTGC


TCGA-19-




GAAGGGCCTTGCCGCAAAGTGTGTAACGGAATAGGTA


2624-




TTGGTGAATTTAAAGACTCACTCTCCATAAATGCTACG


01A-01R-




AATATTAAACACTTCAAAAACTGCACCTCCATCAGTGG


1850-




CGATCTCCACATCCTGCCGGTGGCATTTAGGGGTGACT


01.2




CCTTCACACATACTCCTCCTCTGGATCCACAGGAACTG







GATATTCTGAAAACCGTAAAGGAAATCACAGGGTTTTT







GCTGATTCAGGCTTGGCCTGAAAACAGGACGGACCTC







CATGCCTTTGAGAACCTAGAAATCATACGCGGCAGGA







CCAAGCAACAGGACCAGACAACTGTATCCAGTGTGCC







CACTACATTGACGGCCCCCACTGCGTCAAGACCTGCCC







GGCAGGAGTCATGGGAGAAAACAACACCCTGGTCTGG







AAGTACGCAGACGCCGGCCATGTGTGCCACCTGTGCC







ATCCAAACTGCACCTACGGATGCACTGGGCCAGGTCTT







GAAGGCTGTCCAACGAATGGGCCTAAGATCCCGTCCA







TCGCCACTGGGATGGTGGGGGCCCTCCTCTTGCTGCTG







GTGGTGGCCCTGGGGATCGGCCTCTTCATGCGAAGGC







GCCACATCGTTCGGAAGCGCACGCTGCGGAGGCTGCT







GCAGGAGAGGGAG|ATACAGGTTTGACCCCCGTCTCA







TGTTCAGCAATCGCGGCAGTGTCAGGACTCGAAGATTT







TCCAAACATCTTCTG


G17796.
58166911
58163739
InFrame
8298
ATGGCGGACCCCGGCCCAGATCCCGAATCTGAGTCGG


TCGA-41-




AATCGGTGTTCCCGCGGGAGGTCGGGCTCTTTGCAGA


5651-




CTCTTACTCGGAGAAGAGCCAGTTCTGTTTCTGTGGGC


01A-01R-




ATGTGCTGACCATCACGCAGAACTTTGGGTCCCGCCTC


1850-




GGGGTGGCGGCGCGCGTGTGGGACGCGGCCCTGAGC


01.4




CTGTGCAATTATTTCGAGAGTCAAAATGTGGATTTCCG







AGGCAAGAAGGTGATCGAACTGGGTGCGGGGACAGG







CATCGTGGGGATCTTGGCAGCGCTGCAGG|TGGAACT







GTCACCGCTGTTCCCAGACACACTTATTCTGGGTCTGG







AGATCCGGGTGAAGGTCTCAGACTATGTACAAGACCG







GATTCGGGCCCTACGCGCAGCTCCTGCAGGTGGCTTCC







AGAACATCGCCTGTCTCCGTAGCAATGCCATGAAGCAC







CTTCCTAACTTCTTCTACAAGGGCCAGCTGACAAAGAT







GTTCTTCCTCTTCCCCGACCCACATTTCAAGCGGACAAA







GCACAAGTGGCGAATCATCAGTCCCACCCTGCTAGCA







GAATATGCCTACGTGCTAAGAGTTGGGGGGCTGGTGT







ATACCATAACCGATGTGCTGGAGCTACACGACTGGAT







GTGCACTCATTTCGAAGAGCACCCACTGTTTGAGCGTG







TGCCTCTGGAGGACCTGAGTGAAGACCCCGTTGTGGG







ACATCTAGGCACCTCAACTGAGGAGGGGAAGAAAGTT







CTACGTAATGGAGGGAAGAATTTCCCAGCCATCTTCCG







AAGAATACAAGATCCCGTCCTCCAGGCAGTGACCTCCC







AAACCAGCCTGCCTGGTCAC


G17666.
134210220
219103387
InFrame
8299
ATGGTCCCGGCTGCCGGTCGACGACCGCCCCGCGTCA


TCGA-06-




TGCGGCTCCTCGGCTGGTGGCAAGTATTGCTGTGGGT


5415-




GCTGGGACTTCCCGTCCGCGGCGTGGAGG|GATACAA


01A-01R-




TGTCTCTTTGCTATATGACCTTGAAAATCTTCCGGCATC


1849-




CAAGGATTCCATTGTGCATCAAGCTGGCATGTTGAAGC


01.2




GAAATTGTTTTGCCTCTGTCTTTGAAAAATACTTCCAAT







TCCAAGAAGAGGGCAAGGAAGGAGAGAACAGGGCAG







TTATCCATTATAGGGATGATGAGACCATGTATGTTGAG







TCTAAAAAGGACAGAGTCACAGTAGTCTTCAGCACAG







TGTTTAAGGATGACGACGATGTGGTCATTGGAAAGGT







GTTCATGCAGGAGTTCAAAGAAGGACGCAGAGCCAGC







CACACAGCCCCACAGGTCCTCTTTAGCCACAGGGAACC







TCCTCTGGAGCTGAAAGACACAGACGCCGCTGTGGGT







GACAACATTGGCTACATTACCTTTGTGCTGTTCCCTCGT







CACACCAATGCCAGTGCTCGAGACAACACCATCAACCT







GATCCACACGTTCCGGGACTACCTGCACTACCACATCA







AGTGCTCTAAGGCCTATATTCACACACGTATGCGGGCG







AAAACGTCTGACTTCCTCAAGGTGCTGAACCGCGCACG







CCCAGATGCCGAGAAAAAAGAAATGAAAACAATCACG







GGGAAGACGTTTTCATCCCGC


G17219.
42396776
41420897
InFrame
8300
ATGGACGGTTTCGCCGGCAGTCTCG|GTACACTTCATG


TCGA-06-




TTGGTGATGAAATTCGAGAAATCAATGGCATCAGTGT


0158-




GGCTAACCAAACAGTGGAACAACTGCAAAAAATGCTT


01A-01R-




AGGGAAATGCGGGGGAGTATTACCTTCAAGATTGTGC


1849-




CAAGTTACCGCACTCAGTCTTCGTCCTGTGAGAGAGAT


01.2




TCCCCTTCCACTTCCAGACAGTCCCCAGCTAATGGTCAT







AGCAGCACTAACAATTCTGTTTCGGACTTGCCATCAAC







TACCCAACCAAAAGGACGACAGATCTATGTAAGAGCA







CAATTTGAATATGATCCAGCCAAGGATGACCTCATCCC







CTGTAAAGAAGCTGGCATTCGATTCAGAGTTGGTGAC







ATCATCCAGATTATTAGTAAGGATGATCATAATTGGTG







GCAGGGTAAACTGGAAAACTCCAAAAATGGAACTGCA







GGTCTCATTCCTTCTCCTGAACTTCAGGAATGGCGAGT







AGCTTGCATTGCCATGGAGAAGACCAAACAGGAGCAG







CAGGCCAGCTGTACTTGGTTTGGCAAGAAAAAGAAGC







AGTACAAAGATAAATATTTGGCAAAGCACAATGCAGT







GTTTGATCAATTAGATCTTGTCACATATGAAGAAGTAG







TAAAACTGCCAGCATTCAAGAGGAAAACACTAGTCTTA







TTAGGCGCACATGGTGTTGGGAGAAGACACATAAAAA







ACACTCTCATCACAAAGCACCCAGACCGGTTTGCGTAC







CCTATTCCACATACAACCAGACCTCCAAAGAAAGACGA







AGAAAATGGAAAGAATTATTACTTTGTATCTCATGACC







AAATGATGCAAGACATCTCTAATAACGAGTACTTGGA







GTACGGCAGCCACGAGGATGCGATGTATGGGACAAA







ACTGGAGACCATCCGGAAGATCCACGAGCAGGGGCTG







ATTGCAATACTGGACGTGGAGCCTCAGGCACTGAAGG







TCCTGAGAACTGCAGAGTTTGCTCCTTTTGTTGTTTTCA







TTGCTGCACCAACTATTACTCCAGGTTTAAATGAGGAT







GAATCTCTTCAGCGTCTGCAGAAGGAGTCTGACATCTT







ACAGAGAACATATGCACACTACTTCGATCTCACAATTA







TCAACAATGAAATTGATGAGACAATCAGACATCTGGA







GGAAGCTGTTGAGCTCGTGTGCACAGCCCCACAGTGG







GTCCCTGTCTCCTGGGTCTAT


G17790.
122150870
58193703
InFrame
8301
ATGTCCGGGCAGCTGGAGCGTTGCGAGCGCGAATGGC


TCGA-06-




ACGAGCTGGAGGGAGAATTTCAAGAACTGCAG|G ACA


5856-




TGAAGAATGCAACCCTCTCCCTGAATTCTAATGACAGT


01A-01R-




GAGCCAAAATATTACCCTATAGCAGTTCTGTTGAAAAA


1849-




CCAGAATCAGGAGCTGCCTGAGGATGTAAACCCTGCC


01.4




AAAAAGGAGAATTACCTCTCTGAACAGGACTTTGTGTC







TGTGTTTGGCATCACAAGAGGGCAATTTGCAGCTCTGC







CTGGCTGGAAACAGCTCCAAATGAAGAAAGAAAAGG







GGCTTTTC


NYU_B
19746155
19433460
InFrame
8302
ATGGGTGGCCCACAGGACAAGGATGAACGCACAATCG







CCCTTGTGAGGCCGTGGCCTTGGGGGCACCAGGCCCT







GGATCCAGCATATGGCCTGGACACGATGCACCCAAGC







AGGCGCAGCCTCCCCTTCCCTCTGAACTGTCAGCTTGC







AAGGGTTGGAACTGCTGATTATGGAAGTCCCTCGGAT







CAGAGTGATCAGCAGCTGGACTGTGCCTTGGACCTAA







TGAGGCGCCTGCCTCCCCAGCAAATCGAGAAAAACCT







CAGCGACCTGATCGACCTG|GATGTCCCCGTTGAGGCC







CTCACCACGGTGAAGCCATACTGCAATGAGATCCATGC







CCAGGCTCAACTGTGGCTCAAGAGAGACCCCAAGGCA







TCCTATGATGCCTGGAAGAAGTGTCTTCCTATCAGAGG







GATAGATGGCAATGGGAAAGCCCCCAGCAAATCAGAG







CTCCGCCATCTCTATTTGACTGAGAAGTATGTGTGGAG







GTGGAAACAGTTCCTGAGTCGTCGGGGGAAGAGGAC







CTCCCCCTTGGATCTCAAACTGGGGCATAACAACTGGC







TGCGACAAGTGCTTTTCACTCCAGCAACGCAGGCCGCA







CGGCAGGCAGCCTGTACCATTGTGGAAGCTCTAGCCA







CCATTCCCAGCCGCAAGCAGCAGGTCCTGGACCTGCTT







ACCAGTTACCTGGATGAGCTGAGCATAGCTGGGGAGT







GTGCAGCTGAGTACCTGGCTCTCTACCAGAAGCTCATC







ACTTCTGCGCACTGGAAAGTCTACTTGGCAGCTCGGG







GAGTCCTACCCTATGTGGGCAACCTCATCACCAAGGAA







ATAGCTCGTCTGCTGGCCCTGGAGGAGGCTACCCTGA







GTACCGATCTGCAGCAGGGTTATGCCCTTAAAAGTCTC







ACAGGCCTTCTCTCCTCCTTTGTTGAGGTGGAATCCATC







AAAAGACATTTTAAAAGTCGCTTGGTGGGTACTGTGCT







GAATGGATACCTGTGCTTGCGGAAGCTGGTGGTGCAG







AGGACCAAGCTGATCGATGAGACGCAGGACATGCTGC







TGGAGATGCTGGAGGACATGACCACAGGTACAGAATC







AGAAACCAAGGCCTTCATGGCTGTGTGCATTGAGACA







GCCAAGCGCTACAATCTGGATGACTACCGGACCCCGG







TGTTCATCTTCGAGAGGCTCTGCAGCATCATTTATCCTG







AGGAGAATGAAGTCACTGAGTTCTTTGTGACCCTGGA







GAAGGATCCCCAACAAGAAGACTTCTTACAGGGCAGG







ATGCCTGGGAACCCGTATAGCAGCAATGAGCCAGGCA







TCGGGCCGCTGATGAGGGATATAAAGAACAAGATTTG







CCAGGACTGTGACTTAGTGGCCCTCCTGGAAGATGAC







AGTGGCATGGAGCTTCTAGTGAACAATAAAATCATTA







GTTTGGACCTTCCTGTGGCTGAAGTTTACAAGAAAGTC







TGGTGTACCACGAATGAGGGAGAGCCCATGAGGATTG







TTTATCGTATGCGGGGGCTGCTGGGCGATGCCACAGA







GGAGTTCATTGAGTCCCTGGACTCTACTACAGatgaaga







agaagatgaagaagaagTGTATAAAATGGCTGGTGTGATG







GCCCAGTGTGGGGGCCTGGAATGCATGCTTAACAGAC







TCGCAGGGATCAGAGATTTCAAGCAGGGACGCCACCT







TCTAACAGTGCTACTGAAATTGTTCAGTTACTGCGTGA







AGGTGAAAGTCAACCGGCAGCAACTGGTCAAACTGGA







AATGAACACCTTGAACGTCATGCTGGGGACCCTAAACC







TGGCCCTTGTAGCTGAACAAGAAAGCAAGGACAGTGG







GGGTGCAGCTGTGGCTGAGCAGGTGCTTAGCATCATG







GAGATCATTCTAGATGAGTCCAATGCTGAGCCCCTGAG







TGAGGACAAGGGCAACCTCCTCCTGACAGGTGACAAG







GATCAACTGGTGATGCTCTTGGACCAGATCAACAGCAC







CTTTGTTCGCTCCAACCCCAGTGTGCTCCAGGGCCTGC







TTCGCATCATCCCGTACCTTTCCTTTGGAGAGGTGGAG







AAAATGCAGATCTTGGTGGAGCGATTCAAACCATACT







GCAACTTTGATAAATATGATGAAGATCACAGTGGTGAT







GATAAAGTCTTCCTGGACTGCTTCTGTAAAATAGCTGC







TGGCATCAAGAACAACAGCAATGGGCACCAGCTGAAG







GATCTGATTCTCCAGAAGGGGATCACCCAGAATGCACT







TGACTACATGAAAAAGCACATCCCTAGCGCCAAGAATT







TGGATGCCGACATCTGGAAAAAGTTTTTGTCTCGCCCA







GCCTTGCCATTTATCCTAAGGCTGCTTCGGGGCCTGGC







CATCCAGCACCCTGGCACCCAGGTTCTGATTGGAACTG







ATTCCATCCCGAACCTGCATAAGCTGGAGCAGGTGTCC







AGTGATGAGGGCATTGGGACCTTGGCAGAGAACCTGC







TGGAAGCCCTGCGGGAACACCCTGACGTAAACAAGAA







GATTGACGCAGCCCGCAGGGAGACCCGGGCAGAGAA







GAAGCGCATGGCCATGGCAATGAGGCAGAAGGCCCT







GGGCACCCTGGGCATGACGACAAATGAAAAGGGCCA







GGTCGTGACCAAGACAGCACTCCTGAAGCAGATGGAA







GAGCTGATCGAGGAGCCTGGCCTCACGTGCTGCATCT







GCAGGGAGGGATACAAGTTCCAGCCCACAAAGGTCCT







GGGCATTTATACCTTCACGAAGCGGGTAGCCTTGGAG







GAGATGGAGAATAAGCCCCGGAAACAGCAGGGCTAC







AGCACCGTGTCCCACTTCAACATTGTGCACTACGACTG







CCATCTGGCTGCCGTCAGGTTGGCTCGAGGCCGGGAA







GAGTGGGAGAGTGCCGCCCTGCAGAATGCCAACACCA







AGTGCAACGGGCTCCTTCCGGTCTGGGGACCTCATGTC







CCTGAATCAGCTTTTGCCACTTGCTTGGCAAGACACAA







CACTTACCTCCAGGAATGTACAGGCCAGCGGGAGCCC







ACGTATCAGCTCAACATCCATGACATCAAACTGCTCTTC







CTGCGCTTCGCCATGGAGCAGTCGTTCAGCGCAGACA







CTGGCGGGGGCGGCCGGGAGAGCAACATCCACCTGA







TCCCGTACATCATTCACACTGTGCTTTACGTCCTGAACA







CAACCCGAGCAACTTCCCGAGAAGAGAAGAACCTCCA







AGGCTTTCTGGAACAGCCCAAGGAGAAGTGGGTGGA







GAGTGCCTTTGAAGTGGACGGGCCCTACTATTTCACAG







TCTTGGCCCTTCACATCCTGCCCCCTGAGCAGTGGAGA







GCCACACGTGTGGAAATCTTGCGGAGGCTGTTGGTGA







CCTCGCAGGCTCGGGCAGTGGCTCCAGGTGGAGCCAC







CAGGCTGACAGATAAGGCAGTGAAGGACTATTCCGCT







TACCGTTCTTCCCTTCTCTTTTGGGCCCTCGTCGATCTC







ATTTACAACATGTTTAAGAAACAAACAACCCCAACTGT







GGGAGGGATTGACACTGGCAGCCTGGAGCCTTGTGTC







TGTGAGAAGGTGCCTACCAGTAACACAGAGGGAGGCT







GGTCCTGCTCTCTCGCTGAGTACATCCGCCACAACGAC







ATGCCCATCTACGAAGCTGCCGACAAAGCCCTGAAAA







CCTTCCAGGAGGAGTTCATGCCAGTGGAGACCTTCTCA







GAGTTCCTCGATGTGGCCGGTCTTTTATCAGAAATCAC







CGATCCAGAGAGCTTCCTGAAGGACCTGTTGAACTCA







GTCCCC


G17657.
100438902
100348442
InFrame
8303
ATGAACGGACAGTTGGATCTAAGTGGGAAGCTAATCA


TCGA-19-




TCAAAGCTCAACTTGGGGAGGATATTCGGCGAATTCCT


1787-




ATTCATAATGAAGATATTACTTATGATGAATTAGTGCT


01B-01R-




AATGATGCAACGAGTTTTCAGAGGAAAACTTCTGAGT


1850-




AATGATGAAGTAACAATAAAGTATAAAGATGAAGATG


01.2




GAGATCTTATAACAATTTTTGATAGTTCTGACCTTTCCT







TTGCAATTCAGTGCAGTAGGATACTGAAACTGACATTA







TTTG|GAAAATCTACTTCCTCATCAAGCACCCCTACAGA







GTTCTGCAGGAATGGTGGAACCTGGGAAAATGGCAGA







TGTATTTGTACAGAAGAGTGGAAAGGACTGAGATGTA







CAATTGCTAATTTTTGTGAAAATAGTACCTATATGGGTT







TTACTTTTGCCAGAATCCCAGTGGGCAGATATGGACCA







TCCTTGCAAACATGTGGCAAGGATACTCCAAATGCGG







GCAATCCAATGGCAGTCCGGTTGTGCAGTCTCTCTCTA







TATGGAGAGATAGAATTACAAAAAGTGACAATAGGAA







ATTGCAATGAAAATCTGGAAACCCTGGAAAAGCAGGT







AAAGGATGTCACAGCACCACTTAATAACATTTCTTCTG







AAGTCCAGATTTTAACATCTGATGCCAATAAATTAACT







GCTGAGAACATCACTAGTGCTACGCGAGTGGTTGGAC







AGATATTCAACACTTCCAGAAATGCTTCACCTGAGGCA







AAGAAAGTTGCCATAGTAACAGTGAGTCAACTCCTAG







ATGCCAGTGAAGATGCTTTTCAAAGAGTTGCTGCTACT







GCTAATGATGATGCCCTTACAACGCTTATTGAGCAAAT







GGAGACTTATTCCTTGTCTTTGGGTAATCAATCAGTGG







TGGAACCTAACATAGCAATACAGTCAGCAAATUCTCT







TCAGAAAATGCGGTGGGGCCTTCAAATGTTCGCTTCTC







TGTGCAGAAAGGAGCTAGCAGTTCTCTAGTTTCTAGTT







CAACATTTATACATACAAATGTGGATGGCCTTAACCCA







GATGCACAGACTGAGCTTCAGGTCTTGCTTAATATGAC







GAAAAATTACACCAAGACATGCGGCTTTGTAGTTTATC







AAAATGACAAGCTTTTCCAATCAAAAACTTTTACAGCT







AAATCGGATTTTAGTCAAAAAATTATCTCAAGCAAAAC







TGATGAAAATGAGCAAGATCAGAGTGCTTCTGTTGAC







ATGGTCTTTAGTCCAAAGTACAACCAAAAAGAATTTCA







ACTCTATTCCTATGCCTGTGTCTATTGGAATTTGTCAGC







GAAGGACTGGGACACATATGGCTGTCAAAAAGACAAG







GGCACTGATGGATTCCTGCGCTGCCGCTGCAACCATAC







TACTAATTTTGCTGTATTAATGACTTTCAAAAAGGATTA







TCAATATCCCAAATCACTTGACATATTATCCAACGTTGG







ATGTGCACTGTCTGTTACTGGTCTGGCTCTCACAGTTAT







ATTTCAGATTGTCACCAGGAAAGTCAGAAAAACCTCAG







TAACCTGGGTTTTGGTCAATCTGTGCATATCAATGTTG







ATTTTCAACCTCCTCTTTGTGTTTGGAATTGAAAACTCC







AATAAGAACTTGCAGACAAGTGATGGTGACATCAATA







ATATTGACTTTGACAATAATGACATACCCAGGACAGAC







ACCATTAACATCCCGAATCCCATGTGCACTGCGATTGC







CGCCTTACTGCACTATTTTCTGTTAGTGACATTTACCTG







GAACGCACTCAGCGCTGCACAGCTCTATTACCTTCTAA







TAAGGACCATGAAGCCTCTTCCTCGGCATTTCATTCTTT







TCATCTCATTAATTGGATGGGGAGTCCCAGCTATAGTA







GTGGCTATAACAGTGGGAGTTATTTATTCTCAGAATGG







AAATAATCCACAGTGGGAATTAGACTACCGGCAAGAG







AAAATCTGCTGGCTGGCAATTCCAGAACCCAATGGTGT







TATAAAAAGTCCGCTGTTGTGGTCATTCATCGTACCTG







TAACCATTATCCTCATCAGCAATGTTGTTATGTTTATTA







CAATCTCGATCAAAGTGCTGTGGAAGAATAACCAGAA







CCTGACAAGCACAAAAAAAGTTTCATCCATGAAGAAG







ATTGTTAGCACATTATCTGTTGCAGTTGTTTTTGGAATT







ACCTGGATTCTAGCATACCTGATGCTAGTTAATGATGA







TAGCATCAGGATCGTCTTCAGCTACATATTCTGCCTTTT







CAACACTACACAGGGATTGCAAATTTTTATCCTGTACA







CTGTTAGAACAAAAGTCTTCCAGAGTGAAGCTTCCAAA







GTGTTGATGTTGCTATCGTCTATTGGGAGAAGGAAGTC







ATTGCCTTCAGTGACGCGGCCGAGGCTGCGTGTAAAG







ATGTATAATTTCCTCAGGTCATTGCCAACCTTACATGAA







CGCTTTAGGCTACTGGAAACCTCTCCGAGTACTGAGGA







AATCACACTCTCTGAAAGTGACAATGCAAAGGAAAGC







ATC


G17643.
43976965
52740660
InFrame
8304
ATGGCCCCCGCCCGTCTGTTCGCGCTGCTGCTGTTCTTC


TCGA-12-




GTAGGCGGAGTCGCCGAGTCG|GATTACAAGGGCCAG


5295-




AAGCTAGCTGAACAGATGTTTCAGGGAATTATTCTTTT


01A-01R-




TTCTGCAATAGTTGGATTTATCTACGGGTACGTGGCTG


1849-




AACAGTTCGGGTGGACTGTCTATATAGTTATGGCCGG


01.2




ATTTGCTTTTTCATGTTTGCTGACACTTCCTCCATGGCC







CATCTATCGCCGGCATCCTCTCAAGTGGTTACCTGTTCA







AGAATCAAGCACAGACGACAAGAAACCAGGGGAAAG







AAAAATTAAGAGGCATGCTAAAAATAAT


NYU_G
100191807
102260661
InFrame
8305
ATGCGCTCCATTAGGAAGAGGTGGACGATCTGCACAA







TAAGTCTGCTCCTGATCTTTTATAAGACAAAAGAAATA







GCAAGAACTGAGGAGCACCAGGAGACGCAACTCATCG







GAGATGGTGAATTGTCTTTGAGTCGGTCACTTGTCAAT







AGCTCTGATAAAATCATTCGAAAGGCTGGCTCTTCAAT







CTTCCAGCACAATGTAGAAGGTTGGAAAATCAATTCCT







CTTTGGTCCTAGAGATAAGGAAGAACATACTTCGTTTC







TTAGATGCAGAACGAGATGTGTCAGTGGTCAAGAGCA







GTTTTAAGCCTGGTGATGTCATACACTATGTGCTTGAC







AGGCGCCGGACACTAAACATTTCTCATGATCTACATAG







CCTCCTACCTGAAGTTTCACCAATGAAGAATCGCAGGT







TTAAGACCTGTGCAGTTGTTGGAAATTCTGGCATTCTG







TTAGACAGTGAATGTGGAAAGGAGATTGACAGTCACA







ATTTTGTAATAAGGTGTAATCTAGCTCCTGTGGTGGAG







TTTGCTGCAGATGTGGGAACTAAATCAGATTTTATTAC







CATGAATCCATCAGTTGTACAAAGAGCATTTGGAGGCT







TTCGAAATGAGAGTGACAGAGAAAAATTTGTGCATAG







ACTTTCCATGCTGAATGACAGTGTCCTTTGGATTCCTGC







TTTCATGGTCAAAGGAGGAGAGAAGCACGTGGAGTG







GGTTAATGCATTAATCCTTAAGAATAAACTGAAAGTGC







GAACTGCCTATCCGTCATTGAGACTTATTCATGCTGTCA







GAGG|GTTTTGTGATGAAGGAACCTGTACAGATAAAG







CCAATATTCTGTATGCCTGGGCGAGAAATGCTCCCCCT







ACCCGGCTCCCCAAAGGTGTTGGATTCAGAGTTGGAG







GAGAGACTGGAAGTAAATACTTTGTACTACAGGTACA







CTATGGGGATATTAGTGCTTTTAGAGATAATAACAAGG







ACTGTTCTGGTGTGTCCTTACACCTCACACGTCTGCCAC







AGCCTTTAATTGCTGGCATGTACCTTATGATGTCTGTTG







ACACTGTTATCCCAGCAGGAGAAAAAGTGGTGAATTC







TGACATTTCATGCCATTATAAAAATTATCCAATGCATGT







CTTTGCCTATAGAGTTCACACTCACCATTTAGGTAAGG







TAGTAAGTGGATACAGAGTAAGAAATGGACAGTGGAC







ACTGATTGGACGGCAGAGCCCTCAGCTGCCACAGGCT







TTCTACCCTGTGGGGCATCCAGTTGATGTAAGTTTTGG







TGACCTACTGGCTGCAAGATGTGTATTCACTGGTGAAG







GAAGGACAGAAGCCACACACATTGGTGGCACGTCTAG







TGATGAAATGTGCAACTTATACATTATGTATTACATGG







AAGCCAAGCATGCAGTTTCTTTCATGACCTGTACCCAG







AATGTAGCTCCAGATATGTTCAGAACCATACCACCAGA







GGCCAACATTCCAATTCCCGTGAAGTCTGATATGGTTA







TGATGCATGAACATCATAAAGAAACAGAATATAAAGA







TAAGATTCCTTTACTACAGCAGCCAAAACGAGAAGAA







GAAGAAGTGTTAGACCAGGGTGATTTCTATTCACTACT







TTCCAAGCTGCTAGGAGAAAGGGAAGATGTTGTTCAT







GTGCACAAATATAATCCTACAGAAAAGGCAGAATCAG







AGTCAGACCTGGTAGCTGAGATTGCAAATGTAGTCCA







AAAAAAGGATCTTGGTCGATCTGATGCCAGAGAGGGT







GCAGAACATGAGAGGGGTAATGCTATTCTTGTCAGAG







ACAGAATTCACAAATTCCACAGACTAGTATCTACCTTG







AGGCCACCAGAGAGCAGAGTTTTCTCATTACAGCAGC







CCCCACCTGGTGAAGGCACCTGGGAACCAGAACACAC







AGGAGATTTCCACATGGAAGAGGCACTGGATTGGCCT







GGAGTATACTTGTTACCAGGCCAGGTTTCTGGGGTGG







CTCTAGACCCTAAGAATAACCTGGTGATTTTCCACAGA







GGTGACCATGTCTGGGATGGAAACTCGTTTGACAGCA







AGTTTGTTTACCAGCAAATAGGACTCGGACCAATTGAA







GAAGACACTATTCTTGTCATAGATCCAAATAATGCTGC







AGTACTCCAGTCCAGTGGAAAAAATCTGTTTTACTTGC







CACATGGCTTGAGTATAGATAAAGATGGGAATTATTG







GGTCACAGACGTGGCTCTCCATCAGGTGTTCAAACTGG







ATCCAAACAATAAAGAAGGCCCTGTATTAATCCTGGGA







AGGAGCATGCAACCAGGCAGTGACCAGAATCACTTCT







GTCAACCCACTGATGTGGCTGTGGATCCAGGCACTGG







AGCCATTTATGTATCAGATGGTTACTGCAACAGCAGGA







TTGTGCAGTTTTCACCAAGTGGAAAGTTCATCACACAG







TGGGGAGAAGAGTCTTCAGGGAGCAGTCCTCTGCCAG







GCCAGTTCACTGTTCCTCACAGCTTGGCTCTTGTGCCTC







TTTTGGGCCAATTATGTGTGGCAGACCGGGAAAATGG







TCGGATCCAGTGTTTTAAAACTGACACCAAAGAATTTG







TGAGAGAGATTAAGCATTCATCATTTGGAAGAAATGT







ATTTGCAATTTCATATATACCAGGCTTGCTCTTTGCAGT







GAATGGGAAGCCTCATTTTGGGGACCAAGAACCTGTA







CAAGGATTTGTGATGAACTTTTCCAATGGGGAAATTAT







AGACATCTTCAAGCCAGTGCGCAAGCACTTTGATATGC







CTCATGATATTGTTGCATCTGAAGATGGGACTGTGTAC







ATTGGAGATGCTCATACCAACACCGTGTGGAAGTTCAC







CTTGACTGAGAAATTGGAACATCGATCAGTTAAAAAG







GCTGGCATTGAGGTCCAGGAAATCAAAGAAGCCGAG







GCAGTTGTTGAAACCAAAATGGAGAACAAACCCACCT







CCTCAGAATTGCAGAAGATGCAAGAGAAACAGAAACT







GATCAAAGAGCCAGGCTCGGGAGTGCCTGTTGTTCTC







ATTACAACCCTTCTGGTTATTCCGGTGGTTGTCCTGCTG







GCCATTGCCATATTTATTCGGTGGAAAAAATCAAGGGC







CTTTGGAGCAGATTCTGAACACAAACTCGAGACGAGTT







CAGGAAGAGTACTGGGAAGATTTAGAGGAAAGGGAA







GTGGAGGCTTAAACCTTGGTAATTTCTTTGCAAGCCGT







AAGGGCTACAGTCGAAAAGGGTTTGACCGGCTTAGCA







CTGAGGGCAGTGACCAAGAGAAAGAGGATGATGGAA







GTGAATCAGAAGAGGAGTATTCAGCACCTCTGCCTGC







GCTCGCACCTTCCTCCTCC


G17494.
48965246
48265844
InFrame
8306
ATGTCAGTGGACATGAATAGCCAGGGGTCTGACAGCA


TCGA-14-




ATGAAGAGGACTATGACCCAAATTGTgaggaagaggaaga


2554-




agaagaagaagaCGACCCTGGGGACATAGAGGACTATTA


01A-01R-




CGTGGGAGTAGCCAGCGATGTGGAGCAGCAGGGGGC


1850-




TGATGCCTTTGATCCCGAGGAGTACCAGTTCACTTGCT


01.2




TGACCTACAAGGAATCTGAGGGTGCCCTCAATGAGCA







CATGACCAGCTTAGCTTCTGTCCTAAAG|GATGGGGAC







CCAGACACGCCAAAGCCTGTGAGCTTCACAGTGAAGG







AGACAGTGTGCCCCAGGACGACACAGCAGTCACCAGA







GGATTGTGACTTCAAGAAGGACGGGCTGGTGAAGCG







GTGTATGGGGACAGTGACCCTCAACCAGGCCAGGGGC







TCCTTTGACATCAGTTGTGATAAGGATAACAAGAGATT







TGCCCTGCTGGGTGATTTCTTCCGGAAATCTAAAGAGA







AGATTGGCAAAGAGTTTAAAAGAATTGTCCAGAGAAT







CAAGGATTTTTTGCGGAATCTTGTACCCAGGACAGAGT







CC


G17196.
58123422
94827533
InFrame
8307
ATGAGCCGGGGCGCGGGCGCGCTTCAGCGCCGGACA


TCGA-06-




ACGACCTACCTCATCTCGCTGACCCTGGTTAAGCTCGA


0178-




gtcggtgcctccgccgccgccttctccgtctgcggccgcggccggcgcc


01A-01R-




gccggtgccAGAGGCTCCGAGACTGGGGATCCTGGCAGC


1849-




CCCCGAGGCGCGGAGGAGCCGGGCAAGAAGCGGCAC


01.2




GAACGTCTCTTCCACCGGCAGGATGCGCTGTGGATCA







GCACGAGCAGCGCGGGCACCGGGGGCGCGGAGCCCC







CAGCCCTGTCCCCGGCTCCGGCCAGTCCGGCCCGCCCA







GTCTCCCCCGCTCCCGGCCGCCGCCTCTCCCTCTGGGC







CGTCCCTCCGGGACCCCCGCTCTCCGGGGGACTGAGC







CCCGACCCCAAGCCTGGGGGCGCCCCCACCTCCTCCCG







GCGCCCCCTGCTCAGCAGCCCGAGCTGGGGCGGCCCG







GAGCCCGAAGGCCGGGCGGGCGGCGGCATCCCTGGC







TCATCCTCTCCGCACCCTGGCACCGGCAGCCGGAGGCT







CAAGGTGGCGCCTCCTCCGCCGGCTCCCAAGCCTTGCA







AGACCGTGACCACGAGTGGAGCCAAAGCCGGCGGGG







GCAAGGGCGCGGGTAGCCGCCTGTCATGGCCCGAAA







GCGAGGGCAAGCCCAGGGTCAAGGGGTCAAAGAGCA







GCGCCGGGACTGGAGCTTCGGTCTCTgccgccgccaccgcc







gccgccgccgGGGGAGGGGGCTCTACAGCTTCGACCTCT







GGTGGGGTCGGGGCTGGGGCTGGAGCCCGAGGGAA







GTTGTCCCCTCGGAAAGGCAAGAGTAAGACCTTGGAC







AACAGTGACTTGCATCCGGGACCGCCTGCCGGCTCTCC







TCCTCCGCTAACCCTCCCACCAACTCCGAGTCCAGCCAC







TGCTGTCACCGCTGCTTCCGCGCAGCCCCCCGGGCCTG







CACCTCCAATCACTCTGGAGCCTCCAGCTCCGGGGCTG







AAACGGGGCCGGGAGGGGGGCCGAGCATCCACTCGT







GACCGCAAGATGCTCAAGTTTATCAGCGGCATCTTCAC







CAAGAGCACAGGAGGGCCTCCTGGCTCCGGGCCCCTT







CCCGGACCCCCCAGCCTGTCTTCTGGCAGCGGGTCCAG







GGAGCTGCTGGGCGCCGAGCTCCGCGCTTCCCCTAAG







GCTGTGATCAATAGCCAGGAATGGACTTTGAGCCGCT







CCATTCCTGAACTGCGCCTGGGTGTGCTGGGCGATGCC







AGGAGTGGGAAGTCATCGCTCATCCACCGATTCCTGAC







TGGCTCATACCAGGTGCTGGAGAAGACAGAGAGTGA







GCAGTACAAGAAAGAAATGTTGGTGGATGGACAGAC







ACATCTGGTGCTAATCCGAGAGGAAGCTGGGGCACCT







GATGCCAAGTTCTCAGGCTGGGCAGATGCTGTGATCTT







CGTCTTCAGCCTGGAGGATGAGAACAGTTTCCAGGCT







GTGAGCCGTCTCCATGGGCAGCTGAGTTCCCTTCGCGG







GGAGGGACGAGGAGGCCTGGCCTTGGCACTGGTGGG







GACACAAGACAGGATCAGTGCTTCCTCCCCTCGGGTG







GTGGGAGATGCTCGTGCCAGAGCTCTGTGCGCGGACA







TGAAACGCTGCAGCTACTATGAGACTTGTGCAACCTAT







GGGCTCAATGTGGATCGGGTCTTCCAGGAGGTGGCCC







AGAAGGTGGTGACCTTGCGCAAGCAGCAACAGCTTCT







GGCTGCCTGCAAGTCCCTGCCCAGCTCCCCAAGCCACT







CAGCTGCATCCACTCCGGTAGCTGGCCAGGCTAGTAAC







GGGGGCCACACTAGCGACTACTCTTCTTCCCTCCCGTC







CTCACCGAATGTTGGTCACCGGGAGCTCCGAGCCGAG







GCAGCTGCAGTGGCTGGATTGAGCACCCCAGGGTCCC







TGCACCGGGCAGCCAAGCGCAGGACCAGCCTTTTTGC







GAATCGTCGGGGTAGTGACTCCGAGAAACGAAGCTTG







GATAGTCGGGGAGAGACAACAGGGAGTGGGCGAGCC







ATCCCCATCAAACAGAGCTTCCTACTAAAACGAAGTGG







CAATTCCTTGAACAAAGAATGGAAGAAGAAATATGTA







ACCCTGTCCAGTAATGGCTTTCTACTCTACCACCCCAGT







ATTAACGATTACATCCACAGTACCCACGGCAAGGAGAT







GGACTTGCTGCGAACAACAGTCAAAGTCCCGGGCAAG







CGGCCCCCGAGGGCCATCTCTGCCTTTGGCCCCTCAGC







CAGCATTAACGGGCTCGTCAAGGACATGAGCACTGTC







CAGATGGGTGAAGGCCTGGAAGCCACTACTCCCATGC







CAAGCCCTAGCCCCAGCCCCAGTTCCCTGCAGCCACCA







CCAGATCAGACATCCAAACACCTGCTGAAGCCAGACC







GGAATTTGGCCCGAGCCCTCAGCACGGACTGTACCCC







ATCTGGAGACCTGAGCCCCCTGAGTCGGGAACCCCCTC







CTTCTCCCATGGTGAAGAAGCAGAGGAGGAAAAAATT







GACAACACCATCCAAGACTGAAGGCTCGGCTGGGCAG







GCTGAAGTATGAAGGTTATTCTTTCAGCAGTGTCCTGT







ATTATGGAAATGAAGCTACTCTTCTTATTTTTGATCTGC







TGTTCTTCTGTGTTGTGGATTTGGCTTGCCAAAATTTTA







TTTTAGCATCCTTCCTTACATATCTACAACAAGAGATTT







TTAGATATATCCGTAATACAGTAGGACAAAAGAATTTG







GCATCCAAAACATTGGTGGATCAAAGATTTTTGATT


G17782.
95392394
9544862
InFrame
8308
ATGACCCACTTCAACAAGGGCCCTTCCTATGGGCTCTC


TCGA-26-




GGCCGAAGTCAAGAACAAG|GTGTTGTGTAACGGACC


5136-




AGGAACATGTGTTCCTATCTGTGTATCTGCCCTTCTCCT


01B-01R-




TGGGATACTAGGAATAAAGAAAGTGATCATTGTCTAC


1850-




GTTGAAAGCATCTGCCGTGTAGAAACGTTATCCATGTC


01.4




CGGAAAGATTCTGTTTCATCTCTCAGATTACTTCATTGT







TCAGTGGCCGGCTCTGAAAGAAAAGTATCCCAAATCG







GTGTACCTTGGGCGAATTGTT


GBM-
55639964
68645358
InFrame
8309
ATGAGGCGCCAGCCTGCGAAGGTGGCGGCGCTGCTGC


CUMC3296_L1




TCGGGCTGCTCTTGGAG|CATATTGAAGGTGATGACCT







GCATATCCAGAGGAATGTGCAAAAGCTGAAGGACACA







GTGAAAAAGCTTGGAGAGAGTGGAGAGATCAAAGCA







ATTGGAGAACTGGATTTGCTGTTTATGTCTCTGAGAAA







TGCCTGCATT


G17199.
55087058
102784508
InFrame
8310
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-06-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


0744-




GGAGGAAAAGAAAG|CTGACATAGCCCAGAGATACA


01A-01R-




GGATAAGCAAATACCCAACCCTCAAATTGTTTCGTAAT


1849-




GGGATGATGATGAAGAGAGAATACAGGGGTCAGCGA


01.2




TCAGTGAAAGCATTGGCAGATTACATCAGGCPACAAA







AAAGTGACCCCATTCAAGAAATTCGGGACTTAGCAGA







AATCACCACTCTTGATCGCAGCAAAAGAAATATCATTG







GATATTTTGAGCAAAAGGACTCGGACAACTATAGAGT







TTTTGAACGAGTAGCGAATATTTTGCATGATGACTGTG







CCTTTCTTTCTGCATTTGGGGATGTTTCAAAACCGGAA







AGATATAGTGGCGACAACATAATCTACAAACCACCAG







GGCATTCTGCTCCGGATATGGTGTACTTGGGAGCTATG







ACAAATTTTGATGTGACTTACAATTGGATTCAAGATAA







ATGTGTTCCTCTTGTCCGAGAAATAACATTTGAAAATG







GAGAGGAATTGACAGAAGAAGGACTGCCTTTTCTCAT







ACTCTTTCACATGAAAGAAGATACAGAAAGTTTAGAAA







TATTCCAGAATGAAGTAGCTCGGCAATTAATAAGTGAA







AAAGGTACAATAAACTTTTTACATGCCGATTGTGACAA







ATTTAGACATCCTCTTCTGCACATACAGAAAACTCCAG







CAGATTGTCCTGTAATCGCTATTGACAGCTTTAGGCAT







ATGTATGTGTTTGGAGACTTCAAAGATGTATTAATTCC







TGGAAAACTCAAGCAATTCGTATTTGACTTACATTCTG







GAAAACTGCACAGAGAATTCCATCATGGACCTGACCC







AACTGATACAGCCCCAGGAGAGCAAGCCCAAGATGTA







GCAAGCAGTCCACCTGAGAGCTCCTTCCAGAAACTAGC







ACCCAGTGAATATAGGTATACTCTATTGAGGGATCGA







GATGAGCTT


G17792.
6996029
8288288
InFrame
8311
ATGACTGATGGAAAACTCTCCACCTCTACAAATGGCGT


TCGA-28-




AGCCTTCATGGGTATTCTGGATGGTCGACCAGGAAAC


5204-




CCCCTTCAGAACCTGCAACACGTCAATCTCAAGGCGCC


01A-01R-




CCGACTCCTCTCCGCGCCTGAGTACGGGCCCAAGCTGA


1850-




AACTCAGGGCTTTAGAAGACCGGCACAGCCTCCAGTC


01.4




CGTGGACTCGGGGATTCCTACCCTGGAGATCGGGAAC







CCGGAGCCTGTACCCTGCAGCGCGGTCCACGTGAGGA







GGAAGCAGTCCGACTCCGACCTCATCCCCGAGCGGGC







CTTCCAGAGCGCCTGCGCGCTGCCATCCTGTGCGCCAC







CAGCTCCTAGCAGCACCGAGCGGGAACAGAGCGTGCG







CAAATCCTCCACGTTTCCCAGGACAGGCTATGACTCGG







TAAAGCTCTATAGCCCGACCTCCAAAGCCCTGACCCGC







AGCGATGATGTCTCCGTCTGCAGCGTGTCCAGTCTTGG







GACAGAGCTGTCCACCACGCTGTCCGTCAGCAATGAG







GACATCTTGGACCTTGTGGTCACGAGCAGCTCCAGTGC







CATTGTGACCCTGGAGAATGACGATGACCCACAGTTTA







CCAACGTCACCTTGAGCTCTATCAAGGAAACCCGTGGC







TTACACCAGCAGGACTGTGTTCATGAAGCTGAGGAGG







GGAGTAAATTGAAAATATTGGGGCCATTTAGTAACTTC







TTTGCAAGGAACTTGCTTGCTAGAAAACAAAGTGCAA







GGCTTGACAAACACAATGACTTGGGATGGAAGTTATTT







GGGAAAGCGCCACTCCGAGAGAATGCCCAGAAGGATT







CAAAGAGAATACAGAAGgaatatgaagacaaggctggaagac







ctagCAAGCCACCCTCTCCAAAGCAGAATGTGAGGAAG







AATCTTGACTTTGAACCACTTTCCACCACCGCACTCATC







CTCGAGGACAGACCAGC|ACACCCGCTGTTTGGCCGCA







ACGTGCCCCTGTCCAGCGGTTCTGGCTTCATCATGTCA







GAGGCCGGCCTGATCATCACCAATGCCCACGTGGTGT







CCAGCAACAGTGCTGCCCCGGGCAGGCAGCAGCTCAA







GGTGCAGCTACAGAATGGGGACTCCTATGAGGCCACC







ATCAAAGACATCGACAAGAAGTCGGACATTGCCACCA







TCAAGATCCATCCCAAGAAAAAGCTCCCTGTGTTGTTG







CTGGGTCACTCGGCCGACCTGCGGCCTGGGGAGTTTG







TGGTGGCCATCGGCAGTCCCTTCGCCCTACAGAACACA







GTGACAACGGGCATCGTCAGCACTGCCCAGCGGGAGG







GCAGGGAGCTGGGCCTCCGGGACTCCGACATGGACTA







CATCCAGACGGATGCCATCATCAACTACGGGAACTCCG







GGGGACCACTGGTGAACCTGGATGGCGAGGTCATTGG







CATCAACACGCTCAAGGTCACGGCTGGCATCTCCTTTG







CCATCCCCTCAGACCGCATCACACGGTTCCTCACAGAG







TTCCAAGACAAGCAGATCAAAGACTGGAAGAAGCGCT







TCATCGGCATACGGATGCGGACGATCACACCAAGCCT







GGTGGATGAGCTGAAGGCCAGCAACCCGGACTTCCCA







GAGGTCAGCAGTGGAATTTATGTGCAAGAGGTTGCGC







CGAATTCACCTTCTCAGAGAGGCGGCATCCAAGATGG







TGACATCATCGTCAAGGTCAACGGGCGTCCTCTAGTGG







ACTCGAGTGAGCTGCAGGAGGCCGTGCTGACCGAGTC







TCCTCTCCTACTGGAGGTGCGGCGGGGGAACGACGAC







CTCCTCTTCAGCATCGCACCTGAGGTGGTCATG


G17476.
7310150
32328393
InFrame
8312
ATGAGACTCCTCCCCCGCTTGCTGCTGCTTCTCTTACTC


TCGA-06-




GTGTTCCCTGCCACTGTCTTGTTCCGAGGCGGCCCCAG


2569-




AGGCTTGTTAGCAGTGGCACAAGATCTTACAGAGGAT


01A-01R-




GAAGAAACAGTAGAAGATTCCATAATTGAGGATGAAG


1849-




ATGATGAAGCCGAGGTAGAAGAAGATGAACCCACAG


01.2




ATTTG|CACACTGTCCGTGAAGAAACGATGATGGTGAT







GACTGAAGACATGCCTTTGGAAATTTCTTATGTGCCTT







CTACTTATTTGACTGAAATCACTCATGTCTCACAAGCCC







TATTAGAAGTGGAACAACTTCTCAATGCTCCTGACCTC







TGTGCTAAGGACTTTGAAGATCTCTTTAAGCAAGAGGA







GTCTCTGAAGAATATAAAAGATAGTCTACAACAAAGCT







CAGGTCGGATTGACATTATTCATAGCAAGAAGACAGC







AGCATTGCAAAGTGCAACGCCTGTGGAAAGGGTGAAG







CTACAGGAAGCTCTCTCCCAGCTTGATTTCCAATGGGA







AAAAGTTAACAAAATGTACAAGGACCGACAAGGGCGA







TTTGACAGATCTGTTGAGAAATGGCGGCGTTTTCATTA







TGATATAAAGATATTTAATCAGTGGCTAACAGAAGCTG







AACAGTTTCTCAGAAAGACACAAATTCCTGAGAATTGG







GAACATGCTAAATACAAATGGTATCTTAAGGAACTCCA







GGATGGCATTGGGCAGCGGCAAACTGTTGTCAGAACA







TTGAATGCAACTGGGGAAGAAATAATTCAGCAATCCTC







AAAAACAGATGCCAGTATTCTACAGGAAAAATTGGGA







AGCCTGAATCTGCGGTGGCAGGAGGTCTGCAAACAGC







TGTCAGACAGAAAAAAGAGGCTAGAAGAACAAAAGA







ATATCTTGTCAGAATTTCAAAGAGATTTAAATGAATTT







GTTTTATGGTTGGAGGAAGCAGATAACATTGCTAGTAT







CCCACTTGAACCTGGAAAAGAGCAGCAACTAAAAGAA







AAGCTTGAGCAAGTCAAGTTACTGGTGGAAGAGTTGC







CCCTGCGCCAGGGAATTCTCAAACAATTAAATGAAACT







GGAGGACCCGTGCTTGTAAGTGCTCCCATAAGCCCAG







AAGAGCAAGATAAACTTGAAAATAAGCTCAAGCAGAC







AAATCTCCAGTGGATAAAGGTTTCCAGAGCTTTACCTG







AGAAACAAGGAGAAATTGAAGCTCAAATAAAAGACCT







TGGGCAGCTTGAAAAAAAGCTTGAAGACCTTGAAGAG







CAGTTAAATCATCTGCTGCTGTGGTTATCTCCTATTAGG







AATCAGTTGGAAATTTATAACCAACCAAACCAAGAAG







GACCATTTGACGTTAAGGAAACTGAAATAGCAGTTCA







AGCTAAACAACCGGATGTGGAAGAGATTTTGTCTAAA







GGGCAGCATTTGTACAAGGAAAAACCAGCCACTCAGC







CAGTGAAGAGGAAGTTAGAAGATCTGAGCTCTGAGTG







GAAGGCGGTAAACCGTTTACTTCAAGAGCTGAGGGCA







AAGCAGCCTGACCTAGCTCCTGGACTGACCACTATTGG







AGCCTCTCCTACTCAGACTGTTACTCTGGTGACACAAC







CTGTGGTTACTAAGGAAACTGCCATCTCCAAACTAGAA







ATGCCATCTTCCTTGATGTTGGAGGTACCTGCTCTGGC







AGATTTCAACCGGGCTTGGACAGAACTTACCGACTGG







CTTTCTCTGCTTGATCAAGTTATAAAATCACAGAGGGT







GATGGTGGGTGACCTTGAGGATATCAACGAGATGATC







ATCAAGCAGAAGGCAACAATGCAGGATTTGGAACAGA







GGCGTCCCCAGTTGGAAGAACTCATTACCGCTGCCCAA







AATTTGAAAAACAAGACCAGCAATCAAGAGGCTAGAA







CAATCATTACGGATCGAATTGAAAGAATTCAGAATCAG







TGGGATGAAGTACAAGAACACCTTCAGAACCGGAGGC







AACAGTTGAATGAAATGTTAAAGGATTCAACACAATG







GCTGGAAGCTAAGGAAGAAGCTGAGCAGGTCTTAGG







ACAGGCCAGAGCCAAGCTTGAGTCATGGAAGGAGGG







TCCCTATACAGTAGATGCAATCCAAAAGAAAATCACAG







AAACCAAGCAGTTGGCCAAAGACCTCCGCCAGTGGCA







GACAAATGTAGATGTGGCAAATGACTTGGCCCTGAAA







CTTCTCCGGGATTATTCTGCAGATGATACCAGAAAAGT







CCACATGATAACAGAGAATATCAATGCCTCTTGGAGAA







GCATTCATAAAAGGGTGAGTGAGCGAGAGGCTGCTTT







GGAAGAAACTCATAGATTACTGCAACAGTTCCCCCTGG







ACCTGGAAAAGTTTCTTGCCTGGCTTACAGAAGCTGAA







ACAACTGCCAATGTCCTACAGGATGCTACCCGTAAGGA







AAGGCTCCTAGAAGACTCCAAGGGAGTAAAAGAGCTG







ATGAAACAATGGCAAGACCTCCAAGGTGAAATTGAAG







CTCACACAGATGTTTATCACAACCTGGATGAAAACAGC







CAAAAAATCCTGAGATCCCTGGAAGGTTCCGATGATG







CAGTCCTGTTACAAAGACGTTTGGATAACATGAACTTC







AAGTGGAGTGAACTTCGGAAAAAGTCTCTCAACATTA







GGTCCCATTTGGAAGCCAGTTCTGACCAGTGGAAGCG







TCTGCACCTTTCTCTGCAGGAACTTCTGGTGTGGCTAC







AGCTGAAAGATGATGAATTAAGCCGGCAGGCACCTAT







TGGAGGCGACTTTCCAGCAGTTCAGAAGCAGAACGAT







GTACATAGGGCCTTCAAGAGGGAATTGAAAACTAAAG







AACCTGTAATCATGAGTACTCTTGAGACTGTACGAATA







TTTCTGACAGAGCAGCCTTTGGAAGGACTAGAGAAAC







TCTACCAGGAGCCCAGAGAGCTGCCTCCTGAGGAGAG







AGCCCAGAATGTCACTCGGCTTCTACGAAAGCAGGCT







GAGGAGGTCAATACTGAGTGGGAAAAATTGAACCTGC







ACTCCGCTGACTGGCAGAGAAAAATAGATGAGACCCT







TGAAAGACTCCGGGAACTTCAAGAGGCCACGGATGAG







CTGGACCTCAAGCTGCGCCAAGCTGAGGTGATCAAGG







GATCCTGGCAGCCCGTGGGCGATCTCCTCATTGACTCT







CTCCAAGATCACCTCGAGAAAGTCAAGGCACTTCGAG







GAGAAATTGCGCCTCTGAAAGAGAACGTGAGCCACGT







CAATGACCTTGCTCGCCAGCTTACCACTTTGGGCATTC







AGCTCTCACCGTATAACCTCAGCACTCTGGAAGACCTG







AACACCAGATGGAAGCTTCTGCAGGTGGCCGTCGAGG







ACCGAGTCAGGCAGCTGCATGAAGCCCACAGGGACTT







TGGTCCAGCATCTCAGCACTTTCTTTCCACGTCTGTCCA







GGGTCCCTGGGAGAGAGCCATCTCGCCAAACAAAGTG







CCCTACTATATCAACCACGAGACTCAAACAACTTGCTG







GGACCATCCCAAAATGACAGAGCTCTACCAGTCTTTAG







CTGACCTGAATAATGTCAGATTCTCAGCTTATAGGACT







GCCATGAAACTCCGAAGACTGCAGAAGGCCCTTTGCTT







GGATCTCTTGAGCCTGTCAGCTGCATGTGATGCCTTGG







ACCAGCACAACCTCAAGCAAAATGACCAGCCCATGGA







TATCCTGCAGATTATTAATTGTTTGACCACTATTTATGA







CCGCCTGGAGCAAGAGCACAACAATTTGGTCAACGTC







CCTCTCTGCGTGGATATGTGTCTGAACTGGCTGCTGAA







TGTTTATGATACGGGACGAACAGGGAGGATCCGTGTC







CTGTCTTTTAAAACTGGCATCATTTCCCTGTGTAAAGCA







CATTTGGAAGACAAGTACAGATACCTTTTCAAGCAAGT







GGCAAGTTCAACAGGATTTTGTGACCAGCGCAGGCTG







GGCCTCCTTCTGCATGATTCTATCCAAATTCCAAGACA







GTTGGGTGAAGTTGCATCCTTTGGGGGCAGTAACATT







GAGCCAAGTGTCCGGAGCTGCTTCCAATTTGCTAATAA







TAAGCCAGAGATCGAAGCGGCCCTCTTCCTAGACTGG







ATGAGACTGGAACCCCAGTCCATGGTGTGGCTGCCCG







TCCTGCACAGAGTGGCTGCTGCAGAAACTGCCAAGCA







TCAGGCCAAATGTAACATCTGCAAAGAGTGTCCAATCA







TTGGATTCAGGTACAGGAGTCTAAAGCACTTTAATTAT







GACATCTGCCAAAGCTGCTTTTTTTCTGGTCGAGTTGC







AAAAGGCCATAAAATGCACTATCCCATGGTGGAATATT







GCACTCCGACTACATCAGGAGAAGATGTTCGAGACTTT







GCCAAGGTACTAAAAAACAAATTTCGAACCAAAAGGT







ATTTTGCGAAGCATCCCCGAATGGGCTACCTGCCAGTG







CAGACTGTCTTAGAGGGGGACAACATGGAAACTCCCG







TTACTCTGATCAACTTCTGGCCAGTAGATTCTGCGCCTG







CCTCGTCCCCTCAGCTTTCACACGATGATACTCATTCAC







GCATTGAACATTATGCTAGCAGGCTAGCAGAAATGGA







AAACAGCAATGGATCTTATCTAAATGATAGCATCTCTC







CTAATGAGAGCATAGATGATGAACATTTGTTAATCCAG







CATTACTGCCAAAGTTTGAACCAGGACTCCCCCCTGAG







CCAGCCTCGTAGTCCTGCCCAGATCTTGATTTCCTTAGA







GAGTGAGGAAAGAGGGGAGCTAGAGAGAATCCTAGC







AGATCTTGAGGAAGAAAACAGGAATCTGCAAGCAGAA







TATGACCGTCTAAAGCAGCAGCACGAACATAAAGGCC







TGTCCCCACTGCCGTCCCCTCCTGAAATGATGCCCACCT







CTCCCCAGAGTCCCCGGGATGCTGAGCTCATTGCTGAG







GCCAAGCTACTGCGTCAACACAAAGGCCGCCTGGAAG







CCAGGATGCAAATCCTGGAAGACCACAATAAACAGCT







GGAGTCACAGTTACACAGGCTAAGGCAGCTGCTGGAG







CAACCCCAGGCAGAGGCCAAAGTGAATGGCACAACG







GTGTCCTCTCCTTCTACCTCTCTACAGAGGTCCGACAGC







AGTCAGCCTATGCTGCTCCGAGTGGTTGGCAGTCAAAC







TTCGGACTCCATGGGTGAGGAAGATCTTCTCAGTCCTC







CCCAGGACACAAGCACAGGGTTAGAGGAGGTGATGG







AGCAACTCAACAACTCCTTCCCTAGTTCAAGAGGAAGA







AATACCCCTGGAAAGCCAATGAGAGAGGACACAATG


G17195.
69145970
58109543
InFrame
8313
ATGCCACTCCTGGGTCAGACGGTGAGGTCGGCGTCTG


TCGA-06-




CGAGGACGCGGCGGTGGAGTAGAAGGGCAGCCGGA


0138-




GACAGGCCCGGCGCCCCTTCCGAGGCTAGACGGCCCC


01A-02R-




AGCTTCGCGGGGATCATGGCATTGTGGACCGAGTGCG


1849-




GGGCCACTGGCGAATCGCCGGCTCCTGTTCAACCTGCT


01.2




GGTGTCCATCTGCATTGTGTTCCTCAACAAATGGATTT







ATGTGTACCTTCCCCAACATGAGCCTGACCCTGGTGCA







CTTCGTGGTCACCTGGCTGGGCTTGTATATCTGCCAGA







AGCTGGACATCTTTGCCCCCAAAAGTCTGCCGCCCTCC







AGGCTCCTCCTCCTGGCCCTCAGCTTCTGTGGCTTTGTG







GTCTTCACTAACCTTTCTCTGCAGAACAACACCATAGG







CACCTATCAGCTGGCCAAGGCCATGACCACGCCGGTG







ATCATAGCCATCCAGACCTTCTGCTACCAGAAAACCTT







CTCCACCAGAATCCAGCTCACGCTGATTCCTATAACTTT







AGGTGTAATCCTAAATTCTTATTACGATGTGAAGTTTA







ATTTCCTTGGAATGGTGTTTGCTGCTCTTGGTGTTTTAG







TTACATCCCTTTATCAAGTGTGGGTAGGAGCCAAACAG







CATGAATTACAAGTGAACTCAATGCAGCTGCTGTACTA







CCAGGCTCCGATGTCATCTGCCATGTTGCTGGTTGCTG







TGCCCTTCTTTGAGCCAGTGTTTGGAGAAGGAGGAAT







ATTTGGTCCCTGGTCAGTTTCTGCTTTG|TTCCTCTGTG







ACGAGGGTGCAGGTATCTCTGGGGACTACATCGATCG







CGTGGACGAGCCCTTGTCCTGCTCTTATGTGCTGACCA







TTCGCACTCCTCGGCTCTGCCCCCACCCTCTCCTCCGGC







CCCCACCCAGTGCTGCACCGCAGGCCATCCTCTGTCAC







CCTTCCCTACAGCCTGAGGAGTACATGGCCTACGTTCA







GAGGCAAGCCGACTCAAAGCAGTATGGAGATAAAATC







ATAGAGGAGCTGCAAGATCTAGGCCCCCAAGTGTGGA







GTGAGACCAAGTCTGGGGTGGCACCCCAAAAGATGGC







AGGTGCGAGCCCGACCAAGGATGACAGTAAGGACTCA







GATTTCTGGAAGATGCTTAATGAGCCAGAGGACCAGG







CCCCAGGAGGGGAGGAGGTGCCGGCTGAGGAGCAGG







ACCCAAGCCCTGAGGCAGCAGATTCAGCTTCTGGTGCT







CCCAATGATTTTCAGAACAACGTGCAGGTCAAAGTCAT







TCGAAGCCCTGCGGATTTGATTCGATTCATAGAGGAGC







TGAAAGGTGGAACAAAAAAGGGGAAGCCAAATATAG







GCCAAGAGCAGCCTGTGGATGATGCTGCAGAAGTCCC







TCAGAGGGAACCAGAGAAGGAAAGGGGTGATCCAGA







ACGGCagagagagatggaagaagaggaggatgaggaatggatg







aggatgaagatgaggatgaACGGCAGTTACTGGGAGAATTT







GAGAAGGAACTGGAAGGGATCCTGCTTCCGTCAGACC







GAGACCGGCTCCGTTCGGAGGTGAAGGCTGGCATGG







AGCGGGAACTGGAAAACATCATCCAGGAGACAGAGA







AAGAGCTGGACCCAGATGGGCTGAAGAAGGAGTCAG







AGCGGGATCGGGCAATGCTGGCTCTCACATCCACTCTC







AACAAACTCATCAAAAGACTGGAGGAAAAACAGAGTC







CAGAGCTGGTGAAGAAGCACAAGAAAAAGAGGGTTG







TCCCCAAAAAGCCTCCCCCATCACCCCAACCTACAGAG







GAGGATCCTGAGCACAGAGTCCGGGTCCGGGTCACCA







AGCTCCGTCTCGGAGGCCCTAATCAGGATCTGACTGTC







CTCGAGATGAAACGGGAAAACCCACAGCTGAAACAAA







TCGAGGGGCTGGTGAAGGAGCTGCTGGAGAGGGAGG







GACTCACAGCTGCAGGGAAAATTGAGATCAAAATTGT







CCGCCCATGGGCTGAAGGGACTGAAGAGGGTGCACG







TTGGCTGACTGATGAGGACACGAGAAACCTCAAGGAG







ATCTTCTTCAATATCTTGGTGCCGGGAGCTGAAGAGGC







CCAGAAGGAACGCCAGCGGCAGAAAGAGCTGGAGAG







CAATTACCGCCGGGTGTGGGGCTCTCCAGGTGGGGAG







GGCACAGGGGACCTGGACGAATTTGACTTC


G17212.
140858113
163956014
InFrame
8314
ATGGTCCCAGAGGCCTGGAGGAGCGGACTGGTAAGC


TCGA-06-




ACCGGGAGGGTAGTGGGAGTTTTGCTTCTGCTTGGTG


0129-




CCTTGAACAAGGCTTCCACGGTCATTCACTATGAGATC


01A-01R-




CCGGAGGAAAGAGAGAAGGGTTTCGCTGTGGGCAAC


1849-




GTGGTCGCGAACCTTGGTTTGGATCTCGGTAGCCTCTC


01.2




AGCCCGCAGGTTCCGGGTGGTGTCTGGAGCTAGCCGA







AGATTCTTTGAGGTGAACCGGGAGACCGGAGAGATGT







TTGTGAACGACCGTCTGGATCGAGAGGAGCTGTGTGG







GACACTGCCCTCTTGCACTGTAACTCTGGAGTTGGTAG







TGGAGAACCCGCTGGAGCTGTTCAGCGTGGAAGTGGT







GATCCAGGACATCAACGACAACAATCCTGCTTTCCCTA







CCCAGGAAATGAAATTGGAGATTAGCGAGGCCGTGGC







TCCGGGGACGCGCTTTCCGCTCGAGAGCGCGCACGAT







CCCGATGTGGGAAGCAACTCTTTACAAACCTATGAGCT







GAGCCGAAATGAATACTTTGCGCTTCGCGTGCAGACG







CGGGAGGACAGCACCAAGTACGCGGAGCTGGTGTTG







GAGCGCGCCCTGGACCGAGAACGGGAGCCTAGTCTCC







AGTTAGTGCTGACGGCGTTGGACGGAGGGACCCCAGC







TCTCTCCGCCAGCCTGCCTATTCACATCAAGGTGCTGG







ACGCGAATGACAATGCGCCTGTCTTCAACCAGTCCTTG







TACCGGGCGCGCGTCCTGGAGGATGCACCCTCCGGCA







CGCGCGTGGTACAAGTCCTTGCAACGGATCTGGATGA







AGGCCCCAACGGTGAAATTATTTACTCCTTCGGCAGCC







ACAACCGCGCCGGCGTGCGGCAACTATTCGCCTTAGA







CCTTGTAACCGGGATGCTGACAATCAAGGGTCGGCTG







GACTTCGAGGACACCAAACTCCATGAGATTTACATCCA







GGCCAAAGACAAGGGCGCCAATCCCGAAGGAGCACA







TTGCAAAGTGTTGGTGGAGGTTGTGGATGTGAATGAC







AACGCCCCGGAGATCACAGTCACCTCCGTGTACAGCCC







AGTACCCGAGGATGCCCCTCTGGGGACTGTCATCGCTT







TGCTCAGTGTGACTGACCTGGATGCTGGCGAGAACGG







GCTGGTGACCTGCGAAGTTCCACCGGGTCTCCCTTTCA







GCCTTACTTCTTCCCTCAAGAATTACTTCACTTTGAAAA







CCAGTGCAGACCTGGATCGGGAGACTGTGCCAGAATA







CAACCTCAGCATCACCGCCCGAGACGCCGGAACCCCTT







CCCTCTCAGCCCTTACAATAGTGCGTGTTCAAGTGTCC







GACATCAATGACAACCCTCCACAATCTTCTCAATCTTCC







TACGACGTTTACATTGAAGAAAACAACCTCCCCGGGGC







TCCAATACTAAACCTAAGTGTCTGGGACCCCGACGCCC







CGCAGAATGCTCGGCTTTCTTTCTTTCTCTTGGAGCAA







GGAGCTGAAACCGGGCTAGTGGGTCGCTATTTCACAA







TAAATCGTGACAATGGCATAGTGTCATCCTTAGTGCCC







CTAGACTATGAGGATCGGCGGGAATTTGAATTAACAG







CTCATATCAGCGATGGGGGCACCCCGGTCCTAGCCACC







AACATCAGCGTGAACATATTTGTCACTGATCGCAATGA







CAATGCCCCCCAGGTCCTATATCCTCGGCCAGGTGGGA







GCTCGGTGGAGATGCTGCCTCGAGGTACCTCAGCTGG







CCACCTAGTGTCACGGGTGGTAGGCTGGGACGCGGAT







GCAGGGCACAATGCCTGGCTCTCCTACAGTCTCTTGGG







ATCCCCTAACCAGAGCCTTTTTGCCATAGGGCTGCACA







CTGGTCAAATCAGTACTGCCCGTCCAGTCCAAGACACA







GATTCACCCAGGCAGACTCTCACGGTCTTGATCAAAGA







CAATGGGGAGCCTTCGCTCTCCACCACTGCTACCCTCA







CTGTGTCAGTAACCGAGGACTCTCCTGAAGCCCGAGCC







GAGTTCCCCTCTGGCTCTGCCCCCCGGGAGCAGAAAA







AAAATCTCACCTTTTATCTACTTCTTTCTCTAATCCTGGT







TTCTGTGGGGTTTGTGGTCACAGTGTTCGGAGTAATCA







TATTCAAAGTTTACAAGTGGAAGCAGTCTAGAGACCTA







TACCGAGCCCCGGTGAGCTCACTGTACCGAACACCAG







GGCCCTCCTTGCACGCGGACGCCGTGCGGGGAGGCCT







GATGTCGCCGCACCTTTACCATCAGGTGTATCTCACCA







CGGACTCCCGCCGCAGCGACCCGCTGCTGAAGAAACC







TGGTGCAGCCAGTCCACTGGCCAGCCGCCAGAACACG







CTGCGGAGCTGTGATCCGGTGTTCTATAGGCAGGTGT







TGGGTGCAGAGAGCGCCCCTCCCGGACAG|GAGGAG







CAAAATAGAGGCAAGCCCAATTGGGAGCATCTAAATG







AAGATTTACATGTACTAATCACTGTGGAAGATGCTCAG







AACAGAGCAGAAATCAAATTGAAGAGAGCAGTTGAAG







AAGTGAAGAAATTATTGGTACCTGCAGCAGAAGGAGA







AGACAGCCTGAAGAAGATGCAGCTGATGGAGCTTGCG







ATTCTGAATGGCACCTACAGAGATGCCAACATTAAATC







ACCAGCCCTTGCCTTTTCTCTTGCAGCAACAGCCCAGG







CTGCTCCAAGGATCATTACTGGGCCTGCGCCGGTTCTC







CCACCAGCTGCCCTGCGTACTCCTACGCCAGCTGGCCC







TACCATAATGCCTTTGATCAGACAAATACAGACCGCTG







TCATGCCAAACGGAACTCCTCACCCAACTGCTGCAATA







GTTCCTCCAGGGCCCGAAGCTGGTTTAATCTATACACC







CTATGAGTACCCCTACACATTGGCACCAGCTACATCAA







TCCTTGAGTATCCTATTGAACCTAGTGGTGTATTAGGT







GCGGTGGCTACTAAAGTTCGAAGGCACGATATGCGTG







TCCATCCTTACCAAAGGATTGTGACCGCAGACCGAGCC







GCCACCGGCAAC


G17213.
2062889
90075838
InFrame
8315
ATGGCGGACATCGAGCAGTACTACATGAAGCCGCCCG


TCGA-06-




|AGATCGTTAAGGAGGCTGAGGTGCCGCAGGCTGCGC


0157-




TGGGCGTCCCAGCCCAGGGGACAGGGGACAATGGCC


01A-01R-




ACACGCCTGTGGAGGAGGAGGTCGGGGGCATCCCAG


1849-




TACCAGCACCGGGGCTCCTGCAGGTCACGGAGAGGAG


01.2




GCAGCCTCTGAGCAGCGTCTCCTCTCTGGAGGTCCACT







TCGACCTCCTGGACCTCACTGAGCTCACCGACATGTCG







GACCAGGAGCTGGCCGAGGTCTTTGCTGACTCGGACG







ACGAGAACCTCAACACCGAGTCCCCAGCAGGTCTGCA







CCCGCTGCCCCGGGCCGGCTACCTGCGCTCCCCTTCCT







GGACGAGGACAAGGGCTGAGCAGAGCCACGAGAAGC







AGCCCCTAGGCGACCCCGAGCGGCAGGCCACAGTCCT







GGACACGTTTCTCACTGTGGAGAGGCCCCAGGAGGAC


G17200.
27094490
24624366
InFrame
8316
ATGGCCGCGCAGGTCGCCCCCGCCGCCGCCAGCAGCC


TCGA-06-




TGGGCAACCCGCCGCCGCCGCCGCCCTCGGAGCTGAA


0125-




GAAAGCCGAgcagcagcagcgggaggaggcggggggcgaggcg


01A-01R-




gcggcggcggcagcggcCGAGCGCGGGGAAATGAAGGCA


1849-




GCCGCCGGGCAGGAAAGCGAGGGCCCCGCCGTGGGG


01.2




CCGCCGCAGCCGCTGGGAAAGGAGCTGCAGGACGGG







GCCGAGAGCAATGGGggtggcggcggcggcggacccagc







ggcggcggGCCCGGCGCGGAGCCGGACCTGAAGAACTC







GAACGGGAACGCGGGCCCTAGGCCCGCCCTGAACAAT







AACCTCACGGAGCCGCCcggcggcggcggtggcggcagcagc







gatggggtgggggcgCCTCCTCACTCAGCCGCGGCCGCCTT







GCCGCCCCCAGCCTACGGCTTCGGGCAACCCTACGGCC







GGAGCCCGTCTGCCGTCGCCGCCGCCGCGGCCGCCGT







CTTCCACCAACAACATGGCGGACAACAAAGCCCTGGC







CTGGCAGCGCTGCAGAGCGGCGGCGGCGGGGGCCTG







GAGCCCTACGCGGGGCCCCAGCAGAACTCTCACGACC







ACGGCTTCCCCAACCACCAGTACAACTCCTACTACCCC







AACCGCAGCGCCTACCCCCCGCCCGCCCCGGCCTACGC







GCTGAGCTCCCCGAGAGGTGGCACTCCGGGCTCCGGC







GCGGCGGCGGCTGCCGGCTCCAAGCCGCCTCCCTCCTC







CAGCGCCTCCGCCTCCTCGTCGTCTTCGTCCTTCGCTCA







GCAGCGCTTCGGGGCCATGGGGGGAGGCGGCCCCTC







CGCGGCCGGCGGGGGAACTCCCCAGCCCACCGCCACC







CCCACCCTCAACCAACTGCTCACGTCGCCCAGCTCGGC







CCGGGGCTACCAGGGCTACCCCGGGGGCGACTACAGT







GGCGGGCCCCAGGACGGGGGCGCCGGCAAGGGCCCG







GCGGACATGGCCTCGCAGTGTTGGGGggctgcggcggcgg







cagctgcggcggcggcCGCCTCGGGAGGGGCCCAACAAAG







GAGCCACCACGCGCCCATGAGCCCCGGGAGCAGCGGC







GGCGGGGGGCAGCCGCTCGCCCGGACCCCTCAGCCAT







CCAGTCCAATGGATCAGATGGGCAAGATGAGACCTCA







GCCATATGGCGGGACTAACCCATACTCGCAGCAACAG







GGACCTCCGTCAGGACCGCAGCAAGGACATGGGTACC







CAGGGCAGCCATACGGGTCCCAGACCCCGCAGCGGTA







CCCGATGACCATGCAGGGCCGGGCGCAGAGTGCCATG







GGCGGCCTCTCTTATACACAGCAGATTCCTCCTTATGG







ACAACAAGGCCCCAGCGGGTATGGTCAACAGGGCCAG







ACTCCATATTACAACCAGCAAAGTCCTCACCCTCAGCA







GCAGCAGCCACCCTACTCCCAGCAACCACCGTCCCAGA







CCCCTCATGCCCAACCTTCGTATCAGCAGCAGCCACAG







TCTCAACCACCACAGCTCCAGTCCTCTCAGCCTCCATAC







TCCCAGCAGCCATCCCAGCCTCCACATCAGCAGTCCCC







GGCTCCATACCCCTCCCAGCAGTCGACGACACAGCAGC







ACCCCCAGAGCCAGCCCCCCTACTCACAGCCACAGGCT







CAGTCTCCTTACCAGCAGCAGCAACCTCAGCAGCCAGC







ACCCTCGACGCTCTCCCAGCAGGCTGCGTATCCTCAGC







CCCAGTCTCAGCAGTCCCAGCAAACTGCCTATTCCCAG







CAGCGCTTCCCTCCACCGCAGGAGCTATCTCAAGATTC







ATTTGGGTCTCAGGCATCCTCAGCCCCCTCAATGACCT







CCAGTAAGGGAGGGCAAGAAGATATGAACCTGAGCCT







TCAGTCAAGACCCTCCAGCTTGCCTGATCTATCTGGTTC







AATAGATGACCTCCCCATGGGGACAGAAGGAGCTCTG







AGTCCTGGAGTGAGCACATCAGGGATTTCCAGCAGCC







AAGGAGAGCAGAGTAATCCAGCTCAGTCTCCTTTCTCT







CCTCATACCTCCCCTCACCTGCCTGGCATCCGAGGCCCT







TCCCCGTCCCCTGTTGGCTCTCCCGCCAGTGTTGCTCAG







TCTCGCTCAGGACCACTCTCGCCTGCTGCAGTGCCAGG







CAACCAGATGCCACCTCGGCCACCCAGTGGCCAGTCG







GACAGCATCATGCATCCTTCCATGAACCAATCAAGCAT







TGCCCAAGATCGAGGTTATATGCAGAGGAACCCCCAG







ATGCCCCAGTACAGTTCCCCCCAGCCCGGCTCAGCCTT







ATCTCCGCGTCAGCCTTCCGGAGGACAGATACACACA







GGCATGGGCTCCTACCAGCAGAACTCCATGGGGAGCT







ATGGTCCCCAGGGGGGTCAGTATGGCCCACAAGGTGG







CTACCCCAGGCAGCCAAACTATAATGCCTTGCCCAATG







CCAACTACCCCAGTGCAGGCATGGCTGGAGGCATAAA







CCCCATGGGTGCCGGAGGTCAAATGCATGGACAGCCT







GGCATCCCACCTTATGGCACACTCCCTCCAGGGAGGAT







GAGTCACGCCTCCATGGGCAACCGGCCTTATGGCCCTA







ACATGGCCAATATGCCACCTCAGGTTGGGTCAGGGAT







GTGTCCCCCACCAGGGGGCATGAACCGGAAAACCCAA







GAAACTGCTGTCGCCATGCATGTTGCTGCCAACTCTAT







CCAAAACAGGCCGCCAGGCTACCCCAATATGAATCAA







GGGGGCATGATGGGAACTGGACCTCCTTATGGACAAG







GGATTAATAGTATGGCTGGCATGATCAACCCTCAGGG







ACCCCCATATTCCATGGGTGGAACCATGGCCAACAATT







CTGCAGGGATGGCAGCCAGCCCAGAGATGATGGGCCT







TGGGGATGTAAAGTTAACTCCAGCCACCAAAATGAAC







AACAAGGCAGATGGGACACCCAAGACAGAATCCAAAT







CCAAGAAATCCAGTTCTTCTACTACAACCAATGAGAAG







ATCACCAAGTTGTATGAGCTGGGTGGTGAGCCTGAGA







GGAAGATGTGGGTGGACCGTTATCTGGCCTTCACTGA







GGAGAAGGCCATGGGCATGACAAATCTGCCTGCTGTG







GGTAGGAAACCTCTGGACCTCTATCGCCTCTATGTGTC







TGTGAAGGAGATTGGTGGATTGACTCAG|ATGCAGGC







CCTGACTTCCTGTGAGTGCACCATCTGTCCTGACTGCTT







CCGCCAGCACTTCACCATCGCCTTGAAGGAGAAGCAC







ATCACAGACATGGTGTGCCCTGCCTGTGGCCGCCCCGA







CCTCACCGATGACACACAGTTGCTCAGCTACTTCTCTAC







CCTTGACATCCAGCTTCGCGAGAGCCTAGAGCCAGAT







GCCTATGCGTTGTTCCATAAGAAGCTGACCGAGGGTG







TGCTGATGCGGGACCCCAAGTTCTTGTGGTGTGCCCAG







TGCTCCTTTGGCTTCATATATGAGCGTGAGCAGCTGGA







GGCAACTTGTCCCCAGTGTCACCAGACCTTCTGTGTGC







GCTGCAAGCGCCAGTGGGAGGAGCAGCACCGAGGTC







GGAGCTGTGAGGACTTCCAGAACTGGAAACGCATGAA







CGACCCAGAATACCAGGCCCAGGGCCTAGCAATGTAT







CTTCAGGAAAACGGCATTGACTGCCCCAAATGCAAGTT







CTCGTACGCCCTGGCCCGAGGAGGCTGCATGCACTTTC







ACTGTACCCAGTGCCGCCACCAGTTCTGCAGCGGCTGC







TACAATGCCTTTTACGCCAAGAATAAATGTCCAGAGCC







TAACTGCAGGGTGAAAAAGTCCCTGCACGGCCACCAC







CCTCGAGACTGCCTCTTCTACCTGCGGGACTGGACTGC







TCTCCGGCTTCAGAAGCTGCTACAGGACAATAACGTCA







TGTTTAATACAGAGCCTCCAGCTGGGGCCCGGGCAGT







CCCTGGAGGCGGCTGCCGAGTGATAGAGCAGAAGGA







GGTTCCCAATGGGCTCAGGGACGAAGCTTGTGGCAAG







GAAACTCCAGCTGGCTATGCCGGCCTGTGCCAGGCAC







ACTACAAAGAGTATCTTGTGAGCCTCATCAATGCCCAC







TCGCTGGACCCAGCCACCTTGTATGAGGTGGAAGAGC







TGGAGACGGCCACTGAGCGCTACCTGCACGTACGCCC







CCAGCCTTTGGCTGGAGAGGATCCCCCTGCTTACCAGG







CCCGCTTGTTACAGAAGCTGACAGAAGAGGTACCCTT







GGGACAGAGTATCCCCCGCAGGCGGAAG


G17804.
55268105
56079562
InFrame
8317
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-06-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


5408-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1849-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.4




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|GATGCT







TTCATTGGATTTGGAGGAAATGTGATCAGGCAACAAG







TCAAGGATAACGCCAAATGGTATATCACTGATTTTGTA







GAGCTGCTGGGAGAACTGGAAGAA


G17656.
27024031
49202124
InFrame
8318
ATGGCCGCGCAGGTCGCCCCCGCCGCCGCCAGCAGCC


TCGA-28-




TGGGCAACCCGCCGCCGCCGCCGCCCTCGGAGCTGAA


2514-




GAAAGCCGAgcagcagcagcgggaggaggcggggggcgaggcg


01A-02R-




gcggcggcggcagcggcCGAGCGCGGGGAAATGAAGGCA


1850-




GCCGCCGGGCAGGAAAGCGAGGGCCCCGCCGTGGGG


01.2




CCGCCGCAGCCGCTGGGAAAGGAGCTGCAGGACGGG







GCCGAGAGCAATGGGggtggcggcggcggcggagccggcagc







ggcggcggGCCCGGCGCGGAGCCGGACCTGAAGAACTC







GAACGGGAACGCGGGCCCTAGGCCCGCCCTGAACAAT







AACCTCACGGAGCCGCCcggcggcggcggtggcggcagcagc







gatggggtgggggcgCCTCCTCACTCAGCCGCGGCCGCCTT







GCCGCCCCCAGCCTACGGCTTCGGGCAACCCTACGGCC







GGAGCCCGTCTGCCGTCGCCGCCGCCGCGGCCGCCGT







CTTCCACCAACAACATGGCGGACAACAAAGCCCTGGC







CTGGCAGCGCTGCAGAGCGGCGGCGGCGGGGGCCTG







GAGCCCTACGCGGGGCCCCAGCAGAACTCTCACGACC







ACGGCTTCCCCAACCACCAGTACAACTCCTACTACCCC







AACCGCAGCGCCTACCCCCCGCCCGCCCCGGCCTACGC







GCTGAGCTCCCCGAGAGGTGGCACTCCGGGCTCCGGC







GCGGCGGCGGCTGCCGGCTCCAAGCCGCCTCCCTCCTC







CAGCGCCTCCGCCTCCTCGTCGTCTTCGTCCTTCGCTCA







GCAGCGCTTCGGGGCCATGGGGGGAGGCGGCCCCTC







CGCGGCCGGCGGGGGAACTCCCCAGCCCACCGCCACC







CCCACCCTCAACCAACTGCTCACGTCGCCCAGCTCGGC







CCGGGGCTACCAGGGCTACCCCGGGGGCGACTACAGT







GGCGGGCCCCAGGACGGGGGCGCCGGCAAGGGCCCG







GCGGACATGGCCTCGCAGTGTTGGGGggctgcggcggcgg







cagctgcggcggcggcCGCCTCGGGAGGGGCCCAACAAAG







GAGCCACCACGCGCCCATGAGCCCCGGGAGCAGCGGC







GGCGGGGGGCAGCCGCTCGCCCGGACCCCTCAGTGTC







CATCTGGGAAGCGGGATTTGGGTTGATGAGGAGAAAT







GGCACCAGCTACAAGTAACCCAAGGAGATTCCAAGTA







CACGAAGAACTTGGCAGTTATGATTTGGGGAACAGAT







GTTCTGAAAAACAGAAGCGTCACAGGCGTCGCCACAA







AAAAAAAGAAAGATGCAGTCCCTAAACCACCCCTCTCG







CCTCACAAACTAAGCATCGTCAGAGAGTGTTTGTATGA







CAGAATAGCACAAGAAACTGTGGATGAAACTGAAATT







GCACAGAGACTCTCCAAAGTCAACAAGTACATCTGTGA







AAAAATCATGGATATCAATAAATCCTGTAAAAATGAAG







AACGAAGGGAAGCAAAATACAATTTGCAA


G17654.
146017814
156294880
InFrame
8319
ATGCTTCTCTCGTTACCTGCCTTACATCTTCAGACCTCC


TCGA-41-




GAACACCATCCTTTCTTCCAGCTGCCACACAGAAGGCT


4097-




CGGACCATGGTGCAGTCCCACTGGCTCCCCTGCCCCCC


01A-01R-




TCTCCTGTGAGACTGGCTGCGGGGAGGGATCATGGAT


1850-




ACTTGTCTGCCGGCTTCTGGTTCCCACGCAAGTAAGCC


01.2




TGCTGTCAATGGAGGAGGACATTGATACCCGCAAAAT







CAACAACAGTTTCCTGCGCGACCACAGCTATGCGACCG







AAGCTGACATTATCTCTACGGTAGAATTCAACCACACG







GGAGAATTACTAGCGACAGGGGACAAGGGGGGTCGG







GTTGTAATATTTCAACGAGAGCAGGAGAGTAAAAATC







AGGTTCATCGTAGGGGTGAATACAATGTTTACAGCAC







ATTCCAGAGCCATGAACCCGAGTTCGATTACCTGAAGA







GTTTAGAAATAGAAGAAAAAATCAATAAAATAAGATG







GCTCCCCCAGCAGAATGCAGCTTACTTTCTTCTGTCTAC







TAATGATAAAACTGTGAAGCTGTGGAAAGTCAGCGAG







CGTGATAAGAGGCCAGAAGGCTACAATCTGAAAGATG







AGGAGGGCCGGCTCCGGGATCCTGCCACCATCACAAC







CCTGCGGGTGCCTGTCCTGAGACCCATGGACCTGATG







GTGGAGGCCACCCCACGAAGAGTATTTGCCAACGCAC







ACACATATCACATCAACTCCATATCTGTCAACAGCGAC







TATGAAACCTACATGTCCGCTGATGACCTGAGGATTAA







CCTATGGAACTTTGAAATAACCAATCAAAGTTTTAATA







TTGTGGACATTAAGCCAGCCAACATGGAGGAGCTCAC







GGAGGTGATCACAGCAGCCGAGTTCCACCCCCATCATT







GCAACACCTTCGTGTACAGCAGCAGCAAAGGGACAAT







CCGGCTGTGTGACATGCGGGCATCTGCCCTGTGTGAC







AGGCACACCAAAT|CAGGGGAAATGCTGTCTGTAGCT







GAGCACTTCCTGGAGCAGCAGATGCACCCAACAGTGG







TGATCAGTGCTTACCGCAAGGCATTGGATGATATGATC







AGCACCCTAAAGAAAATAAGTATCCCAGTCGACATCAG







TGACAGTGATATGATGCTGAACATCATCAACAGCTCTA







TTACTACCAAAGCCATCAGTCGGTGGTCATCTTTGGCT







TGCAACATTGCCCTGGATGCTGTCAAGATGGTACAGTT







TGAGGAGAATGGTCGGAAAGAGATTGACATAAAAAA







ATATGCAAGAGTGGAAAAGATACCTGGAGGCATCATT







GAAGACTCCTGTGTCTTGCGTGGAGTCATGATTAACAA







GGATGTGACCCATCCACGTATGCGGCGCTATATCAAG







AACCCTCGCATTGTGCTGCTGGATTCTTCTCTGGAATAC







AAGAAAGGAGAAAGCCAGACTGACATTGAGATTACAC







GAGAGGAGGACTTCACCCGAATTCTCCAGATGGAGGA







AGAGTACATCCAGCAGCTCTGTGAGGACATTATCCAAC







TGAAGCCCGATGTGGTCATCACTGAAAAGGGCATCTC







AGATTTAGCTCAGCACTACCTTATGCGGGCCAATATCA







CAGCCATCCGCAGAGTCCGGAAGACAGACAATAATCG







CATTGCTAGAGCCTGTGGGGCCCGGATAGTCAGCCGA







CCAGAGGAACTGAGAGAAGATGATGTTGGAACAGGA







GCAGGCCTGTTGGAAATCAAGAAAATTGGAGATGAAT







ACTTTACTTTCATCACTGACTGCAAAGACCCCAAGGCC







TGCACCATTCTCCTCCGGGGGGCTAGCAAAGAGATTCT







CTCGGAAGTAGAACGCAACCTCCAGGATGCCATGCAA







GTGTGTCGCAATGTTCTCCTGGACCCTCAGCTGGTGCC







AGGGGGTGGGGCCTCCGAGATGGCTGTGGCCCATGCC







TTGACAGAAAAATCCAAGGCCATGACTGGTGTGGAAC







AATGGCCATACAGGGCTGTTGCCCAGGCCCTAGAGGT







CATTCCTCGTACCCTGATCCAGAACTGTGGGGCCAGCA







CCATCCGTCTACTTACCTCCCTTCGGGCCAAGCACACCC







AGGAGAACTGTGAGACCTGGGGTGTAAATGGTGAGA







CGGGTACTTTGGTGGACATGAAGGAACTGGGCATATG







GGAGCCATTGGCTGTGAAGCTGCAGACTTATAAGACA







GCAGTGGAGACGGCAGTTCTGCTACTGCGAATTGATG







ACATCGTTTCAGGCCACAAAAAGAAAGGCGATGACCA







GAGCCGGCAAGGCGGGGCTCCTGATGCTGGCCAGGA







G


G17206.
27094490
24624366
InFrame
8320
ATGGCCGCGCAGGTCGCCCCCGCCGCCGCCAGCAGCC


TCGA-06-




TGGGCAACCCGCCGCCGCCGCCGCCCTCGGAGCTGAA


0125-




GAAAGCCGAgcagcagcagcgggaggaggcggggggcgaggcg


02A-11R-




gcggcggcggcagcggcCGAGCGCGGGGAAATGAAGGCA


2005-




GCCGCCGGGCAGGAAAGCGAGGGCCCCGCCGTGGGG


01.2




CCGCCGCAGCCGCTGGGAAAGGAGCTGCAGGACGGG







GCCGAGAGCAATGGGggtggcggcggcggcggacccagc







ggcggcggGCCCGGCGCGGAGCCGGACCTGAAGAACTC







GAACGGGAACGCGGGCCCTAGGCCCGCCCTGAACAAT







AACCTCACGGAGCCGCCcggcggcggcggtggcggcagcagc







gatggggtgggggcgCCTCCTCACTCAGCCGCGGCCGCCTT







GCCGCCCCCAGCCTACGGCTTCGGGCAACCCTACGGCC







GGAGCCCGTCTGCCGTCGCCGCCGCCGCGGCCGCCGT







CTTCCACCAACAACATGGCGGACAACAAAGCCCTGGC







CTGGCAGCGCTGCAGAGCGGCGGCGGCGGGGGCCTG







GAGCCCTACGCGGGGCCCCAGCAGAACTCTCACGACC







ACGGCTTCCCCAACCACCAGTACAACTCCTACTACCCC







AACCGCAGCGCCTACCCCCCGCCCGCCCCGGCCTACGC







GCTGAGCTCCCCGAGAGGTGGCACTCCGGGCTCCGGC







GCGGCGGCGGCTGCCGGCTCCAAGCCGCCTCCCTCCTC







CAGCGCCTCCGCCTCCTCGTCGTCTTCGTCCTTCGCTCA







GCAGCGCTTCGGGGCCATGGGGGGAGGCGGCCCCTC







CGCGGCCGGCGGGGGAACTCCCCAGCCCACCGCCACC







CCCACCCTCAACCAACTGCTCACGTCGCCCAGCTCGGC







CCGGGGCTACCAGGGCTACCCCGGGGGCGACTACAGT







GGCGGGCCCCAGGACGGGGGCGCCGGCAAGGGCCCG







GCGGACATGGCCTCGCAGTGTTGGGGggctgcggcggcgg







cagctgcggcggcggcCGCCTCGGGAGGGGCCCAACAAAG







GAGCCACCACGCGCCCATGAGCCCCGGGAGCAGCGGC







GGCGGGGGGCAGCCGCTCGCCCGGACCCCTCAGCCAT







CCAGTCCAATGGATCAGATGGGCAAGATGAGACCTCA







GCCATATGGCGGGACTAACCCATACTCGCAGCAACAG







GGACCTCCGTCAGGACCGCAGCAAGGACATGGGTACC







CAGGGCAGCCATACGGGTCCCAGACCCCGCAGCGGTA







CCCGATGACCATGCAGGGCCGGGCGCAGAGTGCCATG







GGCGGCCTCTCTTATACACAGCAGATTCCTCCTTATGG







ACAACAAGGCCCCAGCGGGTATGGTCAACAGGGCCAG







ACTCCATATTACAACCAGCAAAGTCCTCACCCTCAGCA







GCAGCAGCCACCCTACTCCCAGCAACCACCGTCCCAGA







CCCCTCATGCCCAACCTTCGTATCAGCAGCAGCCACAG







TCTCAACCACCACAGCTCCAGTCCTCTCAGCCTCCATAC







TCCCAGCAGCCATCCCAGCCTCCACATCAGCAGTCCCC







GGCTCCATACCCCTCCCAGCAGTCGACGACACAGCAGC







ACCCCCAGAGCCAGCCCCCCTACTCACAGCCACAGGCT







CAGTCTCCTTACCAGCAGCAGCAACCTCAGCAGCCAGC







ACCCTCGACGCTCTCCCAGCAGGCTGCGTATCCTCAGC







CCCAGTCTCAGCAGTCCCAGCAAACTGCCTATTCCCAG







CAGCGCTTCCCTCCACCGCAGGAGCTATCTCAAGATTC







ATTTGGGTCTCAGGCATCCTCAGCCCCCTCAATGACCT







CCAGTAAGGGAGGGCAAGAAGATATGAACCTGAGCCT







TCAGTCAAGACCCTCCAGCTTGCCTGATCTATCTGGTTC







AATAGATGACCTCCCCATGGGGACAGAAGGAGCTCTG







AGTCCTGGAGTGAGCACATCAGGGATTTCCAGCAGCC







AAGGAGAGCAGAGTAATCCAGCTCAGTCTCCTTTCTCT







CCTCATACCTCCCCTCACCTGCCTGGCATCCGAGGCCCT







TCCCCGTCCCCTGTTGGCTCTCCCGCCAGTGTTGCTCAG







TCTCGCTCAGGACCACTCTCGCCTGCTGCAGTGCCAGG







CAACCAGATGCCACCTCGGCCACCCAGTGGCCAGTCG







GACAGCATCATGCATCCTTCCATGAACCAATCAAGCAT







TGCCCAAGATCGAGGTTATATGCAGAGGAACCCCCAG







ATGCCCCAGTACAGTTCCCCCCAGCCCGGCTCAGCCTT







ATCTCCGCGTCAGCCTTCCGGAGGACAGATACACACA







GGCATGGGCTCCTACCAGCAGAACTCCATGGGGAGCT







ATGGTCCCCAGGGGGGTCAGTATGGCCCACAAGGTGG







CTACCCCAGGCAGCCAAACTATAATGCCTTGCCCAATG







CCAACTACCCCAGTGCAGGCATGGCTGGAGGCATAAA







CCCCATGGGTGCCGGAGGTCAAATGCATGGACAGCCT







GGCATCCCACCTTATGGCACACTCCCTCCAGGGAGGAT







GAGTCACGCCTCCATGGGCAACCGGCCTTATGGCCCTA







ACATGGCCAATATGCCACCTCAGGTTGGGTCAGGGAT







GTGTCCCCCACCAGGGGGCATGAACCGGAAAACCCAA







GAAACTGCTGTCGCCATGCATGTTGCTGCCAACTCTAT







CCAAAACAGGCCGCCAGGCTACCCCAATATGAATCAA







GGGGGCATGATGGGAACTGGACCTCCTTATGGACAAG







GGATTAATAGTATGGCTGGCATGATCAACCCTCAGGG







ACCCCCATATTCCATGGGTGGAACCATGGCCAACAATT







CTGCAGGGATGGCAGCCAGCCCAGAGATGATGGGCCT







TGGGGATGTAAAGTTAACTCCAGCCACCAAAATGAAC







AACAAGGCAGATGGGACACCCAAGACAGAATCCAAAT







CCAAGAAATCCAGTTCTTCTACTACAACCAATGAGAAG







ATCACCAAGTTGTATGAGCTGGGTGGTGAGCCTGAGA







GGAAGATGTGGGTGGACCGTTATCTGGCCTTCACTGA







GGAGAAGGCCATGGGCATGACAAATCTGCCTGCTGTG







GGTAGGAAACCTCTGGACCTCTATCGCCTCTATGTGTC







TGTGAAGGAGATTGGTGGATTGACTCAG|ATGCAGGC







CCTGACTTCCTGTGAGTGCACCATCTGTCCTGACTGCTT







CCGCCAGCACTTCACCATCGCCTTGAAGGAGAAGCAC







ATCACAGACATGGTGTGCCCTGCCTGTGGCCGCCCCGA







CCTCACCGATGACACACAGTTGCTCAGCTACTTCTCTAC







CCTTGACATCCAGCTTCGCGAGAGCCTAGAGCCAGAT







GCCTATGCGTTGTTCCATAAGAAGCTGACCGAGGGTG







TGCTGATGCGGGACCCCAAGTTCTTGTGGTGTGCCCAG







TGCTCCTTTGGCTTCATATATGAGCGTGAGCAGCTGGA







GGCAACTTGTCCCCAGTGTCACCAGACCTTCTGTGTGC







GCTGCAAGCGCCAGTGGGAGGAGCAGCACCGAGGTC







GGAGCTGTGAGGACTTCCAGAACTGGAAACGCATGAA







CGACCCAGAATACCAGGCCCAGGGCCTAGCAATGTAT







CTTCAGGAAAACGGCATTGACTGCCCCAAATGCAAGTT







CTCGTACGCCCTGGCCCGAGGAGGCTGCATGCACTTTC







ACTGTACCCAGTGCCGCCACCAGTTCTGCAGCGGCTGC







TACAATGCCTTTTACGCCAAGAATAAATGTCCAGAGCC







TAACTGCAGGGTGAAAAAGTCCCTGCACGGCCACCAC







CCTCGAGACTGCCTCTTCTACCTGCGGGACTGGACTGC







TCTCCGGCTTCAGAAGCTGCTACAGGACAATAACGTCA







TGTTTAATACAGAGCCTCCAGCTGGGGCCCGGGCAGT







CCCTGGAGGCGGCTGCCGAGTGATAGAGCAGAAGGA







GGTTCCCAATGGGCTCAGGGACGAAGCTTGTGGCAAG







GAAACTCCAGCTGGCTATGCCGGCCTGTGCCAGGCAC







ACTACAAAGAGTATCTTGTGAGCCTCATCAATGCCCAC







TCGCTGGACCCAGCCACCTTGTATGAGGTGGAAGAGC







TGGAGACGGCCACTGAGCGCTACCTGCACGTACGCCC







CCAGCCTTTGGCTGGAGAGGATCCCCCTGCTTACCAGG







CCCGCTTGTTACAGAAGCTGACAGAAGAGGTACCCTT







GGGACAGAGTATCCCCCGCAGGCGGAAG


G17792.
36204176
33911962
InFrame
8321
ATGGCTGAGCTGGATCCGTTCGGCGCCCCTGCCGGCG


TCGA-28-




CCCCTGGCGGTCCCGCGCTGGGGAACGGAGTGGCCG


5204-




GCGCCGGCGAAGAAGACCCGGCTGCGGCCTTCTTGGC


01A-01R-




GCAGCAAGAGAGCGAGATTGCGGGCATCGAGAACGA


1850-




CGAGGCCTTCGCCATCCTGGAcggcggcgcccccgggcccca


01.4




gccgcacggcgagccgccggggggTCCGGATGCTGTTGATGG







AGTAATGAATGGTGAATACTACCAGGAAAGTAATGGT







CCAACAGACAGTTATGCAGCTATTTCACAAGTGGATCG







ATTGCAGTCAGAGCCTGAAAGTATCCGTAAATGGAGA







GAAGAACAAATGGAACGCTTGGAAGCCCTTGATGCCA







ATTCTCGGAAGCAAGAAGCAGAGTGGAAAGAAAAGG







CAATAAAGGAGCTAGAAGAATGGTATGCAAGACAGG







ACGAGCAGCTACAGAAAACAAAAGCAAACAACAG|GA







CTATCCTATTAAGTGTAATCTCACTGCTTAATGAGCCCA







ACACCTTCTCCCCAGCCAATGTCGATGCTTCAGTTATGT







TCAGGAAATGGAGAGACAGTAAAGGAAAAGACAAAG







AATATGCTGAAATTATTAGGAAACAAGTTTCAGCCACT







AAGGCCGAAGCAGAAAAGGATGGAGTGAAGGTCCCC







ACAACCCTGGCGGAATACTGCATCAAAACTAAAGTGC







CTTCCAATGACAACAGCTCAGATTTGCTTTACGACGAC







TTGTatgatgacgacattgatgatgaagatgaggaggaggaagatgc







cgactgttatgatgatgatgatTCTGGGAATGAGGAGTCG


G17663.
205719085
205811017
InFrame
8322
ATGTCGCGGCCTGTCAG|GTTTATGGAGAGAAATCCCT


TCGA-19-




TAACCAATGCAATAATCAGGACCACCACGGCACTCACC


2619-




ATATTCAAAGCAGGGGTCAAGTTCAATGTCATCCCCCC


01A-01R-




AGTGGCCCAGGCCACAGTCAACTTCCGGATTCACCCTG


1850-




GACAGACAGTCCAAGAGGTCCTAGAACTCACGAAGAA


01.2




CATTGTGGCTGATAACAGAGTCCAGTTCCATGTGTTGA







GTGCCTTTGACCCCCTCCCCGTCAGCCCTTCTGATGACA







AGGCCTTGGGCTACCAGCTGCTCCGCCAGACCGTACA







GTCCGTCTTCCCGGAAGTCAATATTACTGCCCCAGTTA







CTTCTATTGGCAACACAGACAGCCGATTCTTTACAAAC







CTCACCACTGGCATCTACAGGTTCTACCCCATCTACATA







CAGCCTGAAGACTTCAAACGCATCCATGGAGTCAACG







AGAAAATCTCAGTCCAAGCCTATGAGACCCAAGTGAA







ATTCATCTTTGAGTTGATTCAGAATGCTGACACAGACC







AGGAGCCAGTTTCTCACCTGCACAAACTG


NYU_E
69333677
69349911
InFrame
8323
ATGAGGGTAGCTGCGGCGACTGCGGCGGCTGGAGCG







GGGCCGGCCATGGCGGTGTGGACGCGGGCCACCAAA







GCGGGGCTGGTGGAGCTGCTCCTGAGGGAGCGCTGG







GTCCGAGTGGTGGCCGAGCTGAGCGGGGAGAGCCTG







AGCCTGACGGGCGACGCCGCCGCGGCCGAGCTGGAG







CCCGCTCTGGGACCCGCGGCCGCCGCCTTCAACGGCCT







CCCAAACGGCGGCGGCGCGGGCGACTCGCTGCCCGG







GAGCCCAAgccgcggcctggggcccccgagcccgccggcgccgcct







cggggccccgcgggtgaggcgggcgcgtcgccgcccgtgcccgggT







GCGGGTGGTGAAGCAAGAGGCGGGCGGCCTGGGCAT







CAGCATCAAGGGCGGCCGCGAGAACCGGATGCCGATC







CTCATCTCCAAGATCTTCCCCGGGCTGGCTGCCGACCA







GAGCCGGGCGCTGCGGCTGGGCGACGCCATCCTGTCG







GTGAACGGCACCGACCTGCGCCAGGCCACCCACGACC







AGGCCGTGCAGGCGCTGAAGCGCGCGGGCAAGGAGG







TGCTGCTGGAGGTCAAGTTCATCCGAGAAGTAACACC







ATATATCAAGAAGCCATCATTAGTATCAGATCTGCCGT







GGGAAGGTGCAGCCCCCCAGTCACCAAGCTTTAGTGG







CAGTGAGGACTCTGGTTCGCCAAAACACCAGAACAGC







ACCAAGGACAGGAAGATCATCCCTCTCAAAATGTGCTT







TGCTGCTAGAAACCTAAGCATGCCGGATCTGGAAAAC







AGATTGATAGAGCTACATTCTCCTGATAGCAGGAACAC







GTTGATCCTACGCTGCAAAGATACAGCCACAGCACACT







CCTGGTTCGTAGCTATCCACACCAACATAATGGCTCTC







CTCCCACAGGTGTTGGCTGAACTCAACGCCATGCTTGG







GGCAACCAGTACAGCAGGAGGCAGTAAAGAGGTGAA







GCATATTGCCTGGCTGGCAGAACAGGCAAAACTAGAT







GGTGGAAGACAGCAATGGAGACCTGTCCTCATGGCTG







TGACTGAGAAGGATTTGCTGCTCTATGACTGTATGCCG







TGGACAAGAGATGCCTGGGCGTCACCATGCCACAGCT







ACCCACTTGTTGCCACCAGGTTGGTTCATTCTGGCTCC







GGATGTCGATCCCCCTCCCTTGGATCTGACCTTACATTT







GCTACCAGGACAGGCTCTCGACAGGGCATTGAGATGC







ATCTCTTCAGGGTGGAGACACATCGGGATCTGTCATCC







TGGACCAGGATACTTGTTCAGGGTTGCCATGCTGCTGC







TGAGCTGATCAAGGAAGTCTCTCTAGGCTGCATGTTAA







ATGGCCAAGAGGTGAGGCTTACTATTCACTATGAAAAT







GGGTTCACCATCTCAAGGGAAAATGGAGGCTCCAGCA







GCATATTGTACCGCTACCCCTTTGAAAGGCTGAAGATG







TCTGCTGATGATGGCATCCGAAATCTATACTTGGATTTT







GGTGGTCCCGAGGGAGAACTG|AAAGCCATTGATCTG







GTGACGAAAGCCACAGAGGAGGACAAAGCCAAGAAC







TACGAGGAGGCGCTGCGGCTGTACCAGCATGCGGTGG







AGTACTTCCTCCACGCTATCAAGTATGAGGCCCACAGC







GACAAGGCCAAGGAGAGCATTCGAGCCAAGTGCGTG







CAGTACCTAGACCGGGCCGAGAAGCTGAAGGATTATT







TACGAAGCAAAGAGAAACACGGCAAGAAGCCAGTCA







AAGAGAACCAGAGTGAGGGCAAGGGCAGTGACAGTG







ACAGTGAAGGGGATAATCCGGAGAAAAAGAAACTGC







AAGAACAGCTGATGGGTGCCGTCGTGATGGAGAAGCC







CAACATACGGTGGAACGACGTGGCCGGGCTGGAGGG







GGCCAAGGAGGCCCTCAAAGAAGCTGTCATTTTGCCA







ATCAAATTCCCACACTTGTTCACAGGCAAGCGCACCCC







CTGGCGGGGGATTCTGCTGTTCGGACCCCCTGGCACA







GGGAAATCCTACCTGGCCAAAGCCGTGGCAACAGAGG







CCAACAACTCCACCTTCTTCTCTGTGTCCTCCTCAGATC







TGATGTCCAAGTGGCTGGGGGAGAGTGAAAAGCTGG







TCAAGAACCTGTTTGAGCTGGCCAGGCAGCACAAGCC







CTCCATCATCTTCATCGATGAGGTGGATTCCCTCTGCG







GGTCCCGAAATGAAAATGAGAGTGAGGCCGCCCGGA







GGATCAAAACGGAGTTCTTGGTCCAGATGCAGGGGGT







GGGGAATAACAATGATGGGACTCTGGTTCTTGGAGCC







ACAAACATCCCATGGGTGTTGGATTCGGCCATCAGGA







GGAGGTTTGAAAAACGAATTTATATCCCCTTGCCGGAG







GAAGCTGCCCGCGCCCAGATGTTCCGGTTGCATCTCGG







GAGCACTCCCCACAACCTCACGGATGCAAACATCCACG







AGCTGGCCCGGAAGACGGAAGGCTACTCGGGCGCGG







ACATCAGCATCATCGTGCGGGACTCTCTCATGCAGCCC







GTGAGGAAGGTGCAGTCGGCCACACACTTCAAAAAGG







TCTGTGGCCCCTCTCGCACCAACCCCAGCATGATGATT







GATGACCTCCTGACTCCATGCTCACCAGGGGACCCAG







GAGCCATGGAGATGACTTGGATGGATGTCCCTGGGGA







CAAACTCTTAGAGCCTGTGGTTTGCATGTCGGACATGC







TGCGGTCTCTGGCCACCACCCGGCCCACGGTGAATGC







AGACGACCTCCTGAAAGTGAAGAAATTCTCAGAGGAC







TTTGGGCAAGAGAGT


NYU_G
42759130
42744294
InFrame
8324
ATGAAGACCCCGGCGGACACAG|GTGACAGCGGGAA







GGTGACCACAGTCGTAGCCACTCTAGGCCAAGGCCCA







GAGCGCTCCCAAGAAGTGGCTTACACGGACATCAAAG







TGATTGGCAATGGCTCATTTGGGGTCGTGTACCAGGC







ACGGCTGGCAGAGACCAGGGAACTAGTCGCCATCAAG







AAGGTTCTCCAGGACAAGAGGTTCAAGAACCGAGAGC







TGCAGATCATGCGTAAGCTGGACCACTGCAATATTGTG







AGGCTGAGATACTTTTTCTACTCCAGTGGCGAGAAGAA







AGACGAGCTTTACCTAAATCTGGTGCTGGAATATGTGC







CCGAGACAGTGTACCGGGTGGCCCGCCACTTCACCAA







GGCCAAGTTGACCATCCCTATCCTCTATGTCAAGGTGT







ACATGTACCAGCTCTTCCGCAGCTTGGCCTACATCCACT







CCCAGGGCGTGTGTCACCGCGACATCAAGCCCCAGAA







CCTGCTGGTGGACCCTGACACTGCTGTCCTCAAGCTCT







GCGATTTTGGCAGTGCAAAGCAGTTGGTCCGAGGGGA







GCCCAATGTCTCCTACATCTGTTCTCGCTACTACCGGGC







CCCAGAGCTCATCTTTGGAGCCACTGATTACACCTCAT







CCATCGATGTTTGGTCAGCTGGCTGTGTACTGGCAGAG







CTCCTCTTGGGCCAGCCCATCTTCCCTGGGGACAGTGG







GGTGGACCAGCTGGTGGAGATCATCAAGGTGCTGGG







AACACCAACCCGGGAACAAATCCGAGAGATGAACCCC







AACTACACGGAGTTCAAGTTCCCTCAGATTAAAGCTCA







CCCCTGGACAAAGGTGTTCAAATCTCGAACGCCGCCA







GAGGCCATCGCGCTCTGCTCTAGCCTGCTGGAGTACAC







CCCATCCTCAAGGCTCTCCCCACTAGAGGCCTGTGCGC







ACAGCTTCTTTGATGAACTGCGATGTCTGGGAACCCAG







CTGCCTAACAACCGCCCACTTCCCCCTCTCTTCAACTTC







AGTGCTGGTGAACTCTCCATCCAACCGTCTCTCAACGC







CATTCTTATCCCTCCTCACTTGAGGTCCCCAGCGGGCAC







TACCACCCTCACCCCGTCCTCACAAGCTTTAACTGAGAC







TCCGACCAGCTCAGACTGGCAGTCGACCGATGCCACA







CCTACCCTCACTAACTCCTCC


NYU_B
6457893
6443410
InFrame
8325
ATGGGCATGGCCAGGGGCAGCCTCACTCGGGTTCCAG







GGGTGATGGGAGAGGGCACTCAGGGCCCAGAGCTCA







GCCTTGACCCTGACCCTTGCTCTCCCCAATCCACTCCGG







GGCTCATGAAGGGGAACAAGCTGGAGGAGCAGGACC







CTAGACCTCTGCAGCCCATACCAGGTCTCATGGAGGG







GAACAAGCTGGAGGAGCAGGACTCTAGCCCTCCACAG







TCCACTCCAGGGCTCATGAAGGGGAACAAGCGTGAGG







AGCAGGGGCTGGGCCCCGAACCTGCGGCGCCCCAGCA







GCCCACGGCGGAGGAGGAGGCCCTGATCGAGTTCCAC







CGCTCCTACCGAGAGCTCTTCGAGTTCTTCTGCAACAA







CACCACCATCCACGGCGCCATCCGCCTGGTGTGCTCCC







AGCACAACCGCATGAAGACGGCCTTCTGGGCAGTGCT







GTGGCTCTGCACCTTTGGCATGATGTACTGGCAATTCG







GCCTGCTTTTCGGAGAGTACTTCAGCTACCCCGTCAGC







CTCAACATCAACCTCAACTCGGACAAGCTCGTCTTCCCC







GCAGTGACCATCTGCACCCTCAATCCCTACAGGTACCC







GGAAATTAAAGAGGAGCTGGAGGAGCTGGACCGCAT







CACAGAGCAGACGCTCTTTGACCTGTACAAATACAGCT







CCTTCACCACTCTCGTGGCCGGCTCCCGCAGCCGTCGC







GACCTGCGGGGGACTCTGCCGCACCCCTTGCAGCGCC







TGAGGGTCCCGCCCCCGCCTCACGGGGCCCGTCGAGC







CCGTAGCGTGGCCTCCAGCTTGCGGGACAACAACCCC







CAGGTGGACTGGAAGGACTGGAAGATCGGCTTCCAGC







TGTGCAACCAGAACAAATCGGACTGCTTCTACCAGACA







TACTCATCAGGGGTGGATGCGGTGAGGGAGTGGTACC







GCTTCCACTACATCAACATCCTGTCGAGGCTGCCAGAG







ACTCTGCCATCCCTGGAGGAGGACACGCTGGGCAACT







TCATCTTCGCCTGCCGCTTCAACCAGGTCTCCTGCAACC







AGGCGAATTACTCTCACTTCCACCACCCGATGTATGGA







AACTGCTATACTTTCAATGACAAGAACAACTCCAACCT







CTGGATGTCTTCCATGCCTGGAATCAACAACGGTCTGT







CCCTGATGCTGCGCGCAGAGCAGAATGACTTCATTCCC







CTGCTGTCCACAGTGACTGGGGCCCGGGTAATGGTGC







ACGGGCAGGATGAACCTGCCTTTATGGATGATGGTGG







CTTTAACTTGCGGCCTGGCGTGGAGACCTCCATCAGCA







TGAGGAAGGAAACCCTGGACAGACTTGGGGGCGATT







ATGGCGACTGCACCAAGAATGGCAGTGATGTTCCTGTT







GAGAACCTTTACCCTTCAAAGTACACACAGCAGGTGTG







TATTCACTCCTGCTTCCAGGAGAGCATGATCAAGGAGT







GTGGCTGTGCCTACATCTTCTATCCGCGGCCCCAGAAC







GTGGAGTACTGTGACTACAGAAAGCACAGTTCCTGGG







GGTACTGCTACTATAAGCTCCAGGTTGACTTCTCCTCA







GACCACCTGGGCTGTTTCACCAAGTGCCGGAAGCCAT







GCAGCGTGACCAGCTACCAGCTCTCTGCTGGTTACTCA







CGATGGCCCTCGGTGACATCCCAGGAATGGGTCTTCCA







GATGCTATCGCGACAGAACAATTACACCGTCAACAACA







AGAGAAATGGAGTGGCCAAAGTCAACATCTTCTTCAA







GGAGCTGAACTACAAAACCAATTCTGAGTCTCCCTCTG







TCACG|GTGCTCCTGGAGCTGTTGGTGGGAATATACCC







CTCAGGGGTTATTGGACTGGTCCCTCACCTAGGGGAC







AGGGAGAAGAGAGATAGTGTGTGTCCCCAAGGAAAA







TATATCCACCCTCAAAATAATTCGATTTGCTGTACCAAG







TGCCACAAAGGAACCTACTTGTACAATGACTGTCCAGG







CCCGGGGCAGGATACGGACTGCAGGGAGTGTGAGAG







CGGCTCCTTCACCGCTTCAGAAAACCACCTCAGACACT







GCCTCAGCTGCTCCAAATGCCGAAAGGAAATGGGTCA







GGTGGAGATCTCTTCTTGCACAGTGGACCGGGACACC







GTGTGTGGCTGCAGGAAGAACCAGTACCGGCATTATT







GGAGTGAAAACCTTTTCCAGTGCTTCAATTGCAGCCTC







TGCCTCAATGGGACCGTGCACCTCTCCTGCCAGGAGAA







ACAGAACACCGTGTGCACCTGCCATGCAGGTTTCTTTC







TAAGAGAAAACGAGTGTGTCTCCTGTAGTAACTGTAA







GAAAAGCCTGGAGTGCACGAAGTTGTGCCTACCCCAG







ATTGAGAATGTTAAGGGCACTGAGGACTCAGGCACCA







CAGTGCTGTTGCCCCTGGTCATTTTCTTTGGTCTTTGCC







TTTTATCCCTCCTCTTCATTGGTTTAATGTATCGCTACCA







ACGGTGGAAGTCCAAGCTCTACTCCATTGTTTGTGGGA







AATCGACACCTGAAAAAGAGGGGGAGCTTGAAGGAA







CTACTACTAAGCccctggccccaaacccaagcttcagtcccactcc







aggcttcacccccaccctgggcttcagtcccgtgcccagttccaccttca







cctccagctccaCCTATACCCCCGGTGACTGTCCCAACTTT







GCGGCTCCCCGCAGAGAGGTGGCACCACCCTATCAGG







GGGCTGACCCCATCCTTGCGACAGCCCTCGCCTCCGAC







CCCATCCCCAACCCCCTTCAGAAGTGGGAGGACAGCG







CCCACAAGCCACAGAGCCTAGACACTGATGACCCCGC







GACGCTGTACGCCGTGGTGGAGAACGTGCCCCCGTTG







CGCTGGAAGGAATTCGTGCGGCGCCTAGGGCTGAGCG







ACCACGAGATCGATCGGCTGGAGCTGCAGAACGGGC







GCTGCCTGCGCGAGGCGCAATACAGCATGCTGGCGAC







CTGGAGGCGGCGCACGCCGCGGCGCGAGGCCACGCT







GGAGCTGCTGGGACGCGTGCTCCGCGACATGGACCTG







CTGGGCTGCCTGGAGGACATCGAGGAGGCGCTTTgcgg







ccccgccgccctcccgcccgcgcccAGTCTTCTCAGA


G17675.
69230529
63195940
InFrame
8326
ATGGTGAGGAGCAGGCAAATGTGCAATACCAACATGT


TCGA-19-




CTGTACCTACTGATGGTGCTGTAACCACCTCACAGATT


2624-




CCAGCTTCGGAACAAGAGACCCTGGTTAGACCAAAGC


01A-01R-




CATTGCTTTTGAAGTTATTAAAGTCTGTTGGTGCACAA


1850-




AAAGACACTTATACTATGAAAGAGGTTCTTTTTTATCTT


01.2




GGCCAGTATATTATGACTAAACGATTATATGATGAGAA







GCAACAACATATTGTATATTGTTCAAATGATCTTCTAG







GAGATTTGTTTGGCGTGCCAAGCTTCTCTGTGAAAGAG







CACAGGAAAATATATACCATGATCTACAGGAACTTGGT







AGTAGTCAATCAGCAGGAATCATCGGACTCAGGTACA







TCTGTGAGTGAGAACAGGTGTCACCTTGAAGGTGGGA







GTGATCAAAAGGACCTTGTACAAGAGCTTCAGGAAGA







GAAACCTTCATCTTCACATTTGGTTTCTAGACCATCTAC







CTCATCTAGAAGGAGAGCAATTAGTGAGACAGAAGAA







AATTCAGATGAATTATCTGGTGAACGACAAAGAAAAC







GCCACAAATCTGATAGTATTTCCCTTTCCTTTGATGAAA







GCCTGGCTCTGTGTGTAATAAGGGAGATATGTTGTGA







AAGAAGCAGTAGCAGTGAATCTACAGGGACGCCATCG







AATCCGGATCTTGATGCTGGTGTAAGTGAACATTCAGG







TGATTGGTTGGATCAGGATTCAGTTTCAGATCAGTTTA







GTGTAGAATTTGAAGTTGAATCTCTCGACTCAGAAGAT







TATAGCCTTAGTGAAGAAGGACAAGAACTCTCAGATG







AAGATGATGAGGTATATCAAGTTACTGTGTATCAGGC







AGGGGAGAGTGATACAGATTCATTTGAAGAAGATCCT







GAAATTTCCTTAGCT|GAATCCGAGGGTGTTTCCTGCC







ACTATTGGTCGCTGTTTGACGGGCACGCGGGGTCCGG







GGCCGCGGTGGTGGCGTCACGCCTGCTGCAGCACCAC







ATCACGGAGCAGCTGCAGGACATCGTGGACATCCTGA







AGAACTCCGCCGTCCTGCCCCCTACCTGCCTGGGGGAG







GAGCCTGAGAACACGCCCGCCAACAGCCGGACTCTGA







CCCGGGCAGCCTCCCTGCGCGGAGGGGTGGGGGCCC







CGGGCTCCCCCAGCACGCCCCCCACACGCTTCTTTACC







GAGAAGAAGATTCCCCATGAGTGCCTGGTCATCGGAG







CGCTTGAAAGTGCATTCAAGGAAATGGACCTACAGAT







AGAACGAGAGAGGAGTTCATATAATATATCTGGTGGC







TGCACGGCCCTCATTGTGATTTGCCTTTTGGGGAAGCT







GTATGTTGCAAATGCTGGGGATAGCAGGGCCATAATC







ATCAGAAATGGAGAAATTATCCCCATGTCTTCAGAATT







TACCCCCGAGACGGAGCGCCAGCGACTTCAGTACCTG







GCATTCATGCAGCCTCACTTGCTGGGAAATGAGTTCAC







ACATTTGGAGTTTCCAAGGAGAGTACAGAGAAAGGAG







CTTGGAAAGAAGATGCTCTACAGGGACTTTAATATGAC







AGGCTGGGCATACAAAACCATTGAGGATGAGGACTTG







AAGTTCCCCCTTATATATGGAGAAGGCAAGAAGGCCC







GGGTAATGGCAACTATTGGAGTGACCAGGGGACTTGG







GGACCATGACCTGAAGGTGCATGACTCCAACATCTACA







TTAAACCATTCCTGTCTTCAGCTCCAGAGGTAAGAATC







TACGATCTTTCAAAATATGATCATGGATCAGATGATGT







GCTGATCTTGGCCACTGATGGACTCTGGGACGTTTTAT







CAAATGAAGAAGTAGCAGAAGCAATCACTCAGTTTCTT







CCTAACTGTGATCCAGATGATCCTCACAGGTACACACT







GGCAGCTCAGGACCTGGTGATGCGTGCCCGGGGTGTG







CTGAAGGACAGAGGATGGCGGATATCTAATGACCGAC







TGGGCTCAGGAGACGACATTTCTGTATATGTCATTCCT







TTAATACATGGAAACAAGCTGTCA


G17484.
155385535
156384545
InFrame
8327
ATGGACCCTAGAAATACTGCTATGTTAGGATTGGGTTC


TCGA-14-




TGATTCCGAAGGTTTTTCAAGAAAGAGTCCTTCTGCCA


0787-




TCAGTACTGGCACATTGGTCAGTAAGAGAGAAGTAGA


01A-01R-




GCTAGAAAAAAACACAAAGGAGGAAGAGGACCTTCG


1849-




CAAACGGAATCGAGAAAGAAACATCGAAGCTGGGAA


01.2




AGATGATGGTTTGACTGATGCACAGCAACAGTTTTCAG







TGAAAGAAACAAACTTTTCAGAGGGAAATTTAAAATT







GAAAATTGGCCTCCAGGCTAAGAGAACTAAAAAACCT







CCAAAGAACTTGGAGAACTATGTATGTCGACCTGCCAT







AAAAACAACTATTAAGCACCCAAGGAAAGCACTTAAA







AGTGGAAAGATGACGGATGAAAAGAATGAACACTGTC







CTTCAAAACGAGACCCTTCAAAGTTGTACAAGAAAGCA







GATGATGTTGCAGCCATTGAATGCCAGTCTGAAGAAG







TCATCCGTCTTCATTCACAGGGAGAAAACAATCCTTTG







TCTAAGAAGCTGTCTCCAGTACACTCAGAAATGGCAGA







TTATATTAATGCAACGCCATCTACTC-TTC-FTGGTAGCCG







GGATCCTGATTTAAAGGACAGAGCATTACTTAATGGA







GGAACTAGTGTAACAGAAAAGTTGGCACAGCTGATTG







CTACCTGTCCTCCTTCCAAGTCTTCCAAGACAAAACCGA







AGAAGTTAGGAACTGGCACTACAGCAGGATTGGTTAG







CAAGGATTTGATCAGGAAAGCAGGTGTTGGCTCTGTA







GCTGGAATAATACATAAGGACTTAATAAAAAAGCCAA







CCATCAGCACAGCAGTTGGATTGGTAACTAAAGATCCT







GGGAAAAAGCCAGTGTTTAATGCAGCAGTAGGATTGG







TCAATAAGGACTCTGTGAAAAAACTGGGAACTGGCAC







TACAGCGGTATTCATTAATAAAAACTTAGGCAAAAAGC







CAGGAACTATCACTACAGTAGGACTGCTAAGCAAAGA







TTCAGGAAAGAAGCTAGGAATTGGTATTGTTCCAGGTT







TAGTGCATAAAGAGTCTGGCAAGAAGTTAGGACTTGG







CACTGTGGTTGGACTGGTTAATAAAGATTTGGGAAAG







AAATTGGGTTCTACTGTTGGCCTAGTGGCCAAGGACTG







TGCAAAGAAGATTGTAGCAAGTTCAGCAATGGGATTG







GTTAATAAGGACATTGGAAAGAAACTAATGAGTTGTC







CTTTGGCAGGTCTGATCAGTAAAGATGCCATAAACCTT







AAAGCCGAAGCACTGCTCCCCACTCAGGAACCGCTTAA







GGCTTCTTGTAGTACAAACATCAATAATCAGGAAAGTC







AGGAACTTTCTGAATCCCTGAAAGATAGTGCCACCAGC







AAAACTTTTGAAAAGAATGTTGTACGGCAGAATAAAG







AAAGCATATTGGAAAAGTTCTCAGTACGAAAAGAAAT







CATTAATTTGGAGAAAGAAATGTTTAATGAAGGAACA







TGCATTCAGCAAGACAGTTTCTCATCCAGTGAAAAGGG







ATCTTATGAAACCTCAAAGCATGAAAAGCAGCCTCCTG







TATATTGCACTTCTCCGGACTTTAAAATGGGAGGTGCT







TCTGATGTATCTACCGCTAAATCCCCATTCAGTGCAGTA







GGAGAAAGCAATCTCCCTTCCCCATCACCTACTGTATCT







GTTAATCCTTTAACCAGAAGTCCCCCTGAAACTTCTTCA







CAGTTGGCTCCTAATCCATTACTTTTAAGTTCTACTACA







GAACTAATCGAAGAAATTTCTGAATCTGTTGGAAAGA







ACCAGTTTACTTCTGAAAGTACCCACTTGAACGTTGGT







CATAGGTCAGTTGGTCATAGTATAAGTATTGAATGTAA







AGGGATTGATAAAGAGGTAAATGATTCAAAAACTACC







CATATAGATATTCCAAGAATAAGCTCTTCCCTTGGAAA







AAAGCCAAGTTTGACTTCTGAATCCAGCATTCATACTA







TTACTCCTTCAGTTGTTAACTTCACTAGTTTATTTAGTA







ATAAGCCTTTTTTAAAACTGGGTGCAGTATCTGCATCA







GACAAACACTGCCAAGTTGCTGAAAGCCTAAGTACTA







GTTTGCAGTCCAAACCATTAAAAAAAAGAAAAGGAAG







AAAACCTCGGTGGACTAAAGTGGTGGCAAGAAGCACA







TGCCGGTCTCCAAAAGGGCTAGAATTAGAAAGATCAG







AGCTTTTTAAAAACGTTTCATGTAGCTCACTATCAAATA







GTAATTCTGAGCCAGCCAAGTTTATGAAAAACATTGGA







CCCCCTTCATTTGTAGATCATGACTTCCTTAAACGCCGA







TTGCCAAAGTTGAGCAAATCCACAGCTCCATCTCTTGC







TCTCTTAGCTGATAGTGAAAAACCATCTCATAAGTCTTT







TGCTACTCACAAACTATCCTCCAGTATGTGTGTCTCTAG







TGACCTTTTGTCTGATATTTATAAGCCCAAAAGAGGAA







GGCCTAAATCTAAGGAGATGCCTCAACTGGAAGGGCC







ACCTAAAAGGACTTTAAAAATCCCTGCTTCTAAAGTGT







TTTCTTTACAGTCTAAGGAAGAACAAGAACCCCCAATT







TTACAGCCAGAAATTGAAATCCCTTCCTTCAAACAAGG







TCTGTCTGTGTCTCCTTTTCCAAAAAAGAGAGGCAGGC







CTAAGAGGCAAATGAGGTCACCAGTCAAGATGAAGCC







ACCTGTACTGTCAGTGGCTCCATTTGTTGCCACTGAAA







GTCCAAGCAAGCTAGAATCTGAAAGTGACAACCATAG







AAGTAGCAGTGATTTCTTTGAGAGCGAGGATCAACTTC







AGGATCCAGATGACCTAGATGACAGTCATAGGCCAAG







TGTCTGTAGTATGAGTGACCTTGAGATGGAACCAGAT







AAAAAAATTACCAAGAGAAACAATGGACAATTAATGA







AAACAATTATCCGCAAAATAAATAAAATGAAGACTTTA







AAGAGAAAGAAACTGTTGAATCAGATTCTTTCAAGTTC







TGTAGAATCAAGTAATAAAGGGAAAGTGCAATCCAAA







CTCCATAATACGGTATCAAGTCTTGCTGCCACATTTGG







CTCTAAATTGGGCCAACAGATAAATGTCAGCAAGAAA







GGAACCATTTATATAGGAAAGAGAAGAGGTCGCAAAC







CAAAAACTGTCTTAAATGGTATTCTTTCTGGTAGTCCTA







CTAGCCTTGCTGTTCTTGAGCAAACAGCTCAACAGGCA







GCTGGGTCAGCATTAGGACAGATTCTTCCCCCATTACT







GCCTTCATCTGCTAGTAGTTCTGAGATTCTTCCATCACC







TATTTGCTCTCAGTCTTCTGGGACTAGTGGAGGTCAGA







GCCCTGTAAGTAGTGATGCAGGTTTTGTTGAACCCAGT







TCAGTGCCATATTTGCATTTACACTCCAGACAGGGCAG







TATGATTCAGACTCTTGCAATGAAGAAGGCCTCAAAG







GGGAGGAGGCGGTTATCTCCTCCTACTTTGTTGCCAAA







TTCTCCTTCGCACTTGAGTGAACTCACATCTCTAAAAGA







AGCTACTCCTTCCCCAATCAGTGAGTCTCATAGTGATG







AGACCATTCCCAGTGATAGTGGAATTGGAACAGATAA







TAACAGCACATCAGACAGGGCAGAGAAATTTTGTGGG







CAAAAAAAGAGGAGGCATTCTTTTGAGCATGTTTCTCT







GATTCCCCCTGAAACCTCTACAGTGCTAAGCAGTCTTA







AAGAAAAACATAAACACAAATGTAAGCGCAGGAATCA







TGATTACCTCAGCTATGACAAGATGAAAAGGCAGAAA







CGAAAACGGAAAAAGAAATATCCCCAGCTTCGAAATA







GACAGGATCCAGACTTTATTGCAGAGCTGGAGGAACT







AATAAGTCGCCTAAGTGAAATTCGGATCACTCATCGAA







GTCATCATTTTATCCCCCGAGATCTTCTGCCAACTATCT







TTCGAATCAACTTTAATAGTTTCTATACACATCCTTCTTT







CCCCTTAGACCCTTTGCACTACATTCGAAAACCTGACTT







AAAAAAGAAAAGAGGGAGACCCCCTAAGATGAGGGA







GGCAATGGCTGAAATGCCTTTTATGCACAGCCTTAGTT







TTCCTCTTTCTAGTACTGGATTCTATCCATCTTATGGTAT







GCCTTACTCTCCTTCACCCCTTACAGCTGCTCCCATAGG







ATTAGGTTACTATGGAAGGTATCCTCCCACTCTTTATCC







ACCTCCTCCATCTCCTTCTTTCACCACGCCACTTCCACCT







CCTTCCTATATGCATGCTGGTCATTTACTTCTCAATCCT







GCCAAATACCATAAGAAAAAGCATAAGCTACTTCGACA







GGAGGCCTTTCTTACAACCAGCAGGACTCCCCTCCTTT







CCATGAGTACCTACCCCAGTGTTCCTCCTGAGATGGCC







TATGGTTGGATGGTTGAGCACAAACACAGGCACCGTC







ACAAACACAGAGAACACCGTTCTTCTGAACAACCCCAG







GTTTCTATGGACACTGGCTCTTCCCGATCTGTCCTGGA







ATCTTTGAAGCGCTATAGATTTGGAAAGGATGCTGTTG







GAGAGCGATATAAGCATAAGGAAAAGCACCGTTGTCA







CATGTCCTGCCCTCATCTCTCTCCTTCAAAAAGCTTAAT







AAACAGAGAGGAACAGTGGGTCCACCGAGAGCCTTCA







GAATCTAGTCCATTGGCCTTGGGATTGCAGACACCTTT







ACAGATTGACTGTTCAGAAAGTTCTCCAAGCTTATCCC







TTGGAGGATTCACTCCCAACTCTGAGCCAGCCAGCAGT







GATGAACATACAAACCTTTTCACAAGTGCAATAGGCAG







CTGCAGAGTTTCAAACCCTAACTCCAGTGGCCGGAAG







AAATTAACTGACAGCCCTGGACTCTTTTCTGCACAGGA







CACTTCACTAAATCGGCTTCACAGAAAGGAGTCACTGC







CTTCTAACGAAAGGGCAGTACAGACTTTGGCAGGCTC







CCAGCCAACCTCTGATAAACCCTCCCAGCGGCCATCAG







AGAGCACAAATTGTAGCCCTACCCGGAAAAGGTCTTC







ATCTGAGAGTACTTCTTCAACAGTAAACGGAGTTCCCT







CTCGAAGTCCAAGATTAGTTGCTTCTGGGGATGACTCT







GTGGATAGTCTGCTGCAGCGGATGGTACAAAATGAGG







ACCAAGAGCCCATGGAGAAAAGTATTGATGCTGTGAT







TGCAACTGCCTCTGCACCACCTTCTTCCAGTCCAGGCC







GTAGCCACAGCAAGGACCGAACCCTGGGAAAACCAGA







CAGCCTTTTAGTGCCTGCAGTCACAAGTGACTCTTGCA







ATAATAGCATCTCACTCCTATCTGAAAAGTTGACAAGC







AGCTGTTCCCCCCATCATATCAAGAGAAGTGTAGTGGA







AGCTATGCAACGCCAAGCTCGGAAAATGTGCAATTAC







GACAAAATCTTGGCCACAAAGAAAAACCTAGACCATG







TCAATAAAATCTTAAAAGCCAAAAAACTTCAAAGGCAG







GCCAGGACAGGGAATAACTTTGTGAAACGTAGGCCAG







GTCGACCTCGGAAATGTCCCCTTCAGGCTGTCGTATCA







ATGCAAGCATTCCAGGCTGCTCAGTTTGTCAACCCAGA







ATTGAACAGAGACGAGGAAGGAGCAGCACTGCACCTC







AGTCCTGACACAGTTACAGATGTAATTGAGGCTGTTGT







TCAGAGTGTAAATCTGAACCCAGAACATAAAAAGGGG







TTGAAGAGAAAAGGTTGGCTATTGGAAGAACAGACCA







GAAAAAAGCAGAAGCCATTACCAGAGGAAGAAGAGC







AAGAGAATAATAAAAGCTTTAATGAAGCACCAGTTGA







GATTCCCAGTCCTTCTGAAACCCCAGCTAAACCTTCTGA







ACCTGAAAGTACCTTGCAGCCTGTGCTTTCTCTCATCCC







AAGGGAAAAGAAGCCCCCACGTCCCCCAAAGAAGAAG







TATCAGAAAGCAGGGCTGTATTCTGACGTTTACAAAAC







TACAGA|CTTCTCACCTGGGGCGGGTGGGTTCTGCACC







ACCCTCCCACCCTCCTTCCTCCGTGTGGACGATAGAGC







CACATCCAGCACCACGGACAGCTCCCGGGCGCCTTCAT







CTCCTCGTCCTCCAGGCAGCACAAGCCATTGTGGAATC







TCCACCAGGTGTACAGAACGGTGCCTCTGCGTCCTGCC







ACTCAGGACCTCTCAAGTCCCCGATGTGATGGCTCCTC







AGCATGATCAGGAgaaattccatgatcttgcttattcctgtcttggg







aagtccttctccatgtctaaccaagatctatatggctatagcaccagctc







tttggctcttggcttggcatggctaagttgggagACCAAAAAGAAG







AATGTACTTCATCTGGTTGGGCTGGATTCCCTC


G17792.
100065175
10061309
InFrame
8328
ATGAGCGGGGGCAAGAAGAAGAGTAGTTTCCAAATCA


TCGA-28-




CCAGCGTCACCACGGACTATGAGGGCCCTGGGAGCCC


5204-




AGGGGCTTCGGATCCCCCTACCCCACAGCCCCCAACCG


01A-01R-




GGCCCCCGCCCCGCCTGCCCAATGGGGAGCCCAGCCC


1850-




CGATCCGGGGGGCAAGGGCACCCCCCGGAATGGCTCC


01.4




CCACCACCTGGGGCCCCTTCCTCCCGTTTCCGGGTGGT







GAAGCTGCCCCACGGCCTGGGAGAGCCTTATCGCCGC







GGTCGCTGGACGTGTGTGGATGTTTATGAGCGAGACC







TGGAGCCCCACAGCTTCGGCGGACTCCTGGAGGGAAT







TCGAGGGGCCTCAGGGGGCGCCGGGGGCAGATCTTT







GGATTCCAGGTTGGAGCTGGCCAGCCTCGGCCTGGGC







GCCCCCACCCCACCGTCAGGCCTGTCTCAGGGCCCCAC







CTCCTGGCTCCGTCCACCCCCCACCTCTCCTGGACCTCA







GGCCCGCTCCTTCACTGGGGGACTGGGCCAGCTGGTG







GTGCCCAGCAAAGCCAAGGCAGAGAAACCCCCACTGT







CGGCCTCCTCACCCCAGCAGCGCCCCCCAGAGCCTGAG







ACCGGTGAGAGTGCGGGCACATCCCGGGCTGCCACGC







CCCTGCCCTCTCTGAGGGTGGAAGCGGAGGCTGGGGG







CTCAGGGGCCAGGACCCCTCCACTGTCCCGGAGGAAA







GCTGTAGACATGCGGCTGCGGATGGAGTTGGGTGCTC







CAGAAGAGATGGGGCAGGTGCCCCCACTTGACTCTCG







CCCCAGCTCCCCAGCCCTCTACTTCACCCACGATGCCA







GCCTGGTTCACAAATCTCCAGACCCCTTCGGAGCAGTA







GCAGCTCAGAAGTTCAGCCTGGCCCACTCCATGTTGGC







CATCAGTGGTCACCTAGACAGCGACGATGATAGTGGC







TCCGGAAGCCTGGTTGGCATTGACAACAAAATCGAGC







AAGCCATG|TGTTTTCTTCTGGAGGCAAAAAATTAAACC







AACCATCTCAGGACACCCTGACTCCAAGAAACACTCAT







TGAAGAAGATGGAGAAGACTCTCCAGGTGGTTGAGAC







TTTGAGGTTGGTCGAGCTCCCAAAAGAGGCTAAGCCC







AAGTTGGGTGAGTCCCCCGAGCTGGCAGATCCCTGCG







TGTTGGCCAAGACTACAGAGGAGACCGAGGTGGAGCT







GGGCCAACAGGGCCAATCCCTACTGCAGCTGCCGAGG







ACGGCCGTCAAGTCTGTCTCCACGCTCATGGTCTCTGC







CCTGCAGAGCGGCTGGCAGATGTGCAGCTGGAAGTCA







TCAGTGAGTTCTGCCTCAGTCAGCTCCCAAGTGAGGAC







GCAGTCACCTTTGAAGACTCCGGAGGCTGAGTTGCTGT







GGGAGGTGTACCTGGTGCTGTGGGCCGTTCGGAAACA







CCTGCGCCGGCTGTACCGCAGGCAGGAGAGGCACAG







ACGGCACCACGTCCGATGCCATGCTGCCCCCCGACCCA







ACCCGGCTCAGTCCCTGAAACTGGATGCCCAAAGTCCC







CTC


BT299
23037536
1256473
InFrame
8329
ATGGCTCTGCGGAGGCTGGGGGCCGCGCTGCTGCTGC







TGCCGCTGCTCGCCGCCGTGGAAg|GGGCCGGCCAGG







ACGTGGGCCGAAGCTGCATCCTGGTCTCCATTGCGGG







CAAGAATGTCATGCTGGACTGTGGAATGCACATGGGC







TTCAATGACGACCGACGCTTCCCTGACTTCTCCTACATC







ACCCAGAACGGCCGCCTAACAGACTTCCTGGACTGTGT







GATCATTAGCCACTTCCACCTGGACCACTGCGGGGCAC







TCCCCTACTTCAGCGAGATGGTGGGCTACGACGGGCC







CATCTACATGACTCACCCCACCCAGGCCATCTGCCCCAT







CTTGCTGGAGGACTACCGCAAGATCGCCGTAGACAAG







AAGGGCGAGGCCAACTTCTTCACCTCCCAGATGATCAA







AGACTGCATGAAGAAGGTGGTGGCTGTCCACCTCCAC







CAGACGGTCCAGGTAGATGATGAGCTGGAGATCAAG







GCCTACTATGCAGGCCACGTGCTGGGGGCAGCCATGT







TCCAGATTAAAGTGGGCTCAGAGTCTGTGGTCTACACG







GGTGATTATAACATGACCCCAGACCGACACTTAGGAG







CTGCCTGGATTGACAAGTGCCGCCCCAACCTGCTCATC







ACAGAGTCCACGTACGCCACGACCATCCGTGACTCCAA







GCGCTGCCGGGAGCGAGACTTCCTGAAGAAAGTCCAC







GAGACCGTGGAGCGTGGTGGGAAGGTGCTGATACCT







GTGTTCGCGCTGGGCCGCGCCCAGGAGCTCTGCATCC







TCCTGGAGACCTTCTGGGAGCGCATGAACCTGAAGGT







GCCCATCTACTTCTCCACGGGGCTGACCGAGAAGGCC







AACCACTACTACAAGCTGTTCATCCCCTGGACCAACCA







GAAGATCCGCAAGACTTTCGTGCAGAGGAACATGTTT







GAGTTCAAGCACATCAAGGCCTTCGACCGGGCTTTTGC







TGACAACCCAGGACCGATGGTTGTGTTTGCCACGCCA







GGAATGCTGCACGCTGGGCAGTCCCTGCAGATCTTCC







GGAAATGGGCCGGAAACGAAAAGAACATGGTCATCAT







GCCCGGCTACTGCGTGCAGGGCACCGTCGGCCACAAG







ATCCTCAGCGGGCAGCGGAAGCTCGAGATGGAGGGG







CGGCAGGTGCTGGAGGTCAAGATGCAGGTGGAGTAC







ATGTCATTCAGCGCACACGCGGACGCCAAGGGCATCA







TGCAGCTGGTGGGCCAGGCAGAGCCGGAGAGCGTGC







TGCTGGTGCATGGCGAGGCCAAGAAGATGGAGTTCCT







GAAGCAGAAGATCGAGCAGGAGCTCCGGGTCAACTG







CTACATGCCGGCCAATGGCGAGACGGTGACGCTGCCC







ACAAGCCCCAGCATCCCCGTAGGCATCTCGCTGGGGCT







GCTGAAGCGGGAGATGGCGCAGGGGCTGCTCCCTGA







GGCCAAGAAGCCTCGGCTCCTGCACGGCACCCTGATC







ATGAAGGACAGCAACTTCCGGCTGGTGTCCTCAGAGC







AAGCCCTCAAAGAGCTGGGTCTGGCTGAGCACCAGCT







GCGCTTCACCTGCCGCGTGCACCTGCATGACACACGCA







AGGAGCAGGAGACGGCATTGCGCGTCTACAGCCACCT







CAAGAGCGTCCTGAAGGACCACTGTGTGCAGCACCTC







CCAGACGGCTCTGTGACTGTGGAGTCCGTCCTCCTCCA







GGCCGCCGCCCCTTCTGAGGACCCAGGCACCAAGGTG







CTGCTGGTCTCCTGGACCTACCAGGACGAGGAGCTGG







GGAGCTTCCTCACATCTCTGCTGAAGAAGGGCCTCCCC







CAGGCCCCCAGC


G17667.
54373484
53494330
InFrame
8330
ATGAACCAGCCAGAGTCTGCCAACGATCCTGAACCCCT


TCGA-26-




GTGTGCAGTGTGTGGCCAAGCCCACTCCTTGGAGGAA


5134-




AACCACTTCTACAGCTATCCAGAGGAAGTGGATGATG


01A-01R-




ACCTCATCTGCCACATCTGCCTGCAGGCTTTGCTGGAC


1850-




CCCCTGGACACTCCGTGTGGACACACCTACTGCACCCT


01.2




CTGCCTCACCAACTTCCTGGTGGAGAAGGACTTCTGTC







CCATGGACCGCAAGCCTCTGGTTCTGCAGCACTGCAAG







AAGTCCAGCATCCTGGTCAACAAACTCCTCAACAAGCT







ACTGGTGACCTGCCCATTCAGGGAGCACTGCACCCAG







GTGTTGCAGCGCTGTGACCTCGAGCATCACTTTCAAAC







CAGCTGTAAAGGTGCCTCCCACTACGGCCTGACCAAA







GATAGGAAGAGGCGCTCACAAGATGGCTGTCCAGACG







GCTGTGCGAGCCTCACAGCCACGGCTCCCTCCCCAGAG







GTTTCTGCAGCTGCCACCATCTCCTTAATGACAGACGA







GCCTGGCCTAGACAACCCTGCCTACGTGTCCTCGGCAG







AGGACGGGCAGCCAGCAATCAGCCCAGTGGACTCTGG







CCGGAGCAACCGAACTAGGGCACGGCCCTTTGAGAGA







TCCACTATTAGAAGCAGATCATTTAAAAAAATAAATCG







AGCTTTGAGTGTTCTTCGAAGGACAAAGAGCGGGAGT







GCAGTTGCCAACCATGCCGACCAGGGCAGGGAAAATT







CTGAAAACACCACTGCCCCTGAAG|TTTGGAAACACAT







GCTACTG


GBM-
41198856
41192900
InFrame
8331
ATGTGGCTGAAGGTGGGGGGCCTACTTCGGGGGACC


CUMC3338_L1




GGTGGACAGCTGGGCCAGACTGTTGGTTGGCCTTGTG







GGGCCCTGGGGCCTGGGCCCCACCGCTGGGGACCATG







TGGAGGTTCTTGGGCCCAAAAGTTTTACCAGGATGGG







CCTGGGAGAGGCCTGGGTGAGGAGGACATTCGCAGG







GCACGGGAGGCCCGTCCCAGGAAGACACCCCGGCCCC







AGCTGAGTGACCGCTCTCGAGAACGCAAGGTGCCTGC







CTCCCGCATCAGCCGCTTGGCCAACTTTGGGGGACTGG







CTGTGGGCTTGGGGCTAGGAGTACTGGCCGAGATGGC







TAAGAAGTCCATGCCAGGAGGTCGTCTGCAGTCAGAG







GGTGGTTCTGGGCTGGACTCCAGCCCCTTCCTGTCGGA







GGCCAATGCCGAGCGGATTGTGCAGACCTTATGTACA







GTTCGAGGGGCCGCCCTCAAGGTTGGCCAGATGCTCA







GCATCCAGGACAACAGCTTCATCAGCCCTCAGCTGCAG







CACATCTTTGAGCGGGTCCGCCAGAGCGCCGACTTCAT







GCCCCGCTGGCAGATGCTGagagttcttgaagaggagctcggc







agggactggcaggccaaggtggcctccttggaggaggtgccctttgCC







GCTGCCTCAATTGGGCAGGTGCACCAGGGCCTGCTGA







GGGACGGGACGGAGGTGGCCGTGAAGATCCAGTACC







CCGGCATAGCCCAGAGCATTCAGAGCGATGTCCAGAA







CCTGCTGGCGGTACTCAAGATGAGCGCGGCCCTGCCC







GCGGGCCTGTTTGCCGAGCAGAGCCTGCAGGCCTTGC







AGCAGGAGCTGGCTTGGGAGTGTGACTACCGTCGTGA







GGCGGCTTGTGCCCAGAATTTCAGGCAGCTGCTGGCA







AATGACCCCTTCTTCCGGGTCCCAGCCGTGGTTAAGGA







GCTGTGCACGACACGGGTGCTGGGCATGGAGCTGGCT







GGAGGGGTCCCCCTGGACCAGTGCCAGGGCCTAAGCC







AGGACCTGCGGAACCAGATTTGCTTCCAGCTCCTGACG







CTGTGTCTGCGGGAGCTGTTTGAGTTCCGATTCATGCA







GACTGACCCCAACTGGGCCAACTTCCTGTATGATGCCT







CCAGCCACCAGGTGACCCTGCTGGACTTTGGTGCAAG







CCGGGAGTTTGGGACAGAGTTCACAGACCATTACATC







GAGGTGGTGAAGGCTGCAGCTGATGGAGACAGAGAC







TGTGTCCTGCAGAAGTCCAGGGACCTCAAATTCCTCAC







AGGCTTTGAAACCAAG|GGCGGACCCCGGAGGCCTGA







GCGGCACCTGCCCCCAGCCCCCTGTGGGGCCCCGGGG







CCCCCAGAAACCTGCAGGACGGAGCCAGACGGGGCG







GGCACCATGAACAAGTTACGGCAGAGCCTGCGGCGGA







GGAAGCCAGCCTACGTGCCCGAGGCGTCGCGCCCGCA







CCAGTGGCAGGCAGACGAGGACGCGGTGCGGAAGGG







CACGTGCAGCTTCCCGGTCAGGTACCTGGGTCACGTG







GAGGTAGAGGAGTCCCGGGGAATGCACGTGTGTGAA







GATGCGGTGAAGAAGCTGAAGGCGATGGGCCGAAAG







TCCGTGAAGTCTGTCCTGTGGGTGTCAGCCGATGGGC







TCCGAGTGGTGGACGACAAAACCAAGGATCTTCTGGT







CGACCAGACCATCGAAAAGGTCTCCTTTTGTGCTCCTG







ACCGCAACCTGGACAAGGCTTTCTCCTATATCTGTCGT







GACGGGACTACCCGCCGCTGGATCTGCCACTGTTTTCT







GGCACTGAAGGACTCCGGCGAGAGGCTGAGCCACGCT







GTGGGCTGTGCTTTTGCCGCCTGCCTGGAGCGAAAAC







AGCGACGGGAGAAGGAATGTGGGGTCACGGCCGCCT







TCGATGCCAGCCGCACCAGCTTCGCCCGCGAGGGCTC







CTTCCGCCTGTCTGGGGGTGGGCGGCCTGCTGAGCGA







GAGGCCCCGGACAAGAAGAAAGCAGAGGCAGCAGCT







GCCCCCACTGTGGCTCCTGGCCCTGCCCAGCCTGGGCA







CGTGTCCCCGACACCAGCCACCACATCCCCTGGTGAGA







AGGGTGAGGCAGGCACCCCTGTGGCTGCAGGCACCAC







TGCGGCCGCCATCCCCCGGCGCCATGCACCCCTGGAG







CAGCTGGTTCGCCAGGGCTCCTTCCGTGGGTTCCCAGC







ACTCAGCCAGAAGAACTCGCCTTTCAAACGGCAGCTG







AGCCTACGGCTGAATGAGCTGCCATCCACGCTGCAGC







GCCGCACTGACTTCCAGGTGAAGGGCACAGTGCCTGA







GATGGAGCCTCCTGGTGCCGGCGACAGTGACAGCATC







AACGCTCTGTGCACACAGATCAGTTCATCTTTTGCCAG







TGCTGGAGCGCCAGCACCAGGGCCACCACCTGCCACA







ACAGGGACTTCTGCCTGGGGTGAGCCCTCCGTGCCCCC







TGCAGCTGCCTTCCAGCCTGGGCACAAGCGGACACCTT







CAGAGGCTGAGCGATGGCTGGAGGAGGTGTCACAGG







TGGCCAAGGCCcagcagcagcagcagcagcaacagcaacagca







gcagcagcagcagcagcaacagcagcaagcagcCTCAGTGGCCC







CAGTGCCCACCATGCCTCCTGCCCTGCAGCCTTTCCCCG







CCCCCGTGGGGCCCTTTGACGCTGCACCTGCCCAAGTG







GCCGTGTTCCTGCCACCCCCACACATGCAGCCCCCTTTT







GTGCCCGCCTACCCGGGCTTGGGCTACCCACCGATGCC







CCGGGTGCCCGTGGTGGGCATCACACCCTCACAGATG







GTGGCAAACGCCTTCTGCTCAGCCGCCCAGCTCCAGCC







TCAGCCTGCCACTCTGCTTGGGAAAGCTGGGGCCTTCC







CGCCCCCTGCCATACCCAGTGCCCCTGGGAGCCAGGCC







CGCCCTCGCCCCAATGGGGCCCCCTGGCCCCCTGAGCC







AGCGCCTGCCCCAGCTCCAGAGTTGGACCCCTTTGAGG







CCCAGTGGGCGGCATTAGAAGGCAAAGCCACTGTAGA







GAAACCCTCCAACCCCTTTTCTGGCGACCTGCAAAAGA







CATTCGAGATTGAACTG


G17815.
154918048
154904891
InFrame
8332
ATGGCCTCCTGCCCAGACTCTGATAATAGCTGGGTGCT


TCGA-19-




TGCTGGCTCCGAGAGCCTGCCAGTGGAGACACTGGGC


5960-




CCGGCATCCAGGATGGACCCAGAATCTGAGAGAGCCC


01A-11R-




TGCAGGCCCCTCACAGCCCCTCCAAGACAGATGGGAA


1850-




AGAATTAGCTGGGACCATGGATGGAGAAGGGACGCT


01.4




CTTCCAGACTGAAAGCCCTCAGTCTGGCAGCATTCTAA







CAGAGGAGACTGAGGTCAAGGGCACCCTGGAAGGTG







ATGTTTGTGGTGTGGAGCCTCCTGGCCCAGGAGACAC







AGTAGTCCAGGGAGACCTGCAGGAGACCACCGTGGTG







ACAGGCCTGGGACCAGACACACAGGACCTGGAAGGC







CAGAGCCCTCCACAGAGCCTGCCTTCAACCCCCAAAGC







AGCTTGGATCAGGGAGGAGGGCCGCTGCTCCAGCAGT







GACGATGACACCGACGTGGACATGGAGGGTCTGCGG







AGACGGCGGGGCCGGGAGGCCGGCCCACCTCAGCCC







ATGGTGCCCCTGGCTGTGGAGAACCAGGCTGGGGGTG







AGGGTGCAGGCGGGGAGCTGGGCATCTCCCTCAACAT







GTGCCTCCTTGGGGCCCTGGTTCTGCTTGGCCTGGGG







GTCCTCCTCTTCTCAGGTGGCCTCTCAGAGTCTGAGAC







TGGGCCCATGGAGGAAGTGGAGCGGCAGGTCCTCCCA







GACCCCGAGGTGCTGGAAGCTGTGGGGGACAGGCAG







GATGGGCTAAGGGAACAGCTGCAGGCCCCAGTGCCTC







CTGACAGTGTCCCCAGCCTGCAAAACATGGGTCTTCTG







CTGGACAAGCTGGCCAAGGAGAACCAGGACATCCGGC







TGCTGCAGGCCCAGCTGCAGGCCCAAAAGGAAGAGCT







TCAGAGCCTGATGCACCAGCCCAAAGGGCTAGAGGAG







GAGAATGCCCAGCTCCGGGGGGCTCTGCAGCAGGGC







GAAGCCTTCCAGCGGGCTCTGGAGTCAGAGCTGCAGC







AGCTGCGGGCCCGGCTCCAGGGGCTGGAGGCCGACT







GTGTCCGGGGCCCAGATGGGGTGTGCCTCAGTGGGG







GTAGAGGCCCACAGGGTGACAAGGCCATCAGGGAGC







AAGGCCCCAGGGAGCAGGAGCCAGAACTCAGCTTCCT







GAAGCAGAAGGAACAGCTGGAGGCTGAGGCACAGGC







ATTAAGGCAAGAGTTAGAGAGGCAGCGACGGCTGCT







GGGGTCTGTACAGCAGGATCTGGAGAGGAGCTTGCA







GGATGCCAGCCGCGGGGACCCAGCTCATGCTGGCTTG







GCTGAGCTGGGCCACAGATTGGCCCAGAAACTGCAGG







GCCTGGAGAACTGGGGCCAGGACCCTGGGGTCTCTGC







CAATGCCTCAAAGGCCTGGCACCAGAAGTCCCACTTCC







AGAATTCTAGGGAGTGGAGTGGAAAGGAAAAGTGGT







GGGATGGGCAGAGAGACCGGAAGGCTGAGCACTGGA







AACATAAGAAGGAAGAATCTGGCCGGGAAAGGAAGA







AGAACTGGGGAGGTCAGGAGGACAGGGAGCCAGCAG







GAAGGTGGAAGGAGGGCAGGCCAAGGGTGGAGGAG







TCGGGGAGCAAGAAGGAGGGCAAGCGACAGGGCCCG







AAGGAACCCCCAAGGAAAAGTGGTAGCTTCCACTCCT







CTGGAGAAAAGCAGAAGCAACCTCGGTGGAGGGAAG







GGACTAAGGACAGCCATGACCCCCTGCCATCCTGGGC







AGAGCTGTTGAGGCCCAAGTACCGGGCACCCCAGGGC







TGCTCAGGTGTGGACGAGTGTGCCCGGCAGGAGGGC







CTGACTTTCTTTGGCACAGAGCTAGCCCCAGTGCGGCA







ACAGGAGCTGGCCTCTCTGCTAAGAACATACTTGGCAC







GGCTGCCCTGGGCTGGGCAGCTGACCAAGGAGCTACC







CCTCTCACCTGCTTTCTTTGGTGAGGATGGCATCTTCCG







TCATGACCGCCTCCGCTTCCGGGATTTTGTGGATGCCC







TGGAGGACAGCTTGGAGGAGGTGGCTGTGCAACAGA







CAGGTGATGATGATGAAGTAGATGACTTTGAGGACTT







CATCTTCAGCCACTTCTTTGGAGACAAAGCACTGAAGA







AGAG|ACTTGGAGCTGATGTCTGTGCTGTCCTCCGGCT







CTCTGGTCCACTCAAGGAACAGTATGCTCAGGAGCAT







GGCTTGAACTTCCAGAGACTCCTGGACACCAGCACCTA







CAAGGAGGCCTTTCGGAAGGACATGATCCGCTGGGGA







GAGGAGAAACGCCAGGCTGACCCAGGCTTCTTTTGCA







GGAAGATTGTGGAGGGCATCTCCCAGCCCATCTGGCT







GGTGAGTGACACACGGAGAGTGTCTGACATCCAGTGG







TTTCGGGAGGCCTATGGGGCCGTGACGCAGACGGTCC







GCGTTGTAGCGTTGGAGCAGAGCCGACAGCAGCGGG







GCTGGGTGTTCACGCCAGGGGTGGACGATGCTGAGTC







AGAATGTGGCCTGGACAACTTCGGGGACTTTGACTGG







GTCATCGAGAACCATGGAGTTGAACAGCGCCTGGAGG







AGCAGTTGGAGAACCTGATAGAATTTATCCGCTCCAGA







CTT


G17662.
58758160
50285014
InFrame
8333
ATGGAAGCACGTTCTATGCTGGTTCCACCCCAGGCATC


TCGA-32-




TGTGTGCTTCGAGGATGTGGCTATGGCATTCACACAG


1970-




GAGGAGTGGGAACAGCTGGACCTGGCCCAGAGGACA


01A-01R-




CTGTACCGAGAGGTGACACTGGAGACCTGGGAGCATA


1850-




TTGTCTCCCTGGGGCTTTTCCTTTCCAAATCTGATGTGA


01.2




TCTCTCAGCTGGAGCAAGAAGAGGACCTGTGCAGGGC







AGAGCAGGAGGCCCCCCGAG|GTAAGAGCAAAGAGG







CGGAAATTAAGAGAATCAACAAGGAACTGGCCAACAT







CCGCTCCAAGTTCAAAGGAGACAAAGCCTTGGATGGC







TACAGTAAGAAAAAATATGTGTGTAAACTGCTTTTCAT







CTTCCTGCTTGGCCATGACATTGACTTTGGGCACATGG







AGGCTGTGAATCTGTTGAGTTCCAATAAATACACAGAG







AAGCAAATAGGTTACCTGTTCATTTCTGTGCTGGTGAA







CTCGAACTCGGAGCTGATCCGCCTCATCAACAACGCCA







TCAAGAATGACCTGGCCAGCCGCAACCCCACCTTCATG







TGCCTGGCCCTGCACTGCATCGCCAACGTGGGCAGCC







GGGAGATGGGCGAGGCCTTTGCCGCTGACATCCCCCG







CATCCTGGTGGCCGGGGACAGCATGGACAGTGTCAAG







CAGAGTGCGGCCCTGTGCCTCCTTCGACTGTACAAGGC







CTCGCCTGACCTGGTGCCCATGGGCGAGTGGACGGCG







CGTGTGGTACACCTGCTCAATGACCAGCACATGGGTGT







GGTCACGGCCGCCGTCAGCCTCATCACCTGTCTCTGCA







AGAAGAACCCAGATGACTTCAAGACGTGCGTCTCTCTG







GCTGTGTCGCGCCTGAGCCGGATCGTCTCCTCTGCCTC







CACCGACCTCCAGGACTACACCTACTACTTCGTCCCAG







CACCCTGGCTCTCGGTGAAGCTCCTGCGGCTGCTGCAG







TGCTACCCGCCTCCAGAGGATGCGGCTGTGAAGGGGC







GGCTGGTGGAATGTCTGGAGACTGTGCTCAACAAGGC







CCAGGAGCCCCCCAAATCCAAGAAGGTGCAGCATTCC







AACGCCAAGAACGCCATCCTCTTCGAGACCATCAGCCT







CATCATCCACTATGACAGTGAGCCCAACCTCCTGGTTC







GGGCCTGCAACCAGCTGGGCCAGTTCCTGCAGCACCG







GGAGACCAACCTGCGCTACCTGGCCCTGGAGAGCATG







TGCACGCTGGCCAGCTCCGAGTTCTCCCATGAAGCCGT







CAAGACGCACATTGACACCGTCATCAATGCCCTCAAGA







CGGAGCGGGACGTCAGCGTGCGGCAGCGGGCGGCTG







ACCTCCTCTACGCCATGTGTGACCGGAGCAATGCCAAG







CAGATCGTGTCGGAGATGCTGCGGTACCTGGAGACGG







CAGACTACGCCATCCGCGAGGAGATCGTCCTGAAGGT







GGCCATCCTGGCCGAGAAGTACGCCGTGGACTACAGC







TGGTACGTGGACACCATCCTCAACCTCATCCGCATTGC







GGGCGACTACGTGAGTGAGGAGGTGTGGTACCGTGT







GCTACAGATCGTCACCAACCGTGATGACGTCCAGGGC







TATGCCGCCAAGACCGTCTTTGAGGCGCTCCAGGCCCC







TGCCTGTCACGAGAACATGGTGAAGGTTGGCGGCTAC







ATCCTTGGGGAGTTTGGGAACCTGATTGCTGGGGACC







CCCGCTCCAGCCCCCCAGTGCAGTTCTCCCTGCTCCACT







CCAAGTTCCATCTGTGCAGCGTGGCCACGCGGGCGCT







GCTGCTGTCCACCTACATCAAGTTCATCAACCTCTTCCC







CGAGACCAAGGCCACCATCCAGGGCGTCCTGCGGGCC







GGCTCCCAGCTGCGCAATGCTGACGTGGAGCTGCAGC







AGCGAGCCGTGGAGTACCTCACCCTCAGCTCAGTGGC







CAGCACCGACGTCCTGGCCACGGTGCTGGAGGAGATG







CCGCCCTTCCCCGAGCGCGAGTCGTCCATCCTGGCCAA







GCTGAAACGCAAGAAGGGGCCAGGGGCCGGCAGCGC







CCTGGACGATGGCCGGAGGGACCCCAGCAGCAACGA







CATCAACGGGGGCATGGAGCCCACCCCCAGCACTGTG







TCGACGCCCTCGCCCTCCGCCGACCTCCTGGGGCTGCG







GGCAGCCCCTCCCCCGGCAGCACCCCCGGCTTCTGCAG







GAGCAGGGAACCTTCTGGTGGACGTCTTCGATGGCCC







GGCCGCCCAGCCCAGCCTGGGGCCCACCCCCGAGGAG







GCCTTCCTCAGCGAGCTGGAGCCGCCTGCCCCCGAGA







GCCCCATGGCTTTGCTGGCTGACCCAGCTCCAGCTGCT







GACCCAGGTCCTGAGGACATCGGCCCTCCCATTCCGGA







AGCCGATGAGTTGCTGAATAAGTTTGTGTGTAAGAAC







AACGGGGTCCTGTTCGAGAACCAGCTGCTGCAGATCG







GAGTCAAGTCAGAGTTCCGACAGAACCTGGGCCGCAT







GTATCTCTTCTATGGCAACAAGACCTCGGTGCAGTTCC







AGAATTTCTCACCCACTGTGGTTCACCCGGGAGACCTC







CAGACTCATCTGGCTGTGCAGACCAAGCGCGTGGCGG







CGCAGGTGGACGGCGGCGCGCAGGTGCAGCAGGTGC







TCAATATCGAGTGCCTGCGGGACTTCCTGACGCCCCCG







CTGCTGTCCGTGCGCTTCCGGTACGGTGGCGCCCCCCA







GGCCCTCACCCTGAAGCTCCCAGTGACCATCAACAAGT







TCTTCCAGCCCACCGAGATGGCGGCCCAGGATTTCTTC







CAGCGCTGGAAGCAGCTGAGCCTCCCTCAACAGGAGG







CGCAGAAAATCTTCAAAGCCAACCACCCCATGGACGCA







GAAGTTACTAAGGCCAAGCTTCTGGGGTTTGGCTCTGC







TCTCCTGGACAATGTGGACCCCAACCCTGAGAACTTCG







TGGGGGCGGGGATCATCCAGACTAAAGCCCTGCAGGT







GGGCTGTCTGCTTCGGCTGGAGCCCAATGCCCAGGCC







CAGATGTACCGGCTGACCCTGCGCACCAGCAAGGAGC







CCGTCTCCCGTCACCTGTGTGAGCTGCTGGCACAGCAG







TTC


NYU_E
65557650
65592691
InFrame
8334
ATGGCCTCGGAGAGTGGGAAGCTTTGGGGTGGCCGG







TTTGTGGGTGCAGTGGACCCCATCATGGAGAAGTTCA







ACGCGTCCATTGCCTACGACCGGCACCTTTGGGAGGT







GGATGTTCAAGGCAGCAAAGCCTACAGCAGGGGCCTG







GAGAAGGCAGGGCTCCTCACCAAGGCCGAGATGGAC







CAGATACTCCATGGCCTAGACAAGGTGGCTGAGGAGT







GGGCCCAGGGCACCTTCAAACTGAACTCCAATGATGA







GGACATCCACACAGCCAATGAGCGCCGCCTGAAGGAG







CTCATTGGTGCAACGGCAGGGAAGCTGCACACGGGAC







GGAGCCGGAATGACCAGGTGGTCACAGACCTCAGGCT







GTGGATGCGGCAGACCTGCTCCACGCTCTCGGGCCTCC







TCTGGGAGCTCATTAGGACCATGGTGGATCGGGCAGA







GGCGGAACGTGATGTTCTCTTCCCGGGGTACACCCATT







TGCAGAGGGCCCAGCCCATCCGCTGGAGCCACTGGAT







TCTGAGCCACGCCGTGGCACTGACCCGAGACTCTGAG







CGGCTGCTGGAGGTGCGGAAGCGGATCAATGTCCTGC







CCCTGGGGAGTGGGGCCATTGCAGGCAATCCCCTGGG







TGTGGACCGAGAGCTGCTCCGAGCAGAACTCAACTTT







GGGGCCATCACTCTCAACAGCATGGATGCCACTAGTG







AGCGGGACTTTGTGGCCGAGTTCCTGTTCTGGGCTTCG







CTGTGCATGACCCATCTCAGCAGGATGGCCGAGGACC







TCATCCTCTACTGCACCAAGGAATTCAGCTTCGTGCAG







CTCTCAGATGCCTACAGCACGGGAAGCAGCCTGATGC







CCCAGAAGAAAAACCCCGACAGTTTGGAGCTGATCCG







GAGCAAGGCTGGGCGTGTGTTTGGGCGGTGTGCCGG







GCTCCTGATGACCCTCAAGGGACTTCCCAGCACCTACA







ACAAAGACTTACAGGAGGACAAGGAAGCTGTGTTTGA







AGTGTCAGACACTATGAGTGCCGTGCTCCAGGTGGCC







ACTGGCGTCATCTCTACGCTGCAGATTCACCAAGAGAA







CATGGGACAGGCTCTCAGCCCCGACATGCTGGCCACT







GACCTTGCCTATTACCTGGTCCGCAAAGGGATGCCATT







CCGCCAGGCCCACGAGGCCTCCGGGAAAGCTGTGTTC







ATGGCCGAGACCAAGGGGGTCGCCCTCAACCAGCTGT







CACTGCAGGAGCTGCAGACCATCAG|GAAGGATGCCA







ATTCTGCGCTTCTCAGTAACTACGAGGTATTTCAGTTAC







TAACTGATCTGAAAGAGCAGCGTAAAGAAAGTGGAAA







GAATAAACACAGCTCTGGGCAACAGAACTTGAACACT







ATCACCTATGAAACGTTAAAATACATATCAAAAACACC







ATGCAGGCACCAGAGTCCTGAAATTGTCAGAGAATTTC







TCACAGCATTGAAAAGCCACAAGTTGACCAAAGCTGA







GAAGCTCCAGCTGCTGAACCACCGGCCTGTGACTGCT







GTGGAGATCCAGCTGATGGTGGAAGAGAGTGAAGAG







CGGCTCACGGAGGAGCAGATTGAAGCTCTTCTCCACA







CCGTCACCAGCATTCTGCCTGCAGAGCCAGAGGCTGA







GCAGAAGAAGAATACAAACAGCAATGTGGCAATGGAC







GAAGAGGACCCAGCA


G17816.
55268106
56079562
InFrame
8335
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-28-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


5215-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.4




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|GATGCT







TTCATTGGATTTGGAGGAAATGTGATCAGGCAACAAG







TCAAGGATAACGCCAAATGGTATATCACTGATTTTGTA







GAGCTGCTGGGAGAACTGGAAGAA


G17650.
55433922
55914330
InFrame
8336
ATGGGCGAGACCATGTCAAAGAGGCTGAAGCTCCACC


TCGA-28-




TGGGAGGGGAGGCAGAAATGGAGGAACGGGCGTTCG


2513-




TCAACCCCTTCCCGGACTACGAGGCCGCCGCCGGGGC


01A-01R-




GCTGCTCGCCTCCGGAGCGGCCGAAGAGACAGGCTGT


1850-




GTTCGTCCCCCGGCGACCACGGATGAGCCCGGCCTCCC


01.2




TTTTCATCAGGACGGGAAG|CAAAAAGAAAATAATATT







CGTTGTTTAACTACGATTGGACATTTTGGTTTGAATGT







TTGCCCAATCAGTTGGTGAGCAGATCTATCCGACAAGG







ATTCACTTTTAATATTCTCTGTGTGGGGGAGACTGGAA







TTGGAAAATCGACACTGATAGACACATTGTTTAATACT







AACTTGAAAGATAACAAATCCTCACATTTTTACTCAAAT







GTTGGACTTCAAATTCAGACATATGAACTTCAGGAAAG







CAATGTTCAGTTGAAATTGACTGTTGTGGAGACAGTAG







GGTATGGTGATCAAATAGACAAAGAAGCCAGCTACCA







ACCAATAGTTGACTACATAGATGCCCAATTTGAGGCCT







ATCTTCAAGAAGAACTGAAGATTAAACGTTCCTTGTTT







GAGTACCATGATTCTCGCGTCCACGTGTGTCTTTACTTC







ATTTCACCTACAGGACATTCCCTGAAGTCTCTTGATCTA







TTAACAATGAAGAACCTTGACAGTAAGGTGAATATTAT







ACCACTGATTGCCAAAGCAGACACTATTTCTAAAAATG







ATTTACAGACGTTTAAGAATAAGATAATGAGTGAATTG







ATTAGCAATGGCATCCAGATATATCAGCTCCCAACAGA







TGAAGAAACTGCTGCTCAAGCGAACTCCTCAGTTAGTG







GGCTGTTACCCTTTGCTGTGGTAGGGAGTACAGATGA







AGTGAAAGTTGGAAAAAGGATGGTCAGAGGCCGTCA







CTACCCTTGGGGAGTTTTGCAAGTGGAAAATGAAAAT







CACTGTGACTTCGTTAAGCTCCGAGATATGCTTCTTTGT







ACCAATATGGAAAATCTAAAAGAAAAAACCCACACTCA







GCACTATGAATGTTATAGGTACCAAAAACTGCAGAAA







ATGGGCTTTACAGATGTGGGTCCAAACAACCAGCCAG







TTAGTTTTCAAGAAATCTTTGAAGCCAAAAGACAAGAG







TTCTATGATCAATGTCAGAGGGAAGAAGAAGAGTTGA







AACAGAGATTTATGCAGCGAGTCAAGGAGAAAGAAGC







AACATTTAAAGAAGCTGAAAAAGAGCTGCAGGACAAG







TTCGAGCATCTTAAAATGATTCAACAGGAGGAGATAA







GGAAGCTCGAGGAAGAGAAAAAACAACTGGAAGGAG







AAATCATAGATTTTTATAAAATGAAAGCTGCCTCCGAA







GCACTGCAGACTCAGCTGAGCACCGATACAAAGAAAG







ACAAACATCGTAAGAAA


BT308
33222419
33303169
InFrame
8337
ATGGCGGCTCCCTTGGTCCTGGTGCTGGTGGTGGCTG







TGACAGTGCGGGCGGCCTTGTTCCGCTCCAGTCTGGCC







GAGTTCATTTCCGAGCGGGTGGAGGTGGTGTCCCCAC







TGAGCTCTTGGAAGAGAGTGGTTGAAGGCCTTTCACT







GTTGGACTTGGGAGTATCTCCGTATTCTGGAGCAGTAT







TTCATGAAACTCCATTAATAATATACCTCTTTCATTTCCT







AATTGACTATGCTGAATTGGTGTTTATGATAACTGATG







CACTCACTGCTATTGCCCTGTATTTTGCAATCCAGGACT







TCAATAAAGTTGTGTTTAAAAAGCAGAAACTCCTCCTA







GAACTGGACCAGTATGCCCCAGATGTGGCCGAACTCA







TCCGGACCCCTATGGAAATGCGTTACATCCCTTTGAAA







GTGGCCCTGTTCTATCTCTTAAATCCTTACACGATTTTG







TCTTGTGTTGCCAAGTCTACCTGTGCCATCAACAACACC







CTCATTGCTTTCTTCATTTTGACTACGATAAAAG|ACAT







AACCAGTGCGGTGCAATCCAAGCGAAGAAAATCCAAG


G17656.
55268106
55588823
InFrame
8338
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-28-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


2514-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-02R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1850-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|TGCACA







GAAGCCAAAAAGCATTGCTGGTATTTCGAAGGACTCT







ATCCAACCTATTATATATGCCGCTCCTACGAGGACTGC







TGTGGCTCCAGGTGCTGTGTGCGGGCCCTCTCCATACA







GAGGCTGTGGTACTTCTGGTTCCTTCTGATGATGGGCG







TGCTTTTCTGCTGCGGAGCCGGCTTCTTCATCCGGAGG







CGCATGTACCCCCCGCCGCTGATCGAGGAGCCAGCCTT







CAATGTGTCCTACACCAGGCAGCCCCCAAATCCCGGCC







CAGGAGCCCAGCAGCCGGGGCCGCCCTATTACACCGA







CCCAGGAGGACCGGGGATGAACCCTGTCGGGAATTCC







ATGGCAATGGCTTTCCAGGTCCCACCCAACTCACCCCA







GGGGAGTGTGGCCTGCCCGCCCCCTCCAGCCTACTGC







AACACGCCTCCGCCCCCGTACGAACAGGTAGTGAAGG







CCAAG


G17639.
33295425
34382867
InFrame
8339
ATGGCGGAGGCGCCTCCTGTCTCAGGTACTTTTAAATT


TCGA-12-




CAATACAGATGCTGCTGAATTCATTCCTCAGGAGAAAA


3652-




AAAATTCTGGTCTAAATTGTGGGACTCAAAGGAGACT


01A-01R-




AGACTCTAATAGGATTGGTAGAAGAAATTACAGTTCAC


1849-




CACCTCCCTGTCACCTTTCCAGGCAGGTCCCTTATGATG


01.2




AAATCTCTGCTGTTCATCAGCATAGTTATCATCCGTCAG







GAAGCAAACCTAAGAGTCAGCAGACGTCTTTCCAGTCC







TCTCCTTGTAATAAATCGCCCAAGAGCCATGGCCTTCA







GAATCAACCTTGGCAGAAATTGAGGAATGAGAAGCAC







CATATCAGAGTCAAGAAAGCACAGAGTCTTGCTGAGC







AGACCTCAGATACAGCTGGATTAGAGAGCTCGACCAG







ATCAGAGAGTGGGACAGACCTCAGAGAGCATAGTCCT







TCTGAGAGTGAGAAGGAAGTTGTGGGTGCAGATCCCA







GGGGAGCAAAACCCAAAAAAGCAACACAGTTTGTATA







CAGCTATGGTAGAGGACCAAAAGTCAAGGGGAAACTC







AAATGTGAATGGAGTAACCGAACAACTCCAAAACCGG







AGGATGCTGGACCCGAAAGTACCAAACCTGTGGGGGT







TTTCCACCCTGACTCTTCAGAGGCATCCTCTAGAAAAG







GAGTATTGGATGGGTATGGAGCCAGACGAAATGAGC







AGAGAAGATACCCACAGAAAAGGCCTCCCTGGGAAGT







GGAGGGGGCCAGGCCACGACCAGGCAGAAATCCACC







AAAACAGGAGGGCCACCGACATACAAACGCAGGACAC







AGAAACAACATGGGCCCCATTCCAAAGGATGACCTCA







ATGAAAGACCAGCAAAATCTACCTGTGACAGTGAGAA







CTTGGCAGTCATCAACAAGTCTTCCAGGAGGGTTGACC







AAGAGAAATGCACTGTACGGAGGCAGGATCCTCAAGT







AGTATCTCCTTTCTCCCGAGGCAAACAGAACCATGTGC







TAAAGAATGTGGAAACGCACACAG|CAGACAAATGCA







GCCCAAACTACCTGGGCAGTGACTGGTACAACACATG







GAGGATGGAACCTTACAACAGCAGCTGCTGCAACAAG







TATACCACCTACCTTCCTCGGCTGCCTAAGGAGGCCAG







GATGGAGACAGCAGTTCGAGGAATGCCCTTGGAATGC







CCTCCTAGGCCGGAGCGGCTCAATGCCTACGAGCGCG







AAGTGATGGTGAACATGCTGAACTCACTGTCGCGGAA







CCAGCAGCTGCCGCGGATCACGCCCCGATGCGGGTGC







GTGGACCCGCTGCCCGGCCGCCTGCCCTTCCATGGTTA







CGAAAGTGCTTGCTCGGGCCGCCACTACTGTCTGCGCG







GGATGGACTACTACGCCAGCGGGGCGCCCTGCACCGA







CCGCCGCCTGCGGCCTTGGTGCCGGGAGCAACCGACT







ATGTGTACCTCCCTACGAGCACCGGCCCGGAATGCAGT







GTGCTGTTACAACTCCCCCGCCGTCATACTACCCATATC







CGAACCT


G17796.
58090156
57960909
InFrame
8340
ATGGCGGCGGAAACGCTGCTGTCCAGTTTGTTAGGAC


TCGA-41-




TGCTGCTTCTGGGACTCCTGTTACCCGCAAGTCTGACC


5651-




GGCGGTGTCGGGAGCCTGAACCTGGAGGAGCTGAGT


01A-01R-




GAGATGCGTTATGGGATCGAGATCCTGCCGTTGCCTGT


1850-




CATGGGAGGGCAGAGCCAATCTTCGGACGTGGTGATT


01.4




GTCTCCTCTAAGTACAAACAGCGCTATGAGTGTCGCCT







GCCAGCTGGAGCTATTCACTTCCAGCGTGAAAGGGAG







GAGGAAACACCTGCTTACCAAGGGCCTGGGATCCCTG







AGTTGTTGAGCCCAATGAGAGATGCTCCCTGCTTGCTG







AAGACAAAGGACTGGTGGACATATGAATTCTGTTATG







GACGCCACATCCAGCAATACCACATGGAAGATTCAGA







GATCAAAGGTGAAGTCCTCTATCTCGGCTACTACCAAT







CAGCCTTCGACTGGGATGATGAAACAGCCAAGGCCTC







CAAGCAGCATCGTCTTAAACGCTACCACAGCCAGACCT







ATGGCAATGGGTCCAAGTGCGACCTTAATGGGAGGCC







CCGGGAGGCCGAGGTTCGG|GGTTGTACTGAACGCTT







TGTGTCCAGCCCGGAGGAGATTCTGGATGTGATTGAT







GAAGGGAAATCAAATCGTCATGTGGCTGTCACCAACA







TGAATGAACACAGCTCTCGGAGCCACAGCATCTTCCTC







ATCAACATCAAGCAGGAGAACATGGAAACGGAGCAG







AAGCTCAGTGGGAAGCTGTATCTGGTGGACCTGGCAG







GGAGTGAGAAGGTCAGCAAGACTGGAGCAGAGGGAG







CCGTGCTGGACGAGGCAAAGAATATCAACAAGTCACT







GTCAGCTCTGGGCAATGTGATCTCCGCACTGGCTGAG







GGCACTAAAAGCTATGTTCCATATCGTGACAGCAAAAT







GACAAGGATTCTCCAGGACTCTCTCGGGGGAAACTGC







CGGACGACTATGTTCATCTGTTGCTCACCATCCAGTTAT







AATGATGCAGAGACCAAGTCCACCCTGATGTTTGGGC







AGCGGGCAAAGACCATTAAGAACACTGCCTCAGTAAA







TTTGGAGTTGACTGCTGAGCAGTGgaagaagaaatatgag







asggagaaggagaagacaaaggcccagaaggagaCGATTGCGA







AGCTGGAGGCTGAGCTGAGCCGGTGGCGCAATGGAG







AGAATGTGCCTGAGACAGAGCGCCTGGCTGGGGAGG







AGGCAGCCCTGGGAGCCGAGCTCTGTGAGGAGACCCC







TGTGAATGACAACTCATCCATCGTGGTGCGCATCGCGC







CCGAGGAGCGGCAGAAATACGAGGAGGAGATCCGCC







GTCTCTATAAGCAGCTTGACGACAAGGATGATGAAAT







CAACCAACAAAGCCAACTCATAGAGAAGCTCAAGCAG







CAAATGCTGGACCAGGAAGAGCTGCTGGTGTCCACCC







GAGGAGACAACGAGAAGGTCCAGCGGGAGCTGAGCC







ACCTGCAATCAGAGAACGATGCCGCTAAGGATGAGGT







GAAGGAAGTGCTGCAGGCCCTGGAGGAGCTGGCTGT







GAACTATGACCAGAAGTCCCAGGAGGTGGAGGAGAA







GAGCCAGCAGAACCAGCTTCTGGTGGATGAGCTGTCT







CAGAAGGTGGCCACCATGCTGTCCCTGGAGTCTGAGT







TGCAGCGGCTACAGGAGGTCAGTGGACACCAGCGAA







AACGAATTGCTGAGGTGCTGAACGGGCTGATGAAGGA







TCTGAGCGAGTTCAGTGTCATTGTGGGCAACGGGGAG







ATTAAGCTGCCAGTGGAGATCAGTGGGGCCATCGAGG







AGGAGTTCACTGTGGCCCGACTCTACATCAGCAAAATC







AAATCAGAAGTCAAGTCTGTGGTCAAGCGGTGCCGGC







AGCTGGAGAACCTCCAGGTGGAGTGTCACCGCAAGAT







GGAAGTGACCGGGCGGGAGCTCTCATCCTGCCAGCTC







CTCATCTCTCAGCATGAGGCCAAGATCCGCTCGCTTAC







GGAATACATGCAGAGCGTGGAGCTAAAGAAGCGGCA







CCTGGAAGAGTCCTATGACTCCTTGAGCGATGAGCTG







GCCAAGCTCCAGGCCCAGGAAACTGTGCATGAAGTGG







CCCTGAAGGACAAGGAGCCTGACACTCAGGATGCAGA







TGAAGTGAAGAAGGCTCTGGAGCTGCAGATGGAGAG







TCACCGGGAGGCCCATCACCGGCAGCTGGCCCGGCTC







CGGGACGAGATCAACGAGAAGCAGAAGACCATTGAT







GAGCTCAAAGACCTAAATCAGAAGCTCCAGTTAGAGC







TAGAGAAGCTTCAGGCTGACTACGAGAAGCTGAAGAG







CGAAGAACACGAGAAGAGCACCAAGCTGCAGGAGCT







GACATTTCTGTACGAGCGACATGAGCAGTCCAAGCAG







GACCTCAAGGGTCTGGAGGAGACAGTTGCCCGGGAAC







TCCAGACCCTCCACAACCTTCGCAAGCTGTTCGTTCAA







GACGTCACGACTCGAGTCAAGAAAAGTGCAGAAATGG







AGCCCGAAGACAGTGGGGGGATTCACTCCCAAAAGCA







GAAGATTTCCTTTCTTGAGAACAACCTGGAACAGCTTA







CAAAGGTTCACAAACAGCTGGTACGTGACAATGCAGA







TCTGCGTTGTGAGCTTCCTAAATTGGAAAAACGACTTA







GGGCTACGGCTGAGAGAGTTAAGGCCCTGGAGGGTG







CACTGAAGGAGGCCAAGGAGGGCGCCATGAAGGACA







AGCGCCGGTACCAGCAGGAGGTGGACCGCATCAAGG







AGGCCGTTCGCTACAAGAGCTCGGGCAAACGGGGCCA







TTCTGCCCAGATTGCCAAACCCGTCCGGCCTGGCCACT







ACCCAGCATCCTCACCCACCAACCCCTATGGCACCCGG







AGCCCTGAGTGCATCAGTTACACCAACAGCCTCTTCCA







GAACTACCAGAATCTCTACCTGCAGGCCACACCCAGCT







CCACCTCAGATATGTACTTTGCAAACTCCTGTACCAGC







AGTGGAGCCACATCTTCTGGCGGCCCCTTGGCTTCCTA







CCAGAAGGCCAACATGGACAATGGAAATGCCACAGAT







ATCAATGACAATAGGAGTGACCTGCCGTGTGGCTATG







AGGCTGAGGACCAGGCCAAGCTTTTCCCTCTCCACCAA







GAGACAGCAGCCAGC


G17468.
61554352
90122482
InFrame
8341
ATGAAGCTTGCTGACAGCGTAATGGCAGGGAAAGCTT


TCGA-19-




CCGACGGCTCCATCAAATGGCAGCTCTGCTACGACATC


0957-




TCGGCCAGAACTTGGTGGATGGATGAATTTCATCCTTT


02A-11R-




CATCGAAGCACTTCTGCCCCACGTCCGAGCCTTTGCCT


2005-




ACACATGGTTCAACCTGCAGGCCCGAAAACGAAAATA


01.2




CTTCAAAAAACATGAAAAGCGTATGTCAAAAGAAGAA







GAGAGAGCCGTGAAGGATGAATTGCTAAGTGAAAAA







CCAGAGGTCAAGCAGAAGTGGGCATCTCGACTTCTGG







CAAAGTTGCGGAAAGATATCCGACCCGAATATCGAGA







GGATTTTGTTCTTACAGTTACAGGGAAAAAACCTCCAT







GTTGTGTTCTTTCCAACCCAGACCAGAAAGGCAAGATG







CGAAGAATTGACTGCCTCCGCCAGGCAGATAAAGTCT







GGAGGTTGGACCTTGTTATGGTGATTTTGTTTAAAGGT







ATTCCGCTGGAAAGTACTGATGGCGAGCGCCTTGTAA







AGTCCCCACAATGCTCTAATCCAGGGCTCTGTGTCCAA







CCCCATCACATAGGGGTTTCTGTTAAGGAACTCGATTT







ATATTTGGCATACTTTGTGCATGCAGCAG|TAATTAGT







GAATGCCAAAGGCAGCAACTGGAGGCTGTGAGCTACT







CCTCTCGATATGCTCTGGGCCTCTTTTATGAAGCTGGT







ACGAAGATTGATGTCCCTTGGGCTGGGCAGTACATCA







CCAGTAATCCCTGCATACGCTTCGTCTCCATTGATAATA







AGAAGCGCAATATAGAGTCATCAGAAATTGGGCCTTC







CCTCGTGATTCACACCACTGTCCCATTTGGAGTTACATA







CTTGGAACACAGCATTGAGGATGTGCAAGAGTTAGTC







TTCCAGCAGCTGGAAAACATTTTGCCGGGTTTGCCTCA







GCCAATTGCTACCAAATGCCAAAAATGGAGACATTCAC







AGGTTACAAATGCTGCTGCCAACTGTCCTGGCCAAATG







ACTCTGCATCACAAACCTTTCCTTGCATGTGGAGGGGA







TGGATTTACTCAGTCCAACTTTGATGGCTGCATCACTTC







TGCCCTATGTGTTCTGGAAGCTTTAAAGAATTATATT


G17790.
20304379
20987526
InFrame
8342
ATGATGTTAAGAGGAAACCTGAAGCAAGTGCGCATTG


TCGA-06-




AGAAAAACCCGGCCCGCCTTCGCGCCCTGGAGTCCGC


5856-




GGTGGGCGAGAGCGAGCCGGCGGCCGCGGCAGCCAT


01A-01R-




GGCGCTCGCTCTTGCCGGGGAGCCGGCACCGCCCGCG


1849-




CCCGCGCCTCCAGAGGACCACCCGGACGAGGAGATGG


01.4




GGTTCACTATCGACATCAAGAGTTTCCTCAAGCCGGGC







GAGAAGACGTACACGCAGCGCTGCCGCCTCTTCGTGG







GAAATCTGCCCACCGACATCACGGAGGAGGACTTCAA







GAGGCTCTTCGAACGCTATGGCGAGCCCAGCGAAGTC







TTCATCAACCGGGACCGTGGCTTCGGCTTCATCCGCTT







GGAATCCAGAACCCTGGCTGAAATTGCAAAAGCAGAG







CTGGACGGCACCATTCTCAAGAGCAGACCTCTACGGAT







TCGCTTCGCTACACATGGAGCAGCCTTGACTGTCAAGA







ACCTTTCTCCAGTTGTTTCCAATGAGCTGCTAGAGCAA







GCATTTTCTCAGTTTGGTCCAGTAGAGAAAGCTGTTGT







GGTTGTGGATGATCGCGGTAGAGCTACAGGAAAAGGT







TTTGTAGAGTTTGCAGCAAAACCTCCTGCACGAAAGGC







TCTGGAAAGATGTGGTGATGGGGCATTCTTGCTAACA







ACGACCCCTCGTCCAGTCATTGTGGAACCCATGGAGCA







GTTTGATGATGAAGATGGCTTGCCAGAGAAGCTGATG







CAGAAAACTCAACAATATCATAAGGAAAGAGAACAAC







CACCACGTTTTGCTCAACCTGGGACATTTGAATTTGAG







TATGCATCTCGATGGAAGGCTCTTGATGAAATGGAAA







AGCAGCAGCGTGAGCAGGTTGATAGAAACATCAGAG







AAGCCAAAGAGAAACTGGAGGCAGAAATGGAAGCAG







CTAGGCATGAACACCAATTAATGCTAATGAGGCAAGA







TCTAATGAGGCGTCAAGAAGAACTCAGACGCTTGGAA







GAACTCAGAAACCAAGAGTTGCAAAAACGGAAGCAAA







TACAACTAAGACATGAAGAGGAGCATCGGCGGCGTGA







GGAAGAAATGATCCGACACAGAGAACAGGAGGAACT







GAGGCGACAGCAAGAGGGCTTTAAGCCAAACTACATG







GAAAAT|GAAGGAATCGTGTCTCCTAGTGACCTGGACC







TTGTCATGTCAGAAGGGTTGGGCATGCGGTATGCATTC







ATTGGACCCCTGGAAACCATGCATCTCAATGCAGAAG







GTATGTTAAGCTACTGCGACAGATACAGCGAAGGCAT







AAAACATGTCCTACAGACTTTTGGACCCATTCCAGAGT







TTTCCAGGGCCACTGCTGAGAAGGTTAACCAGGACAT







GTGCATGAAGGTCCCTGATGACCCGGAGCACTTAGCT







GCCAGGAGGCAGTGGAGGGACGAGTGCCTCATGAGA







CTCGCCAAGTTGAAGAGTCAAGTGCAGCCCCAG


GBM-
7470322
7217040
InFrame
8343
ATGAAAGAGACTATACAAGGGACCGGGTCCTGGGGG


CUMC3338_L1




CCTGAGCCTCCTGGACCCGGCATACCCCCAGCTTACTC







AAGTCCCAGGCGGGAGCGTCTTCGTTGGCCCCCACCTC







CCAAACCCCGACTCAAGTCAGGTGGAGGGTTTGGGCC







AGATCCTGGGTCAGGGACCACAGTGCCAGCCAGACGC







CTCCCTGTCCCCCGACCCTCTTTTGATGCCTCAGCAAGT







gaagaggaggaagaagaggaggaggaggaggatgaagatgaagag







gaggaAGTGGCAGCTTGGAGGCTGCCCCCAAGATGGA







GTCAGCTGGGAACCTCCCAGCGGCCCCGCCCTTCCCGC







CCCACTCATCGAAAAACCTGCTCACAGCGCCGCCGCCG







AGCCATGAGAGCCTTCCGGATGCTGCTCTACTCAAAAA







GCACCTCGCTGACATTCCACTGGAAGCTTTgggggcgcca







ccggggccggcggcggggccTCGCACACCCCAAGAACCATCT







TTCACCCCAGCAAGGGGGTGCGACGCCACAGGTGCCA







TCCCCCTGTTGTCGTTTTGACTCCCCCCGGGGGCCACCT







CCACCCCGGCTGGGTCTGCTAGGTGCTCTCATGGCTGA







GGATGGGGTGAGAGGGTCTCCACCAGTGCCCTCTGGG







CCCCCCATGGAGGAAGATGGACTCAGGTGGACTCCAA







AGTCTCCTCTGGACCCTGACTCGGGCCTCCTTTCATGTA







CTCTGCCCAACGGTTTTGGGGGACAATCTGGGCCAGA







AGGGGAGCGCAGCTTGGCACCCCCTGATGCCAGCATC







CTCATCAGCAATGTGTGCAGCATCGGGGACCATGTGG







CCCAGGAGCTTTTTCAGGGCTCAGATTTGGGCATGGCA







GAAGAGGCAGAGAGGCCTGGGGAGAAAGCCGGCCA







GCACAGCCCCCTGCGAGAGGAGCATGTGACCTGCGTA







CAGAGCATCTTGGACGAATTCCTTCAAACGTATGGCAG







CCTCATACCCCTCAGCACTGATGAGGTAGTAGAGAAG







CTGGAGGACATTTTCCAGCAGGAGTTTTCCACCCCTTC







CAGGAAGGGCCTGGTGTTGCAGCTGATCCAGTCTTAC







CAGCGGATGCCAGGCAATGCCATGGTGAGGGGCTTCC







GAGTGGCTTATAAGCGGCACGTGCTGACCATGGATGA







CTTGGGGACCTTGTATGGACAGAACTGGCTCAATGAC







CAGGTGATGAACATGTATGGAGACCTGGTCATGGACA







CAGTCCCTGAAAAGGTGCATTTCTTCAATAGTTTCTTCT







ATGATAAACTCCGTACCAAGGGTTATGATGGGGTGAA







AAGGTGGACCAAAAAC|ACCCGGCACTACGTGGGCTC







AGCAGCTGCTTTTGCAGGGACACCAGAGCATGGACAA







TTCCAAGGCAGTCCTGGTGGTGCCTATGGGACTGCTCA







GCCCCCACCTCACTATGGGCCCACACAGCCAGCTTATA







GTCCTAGTCAGCAGCTCAGAGCTCCTTCGGCATTCCCT







GCAGTGCAGTACCTATCTCAGCCACAGCCACAGCCCTA







TGCTGTGCATGGCCACTTTCAGCCCACTCAGACAGGTT







TCCTCCAGCCTGGTGGTGCCCTGTCCTTGCAAAAGCAG







ATGGAACATGCTAACCAGCAGACTGGCTTCTCCGACTC







ATCCTCTCTGCGCCCCATGCACCCCCAGGCTCTGCATCC







AGCCCCTGGACTCCTTGCTTCCCCCCAGCTCCCTGTGCA







GATGCAGCCAGCAGGAAAGTCGGGCTTTGCAGCTACC







AGCCAACCTGGCCCTCGGCTCCCCTTCATCCAACACAG







CCAGAACCCGCGATTCTACCACAAG


G17790.
117175595
69229609
InFrame
8344
ATGGTGAACCTGGCGGCCATGGTGTGGCGCCGGCTTC


TCGA-06-




TGCGGAAGAGGTGGGTGCTCGCCCTGGTCTTCGGGCT


5856-




GTCGCTCGTCTACTTCCTCAGCAGCACCTTCAAGCAG|


01A-01R-




GATCTTGATGCTGGTGTAAGTGAACATTCAGGTGATTG


1849-




GTTGGATCAGGATTCAGTUCAGATCAGTTTAGTGTAG


01.4




AATTTGAAGTTGAATCTCTCGACTCAGAAGATTATAGC







CTTAGTGAAGAAGGACAAGAACTCTCAGATGAAGATG







ATGAGGTATATCAAGTTACTGTGTATCAGGCAGGGGA







GAGTGATACAGATTCATTTGAAGAAGATCCTGAAATTT







CCTTAGCTGACTATTGGAAATGCACTTCATGCAATGAA







ATGAATCCCCCCCTTCCATCACATTGCAACAGATGTTG







GGCCCTTCGTGAGAATTGGCTTCCTGAAGATAAAGGG







AAAGATAAAGGGGAAATCTCTGAGAAAGCCAAACTGG







AAAACTCAACACAAGCTGAAGAGGGCTTTGATGTTCCT







GATTGTAAAAAAACTATAGTGAATGATTCCAGAGAGT







CATGTGTTGAGGAAAATGATGATAAAATTACACAAGC







TTCACAATCACAAGAAAGTGAAGACTATTCTCAGCCAT







CAACTTCTAGTAGCATTATTTATAGCAGCCAAGAAGAT







GTGAAAGAGTTTGAAAGGGAAGAAACCCAAGACAAA







GAAGAGAGTGTGGAATCTAGTTTGCCCCTTAATGCCAT







TGAACCTTGTGTGATTTGTCAAGGTCGACCTAAAAATG







GTTGCATTGTCCATGGCAAAACAGGACATCTTATGGCC







TGCTTTACATGTGCAAAGAAGCTAAAGAAAAGGAATA







AGCCCTGCCCAGTATGTAGACAACCAATTCAAATGATT







GTGCTAACTTATTTCCCC


G17802.
51203855
55886916
InFrame
8345
ATGGACGCGCCGCGCGCCTCGGCGGCCAAGCCCCCGA


TCGA-28-




CCGGGAGGAAGATGAAGGCTCGTGCTCCCCCACCTCC


5208-




TGGAAAGGCTGCCACTCTGCATGTGCACAGTGACCAG


01A-01R-




AAGCCCCCCCACGATGGGGCCCTCGGGTCGCAGCAGA


1850-




ACTTGGTTCGCATGAAGGAGGCGCTGAGGGCCAGCAC


01.4




CATGGACGTCACCGTGGTCCTGCCTAGTGGGCTGGAG







AAGAGGAGCGTGCTCAATGGGAGCCATGCGATGATG







GACCTACTGGTTGAACTTTGCCTTCAGAACCACCTGAA







TCCATCCCACCATGCCCTTGAAATTCGGTCTTCAGAAAC







CCAACAACCTTTGAGTTTTAAGCCAAATACTTTGATTG







GGACCCTGAATGTGCATACTGTGTTTCTGAAAGAAAAA







GTTCCTGAAGAGAAGGTTAAGCCTGGTCCCCCTAAGG







TGCCTGAGAAATCTGTGCGTTTGGTCGTGAATTACCTG







CGGACACAAAAAGCTGTTGTGCGTGTGAGCCCTGAGG







TTCCTCTCCAGAATATTCTCCCAGTCATTTGTGCAAAGT







GTGAGGTCAGCCCAGAGCACGTGGTTCTCCTCAGGGA







CAACATTGCCGGAGAGGAGCTGGAGCTGTCCAAGTCC







CTGAACGAGCTCGGGATAAAGGAGCTCTACGCGTGGG







ACAACAGAAGAGAAACCTTTAGGAAATCATCACTTGG







CAATGATGAGACAGATAAAGAGAAGAAAAAATTTCTG







GGATTTTTCAAAGTTAATAAAAGAAGCAATAGTAAGG







CTGAGCAGCTCGTGCTGTCGGGTGCAGACAGCGATGA







GGACACCTCCAGGGCTGCCCCAGGAAGGGGTTTGAAC







GGCTGTTTAACGACCCCCAACTCCCCATCCATGCACTC







ACGTTCTCTTACGCTGGGTCCATCCCTCTCGCTGGGCA







GCATCTCAGGGGTGTCCGTGAAGTCGGAGATGAAGAA







GCGCCGAGCCCCTCCTCCTCCAGGTTCAGGGCCACCTG







TGCAAGACAAGGCATCGGAAAAG|GGGCTGTTACCCT







TTGCTGTGGTAGGGAGTACAGATGAAGTGAAAGTTGG







AAAAAGGATGGTCAGAGGCCGTCACTACCCTTGGGGA







GTTTTGCAAGTGGAAAATGAAAATCACTGTGACTTCGT







TAAGCTCCGAGATATGCTTCTTTGTACCAATATGGAAA







ATCTAAAAGAAAAAACCCACACTCAGCACTATGAATGT







TATAGGTACCAAAAACTGCAGAAAATGGGCTTTACAG







ATGTGGGTCCAAACAACCAGCCAGTTAGTTTTCAAGAA







ATCTTTGAAGCCAAAAGACAAGAGTTCTATGATCAATG







TCAGAGGGAAGAAGAAGAGTTGAAACAGAGATTTAT







GCAGCGAGTCAAGGAGAAAGAAGCAACATTTAAAGA







AGCTGAAAAAGAGCTGCAGGACAAGTTCGAGCATCTT







AAAATGATTCAACAGGAGGAGATAAGGAAGCTCGAG







GAAGAGAAAAAACAACTGGAAGGAGAAATCATAGATT







TTTATAAAATGAAAGCTGCCTCCGAAGCACTGCAGACT







CAGCTGAGCACCGATACAAAGAAAGACAAACATCGTA







AGAAA


G17210.
101176424
30536851
InFrame
8346
ATGAAGCTGGCCCTGCTCCTGCCCTGGGCGTGTTGCTG


TCGA-12-




CCTCTGCGGGTCGGCGCTGGCCACCGGCTTCCTCTATC


0616-




CCTTCTCGGCCGCAGCTCTGCAGCAGCACGGCTACCCC


01A-01R-




GAGCCCGGCGCCGGCTCCCCTGGCAGCGGCTACGCGA


1849-




GCCGCCGGCACTGGTGCCATCACACAGTGACACGGAC


01.2




GGTGTCCTGCCAGGTGCAGAATGGCTCGGAGACGGTG







GTCCAGCGCGTGTACCAGAGCTGCCGGTGGCCGGGGC







CCTGCGCCAACCTCGTAAGTTACAGGACTCTGATCAGA







CCCACCTACAGAGTGTCCTACCGCACGGTGACGGTGCT







GGAGTGGAGATGCTGCCCTGGCTTCACCGGGAGCAAC







TGTGATGAGGAATGCATGAACTGCACCCGGCTCAGTG







ACATGAGTGAGCGACTGACCACACTGGAGGCCAAG|A







TTATTTGCATGGGTGCAAAAGAAAATGGTTTGCCGCTG







GAGTATCAAGAGAAGTTAAAAGCAATAGAACCAAATG







ACTATACAGGAAAGGTCTCAGAAGAAATTGAAGACAT







CATCAAAAAGGGGGAAACACAAACTCTT


NYU_B
233613792
234862561
InFrame
8347
ATGGCAGCGGAAACGCAGACACTGAACTTTGGGCCTG







AATGGCTCCGAGCTCTGTCCAGTGGTGGGAGTATTAC







ATCCCCTCCTCTTTCTCCAGCATTGCCGAAGTATAAATT







AGCAGATTATCGTTACGGCAGAGAAGAAATGTTAGCA







CTTTTCCTTAAAGACAACAAGATACCTTCAGACCTTCTG







GATAAAGAATTTCTGCCTATCCTCCAGGAGGAACCCCT







TCCACCATTGGCTCTGGTACCCTTTACAGAAGAAGAAC







AG|CTCAAAGAAATTCTCGAATGTTCTCACCTATTAACA







GTTATTAAAATGGAAGAAGCTGGGGATGAAATTGTGA







GCAATGCCATCTCCTACGCTCTATACAAAGCCTTCAGC







ACCAGTGAGCAAGACAAGGATAACTGGAATGGGCAG







CTGAAGCTTCTGCTGGAGTGGAACCAGCTGGACTTAG







CCAATGATGAGATTTTCACCAATGACCGCCGATGGGA







GTCTGCTGACCTTCAAGAAGTCATGTTTACGGCTCTCA







TAAAGGACAGACCCAAGTTTGTCCGCCTCTTTCTGGAG







AATGGCTTGAACCTACGGAAGTTTCTCACCCATGATGT







CCTCACTGAACTCTTCTCCAACCACTTCAGCACGCTTGT







GTACCGGAATCTGCAGATCGCCAAGAATTCCTATAATG







ATGCCCTCCTCACGTTTGTCTGGAAACTGGTTGCGAAC







TTCCGAAGAGGCTTCCGGAAGGAAGACAGAAATGGCC







GGGACGAGATGGACATAGAACTCCACGACGTGTCTCC







TATTACTCGGCACCCCCTGCAAGCTCTCTTCATCTGGGC







CATTCTTCAGAATAAGAAGGAACTCTCCAAAGTCATTT







GGGAGCAGACCAGGGGCTGCACTCTGGCAGCCCTGG







GAGCCAGCAAGCTTCTGAAGACTCTGGCCAAAGTGAA







GAACGACATCAATGCTGCTGGGGAGTCCGAGGAGCTG







GCTAATGAGTACGAGACCCGGGCTGTTGAGCTGTTCA







CTGAGTGTTACAGCAGCGATGAAGACTTGGCAGAACA







GCTGCTGGTCTATTCCTGTGAAGCTTGGGGTGGAAGC







AACTGTCTGGAGCTGGCGGTGGAGGCCACAGACCAGC







ATTTCATCGCCCAGCCTGGGGTCCAGAATTTTCTTTCTA







AGCAATGGTATGGAGAGATTTCCCGAGACACCAAGAA







CTGGAAGATTATCCTGTGTCTGTTTATTATACCCTTGGT







GGGCTGTGGCTTTGTATCATTTAGGAAGAAACCTGTCG







ACAAGCACAAGAAGCTGCTTTGGTACTATGTGGCGTTC







TTCACCTCCCCCTTCGTGGTCTTCTCCTGGAATGTGGTC







TTCTACATCGCCTTCCTCCTGCTGTTTGCCTACGTGCTG







CTCATGGATTTCCATTCGGTGCCACACCCCCCCGAGCT







GGTCCTGTACTCGCTGGTCTTTGTCCTCTTCTGTGATGA







AGTGAGACAGTGGTACGTAAATGGGGTGAATTATTTT







ACTGACCTGTGGAATGTGATGGACACGCTGGGGCTTT







TTTACTTCATAGCAGGAATTGTATTTCGGCTCCACTCTT







CTAATAAAAGCTCTTTGTATTCTGGACGAGTCATTTTCT







GTCTGGACTACATTATTTTCACTCTAAGATTGATCCACA







TTTTTACTGTAAGCAGAAACTTAGGACCCAAGATTATA







ATGCTGCAGAGGATGCTGATCGATGTGTTCTTCTTCCT







GTTCCTCTTTGCGGTGTGGATGGTGGCCTTTGGCGTGG







CCAGGCAAGGGATCCTTAGGCAGAATGAGCAGCGCTG







GAGGTGGATATTCCGTTCGGTCATCTACGAGCCCTACC







TGGCCATGTTCGGCCAGGTGCCCAGTGACGTGGATGG







TACCACGTATGACTTTGCCCACTGCACCTTCACTGGGA







ATGAGTCCAAGCCACTGTGTGTGGAGCTGGATGAGCA







CAACCTGCCCCGGTTCCCCGAGTGGATCACCATCCCCC







TGGTGTGCATCTACATGTTATCCACCAACATCCTGCTG







GTCAACCTGCTGGTCGCCATGTTTGGCTACACGGTGGG







CACCGTCCAGGAGAACAATGACCAGGTCTGGAAGTCC







CAGAGGTACTTCCTGGTGCAGGAGTACTGCAGCCGCC







TCAATATCCCCTTCCCCTTCATCGTCTTCGCTTACTTCTA







CATGGTGGTGAAGAAGTGCTTCAAGTGTTGCTGCAAG







GAGAAAAACATGGAGTCTTCTGTCTGCTGTTTCAAAAA







TGAAGACAATGAGACTCTGGCATGGGAGGGTGTCATG







AAGGAAAACTACCTTGTCAAGATCAACACAAAAGCCA







ACGACACCTCAGAGGAAATGAGGCATCGATTTAGACA







ACTGGATACAAAGCTTAATGATCTCAAGGGTCTTCTGA







AAGAGATTGCTAATAAAATCAAA


G17469.
55268106
55863785
InFrame
8348
ATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGG


TCGA-06-




CGCTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCT


2557-




GGAGGAAAAGAAAGTTTGCCAAGGCACGAGTAACAA


01A-01R-




GCTCACGCAGTTGGGCACTTTTGAAGATCATTTTCTCA


1849-




GCCTCCAGAGGATGTTCAATAACTGTGAGGTGGTCCTT


01.2




GGGAATTTGGAAATTACCTATGTGCAGAGGAATTATG







ATCTTTCCTTCTTAAAGACCATCCAGGAGGTGGCTGGT







TATGTCCTCATTGCCCTCAACACAGTGGAGCGAATTCC







TTTGGAAAACCTGCAGATCATCAGAGGAAATATGTACT







ACGAAAATTCCTATGCCTTAGCAGTCTTATCTAACTATG







ATGCAAATAAAACCGGACTGAAGGAGCTGCCCATGAG







AAATTTACAGGAAATCCTGCATGGCGCCGTGCGGTTCA







GCAACAACCCTGCCCTGTGCAACGTGGAGAGCATCCA







GTGGCGGGACATAGTCAGCAGTGACTTTCTCAGCAAC







ATGTCGATGGACTTCCAGAACCACCTGGGCAGCTGCC







AAAAGTGTGATCCAAGCTGTCCCAATGGGAGCTGCTG







GGGTGCAGGAGAGGAGAACTGCCAGAAACTGACCAA







AATCATCTGTGCCCAGCAGTGCTCCGGGCGCTGCCGTG







GCAAGTCCCCCAGTGACTGCTGCCACAACCAGTGTGCT







GCAGGCTGCACAGGCCCCCGGGAGAGCGACTGCCTG







GTCTGCCGCAAATTCCGAGACGAAGCCACGTGCAAGG







ACACCTGCCCCCCACTCATGCTCTACAACCCCACCACGT







ACCAGATGGATGTGAACCCCGAGGGCAAATACAGCTT







TGGTGCCACCTGCGTGAAGAAGTGTCCCCGTAATTATG







TGGTGACAGATCACGGCTCGTGCGTCCGAGCCTGTGG







GGCCGACAGCTATGAGATGGAGGAAGACGGCGTCCG







CAAGTGTAAGAAGTGCGAAGGGCCTTGCCGCAAAGTG







TGTAACGGAATAGGTATTGGTGAATTTAAAGACTCACT







CTCCATAAATGCTACGAATATTAAACACTTCAAAAACT







GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTG







GCATTTAGGGGTGACTCCTTCACACATACTCCTCCTCTG







GATCCACAGGAACTGGATATTCTGAAAACCGTAAAGG







AAATCACAGGGTTTTTGCTGATTCAGGCTTGGCCTGAA







AACAGGACGGACCTCCATGCCTTTGAGAACCTAGAAA







TCATACGCGGCAGGACCAAGCAACATGGTCAGTTTTCT







CTTGCAGTCGTCAGCCTGAACATAACATCCTTGGGATT







ACGCTCCCTCAAGGAGATAAGTGATGGAGATGTGATA







ATTTCAGGAAACAAAAATTTGTGCTATGCAAATACAAT







AAACTGGAAAAAACTGTTTGGGACCTCCGGTCAGAAA







ACCAAAATTATAAGCAACAGAGGTGAAAACAGCTGCA







AGGCCACAGGCCAGGTCTGCCATGCCTTGTGCTCCCCC







GAGGGCTGCTGGGGCCCGGAGCCCAGGGACTGCGTC







TCTTGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGG







ACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAGTT







TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAG







TGCCTGCCTCAGGCCATGAACATCACCTGCACAGGACG







GGGACCAGACAACTGTATCCAGTGTGCCCACTACATTG







ACGGCCCCCACTGCGTCAAGACCTGCCCGGCAGGAGT







CATGGGAGAAAACAACACCCTGGTCTGGAAGTACGCA







GACGCCGGCCATGTGTGCCACCTGTGCCATCCAAACTG







CACCTACGGATGCACTGGGCCAGGTCTTGAAGGCTGT







CCAACGAATGGGCCTAAGATCCCGTCCATCGCCACTGG







GATGGTGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCC







CTGGGGATCGGCCTCTTCATGCGAAGGCGCCACATCG







TTCGGAAGCGCACGCTGCGGAGGCTGCTGCAGGAGA







GGGAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGC







TCCCAACCAAGCTCTCTTGAGGATCTTGAAGGAAACTG







AATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTT







CGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGGT







GAGAAAGTTAAAATTCCCGTCGCTATCAAGGAATTAA







GAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCT







CGATGAAGCCTACGTGATGGCCAGCGTGGACAACCCC







CACGTGTGCCGCCTGCTGGGCATCTGCCTCACCTCCAC







CGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCC







TCCTGGACTATGTCCGGGAACACAAAGACAATATTGG







CTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAA







AGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCA







CCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACA







CCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCA







AACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAG







AAGGAGGCAAAGTGCCTATCAAGTGGATGGCATTGGA







ATCAATTTTACACAGAATCTATACCCACCAGAGTGATG







TCTGGAGCTACGGGGTGACTGTTTGGGAGTTGATGAC







CTTTGGATCCAAGCCATATGACGGAATCCCTGCCAGCG







AGATCTCCTCCATCCTGGAGAAAGGAGAACGCCTCCCT







CAGCCACCCATATGTACCATCGATGTCTACATGATCAT







GGTCAAGTGCTGGATGATAGACGCAGATAGTCGCCCA







AAGTTCCGTGAGTTGATCATCGAATTCTCCAAAATGGC







CCGAGACCCCCAGCGCTACCTTGTCATTCAG|CTGCAG







GACAAGTTCGAGCATCTTAAAATGATTCAACAGGAGG







AGATAAGGAAGCTCGAGGAAGAGAAAAAACAACTGG







AAGGAGAAATCATAGATTTTTATAAAATGAAAGCTGCC







TCCGAAGCACTGCAGACTCAGCTGAGCACCGATACAA







AGAAAGACAAACATCGTAAGAAA


G17480.
231945028
230377652
InFrame
8349
ATGATCACCTCGGCCGCTGGAATTATTTCTCTTCTGGAT


TCGA-27-




GAAGATGAACCACAGCTTAAGGAATTTGCACTACACA


1830-




AATTGAATGCAGTTGTTAATGACTTCTGGGCAGAAATT


01A-01R-




TCCGAGTCCGTAGACAAAATAGAGGTTTTATACGAAG


1850-




ATGAAGGTTTCCGGAGTCGGCAGTTTGCAGCCTTAGT


01.2




GGCATCTAAAGTATTTTATCACCTGGGGGCTTTTGAGG







AGTCTCTGAATTATGCTCTTGGAGCAGGGGACCTCTTC







AATGTCAATGATAACTCTGAATATGTGGAAACTATTAT







AGCAAAATGCATTGATCACTACACCAAACAATGTGTGG







AAAATGCAGATTTGCCTGAAGGAGAAAAAAAACCAAT







TGACCAGAGATTGGAAGGCATCGTAAATAAAATGTTC







CAGCGATGTCTAGATGATCACAAGTATAAACAGGCTAT







TGGCATTGCTCTGGAGACACGAAGACTGGACGTCTTT







GAAAAGACCATACTGGAGTCGAATGATGTCCCAGGAA







TGTTAGCTTATAGCCTTAAGCTCTGCATGTCTTTAATGC







AGAATAAACAGTTTCGGAATAAAGTACTAAGAGTTCTA







GTTAAAATCTACATGAACTTGGAGAAACCTGATTTCAT







CAATGTTTGTCAGTGCTTAATTTTCTTAGATGATCCTCA







GGCTGTGAGTGATATCTTAGAGAAACTGGTAAAGGAA







GACAACCTCCTGATGGCATATCAGATTTGTTTTGATTTG







TATGAAAGTGCTAGCCAGCAGTTTTTGTCATCTGTAAT







CCAGAATCTTCGAACTGTTGGCACCCCTATTGCTTCTGT







GCCTGGATCCACTAATACGGGTACTGTTCCGGGATCAG







AGAAAGACAGTGACTCGATGGAAACAGAAGAAAAGA







CAAGCAGTGCATTTGTAGGAAAGACACCAGAAGCCAG







TCCAGAGCCTAAGGACCAGACTTTGAAAATGATTAAA







ATTTTAAGTGGTGAAATGGCTATTGAGTTACATCTGCA







GTTCTTAATACGAAACAATAATACAGACCTCATGATTC







TAAAAAACACAAAGGATGCAGTACGGAATTCTGTATG







TCATACTGCAACCGTTATAGCAAACTCTTTTATGCACTG







TGGGACAACCAGTGACCAGTTTCTTAGAGATAATTTGG







AATGGTTAGCCAGAGCCACTAACTGGGCAAAATTTACT







GCTACAGCCAGTTTGGGTGTAATTCATAAGGGTCATGA







AAAAGAAGCATTACAGTTAATGGCAACATACCTTCCCA







AGGATACTTCTCCAGGATCAGCCTATCAGGAAGGTGG







AGGTCTCTATGCACTAGGTCTTATTCATGCCAATCATG







GTGGTGATATAATTGACTATCTGCTTAATCAGCTTAAG







AACGCCAGCAATGATTGCAACTTTTTCCTGTACCTGTG







AGGAGCAGTACGTGGGTACTTTCTGTGAAGAATACGA







TGCTTGCCAGAGGAAACCTTGCCAAAACAACGCGAGC







TGTATTGATGCAAATGAAAAGCAAGATGGGAGCAATT







TCACCTGTGTTTGCCTTCCTGGTTATACTGGAGAGCTTT







GCCAGTCCAAGATTGATTACTGCATCCTAGACCCATGC







AGAAATGGAGCAACATGCATTTCCAGTCTCAGTGGATT







CACCTGCCAGTGTCCAGAAGGATACTTCGGATCTGCTT







GTGAAGAAAAGGTGGACCCCTGCGCCTCGTCTCCGTG







CCAGAACAACGGCACCTGCTATGTGGACGGGGTACAC







TTTACCTGCAACTGCAGCCCGGGCTTCACAGGGCCGAC







CTGTGCCCAGCTTATTGACTTCTGTGCCCTCAGCCCCTG







TGCTCATGGCACGTGCCGCAGCGTGGGCACCAGCTAC







AAATGCCTCTGTGATCCAGGTTACCATGGCCTCTACTG







TGAGGAGGAATATAATGAGTGCCTCTCCGCTCCATGCC







TGAATGCAGCCACCTGCAGGGACCTCGTTAATGGCTAT







GAGTGTGTGTGCCTGGCAGAATACAAAGGAACACACT







GTGAATTGTACAAGGATCCCTGCGCTAACGTCAGCTGT







CTGAACGGAGCCACCTGTGACAGCGACGGCCTGAATG







GCACGTGCATCTGTGCACCCGGGTTTACAGGTGAAGA







GTGCGACATTGACATAAATGAATGTGACAGTAACCCCT







GCCACCATGGTGGGAGCTGCCTGGACCAGCCCAATGG







TTATAACTGCCACTGCCCGCATGGTTGGGTGGGAGCA







AACTGTGAGATCCACCTCCAATGGAAGTCCGGGCACA







TGGCGGAGAGCCTCACCAACATGCCACGGCACTCCCTC







TACATCATCATTGGAGCCCTCTGCGTGGCCTTCATCCTT







ATGCTGATCATCCTGATCGTGGGGATTTGCCGCATCAG







CCGCATTGAATACCAGGGTTCTTCCAGGCCAGCCTATG







AGGAGTTCTACAACTGCCGCAGCATCGACAGCGAGTT







CAGCAATGCCATTGCATCCATCCGGCATGCCAGGTTTG







GAAAGAAATCCCGGCCTGCAATGTATGATGTGAGCCC







CATCGCCTATGAAGATTACAGTCCTGATGACAAACCCT







TGGTCACACTGATTAAAACTAAAGATTTG


G17634.
36424288
37967938
InFrame
8350
ATGGCGGAGGGCGCCCAGCCGCATCAGCCGCCTCAGC


TCGA-19-




TcgggcccggcgccgccgcccggggcATGAAGCGGGAGTCGG


2625-




AGCTGGAGCTGCCGGTGCCCGGAGCGGGAGGAGACG


01A-01R-




GAGCCGATCCCGGCCTGAGCAAGCGGCCGCGCACTGA


1850-




GGAGGCGGCGGCCGACGGTGGCGGCGGGATGCAG|G


01.2




GGGAACTTGAGGTTAAGAACATGGACATGAAGCCGG







GGTCAACCCTGAAGATCACAGGCAGCATCGCCGATGG







CACTGATGGCTTTGTAATTAATCTGGGCCAGGGGACA







GACAAGCTGAACCTGCATTTCAACCCTCGCTTCAGCGA







ATCCACCATTGTCTGCAACTCATTGGACGGCAGCAACT







GGGGGCAAGAACAACGGGAAGATCACCTGTGCTTCAG







CCCAGGGTCAGAGGTCAAGTTCACAGTGACCTTTGAG







AGTGACAAATTCAAGGTGAAGCTGCCAGATGGGCACG







AGCTGACTTTTCCCAACAGGCTGGGTCACAGCCACCTG







AGCTACCTGAGCGTAAGGGGCGGGTTCAACATGTCCT







CTTTCAAGTTAAAAGAA


G17654.
66313803
86374956
InFrame
8351
ATGGACCGGCCGGGGTTCGTGGCAGCGCTGGTGGCT


TCGA-41-




GGTGGGGTAGCAGGTGTTTCTGTTGACTTGATATTATT


4097-




TCCTCTGGATACCATTAAAACCAGGCTGCAGAGTCCCC


01A-01R-




AAGGATTTAGTAAGGCTGGTGGTTTTCATGGAATATAT


1850-




GCTGGCGTTCCTTCTGCTGCTATTGGATCCTTTCCTAAT


01.2




GCTGCTGCATTTTTTATCACCTATGAATATGTGAAGTG







GTTTTTGCATGCTGATTCATCTTCATATTTGACACCTAT







GAAACATATGTTGGCTGCCTCTGCTGGAGAAGTGGTT







GCCTGCCTGATTCGAGTTCCATCTGAAGTGGTTAAGCA







GAGGGCACAGGTATCTGCTTCTACAAGAACATTTCAGA







TTTTCTCTAACATCTTATATGAAGAGGGTATCCAAGGG







TTGTATCGAGGCTATAAAAGCACAGTTTTAAGAGAG|A







TAGAAGAAGTCAGAGATGCCATGGAAAATGAAATGAG







AACCCAGCTTCGCCGACAGGCAGCTGCCCACACTGATC







ACTTGCGAGATGTCCTTAGGGTACAAGAACAGGAATT







GAAGTCTGAATTTGAGCAGAACCTGTCTGAGAAACTCT







CTGAACAAGAATTACAATTTCGTCGTCTCAGTCAAGAG







CAAGTTGACAACTTTACTCTGGATATAAATACTGCCTAT







GCCAGACTCAGAGGAATCGAACAGGCTGTTCAGAGCC







ATGCAGTTGCTGAAGAGGAAGCCAGAAAAGCCCACCA







ACTCTGGCTTTCAGTGGAGGCATTAAAGTACAGCATGA







AGACCTCATCTGCAGAAACACCTACTATCCCGCTGGGT







AGTGCAGTTGAGGCCATCAAAGCCAACTGTTCTGATAA







TGAATTCACCCAAGCTTTAACCGCAGCTATCCCTCCAG







AGTCCCTGACCCGTGGGGTGTACAGTGAAGAGACCCT







TAGAGCCCGTTTCTATGCTGTTCAAAAACTGGCCCGAA







GGGTAGCAATGATTGATGAAACCAGAAATAGCTTGTA







CCAGTACTTCCTCTCCTACCTACAGTCCCTGCTCCTATTC







CCACCTCAGCAACTGAAGCCGCCCCCAGAGCTCTGCCC







TGAGGATATAAACACATTTAAATTACTGTCATATGCTTC







CTATTGCATTGAGCATGGTGATCTGGAGCTAGCAGCA







AAGTTTGTCAATCAGCTGAAGGGGGAATCCAGACGAG







TGGCACAGGACTGGCTGAAGGAAGCCCGAATGACCCT







AGAAACGAAACAGATAGTGGAAATCCTGACAGCATAT







GCCAGCGCCGTAGGAATAGGAACCACTCAGGTGCAGC







CAGAG


G17485.
142166712
246093239
InFrame
8352
ATGGGAGTCCCCAAGTTTTACAGATGGATCTCAGAGC


TCGA-14-




GGTATCCCTGTCTCAGCGAAGTGGTGAAAGAGCATCA


1402-




G|GTGATCTGCAACTCTTTCACCATCTGTAATGCGGAG


02A-01R-




ATGCAGGAAGTTGGTGTTGGCCTATATCCCAGTATCTC


2005-




TTTGCTCAATCACAGCTGTGACCCCAACTGTTCGATTGT


01.2




GTTCAATGGGCCCCACCTCTTACTGCGAGCAGTCCGAG







ACATCGAGGTGGGAGAGGAGCTCACCATCTGCTACCT







GGATATGCTGATGACCAGTGAGGAGCGCCGGAAGCA







GCTGAGGGACCAGTACTGCTTTGAATGTGACTGTTTCC







GTTGCCAAACCCAGGACAAGGATGCTGATATGCTAAC







TGGTGATGAGCAAGTATGGAAGGAAGTTCAAGAATCC







CTGAAAAAAATTGAAGAACTGAAGGCACACTGGAAGT







GGGAGCAGGTTCTGGCCATGTGCCAGGCAATCATAAG







CAGCAATTCTGAACGGCTTCCCGATATCAACATCTACC







AGCTGAAGGTGCTCGACTGCGCCATGGATGCCTGCAT







CAACCTCGGCCTGTTGGAGGAAGCCTTGTTCTATGGTA







CTCGGACCATGGAGCCATACAGGATTTTTTTCCCAGGA







AGCCATCCCGTCAGAGGGGTTCAAGTGATGAAAGTTG







GCAAACTGCAGCTACATCAAGGCATGTTTCCCCAAGCA







ATGAAGAATCTGAGACTGGCTTTTGATATTATGAGAGT







GACACATGGCAGAGAACACAGCCTGATTGAAGATTTG







ATTCTACTTTTAGAAGAATGCGACGCCAACATCAGAGC







ATCC


G17498.
119976637
332025
InFrame
8353
atggccgccgccggcgccCGGCTCAgccccggccccggctcggggc


TCGA-02-




tccgggggcggccgAGGCTCTGCTTCCACCCGGGgccgccgc


2483-




cactgctgccgctgctgctgctgttcctgctcctgctgccgccgccgccgc


01A-01R-




tgctgGCCGGCGCCACCGCCGCTGCCTCGCGGGAGCCC


1849-




GACAGCCCGTGCCGGCTGAAGACCGTCACGGTGTCCA


01.2




CACTGCCCGCCCTGCGGGAGAGCGACATCggctggagcg







gcgcccgcgccggggccggggctgggaccggggccggagccgCCGC







CGCCGCCGCGTCCCCGGGCTCTCCTGGCTCTGCCGGCA







CCGCCGCCGAGTCGCGCCTCCTGCTCTTTGTGCGTAAC







GAGCTGCCGGGGCGCATCGCGGTGCAGGACGACCTG







GACAACACCGAGCTGCCCTTCTTCACCCTGGAGATGTC







TGGCACAGCAGCGGACATCTCGCTGGTGCACTGGAGA







CAGCAGTGGCTGGAGAATGGCACCTTGTACTTCCACGT







CTCCATGAGCAGCTCCGGGCAGCTGGCCCAAGCCACC







GCCCCCACTCTCCAGGAGCCCTCGGAGATTGTTGAGG







AGCAGATGCACATCCTCCACATTTCTGTGATGGGTGGC







CTCATCGCGCTGCTGCTGCTGCTGCTGGTGTTCACCGT







GGCGCTGTACGCCCAGCGACGTTGGCAGAAGCGTCGC







CGCATCCCCCAGAAGAGCGCAAGCACAGAAGCCACTC







ATGAGATCCACTACATCCCATCTGTGCTGCTGGGTCCC







CAGGCGCGGGAGAGCTTCCGTTCATCCCGGCTGCAAA







CCCACAATTCCGTCATTGGCGTGCCCATCCGGGAGACT







CCCATCCTGGATGACTATGACTGTGAGGAGGATGAGG







AGCCACCTAGGCGGGCCAACCATGTCTCCCGCGAGGA







CGAGTTTGGCAGCCAGGTGACCCACACTCTGGACAGT







CTGGGACATCCAGGGGAAGAGAAGGTGGACTTTGAG







AAGAAAG|ACCCAAGCCTGCCCAATGTGCAGGTGACC







AGGCTGACACTCCTGTCGGAACAGGCTCCGGGGCCCG







TCGTCATGGATCTCACAGGGGACCTGGCTGTTCTGAAG







GACCAGGTGTTTGTCCTGAAGGAAGGTGTTGATTACA







GAGTGAAGATCTCCTTCAAGGTCCACAGGGAGATTGT







CAGCGGCCTCAAGTGTCTGCACCACACCTACCGCCGG







GGCCTGCGCGTGGACAAGACCGTCTACATGGTGGGCA







GCTATGGCCCGAGCGCCCAGGAGTATGAGTTTGTGAC







TCCGGTGGAGGAAGCGCCGAGGGGTGCGCTGGTGCG







GGGCCCCTATCTGGTGGTGTCCCTCTTCACCGACGATG







ACAGGACGCACCACCTGTCCTGGGAGTGGGGTCTCTG







CATCTGCCAGGACTGGAAGGAC


G17799.
61109367
47345631
InFrame
8354
ATGCTGCGATTACAGATGACTGATGGTCATATAAGTTG


TCGA-06-




CACAGCAGTAGAATTTAGTTATATGTCAAAAATAAGCC


1804-




TGAACACACCACCTGGAACTAAAGTTAAGCTCTCAGGC


01A-01R-




ATTGTTGACATAAAAAATGGATTCCTGCTCTTGAATGA


1849-




CTCTAACACCACAGTTCTTGGTGGTGAAGTGGAACACC


01.4




TTATTGAGAAATGGGAGTTACAGAGAAGCTTATCAAA







ACACAATAGAAGCAATATTGGAACTGAAGGTGGACCA







CCGCCTTTTGTGCCTTTTGGACAGAAGTGTGTATCTCAT







GTCCAAGTGGATAGCAGAGAACTTGATCGAAGAAAAA







CATTGCAAGTTACAATGCCTGTCAAACCTACAAATGAT







AATGATGAATTTGAAAAGCAAAGGACGGCTGCTATTG







CTGAAGTTGCAAAGAGCAAGGAAACCAAGACATTTGG







AGGAGGTGGTGGTGGTGCTAGAAGTAATCTCAATATG







AATGCTGCTGGTAACCGAAATAGGGAAGTTTTACAGA







AAGAAAAGTCAACCAAATCAGAGGGAAAACATGAAG







GTGTCTATAGAGAACTGGTTGATGAGAAAGCTCTGAA







GCACATAACGGAAATGGGCTTCAGTAAGGAAGCATCG







AGGCAAGCTCTTATGGATAATGGCAACAACTTAGAAG







CAGCACTGAACGTACTTCTTACAAGCAATAAACAGAAA







CCTGTTATGGGTCCTCCTCTGAGAGGTAGAGGAAAAG







GCAGGGGGCGAATAAGATCTGAAGATGAAGAGGACC







TGGGAAATGCAAGGCCATCAGCACCAAGCACATTATTT







GATTTCTTGGAATCTAAAATGGGAACTTTGAATGTGGA







AGAACCTAAATCACAGCCACAGCAGCTTCATCAGGGA







CAATACAGATCATCAAATACTGAGCAAAATGGAGTAA







AAGATAATAATCATCTGAGACATCCTCCTCGAAATGAT







ACCAGGCAGCCAAGAAATGAAAAACCGCCTCGTTTTC







AAAGAGACTCCCAAAATTCAAAGTCAGTTTTAGAAGG







CAGTGGATTACCTAGAAATAGAGGTTCTGAAAGACCA







AGTACTTCTTCAGTATCTGAAGTATGGGCTGAAGACAG







AATCAAATGTGATAGACCGTATTCTAGATATGACAGAA







CTAAAGATACTTCATATCCTTTAGGTTCTCAGCATAGTG







ATGGTGCTTTTAAAAAAAGAGATAACTCTATGCAAAGC







AGATCAGGAAAAGGTCCCTCCTTTGCAGAGGCAAAAG







AAAATCCACTTCCTCAAGGATCTGTAGATTATAATAAT







CAAAAACGTGGAAAAAGAGAAAGCCAAACATCTATTC







CTGATTA11111 ATGACAGGAAATCACAAACAATAAAT







AATGAAGCTTTCAGTGGTATAAAAATTGAAAAACATTT







TAATGTAAATACTGATTATCAGAATCCAGTTCGAAGTA







ATAGTTTCATTGGTGTTCCAAATGGAGAAGTAGAAATG







CCACTGAAAGGAAGACGAATAGGACCTATTAAGCCAG







CAGGACCTGTCACAGCTGTACCCTGTGATGATAAAATA







TTTTACAATAGTGGGCCCAAACGAAGATCTGGGCCAAT







TAAGCCAGAAAAAATACTAGAATCATCTATTCCTATGG







AGTATGCAAAAATGTGGAAACCTGGAGATGAATGTTT







TGCACTTTATTGGGAAGACAACAAGTTTTACCGGGCAG







AAGTTGAAGCCCTCCATTCTTCGGGTATGACAGCAGTT







GTTAAATTCATTGACTACGGAAACTATGAAGAGGTGCT







ACTGAGCAATATCAAGCCCATTCAAACAGAGGCATGG







|GGTTATGATCATAGCTACTACTTCATTGCAACCTTTAT







TACTGACCACATCAGACATCATGCTAAATACCTGAATG







CA


G17660.
61655656
69872567
InFrame
8355
ATGGCAGATCCAGGAATGATGAGTCTTTTTGGCGAGG


TCGA-06-




ATGGGAATATTTTCAGTGAAGGTCTTGAAGGCCTCGG


5414-




AGAATGTGGTTACCCGGAAAATCCAGTAAATCCTATG


01A-01R-




GGTCAGCAAATGCCAATAGACCAAGGCTTTGCCTCTTT


1849-




ACAGCCATCCCTTCATCATCCTTCAACTAATCAAAATCA


01.2




AACAAAGCTGACACATTTTGATCACTATAATCAGTATG







AACAACAAAAGATGCATCTGATGGATCAGCCGAACAG







AATGATGAGCAACACCCCTGGGAACGGACTCGCGTCT







CCGCACTCGCAGTATCACACCCCTCCCGTTCCTCAGGT







GCCCCATGGTGGCAGTGGTGGCGGTCAGATGGGTGTC







TACCCTGGCATGCAGAATGAGAGGCATGGGCAATCCT







TTGTGGACAGCAGCTCCATGTGGGGCCCCAGGGCTGT







TCAGGTACCAGACCAGATACGAGCCCCCTACCAGCAG







CAGCAGCCACAGCCGCAGCCACCGCAGCCGGCTCCGT







CGGGGCCCCCTGCACAGGGCCACCCTCAGCACATGCA







GCAGATGGGCAGCTATATGGCACGTGGGGATTTTTCC







ATGCAGCAGCATGGTCAGCCACAGCAGAGGATGAGCC







AGTTTTCCCAAGGCCAAGAGGGCCTCAATCAGGGAAA







TCCTTTTATTGCCACCTCAGGACCTGGCCACTTGTCCCA







CGTGCCCCAGCAGAGTCCCAGCATGGCACCTTCCTTGC







GTCACTCGGTGCAGCAGTTCCATCACCACCCCTCTACT







GCTCTCCATGGAGAATCCGTTGCCCACAGTCCCAGATT







CTCCCCGAATCCTCCCCAACAAGGGGCTGTTAGGCCGC







AAACCCTTAACTTTAGTTCTCGGAGCCAGACAGTCCCC







TCTCCTACTATAAACAACTCAGGGCAGTATTCTCGATAT







CCTTACAGTAACCTAAATCAGGGATTAGTTAACAATAC







AGGGATGAATCAAAATTTAGGCCTTACAAATAATACTC







CAATGAATCAGTCCGTACCAAGATACCCCAATGCTGTA







GGATTCCCATCAAACAGTGGTCAAGGACTAATGCACC







AGCAGCCCATCCACCCCAGTGGCTCACTTAACCAAATG







AACACACAAACTATGCATCCTTCACAGCCTCAGGGAAC







TTATGCCTCTCCACCTCCCATGTCACCCATGAAAGCAAT







GAGTAATCCAGCAGGCACTCCTCCTCCACAAGTCAGGC







CGGGAAGTGCTGGGATACCAATGGAAGTTGGCAGTTA







TCCAAATATGCCCCATCCTCAGCCATCTCACCAGCCCCC







TGGTGCCATGGGAATCGGACAGAGGAATATGGGCCCC







AGAAACATGCAGCAGTCTCGTCCATTTATAGGCATGTC







CTCGGCACCAAGGGAATTGACTGGGCACATGAGGCCA







AATGGTTGTCCTGGTGTTGGCCTTGGAGACCCACAAGC







AATCCAGGAACGACTGATACCTGGCCAACAACATCCTG







GTCAACAGCCATCTTTTCAGCAGTTGCCAACCTGTCCTC







CACTGCAGCCTCACCCGGGCTTGCACCACCAGTCTTCA







CCTCCACACCCTCATCACCAGCCTTGGGCACAGCTCCA







CCCATCACCCCAGAACACCCCGCAGAAAGTGCCTGTGC







ATCAG|TTTGACGGTGAGAACATGTATATGAGCATGAC







AGAGCCGAGCCAGGACTATGTGCCAGCCAGCCAGTCC







TACCCTGGTCCAAGCCTGGAAAGTGAAGACTTCAACAT







TCCACCAATTACTCCTCCTTCCCTCCCAGACCACTCGCT







GGTGCACCTGAATGAAGTTGAGTCTGGTTACCATTCTC







TGTGTCACCCCATGAACCATAATGGCCTGCTACCATTTC







ATCCACAAAACATGGACCTCCCTGAAATCACAGTCTCC







AATATGCTGGGCCAGGATGGAACACTGCTTTCTAATTC







CATTTCTGTGATGCCAGATATACGAAACCCAGAAGGA







ACTCAGTACAGTTCCCATCCTCAGATGGCAGCCATGAG







ACCAAGGGGCCAGCCTGCAGACATCAGGCAGCAGCCA







GGAATGATGCCACATGGCCAGCTGACTACCATTAACCA







GTCACAGCTAAGTGCTCAACTTGGTTTGAATATGGGAG







GAAGCAATGTTCCCCACAACTCACCATCTCCACCTGGA







AGCAAGTCTGCAACTCCTTCACCATCCAGTTCAGTGCA







TGAAGATGAAGGCGATGATACCTCTAAGATCAATGGT







GGAGAGAAGCGGCCTGCCTCTGATATGGGGAAAAAA







CCAAAAACTCCCAAAAAGAAGAAGAAGAAGGATCCCA







ATGAGCCCCAGAAGCCTGTGTCTGCCTATGCGTTATTC







TTTCGTGATACTCAGGCCGCCATCAAGGGCCAAAATCC







AAACGCTACCTTTGGCGAAGTCTCTAAAATTGTGGCTT







CAATGTGGGACGGTTTAGGAGAAGAGCAAAAACAGG







TCTATAAAAAGAAAACCGAGGCTGCGAAGAAGGAGTA







CCTGAAGCAACTCGCAGCATACAGAGCCAGCCTTGTAT







CCAAGAGCTACAGTGAACCTGTTGACGTGAAGACATC







TCAACCTCCTCAGCTGATCAATTCGAAGCCGTCGGTGT







TCCATGGGCCCAGCCAGGCCCACTCGGCCCTGTACCTA







AGTTCCCACTATCACCAACAACCGGGAATGAATCCTCA







CCTAACTGCCATGCATCCTAGTCTCCCCAGGAACATAG







CCCCCAAGCCGAATAACCAAATGCCAGTGACTGTCTCT







ATAGCAAACATGGCTGTGTCCCCTCCTCCTCCCCTCCAG







ATCAGCCCGCCTCTTCACCAGCATCTCAACATGCAGCA







GCACCAGCCGCTCACCATGCAGCAGCCCCTTGGGAAC







CAGCTCCCCATGCAGGTCCAGTCTGCCTTACACTCACC







CACCATGCAGCRAGGATTTACTCTTCAACCCGACTATC







AGACTATTATCAATCCTACATCTACAGCTGCACAAGTT







GTCACCCAGGCAATGGAGTATGTGCGTTCGGGGTGCA







GAAATCCTCCCCCACAACCGGTGGACTGGAATAACGA







CTACTGCAGTAGTGGGGGCATGCAGAGGGACAAAGC







ACTGTACCTTACT


GBM-
67688936
55238868
InFrame
8356
ATGGCGAGCGCCTCGTACCACATTTCCAATTTGCTGGA


CUMC3296_L1




AAAAATGACATCCAGCGACAAGGACTTTAGGTTTATG







GCTACAAATGATTTGATGACGGAACTGCAGAAAGATT







CCATCAAGTTGGATGATGATAGTGAAAGGAAAGTAGT







GAAAATGATTTTGAAGTTATTGGAAGATAAAAATGGA







GAGGTACAGAATTTAGCTGTCAAATGTCTTGGTCCTTT







AGTGAGTAAAGTGAAAGAATACCAAGTAGAGACAATT







GTAGATACCCTCTGCACTAACATGCTTTCTGATAAAGA







ACAACTTCGAGACATTTCAAGTATTGGTCTTAAAACAG







TAATTGGAGAACTTCCTCCAGCTTCCAGTGGCTCTGCA







TTAGCTGCTAATGTATGTAAAAAGATTACTGGACGTCT







TACAAGTGCAATAGCAAAACAGGAAGATGTCTCTGTTC







AGCTAGAAGCCTTGGATATTATGGCTGATATGTTGAGC







AG|ATGCACTGGGCCAGGTCTTGAAGGCTGTCCAACG







AATGGGCCTAAGATCCCGTCCATCGCCACTGGGATGG







TGGGGGCCCTCCTCTTGCTGCTGGTGGTGGCCCTGGG







GATCGGCCTCTTCATGCGAAGGCGCCACATCGTTCGGA







AGCGCACGCTGCGGAGGCTGCTGCAGGAGAGGGAGC







TTGTGGAGCCTCTTACACCCAGTGGAGAAGCTCCCAAC







CAAGCTCTCTTGAGGATCTTGAAGGAAACTGAATTCAA







AAAGATCAAAGTGCTGGGCTCCGGTGCGTTCGGCACG







GTGTATAAGGGACTCTGGATCCCAGAAGGTGAGAAAG







TTAAAATTCCCGTCGCTATCAAGGAATTAAGAGAAGCA







ACATCTCCGAAAGCCAACAAGGAAATCCTCGATGAAG







CCTACGTGATGGCCAGCGTGGACAACCCCCACGTGTG







CCGCCTGCTGGGCATCTGCCTCACCTCCACCGTGCAGC







TCATCACGCAGCTCATGCCCTTCGGCTGCCTCCTGGAC







TATGTCCGGGAACACAAAGACAATATTGGCTCCCAGTA







CCTGCTCAACTGGTGTGTGCAGATCGCAAAGGGCATG







AACTACTTGGAGGACCGTCGCTTGGTGCACCGCGACCT







GGCAGCCAGGAACGTACTGGTGAAAACACCGCAGCAT







GTCAAGATCACAGATTTTGGGCTGGCCAAACTGCTGG







GTGCGGAAGAGAAAGAATACCATGCAGAAGGAGGCA







AAGTGCCTATCAAGTGGATGGCATTGGAATCAATTTTA







CACAGAATCTATACCCACCAGAGTGATGTCTGGAGCTA







CGGGGTGACTGTTTGGGAGTTGATGACCTTTGGATCC







AAGCCATATGACGGAATCCCTGCCAGCGAGATCTCCTC







CATCCTGGAGAAAGGAGAACGCCTCCCTCAGCCACCC







ATATGTACCATCGATGTCTACATGATCATGGTCAAGTG







CTGGATGATAGACGCAGATAGTCGCCCAAAGTTCCGT







GAGTTGATCATCGAATTCTCCAAAATGGCCCGAGACCC







CCAGCGCTACCTTGTCATTCAGGGGGATGAAAGAATG







CATTTGCCAAGTCCTACAGACTCCAACTTCTACCGTGCC







CTGATGGATGAAGAAGACATGGACGACGTGGTGGAT







GCCGACGAGTACCTCATCCCACAGCAGGGCTTCTTCAG







CAGCCCCTCCACGTCACGGACTCCCCTCCTGAGCTCTCT







GAGTGCAACCAGCAACAATTCCACCGTGGCTTGCATTG







ATAGAAATGGGCTGCAAAGCTGTCCCATCAAGGAAGA







CAGCTTCTTGCAGCGATACAGCTCAGACCCCACAGGCG







CCTTGACTGAGGACAGCATAGACGACACCTTCCTCCCA







GTGCCTGaATACATAAACCAGTCCGTTCCCAAAAGGCC







CGCTGGCTCTGTGCAGAATCCTGTCTATCACAATCAGC







CTCTGAACCCCGCGCCCAGCAGAGACCCACACTACCAG







GACCCCCACAGCACTGCAGTGGGCAACCCCGAGTATC







TCAACACTGTCCAGCCCACCTGTGTCAACAGCACATTC







GACAGCCCTGCCCACTGGGCCCAGAAAGGCAGCCACC







AAATTAGCCTGGACAACCCTGACTACCAGCAGGACTTC







TTTCCCAAGGAAGCCAAGCCAAATGGCATCTTTAAGG







GCTCCACAGCTGAAAATGCAGAATACCTAAGGGTCGC







GCCACAAAGCAGTGAATTTATTGGAGCA


G17505.
216844
1282739
InFrame
8357
ATGAATAACCTAAATGATCCCCCAAATTGGAATATCCG


TCGA-06-




GCCTAATTCCAGGGCGGATGGTGGTGATGGAAGCAG


2564-




GTGGAATTATGCCCTGTTGGTTCCAATGCTGGGATTGG


01A-01R-




CTGCTTTTC|GGGTTGGCTGTGTTCCGGCCGCAGAGCA


1849-




CCGTCTGCGTGAGGAGATCCTGGCCAAGTTCCTGCACT


01.2




GGCTGATGAGTGTGTACGTCGTCGAGCTGCTCAGGTC







TTTCTTTTATGTCACGGAGACCACGTTTCAAAAGAACA







GGCTCTTTTTCTACCGGAAGAGTGTCTGGAGCAAGTTG







CAAAGCATTGGAATCAGACAGCACTTGAAGAGGGTGC







AGCTGCGGGAGCTGTCGGAAGCAGAGGTCAGGCAGC







ATCGGGAAGCCAGGCCCGCCCTGCTGACGTCCAGACT







CCGCTTCATCCCCAAGCCTGACGGGCTGCGGCCGATTG







TGAACATGGACTACGTCGTGGGAGCCAGAACGTTCCG







CAGAGAAAAGAGGGCCGAGCGTCTCACCTCGAGGGT







GAAGGCACTGTTCAGCGTGCTCAACTACGAGCGGGCG







CGGCGCCCCGGCCTCCTGGGCGCCTCTGTGCTGGGCC







TGGACGATATCCACAGGGCCTGGCGCACCTTCGTGCT







GCGTGTGCGGGCCCAGGACCCGCCGCCTGAGCTGTAC







TTTGTCAAGGTGGATGTGACGGGCGCGTACGACACCA







TCCCCCAGGACAGGCTCACGGAGGTCATCGCCAGCAT







CATCAAACCCCAGAACACGTACTGCGTGCGTCGGTATG







CCGTGGTCCAGAAGGCCGCCCATGGGCACGTCCGCAA







GGCCTTCAAGAGCCACGTCTCTACCTTGACAGACCTCC







AGCCGTACATGCGACAGTTCGTGGCTCACCTGCAGGA







GACCAGCCCGCTGAGGGATGCCGTCGTCATCGAGCAG







AGCTCCTCCCTGAATGAGGCCAGCAGTGGCCTCTTCGA







CGTCTTCCTACGCTTCATGTGCCACCACGCCGTGCGCA







TCAGGGGCAAGTCCTACGTCCAGTGCCAGGGGATCCC







GCAGGGCTCCATCCTCTCCACGCTGCTCTGCAGCCTGT







GCTACGGCGACATGGAGAACAAGCTGTTTGCGGGGAT







TCGGCGGGACGGGCTGCTCCTGCGTTTGGTGGATGAT







TTCTTGTTGGTGACACCTCACCTCACCCACGCGAAAAC







CTTCCTCAGGACCCTGGTCCGAGGTGTCCCTGAGTATG







GCTGCGTGGTGAACTTGCGGAAGACAGTGGTGAACTT







CCCTGTAGAAGACGAGGCCCTGGGTGGCACGGCTTTT







GTTCAGATGCCGGCCCACGGCCTATTCCCCTGGTGCGG







CCTGCTGCTGGATACCCGGACCCTGGAGGTGCAGAGC







GACTACTCCAGCTATGCCCGGACCTCCATCAGAGCCAG







TCTCACCTTCAACCGCGGCTTCAAGGCTGGGAGGAAC







ATGCGTCGCAAACTCTTTGGGGTCTTGCGGCTGAAGTG







TCACAGCCTGTTTCTGGATTTGCAGGTGAACAGCCTCC







AGACGGTGTGCACCAACATCTACAAGATCCTCCTGCTG







CAGGCGTACAGGTTTCACGCATGTGTGCTGCAGCTCCC







ATTTCATCAGCAAGTTTGGAAGAACCCCACATTTTTCCT







GCGCGTCATCTCTGACACGGCCTCCCTCTGCTACTCCAT







CCTGAAAGCCAAGAACGCAGGGATGTCGCTGGGGGC







CAAGGGCGCCGCCGGCCCTCTGCCCTCCGAGGCCGTG







CAGTGGCTGTGCCACCAAGCATTCCTGCTCAAGCTGAC







TCGACACCGTGTCACCTACGTGCCACTCCTGGGGTCAC







TCAGGACAGCCCAGACGCAGCTGAGTCGGAAGCTCCC







GGGGACGACGCTGACTGCCCTGGAGGCCGCAGCCAAC







CCGGCACTGCCCTCAGACTTCAAGACCATCCTGGAC


G17798.
11313895
6880241
InFrame
8358
ATGCTTGGAACCGGACCTGCCGCCGCCACCACCGCTGC


TCGA-32-




CACCACATCTAGCAATGTGAGCGTCCTGCAGCAGTTTG


5222-




CCAGTGGCCTAAAGAGCCGGAATGAGGAAACCAGGG


01A-01R-




CCAAAGCCGCCAAGGAGCTCCAGCACTATGTCACCAT


1850-




GGAACTCCGAGAGATGAGTCAAGAGGAGTCTACTCGC


01.4




TTCTATGACCAACTGAACCATCACATTTTTGAATTGGTT







TCCAGCTCAGATGCCAATGAGAGGAAAGGTGGCATCT







TGGCCATAGCTAGCCTCATAGGAGTGGAAGGTGGGAA







TGCCACCCGAATTGGCAGATTTGCCAACTATCTTCGGA







ACCTCCTCCCCTCCAATGACCCAGTTGTCATGGAAATG







GCATCCAAGGCCATTGGCCGTCTTGCCATGGCAGGGG







ACACTTTTACCGCTGAGTACGTGGAATTTGAGGTGAAG







CGAGCCCTGGAATGGCTGGGTGCTGACCGCAATGAGG







GCCGGAGACATGCAGCTGTCCTGGTTCTCCGTGAGCT







GGCCATCAGCGTCCCTACCTTCTTCTTCCAGCAAGTGC







AACCCTTCTTTGACAACATTTTTGTGGCCGTGTGGGAC







CCCAAACAGGCCATCCGTGAGGGAGCTGTAGCCGCCC







TTCGTGCCTGTCTGATTCTCACAACCCAGCGTGAGCCG







AAGGAGATGCAGAAGCCTCAGTGGTACAGGCACACAT







TTGAAGAAGCAGAGAAGGGATTTGATGAGACCTTGGC







CAAAGAGAAGGGCATGAATCGGGATGATCGGATCCAT







GGAGCCTTGTTGATCCTTAACGAGCTGGTCCGAATCAG







CAGCATGGAGGGAGAG|agcgtttcccaaagtgtattctgcgg







aacTAGCACCTACTGTGTTCTCAACACCGTGCCACCTAT







AGAAGATGATCATGGGAACAGCAATAGTAGTCATGTA







AAAATCTTTTTACCGAAAAAGCTGCTTGAATGTCTGCC







GAAATGTTCAAGTTTACCAAAAGAGAGGCACCGCTGG







AACACTAATGAGGAAATTGCAGCTTATTTAATAACATT







TGAGAAACACGAAGAATGGCTAACCACCTCCCCTAAG







ACAAGACCACAGAATGGCTCAATGATACTCTACAACAG







GAAGAAAGTGAAATACAGGAAAGATGGGTATTGCTG







GAAAAAGAGGAAAGATGGGAAAACGACCAGAGAGGA







CCACATGAAACTCAAGGTCCAGGGAGTGGAGTGCTTG







TACGGCTGCTATGTCCATTCCTCCATCATCCCCACCTTC







CACCGGAGGTGCTACTGGCTCCTTCAGAACCCCGACAT







CGTCCTGGTGCACTACCTGAACGTGCCGGCCATCGAG







GACTGCGGCAAGCCTTGCGGCCCCATCCTCTGCTCCAT







CAACACCGACAAGAAGGAGTGGGCGAAATGGACGAA







AGAAGAGCTCATCGGGCAGCTGAAACCCATGTTCCAT







GGCATCAAGTGGACCTGCAGCAATGGGAACAGCAGCT







CAGGCTTCTCGGTGGAACAGCTGGTGCAGCAGATCCT







CGACAGCCACCAGACCAAGCCCCAGCCGCGGACCCAC







AACTGCCTCTGCACCGGCAGCCTGGGAGCTGGCGGCA







GCGTGCATCACAAGTGTAACAGCGCCAAACACCGCAT







CATCTCGCCCAAGGTGGAGCCACGGACAGGGGGGTAC







GGGAGCCACTCGGAGGTGCAGCACAATGACGTGTCG







GAGGGCAAGCACGAGCACAGCCACAGCAAGGGCTCC







AGCCGTGAGAAGAGGAACGGCAAGGTGGCCAAGCCC







GTGCTCCTGCACCAGAGCAGCACCGAGGTCTCCTCCAC







CAACCAGGTGGAAGTCCCCGACACCACCCAGAGCTCC







CCTGTGTCCATCAGCAGCGGGCTCAACAGCGACCCGG







ACATGGTGGACAGCCCGGTGGTCACAGGTGTGTCCGG







TATGGCGGTGGCCTCTGTGATGGGGAGCTTGTCCCAG







AGCGCCACGGTGTTCATGTCAGAGGTCACCAATGAGG







CCGTGTACACCATGTCCCCCACCGCTGGCCCCAACCAC







CACCTCCTCTCACCTGACGCCTCTCAGGGCCTCGTCCTG







GCCGTGAGCTCTGATGGCCACAAGTTCGCCTTTCCCAC







CACGGGCAGCTCGGAGAGCCTGTCCATGCTGCCCACC







AACGTGTCCGAAGAGCTGGTCCTCTCCACCACCCTCGA







CGGTGGCCGGAAGATTCCAGAAACCACCATGAACTTT







GACCCCGACTGTTTCCTTAATAACCCAAAGCAGGGCCA







GACGTACGGGGGTGGAGGCCTGAAAGCCGAGATGGT







CAGCTCCAACATCCGGCACTCGCCACCCGGGGAGCGG







AGCTTCAGCTTTACCACCGTCCTCACCAAGGAGATCAA







GACCGAGGACACCTCCTTCGAGCAGCAGATGGCCAAA







GAAGCGTACTCCTCCTCCGCGGCGGCTGTGGCAGCCA







GCTCCCTCACCCTGACCGCCGGCTCCAGCCTCCTGCCG







TCGGGCGGCGGCCTGAGTCCCAGCACCACCCTGGAGC







AGATGGACTTCAGCGCCATCGACTCCAACAAGGACTA







CACGTCCAGCTTCAGCCAGACGGGCCACAGCCCCCAC







ATCCACCAGACCCCCTCCCCGAGCTTCTTCCTGCAGGA







CGCCAGCAAACCCCTCCCCGTCGAGCAGAACACCCACA







GCAGCCTGAGTGACTCTGGGGGCACCTTCGTGATGCC







CACGGTGAAAACGGAGGCCTCGTCCCAAACCAGCTCC







TGCAGCGGTCACGTGGAGACGCGGATCGAGTCCACTT







CCTCCCTCCACCTCATGCAGTTCCAGGCCAACTTCCAG







GCCATGACGGCAGAAGGGGAGGTCACCATGGAGACC







TCGCAGGCGGCGGAAGGGAGCGAGGTCCTGCTCAAG







TCTGGGGAGCTGCAGGCTTGCAGCTCTGAGCACTACCT







GCAGCCGGAGACCAACGGGGTAATCCGAAGCGCCGG







CGGCGTCCCCATCCTCCCGGGCAACGTGGTGCAGGGA







CTCTACCCCGTGGCCCAGCCCAGCCTCGGCAACGCCTC







CAACATGGAGCTCAGCCTGGACCACTTTGACATCTCCT







TCAGCAACCAGTTCTCCGACCTGATCAACGACTTCATCT







CCGTGGAGGGGGGCAGCAGCACCATCTATGGGCACCA







GCTGGTGTCGGGGGACAGCACGGCGCTCTCACAGTCA







GAGGACGGGGCGCGGGCCCCCTTCACCCAGGCAGAG







ATGTGCCTCCCCTGCTGTAGCCCCCAGCAGGGTAGCCT







GCAGCTGAGCAGCTCGGAGGGCGGGGCCAGCACCAT







GGCCTACATGCACGTCGCCGAGGTGGTCTCGGCCGCC







TCGGCCCAGGGCACCCTAGGCATGCTGCAGCAGAGCG







GACGGGTGTTCATGGTGACCGACTACTCCCCAGAGTG







GTCTTACCCAGAGGGAGGAGTGAAGGTCCTCATCACA







GGCCCGTGGCAAGAAGCCAGCAATAACTACAGCTGCC







TGTTTGACCAGATCTCAGTGCCTGCATCCCTGATTCAG







CCTGGGGTGCTGCGCTGCTACTGCCCAGCCCATGACAC







TGGTCTTGTGACCCTACAAGTTGCCTTCAACAACCAGA







TCATCTCCAACTCGGTGGTGTTTGAGTACAAAGCCCGG







GCTCTGCCCACGCTCCCTTCCTCCCAGCACGACTGGCT







GTCGTTGGACGATAACCAGTTCAGGATGTCCATCCTGG







AACGACTGGAGCAGATGGAGAGGAGGATGGCCGAGA







TGACGGGGTCCCAGCAGCACAAACAGGCGAGCGGAG







GCGGCAGCAGTGGAGGCGGCAGCGGGAGCGGGAAT







GGAGGGAGCCAGGCACAGTGTGCTTCTGGGACTGGG







GCCTTGGGGAGCTGCTTTGAGAGCCGTGTGGTCGTGG







TATGCGAGAAGATGATGAGCCGAGCCTGCTGGGCGAA







GTCCAAGCACTTGATCCACTCAAAGACTTTCCGCGGAA







TGACCCTACTCCACCTGGCCGCTGCCCAGGGCTATGCC







ACCCTAATCCAGACCCTCATCAAATGGCGTACAAAGCA







CGCGGATAGCATTGACCTGGAACTGGAAGTTGACCCC







TTGAATGTGGACCACTTCTCCTGTACTCCTCTGATGTG







GGCGTGTGCCCTAGGGCACTTGGAAGCTGCCGTCGTG







CTGTACAAGTGGGACCGTCGGGCCATCTCGATTCCCGA







CTCTCTAGGAAGGCTGCCTTTGGGAATTGCCAGGTCAC







GGGGTCATGTGAAATTAGCAGAGTGTCTGGAGCACCT







GCAGAGAGATGAGCAGGCTCAGCTGGGACAGAACCC







CAGAATCCACTGTCCTGCAAGCGAAGAGCCCAGCACA







GAGAGCTGGATGGCCCAGTGGCACAGCGAAGCCATCA







GCTCTCCAGAAATACCCAAGGGAGTCACTGTTATTGCA







AGCACCAACCCAGAGCTGAGAAGACCTCGTTCTGAAC







CCTCTAATTACTACAGCAGTGAGAGCCACAAAGATTAT







CCGGCTCCCAAAAAGCATAAATTGAACCCTGAGTACTT







CCAGACAAGGCAGGAGAAGCTGCTTCCCACTGCACTG







AGTCTGGAAGAGCCAAATATCAGGAAGCAAAGCCCTA







GTTCTAAGCAGTCTGTCCCCGAGACACTCAGCCCCAGT







GAAGGAGTGAGGGACTTCAGCCGGGAACTCTCCCCTC







CCACTCCAGAGACTGCAGCATTTCAAGCCTCTGGATCT







CAGCCTGTAGGAAAGTGGAATTCCAAAGATCTTTACAT







TGGTGTGTCTACAGTACAGGTGACTGGAAATCCGAAG







GGGACCAGTGTAGGAAAGGAGGCAGCACCTTCACAG







GTGCGTCCACGGGAACCAATGAGTGTCCTGATGATGG







CTAACAGAGAGGTGGTGAATACAGAGCTGGGGTCCTA







CCGTGATAGTGCAGAAAATGAAGAATGCGGCCAGCCC







ATGGATGACATACAGGTGAACATGATGACCTTGGCAG







AACACATTATTGAAGCCACACCTGACCGAATCAAGCAG







GAGAATTTTGTGCCCATGGAGTCCTCAGGATTGGAAA







GAACAGACCCTGCCACCATTAGCAGTACAATGAGCTG







GCTGGCCAGTTATCTAGCGGATGCTGACTGCCTTCCCA







GTGCTGCCCAGATCCGAAGTGCATATAACGAGCCTCTA







ACCCCTTCTTCTAATACCAGCTTGAGCCCTGTTGGCTCT







CCCGTCAGTGAAATCGCTTTCGAGAAACCTAACCTTCC







CTCCGCCGCGGATTGGTCAGAATTCCTGAGTGCATCTA







CCAGTGAGAAGGTAGAGAATGAGTTTGCTCAGCTCAC







TCTGTCTGATCATGAACAGAGAGAACTCTATGAGGCTG







CCAGGCTTGTCCAGACAGCTTTCCGGAAATACAAGGG







CCGACCCTTGCGGGAACAGCAAGAAGTAGCTGCTGCT







GTTATTCAGCGTTGTTACAGAAAATATAAACAGTACGC







ACTTTATAAAAAGATGACACAGGCTGCCATCCTTATCC







AGAGCAAATTCCGAAGTTACTATGAACAAAAAAAATTC







CAGCAGAGCCGACGGGCTGCTGTGCTCATCCAAAAGT







ACTACCGAAGTTATAAGAAATGTGGCAAAAGACGGCA







GGCTCGCCGGACGGCTGTGATTGTACAACAGAAACTC







AGGAGCAGTTTGCTAACCAAAAAGCAGGATCAAGCTG







CTCGAAAAATAATGAGGTTTCTTCGCCGCTGTCGCCAC







AGCCCCCTGGTGGACCATAGGCTGTACAAAAGGAGTG







AAAGAATTGAAAAAGGCCAAGGAACT


GBM-
24640521
18659539
InFrame
8359
ATGGCCCGGGGCTACGGGGCCACGGTCAGCCTAGTCC


CUMC3297_L1




TGCTGGGTCTGGGGCTGGCGCTGGCTGTCATTGTGCT







GGCTGTGGTCCTCTCTCGACACCAGGCCCCATGTGGCC







CCCAGGCCTTTGCCCACGCTGCTGTTGCCGCCGACTCC







AAGGTCTGCTCGGATATTGGACG|GCAGGAAACTGCA







TATCTTCTGGTTTACATGAAGATGGAGTGC


GBM-
61083748
104375025
InFrame
8360
ATGTTTGTGTATGTGCTCACTCCCGGAGAGCAATCAGG


CUMC3342_L1




GAGACGGCTCCCCGGCCAGACTTGGCTGATGTTTTCTT







GTTTCTGTTTCAGCCTTCAGGATAATTCCTTCAGCAGCA







CCACTGTAACAGAGTGTGACGAAGATCCAGTCTCTCTA







CATGAAGACCAGACTGATTGCTCCAGTCTCAGAGATG







AAAACAATAAAGAGAACTACCCCGACGCAGGGGCTCT







GGTAGAAGAGCACGCGCCGCCCTCTTGGGAGCCGCAG







CAGCAGAATGTAGAGGCGACCGTGCTGGTGGACAGC







GTATTGCGACCCAGCATGGGCAACTTCAAGTCCAGGA







AGCCCAAGTCCATCTTCAAAGCGGAGAGCGGGAGGA







GCCACGGAGAAAGTCAGGAGACAGAGCATGTGGTAT







CCAGCCAGTCAGAGTGTCAGGTGAGAGCAGGAACACC







AGCTCATGAGAGTCCACAAAACAATGCCTTCAAGTGCC







AAGAAACAGTGCGACTTCAACCAAG|GAGCCGCAAAG







ACAGCCTGGAAAGTGACAGCTCCACGGCCATCATTCCC







CATGAGCTGATTCGCACGCGGCAGCTTGAGAGCGTAC







ATCTGAAATTCAACCAGGAGTCCGGAGCCCTCATTCCT







CTCTGCCTAAGGGGCAGGCTCCTGCATGGACGGCACT







TTACATATAAAAGTATCACAGGTGACATGGCCATCACG







TTTGTCTCCACGGGAGTGGAAGGCGCCTTTGCCACTGA







GGAGCATCCTTACGCGGCTCATGGACCCTGGTTACAAA







TTCTGTTGACCGAAGAGTTTGTAGAGAAAATGTTGGA







GGATTTAGAAGATTTGACTTCTCCAGAGGAATTCAAAC







TTCCCAAAGAGTACAGCTGGCCTGAAAAGAAGCTGAA







GGTCTCCATCCTGCCTGACGTGGTGTTCGACAGTCCGC







TACAC


G17500.
39438743
41808143
InFrame
8361
ATGCACTCAGCCGGGACTCCCGGGTTATCCTCGCGCCG


TCGA-27-




GACAGGCAACTCCACCAGCTTTCAACCAGGACCGCCAC


1831-




CGCCGCCCCGgctgctgctgctgctgctgctTCTCCTGTCACTG


01A-01R-




GTAAGCCGCGTCCCGGCACAGCCCGCTGCCTTCGGCA


1850-




GGGCGTTGCTGTCCCCTGGTCTCGCGGGGGCTGCAGG


01.2




GGTCCCTGCTGAGGAGGCCATAGTGCTGGCGAACCGC







GGACTCCGGGTGCCTTTCGGCCGTGAAGTCTGGCTGG







ATCCCCTGCATGACCTGGTGTTGCAGGTGCAGCCCGG







GGACCGCTGCGCGGTTTCGGTACTAGACAACGACGCA







CTGGCCCAGCGACCGGGCCGCCTGAGTCCCAAGCGCT







TCCCGTGCGACTTTGGCCCTGGCGAGGTGCGCTACTCT







CACCTGGGCGCGCGCAGCCCGTCTCGGGACCGCGTCC







GGCTGCAGCTGCGCTATGACGCGCCCGGAGGGGCAGT







AGTGCTACCACTGGTACTGGAGGTGGAGGTGGTCTTC







ACCCAGCTGGAGGTTGTGACTCGGAACTTGCCTCTGGT







CGTGGAAGAGCTGCTGGGGACCAGCAATGCCCTGGAC







GCGCGGAGCCTGGAGTTCGCCTTCCAGCCCGAGACAG







AGGAGTGCCGCGTGGGCATCCTGTCCGGCTTGGGCGC







GCTGCCTCGCTATGGAGAACTCCTCCACTACCCGCAGG







TCCCTGGAGGAGCCAGAGAGGGAGGCGCCCCGGAGA







CTCTCCTGATGGACTGCAAAGCTTTCCAGGAACTAGGC







GTGCGCTATCGCCACACAGCCGCCAGTCGCTCACCAAA







CAGGGACTGGATACCCATGGTGGTGGAGCTGCGTTCA







CGAGGGGCTCCTGTGGGCAGCCCTGCTTTGAAACGCG







AGCACTTCCAGGTTCTGGTGAGGATCCGAGGAGGGGC







CGAGAACACTGCACCCAAGCCCAGTTTCGTGGCCATG







ATGATGATGGAGGTGGACCAGTTTGTACTGACGGCCC







TGACCCCAGACATGCTGGCAGCCGAGGATGCTGAGTC







TCCCTCTGACCTGTTGATCTTCAACCTTACTTCTCCATTC







CAGCCTGGCCAGGGCTACTTGGTGAGCACCGATGATC







GCAGCCTGCCCCTTTCCTCCTTCACTCAGAGGGATCTG







CGGCTCCTGAAGATTGCCTACCAGCCCCCTTCTGAAGA







CTCTGACCAGGAGCGCCTCTTTGAACTGGAATTGGAG







GTAGTGGATCTAGAAGGAGCAGCTTCAGACCCTTTTGC







CTTCATGGTAGTGGTGAAGCCCATGAACACAATGGCTC







CGGTGGTCACCCGGAATACCGGTCTTATTCTCTATGAG







GGTCAGTCTCGGCCCCTCACAGGCCCTGCAGGCAGTG







GTCCGCAAAACTTGGTCATCAGCGATGAGGATGACCT







AGAAGCAGTGCGGCTAGAGGTGGTGGCTGGGCTCCG







GCATGGTCACCTTGTCATTCTGGGTGCTTCCAGTGGCA







GCTCTGCTCCCAAGAGCTTTACAGTGGCTGAGCTGGCA







GCCGGCCAGGTGGTCTACCAGCATGATGACAGAGACG







GCTCGCTGAGCGACAACCTGGTGCTTCGCATGGTGGA







TGGAGGAGGCAGGCACCAGGTACAGTTTCTGTTCCCC







ATCACCTTAGTGCCTGTGGATGACCAGCCACCTGTTCT







CAATGCCAACACGGGGCTGACACTGGCAGAGGGTGA







AACAGTGCCCATCCTGCCCCTTTCCCTGAGTGCAACTG







ACATGGATTCAGATGATTCTCTGCTGCTTTTTGTGCTG







GAGTCACCCTTCTTAACTACGGGGCATCTGCTTCTCCG







CCAAACTCACCCTCCCCATGAGAAGCAGGAACTTCTCA







GAGGCCTTTGGAGGAAGGAGGGGGCATTTTATGAGC







GAACAGTGACAGAGTGGCAGCAGCAGGACATAACAG







AGGGCAGGCTGTTCTATAGACACTCTGGGCCCCATAGT







CCTGGGCCAGTCACAGACCAGTTCACATTTAGAGTCCA







GGATAACCATGACCCTCCTAATCAGTCCGGGCTACAGC







GGTTTGTGATTCGTATCCATCCTGTGGATCGCCTCCCTC







CGGAGCTGGGCAGTGGCTGTCCCCTTCGTATGGTGGT







ACAGGAATCCCAGCTCACACCACTGAGGAAGAAGTGG







CTGCGCTACACTGACCTGGACACAGATGACCGAGAAC







TACGTTACACAGTGACTCAGTCCCCCACAGACACAGAC







GAAAATCACCTGCCAGCCCCACTGGGTACCTTGGTCTT







GACTGACAACCCCTCAGTCGTGGTGACCCATTTTACCC







AAGCCCAGATCAACCATCATAAAATTGCTTACAGACCC







CCGGGTCAAGAACTGGGCGTGGCTACTCGAGTGGCCC







AGTTCCAGTTCCAGGTGGAAGACCGAGCTGGGAATGT







GGCTCCAGGTACCTTTACCCTTTACTTGCATCCCGTGGA







CAACCAGCCACCTGAGATCCTCAACACCGGCTTCACTA







TTCAGGAGAAGGGTCACCACATCCTGAGTGAGACAGA







GTTGCACGTGAATGATGTAGACACTGATGTTGCCCATA







TCTCTTTCACTCTCACTCAGGCACCCAAACATGGCCACA







TGAGAGTGTCTGGACAGATCCTGCATGTAGGGGGTCT







CTTCCACTTGGAGGACATAAAACAGGGCCGAGTTTCCT







ATGCCCATAATGGGGACAAGTCCCTGACTGATAGCTG







CTCCTTGGAAGTCAGTGACAGACATCATGTGGTGCCCA







TCACTCTCAGAGTAAATGTCCGGCCAGTGGATGATGA







AGTGCCCATACTGAGCCATCCTACTGGCACTCTGGAGT







CCTATCTAGATGTCTTAGAAAATGGGGCTACTGAAATC







ACTGCCAATGTTATTAAGGGGACCAATGAGGAAACTG







ATGACTTGATGTTGACTTTCCTCTTGGAAGATCCACCTT







TGTATGGGGAAATCTTGGTCAATGGCATTCCAGCAGA







GCAGTTTACTCAAAGGGACATCTTGGAGGGCTCTGTTG







TATATACCCACACCAGTGGTGAGATAGGCCTATTGCCT







AAAGCGGATTCTTTTAACCTGAGTCTGTCAGATATGTC







TCAAGAATGGAGAATTGGTGGCAATACTATCCAAGGA







GTTACTATATGGGTGACCATCCTGCCTGTTGATAGCCA







GGCCCCAGAAATCTTTGTAGGTGAACAGTTGATAGTA







ATGGAAGGTGATAAAAGTGTTATAACATCAGTGCATA







TAAGTGCTGAAGATGTCGACTCCCTGAATGATGACATC







TTGTGCACTATAGTTATTCAGCCTACTTCAGGTTATGTT







GAAAACATTTCTCCAGCACCAGGCTCTGAGAAATCAAG







AGCAGGGATTGCCATAAGTGCTTTCAACTTGAAAGATC







TCAGGCAGGGCCACATAAACTATGTCCAGAGTGTCCAT







AAAGGGGTGGAACCTGTGGAGGACCGATTTGTATTTC







GTTGTTCTGATGGCATTAACTTTTCAGAGAGACAGTTC







TTCCCCATTGTAATCATTCCCACCAATGATGAACAGCCA







GAGATGTTTATGAGAGAATTTATGGTGATGGAAGGCA







TGAGTCTGGTAATTGATACACCCATTCTCAATGCTGCT







GATGCTGATGTTCCCCTGGATGATTTAACTTTCACTATT







ACCCAATTCCCCACTCATGGTCACATCATGAATCAGCT







GATAAATGGCACGGTTTTGGTCGAAAGCTTCACCTTGG







ATCAGATCATAGAGAGTTCCAGCATTATTTATGAGCAT







GATGACTCCGAGACCCAGGAAGACAGTTTTGTGATTA







AACTAACAGATGGGAAGCACTCTGTGGAAAAGACGGT







CCTCATTATAGTTATCCCTGTTGATGATGAGACGCCCA







GAATGACTATCAATAATGGACTAGAAATAGAAATTGG







GGATACCAAGATTATCAACAACAAAATATTAATGGCAA







CAGATTTAGATTCAGAAGACAAATCTTTGGTTTATATT







ATTCGTTATGGGCCAGGACATGGCTTATTACAGAGAC







GAAAACCTACTGGTGCCTTTGAAAATATCACACTGGGC







ATGAATTTTACCCAGGATGAAGTAGACAGAAACTTAAT







TCAGTATGTCCATTTGGGGCAAGAGGGCATTCGGGAC







CTAATTAAATTTGATGTGACTGATGGAATAAATCCCCT







CATAGATCGTTACTTTTATGTGTCCATCGGGAGCATTG







ACATTGTCTTCCCTGATGTGATAAGTAAGGGAGTGTCC







TTGAAAGAAGGTGGCAAAGTCACTCTTACAACAGACC







TACTAAGCACTAGTGACTTGAACAGTCCTGATGAAAAC







TTGGTTTTTACCATCACCAGGGCTCCCATGCGAGGTCA







CCTGGAATGCACGGATCAGCCTGGTGTGTCCATCACGT







CTTTCACTCAGCTGCAACTGGCTGGAAACAAAATCTAC







TACATCCACACAGCTGATGATGAAGTGAAAATGGACA







GTTTTGAGTTTCAAGTCACCGATGGACGTAACCCTGTC







TTTCGGACATTCCGTATCTCCATTAGCGATGTGGACAA







TAAAAAGCCAGTGGTCACCATCCACAAGCTGGTTGTCA







GTGAAAGTGAAAACAAGCTGATTACTCCTTTTGAGCTC







ACTGTCGAAGACAGAGATACTCCTGACAAGCTCCTGA







AATTCACTATCACCCAGGTGCCTATTCATGGCCATCTCC







TATTCAACAATACCAGACCTGTCATGGTTTTTACCAAGC







AAGACTTGAATGAAAACTTAATCAGCTACAAACATGAT







GGCACTGAGTCAAGTGAAGATAGCTTCTCCTTCACAGT







GACTGATGGCACCCATACAGACTTCTATGTTTTTCCTGA







TACGGTGTTTGAAACAAGGAGACCCCAAGTGATGAAG







ATCCAGGTCTTGGCTGTTGACAACAGTGTCCCCCAAAT







CGCAGTGAATAAGGGGGCCTCTACACTTCGCACTCTAG







CCACTGGCCACTTGGGGTTCATGATCACAAGCAAAATA







TTGAAAGTGGAGGACAGAGACAGCTTACACATTTCTCT







TAGATTTATCGTGACAGAGGCCCCTCAACATGGATATC







TTCTCAACCTGGACAAAGGCAACCACAGCATCACTCAG







TTCACACAAGCTGACATTGATGACATGAAAATATGCTA







TGTCTTAAGAGAAGGGGCTAATGCCACAAGTGATATG







TTCTATTTTGCAGTTGAAGATGGTGGTGGAAACAAGTT







AACGTACCAGAATTTTCGTCTGAATTGGGCATGGATCT







CCTTTGAAAAGGAATATTACCTGGTCAATGAGGACTCC







AAATTTCTAGATGTTGTTCTTAAACGTAGAGGTTACTT







GGGAGAAACTTCTTTTATAAGTATTGGCACAAGAGAC







AGAACTGCAGAAAAAGACAAAGACTTCAAGGGCAAA







GCACAGAAACAAGTGCAGTTCAACCCAGGCCAGACCA







GGGCCACATGGCGAGTGCGGATCCTGAGTGATGGGG







AGCATGAGCAGTCTGAAACCTTTCAGGTGGTACTCTCA







GAGCCCGTGCTGGCTGCCTTGGAATTCCCCACAGTCGC







CACTGTTGAGATCGTTGATCCAGGAGATGAGCCAACT







GTGTTTATTCCCCAGTCCAAATACTCCGTTGAAGAAGA







TGTTGGTGAGCTGTTCATTCCCATCAGGAGGAGCGGA







GATGTGAGCCAGGAGTTGATGGTGGTCTGTTATACCC







AACAAGGAACAGCAACTGGAACTGTGCCGACTTCCGT







GTTGTCTTACTCTGATTACATATCCAGGCCTGAGGACC







ACACCAGTGTTGTCCGCTTTGACAAAGATGAACGGGA







GAAACTGTGTCGGATAGTCATAATTGATGACTCTTTGT







ACGAGGAGGAGGAAACCTTCCATGTCCTTCTGAGCAT







GCCCATGGGGGGAAGAATCGGATCAGAGTTCCCAGG







GGCTCAAGTTACAATCGTTCCTGACAAAGATGATGAAC







CCATCTTTTACTTCGGTGATGTGGAATACTCTGTGGAT







GAGAGTGCTGGCTATGTGGAAGTGCAGGTGTGGAGA







ACGGGCACTGACCTGTCCAAGTCTTCTAGTGTCACAGT







GAGGTCTCGGAAAACAGATCCTCCCTCTGCAGATGCTG







GAACAGACTATGTGGGCATCAGCCGTAATTTAGATTTT







GCACCTGGAGTCAACATGCAGCCTGTTCGTGTTGTCAT







TCTGGATGACCTTGGACAACCAGCGCTGGAGGGAATT







GAGAAATTTGAACTGGTGCTTCGCATGCCTATGAACGC







AGCCCTTGGCGAGCCCAGCAAAGCCACAGTGTCCATA







AATGACTCTGTCTCCGATTTGCCTAAGATGCAATTCAA







AGAACGAATATATACTGGCAGCGAAAGTGATGGGCAG







ATAGTTACAATGATCCATAGGACTGGGGATGTCCAGT







ACAGATCTTCAGTGAGATGCTACACCCGGCAGGGGTC







TGCACAGGTGATGATGGACTTTGAAGAACGCCCAAAC







ACTGATACCTCCATCATCACATTCCTCCCTGGTGAGACA







GAAAAGCCCTGCATTCTTGAGCTGATGGACGATGTGCT







CTATGAGGAGGTAGAGGAGCTCCGCCTGGTACTCGGC







ACTCCACAAAGCAACTCTCCCTTTGGGGCTGCAGTTGG







TGAACAAAATGAAACTCTCATAAGGATCCGAGATGAT







GCTGATAAGACTGTTATTAAATTTGGAGAAACCAAATT







TAGTGTCACTGAACCCAAAGAACCTGGAGAGTCGGTG







GTTATAAGAATTCCAGTGATTCGCCAAGGAGACACTTC







AAAGGTTTCCATTGTGAGAGTCCACACCAAGGATGGC







TCGGCCACCTCTGGAGAAGACTACCACCCTGTGTCAGA







AGAAATTGAGTTTAAGGAAGGGGAAACCCAGCACGTG







GTTGAAATCGAAGTTACCTTTGACGGGGTGAGAGAGA







TGAGAGAGGCCTTCACTGTTCACCTAAAACCTGATGAA







AATATGATAGCAGAGATGCAGTTGACGAAAGCCATTG







TGTACATAGAAGAAATGAGCAGCATGGCAGATGTCAC







TTTTCCTTCTGTCCCTCAAATTGTATCCCTGTTGATGTAT







GACGACACTTCCAAAGCTAAGGAGAGTGCTGAACCCA







TGTCTGGCTATCCTGTCATCTGTATCACAGCTTGCAACC







CCAAATATTCAGACTACGATAAAACAGGCTCTATCTGT







GCAAGTGAGAACATCAATGACACTTTGACGCGGTACC







GGTGGCTGATTAGTGCACCTGCGGGCCCTGACGGTGT







GACCAGCCCTATGAGAGAAGTGGACTTCGACACCTTTT







TTACGTCATCCAAGATGGTCACACTGGACTCCATATAC







TTTCAGCCTGGCTCCCGGGTACAGTGCGCAGCTCGTGC







TGTGAACACCAATGGGGATGAAGGCCTGGAGCTCATG







AGCCCTATTGTAACCATCAGCAGAGAAGAAGGTCTTTG







TCAGCCCCGTGTACCTGGGGTTGTTGGAGCAGAGCCG







TTCTCAGCTAAATTGCGCTACACAGGCCCTGAGGATGC







AGACTACACAAACCTTATCAAGCTCACTGTCACAATGC







CACACATAGATGGCATGCTCCCCGTGATCTCCACTAGA







GAGCTTTCCAACTTTGAGCTCACCCTCAGCCCTGATGG







CACAAGAGTTGGAAACCACAAGTGCTCCAACCTCCTG







GATTATACTGAAGTGAAGACTCATTATGGTTTCTTGAC







TGATGCTACCAAAAATCCAGAAATAATTGGAGAGACA







TATCCTTACCAGTACAGCTTGTCCATCAGAGGTTCCACT







ACCTTGCGCTTCTACCGGAACCTGAACCTAGAGGCCTG







TTTATGGGAGTTCGTTAGCTACTATGACATGTCAGAAC







TCCTTGCTGACTGTGGTGGCACCATTGGAACAGATGG







ACAGTGTGGATGTGAAATTGGACCCCAAGGATTTGCG







AATAGATACATTTCGAGCCAAAGGAGCAGGAGGGCA







GCATGTTAATAAAACTGATAGTGCCGTCAGACTTGTCC







ACATCCCCACAGGGCTAGTAGTAGAATGCCAACAAGA







AAGATCACAGATAAAAAATAAAGAAATAGCCTTTCGT







GTGTTGAGAGCTAGACTCTACCAGCAGATTATTGAGA







AAGACAAGCGTCAGCAACAAAGTGCTAGAAAACTGCA







GGTGGGAACAAGAGCCCAGTCAGAGCGAATTCGGAC







ATATAATTTCACCCAGGATAGAGTCAGTGACCACAGG







ATAGCATATGAAGTTCGTGATATTAAGGCTCAGTCTCA







TTCCACGGGTGGTAGTCGTGACCCTGCACATTCCACAT







TCCTATCCTTGGATTCTGTGAGATCACCTGGTATTCTCA







TTATGACTTCTTCTGTTAGGAATTTTTATGTGGTGGGA







AGGGCCTGGATCAGC


G17212.
3410422
47221257
InFrame
8362
ATGGCCATAGAccggcggcgcgaggcggcgggcggcgggcctgg


TCGA-06-




gcggcagccggccccggccgaggagaacggctccctgccgcccgggg


0129-




acgcggcggcctcggcgcccctcgggggacgcgcgggccccggcggc


01A-01R-




ggcgcggAGATCCAGCCGCTGCCCCCACTGCATCCTGGA


1849-




GGCGGCCCGCACCCGAGCTGCTGCTCCGCGGCTGCGG


01.2




CCCCGAGCCTCTTGTTGCTGGACTATGACGGGTCGGTG







CTGCCCTTCCTCGGGGGCCTGGGCGGGGGCTATCAGA







AGACCCTCGTGCTGCTCACCTGGATCCCGGCGCTGTTC







ATCGGCTTCAGCCAGTTCTCGGACTCGTTCCTCCTGGA







CCAGCCCAACTTCTGGTGCCGCGGGGCCGGCAAAGGC







ACCGAGCTGGCAGGGGTCACCACCACAGGCCGGGGC







GGGGACATGGGCAACTGGACCAGCCTCCCCACCACCC







CCTTCGCCACTGCCCCCTGGGAGGCTGCGGGCAACCG







GAGCAACAGCAGCGGCGCGGACGGAGGCGACACACC







ACCCCTGCCATCCCCTCCGGACAAGGGGGACAACGCC







TCCAACTGTGACTGCCGCGCATGGGACTACGGCATCC







GCGCCGGCCTCGTCCAGAACGTGGTCAGCAAGTGGGA







TCTTGTGTGTGATAATGCCTGGAAGGTCCATATCGCTA







AGTTCTCCTTACTGGTTGGATTAATCTTTGGCTACCTAA







TAACTGGATGCATTGCTGACTGGGTCGGCCGGCGGCC







TGTGCTGCTGTTTTCCATCATCTTCATTCTGATCTTTGG







ACTGACTGTGGCACTGTCAGTGAATGTGACAATGTTCA







GCACACTCAGGTTCTTTGAAGGATTTTGCCTGGCTGGA







ATCATTCTCACCTTGTATGCTTTAC|GTATCGATATCCT







GAAGCTTGTAGCAGCCCAAGTGGGAAGCCAGTGGAA







AGATATCTATCAGTTTCTTTGCAATGCCAGTGAGAGGG







AGGTTGCTGCTTTCTCCAATGGGTACACAGCCGACCAC







GAGCGGGCCTACGCAGCTCTGCAGCACTGGACCATCC







GGGGCCCCGAGGCCAGCCTCGCCCAGCTAATTAGCGC







CCTGCGCCAGCACCGGAGAAACGATGTTGTGGAGAAG







ATTCGTGGGCTGATGGAAGACACCACCCAGCTGGAAA







CTGACAAACTAGCTCTCCCGATGagccccagcccgcttagcc







cgagccccatccccagccccaacGCGAAACTTGAGAATTCCG







CTCTCCTGACGGTGGAGCCTTCCCCACAGGACAAGAAC







AAGGGCTTCTTCGTGGATGAGTCGGAGCCCCTTCTCCG







CTGTGACTCTACATCCAGCGGCTCCTCCGCGCTGAGCA







GGAACGGTTCCTTTATTACCAAAGAAAAGAAGGACAC







AGTGTTGCGGCAGGTACGCCTGGACCCCTGTGACTTG







CAGCCTATCTTTGATGACATGCTCCACTTTCTAAATCCT







GAGGAGCTGCGGGTGATTGAAGAGATTCCCCAGGCTG







AGGACAAACTAGACCGGCTATTCGAAATTATTGGAGT







CAAGAGCCAGGAAGCCAGCCAGACCCTCCTGGACTCT







GTTTATAGCCATCTTCCTGACCTGCTG


GBM-
141255367
158715068
InFrame
8363
ATGGAAAACACTG|ATCGAAAAGTATCCTCCTTGCACA


CUMC3322_L1




CCTCCCGAGTTCAGAGGCAGATGGTGGTCTCCGTTCAC







GACTTACCCGAGAAGAGCTTTGTGCCCCTGCTGGACA







GCAAATACGTCCTCTGTGTGTGGGATATTTGGCAGCCT







TCAGGGCCACAGAAAGTTCTGATATGTGAGTCCCAGG







TCACGTGTTGCTGCTTGAGCCCTTTGAAAGCATTTTTAC







TGTTTGCCGGAACAGCGCACGGCTCAGTTGTCGTCTGG







GATTTGAGAGAAGACTCAAGGCTGCATTACTCTGTGAC







GCTGAGCGATGGCTTCTGGACGTTCCGGACCGCCACG







TTTTCCACCGATGGAATCCTTACCTCAGTAAACCACCG







AAGCCCTCTTCAAGCAGTAGAACCTATCTCAACGTCCG







TCCACAAAAAGCAGAGCTTTGTGCTTTCACCCTTTTCTA







CTCAAGAAGAAATGTCAGGTTTGTCCTTCCACATCGCT







TCCTTGGATGAGAGTGGGGTTCTCAATGTATGGGTGG







TTGTTGAATTACCAAAGGCAGACATCGCAGGTTCAATA







AGTGATTTAGGTCTGATGCCTGGAGGGAGGGTCAAGC







TGGTACATAGTGCTCTGATCCAGTTGGGTGACAGTCTT







TCCCATAAAGGTAATGAATTTTGGGGCACTACACAAAC







ACTGAATGTTAAATTTCTGCCTTCAGATCCTAATCACTT







TATTATTGGCACAGACATGGGTCTCATAAGCCATGGCA







CAAGACAAGATTTGAGAGTGGCTCCCAAACTATTCAAA







CCTCAGCAACATGGTATAAGACCAGTGAAAGTTAATGT







CATTGATTTTTCACCATTTGGAGAACCAATATTTTTGGC







CGGCTGTTCGGACGGAAGCATCAGGCTGCACCAGCTG







AGCTCCGCGTTTCCGCTCCTGCAGTGGGACAGCAGCAC







GGACAGCCATGCGGTCACCGGCCTGCAGTGGTCCCCA







ACCAGGCCTGCCGTGTTCCTGGTGCAGGACGACACAT







CCAACATCTACATCTGGGACCTCCTCCAGAGCGATCTG







GGTCCTGTCGCCAAACAGCAGGTCTCCCCCAACAGGCT







GGTGGCCATGGCTGCGGTGGGTGAGCCTGAGAAGGC







TGGTGGCAGCTTCCTGGCCCTGGTGCTGGCCAGGGCG







TCTGGCTCCATCGACATCCAGCACCTGAAGAGGCGGT







GGGCGGCCCCGGAGGTGGACGAGTGCAACAGGCTGC







GTCTGCTTTTGCAGGAAGCCCTGTGGCCAGAGGGAAA







ACTGCACAAG


G17675.
54379794
56297264
InFrame
8364
ATGACATGCCCTCGCAATGTAACTCCGAACTCGTACGC


TCGA-19-




GGAGCCCTTGGCTGCGCCCGGCGGAGGAGAGCGCTAT


2624-




AGCCGGAGCGCAGGCATGTATATGCAGTCTGGGAGTG


01A-01R-




ACTTCAATTGCGGGGTGATGAGGGGCTGCGGGCTCGC


1850-




GCCCTCGCTCTCCAAGAGGGACGAGGGCAGCAGCCCC


01.2




AGCCTCGCCCTCAACACCTATCCGTCCTACCTCTCGCAG







CTGGACTCCTGGGGCGACCCCAAAGCCGCCTATCGCCT







GGAACAACCTGTTGGCAGGCCGCTGTCCTCCTGCTCCT







ACCCACCTAGTGTCAAGGAGGAGAATGTCTGCTGCAT







GTACAGCGCAGAGAAGCGGGCGAAAAGTGGCCCCGA







GGCAGCTCTCTACTCCCACCCCTTGCCGGAGTCCTGCC







TTGGGGAGCACGAGGTACCCGTGCCCAGCTACTACCG







CGCCAGCCCGAGCTACTCCGCGCTGGACAAGACGCCC







CACTGTTCTGGGGCCAACGACTTCGAAGCCCCTTTCGA







GCAGCGGGCCAGTCTCAACCCGCGCGCCGAACATCTG







GAATCGCCTCAGCTGGGGGGCAAAGTGAGTTTCCCTG







AGACCCCCAAGTCCGACAGCCAGACCCCCAGCCCCAAT







GAAATCAAGACGGAGCAGAGCCTGGCGGGCCCTAAA







GGGAGCCCCTCGGAGAGCGAAAAGGAGAGGGCCAAA







GCTGCCGACTCCAGCCCAGACACCTCGGATAACGAAG







CGAAAG|GCAAGTATATCGCGTCAACACAGCGACCTG







ACGGGACCTGGCGCAAGCAGCGGAGGGTGAAAGAAG







GATATGTGCCCCAGGAGGAGGTCCCAGTATATGAAAA







CAAGTATGTGAAGTTTTTCAAGAGTAAACCAGAGTTGC







CCCCAGGGCTAAGCCCTGAGGCCACTGCTCCTGTCACC







CCATCCAGGCCTGAAGGTGGTGAACCAGGCCTCTCCA







AGACAGCCAAACGTAACCTGAAGCGAAAGGAGAAGA







GGCGGCAGCAGCAAGAGAAAGGAGAGGCAGAGGCCT







TGAGCAGGACTCTTGATAAGGTGTCCCTGGAAGAGAC







AGCCCAACTCCCCAGTGCTCCACAGGGCTCTCGGGCA







GCCCCCACAGCTGCATCTGACCAGCCTGACTCAGCTGC







CACCACTGAGAAAGCCAAGAAGATAAAGAACCTAAAG







AAGAAACTCCGGCAGGTGGAAGAGCTGCAGCAGCGG







ATCCAGGCTGGGGAAGTCAGCCAGCCTAGCAAAGAGC







AGCTAGAAAAGCTAGCAAGGAGGAGGGCGCTAGAAG







AGGAGTTAGAGGACTTGGAGTTAGGCCTC


G17803.
74046509
10694746
InFrame
8365
ATGGCTGCTGAGAAGCAGGTCCCAggcggcggcggcggcg


TCGA-76-




gcggcagtggcggcggcggtggcagtggcggcggcggtagcggcggt


4925-




ggACGTGGTGCCGGAGGGGAAGAAAATAAAGAAAAC


01A-01R-




GAACGCCCTTCGGCCGGATCGAAGGCAAACAAAGAAT


1850-




TTGGGGATAGCCTGAGTTTGGAGATTCTTCAGATTATT


01.4




AAGGAATCCCAGCAGCAGCATGGTTTACGGCATGGAG







ATTTTCAGAGGTACAGGGGCTACTGTTCCCGTAGACAA







AGACGTCTTCGAAAAACACTCAACTTCAAGATGGGTAA







CAGACACAAATTCACAGGGAAGAAAGTGACTGAAGAG







CTTCTGACCGATAATAGATACTTGCTTCTGGTTCTGATG







GATGCTGAAAGAGCCTGGAGCTACGCCATGCAGCTGA







AACAGGAAGCCAACACTGAACCCCGAAAACGGTTTCA







CTTGTTATCTCGCCTACGCAAAGCCGTGAAGCATGCAG







AGGAATTGGAACGCTTGTGTGAGAGCAATCGCGTGGA







TGCCAAGACCAAATTAGAGGCTCAGGCTTACACAGCTT







ACCTCTCAGGAATGCTACGTTTTGAACATCAAGAATGG







AAAGCTGCCATTGAGGCTTTTAACAAATGCAAAACTAT







CTATGAGAAGCTAGCCAGTGCTTTCACAGAGGAGCAG







GCTGTGCTGTATAACCAACGTGTGGAAGAGATTTCACC







CAACATCCGCTATTGTGCATATAATATTGGGGACCAGT







CAGCCATCAATGAACTCATGCAGATGAGATTGAGGTCT







GGGGGCACTGAGGGTCTCTTGGCTGAAAAATTGGAGG







CTTTGATCACTCAGACTCGAGCCAAACAGGCAGCTACC







ATGAGTGAAGTGGAGTGGAGAGGGAGAACGGTTCCA







GTGAAGATTGACAAAGTGCGCATTTTCTTATTAGGACT







GGCTGATAACGAAGCAGCTATTGTCCAGGCTGAAAGC







GAAGAAACTAAGGAGCGCCTGTTTGAATCAATGCTCA







GCGAGTGTCGGGACGCCATCCAGGTGGTTCGGGAGG







AGCTCAAGCCAGATCAG|CCATTGATCAGCCGCAACTA







CAAGGGCGATGTGGCCATGAGCAAGATTGAGCACTTC







ATGCCTTTGCTGGTACAGCGGGAGGAGGAAGGCGCCC







TGGCCCCGCTGCTGAGCCACGGCCAGGTCCACTTCCTA







TGGATCAAACACAGCAACCTCTACTTGGTGGCCACCAC







ATCGAAGAATGCCAATGCCTCCCTGGTGTACTCCTTCC







TGTATAAGACAATAGAGGTATTCTGCGAATACTTCAAG







GAGCTGGAGGAGGAGAGCATCCGGGACAACTTTGTCA







TCGTCTACGAGTTGCTGGACGAGCTCATGGACTTTGGC







TTCCCGCAGACCACCGACAGCAAGATCCTGCAGGAGT







ACATCACTCAGCAGAGCAACAAGCTGGAGACGGGCAA







GTCACGGGTGCCACCCACTGTCACCAACGCTGTGTCCT







GGCGCTCCGAGGGTATCAAGTATAAGAAGAACGAGGT







CTTCATTGATGTCATAGAGTCTGTCAACCTGCTGGTCA







ATGCCAACGGCAGCGTCCTTCTGAGCGAAATCGTCGG







TACCATCAAGCTCAAGGTGTTTCTGTCAGGAATGCCAG







AGCTGCGGCTGGGCCTCAATGACCGCGTGCTCTTCGA







GCTCACTGGCCTTTCAGGCAGCAAGAACAAATCAGTA







GAGCTGGAGGATGTAAAATTCCACCAGTGCGTGCGGC







TCTCTCGCTTTGACAACGACCGCACCATCTCCTTCATCC







CGCCTGATGGTGACTTTGAGCTCATGTCATACCGCCTC







AGCACCCAGGTCAAGCCACTGATCTGGATTGAGTCTGT







CATTGAGAAGTTCTCCCACAGCCGCGTGGAGATCATG







GTCAAGGCCAAGGGGCAGTTTAAGAAACAGTCAGTGG







CCAACGGTGTGGAGATATCTGTGCCTGTACCCAGCGAT







GCCGACTCCCCCAGATTCAAGACCAGTGTGGGCAGCG







CCAAGTATGTGCCGGAGAGAAACGTCGTGATTTGGAG







TATTAAGTCTTTCCCGGGGGGCAAGGAGTACTTGATGC







GAGCCCACTTTGGCCTCCCCAGTGTGGAAAAGGAAGA







GGTGGAGGGCCGGCCCCCCATCGGGGTCAAGTTTGAG







ATCCCCTACTTCACCGTCTCTGGGATCCAGGTCCGATA







CATGAAGATCATTGAGAAAAGTGGTTACCAGGCCCTG







CCCTGGGTTCGCTACATCACCCAGAGTGGCGATTACCA







ACTTCGTACCAGC


G17796.
58186856
57849877
InFrame
8366
ATGTCGCTGCTGCGGTCGCTGCGCGTGTTTCTGGTCGC


TCGA-41-




GCGGACCGGGAGCTACCCGGCTGGGTCTCTTCTGCGT


5651-




CAGTCGCCCCAGCCAAGGCACACATTTTATGCTGGGCC


01A-01R-




CCGTCTGTCTGCCTCGGCCTCCAGCAAGGAGCTCCTCA


1850-




TGAAGCTGCGGCGGAAAACAGGCTACTCCTTTGTAAA


01.4




TTGCAAGAAAGCTCTGGAGACTTGTGGCGGGGACCTC







AAACAGGCAGAGATCTGGCTCCACAAGGAGGCCCAGA







AGGAGGGCTGGAGCAAAGCTGCCAAGCTCCAAGGGA







GGAAGACCAAAGAAGGCCTGATTGGGCTGTTGCAGG







AAGGAAACACAACTGTATTAGTAGAGGTAAACTGTGA







GACAGATTTTGTTTCTAGAAATTTAAAATTTCAACTGTT







GGTCCAGCAAGTAGCCCTTGGAACCATGATGCATTGTC







AGACCCTAAAGGATCAACCCTCTGCATACAGTAAAgtgc







agtggctcacgcctgtaaacctagcactttgggaggctgaagcaggtgg







atcacttgagGGTTTCTTGAATTCCTCTGAGCTTTCTGGAC







TTCCAGCTGGGCCTGACAGAGAAGGCTCACTCAAGGA







TCAGTTGGCTTTAGCAATTG|ACTCCACTTCAGCCTACA







GCTCCCTGCTCACTTTTCACCTGTCCACTCCTCGGTCCC







ACCACCTGTACCATGCCCGCCTGTGGCTGCACGTGCTC







CCCACCCTTCCTGGCACTCTTTGCTTGAGGATCTTCCGA







TGGGGACCAAGGAGGAGGCGCCAAGGGTCCCGCACT







CTCCTGGCTGAGCACCACATCACCAACCTGGGCTGGCA







TACCTTAACTCTGCCCTCTAGTGGCTTGAGGGGTGAGA







AGTCTGGTGTCCTGAAACTGCAACTAGACTGCAGACCC







CTAGAAGGCAACAGCACAGTTACTGGACAACCGAGGC







GGCTCTTGGACACAGCAGGACACCAGCAGCCCTTCCTA







GAGCTTAAGATCCGAGCCAATGAGCCTGGAGCAGGCC







GGGCCAGGAGGAGGACCCCCACCTGTGAGCCTGCGAC







CCCCTTATGTTGCAGGCGAGACCATTACGTAGACTTCC







AGGAACTGGGATGGCGGGACTGGATACTGCAGCCCG







AGGGGTACCAGCTGAATTACTGCAGTGGGCAGTGCCC







TCCCCACCTGGCTGGCAGCCCAGGCATTGCTGCCTCTT







TCCATTCTGCCGTCTTCAGCCTCCTCAAAGCCAACAATC







CTTGGCCTGCCAGTACCTCCTGTTGTGTCCCTACTGCCC







GAAGGCCCCTCTCTCTCCTCTACCTGGATCATAATGGC







AATGTGGTCAAGACGGATGTGCCAGATATGGTGGTGG







AGGCCTGTGGCTGCAGC


G17663.
70800719
94827661
InFrame
8367
ATGGCTTCACTGAGAAGAGTCAAAGTGCTGTTGGTGTT


TCGA-19-




GAACTTGATCGCGGTAGCCGGCTTCGTGCTCTTCCTGG


2619-




CCAAGTGCCGGCCCATCGCGGTGCGCAGCGGAGACGC


01A-01R-




CTTCCACGAGATCCGGCCGCGCGCCGAGGTGGCCAAC


1850-




CTCAGCGCGCACAGCGCCAGCCCCATCCAGGATGCGG


01.2




TCCTGAAGCGCCTGTCGCTGCTGGAGGACATCGTGTAC







CGGCAGCTGAATGGCTTATCCAAATCCCTTGGGCTCAT







TGAAGGTTATGGTGGGCGGGGTAAAGGGGGCCTTCC







GGCTACTCTTTCCCCGGCTGAAGAAGAAAAGGCTAAG







GGACCCCATGAGAAGTATGGCTACAATTCATACCTCAG







TGAAAAAATTTCACTGGACCGTTCCATTCCGGATTATC







GTCCCACCAA|ATTTGTTATTGGGCGGGAAAAACCAGG







ACAAGTGAGCGAGGTTGCCCAGTTGATAAGCCAGACA







CTGGAACAGGAGAGGCGCCAGAGAGAGCTGCTGGAA







CAGCACTATGCCCAGTATGATGCCGACGATGACGAGA







ACACTGTGGCTGAATTGCAAGGAATGTCTGGCAACTG







CAATAACAATAACAACTATTTTCTTAAGACAGGAGAAT







ATGCCACAGATGAAGAAGAAGATGAGGTAGGACCTGT







CCTTCCTGGCAGCGACATGGCCATTGAAGTCTTTGAGC







TGCCTGAGAATGAGGACATGTTTTCCCCATCAGAACTG







GACACAAGCAAGCTCAGTCACAAGTTCAAAGAGTTGC







AAATCAAACATGCAGTTACAGAAGCAGAGATTCAAAA







ATTGAAGACCAAGCTGCAGGCAGCAGAAAATGAGAAA







GTGAGGTGGGAACTAGAAAAAACCCAACTCCAACAAA







ACATAGAAGAGAATAAGGAAAGAATGTTGAAGTTGGA







AAGCTACTGGATTGAGGCCCAAACATTATGCCACACA







GTGAATGAGCATCTCAAAGAGACTCAAAGCCAGTATC







AGGCCTTGGAAAAGAAATACAACAAGGCAAAGAAGTT







GATCAAGGATTTTCAACAAAAAGAGCTTGATTTCATCA







AAAGACAGGAAGCAGAAAGAAAGAAAATAGAAGATT







TGGAAAAAGCTCATCTTGTGGAAGTGCAAGGCCTCCA







AGTGCGGATTAGAGATTTGGAAGCTGAGGTATTCAGG







CTACTGAAGCAAAATGGGACTCAAGTTAACAATAATAA







CAACATCTTTGAGAGAAGAACATCTCTTGGTGAAGTCT







CTAAAGGGGATACCATGGAGAACTTGGATGGCAAGCA







GACATCTTGCCAAGATGGCCTAAGTCAAGACTTGAATG







AAGCAGTCCCAGAGACAGAGCGCCTGGATTCAAAAGC







ACTGAAAACTCGAGCCCAGCTCTCTGTGAAGAACAGA







CGCCAGAGACCCTCTAGGACAAGACTGTATGATAGTG







TTAGTTCCACAGATGGGGAGGACAGTCTAGAGAGAAA







GCCATCAAACAGTTTCTATAACCACATGCATATTACCA







AATTACTTCCACCTAAGGGTTTGAGAACGTCTTCTCCA







GAATCAGATTCTGGTGTTCCACCCCTCACCCCGGTGGA







TAGCAATGTGCCCTTCTCGTCTGACCACATAGCTGAAT







TTCAAGAAGAACCACTGGACCCAGAAATGGGGCCTCT







CTCCTCTATGTGGGGAGACACTTCACTGTTTTCTACTTC







AAAGTCTGATCATGATGTGGAAGAATCTCCTTGCCATC







ACCAAACCACCAACAAGAAAATATTACGAGAAAAAGA







TGATGCCAAAGATCCCAAATCACTAAGGGCATCCAGTT







CATTGGCGGTGCAAGGAGGAAAAATTAAGCGGAAGTT







TGTGGATCTGGGGGCGCCTTTGCGAAGGAATTCCAGC







AAGGGAAAGAAGTGGAAAGAAAAAGAAAAAGAAGCC







AGTAGGTTTTCTGCAGGTAGCAGGATCTTCAGAGGCA







GACTGGAAAACTGGACACCCAAGCCATGTTCAACAGC







TCAGACCTCCACTCGTTCCCCTTGCATGCCTTTCTCATG







GTTTAATGACAGCCGGAAAGGATCCTATTCCTTCAGGA







ACCTGCCTGCGCCTACAAGTTCCCTTCAGCCTTCTCCTG







AGACTCTAATTTCAGATAAAAAGGGGTCCAAGGTAGA







AAACACATGGATTACAAAAGCAAACAAGAGAAACCCA







AATCCCTCCTCTTCTTCAATCTTTGGAAGGCATTCTCAA







CTTATGTCTGTAGTCTGGATCCAAGAAACCAATAATTTT







ACCTTCAATGATGACTTCAGTCCCAGCAGTACCAGTTC







AGCAGACCTCAGCGGCTTAGGAGCAGAACCTAAAACA







CCAGGGCTCTCTCAGTCCTTAGCACTGTCATCAGATGA







GATCCTTGATGATGGACAGTCTCCCAAACACAGTCAGT







GTCAGAATCGGGCCGTTCAGGAATGGAGTGTGCAGCA







GGTTTCTCACTGGTTAATGAGCCTAAATCTGGAGCAGT







ATGTATCTGAATTCAGTGCCCAAAACATCACTGGAGAA







CAGCTCCTGCAGTTGGATGGAAATAAACTTAAGGCTCT







TGGAATGACAGCATCCCAGGACCGAGCAGTGGTCAAA







AAGAAACTCAAGGAAATGAAGATGTCTCTAGAGAAGG







CTCGGAAGGCCCAAGAGAAAATGGAAAAACAAAGAG







AAAAGCTAAGGAGAAAGGAGCAAGAGCAAATGCAGA







GGAAGTCCAAAAAGACAGAAAAGATGACGTCAACTAC







AGCCGAGGGTGCTGGTGAGCAG


G17207.
910775
4438632
InFrame
8368
ATGCAG|AGCACGCAGGCCCATGAGAACAGCAGGGAT


TCGA-06-




AGCCGGCTGGCATGGATGGGCACCTGGGAGCACCTTG


0156-




TGTCTACTGGATTCAACCAGATGCGTGAGCGCGAAGT


01A-03R-




GAAGCTGTGGGACACGCGGTTCTTCTCCAGCGCCCTG


1849-




GCCTCCCTCACCTTGGACACCTCGCTTGGGTGTCTCGT


01.2




GCCTCTGCTGGACCCTGACTCTGGGCTCCTGGTCCTGG







CAGGAAAGGGCGAGAGGCAGCTGTACTGTTACGAGG







TGGTCCCGCAGCAGCCGGCGCTGAGCCCAGTGACCCA







GTGTGTCCTGGAGAGCGTGCTGCGTGGGGCTGCCCTT







GTGCCCCGGCAGGCGCTGGCCGTCATGAGCTGCGAGG







TACTCCGCGTCCTACAGCTGAGCGACACAGCCATCGTG







CCCATCGGCTACCATGTGCCCCGCAAGGCTGTGGAGTT







CCACGAGGACCTGTTCCCGGACACTGCCGGCTGTGTG







CCTGCCACCGACCCCCATAGCTGGTGGGCTGGGGACA







ACCAGCAGGTGCAGAAGGTCAGCCTCAACCCCGCCTG







CCGGCCCCACCCGAGCTTCACTTCCTGTCTGGTGCCCC







CTGCGGAGCCCCTCCCTGACACAGCCCAGCCTGCGGT







GATGGAGACACCCGTGGGTGATGCAGACGCAAGCGA







GGGTTTCTCTTCCCCTCCCAGTTCGCTGACCTCGCCCTC







CACGCCCTCCAGCCTGGGGCCCTCACTCTCCAGCACCA







GTGGCATCGGGACCAGCCCCAGTTTGAGGTCGCTGCA







GAGCCTGCTGGGCCCCAGTTCCAAGTTCCGCCATGCTC







AGGGCACTGTCCTGCACCGAGACAGCCACATCACCAA







CCTCAAGGGGCTCAACCTCACCACACCTGGTGAGAGT







GACGGCTTCTGTGCCAACAAGCTGCGTGTGGCCGTGC







CGCTGCTCAGCAGCGGGGGACAGGTGGCTGTGCTTGA







GCTACGGAAGCCTGGCCGCCTGCCCGACACGGCACTG







CCCACGCTGCAGAATGGGGCAGCTGTGACTGATCTGG







CCTGGGACCCCTTTGACCCCCATCGCCTCGCTGTGGCT







GGTGAGGACGCCAGGATCCGACTGTGGCGGGTACCC







GCAGAGGGCCTGGAAGAGGTGCTCACCACGCCAGAG







ACTGTGCTCACAGGCCACACGGAGAAGATCTGCTCCCT







GCGCTTCCACCCACTGGCAGCCAATGTGCTGGCCTCGT







CCTCCTATGACCTCACTGTTCGCATCTGGGACCTTCAG







GCTGGAGCTGATCGGCTGAAGCTGCAGGGCCACCAAG







ACCAGATCTTCAGCCTGGCCTGGAGTCCTGATGGGCA







GCAGCTGGCCACTGTCTGCAAGGATGGGCGTGTGCGG







GTCTACAGGCCCCGGAGTGGCCCTGAGCCCCTGCAGG







AAGGCCCAGGGCCCAAGGGAGGACGCGGAGCTCGCA







TTGTCTGGGTATGTGATGGTCGCTGTCTGCTGGTGTCT







GGCTTTGACAGCCAAAGTGAGCGCCAGCTGCTCCTATA







TGAAGCTGAGGCCCTGGCCGGCGGACCCTTGGCAGTG







TTGGGCCTGGACGTGGCTCCCTCAACCCTGCTGCCCAG







CTACGACCCAGACACTGGCCTGGTGCTCCTGACCGGCA







AGGGCGACACCCGTGTATTCCTGTACGAGCTGCTCCCC







GAGTCCCCTTTCTTCCTGGAGTGCAACAGCTTCACGTC







GCCTGACCCCCACAAGGGCCTCGTCCTCCTGCCTAAGA







CGGAGTGCGACGTGCGGGAAGTGGAGCTGATGCGGT







GCCTGCGGCTGCGTCAGTCCTCCCTGGAGCCTGTGGCC







TTCCGGCTGCCCCGAGTCCGGAAAGAGTTCTTCCAGGA







TGACGTGTTCCCAGACACGGCTGTGATCTGGGAGCCT







GTGCTCAGTGCCGAGGCCTGGCTGCAAGGCGCTAATG







GGCAGCCCTGGCTTCTCAGCCTGCAGCCTCCTGACATG







AGCCCAGTGAGCCAAGCCCCCCGAGAGGCCCCTGCTC







GTCGGGCCCCATCCTCAGCGCAGTACCTGGAAGAAAA







GTCTGACCAGCAAAAGAAGGAGGAGCTGCTGAATGCC







ATGGTGGCAAAACTGGGGAACCGGGAGGACCCACTCC







CCCAGGACTCCTTTGAAGGCGTGGACGAGGACGAGTG







GGAC


G17802.
55433922
56079562
InFrame
8369
ATGGGCGAGACCATGTCAAAGAGGCTGAAGCTCCACC


TCGA-28-




TGGGAGGGGAGGCAGAAATGGAGGAACGGGCGTTCG


5208-




TCAACCCCTTCCCGGACTACGAGGCCGCCGCCGGGGC


01A-01R-




GCTGCTCGCCTCCGGAGCGGCCGAAGAGACAGGCTGT


1850-




GTTCGTCCCCCGGCGACCACGGATGAGCCCGGCCTCCC


01.4




TTTTCATCAGGACGGGAAG|GATGCTTTCATTGGATTT







GGAGGAAATGTGATCAGGCAACAAGTCAAGGATAAC







GCCAAATGGTATATCACTGATTTTGTAGAGCTGCTGGG







AGAACTGGAAGAA


G17485.
151216546
151372723
InFrame
8370
ATGCCGCAGTCCAAGTCCCGGAAGATCGCGATCCTGG


TCGA-14-




GCTACCGGTCTGTGG|CCTCCGGCCTCTCCTCCTCTCCG


1402-




TCAACACCCACCCAAGTGACCAAGCAGCACACGTTTCC


02A-01R-




CCTGGAATCCTATAAGCACGAGCCTGAACGGTTAGAG


2005-




AATCGCATCTATGCCTCGTCTTCCCCCCCGGACACAGG


01.2




GCAGAGGTTCTGCCCGTCTTCCTTCCAGAGCCCGACCA







GGCCTCCACTGGCATCACCGACACACTATGCTCCCTCC







AAAGCCGCGGCGCTGGCGGCGGCCCTGGGACCCGCG







GAAGCCGGCATGCTGGAGAAGCTGGAGTTCGAGGAC







GAAGCAGTAGAAGACTCAGAAAGTGGTGTTTACATGC







GATTCATGAGGTCACACAAGTGTTATGACATCGTTCCA







ACCAGTTCAAAGCTTGTTGTCTTTGATACTACATTACAA







GTTAAAAAGGCCTTCTTTGCTTTGGTAGCCAACGGTGT







CCGAGCAGCGCCACTGTGGGAGAGTAAAAAACAAAGT







TTTGTAGGAATGCTAACAATTACAGATTTCATAAATAT







ACTACATAGATACTATAAATCACCTATGGTACAGATTT







ATGAATTAGAGGAACATAAAATTGAAACATGGAGGGA







GCTTTATTTACAAGAAACATTTAAGCCTTTAGTGAATAT







ATCTCCAGATGCAAGCCTCTTCGATGCTGTATACTCCTT







GATCAAAAATAAAATCCACAGATTGCCCGTTATTGACC







CTATCAGTGGGAATGCACTTTATATACTTACCCACAAA







AGAATCCTCAAGTTCCTCCAGCTTTTTATGTCTGATATG







CCAAAGCCTGCCTTCATGAAGCAGAACCTGGATGAGC







TTGGAATAGGAACGTACCACAACATTGCCTTCATACAT







CCAGACACTCCCATCATCAAAGCCTTGAACATATTTGT







GGAAAGACGAATATCAGCTCTGCCTGTTGTGGATGAG







TCAGGAAAAGTTGTAGATATTTATTCCAAATTTGATGT







AATTAATCTTGCTGCTGAGAAAACATACAATAACCTAG







ATATCACGGTGACCCAGGCCCTTCAGCACCGTTCACAG







TATTTTGAAGGTGTTGTGAAGTGCAATAAGCTGGAAAT







ACTGGAGACCATCGTGGACAGAATAGTAAGAGCTGAG







GTCCATCGGCTGGTGGTGGTAAATGAAGCAGATAGTA







TTGTGGGTATTATTTCCCTGTCGGACATTCTGCAAGCC







CTGATCCTCACACCAGCAGGTGCCAAACAAAAGGAGA







CAGAAACGGAG


NYU_E
88033698
87883123
InFrame
8371
ATGGGTGCTGGGCCCTCCTTGCTGCTCGCCGCCCTCCT







GCTGCTTCTCTCCGGCGACGGCGCCGTGCGCTGCGAC







ACACCTGCCAACTGCACCTATCTTGACCTGCTGGGCAC







CTGGGTCTTCCAGGTGGGCTCCAGCGGTTCCCAGCGC







GATGTCAACTGCTCGGTTATGGGACCACAAGAAAAAA







AAGTAGTGGTGTACCTTCAGAAGCTGGATACAGCATA







TGATGACCTTGGCAATTCTGGCCATTTCACCATCATTTA







CAACCAAGGCTTTGAGATTGTGTTGAATGACTACAAGT







GGTTTGCCTTTTTTAAGTATAAAGAAGAGGGCAGCAA







GGTGACCACTTACTGCAACGAGACAATGACTGGGTGG







GTGCATGATGTGTTGGGCCGGAACTGGGCTTGTTTCA







CCGGAAAGAAGGTGGGAACTGCCTCTGAGAATGTGTA







TGTCAACATAGCACACCTTAAGAATTCTCAGGAAAAGT







ATTCTAATAGGCTCTACAAGTATGATCACAACTTTGTG







AAAGCTATCAATGCCATTCAGAAGTCTTGGACTGCAAC







TACATACATGGAATATGAGACTCTTACCCTGGGAGATA







TGATTAGGAGAAGTGGTGGCCACAGTCGAAAAATCCC







AAGGCCCAAACCTGCACCACTGACTGCTGAAATACAG







CAAAAGATTTTGCATTTGCCAACATCTTGGGACTGGAG







AAATGTTCATGGTATCAATTTTGTCAGTCCTGTTCGAAA







CCAAG|GTCAAGAAAGATTTGGAAACATGACGAGGGT







CTATTACCGAGAAGCTATGGGTGCATTTATTGTCTTCG







ATGTCACCAGGCCAGCCACATTTGAAGCAGTGGCAAA







GTGGAAAAATGATTTGGACTCCAAGTTAAGTCTCCCTA







ATGGCAAACCGGTTTCAGTGGTTTTGTTGGCCAACAAA







TGTGACCAGGGGAAGGATGTGCTCATGAACAATGGCC







TCAAGATGGACCAGTTCTGCAAGGAGCACGGTTTCGT







AGGATGGTTTGAAACATCAGCAAAGGAAAATATAAAC







ATTGATGAAGCCTCCAGATGCCTGGTGAAACACATACT







TGCAAATGAGTGTGACCTAATGGAGTCTATTGAGCCG







GACGTCGTGAAGCCCCATCTCACATCAACCAAGGTTGC







CAGCTGCTCTGGCTGTGCCAAATCC


G17212.
51128913
42019095
InFrame
8372
ATGGCGGAACGAGGCCTGGAGCCGTCGCCGGCCGCG


TCGA-06-




GTGGCGGCGCTGCCGCCTGAAGTGCGGGCGCAGCTG


0129-




GCGGAGCTGGAGCTGGAGCTCTCGGAGGGGGACATC


01A-01R-




ACCCAGAAGGGCTATGAAAAGAAAAGGTCCAAACTCC


1849-




TATCTCCTTACAGCCCGCAGACACAAGAAACTGATTCA


01.2




GCAGTACAGAAAGAACTTAGAAACCAGACACCTGCTC







CATCTGCAGCTCAAACTTCTGCTCCCTCTAAGTACCACC







GAACTCGATCTGGGGGAGCCAGGGATGAACGATATCG







ATCAGATATCCACACAGAAGCAGTTCAGGCTGCACTG







GCAAAGCATAAAGAACAGAAGATGGCTTTGCCCATGC







CAACCAAAAGGCGATCCACATTTGTTCAGTCTCCTGCA







GATGCCTGCACACCTCCTGACACATCTTCGGCCTCTGA







GGATGAGGGCTCTCTGAGACGCCAAGCTGCGCTCTCT







GCTGCCTTGCAACAGAGCTTACAGAATGCTGAGTCCTG







GATCAACCGTTCAATTCAGGGATCGTCCACTTCTTCATC







CGCATCTTCTACGCTGTCCCACGGAGAGGTCAAAGGA







ACCAGTGGGTCTCTAGCTGATGTATTTGCCAATACTCG







AATAGAGAATTTCTCTGCTCCTCCTGATGTCACTACAAC







TACCTCTTCCTCCTCATCATCTTCCTCAATTCGCCCAGCA







AACATTGACCTGCCCCCCTCGGGGATAGTTAAAGGCAT







GCACAAAGGATCCAACAGGTCCAGCCTTATGGATACA







GCTGATGGTGTTCCTGTCAGTAGCAGAGTATCTACAAA







AATCCAGCAGCTTCTGAACACTCTGAAACGACCCAAAA







GGCCTCCCTTAAAGGAATTTTTTGTGGATGACTCTGAA







GAAATTGTGGAAGTACCTCAGCCAGACCCCAACCAGC







CCAAGCCGGAGGGACGGCAGATGACCCCTGTGAAAG







GAGAGCCTTTAGGAGTCATCTGTAACTGGCCTCCTGCT







CTTGAATCTGCCCTGCAGCGCTGGGGTACCACTCAAGC







AAAATGCTCCTGTCTGACTGCACTGGACATGACAGGG







AAACCAGTTTACACTCTTACATATGGAAAGTTGTGGAG







CAGAAGTTTAAAGTTGGCCTACACACTTCTTAATAAAC







TGGGGACCAAAAATGAACCTGTGTTAAAACCTGGAGA







CAGGGTAGCCCTGGTTTACCCCAACAATGATCCAGTCA







TGTTTATGGTGGCTTTCTATGGATGCCTCCTGGCAGAA







GTGATTCCAGTGCCTATAGAGGTACCTCTTACCAGAAA







GGATGCTGGAGGTCAGCAGATTGGCTTCTTGCTAGGA







AGCTGTGGTATTGCCTTAGCTCTTACCAGTGAAGTTTG







TCTAAAAGGACTGCCAAAAACCCAGAATGGAGAAATT







GTACAGTTTAAAGGTTGGCCCCGGCTCAAATGGGTTGT







AACAGATTCCAAGTACCTTTCAAAGCCACCGAAAGACT







GGCAGCCACACATCTCACCTGCTGGGACAGAACCGGC







ATACATTGAGTATAAAACAAGCAAAGAAGGGAGTGTA







ATGGGAGTTACAGTATCCCGGCTTGCAATGTTGTCTCA







CTGCCAAGCTCTGTCGCAGGCCTGCAATTATTCTGAAG







GGGAAACAATAGTAAATGTCTTAGACTTTAAGAAGGA







TGCTGGGCTGTGGCACGGCATGTTTGCGAATGTAATG







AATAAGATGCACACAATCAGCGTACCCTACTCTGTTAT







GAAAACCTGTCCTCTCTCTTGGGTCCAAAGAGTACATG







CTCACAAAGCCAAGGTAGCTTTAGTAAAATGTCGGGA







CTTGCACTGGGCTATGATGGCACATCGGGACCAAAGA







GACGTGAGCTTGAGTTCCCTCCGAATGTTAATTGTGAC







TGATGGAGCTAACCCCTGGTCCGTGTCATCCTGTGATG







CCTTCCTGAGTCTGTTCCAAAGTCATGGACTGAAGCCT







GAGGCCATCTGTCCGTGCGCCACGTCTGCTGAAGCCAT







GACTGTAGCAATCCGCAGGCCTGGAGTTCCAGGAGCC







CCTTTGCCAGGAAGAGCCATTCTCTCAATGAATGGATT







GAGCTATGGGGTAATACGGGTCAATACTGAAGATAAA







AATTCAGCACTGACGGTCCAGGATGTAGGGCATGTAA







TGCCTGGTGGGATGATGTGCATTGTGAAACCAGATGG







ACCTCCCCAGCTCTGCAAAACAGATGAAATTGGAGAA







ATCTGTGTTAGCTCCAGAACTGGAGGCATGATGTACTT







TGGGCTTGCTGGTGTGACAAAAAATACATTTGAGGTA







ATTCCAGTGAATTCTGCAGGCTCTCCTGTTGGGGATGT







GCCATTCATCCGATCAGGATTGCTGGGGTTTGTAGGGC







CGGGTAGTTTGGTGTTCGTGGTTGGGAAAATGGATGG







CTTACTGATGGTTAGTGGTCGAAGACATAATGCTGATG







ACATTGTTGCTACTGGATTGGCTGTAGAATCAATAAAG







ACTGTTTATAGAGGAAGAATTGCTGTGTTTTCTGTGTC







TGTATTTTATGATGAGCGCATTGTGGTGGTTGCGGAAC







AAAGACCTGATGCTTCTGAGGAAGATAGTTTCCAGTG







GATGAGCCGCGTGCTGCAGGCGATCGATAGCATTCAT







CAAGTGGGGGTTTATTGTCTTGCTCTGGTGCCAGCCAA







TACATTGCCAAAAACTCCACTAGGAGGAATCCATATAT







CTCAGACGAAACAACTCTTTCTGGAGGGATCACTGCAT







CCTTGCAACATCCTCATGTGCCCCCATACATGTGTGAC







AAACTTGCCAAAGCCCCGGCAAAAACAACCAGGTGTA







GGCCCTGCTTCCGTGATGGTTGGGAATCTGGTTGCTG







GAAAACGTATAGCACAAGCTGCTGGAAGGGATCTGGG







ACAAATAGAAGAGAATGATTTGGTGAGGAAGCACCAG







TTTCTGGCAGAGATCCTACAGTGGCGAGCCCAGGCGA







CTCCTGACCATGTACTCTTCATGCTGTTAAATGCCAAG







GGAACCACTGTATGCACAGCCAGCTGCCTTCAGCTTCA







TAAGCGAGCAGAGAGGATTGCATCTGTTCTTGGTGAT







AAGGGACATCTAAATGCAGGAGATAATGTGGTGTTGC







TCTATCCACCTGGCATTGAGTTAATCGCCGCCTTCTATG







GCTGCCTGTATGCGGGCTGTATACCTGTGACCGTCAGA







CCTCCACATGCTCAGAACCTCACGGCCACGCTGCCCAC







TGTCCGAATGATTGTTGATGTCAGCAAAGCAGCCTGTA







TTCTCACCAGTCAGACCCTAATGAGGCTACTGAGGTCC







CGAGAGGCAGCAGCAGCTGTGGATGTGAAAACCTGG







CCAACCATCATTGACACAGATGATTTACCCAGGAAAAG







GTTACCTCAGCTGTATAAACCGCCCACTCCTGAGATGT







TGGCATATCTTGATTTTAGTGTCTCCACAACTGGCATGC







TTACAGGAGTGAAGATGTCCCACTCTGCAGTGAACGCT







CTGTGTCGAGCCATCAAGCTCCAGTGTGAGTTGTACTC







TTCTCGGCAGATCGCCATCTGCCTTGACCCTTACTGTG







GACTTGGCTTCGCGCTCTGGTGTCTCTGCAGTGTCTAT







TCAGGCCACCAGTCTGTCTTAATTCCTCCTATGGAGTTA







GAGAACAACCTTTTCCTCTGGCTCTCCACAGTCAACCA







GTACAAAATAAGGGACACTTTCTGCTCCTATTCAGTGA







TGGAGCTCTGCACCAAAGGTCTTGGGAACCAAGTGGA







AGTGCTAAAGACCAGAGGGATCAACCTCTCCTGCGTCC







GGACCTGTGTGGTGGTGGCGGAGGAGAGGCCCCGCG







TTGCACTCCAGCAGTCCTTCTCTAAGCTCTTCAAAGACA







TCGGGCTGTCCCCGCGGGCTGTCAGCACCACTTTTGGA







TCAAGAGTCAATGTAGCAATATGTTTACAGGGAACCTC







AGGGCCTGATCCGACTACTGTGTATGTGGATCTGAAAT







CACTAAGACATGACAGGGTTCGTCTCGTGGAACGTGG







CGCCCCTCAGAGTTTGCTTCTCTCAGAGTCTGGAAAG|







AGATCGGGAAGTAAACAGTCCACTAACCCTGCCGATA







ACTATCATCTGGCCCGGAGGAGAACCCTGCAGGTGGT







TGTGAGCTCCTTGCTGACAGAGGCAGGGTTTGAGAGT







GCCGAGAAAGCATCCGTGGAAACGCTGACAGAGATGC







TGCAGAGCTACATTTCAGAAATTGGGAGAAGTGCCAA







GTCTTACTGTGAGCACACAGCCAGGACCCAGCCCACAC







TGTCCGATATCGTGGTCACACTTGTTGAGATGGGTTTC







AATGTGGACACTCTCCCTGCTTATGCAAAACGGTCTCA







GAGGATGGTCATCACTGCTCCTCCGGTGACCAATCAGC







CAGTGACCCCCAAGGCCCTCACTGCAGGGCAGAACCG







ACCCCACCCGCCGCACATCCCCAGCCATTTTCCTGAGTT







CCCTGATCCCCACACCTACATCAAAACTCCGACGTACC







GTGAGCCCGTGTCAGACTACCAGGTCCTGCGGGAGAA







GGCTGCATCCCAGAGGCGCGATGTGGAGCGGGCACTT







ACCCGTTTCATGGCCAAGACAGGCGAGACTCAGAGTC







TTTTCAAAGATGACGTCAGCACATTTCCATTGATTGCTG







CCAGACCTTTCACCATCCCCTACCTGACAGCTCTTCTTC







CGTCTGAACTGGAGATGCAACAAATGGAAGAGACAGA







TTCCTCGGAGCAGGATGAACAGACAGACACAGAGAAC







CTTGCTCTTCATATCAGCATGatagagtctcgctccgtcaccca







ggctggagtgcagtggcaagatcttggctcactgcaacctccgcctcct







gggttcaagcgattctccagcctcagcctcctgagtagctggaattaca







gGAGGATTCTGGAGCCGAGAAGGAGAACACCTCTGTC







CTGCAGCAGAACCCCTCCTTGTCGGGTAGCCGGAATG







GGGAGGAGAACATCATCGATAACCCTTATCTGCGGCC







GG


G17787.
13920034
156012704
InFrame
8373
ATGGAACCCCCCGCGGCCAAGCGGAGCCGGGGCTGCC


TCGA-26-




CCGCGGGACCCGAGGAGCGCGATgccggggccggggccgc


5139-




gcgtggccggggccggcccgAGGCGCTGCTGGACCTCAGCG


01A-01R-




CCAAGCGGGTAGCCGAGAGCTGGGCCTTCGAGCAGGT


1850-




GGAGGAGCGGTTCTCCCGGGTGCCTGAGCCCGTCCAG


01.4




AAGCGCATCGTGTTTTGGTCGTTTCCACGCAGTGAACG







GGAAATATGTATGTACTCGTCGCTGGGTTACCCGCCCC







CAGAGGGCGAGCACGATGCCCGGGTGCCCTTTACCCG







CGGGCTGCACCTGCTCCAGAGCGGGGCCGTGGACCGC







GTGTTGCAAGTGGGATTCCACCTGAGCGGAAACATCC







GCGAGCCAGGGAGTCCTGGAGAGCCCGAGCGCCTCTA







CCATGTCTCCATCAGCTTTGATCGCTGCAAGATCACGT







CCGTGAGCTGCGGCTGTGACAACCGCGACCTCTTCTAC







TGTGCCCACGTGGTGGCCCTGTCCCTGTACCGCATTCG







GCACGCCCACCAGGTGGAGCTGCGGCTGCCCATCTCC







GAGACGCTCTCCCAGATGAACCGGGACCAGCTGCAGA







AGTTCGTGCAGTACCTCATCAGCGCCCATCACACTGAG







GTGCTGCCCACTGCTCAGCGCTTGGCTGATGAGATCCT







CCTGCTGGGCTCCGAGATCAACTTGGTGAATGGTGCCC







CAGACCCCACCGCCGGCGCAGGAATCGAGGACGCCAA







CTGCTGGCACCTGGACGAGGAGCAGATCCAGGAGCA







GGTGAAGCAGCTACTGTCCAATGGCGGCTACTACGGG







GCCAGCCAGCAGCTGCGCTCCATGTTCAGCAAGGTGC







GGGAGATGCTGCGAATGCGGGACTCCAACGGGGCGC







GCATGCTGATTCTCATGACCGAGCAGTTCCTGCAGGAC







ACGCGCCTGGCCCTGTGGCGGCAGCAGGGCGCGGGC







ATGACGGACAAGTGCCGGCAGCTCTGGGATGAGCTGG







|GGATGTTCAATAGCCCAGAAATGCAAGCCCTCCTCCA







GCAGATCTCTGAGAACCCCCAGCTGATGCAGAATGTG







ATCTCAGCACCCTACATGCGCAGCATGATGCAGACGCT







TGCCCAGAACCCCGACTTTGCTGCTCAGATGATGGTGA







ATGTGCCGCTCTTCGCGGGGAACCCCCAACTGCAGGA







GCAGCTCCGCCTGCAGCTCCCAGTCTTCCTGCAGCAGA







TGCAGAACCCAGAGTCACTCTCCATCCTTACCAATCCCC







GAGCCATGCAGGCATTGCTGCAGATCCAGCAGGGACT







ACAGACCTTGCAGACCGAGGCCCCTGGGCTGGTACCC







AGCCTTGGCTCCTTTGGGATATCCCGGACCCCAGCACC







CTCAGCAGGCAGCAACGCAGGGTCTACGCCCGAGGCC







CCCACTTCCTCACCAGCCACGCCAGCCACATCTTCTCCA







ACAGGGGCTTCCAGCGCCCAGCAGCAACTCATGCAGC







AGATGATCCAGCTTTTGGCTGGAAGTGGAAACTCACA







GGTGCAGACGCCAGAAGTGAGATTTCAGCAGCAGCTG







GAGCAGCTCAACTCCATGGGCTTCATCAATCGTGAGGC







TAACCTGCAGGCCCTGATTGCCACAGGAGGGGACATC







AACGCAGCTATCGAGAGACTGCTGGGCTCCCAGCTCTC







C


G17675.
65232631
63226059
InFrame
8374
ATGGACGTCCTTCCCAccggcgggggccgcccggggcTCCGG


TCGA-19-




ACGGAGCTGGAATTCCGCGGCGGCGGTGGCGAGGCG


2624-




AGGCTGGAGAGTCAGGAGGAAGAAACGATTCCTGCA


01A-01R-




GCTCCCCCAGCCCCGCGCCTCCGGGGAGCGGCGGAGC


1850-




GGCCGCGGCGCTCCCGGGACACGTGGGACGGCGATG


01.2




AGGACACGGAGCCCGGCGAGGCGTGCGGCGGCCGCA







CAAGCCGCACGGCGTCCCTGGTGAGCGGGCTGCTCAA







CGAGCTGTACAGCTGCACAGAGGAGGAGGAggcggcgg







gcgggggccgcggggccgagggccgccggcggcgccgcgACAGCC







TCGACAGCTCCACCGAGGCCTCGGGCTCCGACGTGGT







CCTGGGCGGCCGCAGCGGTGCCGGCGACTCCCGCGTG







CTGCAGGAGCTGCAGGAGCGACCGAGCCAGCGGCAT







CAGATGCTGTACCTGCGGCAGAAAGACGCTAATGAAC







TGAAGACGATCCTTCGAGAGCTAAAGTACAGAATTGG







CATCCAGTCGGCCAAGTTACTTCGGCATCTGAAGCAGA







AAGATAGGCTTCTGCATAAAGTGCAGAGGAACTGTGA







TATTGTGACTGCCTGCTTGCAGGCTGTGTCACAGAAGA







GAAGAGTTGATACCAAGTTGAAATTCACTCTTGAGCCA







TCTTTAGGTCAAAATGGTTTTCAGCAGTGGTACGATGC







TCTCAAGGCAGTTGCCAGGCTATCCACAGGAATACCAA







AGGAATGGAGGAGAAAGGTTTGGTTGACCTTGGCAG







ATCATTATTTGCACAGTATAGCCATTGACTGGGACAAA







ACCATGCGCTTCACTTTCAATGAAAGGAGTAATCCTGA







TGATGACTCCATGGGAATTCAGATAGTCAAGGACCTTC







ACCGCACAGGCTGTAGTTCTTACTGTGGCCAGGAGGC







TGAGCAGGACAGGGTTGTGTTGAAGCGGGTGCTGCTG







GCCTATGCCCGATGGAACAAAACTGTTGGGTACTGCC







AAGGCTTTAACATCCTGGCTGCACTAATTCTGGAAGTG







ATGGAAGGCAATGAAGGGGATGCCCTGAAAATTATGA







TTTACCTTATTGATAAGGTACTTCCCGAAAGCTATTTCG







TCAATAATCTCCGGGCATTGTCTGTGGATATGGCTGTC







TTCAGAGACCTTTTAAGAATGAAGCTGCCGGAATTATC







TCAGCACCTGGATACTCTTCAGAGAACTGCAAACAAAG







AAAGTGGAGGTGGATATGAGCCCCCACTTACAAATGT







CTTCACGATGCAGTGGTTTCTGACTCTCTTTGCCACATG







CCTCCCTAATCAGACCGTTTTAAAGATCTGGGATTCAG







TCTTCTTTGAAGGTTCAGAAATCATCCTAAGGGTGTCG







CTGGCTATCTGGGCAAAATTAGGAGA|GGTTATCAAT







GCCGGGAAGAGCACACACAATGAAGACCAAGCCAGCT







GTGAGGTGCTCACTGTGAAGAAGAAGGCAGGGGCCG







TGACCTCAACCCCAAACAGGAACTCATCCAAGAGACG







GTCCTCCCTTCCCAATGGGGAAGGGCTGCAGCTGAAG







GAGAACTCGGAATCCGAGGGTGTTTCCTGCCACTATTG







GTCGCTGTTTGACGGGCACGCGGGGTCCGGGGCCGC







GGTGGTGGCGTCACGCCTGCTGCAGCACCACATCACG







GAGCAGCTGCAGGACATCGTGGACATCCTGAAGAACT







CCGCCGTCCTGCCCCCTACCTGCCTGGGGGAGGAGCCT







GAGAACACGCCCGCCAACAGCCGGACTCTGACCCGGG







CAGCCTCCCTGCGCGGAGGGGTGGGGGCCCCGGGCTC







CCCCAGCACGCCCCCCACACGCTTCTTTACCGAGAAGA







AGATTCCCCATGAGTGCCTGGTCATCGGAGCGCTTGAA







AGTGCATTCAAGGAAATGGACCTACAGATAGAACGAG







AGAGGAGTTCATATAATATATCTGGTGGCTGCACGGC







CCTCATTGTGATTTGCCTTTTGGGGAAGCTGTATGTTG







CAAATGCTGGGGATAGCAGGGCCATAATCATCAGAAA







TGGAGAAATTATCCCCATGTCTTCAGAATTTACCCCCG







AGACGGAGCGCCAGCGACTTCAGTACCTGGCATTCAT







GCAGCCTCACTTGCTGGGAAATGAGTTCACACATTTGG







AGTTTCCAAGGAGAGTACAGAGAAAGGAGCTTGGAA







AGAAGATGCTCTACAGGGACTTTAATATGACAGGCTG







GGCATACAAAACCATTGAGGATGAGGACTTGAAGTTC







CCCCTTATATATGGAGAAGGCAAGAAGGCCCGGGTAA







TGGCAACTATTGGAGTGACCAGGGGACTTGGGGACCA







TGACCTGAAGGTGCATGACTCCAACATCTACATTAAAC







CATTCCTGTCTTCAGCTCCAGAGGTAAGAATCTACGAT







CTTTCAAAATATGATCATGGATCAGATGATGTGCTGAT







CTTGGCCACTGATGGACTCTGGGACGTTTTATCAAATG







AAGAAGTAGCAGAAGCAATCACTCAGTTTCTTCCTAAC







TGTGATCCAGATGATCCTCACAGGTACACACTGGCAGC







TCAGGACCTGGTGATGCGTGCCCGGGGTGTGCTGAAG







GACAGAGGATGGCGGATATCTAATGACCGACTGGGCT







CAGGAGACGACATTTCTGTATATGTCATTCCTTTAATAC







ATGGAAACAAGCTGTCA


G17800.
48586286
56936733
InFrame
8375
ATGGACAATATGTCTATTACGAATACACCAACAAGTAA


TCGA-06-




TGATGCCTGTCTGAGCATTGTGCATAGTTTGATGTGCC


5859-




ATAGACAAGGTGGAGAGAGTGAAACATTTGCAAAAAG


01A-01R-




AGCAATTGAAAGTTTGGTAAAGAAGCTGAAGGAGAAA


1849-




AAAGATGAATTGGATTCTTTAATAACAGCTATAACTAC


01.4




AAATGGAGCTCATCCTAGTAAATGTGTTACCATACAGA







GAACATTGGATGGGAGGCTTCAGGTGGCTGGTCGGAA







AGGATTTCCTCATGTGATCTATGCCCGTCTCTGGAGGT







GGCCTGATCTTCACAAAAATGAACTAAAACATGTTAAA







TATTGTCAGTATGCGTTTGACTTAAAATGTGATAGTGT







CTGTGTGAATCCATATCACTACGAACGAGTTGTATCAC







CTGGAATTGATCTCTCAGGATTAACACTGCAGAGTAAT







GCTCCATCAAGTATGATGGTGAAGGATGAATATGTGC







ATGACTTTGAGGGACAGCCATCGTTGTCCACTGAAGG







ACATTCAATTCAAACCATCCAGCATCCACCAAGTAATC







GTGCATCGACAGAGACATACAGCACCCCAGCTCTGTTA







GCCCCATCTGAGTCTAATGCTACCAGCACTGCCAACTT







TCCCAACATTCCTGTGGCTTCCACAAGTCAGCCTGCCA







GTATACTGGGGGGCAGCCATAGTGAAGGACTGTTGCA







GATAGCATCAGGGCCTCAGCCAGGACAGCAGCAGAAT







GGATTTACTGGTCAGCCAGCTACTTACCATCATAACAG







CACTACCACCTGGACTGGAAGTAGGACTGCACCATAC







ACACCTAATTTGCCTCACCACCAAAACGGCCATCTTCA







GCACCACCCGCCTATGCCGCCCCATCCCGGACATTACT







GGCCTGTTCACAATGAGCTTGCATTCCAGCCTCCCATTT







CCAATCATCCTG|GTGTGGTTCCAGAACCGACGGGC


G17638.
234299129
234967480
InFrame
8376
ATggcggcggcggcgggcgcccctccgccgggtcccccgcaaccgcct


TCGA-28-




ccgccgccgccgcccgAGGAGTCGTCCGACAGCGAGCCCG


2499-




AGGCGGAGCCCGGCTCCCCACAGAAGCTCATCCGCAA


01A-01R-




GGTGTCCACGTCGGGTCAGATCCGACAGAAGACCATC


1850-




ATCAAAGAGGGGATGCTGACCAAACAGAACAATTCAT


01.2




TCCAGCGATCAAAAAGGAGATACTTTAAGCTTCGAGG







GCGAACGCTTTACTATGCCAAAACGGCAAAGTCAATCA







TATTTGATGAGGTGGATCTGACAGATGCCAGCGTAGC







TGAATCCAGTACCAAAAACGTCAACAACAGTTTTACG|







GTTGAGGTCCTAGATGAGAACAACTTGGTCATGAATTT







AGAGTTCAGCATCCGGGAGACTACATGCAGGAAGGAT







TCTGGAGAAGATCCCGCTACATGTGCCTTCCAGAGGG







ACTACTATGTGTCCACAGCTGTTTGCAGAAGCACCGTG







AAGGTATCTGCCCAGCAGGTGCAGGGCGTGCATGCTC







GCTGCAGCTGGTCCTCCTCCACGTCTGAGTCTTACAGC







AGCGAAGAGATGATTTTTGGGGACATGTTGGGATCTC







ATAAATGGAGAAACAATTATCTATTTGGTCTCATTTCA







GACGAGTCCATAAGTGAACAATTTTATGATCGGTCACT







TGGGATCATGAGAAGGGTATTGCCTCCTGGAAACAGA







AGGTACCCAAACCACCGGCACAGAGCAAGAATAAATA







CTGACTTTGAG



















Protein
Protein
Protein
Protein


Exon
Exon



Start
Stop
Start
Stop
SEQ ID

Break
Break


sample
5p
5p
3p
3p
NO:
Protein Sequence
5p
3p





G17807.
1
982
190
225
8377
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
6


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




28-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




5209-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.4





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQDAFIGFGGNVI










RQQVKDNAKWYITDFVELLGELEE




G17197.
1
336
240
432
8378
MGETMSKRLKLHLGGEAEMEERAFVNPFPDY
6
7


TCGA-





EAAAGALLASGAAEETGCVRPPATTDEPGLPF




06-





HQDGKIIHNFIRRIQTKIKDLLQQMEEGLKTAD




0211-





PHDCSAYTGWTGIALLYLQLYRVTCDQTYLLR




01B-





SLDYVKRTLRNLNGRRVTFLCGDAGPLAVGAV




01R-





IYHKLRSDCESQECVTKLLQLQRSVVCQESDLP




1849-





DELLYGRAGYLYALLYLNTEIGPGYVCESAIKEV




01.2





VNAIIESGKTLSREERKTERCPLLYQWHRKQYV










GAAHGMAGIYYMLMQPAAKVDQETLTEMV










KPSIDYVRHKKFRSGNYPSSLSNETDRLVHWC










HGAPGVIHMLMQAYKGLLPFAVVGSTDEVK










VGKRMVRGRHYPWGVLQVENENHCDFVKL










RDMLLCTNMENLKEKTHTQHYECYRYQKLQK










MGFTDVGPNNQPVSFQEIFEAKRQEFYDQC










QREEEELKQRFMQRVKEKEATFKEAEKELQDK










FEHLKMIQQEEIRKLEEEKKQLEGEIIDFYKMK










AASEALQTQLSTDTKKDKHRKK




G17650.
1
982
373
432
8379
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




28-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




2513-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




G17506.
1
760
647
838
8380
MGAPACALALCVAVAIVAGASSESLGTEQRV
17
11


TCGA-





VGRAAEVPGPEPGQQEQLVFGSGDAVELSCP




27-





PPGGGPMGPTVWVKDGTGLVPSERVLVGPQ




1835-





RLQVLNASHEDSGAYSCRQRLTQRVLCHFSVR




01A-





VTDAPSSGDDEDGEDEAEDTGVDTGAPYWT




01R-





RPERMDKKLLAVPAANTVRFRCPAAGNPTPSI




1850-





SWLKNGREFRGEHRIGGIKLRHQQWSLVMES




01.2





VVPSDRGNYTCVVENKFGSIRQTYTLDVLERS










PHRPILQAGLPANQTAVLGSDVEFHCKVYSDA










QPHIQWLKHVEVNGSKVGPDGTPYVTVLKS










WISESVEADVRLRLANVSERDGGEYLCRATNF










IGVAEKAFWLSVHGPRAAEEELVEADEAGSVY










AGILSYGVGFFLFILVVAAVTLCRLRSPPKKGLG










SPTVHKISRFPLKRQVSLESNASMSSNTPLVRI










ARLSSGEGPTLANVSELELPADPKWELSRARLT










LGKPLGEGCFGQVVMAEAIGIDKDRAAKPVT










VAVKMLKDDATDKDLSDLVSEMEMMKMIG










KHKNIINLLGACTQGGPLYVLVEYAAKGNLREF










LRARRPPGLDYSFDTCKPPEEQLTFKDLVSCAY










QVARGMEYLASQKCIHRDLAARNVLVTEDNV










MKIADFGLARDVHNLDYYKKTTNGRLPVKW










MAPEALFDRVYTHQSDVWSFGVLLWEIFTLG










GSPYPGIPVEELFKLLKEGHRMDKPANCTHDL










YMIMRECWHAAPSQRPTFKQLVEDLDRVLTV










TSTDVKATQEENRELRSRCEELHGKNLELGKI










MDRFEEVVYQAMEEVQKQKELSKAEIQKVLK










EKDQLTTDLNSMEKSFSDLFKRFEKQKEVIEGY










RKNEESLKKCVEDYLARITQEGQRYQALKAHA










EEKLQLANEEIAQVRSKAQAEALALQASLRKE










QMRIQSLEKTVEQKTKENEELTRICDDLISKME










KI




G17191.
1
336
240
432
8381
MGETMSKRLKLHLGGEAEMEERAFVNPFPDY
6
7


TCGA-





EAAAGALLASGAAEETGCVRPPATTDEPGLPF




06-





HQDGKIIHNFIRRIQTKIKDLLQQMEEGLKTAD




0211-





PHDCSAYTGWTGIALLYLQLYRVTCDQTYLLR




01A-





SLDYVKRTLRNLNGRRVTFLCGDAGPLAVGAV




01R-





IYHKLRSDCESQECVTKLLQLQRSVVCQESDLP




1849-





DELLYGRAGYLYALLYLNTEIGPGTVCESAIKEV




01.2





VNAIIESGKTLSREERKTERCPLLYQWHRKQYV










GAAHGMAGIYYMLMQPAAKVDQETLTEMV










KPSIDYVRHKKFRSGNYPSSLSNETDRLVHWC










HGAPGVIHMLMQAYKGLLPFAVVGSTDEVK










VGKRMVRGRHYPWGVLQVENENHCDFVKL










RDMLLCTNMENLKEKTHTQHYECYRYQKLQK










MGFTDVGPNNQPVSFQEIFEAKRQEFYDQC










QREEEELKQRFMQRVKEKEATFKEAEKELQDK










FEHLKMIQQEEIRKLEEEKKQLEGEIIDFYKMK










AASEALQTQLSTDTKKDKHRKK




G17512.
1
982
373
432
8382
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




27-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




1837-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKIVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




NYU_A
1
760
548
838
8383
MGAPACALALCVAVAIVAGASSESLGTEQRV
17
8








VGRAAEVPGPEPGQQEQLVFGSGDAVELSCP










PPGGGPMGPTVWVKDGTGLVPSERVLVGPQ










RLQVLNASHEDSGAYSCRQRLTQRVLCHFSVR










VTDAPSSGDDEDGEDEAEDTGVDTGAPYWT










RPERMDKKLLAVPAANTVRFRCPAAGNPTPSI










SWLKNGREFRGEHRIGGIKLRHQQWSLVMES










VVPSDRGNYTCVVENKFGSIRQTYTLDVLERS










PHRPILQAGLPANQTAVLGSDVEFHCKVYSDA










QPHIQWLKHVEVNGSKVGPDGTPYVTVLKS










WISESVEADVRLRLANVSERDGGEYLCRATNF










IGVAEKAFWLSVHGPRAAEEELVEADEAGSVY










AGILSYGVGFFLFILVVAAVTLCRLRSPPKKGLG










SPTVHKISRFPLKRQVSLESNASMSSNTPLVRI










ARLSSGEGPTLANVSELELPADPKWELSRARLT










LGKPLGEGCFGQVVMAEAIGIDKDRAAKPVT










VAVKMLKDDATDKDLSDLVSEMEMMKMIG










KHKNIINLLGACTQGGPLYVLVEYAAKGNLREF










LRARRPPGLDYSFDTCKPPEEQLTFKDLVSCAY










QVARGMEYLASQKCIHRDLAARNVLVTEDNV










MKIADFGLARDVHNLDYYKKTTNGRLPVKW










MAPEALFDRVYTHQSDVWSFGVLLWEIFTLG










GSPYPGIPVEELFKLLKEGHRMDKPANCTHDL










YMIMRECWHAAPSQRPTFKQLVEDLDRVLTV










TSTDFKESALRKQSLYLKFDPLLRDSPGRPVPV










ATETSSMHGANETPSGRPREAKLVEFDFLGAL










DIPVPGPPPGVPAPGGPPLSTGPIVDLLQYSQ










KDLDAVVKATQEENRELRSRCEELHGKNLELG










KIMDRFEEVVYQAMEEVQKQKELSKAEIQKVL










KEKDQLTTDLNSMEKSFSDLFKRFEKQKEVIEG










YRKNEESLKKCVEDYLARITQEGQRYQALKAH










AEEKLQLANEEIAQVRSKAQAEALALQASLRK










EQMRIQSLEKTVEQKTKENEELTRICDDLISKM










EKI




G17814.
1
834
399
796
8384
MARQPPPPWVHAAFLLCLLSLGGAIEIPMDLT
21
10


TCGA-





QPPTITKQSAKDHIVDPRDNILIECEAKGNPAP




06-





SFHWTRNSRFFNIAKDPRVSMRRRSGTLVIDF




5411-





RSGGRPEEYEGEYQCFARNKFGTALSNRIRLQ




01A-





VSKSPLWPKENLDPVVVQEGAPLTLQCNPPP




01R-





GLPSPVIFWMSSSMEPITQDKRVSQGHNGDL




1849-





YFSNVMLQDMQTDYSCNARFHFTHTIQQKN




01.4





PFTLKVLTNHPYNDSSLRNHPDMYSARGVAE










RTPSFMYPQGTASSQMVLRGMDLLLECIASG










VPTPDIAWYKKGGDLPSDKAKFENFNKALRIT










NVSEEDSGEYFCLASNKMGSIRHTISVRVKAA










PYWLDEPKNLILAPGEDGRLVCRANGNPKPT










VQWMVNGEPLQSAPPNPNREVAGDTIIFRD










TQISSRAVYQCNTSNEHGYLLANAFVSVLDVP










PRMLSPRNQLIRVILYNRTRLDCPFFGSPIPTLR










WFKNGQGSNLDGGNYHVYENGSLEIKMIRKE










DQGIYTCVATNILGKAENQVRLEVKDPTRIYR










MPEDQVARRGTTVQLECRVKHDPSLKLTVSW










LKDDEPLYIGNRMKKEDDSLTIFGVAERDQGS










YTCVASTELDQDLAKAYLTVLADQATPTNRLA










ALPKGRPDRPRDLELTDLAERSVRLTWIPGDA










NNSPITDYVVQFEEDQFQPGVWHDHSKYPG










SVNSAVLRLSPYVNYQFRVIAINEVGSSHPSLP










SERYRTSGAPPESNPGDVKGEGTRKNNMEIT










WTPMNATSAFGPNLRYIVKWRRRETREAWN










NVTVWGSRYVVGQTPVYVPYEIRVQAENDFG










KGPEPESVIGYSGEDYTNSTSGDPVEKKDETPF










GVSVAVGLAVFACLFLSTLLLVLNKCGRRNKF










GINRPAVLAPEDGLAMSLHFMTLGGSSLSPTE










GKGSGLQGHIIENPQYFSDACVHHIKRRDIVLK










WELGEGAFGKVFLAECHNLLPEQDKMLVAVK










ALKEASESARQDFQREAELLTMLQHQHIVRFF










GVCTEGRPLLMVFEYMRHGDLNRFLRSHGPD










AKLLAGGEDVAPGPLGLGQLLAVASQVAAG










MVYLAGLHFVHRDLATRNCLVGQGLVVKIGD










FGMSRDIYSTDYYRVGGRTMLPIRWMPPESIL










YRKFTTESDVWSFGVVLWEIFTYGKQPWYQL










SNTEAIDCITQGRELERPRACPPEVYAIMRGC










WQREPQQRHSIKDVHARLQALAQAPPVYLD










VLG




G17223.
1
982
373
432
8385
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




06-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




0750-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1849-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




G17798.
1
982
373
432
8386
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




32-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




5222-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.4





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




G17195.
1
171
63
246
8387
MFKRMAEFGPDSGGRVKGVTIVKPIVYGNVA
6
2


TCGA-





RYFGKKREEDGHTHQWTVYVKPYRNEDMSA




06-





YVKKIQFKLHESYGNPLRVVTKPPYEITETGWG




0138-





EFEIIIKIFFIDPNERPVTLYHLLKLFQSDTNAML




01A-





GKKTVVSEFYDEMIFQDPTAMMQQLLTTSRQ




02R-





LTLGAYKHETEYPYVKLLLDAMKHSGCAVNKD




1849-





RHFSCEDCNGNVSGGFDASTSQIVLCQNNIH




01.2





NQAHMNRVVTHELIHAFDHCRAHVDWFTNI










RHLACSEVRAANLSGDCSLVNEIFRLHFGLKQ










HHQTCVRDRATLSILAVRNISKEVAKKAVDEV










FESCFNDHEPFGRIPHNKTYARYAHRDFENRD










RYYSNI




G17803.
1
760
612
838
8388
MGAPACALALCVAVAIVAGASSESLGTEQRV
17
10


TCGA-





VGRAAEVPGPEPGQQEQLVFGSGDAVELSCP




76-





PPGGGPMGPTVWVKDGTGLVPSERVLVGPQ




4925-





RLQVLNASHEDSGAYSCRQRLTQRVLCHFSVR




01A-





VTDAPSSGDDEDGEDEAEDTGVDTGAPYWT




01R-





RPERMDKKLLAVPAANTVRFRCPAAGNPTPSI




1850-





SWLKNGREFRGEHRIGGIKLRHQQWSLVMES




01.4





VVPSDRGNYTCVVENKFGSIRQTYTLDVLERS










PHRPILQAGLPANQTAVLGSDVEFHCKVYSDA










QPHIQWLKHVEVNGSKVGPDGTPYVTVLKS










WISESVEADVRLRLANVSERDGGEYLCRATNF










IGVAEKAFWLSVHGPRAAEEELVEADEAGSVY










AGILSYGVGFFLFILVVAAVTLCRLRSPPKKGLG










SPTVHKISRFPLKRQVSLESNASMSSNTPLVRI










ARLSSGEGPTLANVSELELPADPKWELSRARLT










LGKPLGEGCFGQVVMAEAIGIDKDRAAKPVT










VAVKMLKDDATDKDLSDLVSEMEMMKMIG










KHKNIINLLGACTQGGPLYVLVEYAAKGNLREF










LRARRPPGLDYSFDTCKPPEEQLTFKDLVSCAY










QVARGMEYLASQKCIHRDLAARNVLVTEDNV










MKIADFGLARDVHNLDYYKKTTNGRLPVKW










MAPEALFDRVYTHQSDVWSFGVLLWEIFTLG










GSPYPGIPVEELFKLLKEGHRMDKPANCTHDL










YMIMRECWHAAPSQRPTFKQLVEDLDRVLTV










TSTDVPGPPPGVPAPGGPPLSTGPIVDLLQYS










QKDLDAVVKATQEENRELRSRCEELHGKNLEL










GKIMDRFEEVVYQAMEEVQKQKELSKAEIQK










VLKEKDQLTTDLNSMEKSFSDLFKRFEKQKEVI










EGYRKNEESLKKCVEDYLARITQEGQRYQALK










AHAEEKLQLANEEIAQVRSKAQAEALALQASL










RKEQMRIQSLEKTVEQKTKENEELTRICDDLIS










KMEKI




NYU_B
1
31
214
548
8389
MHGGGPPSGDSACPLRTIKRVQFGVLSPDEL
1
5








VPVLRMVEGDTIYDYCWYSLMSSAQPDTSYV










ASSSRENPIHIWDAFTGELRASFRAYNHLDELT










AAHSLCFSPDGSQLFCGFNRTVRVFSTARPGR










DCEVRATFAKKQGQSGIISCIAFSPAQPLYACG










SYGRSLGLYAWDDGSPLALLGGHQGGITHLCF










HPDGNRFFSGARKDAELLCWDLRQSGYPLWS










LGREVTTNQRIYFDLDPTGQFLVSGSTSGAVS










VWDTDGPGNDGKPEPVLSFLPQKDCTNGVSL










HPSLPLLATASGQRVFPEPTESGDEGEELGLPL










LSTRHVHLECRLQLWWCGGAPDSSIPDDHQ










GEKGQGGTEGGVGELI




G17507.
1
982
373
432
8390
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




28-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




1747-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01C-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKIVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




G17469.
1
136
980
998
8391
MADFDTYDDRAYSSFGGGRGSRGSAGGHGS
4
34


TCGA-





RSQKELPTEPPYTAYVGNLPFNTVQGDIDAIFK




06-





DLSIRSVRLVRDKDTDKFKGFCYVEFDEVDSLK




2557-





EALTYDGALLGDRSLRVDIAEGRKQDKGGFGF




01A-





RKGGPDDREIKETDGSSQIKQEPDPTW




01R-










1849-










01.2










G17785.
1
108
119
239
8392
MLTMSVTLSPLRSQDLDPMATDASPMAINM
2
2


TCGA-





TPTVEQGEGEEAMKDMDSDQQYEKPPPLHT




06-





GADWKIVLHLPEIETWLRMTSERVRDLTYSVQ




5413-





QDSDSKHVDVHLVQLKAMVACYPGNGTGYV




01A-





RHVDNPNGDGRCITCIYYLNKNWDAKLHGGI




01R-





LRIFPEGKSFIADVEPIFDRLLFFWSDRRNPHEV




1849-





QPSYATRYAMTVWYFDAEERAEAKKKFRNLT




01.4





RKTESALTED




G17467.
1
83
103
348
8393
MATEGGGKEMNEIKTQFTTREGLYKLLPHSEY
1
3


TCGA-





SRPNRVPFNSQGSNPVRVSFVNLNDQSGNG




14-





DRLCFNVGRELYFYIYKGVRKEIDPSLGVAELP




0736-





DEFFEEDNMLSMGKKMMQEAMSAFPGIDE




02A-





AMSYAEVMRLVKGMNFSVVVFDTAPTGHTL




01R-





RLLNFPTIVERGLGRLMQIKNQISPFISQMCN




2005-





MLGLGDMNADQLASKLEETLPVIRSVSEQFK




01.2





DPEQTTFICVCIAEFLSLYETERLIQELAKCKIDT










HNIIVNQLVFPDPEKPCKMCEARHKIQAKYLD










QMEDLYEDFHIVKLPLLPHEVRGADKVNTFSA










LLLEPYKPPSAQ




GBM-
1
164
59
226
8394
MAAAAAAGAGPEMVRGQVFDVGPRYTNLS
3
2


CUMC3316_L1





YIGEGAYGMVCSAYDNVNKVRVAIKKISPFEH










QTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQ










MKDVYIVQDLMETDLYKLLKTQHLSNDHICYF










LYQILRGLKYIHSANVLHRDLKPSNLLLNTTCDL










KALSLCNYFESQNVDFRGKKVIELGAGTGIVGI










LAALQGGDVTITDLPLALEQIQGNVQANVPA










GGQAQVRALSWGIDHHVFPANYDLVLGADI










VYLEPTFPLLLGTLQHLCRPHGTIYLASKMRKE










HGTESFFQHLLPQHFQLELAQRDEDENVNIYR










ARHREPRPA




G17663.
1
876
417
796
8395
MAQLFLPLLAALVLAQAPAALADVLEGDSSED
13
11


TCGA-





RAFRVRIAGDAPLQGVLGGALTIPCHVHYLRP




19-





PPSRRAVLGSPRVKWTFLSRGREAEVLVARGV




2619-





RVKVNEAYRFRVALPAYPASLTDVSLALSELRP




01A-





NDSGIYRCEVQHGIDDSSDAVEVKVKGVVFLY




01R-





REGSARYAFSFSGAQEACARIGAHIATPEQLY




1850-





AAYLGGYEQCDAGWLSDQTVRYPIQTPREAC




01.2





YGDMDGFPGVRNYGVVDPDDLYDVYCYAED










LNGELFLGDPPEKLTLEEARAYCQERGAEIATT










GQLYAAWDGGLDHCSPGWLADGSVRYPIVT










PSQRCGGGLPGVKTLFLFPNQTGFPNKHSRF










NVYCFRDSAQPSAIPEASNPASNPASDGLEAI










VTVTETLEELQLPQEATESESRGAIYSIPIMEDG










GGGSSTPEDPAEAPRTLLEFETQSMVPPTGFS










EEEGKALEEEEKYEDEEEKEEEEEEEEVEDEAL










WAWPSELSSPGPEASLPTEPAAQEESLSQAPA










RAVLQPGASPLPDGESEASRPPRVHGPPTETL










PTPRERNLASPSPSTLVEAREVGEATGGPELSG










VPRGESEETGSSEGAPSLLPATRAPEGTRELEA










PSEDNSGRTAPAGTSVQAQPVLPTDSASRGG










VAVVPASGDCVPSPCHNGGTCLEEEEGVRCL










CLPGYGGDLCDVGLRFCNPGWDAFQGACYK










HFSTRRSWEEAETQCRMYGAHLASISTPEEQ










DFINNRYREYQWIGLNDRTIEGDFLWSDGVPL










LYENWNPGQPDSYFLSGENCVVMVWHDQG










QWSDVPCNYHLSYTCKMGLVSCGPPPELPLA










QVFGRPRLRYEVDTVLRYRCREGLAQRNLPLI










RCQENGRWEAPQISCVPRRPVSVAVGLAVFA










CLFLSTLLLVLNKCGRRNKFGINRPAVLAPEDG










LAMSLHFMTLGGSSLSPTEGKGSGLQGHIIEN










PQYFSDACVHHIKRRDIVLKWELGEGAFGKVF










LAECHNLLPEQDKMLVAVKALKEASESARQD










FQREAELLTMLQHQHIVRFFGVCTEGRPLLM










VFEYMRHGDLNRFLRSHGPDAKLLAGGEDVA










PGPLGLGQLLAVASQVAAGMVYLAGLHFVHR










DLATRNCLVGQGLVVKIGDFGMSRDIYSTDYY










RVGGRTMLPIRWMPPESILYRKFTTESDVWS










FGVVLWEIFTYGKQPWYQLSNTEAIDCITQGR










ELERPRACPPEVYAIMRGCWQREPQQRHSIK










DVHARLQALAQAPPVYLDVLG




G17203.
1
31
336
1210
8396
MDQVMQFVEPSRQFVKDSIRLVKRCTKPDRK
2
9


TCGA-





VCNGIGIGEFKDSLSINATNIKHFKNCTSISGDL




06-





HILPVAFRGDSFTHTPPLDPQELDILKTVKEITG




0211-





FLLIQAWPENRTDLHAFENLEIIRGRTKQHGQ




02A-





FSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLC




02R-





YANTINWKKLFGTSGQKTKIISNRGENSCKAT




2005-





GQVCHALCSPEGCWGPEPRDCVSCRNVSRG




01.2





RECVDKCNLLEGEPREFVENSECIQCHPECLP










QAMNITCTGRGPDNCIQCAHYIDGPHCVKTC










PAGVMGENNTLVWKYADAGHVCHLCHPNC










TYGCTGPGLEGCPTNGPKIPSIATGMVGALLLL










LVVALGIGLFMRRRHIVRKRTLRRLLQERELVE










PLTPSGEAPNQALLRILKETEFKKIKVLGSGAFG










TVYKGLWIPEGEKVKIPVAIKELREATSPKANK










EILDEAYVMASVDNPHVCRLLGICLTSIVQLIT










QLMPFGCLLDYVREHKDNIGSQYLLNWCVQI










AKGMNYLEDRRLVHRDLAARNVLVKTPQHV










KITDFGLAKLLGAEEKEYHAEGGKVPIKWMAL










ESILHRIYTHQSDVWSYGVTVWELMTFGSKPY










DGIPASEISSILEKGERLPQPPICTIDVYMIMVK










CWMIDADSRPKFRELIIEFSKMARDPQRYLVI










QGDERMHLPSPTDSNFYRALMDEEDMDDV










VDADEYLIPQQGFFSSPSTSRTPLLSSLSATSNN










STVACIDRNGLQSCPIKEDSFLQRYSSDPTGAL










TEDSIDDTFLPVPEYINQSVPKRPAGSVQNPVY










HNQPLNPAPSRDPHYQDPHSTAVGNPEYLNT










VQPTCVNSTFDSPAHWAQKGSHQISLDNPDY










QQDFFPKEAKPNGIFKGSTAENAEYLRVAP QS










SEFIGA




G17784.
1
150
98
245
8397
MRQSLLFLTSVVPFVLAPRPPDDPGFGPHQRL
4
2


TCGA-





EKLDSLLSDYDILSLSNIQQHSVRKRDLQTSTH




76-





VETLLTFSALKRHFKLYLTSSTERFSQNFKVVVV




4929-





DGKNESEYTVKWQDFFTGHVVGEPDSRVLA




01A-





HIRDDDVIIRINTDGAEYNIEELLDKYLIANATN




01R-





PESKVFYLKMKGDYFRYLAEVACGDDRKQTID




1850-





NSQGAYQEAFDISKKEMQPTHPIRLGLALNFS




01.4





VFYYEILNNPELACTLAKTAFDEAIAELDTLNED










SYKDSTLIMQLLRDNLTLWTSDSAGEECDAAE










GAEN




G17675.
1
228
559
584
8398
MEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSL
14
15


TCGA-





SINATNIKHFKNCTSISGDLHILPVAFRGDSFTH




19-





TPPLDPQELDILKTVKEITGFLLIQAWPENRTD




2624-





LHAFENLEIIRGRTKQQDQTTVSSVPTTLTAPT




01A-





ASRPARQESWEKTTPWSGSTQTPAMCATCAI




01R-





QTAPTDALGQVLKAVQRMGLRSRPSPLGW




1850-





WGPSSCCWWWPWGSASSCEGATSFGSARC




01.2





GGCCRRGRYRFDPRLMFSNRGSVRTRRFSKH










LL




G17796.
1
96
117
301
8399
MADPGPDPESESESVFPREVGLFADSYSEKSQ
2
3


TCGA-





FCFCGHVLTITQNFGSRLGVAARVWDAALSLC




41-





NYFESQNVDFRGKKVIELGAGTGIVGILAALQ




5651-





VELSPLFPDTLILGLEIRVKVSDYVQDRIRALRA




01A-





APAGGFQNIACLRSNAMKHLPNFFYKGQLTK




01R-





MFFLFPDPHFKRTKHKWRIISPTLLAEYAYVLR




1850-





VGGLVYTITDVLELHDWMCTHFEEHPLFERVP




01.4





LEDLSEDPVVGHLGTSTEEGKKVLRNGGKNFP










AIFRRIQDPVLQAVTSQTSLPGH




G17666.
1
34
90
300
8400
MVPAAGRRPPRVMRLLGWWQVLLWVLGLP
1
6


TCGA-





VRGVEGYNVSLLYDLENLPASKDSIVHQAGML




06-





KRNCFASVFEKYFQFQEEGKEGENRAVIHYRD




5415-





DETMYVESKKDRVTVVFSTVFKDDDDVVIGK




01A-





VFMQEFKEGRRASHTAPQVLFSHREPPLELKD




01R-





TDAAVGDNIGYITFVLFPRHTNASARDNTINLI




1849-





HTFRDYLHYHIKCSKAYIHTRMRAKTSDFLKVL




01.2





NRARPDAEKKEMKTITGKTFSSR




G17219.
1
8
528
926
8401
MDGFAGSLGTLHVGDEIREINGISVANQTVE
1
17


TCGA-





QLQKMLREMRGSITFKIVPSYRTQSSSCERDS




06-





PSTSRQSPANGHSSTNNSVSDLPSTTQPKGR




0158-





QIYVRAQFEYDPAKDDLIPCKEAGIRFRVGDII




01A-





QIISKDDHNWWQGKLENSKNGTAGLIPSPEL




01R-





QEWRVACIAMEKTKQEQQASCTWFGKKKKQ




1849-





YKDKYLAKHNAVFDQLDLVTYEEVVKLPAFKR




01.2





KTLVLLGAHGVGRRHIKNTLITKHPDRFAYPIP










HTTRPPKKDEENGKNYYFVSHDQMMQDISN










NEYLEYGSHEDAMYGTKLETIRKIHEQGLIAIL










DVEPQALKVLRTAEFAPFVVFIAAPTITPGLNE










DESLQRLQKESDILQRTYAHYFDLTIINNEIDETI










RHLEEAVELVCTAPQWVPVSWVY




G17790.
1
23
740
819
8402
MSGQLERCEREWHELEGEFQELQDMKNATL
1
18


TCGA-





SLNSNDSEPKYYPIAVLLKNQNQELPEDVNPA




06-





KKENYLSEQDFVSVFGITRGQFAALPGWKQL




5856-





QMKKEKGLF




01A-










01R-










1849-










01.4










NYU _B
1
93
4035
5204
8403
MGGPQDKDERTIALVRPWPWGHQALDPAY
3
82








GLDTMHPSRRSLPFPLNCQLARVGTADYGSP










SDQSDQQLDCALDLMRRLPPQQIEKNLSDLID










LDVPVEALTTVKPYCNEIHAQAQLWLKRDPK










ASYDAWKKCLPIRGIDGNGKAPSKSELRHLYLT










EKYVWRWKQFLSRRGKRTSPLDLKLGHNNW










LRQVLFTPATQAARQAACTIVEALATIPSRKQ










QVLDLLTSYLDELSIAGECAAEYLALYQKLITSA










HWKVYLAARGVLPYVGNLITKEIARLLALEEAT










LSTDLQQGYALKSLTGLLSSFVEVESIKRHFKSR










LVGTVLNGYLCLRKLVVQRTKLIDETQDMLLE










MLEDMTTGTESETKAFMAVCIETAKRYNLDD










YRTPVFIFERLCSIIYPEENEVTEFFVTLEKDPQQ










EDFLQGRMPGNPYSSNEPGIGPLMRDIKNKIC










QDCDLVALLEDDSGMELLVNNKIISLDLPVAE










VYKKVWCTTNEGEPMRIVYRMRGLLGDATEE










FIESLDSTTDEEEDEEEVYKMAGVMAQCGGL










ECMLNRLAGIRDFKQGRHLLTVLLKLFSYCVKV










KVNRQQLVKLEMNTLNVMLGTLNLALVAEQ










ESKDSGGAAVAEQVLSIMEIILDESNAEPLSED










KGNLLLTGDKDQLVMLLDQINSTFVRSNPSVL










QGLLRIIPYLSFGEVEKMQILVERFKPYCNFDKY










DEDHSGDDKVFLDCFCKIAAGIKNNSNGHQL










KDLILQKGITQNALDYMKKHIPSAKNLDADIW










KKFLSRPALPFILRLLRGLAIQHPGTQVLIGTDSI










PNLHKLEQVSSDEGIGTLAENLLEALREHPDV










NKKIDAARRETRAEKKRMAMAMRQKALGTL










GMTTNEKGQVVTKTALLKQMEELIEEPGLTCC










ICREGYKFQPTKVLGIYTFTKRVALEEMENKPR










KQQGYSTVSHFNIVHYDCHLAAVRLARGREE










WESAALQNANTKCNGLLPVWGPHVPESAFA










TCLARHNTYLQECTGQREPTYQLNIHDIKLLFL










RFAMEQSFSADTGGGGRESNIHLIPYIIHTVLY










VLNTTRATSREEKNLQGFLEQPKEKWVESAFE










VDGPYYFTVLALHILPPEQWRATRVEILRRLLV










TSQARAVAPGGATRLTDKAVKDYSAYRSSLLF










WALVDLIYNMFKKQTTPTVGGIDTGSLEPCVC










EKVPTSNTEGGWSCSLAEYIRHNDMPIYEAAD










KALKTFQEEFMPVETFSEFLDVAGLLSEITDPES










FLKDLLNSVP




G17657.
1
89
39
797
8404
MNGQLDLSGKLIIKAQLGEDIRRIPIHNEDITYD
3
2


TCGA-





ELVLMMQRVFRGKLLSNDEVTIKYKDEDGDLI




19-





TIFDSSDLSFAIQCSRILKLTLFGKSTSSSSTPTEF




1787-





CRNGGTWENGRCICTEEWKGLRCTIANFCEN




01B-





STYMGFTFARIPVGRYGPSLQTCGKDTPNAG




01R-





NPMAVRLCSLSLYGEIELQKVTIGNCNENLETL




1850-





EKQVKDVTAPLNNISSEVQILTSDANKLTAENI




01.2





TSATRVVGQIFNTSRNASPEAKKVAIVTVSQLL










DASEDAFQRVAATANDDALTTLIEQMETYSLS










LGNQSVVEPNIAIQSANFSSENAVGPSNVRFS










VQKGASSSLVSSSTFIHTNVDGLNPDAQTELQ










VLLNMTKNYTKTCGFVVYQNDKLFQSKTFTA










KSDFSQKIISSKTDENEQDQSASVDMVFSPKY










NQKEFQLYSYACVYWNLSAKDWDTYGCQKD










KGTDGFLRCRCNHTTNFAVLMTFKKDYQYPK










SLDILSNVGCALSVTGLALTVIFQIVTRKVRKTS










VTWVLVNLCISMLIFNLLFVFGIENSNKNLQTS










DGDINNIDFDNNDIPRTDTINIPNPMCTAIAAL










LHYFLLVTFTWNALSAAQLYYLLIRTMKPLPRH










FILFISLIGWGVPAIVVAITVGVIYSQNGNNPQ










WELDYRQEKICWLAIPEPNGVIKSPLLWSFIVP










VTIILISNVVMFITISIKVLWKNNQNLTSTKKVS










SMKKIVSTLSVAVVFGITWILAYLMLVNDDSIR










IVFSYIFCLFNTTQGLQIFILYTVRTKVFQSEASK










VLMLLSSIGRRKSLPSVTRPRLRVKMYNFLRSL










PTLHERFRLLETSPSTEEITLSESDNAKESI




G17643.
1
20
79
169
8405
MAPARLFALLLFFVGGVAESDYKGQKLAEQM
1
2


TCGA-





FQGIILFSAIVGFIYGYVAEQFGWTVYIVMAGF




12-





AFSCLLTLPPWPIYRRHPLKWLPVQESSTDDK




5295-





KPGERKIKRHAKNN




01A-










01R-










1849-










01.2










NYU_G
1
265
119
974
8406
MRSIRKRWTICTISLLLIFYKTKEIARTEEHQET
4
5








QLIGDGELSLSRSLVNSSDKIIRKAGSSIFQHNV










EGWKINSSLVLEIRKNILRFLDAERDVSVVKSSF










KPGDVIHYVLDRRRTLNISHDLHSLLPEVSPMK










NRRFKTCAVVGNSGILLDSECGKEIDSHNFVIR










CNLAPVVEFAADVGTKSDFITMNPSVVQRAF










GGFRNESDREKFVHRLSMLNDSVLWIPAFMV










KGGEKHVEWVNALILKNKLKVRTAYPSLRLIH










AVRGFCDEGTCTDKANILYAWARNAPPTRLP










KGVGFRVGGETGSKYFVLQVHYGDISAFRDN










NKDCSGVSLHLTRLPQPLIAGMYLMMSVDTV










IPAGEKVVNSDISCHYKNYPMHVFAYRVHTH










HLGKVVSGYRVRNGQWTLIGRQSPQLPQAFY










PVGHPVDVSFGDLLAARCVFTGEGRTEATHIG










GTSSDEMCNLYIMYYMEAKHAVSFMTCTQN










VAPDMFRTIPPEANIPIPVKSDMVMMHEHH










KETEYKDKIPLLQQPKREEEEVLDQGDFYSLLS










KLLGEREDVVHVHKYNPTEKAESESDLVAEIA










NVVQKKDLGRSDAREGAEHERGNAILVRDRI










HKFHRLVSTLRPPESRVFSLQQPPPGEGTWEP










EHTGDFHMEEALDWPGVYLLPGQVSGVALD










PKNNLVIFHRGDHVWDGNSFDSKFVYQQIGL










GPIEEDTILVIDPNNAAVLQSSGKNLFYLPHGL










SIDKDGNYWVTDVALHQVFKLDPNNKEGPVL










ILGRSMQPGSDQNHFCQPTDVAVDPGTGAIY










VSDGYCNSRIVQFSPSGKFITQWGEESSGSSPL










PGQFTVPHSLALVPLLGQLCVADRENGRIQCF










KTDTKEFVREIKHSSFGRNVFAISYIPGLLFAVN










GKPHFGDQEPVQGFVMNFSNGEIIDIFKPVRK










HFDMPHDIVASEDGTVYIGDAHTNTVWKFTL










TEKLEHRSVKKAGIEVQEIKEAEAVVETKMEN










KPTSSELQKMQEKQKLIKEPGSGVPVVLITTLL










VIPVVVLLAIAIFIRWKKSRAFGADSEHKLETSS










GRVLGRFRGKGSGGLNLGNFFASRKGYSRKG










FDRLSTEGSDQEKEDDGSESEEEYSAPLPALAP










SSS




G17494.
1
85
67
170
8407
MSVDMNSQGSDSNEEDYDPNCEEEEEEEED
1
2


TCGA-





DPGDIEDYYVGVASDVEQQGADAFDPEEYQF




14-





TCLTYKESEGALNEHMTSLASVLKDGDPDTPK




2554-





PVSFTVKETVCPRTTQQSPEDCDFKKDGLVKR




01A-





CMGTVTLNQARGSFDISCDKDNKRFALLGDF




01R-





FRKSKEKIGKEFKRIVQRIKDFLRNLVPRTES




1850-










01.2










G17196.
1
852
922
995
8408
MSRGAGALQRRTTTYLISLTLVKLESVPPPPPS
13
27


TCGA-





PSAAAAGAAGARGSETGDPGSPRGAEEPGKK




06-





RHERLFHRQDALWISTSSAGTGGAEPPALSPA




0178-





PASPARPVSPAPGRRLSLWAVPPGPPLSGGLS




01A-





PDPKPGGAPTSSRRPLLSSPSWGGPEPEGRA




01R-





GGGIPGSSSPHPGTGSRRLKVAPPPPAPKPCK




1849-





TVTTSGAKAGGGKGAGSRLSWPESEGKPRVK




01.2





GSKSSAGTGASVSAAATAAAAGGGGSTASTS










GGVGAGAGARGKLSPRKGKSKTLDNSDLHPG










PPAGSPPPLTLPPTPSPATAVTAASAQPPGPA










PPITLEPPAPGLKRGREGGRASTRDRKMLKFIS










GIFTKSTGGPPGSGPLPGPPSLSSGSGSRELLG










AELRASPKAVINSQEWTLSRSIPELRLGVLGDA










RSGKSSLIHRFLTGSYQVLEKTESEQYKKEMLV










DGQTHLVLIREEAGAPDAKFSGWADAVIFVFS










LEDENSFQAVSRLHGQLSSLRGEGRGGLALAL










VGTQDRISASSPRVVGDARARALCADMKRCS










YYETCATYGLNVDRVFQEVAQKVVTLRKQQQ










LLAACKSLPSSPSHSAASTPVAGQASNGGHTS










DYSSSLPSSPNVGHRELRAEAAAVAGLSTPGSL










HRAAKRRTSLFANRRGSDSEKRSLDSRGETTG










SGRAIPIKQSFLLKRSGNSLNKEWKKKYVTLSS










NGFLLYHPSINDYIHSTHGKEMDLLRTTVKVP










GKRPPRAISAFGPSASINGLVKDMSTVQMGE










GLEATTPMPSPSPSPSSLQPPPDQTSKHLLKP










DRNLARALSTDCTPSGDLSPLSREPPPSPMVK










KQRRKKLTTPSKTEGSAGQAEDEGYSFSSVLYY










GNEATLLIFDLLFFCVVDLACQNFILASFLTYLQ










QEIFRYIRNTVGQKNLASKTLVDQRFLI




G17782.
1
19
140
216
8409
MTHFNKGPSYGLSAEVKNKVLCNGPGTCVPI
1
4


TCGA-





CVSALLLGILGIKKVIIVYVESICRVETLSMSGKIL




26-





FHLSDYFIVQWPALKEKYPKSVYLGRIV




5136-










01B-










01R-










1850-










01.4










GBM-
1
18
132
179
8410
MRRQPAKVAALLLGLLLEHIEGDDLHIQRNVQ
1
4


CUMC3296_L1





KLKDTVKKLGESGEIKAIGELDLLFMSLRNACI




G17199.
1
29
96
406
8411
MRPSGTAGAALLALLAALCPASRALEEKKADI
1
5


TCGA-





AQRYRISKYPTLKLFRNGMMMKREYRGQRSV




06-





KALADYIRQQKSDPIQEIRDLAEITTLDRSKRNII




0744-





GYFEQKDSDNYRVFERVANILHDDCAFLSAFG




01A-





DVSKPERYSGDNIIYKPPGHSAPDMVYLGAM




01R-





TNFDVTYNWIQDKCVPLVREITFENGEELTEE




1849-





GLPFLILFHMKEDTESLEIFQNEVARQLISEKGT




01.2





INFLHADCDKFRHPLLHIQKTPADCPVIAIDSFR










HMYVFGDFKDVLIPGKLKQFVFDLHSGKLHRE










FHHGPDPTDTAPGEQAQDVASSPPESSFQKL










APSEYRYTLLRDRDEL




G17792.
1
320
162
453
8412
MTDGKLSTSTNGVAFMGILDGRPGNPLQNL
4
3


TCGA-





QHVNLKAPRLLSAPEYGPKLKLRALEDRHSLQ




28-





SVDSGIPTLEIGNPEPVPCSAVHVRRKQSDSDL




5204-





IPERAFQSACALPSCAPPAPSSTEREQSVRKSS




01A-





TFPRTGYDSVKLYSPTSKALTRSDDVSVCSVSS




01R-





LGTELSTTLSVSNEDILDLVVTSSSSAIVTLEND




1850-





DDPQFTNVTLSSIKETRGLHQQDCVHEAEEGS




01.4





KLKILGPFSNFFARNLLARKQSARLDKHNDLG










WKLFGKAPLRENAQKDSKRIQKEYEDKAGRP










SKPPSPKQNVRKNLDFEPLSTTALILEDRPAHP










LFGRNVPLSSGSGFIMSEAGLIITNAHVVSSNS










AAPGRQQLKVQLQNGDSYEATIKDIDKKSDIA










TIKIHPKKKLPVLLLGHSADLRPGEFVVAIGSPF










ALQNTVTTGIVSTAQREGRELGLRDSDMDYIQ










TDAIINYGNSGGPLVNLDGEVIGINTLKVTAGI










SFAIPSDRITRFLTEFQDKQIKDWKKRFIGIRM










RTITPSLVDELKASNPDFPEVSSGIYVQEVAPN










SPSQRGGIQDGDIIVKVNGRPLVDSSELQEAV










LTESPLLLEVRRGNDDLLFSIAPEVVM




G17476.
1
64
1974
3685
8413
MRLLPRLLLLLLLVFPATVLFRGGPRGLLAVAQ
2
42


TCGA-





DLTEDEETVEDSIIEDEDDEAEVEEDEPTDLHT




06-





VREETMMVMTEDMPLEISYVPSTYLTEITHVS




2569-





QALLEVEQLLNAPDLCAKDFEDLFKQEESLKNI




01A-





KDSLQQSSGRIDIIHSKKTAALQSATPVERVKL




01R-





QEALSQLDFQWEKVNKMYKDRQGRFDRSVE




1849-





KWRRFHYDIKIFNQWLTEAEQFLRKTQIPEN




01.2





WEHAKYKWYLKELQDGIGQRQTVVRTLNAT










GEEIIQQSSKTDASILQEKLGSLNLRWQEVCKQ










LSDRKKRLEEQKNILSEFQRDLNEFVLWLEEAD










NIASIPLEPGKEQQLKEKLEQVKLLVEELPLRQ










GILKQLNETGGPVLVSAPISPEEQDKLENKLKQ










TNLQWIKVSRALPEKQGEIEAQIKDLGQLEKKL










EDLEEQLNHLLLWLSPIRNQLEIYNQPNQEGP










FDVKETEIAVQAKQPDVEEILSKGQHLYKEKPA










TQPVKRKLEDLSSEWKAVNRLLQELRAKQPDL










APGLTTIGASPTQTVTLVTQPVVTKETAISKLE










MPSSLMLEVPALADFNRAWTELTDWLSLLDQ










VIKSQRVMVGDLEDINEMIIKQKATMQDLEQ










RRPQLEELITAAQNLKNKTSNQEARTIITDRIER










IQNQWDEVQEHLQNRRQQLNEMLKDSTQW










LEAKEEAEQVLGQARAKLESWKEGPYTVDAI










QKKITETKQLAKDLRQWQTNVDVANDLALKL










LRDYSADDTRKVHMITENINASWRSIHKRVSE










REAALEETHRLLQQFPLDLEKFLAWLTEAETTA










NVLQDATRKERLLEDSKGVKELMKQWQDLQ










GEIEAHTDVYHNLDENSQKILRSLEGSDDAVLL










QRRLDNMNFKWSELRKKSLNIRSHLEASSDQ










WKRLHLSLQELLVWLQLKDDELSRQAPIGGDF










PAVQKQNDVHRAFKRELKTKEPVIMSTLETVR










IFLTEQPLEGLEKLYQEPRELPPEERAQNVTRLL










RKQAEEVNTEWEKLNLHSADWQRKIDETLER










LRELQEATDELDLKLRQAEVIKGSWQPVGDLLI










DSLQDHLEKVKALRGEIAPLKENVSHVNDLAR










QLTTLGIQLSPYNLSTLEDLNTRWKLLQVAVE










DRVRQLHEAHRDFGPASQHFLSTSVQGPWE










RAISPNKVPYYINHETQTTCWDHPKMTELYQS










LADLNNVRFSAYRTAMKLRRLQKALCLDLLSLS










AACDALDQHNLKQNDQPMDILQIINCLTTIYD










RLEQEHNNLVNVPLCVDMCLNWLLNVYDTG










RTGRIRVLSFKTGIISLCKAHLEDKYRYLFKQVA










SSTGFCDQRRLGLLLHDSIQIPRQLGEVASFGG










SNIEPSVRSCFQFANNKPEIEAALFLDWMRLE










PQSMVWLPVLHRVAAAETAKHQAKCNICKE










CPIIGFRYRSLKHFNYDICQSCFFSGRVAKGHK










MHYPMVEYCTPTTSGEDVRDFAKVLKNKFRT










KRYFAKHPRMGYLPVQTVLEGDNMETPVTLI










NFWPVDSAPASSPQLSHDDTHSRIEHYASRLA










EMENSNGSYLNDSISPNESIDDEHLLIQHYCQS










LNQDSPLSQPRSPAQILISLESEERGELERILADL










EEENRNLQAEYDRLKQQHEHKGLSPLPSPPE










MMPTSPQSPRDAELIAEAKLLRQHKGRLEAR










MQILEDHNKQLESQLHRLRQLLEQPQAEAKV










NGTTVSSPSTSLQRSDSSQPMLLRVVGSQTSD










SMGEEDLLSPPQDTSTGLEEVMEQLNNSFPSS










RGRNTPGKPMREDTM




G17195.
1
261
193
667
8414
MPLLGQTVRSASARTRRWSRRAAGDRPGAP
6
6


TCGA-





SEARRPQLRGDHGIVDRVRGHWRIAGSCSTC




06-





WCPSALCSSTNGFMCTFPNMSLTLVHFVVT




0138-





WLGLYICQKLDIFAPKSLPPSRLLLLALSFCGFV




01A-





VFTNLSLQNNTIGTYQLAKAMTTPVIIAIQTFC




02R-





YQKTFSTRIQLTLIPITLGVILNSYYDVKFNFLG




1849-





MVFAALGVLVTSLYQVWVGAKQHELQVNS




01.2





MQLLYYQAPMSSAMLLVAVPFFEPVFGEGGI










FGPWSVSALFLCDEGAGISGDYIDRVDEPLSCS










YVLTIRTPRLCPHPLLRPPPSAAPQAILCHPSLQ










PEEYMAYVQRQADSKQYGDKIIEELQDLGPQ










VWSETKSGVAPQKMAGASPTKDDSKDSDFW










KMLNEPEDQAPGGEEVPAEEQDPSPEAADSA










SGAPNDFQNNVQVKVIRSPADLIRFIEELKGG










TKKGKPNIGQEQPVDDAAEVPQREPEKERGD










PERQREMEEEEDEDEDEDEDEDERQLLGEFE










KELEGILLPSDRDRLRSEVKAGMERELENIIQET










EKELDPDGLKKESERDRAMLALTSTLNKLIKRL










EEKQSPELVKKHKKKRVVPKKPPPSPQPTEED










PEHRVRVRVTKLRLGGPNQDLTVLEMKRENP










QLKQIEGLVKELLEREGLTAAGKIEIKIVRPWAE










GTEEGARWLTDEDTRNLKEIFFNILVPGAEEA










QKERQRQKELESNYRRVWGSPGGEGTGDLD










EFDF




G17212.
1
810
134
341
8415
MVPEAWRSGLVSTGRVVGVLLLLGALNKAST
1
4


TCGA-





VIHYEIPEEREKGFAVGNVVANLGLDLGSLSAR




06-





RFRVVSGASRRFFEVNRETGEMFVNDRLDRE




0129-





ELCGTLPSCTVTLELVVENPLELFSVEVVIQDIN




01A-





DNNPAFPTQEMKLEISEAVAPGTRFPLESAHD




01R-





PDVGSNSLQTYELSRNEYFALRVQTREDSTKY




1849-





AELVLERALDREREPSLQLVLTALDGGTPALSA




01.2





SLPIHIKVLDANDNAPVFNQSLYRARVLEDAPS










GTRVVQVLATDLDEGPNGEIIYSFGSHNRAGV










RQLFALDLVTGMLTIKGRLDFEDTKLHEIYIQA










KDKGANPEGAHCKVLVEVVDVNDNAPEITVT










SVYSPVPEDAPLGTVIALLSVTDLDAGENGLVT










CEVPPGLPFSLTSSLKNYFTLKTSADLDRETVPE










YNLSITARDAGTPSLSALTIVRVQVSDINDNPP










QSSQSSYDVYIEENNLPGAPILNLSVWDPDAP










QNARLSFFLLEQGAETGLVGRYFTINRDNGIVS










SLVPLDYEDRREFELTAHISDGGTPVLATNISV










NIFVTDRNDNAPQVLYPRPGGSSVEMLPRGT










SAGHLVSRVVGWDADAGHNAWLSYSLLGSP










NQSLFAIGLHTGQISTARPVQDTDSPRQTLTV










LIKDNGEPSLSTTATLTVSVTEDSPEARAEFPSG










SAPREQKKNLTFYLLLSLILVSVGFVVTVEGVIIF










KVYKWKQSRDLYRAPVSSLYRTPGPSLHADAV










RGGLMSPHLYHQVYLTTDSRRSDPLLKKPGAA










SPLASRQNTLRSCDPVFYRQVLGAESAPPGQE










EQNRGKPNWEHLNEDLHVLITVEDAQNRAEI










KLKRAVEEVKKLLVPAAEGEDSLKKMQLMELA










ILNGTYRDANIKSPALAFSLAATAQAAPRIITGP










APVLPPAALRTPTPAGPTIMPLIRQIQTAVMP










NGTPHPTAAIVPPGPEAGLIYTPYEYPYTLAPA










TSILEYPIEPSGVLGAVATKVRRHDMRVHPYQ










RIVTADRAATGN




G17213.
1
12
131
278
8416
MADIEQYYMKPPEIVKEAEVPQAALGVPAQG
2
2


TCGA-





TGDNGHTPVEEEVGGIPVPAPGLLQVTERRQ




06-





PLSSVSSLEVHFDLLDLTELTDMSDQELAEVFA




0157-





DSDDENLNTESPAGLHPLPRAGYLRSPSWTRT




01A-





RAEQSHEKQPLGDPERQAIVLDTFLTVERPQE




01R-





D




1849-










01.2










G17200.
1
1066
710
1072
8417
MAAQVAPAAASSLGNPPPPPPSELKKAEQQQ
11
12


TCGA-





REEAGGEAAAAAAAERGEMKAAAGQESEGP




06-





AVGPPQPLGKELQDGAESNGGGGGGGAGSG




0125-





GGPGAEPDLKNSNGNAGPRPALNNNLTEPP




01A-





GGGGGGSSDGVGAPPHSAAAALPPPAYGFG




01R-





QPYGRSPSAVAAAAAAVFHQQHGGQQSPGL




1849-





AALQSGGGGGLEPYAGPQQNSHDHGFPNH




01.2





QYNSYYPNRSAYPPPAPAYALSSPRGGTPGSG










AAAAAGSKPPPSSSASASSSSSSFAQQRFGAM










GGGGPSAAGGGTPQPTATPTLNQLLTSPSSA










RGYQGYPGGDYSGGPQDGGAGKGPADMAS










QCWGAAAAAAAAAAASGGAQQRSHHAPM










SPGSSGGGGQPLARTPQPSSPMDQMGKMR










PQPYGGTNPYSQQQGPPSGPQQGHGYPGQ










PYGSQTPQRYPMTMQGRAQSAMGGLSYTQ










QIPPYGQQGPSGYGQQGQTPYYNQQSPHPQ










QQQPPYSQQPPSQTPHAQPSYQQQPQSQPP










QLQSSQPPYSQQPSQPPHQQSPAPYPSQQST










TQQHPQSQPPYSQPQAQSPYQQQQPQQPA










PSTLSQQAAYPQPQSQQSQQTAYSQQRFPPP










QELSQDSFGSQASSAPSMTSSKGGQEDMNLS










LQSRPSSLPDLSGSIDDLPMGTEGALSPGVSTS










GISSSQGEQSNPAQSPFSPHTSPHLPGIRGPSP










SPVGSPASVAQSRSGPLSPAAVPGNQMPPRP










PSGQSDSIMHPSMNQSSIAQDRGYMQRNPQ










MPQYSSPQPGSALSPRQPSGGQIHTGMGSY










QQNSMGSYGPQGGQYGPQGGYPRQPNYN










ALPNANYPSAGMAGGINPMGAGGQMHGQ










PGIPPYGTLPPGRMSHASMGNRPYGPNMAN










MPPQVGSGMCPPPGGMNRKTQETAVAMH










VAANSIQNRPPGYPNMNQGGMMGTGPPY










GQGINSMAGMINPQGPPYSMGGTMANNSA










GMAASPEMMGLGDVKLTPATKMNNKADGT










PKTESKSKKSSSSTTTNEKITKLYELGGEPERKM










WVDRYLAFTEEKAMGMTNLPAVGRKPLDLY










RLYVSVKEIGGLTQMQALTSCECTICPDCFRQ










HFTIALKEKHITDMVCPACGRPDLTDDTQLLS










YFSTLDIQLRESLEPDAYALFHKKLTEGVLMRD










PKFLWCAQCSFGFIYEREQLEATCPQCHQTFC










VRCKRQWEEQHRGRSCEDFQNWKRMNDPE










YQAQGLAMYLQENGIDCPKCKFSYALARGGC










MHFHCTQCRHQFCSGCYNAFYAKNKCPEPN










CRVKKSLHGHHPRDCLFYLRDWTALRLQKLLQ










DNNVMFNTEPPAGARAVPGGGCRVIEQKEV










PNGLRDEACGKETPAGYAGLCQAHYKEYLVSL










INAHSLDPATLYEVEELETATERYLHVRPQPLA










GEDPPAYQARLLQKLTEEVPLGQSIPRRRK




G17804.
1
982
190
225
8418
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
6


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




06-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




5408-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1849-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.4





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTPQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HIPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSIVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQDAFIGFGGNVI










RQQVKDNAKWYITDFVELLGELEE




G17656.
1
379
298
421
8419
MAAQVAPAAASSLGNPPPPPPSELKKAEQQQ
1
5


TCGA-





REEAGGEAAAAAAAERGEMKAAAGQESEGP




28-





AVGPPQPLGKELQDGAESNGGGGGGGAGSG




2514-





GGPGAEPDLKNSNGNAGPRPALNNNLTEPP




01A-





GGGGGGSSDGVGAPPHSAAAALPPPAYGFG




02R-





QPYGRSPSAVAAAAAAVFHQQHGGQQSPGL




1850-





AALQSGGGGGLEPYAGPQQNSHDHGFPNH




01.2





QYNSYYPNRSAYPPPAPAYALSSPRGGTPGSG










AAAAAGSKPPPSSSASASSSSSSFAQQRFGAM










GGGGPSAAGGGTPQPTATPTLNQLLTSPSSA










RGYQGYPGGDYSGGPQDGGAGKGPADMAS










QCWGAAAAAAAAAAASGGAQQRSHHAPM










SPGSSGGGGQPLARTPQVHLGSGIWVDEEK










WHQLQVTQGDSKYTKNLAVMIWGTDVLKN










RSVTGVATKKKKDAVPKPPLSPHKLSIVRECLY










DRIAQETVDETEIAQRLSKVNKYICEKIMDINK










SCKNEERREAKYNLQ




G17654.
1
329
102
545
8420
MLLSLPALHLQTSEHHPFFQLPHRRLGPWCSP
7
6


TCGA-





TGSPAPLSCETGCGEGSWILVCRLLVPTQVSLL




41-





SMEEDIDTRKINNSFLRDHSYATEADIISTVEFN




4097-





HTGELLATGDKGGRVVIFQREQESKNQVHRR




01A-





GEYNVYSTFQSHEPEFDYLKSLEIEEKINKIRWL




01R-





PQQNAAYFLLSTNDKTVKLWKVSERDKRPEG




1850-





YNLKDEEGRLRDPATITTLRVPVLRPMDLMVE




01.2





ATPRRVFANAHTYHINSISVNSDYETYMSADD










LRINLWNFEITNQSFNIVDIKPANMEELTEVIT










AAEFHPHHCNTFVYSSSKGTIRLCDMRASALC










DRHTKSGEMLSVAEHFLEQQMHPTVVISAYR










KALDDMISTLKKISIPVDISDSDMMLNIINSSIT










TKAISRWSSLACNIALDAVKMVQFEENGRKEI










DIKKYARVEKIPGGIIEDSCVLRGVMINKDVTH










PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITR










EEDFTRILQMEEEYIQQLCEDIIQLKPDVVITEK










GISDLAQHYLMRANITAIRRVRKTDNNRIARA










CGARIVSRPEELREDDVGTGAGLLEIKKIGDEY










FTFITDCKDPKACTILLRGASKEILSEVERNLQD










AMQVCRNVLLDPQLVPGGGASEMAVAHALT










EKSKAMTGVEQWPYRAVAQALEVIPRTLIQN










CGASTIRLLTSLRAKHTQENCETWGVNGETGT










LVDMKELGIWEPLAVKLQTYKTAVETAVLLLRI










DDIVSGHKKKGDDQSRQGGAPDAGQE




G17206.
1
1066
710
1072
8421
MAAQVAPAAASSLGNPPPPPPSELKKAEQQQ
11
12


TCGA-





REEAGGEAAAAAAAERGEMKAAAGQESEGP




06-





AVGPPQPLGKELQDGAESNGGGGGGGAGSG




0125-





GGPGAEPDLKNSNGNAGPRPALNNNLTEPP




02A-





GGGGGGSSDGVGAPPHSAAAALPPPAYGFG




11R-





QPYGRSPSAVAAAAAAVFHQQHGGQQSPGL




2005-





AALQSGGGGGLEPYAGPQQNSHDHGFPNH




01.2





QYNSYYPNRSAYPPPAPAYALSSPRGGTPGSG










AAAAAGSKPPPSSSASASSSSSSFAQQRFGAM










GGGGPSAAGGGTPQPTATPTLNQLLTSPSSA










RGYQGYPGGDYSGGPQDGGAGKGPADMAS










QCWGAAAAAAAAAAASGGAQQRSHHAPM










SPGSSGGGGQPLARTPQPSSPMDQMGKMR










PQPYGGTNPYSQQQGPPSGPQQGHGYPGQ










PYGSQTPQRYPMTMQGRAQSAMGGLSYTQ










QIPPYGQQGPSGYGQQGQTPYYNQQSPHPQ










QQQPPYSQQPPSQTPHAQPSYQQQPQSQPP










QLQSSQPPYSQQPSQPPHQQSPAPYPSQQST










TQQHPQSQPPYSQPQAQSPYQQQQPQQPA










PSTLSQQAAYPQPQSQQSQQTAYSQQRFPPP










QELSQDSFGSQASSAPSMTSSKGGQEDMNLS










LQSRPSSLPDLSGSIDDLPMGTEGALSPGVSTS










GISSSQGEQSNPAQSPFSPHTSPHLPGIRGPSP










SPVGSPASVAQSRSGPLSPAAVPGNQMPPRP










PSGQSDSIMHPSMNQSSIAQDRGYMQRNPQ










MPQYSSPQPGSALSPRQPSGGQIHTGMGSY










QQNSMGSYGPQGGQYGPQGGYPRQPNYN










ALPNANYPSAGMAGGINPMGAGGQMHGQ










PGIPPYGTLPPGRMSHASMGNRPYGPNMAN










MPPQVGSGMCPPPGGMNRKTQETAVAMH










VAANSIQNRPPGYPNMNQGGMMGTGPPY










GQGINSMAGMINPQGPPYSMGGTMANNSA










GMAASPEMMGLGDVKLTPATKMNNKADGT










PKTESKSKKSSSSTTTNEKITKLYELGGEPERKM










WVDRYLAFTEEKAMGMTNLPAVGRKPLDLY










RLYVSVKEIGGLTQMQALTSCECTICPDCFRQ










HFTIALKEKHITDMVCPACGRPDLTDDTQLLS










YFSTLDIQLRESLEPDAYALFHKKLTEGVLMRD










PKFLWCAQCSFGFIYEREQLEATCPQCHQTFC










VRCKRQWEEQHRGRSCEDFQNWKRMNDPE










YQAQGLAMYLQENGIDCPKCKFSYALARGGC










MHFHCTQCRHQFCSGCYNAFYAKNKCPEPN










CRVKKSLHGHHPRDCLFYLRDWTALRLQKLLQ










DNNVMFNTEPPAGARAVPGGGCRVIEQKEV










PNGLRDEACGKETPAGYAGLCQAHYKEYLVSL










INAHSLDPATLYEVEELETATERYLHVRPQPLA










GEDPPAYQARLLQKLTEEVPLGQSIPRRRK




G17792.
1
161
121
238
8422 
MAELDPFGAPAGAPGGPALGNGVAGAGEED
4
4


TCGA-





PAAAFLAQQESEIAGIENDEAFAILDGGAPGP




28-





QPHGEPPGGPDAVDGVMNGEYYQESNGPT




5204-





DSYAAISQVDRLQSEPESIRKWREEQMERLEA




01A-





LDANSRKQEAEWKEKAIKELEEWYARQDEQL




01R-





QKTKANNRTILLSVISLLNEPNTFSPANVDASV




1850-





MFRKWRDSKGKDKEYAEIIRKQVSATKAEAEK




01.4





DGVKVPTTLAEYCIKTKVPSNDNSSDLLYDDLY










DDDIDDEDEEEEDADCYDDDDSGNEES




G17663.
1
5
322
502
8423 
MSRPVRFMERNPLTNAIIRTTTALTIFKAGVKF
1
9


TCGA-





NVIPPVAQATVNFRIHPGQTVQEVLELTKNIV




19-





ADNRVQFHVLSAFDPLPVSPSDDKALGYQLLR




2619-





QTVQSVFPEVNITAPVTSIGNTDSRFFTNLTTG




01A-





IYRFYPIYIQPEDFKRIHGVNEKISVQAYETQVK




01R-





FIFELIQNADTDQEPVSHLHKL




1850-










01.2










NYU_E
1
510
7
437
8424 
MRVAAATAAAGAGPAMAVWTRATKAGLVE
6
2








LLLRERWVRVVAELSGESLSLTGDAAAAELEP










ALGPAAAAFNGLPNGGGAGDSLPGSPSRGLG










PPSPPAPPRGPAGEAGASPPVRRVRVVKQEA










GGLGISIKGGRENRMPILISKIFPGLAADQSRAL










RLGDAILSVNGTDLRQATHDQAVQALKRAGK










EVLLEVKFIREVTPYIKKPSLVSDLPWEGAAPQ










SPSFSGSEDSGSPKHQNSTKDRKIIPLKMCFAA










RNLSMPDLENRLIELHSPDSRNTLILRCKDTAT










AHSWFVAIHTNIMALLPQVLAELNAMLGATS










TAGGSKEVKHIAWLAEQAKLDGGRQQWRPV










LMAVTEKDLLLYDCMPWTRDAWASPCHSYP










LVATRLVHSGSGCRSPSLGSDLTFATRTGSRQ










GIEMHLFRVETHRDLSSWTRILVQGCHAAAEL










IKEVSLGCMLNGQEVRLTIHYENGFTISRENG










GSSSILYRYPFERLKMSADDGIRNLYLDFGGPE










GELKAIDLVTKATEEDKAKNYEEALRLYQHAVE










YFLHAIKYEAHSDKAKESIRAKCVQYLDRAEKL










KDYLRSKEKHGKKPVKENQSEGKGSDSDSEG










DNPEKKKLQEQLMGAVVMEKPNIRWNDVA










GLEGAKEALKEAVILPIKFPHLFTGKRTPWRGIL










LFGPPGTGKSYLAKAVATEANNSTFFSVSSSDL










MSKWLGESEKLVKNLFELARQHKPSIIFIDEVD










SLCGSRNENESEAARRIKTEFLVQMQGVGNN










NDGTLVLGATNIPWVLDSAIRRRFEKRIYIPLPE










EAARAQMFRLHLGSTPHNLTDANIHELARKTE










GYSGADISIIVRDSLMQPVRKVQSATHFKKVC










GPSRTNPSMMIDDLLTPCSPGDPGAMEMT










WMDVPGDKLLEPVVCMSDMLRSLATTRPTV










NADDLLKVKKFSEDFGQES




NYU_G
1
7
95
483
8425
MKTPADTGDSGKVTTVVATLGQGPERSQEV
1
2








AYTDIKVIGNGSFGVVYQARLAETRELVAIKKV










LQDKRFKNRELQIMRKLDHCNIVRLRYFFYSSG










EKKDELYLNLVLEYVPETVYRVARHFTKAKLTIP










ILYVKVYMYQLFRSLAYIHSQGVCHRDIKPQNL










LVDPDTAVLKLCDFGSAKQLVRGEPNVSYICS










RYYRAPELIFGATDYTSSIDVWSAGCVLAELLL










GQPIFPGDSGVDQLVEIIKVLGTPTREQIREM










NPNYTEFKFPQIKAHPWTKVEKSRTPPEAIALC










SSLLEYTPSSRLSPLEACAHSFFDELRCLGTQLP










NNRPLPPLFNFSAGELSIQPSLNAILIPPHLRSP










AGTTTLTPSSQALTETPTSSDWQSTDATPTLT










NSS




NYU_B
1
602
13
455
8426
MGMARGSLTRVPGVMGEGTQGPELSLDPDP
11
2








CSPQSTPGLMKGNKLEEQDPRPLQPIPGLME










GNKLEEQDSSPPQSTPGLMKGNKREEQGLGP










EPAAPQQPTAEEEALIEFHRSYRELFEFFCNNT










TIHGAIRLVCSQHNRMKTAFWAVLWLCTFG










MMYWQFGLLFGEYFSYPVSLNINLNSDKLVFP










AVTICTLNPYRYPEIKEELEELDRITEQTLFDLYK










YSSFTTLVAGSRSRRDLRGTLPHPLQRLRVPPP










PHGARRARSVASSLRDNNPQVDWKDWKIGF










QLCNQNKSDCFYQTYSSGVDAVREWYRFHYI










NILSRLPETLPSLEEDTLGNFIFACRFNQVSCN










QANYSHFHHPMYGNCYTFNDKNNSNLWMS










SMPGINNGLSLMLRAEQNDFIPLLSTVTGARV










MVHGQDEPAFMDDGGFNLRPGVETSISMRK










ETLDRLGGDYGDCTKNGSDVPVENLYPSKYT










QQVCIHSCFQESMIKECGCAYIFYPRPQNVEY










CDYRKHSSWGYCYYKLQVDFSSDHLGCFTKCR










KPCSVTSYQLSAGYSRWPSVTSQEWVFQMLS










RQNNYTVNNKRNGVAKVNIFFKELNYKTNSE










SPSVTVLLELLVGIYPSGVIGLVPHLGDREKRDS










VCPQGKYIHPQNNSICCTKCHKGTYLYNDCPG










PGQDTDCRECESGSFTASENHLRHCLSCSKCR










KEMGQVEISSCTVDRDTVCGCRKNQYRHYW










SENLFQCFNCSLCLNGTVHLSCQEKQNTVCTC










HAGFFLRENECVSCSNCKKSLECTKLCLPQIEN










VKGTEDSGTTVLLPLVIFFGLCLLSLLFIGLMYR










YQRWKSKLYSIVCGKSTPEKEGELEGTTTKPLA










PNPSFSPTPGFTPTLGFSPVPSSTFTSSSTYTPG










DCPNFAAPRREVAPPYQGADPILATALASDPI










PNPLQKWEDSAHKPQSLDTDDPATLYAVVEN










VPPLRWKEFVRRLGLSDHEIDRLELQNGRCLR










EAQYSMLATWRRRTPRREATLELLGRVLRDM










DLLGCLEDIEEALCGPAALPPAPSLLR




G17675.
1
306
137
514
8427
MVRSRQMCNTNMSVPTDGAVTTSQIPASEQ
10
3


TCGA-





ETLVRPKPLLLKLLKSVGAQKDTYTMKEVLFYL




19-





GQYIMTKRLYDEKQQHIVYCSNDLLGDLFGVP




2624-





SFSVKEHRKIYTMIYRNLVVVNQQESSDSGTS




01A-





VSENRCHLEGGSDQKDLVQELQEEKPSSSHLV




01R-





SRPSTSSRRRAISETEENSDELSGERQRKRHKS




1850-





DSISLSFDESLALCVIREICCERSSSSESTGTPSN




01.2





PDLDAGVSEHSGDWLDQDSVSDQFSVEFEVE










SLDSEDYSLSEEGQELSDEDDEVYQVTVYQAG










ESDTDSFEEDPEISLAESEGVSCHYWSLFDGH










AGSGAAVVASRLLQHHITEQLQDIVDILKNSA










VLPPTCLGEEPENTPANSRTLTRAASLRGGVG










APGSPSTPPTRFFTEKKIPHECLVIGALESAFKE










MDLQIERERSSYNISGGCTALIVICLLGKLYVAN










AGDSRAIIIRNGEIIPMSSEFTPETERQRLQYLA










FMQPHLLGNEFTHLEFPRRVQRKELGKKMLY










RDFNMTGWAYKTIEDEDLKFPLIYGEGKKARV










MATIGVTRGLGDHDLKVHDSNIYIKPFLSSAPE










VRIYDLSKYDHGSDDVLILATDGLWDVLSNEE










VAEAITQFLPNCDPDDPHRYTLAAQDLVMRA










RGVLKDRGWRISNDRLGSGDDISVYVIPLIHG










NKLS




G17484.
1
2002
24
156
8428
MDPRNTAMLGLGSDSEGFSRKSPSAISTGTLV
6
4


TCGA-





SKREVELEKNTKEEEDLRKRNRERNIEAGKDD




14-





GLTDAQQQFSVKETNFSEGNLKLKIGLQAKRT




0787-





KKPPKNLENYVCRPAIKTTIKHPRKALKSGKMT




01A-





DEKNEHCPSKRDPSKLYKKADDVAAIECQSEE




01R-





VIRLHSQGENNPLSKKLSPVHSEMADYINATP




1849-





STLLGSRDPDLKDRALLNGGTSVTEKLAQLIAT




01.2





CPPSKSSKTKPKKLGTGTTAGLVSKDLIRKAGV










GSVAGIIHKDLIKKPTISTAVGLVTKDPGKKPVF










NAAVGLVNKDSVKKLGTGTTAVFINKNLGKKP










GTITTVGLLSKDSGKKLGIGIVPGLVHKESGKKL










GLGTVVGLVNKDLGKKLGSTVGLVAKDCAKKI










VASSAMGLVNKDIGKKLMSCPLAGLISKDAIN










LKAEALLPTQEPLKASCSTNINNQESQELSESL










KDSATSKTFEKNVVRQNKESILEKFSVRKEIINL










EKEMFNEGTCIQQDSFSSSEKGSYETSKHEKQ










PPVYCTSPDFKMGGASDVSTAKSPFSAVGES










NLPSPSPTVSVNPLTRSPPETSSQLAPNPLLLSS










TTELIEEISESVGKNQFTSESTHLNVGHRSVGH










SISIECKGIDKEVNDSKTTHIDIPRISSSLGKKPSL










TSESSIHTITPSVVNFTSLFSNKPFLKLGAVSAS










DKHCQVAESLSTSLQSKPLKKRKGRKPRWTKV










VARSTCRSPKGLELERSELFKNVSCSSLSNSNSE










PAKFMKNIGPPSFVDHDFLKRRLPKLSKSTAPS










LALLADSEKPSHKSFATHKLSSSMCVSSDLLSDI










YKPKRGRPKSKEMPQLEGPPKRTLKIPASKVFS










LQSKEEQEPPILQPEIEIPSFKQGLSVSPFPKKR










GRPKRQMRSPVKMKPPVLSVAPFVATESPSK










LESESDNHRSSSDFFESEDQLQDPDDLDDSHR










PSVCSMSDLEMEPDKKITKRNNGQLMKTIIRK










INKMKTLKRKKLLNQILSSSVESSNKGKVQSKL










HNTVSSLAATFGSKLGQQINVSKKGTIYIGKRR










GRKPKTVLNGILSGSPTSLAVLEQTAQQAAGS










ALGQILPPLLPSSASSSEILPSPICSQSSGTSGGQ










SPVSSDAGFVEPSSVPYLHLHSRQGSMIQTLA










MKKASKGRRRLSPPTLLPNSPSHLSELTSLKEA










TPSPISESHSDETIPSDSGIGTDNNSTSDRAEKF










CGQKKRRHSFEHVSLIPPETSTVLSSLKEKHKH










KCKRRNHDYLSYDKMKRQKRKRKKKYPQLRN










RQDPDFIAELEELISRLSEIRITHRSHHFIPRDLL










PTIFRINFNSFYTHPSFPLDPLHYIRKPDLKKKR










GRPPKMREAMAEMPFMHSLSFPLSSTGFYPS










YGMPYSPSPLTAAPIGLGYYGRYPPTLYPPPPS










PSFTTPLPPPSYMHAGHLLLNPAKYHKKKHKL










LRQEAFLTTSRTPLLSMSTYPSVPPEMAYGW










MVEHKHRHRHKHREHRSSEQPQVSMDTGSS










RSVLESLKRYRFGKDAVGERYKHKEKHRCHMS










CPHLSPSKSLINREEQWVHREPSESSPLALGLQ










TPLQIDCSESSPSLSLGGFTPNSEPASSDEHTNL










FTSAIGSCRVSNPNSSGRKKLTDSPGLFSAQDT










SLNRLHRKESLPSNERAVQTLAGSQPTSDKPS










QRPSESTNCSPTRKRSSSESTSSTVNGVPSRSP










RLVASGDDSVDSLLQRMVQNEDQEPMEKSID










AVIATASAPPSSSPGRSHSKDRTLGKPDSLLVP










AVTSDSCNNSISLLSEKLTSSCSPHHIKRSVVEA










MQRQARKMCNYDKILATKKNLDHVNKILKAK










KLQRQARTGNNFVKRRPGRPRKCPLQAVVS










MQAFQAAQFVNPELNRDEEGAALHLSPDTV










TDVIEAVVQSVNLNPEHKKGLKRKGWLLEEQ










TRKKQKPLPEEEEQENNKSFNEAPVEIPSPSET










PAKPSEPESTLQPVLSLIPREKKPPRPPKKKYQK










AGLYSDVYKTTDFSPGAGGFCTTLPPSFLRVD










DRATSSTTDSSRAPSSPRPPGSTSHCGISTRCT










ERCLCVLPLRTSQVPDVMAPQHDQEKFHDLA










YSCLGKSFSMSNQDLYGYSTSSLALGLAWLSW










ETKKKNVLHLVGLDSL




G17792.
1
326
21
206
8429
MSGGKKKSSFQITSVTTDYEGPGSPGASDPPT
4
2


TCGA-





PQPPTGPPPRLPNGEPSPDPGGKGTPRNGSP




28-





PPGAPSSRFRVVKLPHGLGEPYRRGRWTCVD




5204-





VYERDLEPHSFGGLLEGIRGASGGAGGRSLDS




01A-





RLELASLGLGAPTPPSGLSQGPTSWLRPPPTSP




01R-





GPQARSFTGGLGQLVVPSKAKAEKPPLSASSP




1850-





QQRPPEPETGESAGTSRAATPLPSLRVEAEAG




01.4





GSGARTPPLSRRKAVDMRLRMELGAPEEMG










QVPPLDSRPSSPALYFTHDASLVHKSPDPFGA










VAAQKFSLAHSMLAISGHLDSDDDSGSGSLV










GIDNKIEQAMVFFWRQKIKPTISGHPDSKKHS










LKKMEKTLQVVETLRLVELPKEAKPKLGESPEL










ADPCVLAKTTEETEVELGQQGQSLLQLPRTAV










KSVSTLMVSALQSGWQMCSWKSSVSSASVS










SQVRTQSPLKTPEAELLWEVYLVLWAVRKHLR










RLYRRQERHRRHHVRCHAAPRPNPAQSLKLD










AQSPL




BT299
1
20
10
600
8430
MALRRLGAALLLLPLLAAVEGAGQDVGRSCIL










VSIAGKNVMLDCGMHMGFNDDRRFPDFSYI










TQNGRLTDFLDCVIISHFHLDHCGALPYFSEM










VGYDGPIYMTHPTQAICPILLEDYRKIAVDKKG










EANFFTSQMIKDCMKKVVAVHLHQTVQVDD










ELEIKAYYAGHVLGAAMFQIKVGSESVVYTGD










YNMTPDRHLGAAWIDKCRPNLLITESTYATTI










RDSKRCRERDFLKKVHETVERGGKVLIPVFALG










RAQELCILLETFWERMNLKVPIYFSTGLTEKAN










HYYKLFIPWTNQKIRKTFVQRNMFEFKHIKAF










DRAFADNPGPMVVFATPGMLHAGQSLQIFR










KWAGNEKNMVIMPGYCVQGTVGHKILSGQR










KLEMEGRQVLEVKMQVEYMSFSAHADAKGI










MQLVGQAEPESVLLVHGEAKKMEFLKQKIEQ










ELRVNCYMPANGETVTLPTSPSIPVGISLGLLK










REMAQGLLPEAKKPRLLHGTLIMKDSNFRLVS










SEQALKELGLAEHQLRFTCRVHLHDTRKEQET










ALRVYSHLKSVLKDHCVQHLPDGSVTVESVLL










QAAAPSEDPGTKVLLVSWTYQDEELGSFLTSL










LKKGLPQAPS




G17667.
1
258
50
56
8431
MNQPESANDPEPLCAVCGQAHSLEENHFYSY
4
4


TCGA-





PEEVDDDLICHICLQALLDPLDTPCGHTYCTLC




26-





LTNFLVEKDFCPMDRKPLVLQHCKKSSILVNKL




5134-





LNKLLVTCPFREHCTQVLQRCDLEHHFQTSCK




01A-





GASHYGLTKDRKRRSQDGCPDGCASLTATAP




01R-





SPEVSAAATISLMTDEPGLDNPAYVSSAEDGQ




1850-





PAISPVDSGRSNRTRARPFERSTIRSRSFKKINR




01.2





ALSVLRRTKSGSAVANHADQGRENSENTTAP










EVWKHMLL




GBM-
1
432
8
609
8432
MWLKVGGLLRGTGGQLGQIVGWPCGALGP
14
2


CUMC3338_L1





GPHRWGPCGGSWAQKFYQDGPGRGLGEED










IRRAREARPRKTPRPQLSDRSRERKVPASRISR










LANFGGLAVGLGLGVLAEMAKKSMPGGRLQ










SEGGSGLDSSPFLSEANAERIVQTLCTVRGAAL










KVGQMLSIQDNSFISPQLQHIFERVRQSADF










MPRWQMLRVLEEELGRDWQAKVASLEEVPF










AAASIGQVHQGLLRDGTEVAVKIQYPGIAQSI










QSDVQNLLAVLKMSAALPAGLFAEQSLQALQ










QELAWECDYRREAACAQNFRQLLANDPFFRV










PAVVKELCTTRVLGMELAGGVPLDQCQGLSQ










DLRNQICFQLLTLCLRELFEFRFMQTDPNWAN










FLYDASSHQVTLLDFGASREFGTEFTDHYIEVV










KAAADGDRDCVLQKSRDLKFLTGFETKGGPR










RPERHLPPAPCGAPGPPETCRTEPDGAGTMN










KLRQSLRRRKPAYVPEASRPHQWQADEDAVR










KGTCSFPVRYLGHVEVEESRGMHVCEDAVKK










LKAMGRKSVKSVLWVSADGLRVVDDKTKDLL










VDQTIEKVSFCAPDRNLDKAFSYICRDGTTRR










WICHCFLALKDSGERLSHAVGCAFAACLERKQ










RREKECGVTAAFDASRTSFAREGSFRLSGGGR










PAEREAPDKKKAEAAAAPTVAPGPAQPGHVS










PTPATTSPGEKGEAGTPVAAGTTAAAIPRRHA










PLEQLVRQGSFRGFPALSQKNSPFKRQLSLRL










NELPSTLQRRTDFQVKGTVPEMEPPGAGDSD










SINALCTQISSSFASAGAPAPGPPPATTGTSA










WGEPSVPPAAAFQPGHKRTPSEAERWLEEVS










QVAKAQQQQQQQQQQQQQQQQQQQQA










ASVAPVPTMPPALQPFPAPVGPFDAAPAQVA










VFLPPPHMQPPFVPAYPGLGYPPMPRVPVVG










ITPSQMVANAFCSAAQLQPQPATLLGKAGAF










PPPAIPSAPGSQARPRPNGAPWPPEPAPAPA










PELDPFEAQWAALEGKAIVEKPSNPFSGDLQ










KTFEIEL




G17815.
1
700
32
192
8433
MASCPDSDNSWVLAGSESLPVETLGPASRM
10
2


TCGA-





DPESERALQAPHSPSKTDGKELAGTMDGEGT




19-





LFQTESPQSGSILTEETEVKGTLEGDVCGVEPP




5960-





GPGDTVVQGDLQETTVVTGLGPDTQDLEGQ




01A-





SPPQSLPSTPKAAWIREEGRCSSSDDDTDVD




11R-





MEGLRRRRGREAGPPQPMVPLAVENQAGG




1850-





EGAGGELGISLNMCLLGALVLLGLGVLLFSGGL




01.4





SESETGPMEEVERQVLPDPEVLEAVGDRQDG










LREQLQAPVPPDSVPSLQNMGLLLDKLAKEN










QDIRLLQAQLQAQKEELQSLMHQPKGLEEEN










AQLRGALQQGEAFQRALESELQQLRARLQGL










EADCVRGPDGVCLSGGRGPQGDKAIREQGP










REQEPELSFLKQKEQLEAEAQALRQELERQRR










LLGSVQQDLERSLQDASRGDPAHAGLAELGH










RLAQKLQGLENWGQDPGVSANASKAWHQK










SHFQNSREWSGKEKWWDGQRDRKAEHWK










HKKEESGRERKKNWGGQEDREPAGRWKEGR










PRVEESGSKKEGKRQGPKEPPRKSGSFHSSGE










KQKQPRWREGTKDSHDPLPSWAELLRPKYRA










PQGCSGVDECARQEGLTFFGTELAPVRQQEL










ASLLRTYLARLPWAGQLTKELPLSPAFFGEDGI










FRHDRLRFRDFVDALEDSLEEVAVQQTGDDD










EVDDFEDFIFSHFFGDKALKKRLGADVCAVLRL










SGPLKEQYAQEHGLNFQRLLDTSTYKEAFRKD










MIRWGEEKRQADPGFFCRKIVEGISQPIWLVS










DTRRVSDIQWFREAYGAVTQIVRVVALEQSR










QQRGWVFTPGVDDAESECGLDNFGDFDWVI










ENHGVEQRLEEQLENLIEFIRSRL




G17662.
1
81
23
977
8434
MEARSMLVPPQASVCFEDVAMAFTQEEWE
3
2


TCGA-





QLDLAQRTLYREVTLETWEHIVSLGLFLSKSDV




32-





ISQLEQEEDLCRAEQEAPRGKSKEAEIKRINKEL




1970-





ANIRSKFKGDKALDGYSKKKYVCKLLFIFLLGH




01A-





DIDFGHMEAVNLLSSNKYTEKQIGYLFISVLVN




01R-





SNSELIRLINNAIKNDLASRNPTFMCLALHCIA




1850-





NVGSREMGEAFAADIPRILVAGDSMDSVKQS




01.2





AALCLLRLYKASPDLVPMGEWTARVVHLLND










QHMGVVTAAVSLITCLCKKNPDDFKTCVSLAV










SRLSRIVSSASTDLQDYTYYFVPAPWLSVKLLRL










LQCYPPPEDAAVKGRLVECLETVLNKAQEPPK










SKKVQHSNAKNAILFETISLIIHYDSEPNLLVRA










CNQLGQFLQHRETNLRYLALESMCTLASSEFS










HEAVKTHIDTVINALKTERDVSVRQRAADLLY










AMCDRSNAKQIVSEMLRYLETADYAIREEIVLK










VAILAEKYAVDYSWYVDTILNLIRIAGDYVSEE










VWYRVLQIVTNRDDVQGYAAKTVFEALQAPA










CHENMVKVGGYILGEFGNLIAGDPRSSPPVQF










SLLHSKFHLCSVATRALLLSTYIKFINLFPETKATI










QGVLRAGSQLRNADVELQQRAVEYLTLSSVA










STDVLATVLEEMPPFPERESSILAKLKRKKGPG










AGSALDDGRRDPSSNDINGGMEPTPSTVSTP










SPSADLLGLRAAPPPAAPPASAGAGNLLVDVF










DGPAAQPSLGPTPEEAFLSELEPPAPESPMALL










ADPAPAADPGPEDIGPPIPEADELLNKFVCKN










NGVLFENQLLQIGVKSEFRQNLGRMYLFYGN










KTSVQFQNFSPTVVHPGDLQTHLAVQTKRVA










AQVDGGAQVQQVLNIECLRDFLTPPLLSVRFR










YGGAPQALTLKLPVTINKFFQPTEMAAQDFF










QRWKQLSLPQQEAQKIFKANHPMDAEVTKA










KLLGFGSALLDNVDPNPENFVGAGIIQTKALQ










VGCLLRLEPNAQAQMYRLTLRTSKEPVSRHLC










ELLAQQF




NYU_E
1
416
3
148
8435
MASESGKLWGGRFVGAVDPIMEKFNASIAYD
15
2








RHLWEVDVQGSKAYSRGLEKAGLLTKAEMD










QILHGLDKVAEEWAQGTFKLNSNDEDIHTAN










ERRLKELIGATAGKLHTGRSRNDQVVTDLRLW










MRQTCSTLSGLLWELIRTMVDRAEAERDVLF










PGYTHLQRAQPIRWSHWILSHAVALTRDSERL










LEVRKRINVLPLGSGAIAGNPLGVDRELLRAEL










NFGAITLNSMDATSERDFVAEFLFWASLCMT










HLSRMAEDLILYCTKEFSFVQLSDAYSTGSSLM










PQKKNPDSLELIRSKAGRVFGRCAGLLMTLKG










LPSTYNKDLQEDKEAVFEVSDTMSAVLQVAT










GVISTLQIHQENMGQALSPDMLATDLAYYLV










RKGMPFRQAHEASGKAVFMAETKGVALNQL










SLQELQTIRKDANSALLSNYEVFQLLTDLKEQR










KESGKNKHSSGQQNLNTITYETLKYISKTPCRH










QSPEIVREFLTALKSHKLTKAEKLQLLNHRPVT










AVEIQLMVEESEERLTEEQIEALLHTVTSILPAE










PEAEQKKNTNSNVAMDEEDPA




G17816.
1
982
190
225
8436
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
6


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




28-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




5215-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.4





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQDAFIGFGGNVI










RQQVKDNAKWYITDFVELLGELEE




G17650.
1
68
18
432
8437
MGETMSKRLKLHLGGEAEMEERAFVNPFPDY
1
3


TCGA-





EAAAGALLASGAAEETGCVRPPATTDEPGLPF




28-





HQDGKQKENNIRCLTTIGHFGFECLPNQLVSR




2513-





SIRQGFTFNILCVGETGIGKSTLIDTLFNTNLKD




01A-





NKSSHFYSNVGLQIQTYELQESNVQLKLTVVE




01R-





TVGYGDQIDKEASYQPIVDYIDAQFEAYLQEEL




1850-





KIKRSLFEYHDSRVHVCLYFISPTGHSLKSLDLLT




01.2





MKNLDSKVNIIPLIAKADTISKNDLQTFKNKIM










SELISNGIQIYQLPTDEETAAQANSSVSGLLPFA










VVGSTDEVKVGKRMVRGRHYPWGVLQVEN










ENHCDFVKLRDMLLCTNMENLKEKTHTQHYE










CYRYQKLQKMGFTDVGPNNQPVSFQEIFEAK










RQEFYDQCQREEEELKQRFMQRVKEKEATFK










EAEKELQDKFEHLKMIQQEEIRKLEEEKKQLEG










EIIDFYKMKAASEALQTQLSTDTKKDKHRKK




BT308
1
176
2050
2063
8438
MAAPLVLVLVVAVTVRAALFRSSLAEFISERVE
6
16








VVSPLSSWKRVVEGLSLLDLGVSPYSGAVFHE










TPLIIYLFHFLIDYAELVFMITDALTAIALYFAIQD










FNKVVFKKQKLLLELDQYAPDVAELIRTPMEM










RYIPLKVALFYLLNPYTILSCVAKSTCAINNTLIA










FFILTTIKDITSAVQSKRRKSK




G17656.
1
982
18
172
8439
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
2


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




28-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




2514-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




02R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1850-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTTYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQCTEAKKHCWYF










EGLYPTYYICRSYEDCCGSRCCVRALSIQRLWY










FWFLLMMGVLFCCGAGFFIRRRMYPPPLIEEP










AFNVSYTRQPPNPGPGAQQPGPPYYTDPGG










PGMNPVGNSMAMAFQVPPNSPQGSVACPP










PPAYCNTPPPPYEQVVKAK




G17639.
1
344
94
262
8440
MAEAPPVSGTFKFNTDAAEFIPQEKKNSGLNC
2
3


TCGA-





GTQRRLDSNRIGRRNYSSPPPCHLSRQVPYDE




12-





ISAVHQHSYHPSGSKPKSQQTSFQSSPCNKSP




3652-





KSHGLQNQPWQKLRNEKHHIRVKKAQSLAE




01A-





QTSDTAGLESSTRSESGTDLREHSPSESEKEVV




01R-





GADPRGAKPKKATQFVYSYGRGPKVKGKLKC




1849-





EWSNRTTPKPEDAGPESTKPVGVFHPDSSEAS




01.2





SRKGVLDGYGARRNEQRRYPQKRPPWEVEG










ARPRPGRNPPKQEGHRHTNAGHRNNMGPIP










KDDLNERPAKSTCDSENLAVINKSSRRVDQEK










CTVRRQDPQVVSPFSRGKQNHVLKNVETHTA










DKCSPNYLGSDWYNTWRMEPYNSSCCNKYT










TYLPRLPKEARMETAVRGMPLECPPRPERLNA










YEREVMVNMLNSLSRNQQLPRITPRCGCVDP










LPGRLPFHGYESACSGRHYCLRGMDYYASGA










PCTDRRLRPWCREQPTMCTSLRAPARNAVCC










YNSPAVILPISEP




G17796.
1
193
167
1032
8441
MAAETLLSSLLGLLLLGLLLPASLTGGVGSLNLE
5
7


TCGA-





ELSEMRYGIEILPLPVMGGQSQSSDVVIVSSKY




41-





KQRYECRLPAGAIHFQREREEETPAYQGPGIP




5651-





ELLSPMRDAPCLLKTKDWWTYEFCYGRHIQQ




01A-





YHMEDSEIKGEVLYLGYYQSAFDWDDETAKA




01R-





SKQHRLKRYHSQTYGNGSKCDLNGRPREAEV




1850-





RGCTERFVSSPEEILDVIDEGKSNRHVAVTNM




01.4





NEHSSRSHSIFLINIKQENMETEQKLSGKLYLV










DLAGSEKVSKTGAEGAVLDEAKNINKSLSALG










NVISALAEGTKSYVPYRDSKMTRILQDSLGGN










CRTTMFICCSPSSYNDAETKSTLMFGQRAKTIK










NTASVNLELTAEQWKKKYEKEKEKTKAQKETI










AKLEAELSRWRNGENVPETERLAGEEAALGA










ELCEETPVNDNSSIVVRIAPEERQKYEEEIRRLY










KQLDDKDDEINQQSQLIEKLKQQMLDQEELL










VSTRGDNEKVQRELSHLQSENDAAKDEVKEV










LQALEELAVNYDQKSQEVEEKSQQNQLLVDE










LSQKVATMLSLESELQRLQEVSGHQRKRIAEV










LNGLMKDLSEFSVIVGNGEIKLPVEISGAIEEEF










TVARLYISKIKSEVKSVVKRCRQLENLQVECHR










KMEVTGRELSSCQLLISQHEAKIRSLTEYMQSV










ELKKRHLEESYDSLSDELAKLQAQETVHEVALK










DKEPDTQDADEVKKALELQMESHREAHHRQ










LARLRDEINEKQKTIDELKDLNQKLQLELEKLQ










ADYEKLKSEEHEKSTKLQELTFLYERHEQSKQD










LKGLEETVARELQTLHNLRKLFVQDVTTRVKKS










AEMEPEDSGGIHSQKQKISFLENNLEQLTKVH










KQLVRDNADLRCELPKLEKRLRATAERVKALE










GALKEAKEGAMKDKRRYQQEVDRIKEAVRYK










SSGKRGHSAQIAKPVRPGHYPASSPTNPYGTR










SPECISYTNSLFQNYQNLYLQATPSSTSDMYFA










NSCTSSGATSSGGPLASYQKANMDNGNATDI










NDNRSDLPCGYEAEDQAKLFPLHQETAAS




G17468.
1
209
176
342
8442
MKLADSVMAGKASDGSIKWQLCYDISARTW
2
5


TCGA-





WMDEFHPFIEALLPHVRAFAYTWFNLQARKR




19-





KYFKKHEKRMSKEEERAVKDELLSEKPEVKQK




0957-





WASRLLAKLRKDIRPEYREDFVLTVTGKKPPCC




02A-





VLSNPDQKGKMRRIDCLRQADKVWRLDLVM




11R-





VILFKGIPLESTDGERLVKSPQCSNPGLCVQPH




2005-





HIGVSVKELDLYLAYFVHAAVISECQRQQLEAV




01.2





SYSSRYALGLFYEAGTKIDVPWAGQYITSNPCI










RFVSIDNKKRNIESSEIGPSLVIHTTVPFGVTYLE










HSIEDVQELVFQQLENILPGLPQPIATKCQKW










RHSQVTNAAANCPGQMTLHHKPFLACGGDG










FTQSNFDGCITSALCVLEALKNYI




G17790.
1
386
211
319
8443
MMLRGNLKQVRIEKNPARLRALESAVGESEP
6
6


TCGA-





AAAAAMALALAGEPAPPAPAPPEDHPDEEM




06-





GFTIDIKSFLKPGEKTYTQRCRLFVGNLPTDITE




5856-





EDFKRLFERYGEPSEVFINRDRGFGFIRLESRTL




01A-





AEIAKAELDGTILKSRPLRIRFATHGAALTVKNL




01R-





SPVVSNELLEQAFSQFGPVEKAVVVVDDRGR




1849-





ATGKGFVEFAAKPPARKALERCGDGAFLLTTT




01.4





PRPVIVEPMEQFDDEDGLPEKLMQKTQQYHK










EREQPPRFAQPGTFEFEYASRWKALDEMEKQ










QREQVDRNIREAKEKLEAEMEAARHEHQLML










MRQDLMRRQEELRRLEELRNQELQKRKQIQL










RHEEEHRRREEEMIRHREQEELRRQQEGFKP










NYMENEGIVSPSDLDLVMSEGLGMRYAFIGP










LETMHLNAEGMLSYCDRYSEGIKHVLQTFGPI










PEFSRATAEKVNQDMCMKVPDDPEHLAARR










QWRDECLMRLAKLKSQVQPQ




GBM-
1
447
160
327
8444
MKETIQGTGSWGPEPPGPGIPPAYSSPRRERL
8
6


CUMC3338_L1





RWPPPPKPRLKSGGGFGPDPGSGTTVPARRL










PVPRPSFDASASEEEEEEEEEEDEDEEEEVAA










WRLPPRWSQLGTSQRPRPSRPTHRKTCSQRR










RRAMRAFRMLLYSKSTSLTFHWKLWGRHRG










RRRGLAHPKNHLSPQQGGATPQVPSPCCRFD










SPRGPPPPRLGLLGALMAEDGVRGSPPVPSG










PPMEEDGLRWTPKSPLDPDSGLLSCTLPNGF










GGQSGPEGERSLAPPDASILISNVCSIGDHVA










QELFQGSDLGMAEEAERPGEKAGQHSPLREE










HVTCVQSILDEFLQTYGSLIPLSTDEVVEKLEDI










FQQEFSTPSRKGLVLQLIQSYQRMPGNAMVR










GFRVAYKRHVLTMDDLGTLYGQNWLNDQV










MNMYGDLVMDTVPEKVHFFNSFFYDKLRTK










GYDGVKRWTKNTRHYVGSAAAFAGTPEHGQ










FQGSPGGAYGTAQPPPHYGPTQPAYSPSQQL










RAPSAFPAVQYLSQPQPQPYAVHGHFQPTQT










GFLQPGGALSLQKQMEHANQQTGFSDSSSLR










PMHPQALHPAPGLLASPQLPVQMQPAGKSG










FAATSQPGPRLPFIQHSQNPRFYHK




G17790.
1
37
228
497
8445
MVNLAAMVWRRLLRKRWVLALVFGLSLVYFL
1
9


TCGA-





SSTFKQDLDAGVSEHSGDWLDQDSVSDQFSV




06-





EFEVESLDSEDYSLSEEGQELSDEDDEVYQVTV




5856-





YQAGESDTDSFEEDPEISLADYWKCTSCNEM




01A-





NPPLPSHCNRCWALRENWLPEDKGKDKGEIS




01R-





EKAKLENSTQAEEGFDVPDCKKTIVNDSRESC




1849-





VEENDDKITQASQSQESEDYSQPSTSSSIIYSSQ




01.4





EDVKEFEREETQDKEESVESSLPLNAIEPCVICQ










GRPKNGCIVHGKTGHLMACFTCAKKLKKRNK










PCPVCRQPIQMIVLTYFP




G17802.
1
344
240
432
8446
MDAPRASAAKPPTGRKMKARAPPPPGKAAT
7
7


TCGA-





LHVHSDQKPPHDGALGSQQNLVRMKEALRA




28-





STMDVTVVLPSGLEKRSVLNGSHAMMDLLVE




5208-





LCLQNHLNPSHHALEIRSSETQQPLSFKPNTLI




01A-





GTLNVHTVFLKEKVPEEKVKPGPPKVPEKSVRL




01R-





VVNYLRTQKAVVRVSPEVPLQNILPVICAKCEV




1850-





SPEHVVLLRDNIAGEELELSKSLNELGIKELYA




01.4





WDNRRETFRKSSLGNDETDKEKKKFLGFFKVN










KRSNSKAEQLVLSGADSDEDTSRAAPGRGLN










GCLTTPNSPSMHSRSLTLGPSLSLGSISGVSVK










SEMKKRRAPPPPGSGPPVQDKASEKGLLPFA










VVGSTDEVKVGKRMVRGRHYPWGVLQVEN










ENHCDFVKLRDMLLCTNMENLKEKTHTQHYE










CYRYQKLQKMGFTDVGPNNQPVSFQEIFEAK










RQEFYDQCQREEEELKQRFMQRVKEKEATFK










EAEKELQDKFEHLKMIQQEEIRKLEEEKKQLEG










EIIDFYKMKAASEALQTQLSTDTKKDKHRKK




G17210.
1
149
141
188
8447
MKLALLLPWACCCLCGSALATGFLYPFSAAAL
4
4


TCGA-





QQHGYPEPGAGSPGSGYASRRHWCHHTVTR




12-





TVSCQVQNGSETVVQRVYQSCRWPGPCANL




0616-





VSYRTLIRPTYRVSYRTVTVLEWRCCPGFTGSN




01A-





CDEECMNCTRLSDMSERLTTLEAKIICMGAKE




01R-





NGLPLEYQEKLKAIEPNDYTGKVSEEIEDIIKKG




1849-





ETQTL




01.2










NYU_B
1
89
380
1104
8448
MAAETQTLNFGPEWLRALSSGGSITSPPLSPA
4
10








LPKYKLADYRYGREEMLALFLKDNKIPSDLLDK










EFLPILQEEPLPPLALVPFTEEEQLKEILECSHLLT










VIKMEEAGDEIVSNAISYALYKAFSTSEQDKDN










WNGQLKLLLEWNQLDLANDEIFTNDRRWES










ADLQEVMFTALIKDRPKFVRLFLENGLNLRKFL










THDVLTELFSNHFSTLVYRNLQIAKNSYNDALL










TFVWKLVANFRRGFRKEDRNGRDEMDIELHD










VSPITRHPLQALFIWAILQNKKELSKVIWEQTR










GCTLAALGASKLLKTLAKVKNDINAAGESEELA










NEYETRAVELFTECYSSDEDLAEQLLVYSCEAW










GGSNCLELAVEATDQHFIAQPGVQNFLSKQW










YGEISRDTKNWKIILCLFIIPLVGCGFVSFRKKPV










DKHKKLLWYYVAFFTSPFVVFSWNVVFYIAFLL










LFAYVLLMDFHSVPHPPELVLYSLVFVLFCDEV










RQWYVNGVNYFTDLWNVMDTLGLFYFIAGI










VFRLHSSNKSSLYSGRVIFCLDYIIFTLRLIHIFTV










SRNLGPKIIMLQRMLIDVFFFLFLFAVWMVAF










GVARQGILRQNEQRWRWIFRSVIYEPYLAMF










GQVPSDVDGTTYDFAHCTFTGNESKPLCVELD










EHNLPRFPEWITIPLVCIYMLSTNILLVNLLVA










MFGYTVGTVQENNDQVWKFQRYFLVQEYCS










RLNIPFPFIVFAYFYMVVKKCFKCCCKEKNMES










SVCCFKNEDNETLAWEGVMKENYLVKINTKA










NDTSEEMRHRFRQLDTKLNDLKGLLKEIANKI










K




G17469.
1
982
373
432
8449
MRPSGTAGAALLALLAALCPASRALEEKKVCQ
24
10


TCGA-





GTSNKLTQLGTFEDHFLSLQRMFNNCEVVLG




06-





NLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVE




2557-





RIPLENLQIIRGNMYYENSYALAVLSNYDANKT




01A-





GLKELPMRNLQEILHGAVRFSNNPALCNVESI




01R-





QWRDIVSSDFLSNMSMDFQNHLGSCQKCDP




1849-





SCPNGSCWGAGEENCQKLTKIICAQQCSGRC




01.2





RGKSPSDCCHNQCAAGCTGPRESDCLVCRKF










RDEATCKDTCPPLMLYNPTEYQMDVNPEGKY










SFGATCVKKCPRNYVVTDHGSCVRACGADSY










EMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDS










LSINATNIKHFKNCTSISGDLHILPVAFRGDSFT










HTPPLDPQELDILKTVKEITGFLLIQAWPENRT










DLHAFENLEIIRGRTKQHGQFSLAVVSLNITSL










GLRSLKEISDGDVIISGNKNLCYANTINWKKLF










GTSGQKTKIISNRGENSCKATGQVCHALCSPE










GCWGPEPRDCVSCRNVSRGRECVDKCNLLEG










EPREFVENSECIQCHPECLPQAMNITCTGRGP










DNCIQCAHYIDGPHCVKTCPAGVMGENNTLV










WKYADAGHVCHLCHPNCTYGCTGPGLEGCP










TNGPKIPSIATGMVGALLLLLVVALGIGLFMRR










RHIVRKRTLRRLLQERELVEPLTPSGEAPNQAL










LRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEK










VKIPVAIKELREATSPKANKEILDEAYVMASVD










NPHVCRLLGICLTSTVQLITQLMPFGCLLDYVR










EHKDNIGSQYLLNWCVQIAKGMNYLEDRRLV










HRDLAARNVLVKTPQHVKITDFGLAKLLGAEE










KEYHAEGGKVPIKWMALESILHRIYTHQSDV










WSYGVTVWELMTFGSKPYDGIPASEISSILEK










GERLPQPPICTIDVYMIMVKCWMIDADSRPK










FRELIIEFSKMARDPQRYLVIQLQDKFEHLKMI










QQEEIRKLEEEKKQLEGEIIDFYKMKAASEALQ










TQLSTDTKKDKHRKK




G17480.
1
471
331
737
8450
MITSAAGIISLLDEDEPQLKEFALHKLNAVVND
12
6


TCGA-





FWAEISESVDKIEVLYEDEGFRSRQFAALVASK




27-





VFYHLGAFEESLNYALGAGDLFNVNDNSEYVE




1830-





TIIAKCIDHYTKQCVENADLPEGEKKPIDQRLE




01A-





GIVNKMFQRCLDDHKYKQAIGIALETRRLDVF




01R-





EKTILESNDVPGMLAYSLKLCMSLMQNKQFR




1850-





NKVLRVLVKIYMNLEKPDFINVCQCLIFLDDPQ




01.2





AVSDILEKLVKEDNLLMAYQICFDLYESASQQF










LSSVIQNLRTVGTPIASVPGSTNTGTVPGSEKD










SDSMETEEKTSSAFVGKTPEASPEPKDQTLKM










IKILSGEMAIELHLQFLIRNNNTDLMILKNTKD










AVRNSVCHTATVIANSFMHCGTTSDQFLRDN










LEWLARATNWAKFTATASLGVIHKGHEKEAL










QLMATYLPKDTSPGSAYQEGGGLYALGLIHA










NHGGDIIDYLLNQLKNASNDATFSCTCEEQYV










GTFCEEYDACQRKPCQNNASCIDANEKQDGS










NFTCVCLPGYTGELCQSKIDYCILDPCRNGATC










ISSLSGFTCQCPEGYFGSACEEKVDPCASSPCQ










NNGTCYVDGVHFTCNCSPGFTGPTCAQLIDF










CALSPCAHGTCRSVGTSYKCLCDPGYHGLYCE










EEYNECLSAPCLNAATCRDLVNGYECVCLAEY










KGTHCELYKDPCANVSCLNGATCDSDGLNGT










CICAPGFTGEECDIDINECDSNPCHHGGSCLD










QPNGYNCHCPHGWVGANCEIHLQWKSGH










MAESLTNMPRHSLYIIIGALCVAFILMLIILIVGI










CRISRIEYQGSSRPAYEEFYNCRSIDSEFSNAIAS










IRHARFGKKSRPAMYDVSPIAYEDYSPDDKPL










VTLIKTKDL




G17634.
1
62
2
132
8451
MAEGAQPHQPPQLGPGAAARGMKRESELEL
1
2


TCGA-





PVPGAGGDGADPGLSKRPRTEEAAADGGGG




19-





MQGELEVKNMDMKPGSTLKITGSIADGTDGF




2625-





VINLGQGTDKLNLHFNPRFSESTIVCNSLDGSN




01A-





WGQEQREDHLCFSPGSEVKFTVTFESDKFKVK




01R-





LPDGHELTFPNRLGHSHLSYLSVRGGFNMSSF




1850-





KLKE




01.2










G17654.
1
151
467
758
8452
MDRPGFVAALVAGGVAGVSVDLILFPLDTIKT
6
13


TCGA-





RLQSPQGFSKAGGFHGIYAGVPSAAIGSFPNA




41-





AAFFITYEYVKWFLHADSSSYLTPMKHMLAAS




4097-





AGEVVACLIRVPSEVVKQRAQVSASTRTFQIFS




01A-





NILYEEGIQGLYRGYKSTVLREIEEVRDAMENE




01R-





MRTQLRRQAAAHTDHLRDVLRVQEQELKSEF




1850-





EQNLSEKLSEQELQFRRLSQEQVDNFTLDINT




01.2





AYARLRGIEQAVQSHAVAEEEARKAHQLWLS










VEALKYSMKTSSAETPTIPLGSAVEAIKANCSD










NEFTQALTAAIPPESLTRGVYSEETLRARFYAV










QKLARRVAMIDETRNSLYQYFLSYLQSLLLFPP










QQLKPPPELCPEDINTFKLLSYASYCIEHGDLEL










AAKFVNQLKGESRRVAQDWLKEARMTLETK










QIVEILTAYASAVGIGTTQVQPE




G17485.
1
25
118
369
8453
MGVPKFYRWISERYPCLSEVVKEHQVICNSFTI
1
6


TCGA-





CNAEMQEVGVGLYPSISLLNHSCDPNCSIVFN




14-





GPHLLLRAVRDIEVGEELTICYLDMLMTSEERR




1402-





KQLRDQYCFECDCFRCQTQDKDADMLTGDE




02A-





QVWKEVQESLKKIEELKAHWKWEQVLAMCQ




01R-





AIISSNSERLPDINIYQLKVLDCAMDACINLGLL




2005-





EEALFYGTRTMEPYRIFFPGSHPVRGVQVMK




01.2





VGKLQLHQGMFPQAMKNLRLAFDIMRVTHG










REHSLIEDLILLLEECDANIRAS




G17498.
1
338
85
225
8454
MAAAGARLSPGPGSGLRGRPRLCFHPGPPPL
3
3


TCGA-





LPLLLLFLLLLPPPPLLAGATAAASREPDSPCRLK




02-





TVTVSTLPALRESDIGWSGARAGAGAGTGAG




2483-





AAAAAASPGSPGSAGTAAESRLLLFVRNELPG




01A-





RIAVQDDLDNTELPFFTLEMSGTAADISLVHW




01R-





RQQWLENGTLYFHVSMSSSGQLAQATAPTL




1849-





QEPSEIVEEQMHILHISVMGGLIALLLLLLVFTV




01.2





ALYAQRRWQKRRRIPQKSASTEATHEIHYIPS










VLLGPQARESFRSSRLQTHNSVIGVPIRETPILD










DYDCEEDEEPPRRANHVSREDEFGSQVTHTL










DSLGHPGEEKVDFEKKDPSLPNVQVTRLTLLSE










QAPGPVVMDLTGDLAVLKDQVFVLKEGVDY










RVKISFKVHREIVSGLKCLHHTYRRGLRVDKTV










YMVGSYGPSAQEYEFVTPVEEAPRGALVRGP










YLVVSLFTDDDRTHHLSWEWGLCICQDWKD




G17799.
1
613
256
282
8455
MLRLQMTDGHISCTAVEFSYMSKISLNTPPGT
12
10


TCGA-





KVKLSGIVDIKNGFLLLNDSNTTVLGGEVEHLIE




06-





KWELQRSLSKHNRSNIGTEGGPPPFVPFGQK




1804-





CVSHVQVDSRELDRRKTLQVTMPVKPTNDN




01A-





DEFEKQRTAAIAEVAKSKETKTFGGGGGGARS




01R-





NLNMNAAGNRNREVLQKEKSTKSEGKHEGV




1849-





YRELVDEKALKHITEMGFSKEASRQALMDNG




01.4





NNLEAALNVLLTSNKQKPVMGPPLRGRGKGR










GRIRSEDEEDLGNARPSAPSTLFDFLESKMGTL










NVEEPKSQPQQLHQGQYRSSNTEQNGVKDN










NHLRHPPRNDTRQPRNEKPPRFQRDSQNSKS










VLEGSGLPRNRGSERPSTSSVSEVWAEDRIKC










DRPYSRYDRTKDTSYPLGSQHSDGAFKKRDNS










MQSRSGKGPSFAEAKENPLPQGSVDYNNQK










RGKRESQTSIPDYFYDRKSQTINNEAFSGIKIEK










HFNVNTDYQNPVRSNSFIGVPNGEVEMPLKG










RRIGPIKPAGPVTAVPCDDKIFYNSGPKRRSGP










IKPEKILESSIPMEYAKMWKPGDECFALYWED










NKFYRAEVEALHSSGMTAVVKFIDYGNYEEVL










LSNIKPIQTEAWGYDHSYYFIATFITDHIRHHA










KYLNA




G17660.
1
555
34
526
8456
MADPGMMSLFGEDGNIFSEGLEGLGECGYPE
2
2


TCGA-





NPVNPMGQQMPIDQGFASLQPSLHHPSTNQ




06-





NQTKLTHFDHYNQYEQQKMHLMDQPNRM




5414-





MSNTPGNGLASPHSQYHTPPVPQVPHGGSG




01A-





GGQMGVYPGMQNERHGQSFVDSSSMWGP




01R-





RAVQVPDQIRAPYQQQQPQPQPPQPAPSGP




1849-





PAQGHPQHMQQMGSYMARGDFSMQQHG




01.2





QPQQRMSQFSQGQEGLNQGNPFIATSGPGH










LSHVPQQSPSMAPSLRHSVQQFHHHPSTALH










GESVAHSPRFSPNPPQQGAVRPQTLNFSSRS










QTVPSPTINNSGQYSRYPYSNLNQGLVNNTG










MNQNLGLTNNTPMNQSVPRYPNAVGFPSNS










GQGLMHQQPIHPSGSLNQMNTQTMHPSQP










QGTYASPPPMSPMKAMSNPAGTPPPQVRP










GSAGIPMEVGSYPNMPHPQPSHQPPGAMGI










GQRNMGPRNMQQSRPFIGMSSAPRELTGH










MRPNGCPGVGLGDPQAIQERLIPGQQHPGQ










QPSFQQLPTCPPLQPHPGLHHQSSPPHPHHQ










PWAQLHPSPQNTPQKVPVHQFDGENMYMS










MTEPSQDYVPASQSYPGPSLESEDFNIPPITPP










SLPDHSLVHLNEVESGYHSLCHPMNHNGLLPF










HPQNMDLPEITVSNMLGQDGTLLSNSISVMP










DIRNPEGTQYSSHPQMAAMRPRGQPADIRQ










QPGMMPHGQLTTINQSQLSAQLGLNMGGS










NVPHNSPSPPGSKSATPSPSSSVHEDEGDDTS










KINGGEKRPASDMGKKPKTPKKKKKKDPNEP










QKPVSAYALFFRDTQAAIKGQNPNATFGEVSK










IVASMWDGLGEEQKQVYKKKTEAAKKEYLKQ










LAAYRASLVSKSYSEPVDVKTSQPPQLINSKPS










VFHGPSQAHSALYLSSHYHQQPGMNPHLTA










MHPSLPRNIAPKPNNQMPVTVSIANMAVSP










PPPLQISPPLHQHLNMQQHQPLTMQQPLGN










QLPMQVQSALHSPTMQQGFTLQPDYQTIINP










TSTAAQVVTQAMEYVRSGCRNPPPQPVDW










NNDYCSSGGMQRDKALYLT




GBM-
1
163
627
1210
8457
MASASYHISNLLEKMTSSDKDFRFMATNDLM
4
16


CUMC3296_L1





TELQKDSIKLDDDSERKVVKMILKLLEDKNGEV










QNLAVKCLGPLVSKVKEYQVETIVDTLCTNML










SDKEQLRDISSIGLKTVIGELPPASSGSALAANV










CKKITGRLTSAIAKQEDVSVQLEALDIMADML










SRCTGPGLEGCPTNGPKIPSIATGMVGALLLLL










VVALGIGLFMRRRHIVRKRTLRRLLQERELVEP










LTPSGEAPNQALLRILKETEFKKIKVLGSGAFGT










WKGLWIPEGEKVKIPVAIKELREATSPKANKEI










LDEAYVMASVDNPHVCRLLGICLTSIVQLITQL










MPFGCLLDYVREHKDNIGSQYLLNWCVQIAK










GMNYLEDRRLVHRDLAARNVLVKTPQHVKIT










DFGLAKLLGAEEKEYHAEGGKVPIKWMALESI










LHRIYTHQSDVWSYGVTVWELMTFGSKPYD










GIPASEISSILEKGERLPQPPICTIDVYMIMVKC










WMIDADSRPKFRELIIEFSKMARDPQRYLVIQ










GDERMHLPSPTDSNFYRALMDEEDMDDVVD










ADEYLIPQQGFFSSPSTSRTPLLSSLSATSNNST










VACIDRNGLQSCPIKEDSFLQRYSSDPTGALTE










DSIDDTFLPVPEYINQSVPKRPAGSVQNPVYH










NQPLNPAPSRDPHYQDPHSTAVGNPEYLNTV










QPTCVNSTFDSPAHWAQKGSHQISLDNPDY










QQDFFPKEAKPNGIFKGSTAENAEYLRVAPQS










SEFIGA




G17505.
1
40
525
1132
8458
MNNLNDPPNWNIRPNSRADGGDGSRWNY
2
3


TCGA-





ALLVPMLGLAAFRVGCVPAAEHRLREEILAKFL




06-





HWLMSVYVVELLRSFFYVTETTFQKNRLFFYR




2564-





KSVWSKLQSIGIRQHLKRVQLRELSEAEVRQH




01A-





REARPALLTSRLRFIPKPDGLRPIVNMDYVVGA




01R-





RTFRREKRAERLTSRVKALFSVLNYERARRPGL




1849-





LGASVLGLDDIHRAWRTFVLRVRAQDPPPELY




01.2





FVKVDVTGAYDTIPQDRLTEVIASIIKPQNTYC










VRRYAVVQKAAHGHVRKAFKSHVSTLTDLQP










YMRQFVAHLQETSPLRDAVVIEQSSSLNEASS










GLFDVFLRFMCHHAVRIRGKSYVQCQGIPQG










SILSTLLCSLCYGDMENKLFAGIRRDGLLLRLVD










DFLLVTPHLTHAKTFLRTLVRGVPEYGCVVNLR










KTVVNFPVEDEALGGTAFVQMPAHGLFPWC










GLLLDTRTLEVQSDYSSYARTSIRASLTFNRGFK










AGRNMRRKLFGVLRLKCHSLFLDLQVNSLQIV










CTNIYKILLLQAYRFHACVLQLPFHQQVWKNP










TFFLRVISDTASLCYSILKAKNAGMSLGAKGAA










GPLPSEAVQWLCHQAFLLKLTRHRVTYVPLLG










SLRTAQTQLSRKLPGTTLTALEAAANPALPSDF










KTILD




G17798.
1
280
15
1673
8459
MLGTGPAAATTAATTSSNVSVLQQFASGLKS
6
2


TCGA-





RNEETRAKAAKELQHYVTMELREMSQEESTR




32-





FYDQLNHHIFELVSSSDANERKGGILAIASLIGV




5222-





EGGNATRIGRFANYLRNLLPSNDPVVMEMAS




01A-





KAIGRLAMAGDTFTAEYVEFEVKRALEWLGA




01R-





DRNEGRRHAAVLVLRELAISVPTFFFQQVQPF




1850-





FDNIFVAVWDPKQAIREGAVAALRACLILTTQ




01.4





REPKEMQKPQWYRHTFEEAEKGFDETLAKEK










GMNRDDRIHGALLILNELVRISSMEGESVSQS










VFCGTSTYCVLNTVPPIEDDHGNSNSSHVKIFL










PKKLLECLPKCSSLPKERHRWNTNEEIAAYLITF










EKHEEWLTTSPKTRPQNGSMILYNRKKVKYRK










DGYCWKKRKDGKTTREDHMKLKVQGVECLY










GCYVHSSIIPTFHRRCYWLLQNPDIVLVHYLNV










PAIEDCGKPCGPILCSINTDKKEWAKWTKEELI










GQLKPMFHGIKWTCSNGNSSSGFSVEQLVQ










QILDSHQTKPQPRTHNCLCTGSLGAGGSVHH










KCNSAKHRIISPKVEPRTGGYGSHSEVQHNDV










SEGKHEHSHSKGSSREKRNGKVAKPVLLHQSS










TEVSSTNQVEVPDTTQSSPVSISSGLNSDPDM










VDSPVVTGVSGMAVASVMGSLSQSATVFMS










EVTNEAVYTMSPTAGPNHHLLSPDASQGLVL










AVSSDGHKFAFPTTGSSESLSMLPTNVSEELVL










STTLDGGRKIPETTMNFDPDCFLNNPKQGQT










YGGGGLKAEMVSSNIRHSPPGERSFSFTTVLT










KEIKTEDTSFEQQMAKEAYSSSAAAVAASSLTL










TAGSSLLPSGGGLSPSTTLEQMDFSAIDSNKDY










TSSFSQTGHSPHIHQTPSPSFFLQDASKPLPVE










QNTHSSLSDSGGTFVMPTVKTEASSQTSSCSG










HVETRIESTSSLHLMQFQANFQAMTAEGEVT










METSQAAEGSEVLLKSGELQACSSEHYLQPET










NGVIRSAGGVPILPGNVVQGLYPVAQPSLGN










ASNMELSLDHFDISFSNQFSDLINDFISVEGGS










STIYGHQLVSGDSTALSQSEDGARAPFTQAE










MCLPCCSPQQGSLQLSSSEGGASTMAYMHV










AEVVSAASAQGTLGMLQQSGRVFMVTDYSP










EWSYPEGGVKVLITGPWQEASNNYSCLFDQIS










VPASLIQPGVLRCYCPAHDTGLVTLQVAFNNQ










IISNSVVFEYKARALPTLPSSQHDWLSLDDNQF










RMSILERLEQMERRMAEMTGSQQHKQASG










GGSSGGGSGSGNGGSQAQCASGTGALGSCF










ESRVVVVCEKMMSRACWAKSKHLIHSKTFRG










MTLLHLAAAQGYATLIQTLIKWRTKHADSIDL










ELEVDPLNVDHFSCTPLMWACALGHLEAAVV










LYKWDRRAISIPDSLGRLPLGIARSRGHVKLAE










CLEHLQRDEQAQLGQNPRIHCPASEEPSTES










WMAQWHSEAISSPEIPKGVTVIASTNPELRRP










RSEPSNYYSSESHKDYPAPKKHKLNPEYFQTR










QEKLLPTALSLEEPNIRKQSPSSKQSVPETLSPS










EGVRDFSRELSPPTPETAAFQASGSQPVGKW










NSKDLYIGVSTVQVTGNPKGTSVGKEAAPSQ










VRPREPMSVLMMANREVVNTELGSYRDSAE










NEECGQPMDDIQVNMMTLAEHIIEATPDRIK










QENFVPMESSGLERTDPATISSTMSWLASYLA










DADCLPSAAQIRSAYNEPLTPSSNTSLSPVGSP










VSEIAFEKPNLPSAADWSEFLSASTSEKVENEF










AQLTLSDHEQRELYEAARLVQTAFRKYKGRPL










REQQEVAAAVIQRCYRKYKQYALYKKMTQAA










ILIQSKFRSYYEQKKFQQSRRAAVLIQKYYRSYK










KCGKRRQARRTAVIVQQKLRSSLLTKKQDQA










ARKIMRFLRRCRHSPLVDHRLYKRSERIEKGQ










GT




GBM-
1
57
358
372
8460
MARGYGATVSLVLLGLGLALAVIVLAVVLSRH
1
10


CUMC3297_L1





QAPCGPQAFAHAAVAADSKVCSDIGRQETAY










LLVYMKMEC




GBM-
1
169
341
484
8461
MFVYVLTPGEQSGRRLPGQTWLMFSCFCFSL
4
9


CUMC3342_L1





QDNSFSSTTVTECDEDPVSLHEDQTDCSSLRD










ENNKENYPDAGALVEEHAPPSWEPQQQNVE










ATVLVDSVLRPSMGNFKSRKPKSIFKAESGRS










HGESQETEHVVSSQSECQVRAGTPAHESPQN










NAFKCQETVRLQPRSRKDSLESDSSTAIIPHELI










RTRQLESVHLKFNQESGALIPLCLRGRLLHGRH










FTYKSITGDMAITFVSTGVEGAFATEEHPYAA










HGPWLQILLTEEFVEKMLEDLEDLTSPEEFKLP










KEYSWPEKKLKVSILPDVVFDSPLH




G17500.
1
2661
303
468
8462
MHSAGTPGLSSRRTGNSTSFQPGPPPPPRLLL
16
7


TCGA-





LLLLLLSLVSRVPAQPAAFGRALLSPGLAGAAG




27-





VPAEEAIVLANRGLRVPFGREVWLDPLHDLVL




1831-





QVQPGDRCAVSVLDNDALAQRPGRLSPKRFP




01A-





CDFGPGEVRYSHLGARSPSRDRVRLQLRYDAP




01R-





GGAVVLPLVLEVEVVFTQLEVVTRNLPLVVEEL




1850-





LGTSNALDARSLEFAFQPETEECRVGILSGLGA




01.2





LPRYGELLHYPQVPGGAREGGAPETLLMDCK










AFQELGVRYRHTAASRSPNRDWIPMVVELRS










RGAPVGSPALKREHFQVLVRIRGGAENTAPKP










SFVAMMMMEVDQFVLTALTPDMLAAEDAE










SPSDLLIFNLTSPFQPGQGYLVSTDDRSLPLSSF










TQRDLRLLKIAYQPPSEDSDQERLFELELEVVD










LEGAASDPFAFMVVVKPMNTMAPVVTRNTG










LILYEGQSRPLTGPAGSGPQNLVISDEDDLEAV










RLEVVAGLRHGHLVILGASSGSSAPKSFTVAEL










AAGQVVYQHDDRDGSLSDNLVLRMVDGGG










RHQVQFLFPITLVPVDDQPPVLNANTGLTLAE










GETVPILPLSLSATDMDSDDSLLLFVLESPFLTT










GHLLLRQTHPPHEKQELLRGLWRKEGAFYERT










VTEWQQQDITEGRLFYRHSGPHSPGPVTDQF










TFRVQDNHDPPNQSGLQRFVIRIHPVDRLPPE










LGSGCPLRMVVQESQLTPLRKKWLRYTDLDT










DDRELRYTVTQSPTDTDENHLPAPLGTLVLTD










NPSVVVTHFTQAQINHHKIAYRPPGQELGVA










TRVAQFQFQVEDRAGNVAPGTFTLYLHPVDN










QPPEILNTGFTIQEKGHHILSETELHVNDVDTD










VAHISFTLTQAPKHGHMRVSGQILHVGGLFHL










EDIKQGRVSYAHNGDKSLTDSCSLEVSDRHHV










VPITLRVNVRPVDDEVPILSHPTGTLESYLDVLE










NGATEITANVIKGTNEETDDLMLTFLLEDPPLY










GEILVNGIPAEQFTQRDILEGSVVYTHTSGEIGL










LPKADSFNLSLSDMSQEWRIGGNTIQGVTIW










VTILPVDSQAPEIFVGEQLIVMEGDKSVITSVHI










SAEDVDSLNDDILCTIVIQPTSGYVENISPAPGS










EKSRAGIAISAFNLKDLRQGHINYVQSVHKGV










EPVEDRFVFRCSDGINFSERQFFPIVIIPTNDEQ










PEMFMREFMVMEGMSLVIDTPILNAADADV










PLDDLTFTITQFPTHGHIMNQLINGTVLVESFT










LDQIIESSSIIYEHDDSETQEDSFVIKLTDGKHSV










EKTVLIIVIPVDDETPRMTINNGLEIEIGDTKIIN










NKILMATDLDSEDKSLVYIIRYGPGHGLLQRRK










PTGAFENITLGMNFTQDEVDRNLIQYVHLGQ










EGIRDLIKFDVTDGINPLIDRYFYVSIGSIDIVFP










DVISKGVSLKEGGKVTLTTDLLSTSDLNSPDEN










LVFTITRAPMRGHLECTDQPGVSITSFTQLQLA










GNKIYYIHTADDEVKMDSFEFQVTDGRNPVF










RTFRISISDVDNKKPVVTIHKLVVSESENKLITPF










ELTVEDRDTPDKLLKFTITQVPIHGHLLFNNTR










PVMVFTKQDLNENLISYKHDGTESSEDSFSFT










VTDGTHTDFYVFPDTVFETRRPQVMKIQVLA










VDNSVPQIAVNKGASTLRTLATGHLGFMITSKI










LKVEDRDSLHISLRFIVTEAPQHGYLLNLDKGN










HSITQFTQADIDDMKICYVLREGANATSDMFY










FAVEDGGGNKLTYQNFRLNWAWISFEKEYYL










VNEDSKFLDVVLKRRGYLGETSFISIGTRDRTA










EKDKDFKGKAQKQVQFNPGQTRATWRVRILS










DGEHEQSETFQVVLSEPVLAALEFPTVAIVEIV










DPGDEPTVFIPQSKYSVEEDVGELFIPIRRSGD










VSQELMVVCYTQQGTATGIVPTSVLSYSDYIS










RPEDHTSVVRFDKDEREKLCRIVIIDDSLYEEEE










TFHVLLSMPMGGRIGSEFPGAQVTIVPDKDD










EPIFYFGDVEYSVDESAGYVEVQVWRTGTDLS










KSSSVTVRSRKTDPPSADAGTDYVGISRNLDF










APGVNMQPVRVVILDDLGQPALEGIEKFELVL










RMPMNAALGEPSKATVSINDSVSDLPKMQFK










ERIYTGSESDGQIVTMIHRTGDVQYRSSVRCY










TRQGSAQVMMDFEERPNTDTSIITFLPGETEK










PCILELMDDVLYEEVEELRLVLGTPQSNSPFGA










AVGEQNETLIRIRDDADKTVIKFGETKFSVTEP










KEPGESVVIRIPVIRQGDTSKVSIVRVHTKDGS










ATSGEDYHPVSEEIEFKEGETQHVVEIEVTFDG










VREMREAFTVHLKPDENMIAEMQLTKAIVYIE










EMSSMADVTFPSVPQIVSLLMYDDTSKAKES










AEPMSGYPVICITACNPKYSDYDKTGSICASEN










INDTLTRYRWLISAPAGPDGVTSPMREVDFDT










FFTSSKMVTLDSIYFQPGSRVQCAARAVNTNG










DEGLELMSPIVTISREEGLCQPRVPGVVGAEPF










SAKLRYTGPEDADYTNLIKLTVTMPHIDGMLP










VISTRELSNFELTLSPDGTRVGNHKCSNLLDYT










EVKTHYGFLTDATKNPEIIGETYPYQYSLSIRGS










TTLRFYRNLNLEACLWEFVSYYDMSELLADCG










GTIGTDGQVDVKLDPKDLRIDTFRAKGAGGQ










HVNKTDSAVRLVHIPTGLVVECQQERSQIKNK










EIAFRVLRARLYQQIIEKDKRQQQSARKLQVG










TRAQSERIRTYNFTQDRVSDHRIAYEVRDIKA










QSHSTGGSRDPAHSTFLSLDSVRSPGILIMTSS










VRNFYVVGRAWIS




G17212.
1
304
415
655
8463
MAIDRRREAAGGGPGRQPAPAEENGSLPPG
3
4


TCGA-





DAAASAPLGGRAGPGGGAEIQPLPPLHPGGG




06-





PHPSCCSAAAAPSLLLLDYDGSVLPFLGGLGG




0129-





GYQKTLVLLTWIPALFIGFSQFSDSFLLDQPNF




01A-





WCRGAGKGTELAGVTTTGRGGDMGNWTSL




01R-





PTTPFATAPWEAAGNRSNSSGADGGDTPPLP




1849-





SPPDKGDNASNCDCRAWDYGIRAGLVQNVV




01.2





SKWDLVCDNAWKVHIAKFSLLVGLIFGYLITG










CIADWVGRRPVLLFSIIFILIFGLTVALSVNVTM










FSTLRFFEGFCLAGIILTLYALRIDILKLVAAQVG










SQWKDIYQFLCNASEREVAAFSNGYTADHER










AYAALQHWTIRGPEASLAQLISALRQHRRND










VVEKIRGLMEDTTQLETDKLALPMSPSPLSPSP










IPSPNAKLENSALLTVEPSPQDKNKGFFVDESE










PLLRCDSTSSGSSALSRNGSFITKEKKDTVLRQ










VRLDPCDLQPIFDDMLHFLNPEELRVIEEIPQA










EDKLDRLFEIIGVKSQEASQTLLDSVYSHLPDLL




GBM-
1
4
641
1066
8464
MENTDRKVSSLHTSRVQRQMVVSVHDLPEKS
2
16


CUMC3322_L1





FVPLLDSKYVLCVWDIWQPSGPQKVLICESQV










TCCCLSPLKAFLLFAGTAHGSVVVWDLREDSR










LHYSVTLSDGFWTFRTATFSTDGILTSVNHRSP










LQAVEPISTSVHKKQSFVLSPFSTQEEMSGLSF










HIASLDESGVLNVWVVVELPKADIAGSISDLGL










MPGGRVKLVHSALIQLGDSLSHKGNEFWGTT










QTLNVKFLPSDPNHFIIGTDMGLISHGTRQDL










RVAPKLFKPQQHGIRPVKVNVIDFSPFGEPIFL










AGCSDGSIRLHQLSSAFPLLQWDSSTDSHAVT










GLQWSPTRPAVFLVQDDTSNIYIWDLLQSDL










GPVAKQQVSPNRLVAMAAVGEPEKAGGSFL










ALVLARASGSIDIQHLKRRWAAPEVDECNRLR










LLLQEALWPEGKLHK




G17675.
1
250
13
204
8465
MTCPRNVTPNSYAEPLAAPGGGERYSRSAG
1
2


TCGA-





MYMQSGSDFNCGVMRGCGLAPSLSKRDEGS




19-





SPSLALNTVPSYLSQLDSWGDPKAAYRLEQPV




2624-





GRPLSSCSYPPSVKEENVCCMYSAEKRAKSGP




01A-





EAALYSHPLPESCLGEHEVPVPSYYRASPSYSA




01R-





LDKTPHCSGANDFEAPFEQRASLNPRAEHLES




1850-





PQLGGKVSFPETPKSDSQTPSPNEIKTEQSLAG




01.2





PKGSPSESEKERAKAADSSPDTSDNEAKGKYIA










STQRPDGTWRKQRRVKEGYVPQEEVPVYEN










KYVKFFKSKPELPPGLSPEATAPVTPSRPEGGE










PGLSKTAKRNLKRKEKRRQQQEKGEAEALSRT










LDKVSLEETAQLPSAPQGSRAAPTAASDQPDS










AATTEKAKKIKNLKKKLRQVEELQQRIQAGEVS










QPSKEQLEKLARRRALEEELEDLELGL




G17803.
1
359
14
425
8466
MAAEKQVPGGGGGGGSGGGGGSGGGGSG
9
2


TCGA-





GGRGAGGEENKENERPSAGSKANKEFGDSLS




76-





LEILQIIKESQQQHGLRHGDFQRYRGYCSRRQ




4925-





RRLRKTLNFKMGNRHKFTGKKVTEELLTDNRY




01A-





LLLVLMDAERAWSYAMQLKQEANTEPRKRF




01R-





HLLSRLRKAVKHAEELERLCESNRVDAKTKLEA




1850-





QAYTAYLSGMLRFEHQEWKAAIEAFNKCKTIY




01.4





EKLASAFTEEQAVLYNQRVEEISPNIRYCAYNI










GDQSAINELMQMRLRSGGTEGLLAEKLEALIT










QTRAKQAATMSEVEWRGRTVPVKIDKVRIFLL










GLADNEAAIVQAESEETKERLFESMLSECRDAI










QVVREELKPDQPLISRNYKGDVAMSKIEHFM










PLLVQREEEGALAPLLSHGQVHFLWIKHSNLY










LVATTSKNANASLVYSFLYKTIEVFCEYFKELEE










ESIRDNFVIVYELLDELMDFGFPQTTDSKILQEY










ITQQSNKLETGKSRVPPTVTNAVSWRSEGIKY










KKNEVFIDVIESVNLLVNANGSVLLSEIVGTIKL










KVFLSGMPELRLGLNDRVLFELTGLSGSKNKS










VELEDVKFHQCVRLSRFDNDRTISFIPPDGDFE










LMSYRLSTQVKPLIWIESVIEKFSHSRVEIMVK










AKGQFKKQSVANGVEISVPVPSDADSPRFKTS










VGSAKYVPERNVVIWSIKSFPGGKEYLMRAHF










GLPSVEKEEVEGRPPIGVKFEIPYFTVSGIQVRY










MKIIEKSGYQALPWVRYITQSGDYQLRTS




G17796.
1
211
100
350
8467
MSLLRSLRVFLVARTGSYPAGSLLRQSPQPRH
6
2


TCGA-





TFYAGPRLSASASSKELLMKLRRKTGYSFVNCK




41-





KALETCGGDLKQAEIWLHKEAQKEGWSKAAK




5651-





LQRKTKEGLIGLLQEGNTTVLVEVNCETDFV




01A-





SRNLKFQLLVQQVALGTMMHCQTLKDQPSA




01R-





YSKVQWLTPVNLALWEAEAGGSLEGFLNSSEL




1850-





SGLPAGPDREGSLKDQLALAIDSTSAYSSLLTF




01.4





HLSTPRSHHLYHARLWLHVLPTLPGTLCLRIFR










WGPRRRRQGSRTLLAEHHITNLGWHTLTLPS










SGLRGEKSGVLKLQLDCRPLEGNSTVTGQPRR










LLDTAGHQQPFLELKIRANEPGAGRARRRTPT










CEPATPLCCRRDHYVDFQELGWRDWILQPEG










YQLNYCSGQCPPHLAGSPGIAASFHSAVFSLLK










ANNPWPASTSCCVPTARRPLSLLYLDHNGNV










VKTDVPDMVVEACGCS




G17663.
1
140
585
1374
8468
MASLRRVKVLLVLNLIAVAGFVLFLAKCRPIAV
2
5


TCGA-





RSGDAFHEIRPRAEVANLSAHSASPIQDAVLK




19-





RLSLLEDIVYRQLNGLSKSLGLIEGYGGRGKGG




2619-





LPATLSPAEEEKAKGPHEKYGYNSYLSEKISLDR




01A-





SIPDYRPTKFVIGREKPGQVSEVAQLISQTLEQ




01R-





ERRQRELLEQHYAQYDADDDENTVAELQGM




1850-





SGNCNNNNNYFLKTGEYATDEEEDEVGPVLP




01.2





GSDMAIEVFELPENEDMFSPSELDTSKLSHKF










KELQIKHAVTEAEIQKLKTKLQAAENEKVRWE










LEKTQLQQNIEENKERMLKLESYWIEAQTLCH










TVNEHLKETQSQYQALEKKYNKAKKLIKDFQQ










KELDFIKRQEAERKKIEDLEKAHLVEVQGLQVR










IRDLEAEVFRLLKQNGTQVNNNNNIFERRTSL










GEVSKGDTMENLDGKQTSCQDGLSQDLNEA










VPETERLDSKALKTRAQLSVKNRRQRPSRTRLY










DSVSSTDGEDSLERKPSNSFYNHMHITKLLPPK










GLRTSSPESDSGVPPLTPVDSNVPFSSDHIAEF










QEEPLDPEMGPLSSMWGDTSLFSTSKSDHDV










EESPCHHQTTNKKILREKDDAKDPKSLRASSSL










AVQGGKIKRKFVDLGAPLRRNSSKGKKWKEK










EKEASRFSAGSRIFRGRLENWTPKPCSTAQTST










RSPCMPFSWFNDS6RKGSYSFRNLPAPTSSLQP










SPETLISDKKGSKVENTWITKANKRNPNPSSSS










IFGRHSQLMSVVWIQETNNFTFNDDFSPSSTS










SADLSGLGAEPKTPGLSQSLALSSDEILDDGQS










PKHSQCQNRAVQEWSVQQVSHWLMSLNLE










QYVSEFSAQNITGEQLLQLDGNKLKALGMTA










SQDRAVVKKKLKEMKMSLEKARKAQEKMEK










QREKLRRKEQEQMQRKSKKTEKMTSTTAEGA










GEQ




G17207.
1
2
205
925
8469
MQSTQAHENSRDSRLAWMGTWEHLVSTGF
1
8


TCGA-





NQMREREVKLWDTRFFSSALASLTLDTSLGCL




06-





VPLLDPDSGLLVLAGKGERQLYCYEVVPQQPA




0156-





LSPVTQCVLESVLRGAALVPRQALAVMSCEVL




01A-





RVLQLSDTAIVPIGYHVPRKAVEFHEDLFPDTA




03R-





GCVPATDPHSWWAGDNQQVQKVSLNPACR




1849-





PHPSFTSCLVPPAEPLPDTAQPAVMETPVGD




01.2





ADASEGFSSPPSSLTSPSTPSSLGPSLSSTSGIGT










SPSLRSLQSLLGPSSKFRHAQGTVLHRDSHITN










LKGLNLTTPGESDGFCANKLRVAVPLLSSGGQ










VAVLELRKPGRLPDTALPTLQNGAAVTDLAW










DPFDPHRLAVAGEDARIRLWRVPAEGLEEVLT










TPETVLTGHTEKICSLRFHPLAANVLASSSYDLT










VRIWDLQAGADRLKLQGHQDQIFSLAWSPD










GQQLATVCKDGRVRVYRPRSGPEPLQEGPGP










KGGRGARIVWVCDGRCLLVSGFDSQSERQLL










LYEAEALAGGPLAVLGLDVAPSTLLPSYDPDTG










LVLLTGKGDTRVFLYELLPESPFFLECNSFTSPD










PHKGLVLLPKTECDVREVELMRCLRLRQSSLEP










VAFRLPRVRKEFFQDDVFPDTAVIWEPVLSAE










AWLQGANGQPWLLSLQPPDMSPVSQAPRE










APARRAPSSAQYLEEKSDQQKKEELLNAMVA










KLGNREDPLPQDSFEGVDEDEWD




G17802.
1
68
190
225
8470
MGETMSKRLKLHLGGEAEMEERAFVNPFPDY
1
6


TCGA-





EAAAGALLASGAAEETGCVRPPATTDEPGLPF




28-





HQDGKDAFIGFGGNVIRQQVKDNAKWYITD




5208-





FVELLGELEE




01A-










01R-










1850-










01.4










G17485.
1
17
156
569
8471
MPQSKSRKIAILGYRSVASGLSSSPSTPTQVTK
1
4


TCGA-





QHTFPLESYKHEPERLENRIYASSSPPDTGQRF




14-





CPSSFQSPTRPPLASPTHYAPSKAAALAAALGP




1402-





AEAGMLEKLEFEDEAVEDSESGVYMRFMRSH




02A-





KCYDIVPTSSKLVVFDTTLQVKKAFFALVANGV




01R-





RAAPLWESKKQSFVGMLTITDFINILHRYYKSP




2005-





MVQIYELEEHKIETWRELYLQETFKPLVNISPD




01.2





ASLFDAVYSLIKNKIHRLPVIDPISGNALYILTHK










RILKFLQLFMSDMPKPAFMKQNLDELGIGTYH










NIAFIHPDTPIIKALNIFVERRISALPVVDESGKV










VDIYSKFDVINLAAEKTYNNLDITVTQALQHRS










QYFEGVVKCNKLEILETIVDRIVRAEVHRLVVV










NEADSIVGIISLSDILQALILTPAGAKQKETETE




NYU_E
1
252
68
211
8472
MGAGPSLLLAALLLLLSGDGAVRCDTPANCTY
5
2








LDLLGTWVFQVGSSGSQRDVNCSVMGPQEK










KVVVYLQKLDTAYDDLGNSGHFTIIYNQGFEIV










LNDYKWFAFFKYKEEGSKVTTYCNETMTGWV










HDVLGRNWACFTGKKVGTASENVYVNIAHLK










NSQEKYSNRLYKYDHNFVKAINAIQKSWTATT










YMEYETLTLGDMIRRSGGHSRKIPRPKPAPLT










AEIQQKILHLPTSWDWRNVHGINFVSPVRNQ










GQERFGNMTRVYYREAMGAFIVFDVTRPATF










EAVAKWKNDLDSKLSLPNGKPVSVVLLANKC










DQGKDVLMNNGLKMDQFCKEHGFVGWFET










SAKENINIDEASRCLVKHILANECDLMESIEPD










VVKPHLTSTKVASCSGCAKS




G17212.
1
1367
15
338
8473
MAERGLEPSPAAVAALPPEVRAQLAELELELS
34
2


TCGA-





EGDITQKGYEKKRSKLLSPYSPQTQETDSAVQ




06-





KELRNQTPAPSAAQTSAPSKYHRTRSGGARD




0129-





ERYRSDIHTEAVQAALAKHKEQKMALPMPTK




01A-





RRSTFVQSPADACTPPDTSSASEDEGSLRRQA




01R-





ALSAALQQSLQNAESWINRSIQGSSTSSSASST




1849-





LSHGEVKGTSGSLADVFANTRIENFSAPPDVT




01.2





TTTSSSSSSSSIRPANIDLPPSGIVKGMHKGSN










RSSLMDTADGVPVSSRVSTKIQQLLNTLKRPK










RPPLKEFFVDDSEEIVEVPQPDPNQPKPEGRQ










MTPVKGEPLGVICNWPPALESALQRWGTTQ










AKCSCLTALDMTGKPVYTLTYGKLWSRSLKLA










YTLLNKLGTKNEPVLKPGDRVALVYPNNDPV










MFMVAFYGCLLAEVIPVPIEVPLTRKDAGGQ










QIGFLLGSCGIALALTSEVCLKGLPKTQNGEIV










QFKGWPRLKWVVTDSKYLSKPPKDWQPHISP










AGTEPAYIEYKTSKEGSVMGVTVSRLAMLSHC










QALSQACNYSEGETIVNVLDFKKDAGLWHG










MFANVMNKMHTISVPYSVMKTCPLSWVQR










VHAHKAKVALVKCRDLHWAMMAHRDQRD










VSLSSLRMLIVTDGANPWSVSSCDAFLSLFQS










HGLKPEAICPCATSAEAMTVAIRRPGVPGAPL










PGRAILSMNGLSYGVIRVNTEDKNSALTVQDV










GHVMPGGMMCIVKPDGPPQLCKTDEIGEICV










SSRTGGMMYFGLAGVTKNTFEVIPVNSAGSP










VGDVPFIRSGLLGFVGPGSLVFVVGKMDGLL










MVSGRRHNADDIVATGLAVESIKTVYRGRIAV










FSVSVFYDERIVVVAEQRPDASEEDSFQWMS










RVLQAIDSIHQVGVYCLALVPANTLPKTPLGGI










HISQTKQLFLEGSLHPCNILMCPHTCVTNLPKP










RQKQPGVGPASVMVGNLVAGKRIAQAAGRD










LGQIEENDLVRKHQFLAEILQWRAQATPDHV










LFMLLNAKGTTVCTASCLQLHKRAERIASVLG










DKGHLNAGDNVVLLYPPGIELIAAFYGCLYAG










CIPVTVRPPHAQNLTATLPTVRMIVDVSKAAC










ILTSQTLMRLLRSREAAAAVDVKTWPTIIDTDD










LPRKRLPQLYKPPTPEMLAYLDFSVSTTGMLT










GVKMSHSAVNALCRAIKLQCELYSSRQIAICLD










PYCGLGFALWCLCSVYSGHQSVLIPPMELENN










LFLWLSIVNQYKIRDTFCSYSVMELCTKGLGN










QVEVLKTRGINLSCVRTCVVVAEERPRVALQQ










SFSKLFKDIGLSPRAVSTTFGSRVNVAICLQGTS










GPDPTTVYVDLKSLRHDRVRLVERGAPQSLLL










SESGKRSGSKQSTNPADNYHLARRRTLQVVVS










SLLTEAGFESAEKASVETLTEMLQSYISEIGRSA










KSYCEHTARTQPTLSDIVVTLVEMGFNVDTLP










AYAKRSQRMVITAPPVTNQPVTPKALTAGQN










RPHPPHIPSHFPEFPDPHTYIKTPTYREPVSDY










QVLREKAASQRRDVERALTRFMAKTGETQSL










FKDDVSTFPLIAARPFTIPYLTALLPSELEMQQ










MEETDSSEQDEQTDTENLALHISMIESRSVTQ










AGVQWQDLGSLQPPPPGFKRFSSLSLLSSWN










YRRILEPRRRTPLSCSRTPPCRVAGMGRRTSSI










TLICGR




G17787.
1
337
376
601
8474
MEPPAAKRSRGCPAGPEERDAGAGAARGRG
5
7


TCGA-





RPEALLDLSAKRVAESWAFEQVEERFSRVPEP




26-





VQKRIVFWSFPRSEREICMYSSLGYPPPEGEH




5139-





DARVPFTRGLHLLQSGAVDRVLQVGFHLSGNI




01A-





REPGSPGEPERLYHVSISFDRCKITSVSCGCDN




01R-





RDLFYCAHVVALSLYRIRHAHQVELRLPISETLS




1850-





QMNRDQLQKFVQYLISAHHTEVLPTAQRLAD




01.4





EILLLGSEINLVNGAPDPTAGAGIEDANCWHL










DEEQIQEQVKQLLSNGGYYGASQQLRSMFSK










VREMLRMRDSNGARMLILMTEQFLQDTRLA










LWRQQGAGMTDKCRQLWDELGMFNSPEM










QALLQQISENPQLMQNVISAPYMRSMMQTL










AQNPDFAAQMMVNVPLFAGNPQLQEQLRL










QLPVFLQQMQNPESLSILTNPRAMQALLQIQ










QGLQTLQTEAPGLVPSLGSFGISRTPAPSAGS










NAGSTPEAPTSSPATPATSSPTGASSAQQQL










MQQMIQLLAGSGNSQVQTPEVRFQQQLEQL










NSMGFINREANLQALIATGGDINAAIERLLGS










QLS




G17675.
1
473
82
514
8475
MDVLPTGGGRPGLRTELEFRGGGGEARLESQ
8
2


TCGA-





EEETIPAAPPAPRLRGAAERPRRSRDTWDGDE




19-





DTEPGEACGGRTSRTASLVSGLLNELYSCTEEE




2624-





EAAGGGRGAEGRRRRRDSLDSSTEASGSDVV




01A-





LGGRSGAGDSRVLQELQERPSQRHQMLYLR




01R-





QKDANELKTILRELKYRIGIQSAKLLRHLKQKDR




1850-





LLHKVQRNCDIVTACLQAVSQKRRVDTKLKFT




01.2





LEPSLGQNGFQQWYDALKAVARLSTGIPKEW










RRKVWLTLADHYLHSIAIDWDKTMRFTFNER










SNPDDDSMGIQIVKDLHRTGCSSYCGQEAEQ










DRVVLKRVLLAYARWNKTVGYCQGFNILAALI










LEVMEGNEGDALKIMIYLIDKVLPESYFVNNLR










ALSVDMAVFRDLLRMKLPELSQHLDTLQRTA










NKESGGGYEPPLTNVFTMQWFLTLFATCLPN










QTVLKIWDSVFFEGSEIILRVSLAIWAKLGEVIN










AGKSTHNEDQASCEVLTVKKKAGAVTSTPNR










NSSKRRSSLPNGEGLQLKENSESEGVSCHYWS










LFDGHAGSGAAVVASRLLQHHITEQLQDIVDI










LKNSAVLPPTCLGEEPENTPANSRTLTRAASLR










GGVGAPGSPSTPPTRFFTEKKIPHECLVIGALE










SAFKEMDLQIERERSSYNISGGCTALIVICLLGK










LYVANAGDSRAIIIRNGEIIPMSSEFTPETERQR










LQYLAFMQPHLLGNEFTHLEFPRRVQRKELGK










KMLYRDFNMTGWAYKTIEDEDLKFPLIYGEGK










KARVMATIGVTRGLGDHDLKVHDSNIYIKPFL










SSAPEVRIYDLSKYDHGSDDVLILATDGLWDVL










SNEEVAEAITQFLPNCDPDDPHRYTLAAQDLV










MRARGVLKDRGWRISNDRLGSGDDISVYVIPL










IHGNKLS




G17800.
1
318
97
104
8476
MDNMSITNTPTSNDACLSIVHSLMCHRQGG
7
2


TCGA-





ESETFAKRAIESLVKKLKEKKDELDSLITAITTNG




06-





AHPSKCVTIQRTLDGRLQVAGRKGFPHVIYAR




5859-





LWRWPDLHKNELKHVKYCQYAFDLKCDSVCV




01A-





NPYHYERVVSPGIDLSGLTLQSNAPSSMMVK




01R-





DEYVHDFEGQPSLSTEGHSIQTIQHPPSNRAS




1849-





TETYSTPALLAPSESNATSTANFPNIPVASTSQ




01.4





PASILGGSHSEGLLQIASGPQPGQQQNGFTG










QPATYHHNSTTTWTGSRTAPYTPNLPHHQN










GHLQHHPPMPPHPGHYWPVHNELAFQPPIS










NHPGVVPEPTG




G17638.
1
116
70
211
8477
MAAAAGAPPPGPPQPPPPPPPEESSDSEPEA
3
3


TCGA-





EPGSPQKLIRKVSTSGQIRQKTIIKEGMLTKQN




28-





NSFQRSKRRYFKLRGRTLYYAKTAKSIIFDEVDL




2499-





TDASVAESSTKNVNNSFTVEVLDENNLVMNL




01A-





EFSIRETTCRKDSGEDPATCAFQRDYYVSTAVC




01R-





RSTVKVSAQQVQGVHARCSWSSSTSESYSSEE




1850-





MIFGDMLGSHKWRNNYLFGLISDESISEQFYD




01.2





RSLGIMRRVLPPGNRRYPNHRHRARINTDFE









Annotation information in each column is described below for Table 7:


sample: Name of TCGA or private sample.


#chrom5p: 5′ chromosome


#start5p: 5′ genomic start coordinate


#end5p: 5′ genomic end coordinate


#chrom3p: 3′ chromosome


#start3p: 3′ genomic start coordinate


#end3p: 3′ genomic end coordinate


strand5p: 5′ strand


strand3p: 3′ strand


genes5p: 5′ gene


genes3p: 3′ gene


total_frags (split inserts+split reads): Total number of split inserts and split reads


spanning_frags (split reads): Number of split reads


GeneBreakpoint5p: The genomic coordinate of the breakpoint in the 5′ gene


GeneBreakpoint3p: The genomic coordinate of the breakpoint in the 3′ gene


FrameType: Reading frame of gene fusions. Values include in-frame, frameshift, or null (no transcript information was found in the Ensembl Homo_sapiens.GRCh37.60.gtf file).


FusedSequence: Reconstructed sequence of the fusion RNA transcript


ProteinStart5p: The start coordinate of the 5′ protein segment


ProteinStop5p: The stop coordinate (breakpoint) of the 5′ protein segment


ProteinStart3p: The start coordinate (breakpoint) of the 3′ protein segment


ProteinStop3p: The stop coordinate of the 3′ protein segment


ProteinSequence: Reconstructed sequence of the fusion protein


ExonBreak5p: The last exon of the 5′ gene before the breakpoint


ExonBreak3p: The first exon of the 3′ gene after the breakpoint


Table 8. Genomic breakpoints of gene fusions detected through whole-exome DNA sequencing. Annotation information in each column is described below:


Table 8. Genomic breakpoints of gene fusions detected through whole-exome DNA sequencing. Annotation information in each column is described below:









TABLE 8





Genomic breakpoints of gene fusions detected through whole-exome


DNA sequencing. Annotation information in each column is described below:































Exon










Before



split


Sense


Breakpoint
Break


sample
reads
gene5p
chr5p
5p
Start 5p
End 5p
5p
point 5p





TCGA-06-0750-
51
EGFR
chr7
+
55054218
55268221
55268221
24


01A-01D-1492-


08


TCGA-27-1837-
648
EGFR
chr7
+
55086724
55268937
55268937
24


01A-01D-1494-


08


TCGA-28-2513-
378
EGFR
chr7
+
55086724
55269001
55269001
24


01A-01D-1494-


08


TCGA-06-5411-
676
NFASC
chr1
+
204797781
204951827
204951827
21


01A-01D-1696-


08


TCGA-32-5222-
0
EGFR
chr7
+
55086724
N/A
N/A
N/A


01A-01D-1486-


08


TCGA-28-5209-
0
EGFR
chr7
+
55086724
N/A
N/A
N/A


01A-01D-1486-


08













Exon










After



split
Gene

Sense


Breakpoint
Break


sample
reads
3p
chr3p
3p
start3p
end3p
3p
point 3p





TCGA-06-0750-
51
SEPT14
chr7

55828730
55871487
55871487
10


01A-01D-1492-


08


TCGA-27-1837-
648
SEPT14
chr7

55861236
55870909
55870909
10


01A-01D-1494-


08


TCGA-28-2513-
378
SEPT14
chr7

55861236
55871369
55871369
10


01A-01D-1494-


08


TCGA-06-5411-
676
NTRK1
chr1
+
156844170
156851642
156844170
10


01A-01D-1696-


08


TCGA-32-5222-
0
SEPT14
chr7

55861236
N/A
N/A
N/A


01A-01D-1486-


08


TCGA-28-5209-
0
PSPH
chr7

56078743
N/A
N/A
N/A


01A-01D-1486-


08





















Read






split


Dir





sample
inserts
posA5p
posB5p
5p
posA3p
posB3p
readDir 3p





TCGA-06-
17
55242336
55268011
Fwd
55871183
55871412
Fwd


0750-01A-


01D-1492-08


TCGA-27-
505
55268346
55268884
Fwd
55870367
55870871
Fwd


1837-01A-


01D-1494-08


TCGA-28-
251
55268377
55268962
Fwd
55871008
55871339
Fwd


2513-01A-


01D-1494-08


TCGA-06-
131
204951788
204951826
Fwd
156844171
156844232
Rev


5411-01A-


01D-1696-08


TCGA-32-
1
55628015
55628015
Fwd
55872287
55872287
Fwd


5222-01A-


01D-1486-08


TCGA-28-
1
55198162
55198162
Rev
56087184
56087184
Fwd


5209-01A-


01D-1486-08





Annotation information in each column is described below for Table 8:


sample: Name of TCGA sample


split reads: Total number of split reads


gene5p: 5′ gene


chr5p: 5′ chromosome


sense5p: 5′ sense


start5p: 5′ genomic start coordinate


end5p: 5′ genomic end coordinate


breakpoint5p: 5′ genomic coordinate of breakpoint


exonBeforeBreakpoint5p: Exon number of 5′ gene before the breakpoint


gene3p: 3′ gene


chr3p: 3′ chromosome


sense3p: 3′ sense


start3p: 3′ genomic start coordinate


end3p: 3′ genomic end coordinate


breakpoint3p: 3′ genomic coordinate of breakpoint


exonAfterBreakpoint3p: Exon number of 3′ gene after the breakpoint


split inserts: Total number of split inserts


posA5p: Coordinate of split insert read closest to 5′ end in 5′ gene


posB5p: Coordinate of split insert read closest to 3′ end in 5′ gene


readDir5p: Read direction of split insert reads in 5′ gene


posA3p: Coordinate of split insert read closest to 5′ end in 3′ gene


posB3p: Coordinate of split insert read closest to 3′ end in 3′ gene


readDir3p: Read direction of split insert reads in 3′ gene













TABLE 11







Relative expression of EGFR fusion and wild-type transcripts.


Expression is estimated using the depth of reads covering the


fusion breakpoint or wild-type exon junctions excluded from the


fusion transcript. These wild-type exons include exons


25-26, 26-27, and 27-28.












Fusion





Sample
Bp
Exon 25-26
Exon 26-27
Exon 27-28










EGFR-SEPT14











TCGA-28-2513
1464
21
12
25


TCGA-27-1837
796
6
5
6


TCGA-06-0750
414
69
61
101


TCGA-32-5222
495
256
190
348


TCGA-28-1747
142
426
300
502


TCGA-06-2557
13
1031
657
1254







EGFR-PSPH











TCGA-28-5209
5648
216
122
232


TCGA-06-5408
37
232
200
292


TCGA-28-5215
28
29
26
44
















TABLE 12





Genomic breakpoints of gene fusions detected through whole-exome


DNA sequencing.































Exon










Before










Break-



split
Gene
Chr
Sense


Break
point


sample
reads
5p
5p
5p
Start 5p
End 5p
point 5p
5p





TCGA-06-
51
EGFR
chr7
+
55054218
55268221
55268221
24


0750-01A-


01D-1492-


08


TCGA-27-
648
EGFR
chr7
+
55086724
55268937
55268937
24


1837-01A-


01D-1494-


08


TCGA-28-
378
EGFR
chr7
+
55086724
55269001
55269001
21


2513-01A-


01D-1494-


08


TCGA-06-
676
NFASC
chr1
+
204797781
204951827
204951827
21


5411-01A-


01D-1696-


08


TCGA-32-
0
EGFR
chr7
+
55086724
N/A
N/A
N/A


5222-01A-


01D-1486-


08


TCGA-28-
0
EGFR
chr7
+
55086724
N/A
N/A
N/A


5209-01A-


01D-1486-


08
























Exon









After



Gene

Sense


Break-
Break-


sample
3p
Chr 3p
3p
Start 3p
End 3p
point 3p
point 3p





TCGA-06-
SEPT14
chr7

55828730
55871487
55871487
10


0750-01A-


01D-1492-


08


TCGA-27-
SEPT14
chr7

55861236
55870909
55870909
10


1837-01A-


01D-1494-


08


TCGA-28-
SEPT14
chr7

55861236
55871369
55871369
10


2513-01A-


01D-1494-


08


TCGA-06-
NTRK1
chr1
+
156844170
156851642
156844170
10


5411-01A-


01D-1696-


08


TCGA-32-
SEPT14
chr7

55861236
N/A
N/A
N/A


5222-01A-


01D-1486-


08


TCGA-28-
PSPH
chr7

56078743
N/A
N/A
N/A


5209-01A-


01D-1486-


08









Read



split


Dir


Read


sample
inserts
posA 5p
posB 5p
5p
posA 3p
posB 3p
Dir 3p





TCGA-06-
17
55242336
55268011
Fwd
55871183
55871412
Fwd


0750-01A-


01D-1492-08


TCGA-27-
505
55268346
55268884
Fwd
55870367
55870871
Fwd


1837-01A-


01D-1494-08


TCGA-28-
251
55268377
55268962
Fwd
55871008
55871339
Fwd


2513-01A-


01D-1494-08


TCGA-06-
131
204951788
204951826
Fwd
156844171
156844232
Fwd


5411-01A-


01D-1696-08


TCGA-32-
1
55268015
55268015
Fwd
55872287
55872287
Fwd


5222-01A-


01D-1486-08


TCGA-28-
1
55198162
55198162
Rev
56087184
56087184
Fwd


5209-01A-


01D-1486-08





Annotation information in each column is described below for Table 12:


sample: Name of TCGA sample


split reads: Total number of split reads


gene5p: 5′ gene


chr5p: 5′ chromosome


sense5p: 5′ sense


start5p: 5′ genomic start coordinate


end5p: 5′ genomic end coordinate


breakpoint5p: 5′ genomic coordinate of breakpoint


exonBeforeBreakpoint5p: Exon number of 5′ gene before the breakpoint


gene3p: 3′ gene


chr3p: 3′ chromosome


sense3p: 3′ sense


start3p: 3′ genomic start coordinate


end3p: 3′ genomic end coordinate


breakpoint3p: 3′ genomic coordinate of breakpoint


exonAfterBreakpoint3p: Exon number of 3′ gene after the breakpoint


split inserts: Total number of split inserts


posA5p: Coordinate of split insert read closest to 5′ end in 5′ gene


posB5p: Coordinate of split insert read closest to 3′ end in 5′ gene


readDir5p: Read direction of split insert reads in 5′ gene


posA3p: Coordinate of split insert read closest to 5′ end in 3′ gene


posB3p: Coordinate of split insert read closest to 3′ end in 3′ gene


readDir3p: Read direction of split insert reads in 3′ gene.













TABLE 13







Analysis of the incidence of EGFR-SEPT14 and EGFR-PSPH gene


fusions in GBM harboring or not the EGFRvIII rearrangement.













EGFR-




Isoform
EGFR-SEPT14
PSPH
Non-Fusion
Total





EGFRvIII
1
1
14
16


No EGFRvIII
5
2
64
71


Total
6
3
78
87
















TABLE 14







Enrichment of classical/mesenchymal subtype among samples with


EGFR-SEPT14 or EGFR-PSPH.












Classical
Mesenchymal
Proneural
Neural














EGFR
3
5
1
0


Fusion






No Fusion
37
47
38
0


Total
40
52
39
28





Fisher's p-value = 0.0500 for the enrichment of classical/mesenchymal subtype













TABLE 15





Antibodies and concentrations used in immunofluorescence staining.




















B-III Tubulin
Mouse
1:400
Promega



δ-Catenin
Guinea
1:500
Acris




Pig





Fibronectin
Mouse
1:1000
BD-






Pharmingen



Col5A1
Rabbit
1:200
Santa Cruz






Biotech



PSD-95
Rabbit
1:500
Invitrogen



Smooth muscle
Mouse
1:200
Sigma



actin
















TABLE 16





Antibodies and concentrations used for Western blots and


immunopreciptation assays.




















Anti-Vinculin
Mouse
1:400
SIGMA



Anti-N-Cadherin
Mouse
1:200
BD-






Pharmingen



Cyclin A
Rabbit
1:500
Santa Cruz






Biotech



p27
Mouse
1:250
BD






Transduction



B-III Tubulin
Mouse
1:400
Promega



δ-Catenin
Guinea
1:500
Acris




Pig





Fibronectin
Mouse
1:1000
BD-






Pharmingen



p107
Rabbit
1:1000
Santa Cruz






Biotech



Nestin
Mouse
1:500
BD-






Pharmingen



CD133
Rabbit
1:200
Abcam



Sox2
Rabbit
1:500
Cell Signaling



EGFR
Mouse
1:1000
Millipore



AKT
Rabbit
1:1000
Cell Signaling



pAKT-S473
Rabbit
1:1000
Cell Signaling



ERK1/2
Rabbit
1:1000
Cell Signaling



pERK1/2
Rabbit
1:1000
Cell Signaling



STAT3
Rabbit
1:1000
Santa Cruz






Biotech



pSTAT3-Y705
Rabbit
1:1000
Cell Signaling



LZTR1
Rabbit
1:1000
Abcam



Cul3
Rabbit
1:1000
Bethyl
















TABLE 17







Primers used for screening gene fusions from cDNA.










SEQ




ID



Primer
NO:
Sequence





hEGFR-RT-FW1
32
5′- GGGTGACTGTTTGGGAGTTGATG -3





hSEP14-RT-REV1
33
5′- TGTTTGTCTTTCTTTGTATCGGTGC-3′





hEGFR-RT-FW1
87
5′-AGAGGTGACCACCAATCAGC-3′





hPSPH-RT-REV1
88
5′-CGTGTCCCACACAGAGACAG-3′





hNFASC-RT-FW1
36
5′- AGTTCCGTGTCATTGCCATCAAC-3′





hNTRK1-RT-REV1
37
5′- TGTTTCGTCCTTCTTCTCCACCG-3′





hCAND1-RT- FW1
38
5′-GGAAAAAATGACATCCAGCGAC-3′





hEGFR-RT-REV1
39
5′- TGGGTGTAAGAGGCTCCACAAG-3′
















TABLE 18







Primers used for genomic detection of gene


fusions.










SEQ




ID



Primer
NO:
Sequence





genomic
40
5′- GGATGATAGACGCAGATAGTCGCC-3′


EGFR-FW1







genomic
41
5′- TCCAGTTGTTTTTTCTCTTCCTCG-3′


SEPT14-REV1







genomic
12
5′- TCCGAGTCCAGGCTGAAAATG-3′


NFASC-FW1







genomic
89
5′- CTACTTCCTATCTCACCCCAAAAGG-3′


NTRK1-REV1







genomic
44
5′- GCAATAGCAAAACAGGAAGATGTC-3′


CAND1-FW1







genomic
45
5′- GAACACTTACCCATTCGTTGG-3′


EGFR-REV1
















TABLE 19







Primers used for semiquantitative RT-PCR to detect


exogenous Myc-LZTR1 WT and mutant LZTR1-R801W.










SEQ




ID



Primer
NO:
Sequence





LZTR1 FW
90
5′- TCCCACATCTCAGACAAGCA-3′





His-Tag REV
91
5′- TCAATGGTGATGGTGATGATG-3′





GAPDH FW
92
5′- GAAGGTGAAGGTCGGAGTCAAC-3′





GAPDH REV
93
5′-CAGAGTTAAAAGCAGCCCTGGT-3′








Claims
  • 1. A cDNA encoding a fusion protein comprising the tyrosine kinase domain of EGFR fused to: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein.
  • 2. The cDNA of claim 1, wherein the Septin protein is Septin-1, Septin-2, Septin-3, Septin-4, Septin-5, Septin-6, Septin-7, Septin-8, Septin-9, Septin-10, Septin-11, Septin-12, Septin-13, or Septin-14
  • 3. The cDNA of claim 1, wherein the Septin protein is Septin-14 (SEPT14).
  • 4. The cDNA of claim 1, wherein the CAND protein is CAND protein is CAND1, CAND2, or CAND3.
  • 5. The cDNA of claim 1, wherein the CAND protein is CAND1.
  • 6. The cDNA of claim 1, wherein the fusion protein is EGFR-SEPT14, EGFR-CAND1, or EGFR-PSPH.
  • 7. The cDNA of claim 1, wherein the cDNA comprising the tyrosine kinase domain of EGFR fused to the coiled-coil domain of a Septin protein comprises SEQ ID NO: 2, or SEQ ID NO: 4 or has a genomic breakpoint comprising SEQ ID NO: 4.
  • 8. The cDNA of claim 1, wherein the cDNA comprising the tyrosine kinase domain of EGFR fused to a PSPH protein comprises SEQ ID NO: 8 or SEQ ID NO: 10 or has a genomic breakpoint comprising SEQ ID NO: 10.
  • 9. The cDNA of claim 1, wherein the cDNA comprising the tyrosine kinase domain of EGFR fused to a CAND protein comprises SEQ ID NO: 14 or SEQ ID NO: 15, or has a genomic breakpoint comprising SEQ ID NO: 15.
  • 10. A purified fusion protein comprising the tyrosine kinase domain of EGFR fused to: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein.
  • 11. The purified fusion protein of claim 10, wherein the fusion protein is EGFR-SEPT14, EGFR-CAND1, or EGFR-PSPH.
  • 12. The purified fusion protein of claim 10, wherein the fusion protein comprises SEQ ID NO: 1, 5, 7, 11, 13, or 8495.
  • 13. An antibody or antigen-binding fragment thereof, that specifically binds to a purified fusion protein comprising a tyrosine kinase domain of an EGFR protein fused to a polypeptide comprising: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein.
  • 14. The antibody or antigen-binding fragment of claim 13, wherein the Septin protein is SEPT14.
  • 15. The antibody or antigen-binding fragment of claim 13, wherein the CAND protein is CAND1.
  • 16. The antibody or antigen-binding fragment of claim 13, wherein the fusion protein is EGFR-SEPT14, EGFR-CAND1, or EGFR-PSPH.
  • 17. The antibody or antigen-binding fragment of claim 13, wherein the EGFR-SEPT14 fusion protein comprises the amino acid sequence of SEQ ID NO: 1, 3, or 5.
  • 18. The antibody or antigen-binding fragment of claim 13, wherein the EGFR-CAND1 fusion protein comprises the amino acid sequence of SEQ ID NO: 13, 16, or 8495.
  • 19. The antibody or antigen-binding fragment of claim 13, wherein the EGFR-PSPH fusion protein comprises the amino acid sequence of SEQ ID NO: 7, 9, or 11.
  • 20. A composition for decreasing the expression level or activity of a fusion protein in a subject comprising the tyrosine kinase domain of an EGFR protein fused to a polypeptide comprising: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein;
  • 21. The composition of claim 20, wherein the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof.
  • 22. The composition of claim 20, wherein the fusion protein is EGFR-SEPT14, EGFR-CAND 1, or EGFR-PSPH.
  • 23. The composition of claim 21, wherein the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.
  • 24. A method of decreasing growth of a solid tumor in a subject in need thereof, the method comprising administering to the subject an effective amount of an EGFR fusion molecule inhibitor, wherein the inhibitor decreases the size of the solid tumor, and wherein the EGFR fusion comprises the tyrosine kinase domain of EGFR fused to: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein.
  • 25. The method of claim 24, wherein the solid tumor comprises glioblastoma multiforme, breast cancer, lung cancer, prostate cancer, or colorectal carcinoma.
  • 26. The method of claim 24, wherein the inhibitor comprises an antibody that specifically binds to an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion protein, or a fragment thereof; a small molecule that specifically binds to an EGFR protein; an antisense RNA or antisense DNA that decreases expression of an EGFR-SEPT fusion protein, an EGFR-PSPH fusion protein, an EGFR-CAND fusion; a siRNA that specifically targets an EGFR-SEPT fusion gene, an EGFR-PSPH fusion gene, or an EGFR-CAND; or a combination thereof.
  • 27. The method of claim 24, wherein the fusion protein is EGFR-SEPT14, EGFR-CAND 1, or EGFR-PSPH.
  • 28. The method of claim 26, wherein the small molecule that specifically binds to an EGFR protein comprises AZD4547, NVP-BGJ398, PD173074, NF449, TK1258, BIBF-1120, BMS-582664, AZD-2171, TSU68, AB1010, AP24534, E-7080, LY2874455, or a combination thereof.
  • 29. A diagnostic kit for determining whether a sample from a subject exhibits a presence of an EGFR fusion, the kit comprising at least one oligonucleotide that specifically hybridizes to an EGFR fusion, or a portion thereof, wherein the EGFR fusion comprises the tyrosine kinase domain of EGFR fused to: (i) the coiled-coil domain of a Septin protein;(ii) a phosphoserine phosphatase (PSPH) protein; or(iii) a Cullin-associated and neddylation-dissociated (CAND) protein.
  • 30. The kit of claim 29, wherein the oligonucleotides comprise a set of nucleic acid primers or in situ hybridization probes.
  • 31. The kit of claim 29, wherein the oligonucleotide comprises SEQ ID NO: 32, 33, 34, 35, 38, 39, 40, 41, 44, 45, 87, or 88.
  • 32. The kit of claim 30, wherein the primers prime a polymerase reaction only when an EGFR fusion is present.
  • 33. The kit of claim 29, wherein the fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein.
  • 34. The kit of claim 29, wherein the determining comprises gene sequencing, selective hybridization, selective amplification, gene expression analysis, or a combination thereof.
  • 35. A diagnostic kit for determining whether a sample from a subject exhibits a presence of an EGFR fusion protein, the kit comprising an antibody that specifically binds to an EGFR fusion protein comprising SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495, wherein the antibody will recognize the protein only when an EGFR fusion protein is present.
  • 36. The kit of claim 35, wherein the fusion protein is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein.
  • 37. A method for detecting the presence of an EGFR fusion in a human subject, the method comprising: (a) obtaining a biological sample from the human subject; and(b) detecting whether or not there is an EGFR fusion present in the subject;
  • 38. The method of claim 37, wherein the detecting comprises measuring EGFR fusion protein levels by ELISA using an antibody directed to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495; western blot using an antibody directed to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 16, or 8495; mass spectroscopy, isoelectric focusing, or a combination thereof.
  • 39. The method of claim 37, wherein the detecting of step (b) comprises detecting whether or not there is a nucleic acid sequence encoding an EGFR fusion protein in the subject.
  • 40. The method of claim 39, wherein the nucleic acid sequence comprises any one of SEQ ID NOS: 2, 4, 8, 10, 14, or 15.
  • 41. The method of claim 39, wherein the detecting comprises using hybridization, amplification, or sequencing techniques to detect an EGFR fusion.
  • 42. The method of claim 41, wherein the amplification uses primers comprising SEQ ID NO: 32, 33, 34, 35, 38, 39, 40, 41, 44, 45, 87, or 88.
  • 43. The method of claim 37, wherein the EGFR fusion is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein.
  • 44. The method of claim 39, wherein the EGFR fusion is an EGFR-SEPT14 fusion protein, an EGFR-PSPH fusion protein, or an EGFR-CAND1 fusion protein.
Parent Case Info

The application is a continuation of PCT International Application No. PCT/US2014/026351 filed on Mar. 13, 2014, which claims the benefit of and priority to U.S. Provisional Patent Application No. 61/793,086, filed on Mar. 15, 2013, the content of which are hereby incorporated by reference in their entireties. All patents, patent applications and publications cited herein are hereby incorporated by reference in their entirety. The disclosures of these publications in their entireties are hereby incorporated by reference into this application. This patent disclosure contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the U.S. Patent and Trademark Office patent file or records, but otherwise reserves any and all copyright rights.

GOVERNMENT SUPPORT

This invention was made with government support under Grant No. R01CA101644 awarded by the National Cancer Institute. The Government has certain rights in the invention.

Provisional Applications (1)
Number Date Country
61793086 Mar 2013 US
Continuations (1)
Number Date Country
Parent PCT/US2014/026351 Mar 2014 US
Child 14853568 US