CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of priority of Singapore application No. 10201601142V, filed 16 Feb. 2016, the contents of it being hereby incorporated by reference in its entirety for all purposes.
FIELD OF THE INVENTION
The invention relates to a method for determining the presence or absence of at least one promoter in a cancerous biological sample relative to a non-cancerous biological sample.
BACKGROUND OF THE INVENTION
Gastric cancer (GC) is the third leading cause of global cancer mortality with high prevalence in many East Asian countries. GC patients often present with late-stage disease, and clinical management remains challenging as exemplified by several recent negative Phase II and Phase III clinical trials. At the molecular level, studies have identified characteristic gene mutations, copy number alterations, gene fusions, and transcriptional patterns in GC. However, few of these have been clinically translated into targeted therapies, with the exception of HER2-positive GC and traztuzumab. There is thus a strong need for additional and more comprehensive explorations of GC, as these may highlight new biomarkers for disease detection, predicting patient prognosis or responses to therapy, as well as new therapeutic modalities.
Promoter elements are cis-regulatory elements which function to link gene transcription initiation to upstream regulatory stimuli, integrating inputs from diverse signaling pathways. Promoters represent an important reservoir of biological, functional, and regulatory diversity, as current estimates suggest that 30-50% of genes in the human genome are associated with multiple promoters, which can be selectively activated as a function of developmental lineage and cellular state. Differential usage of alternative promoters causes the generation of distinct 5′ untranslated regions (5′ UTRs) and first exons in transcripts, which in turn can influence mRNA expression levels, translational efficiencies, and generation of different protein isoforms through gain and loss of 5′ coding domains. To date, promoter alterations in cancer have been largely studied on a gene-by-gene basis, and very little is known about the global extent of promoter-level diversity in GC and other solid malignancies.
Accordingly, there is a need for a method of profiling promoter elements in cancer.
SUMMARY
In one aspect there is provided a method for determining the presence or absence of at least one promoter in a cancerous biological sample relative to a non-cancerous biological sample, comprising: contacting the cancerous biological sample with at least one antibody specific for histone modifications H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises at least one region specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid; and determining the presence or absence of at least one promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a non-cancerous biological sample.
In another aspect there is provided a method for determining the prognosis of cancer in a subject, comprising, contacting a cancerous biological sample obtained from the subject with at least one antibody specific for histone modification H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises at least one region specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid; and determining the presence or absence of at least one cancer-associated promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a reference nucleic acid sequence, wherein the presence or absence of the at least one cancer-associated promoter in the cancerous biological sample is indicative of the prognosis of the cancer in the subject.
In another aspect there is provided a biomarker for detecting cancer in a subject, the biomarker comprising at least one promoter having a change in signal intensity of H3K4me3 in a cancerous biological sample relative to a non-cancerous biological sample.
In another aspect there is provided a method for modulating the activity of at least one cancer-associated promoter in a cell, comprising administering an inhibitor of EZH2 to the cell.
In another aspect there is provided a method for modulating the immune response of a subject to cancer, comprising administering to the subject an inhibitor of EZH2, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject.
In another aspect there is provided a method for determining the presence or absence of at least one cancer-associated promoter in a cancerous biological sample relative to a non-cancerous biological sample, comprising: contacting the cancerous biological sample with at least one antibody specific for histone modifications H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises at least one region specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid at a read depth of 20M; and determining the presence or absence of at least one cancer-associated promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a non-cancerous biological sample.
In one aspect, there is provided a biomarker comprising at least one promoter having a change in signal intensity of H3K4me3 in a cancerous biological sample relative to a non-cancerous biological sample for use in detecting cancer in a subject.
In one aspect, there is provided a use of a biomarker comprising at least one promoter having a change in signal intensity of H3K4me3 in a cancerous biological sample relative to a non-cancerous biological sample in the manufacture of a medicament for detecting cancer in a subject.
In one aspect, there is provided an inhibitor of EZH2 for use in modulating the activity of at least one cancer-associated promoter in a cell.
In one aspect, there is provided a use of an inhibitor of EZH2 in the manufacture of a medicament for modulating the activity of at least one cancer-associated promoter in a cell.
In one aspect, there is provided an inhibitor of EZH2 for use in modulating the immune response of a subject to cancer, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject.
In one aspect, there is provided a use of an inhibitor of EZH2 in the manufacture of a medicament for modulating the immune response of a subject to cancer, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject.
Definitions
The following are some definitions that may be helpful in understanding the description of the present invention. These are intended as general definitions and should in no way limit the scope of the present invention to those terms alone, but are put forth for a better understanding of the following description.
As used herein, the term “promoter” is intended to refer to a region of DNA that initiates transcription of a particular gene.
As used herein, the term “cancerous” relates to being affected by or showing abnormalities characteristic of cancer.
As used herein, the term “biological sample” refers to a sample of tissue or cells from a patient that has been obtained from, removed or isolated from the patient. The term “obtained or derived from” as used herein is meant to be used inclusively. That is, it is intended to encompass any nucleotide sequence directly isolated from a biological sample or any nucleotide sequence derived from the sample.
As used herein, the term “antibody” or “antibodies” as used herein refers to molecules with an immunoglobulin-like domain and includes antigen binding fragments, monoclonal, recombinant, polyclonal, chimeric, fully human, humanised, bispecific and heteroconjugate antibodies; a single variable domain, single chain Fv, a domain antibody, immunologically effective fragments and diabodies.
The term “specifically binds” as used throughout the present specification in relation to antigen binding proteins means that the antigen binding protein binds to a target epitope on an antigen with a greater affinity than that which results when bound to a non-target epitope. In certain embodiments, specific binding refers to binding to a target with an affinity that is at least 10, 50, 100, 250, 500, or 1000 times greater than the affinity for a non-target epitope. For example, binding affinity may be as measured by routine methods, e.g., by competition ELISA or by measurement of Kd with BIACORE™, KINEXA™ or PROTEON™.
As used herein, the term “isolated” relates to a biological component (such as a nucleic acid molecule, protein or organelle) that has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, i.e., other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been “isolated” include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
As used herein, the term “nucleic acid” refers to a deoxyribonucleotide or ribonucleotide polymer in either single, or double stranded form, and unless otherwise limited, encompassing known analogues of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides, “Nucleotide” includes, but is not limited to, a monomer that includes a base linked to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino acid, as in a peptide nucleic acid (MA). A nucleotide is one monomer in a polynucleotide. A nucleotide sequence refers to the sequence of bases in a polynucleotide.
As used herein, the term “prognosis” or grammatical variants thereof, as used herein refers to a prediction of the probable course and outcome of a clinical condition or disease. A prognosis of a patient is usually made by evaluating factors or symptoms of a disease that are indicative of a favorable or unfavorable course or outcome of the disease. The term “prognosis” does not refer to the ability to predict the course or outcome of a condition with 100% accuracy. Instead, the term “prognosis” refers to an increased probability that a certain course or outcome will occur; that is, that a course or outcome is more likely to occur in a patient exhibiting a given condition, when compared to those individuals not exhibiting the condition.
As used herein, the term “modulating” is intended to refer to an adjustment of the immune response to a desired level.
As used herein, the term “annotated promoter” refers to a promoter mapping close (<500 bp) to a known Gencode transcription start site (TSS).
The term “unannotated promoter” refers to a promoter mapping to genomic regions devoid of known Gencode TSSs.
As used herein, the term “canonical” in the context of a promoter refers to a promoter region exhibiting unaltered H3K4me3 peaks.
As used herein, the term “detectable label” or “reporter” refers to a detectable marker or reporter molecules, which can be attached to nucleic acids. Typical labels include fluorophores, radioactive isotopes, ligands, chemiluminescent agents, metal sols and colloids, and enzymes. Methods for labeling and guidance in the choice of labels useful for various purposes are discussed, e.g., in Sambrook et al., in Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press (1989) and Ausubel et al., in Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences (1987),
As used herein, the term “hypomethylated” refers to a decrease in the normal methylation level of DNA,
As used herein, the term “hypermethylated” refers to an increase in the normal methylation level of DNA.
As used herein, the term “about”, in the context of concentrations of components of the formulations, typically means +/−5% of the stated value, more typically +/−4% of the stated value, more typically +/−3% of the stated value, more typically, +/−2% of the stated value, even more typically +/−1% of the stated value, and even more typically +/−0.5% of the stated value.
Throughout this disclosure, certain embodiments may be disclosed in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the disclosed ranges. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed sub-ranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
Certain embodiments may also be described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the disclosure. This includes the generic description of the embodiments with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
Unless the context requires otherwise or specifically stated to the contrary, integers, steps, or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.
The word “substantially” does not exclude “completely” e.g. a composition which is “substantially free” from Y may be completely free from Y. Where necessary, the word “substantially” may be omitted from the definition of the invention.
The invention illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms “comprising”, “including”, “containing”, etc. shall be read expansively and without limitation. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the inventions embodied therein herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.
The invention has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
Other embodiments are within the following claims and non-limiting examples. In addition, where features or aspects of the invention are described in terms of Markush groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will be better understood with reference to the detailed description when considered in conjunction with the non-limiting examples and the accompanying drawings, in which
FIG. 1: Somatic Promoter Alterations in Primary Gastric Adenocarcinoma.
A) Example of an unaltered GC promoter. The UCSC genome track of the RHOA TSS (shaded box) highlights similar H3K4me3 signals in GC and matched normal samples. Similar signals are seen in GC lines. The bottom two tracks display similar levels of RNA expression in the same GC and matched normal sample (RNAseq).
B) Example of a gained somatic promoter. The UCSC genome track of the CEACAM6 TSS (shaded box) highlights gain of H3K4me3 signals in GC samples and GC lines, compared to matched normal samples. In contrast, no changes are observed at the TSS of CEACAM5, an adjacent gene. Concordant tumor-specific gain of RNA expression is shown in the bottom 2 tracks displaying RNA-seq profiles of the same GC and matched normal samples.
C) Example of a lost somatic promoter. The UCSC genome track of the ATP4A TSS (shaded box) highlights loss of H3K4me3 signals in GC samples and GC lines compared to matched normal samples. Concordant tumor-specific loss of RNA expression is shown in the bottom 2 tracks displaying RNA-seq profiles of the same GC and gastric normal samples.
D) Heatmap of H3K4me3 read densities (row scaled) of somatic promoters (rows) in primary GCs and matched normal samples.
E) Correlation between H3K4me3 promoter signals and H3K27ac activity signals in primary gastric samples (r=0.91, P<0.001). Each data point corresponds to a single H3K4me3 hi/H3K4me1 lo region. Analysis was performed using data from 16 N/T pairs (Table 4).
F) Top 5 gene sets associated with canonical gained and lost somatic promoters. Genesets associated with genes up and downregulated in GC are rediscovered. Also note that gene sets related to H3K27me3 and SUZ12, a PRC2 component, are enriched.
FIG. 2: Association of Somatic Promoter Alterations with Gene Expression in GC and Other Tumor Types
A) Example of a GC somatic promoter. Example is for illustrative purposes only.
B) Changes in RNA-seq expression (top) and DNA methylation (bottom) in discovery samples between somatic promoters and all promoters. Top—Boxplot depicting changes in RNA-seq expression between 9 paired primary GC and gastric normal samples at genomic regions exhibiting somatic promoters (gained and lost) (***P<0.001, Wilcoxon Test). Bottom—Boxplot depicting changes in DNA methylation (β-values) at regions exhibiting somatic promoters between 20 paired GC and gastric normal samples, compared to all promoters. (***P<0.001, Wilcoxon test)
C) Independent Validation Cohorts. Boxplot depicting changes in RNA-seq expression at genomic regions exhibiting somatic promoters across 354 (321 GC, 33 normal) TCGA Stomach adenocarcinoma (STAD) samples, compared to all promoters (***P<0.001, Wilcoxon test)
D) Somatic Promoters in Other Cancer Types. Boxplot depicting changes in RNA-seq expression at genomic regions exhibiting GC somatic promoters compared against all promoters, across 326 TCGA Colon adenocarcinoma (COAD) samples (286 COAD, 40 normal; ***P<0.001, Wilcoxon test), 170 TCGA kidney renal clear cell carcinoma (ccRCC) samples (98 ccRCC and 72 normal; ***P<0.001, Wilcoxon test), and 115 TCGA lung adenocarcinoma (LUAD) samples (58 LUAD, 57 normal; ***P<0.001 somatic gain vs all promoters and somatic gain vs. somatic loss, Wilcoxon test).
FIG. 3: Alternative Promoters in GC
A) UCSC browser track of the HNF4α gene. GC and matched gastric normal samples have equal H3K4me3 signals at the canonical HNF4α promoter. However, an alternative promoter, seen by H3K4me3 gain, can be observed at a downstream TSS in GCs compared to matched normals. At the RNA level, both in-house and TCGA STAD samples also show gain of gene expression at the alternate promoter TSS compared to normal samples.
B) UCSC browser track of the EPCAM gene. Another example of alternative promoter usage at a downstream TSS. Gain of H3K4me3 is observed at a TSS downstream of the canonical promoter, while the canonical promoter exhibits equal H3K4me3 signals in GC and gastric normal. Gain of RNA-seq expression can also be observed in GC at the alternative promoter driven transcript in both in-house and TCGA STAD samples.
C) UCSC browser track of the RASA3 gene, demonstrating H3K4me3 and RNA-seq signals highlighting gain of promoter activity at an un-annotated TSS (dark grey box) corresponding to a novel N-terminal truncated RASA3 transcript. Expression of this variant transcript was validated through 5′RACE in GC lines (bottom).
D) Functional domains of the translated RASA3 canonical and alternate isoform. The alternate transcript is predicted to encode a RASA3 protein missing the RASGAP domain. E) Effect of overexpression of RASA3 canonical (CanT) and alternate (SomT) isoforms on the migration capability of SNU1967 (top) and GES1 (bottom) cells. Representative images of RASA3-Ctl (Empty vector), RASA3-CanT and RASA3-SomT in migration assays (n=3). Barplots show the % area of migrated cells vs the area of transwell membrane. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test)
FIG. 4: Somatic Promoter Alterations Exhibit Immunoediting Signatures
A) Schematic outlining alternative promoter usage leading to alternative transcript usage (Transcript box) and N terminally truncated protein isoforms (protein box).
B) Barplot showing the average % of peptides with predicted high-affinity binding to MHC Class I (HLA-A, B, and C, IC<=50 nm). N-terminal peptides associated with recurrent somatic promoters (alternative promoters) show significantly enriched predicted MHC I binding compared to canonical GC peptides (P<0.01, Fisher's test), random peptides from the human proteome (P<0.001) and C-terminal peptides (P<0.01) derived from the same genes exhibiting the N-terminal alterations. Canonical peptides refer to peptides derived from protein coding genes overexpressed in GC through non-alternative promoters.
C) Percentage (%) of high affinity peptides predicted to bind different HLA-alleles categorized by somatic gain or loss. Most alleles have a greater number of N-terminal lost peptides predicted to have high binding affinity.
D) Quantification of somatic promoter expression using Nanostring profiling. Top—Distinct Nanostring probes were designed to measure expression of alternate and canonical promoter driven transcripts. 2 probes were designed for each gene—a canonical probe at the 5′ transcript marked by unaltered H3K4me3, and an alternate probe at the 5′ transcript of the somatic promoter. Bottom—Heatmap of alternative promoter expression from 95 GCs and matched normal samples. GC samples have been ordered left to right by their levels of somatic promoter usage.
E) Association between Somatic Promoters and T-cell immune correlates (Singapore (SG) cohort). Top left—Expression of T-cell markers CD8A (P=0.1443) and the T-cell cytolytic markers GZMA (P=0.0001) and PRF1 (P=0.00806) in GC samples with either high or low somatic promoter usage (SG). Samples with high alternative promoter usage show lower expression of immune markers. All P values are from Wilcoxon one sided test. Right-Kaplan-Meier analysis comparing overall survival curves between validation samples with high somatic promoter usage (top 25%) and low somatic promoter usage (bottom 25%) (HR=2.56, P=0.02).
F) Association of Somatic Promoters with T-cell Correlates in TCGA and ACRG Cohorts. (Left) Expression of T-cell markers CD8A (P=0.02), GZMA (P=0.01) and PRF1 (P=0.03) in TCGA STAD with either high or low somatic promoter usage. T-cell markers were evaluated by RNA-seq (Transcripts per million, Right) Expression of T-cell markers CD8A (P=0.035), GZMA (P=0.001) and PRF1 (P=0.025) in ACRG GC samples with either high or low somatic promoter usage. All P values are from Wilcoxon one sided test.
G) EpiMAX Heatmap of total cytokine responses (Fold change relative to Actin) for 15 peptide pools against 9 donors.
H) Individual cytokine responses against 15 peptides for two individual donors (Donor 2 and Donor 3) showing complex cytokine responses (FC2).
FIG. 5: Somatic Promoters are Associated with EZH2 Occupancy
A) Binding enrichment of ReMap-defined TFBSs at genomic regions exhibiting somatic promoters. TFs were sorted according to their binding frequency at all H3K4me3-defined promoter regions. EZH2 and SUZ12 binding sites significantly overlap regions exhibiting somatic promoters (gained and lost) (P<0.01, Empirical distribution test).
B) Proportion of RNA transcripts associated with somatic promoters changing upon GSK126 treatment in IM95 cells, compared to RNA transcripts associated with unaltered promoters. The top somatic promoter figure is for illustrative purposes only. Unaltered promoters were defined as all gene promoters except the somatic promoters. The proportion of genes changing upon treatment, as a proportion of all genes, is also shown. Somatic promoters are more likely to change expression after GSK126 treatment relative to unaltered promoters (OR 1.46, P<0.001) or all GSK126 regulated genes (OR 9.21, P<0.001, Fisher Test)
C) UCSC browser track of the SLC9A9 TSS, a gene with loss of promoter activity. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and Day 9 (D9) treatment.
D) UCSC browser track of the PSCA TSS, with loss of promoter activity. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and Day 9 (D9) treatment.
FIG. 6: Somatic promoters reveal novel cancer-associated transcripts
A) Distribution of distances for different promoter categories to the nearest annotated TSSs. (left) The first barplot shows distance distributions for promoters present in gastric normal tissues, the second for promoter present in GC samples, and the third for promoters exhibiting somatic alterations (i.e. different in tumor vs normal). (right) The barplots present distance distributions associated with either lost or gained somatic promoters. A substantial proportion of gained somatic promoters occupy locations distant from previously annotated TSSs
B) Median functional scores of unannotated promoters as predicted by GenoSkyline across 7 different tissues. Unannotated promoters exhibited high functional scores for GI, fetal and ESC tissues.
C) Boxplot depicting average RNA-seq reads for CAGE-validated promoters, comparing either all promoters or somatic promoters and also supported by CAGE data. (**P<0.001, Wilcoxon one sided test). Somatic promoters are observed to have lower levels of RNA-seq expression.
D) Cartoon depicting proposed effects of dynamic range on NanoChIP-seq and RNA-seq sensitivity in detecting lowly expressed transcripts. Due to a more restricted dynamic range, epigenomic profiling may detect active promoters missed by RNA-sequencing, due to the random sampling of abundantly expressed genes by RNAseq.
E) Down and Up-sampling analysis. The y-axis depicts the number of transcripts detected that overlap either all promoters or somatic promoters at varying RNA-sequencing depths. Original primary sample RNA-seq data was sequenced at ˜106M reads which was down-sampled to 20M, 40M and 60M reads. Deep RNA-seq data was additionally generated at ˜139M read depth.
F) Cancer-associated transcripts detected at deep but not regular RNA-seq depth. The UCSC genome browser track for ABCA13 shows an example of a novel transcript detected by NanoChIP-seq at a read depth of 20M but only detected by RNA-sequencing at read depth of ˜139M (Deep sequencing GC). This transcript is not detected by regular depth RNA-seq (GC).
FIG. 7: Chromatin Profiles of Primary GC
A) Chromatin profiles of primary GCs, matched normal gastric mucosae, and GC cell lines for 3 marks (H3K4me3, H3K27ac and H3K4me1). Shown are UCSC genome browser tracks of the GC driver gene MYC highlighting strong H3K4me3 and H3K27ac signals and low H3K4me1 at promoter locations
B) H3K4me3, H3K27ac and H3K4me1 signal distributions at transcription start sites (TSS). Line plots show the distribution of chromatin signals for H3K4me3 hi/H3K4me1 lo regions at TSS regions (+/−3 kb). Heatmaps were plotted using ngs.plot(6) for the top 10,000 H3K4me3 hi/H3K4me1 lo regions
C) Density distributions of H3K4me3:H3K4me1 ratios at identified H3K4me3 regions. All regions with H3K4me3/H3K4me1 ratios >1 were selected for further analysis (73%)
D) Distribution of H3K4me3 hi/H3k4me1 lo regions against representative gene body features (top). The arrow represents the TSS.
E) Enrichment of H3K4me3 hi/H3K4me1 lo regions against 15 chromatin states (columns) defined in different gastrointestinal tissues from the Epigenome Roadmap database (rows). Each column is scaled from 0 to 1.
F) Overlap of H3K4me3 hi/H3K4me1 lo regions with FANTOMS CAGE data
FIG. 8: Epithelial features of GC promoters
A) Spearman correlation heat-map between H3K4me3 signals of primary GC, gastric normal samples (red type, highlighted by red arrow) and various tissue types from the Epigenome Roadmap database across all H3K4me3 hi/H3K4me1 lo regions
B) Overlap of H3K4me3 hi/H3K4me1 lo regions with H3K4me3 regions identified in GC cell lines (87%), gastrointestinal fibroblast cells (61%) and colon carcinoma lines (74%)
FIG. 9: GC Somatic Promoter Features
A) Differential (somatic) H3K4me3 regions identified from 2 independent algorithms DESeq2 and edgeR. 96% of regions identified from DESeq2 overlapped those identified using edgeR. Both sets were pooled for subsequent analysis.
B) Principal component analysis of 16 GC and gastric normal samples based on somatic promoters
C) Heatmap of H3K27ac read densities across 16 GC and gastric normal samples across 1959 somatic promoters.
D) Correlation between H3K4me3 promoter signals and H3K27ac activity signals in primary gastric samples for gained somatic (Left, r=0.78, p<0.001) and lost somatic (Right, r=0.82, p<0.001) promoters. Each data point corresponds to a single H3K4me3 hi/H3K4me1 lo region. Analysis was performed using data from 16 N/T pairs (Table 4).
E) Volcano plot of somatic promoters (Top) highlighting the dynamic range of fold changes differences (x-axis) and the false discovery rate (FDR)-adjusted significance (−log 10 scale, y axis). The majority of the somatic promoters lie between FC 1 and 2.82, which likely reflects the dynamic range of Chip-seq. The Table (bottom) lists the number of somatic promoters identified at differing levels of stringency. Despite varying FDR thresholds, the majority of differential peaks are still preserved (e.g. 59% at q<0.01).
F) Enrichment analysis of somatic promoters at varying fold change and FDR (q value) for top 5 genesets (FIG. 1F) associated with gained (red) and lost somatic promoters (blue). X axis reflects the −log 10 p value for gene-sets found to be enriched in subsets of somatic promoters. Even at stricter fold change (FC 2) and q-value thresholds (0.05, 0.01 and 0.001), similar GC specific and PRC2 associated signatures are still observed.
FIG. 10: Association of Somatic Promoters with Gene Expression in GC and Other Tumor Types
A) Example of a GC somatic promoter. Example is for illustrative purposes only.
B) Changes in RNA-seq expression (top) and DNA methylation (bottom) discovery samples between somatic promoters and unaltered promoters. Top—Boxplot depicting changes in RNA-seq expression between 9 paired primary GC and gastric normal samples at genomic regions exhibiting somatic promoters (gained and lost) (***P<0.001, Wilcoxon Test). Bottom—Boxplot depicting changes in DNA methylation (β-values) at regions exhibiting somatic promoters between 20 paired GC and gastric normal samples, compared to unaltered promoters (***P<0.001, Wilcoxon test)
C) Independent Validation Cohorts. Boxplot depicting changes in RNA-seq expression at genomic regions exhibiting somatic promoters across 354 (321 GC, 33 normal) TCGA Stomach adenocarcinoma (STAD) samples, compared to unaltered promoters (***P<0.001, Wilcoxon test)
D) Somatic Promoters in Other Cancer Types. Boxplot depicting changes in RNA-seq expression at genomic regions exhibiting GC somatic promoters compared to unaltered promoters, across 328 TCGA Colon adenocarcinoma (COAD) samples (286 COAD, 40 normal; ***P<0.001, Wilcoxon test), 170 TCGA kidney renal clear cell carcinoma (ccRCC) samples (98 ccRCC and 72 normal; ***P<0.001, Wilcoxon test), and 115 TCGA lung adenocarcinoma (LUAD) samples (58 LUAD, 57 normal; ***P<0.001 Somatic gain vs unaltered and somatic gain vs somatic loss, *P<0.05 Somatic loss vs unaltered, Wilcoxon test).
FIG. 11: Changes in DNA methylation at CpG island containing promoters
A) Boxplot depicting changes in DNA methylation (β-values) at CpG island bearing somatic promoters between 20 paired GC and gastric normal samples, compared to all promoters bearing CpG islands (**P<0.001, Wilcoxon test)
FIG. 12: Expression distribution of alternative and canonical isoforms
A) Barplot showing distribution of T/N ratios of canonical and alternative transcript isoforms for all alternative transcripts (Global—top), HNF4α (middle), and EPCAM (bottom) using four independent quantification techniques, Cufflinks, MISO, Kallisto and NanoString. The Nanostring platform is introduced in FIG. 4 of the Main Text. ++ Nanostring analysis is confined to queried probes. (*P<0.05, **P<0.01, ***P<0.001, Wilcoxon one sided test).
B) Boxplot showing the T/N ratio of N-terminal reads mapping to canonical promoters, compared to N-terminal reads mapping to alternative promoters. Alternative promoter driven transcripts exhibit significantly higher T/N ratios (p=0.04, Wilcoxon one sided test).
FIG. 13: Characterization of RASA3 Isoform
A) UCSC browser track of the RASA3 gene demonstrating H3K4me3 and RNA-seq signals at Somatic and Canonical TSSs. The Canonical TSS has equal signals while the Somatic TSS shows gain of promoter activity at an un-annotated TSS corresponding to a novel N-terminal truncated RASA3 transcript.
B) UCSC browser track of the RASA3 gene demonstrating RNA-seq signals for the NCC24 GC cell line at Somatic and Canonical TSSs. NCC24 only expresses RASA3 SomT (also see C).
C) Left—Identification of RASA3 SomT and CanT transcripts in NCC24 and NCC59 GC cells by 5′RACE. A third line (MKN1), was negative for RASA3 SomT as shown in the gel picture. A no-RNA template was run as a negative control. Right-Western Blot highlighting expression of RASA3 SomT protein in NCC24 cells.
D) RAS GTP assays. (left) The Western blot shows levels of RAS in GES1 cells transfected with either empty vector (EV), RASA3 CanT or RASA3 SomT (n=3). GES1 cells were serum-starved overnight followed by serum stimulation for 30 minutes prior to harvest and a RAS-GTP pull down assay. Total RAS was measured in corresponding whole cell protein lysates. β-actin was used as a loading control. Positive (GTP) and negative (GDP) controls from the pull down assay are also shown. (right) The barplot quantifies active RAS intensity from three independent pull-down assays, performed in GES1 cells transfected with either empty vector (EV), RASA3 CanT or RASA3 SomT under FBS exposed conditions. Data is shown as mean±SD; n=3. (*P<0.05, Student's two sided t-test).
E) Cell proliferation assays of SNU1967, GES1 and AGS cells after transfection with RASA3 CanT and SomT normalized to Day 0. (Data is shown as mean±SD performed in triplicate, representative of 3 independent experiments).
F) Effect of overexpression of RASA3 CanT and SomT isoforms on the invasive capability of GES1 and SNU1967 cells. Representative images of EV, RASA3-WT and RASA3-Var in invasion assay (n=3). Barplot showing % area of invaded cells vs the area of transwell membrane. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test).
G) Effect of overexpression of RASA3 CanT and SomT protein isoforms on the migration capability of highly migratory KRAS mutated AGS cells. Barplot showing % area of migrated cells vs the area of transwell membrane. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test). RASA3 WT induces more potent migration suppression than RASA3 Var, suggesting that RASA3 WT is a migration inhibitor.
H) siRNA-mediated knockdown of RASA3 SomT in NCC24 cells. Cells were treated with sc-siRNA (control) and 2 RASA3 siRNAs (siRNA1-hs.Ri.RASA3.13 TriFECTa® Kit DsiRNA and siRNA-3-Silencer® Select Pre-Designed siRNA s355). (Left) Barplots showing fold change differences in mRNA expression of RASA3 SomT after treatment with siRNA-1 and siRNA-3. Data is shown as mean±SD; n=3. (Right) Western blotting results confirming RASA3 SomT protein reductions. Cells were harvested and lysed after 48 hrs of transfection. (***P<0.001, Student's one sided t-test).
I) Effect of siRNA knockdown of RASA3 SomT isoform on the migration (left) and invasive (right) capability of NCC24 cells from two independent siRNAs. Representative images of sc-siRNA (control), siRNA-1, and siRNA-3 in migration and invasion assays (n=3). Barplot showing % area of migrated/invaded cells vs the area of transwell membrane. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test).
FIG. 14: Characterization of MET Isoforms
A) UCSC browser track of the MET gene, demonstrating H3K4me3 and RNA-seq signals highlighting gain of promoter activity at an alternative downstream locus (dark grey box).
B) Functional domains of the MET canonical (WT) and alternative (Var) isoform. The alternative isoform is predicted to encode a MET protein with an N terminally truncated SEMA domain.
C) Expression of MET (Var) transcripts in GC lines, as detected by 5′RACE.
D) Western blot of HEK293 cells transfected with empty vector (EV), MET canonical full length (MET-WT) and truncated Variant (MET-Var) at 0, 15 and 30 minutes of HGF treatment (100 ng/ml) (n=3). GAB1, STAT3 and ERK1/2 are known downstream effectors of MET signaling. Number below each band is the quantified intensity using Image Lab. In both untreated and HGF-treated conditions, MET-Var transfected cells exhibited higher levels of p-Gab1 (Y627), a key mediator of MET signaling (2.48-3.95 fold, p=0.003 (untreated), p<0.05 (T15 and T30). In untreated samples, cells transfected with MET-Var also exhibited higher pERK1/2 levels (2.74 fold) and also higher p-STAT3 (Y705) levels (1.80 fold) compared to MET-WT (p=0.023 and p=0.026 for pERK and p-STAT3 (Y705) respectively).
E) Bar graphs showing increase in pERK1/2 for EV, MET-WT and MET-Var at T0, T15 and T30, reflecting effects of HGF treatment. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test)
F) Bar graphs showing increase in p-GAB1 (Y627), p-STAT3 (Y705), and pERK1/2 in cells transfected with MET-Var compared to EV and MET-WT. Graphs for all 3 time points are shown. Data is shown as mean±SD; n=3. (*P<0.05, **P<0.01, ***P<0.001, Student's one sided t-test)
FIG. 15: Immunogenicity of N-terminal peptides
A) Barplot showing average % of N-terminal peptides with predicted high-affinity binding to MHC Class I HLA-A (IC<=50 nm). As comparison, the figure in the Main Text represents average % s based on all three HLA classes (HLA-A, HLA-B, HLA-C). N-terminal peptides associated with recurrent somatic alternative promoters show significantly enriched predicted MHC I binding compared to canonical GC peptides (p<0.01), random peptides from human proteome and C-terminal peptides (p<0.001, Fisher's Test) derived from the same genes exhibiting the N-terminal alterations.
B) MHC Binding Predictions using N-terminal peptides inferred by RNA-seq analysis alone. Annotated transcripts exhibiting different N-terminal exons in GC vs normals were identified using two different RNA-seq algorithms (DEXSeq(7) and Voom-diffsplice(8)) (FC>=2, FDR 0.05). This analysis identified 96 genes with potential alternative N-terminal transcripts, of which 46 (48%) were predicted to result in differing N terminal peptides (Purple bar).
FIG. 16: Immunogenicity Assay and Nanostring Profiling
A) Scatter plot of fold change (T vs N) of expression of alternate and canonical probes from NanoString and RNA-seq data of the same samples. An improved correlation is observed using the alternate probes
B) Left—Expression of T-cell markers CD8A, GZMA and PRF1 in SG series (top), TCGA STAD (middle) and ACRG cohort (bottom) with high or low somatic promoter usage after adjustment of tumor purities as estimated by ASCAT. P values (Wilcoxon one sided test) are: CD8A—p=0.09 (SG), 0.004 (TCGA), 0.3 (ACRG); GZMA—0.0001 (SG), 0.002 (TCGA), 0.166 (ACRG), PRF1—0.013 (SG), 0.006 (TCGA), 0.3 (ACRG). Right—Expression of T-cell markers CD8A, GZMA and PRF1 in SG series (top), TCGA STAD (middle) and ACRG cohort (bottom) with high or low somatic promoter usage after adjustment of tumor content as estimated by ESTIMATE. p values (Wilcoxon one sided test) are: CD8A—p=0.28 (SG), 0.17 (TCGA), 0.37 (ACRG), GZMA—0.0005 (SG), 0.03 (TCGA), 0.09 (ACRG), PRF1—0.02 (SG), 0.22 (TCGA), 0.17 (ACRG). Samples with high alternative promoter usage are in red, while those with low usage are in blue.
C) Kaplan-Meier analysis comparing overall survival curves between validation samples with high somatic promoter usage and low somatic promoter usage (split by median) (HR=1.81, P=0.04)
D) Left—Expression of T-cell markers CD8A, GZMA and PRF1 in TCGA STAD with high or low somatic promoter usage after adjustment of mutation burden. P values (Wilcoxon one sided test) are: P=0.02 (CD8A), 0.01 (GZMA) and 0.03 (PRF1). Right—Expression of T-cell markers CD8A, GZMA and PRF1 in ACRG cohort with high or low somatic promoter usage after adjustment of mutation burden. P values (Wilcoxon one sided test) are: P=0.167 (CD8A), 0.009 (GZMA) and 0.03 (PRF1).
E) Heatmap of alternative promoter expression from 264 ACRG GCs for all gained alternative promoters. GC samples have been ordered left to right by their levels of somatic promoter usage.
FIG. 17: Functional Assessment of Peptide Immunogenicity
A) Individual cytokine responses against 15 peptides for other normal donor PBMCs tested against different peptide pools.
B) Experimental Immunogenicity Assay. Experimental design of in-vitro assay—i) Immature dendritic cells (DCs) cultured from CD14+ monocytes from HLA-A02:06 donors were differentiated in mature DCs (see Methods). Mature DCs were exposed to isogenic GC cell lysates (AGS cells) expressing Canonical (CanT) and Somatic (SomT) RASA3 isoforms. ii) Antigen presentation and T-cell activation: DCs presenting Can or Som RASA3 isoforms were co-cultured with HLA-matched T cells, resulting in T-cells primed against CanT or SomT RASA3. Primed T cells were then independently co-cultured with RASA3 CanT or RASA3 SomT expressing GC cells for two days, and markers of T-cell activation were assessed.
C) Concentration of interferon-gamma (IFN-γ) secretion by co-culture of T cells primed with RASA3 CanT or SomT Isoforms, after antigen challenge. RASA3 CanT primed T cells released significantly more IFN-γ when co-cultured with RASA3 CanT expressing cells, compared to T cells primed with RASA3 SomT and co-cultured with RASA3 SomT expressing cells (P=0.02, representative of n=3 experiments). IFN-γ levels were determined by ELISA.
FIG. 18: EZH2 Inhibition
A) Barplot showing increased enrichment of EZH2 binding sites in HFE-145 cells at somatic promoters compared to all promoters (P<0.01).
B) Growth curves of IM95 GC cells after GSK126 administration. Cell proliferation was monitored from 24 to 216 hours and represented relative to DMSO control treated cells (means±s.e.m. represents data from three experiments, and each experiment was performed in duplicate)
C) Top 5 enriched curated gene sets (C2) for the set of genes identified from differential analysis of GSK126 treated vs DMSO control IM95 RNA-seq data at promoter loci.
D) UCSC browser track of alternative promoter ESRRG with loss of promoter activity (GC (red) and normal gastric tissue (blue) H3K4me3). Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and Day 9 (D9) treatment.
FIG. 19: Unannotated somatic promoters
A) Barplot showing fold enrichment of L1 (FC=8.02, P<0.001) and ERV1 (FC=2.78, P<0.001) repeat elements at unannotated promoter regions compared to all promoters
B) Boxplot comparing H3K27ac signals (rpm) at unannotated somatic promoters with annotated somatic promoters. Unannotated somatic promoters have lower H3K27ac signals.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
In a first aspect, the present invention refers to a method for determining the presence or absence of at least one promoter in a cancerous biological sample relative to a non-cancerous biological sample. The method comprises contacting the cancerous biological sample with at least one antibody or antibodies specific for histone modifications H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises at least one region or regions specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid; and determining the presence or absence of at least one promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a non-cancerous biological sample.
In one embodiment, the cancerous and non-cancerous biological sample may comprise a single cell, multiple cells, fragments of cells, body fluid or tissue. In one embodiment the cancerous and non-cancerous biological sample may be obtained from the same subject.
In one embodiment, the cancerous and non-cancerous biological sample are each obtained from different subjects.
The contacting step in accordance with the method as described herein may comprise the immunoprecipitation of chromatin with the antibodies specific for the histone modifications. Examples of histone modification include but are not limited to H3K27ac, H3K4me3, H3K4me1. In a preferred embodiment, the histone modification is H3K4me3 and/or H3K4me1. In yet another embodiment, the histone modification is H3K27ac.
The method may further comprise mapping at least one promoter from the cancerous biological sample against at least one reference nucleic acid sequence to identify a gene transcript associated with the at least one promoter.
In some embodiments, the at least one reference nucleic acid sequence may comprise a nucleic acid sequence derived from: i) an annotated genome sequence; ii) a de novo transcriptome assembly; and/or iii) a non-cancerous nucleic acid sequence library or database.
In one embodiment, the change of signal intensity of H3K4me3 may be greater than a 0.5 fold, greater than a 1 fold, greater than a 1.5 fold, greater than a 2 fold, greater than a 2.5 fold or greater than a 3 fold increase or decrease relative to the signal intensity of H3K4me3 in the non-cancerous biological sample. In a preferred embodiment, the change of signal intensity of H3K4me3 may be greater than a 1.5 fold increase or decrease relative to the signal intensity of H3K4me3 in the non-cancerous biological sample. In another embodiment, the change of signal intensity of H3K4me3 greater than a 0.5 fold, greater than a 1 fold, greater than a 1.5 fold, greater than a 2 fold, greater than a 2.5 fold or greater than a 3 fold increase relative to the signal intensity of H3K4me3 in a non-cancerous biological sample, may correlate to the presence of at least one cancer-associated promoter in the cancerous biological sample.
In a preferred embodiment the change of signal intensity of H3K4me3 greater than a 1.5 fold increase relative to the signal intensity of H3K4me3 in a non-cancerous biological sample, may correlate to the presence of at least one cancer-associated promoter in the cancerous biological sample.
In one embodiment, the activity of the at least one cancer-associated promoter may correlate with an increase of SUZ12 or EZH2 binding sites relative to the total promoter population.
In one embodiment, an increase of SUZ12 or EZH2 binding sites correlates with an upregulation of activity of the at least one cancer-associated promoter. In another embodiment, the increase of SUZ12 or EZH2 binding sites correlates with a downregulation of activity of the at least one cancer-associated promoter.
In one embodiment, the at least one promoter may be a canonical promoter that is positioned within 100 bp, 200 bp, 300 bp, 400 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp or 1000 bp from a known gene transcript start site. In a preferred embodiment, the at least one promoter may be a canonical promoter that is positioned within 500 bp from a known gene transcript start site. The gene transcript start site may be associated with one or more of a cell-type specification gene, a cell adhesion gene, a cell mediated immunity gene, a gastric cancer-associated or deregulated gene, a PRC2 target gene or a transcription factor. In one embodiment, the gene transcript start site may be associated with an oncogene. The gene transcript start site may be associated with a gene selected from the group consisting of MYC, MET, CEACAM6, CLDN7, CLDN3, HOTAIR, PVT1, HNF4a, RASA3, GRIN2D, EpCAM and a combination thereof.
In one embodiment, the cancer is gastrointestinal cancer, gastric cancer or colon cancer.
In another embodiment, the at least one promoter may be an alternative promoter that may be associated with a canonical promoter, wherein the canonical promoter may be present in both the cancerous biological sample and the non-cancerous biological sample, and i) wherein the alternative promoter may be only present in the cancerous biological sample, or ii) wherein the alternative promoter may be only absent in the cancerous biological sample.
In some embodiments, the at least one promoter is an unannotated promoter that is positioned more than 100 bp, more than 200 bp, more than 300 bp, more than 400 bp, more than 500 bp away, more than 600 bp, more than 700 bp, more than 800 bp, more than 900 bp or more than 1000 bp from a gene transcript start site. In a preferred embodiment, the at least one promoter is an unannotated promoter that is positioned more than 500 bp away from a gene transcript start site.
In one embodiment, the method as described herein further comprises measuring the expression level of the at least one alternative promoter in the cancerous biological sample and non-cancerous biological sample, wherein the measuring comprises digital profiling of reporter probes; and determining the differential expression level of the at least one alternative promoter relative to the non-cancerous biological sample, based on the digital profiling of the reporter probes, to validate the presence or absence of at least one alternative promoter in the cancerous biological sample relative to a non-cancerous biological sample.
The step of measuring may be conducted using a NanoString™ platform.
In another aspect, the present invention provides a method for determining the prognosis of cancer in a subject. The method comprises contacting a cancerous biological sample obtained from the subject with at least one antibody or antibodies specific for histone modification H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises at least one region or regions specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid; and determining the presence or absence of at least one cancer-associated promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a reference nucleic acid sequence, wherein the presence or absence of the at least one cancer-associated promoter in the cancerous biological sample is indicative of the prognosis of the cancer in the subject.
In one embodiment, the at least one cancer-associated promoter may be an alternative promoter that is associated with a canonical promoter, wherein the canonical promoter may be present in both the cancerous biological sample and the reference nucleic acid sequence, and i) wherein the alternative promoter may be only present in the cancerous biological sample, or ii) wherein the alternative promoter may be only absent in the cancerous biological sample.
The presence or absence of the at least one alternative promoter in the cancerous sample may indicative of a poor prognosis of cancer survival in the subject.
In one embodiment the method as described herein further comprises measuring the expression level of the at least one alternative promoter in the cancerous biological sample and the reference nucleic acid sequence, wherein the measuring comprises digital profiling of reporter probes; and determining the differential expression level of the at least one alternative promoter relative to the non-cancerous biological sample, based on the digital profiling of the reporter probes, to validate the presence or absence of at least one alternative promoter in the cancerous biological sample relative to the reference nucleic acid sequence.
The step of measuring may be conducted using a NanoString™ platform.
In another aspect the present invention provides a biomarker for detecting cancer in a subject, the biomarker comprising at least one promoter having a change in signal intensity of H3K4me3 in a cancerous biological sample relative to a non-cancerous biological sample.
In one embodiment, the at least one promoter comprises an increase of EZH2 binding sites relative to the total promoter population. In one embodiment, the at least one promoter may be hypomethylated. In another embodiment, the at least one promoter may be hypermethylated.
The at least one promoter may be a canonical promoter that is positioned less than 500 bp away from a gene transcript start site. In one embodiment, the gene transcript start site may be associated with one or more of a cell-type specification gene, a cell adhesion gene, a cell mediated immunity gene, a gastric cancer-associated or deregulated gene, a PRC2 target gene or a transcription factor. In one embodiment, the gene transcript start site may be associated with an oncogene.
In one embodiment, the gene transcript start site may be associated with a gene selected from the group consisting of MYC, MET, CEACAM6, CLDN7, CLDN3, HOTAIR, PVT1, HNF4α, RASA3, GRIN2D, EpCAM or a combination thereof.
In one embodiment, the at least one promoter may be an alternative promoter that may be associated with a canonical promoter, wherein the canonical promoter may be present in both a cancerous sample and a non-cancerous sample, and i) wherein the alternative promoter may be only present in a cancerous sample, or ii) wherein the alternative promoter may be only absent in a cancerous sample.
In one embodiment, the at least one promoter may be an unannotated promoter that may be positioned more than 100 bp, more than 200 bp, more than 300 bp, more than 400 bp, more than 500 bp, more than 600 bp, more than 700 bp, more than 800 bp, more than 900 bp or more than 1000 bp away from a gene transcript start site. In a preferred embodiment, the at least one promoter may be an unannotated promoter that may be positioned more than 500 bp away from a gene transcript start site.
In another aspect, there is provided a method for modulating the activity of at least one cancer-associated promoter in a cell, comprising administering an inhibitor of EZH2 to the cell. In another aspect there is provided a method for modulating the immune response of a subject to cancer, comprising administering to the subject an inhibitor of EZH2, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject.
In one embodiment, the inhibitor of EZH2 may modulate the expression of immunogenic N-terminal peptides.
In one embodiment, the at least one cancer-associated promoter may be an alternative promoter that may be associated with a canonical promoter, wherein the canonical promoter may be present in both a cancerous sample and a non-cancerous sample, and i) wherein the alternative promoter may only be present in a cancerous sample, or ii) wherein the alternative promoter may only be absent in a cancerous sample.
In one embodiment, the alternative promoter is associated with a transcript variant, and wherein the transcript variant encodes a N-terminal protein variant.
In one embodiment, the N-terminal protein variant may be an N-terminal truncated protein or an N-terminal elongated protein. In one embodiment, the inhibitor of EZH2 may be a siRNA or a small molecule.
In one embodiment, the inhibitor of EZH2 may be GSK126.
In another aspect, there is provided use of an inhibitor of EZH2 in the manufacture of a medicament for modulating the activity of at least one cancer-associated promoter in a cell.
In another aspect there is provided use of an inhibitor of EZH2, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject, in the manufacture of a medicament for modulating the immune response of a subject to cancer.
In another aspect, there is provided an inhibitor of EZH2 for use in modulating the activity of at least one cancer-associated promoter in a cell. In yet another aspect, there is provided an inhibitor of EZH2 for use in modulating the immune response of a subject to cancer, wherein the EZH2 is associated with at least one cancer-associated promoter in the subject.
In another aspect there is provided a method for determining the presence or absence of at least one cancer-associated promoter in a cancerous biological sample relative to a non-cancerous biological sample. The method comprises: contacting the cancerous biological sample with antibodies specific for histone modifications H3K4me3 and H3K4me1; isolating nucleic acid from the cancerous biological sample having a signal ratio of H3K4me3 relative to H3K4me1 greater than 1, wherein the isolated nucleic acid comprises regions specific to said histone modifications; detecting a signal intensity of H3K4me3 in the isolated nucleic acid at a read depth of 20M; and determining the presence or absence of at least one cancer-associated promoter in the cancerous biological sample based on the change in the signal intensity of H3K4me3 relative to the signal intensity of H3K4me3 in a non-cancerous biological sample.
EXPERIMENTAL SECTION
Methods and Materials
Primary Tissue Samples and Cell Lines
Primary patient samples were obtained from the SingHealth tissue repository with approvals from institutional research ethics review committees and signed patient informed consent. ‘Normal’ (non-malignant) samples used in this study refers to samples harvested from the stomach, from sites distant from the tumour and exhibiting no visible evidence of tumour or intestinal metaplasia/dysplasia upon surgical assessment. Tumor samples were confirmed by cryosectioning to contain >60% tumor cells. FU97, IM95, MKN7, OCUM1 and RERF-GC-1B cell lines were obtained from the Japan Health Science Research Resource Bank. AGS, KATOIII and SNU16, Hs 1.Int and Hs 738.St/Int gastrointestinal fibroblast lines were obtained from the American Type Culture Collection. NCC-59, NCC-24 and SNU-1967 and SNU-1750 were obtained from the Korean Cell Line Bank. YCC3, YCC7, YCC21, YCC22 were gifts from Yonsei Cancer Centre, South Korea. HFE145 cells were a gift from Dr. Hassan Ashktorab, Howard University. GES-1 cells were a gift from Dr. Alfred Cheng, Chinese University of Hong Kong. Cell line identifies were confirmed by STR DNA profiling using ANSI/ATCC ASN-0002-2011 guidelines. For our study, MKN7 cells, listed as a commonly misidentified cell line by ICLAC (http://iclac.org/databases/cross-contaminations/), exhibited a perfect match (100%) with MKN7 reference profiles in the Japanese Collection of Research Bioresources Cell Bank. All cell lines were negative for mycoplasma contamination as assessed by the MycoAlert™ Mycoplasma Detection Kit (Lonza) and the MycoSensor qPCR Assay Kit (Agilent Technologies). PBMCs from healthy donors were collected under protocol CIRB Ref No. 2010/720/E.
Nano-ChIPseq
Nano-ChIP-Seq was performed as described below.
Primary Tissue and Cell Line Fixation
Fresh-frozen cancer and normal tissues were dissected using a razor blade in liquid nitrogen to obtain—5 mg sized pieces for each ChIP. Tissue pieces were fixed in 1% formaldehyde/PBS buffer for 10 min at room temperature. Fixation was stopped by addition of glycine to a final concentration of 125 mM. Tissue pieces were washed 3 times with TBSE buffer. For cell lines, 1 million fresh harvested cells were fixed in 1% formaldehyde/medium buffer for 10 minutes (min) at room temperature. Fixation was stopped by addition of glycine to a final concentration of 125 mM. Fixed cells were washed 3 times with TBSE buffer, and centrifuged (5,000 r.p.m., 5 min).
ChIP
Pelleted cells and pulverized tissues were lysed in 100 μl 1% SDS lysis buffer and sonicated to 300-500 bp using a Bioruptor (Diagenode). ChIP was performed using the following antibodies: H3K4me3 (07-473, Millipore); H3K4me1 (ab8895, Abcam); H3K27ac (ab4729, Abcam).
WGA
After recovery of ChIP and input DNA, whole-genome-amplification was performed using the WGA4 kit (Sigma-Aldrich) and BpmI-WGA primers. Amplified DNAs were purified using PCR purification columns (QIAGEN) and digested with BpmI (New England Biolabs) to remove WGA adapters.
Library Preparation and Sequencing
30 ng of amplified DNA was used for each sequencing library preparation (New England Biolabs). 8 libraries were multiplexed (New England Biolabs) and sequenced on 2 lanes of a Hiseq2500 sequencer (Illumina) to an average depth of 20-30 million reads per library.
Sequencing reads were trimmed (10 bp from front and back) and mapped against human genome reference hg19 using the Burrows-Wheeler Aligner (BWA) (version 0.6.2) ‘aln’ algorithm. Reading statistics were generated using mapstat from samtools. We filtered reads based on their mapping quality (MAPQ>=10) and used uniquely mapped reads to perform peak calling using CCAT v3.0. We chose a MAPQ value of ≥10 because i) MAPQ≥10 has been previously reported as a reliable value for confident read mapping, ii) MAPQ≥10 has been recommended by the developers of the BWA-algorithm as a suitable threshold for confident mapping, and iii) independent studies comparing various read alignment algorithms have shown that mapping accuracies plateau at a 10-12 MAPQ threshold.
EZH2 ChIP-seq
Cells were cross-linked with 1% formaldehyde for 10 minutes at room temperature, and stopped by adding glycine to a final concentration of 0.2M. Chromatin was extracted and sonicated to ˜500 bp fragments. EZH2 antibodies (Catalog #5246, Cell Signaling) were used for chromatin immunoprecipitation (ChIP). 30 ng of ChIPed DNA was used for each sequencing library preparation (New England Biolabs). The library was sequenced on a Hiseq2500 (Illumina). Input DNA from cells prior to immunoprecipitation was used to normalize ChIP-seq peak calling. Prior to sequencing, qPCR was used to verify that positive and negative control ChIP regions were amplified in the linear range. Sequencing reads were mapped against human genome reference hg19 using the Burrows-Wheeler Aligner (BWA) (version 0.7) ‘aln’ algorithm. Reading statistics were generated using mapstat from samtools. We filtered reads based on their mapping quality (MAPQ>=10) and used uniquely mapped reads to perform peak calling using MACS2.
Quality Control Assessments of Nano-ChIPseq Data
ChIP Enrichment Assessment
We assessed ChIP library qualities (H3K27ac, H3K4me3 and H3K4me1) using two different methods. First, we estimated ChIP qualities, particularly H3K27ac and H3K4me3, by interrogating their enrichment levels at annotated promoters of protein-coding genes. Specifically, we computed median read densities of input and input-corrected ChIP signals around the transcription start sites (TSSs, +/−500 bp) of highly expressed protein-coding genes. For each sample, we then compared read density ratios of ChIP over input as a surrogate of data quality, retaining only those samples where the ChIP/input ratio was greater than 2-fold. Using this criteria, all H3K4me3 and H3K27ac samples (GC lines and primary samples) exhibited greater than 2-fold enrichment, indicating successful enrichment. Second, we used CHANCE (ChIp-seq ANalytics and Confidence Estimation), a software for ChIP-seq quality control and protocol optimization that indicates whether a ChIP library shows successful or weak enrichment. CHANCE assessment confirmed that the large majority (81%) of samples in our study exhibited successful enrichment. Quality status of each library, as assessed by both methods, are reported in Table 1.
TABLE 1
|
|
Read Mapping statistics of NanoChIP-seq libraries
|
ChIP
|
# of
enrich-
|
Peaks
ment
|
Total
(FDR
CHANCE
around
|
S.
Patient
Sample
Library
Histone
Total
Mapped
<5%,
Enrich-
TSS
|
No
No
Group
ID
ID
Modification
Reads
Reads
CCAT)
ment
(>2 Fold)
|
|
1
1
N
2000639
CHG023
H3K4Me1
116,179,997
56,009,114
11,438
successful
yes
|
2
1
N
2000639
CHG079
H3K4Me3
144,760,092
45,662,594
13,301
successful
yes
|
3
1
N
2000639
CHG022
H3K27Ac
107,005,238
47,688,264
30,155
successful
yes
|
4
1
N
2000639
CHG021
Input
108,432,681
53,434,667
—
—
—
|
5
1
T
2000639
CHG019
H3K4Me1
139,751,844
62,529,719
9,133
successful
yes
|
6
1
T
2000639
CHG078
H3K4Me3
176,761,815
52,219,714
15,417
successful
yes
|
7
1
T
2000639
CHG018
H3K27Ac
125,811,014
56,636,793
22,220
successful
yes
|
8
1
T
2000639
CHG017
Input
133,549,980
62,465,142
—
—
—
|
9
2
N
2000721
CHG081
H3K4Me3
123,984,264
41,723,243
13,046
successful
yes
|
10
2
N
2000721
CHG031
H3K4Me1
142,898,092
61,716,210
17,896
successful
yes
|
11
2
N
2000721
CHG030
H3K27Ac
142,881,448
56,328,103
24,624
successful
yes
|
12
2
N
2000721
CHG029
Input
144,582,591
67,254,098
—
—
—
|
13
2
T
2000721
CHG080
H3K4Me3
128,094,707
52,416,345
12,751
successful
yes
|
14
2
T
2000721
CHG026
H3K27Ac
132,143,844
52,416,345
45,274
successful
yes
|
15
2
T
2000721
CHG027
H3K4Me1
120,824,194
54,688,706
48,701
successful
yes
|
16
2
T
2000721
CHG025
Input
150,621,523
65,242,401
—
—
—
|
17
3
N
2000986
CHG083
H3K4Me3
145,813,278
44,476,466
13,305
successful
yes
|
18
3
N
2000986
CHG039
H3K4Me1
112,190,461
52,061,916
14,977
successful
yes
|
19
3
N
2000986
CHG038
H3K27Ac
136,195,033
47,671,991
26,993
successful
yes
|
20
3
N
2000986
CHG037
Input
125,858,642
58,503,831
—
—
—
|
21
3
T
2000986
CHG082
H3K4Me3
199,735,230
48,070,517
13,296
successful
yes
|
22
3
T
2000986
CHG035
H3K4Me1
99,757,592
48,602,649
25,882
successful
yes
|
23
3
T
2000986
CHG034
H3K27Ac
127,564,120
45,231,776
29,278
successful
yes
|
24
3
T
2000986
CHG033
Input
127,392,001
57,846,771
—
—
—
|
25
4
N
980437
CHG087
H3K4Me3
252,269,976
16,106,111
6,925
weak
yes
|
26
4
N
980437
CHG089
H3K27Ac
248,399,140
21,095,856
20,018
weak
yes
|
27
4
N
980437
CHG086
input
223,083,607
13,951,728
—
—
—
|
28
4
T
980437
CHG091
H3K4Me3
254,777,628
12,340,257
7,007
weak
yes
|
29
4
T
980437
CHG093
H3K27Ac
215,915,787
19,054,278
48,614
weak
yes
|
30
4
T
980437
CHG090
input
214,007,053
18,743,433
—
—
—
|
31
5
N
980097
CHG097
H3K27Ac
254,991,965
17,871,717
10,566
weak
yes
|
32
5
N
980097
CHG094
Input
248,345,017
15,056,998
—
—
—
|
33
5
T
980097
CHG101
H3K27Ac
254,857,885
16,050,861
81,607
successful
yes
|
34
5
T
980097
CHG098
Input
235,148,448
16,412,565
—
—
—
|
35
6
N
990068
CHG441
H3K4Me3
25,942,766
18,661,944
9,040
successful
yes
|
36
6
N
990068
CHG443
H3K27Ac
28,993,775
20,404,671
30,306
successful
yes
|
37
6
N
990068
CHG444
Input
16,583,307
14,164,125
—
—
—
|
38
6
T
990068
CHG437
H3K4Me3
19,295,687
15,981,638
23,546
successful
yes
|
39
6
T
990068
CHG439
H3K27Ac
30,394,067
26,279,884
84,958
successful
yes
|
40
6
T
990068
CHG440
Input
54,957,058
46,535,339
—
—
—
|
41
7
N
2000085
CHG449
H3K4Me3
22,207,074
17,120,624
13,421
weak
yes
|
42
7
N
2000085
CHG451
H3K27Ac
31,752,518
26,505,029
93,432
successful
yes
|
43
7
N
2000085
CHG452
Input
23,861,825
20,188,881
—
—
—
|
44
7
T
2000085
CHG445
H3K4Me3
27,386,842
17,898,292
16,274
successful
yes
|
45
7
T
2000085
CHG447
H3K27Ac
37,833,126
29,893,873
67,464
successful
yes
|
46
7
T
2000085
CHG448
Input
25,476,868
21,590,215
—
—
—
|
47
8
N
980401
GCC005
H3K4Me3
47,143,397
32,011,124
9,739
weak
yes
|
48
8
N
980401
GCC006
H3K4Me1
49,813,057
38,517,830
29,304
successful
yes
|
49
8
N
980401
GCC007
H3K27Ac
49,333,955
34,378,734
104,483
successful
yes
|
50
8
N
980401
GCC008
Input
48,654,609
39,027,473
—
—
—
|
51
8
T
980401
GCC002
H3K4Me1
46,014,858
35,781,553
5,374
weak
yes
|
52
8
T
980401
GCC001
H3K4Me3
40,037,248
16,724,980
11,773
successful
yes
|
53
8
T
980401
GCC003
H3K27Ac
70,844,500
51,841,868
108,169
successful
yes
|
54
8
T
980401
GCC004
Input
55,650,648
46,769,330
—
—
—
|
55
9
N
980447
GCC013
H3K4Me3
49,510,760
43,302,748
10,442
successful
yes
|
56
9
N
980447
GCC014
H3K4Me1
51,911,778
46,524,450
18,916
weak
yes
|
57
9
N
980447
GCC015
H3K27Ac
43,725,655
38,581,698
147,189
successful
yes
|
58
9
N
980447
GCC016
Input
43,722,729
36,570,838
—
—
—
|
59
9
T
980447
GCC010
H3K4Me1
51,224,701
40,643,956
7,959
successful
yes
|
60
9
T
980447
GCC009
H3K4Me3
41,895,137
28,002,598
9,325
weak
yes
|
61
9
T
980447
GCC011
H3K27Ac
75,243,898
63,172,397
98,169
successful
yes
|
62
9
T
980447
GCC012
Input
40,502,678
33,280,117
—
—
—
|
63
10
N
2001206
GCC021
H3K4Me3
42,094,067
35,485,202
12,682
successful
yes
|
64
10
N
2001206
GCC022
H3K4Me1
44,213,793
38,760,554
50,615
weak
yes
|
65
10
N
2001206
GCC023
H3K27Ac
47,356,714
34,355,781
112,565
successful
yes
|
66
10
N
2001206
GCC024
Input
58,885,884
49,927,340
—
—
—
|
67
10
T
2001206
GCC017
H3K4Me3
48,193,228
36,729,294
13,835
successful
yes
|
68
10
T
2001206
GCC018
H3K4Me1
43,730,845
35,480,758
44,504
weak
yes
|
69
10
T
2001206
GCC019
H3K27Ac
52,518,766
42,398,517
111,758
successful
yes
|
70
10
T
2001206
GCC020
Input
81,949,870
70,380,385
—
—
—
|
71
11
N
980436
GCC029
H3K4Me3
27,612,232
20,121,957
12,398
weak
yes
|
72
11
N
980436
GCC030
H3K4Me1
22,983,565
20,452,059
53,077
weak
yes
|
73
11
N
980436
GCC031
H3K27Ac
23,061,305
15,315,483
104,880
successful
yes
|
74
11
N
980436
GCC032
Input
24,411,542
21,182,579
—
—
—
|
75
11
T
980436
GCC025
H3K4Me3
31,564,679
24,866,375
8,625
weak
yes
|
76
11
T
980436
GCC026
H3K4Me1
51,645,661
38,028,800
58,456
successful
yes
|
77
11
T
980436
GCC027
H3K27Ac
51,093,256
35,496,776
102,351
successful
yes
|
78
11
T
980436
GCC028
Input
25,606,490
20,820,223
—
—
—
|
79
12
N
980417
GCC037
H3K4Me3
18,976,505
15,277,228
10,387
successful
yes
|
80
12
N
980417
GCC039
H3K27Ac
30,443,642
25,447,390
70,910
successful
yes
|
81
12
N
980417
GCC038
H3K4Me1
22,127,416
18,537,610
109,119
successful
yes
|
82
12
N
980417
GCC040
Input
33,758,416
28,242,473
—
—
—
|
83
12
T
980417
GCC033
H3K4Me3
42,615,610
27,972,601
10,260
successful
yes
|
84
12
T
980417
GCC035
H3K27Ac
33,438,272
29,141,996
76,369
successful
yes
|
85
12
T
980417
GCC034
H3K4Me1
31,115,402
26,172,044
142,635
weak
yes
|
86
12
T
980417
GCC036
Input
26,806,807
22,277,771
—
—
—
|
87
13
N
980319
GCC075
H3K4Me3
34,503,108
26,201,666
9,466
successful
yes
|
88
13
N
980319
GCC076
H3K4Me1
32,308,832
28,194,660
56,964
weak
yes
|
89
13
N
980319
GCC077
H3K27Ac
28,534,828
24,595,902
73,073
successful
yes
|
90
13
N
980319
GCC078
Input
31,533,287
26,147,884
—
—
—
|
91
13
T
980319
GCC071
H3K4Me3
31,707,599
22,793,555
14,049
succesful
yes
|
92
13
T
980319
GCC073
H3K27Ac
42,548,744
35,755,479
102,971
successful
yes
|
93
13
T
980319
GCC072
H3K4Me1
28,112,304
24,361,418
196,347
weak
yes
|
94
13
T
980319
GCC074
Input
28,895,896
24,529,014
—
—
—
|
95
14
N
990275
GCC088
H3K4Me3
39,968,810
31,536,231
7,964
successful
yes
|
96
14
N
990275
GCC089
H3K27Ac
52,738,627
22,089,449
70,246
successful
yes
|
97
14
N
990275
GCC090
Input
33,342,252
21,049,309
—
—
—
|
98
14
T
990275
GCC085
H3K4Me3
26,399,904
14,795,436
25,423
weak
yes
|
99
14
T
990275
GCC086
H3K27Ac
45,712,891
25,668,453
183,458
successful
yes
|
100
14
T
990275
GCC087
Input
40,285,061
32,790,063
—
—
—
|
101
15
N
2000877
GCC082
H3K4Me3
52,151,546
22,229,998
11,368
successful
yes
|
102
15
N
2000877
GCC083
H3K27Ac
45,775,899
41,027,897
61,175
weak
yes
|
103
15
N
2000877
GCC084
Input
38,226,148
30,117,584
—
—
—
|
104
15
T
2000877
GCC079
H3K4Me3
49,368,282
24,022,463
9,837
successful
yes
|
105
15
T
2000877
GCC080
H3K27Ac
38,621,705
33,990,267
41,048
successful
yes
|
106
15
T
2000877
GCC081
Input
38,824,621
32,814,299
—
—
—
|
107
16
N
20020720
GCC100
H3K4Me3
58,679,413
34,278,884
9,901
successful
yes
|
108
16
N
20020720
GCC101
H3K27Ac
43,532,496
37,750,917
65,167
successful
yes
|
109
16
N
20020720
GCC102
Input
39,544,734
31,454,551
—
—
—
|
110
16
T
20020720
GCC097
H3K4Me3
57,599,648
16,022,427
12,922
successful
yes
|
111
16
T
20020720
GCC098
H3K27Ac
35,400,105
29,507,542
74,115
successful
yes
|
112
16
T
20020720
GCC099
Input
37,092,424
29,452,932
—
—
—
|
113
17
N
20021007
GCC094
H3K4Me3
56,788,147
18,217,449
16,073
successful
yes
|
114
17
N
20021007
GCC095
H3K27Ac
40,488,514
33,372,754
122,851
successful
yes
|
115
17
N
20021007
GCC096
Input
40,712,616
34,440,613
—
—
—
|
116
17
T
20021007
GCC091
H3K4Me3
33,903,211
27,230,052
7,843
weak
yes
|
117
17
T
20021007
GCC092
H3K27Ac
50,268,912
19,156,361
98,104
successful
yes
|
118
17
T
20021007
GCC093
Input
34,936,961
29,417,989
—
—
—
|
119
CL1
FU97
FU97
GCC043
H3K27Ac
30,087,131
22,566,178
21,867
successful
yes
|
120
CL1
FU97
FU97
GCC041
H3K4Me3
26,986,288
23,243,556
26,562
successful
yes
|
121
CL1
FU97
FU97
GCC045
Input
33,566,067
23,430,741
—
—
—
|
122
CL10
RERF-
RERF-
CHG374
H3K27Ac
39,882,820
19,500,590
11,201
successful
yes
|
GC-1B
GC-1B
|
123
CL10
RERF-
RERF-
CHG371
H3K4Me3
42,450,431
25,988,948
16,625
successful
yes
|
GC-1B
GC-1B
|
124
CL10
RERF-
RERF-
CHG376
Input
21,437,700
16,948,709
—
—
—
|
GC-1B
GC-1B
|
125
CL11
SNU16
SNU16
CHG236
H3K27Ac
21,726,635
16,967,938
13,619
successful
yes
|
126
CL11
SNU16
SNU16
CHG233
H3K4Me3
20,136,058
18,151,002
19,445
successful
yes
|
127
CL11
SNU16
SNU16
CHG232
Input
19,522,181
14,558,761
—
—
—
|
128
CL12
SNU1750
SNU1750
CHG230
H3K27Ac
18,716,777
15,805,037
15,074
successful
yes
|
129
CL12
SNU1750
SNU1750
CHG227
H3K4Me3
16,655,044
14,883,880
18,130
successful
yes
|
130
CL12
SNU1750
SNU1750
CHG226
Input
19,602,424
13,575,272
—
—
—
|
131
CL13
YCC21
YCC21
CHG429
H3K27Ac
22,884,268
13,861,557
21,415
successful
yes
|
132
CL13
YCC21
YCC21
CHG427
H3K4Me3
22,788,225
15,669,142
20,120
successful
yes
|
133
CL13
YCC21
YCC21
CHG431
Input
40,378,916
34,747,778
—
—
—
|
134
CL13
YCC22
YCC22
GCC063
H3K27Ac
33,314,935
23,877,905
11,774
successful
yes
|
135
CL13
YCC22
YCC22
GCC061
H3K4Me3
27,410,298
24,163,717
25,417
successful
yes
|
136
CL13
YCC22
YCC22
GCC065
Input
26,685,596
18,976,555
—
—
—
|
137
CL14
YCC3
YCC3
GCC053
H3K27Ac
27,581,400
21,579,098
14,118
successful
yes
|
138
CL14
YCC3
YCC3
GCC051
H3K4Me3
22,106,259
18,914,296
17,276
success
yes
|
139
CL14
YCC3
YCC3
GCC055
Input
27,745,993
18,854,658
—
—
—
|
140
CL15
YCC7
YCC7
CHG424
H3K27Ac
38,599,550
22,445,268
32,770
successful
yes
|
141
CL15
YCC7
YCC7
CHG422
H3K4Me3
19,594,480
14,546,474
22,521
successful
yes
|
142
CL15
YCC7
YCC7
CHG426
Input
24,527,190
21,748,808
—
—
—
|
143
CL2
HFE145
HFE145
CHG245
H3K4Me3
24,122,708
19,760,850
18,492
successful
yes
|
144
CL2
HFE145
HFE145
CHG244
Input
22,447,791
17,960,470
—
—
—
|
145
CL2
HFE145
HFE145
HFE145-
H3K4Me3
50,701,700
45,821,209
17,299
weak
—
|
EZH2-
|
MJ-5246
|
146
CL2
HFE145
HFE145
HFE145-
Input
36,885,332
36,157,452
—
—
—
|
input-MJ
|
147
CL3
Hs1.Int
Hs1.Int
HsInt-
H3K4Me3
37,088,221
32,789,363
22,518
successful
—
|
K4me3.
|
merged
|
148
CL3
Hs1.Int
Hs1.Int
HsInt-G-
H3K4Me3
30,617,105
27,713,302
20,298
successful
—
|
(replicate)
K4me3.
|
merged
|
149
CL3
Hs1.Int
Hs1.Int
HsInt-
Input
32,275,816
28,576,200
—
—
—
|
input.
|
merged
|
150
CL4
Hs738.
Hs738.
Hs738-
H3K4Me3
37,945,394
33,334,651
150,552
successful
—
|
St/Int
St/Int
K4me3.
|
merged
|
151
CL4
Hs738.
Hs738.St/
Hs738-
Input
32,275,816
24,581,922
—
—
—
|
St/Int
Int
K4me3.
|
merged
|
152
CL5
IM95
IM95
CHG434
H3K27Ac
23,309,435
9,168,213
27,692
successful
yes
|
153
CL5
IM95
IM95
CHG432
H3K4Me3
25,179,506
14,069,213
19,956
successful
yes
|
154
CL5
IM95
IM95
CHG436
Input
37,968,519
33,292,944
—
—
—
|
155
CL6
KATO3
KATO3
CHG242
H3K27Ac
24,559,532
17,356,721
28,730
successful
yes
|
156
CL6
KATO3
KATO3
CHG238
Input
20,527,352
14,593,025
—
—
—
|
157
CL7
MKN7
MKN7
CHG419
H3K27Ac
35,301,333
30,804,178
24,268
successful
yes
|
158
CL7
MKN7
MKN7
CHG417
H3K4Me3
28,119,400
24,793,006
23,766
successful
yes
|
159
CL7
MKN7
MKN7
CHG421
Input
35,839,896
31,791,610
—
—
—
|
160
CL8
NCC59
NCC59
CHG218
H3K27Ac
22,973,156
19,828,610
14,937
successful
yes
|
161
CL8
NCC59
NCC59
CHG215
H3K4Me3
15,642,441
13,907,147
12,410
successful
yes
|
162
CL8
NCC59
NCC59
CHG214
Input
17,926,188
13,139,789
—
—
—
|
163
CL9
OCUM1
OCUM1
CHG212
H3K27Ac
24,573,737
20,570,185
17,284
successful
yes
|
164
CL9
OCUM1
OCUM1
CHG209
H3K4Me3
19,557,872
17,178,274
15,445
successful
yes
|
165
CL9
OCUM1
OCUM1
CHG208
Input
20,585,679
16,680,529
—
—
—
|
|
Promoter Analysis
Promoter (H3K4Me3 hi/H3K4Me1 lo) regions were identified by calculating the H3K4Me3:H3K4Me1 ratio for all H3K4Me3 regions merged across normal and GC samples. We estimated the required sample size to achieve 80% power and 10% type I error (http://powerandsamplesize.com/) based on the average signals of top 100 differential promoters between tumor and normal samples. This result yielded a recommended sample size of 11 (average), which is met in our study (16 N/T). Regions with H3K4Me3:H3K4Me1 ratios <1 in both normal and GC samples were excluded from further analysis. For all analyses performed in this study, promoter regions were defined as genomic locations exhibiting H3K4me3 hi/me1 low signals, and for all subsequent analyses, it was only within this pre-defined H3K4me3 hi/me1 low subset that H3K4me3 signals were compared. H3K27ac data was used for correlative analysis. H3K4me3 data (fastqs) for colon carcinoma lines was downloaded from public databases—Hct116 and Caco2 from ENCODE and V503 and V400 from GSE36204. To compare promoter signals between GC and normal samples, we used the DESeq2 and edgeR bioconductor packages using a read count matrix of chipseq signals, adjusting for replicate information. Regions with fold changes greater than 1.5 (FDR 0.1) were selected as significantly different. The criteria of FC 1.5 and q<0.1 was based on previous literature comparing ChIP-seq profiles using DESeq2 and edgeR also using similar thresholds. Significantly altered promoters identified by DESeq2 overlapped almost completely with altered promoters found by edgeR. A regularized log transformation of the DESeq2 read counts was used to plot PCAs and heatmaps.
Transcriptome Analysis
RNA-seq data was obtained from the European Genome-phenome Archive under Accession No: EGAS00001001128. Data was processed by first aligning to GENCODE v19 transcript annotations using TopHat v2.0.12. Cufflinks 2.2.0 was used to generate FPKM abundance measures. For identification of novel transcripts, Cufflinks was used without employing a reference transcript annotation. Transcripts were then merged across all GC and normal samples and compared against GENCODE annotations to identify novel transcripts using Cuffmerge 2.2.0. Deep-depth strand-specific RNA sequencing was also performed on 10 additional primary samples. Total RNA was extracted using the Qiagen RNeasy Mini kit, and RNA-seq libraries were constructed according to manufacturer's instructions using Illumina Stranded Total RNA Sample Prep Kit v2 (Illumina, San Diego, Calif., USA) Ribo-Zero Gold option (Epicentre, Madison, Wis., USA), and 1 ug total RNA. Sequencing was performed using the paired-end 101 bp read option. TCGA datasets were downloaded from TCGA Data Portal (https://tcga-data.nci.nih.gov/tcga) in form of fastq files which were then aligned to GENCODE v19 transcript annotations using TopHat v2.0.12. To analyze promoter-associated RNA expression, RNA-seq reads from TCGA samples (tumors and normals) were mapped against the genomic locations of promoter regions originally defined by epigenomic profiling in the discovery samples, including all promoters, gained somatic promoters, and lost somatic promoters (see FIG. 1 in Main Text). RNA-seq reads mapping to these epigenome-defined promoter regions were then quantified, normalized by promoter length (kilobases) and by total library size, and fold changes in expression were computed between tumor and normal TCGA sample groups. Length of promoter loci was defined as the number of base pairs (bps) between the start and stop genomic coordinate of the H3K4me3 region as identified by the peak caller program CCAT v3.0. (190) Isoform level quantification for alternative promoter driven transcripts was performed using cufflinks (FPKM), Kallisto (TPM) and MISO (isoform centric analysis). Assigned counts for each isoform were normalized by DESeq2.
DNA Methylation Analysis
Genomic DNA of gastric tumors and matched normal gastric tissues was extracted (QIAGEN) and processed for DNA methylation profiling using Illumina HumanMethylation450 BeadChips (HM450). Methylation β-values were calculated and background corrected using the methylumi R BioConductor package. Normalization was performed using the BMIQ method (wateRmelon package in R). CpG island locations were downloaded from the UCSC genome browser. Overlaps of at least 1 bp between promoter loci and CpG islands were identified using BEDTools intersect. For each group (all promoters, gained somatic promoters and lost somatic promoters), we identified probes overlapping the predicted promoter regions and calculated average beta value differences. A two-sample Wilcoxon test was performed.
Survival Analysis
Kaplan-Meier survival analysis was used with overall survival as the outcome metric. Log-rank tests were used to assess the significance of the Kaplan-Meier analysis.
Gene Set Enrichment Analysis
Gene set enrichment analysis was performed using MsigDB by computing the overlap of genes associated with somatic promoters against the C2 set of curated genes.
Mass Spectrometry and Data Analysis
Peptide level mass spectrometry data for 90 colon and rectal cancer (CRC) samples and 60 normal colon epithelium samples were downloaded from the CPTAC portal generated by the Clinical Proteomic Tumor Analysis Consortium (NCl/NIH). (https://cptac-data-portal.georgetown.edu/cptac). Spectral counts were extracted using IDPicker's idQuery tool. Differentially expressed peptides were identified by fitting a linear model (limma R) on quantile normalized and log2 transformed spectral counts. For GC cell line mass spectrometry, AGS, GES-1, SNU1750 and MKN1 cells were extracted with RIPA buffer supplemented with protease inhibitor. 150 μg protein extract of each biological quadruplicate (i.e. 4 replicates per cell line) were separated on a 12% NuPAGE Novel Bis-Tris precast gel (Thermo Scientific). For in-gel digestion, samples were separated into two fractions and reduced in 10 mM DTT for 1 h at 56° C. followed by alkylation with 55 mM iodoacetamide (Sigma) for 45 min in the dark. Tryptic digests were performed in 50 mM ammonium bicarbonate buffer with 2 μg trypsin (Promega) at 37° C. overnight. Peptides were desalted on StageTips and analysed by nanoflow liquid chromatography on an EASY-nLC 1200 system coupled to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific). Peptides were separated on a C18-reversed phase column (25 cm long, 75 μm inner diameter) packed in-house with ReproSil-Pur C18-QAQ 1.9 μm resin (Dr Maisch). The column was mounted on an Easy Flex Nano Source and temperature controlled by a column oven (Sonation) at 40° C. A 225-min gradient from 2 to 40% acetonitrile in 0.5% formic acid at a flow of 225 nl/min was used. Spray voltage was set to 2.4 kV. The Q Exactive HF was operated with a TOP20 MS/MS spectra acquisition method per MS full scan. MS scans were conducted with 60,000 and MS/MS scans with 15,000 resolution. For data analysis, raw files were processed with MaxQuant version 1.5.2.8 against the UNIPROT annotated human protein database. Carbamidomethylation was set as a fixed modification while methionine oxidation and protein N-acetylation were considered as variable modifications. Search results were processed with MaxQuant filtered with a false discovery rate of 0.01. The match between run option and LFQ quantitation were activated. LFQ intensities were filtered for potential contaminants, reverse proteins and loge transformed. They were then imputed using open source software Perseus (0.5 width, 1.8 downshift) and fitted using linear models (limma R).
5′ RACE and Gene Cloning
5′ Rapid amplification of cDNA ends (5′ RACE) was performed using the 5′ RACE System for Rapid Amplification of cDNA Ends, Version 2 (Invitrogen, 18374-058). Briefly, 2 μg of total RNA was used for each reverse transcription reaction with SuperScript™ II reverse transcriptase and gene-specific primer 1 for each gene. After cDNA synthesis, RNase mix (RNase H and RNase T1) was used to degrade the RNA. First strand cDNAs were then purified with S.N.A.P. columns, and tailed with dCTP and TdT. dC-tailed cDNAs were amplified using the abridged anchor primer and nested gene-specific primer 2 by Go Taq®Hot Start Polymerase (Promega, M5001). Subsequently, primary PCR products were reamplified with the abridged universal amplification primer (AUAP), and gene-specific primer 3. Gel electrophoresis was performed. PCR bands of interest were excised and purified for cloning with the TA Cloning Kit (Invitrogen, K2020). A minimum of 12 independent colonies were isolated, and purified plasmid DNA was sequenced bi-directionally on an ABI 3730 DNA analyzer (Applied Biosystems) (Table 2). Constructs for MET transcripts were generated by PCR amplification of full-length cDNAs encoding wild type and variant MET from KATOIII cells. Wild type and variant RASA3 full-length transcripts were PCR amplified from NCC59 cells. cDNA fragments were cloned into the pCI-Puro-HA vector (modified from Promega's pCI-Neo vector, a gift from Wanjin Hong, Institute of Molecular and Cell Biology, Singapore). Plasmids were transiently transfected into cell lines using Lipofectamine 3000 (Thermo Scientific).
TABLE 2
|
|
RACE Primers
|
Gene
Gene
Gene
|
specific
specific
specific
|
Gene
primer 1
primer 2
primer 3
|
|
RASA3
5′GGAGTAGATACGC
5′CACAGCCAGTG
5′CTTCTCCACTG
|
TCCGT3′
GCCGCTCAGGTA3′
CCAGGATGTT3′
|
(SEQ ID
(SEQ ID
(SEQ ID
|
NO: 1837)
NO: 1838)
NO: 1839)
|
|
MET
5′TAGGAGAATGTAC
5′GGAGACACTGG
5′CGAGAAACCAC
|
TGTAT 3′
ATGGGAGTC 3′
AACCTGCAT3′
|
(SEQ ID
(SEQ ID
(SEQ ID
|
NO: 1840)
NO: 1841)
NO: 1842)
|
|
Western Blotting
3×105 HEK293 cells were seeded and transfected using Lipofectamine 3000 (Thermo Scientific). Cells were serum starved for 16 hours before addition of human HGF (R&D systems, 100 ng/ml) for 0, 15 and 30 minutes, and immediately harvested with cold Triton-X100 Lysis Buffer (50 mM Tris pH 8.0, 150 mM NaCl, 1% Triton X-100) with protease and phosphatase inhibitors (Roche) on ice. Protein concentration was measured by Pierce BCA protein assay (Thermo Scientific). Cell lysates were heated at 95° C. for 10 min in SDS sample buffer and 20 μg of each cell lysate was loaded per well. Proteins were transferred to nitrocellulose membranes. Western blotting was performed by incubating membranes 4 hrs at room temperature with the following antibodies: Met & β-actin (Santa Cruz), p-MET (Y1234/1235 & Y1349), pSTAT3 (S727 & Y705), STAT3, ERK, p-ERK, Gab1, pGab1 (Y627) (Cell Signaling). Membranes were incubated in secondary antibodies at 1:3,000 for 1 hr at room temperature and developed with SuperSignal West Femto Maximum Sensitivity substrate (Thermo Scientific) using ChemiDoc™ MP Imaging System (BIO-RAD). Western blot bands were quantified using Image Lab software (BIO-RAD). Experiments were repeated in triplicate.
Cell Proliferation Assays
3×103 GES1, SNU1967 and AGS cells were plated into 96-well plates in media with 10% fetal bovine serum and left overnight to attach. The next day (Day 0), cells were transiently transfected with wild-type and variant RASA3 constructs using Lipofectamine 3000 (Thermo Scientific). The amount of the constructs was 40 ng/well for AGS and 100 ng/well for GES1 and SNU1967 cells. Cell proliferation was measured by the WST-8 assay (Cell Counting Kit-8, Dojindo) from 24 to 120 hours post-transfection. 10 uL of WST-8 solution was added per well and the absorbance reading was measured at 450 nm after 2 hours of incubation in a humidified incubator.
Transfection with RASA3 siRNAs
Two RASA3 siRNAs were used to silence the RASA3 SomT transcript in NCC24 cells (hs.Ri.RASA3.13.1 TriFECTa® Kit DsiRNA Duplex (Integrated DNA Technologies), and Silencer® Select Pre-Designed siRNA s355 (Life Technologies)). NCC24 cells were transfected either with the above two siRNAs or a non-targeting control (ON-TARGETplus Non-targeting pool, Dharmacon) at a final concentration of 100 nM for 48 hours, subsequently followed by qPCR and western validation and migration/invasion assays.
Migration and Invasion Assays
To determine cell migratory capacities, RASA3 wild type and variant transfected AGS and GES1, SNU1967 and AGS, and siRNA treated NCC24 cells were tested using Corning Costar 6.5 mm Transwell with 8.0 μm Pore Polycarbonate Membrane Inserts (3422, Corning, N.Y., USA). 2.5×104 AGS cells and 2×104 GES1 cells, 3×104 SNU1967 cells and 5×104 NCC24 cells were suspended in 0.1 ml serum-free RPMI medium and added to the top of the Transwell insert. 0.6 ml RPMI containing 10% FBS was added into the bottom well as a chemoattractant. After incubation for 24 h at 37° C. in a 5% CO2 incubator, cells were fixed with 3.7% formaldehyde and permeabilized with 100% methanol. Non-migrated cells were scraped off with cotton swabs from the upper surface of the membrane. Migrated cells were stained with 0.5% crystal violet. The number of migrated cells were represented as the total area of migrated cells vs the area of transwell membrane calculated using ImageJ software. For cell invasion assays, the above Transwell inserts were coated with 0.1 ml (300 μg/mL) Corning Matrigel matrix (354234, Corning, N.Y., USA) for 2 to 4 h at 37° C. before use. All subsequent steps were identical to the migration assay protocol.
Measurement of RASA3 mRNA Levels
Total RNA was extracted from three independent experiments using the Qiagen RNAeasy mini kit according to manufacturer's instructions. RNA was reverse transcribed using Improm-II™ Reverse Transcriptase (Promega). Real time PCR was performed in triplicate using Quantifast SYBR Green PCR kit (Qiagen) on an Applied Biosystems HT7900 Real Time PCR System. Fold change was calculated using the Delta Ct method and normalised to β-actin. Primer sequences are as follows. β-actin: F-5′ TCCCTGGAGAAGAGCTACG 3′ (SEQ ID NO: 1843), R-5′ GTAGTTTCGTGGATGCCACA 3′ (SEQ ID NO: 1844); RASA3 SomT: F-5′ TTGTGAGTGGTTCAGCGGTA 3′ (SEQ ID NO: 1845), R-5′ TCAAGCGAAACCATCTCTTCT 3′ (SEQ ID NO: 1846).
RAS-GTP Assay
GES1 cells were transfected with either RASA3 CanT, RASA3 SomT or empty vector for 48 hours. Cells were harvested for protein in FBS containing media or subjected to over-night serum starvation followed by serum stimulation for 30 minutes prior to harvest. Proteins were extracted using ice-cold lysis buffer (Active RAS Pull-down and Detection Kit) containing protease inhibitor cocktail (Nacalai Tesque). Active RAS fraction was obtained using the Active RAS Pull-down and Detection Kit (Thermo Fisher Scientific) according to manufacturer's instructions. Total RAS was measured in corresponding whole cell protein lysates. B-actin was used as a loading control. Protein concentrations were determined using the Pierce BCA protein assay (Thermo Scientific). SDS sample buffer was added to the lysates and boiled at 100° C. for 5 minutes. Samples were loaded in each well of a 4-15% Mini-Protean TGX gel (Biorad) and transferred to a PVDF membrane using a semi-dry blotting system (Biorad). Membranes were probed with anti-RAS (1 in 200 dilution, supplied in Active RAS Pull-down and Detection Kit), or B-actin (1 in 5000 dilution, Sigma A5316) in 5% milk-PBST at 4° C. over-night. Secondary anti-mouse antibody (LNA931, Amersham) was used at a dilution of 1 in 2000 for 1 hour at room temperature. Membranes were developed using Amersham ECL Prime Western Blotting Detection Reagent and imaged using a Chemidoc Imaging system (Biorad).
Altered Peptide and Antigen Prediction
Altered peptides were defined as variant N-terminal protein sequences arising from somatic alterations in alternative promoter usage. The following filters were applied to select the pool of altered peptides—i) Fold change of at least 1.5 for alternate vs. canonical RNA-seq expression ii) Only one canonical and one alternate isoform per gene loci iii) Annotated transcripts are confirmed as protein coding by Gencode. Canonical promoters were defined as regions exhibiting unaltered H3K4me3 peaks. Random peptides from the human proteome were generated from amino acid sequences of Gencode coding transcripts. N-terminal peptide gains were identified as cases where the alternative transcript was associated with a different 5′ region predicted to result in a different translated protein sequence compared to the canonical transcript. For each N terminal altered protein, we evaluated binding of 9-mer peptides using the NetMHCpan 2.8 using a strict threshold of IC<=50 nm to identify strong MHC binders. N-terminal gained peptides were mapped against protein assembly data of the same gene to evaluate protein expression. Antigen predictions were performed against HLA types of 13 GC samples predicted using OptiType. OptiType was run using default parameters except BWA mem was used as an aligner for pre-filtering reads aligning to the Optitype provided reference sequences. 3 samples with poor coverage and unpaired reads with mismatches were omitted from analysis. Eleven HLA-A, HLA-B, and HLA-C allelic variants of increased prevalence in the South East Asian population (HLA-A*02:07/HLA-A*11:01/HLA-A*24:02/HLA-A*33:03/HLA-A*24:07, HLA-B*13:01/HLA-B*40:01/HLA-B*46:01, HLA-C*03:04/HLA-C*07:02/HLA-C*08:01) were obtained from the Allele Frequency Net Database (http://www.allelefrequencies.net).
Association of Cytolytic Markers with Alternative Promoter Usage
Local immune cytolytic activity was evaluated using the expression of Granzyme A (GZMA) and Perforin (PRF1). Tumor content was estimated using two algorithms—ASCAT(79) (aberrant cell fraction) and ESTIMATE (tumor purity). Expression data for the SG series was downloaded (GSE15460) and normalized using the robust multi-array average algorithm in the ‘affy’ R package and loge transformed. Affymetrix SNP Array 6.0 data for the SG series was downloaded from GSE31168 and GSE85466. Mutation frequencies for TCGA STAD samples were downloaded from the TCGA STAD publication data (https://tcga-data.nci.nih.gov/docs/publications/stad_20140 using level 2 curated MAF files (QCv5_blacklist_Pass.aggregated.capture.tcga.uuid.curated.somatic.maf) filtered for “Missense” variant classification. Expression data for TCGA STAD samples (TPM) was computed using the kallisto algorithm. Raw SNP Array 6.0.CEL files for TCGA gastric cancers (STAD) were downloaded from the GDC data portal (https://gdc-portal.nci.nih.gov/). Access to this dataset was obtained using dbGaP credentials and an ID issued by eRA commons. Precomputed ESTIMATE scores for TCGA STAD were downloaded from http://bioinformatics.mdanderson.org/estimate/and converted to tumor purity using the formula cos (0.6049872018+0.0001467884×ESTIMATE score). Preprocessed expression data for the ACRG series was downloaded from GSE62254, and pre-computed ASCAT scores obtained from collaborators (JL). Expression of cytolytic markers was adjusted for missense mutation and tumor purity frequencies using a spline regression model.
Peptides and Cells for Cytokine Assays
A set of peptides for 15 representative alternative promoters was purchased from GenScript (GenScript). Peptide sequences and composition of peptide pools for each alternative promoter are described in Table 3. Control peptide pools for human Actin were purchased from JPT (PM-ACTS, PepMix™ Human (Actin) JPT). Peripheral blood mononuclear cells (PBMCs) were obtained from 9 healthy volunteers of whom 8 PBMC samples were HLA-typed (Table 3).
TABLE 3
|
|
HLA types of healthy PBMC donors
|
Sample
HLA-A
HLA-B
HLA-C
|
|
Donor 1
A*11:01
A*24:02
B*15:01
B*51:01
C*04:01
C*14:02
|
Donor 2
A*11:01
A*33:03
B*40:01
B*58:01
C*03:02
C*07:02
|
Donor 3
A*03:01
A*33:03
B*35:03
B*38:01
C*12:03
C*12:03
|
Donor 4
A*02:07
A*24:07
B*15:02
B*46:01
C*01:02
C*08:01
|
Donor 5
A*02:03
A*11:01
B*15:02
B*51:01
C*08:01
C*14:02
|
Donor 6
A*02:01
A*68:01
B*15:13
B*40:06
C*08:01
C*15:02
|
Donor 7
A*02:07
A*33:03
B*27:04
B*58:01
C*03:02
C*12:02
|
Donor 8
A*02:03
A*11:01
B*38:02
B*46:01
C*01:02
C*07:02
|
Donor 9
Not determined
|
|
EpiMAX Assay
PBMCs were labelled with 1 μM CFSE (Life Technologies, Thermo Fisher Scientific) and cultured at a density of 200,000 cells per well in complete culture medium (cRPMI comprising RPMI 1640 medium (Gibco, Thermo Fisher Scientific), 15 mM HEPES (Gibco), 1% non-essential amino acid (Gibco), 1 mM sodium pyruvate (Gibco), 1% penicillin/streptomycin (Gibco), 2 mM L-glutamine (Gibco), 50 μM β2-mercaptoethanol (Sigma, Merck), and 10% heat-inactivated FCS (Hyclone)) for 5 days. Individual peptide pools of each alternative promoter were added at the start of the culture at a concentration of 1 μg/ml for each peptide. At the end of day 5, cells were stained with LIVE/DEAD® fixable near-IR dead cell stain kit (Life Technologies), and labelled with CD4-BUV737 (BD), CD8-PacificBlue (BD), CD3-PE (BioLegend), CD19-PE/TexasRed (Beckman), and CD56-APC (BD). Analysis of T cell proliferation by CFSE dilution was performed by flow cytometry using a LSRII (BD). In addition, magnetic bead-based cytokine multiplex analysis (human cytokine panel 1, Millipore, Merck) was performed on cell culture supernatants to measure secreted cytokine levels.
IFN-γ Assay
To test the immunogenicity of the RASA3 WT and Variant protein sequences, CD14+ monocytes were isolated from a HLA-A*02:06 donor by positive selection using magnetic beads (Miltenyi, Germany). Dendritic cells were generated by GM-CSF (1000 IU/ml) and IL-4 (400 IU/ml), and further matured by TNF (10 ng/ml), IL-1b (10 ng/ml), IL-6 (10 ng/ml) (Miltenyi, Germany) and PGE2 (1 μg/ml) (Stemcell Technologies, Canada) for 24 hours. The DCs were then primed with AGS cell lysates expressing WT RASA3 or Variant RASA3 for 24 hours, before being co-cultured with T cells from the same donor at the ratio of 1:5. After 5 days of co-culture with DC, T cells were isolated by positive selection using CD3 magnetic beads (Miltenyi, Germany) and co-cultured with AGS cells expressing either WT or Variant RASA3 at the ratio of 20:1 for two days. Supernatants were harvested and IFN-γ release was measured by ELISA (R&D, USA).
NanoString Analysis
Nanostring nCounter Reporter CodeSets were designed for 95 genes (83 upregulated in GC and 11 downregulated) and 5 housekeeping genes (AGPAT1, CLTC, B2M, POL2RL and TBP covering a broad expression range) on the SG series samples. For each gene, we designed 3 probes, targeting a) the 5′ end of the alternate promoter location, b) the 5′ end of the canonical promoter (defined by promoter regions of equal enrichment in both GC and normal samples OR the longest protein coding transcript) and c) a common downstream probe. Vendor-provided nCounter software (nSolver) was used for data analysis. Raw counts were normalized using the geometric mean of the internal positive control probes included in each CodeSet.
A separate NanoString assay was designed for 88 genes on the ACRG cohort. For each gene, we designed 3 probes, targeting a) the 5′ end of the alternate promoter location, b) the 5′ end of the canonical promoter (defined by promoter regions of equal enrichment in both GC and normal samples OR the longest protein coding transcript).
Repeat Enrichment Analysis
Repetitive element families over-represented at regions exhibiting somatic promoter alterations were identified using RepeatMasker annotations from the UCSC Table Browser (GRCh37/hg19). “Unknown”, “Simple_Repeat” and “Satellite” annotations were filtered from the repeat set. Repetitive elements were included only if they overlapped a promoter by a minimum of 50%. Enrichment of repetitive element families was assessed using a binomial test with Benjamini-Hochberg FDR correction and all promoter regions were used as the background.
Functional Prediction Analysis
Genome wide and tissue specific functional scores were downloaded from GenoCanyon (http://genocanyon.med.yale.edu/GenoCanyon_Downloads.html, Version 1.0.3) and GenoSkyline (http://genocanyon.med.yale.edu/GenoSkyline) respectively. Overlaps were calculated using bedtools IntersectBed and functional scores over each unannotated somatic promoter were computed.
Transcription Factor Enrichment
Transcription factor binding sites for 237 TFs were obtained from the ReMap database, a public database of ENCODE and other public Chip-seq TFBS data sets. Overlaps were calculated and counted against the somatic promoter set. Relative enrichment scores were calculated as ratio of (#bases in state and overlap feature)/(#bases in genome) and [(#bases overlap feature)/(#bases in genome)×(#bases in state)/(#bases in genome)].
EZH2 Inhibition
IM95 were treated with GSK126 (Selleck, USA), a selective EZH2 inhibitor, at a concentration of 5 uM. Cell proliferation was monitored in 96-well plates post-treatment with GSK126 using the CellTiter-Glo® Luminescent Cell Viability Assay (Promega) for three independent experiments. For RNA-seq analysis, total RNA was extracted using the Qiagen RNAeasy mini kit according to manufacturer's instructions. Cells were treated with GSK126 (Selleck, USA; dissolved in DMSO) at a concentration of 5 uM. Control cells were treated with the same concentration of DMSO (0.1%). RNAseq differential analysis for promoter loci was carried out using edgeR on read counts mapping to H3K4me3 regions estimated using featureCounts. RNAseq gene level differential analysis was performed using cuffdiff2.2.1.
Additional Information
Accession codes: Genomic data for this study has been deposited in the National
Center for Biotechnology GEO database, under accession numbers GSE51776 and GSE75898. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=kfoxqeamzftpal&acc=GSE75898)
Results
Identifying Epigenomic Promoter Alterations in GC
Using NanoChIP-seq, we profiled three histone modification marks (H3K4me3, H3K27ac and H3K4me1) across 17 GCs, matched normal gastric mucosae (34 samples) and 13 GC cell lines, generating 110 epigenomic profiles (Tables 1 and 4 provide clinical and sequencing metrics) (FIG. 1a). Quality control of the Nano-ChIPseq data was performed using two independent methods: ChIP-enrichment at known promoters, and employing the ChIP-seq quality control and validation tool CHANCE (CHip-seq ANalytics and Confidence Estimation). Comparisons of Nano-ChIPseq read densities at 1,000 promoters associated with highly expressed protein-coding genes confirmed successful enrichment in all H3K27ac and H3K4me3 libraries. CHANCE analysis also revealed that the large majority (81%) of samples exhibited successful enrichment (Table 1). We have previously also shown that Nano-ChIP signals exhibit a good concordance with orthogonal ChIP-qPCR results.
TABLE 4
|
|
Clinicopathological Parameters of samples used
|
Site
|
Sample
of
Stage
Stage
Stage
Stage
Lauren's
EBV
TCGA
|
ID
Platform
Age
Gender
Tumor
(T)
(N)
(M)
AJCC7
Grade
Classification
status
Subtype
|
|
20021007
ChIPseq +
53.8
male
GE
T2b
N0
m0
2A
poorly
intestinal type
unknown
GS
|
Infinium450K
junction
differentiated
adenocarcinoma
|
20020720
ChIPseq +
75.2
male
antrum
T2a
N1
m0
2A
moderately
intestinal type
unknown
CIN
|
Infinium450K
differentiated
adenocarcinoma
|
2001206
ChIPseq +
64.8
male
antrum
T4a
N3b
m1
4
poorly
diffuse type
unknown
C!N
|
Infinium450K
differentiated
adenocarcinoma
|
2000877
ChIPseq +
44.6
male
cardia
T2a
N1
m0
2A
poorly
intestinal type
unknown
CIN
|
Infinium450K
differentiated
adenocarcinoma
|
2000085
ChIPseq +
52.6
male
lesser
T2
N0
m0
1B
moderately
intestinal type
yes
GS
|
Infinium450K
curve
differentiated
adenocarcinoma
|
990275
ChIPseq +
71.6
male
lesser
T4a
N0
m0
2B
moderately
intestinal type
no
CIN
|
Infinium450K
curve
differentiated
adenocarcinoma
|
990068
ChIPseq +
73.3
male
body
T4a
N2
m0
3B
poorly
intestinal type
no
GS
|
Infinium450K
differentiated
adenocarcinoma
|
980447
ChIPseq +
68.8
male
lesser
T4a
T3b
m1
4
poorly
intestinal type
unknown
CIN
|
Infinium450K
curve
differentiated
adenocarcinoma
|
980436
ChIPseq +
65.0
female
lesser
T4a
N1
m0
3A
moderately
intestinal type
unknown
GS
|
Infinium450K
curve
differentiated
adenocarcinoma
|
980401
ChIPseq +
82.9
female
unknown
T4a
N1
m0
3A
poorly
diffuse type
unknown
GS
|
Infinium450K
differentiated
adenocarcinoma
|
980319
ChIPseq +
67.8
male
unknown
T4a
N1
m0
3A
poorly
mixed/
yes
GS
|
Infinium450K
differentiated
OTHERS
|
2000986
ChIPseq +
39.0
female
pylorus
T4a
T3b
m1
4
poorly
diffuse type
unknown
GS
|
Infinium450K +
differentiated
adenocarcinoma
|
RNA-seq
|
2000721
ChIPseq +
70.9
male
lesser
T4a
T3b
m1
4
poorly
diffuse type
yes
GS
|
Infinium450K +
curve
differentiated
adenocarcinoma
|
RNA-seq
|
2000639
ChIPseq +
69.5
male
lesser
T4a
N3a
m1
4
moderately
intestinal type
yes
GS
|
Infinium450K +
curve
differentiated
adenocarcinoma
|
RNA-seq
|
980437
ChIPseq +
67.8
female
incisura
T4a
T3b
m0
3C
poorly
intestinal type
unknown
CIN
|
Infinium450K +
differentiated
adenocarcinoma
|
RNA-seq
|
980417
ChIPseq +
67.0
male
lesser
T4a
T3b
m0
3C
poorly
diffuse type
yes
GS
|
Infinium450K +
curve
differentiated
adenocarcinoma
|
RNA-seq
|
980097
ChIPseq +
65.4
male
unknown
T2
N1
m0
2A
undifferentiated
mixed/
unknown
EBV
|
Infinium450K +
OTHERS
|
RNA-seq
|
980418
Infinium450K
88.0
male
greater
T4a
N2
m0
3B
moderately
intestinal type
unknown
—
|
curve
differentiated
adenocarcinoma
|
57689477
RNA-seq
84.5
female
greater
T1b
N0
m0
1A
moderately
intestinal type
no
—
|
curve
differentiated
adenocarcinoma
|
43658255
RNA-seq
66.6
male
antrum
T4a
N3a
m1
4
moderately
intestinal type
unknown
—
|
differentiated
adenocarcinoma
|
2000892
RNA-seq
71.3
female
lesser
T2
N1
m0
2A
moderately
intestinal type
no
—
|
curve
differentiated
adenocarcinoma
|
|
To enable accurate promoter identification, we integrated data from multiple histone modifications, selecting H3K4me3 regions simultaneously co-depleted for H3K4me142 (“H3K4me3 hi/H3K4me1 lo regions”; FIG. 7, Methods). Comparisons against data from external sources, including GENCODE reference transcripts, ENCODE chromatin-state models, and CAGE (CAP analysis gene expression) databases, validated the vast majority of H3K4me3 hi/H3K4me1 lo regions as true promoter elements (see section titled “Validation of H3K4me3 hi/H3K4me1 lo regions as true promoters” and FIG. 7). Because primary gastric tissues comprise several different tissue types, including epithelial cells, immune cells, and stroma, we further confirmed that our promoter profiles were reflective of bona fide gastric epithelia by comparisons against Epigenome Roadmap data for gastric and non-gastric tissues. Gastric tumor and matched normal promoter profiles exhibited the highest correlations to Roadmap gastric mucosae, and were distinct from other gastrointestinal tissues (small intestine, colon mucosa, colon sigmoid), stomach-associated muscle, skin, and blood (CD14) (FIG. 8). Primary tissue promoter profiles also showed a significant overlap with promoter profiles of GC cell lines (87%), which are purely epithelial in origin, compared to gastrointestinal fibroblast lines (58-69%), and colon carcinoma lines (59-74%) (FIG. 8).
In total, we mapped ˜23,000 promoter elements in the Nano-ChIPseq cohort. Visual exploration of these promoter elements identified three main promoter categories—unaltered promoters, promoters gained in tumors (gained somatic or tumor-specific promoters), and promoters present in normal gastric tissues but lost or decreased in GC (lost somatic or normal-specific promoters) (FIG. 1a-c). Representative examples of unaltered promoters included RhoA (FIG. 1a), while CEACAM6, an intracellular adhesion gene, exhibited somatic promoter gain at the CEACAM6 transcription start site (TSS) in tumor samples and cell lines (FIG. 1b). Conversely, ATP4A, a parietal cell-associated H+/K+ ATPase with decreased expression in GC43, exhibited somatic promoter loss (FIG. 1c). Both CEACAM6 and ATP4A promoter alterations were correlated with increased and decreased CEACAM6 and ATP4A gene expression in the same samples respectively (FIGS. 1b and 1c).
Previous studies have established distinct molecular subtypes of GC. Due to limited sample sizes however, we elected in the current stay to identify promoter alterations (“somatic promoters”) present in multiple GC tissues relative to control tissues irrespective of subtype. Focusing on recurrent alterations also has the benefit of reducing potential artefacts due to “private” epigenomic variation or individual sample-specific technical errors. Using two complementary read-count based algorithms commonly used for analysis of ChIP-seq data, we identified ˜2000 highly recurrent somatic promoters, of which 75% were gained in GCs (FC 1.5, q<0.1). Two-dimensional heat-map clustering and principal components analysis (PCA) plots based on somatic promoters confirmed a separation of GCs from normal samples based on promoter alterations (FIG. 1d and FIG. 9). Somatic promoter H3K4me3 levels were also highly correlated with H3K27ac signals (r=0.91, P<0.001, FIG. 1e), commonly regarded as a marker of active regulatory activity. This correlation was observed across all somatic promoters (r=0.84, P<0.001, FIG. 1E), and also when gained somatic and lost somatic promoters were analyzed separately (r=0.78, P<0.001 for gained somatic; r=0.82, P<0.001 for lost somatic, FIG. 9). Pathway analysis revealed that both gained somatic and lost somatic promoters were significantly associated with expression genesets previously reported to be up and downregulated in GC respectively (FIG. 10. These included upregulated oncogenes (MET, ABL2), cell adhesion genes (CEACAM6) and claudin family members (CLDN7, CLDN3). 15-18% of somatic promoters mapped to non-coding RNAs (ncRNAs), including HOTAIR and PVT1, previously associated with GC (Table 5). Additional analyses at increasing thresholds of stringency (FC from 1.5-2 and FDR from 0.1-0.001) yielded similar results, supporting the robustness of this analysis (FIG. 9). These results demonstrate that normal gastric epithelia and GCs can be distinguished on the basis of epigenomic promoter profiles.
TABLE 5
|
|
Non coding RNAs associated with Altered promoters
|
Gene
H3K4Me3 (T/N)
|
|
AC004158.2
Gain
|
AC004870.4
Gain
|
AC005281.1
Gain
|
AC005550.4
Gain
|
AC007040.5
Gain
|
AC007392.3
Gain
|
AC009229.6
Gain
|
AC012531.23
Gain
|
AC016683.6
Gain
|
AC016995.3
Gain
|
AC019201.1
Loss
|
AC068134.6
Gain
|
AC069277.2
Gain
|
AC073479.1
Loss
|
AC079779.4
Loss
|
AC090051.1
Loss
|
AC092296.1
Gain
|
AC092594.1
Gain
|
AC092635.1
Loss
|
AC096579.1
Loss
|
AC096579.13
Loss
|
AC096579.7
Loss
|
AC116351.2
Gain
|
AC128653.1
Loss
|
AC131951.1
Loss
|
AC133680.1
Loss
|
AC140912.1
Gain
|
AC144521.1
Gain
|
AF127936.5
Loss
|
AJ003147.8
Gain
|
AL031721.1
Gain
|
AL109618.1
Gain
|
AL122015.1
Gain
|
AL122127.1
Loss
|
AL122127.2
Loss
|
AL122127.3
Loss
|
AL122127.4
Loss
|
AL122127.5
Loss
|
AL139319.1
Gain
|
AP000525.9
Gain
|
AP001065.15
Gain
|
C11orf95
Gain
|
C1orf132
Loss
|
CASC9
Gain
|
CCAT1
Gain
|
CECR7
Loss
|
CT49
Gain
|
CTB-175P5.4
Gain
|
CTC-228N24.1
Gain
|
CTC-276P9.1
Loss
|
CTC-480C2.1
Gain
|
CTD-2008P7.9
Loss
|
CTD-2147F2.1
Gain
|
CTD-2201E18.5
Gain
|
CTD-2314B22.1
Gain
|
CTD-2314B22.3
Gain
|
CTD-2532K18.1
Gain
|
CTD-2591A6.2
Gain
|
FENDRR
Loss
|
FZD10-AS1
Gain
|
GS1-179L18.1
Gain
|
GS1-259H13.2
Gain
|
H19
Gain
|
hsa-mir-4537
Loss
|
hsa-mir-4538
Loss
|
hsa-mir-4539
Loss
|
JRK
Loss
|
LINC00237
Gain
|
LINC00278
Loss
|
LINC00355
Gain
|
LINC00365
Loss
|
LINC00393
Gain
|
LINC00665
Gain
|
LINC00668
Gain
|
LINC00669
Gain
|
LINC00675
Loss
|
LINC00858
Gain
|
LINC00898
Gain
|
LINC00939
Gain
|
LINC00960
Gain
|
MIR1184-1
Gain
|
MIR135B
Gain
|
MIR144
Loss
|
MIR196B
Gain
|
MIR3147
Gain
|
MIR3185
Gain
|
MIR31HG
Loss
|
MIR4488
Gain
|
MIR4634
Gain
|
MIR663A
Gain
|
MIR663B
Loss
|
MIR935
Gain
|
MLLT4-AS1
Gain
|
PVT1
Gain
|
RN7SKP258
Gain
|
RN7SL773P
Gain
|
RNA5S17
Gain
|
RNA5SP18
Gain
|
RNA5SP19
Gain
|
RNA5SP75
Loss
|
RNU1-92P
Gain
|
RNVU1-10
Gain
|
RP11-108K3.1
Gain
|
RP11-138J23.1
Gain
|
RP11-13A1.1
Gain
|
RP11-161I10.1
Gain
|
RP11-163N6.2
Gain
|
RP11-168L22.2
Gain
|
RP11-16E12.2
Loss
|
RP11-177F15.1
Gain
|
RP11-191L9.4
Gain
|
RP11-211C9.1
Gain
|
RP11-229C3.2
Loss
|
RP11-246A10.1
Gain
|
RP11-25H12.1
Gain
|
RP11-276H19.2
Gain
|
RP11-288G11.3
Loss
|
RP11-299P2.1
Loss
|
RP11-2E17.1
Loss
|
RP11-308B16.2
Gain
|
RP11-326A19.4
Gain
|
RP11-346D19.1
Gain
|
RP11-347D21.4
Gain
|
RP11-348J24.2
Gain
|
RP11-351J23.2
Gain
|
RP11-356J5.12
Gain
|
RP11-357H14.17
Gain
|
RP11-371I1.2
Gain
|
RP11-137D17.1
Gain
|
RP11-395B7.2
Gain
|
RP11-3J1.1
Gain
|
RP11-400N13.2
Gain
|
RP11-403I13.5
Gain
|
RP11-408B11.2
Gain
|
RP11-426L16.8
Gain
|
RP11-431M3.1
Loss
|
RP11-434D9.2
Gain
|
RP11-43F13.4
Gain
|
RP11-44H4.1
Gain
|
RP11-44N12.5
Gain
|
RP11-451B8.1
Gain
|
RP11-
Gain
|
453F18_B.1
|
RP11-460N16.1
Gain
|
RP11-469L4.1
Loss
|
RP11-472N13.2
Gain
|
RP11-48O20.4
Loss
|
RP11-499F3.2
Gain
|
RP11-514D23.1
Loss
|
RP11-547I7.2
Gain
|
RP11-575F12.1
Gain
|
RP11-576D8.4
Gain
|
RP11-599B13.3
Loss
|
RP11-608O21.1
Gain
|
RP11-60A8.1
Gain
|
RP11-61G19.1
Gain
|
RP11-626G11.4
Gain
|
RP11-626H12.1
Gain
|
RP11-627G23.1
Loss
|
RP11-632K5.3
Gain
|
RP11-66B24.2
Gain
|
RP11-66B24.7
Gain
|
RP11-689K5.3
Gain
|
RP1-170O19.14
Gain
|
RP1-170O19.17
Gain
|
RP11-776H12.1
Gain
|
RP11-79P5.7
Gain
|
RP11-809C18.5
Gain
|
RP11-81H14.2
Loss
|
RP11-831A10.2
Loss
|
RP11-834C11.14
Gain
|
RP11-834C11.6
Loss
|
RP11-867G2.6
Gain
|
RP11-89F3.2
Gain
|
RP11-933H2.4
Gain
|
RP11-963H4.3
Loss
|
RP1-274L7.1
Gain
|
RP13-137A17.4
Loss
|
RP13-137A17.6
Loss
|
RP13-379O24.3
Loss
|
RP1-63G5.5
Gain
|
RP1-79C4.4
Gain
|
RP3-522D1.1
Gain
|
RP4-562J12.2
Gain
|
RP4-594A5.1
Gain
|
RP5-1077H22.2
Loss
|
RP5-1121A15.3
Gain
|
RP5-884M6.1
Gain
|
RP5-916L7.2
Gain
|
RP6-114E22.1
Gain
|
SNORA31
Gain
|
SNORA48
Gain
|
SNORD56B
Loss
|
snoU13
Gain
|
SOX21-AS1
Loss
|
TPTEP1
Loss
|
TTTY15
Loss
|
U3
Loss
|
U8
Loss
|
|
Validation of H3K4Me3 Hi/H3K4Me1 Lo Regions as True Promoters
Four lines of evidence support the vast majority of H3K4me3 hi/H3K4me1 lo regions as true promoters. First, H3K4me3 hi/H3K4me1 lo regions were strongly enriched at genomic locations located 1 kb upstream of known GENCODE transcription start sites (TSSs) (FIG. 7). Second, at TSS regions, H3K4me3 signals exhibited a classical skewed bimodal intensity pattern, previously reported to be associated with promoters (FIG. 7). Third, when overlapped with regions defined by the Epigenomic Roadmap (EpiRd) 15 state model, we observed significant enrichments of H3K4me3 hi/H3K4me1 lo regions at proximal promoter states (TSSs/Regions flanking transcription sites) in gastrointestinal tissues relative to other tissues (FIG. 7). Fourth, CAGE (CAP analysis gene expression) is a specialized transcriptome sequencing method used to map gene promoters using 5′ mRNA data. Integration with CAGE data from the FANTOMS consortium revealed an 81% overlap of H3K4me3 hi/H3K4me1 lo regions with robust CAGE tag clusters. (FIG. 7).
Somatic Promoters in GC Exhibit Deregulation in Diverse Cancer Types
To explore relationships between epigenomic promoter alterations and gene expression, we analyzed RNA-seq data from the same discovery cohort (˜106 million reads/sample), quantifying RNA-seq transcript reads mapping to the epigenome-guided promoter regions or directly downstream. Examining somatic promoter regions (FIG. 2A provides an illustrative example of a gained somatic promoter), we observed significantly increased expression at gained somatic promoters in GCs, and significantly decreased expression at lost somatic promoters, compared to either all promoters (P<0.001, FIG. 2B), or unaltered promoters (P<0.001, FIG. 10). Among other types of epigenetic modifications, previous studies have also reported a reciprocal relationship between active regulatory regions and DNA methylation. Using Infinium 450K DNA methylation arrays, we identified 7,505 CpG sites overlapping somatic promoter regions (5,213 sites for gained somatic promoters, 2,292 sites for lost somatic promoters). Promoters gained in GC were significantly hypomethylated compared to all promoters, (P<0.001, Wilcoxon test) while promoters lost in GC were hypermethylated (P<0.001, Wilcoxon test) (FIG. 2b, bottom). As DNA methylation typically occurs in CpG rich regions, (56) we then repeated the analysis focusing only on CpG island bearing promoters (Methods and Materials). Similar to the original results, CpG island bearing promoters gained in GC were significantly hypomethylated compared to all CpG island bearing promoters, (P<0.001, Wilcoxon test) while CpG island bearing promoters lost in GC were hypermethylated (P<0.001, Wilcoxon test) (FIG. 11).
To validate the somatic promoter alterations in a larger independent GC cohort and also to examine their behavior in other cancer types, we proceeded to query RNA-seq data of 354 GC samples from the TCGA consortium (n=321 GC, n=33 matched normals). To perform this analysis, RNA-seq reads from TCGA samples were mapped against the epigenome-guided somatic promoter regions defined by the discovery samples, and normalized to calculate fold change differences in expression in GC vs. normals (see Methods and Materials). Similar to the discovery series, we observed that TCGA GCs also exhibited significantly increased expression at gained somatic promoters, while lost somatic promoters exhibited decreased expression, relative to either all promoters (P<0.001, FIG. 2C) or unaltered promoters (P<0.001, FIG. 10). We further tested the tissue-specificity of the GC somatic promoters by querying RNA-seq data from other tumor types, including colon, kidney renal clear cell carcinoma (ccRCC), and lung adenocarcinoma (LUAD) (FIG. 2d). Almost two-thirds (n=1231, 63%, FC=1.5) of GC somatic promoters were also differentially regulated in TCGA colon cancer samples and similarly, a significant proportion of GC somatic promoters were also associated with differential RNA-seq expression in TCGA ccRCC (n=939, 48%, FC=1.5) and LUAD samples (n=1059, 54%, FC=1.5) (FIG. 2D). This result suggests that many GC somatic promoters are also likely associated with deregulated promoter activity in other solid epithelial malignancies.
Role of Alternative Promoters
By comparing the somatic promoters against the reference Gencode database (V19), we discovered extensive use of alternative promoters (18%) in GCs, defined as situations where a common unaltered promoter is present in both normal tissues and tumors (canonical promoter) but a secondary tumor-specific promoter is engaged in the latter (alternative promoter). The remaining 82% of somatic promoters corresponded to single major isoforms or unannotated transcripts (see later). 57% of the alternative promoters occurred downstream of the canonical promoter. Using multiple RNA-seq analysis methods, we confirmed that transcript isoforms driven by alternative promoters are overexpressed in GCs to a significantly greater degree than canonical promoters in the same gene (Methods and Materials, FIG. 12). For example, HNF4α, a transcription factor overexpressed in GC, is driven by two promoters (P1 and P2). At the HNF4α canonical promoter (“P2”), we observed equal promoter signals in GCs and normal tissues; however we also further observed gain of an additional promoter in GCs at a transcription start site 45 kb downstream (“P1”). Similar HNF4α P1 promoter gains were also observed in GC cell lines (FIG. 3a), with RNA-seq analysis supporting HNF4α P1 isoform expression in GCs. Alternative promoter usage was also observed at the EpCAM gene, frequently used to identify circulating tumor cells, causing expression of EpCAM transcript ENST00000263735.4 (FIG. 3b). Notably, both the HNF4α and EpCAM alternative isoforms exhibited significantly greater cancer overexpression compared to their canonical isoforms (FIG. 12). Other genes associated with tumor-specific alternative promoters, many reported for the first time, including NKX6-3 (FC 1.83, q<0.05) and GRIN2D (FC 1.9, q<0.001). A complete list of GC tumor-specific promoters is provided (Table 6).
TABLE 6
|
|
Alternative Promoters
|
Change
|
H3K4Me3
in
|
Loci
(T/N)
Type
protein
Gene
|
|
chr2: 69900550-69901900
Loss
Alternate
1
AAK1
|
chr2: 44058400-44060450
Gain
Alternate
1
ABCG5
|
chr1: 179108750-
Gain
Alternate
1
ABL2
|
179113100
|
chr1: 6451200-6453300
Gain
Alternate
1
ACOT7
|
chr7: 991700-995250
Gain
Alternate
1
ADAP1
|
chr11: 69811750-
Gain
Alternate
1
ANO1
|
69814800
|
chr19: 50308050-
Gain
Alternate
1
AP2A1
|
50309350
|
chr17: 36620950-
Gain
Alternate
1
ARHGAP23
|
36622550
|
chr2: 10902450-10904150
Gain
Alternate
1
ATP6V1C2
|
chr7: 70060000-70066050
Gain
Alternate
1
AUTS2
|
chr18: 60804550-
Loss
Alternate
1
BCL2
|
60807050
|
chr11: 1463100-1464700
Gain
Alternate
1
BRSK2
|
chr4: 2038150-2039400
Gain
Alternate
1
C4orf48
|
chr21: 44482600-
Gain
Alternate
1
CBS
|
44484300
|
chr3: 46988600-46990000
Gain
Alternate
1
CCDC12
|
chr16: 28946800-
Gain
Alternate
1
CD19
|
28948350
|
chr6: 4836100-4837550
Gain
Alternate
1
CDYL
|
chr6: 118985250-
Loss
Alternate
1
CEP85L
|
118986450
|
chr9: 124497650-
Gain
Alternate
1
DAB2IP
|
124504300
|
chr19: 6474700-6477300
Gain
Alternate
1
DENND1C
|
chr4: 955250-957700
Gain
Alternate
1
DGKQ
|
chr16: 21059250-
Gain
Alternate
1
DNAH3
|
21060650
|
chr7: 35074250-35076850
Gain
Alternate
1
DPY19L1
|
chr6: 56553350-56559100
Gain
Alternate
1
DST
|
chr2: 47595450-47602500
Gain
Alternate
1
EPCAM
|
chrX: 137860100-
Gain
Alternate
1
FGF13
|
137861300
|
chr3: 69283500-69286950
Gain
Alternate
1
FRMD4B
|
chr7: 99774000-99776200
Gain
Alternate
1
GPC2
|
chr10: 25754300-
Gain
Alternate
1
GPR158
|
25755900
|
chr11: 123458150-
Gain
Alternate
1
GRAMD1B
|
123465950
|
chr20: 43029650-
Gain
Alternate
1
HNF4A
|
43032200
|
chr17: 46639600-
Gain
Alternate
1
HOXB3
|
46642950
|
chr7: 23506000-23515500
Gain
Alternate
1
IGF2BP3
|
chr1: 38410700-38414500
Loss
Alternate
1
INPP5B
|
chr19: 17952000-
Gain
Alternate
1
JAK3
|
17953950
|
chr14: 24891600-
Loss
Alternate
1
KHNYN
|
24897600
|
chr18: 21452050-
Gain
Alternate
1
LAMA3
|
21455250
|
chr5: 154091500-
Loss
Alternate
1
LARP1
|
154095100
|
chr5: 38605950-38609550
Loss
Alternate
1
LIFR
|
chr16: 1013250-1015550
Gain
Alternate
1
LMF1
|
chr19: 49003900-
Gain
Alternate
1
LMTK3
|
49005550
|
chr1: 156896950-
Gain
Alternate
1
LRRC71
|
156898350
|
chr1: 156893100-
Gain
Alternate
1
LRRC71
|
156894550
|
chr1: 236045300-
Loss
Alternate
1
LYST
|
236047550
|
chr20: 33134200-
Gain
Alternate
1
MAP1LC3A
|
33135900
|
chr7: 130125100-
Gain
Alternate
1
MEST
|
130127800
|
chr7: 116363550-
Gain
Alternate
1
MET
|
116365500
|
chr3: 158448250-
Gain
Alternate
1
MFSD1
|
158451400
|
chr1: 1562700-1565700
Gain
Alternate
1
MIB2
|
chr14: 102700300-
Gain
Alternate
1
MOK
|
102702150
|
chr17: 60756900-
Gain
Alternate
1
MRC2
|
60758850
|
chr8: 144652950-
Gain
Alternate
1
MROH6
|
144655550
|
chr7: 100607850-
Gain
Alternate
1
MUC12
|
100613600
|
chr11: 76902300-
Gain
Alternate
1
MYO7A
|
76903800
|
chr1: 24434350-24435800
Gain
Alternate
1
MYOM3
|
chr6: 126136250-
Loss
Alternate
1
NCOA7
|
126140700
|
chr2: 233755200-
Gain
Alternate
1
NGEF
|
233756650
|
chr2: 233791350-
Gain
Alternate
1
NGEF
|
233792700
|
chr17: 26119900-
Gain
Alternate
1
NOS2
|
26121850
|
chr1: 200007500-
Gain
Alternate
1
NR5A2
|
200010950
|
chr18: 55099800-
Gain
Alternate
1
ONECUT2
|
55108900
|
chr8: 107629450-
Loss
Alternate
1
OXR1
|
107632850
|
chr4: 169575100-
Loss
Alternate
1
PALLD
|
169577200
|
chr19: 18364400-
Loss
Alternate
1
PDE4C
|
18366800
|
chr4: 111557000-
Gain
Alternate
1
PITX2
|
111559350
|
chr8: 145009000-
Gain
Alternate
1
PLEC
|
145018500
|
chr19: 49370000-
Gain
Alternate
1
PLEKHA4
|
49372300
|
chr11: 16944700-
Gain
Alternate
1
PLEKHA7
|
16947800
|
chr1: 6530450-6535000
Gain
Alternate
1
PLEKHG5
|
chr5: 74990850-74992350
Gain
Alternate
1
POC5
|
chr6: 35359200-35364100
Loss
Alternate
1
PPARD
|
chr19: 49631500-
Gain
Alternate
1
PPFIA3
|
49632100
|
chr22: 22900650-
Gain
Alternate
1
PRAME
|
22902550
|
chr9: 132458700-
Gain
Alternate
1
PRRX2
|
132461300
|
chr9: 139873000-
Gain
Alternate
1
PTGDS
|
139874300
|
chr1: 29562850-29565950
Gain
Alternate
1
PTPRU
|
chr17: 2878500-2880550
Gain
Alternate
1
RAP1GAP2
|
chr9: 134548500-
Loss
Alternate
1
RAPGEF1
|
134553400
|
chr3: 24851300-24854350
Loss
Alternate
1
RARB
|
chr13: 114769100-
Gain
Alternate
1
RASA3
|
114771100
|
chr20: 399750-402500
Gain
Alternate
1
RBCK1
|
chr19: 14088450-
Gain
Alternate
1
RFX1
|
14090950
|
chr4: 3310150-3312100
Gain
Alternate
1
RGS12
|
chr8: 74035400-74036300
Loss
Alternate
1
SBSPON
|
chr21: 38063750-
Loss
Alternate
1
SIM2
|
38066650
|
chr19: 19215350-
Gain
Alternate
1
SLC25A42
|
19217300
|
chr7: 103021250-
Loss
Alternate
1
SLC26A5
|
103022850
|
chr12: 40425950-
Loss
Alternate
1
SLC2A13
|
40427700
|
chr12: 20975550-
Gain
Alternate
1
SLCO1B3
|
20976900
|
chr16: 68418000-
Loss
Alternate
1
SMPD3
|
68421750
|
chr4: 186729400-
Loss
Alternate
1
SORBS2
|
186734150
|
chr2: 231206350-
Gain
Alternate
1
SP140L
|
231208750
|
chr7: 87854350-87856200
Gain
Alternate
1
SRI
|
chr3: 17734300-17735900
Gain
Alternate
1
TBC1D5
|
chr8: 67866500-67867950
Gain
Alternate
1
TCF24
|
chr6: 10409250-10419650
Gain
Alternate
1
TFAP2A
|
chr3: 129512300-
Gain
Alternate
1
TMCC1
|
129514550
|
chr18: 20910450-
Gain
Alternate
1
TMEM241
|
20912050
|
chr2: 218874000-
Gain
Alternate
1
TNS1
|
218875450
|
chr8: 141017700-
Gain
Alternate
1
TRAPPC9
|
141019200
|
chr4: 8435700-8439650
Loss
Alternate
1
TRMT44
|
chr21: 45844650-
Gain
Alternate
1
TRPM2
|
45846700
|
chrX: 107016000-
Loss
Alternate
1
TSC22D3
|
107021000
|
chr2: 3371900-3374350
Gain
Alternate
1
TSSC1
|
chr17: 40784750-
Loss
Alternate
1
TUBG2
|
40786950
|
chr16: 1428050-1430700
Gain
Alternate
1
UNKL
|
chr12: 109507100-
Gain
Alternate
1
USP30
|
109508350
|
chr20: 50719850-
Gain
Alternate
1
ZFP64
|
50723350
|
chr4: 8128400-8130450
Gain
Alternate
0
ABLIM2
|
chr16: 72660100-
Gain
Alternate
0
AC004158.2
|
72662050
|
chr2: 66801200-66811950
Gain
Alternate
0
AC007392.3
|
chr2: 114081700-
Gain
Alternate
0
AC016745.3
|
114084050
|
chr19: 52104750-
Loss
Alternate
0
AC018755.16
|
52106000
|
chr2: 19504600-19506400
Gain
Alternate
0
AC092594.1
|
chr2: 118899750-
Gain
Alternate
0
AC093901.1
|
118901550
|
chr17: 263900-267650
Loss
Alternate
0
AC108004.3
|
chr3: 18734950-18736300
Gain
Alternate
0
AC144521.1
|
chr12: 109568950-
Loss
Alternate
0
ACACB
|
109570000
|
chrX: 23783150-
Gain
Alternate
0
ACOT9
|
23786000
|
chr7: 5601050-5603800
Gain
Alternate
0
ACTB
|
chr7: 15600650-
Gain
Alternate
0
AGMO
|
15602200
|
chr21: 45336050-
Loss
Alternate
0
AGPAT3
|
45337600
|
chr15: 86232000-
Loss
Alternate
0
AKAP13
|
86236800
|
chr9: 112909300-
Loss
Alternate
0
AKAP2
|
112915400
|
chr2: 241496150-
Gain
Alternate
0
ANKMY1
|
241498200
|
chr2: 242127000-
Loss
Alternate
0
ANO7
|
242129850
|
chr5: 139972550-
Gain
Alternate
0
APBB3
|
139973900
|
chr18: 24443050-
Loss
Alternate
0
AQP4-AS1
|
24445900
|
chr4: 86395150-86399900
Loss
Alternate
0
ARHGAP24
|
chr19: 47362700-
Gain
Alternate
0
ARHGAP35
|
47367650
|
chr9: 35672750-35677150
Loss
Alternate
0
ARHGEF39
|
chrX: 100739600-
Gain
Alternate
0
ARMCX4
|
100741600
|
chr9: 120175650-
Loss
Alternate
0
ASTN2
|
120177900
|
chr3: 193270000-
Loss
Alternate
0
ATP13A4
|
193274550
|
chr18: 77102950-
Loss
Alternate
0
ATP9B
|
77104300
|
chr1: 179486050-
Loss
Alternate
0
AXDND1
|
179487950
|
chr4: 102332100-
Gain
Alternate
0
BANK1
|
102333250
|
chr1: 94046300-94051100
Loss
Alternate
0
BCAR3
|
chr11: 27686500-
Gain
Alternate
0
BDNF-AS
|
27687900
|
chr20: 11897750-
Loss
Alternate
0
BTBD3
|
11902000
|
chr11: 63531650-
Gain
Alternate
0
C11orf95
|
63533550
|
chr19: 30199050-
Gain
Alternate
0
C19orf12
|
30200500
|
chr1: 207991400-
Loss
Alternate
0
C1orf132
|
208001200
|
chr6: 109571700-
Gain
Alternate
0
C6orf183
|
109573350
|
chr8: 128305850-
Gain
Alternate
0
CASC8
|
128307550
|
chr5: 43409150-43412850
Loss
Alternate
0
CCL28
|
chr8: 95245700-95247400
Gain
Alternate
0
CDH17
|
chr7: 105603300-
Loss
Alternate
0
CDHR3
|
105604700
|
chr7: 90338500-90340500
Loss
Alternate
0
CDK14
|
chr7: 29184550-29187650
Gain
Alternate
0
CHN2
|
chr15: 79011600-
Gain
Alternate
0
CHRNB4
|
79013200
|
chr7: 139226300-
Gain
Alternate
0
CLEC2L
|
139228850
|
chr6: 25164900-25167200
Loss
Alternate
0
CMAHP
|
chr16: 81684900-
Loss
Alternate
0
CMIP
|
81687600
|
chr6: 37391200-37392800
Gain
Alternate
0
CMTR1
|
chr3: 74662150-74664400
Loss
Alternate
0
CNTN3
|
chr11: 111172600-
Loss
Alternate
0
COLCA1
|
111176650
|
chr6: 36722500-36725900
Loss
Alternate
0
CPNE5
|
chr11: 85392850-
Loss
Alternate
0
CREBZF
|
85394650
|
chr16: 21288600-
Gain
Alternate
0
CRYM
|
21290700
|
chr5: 60597450-60601050
Loss
Alternate
0
CTC-
|
436P18.3
|
chr15: 45544050-
Loss
Alternate
0
CTD-
|
45548600
2651B20.3
|
chr20: 110300-111350
Gain
Alternate
0
DEFB126
|
chr2: 234326350-
Loss
Alternate
0
DGKD
|
234331500
|
chr1: 223101350-
Loss
Alternate
0
DISP1
|
223104800
|
chr11: 111852050-
Loss
Alternate
0
DIXDC1
|
111855050
|
chr13: 50759600-
Gain
Alternate
0
DLEU1
|
50762100
|
chr1: 46954600-46956800
Gain
Alternate
0
DMBX1
|
chr16: 30021900-
Gain
Alternate
0
DOC2A
|
30023950
|
chr6: 56715250-56717500
Gain
Alternate
0
DST
|
chr18: 46894350-
Loss
Alternate
0
DYM
|
46895900
|
chr5: 106838450-
Loss
Alternate
0
EFNA5
|
106842400
|
chr4: 111331750-
Gain
Alternate
0
ENPEP
|
111333350
|
chr14: 74461400-
Loss
Alternate
0
ENTPD5
|
74463450
|
chr19: 55590850-
Gain
Alternate
0
EPS8L1
|
55593800
|
chr5: 172332450-
Loss
Alternate
0
ERGIC1
|
172333000
|
chr1: 17024500-17028900
Gain
Alternate
0
ESPNP
|
chr1: 216892850-
Loss
Alternate
0
ESRRG
|
216898200
|
chr1: 217249050-
Loss
Alternate
0
ESRRG
|
217252200
|
chr6: 36326200-36331550
Gain
Alternate
0
ETV7
|
chr12: 124778800-
Loss
Alternate
0
FAM101A
|
124786100
|
chr17: 47822200-
Loss
Alternate
0
FAM117A
|
47825200
|
chr4: 187025100-
Loss
Alternate
0
FAM149A
|
187028650
|
chr1: 178986050-
Loss
Alternate
0
FAM20B
|
178987900
|
chr7: 102574000-
Loss
Alternate
0
FBXL13
|
102576900
|
chr16: 86529000-
Loss
Alternate
0
FENDRR
|
86534050
|
chr20: 34192700-
Loss
Alternate
0
FER1L4
|
34196000
|
chr8: 124926550-
Gain
Alternate
0
FER1L6
|
124929550
|
chr7: 121942750-
Gain
Alternate
0
FEZF1
|
121947900
|
chr12: 32654200-
Loss
Alternate
0
FGD4
|
32659150
|
chr16: 86608950-
Gain
Alternate
0
FOXL1
|
86611800
|
chr8: 75230900-75235150
Gain
Alternate
0
GDAP1
|
chr7: 100288750-
Gain
Alternate
0
GIGYF1
|
100293000
|
chr11: 58694450-
Loss
Alternate
0
GLYATL1
|
58696550
|
chr5: 89854500-89855350
Loss
Alternate
0
GPR98
|
chr2: 165476750-
Gain
Alternate
0
GRB14
|
165479250
|
chr9: 140056700-
Gain
Alternate
0
GRIN1
|
140058300
|
chr19: 48900250-
Gain
Alternate
0
GRIN2D
|
48904400
|
chr9: 104466750-
Gain
Alternate
0
GRIN3A
|
104468450
|
chr3: 14642850-14644150
Loss
Alternate
0
GRIP2
|
chr11: 2016000-2021350
Gain
Alternate
0
H19
|
chrX: 152760450-
Gain
Alternate
0
HAUS7
|
152761150
|
chr7: 18534500-18539050
Loss
Alternate
0
HDAC9
|
chr15: 83619150-
Loss
Alternate
0
HOMER2
|
83622750
|
chr7: 27159450-27164850
Gain
Alternate
0
HOXA3
|
chr7: 27208400-27220700
Gain
Alternate
0
HOXA9
|
chr17: 46678350-
Gain
Alternate
0
HOXB6
|
46683450
|
chr17: 46694850-
Gain
Alternate
0
HOXB8
|
46697150
|
chr3: 11178050-11179900
Gain
Alternate
0
HRH1
|
chr3: 11195250-11198600
Gain
Alternate
0
HRH1
|
chr3: 11265900-11269000
Gain
Alternate
0
HRH1
|
chr1: 23543800-23544900
Gain
Alternate
0
HTR1D
|
chrX: 130711450-
Gain
Alternate
0
IGSF1
|
130713600
|
chr17: 38016450-
Loss
Alternate
0
IKZF3
|
38022250
|
chr2: 113619100-
Loss
Alternate
0
IL1B
|
113622250
|
chr4: 143394250-
Gain
Alternate
0
INPP4B
|
143396200
|
chr19: 2255550-2257400
Loss
Alternate
0
JSRP1
|
chr17: 68071050-
Loss
Alternate
0
KCNJ16
|
68073700
|
chr14: 88788450-
Gain
Alternate
0
KCNK10
|
88791000
|
chr4: 56914350-56916700
Gain
Alternate
0
KIAA1211
|
chr10: 24725650-
Loss
Alternate
0
KIAA1217
|
24728200
|
chr11: 33398050-
Gain
Alternate
0
KIAA1549L
|
33400750
|
chr15: 31637200-
Loss
Alternate
0
KLF13
|
31640250
|
chr19: 55019200-
Gain
Alternate
0
LAIR2
|
55020400
|
chr1: 65991250-65992850
Loss
Alternate
0
LEPR
|
chr5: 78014050-78017100
Loss
Alternate
0
LHFPL2
|
chr12: 113904650-
Gain
Alternate
0
LHX5
|
113906650
|
chr22: 30651400-
Gain
Alternate
0
LIF
|
30654850
|
chr20: 21085550-
Gain
Alternate
0
LINC00237
|
21087550
|
chr13: 74234250-
Gain
Alternate
0
LINC00393
|
74236800
|
chr3: 8652200-8654000
Gain
Alternate
0
LMCD1-
|
AS1
|
chr20: 6031700-6033850
Gain
Alternate
0
LRRN4
|
chr3: 116161150-
Gain
Alternate
0
LSAMP
|
116164900
|
chr11: 1889150-1894600
Loss
Alternate
0
LSP1
|
chrX: 149588950-
Gain
Alternate
0
MAMLD1
|
149590100
|
chr1: 27683050-27684600
Loss
Alternate
0
MAP3K6
|
chrX: 20115700-
Loss
Alternate
0
MAP7D2
|
20118300
|
chr3: 150959500-
Gain
Alternate
0
MED12L
|
150960300
|
chr22: 42148300-
Loss
Alternate
0
MEI1
|
42150300
|
chr1: 205537050-
Loss
Alternate
0
MFSD4
|
205540700
|
chr1: 22489600-22491100
Gain
Alternate
0
MIR4418
|
chr19: 748150-750100
Gain
Alternate
0
MISP
|
chr3: 69914350-69917750
Loss
Alternate
0
MITF
|
chr6: 168215700-
Gain
Alternate
0
MLLT4-
|
168217350
AS1
|
chr19: 1286150-1288700
Gain
Alternate
0
MUM1
|
chr19: 50690700-
Gain
Alternate
0
MYH14
|
50695700
|
chr17: 73606350-
Gain
Alternate
0
MYO156
|
73609450
|
chr17: 31010250-
Gain
Alternate
0
MYO1D
|
31012000
|
chr18: 55888350-
Loss
Alternate
0
NEDD4L
|
55892150
|
chr2: 131965200-
Gain
Alternate
0
NF1P8
|
131968600
|
chr14: 27147750-
Gain
Alternate
0
NOVA1-
|
27148900
AS1
|
chr11: 108040050-
Loss
Alternate
0
NPAT
|
108041550
|
chr7: 98248450-98250250
Gain
Alternate
0
NPTX2
|
chr15: 76302650-
Loss
Alternate
0
NRG4
|
76305350
|
chr9: 132370500-
Gain
Alternate
0
NTMT1
|
132373750
|
chr3: 32118200-32120100
Gain
Alternate
0
OSBPL10
|
chr19: 14171500-
Loss
Alternate
0
PALM3
|
14173250
|
chr7: 32107350-32111900
Loss
Alternate
0
PDE1C
|
chr3: 111450850-
Loss
Alternate
0
PHLDB2
|
111453300
|
chr12: 18395250-
Loss
Alternate
0
PIK3C2G
|
18399450
|
chr8: 110534900-
Loss
Alternate
0
PKHD1L1
|
110536100
|
chr20: 8094750-8096650
Gain
Alternate
0
PLCB1
|
chr1: 6544500-6545600
Gain
Alternate
0
PLEKHG5
|
chr22: 41990400-
Gain
Alternate
0
PMM1
|
41991450
|
chr6: 31150550-31154950
Loss
Alternate
0
POU5F1
|
chr11: 7626600-7631400
Loss
Alternate
0
PPFIBP2
|
chr2: 182895050-
Gain
Alternate
0
PPP1R1C
|
182896750
|
chr8: 143759850-
Loss
Alternate
0
PSCA
|
143765700
|
chr8: 27237450-27239750
Loss
Alternate
0
PTK2B
|
chr8: 142384050-
Gain
Alternate
0
PTP4A3
|
142385550
|
chr9: 96767600-96770450
Loss
Alternate
0
PTPDC1
|
chr12: 120661250-
Loss
Alternate
0
PXN
|
120664850
|
chr18: 52384600-
Loss
Alternate
0
RAB27B
|
52386250
|
chr11: 82706750-
Loss
Alternate
0
RAB30
|
82709350
|
chr8: 95485350-95488300
Gain
Alternate
0
RAD54B
|
chr4: 82964050-82966400
Gain
Alternate
0
RASGEF1B
|
chr4: 40512300-40518850
Loss
Alternate
0
RBM47
|
chr9: 116225550-
Gain
Alternate
0
RGS3
|
116228700
|
chr10: 62758000-
Loss
Alternate
0
RHOBTB1
|
62762450
|
chr8: 104510350-
Gain
Alternate
0
RIMS2
|
104514700
|
chr21: 38379100-
Gain
Alternate
0
RIPPLY3
|
38379750
|
chr8: 61324800-61327100
Gain
Alternate
0
RP11-
|
163N6.2
|
chr20: 6301750-6304300
Gain
Alternate
0
RP11-
|
199O14.1
|
chr3: 187606800-
Gain
Alternate
0
RP11-
|
187608950
30O15.1
|
chr1: 39191950-39194400
Loss
Alternate
0
RP11-
|
334L9.1
|
chr11: 112140350-
Gain
Alternate
0
RP11-
|
112142500
356J5.12
|
chr6: 82809950-82812100
Gain
Alternate
0
RP11-
|
379B8.1
|
chr14: 39702300-
Loss
Alternate
0
RP11-
|
39706400
407N17.3
|
chr1: 203394800-
Gain
Alternate
0
RP11-
|
203398950
435P24.3
|
chr9: 72091300-72092650
Gain
Alternate
0
RP11-
|
470P21.2
|
chr15: 82161650-
Gain
Alternate
0
RP11-
|
82163400
499F3.2
|
chr4: 88631250-
Gain
Alternate
0
RP11-
|
88631950
742B18.1
|
chr11: 94372300-
Gain
Alternate
0
RP11-
|
94374550
867G2.5
|
chr3: 131049650-
Gain
Alternate
0
RP11-
|
131051500
933H2.4
|
chr17: 10746250-
Loss
Alternate
0
RP11-
|
10749200
963H4.3
|
chr6: 85334900-85337050
Gain
Alternate
0
RP1-
|
90L14.1
|
chr7: 156735150-
Gain
Alternate
0
RP5-
|
156736500
1121A15.3
|
chr2: 55236200-55238400
Loss
Alternate
0
RTN4
|
chr16: 51186150-
Loss
Alternate
0
SALL1
|
51187850
|
chr2: 200326950-
Gain
Alternate
0
SATB2
|
200329550
|
chr3: 53031650-53034600
Gain
Alternate
0
SFMBT1
|
chr14: 71849000-
Loss
Alternate
0
SIPA1L1
|
71850350
|
chr1: 232760700-
Gain
Alternate
0
SIPA1L2
|
232767700
|
chr7: 100448750-
Gain
Alternate
0
SLC12A9
|
100451750
|
chr12: 105344050-
Loss
Alternate
0
SLC41A2
|
105348050
|
chr6: 31843950-31847850
Loss
Alternate
0
SLC44A4
|
chr1: 75840850-75842350
Gain
Alternate
0
SLC44A5
|
chr1: 205637750-
Gain
Alternate
0
SLC45A3
|
205639250
|
chr11: 26985950-
Gain
Alternate
0
SLC5A12
|
26987450
|
chr14: 23622000-
Loss
Alternate
0
SLC7A8
|
23623950
|
chr22: 31459200-
Gain
Alternate
0
SMTN
|
31461650
|
chr20: 10197250-
Gain
Alternate
0
SNAP25-
|
10201300
AS1
|
chr16: 1842850-1844950
Loss
Alternate
0
SPSB3
|
chr11: 4010850-4011700
Loss
Alternate
0
STIM1
|
chr8: 99951150-99961750
Gain
Alternate
0
STK3
|
chr7: 23761400-23764000
Gain
Alternate
0
STK31
|
chr1: 110573450-
Loss
Alternate
0
STRIP1
|
110574700
|
chr7: 73131100-73134700
Gain
Alternate
0
STX1A
|
chr20: 46411750-
Gain
Alternate
0
SULF2
|
46414250
|
chr12: 79438650-
Gain
Alternate
0
SYT1
|
79440250
|
chr15: 57509850-
Loss
Alternate
0
TCF12
|
57515600
|
chr12: 110411050-
Gain
Alternate
0
TCHP
|
110419200
|
chr21: 32640100-
Loss
Alternate
0
TIAM1
|
32641350
|
chr19: 3707600-3711250
Loss
Alternate
0
TJP3
|
chr10: 102830000-
Loss
Alternate
0
TLX1NB
|
102833650
|
chr2: 228241600-
Gain
Alternate
0
TM4SF20
|
228244450
|
chr16: 19427700-
Gain
Alternate
0
TMC5
|
19435900
|
chr7: 47490900-47493500
Loss
Alternate
0
TNS3
|
chr8: 144436800-
Gain
Alternate
0
TOP1MT
|
144438000
|
chr13: 45955000-
Gain
Alternate
0
TPT1-AS1
|
45957700
|
chr17: 3459750-3462900
Loss
Alternate
0
TRPV3
|
chr3: 12522200-12524700
Gain
Alternate
0
TSEN2
|
chr22: 46683150-
Loss
Alternate
0
TTC38
|
46685350
|
chr6: 133003800-
Gain
Alternate
0
VNN1
|
133008900
|
chr15: 53831700-
Gain
Alternate
0
WDR72
|
53833550
|
chr11: 102617350-
Gain
Alternate
0
WTAPP1
|
102619450
|
chr11: 68436350-
Gain
Alternate
0
Novel Gene
|
68438200
|
chr12: 125226400-
Loss
Alternate
0
Novel Gene
|
125228400
|
chr12: 89240400-
Gain
Alternate
0
Novel Gene
|
89241750
|
chr14: 99752650-
Loss
Alternate
0
Novel Gene
|
99754000
|
chr18: 76805850-
Gain
Alternate
0
Novel Gene
|
76809250
|
chr19: 53560600-
Gain
Alternate
0
Novel Gene
|
53562700
|
chr2: 45227500-45229600
Gain
Alternate
0
Novel Gene
|
chr2: 134784950-
Gain
Alternate
0
Novel Gene
|
134786450
|
chr2: 176458500-
Gain
Alternate
0
Novel Gene
|
176460750
|
chr20: 46600150-
Gain
Alternate
0
Novel Gene
|
46603250
|
chr4: 10830100-10832350
Gain
Alternate
0
Novel Gene
|
chr5: 35404300-35405800
Gain
Alternate
0
Novel Gene
|
chr5: 42999400-43001150
Gain
Alternate
0
Novel Gene
|
chr5: 72496650-72498300
Gain
Alternate
0
Novel Gene
|
chr1: 204682350-
Loss
Alternate
0
Novel Gene
|
204684550
|
chr6: 868400-871100
Loss
Alternate
0
Novel Gene
|
chr1: 220635500-
Gain
Alternate
0
Novel Gene
|
220637400
|
chr6: 47146850-47150550
Loss
Alternate
0
Novel Gene
|
chr6: 160720200-
Gain
Alternate
0
Novel Gene
|
160722150
|
chr6: 170474550-
Gain
Alternate
0
Novel Gene
|
170475800
|
chr1: 242107250-
Gain
Alternate
0
Novel Gene
|
242109450
|
chr7: 27274550-27276500
Gain
Alternate
0
Novel Gene
|
chr9: 17905350-17908250
Loss
Alternate
0
Novel Gene
|
chr9: 31848250-31849950
Gain
Alternate
0
Novel Gene
|
chrX: 56133300-
Gain
Alternate
0
Novel Gene
|
56134800
|
chrX: 3466450-3468750
Gain
Alternate
0
Novel Gene
|
chrX: 6849150-6851300
Gain
Alternate
0
Novel Gene
|
chr11: 60941900-
Loss
Alternate
0
Novel Gene
|
60945700
|
chr11: 71350450-
Gain
Alternate
0
Novel Gene
|
71351500
|
chr11: 119775600-
Loss
Alternate
0
Novel Gene
|
119779600
|
chr5: 82391600-82392950
Gain
Alternate
0
XRCC4
|
chr3: 141107100-
Loss
Alternate
0
ZBTB38
|
141108400
|
chr18: 45660800-
Loss
Alternate
0
ZBTB7C
|
45664950
|
chr13: 100619800-
Gain
Alternate
0
ZIC5
|
100623100
|
chr2: 180425300-
Loss
Alternate
0
ZNF385B
|
180426950
|
chr19: 53539900-
Gain
Alternate
0
ZNF702P
|
53541600
|
|
To explore the influence of alternative promoters on protein diversity, we identified 714 tumor-specific promoter alterations predicted to change N-terminal protein composition and also supported by both H3K4me3 and RNA-seq data. The vast majority of these alterations (>95%) were in-frame to that of the canonical protein. Of these, 47% (n=338) were predicted to cause gains of new N-terminal peptides in tumors (see Methods). To confirm protein-level expression of these N-terminal peptides in gastrointestinal cancer, we queried publically available peptide spectral data of 90 TCGA colorectal cancer (CRC) and 60 normal colon samples. CRC data was used for this analysis as large-scale proteomic data of primary GCs are not currently available, and because many GC somatic promoters are also observed in CRC (FIG. 2d). Among N-terminal peptides predicted to be gained in tumors, we confirmed protein expression of 33% (112/338) in the CRC data (Table 7), of which 51.8% were overexpressed in CRC samples relative to normal colon samples (FDR 10%). In a separate experiment, we further investigated if these N-terminal peptides also exhibit tumor overexpression in proteomic data from 3 GC cell lines and 1 normal gastric epithelial line (GES1) (Methods and Materials). Similar to the CRC data, 48% of the N-terminal peptides were overexpressed in the GC lines relative to normal GES1 gastric cells. Taken collectively, these analyses suggest that alternative promoters may contribute significantly towards proteomic diversity in gastrointestinal cancer.
TABLE 7
|
|
Spectral Counts from CRC samples of N terminal peptides
|
predicted to be gained in GC
|
Spectral
|
SEQ_ID_NO
Peptide
GeneId
Count
|
|
SEQ ID NO: 1
IDNSQVESGSLEDDWDFLPPKK
ENSG00000179218.9
2602
|
|
SEQ ID NO: 2
FYALSASFEPFSNK
ENSG00000179218.9
2047
|
|
SEQ ID NO: 3
EQFLDGDGWTSR
ENSG00000179218.9
1370
|
|
SEQ ID NO: 4
IKDPDASKPEDWDER
ENSG00000179218.9
805
|
|
SEQ ID NO: 5
GDVTAQIALQPALK
ENSG00000112096.12
601
|
|
SEQ ID NO: 6
GISLNPEQWSQLK
ENSG00000113387.7
536
|
|
SEQ ID NO: 7
AYHSFLVEPISCHAWNK
ENSG00000130429.8
497
|
|
SEQ ID NO: 8
IAVQPGTVGPQGR
ENSG00000134871.13
468
|
|
SEQ ID NO: 9
VLAQNSGFDLQETLVK
ENSG00000146731.6
435
|
|
SEQ ID NO: 10
CKDDEFTHLYTLIVRPDNTYEVK
ENSG00000179218.9
424
|
|
SEQ ID NO: 11
AKIDDPTDSKPEDWDKPEHIPDP
ENSG00000179218.9
414
|
DAK
|
|
SEQ ID NO: 12
VHVIFNYK
ENSG00000179218.9
396
|
|
SEQ ID NO: 13
HEQNIDCGGGYVK
ENSG00000179218.9
361
|
|
SEQ ID NO: 14
LIDFGLAR
ENSG00000065534.14
359
|
|
SEQ ID NO: 15
TWKPTLVILR
ENSG00000130429.8
358
|
|
SEQ ID NO: 16
AIWNVINWENVTER
ENSG00000112096.12
353
|
|
SEQ ID NO: 17
IDDPTDSKPEDWDKPEHIPDPDA
ENSG00000179218.9
323
|
K
|
|
SEQ ID NO: 18
NVRPDYLK
ENSG00000112096.12
320
|
|
SEQ ID NO: 19
NSVSQISVLSGGK
ENSG00000130429.8
317
|
|
SEQ ID NO: 20
DGNVLLHEMQIQHPTASLIAK
ENSG00000146731.6
314
|
|
SEQ ID NO: 21
AGATHVER
ENSG00000145016.9
311
|
|
SEQ ID NO: 22
LVALLNTLDR
ENSG00000119383.15
298
|
|
SEQ ID NO: 23
HHAAYVNNLNVTEEK
ENSG00000112096.12
296
|
|
SEQ ID NO: 24
FYGDEEKDKGLQTSQDAR
ENSG00000179218.9
290
|
|
SEQ ID NO: 25
KVHVIFNYK
ENSG00000179218.9
283
|
|
SEQ ID NO: 26
GPLPAAPPVAPER
ENSG00000115310.13
282
|
|
SEQ ID NO: 27
VLLSALER
ENSG00000100714.11
277
|
|
SEQ ID NO: 28
SVSIGYLLVK
ENSG00000134871.13
276
|
|
SEQ ID NO: 29
IQQEIAVQNPLVSER
ENSG00000167770.7
271
|
|
SEQ ID NO: 30
GELLEAIKR
ENSG00000112096.12
268
|
|
SEQ ID NO: 31
AHNQDLGLAGSCLAR
ENSG00000134871.13
265
|
|
SEQ ID NO: 32
YVVVTGITPTPLGEGK
ENSG00000100714.11
256
|
|
SEQ ID NO: 33
MEDLDQSPLVSSSDSPPRPQPAF
ENSG00000115310.13
254
|
K
|
|
SEQ ID NO: 34
AAQAPSSFQLLYDLK
ENSG00000100714.11
253
|
|
SEQ ID NO: 35
LQAQLNELQAQLSQK
ENSG00000137497.13
250
|
|
SEQ ID NO: 36
ALQFLEEVK
ENSG00000146731.6
244
|
|
SEQ ID NO: 37
LLTSGYLQR
ENSG00000167770.7
242
|
|
SEQ ID NO: 38
GDLNDCFIPCTPK
ENSG00000100714.11
241
|
|
SEQ ID NO: 39
ASSEGGTAAGAGLDSLHK
ENSG00000130429.8
240
|
|
SEQ ID NO: 40
EAVTEILGIEPDREK
ENSG00000211460.7
236
|
|
SEQ ID NO: 41
EVEERPAPTPWGSK
ENSG00000130429.8
235
|
|
SEQ ID NO: 42
IITEGFEAAK
ENSG00000146731.6
235
|
|
SEQ ID NO: 43
YLNIFGESQPNPK
ENSG00000004864.9
234
|
|
SEQ ID NO: 44
LTAASVGVQGSGWGWLGFNK
ENSG00000112096.12
229
|
|
SEQ ID NO: 45
IAPLEEGTLPFNLAEAQR
ENSG00000004864.9
221
|
|
SEQ ID NO: 46
GQTLVVQFTVK
ENSG00000179218.9
220
|
|
SEQ ID NO: 47
AQLGVQAFADALLIIPK
ENSG00000146731.6
217
|
|
SEQ ID NO: 48
QVAPEKPVK
ENSG00000113387.7
217
|
|
SEQ ID NO: 49
VATAQDDITGDGTTSNVLIIGELL
ENSG00000146731.6
215
|
K
|
|
SEQ ID NO: 50
GLLPQLLGVAPEK
ENSG00000004864.9
214
|
|
SEQ ID NO: 51
NAYVWTLK
ENSG00000130429.8
214
|
|
SEQ ID NO: 52
IYGADDIELLPEAQHK
ENSG00000100714.11
211
|
|
SEQ ID NO: 53
CHAIIDEQPLIFK
ENSG00000169756.12
210
|
|
SEQ ID NO: 54
KGISLNPEQWSQLK
ENSG00000113387.7
209
|
|
SEQ ID NO: 55
GIDPFSLDALSK
ENSG00000146731.6
207
|
|
SEQ ID NO: 56
LLQCYPPPEDAAVK
ENSG00000196961.8
207
|
|
SEQ ID NO: 57
GVPTGFILPIR
ENSG00000100714.11
204
|
|
SEQ ID NO: 58
IVTCGTDR
ENSG00000130429.8
204
|
|
SEQ ID NO: 59
TPVPSDIDISR
ENSG00000100714.11
203
|
|
SEQ ID NO: 60
YQEALAK
ENSG00000112096.12
198
|
|
SEQ ID NO: 61
VAWVSHDSTVCLADADKK
ENSG00000130429.8
197
|
|
SEQ ID NO: 62
LDIDPETITWQR
ENSG00000100714.11
194
|
|
SEQ ID NO: 63
IDNSQVESGSLEDDWDFLPPK
ENSG00000179218.9
192
|
|
SEQ ID NO: 64
LAILQVGNR
ENSG00000100714.11
192
|
|
SEQ ID NO: 65
AQAALAVNISAAR
ENSG00000146731.6
191
|
|
SEQ ID NO: 66
GALALAQAVQR
ENSG00000100714.11
189
|
|
SEQ ID NO: 67
TDPTTLTDEEINR
ENSG00000100714.11
189
|
|
SEQ ID NO: 68
LELSVLYK
ENSG00000167770.7
188
|
|
SEQ ID NO: 69
GLDGYQGPDGPR
ENSG00000134871.13
187
|
|
SEQ ID NO: 70
LSGLEQPQGALQTR
ENSG00000133316.11
184
|
|
SEQ ID NO: 71
SCQTALVEILDVIVR
ENSG00000067704.8
182
|
|
SEQ ID NO: 72
DDNMFQIGK
ENSG00000113387.7
181
|
|
SEQ ID NO: 73
EHNGQVTGIDWAPESNR
ENSG00000130429.8
179
|
|
SEQ ID NO: 74
KIKDPDASKPEDWDER
ENSG00000179218.9
178
|
|
SEQ ID NO: 75
MFGIPVVVAVNAFK
ENSG00000100714.11
178
|
|
SEQ ID NO: 76
FFEHFIEGGR
ENSG00000167770.7
177
|
|
SEQ ID NO: 77
IFHELTQTDK
ENSG00000100714.11
174
|
|
SEQ ID NO: 78
FINLFPETK
ENSG00000196961.8
172
|
|
SEQ ID NO: 79
FYGDEEKDK
ENSG00000179218.9
172
|
|
SEQ ID NO: 80
FNGGGHINHSIFWTNLSPNGGG
ENSG00000112096.12
169
|
EPK
|
|
SEQ ID NO: 81
DPDASKPEDWDER
ENSG00000179218.9
168
|
|
SEQ ID NO: 82
LGSPDYGNSALLSLPGYRPTTR
ENSG00000137497.13
168
|
|
SEQ ID NO: 83
ASGDSARPVLLQVAESAYR
ENSG00000004864.9
167
|
|
SEQ ID NO: 84
TDTESELDLISR
ENSG00000100714.11
166
|
|
SEQ ID NO: 85
LDFVCSFLQK
ENSG00000137497.13
165
|
|
SEQ ID NO: 86
WIDETPPVDQPSR
ENSG00000119383.15
165
|
|
SEQ ID NO: 87
GLLGALTSTPYSPTQHLER
ENSG00000153310.14
164
|
|
SEQ ID NO: 88
KPEDWDEEMDGEWEPPVIQNP
ENSG00000179218.9
162
|
EYK
|
|
SEQ ID NO: 89
FSDIQIR
ENSG00000100714.11
160
|
|
SEQ ID NO: 90
STSFNVQDLLPDHEYK
ENSG00000065534.14
160
|
|
SEQ ID NO: 91
GEQGFMGNTGPTGAVGDR
ENSG00000134871.13
159
|
|
SEQ ID NO: 92
QPSQGPTFGIK
ENSG00000100714.11
157
|
|
SEQ ID NO: 93
THLSLSHNPEQK
ENSG00000100714.11
157
|
|
SEQ ID NO: 94
APVPSTCSSTFPEELSPPSHQAK
ENSG00000137497.13
155
|
|
SEQ ID NO: 95
GEGGTTNPHIFPEGSEPK
ENSG00000167770.7
155
|
|
SEQ ID NO: 96
TALAEAELEYNPEHVSR
ENSG00000067704.8
155
|
|
SEQ ID NO: 97
FPLLKPSPK
ENSG00000067704.8
154
|
|
SEQ ID NO: 98
DQAANLMANR
ENSG00000198947.10
153
|
|
SEQ ID NO: 99
HLTAQVR
ENSG00000137497.13
153
|
|
SEQ ID NO:
FVLSSGK
ENSG00000179218.9
149
|
100
|
|
SEQ ID NO:
SSLPPVLGTESDATVK
ENSG00000065534.14
148
|
101
|
|
SEQ ID NO:
AWGAVVPLVGK
ENSG00000153310.14
146
|
102
|
|
SEQ ID NO:
IEGYPDPEVVWFK
ENSG00000065534.14
145
|
103
|
|
SEQ ID NO:
GKNVLINK
ENSG00000179218.9
144
|
104
|
|
SEQ ID NO:
GLQTSQDAR
ENSG00000179218.9
144
|
105
|
|
SEQ ID NO:
HTLTQIK
ENSG00000146731.6
144
|
106
|
|
SEQ ID NO:
VHAELADVLTEAVVDSILAIK
ENSG00000146731.6
144
|
107
|
|
SEQ ID NO:
YVIHTVGPIAYGEPSASQAAELR
ENSG00000133315.6
142
|
108
|
|
SEQ ID NO:
IQSSHNFQLESVNK
ENSG00000135052.12
141
|
109
|
|
SEQ ID NO:
QIDNPDYK
ENSG00000179218.9
140
|
110
|
|
SEQ ID NO:
DAEGILEDLQSYR
ENSG00000153310.14
139
|
111
|
|
SEQ ID NO:
YTAESSDTLCPR
ENSG00000067704.8
139
|
112
|
|
SEQ ID NO:
EESREPAPASPAPAGVEIR
ENSG00000113657.8
138
|
113
|
|
SEQ ID NO:
EMDRETLIDVAR
ENSG00000146731.6
138
|
114
|
|
SEQ ID NO:
NEVSFVIHNLPVLAK
ENSG00000086475.10
138
|
115
|
|
SEQ ID NO:
QVAPEKPVKK
ENSG00000113387.7
137
|
116
|
|
SEQ ID NO:
FLINLEGGDIR
ENSG00000067704.8
136
|
117
|
|
SEQ ID NO:
LSVNSVTAGDYSR
ENSG00000211460.7
135
|
118
|
|
SEQ ID NO:
QAQVNLTVVDKPDPPAGTPCAS
ENSG00000065534.14
135
|
119
DIR
|
|
SEQ ID NO:
IFDDVSSGVSQLASK
ENSG00000101199.8
134
|
120
|
|
SEQ ID NO:
PDASKPEDWDER
ENSG00000179218.9
134
|
121
|
|
SEQ ID NO:
YGGAPQALTLK
ENSG00000196961.8
132
|
122
|
|
SEQ ID NO:
LVTPGETPSWTGSGFVR
ENSG00000172037.9
131
|
123
|
|
SEQ ID NO:
EQISDIDDAVR
ENSG00000113387.7
129
|
124
|
|
SEQ ID NO:
KPAAGLSAAPVPTAPAAGAPLM
ENSG00000115310.13
129
|
125
DFGNDFVPPAPR
|
|
SEQ ID NO:
ATSSTQSLAR
ENSG00000137497.13
128
|
126
|
|
SEQ ID NO:
LLVPTQFVGAIIGK
ENSG00000136231.9
128
|
127
|
|
SEQ ID NO:
GELLEAIK
ENSG00000112096.12
126
|
128
|
|
SEQ ID NO:
FFQPTEMAAQDFFQR
ENSG00000196961.8
124
|
129
|
|
SEQ ID NO:
GSGSRPGIEGDTPR
ENSG00000113657.8
121
|
130
|
|
SEQ ID NO:
NAIDDGCVVPGAGAVEVAMAE
ENSG00000146731.6
121
|
131
ALIK
|
|
SEQ ID NO:
AAAAAAVGPGAGGAGSAVPGG
ENSG00000142453.7
120
|
132
AGPCATVSVFPGAR
|
|
SEQ ID NO:
DFLTPPLLSVR
ENSG00000196961.8
120
|
133
|
|
SEQ ID NO:
LFVVPADEAQAR
ENSG00000105223.14
120
|
134
|
|
SEQ ID NO:
WMIQYNNLNLK
ENSG00000100714.11
120
|
135
|
|
SEQ ID NO:
SLPISLVFLVPVR
ENSG00000169896.12
119
|
136
|
|
SEQ ID NO:
ALQVGCLLR
ENSG00000196961.8
118
|
137
|
|
SEQ ID NO:
ESFNPESYELDK
ENSG00000086475.10
118
|
138
|
|
SEQ ID NO:
TGWISTSSIWK
ENSG00000067704.8
118
|
139
|
|
SEQ ID NO:
EYAEDDNIYQQK
ENSG00000167770.7
117
|
140
|
|
SEQ ID NO:
TQIAICPNNHEVHIYEK
ENSG00000130429.8
117
|
141
|
|
SEQ ID NO:
SLEAQVAHADQQLR
ENSG00000137497.13
116
|
142
|
|
SEQ ID NO:
SVTLLIK
ENSG00000146731.6
116
|
143
|
|
SEQ ID NO:
IHFVPGWDCHGLPIEIK
ENSG00000067704.8
115
|
144
|
|
SEQ ID NO:
QQPDTELEIQQK
ENSG00000067704.8
115
|
145
|
|
SEQ ID NO:
KGEPVSAEDLGVSGALTVLMK
ENSG00000100714.11
114
|
146
|
|
SEQ ID NO:
LGIGMDTCVIPLR
ENSG00000086475.10
113
|
147
|
|
SEQ ID NO:
QPSWDPSPVSSTVPAPSPLSAAA
ENSG00000115310.13
113
|
148
VSPSK
|
|
SEQ ID NO:
QISEGVEYIHK
ENSG00000065534.14
109
|
149
|
|
SEQ ID NO:
SEGGTAAGAGLDSLHK
ENSG00000130429.8
108
|
150
|
|
SEQ ID NO:
PTGFILPIR
ENSG00000100714.11
107
|
151
|
|
SEQ ID NO:
SQAGVSSGAPPGR
ENSG00000137497.13
107
|
152
|
|
SEQ ID NO:
VCGDSDKGFVVINQK
ENSG00000146731.6
107
|
153
|
|
SEQ ID NO:
LGIVQGIVGAR
ENSG00000172037.9
104
|
154
|
|
SEQ ID NO:
FLSLPEVR
ENSG00000106066.9
103
|
155
|
|
SEQ ID NO:
GLVLDHGAR
ENSG00000146731.6
102
|
156
|
|
SEQ ID NO:
LKNQVTQLK
ENSG00000100714.11
102
|
157
|
|
SEQ ID NO:
TSVQFQNFSPTVVHPGDLQTQL
ENSG00000196961.8
102
|
158
AVQTK
|
|
SEQ ID NO:
EPPYGADVLR
ENSG00000067704.8
101
|
159
|
|
SEQ ID NO:
AAGPLLTDECR
ENSG00000133315.6
100
|
160
|
|
SEQ ID NO:
IIEVAPQVATQNVNPTPGATS
ENSG00000086475.10
100
|
161
|
|
SEQ ID NO:
LFSQGQDVSNK
ENSG00000130396.16
100
|
162
|
|
SEQ ID NO:
VSGPWEEADAEAVAR
ENSG00000090006.13
100
|
163
|
|
SEQ ID NO:
VTGTQPITCTWMK
ENSG00000065534.14
100
|
164
|
|
SEQ ID NO:
VLIDIR
ENSG00000113387.7
99
|
165
|
|
SEQ ID NO:
AVLEEGTDVVIK
ENSG00000067704.8
98
|
166
|
|
SEQ ID NO:
QFAEILHFTLR
ENSG00000153310.14
97
|
167
|
|
SEQ ID NO:
IVGAPMHDLLLWNNATVTTCHS
ENSG00000100714.11
96
|
168
K
|
|
SEQ ID NO:
AYIQENLELVEK
ENSG00000100714.11
95
|
169
|
|
SEQ ID NO:
EIGLLSEEVELYGETK
ENSG00000100714.11
95
|
170
|
|
SEQ ID NO:
DSFLGSIPGK
ENSG00000067704.8
94
|
171
|
|
SEQ ID NO:
QLDALLEALK
ENSG00000172037.9
94
|
172
|
|
SEQ ID NO:
IIDEDFELTER
ENSG00000065534.14
93
|
173
|
|
SEQ ID NO:
DTINLLDQR
ENSG00000135052.12
92
|
174
|
|
SEQ ID NO:
VVQSLEQTAR
ENSG00000211460.7
92
|
175
|
|
SEQ ID NO:
DDSNLYINVK
ENSG00000100714.11
90
|
176
|
|
SEQ ID NO:
VSGQPQSVTASSDK
ENSG00000101199.8
90
|
177
|
|
SEQ ID NO:
EFCQQEVEPMCK
ENSG00000167770.7
89
|
178
|
|
SEQ ID NO:
AGNSLAASTAEETAGSAQGR
ENSG00000172037.9
88
|
179
|
|
SEQ ID NO:
EYWMDPEGEMKPGR
ENSG00000113387.7
88
|
180
|
|
SEQ ID NO:
LQSQLLSIEK
ENSG00000106976.14
88
|
181
|
|
SEQ ID NO:
AGESVELFGK
ENSG00000065534.14
86
|
182
|
|
SEQ ID NO:
NGEFFMSPNDFVTR
ENSG00000004864.9
86
|
183
|
|
SEQ ID NO:
VVVGAPQEIVAANQR
ENSG00000169896.12
86
|
184
|
|
SEQ ID NO:
SQAPLESSLDSLGDVFLDSGRK
ENSG00000137497.13
85
|
185
|
|
SEQ ID NO:
GCLELIK
ENSG00000100714.11
84
|
186
|
|
SEQ ID NO:
HSQTDQEPMCPVGMNK
ENSG00000134871.13
84
|
187
|
|
SEQ ID NO:
NPQVCGPGR
ENSG00000090006.13
83
|
188
|
|
SEQ ID NO:
SRGPGAPCQDVDECAR
ENSG00000090006.13
83
|
189
|
|
SEQ ID NO:
TKDEYLINSQTTEHIVK
ENSG00000067704.8
83
|
190
|
|
SEQ ID NO:
IATTTASAATAAAIGATPR
ENSG00000137497.13
82
|
191
|
|
SEQ ID NO:
LGHELQQAGLK
ENSG00000137497.13
82
|
192
|
|
SEQ ID NO:
TEVPPLLLILDR
ENSG00000136631.8
82
|
193
|
|
SEQ ID NO:
YGDEEKDK
ENSG00000179218.9
82
|
194
|
|
SEQ ID NO:
SESQGTAPAFK
ENSG00000065534.14
81
|
195
|
|
SEQ ID NO:
LPQEPGREQVVEDRPVGGR
ENSG00000135052.12
80
|
196
|
|
SEQ ID NO:
LPYGGQCRPCPCPEGPGSQR
ENSG00000172037.9
79
|
197
|
|
SEQ ID NO:
VYLLYRPGHYDILYK
ENSG00000167770.7
79
|
198
|
|
SEQ ID NO:
FQVATDALK
ENSG00000137497.13
78
|
199
|
|
SEQ ID NO:
LQEGQTLEFLVASVPK
ENSG00000172037.9
78
|
200
|
|
SEQ ID NO:
LQGAVCGVSSGPPPPR
ENSG00000011028.9
78
|
201
|
|
SEQ ID NO:
IQNVVTSFAPQR
ENSG00000172037.9
77
|
202
|
|
SEQ ID NO:
VSTLQNQR
ENSG00000169896.12
77
|
203
|
|
SEQ ID NO:
LSQLEEHLSQLQDNPPQEK
ENSG00000137497.13
76
|
204
|
|
SEQ ID NO:
SQAPLESSLDSLGDVFLDSGR
ENSG00000137497.13
76
|
205
|
|
SEQ ID NO:
AGPDLASCLDVDECR
ENSG00000090006.13
75
|
206
|
|
SEQ ID NO:
GTCHYYANK
ENSG00000134871.13
74
|
207
|
|
SEQ ID NO:
HKSETDTSLIR
ENSG00000146731.6
74
|
208
|
|
SEQ ID NO:
KQQNQELQEQLR
ENSG00000137497.13
74
|
209
|
|
SEQ ID NO:
SGDLYVLAADK
ENSG00000067704.8
74
|
210
|
|
SEQ ID NO:
AFGFSHLEALLDDSK
ENSG00000167770.7
73
|
211
|
|
SEQ ID NO:
EILTLLQGVHQGAGFQDIPK
ENSG00000211460.7
73
|
212
|
|
SEQ ID NO:
IQQCPGTETAEYQSLCPHGR
ENSG00000090006.13
73
|
213
|
|
SEQ ID NO:
KDPDASKPEDWDER
ENSG00000179218.9
73
|
214
|
|
SEQ ID NO:
SYWLSTTAPLPMMPVAEDEIKPY
ENSG00000134871.13
73
|
215
ISR
|
|
SEQ ID NO:
VPQDVLQK
ENSG00000086475.10
73
|
216
|
|
SEQ ID NO:
DFGSFDKFK
ENSG00000112096.12
72
|
217
|
|
SEQ ID NO:
FIILSQEGSLCSVSIEK
ENSG00000065534.14
72
|
218
|
|
SEQ ID NO:
LAVATFAGIENK
ENSG00000004864.9
72
|
219
|
|
SEQ ID NO:
RLENAGSLK
ENSG00000065534.14
72
|
220
|
|
SEQ ID NO:
AAMPPQIIQFPEDQK
ENSG00000065534.14
71
|
221
|
|
SEQ ID NO:
EAQNLSAMEIR
ENSG00000067704.8
71
|
222
|
|
SEQ ID NO:
ILVAGDSMDSVK
ENSG00000196961.8
71
|
223
|
|
SEQ ID NO:
LVHSYPYDWR
ENSG00000067704.8
71
|
224
|
|
SEQ ID NO:
AEAGDAALSVAEWLR
ENSG00000186635.10
70
|
225
|
|
SEQ ID NO:
ELSNFYFSIIK
ENSG00000067704.8
70
|
226
|
|
SEQ ID NO:
AEAAAPYTVLAQSAPR
ENSG00000090006.13
69
|
227
|
|
SEQ ID NO:
GPGAPCQDVDECAR
ENSG00000090006.13
69
|
228
|
|
SEQ ID NO:
VSDFYDIEER
ENSG00000065534.14
69
|
229
|
|
SEQ ID NO:
NNDFYVTGESYAGK
ENSG00000106066.9
68
|
230
|
|
SEQ ID NO:
QPVVDTFDIR
ENSG00000142453.7
68
|
231
|
|
SEQ ID NO:
QQLQALSEPQPR
ENSG00000135052.12
68
|
232
|
|
SEQ ID NO:
APAEILNGKEISAQIR
ENSG00000100714.11
67
|
233
|
|
SEQ ID NO:
KLDVEEPDSANSSFYSTR
ENSG00000137497.13
67
|
234
|
|
SEQ ID NO:
QPPPDSSEEAPPATQNFIIPK
ENSG00000119383.15
67
|
235
|
|
SEQ ID NO:
SLADVDAILAR
ENSG00000172037.9
67
|
236
|
|
SEQ ID NO:
TGGSAQPETPYSGPGLLIDSLVLL
ENSG00000172037.9
67
|
237
PR
|
|
SEQ ID NO:
CDLCQEVLADIGFVK
ENSG00000169756.12
66
|
238
|
|
SEQ ID NO:
FIAGTGCLVR
ENSG00000184207.8
66
|
239
|
|
SEQ ID NO:
HHAAYVNNLNVTEEKYQEALAK
ENSG00000112096.12
66
|
240
|
|
SEQ ID NO:
QGIVHLDLKPENIMCVNK
ENSG00000065534.14
66
|
241
|
|
SEQ ID NO:
TLGDQLSLLLGAR
ENSG00000011028.9
66
|
242
|
|
SEQ ID NO:
CTHWAEGGK
ENSG00000100714.11
65
|
243
|
|
SEQ ID NO:
FGLYLPLFKPSVSTSK
ENSG00000004864.9
65
|
244
|
|
SEQ ID NO:
GSCYPATGDLLVGR
ENSG00000172037.9
65
|
245
|
|
SEQ ID NO:
VMPLIIQGFK
ENSG00000086475.10
65
|
246
|
|
SEQ ID NO:
TPLWIGLAGEEGSR
ENSG00000011028.9
64
|
247
|
|
SEQ ID NO:
TQPDGTSVPGEPASPISQR
ENSG00000137497.13
64
|
248
|
|
SEQ ID NO:
VWGVPIPVFHHK
ENSG00000067704.8
64
|
249
|
|
SEQ ID NO:
ALLNVVDNAR
ENSG00000105223.14
63
|
250
|
|
SEQ ID NO:
GGTTNPHIFPEGSEPK
ENSG00000167770.7
63
|
251
|
|
SEQ ID NO:
YTVNFLEAK
ENSG00000142453.7
63
|
252
|
|
SEQ ID NO:
ATIQGVLR
ENSG00000196961.8
62
|
253
|
|
SEQ ID NO:
GPLGDQYQTVK
ENSG00000172037.9
62
|
254
|
|
SEQ ID NO:
VAAQVDGGAQVQQVLNIECLR
ENSG00000196961.8
62
|
255
|
|
SEQ ID NO:
FTPVVCGLR
ENSG00000090006.13
61
|
256
|
|
SEQ ID NO:
LFPNSLDQTDMHGDSEYNIMFG
ENSG00000179218.9
61
|
257
PDICGPGTK
|
|
SEQ ID NO:
TILLSTTDPADFAVAEALEK
ENSG00000130396.16
61
|
258
|
|
SEQ ID NO:
LTYLGCASVNAPR
ENSG00000011454.12
60
|
259
|
|
SEQ ID NO:
SCYLSSLDLLLEHR
ENSG00000133315.6
60
|
260
|
|
SEQ ID NO:
VVATTQMQAADAR
ENSG00000166825.9
60
|
261
|
|
SEQ ID NO:
GVGGSQPPDIDKTELVEPTEYLV
ENSG00000166825.9
59
|
262
VHLK
|
|
SEQ ID NO:
KEIHTVPDMGK
ENSG00000119383.15
59
|
263
|
|
SEQ ID NO:
LFTALFPFEK
ENSG00000169896.12
59
|
264
|
|
SEQ ID NO:
SLESALK
ENSG00000130429.8
59
|
265
|
|
SEQ ID NO:
VDDQIAIVFK
ENSG00000119383.15
59
|
266
|
|
SEQ ID NO:
VLDPAIPIPDPYSSR
ENSG00000172037.9
59
|
267
|
|
SEQ ID NO:
ATPFIECNGGR
ENSG00000134871.13
58
|
268
|
|
SEQ ID NO:
CSVCEAPAIAIAVHSQDVSIPHCP
ENSG00000134871.13
58
|
269
AGWR
|
|
SEQ ID NO:
EAQVAHADQQLR
ENSG00000137497.13
58
|
270
|
|
SEQ ID NO:
EIILDDDECPLQIFR
ENSG00000130396.16
58
|
271
|
|
SEQ ID NO:
TPAAIPATPVAVSQPIR
ENSG00000130396.16
58
|
272
|
|
SEQ ID NO:
DLGFFGIYK
ENSG00000004864.9
57
|
273
|
|
SEQ ID NO:
EERPAPTPWGSK
ENSG00000130429.8
57
|
274
|
|
SEQ ID NO:
YVGFGNTPPPQK
ENSG00000101199.8
57
|
275
|
|
SEQ ID NO:
CLFQSPLFAK
ENSG00000142453.7
56
|
276
|
|
SEQ ID NO:
SETDTSLIR
ENSG00000146731.6
56
|
277
|
|
SEQ ID NO:
ILETWGELLSK
ENSG00000011454.12
54
|
278
|
|
SEQ ID NO:
YSGLCPHVVVLVATVR
ENSG00000100714.11
54
|
279
|
|
SEQ ID NO:
ENSLLFDPLSSSSSNK
ENSG00000166825.9
53
|
280
|
|
SEQ ID NO:
IKNEAEPEFASR
ENSG00000198947.10
53
|
281
|
|
SEQ ID NO:
VSAPDGPCPTGFER
ENSG00000090006.13
53
|
282
|
|
SEQ ID NO:
AQGIAQGAIR
ENSG00000172037.9
52
|
283
|
|
SEQ ID NO:
KVCGDSDKGFVVINQK
ENSG00000146731.6
52
|
284
|
|
SEQ ID NO:
LWSGYSLLYFEGQEK
ENSG00000134871.13
52
|
285
|
|
SEQ ID NO:
VPIWDQDIQFLPGSQK
ENSG00000133316.11
52
|
286
|
|
SEQ ID NO:
YLSYTLNPDLIR
ENSG00000166825.9
52
|
287
|
|
SEQ ID NO:
YVIGVGDAFR
ENSG00000169896.12
52
|
288
|
|
SEQ ID NO:
DLEVVEGSAAR
ENSG00000065534.14
51
|
289
|
|
SEQ ID NO:
FAVGSGSR
ENSG00000130429.8
50
|
290
|
|
SEQ ID NO:
GFGQSVVQLQGSR
ENSG00000169896.12
50
|
291
|
|
SEQ ID NO:
GLPGEVLGAQPGPR
ENSG00000134871.13
50
|
292
|
|
SEQ ID NO:
LAETLGR
ENSG00000169756.12
50
|
293
|
|
SEQ ID NO:
LPPKVESLESLYFTPIPAR
ENSG00000137497.13
50
|
294
|
|
SEQ ID NO:
PTDSKPEDWDKPEHIPDPDAK
ENSG00000179218.9
50
|
295
|
|
SEQ ID NO:
QLSLPQQEAQK
ENSG00000196961.8
50
|
296
|
|
SEQ ID NO:
DVTTFFSGK
ENSG00000101199.8
49
|
297
|
|
SEQ ID NO:
GQVEQANQELQELIQSVK
ENSG00000172037.9
49
|
298
|
|
SEQ ID NO:
IDDVLHTLTGAMSLLR
ENSG00000130396.16
49
|
299
|
|
SEQ ID NO:
LQLPNCIEDPVSPIVLR
ENSG00000169896.12
49
|
300
|
|
SEQ ID NO:
VESLESLYFTPIPAR
ENSG00000137497.13
49
|
301
|
|
SEQ ID NO:
FGDPLGYEDVIPEADREGVIR
ENSG00000169896.12
48
|
302
|
|
SEQ ID NO:
LEPNAQAQMYR
ENSG00000196961.8
48
|
303
|
|
SEQ ID NO:
DSLEDCVTIWGPEGR
ENSG00000011028.9
47
|
304
|
|
SEQ ID NO:
EAVTEILGIEPDR
ENSG00000211460.7
47
|
305
|
|
SEQ ID NO:
FQNLDKK
ENSG00000130429.8
47
|
306
|
|
SEQ ID NO:
GGECASPLPGLR
ENSG00000090006.13
47
|
307
|
|
SEQ ID NO:
IAVSKPSGPQPQADLQALLQSGA
ENSG00000105223.14
47
|
308
QVR
|
|
SEQ ID NO:
VLELSIPASAEQIQHLAGAIAER
ENSG00000172037.9
47
|
309
|
|
SEQ ID NO:
AAPVPTAPAAGAPLMDFGNDFV
ENSG00000115310.13
46
|
310
PPAPR
|
|
SEQ ID NO:
GGYTCVCPDGFLLDSSR
ENSG00000090006.13
46
|
311
|
|
SEQ ID NO:
VLLTRPGEGGTGLPGPPLITR
ENSG00000152894.10
46
|
312
|
|
SEQ ID NO:
ELQPQQQPR
ENSG00000130396.16
45
|
313
|
|
SEQ ID NO:
FCQLHSSGARPPAPAVPGLTR
ENSG00000090006.13
45
|
314
|
|
SEQ ID NO:
LAAGDQLLSVDGR
ENSG00000130396.16
45
|
315
|
|
SEQ ID NO:
SLTLDTWEPELLK
ENSG00000114331.8
45
|
316
|
|
SEQ ID NO:
EQVPGFTPR
ENSG00000100714.11
44
|
317
|
|
SEQ ID NO:
ETGVPIAGR
ENSG00000100714.11
44
|
318
|
|
SEQ ID NO:
KITIGQAPTEK
ENSG00000100714.11
44
|
319
|
|
SEQ ID NO:
FSTMPFLYCNPGDVCYYASR
ENSG00000134871.13
43
|
320
|
|
SEQ ID NO:
LLTIGDANGEIQR
ENSG00000142453.7
43
|
321
|
|
SEQ ID NO:
LQSQVISELDACK
ENSG00000132205.6
43
|
322
|
|
SEQ ID NO:
LTILAAR
ENSG00000065534.14
43
|
323
|
|
SEQ ID NO:
LVECLETVLNK
ENSG00000196961.8
43
|
324
|
|
SEQ ID NO:
SSPQFGVTLLTYELLQR
ENSG00000004864.9
43
|
325
|
|
SEQ ID NO:
YQCHEEGLVPSK
ENSG00000172037.9
43
|
326
|
|
SEQ ID NO:
GCQLCPPFGSEGFR
ENSG00000090006.13
42
|
327
|
|
SEQ ID NO:
KPGLEEAVESACAMR
ENSG00000067704.8
42
|
328
|
|
SEQ ID NO:
LVQCVDAFEEK
ENSG00000065534.14
42
|
329
|
|
SEQ ID NO:
QWFINITDIK
ENSG00000067704.8
42
|
330
|
|
SEQ ID NO:
SQLEAIFLR
ENSG00000105223.14
42
|
331
|
|
SEQ ID NO:
VLEGSELELAK
ENSG00000137497.13
42
|
332
|
|
SEQ ID NO:
VVQDLAAR
ENSG00000172037.9
42
|
333
|
|
SEQ ID NO:
AIMEFNPR
ENSG00000169896.12
41
|
334
|
|
SEQ ID NO:
ALAEGGSILSR
ENSG00000172037.9
41
|
335
|
|
SEQ ID NO:
EICPAGPGYHYSASDLR
ENSG00000090006.13
41
|
336
|
|
SEQ ID NO:
EQVVEDRPVGGR
ENSG00000135052.12
41
|
337
|
|
SEQ ID NO:
LYCNPGDVCYYASR
ENSG00000134871.13
41
|
338
|
|
SEQ ID NO:
TQDASGPELILPASIEFR
ENSG00000130396.16
41
|
339
|
|
SEQ ID NO:
YSEIEPSTEGEVIYR
ENSG00000172037.9
41
|
340
|
|
SEQ ID NO:
AWCVNCFACSTCNTK
ENSG00000169756.12
40
|
341
|
|
SEQ ID NO:
DDPTDSKPEDWDKPEHIPDPDA
ENSG00000179218.9
40
|
342
K
|
|
SEQ ID NO:
IVQATTLLTMDK
ENSG00000130396.16
40
|
343
|
|
SEQ ID NO:
VDLSTSTDWK
ENSG00000133315.6
40
|
344
|
|
SEQ ID NO:
AQLLQQTR
ENSG00000213380.9
39
|
345
|
|
SEQ ID NO:
DVDECQLFR
ENSG00000090006.13
39
|
346
|
|
SEQ ID NO:
IEGYPDPEVVWFKDDQSIR
ENSG00000065534.14
39
|
347
|
|
SEQ ID NO:
LSSMAMISGLSGR
ENSG00000065534.14
39
|
348
|
|
SEQ ID NO:
NNGVLFENQLLQIGVK
ENSG00000196961.8
39
|
349
|
|
SEQ ID NO:
RADPAELR
ENSG00000004864.9
39
|
350
|
|
SEQ ID NO:
SAPASQASLR
ENSG00000137497.13
39
|
351
|
|
SEQ ID NO:
DWEQFEYK
ENSG00000137497.13
38
|
352
|
|
SEQ ID NO:
IQAELAVILK
ENSG00000137497.13
38
|
353
|
|
SEQ ID NO:
SNRDELELELAENRK
ENSG00000137497.13
38
|
354
|
|
SEQ ID NO:
TPVPEKVPPPKPATPDFR
ENSG00000065534.14
38
|
355
|
|
SEQ ID NO:
VSLEPHQGPGTPESK
ENSG00000137497.13
38
|
356
|
|
SEQ ID NO:
CTEPEDQLYYVK
ENSG00000106066.9
37
|
357
|
|
SEQ ID NO:
ECYFDTAAPDACDNILAR
ENSG00000090006.13
37
|
358
|
|
SEQ ID NO:
FGLGSVAGAVGATAVYPIDLVK
ENSG00000004864.9
37
|
359
|
|
SEQ ID NO:
GQEDAILSYEPVTR
ENSG00000082458.7
37
|
360
|
|
SEQ ID NO:
IMELEGR
ENSG00000135052.12
37
|
361
|
|
SEQ ID NO:
TCVSLAVSR
ENSG00000196961.8
37
|
362
|
|
SEQ ID NO:
TILTLTGVSTLGDVK
ENSG00000184207.8
37
|
363
|
|
SEQ ID NO:
VLQIVTNRDDVQGYAAK
ENSG00000196961.8
37
|
364
|
|
SEQ ID NO:
AFGFSHLEALLDDSKELQR
ENSG00000167770.7
36
|
365
|
|
SEQ ID NO:
AGPDSAGIALYSHEDVCVFK
ENSG00000142453.7
36
|
366
|
|
SEQ ID NO:
AQGVLAAQAR
ENSG00000172037.9
36
|
367
|
|
SEQ ID NO:
LPSFQQSCR
ENSG00000213380.9
36
|
368
|
|
SEQ ID NO:
MLSSFLSEDVFK
ENSG00000166825.9
36
|
369
|
|
SEQ ID NO:
DTEQTLYQVQER
ENSG00000172037.9
35
|
370
|
|
SEQ ID NO:
DVEVTKEEFVLAAQK
ENSG00000004864.9
35
|
371
|
|
SEQ ID NO:
INQLSEENGDLSFK
ENSG00000137497.13
35
|
372
|
|
SEQ ID NO:
LNIPATNVFANR
ENSG00000146733.9
35
|
373
|
|
SEQ ID NO:
SLVKPITQLLGR
ENSG00000169896.12
35
|
374
|
|
SEQ ID NO:
YLCEGTESPYQTGQLHPAIR
ENSG00000152894.10
35
|
375
|
|
SEQ ID NO:
ASMQPIQIAEGTGITTR
ENSG00000137497.13
34
|
376
|
|
SEQ ID NO:
IAGALGGLLTPLFLR
ENSG00000064545.10
34
|
377
|
|
SEQ ID NO:
LGASALDSIQEFR
ENSG00000032444.11
34
|
378
|
|
SEQ ID NO:
SGTIFDNFLITNDEAYAEEFGNET
ENSG00000179218.9
34
|
379
WGVTK
|
|
SEQ ID NO:
TVLDLQSSLAGVSENLK
ENSG00000132205.6
34
|
380
|
|
SEQ ID NO:
AGPDLASCLDVDECRER
ENSG00000090006.13
33
|
381
|
|
SEQ ID NO:
EGGTAAGAGLDSLHK
ENSG00000130429.8
33
|
382
|
|
SEQ ID NO:
FYEFSQR
ENSG00000153310.14
33
|
383
|
|
SEQ ID NO:
GEWIKPGAIVIDCGINYVPDDK
ENSG00000100714.11
33
|
384
|
|
SEQ ID NO:
NDPYHPDHFNCANCGK
ENSG00000169756.12
33
|
385
|
|
SEQ ID NO:
SLEPHQGPGTPESK
ENSG00000137497.13
33
|
386
|
|
SEQ ID NO:
SLGEENFEVVK
ENSG00000132561.9
33
|
387
|
|
SEQ ID NO:
THIDTVINALK
ENSG00000196961.8
33
|
388
|
|
SEQ ID NO:
VHAELADVLTEAVVDSILAIKK
ENSG00000146731.6
33
|
389
|
|
SEQ ID NO:
VMQHQYQVSNLGQR
ENSG00000169896.12
33
|
390
|
|
SEQ ID NO:
ASFITPVPGGVGPMTVAMLMQ
ENSG00000100714.11
32
|
391
STVESAK
|
|
SEQ ID NO:
FEHFIEGGR
ENSG00000167770.7
32
|
392
|
|
SEQ ID NO:
LQQAQLYPIAIFIKPK
ENSG00000082458.7
32
|
393
|
|
SEQ ID NO:
MTLADIER
ENSG00000004864.9
32
|
394
|
|
SEQ ID NO:
TVELLSGVVDQTK
ENSG00000004864.9
32
|
395
|
|
SEQ ID NO:
AMDYDLLLR
ENSG00000172037.9
31
|
396
|
|
SEQ ID NO:
DFGSFDK
ENSG00000112096.12
31
|
397
|
|
SEQ ID NO:
EPAVYFKEQFLDGDGWTSR
ENSG00000179218.9
31
|
398
|
|
SEQ ID NO:
FLINLEGGDIREESSYK
ENSG00000067704.8
31
|
399
|
|
SEQ ID NO:
GEWIKPGAIVIDCGINYVPDDKK
ENSG00000100714.11
31
|
400
PNGR
|
|
SEQ ID NO:
HAVVVGR
ENSG00000100714.11
31
|
401
|
|
SEQ ID NO:
LEGDTFLLLIQSLK
ENSG00000104450.8
31
|
402
|
|
SEQ ID NO:
NTSVVDSEPVR
ENSG00000162614.14
31
|
403
|
|
SEQ ID NO:
PGTTDQVPR
ENSG00000113657.8
31
|
404
|
|
SEQ ID NO:
QLDQHLDLLK
ENSG00000172037.9
31
|
405
|
|
SEQ ID NO:
TVIVHGFTLGEK
ENSG00000067704.8
31
|
406
|
|
SEQ ID NO:
YAPDDIPNINSTCFK
ENSG00000130396.16
31
|
407
|
|
SEQ ID NO:
AADLLYAMCDR
ENSG00000196961.8
30
|
408
|
|
SEQ ID NO:
EMGEAFAADIPR
ENSG00000196961.8
30
|
409
|
|
SEQ ID NO:
IQGTLQPHAR
ENSG00000172037.9
30
|
410
|
|
SEQ ID NO:
LPIAVNGSLIYGVCAGK
ENSG00000059691.7
30
|
411
|
|
SEQ ID NO:
VNDDLISEFPHK
ENSG00000082458.7
30
|
412
|
|
SEQ ID NO:
DGGCSLPILR
ENSG00000090006.13
29
|
413
|
|
SEQ ID NO:
ENVDYIIQELR
ENSG00000136631.8
29
|
414
|
|
SEQ ID NO:
GAAVDEYFR
ENSG00000142453.7
29
|
415
|
|
SEQ ID NO:
GETAVPGAPEALR
ENSG00000184207.8
29
|
416
|
|
SEQ ID NO:
ILYSFATAFR
ENSG00000011454.12
29
|
417
|
|
SEQ ID NO:
NVFECNDQVVK
ENSG00000169896.12
29
|
418
|
|
SEQ ID NO:
STGSFVGELMYK
ENSG00000004864.9
29
|
419
|
|
SEQ ID NO:
TIRDLEVVEGSAAR
ENSG00000065534.14
29
|
420
|
|
SEQ ID NO:
TVFEALQAPACHENMVK
ENSG00000196961.8
29
|
421
|
|
SEQ ID NO:
VGLLQYGSTVK
ENSG00000132561.9
29
|
422
|
|
SEQ ID NO:
YVLSNQYRPDISPTER
ENSG00000130396.16
29
|
423
|
|
SEQ ID NO:
AEAELEYNPEHVSR
ENSG00000067704.8
28
|
424
|
|
SEQ ID NO:
ASPDLVPMGEWTAR
ENSG00000196961.8
28
|
425
|
|
SEQ ID NO:
CEACAPGHFGDPSRPGGR
ENSG00000172037.9
28
|
426
|
|
SEQ ID NO:
EDGYSDASGFGYCFR
ENSG00000090006.13
28
|
427
|
|
SEQ ID NO:
GDLIGVVEALTR
ENSG00000032444.11
28
|
428
|
|
SEQ ID NO:
LAILQVGNRDDSNLYINVK
ENSG00000100714.11
28
|
429
|
|
SEQ ID NO:
NDAGQAECSCQVTVDDAPASE
ENSG00000065534.14
28
|
430
NTK
|
|
SEQ ID NO:
QNWFEAFEILDK
ENSG00000106066.9
28
|
431
|
|
SEQ ID NO:
SSEGLLATATVPLDLFK
ENSG00000157617.12
28
|
432
|
|
SEQ ID NO:
STTTIGLVQALGAHLYQNVFACV
ENSG00000100714.11
28
|
433
R
|
|
SEQ ID NO:
VLVLEMFSGGDAAALER
ENSG00000172037.9
28
|
434
|
|
SEQ ID NO:
KQVAPEKPVK
ENSG00000113387.7
27
|
435
|
|
SEQ ID NO:
LQELEGTYEENER
ENSG00000172037.9
27
|
436
|
|
SEQ ID NO:
LVEQHGSDIWWTLPPEQLLPK
ENSG00000067704.8
27
|
437
|
|
SEQ ID NO:
NPTFMCLALHCIANVGSR
ENSG00000196961.8
27
|
438
|
|
SEQ ID NO:
SSDGRPDSGGTLR
ENSG00000130396.16
27
|
439
|
|
SEQ ID NO:
AAPQPLNLVSSVTLSK
ENSG00000114861.14
26
|
440
|
|
SEQ ID NO:
AVQAQGGESQQEAQR
ENSG00000137497.13
26
|
441
|
|
SEQ ID NO:
DFLNQEGADPDSIEMVATR
ENSG00000172037.9
26
|
442
|
|
SEQ ID NO:
GQVLDVVER
ENSG00000172037.9
26
|
443
|
|
SEQ ID NO:
LALIQPSR
ENSG00000146733.9
26
|
444
|
|
SEQ ID NO:
LQQDVLQFQK
ENSG00000135052.12
26
|
445
|
|
SEQ ID NO:
LTFEELER
ENSG00000162614.14
26
|
446
|
|
SEQ ID NO:
QVTPLFIHFR
ENSG00000166825.9
26
|
447
|
|
SEQ ID NO:
SFNVQDLLPDHEYK
ENSG00000065534.14
26
|
448
|
|
SEQ ID NO:
SSCISQHVISEAK
ENSG00000090006.13
26
|
449
|
|
SEQ ID NO:
VLQIVTNR
ENSG00000196961.8
26
|
450
|
|
SEQ ID NO:
VVGDVAYDEAK
ENSG00000100714.11
26
|
451
|
|
SEQ ID NO:
ALQSGPPQSR
ENSG00000136231.9
25
|
452
|
|
SEQ ID NO:
ITIGQAPTEK
ENSG00000100714.11
25
|
453
|
|
SEQ ID NO:
KAQGVLAAQAR
ENSG00000172037.9
25
|
454
|
|
SEQ ID NO:
LKENLYPYLGPSTLR
ENSG00000136631.8
25
|
455
|
|
SEQ ID NO:
LPVTINK
ENSG00000196961.8
25
|
456
|
|
SEQ ID NO:
SILTAIPNDDPYFHITK
ENSG00000213380.9
25
|
457
|
|
SEQ ID NO:
SLGNVIHPDVVVNGGQDQSK
ENSG00000067704.8
25
|
458
|
|
SEQ ID NO:
AVQTSIATAYR
ENSG00000114331.8
24
|
459
|
|
SEQ ID NO:
DASKPEDWDER
ENSG00000179218.9
24
|
460
|
|
SEQ ID NO:
IPVSGPFLVK
ENSG00000136231.9
24
|
461
|
|
SEQ ID NO:
LLGPAGLTWER
ENSG00000138162.13
24
|
462
|
|
SEQ ID NO:
LPVEAFSAVFTK
ENSG00000032444.11
24
|
463
|
|
SEQ ID NO:
SEESTTVHSSPGATGTALFPTR
ENSG00000205277.5
24
|
464
|
|
SEQ ID NO:
SEESTTVHSSPGATGTALFPTR
ENSG00000205277.5
24
|
465
|
|
SEQ ID NO:
SEESTTVHSSPGATGTALFPTR
ENSG00000205277.5
24
|
466
|
|
SEQ ID NO:
TKVHAELADVLTEAVVDSILAIK
ENSG00000146731.6
24
|
467
|
|
SEQ ID NO:
YGEGHQAWIIGIVEK
ENSG00000086475.10
24
|
468
|
|
SEQ ID NO:
ADLYLEGK
ENSG00000067704.8
23
|
469
|
|
SEQ ID NO:
CLEEKNEILQGK
ENSG00000137497.13
23
|
470
|
|
SEQ ID NO:
FIFDCVSQEYGINPER
ENSG00000184207.8
23
|
471
|
|
SEQ ID NO:
IHGTEEGQQILK
ENSG00000137497.13
23
|
472
|
|
SEQ ID NO:
KIQTQLQR
ENSG00000166825.9
23
|
473
|
|
SEQ ID NO:
KVVGDVAYDEAK
ENSG00000100714.11
23
|
474
|
|
SEQ ID NO:
LDSISGNLQR
ENSG00000132205.6
23
|
475
|
|
SEQ ID NO:
LFEDLEFQQLER
ENSG00000019144.12
23
|
476
|
|
SEQ ID NO:
SLGNVIHPDVVVNGGQDQSKEP
ENSG00000067704.8
23
|
477
PYGADVLR
|
|
SEQ ID NO:
TEVNSGFFYK
ENSG00000146731.6
23
|
478
|
|
SEQ ID NO:
TSAGTFPGSQPQAPASPVLPARP
ENSG00000090006.13
23
|
479
PPPPLPR
|
|
SEQ ID NO:
VHSPQQVDFR
ENSG00000065534.14
23
|
480
|
|
SEQ ID NO:
VLTGNTIALVLGGGGAR
ENSG00000032444.11
23
|
481
|
|
SEQ ID NO:
VSALSVVR
ENSG00000004864.9
23
|
482
|
|
SEQ ID NO:
ASLENGVLLCDLINK
ENSG00000136153.15
22
|
483
|
|
SEQ ID NO:
ETLIDVAR
ENSG00000146731.6
22
|
484
|
|
SEQ ID NO:
FESKPQSQEVK
ENSG00000065534.14
22
|
485
|
|
SEQ ID NO:
GHLQIAACPNQDPLQGTTGLIPL
ENSG00000112096.12
22
|
486
LGIDVWEHAYYLQYK
|
|
SEQ ID NO:
GICEALEDSDGRQDSPAGELPK
ENSG00000132561.9
22
|
487
|
|
SEQ ID NO:
GYLAPSGDLSLR
ENSG00000090006.13
22
|
488
|
|
SEQ ID NO:
LQSQLLSIEKEVEEYK
ENSG00000106976.14
22
|
489
|
|
SEQ ID NO:
SGQGSDRGSGSRPGIEGDTPR
ENSG00000113657.8
22
|
490
|
|
SEQ ID NO:
VAISTFQK
ENSG00000213380.9
22
|
491
|
|
SEQ ID NO:
GQDIFIIQTIPR
ENSG00000161542.12
21
|
492
|
|
SEQ ID NO:
ITLDAQDVLAHLVQMAFK
ENSG00000130396.16
21
|
493
|
|
SEQ ID NO:
RTEVPPLLLILDR
ENSG00000136631.8
21
|
494
|
|
SEQ ID NO:
SSPPVQFSLLHSK
ENSG00000196961.8
21
|
495
|
|
SEQ ID NO:
SSTGSPTSPLNAEK
ENSG00000065534.14
21
|
496
|
|
SEQ ID NO:
TKFPAEQYYR
ENSG00000211460.7
21
|
497
|
|
SEQ ID NO:
ANFWYQPSFHGVDLSALR
ENSG00000142453.7
20
|
498
|
|
SEQ ID NO:
DAQIAMMQQR
ENSG00000137497.13
20
|
499
|
|
SEQ ID NO:
EHGAFDAVK
ENSG00000100714.11
20
|
500
|
|
SEQ ID NO:
GLAQADGTLITCVDSGILR
ENSG00000133316.11
20
|
501
|
|
SEQ ID NO:
GLNCEQCQDFYR
ENSG00000172037.9
20
|
502
|
|
SEQ ID NO:
KVVATTQMQAADAR
ENSG00000166825.9
20
|
503
|
|
SEQ ID NO:
MKLTHSLQEELEK
ENSG00000151914.13
20
|
504
|
|
SEQ ID NO:
NIDVFNVEDQKR
ENSG00000135052.12
20
|
505
|
|
SEQ ID NO:
QASDKDDRPFQGEDVENSR
ENSG00000130396.16
20
|
506
|
|
SEQ ID NO:
SLDQTDMHGDSEYNIMFGPDIC
ENSG00000179218.9
20
|
507
GPGTK
|
|
SEQ ID NO:
STIFHSSPDASGTTPSSAHSTTSG
ENSG00000205277.5
20
|
508
R
|
|
SEQ ID NO:
STIFHSSPDASGTTPSSAHSTTSG
ENSG00000205277.5
20
|
509
R
|
|
SEQ ID NO:
STIFHSSPDASGTTPSSAHSTTSG
ENSG00000205277.5
20
|
510
R
|
|
SEQ ID NO:
STIFHSSPDASGTTPSSAHSTTSG
ENSG00000205277.5
20
|
511
R
|
|
SEQ ID NO:
VCLHVQK
ENSG00000169896.12
20
|
512
|
|
SEQ ID NO:
VSQFLQVLETDLYR
ENSG00000213380.9
20
|
513
|
|
SEQ ID NO:
VSSTATTQDVIETLAEK
ENSG00000130396.16
20
|
514
|
|
SEQ ID NO:
YNTRPLGQEPPR
ENSG00000090006.13
20
|
515
|
|
SEQ ID NO:
ANHPMDAEVTK
ENSG00000196961.8
19
|
516
|
|
SEQ ID NO:
ASELGHSLNENVLKPAQEK
ENSG00000101199.8
19
|
517
|
|
SEQ ID NO:
AWVSHDSTVCLADADKK
ENSG00000130429.8
19
|
518
|
|
SEQ ID NO:
FSYDLSQCINQMK
ENSG00000135052.12
19
|
519
|
|
SEQ ID NO:
IYQFTAASPK
ENSG00000005020.8
19
|
520
|
|
SEQ ID NO:
KQDEPIDLFMIEIMEMK
ENSG00000146731.6
19
|
521
|
|
SEQ ID NO:
NIMAGLQQTNSEK
ENSG00000198947.10
19
|
522
|
|
SEQ ID NO:
RPDYLK
ENSG00000112096.12
19
|
523
|
|
SEQ ID NO:
SEESTTVHSSPVATATTPSPAR
ENSG00000205277.5
19
|
524
|
|
SEQ ID NO:
SEESTTVHSSPVATATTPSPAR
ENSG00000205277.5
19
|
525
|
|
SEQ ID NO:
SEESTTVHSSPVATATTPSPAR
ENSG00000205277.5
19
|
526
|
|
SEQ ID NO:
SEESTTVHSSPVATATTPSPAR
ENSG00000205277.5
19
|
527
|
|
SEQ ID NO:
THLTSLK
ENSG00000211460.7
19
|
528
|
|
SEQ ID NO:
AQEAEQLLR
ENSG00000172037.9
18
|
529
|
|
SEQ ID NO:
AQIINDAFNLASAHK
ENSG00000166825.9
18
|
530
|
|
SEQ ID NO:
DQLGGWFQSSLLTSVAAR
ENSG00000067704.8
18
|
531
|
|
SEQ ID NO:
GADDIELLPEAQHK
ENSG00000100714.11
18
|
532
|
|
SEQ ID NO:
GFSHLEALLDDSK
ENSG00000167770.7
18
|
533
|
|
SEQ ID NO:
GLLTDSPAATVLAEAR
ENSG00000019144.12
18
|
534
|
|
SEQ ID NO:
HSNFLGAYDSIR
ENSG00000172037.9
18
|
535
|
|
SEQ ID NO:
KNEFQGELEK
ENSG00000135052.12
18
|
536
|
|
SEQ ID NO:
SFLEEVLASGLHSR
ENSG00000136631.8
18
|
537
|
|
SEQ ID NO:
TEILGIEPDREK
ENSG00000211460.7
18
|
538
|
|
SEQ ID NO:
VILLDPSIIEAK
ENSG00000104450.8
18
|
539
|
|
SEQ ID NO:
AETVQAALEEAQR
ENSG00000172037.9
17
|
540
|
|
SEQ ID NO:
AFVENYPQFK
ENSG00000136631.8
17
|
541
|
|
SEQ ID NO:
DFISNLLK
ENSG00000065534.14
17
|
542
|
|
SEQ ID NO:
DGFFGLSISDR
ENSG00000172037.9
17
|
543
|
|
SEQ ID NO:
DHVFQVNNFEALK
ENSG00000169896.12
17
|
544
|
|
SEQ ID NO:
DPTDSKPEDWDKPEHIPDPDAK
ENSG00000179218.9
17
|
545
|
|
SEQ ID NO:
KIIELK
ENSG00000146731.6
17
|
546
|
|
SEQ ID NO:
LCCPVALAQDVTGALEDALAK
ENSG00000213380.9
17
|
547
|
|
SEQ ID NO:
PAIAHLIHSLNPVR
ENSG00000106066.9
17
|
548
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
549
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
550
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
551
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
552
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
553
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
554
|
|
SEQ ID NO:
PSSTPTTHFSASSTTLGR
ENSG00000205277.5
17
|
555
|
|
SEQ ID NO:
QFVTGIIDSLTISPK
ENSG00000132561.9
17
|
556
|
|
SEQ ID NO:
SEAVLQSPEFAIFR
ENSG00000198947.10
17
|
557
|
|
SEQ ID NO:
TTQGLTALLLSLK
ENSG00000136631.8
17
|
558
|
|
SEQ ID NO:
VPLSVQLKPEVSPTQDIR
ENSG00000125826.15
17
|
559
|
|
SEQ ID NO:
VTAIDFR
ENSG00000004864.9
17
|
560
|
|
SEQ ID NO:
YLIFPNPVCLEPGISYK
ENSG00000172037.9
17
|
561
|
|
SEQ ID NO:
YRLPNTLKPDSYR
ENSG00000166825.9
17
|
562
|
|
SEQ ID NO:
AFLLSLAALR
ENSG00000105223.14
16
|
563
|
|
SEQ ID NO:
DLAQYSSNDAVVETSLTK
ENSG00000114331.8
16
|
564
|
|
SEQ ID NO:
DRLPQEPGREQVVEDRPVGGR
ENSG00000135052.12
16
|
565
|
|
SEQ ID NO:
EAIQHPADEKLQEK
ENSG00000153310.14
16
|
566
|
|
SEQ ID NO:
EFQNNPNPR
ENSG00000169896.12
16
|
567
|
|
SEQ ID NO:
ELSAALQDKK
ENSG00000137497.13
16
|
568
|
|
SEQ ID NO:
ELSGSGLER
ENSG00000213380.9
16
|
569
|
|
SEQ ID NO:
ELWILNR
ENSG00000166825.9
16
|
570
|
|
SEQ ID NO:
FSTEYELQQLEQFKK
ENSG00000166825.9
16
|
571
|
|
SEQ ID NO:
GPALCGSQR
ENSG00000090006.13
16
|
572
|
|
SEQ ID NO:
GPLEPGPPKPGVPQEPGR
ENSG00000125826.15
16
|
573
|
|
SEQ ID NO:
GSLYQCDYSTGSCEPIR
ENSG00000169896.12
16
|
574
|
|
SEQ ID NO:
IQTQLQR
ENSG00000166825.9
16
|
575
|
|
SEQ ID NO:
KNSSIIGDYKQICSQLSER
ENSG00000011454.12
16
|
576
|
|
SEQ ID NO:
LEINFEELLK
ENSG00000162614.14
16
|
577
|
|
SEQ ID NO:
LIVPEPDVDFDAK
ENSG00000132205.6
16
|
578
|
|
SEQ ID NO:
LVGPEGFVVTEAGFGADIGMEK
ENSG00000100714.11
16
|
579
|
|
SEQ ID NO:
QEHCGCYTLLVENK
ENSG00000065534.14
16
|
580
|
|
SEQ ID NO:
RSQAGVSSGAPPGR
ENSG00000137497.13
16
|
581
|
|
SEQ ID NO:
SPGSTPTTHFPASSTTSGHSEK
ENSG00000205277.5
16
|
582
|
|
SEQ ID NO:
SPGSTPTTHFPASSTTSGHSEK
ENSG00000205277.5
16
|
583
|
|
SEQ ID NO:
SPGSTPTTHFPASSTTSGHSEK
ENSG00000205277.5
16
|
584
|
|
SEQ ID NO:
SPGSTPTTHFPASSTTSGHSEK
ENSG00000205277.5
16
|
585
|
|
SEQ ID NO:
VLSQIDVAQK
ENSG00000198947.10
16
|
586
|
|
SEQ ID NO:
YGGMFCNVEGAFESK
ENSG00000113657.8
16
|
587
|
|
SEQ ID NO:
ATVVVEATEPEPSGSIANPAASTS
ENSG00000131711.10
15
|
588
PSLSHR
|
|
SEQ ID NO:
EMTADVIELK
ENSG00000067704.8
15
|
589
|
|
SEQ ID NO:
GEQGFMGNTGPTGAVGDRGPK
ENSG00000134871.13
15
|
590
|
|
SEQ ID NO:
LAEAELEYNPEHVSR
ENSG00000067704.8
15
|
591
|
|
SEQ ID NO:
LESEEDVSQAFLEAVAEEKPHVK
ENSG00000065534.14
15
|
592
PYFSK
|
|
SEQ ID NO:
LMCELGNDVINR
ENSG00000114331.8
15
|
593
|
|
SEQ ID NO:
QQQDYWLIDVR
ENSG00000166825.9
15
|
594
|
|
SEQ ID NO:
SSEGGTAAGAGLDSLHK
ENSG00000130429.8
15
|
595
|
|
SEQ ID NO:
SYKPVFWSPSSR
ENSG00000067704.8
15
|
596
|
|
SEQ ID NO:
TAHLDEEVNKGDILVVATGQPE
ENSG00000100714.11
15
|
597
MVK
|
|
SEQ ID NO:
TRPDGNCFYR
ENSG00000167770.7
15
|
598
|
|
SEQ ID NO:
TSGQCLCR
ENSG00000172037.9
15
|
599
|
|
SEQ ID NO:
AFCVANK
ENSG00000114331.8
14
|
600
|
|
SEQ ID NO:
AMISGLSGR
ENSG00000065534.14
14
|
601
|
|
SEQ ID NO:
AVESSKPLSNAQPSGPLKPVGN
ENSG00000065534.14
14
|
602
|
|
SEQ ID NO:
AYHSFLVEPISCHAWNKDR
ENSG00000130429.8
14
|
603
|
|
SEQ ID NO:
EGVVDIYNCVK
ENSG00000152894.10
14
|
604
|
|
SEQ ID NO:
GTWIHPEIDNPEYSPDPSIYAYD
ENSG00000179218.9
14
|
605
NFGVLGLDLWQVK
|
|
SEQ ID NO:
HLTQAVCTVK
ENSG00000141447.12
14
|
606
|
|
SEQ ID NO:
ITISPLQELTLYNPER
ENSG00000136231.9
14
|
607
|
|
SEQ ID NO:
LACESASSTEVSGALK
ENSG00000169896.12
14
|
608
|
|
SEQ ID NO:
LDCTQCLQHPWLMK
ENSG00000065534.14
14
|
609
|
|
SEQ ID NO:
LDEEAENLVATVVPTHLAAAVPE
ENSG00000119383.15
14
|
610
VAVYLK
|
|
SEQ ID NO:
LPEDDEPPARPPPPPPASVSPQA
ENSG00000115310.13
14
|
611
EPVWTPPAPAPAAPPSTPAAPK
|
|
SEQ ID NO:
LPNTLKPDSYR
ENSG00000166825.9
14
|
612
|
|
SEQ ID NO:
LSSTQQSLAEK
ENSG00000082805.15
14
|
613
|
|
SEQ ID NO:
LVALETGIQK
ENSG00000019144.12
14
|
614
|
|
SEQ ID NO:
MHGGGPTVTAGLPLPK
ENSG00000100714.11
14
|
615
|
|
SEQ ID NO:
QQALELVVQEVSSVLR
ENSG00000157617.12
14
|
616
|
|
SEQ ID NO:
QSMAFSILNTPK
ENSG00000137497.13
14
|
617
|
|
SEQ ID NO:
SSNLLDLK
ENSG00000142453.7
14
|
618
|
|
SEQ ID NO:
VLQDQLK
ENSG00000135052.12
14
|
619
|
|
SEQ ID NO:
WVSHDSTVCLADADKK
ENSG00000130429.8
14
|
620
|
|
SEQ ID NO:
AAQLDGLEAR
ENSG00000172037.9
13
|
621
|
|
SEQ ID NO:
ANALASATCER
ENSG00000169756.12
13
|
622
|
|
SEQ ID NO:
ATDNEPSQFSEPR
ENSG00000132205.6
13
|
623
|
|
SEQ ID NO:
CGFSELYSWQR
ENSG00000067704.8
13
|
624
|
|
SEQ ID NO:
DLLQAAQDK
ENSG00000172037.9
13
|
625
|
|
SEQ ID NO:
EPAPASPAPAGVEIR
ENSG00000113657.8
13
|
626
|
|
SEQ ID NO:
EYELFEFR
ENSG00000136631.8
13
|
627
|
|
SEQ ID NO:
HKPGIVQETTFDLGGDIHSGTAL
ENSG00000130396.16
13
|
628
PTSK
|
|
SEQ ID NO:
IWDLQGSEEPVFR
ENSG00000133316.11
13
|
629
|
|
SEQ ID NO:
LFGDVEASLGR
ENSG00000213380.9
13
|
630
|
|
SEQ ID NO:
LHTLGDNLLDPR
ENSG00000172037.9
13
|
631
|
|
SEQ ID NO:
RFSDIQIR
ENSG00000100714.11
13
|
632
|
|
SEQ ID NO:
SEVYGPMK
ENSG00000166825.9
13
|
633
|
|
SEQ ID NO:
SLSESAATR
ENSG00000159788.14
13
|
634
|
|
SEQ ID NO:
VTCVEMEPLAEYVVR
ENSG00000152894.10
13
|
635
|
|
SEQ ID NO:
YLFEEDNLLR
ENSG00000132561.9
13
|
636
|
|
SEQ ID NO:
AAECLDVDECHR
ENSG00000090006.13
12
|
637
|
|
SEQ ID NO:
AGMSSLKG
ENSG00000146731.6
12
|
638
|
|
SEQ ID NO:
ALASATCER
ENSG00000169756.12
12
|
639
|
|
SEQ ID NO:
CDSHDDPALGLVSGQCR
ENSG00000172037.9
12
|
640
|
|
SEQ ID NO:
DCSIALPYVCK
ENSG00000011028.9
12
|
641
|
|
SEQ ID NO:
DISLQGPGLAPEHCYIENLR
ENSG00000019144.12
12
|
642
|
|
SEQ ID NO:
FVLDHEDGLNLNEDLENFLQK
ENSG00000137497.13
12
|
643
|
|
SEQ ID NO:
GANQHATDEEGKDPLSIAVEAA
ENSG00000114331.8
12
|
644
NADIVTLLR
|
|
SEQ ID NO:
GFSHLEALLDDSKELQR
ENSG00000167770.7
12
|
645
|
|
SEQ ID NO:
GSGVSNFAQLIVR
ENSG00000152894.10
12
|
646
|
|
SEQ ID NO:
IINDAFNLASAHK
ENSG00000166825.9
12
|
647
|
|
SEQ ID NO:
KVVQSLEQTAR
ENSG00000211460.7
12
|
648
|
|
SEQ ID NO:
QPAVEEPAEVTATVLASR
ENSG00000076662.5
12
|
649
|
|
SEQ ID NO:
QTQVLGLTQTCETLK
ENSG00000169896.12
12
|
650
|
|
SEQ ID NO:
RVEDAYILTCNVSLEYEK
ENSG00000146731.6
12
|
651
|
|
SEQ ID NO:
TLDFDALSVGQR
ENSG00000113657.8
12
|
652
|
|
SEQ ID NO:
VVNAMGK
ENSG00000169756.12
12
|
653
|
|
SEQ ID NO:
AKIDDPTDSKPEDWDKPEHIPD
ENSG00000179218.9
11
|
654
|
|
SEQ ID NO:
ALEQLLTELDDFLK
ENSG00000169129.10
11
|
655
|
|
SEQ ID NO:
ASKPEDWDER
ENSG00000179218.9
11
|
656
|
|
SEQ ID NO:
DLNQLFQQDSSSR
ENSG00000082805.15
11
|
657
|
|
SEQ ID NO:
ETPGRPPDPTGAPLPGPTGDPVK
ENSG00000032444.11
11
|
658
PTSLETPSAPLLSR
|
|
SEQ ID NO:
GSACEEDVDECAQEPPPCGPGR
ENSG00000090006.13
11
|
659
|
|
SEQ ID NO:
KASSEGGTAAGAGLDSLHK
ENSG00000130429.8
11
|
660
|
|
SEQ ID NO:
LGFITNNSSK
ENSG00000184207.8
11
|
661
|
|
SEQ ID NO:
LPSHSDFLAELR
ENSG00000169896.12
11
|
662
|
|
SEQ ID NO:
LQDVHVAEGK
ENSG00000065534.14
11
|
663
|
|
SEQ ID NO:
LVTCTGYHQVR
ENSG00000133316.11
11
|
664
|
|
SEQ ID NO:
SIQLPTTVR
ENSG00000166825.9
11
|
665
|
|
SEQ ID NO:
VLSELGR
ENSG00000067704.8
11
|
666
|
|
SEQ ID NO:
WAPNENKFAVGSGSR
ENSG00000130429.8
11
|
667
|
|
SEQ ID NO:
AQELQQTGVLGAFESSFWHMQ
ENSG00000172037.9
10
|
668
EK
|
|
SEQ ID NO:
ASAAAAAGGGATGHPGGGQGA
ENSG00000104450.8
10
|
669
ENPAGLK
|
|
SEQ ID NO:
EAENFHEEDDVDVRPAR
ENSG00000162614.14
10
|
670
|
|
SEQ ID NO:
ERLPSHSDFLAELR
ENSG00000169896.12
10
|
671
|
|
SEQ ID NO:
EWSLESSPAQNWTPPQPR
ENSG00000101199.8
10
|
672
|
|
SEQ ID NO:
FYALSASFEPFSNKG
ENSG00000179218.9
10
|
673
|
|
SEQ ID NO:
GISLNPEQWSQLKEQISDIDDAV
ENSG00000113387.7
10
|
674
R
|
|
SEQ ID NO:
HPLLVGHMPVMVAK
ENSG00000104728.11
10
|
675
|
|
SEQ ID NO:
IAHGNSSIIADR
ENSG00000100714.11
10
|
676
|
|
SEQ ID NO:
IYADSLKPNIPYK
ENSG00000130396.16
10
|
677
|
|
SEQ ID NO:
LAILDSQAGQIR
ENSG00000019144.12
10
|
678
|
|
SEQ ID NO:
NMVVDDDSPEMYK
ENSG00000162614.14
10
|
679
|
|
SEQ ID NO:
NRLDCTQCLQHPWLMK
ENSG00000065534.14
10
|
680
|
|
SEQ ID NO:
PVLLQVAESAYR
ENSG00000004864.9
10
|
681
|
|
SEQ ID NO:
QEPLGSDSEGVNCLAYDEAIMA
ENSG00000167770.7
10
|
682
QQDR
|
|
SEQ ID NO:
QEVEELWIGLNDLK
ENSG00000011028.9
10
|
683
|
|
SEQ ID NO:
SFVIHNLPVLAK
ENSG00000086475.10
10
|
684
|
|
SEQ ID NO:
STTFHSSPR
ENSG00000205277.5
10
|
685
|
|
SEQ ID NO:
STTFHSSPR
ENSG00000205277.5
10
|
686
|
|
SEQ ID NO:
STTFHSSPR
ENSG00000205277.5
10
|
687
|
|
SEQ ID NO:
TAAGLMHTFNAHAATDITGFGIL
ENSG00000086475.10
10
|
688
GHAQNLAK
|
|
SEQ ID NO:
TGAFGLR
ENSG00000172037.9
10
|
689
|
|
SEQ ID NO:
TSLTVVLLR
ENSG00000076662.5
10
|
690
|
|
SEQ ID NO:
VPPLLIYGPFGTGK
ENSG00000130589.12
10
|
691
|
|
SEQ ID NO:
VPSFAAGR
ENSG00000136231.9
10
|
692
|
|
SEQ ID NO:
VPVGDQPPDIEFQIR
ENSG00000106976.14
10
|
693
|
|
SEQ ID NO:
VYDPASPQR
ENSG00000133316.11
10
|
694
|
|
SEQ ID NO:
WFYIDFGGVKPMGSEPVPK
ENSG00000004864.9
10
|
695
|
|
SEQ ID NO:
WTPPAPAPAAPPSTPAAPK
ENSG00000115310.13
10
|
696
|
|
SEQ ID NO:
YDNQWFHGCTSTGR
ENSG00000011028.9
10
|
697
|
|
SEQ ID NO:
YFSYDCGADFPGVPLAPPR
ENSG00000172037.9
10
|
698
|
|
SEQ ID NO:
YGDEEKDKGLQTSQDAR
ENSG00000179218.9
10
|
699
|
|
SEQ ID NO:
YLETADYAIR
ENSG00000196961.8
10
|
700
|
|
SEQ ID NO:
AKQPDLAPGLTTIGASPTQTVTL
ENSG00000198947.10
9
|
701
VTQPVVTK
|
|
SEQ ID NO:
ASPLLPANHVTMAK
ENSG00000067704.8
9
|
702
|
|
SEQ ID NO:
AVLELLQRPGNAR
ENSG00000105963.9
9
|
703
|
|
SEQ ID NO:
CFQVQGQEPQSR
ENSG00000011028.9
9
|
704
|
|
SEQ ID NO:
DKGLQTSQDAR
ENSG00000179218.9
9
|
705
|
|
SEQ ID NO:
DLTALSNMLPK
ENSG00000166825.9
9
|
706
|
|
SEQ ID NO:
DPFSLDALSK
ENSG00000146731.6
9
|
707
|
|
SEQ ID NO:
FGDPLGYEDVIPEADR
ENSG00000169896.12
9
|
708
|
|
SEQ ID NO:
FGLYLPLFK
ENSG00000004864.9
9
|
709
|
|
SEQ ID NO:
FSTEYELQQLEQFK
ENSG00000166825.9
9
|
710
|
|
SEQ ID NO:
GAVYLFHGTSGSGISPSHSQR
ENSG00000169896.12
9
|
711
|
|
SEQ ID NO:
HLCELLAQQF
ENSG00000196961.8
9
|
712
|
|
SEQ ID NO:
ILDQENLSSTALVK
ENSG00000169129.10
9
|
713
|
|
SEQ ID NO:
ISETTMLQSGMK
ENSG00000130396.16
9
|
714
|
|
SEQ ID NO:
ISYHGSCPQGLADSAWIPFR
ENSG00000011028.9
9
|
715
|
|
SEQ ID NO:
KQNWFEAFEILDK
ENSG00000106066.9
9
|
716
|
|
SEQ ID NO:
PISLVFLVPVR
ENSG00000169896.12
9
|
717
|
|
SEQ ID NO:
SKESSQVTSR
ENSG00000136631.8
9
|
718
|
|
SEQ ID NO:
SPPPCTYGR
ENSG00000090006.13
9
|
719
|
|
SEQ ID NO:
SQLNCLLLSGR
ENSG00000133316.11
9
|
720
|
|
SEQ ID NO:
TPLSAAAHTHPVYCVNVVGTQN
ENSG00000158560.10
9
|
721
AHNLITVSTDGK
|
|
SEQ ID NO:
VNYDEENWR
ENSG00000166825.9
9
|
722
|
|
SEQ ID NO:
VSFVIHNLPVLAK
ENSG00000086475.10
9
|
723
|
|
SEQ ID NO:
VTLRPYLTPNDR
ENSG00000166825.9
9
|
724
|
|
SEQ ID NO:
WNVINWENVTER
ENSG00000112096.12
9
|
725
|
|
SEQ ID NO:
ADTDGGLIFR
ENSG00000163975.7
8
|
726
|
|
SEQ ID NO:
AGYTGLR
ENSG00000172037.9
8
|
727
|
|
SEQ ID NO:
AVESSKPLSNAQPSGPLKPVGNA
ENSG00000065534.14
8
|
728
K
|
|
SEQ ID NO:
CSEGFVLAEDGRR
ENSG00000132561.9
8
|
729
|
|
SEQ ID NO:
DLMVLNDVYR
ENSG00000166825.9
8
|
730
|
|
SEQ ID NO:
FPAEQYYR
ENSG00000211460.7
8
|
731
|
|
SEQ ID NO:
FTGHCSCRPGVSGVR
ENSG00000172037.9
8
|
732
|
|
SEQ ID NO:
GDPGDTGAPGPVGMK
ENSG00000134871.13
8
|
733
|
|
SEQ ID NO:
GGPSLSSVLNELPSAATLR
ENSG00000167608.7
8
|
734
|
|
SEQ ID NO:
IKDPDASKPEDWDERAK
ENSG00000179218.9
8
|
735
|
|
SEQ ID NO:
ILCIGAVPGLQPR
ENSG00000110237.3
8
|
736
|
|
SEQ ID NO:
IQSDLTSHEISLEEMKK
ENSG00000198947.10
8
|
737
|
|
SEQ ID NO:
ITGHFYACQVAQR
ENSG00000136231.9
8
|
738
|
|
SEQ ID NO:
KVVGDVAYDEAKER
ENSG00000100714.11
8
|
739
|
|
SEQ ID NO:
LDTDILLGATCGLK
ENSG00000184207.8
8
|
740
|
|
SEQ ID NO:
LVSAVVEYGGK
ENSG00000136631.8
8
|
741
|
|
SEQ ID NO:
MLGVAAGMTHSNMANALASAT
ENSG00000169756.12
8
|
742
CER
|
|
SEQ ID NO:
NIPNGLQEFLDPLCQR
ENSG00000130396.16
8
|
743
|
|
SEQ ID NO:
QADIIGKPSR
ENSG00000184207.8
8
|
744
|
|
SEQ ID NO:
QEISIMNCLHHPK
ENSG00000065534.14
8
|
745
|
|
SEQ ID NO:
QIVSEMLR
ENSG00000196961.8
8
|
746
|
|
SEQ ID NO:
RAEQLLQDAR
ENSG00000172037.9
8
|
747
|
|
SEQ ID NO:
RFENAPDSAK
ENSG00000082805.15
8
|
748
|
|
SEQ ID NO:
SGAPWFK
ENSG00000162614.14
8
|
749
|
|
SEQ ID NO:
SIVEHVASK
ENSG00000146733.9
8
|
750
|
|
SEQ ID NO:
SLVGLSQER
ENSG00000130396.16
8
|
751
|
|
SEQ ID NO:
TVNELQNLSSAEVVVPR
ENSG00000136231.9
8
|
752
|
|
SEQ ID NO:
VIAVVNK
ENSG00000130396.16
8
|
753
|
|
SEQ ID NO:
VSHSELR
ENSG00000146733.9
8
|
754
|
|
SEQ ID NO:
WSDGVGFSYHNFDR
ENSG00000011028.9
8
|
755
|
|
SEQ ID NO:
YGADDIELLPEAQHK
ENSG00000100714.11
8
|
756
|
|
SEQ ID NO:
AKPEASFQVWNK
ENSG00000073849.10
7
|
757
|
|
SEQ ID NO:
ALQLSNSPGASSAFLK
ENSG00000170776.15
7
|
758
|
|
SEQ ID NO:
ASSEGGTAAGAGLDSLHKNSVS
ENSG00000130429.8
7
|
759
QISVLSGGK
|
|
SEQ ID NO:
AVEMAAQR
ENSG00000184207.8
7
|
760
|
|
SEQ ID NO:
AVLELLQR
ENSG00000105963.9
7
|
761
|
|
SEQ ID NO:
AYAQQLADWAR
ENSG00000165912.11
7
|
762
|
|
SEQ ID NO:
DHSAIPVINR
ENSG00000166825.9
7
|
763
|
|
SEQ ID NO:
DLRDPAVCR
ENSG00000172037.9
7
|
764
|
|
SEQ ID NO:
FGSCVPHTTRPR
ENSG00000082458.7
7
|
765
|
|
SEQ ID NO:
GPQYGTLEK
ENSG00000165912.11
7
|
766
|
|
SEQ ID NO:
HWDDVVCESR
ENSG00000172037.9
7
|
767
|
|
SEQ ID NO:
IVLYQTDASLTPWTVR
ENSG00000032444.11
7
|
768
|
|
SEQ ID NO:
KVHSPQQVDFR
ENSG00000065534.14
7
|
769
|
|
SEQ ID NO:
LCTDHGSQLVTITNR
ENSG00000011028.9
7
|
770
|
|
SEQ ID NO:
LDFLPDMMVEGR
ENSG00000048740.13
7
|
771
|
|
SEQ ID NO:
LEAVAEEKPHVKPYFSK
ENSG00000065534.14
7
|
772
|
|
SEQ ID NO:
LEVDAIVNAANSSLLGGGGVDG
ENSG00000133315.6
7
|
773
CIHR
|
|
SEQ ID NO:
LLHEMQIQHPTASLIAK
ENSG00000146731.6
7
|
774
|
|
SEQ ID NO:
LLVEELPLR
ENSG00000198947.10
7
|
775
|
|
SEQ ID NO:
LMNSQLVTTEK
ENSG00000073849.10
7
|
776
|
|
SEQ ID NO:
LSNPPSAGPIVVHCSAGAGR
ENSG00000152894.10
7
|
777
|
|
SEQ ID NO:
LSPSSTETTTLPGSPTTPSLSEK
ENSG00000205277.5
7
|
778
|
|
SEQ ID NO:
LSPSSTETTTLPGSPTTPSLSEK
ENSG00000205277.5
7
|
779
|
|
SEQ ID NO:
LSPSSTETTTLPGSPTTPSLSEK
ENSG00000205277.5
7
|
780
|
|
SEQ ID NO:
LSPSSTETTTLPGSPTTPSLSEK
ENSG00000205277.5
7
|
781
|
|
SEQ ID NO:
MYLFYGNK
ENSG00000196961.8
7
|
782
|
|
SEQ ID NO:
PPLLLILDR
ENSG00000136631.8
7
|
783
|
|
SEQ ID NO:
PSLSLGTITDEEMK
ENSG00000137497.13
7
|
784
|
|
SEQ ID NO:
QCHECIEHIR
ENSG00000106066.9
7
|
785
|
|
SEQ ID NO:
QQNQELQEQLR
ENSG00000137497.13
7
|
786
|
|
SEQ ID NO:
SFAPILPHLAEEVFQHIPY
ENSG00000067704.8
7
|
787
|
|
SEQ ID NO:
SGLCPHVVVLVATVR
ENSG00000100714.11
7
|
788
|
|
SEQ ID NO:
SITILSTPEGTSAACK
ENSG00000136231.9
7
|
789
|
|
SEQ ID NO:
SLEGSDDAVLLQR
ENSG00000198947.10
7
|
790
|
|
SEQ ID NO:
SMDAETYVEGQR
ENSG00000130396.16
7
|
791
|
|
SEQ ID NO:
STTSGLVGESTPSR
ENSG00000205277.5
7
|
792
|
|
SEQ ID NO:
STTSGLVGESTPSR
ENSG00000205277.5
7
|
793
|
|
SEQ ID NO:
STTSGLVGESTPSR
ENSG00000205277.5
7
|
794
|
|
SEQ ID NO:
STTSGLVGESTPSR
ENSG00000205277.5
7
|
795
|
|
SEQ ID NO:
TQGSSTSWFGSNQSKPEFTVDLK
ENSG00000165322.13
7
|
796
|
|
SEQ ID NO:
VIMIVTDGRPQDSVAEVAAK
ENSG00000132561.9
7
|
797
|
|
SEQ ID NO:
VPPPKPATPDFR
ENSG00000065534.14
7
|
798
|
|
SEQ ID NO:
WGFCPIK
ENSG00000011028.9
7
|
799
|
|
SEQ ID NO:
YAVQVAEGMGYLESKR
ENSG00000061938.12
7
|
800
|
|
SEQ ID NO:
AAEEIGIKATHIKLPR
ENSG00000100714.11
6
|
801
|
|
SEQ ID NO:
AGDAVNVVVTGGK
ENSG00000132205.6
6
|
802
|
|
SEQ ID NO:
AGDTLSGTCLLIANK
ENSG00000142453.7
6
|
803
|
|
SEQ ID NO:
AGDTLSGTCLLIANKR
ENSG00000142453.7
6
|
804
|
|
SEQ ID NO:
AIDYEIQR
ENSG00000059691.7
6
|
805
|
|
SEQ ID NO:
ALEQALEK
ENSG00000166825.9
6
|
806
|
|
SEQ ID NO:
ALSSAGER
ENSG00000172037.9
6
|
807
|
|
SEQ ID NO:
CFLCDSR
ENSG00000172037.9
6
|
808
|
|
SEQ ID NO:
DAEEWVQQLK
ENSG00000005020.8
6
|
809
|
|
SEQ ID NO:
DDEFTHLYTLIVRPDNTYEVK
ENSG00000179218.9
6
|
810
|
|
SEQ ID NO:
DFGSFDKFKEK
ENSG00000112096.12
6
|
811
|
|
SEQ ID NO:
DGDVQAGANLSFNR
ENSG00000158560.10
6
|
812
|
|
SEQ ID NO:
EFASHLQQLQDALNELTEEHSK
ENSG00000137497.13
6
|
813
|
|
SEQ ID NO:
ETLPELPSVTR
ENSG00000059691.7
6
|
814
|
|
SEQ ID NO:
GAPMHDLLLWNNATVTTCHSK
ENSG00000100714.11
6
|
815
|
|
SEQ ID NO:
HKSDFGK
ENSG00000179218.9
6
|
816
|
|
SEQ ID NO:
IALETSLSK
ENSG00000076662.5
6
|
817
|
|
SEQ ID NO:
IGDFGLMR
ENSG00000061938.12
6
|
818
|
|
SEQ ID NO:
ILREEGPK
ENSG00000004864.9
6
|
819
|
|
SEQ ID NO:
KSEAPFTHK
ENSG00000162614.14
6
|
820
|
|
SEQ ID NO:
LCGDLVSCFQER
ENSG00000165912.11
6
|
821
|
|
SEQ ID NO:
LLDLLEGLTGQK
ENSG00000198947.10
6
|
822
|
|
SEQ ID NO:
LLEQSIQSAQETEK
ENSG00000198947.10
6
|
823
|
|
SEQ ID NO:
LQAEDCSIACLPR
ENSG00000152894.10
6
|
824
|
|
SEQ ID NO:
MNVVFAVK
ENSG00000136631.8
6
|
825
|
|
SEQ ID NO:
NPPAAYIQK
ENSG00000184922.9
6
|
826
|
|
SEQ ID NO:
NTSLNPQELQR
ENSG00000125826.15
6
|
827
|
|
SEQ ID NO:
NVLINKDIR
ENSG00000179218.9
6
|
828
|
|
SEQ ID NO:
PAETLKPMGN
ENSG00000065534.14
6
|
829
|
|
SEQ ID NO:
PAETLKPMGN
ENSG00000065534.14
6
|
830
|
|
SEQ ID NO:
PFSLDALSK
ENSG00000146731.6
6
|
831
|
|
SEQ ID NO:
PLLPANHVTMAK
ENSG00000067704.8
6
|
832
|
|
SEQ ID NO:
PSGYTCACDSGFR
ENSG00000090006.13
6
|
833
|
|
SEQ ID NO:
PSVVLSAAHTVAAR
ENSG00000032444.11
6
|
834
|
|
SEQ ID NO:
QASNGVLIR
ENSG00000166825.9
6
|
835
|
|
SEQ ID NO:
QGLELAADCHLSR
ENSG00000130396.16
6
|
836
|
|
SEQ ID NO:
QVEELLMAMEK
ENSG00000082805.15
6
|
837
|
|
SEQ ID NO:
QVEKEETNEIQVVNEEPQR
ENSG00000135052.12
6
|
838
|
|
SEQ ID NO:
RLEAEFPPHHSQSTFR
ENSG00000061938.12
6
|
839
|
|
SEQ ID NO:
SWDTNLIECNLDQELK
ENSG00000131711.10
6
|
840
|
|
SEQ ID NO:
TGEPCVAELTEENFQR
ENSG00000082805.15
6
|
841
|
|
SEQ ID NO:
VECEPSWQPFQGHCYR
ENSG00000011028.9
6
|
842
|
|
SEQ ID NO:
VRFTPVVCGLR
ENSG00000090006.13
6
|
843
|
|
SEQ ID NO:
VSLSQPR
ENSG00000090006.13
6
|
844
|
|
SEQ ID NO:
AAEGYTQFYYVDVLDGK
ENSG00000205277.5
5
|
845
|
|
SEQ ID NO:
AALEEVEGDVAELELK
ENSG00000114331.8
5
|
846
|
|
SEQ ID NO:
AEEFGNETWGVTK
ENSG00000179218.9
5
|
847
|
|
SEQ ID NO:
AFEDWLNDDLGSYQGAQGNR
ENSG00000101199.8
5
|
848
|
|
SEQ ID NO:
ATQEWLEK
ENSG00000137497.13
5
|
849
|
|
SEQ ID NO:
CSQFCTTGMDGGMSIWDVK
ENSG00000130429.8
5
|
850
|
|
SEQ ID NO:
DQLVIPDGQEEEQEAAGEGR
ENSG00000135052.12
5
|
851
|
|
SEQ ID NO:
EAQEAEAFALYHK
ENSG00000099991.12
5
|
852
|
|
SEQ ID NO:
EGNCSGCIQDCNR
ENSG00000104450.8
5
|
853
|
|
SEQ ID NO:
EGQIQSVVTYDLALDSGRPHSR
ENSG00000169896.12
5
|
854
|
|
SEQ ID NO:
EIDAALQKK
ENSG00000162614.14
5
|
855
|
|
SEQ ID NO:
ERFQNLDKK
ENSG00000130429.8
5
|
856
|
|
SEQ ID NO:
ETQPPDLPTTALGGCPSDWIQFL
ENSG00000011028.9
5
|
857
NK
|
|
SEQ ID NO:
FREFLESQEDYDPCWSLQEK
ENSG00000101199.8
5
|
858
|
|
SEQ ID NO:
GGTAAGAGLDSLHK
ENSG00000130429.8
5
|
859
|
|
SEQ ID NO:
GLNPGTLNILVR
ENSG00000152894.10
5
|
860
|
|
SEQ ID NO:
GQLAPVFQR
ENSG00000213380.9
5
|
861
|
|
SEQ ID NO:
GSAASTCILTIESK
ENSG00000162614.14
5
|
862
|
|
SEQ ID NO:
ICGVEDAVSEMTR
ENSG00000146733.9
5
|
863
|
|
SEQ ID NO:
IITEGFEAAKEK
ENSG00000146731.6
5
|
864
|
|
SEQ ID NO:
ILKDIANR
ENSG00000067704.8
5
|
865
|
|
SEQ ID NO:
IQDLEHHLGLALNEVQAAK
ENSG00000011454.12
5
|
866
|
|
SEQ ID NO:
IVDAVIEQVK
ENSG00000170776.15
5
|
867
|
|
SEQ ID NO:
KVNVLQK
ENSG00000082805.15
5
|
868
|
|
SEQ ID NO:
LLLQCQVSSDPPATIIWTLNGK
ENSG00000065534.14
5
|
869
|
|
SEQ ID NO:
LSFEEMER
ENSG00000162614.14
5
|
870
|
|
SEQ ID NO:
LSPIPAVPASVPLQAWHPAK
ENSG00000104450.8
5
|
871
|
|
SEQ ID NO:
NQDNEDEWPLAEILSVK
ENSG00000172977.8
5
|
872
|
|
SEQ ID NO:
PTTLTDEEINR
ENSG00000100714.11
5
|
873
|
|
SEQ ID NO:
QIIEDQSGHYIWVPSPEKL
ENSG00000082458.7
5
|
874
|
|
SEQ ID NO:
QIQESEHMK
ENSG00000065534.14
5
|
875
|
|
SEQ ID NO:
RDFGSFDK
ENSG00000112096.12
5
|
876
|
|
SEQ ID NO:
RPQLEELITAAQNLK
ENSG00000198947.10
5
|
877
|
|
SEQ ID NO:
RPYWCISR
ENSG00000067704.8
5
|
878
|
|
SEQ ID NO:
SEESTASHSSQDATGTIVLPAR
ENSG00000205277.5
5
|
879
|
|
SEQ ID NO:
SEESTASHSSQDATGTIVLPAR
ENSG00000205277.5
5
|
880
|
|
SEQ ID NO:
SEESTASHSSQDATGTIVLPAR
ENSG00000205277.5
5
|
881
|
|
SEQ ID NO:
SEESTASHSSQDATGTIVLPAR
ENSG00000205277.5
5
|
882
|
|
SEQ ID NO:
SGTIFDNFLITNDEAY
ENSG00000179218.9
5
|
883
|
|
SEQ ID NO:
SQDADSPGSSGAPENLTFK
ENSG00000130396.16
5
|
884
|
|
SEQ ID NO:
TCYPLESRPSLSLGTITDEEMK
ENSG00000137497.13
5
|
885
|
|
SEQ ID NO:
TGLFTPDMAFETIVK
ENSG00000106976.14
5
|
886
|
|
SEQ ID NO:
VATEAEFSPEDSPSVR
ENSG00000155629.10
5
|
887
|
|
SEQ ID NO:
VPPPCDLGR
ENSG00000090006.13
5
|
888
|
|
SEQ ID NO:
VVSNFILQALQGEPLTVYGSGSQ
ENSG00000115652.10
5
|
889
TR
|
|
SEQ ID NO:
AAIVFTDGR
ENSG00000132561.9
4
|
890
|
|
SEQ ID NO:
AGKGEVTFEDVK
ENSG00000004864.9
4
|
891
|
|
SEQ ID NO:
AIDLEIK
ENSG00000162614.14
4
|
892
|
|
SEQ ID NO:
AIEEELQEIASEPTNK
ENSG00000132561.9
4
|
893
|
|
SEQ ID NO:
ASFITPVPGGVGPMTVAMLMQ
ENSG00000100714.11
4
|
894
STVESAKR
|
|
SEQ ID NO:
CAVVSSAGSLK
ENSG00000073849.10
4
|
895
|
|
SEQ ID NO:
CHYYANK
ENSG00000134871.13
4
|
896
|
|
SEQ ID NO:
CLTALPYICK
ENSG00000011028.9
4
|
897
|
|
SEQ ID NO:
DEELPTLLHFAAK
ENSG00000155629.10
4
|
898
|
|
SEQ ID NO:
DKVMPLIIQGFK
ENSG00000086475.10
4
|
899
|
|
SEQ ID NO:
DKVVALAEGR
ENSG00000101199.8
4
|
900
|
|
SEQ ID NO:
DQVFGSNLANLCQR
ENSG00000165322.13
4
|
901
|
|
SEQ ID NO:
DVFNVEDQKR
ENSG00000135052.12
4
|
902
|
|
SEQ ID NO:
EAELEYNPEHVSR
ENSG00000067704.8
4
|
903
|
|
SEQ ID NO:
EATDVIIIHSK
ENSG00000166825.9
4
|
904
|
|
SEQ ID NO:
EQYDVPQEWR
ENSG00000205277.5
4
|
905
|
|
SEQ ID NO:
ESPQDSAITR
ENSG00000011454.12
4
|
906
|
|
SEQ ID NO:
EVVLQWFTENSK
ENSG00000166825.9
4
|
907
|
|
SEQ ID NO:
EYFTFPASK
ENSG00000130396.16
4
|
908
|
|
SEQ ID NO:
FFDSACTMGAYHPLLYEK
ENSG00000073849.10
4
|
909
|
|
SEQ ID NO:
FGSFDKFK
ENSG00000112096.12
4
|
910
|
|
SEQ ID NO:
FIEAGQFNDNLYGTSIQSVR
ENSG00000082458.7
4
|
911
|
|
SEQ ID NO:
FIPGSALNGMVEMMDR
ENSG00000067704.8
4
|
912
|
|
SEQ ID NO:
GHLQIAACPNQD
ENSG00000112096.12
4
|
913
|
|
SEQ ID NO:
GSWQPVGDLLIDSLQDHLEK
ENSG00000198947.10
4
|
914
|
|
SEQ ID NO:
HVVPGVER
ENSG00000130589.12
4
|
915
|
|
SEQ ID NO:
IDYGTGHEAAFAAFLCCLCK
ENSG00000119383.15
4
|
916
|
|
SEQ ID NO:
IVGNGSEQQLQK
ENSG00000011454.12
4
|
917
|
|
SEQ ID NO:
KESEETIIQTDEDVPGPVPVK
ENSG00000152894.10
4
|
918
|
|
SEQ ID NO:
LEPAGPACPEGGR
ENSG00000213380.9
4
|
919
|
|
SEQ ID NO:
LETLTNQFSDSK
ENSG00000082805.15
4
|
920
|
|
SEQ ID NO:
LFSGSQVR
ENSG00000059691.7
4
|
921
|
|
SEQ ID NO:
LLEILK
ENSG00000082805.15
4
|
922
|
|
SEQ ID NO:
LLQQFPLDLEK
ENSG00000198947.10
4
|
923
|
|
SEQ ID NO:
LLTESVNSVIAQAPPVAQEALKK
ENSG00000198947.10
4
|
924
|
|
SEQ ID NO:
LPVEDKIR
ENSG00000100714.11
4
|
925
|
|
SEQ ID NO:
LPYGGQCR
ENSG00000172037.9
4
|
926
|
|
SEQ ID NO:
LSTAITLLPLEEGR
ENSG00000019144.12
4
|
927
|
|
SEQ ID NO:
LTASSTCGLNGPQPYCIVSHLQD
ENSG00000172037.9
4
|
928
EKK
|
|
SEQ ID NO:
LVTPHGESEQIGVIPSK
ENSG00000082458.7
4
|
929
|
|
SEQ ID NO:
NAEVRPPFTYASLIR
ENSG00000114861.14
4
|
930
|
|
SEQ ID NO:
PAETLKPMGNAKPDENLK
ENSG00000065534.14
4
|
931
|
|
SEQ ID NO:
PGGAGPCATVSVFPGAR
ENSG00000142453.7
4
|
932
|
|
SEQ ID NO:
QELNTIASKPPR
ENSG00000169896.12
4
|
933
|
|
SEQ ID NO:
RFSTEYELQQLEQFKK
ENSG00000166825.9
4
|
934
|
|
SEQ ID NO:
RVPPPCAPGR
ENSG00000090006.13
4
|
935
|
|
SEQ ID NO:
SCHAGFGSPAGWDVPVGALIQR
ENSG00000163975.7
4
|
936
|
|
SEQ ID NO:
SFGHFPGPEFLDVEK
ENSG00000165322.13
4
|
937
|
|
SEQ ID NO:
SITEVGEALK
ENSG00000198947.10
4
|
938
|
|
SEQ ID NO:
SLQADTTNTDTALTTLEEALAEKE
ENSG00000082805.15
4
|
939
R
|
|
SEQ ID NO:
SSNLLDLKNPFFR
ENSG00000142453.7
4
|
940
|
|
SEQ ID NO:
TGYAFVDCPDESWALK
ENSG00000136231.9
4
|
941
|
|
SEQ ID NO:
TQVTFFFPLDLSYR
ENSG00000169896.12
4
|
942
|
|
SEQ ID NO:
TSKDDLLLTDFEGALK
ENSG00000011454.12
4
|
943
|
|
SEQ ID NO:
TVTINTEQK
ENSG00000065534.14
4
|
944
|
|
SEQ ID NO:
VADLLQHINLMK
ENSG00000152894.10
4
|
945
|
|
SEQ ID NO:
VDANISVHHPGEPLGVR
ENSG00000059691.7
4
|
946
|
|
SEQ ID NO:
VMVGDLEDINEMIIK
ENSG00000198947.10
4
|
947
|
|
SEQ ID NO:
VVGDVAYDEAKER
ENSG00000100714.11
4
|
948
|
|
SEQ ID NO:
VYLLYR
ENSG00000167770.7
4
|
949
|
|
SEQ ID NO:
WANGLSEEKPLSVPR
ENSG00000064545.10
4
|
950
|
|
SEQ ID NO:
WAPNENK
ENSG00000130429.8
4
|
951
|
|
SEQ ID NO:
WCVLSTPEIQK
ENSG00000163975.7
4
|
952
|
|
SEQ ID NO:
WMDPEGEMKPGR
ENSG00000113387.7
4
|
953
|
|
SEQ ID NO:
WVLLQDILLK
ENSG00000198947.10
4
|
954
|
|
SEQ ID NO:
YEEQRPSLK
ENSG00000162614.14
4
|
955
|
|
SEQ ID NO:
YGLLNVTK
ENSG00000165322.13
4
|
956
|
|
SEQ ID NO:
YQHIGLVAMFR
ENSG00000169896.12
4
|
957
|
|
SEQ ID NO:
YVPAIAHLIHSLNPVR
ENSG00000106066.9
4
|
958
|
|
SEQ ID NO:
AAILQTEVDALR
ENSG00000082805.15
3
|
959
|
|
SEQ ID NO:
ADGGPEAGELPSIGEATAALALA
ENSG00000019144.12
3
|
960
GR
|
|
SEQ ID NO:
AENYWWR
ENSG00000061938.12
3
|
961
|
|
SEQ ID NO:
AEQPPHLTPGIR
ENSG00000146733.9
3
|
962
|
|
SEQ ID NO:
AIEALSGK
ENSG00000136231.9
3
|
963
|
|
SEQ ID NO:
AIGNIELGIR
ENSG00000131711.10
3
|
964
|
|
SEQ ID NO:
AMNNSWHPECFR
ENSG00000169756.12
3
|
965
|
|
SEQ ID NO:
APNLSSGNVSLK
ENSG00000155629.10
3
|
966
|
|
SEQ ID NO:
AQVAHADQQLR
ENSG00000137497.13
3
|
967
|
|
SEQ ID NO:
AREHFGTVK
ENSG00000211460.7
3
|
968
|
|
SEQ ID NO:
ARFEQMAKAREE
ENSG00000162614.14
3
|
969
|
|
SEQ ID NO:
ASFANEDGQVSPGSLLLAGAIAG
ENSG00000004864.9
3
|
970
MPAASLVTPADVIK
|
|
SEQ ID NO:
AVVVGFDPHFSYMK
ENSG00000184207.8
3
|
971
|
|
SEQ ID NO:
DDLLLTDFEGALK
ENSG00000011454.12
3
|
972
|
|
SEQ ID NO:
DNEETGFGSGTR
ENSG00000166825.9
3
|
973
|
|
SEQ ID NO:
DVDGLTSINAGK
ENSG00000100714.11
3
|
974
|
|
SEQ ID NO:
EAGIQPSLLCVR
ENSG00000163975.7
3
|
975
|
|
SEQ ID NO:
EDFNSKHMANQRALGK
ENSG00000172037.9
3
|
976
|
|
SEQ ID NO:
EEGDLGPVYGFQWR
ENSG00000176890.11
3
|
977
|
|
SEQ ID NO:
EELSSGDSLSPDPWK
ENSG00000130396.16
3
|
978
|
|
SEQ ID NO:
ELQKAVEEMK
ENSG00000198947.10
3
|
979
|
|
SEQ ID NO:
ENSMLREEMHRRFENAPDSAKT
ENSG00000082805.15
3
|
980
K
|
|
SEQ ID NO:
EQISDIDDAVRK
ENSG00000113387.7
3
|
981
|
|
SEQ ID NO:
EVVDAGLVGLER
ENSG00000138162.13
3
|
982
|
|
SEQ ID NO:
FEALQAPACHENMVK
ENSG00000196961.8
3
|
983
|
|
SEQ ID NO:
FHLCSVATR
ENSG00000196961.8
3
|
984
|
|
SEQ ID NO:
FNLDTENAMTFQENAR
ENSG00000169896.12
3
|
985
|
|
SEQ ID NO:
FTEEIPLK
ENSG00000136231.9
3
|
986
|
|
SEQ ID NO:
GALTSTPYSPTQHLER
ENSG00000153310.14
3
|
987
|
|
SEQ ID NO:
GDEGPIGHQGPIGQEGAPGR
ENSG00000134871.13
3
|
988
|
|
SEQ ID NO:
GDSGQPLFLTPYIEAGK
ENSG00000106066.9
3
|
989
|
|
SEQ ID NO:
GEPVSAEDLGVSGALTVLMK
ENSG00000100714.11
3
|
990
|
|
SEQ ID NO:
GFSGIFPACHPCHACFGDWDR
ENSG00000172037.9
3
|
991
|
|
SEQ ID NO:
GIDTPQCHR
ENSG00000172037.9
3
|
992
|
|
SEQ ID NO:
GWDSSHEDDLPVYLAR
ENSG00000113657.8
3
|
993
|
|
SEQ ID NO:
HEQNIDCGGGYV
ENSG00000179218.9
3
|
994
|
|
SEQ ID NO:
HLNQGTDEDIYLLGK
ENSG00000073849.10
3
|
995
|
|
SEQ ID NO:
IAELQQR
ENSG00000137497.13
3
|
996
|
|
SEQ ID NO:
ILVVITDGEK
ENSG00000169896.12
3
|
997
|
|
SEQ ID NO:
INDAFNLASAHK
ENSG00000166825.9
3
|
998
|
|
SEQ ID NO:
INLPAPNPDHVGGYK
ENSG00000004864.9
3
|
999
|
|
SEQ ID NO:
IQEILTQVK
ENSG00000136231.9
3
|
1000
|
|
SEQ ID NO:
IQPTTPSEPTAIK
ENSG00000198947.10
3
|
1001
|
|
SEQ ID NO:
ISPGSTEITTLPGSTTTPGLSEAST
ENSG00000205277.5
3
|
1002
TFYSSPR
|
|
SEQ ID NO:
ISPGSTEITTLPGSTTTPGLSEAST
ENSG00000205277.5
3
|
1003
TFYSSPR
|
|
SEQ ID NO:
ISPGSTEITTLPGSTTTPGLSEAST
ENSG00000205277.5
3
|
1004
TFYSSPR
|
|
SEQ ID NO:
ISPGSTEITTLPGSTTTPGLSEAST
ENSG00000205277.5
3
|
1005
TFYSSPR
|
|
SEQ ID NO:
ISSMERGLR
ENSG00000082805.15
3
|
1006
|
|
SEQ ID NO:
IVLDVGCGSGILSFFAAQAGAR
ENSG00000142453.7
3
|
1007
|
|
SEQ ID NO:
IYGADDIELLPEAQHKAEVYTK
ENSG00000100714.11
3
|
1008
|
|
SEQ ID NO:
KDVKLDK
ENSG00000170776.15
3
|
1009
|
|
SEQ ID NO:
KFQETEQTIQK
ENSG00000132205.6
3
|
1010
|
|
SEQ ID NO:
KFSYDLSQCINQMK
ENSG00000135052.12
3
|
1011
|
|
SEQ ID NO:
KLPAENGSSSAETLNAK
ENSG00000065534.14
3
|
1012
|
|
SEQ ID NO:
KLTELENELNTK
ENSG00000130396.16
3
|
1013
|
|
SEQ ID NO:
KQTENPK
ENSG00000198947.10
3
|
1014
|
|
SEQ ID NO:
KQVTPLFIHFR
ENSG00000166825.9
3
|
1015
|
|
SEQ ID NO:
KRVEDAYILTCNVSLEYEK
ENSG00000146731.6
3
|
1016
|
|
SEQ ID NO:
KVPFAWCAPESLK
ENSG00000061938.12
3
|
1017
|
|
SEQ ID NO:
LAGAPAPK
ENSG00000184207.8
3
|
1018
|
|
SEQ ID NO:
LHELYEKVFSRRADR
ENSG00000032444.11
3
|
1019
|
|
SEQ ID NO:
LLDPEDVDTTYPDKK
ENSG00000198947.10
3
|
1020
|
|
SEQ ID NO:
LLESLQENHFQEDEQFLGAVMP
ENSG00000086475.10
3
|
1021
R
|
|
SEQ ID NO:
LLQVAVEDR
ENSG00000198947.10
3
|
1022
|
|
SEQ ID NO:
LLVSDIQTIQPSLNSVNEGGQK
ENSG00000198947.10
3
|
1023
|
|
SEQ ID NO:
LNLHSADWQR
ENSG00000198947.10
3
|
1024
|
|
SEQ ID NO:
LPAENGSSSAETLNAK
ENSG00000065534.14
3
|
1025
|
|
SEQ ID NO:
LPLEDADIIK
ENSG00000110237.3
3
|
1026
|
|
SEQ ID NO:
LPLQMALTELETLAEK
ENSG00000104728.11
3
|
1027
|
|
SEQ ID NO:
LPTEWNVLGTDQSLHDAGPR
ENSG00000170776.15
3
|
1028
|
|
SEQ ID NO:
LQEALSQLDFQWEK
ENSG00000198947.10
3
|
1029
|
|
SEQ ID NO:
LQEPSAQANCCDSEKNGDIGQQ
ENSG00000132205.6
3
|
1030
IK
|
|
SEQ ID NO:
LQSQVISELDACKECTQGVQR
ENSG00000132205.6
3
|
1031
|
|
SEQ ID NO:
LYIGNLSENAAPSDLESIFK
ENSG00000136231.9
3
|
1032
|
|
SEQ ID NO:
MLESYLHAK
ENSG00000142453.7
3
|
1033
|
|
SEQ ID NO:
NLLLATR
ENSG00000061938.12
3
|
1034
|
|
SEQ ID NO:
NVLLHEMQIQHPTASLIAK
ENSG00000146731.6
3
|
1035
|
|
SEQ ID NO:
QKPCDLPLR
ENSG00000136231.9
3
|
1036
|
|
SEQ ID NO:
QPAAFIVTQYPLPNTVK
ENSG00000152894.10
3
|
1037
|
|
SEQ ID NO:
QQLGHIEAWAEK
ENSG00000130396.16
3
|
1038
|
|
SEQ ID NO:
QREEHYFCK
ENSG00000133315.6
3
|
1039
|
|
SEQ ID NO:
QVFHALEDELQK
ENSG00000151914.13
3
|
1040
|
|
SEQ ID NO:
QWMENPNNNPIHPNLR
ENSG00000166825.9
3
|
1041
|
|
SEQ ID NO:
SAQALVEQMVNEGVNADSIK
ENSG00000198947.10
3
|
1042
|
|
SEQ ID NO:
SATSVLVGEPTTSPISSGSTETTAL
ENSG00000205277.5
3
|
1043
PGSTTTAGLSEK
|
|
SEQ ID NO:
SATSVLVGEPTTSPISSGSTETTAL
ENSG00000205277.5
3
|
1044
PGSTTTAGLSEK
|
|
SEQ ID NO:
SATSVLVGEPTTSPISSGSTETTAL
ENSG00000205277.5
3
|
1045
PGSTTTAGLSEK
|
|
SEQ ID NO:
SAVEGMPSNLDSEVAWGK
ENSG00000198947.10
3
|
1046
|
|
SEQ ID NO:
SEDSTIYDLLKDPVSLR
ENSG00000104728.11
3
|
1047
|
|
SEQ ID NO:
SLESALKDLK
ENSG00000130429.8
3
|
1048
|
|
SEQ ID NO:
SPNPALTFCVK
ENSG00000019144.12
3
|
1049
|
|
SEQ ID NO:
STTFYTSPR
ENSG00000205277.5
3
|
1050
|
|
SEQ ID NO:
STTFYTSPR
ENSG00000205277.5
3
|
1051
|
|
SEQ ID NO:
STTFYTSPR
ENSG00000205277.5
3
|
1052
|
|
SEQ ID NO:
STTFYTSPR
ENSG00000205277.5
3
|
1053
|
|
SEQ ID NO:
TCHYYANK
ENSG00000134871.13
3
|
1054
|
|
SEQ ID NO:
TCSECQELHWGDPGLQCHACDC
ENSG00000172037.9
3
|
1055
DSR
|
|
SEQ ID NO:
TCYPLESR
ENSG00000137497.13
3
|
1056
|
|
SEQ ID NO:
TEFQLELPVK
ENSG00000169896.12
3
|
1057
|
|
SEQ ID NO:
TKEPVIMSTLETVR
ENSG00000198947.10
3
|
1058
|
|
SEQ ID NO:
TPLWIGLAGEEGSRR
ENSG00000011028.9
3
|
1059
|
|
SEQ ID NO:
TQSLNPAPFSPLTAQQMKPEKPS
ENSG00000130396.16
3
|
1060
TLQRPQETVIR
|
|
SEQ ID NO:
TVGWNVPVGYLVESGR
ENSG00000163975.7
3
|
1061
|
|
SEQ ID NO:
VASSSSGNNFLSGSPASPMGDIL
ENSG00000137497.13
3
|
1062
QTPQFQMR
|
|
SEQ ID NO:
VAWVSHDSTVCLADADK
ENSG00000130429.8
3
|
1063
|
|
SEQ ID NO:
VEQQPDYR
ENSG00000130396.16
3
|
1064
|
|
SEQ ID NO:
VIQEVSGLPSEGASEGNQYTPDA
ENSG00000169129.10
3
|
1065
QR
|
|
SEQ ID NO:
VLDLLDPASGDLVIR
ENSG00000079616.8
3
|
1066
|
|
SEQ ID NO:
VLLHEMQIQHPTASLIAK
ENSG00000146731.6
3
|
1067
|
|
SEQ ID NO:
VMDKVTSDETR
ENSG00000138162.13
3
|
1068
|
|
SEQ ID NO:
VPRYELLLK
ENSG00000127084.13
3
|
1069
|
|
SEQ ID NO:
VQFGASHVFK
ENSG00000130396.16
3
|
1070
|
|
SEQ ID NO:
VSCIVSAAK
ENSG00000169129.10
3
|
1071
|
|
SEQ ID NO:
VTEILGIEPDREK
ENSG00000211460.7
3
|
1072
|
|
SEQ ID NO:
VVDALNQGLPR
ENSG00000079616.8
3
|
1073
|
|
SEQ ID NO:
WKTPAAIPATPVAVSQPIR
ENSG00000130396.16
3
|
1074
|
|
SEQ ID NO:
YLETADYAIREEIVLK
ENSG00000196961.8
3
|
1075
|
|
SEQ ID NO:
YLNWESDQPDNPSEENCGVIR
ENSG00000011028.9
3
|
1076
|
|
SEQ ID NO:
YVGFGNTPPPQKK
ENSG00000101199.8
3
|
1077
|
|
SEQ ID NO:
AAGNFATK
ENSG00000130396.16
2
|
1078
|
|
SEQ ID NO:
AEGERQPPPDSSEEAPPATQNFII
ENSG00000119383.15
2
|
1079
PK
|
|
SEQ ID NO:
AGLVVEDALFETLPSDVR
ENSG00000171488.10
2
|
1080
|
|
SEQ ID NO:
AHCGDPVSLAAAGDGSPDIGPT
ENSG00000127084.13
2
|
1081
GELSGSLK
|
|
SEQ ID NO:
AILQNHTDFKDK
ENSG00000142453.7
2
|
1082
|
|
SEQ ID NO:
AINVYGTSEPSQESELTTVGEKPE
ENSG00000065534.14
2
|
1083
EPK
|
|
SEQ ID NO:
ALGEDQVAETSAMSDVLKDILK
ENSG00000157617.12
2
|
1084
|
|
SEQ ID NO:
ANIVMVLEIVSGGELFER
ENSG00000065534.14
2
|
1085
|
|
SEQ ID NO:
APEEQGLLPNGEPSQHSSAPQK
ENSG00000169129.10
2
|
1086
|
|
SEQ ID NO:
APGLGVLSPSGEER
ENSG00000065534.14
2
|
1087
|
|
SEQ ID NO:
AQDDVSEWASK
ENSG00000132561.9
2
|
1088
|
|
SEQ ID NO:
ASSISEEVAVGSIAATLK
ENSG00000170776.15
2
|
1089
|
|
SEQ ID NO:
ATLALDSVLTEEGK
ENSG00000170776.15
2
|
1090
|
|
SEQ ID NO:
AVGGDRQEAIQPGCIGGPKGLP
ENSG00000134871.13
2
|
1091
GLPGPPGPTGAKGLRGIPGFAGA
|
DGGP
|
|
SEQ ID NO:
AVGLVSTWTQR
ENSG00000127084.13
2
|
1092
|
|
SEQ ID NO:
AVSSADPR
ENSG00000138162.13
2
|
1093
|
|
SEQ ID NO:
AWHAFFTAAER
ENSG00000165912.11
2
|
1094
|
|
SEQ ID NO:
DCTQCLQHPWLMK
ENSG00000065534.14
2
|
1095
|
|
SEQ ID NO:
DEISDDAKDFISNLLK
ENSG00000065534.14
2
|
1096
|
|
SEQ ID NO:
DFGPASQHFLSTSVQGPWER
ENSG00000198947.10
2
|
1097
|
|
SEQ ID NO:
DFLDSLGFSTR
ENSG00000176890.11
2
|
1098
|
|
SEQ ID NO:
DGEWEPPVIQNPEYK
ENSG00000179218.9
2
|
1099
|
|
SEQ ID NO:
DTSPAPSGTTSAFVK
ENSG00000205277.5
2
|
1100
|
|
SEQ ID NO:
EAEDRARQEEERR
ENSG00000130396.16
2
|
1101
|
|
SEQ ID NO:
EAPYGAPR
ENSG00000090006.13
2
|
1102
|
|
SEQ ID NO:
ECAIYTNR
ENSG00000104450.8
2
|
1103
|
|
SEQ ID NO:
EGIVALRR
ENSG00000146731.6
2
|
1104
|
|
SEQ ID NO:
EGPYTVDAIQK
ENSG00000198947.10
2
|
1105
|
|
SEQ ID NO:
EKELQTIFDTLPPMR
ENSG00000198947.10
2
|
1106
|
|
SEQ ID NO:
ELEQQLQESAR
ENSG00000019144.12
2
|
1107
|
|
SEQ ID NO:
EQLDKIQSSHNFQLESVNK
ENSG00000135052.12
2
|
1108
|
|
SEQ ID NO:
EVTKEEFVLAAQK
ENSG00000004864.9
2
|
1109
|
|
SEQ ID NO:
EVVPGDSVNSLLSILDVITGHQHP
ENSG00000032444.11
2
|
1110
QR
|
|
SEQ ID NO:
EYWMDPEGEMKPGRK
ENSG00000113387.7
2
|
1111
|
|
SEQ ID NO:
FGFSHLEALLDDSK
ENSG00000167770.7
2
|
1112
|
|
SEQ ID NO:
FGSQASQK
ENSG00000101199.8
2
|
1113
|
|
SEQ ID NO:
FHELTQTDK
ENSG00000100714.11
2
|
1114
|
|
SEQ ID NO:
FLDLGISIAENR
ENSG00000125826.15
2
|
1115
|
|
SEQ ID NO:
FLLDCGIR
ENSG00000065534.14
2
|
1116
|
|
SEQ ID NO:
FVDPSQDHALAK
ENSG00000130396.16
2
|
1117
|
|
SEQ ID NO:
FYGDEEK
ENSG00000179218.9
2
|
1118
|
|
SEQ ID NO:
GAWLGMNFNPK
ENSG00000011028.9
2
|
1119
|
|
SEQ ID NO:
GILVFQLK
ENSG00000130396.16
2
|
1120
|
|
SEQ ID NO:
GISLNPEQWSQL
ENSG00000113387.7
2
|
1121
|
|
SEQ ID NO:
GLYLPLFKPSVSTSK
ENSG00000004864.9
2
|
1122
|
|
SEQ ID NO:
GMEDLIPLVNR
ENSG00000106976.14
2
|
1123
|
|
SEQ ID NO:
GPIGHQGPIGQEGAPGR
ENSG00000134871.13
2
|
1124
|
|
SEQ ID NO:
GPNKHTLTQIK
ENSG00000146731.6
2
|
1125
|
|
SEQ ID NO:
GPTCNEFTGQCHCR
ENSG00000172037.9
2
|
1126
|
|
SEQ ID NO:
GSEGEPGIR
ENSG00000134871.13
2
|
1127
|
|
SEQ ID NO:
GTDVREPDDSPQGR
ENSG00000011028.9
2
|
1128
|
|
SEQ ID NO:
GWAGDSGPQGR
ENSG00000134871.13
2
|
1129
|
|
SEQ ID NO:
HAQEELPPPPPQKK
ENSG00000198947.10
2
|
1130
|
|
SEQ ID NO:
HSTVLENTDGK
ENSG00000163975.7
2
|
1131
|
|
SEQ ID NO:
IEELEEALR
ENSG00000082805.15
2
|
1132
|
|
SEQ ID NO:
IEGSGDQIDTYELSGGAR
ENSG00000106976.14
2
|
1133
|
|
SEQ ID NO:
IELHGKPIEVEHSVPK
ENSG00000136231.9
2
|
1134
|
|
SEQ ID NO:
IIDEDFELTERECIK
ENSG00000065534.14
2
|
1135
|
|
SEQ ID NO:
IKLIDFGLAR
ENSG00000065534.14
2
|
1136
|
|
SEQ ID NO:
ILDLLNEGSAR
ENSG00000079616.8
2
|
1137
|
|
SEQ ID NO:
ILMELDGPNWR
ENSG00000104450.8
2
|
1138
|
|
SEQ ID NO:
IPQAVVDVSSHLQK
ENSG00000171488.10
2
|
1139
|
|
SEQ ID NO:
IQAEQVDAVTLSGEDIYTAGK
ENSG00000163975.7
2
|
1140
|
|
SEQ ID NO:
IVIYVQQTTNK
ENSG00000011454.12
2
|
1141
|
|
SEQ ID NO:
IVSEFDYVEK
ENSG00000166825.9
2
|
1142
|
|
SEQ ID NO:
KADTLPR
ENSG00000049323.11
2
|
1143
|
|
SEQ ID NO:
KINQLSEENGDLSFK
ENSG00000137497.13
2
|
1144
|
|
SEQ ID NO:
KIQEILTQVK
ENSG00000136231.9
2
|
1145
|
|
SEQ ID NO:
KKLPAENGSSSAETLNAK
ENSG00000065534.14
2
|
1146
|
|
SEQ ID NO:
KLLLQCQVSSDPPATIIWTLNGK
ENSG00000065534.14
2
|
1147
|
|
SEQ ID NO:
KPAAGLSAAPVPTAPAAGAPL
ENSG00000115310.13
2
|
1148
|
|
SEQ ID NO:
KSPSSDSWTCADTSTER
ENSG00000101199.8
2
|
1149
|
|
SEQ ID NO:
KSSTGSPTSPLNAEK
ENSG00000065534.14
2
|
1150
|
|
SEQ ID NO:
LALLNEK
ENSG00000137497.13
2
|
1151
|
|
SEQ ID NO:
LDIDEK
ENSG00000130396.16
2
|
1152
|
|
SEQ ID NO:
LIAPLEGYTR
ENSG00000167608.7
2
|
1153
|
|
SEQ ID NO:
LKEEEEDKK
ENSG00000179218.9
2
|
1154
|
|
SEQ ID NO:
LKNQVTQLKEQVPGFTPR
ENSG00000100714.11
2
|
1155
|
|
SEQ ID NO:
LLDPQTNTEIANYPIYK
ENSG00000011454.12
2
|
1156
|
|
SEQ ID NO:
LLDRLPSFQQSCR
ENSG00000213380.9
2
|
1157
|
|
SEQ ID NO:
LLEAIKR
ENSG00000112096.12
2
|
1158
|
|
SEQ ID NO:
LLGFGSALLDNVDPNPENFVGA
ENSG00000196961.8
2
|
1159
GIIQTK
|
|
SEQ ID NO:
LQAQLNELQAQLSQKEQAAEHY
ENSG00000137497.13
2
|
1160
K
|
|
SEQ ID NO:
LQDVHVAEGKK
ENSG00000065534.14
2
|
1161
|
|
SEQ ID NO:
LQGEVLALEEER
ENSG00000019144.12
2
|
1162
|
|
SEQ ID NO:
LSALHLEVR
ENSG00000165912.11
2
|
1163
|
|
SEQ ID NO:
LSSQLVEHCQK
ENSG00000198947.10
2
|
1164
|
|
SEQ ID NO:
LSVMGCDVLK
ENSG00000163975.7
2
|
1165
|
|
SEQ ID NO:
LTAASVGVQGSGWGWLGFNKE
ENSG00000112096.12
2
|
1166
R
|
|
SEQ ID NO:
LTDVAIGAPGEEDNR
ENSG00000169896.12
2
|
1167
|
|
SEQ ID NO:
LTHGVLHTK
ENSG00000105223.14
2
|
1168
|
|
SEQ ID NO:
LVTDPDSGLCSHYWGAIIR
ENSG00000130396.16
2
|
1169
|
|
SEQ ID NO:
MDPEGEMKPGR
ENSG00000113387.7
2
|
1170
|
|
SEQ ID NO:
MELLVK
ENSG00000145362.12
2
|
1171
|
|
SEQ ID NO:
MVSMMEGVIQK
ENSG00000130396.16
2
|
1172
|
|
SEQ ID NO:
MVVASSK
ENSG00000100714.11
2
|
1173
|
|
SEQ ID NO:
NDAGQAECSCQVTVDDAPASE
ENSG00000065534.14
2
|
1174
NTKAPEMK
|
|
SEQ ID NO:
NILSEFQR
ENSG00000198947.10
2
|
1175
|
|
SEQ ID NO:
NLLEVSEVEQELACQNDHSSALQ
ENSG00000136631.8
2
|
1176
NIK
|
|
SEQ ID NO:
NLVDSYMAIVNK
ENSG00000106976.14
2
|
1177
|
|
SEQ ID NO:
NVNVFFPHFK
ENSG00000151116.12
2
|
1178
|
|
SEQ ID NO:
PASAEQIQHLAGAIAER
ENSG00000172037.9
2
|
1179
|
|
SEQ ID NO:
PAVPASVPLQAWHPAK
ENSG00000104450.8
2
|
1180
|
|
SEQ ID NO:
PFSAIYFPCYAHVK
ENSG00000004864.9
2
|
1181
|
|
SEQ ID NO:
PGPVPAHSLCGHLVPK
ENSG00000172037.9
2
|
1182
|
|
SEQ ID NO:
PLQGTTGLIPLLGIDVWEHAYYL
ENSG00000112096.12
2
|
1183
QYK
|
|
SEQ ID NO:
PNENKFAVGSGSR
ENSG00000130429.8
2
|
1184
|
|
SEQ ID NO:
PPVQFSLLHSK
ENSG00000196961.8
2
|
1185
|
|
SEQ ID NO:
QAPIGGDFPAVQK
ENSG00000198947.10
2
|
1186
|
|
SEQ ID NO:
QKLQDVHVAEGK
ENSG00000065534.14
2
|
1187
|
|
SEQ ID NO:
QLAAYIADKVDAAQMPQEAQK
ENSG00000198947.10
2
|
1188
|
|
SEQ ID NO:
QLSESSKLK
ENSG00000157617.12
2
|
1189
|
|
SEQ ID NO:
QQTANKVEIEK
ENSG00000011454.12
2
|
1190
|
|
SEQ ID NO:
QSSSSRDDNMFQIGK
ENSG00000113387.7
2
|
1191
|
|
SEQ ID NO:
QYTYGLVSCGLDR
ENSG00000004139.9
2
|
1192
|
|
SEQ ID NO:
RAGNSLAASTAEETAGSAQGR
ENSG00000172037.9
2
|
1193
|
|
SEQ ID NO:
REAPYGAPR
ENSG00000090006.13
2
|
1194
|
|
SEQ ID NO:
REPAPNAPGDIAAAFPAER
ENSG00000138162.13
2
|
1195
|
|
SEQ ID NO:
RGWDSSHEDDLPVYLAR
ENSG00000113657.8
2
|
1196
|
|
SEQ ID NO:
RLEEESAQLK
ENSG00000011454.12
2
|
1197
|
|
SEQ ID NO:
RQVEKEETNEIQVVNEEPQR
ENSG00000135052.12
2
|
1198
|
|
SEQ ID NO:
RSESQGTAPAFK
ENSG00000065534.14
2
|
1199
|
|
SEQ ID NO:
SCTEETHGFICQK
ENSG00000011028.9
2
|
1200
|
|
SEQ ID NO:
SDFGKFVLSSGK
ENSG00000179218.9
2
|
1201
|
|
SEQ ID NO:
SEYMEGNVR
ENSG00000166825.9
2
|
1202
|
|
SEQ ID NO:
SFAPILPHLAEEVFQHIPYIK
ENSG00000067704.8
2
|
1203
|
|
SEQ ID NO:
SKVPQETQSGGGSR
ENSG00000049323.11
2
|
1204
|
|
SEQ ID NO:
SPATTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
2
|
1205
R
|
|
SEQ ID NO:
SPATTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
2
|
1206
R
|
|
SEQ ID NO:
SPATTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
2
|
1207
RPGSTHTTAFPDSTTTPGLSR
|
|
SEQ ID NO:
SPATTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
2
|
1208
RPGSTHTTAFPDSTTTPGLSR
|
|
SEQ ID NO:
SQDLQVIDLLTVGESR
ENSG00000169231.9
2
|
1209
|
|
SEQ ID NO:
SREPQAKPQLDLSIDSLDLSCEEG
ENSG00000137497.13
2
|
1210
TPLSITSK
|
|
SEQ ID NO:
SRQELASGLPSPAATQELPVER
ENSG00000138162.13
2
|
1211
|
|
SEQ ID NO:
SSAAAGAPSR
ENSG00000049323.11
2
|
1212
|
|
SEQ ID NO:
SSPNVANQPPSPGGK
ENSG00000130396.16
2
|
1213
|
|
SEQ ID NO:
SSSEVLVLAETLDGVR
ENSG00000130589.12
2
|
1214
|
|
SEQ ID NO:
SVQEIAEQLLLENHPAR
ENSG00000151914.13
2
|
1215
|
|
SEQ ID NO:
TCTGYHQVR
ENSG00000133316.11
2
|
1216
|
|
SEQ ID NO:
TGETSR
ENSG00000113387.7
2
|
1217
|
|
SEQ ID NO:
TIQNQLR
ENSG00000169896.12
2
|
1218
|
|
SEQ ID NO:
TLFSLMQYSEEFR
ENSG00000169896.12
2
|
1219
|
|
SEQ ID NO:
TPAPDGPR
ENSG00000032444.11
2
|
1220
|
|
SEQ ID NO:
TPGQIVSEK
ENSG00000059691.7
2
|
1221
|
|
SEQ ID NO:
TPVPEK
ENSG00000065534.14
2
|
1222
|
|
SEQ ID NO:
TTLLDPDSCR
ENSG00000205277.5
2
|
1223
|
|
SEQ ID NO:
TTTESEVMK
ENSG00000100714.11
2
|
1224
|
|
SEQ ID NO:
TVLQIDCGLQLANDSVNR
ENSG00000104450.8
2
|
1225
|
|
SEQ ID NO:
VAQQPLSLVGCEVVPDPSPDHLY
ENSG00000169129.10
2
|
1226
SFR
|
|
SEQ ID NO:
VHALNNVNK
ENSG00000198947.10
2
|
1227
|
|
SEQ ID NO:
VIVMPTTK
ENSG00000067704.8
2
|
1228
|
|
SEQ ID NO:
VLQEDLEQEQVR
ENSG00000198947.10
2
|
1229
|
|
SEQ ID NO:
VPAHAVVVR
ENSG00000163975.7
2
|
1230
|
|
SEQ ID NO:
WLNEVEFK
ENSG00000198947.10
2
|
1231
|
|
SEQ ID NO:
WTDGSIINFISWAPGK
ENSG00000011028.9
2
|
1232
|
|
SEQ ID NO:
WTDGSIINFISWAPGKPR
ENSG00000011028.9
2
|
1233
|
|
SEQ ID NO:
WVNAQFSK
ENSG00000198947.10
2
|
1234
|
|
SEQ ID NO:
YDNFGVLGLDLWQVK
ENSG00000179218.9
2
|
1235
|
|
SEQ ID NO:
YLLYRPGHYDILYK
ENSG00000167770.7
2
|
1236
|
|
SEQ ID NO:
YLSSLDLLLEHR
ENSG00000133315.6
2
|
1237
|
|
SEQ ID NO:
YLVHCLQSELNNYMPAFLDDPEE
ENSG00000130396.16
2
|
1238
NSLQRPK
|
|
SEQ ID NO:
YRDPGVLPWGALEEEEEDGGR
ENSG00000167608.7
2
|
1239
|
|
SEQ ID NO:
AAAAAVGPGAGGAGSAVPGGA
ENSG00000142453.7
1
|
1240
GPCATVSVFPGAR
|
|
SEQ ID NO:
AAAKVALTKRADPAELR
ENSG00000004864.9
1
|
1241
|
|
SEQ ID NO:
AAATEEPEVIPDPAK
ENSG00000152894.10
1
|
1242
|
|
SEQ ID NO:
AAEEPQQQK
ENSG00000167770.7
1
|
1243
|
|
SEQ ID NO:
AAGDGSPDIGPTGELSGSLKIPNR
ENSG00000127084.13
1
|
1244
|
|
SEQ ID NO:
AAGLQAEIGQVK
ENSG00000082805.15
1
|
1245
|
|
SEQ ID NO:
AASGVPR
ENSG00000155629.10
1
|
1246
|
|
SEQ ID NO:
ACGNMFGLMHGTCPETSGGLLI
ENSG00000086475.10
1
|
1247
CLPR
|
|
SEQ ID NO:
ADSAVSQEQLR
ENSG00000165912.11
1
|
1248
|
|
SEQ ID NO:
AEEKPHVKPYFSK
ENSG00000065534.14
1
|
1249
|
|
SEQ ID NO:
AELEYNPEHVSR
ENSG00000067704.8
1
|
1250
|
|
SEQ ID NO:
AEQLLQDAR
ENSG00000172037.9
1
|
1251
|
|
SEQ ID NO:
AEYMRIQAQQQATKPSKEMS
ENSG00000017373.11
1
|
1252
|
|
SEQ ID NO:
AFCGLGTTGMWR
ENSG00000110237.3
1
|
1253
|
|
SEQ ID NO:
AFLEAVAEEKPHVKPYFSK
ENSG00000065534.14
1
|
1254
|
|
SEQ ID NO:
AHKQCALKLLR
ENSG00000141447.12
1
|
1255
|
|
SEQ ID NO:
ALMDLLQLTR
ENSG00000079616.8
1
|
1256
|
|
SEQ ID NO:
ALQDFEEPDK
ENSG00000061938.12
1
|
1257
|
|
SEQ ID NO:
ALQFLEEVKVSR
ENSG00000146731.6
1
|
1258
|
|
SEQ ID NO:
ALQHMAAMSSAQIVSATAIHNK
ENSG00000187079.10
1
|
1259
LGLPGIPRPT
|
|
SEQ ID NO:
AMAYETLEQYGK
ENSG00000104450.8
1
|
1260
|
|
SEQ ID NO:
AMLAAVLEQELPALAENLHQEQ
ENSG00000142733.10
1
|
1261
K
|
|
SEQ ID NO:
AMLAAVLEQELPALAENLHQEQ
ENSG00000142733.10
1
|
1262
K
|
|
SEQ ID NO:
ANGITMYAVGVGKAIEEELQEIA
ENSG00000132561.9
1
|
1263
SEPTNK
|
|
SEQ ID NO:
APAPDVPGCSR
ENSG00000172037.9
1
|
1264
|
|
SEQ ID NO:
APILPHLAEEVFQHIPYIK
ENSG00000067704.8
1
|
1265
|
|
SEQ ID NO:
AQALLADVDTLLFDCDGVLWR
ENSG00000184207.8
1
|
1266
|
|
SEQ ID NO:
AQNSGFDLQETLVK
ENSG00000146731.6
1
|
1267
|
|
SEQ ID NO:
ARFEQMAK
ENSG00000162614.14
1
|
1268
|
|
SEQ ID NO:
ARPEAYQVPASYQPDEEER
ENSG00000125826.15
1
|
1269
|
|
SEQ ID NO:
ARTSAGVGAWGAAAVGRTAGV
ENSG00000133315.6
1
|
1270
R
|
|
SEQ ID NO:
ASIPLKELEQFNSDIQK
ENSG00000198947.10
1
|
1271
|
|
SEQ ID NO:
ATSCFPRPMTPRDR
ENSG00000137497.13
1
|
1272
|
|
SEQ ID NO:
AVTSVSGPGEHLR
ENSG00000169231.9
1
|
1273
|
|
SEQ ID NO:
CAEVVSGK
ENSG00000067704.8
1
|
1274
|
|
SEQ ID NO:
CFGLLLSPGK
ENSG00000011454.12
1
|
1275
|
|
SEQ ID NO:
CGDSDKGFVVINQK
ENSG00000146731.6
1
|
1276
|
|
SEQ ID NO:
CGGLSCNGAAATADLALGR
ENSG00000172037.9
1
|
1277
|
|
SEQ ID NO:
CLCPPDFAGK
ENSG00000090006.13
1
|
1278
|
|
SEQ ID NO:
CLQHPWLMK
ENSG00000065534.14
1
|
1279
|
|
SEQ ID NO:
CLVENAGDVAFVR
ENSG00000163975.7
1
|
1280
|
|
SEQ ID NO:
CSGNIDPMDPDACDPHTGQCLR
ENSG00000172037.9
1
|
1281
|
|
SEQ ID NO:
CTEGPIDLVFVIDGSK
ENSG00000132561.9
1
|
1282
|
|
SEQ ID NO:
CTQCLQHPWLMK
ENSG00000065534.14
1
|
1283
|
|
SEQ ID NO:
CVRWAPNENK
ENSG00000130429.8
1
|
1284
|
|
SEQ ID NO:
DALLEALK
ENSG00000172037.9
1
|
1285
|
|
SEQ ID NO:
DCCFEISAPDKR
ENSG00000005020.8
1
|
1286
|
|
SEQ ID NO:
DDRTGTGTLSVFGMQARYSLR
ENSG00000176890.11
1
|
1287
|
|
SEQ ID NO:
DEDFELTERECIK
ENSG00000065534.14
1
|
1288
|
|
SEQ ID NO:
DISLQGPGLAPE
ENSG00000019144.12
1
|
1289
|
|
SEQ ID NO:
DITAALAAER
ENSG00000106976.14
1
|
1290
|
|
SEQ ID NO:
DLNVISSLLK
ENSG00000225485.3
1
|
1291
|
|
SEQ ID NO:
DQREPLPPAPAENEMK
ENSG00000104728.11
1
|
1292
|
|
SEQ ID NO:
DQSPLVSSSDSPPRPQPAFK
ENSG00000115310.13
1
|
1293
|
|
SEQ ID NO:
DRRGSGKPR
ENSG00000130396.16
1
|
1294
|
|
SEQ ID NO:
DSSHAFTLDELR
ENSG00000163975.7
1
|
1295
|
|
SEQ ID NO:
DWDSPYSHDLDTSADSVGNACR
ENSG00000105223.14
1
|
1296
|
|
SEQ ID NO:
EAEQLLRGPLGDQYQTVK
ENSG00000172037.9
1
|
1297
|
|
SEQ ID NO:
EAEVQTWLQQIGFSK
ENSG00000004139.9
1
|
1298
|
|
SEQ ID NO:
EDTVQSVK
ENSG00000106066.9
1
|
1299
|
|
SEQ ID NO:
EEAEQVLGQAR
ENSG00000198947.10
1
|
1300
|
|
SEQ ID NO:
EGIVALR
ENSG00000146731.6
1
|
1301
|
|
SEQ ID NO:
EGTEAEPLPLR
ENSG00000142733.10
1
|
1302
|
|
SEQ ID NO:
EGTEAEPLPLR
ENSG00000142733.10
1
|
1303
|
|
SEQ ID NO:
EGTPGIFQK
ENSG00000205277.5
1
|
1304
|
|
SEQ ID NO:
EGVIQNFK
ENSG00000130396.16
1
|
1305
|
|
SEQ ID NO:
EIDAALQK
ENSG00000162614.14
1
|
1306
|
|
SEQ ID NO:
EIHTVPDMGKWKR
ENSG00000119383.15
1
|
1307
|
|
SEQ ID NO:
EKLTAASVGVQGSGWGWLGFN
ENSG00000112096.12
1
|
1308
K
|
|
SEQ ID NO:
ELEAKMLAQKAEEKENHCPTML
ENSG00000079616.8
1
|
1309
R
|
|
SEQ ID NO:
ELEEKDGDVQAGANLSFNR
ENSG00000158560.10
1
|
1310
|
|
SEQ ID NO:
ELETLTTNYQWLCTR
ENSG00000198947.10
1
|
1311
|
|
SEQ ID NO:
ELLLSGPPEVAAPDTPYLHVDSA
ENSG00000138162.13
1
|
1312
AQR
|
|
SEQ ID NO:
ELQDGIGQR
ENSG00000198947.10
1
|
1313
|
|
SEQ ID NO:
EMSKKAPSEISRK
ENSG00000198947.10
1
|
1314
|
|
SEQ ID NO:
ENIRQEISIMNCLHHPK
ENSG00000065534.14
1
|
1315
|
|
SEQ ID NO:
EPMKAPLCGEGDQPGGFESQEK
ENSG00000138162.13
1
|
1316
|
|
SEQ ID NO:
EPYAREMLAISFISAVNR
ENSG00000225485.3
1
|
1317
|
|
SEQ ID NO:
ERARKFSGSGLAMGLGSASASA
ENSG00000082458.7
1
|
1318
WRR
|
|
SEQ ID NO:
ERARKFSGSGLAMGLGSASASA
ENSG00000082458.7
1
|
1319
WRR
|
|
SEQ ID NO:
ERVLSLSQALATEASQWHR
ENSG00000105559.7
1
|
1320
|
|
SEQ ID NO:
ESGRGSSTPPGPIAALGMPDTGP
ENSG00000127084.13
1
|
1321
GSSSLGK
|
|
SEQ ID NO:
ESGSLEDDWDFLPPKK
ENSG00000179218.9
1
|
1322
|
|
SEQ ID NO:
EVARNVFECNDQVVK
ENSG00000169896.12
1
|
1323
|
|
SEQ ID NO:
EVPEEGPGAPAR
ENSG00000186635.10
1
|
1324
|
|
SEQ ID NO:
EYQEDLALR
ENSG00000125826.15
1
|
1325
|
|
SEQ ID NO:
FAGDSLK
ENSG00000151914.13
1
|
1326
|
|
SEQ ID NO:
FGPGDQVR
ENSG00000114331.8
1
|
1327
|
|
SEQ ID NO:
FGVLGLDLWQVK
ENSG00000179218.9
1
|
1328
|
|
SEQ ID NO:
FKDNPTVVVEDLR
ENSG00000114331.8
1
|
1329
|
|
SEQ ID NO:
FNGAPTANFQQDVGTK
ENSG00000073849.10
1
|
1330
|
|
SEQ ID NO:
FNHPAEAKWMK
ENSG00000019144.12
1
|
1331
|
|
SEQ ID NO:
FNRALNCMNLPPDK
ENSG00000184922.9
1
|
1332
|
|
SEQ ID NO:
FRLAEDGKR
ENSG00000132561.9
1
|
1333
|
|
SEQ ID NO:
FSAEALR
ENSG00000073849.10
1
|
1334
|
|
SEQ ID NO:
FSPEVPGQK
ENSG00000131711.10
1
|
1335
|
|
SEQ ID NO:
FTDFEEVR
ENSG00000106976.14
1
|
1336
|
|
SEQ ID NO:
FVPIIGIAMPLSSR
ENSG00000151835.9
1
|
1337
|
|
SEQ ID NO:
FWPAIDDGLRR
ENSG00000105223.14
1
|
1338
|
|
SEQ ID NO:
FWVVDQTHFYLGSANMDWR
ENSG00000105223.14
1
|
1339
|
|
SEQ ID NO:
GAAVDEYFRQPVVDTFDIR
ENSG00000142453.7
1
|
1340
|
|
SEQ ID NO:
GAFHRPVLGGFR
ENSG00000165912.11
1
|
1341
|
|
SEQ ID NO:
GAGLAWGVHDCQLCSER
ENSG00000090006.13
1
|
1342
|
|
SEQ ID NO:
GAPISAYQIVVEELHPHRT
ENSG00000152894.10
1
|
1343
|
|
SEQ ID NO:
GATGHPGGGQGAENPAGLKSQ
ENSG00000104450.8
1
|
1344
GNELFR
|
|
SEQ ID NO:
GCLELIKETGVPIAGR
ENSG00000100714.11
1
|
1345
|
|
SEQ ID NO:
GCPQEDSDIAFLIDGSGSIIPHDF
ENSG00000169896.12
1
|
1346
R
|
|
SEQ ID NO:
GDEGPIGHQGPIGQEGAPGRPG
ENSG00000134871.13
1
|
1347
SPGLPGMPGR
|
|
SEQ ID NO:
GDKGERGAPGVTGPK
ENSG00000134871.13
1
|
1348
|
|
SEQ ID NO:
GDNVLINTFSGLLK
ENSG00000142733.10
1
|
1349
|
|
SEQ ID NO:
GDNVLINTFSGLLK
ENSG00000142733.10
1
|
1350
|
|
SEQ ID NO:
GDTGNPGAPGTPGTKGWAGDS
ENSG00000134871.13
1
|
1351
GPQGRP
|
|
SEQ ID NO:
GEFAIDGYSVR
ENSG00000005020.8
1
|
1352
|
|
SEQ ID NO:
GEGLYADPYGLLHEGR
ENSG00000017373.11
1
|
1353
|
|
SEQ ID NO:
GEIAPLKENVSHVNDLAR
ENSG00000198947.10
1
|
1354
|
|
SEQ ID NO:
GEWKPRQIDNPDYK
ENSG00000179218.9
1
|
1355
|
|
SEQ ID NO:
GGCVALATGSAMGLWEVK
ENSG00000011028.9
1
|
1356
|
|
SEQ ID NO:
GGHDIILAAFDNFK
ENSG00000184922.9
1
|
1357
|
|
SEQ ID NO:
GGSQPPDIDKTELVEPTEYLVVHL
ENSG00000166825.9
1
|
1358
K
|
|
SEQ ID NO:
GGVSAVPGFR
ENSG00000134871.13
1
|
1359
|
|
SEQ ID NO:
GHLQIAACPNQDPLQGTTGLIPL
ENSG00000112096.12
1
|
1360
LGIDVWEHAY
|
|
SEQ ID NO:
GHPDRLPLQMALTELETLAEK
ENSG00000104728.11
1
|
1361
|
|
SEQ ID NO:
GKEAGEVR
ENSG00000169896.12
1
|
1362
|
|
SEQ ID NO:
GKNVLINKDIR
ENSG00000179218.9
1
|
1363
|
|
SEQ ID NO:
GLCFLFGSNLR
ENSG00000169896.12
1
|
1364
|
|
SEQ ID NO:
GLEEAVESACAMR
ENSG00000067704.8
1
|
1365
|
|
SEQ ID NO:
GLGKYICQKCHAIIDEQPL
ENSG00000169756.12
1
|
1366
|
|
SEQ ID NO:
GNCFCYGHASECAPAPGAPAHA
ENSG00000172037.9
1
|
1367
EGMVHGACICK
|
|
SEQ ID NO:
GPAPARPKMLVISGGDGYEDFRL
ENSG00000110237.3
1
|
1368
SSGGGSSS
|
|
SEQ ID NO:
GPGAGSALDDGRR
ENSG00000196961.8
1
|
1369
|
|
SEQ ID NO:
GPPSSVPK
ENSG00000184922.9
1
|
1370
|
|
SEQ ID NO:
GQLQDELEKGER
ENSG00000082805.15
1
|
1371
|
|
SEQ ID NO:
GQTPEAGADKRSPRRASAAAAA
ENSG00000104450.8
1
|
1372
GGGATGHPGG
|
|
SEQ ID NO:
GREPASCEDLCGGGVGADGGGS
ENSG00000065534.14
1
|
1373
DR
|
|
SEQ ID NO:
GRISVSLQEEASGGSLAAPAR
ENSG00000032444.11
1
|
1374
|
|
SEQ ID NO:
GSDGMDAVRSAPTLIR
ENSG00000150672.12
1
|
1375
|
|
SEQ ID NO:
GSRPGIEGDTPR
ENSG00000113657.8
1
|
1376
|
|
SEQ ID NO:
GTISFFEIDGR
ENSG00000172977.8
1
|
1377
|
|
SEQ ID NO:
GTWIHPEIDNPEYSPD
ENSG00000179218.9
1
|
1378
|
|
SEQ ID NO:
GVTDTLAQIR
ENSG00000017373.11
1
|
1379
|
|
SEQ ID NO:
GWDCHGLPIEIK
ENSG00000067704.8
1
|
1380
|
|
SEQ ID NO:
HCELCRPFFYR
ENSG00000172037.9
1
|
1381
|
|
SEQ ID NO:
HFQIDYDEDGNCSLIISDVCGDD
ENSG00000065534.14
1
|
1382
DAK
|
|
SEQ ID NO:
HGGLSLVQTTDYIYPIVDDPYM
ENSG00000086475.10
1
|
1383
MGR
|
|
SEQ ID NO:
HLDTLHNFVSR
ENSG00000151914.13
1
|
1384
|
|
SEQ ID NO:
HLNPGLQLYR
ENSG00000114331.8
1
|
1385
|
|
SEQ ID NO:
HTEILEILEIPQLMDTCVR
ENSG00000213380.9
1
|
1386
|
|
SEQ ID NO:
HTLTQIKDAVR
ENSG00000146731.6
1
|
1387
|
|
SEQ ID NO:
IAALNASSTIEDDHEGSFK
ENSG00000099991.12
1
|
1388
|
|
SEQ ID NO:
IAEIQAR
ENSG00000152894.10
1
|
1389
|
|
SEQ ID NO:
IDALREELMEGMDR
ENSG00000132205.6
1
|
1390
|
|
SEQ ID NO:
IFEEQPCLRK
ENSG00000099991.12
1
|
1391
|
|
SEQ ID NO:
IFLTEQPLEGLEK
ENSG00000198947.10
1
|
1392
|
|
SEQ ID NO:
IFSAYIK
ENSG00000130429.8
1
|
1393
|
|
SEQ ID NO:
IIDRIHGTEEGQQILK
ENSG00000137497.13
1
|
1394
|
|
SEQ ID NO:
ILHKGEELAK
ENSG00000169129.10
1
|
1395
|
|
SEQ ID NO:
INELENGGEILNETRSFHHK
ENSG00000059691.7
1
|
1396
|
|
SEQ ID NO:
IPASAEQIQHLAGAIAER
ENSG00000172037.9
1
|
1397
|
|
SEQ ID NO:
IQGTLQPH
ENSG00000172037.9
1
|
1398
|
|
SEQ ID NO:
IQNQWDEVQEHLQNR
ENSG00000198947.10
1
|
1399
|
|
SEQ ID NO:
IQNVVTSFAPQRRAAWWQSEN
ENSG00000172037.9
1
|
1400
GIPA
|
|
SEQ ID NO:
IRQKVDDCERCR
ENSG00000011454.12
1
|
1401
|
|
SEQ ID NO:
ITEQEKLK
ENSG00000151914.13
1
|
1402
|
|
SEQ ID NO:
ITSVSTGNLCTEEQTPPPRPEAYPI
ENSG00000130396.16
1
|
1403
PTQTYTR
|
|
SEQ ID NO:
IVLGGTTVHNTK
ENSG00000136631.8
1
|
1404
|
|
SEQ ID NO:
IVTTHIR
ENSG00000106976.14
1
|
1405
|
|
SEQ ID NO:
KDAEGILEDLQSYR
ENSG00000153310.14
1
|
1406
|
|
SEQ ID NO:
KDVEVTKEEFVLAAQK
ENSG00000004864.9
1
|
1407
|
|
SEQ ID NO:
KEADMQQK
ENSG00000158560.10
1
|
1408
|
|
SEQ ID NO:
KHPSSPECLVSAQK
ENSG00000137497.13
1
|
1409
|
|
SEQ ID NO:
KIQNHIQTLK
ENSG00000198947.10
1
|
1410
|
|
SEQ ID NO:
KISEESGETAKRR
ENSG00000099991.12
1
|
1411
|
|
SEQ ID NO:
KIYAVEASTMAQHAEVLVK
ENSG00000142453.7
1
|
1412
|
|
SEQ ID NO:
KKEELNAVR
ENSG00000198947.10
1
|
1413
|
|
SEQ ID NO:
KKGPGAGSALDDGR
ENSG00000196961.8
1
|
1414
|
|
SEQ ID NO:
KLMQIR
ENSG00000151914.13
1
|
1415
|
|
SEQ ID NO:
KLSSQLVEHCQK
ENSG00000198947.10
1
|
1416
|
|
SEQ ID NO:
KLTFEYR
ENSG00000119383.15
1
|
1417
|
|
SEQ ID NO:
KMEEEPLGPDLEDLKR
ENSG00000198947.10
1
|
1418
|
|
SEQ ID NO:
KMSGTVSK
ENSG00000136631.8
1
|
1419
|
|
SEQ ID NO:
KQVAPEKPVKK
ENSG00000113387.7
1
|
1420
|
|
SEQ ID NO:
KSSTGSPTSPLNAEKLESEEDVSQ
ENSG00000065534.14
1
|
1421
AF
|
|
SEQ ID NO:
KTRPDGNCFYR
ENSG00000167770.7
1
|
1422
|
|
SEQ ID NO:
KVSTLQNQR
ENSG00000169896.12
1
|
1423
|
|
SEQ ID NO:
LAGEEEALR
ENSG00000125826.15
1
|
1424
|
|
SEQ ID NO:
LCDNIVSESESTTAR
ENSG00000170776.15
1
|
1425
|
|
SEQ ID NO:
LCIEHVEEHGLDIDGIYR
ENSG00000165322.13
1
|
1426
|
|
SEQ ID NO:
LCQFEEAKQDCDQALQLADGNV
ENSG00000104450.8
1
|
1427
K
|
|
SEQ ID NO:
LDAWEEAQVEFMASHGNDAAR
ENSG00000105963.9
1
|
1428
|
|
SEQ ID NO:
LDEDLTTLGQMSK
ENSG00000110237.3
1
|
1429
|
|
SEQ ID NO:
LDLFEISQPTEDLEFHGVMR
ENSG00000130396.16
1
|
1430
|
|
SEQ ID NO:
LEAIKR
ENSG00000112096.12
1
|
1431
|
|
SEQ ID NO:
LEMLQQIANR
ENSG00000151914.13
1
|
1432
|
|
SEQ ID NO:
LESEEDVSQAFLEAVAEEKPHVK
ENSG00000065534.14
1
|
1433
|
|
SEQ ID NO:
LESEEDVSQAFLEAVAEEKPHVK
ENSG00000065534.14
1
|
1434
PY
|
|
SEQ ID NO:
LETMARNEVIADINCK
ENSG00000141447.12
1
|
1435
|
|
SEQ ID NO:
LEYNVDAANGIVMEGYLFK
ENSG00000114331.8
1
|
1436
|
|
SEQ ID NO:
LFPNSLDQTDMHGDSEYNIMFG
ENSG00000179218.9
1
|
1437
PDICGPGTKK
|
|
SEQ ID NO:
LGCTMSMR
ENSG00000059691.7
1
|
1438
|
|
SEQ ID NO:
LGIEKTDPTTLTDEEINR
ENSG00000100714.11
1
|
1439
|
|
SEQ ID NO:
LGIVNVDEAVLHFK
ENSG00000155629.10
1
|
1440
|
|
SEQ ID NO:
LGYTPLIVACHYGNVK
ENSG00000145362.12
1
|
1441
|
|
SEQ ID NO:
LHEMQIQHPTASLIAK
ENSG00000146731.6
1
|
1442
|
|
SEQ ID NO:
LHYNELGAK
ENSG00000198947.10
1
|
1443
|
|
SEQ ID NO:
LKAVQAQGGESQQEAQR
ENSG00000137497.13
1
|
1444
|
|
SEQ ID NO:
LKEDMKKIVAVPLNEQK
ENSG00000138640.10
1
|
1445
|
|
SEQ ID NO:
LKEEEEDKKR
ENSG00000179218.9
1
|
1446
|
|
SEQ ID NO:
LKELNDWLTK
ENSG00000198947.10
1
|
1447
|
|
SEQ ID NO:
LKLSFEEMER
ENSG00000162614.14
1
|
1448
|
|
SEQ ID NO:
LKLTFEELER
ENSG00000162614.14
1
|
1449
|
|
SEQ ID NO:
LKPEIQCVSAK
ENSG00000163975.7
1
|
1450
|
|
SEQ ID NO:
LLEATPTDSCGYFR
ENSG00000142733.10
1
|
1451
|
|
SEQ ID NO:
LLEATPTDSCGYFR
ENSG00000142733.10
1
|
1452
|
|
SEQ ID NO:
LLKGESALQR
ENSG00000114331.8
1
|
1453
|
|
SEQ ID NO:
LLNEGQR
ENSG00000163975.7
1
|
1454
|
|
SEQ ID NO:
LNGFQLENFTLK
ENSG00000136231.9
1
|
1455
|
|
SEQ ID NO:
LNKILK
ENSG00000067704.8
1
|
1456
|
|
SEQ ID NO:
LNREVAESPRPR
ENSG00000019144.12
1
|
1457
|
|
SEQ ID NO:
LPPSSPQKLADVAAPPGGPPPPH
ENSG00000017373.11
1
|
1458
SPYSGPPSR
|
|
SEQ ID NO:
LQDAFSAIGQNADLDLPQIAVVG
ENSG00000106976.14
1
|
1459
GQSAGK
|
|
SEQ ID NO:
LQELEGTYEENERALESK
ENSG00000172037.9
1
|
1460
|
|
SEQ ID NO:
LQQQCDDYGSSYLGVIELIGEK
ENSG00000132205.6
1
|
1461
|
|
SEQ ID NO:
LSAHTHTLSLTDINELVCGAPGD
ENSG00000172037.9
1
|
1462
APCATSPCGGAGCR
|
|
SEQ ID NO:
LSFEEMERQRR
ENSG00000162614.14
1
|
1463
|
|
SEQ ID NO:
LSGWLAQQEDAHR
ENSG00000032444.11
1
|
1464
|
|
SEQ ID NO:
LSHFEYVKNEDLEK
ENSG00000061938.12
1
|
1465
|
|
SEQ ID NO:
LSIPQLSVTDYEIM
ENSG00000198947.10
1
|
1466
|
|
SEQ ID NO:
LSIPQLSVTDYEIMEQR
ENSG00000198947.10
1
|
1467
|
|
SEQ ID NO:
LSPAYSLGSLTGASPCQSPCVQR
ENSG00000019144.12
1
|
1468
|
|
SEQ ID NO:
LSSGGGSSSETVGR
ENSG00000110237.3
1
|
1469
|
|
SEQ ID NO:
LTEEQCLFSAWLSEKEDAVNK
ENSG00000198947.10
1
|
1470
|
|
SEQ ID NO:
LVAAGGLDAVLYWCR
ENSG00000004139.9
1
|
1471
|
|
SEQ ID NO:
LVEFSAFLEQQR
ENSG00000187079.10
1
|
1472
|
|
SEQ ID NO:
LVPSVNGVR
ENSG00000100714.11
1
|
1473
|
|
SEQ ID NO:
LVTPHGESEQIGVIPSKK
ENSG00000082458.7
1
|
1474
|
|
SEQ ID NO:
LVVTQEDVELAYQEAMMNMAR
ENSG00000086475.10
1
|
1475
LNRTAAGLMH
|
|
SEQ ID NO:
MAAAEAGGDDAR
ENSG00000184207.8
1
|
1476
|
|
SEQ ID NO:
MAVWEAEQLGGLQR
ENSG00000130589.12
1
|
1477
|
|
SEQ ID NO:
MEALENR
ENSG00000132561.9
1
|
1478
|
|
SEQ ID NO:
MEFDEKELRR
ENSG00000106976.14
1
|
1479
|
|
SEQ ID NO:
MESGRGSSTPPGPIAALGMPDT
ENSG00000127084.13
1
|
1480
GPG
|
|
SEQ ID NO:
MESGRGSSTPPGPIAALGMPDT
ENSG00000127084.13
1
|
1481
GPGSSSLGK
|
|
SEQ ID NO:
MESQLK
ENSG00000082805.15
1
|
1482
|
|
SEQ ID NO:
MGMSFGLESGK
ENSG00000114126.13
1
|
1483
|
|
SEQ ID NO:
MGNAAGSAEQPAGPAAPPPK
ENSG00000184922.9
1
|
1484
|
|
SEQ ID NO:
MIISTPQRLTSSGSVLIGSPYTPAP
ENSG00000114126.13
1
|
1485
AMVTQTHIA
|
|
SEQ ID NO:
MILTNPEGR
ENSG00000152894.10
1
|
1486
|
|
SEQ ID NO:
MKAAKSGTKDGLEK
ENSG00000074964.12
1
|
1487
|
|
SEQ ID NO:
MLEDLGFKDLTLQPR
ENSG00000125826.15
1
|
1488
|
|
SEQ ID NO:
MNSLTLNR
ENSG00000213380.9
1
|
1489
|
|
SEQ ID NO:
MSDKSDLKAELER
ENSG00000158560.10
1
|
1490
|
|
SEQ ID NO:
MSGSSGGAAAPAASSGPAAAAS
ENSG00000038382.13
1
|
1491
AAGSGCGGGA
|
|
SEQ ID NO:
MSKSLGNVIHP
ENSG00000067704.8
1
|
1492
|
|
SEQ ID NO:
MVSTSATDEPR
ENSG00000032444.11
1
|
1493
|
|
SEQ ID NO:
NANSSPVASTTPSASATTNPASA
ENSG00000166825.9
1
|
1494
TTLDQSKA
|
|
SEQ ID NO:
NATLVNEADKLR
ENSG00000166825.9
1
|
1495
|
|
SEQ ID NO:
NAVLEHMEELQEQVALLTER
ENSG00000184922.9
1
|
1496
|
|
SEQ ID NO:
NDKSYWLSTTAPLPMMPVAEDE
ENSG00000134871.13
1
|
1497
IKPYISR
|
|
SEQ ID NO:
NFVKEAEEISSNRR
ENSG00000213380.9
1
|
1498
|
|
SEQ ID NO:
NILVSDMEMNEQQE
ENSG00000011028.9
1
|
1499
|
|
SEQ ID NO:
NLAATLQDIETK
ENSG00000019144.12
1
|
1500
|
|
SEQ ID NO:
NLEELYLVGSLSHDISR
ENSG00000171488.10
1
|
1501
|
|
SEQ ID NO:
NLLEVSEVEQELACQNDHSSALQ
ENSG00000136631.8
1
|
1502
NIKR
|
|
SEQ ID NO:
NLVGSGSEIQFLSEAQDDPQKR
ENSG00000115652.10
1
|
1503
|
|
SEQ ID NO:
NRTEAEVKR
ENSG00000169129.10
1
|
1504
|
|
SEQ ID NO:
NSLSVLSPK
ENSG00000171488.10
1
|
1505
|
|
SEQ ID NO:
NTSAASTAQLVEATEELRR
ENSG00000172037.9
1
|
1506
|
|
SEQ ID NO:
NVQVFLISGGFR
ENSG00000146733.9
1
|
1507
|
|
SEQ ID NO:
NYPSSLCALCVGDEQGR
ENSG00000163975.7
1
|
1508
|
|
SEQ ID NO:
PCPCPEGPGSQR
ENSG00000172037.9
1
|
1509
|
|
SEQ ID NO:
PCQDVDECAR
ENSG00000090006.13
1
|
1510
|
|
SEQ ID NO:
PDENLKSASKEELKK
ENSG00000065534.14
1
|
1511
|
|
SEQ ID NO:
PEAYQVPASYQPDEEERAR
ENSG00000125826.15
1
|
1512
|
|
SEQ ID NO:
PEGEMKPGR
ENSG00000113387.7
1
|
1513
|
|
SEQ ID NO:
PETPYSGPGLLIDSLVLLPR
ENSG00000172037.9
1
|
1514
|
|
SEQ ID NO:
PEVVWFK
ENSG00000065534.14
1
|
1515
|
|
SEQ ID NO:
PGAGAVEVAMAEALIK
ENSG00000146731.6
1
|
1516
|
|
SEQ ID NO:
PGEMGPQGPPGEPGFRGAPGK
ENSG00000134871.13
1
|
1517
|
|
SEQ ID NO:
PGETPSWTGSGFVR
ENSG00000172037.9
1
|
1518
|
|
SEQ ID NO:
PGFHGQAAR
ENSG00000172037.9
1
|
1519
|
|
SEQ ID NO:
PGHVGQMGPVGAPGRPGPPGP
ENSG00000134871.13
1
|
1520
PGPK
|
|
SEQ ID NO:
PILPHLAEEVFQHIPYIK
ENSG00000067704.8
1
|
1521
|
|
SEQ ID NO:
PKIDDVLHTLTGAMSLLRR
ENSG00000130396.16
1
|
1522
|
|
SEQ ID NO:
PKMLVISGGDGYEDFR
ENSG00000110237.3
1
|
1523
|
|
SEQ ID NO:
PPDIDKTELVEPTEYLVVHLK
ENSG00000166825.9
1
|
1524
|
|
SEQ ID NO:
PPKPATPDFR
ENSG00000065534.14
1
|
1525
|
|
SEQ ID NO:
PPVIQNPEYK
ENSG00000179218.9
1
|
1526
|
|
SEQ ID NO:
PPVLGTESDATVK
ENSG00000065534.14
1
|
1527
|
|
SEQ ID NO:
PQLLGVAPEK
ENSG00000004864.9
1
|
1528
|
|
SEQ ID NO:
PRMSAQEQLERMR
ENSG00000105559.7
1
|
1529
|
|
SEQ ID NO:
PSGPATAEDPGRRPVLPQR
ENSG00000132205.6
1
|
1530
|
|
SEQ ID NO:
PTPRPVPMKRHIFR
ENSG00000186635.10
1
|
1531
|
|
SEQ ID NO:
PVAGSELPR
ENSG00000176890.11
1
|
1532
|
|
SEQ ID NO:
PYWCISR
ENSG00000067704.8
1
|
1533
|
|
SEQ ID NO:
QAASPLEPK
ENSG00000137497.13
1
|
1534
|
|
SEQ ID NO:
QAEEVNTEWEK
ENSG00000198947.10
1
|
1535
|
|
SEQ ID NO:
QAEGLSEDGAAMAVEPTQIQLS
ENSG00000198947.10
1
|
1536
K
|
|
SEQ ID NO:
QAPSSFQLLYDLK
ENSG00000100714.11
1
|
1537
|
|
SEQ ID NO:
QAQLEKELSAALQDKK
ENSG00000137497.13
1
|
1538
|
|
SEQ ID NO:
QAQVNLTVVDKPD
ENSG00000065534.14
1
|
1539
|
|
SEQ ID NO:
QDCDQALQLADGNVK
ENSG00000104450.8
1
|
1540
|
|
SEQ ID NO:
QEMVIEVKAIGGKK
ENSG00000110237.3
1
|
1541
|
|
SEQ ID NO:
QETPPPRSPPVANSGSTGFSRRG
ENSG00000105559.7
1
|
1542
SGRGGGPTP
|
|
SEQ ID NO:
QGPMTQAINR
ENSG00000170776.15
1
|
1543
|
|
SEQ ID NO:
QHEVEEATNILTATR
ENSG00000114331.8
1
|
1544
|
|
SEQ ID NO:
QIASLTGLVQSALLR
ENSG00000017373.11
1
|
1545
|
|
SEQ ID NO:
QICSQLSER
ENSG00000011454.12
1
|
1546
|
|
SEQ ID NO:
QKASGDSAR
ENSG00000004864.9
1
|
1547
|
|
SEQ ID NO:
QKMEEEKRRTEEER
ENSG00000162614.14
1
|
1548
|
|
SEQ ID NO:
QLELACETQEEVDSWK
ENSG00000106976.14
1
|
1549
|
|
SEQ ID NO:
QLNETGGPVLVSAPISPEEQDKL
ENSG00000198947.10
1
|
1550
ENK
|
|
SEQ ID NO:
QLPKPNQDTMQILFR
ENSG00000165322.13
1
|
1551
|
|
SEQ ID NO:
QLQTLAPK
ENSG00000105223.14
1
|
1552
|
|
SEQ ID NO:
QNGDSAYLYLLSAR
ENSG00000125826.15
1
|
1553
|
|
SEQ ID NO:
QPDVEEILSK
ENSG00000198947.10
1
|
1554
|
|
SEQ ID NO:
QQNLAVSESPVTPSALAELLDLLD
ENSG00000059691.7
1
|
1555
SR
|
|
SEQ ID NO:
QQQMHIVDMLSK
ENSG00000130396.16
1
|
1556
|
|
SEQ ID NO:
QSSHNFQLESVNK
ENSG00000135052.12
1
|
1557
|
|
SEQ ID NO:
QTLLAESEALTSYSHR
ENSG00000167608.7
1
|
1558
|
|
SEQ ID NO:
QTSVADLLASFNDQSTSDYLVVY
ENSG00000167770.7
1
|
1559
LR
|
|
SEQ ID NO:
QVFGQTTIHQHIPFNWDSEFVQ
ENSG00000004864.9
1
|
1560
LHFGK
|
|
SEQ ID NO:
QVVQDLLK
ENSG00000141447.12
1
|
1561
|
|
SEQ ID NO:
RASAAAAAGGGATGHPGGGQG
ENSG00000104450.8
1
|
1562
AENPAGLK
|
|
SEQ ID NO:
RCDLCAPGYYGFGPTGCQACQC
ENSG00000172037.9
1
|
1563
SHEGALSSLCEK
|
|
SEQ ID NO:
RCEQVQPGYFR
ENSG00000172037.9
1
|
1564
|
|
SEQ ID NO:
RDNEVDGQDYHFVVSR
ENSG00000082458.7
1
|
1565
|
|
SEQ ID NO:
RDPSSNDINGGMEPTPSTVSTPS
ENSG00000196961.8
1
|
1566
PSADLLGLR
|
|
SEQ ID NO:
REMAAASAAAISGAGR
ENSG00000079616.8
1
|
1567
|
|
SEQ ID NO:
RETLFTLDDQALGPELTAPAPEPP
ENSG00000213380.9
1
|
1568
AEEPR
|
|
SEQ ID NO:
RFSTEYELQQLEQFK
ENSG00000166825.9
1
|
1569
|
|
SEQ ID NO:
RGSDELTVPRYR
ENSG00000017373.11
1
|
1570
|
|
SEQ ID NO:
RIEGSGDQIDTYELSGGAR
ENSG00000106976.14
1
|
1571
|
|
SEQ ID NO:
RKEEEEAEDK
ENSG00000179218.9
1
|
1572
|
|
SEQ ID NO:
RLDIDEKPLVVQLNWNKDDR
ENSG00000130396.16
1
|
1573
|
|
SEQ ID NO:
RPPEPEKAPPAAPTRPSALELK
ENSG00000184922.9
1
|
1574
|
|
SEQ ID NO:
RPRPQGRSVSEPR
ENSG00000125744.7
1
|
1575
|
|
SEQ ID NO:
RQAEGLSEDGAAMAVEPTQIQL
ENSG00000198947.10
1
|
1576
SK
|
|
SEQ ID NO:
RRKVPPSGSGGSELSNGEAGEAY
ENSG00000110237.3
1
|
1577
R
|
|
SEQ ID NO:
RSLELQTRTEEEKK
ENSG00000127084.13
1
|
1578
|
|
SEQ ID NO:
RSSYLLAITTERSK
ENSG00000225485.3
1
|
1579
|
|
SEQ ID NO:
RVAAQVDGGAQVQQVLNIECLR
ENSG00000196961.8
1
|
1580
|
|
SEQ ID NO:
SAEESDRLR
ENSG00000130396.16
1
|
1581
|
|
SEQ ID NO:
SCDCDPMGSQDGGR
ENSG00000172037.9
1
|
1582
|
|
SEQ ID NO:
SDVLETVVLINPSDEAVSTEVR
ENSG00000131711.10
1
|
1583
|
|
SEQ ID NO:
SEDYELLCPNGAR
ENSG00000163975.7
1
|
1584
|
|
SEQ ID NO:
SFGSSLMESEVNLDR
ENSG00000198947.10
1
|
1585
|
|
SEQ ID NO:
SGHDQVVELLLERGAPLLAR
ENSG00000145362.12
1
|
1586
|
|
SEQ ID NO:
SGLTSLHLAAQEDKVNVADILTK
ENSG00000145362.12
1
|
1587
|
|
SEQ ID NO:
SGRPSCLYSAARPSGSYR
ENSG00000124831.14
1
|
1588
|
|
SEQ ID NO:
SGTIFDNFLITNDEA
ENSG00000179218.9
1
|
1589
|
|
SEQ ID NO:
SGTLALVEPLVASLDPGR
ENSG00000004139.9
1
|
1590
|
|
SEQ ID NO:
SKIVGAPMHDLLLWNNATVTTC
ENSG00000100714.11
1
|
1591
HSK
|
|
SEQ ID NO:
SKPEDWDER
ENSG00000179218.9
1
|
1592
|
|
SEQ ID NO:
SLEGSDDAVLLQRRLDNMNFKW
ENSG00000198947.10
1
|
1593
SELR
|
|
SEQ ID NO:
SLNPEQWSQLK
ENSG00000113387.7
1
|
1594
|
|
SEQ ID NO:
SLSDPSRRGELAGPGFEGPGGEP
ENSG00000110237.3
1
|
1595
IREV
|
|
SEQ ID NO:
SNRDELELELAENR
ENSG00000137497.13
1
|
1596
|
|
SEQ ID NO:
SPARPQPGEGPGGPGGPPEVSR
ENSG00000105559.7
1
|
1597
|
|
SEQ ID NO:
SPARPQPGEGPGGPGGPPEVSR
ENSG00000105559.7
1
|
1598
|
|
SEQ ID NO:
SPDTTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
1
|
1599
R
|
|
SEQ ID NO:
SPDTTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
1
|
1600
R
|
|
SEQ ID NO:
SPDTTLSPASTTSSGVSEESTTSHS
ENSG00000205277.5
1
|
1601
R
|
|
SEQ ID NO:
SPFPSQHLEAPEDK
ENSG00000198947.10
1
|
1602
|
|
SEQ ID NO:
SPGPPQVDGTPTMSLERPPR
ENSG00000155629.10
1
|
1603
|
|
SEQ ID NO:
SPTTTLSPASMTSLGVGEESTTSR
ENSG00000205277.5
1
|
1604
|
|
SEQ ID NO:
SPTTTLSPASMTSLGVGEESTTSR
ENSG00000205277.5
1
|
1605
|
|
SEQ ID NO:
SPTTTLSPASMTSLGVGEESTTSR
ENSG00000205277.5
1
|
1606
|
|
SEQ ID NO:
SPTTTLSPASMTSLGVGEESTTSR
ENSG00000205277.5
1
|
1607
|
|
SEQ ID NO:
SQAYADYIGFILTLNEGVK
ENSG00000119383.15
1
|
1608
|
|
SEQ ID NO:
SQMNCNLGTCQLQR
ENSG00000205277.5
1
|
1609
|
|
SEQ ID NO:
SRQELNTIASKPPR
ENSG00000169896.12
1
|
1610
|
|
SEQ ID NO:
SSHVTIDTLK
ENSG00000163975.7
1
|
1611
|
|
SEQ ID NO:
SSQNDSPGDASEGPEYLAIGNLD
ENSG00000145016.9
1
|
1612
PRGR
|
|
SEQ ID NO:
STEYELQQLEQFKK
ENSG00000166825.9
1
|
1613
|
|
SEQ ID NO:
STSFNVQDLLPDHEYKFR
ENSG00000065534.14
1
|
1614
|
|
SEQ ID NO:
SVEQEVVQSQLNHCVNLYK
ENSG00000198947.10
1
|
1615
|
|
SEQ ID NO:
SVYTMPLANHR
ENSG00000090006.13
1
|
1616
|
|
SEQ ID NO:
SWAEDEKQKAETVQAALEEAQR
ENSG00000172037.9
1
|
1617
|
|
SEQ ID NO:
SWCSGHLHLRCPR
ENSG00000032444.11
1
|
1618
|
|
SEQ ID NO:
SYVDTGGVSR
ENSG00000184922.9
1
|
1619
|
|
SEQ ID NO:
SYVITGSWNPK
ENSG00000011454.12
1
|
1620
|
|
SEQ ID NO:
TAIWEDQNLR
ENSG00000205277.5
1
|
1621
|
|
SEQ ID NO:
TALLTAGDIYLLSTFR
ENSG00000169231.9
1
|
1622
|
|
SEQ ID NO:
TEALMDAQKEDFNSK
ENSG00000172037.9
1
|
1623
|
|
SEQ ID NO:
TEFCLHDGPPYANGDPHVGHAL
ENSG00000067704.8
1
|
1624
NK
|
|
SEQ ID NO:
TESSGGWQNR
ENSG00000011028.9
1
|
1625
|
|
SEQ ID NO:
THIESSGHGVDTCLHVVLSSKVC
ENSG00000019144.12
1
|
1626
R
|
|
SEQ ID NO:
TKVHAELADVLTEAVVDSILAIKK
ENSG00000146731.6
1
|
1627
|
|
SEQ ID NO:
TLEIALEQKKEECLK
ENSG00000082805.15
1
|
1628
|
|
SEQ ID NO:
TLNATGEEIIQQSSK
ENSG00000198947.10
1
|
1629
|
|
SEQ ID NO:
TLPSMVHR
ENSG00000101199.8
1
|
1630
|
|
SEQ ID NO:
TMNGDMR
ENSG00000120549.11
1
|
1631
|
|
SEQ ID NO:
TNHIGWVQEFLNEENR
ENSG00000184922.9
1
|
1632
|
|
SEQ ID NO:
TNIQLPACLR
ENSG00000213380.9
1
|
1633
|
|
SEQ ID NO:
TPDELQK
ENSG00000198947.10
1
|
1634
|
|
SEQ ID NO:
TPLERDDLHESVFR
ENSG00000151914.13
1
|
1635
|
|
SEQ ID NO:
TSGNQDEILVIR
ENSG00000106976.14
1
|
1636
|
|
SEQ ID NO:
TTLSPASSTSPGLQGESTAFQTHP
ENSG00000205277.5
1
|
1637
ASTHTTPSPPSTATAPVEESTTYH
|
R
|
|
SEQ ID NO:
TTLSPASSTSPGLQGESTAFQTHP
ENSG00000205277.5
1
|
1638
ASTHTTPSPPSTATAPVEESTTYH
|
R
|
|
SEQ ID NO:
TTLSPASSTSPGLQGESTAFQTHP
ENSG00000205277.5
1
|
1639
ASTHTTPSPPSTATAPVEESTTYH
|
R
|
|
SEQ ID NO:
TTQGLTALLLSLKK
ENSG00000136631.8
1
|
1640
|
|
SEQ ID NO:
TTQIINITMTK
ENSG00000137497.13
1
|
1641
|
|
SEQ ID NO:
TWVQQSETK
ENSG00000198947.10
1
|
1642
|
|
SEQ ID NO:
VAIGPSVLNAAR
ENSG00000067704.8
1
|
1643
|
|
SEQ ID NO:
VAYIPDEMAAQQNPLQQPR
ENSG00000136231.9
1
|
1644
|
|
SEQ ID NO:
VDSDMNDAYLGYAAAIILR
ENSG00000169896.12
1
|
1645
|
|
SEQ ID NO:
VEDAYILTCNVSLEYEK
ENSG00000146731.6
1
|
1646
|
|
SEQ ID NO:
VGAPMHDLLLWNNATVTTCHS
ENSG00000100714.11
1
|
1647
K
|
|
SEQ ID NO:
VHLFDIITQYR
ENSG00000213380.9
1
|
1648
|
|
SEQ ID NO:
VIECFNVESR
ENSG00000104728.11
1
|
1649
|
|
SEQ ID NO:
VLGHFEKPLFLELCR
ENSG00000032444.11
1
|
1650
|
|
SEQ ID NO:
VLMDLQNQK
ENSG00000198947.10
1
|
1651
|
|
SEQ ID NO:
VLTTSPSR
ENSG00000019144.12
1
|
1652
|
|
SEQ ID NO:
VMLPPGAQHSDEK
ENSG00000130396.16
1
|
1653
|
|
SEQ ID NO:
VNFRPRYVTRYKTVTQLEWRCCP
ENSG00000132205.6
1
|
1654
GFRGGDCQEGPK
|
|
SEQ ID NO:
VPDMAEIQSR
ENSG00000032444.11
1
|
1655
|
|
SEQ ID NO:
VQLLSQYDNEK
ENSG00000184922.9
1
|
1656
|
|
SEQ ID NO:
VSRASSPEGRHLPSPQLGTK
ENSG00000105559.7
1
|
1657
|
|
SEQ ID NO:
VTCTGYHQVR
ENSG00000133316.11
1
|
1658
|
|
SEQ ID NO:
VTEFDAAR
ENSG00000136631.8
1
|
1659
|
|
SEQ ID NO:
VVQEENQHMQMTIQALQDELR
ENSG00000082805.15
1
|
1660
|
|
SEQ ID NO:
VYLDLTPVK
ENSG00000169129.10
1
|
1661
|
|
SEQ ID NO:
WCATSDPEQHK
ENSG00000163975.7
1
|
1662
|
|
SEQ ID NO:
WFSIQNNQLVYQK
ENSG00000114331.8
1
|
1663
|
|
SEQ ID NO:
WIEFCQLLSER
ENSG00000198947.10
1
|
1664
|
|
SEQ ID NO:
WYQNPDYNFFNNYK
ENSG00000073849.10
1
|
1665
|
|
SEQ ID NO:
YADSLKPNIPYK
ENSG00000130396.16
1
|
1666
|
|
SEQ ID NO:
YENHSATAESSR
ENSG00000152894.10
1
|
1667
|
|
SEQ ID NO:
YLITATLTPER
ENSG00000132205.6
1
|
1668
|
|
SEQ ID NO:
YLQQPGCLLVGTNMDNR
ENSG00000184207.8
1
|
1669
|
|
SEQ ID NO:
YLRELSGSGLER
ENSG00000213380.9
1
|
1670
|
|
SEQ ID NO:
YLSASEYGSSVDGHPEVPETK
ENSG00000169129.10
1
|
1671
|
|
SEQ ID NO:
YNASSQQQR
ENSG00000165322.13
1
|
1672
|
|
SEQ ID NO:
YQETMSAIR
ENSG00000198947.10
1
|
1673
|
|
SEQ ID NO:
YSFWLTTIPEQSFQGSPSADTLK
ENSG00000134871.13
1
|
1674
|
|
SEQ ID NO:
YTKQGFGNLPICMAK
ENSG00000100714.11
1
|
1675
|
|
SEQ ID NO:
YVPAIAHLIHSLN
ENSG00000106066.9
1
|
1676
|
|
SEQ ID NO:
AAECLDVDECHRVPPPCDLGR
ENSG00000090006.13
0
|
1677
|
|
SEQ ID NO:
AEGGKRPAR
ENSG00000104450.8
0
|
1678
|
|
SEQ ID NO:
AEPVWTPPAPAPAAPPSTPAAP
ENSG00000115310.13
0
|
1679
K
|
|
SEQ ID NO:
AFLCPLICHNGGVCVKPDR
ENSG00000090006.13
0
|
1680
|
|
SEQ ID NO:
AHLIHSLNPVR
ENSG00000106066.9
0
|
1681
|
|
SEQ ID NO:
AIAHLIHSLNPVR
ENSG00000106066.9
0
|
1682
|
|
SEQ ID NO:
AIWNVINW
ENSG00000112096.12
0
|
1683
|
|
SEQ ID NO:
AIWNVINWENV
ENSG00000112096.12
0
|
1684
|
|
SEQ ID NO:
ANGITMYAVGVGK
ENSG00000132561.9
0
|
1685
|
|
SEQ ID NO:
AQPVPFVPQVLGVMIGAGVAVV
ENSG00000032444.11
0
|
1686
VTAVLILLVVRR
|
|
SEQ ID NO:
ARILTAAR
ENSG00000004139.9
0
|
1687
|
|
SEQ ID NO:
AVGPGAGGAGSAVPGGAGPCA
ENSG00000142453.7
0
|
1688
TVSVFPGAR
|
|
SEQ ID NO:
AYDNFGVLGLDLWQVK
ENSG00000179218.9
0
|
1689
|
|
SEQ ID NO:
CVCPAGFR
ENSG00000090006.13
0
|
1690
|
|
SEQ ID NO:
CVHGPTGSR
ENSG00000090006.13
0
|
1691
|
|
SEQ ID NO:
CVPPRTSAGTFPGSQPQAPASPV
ENSG00000090006.13
0
|
1692
LPAR
|
|
SEQ ID NO:
DHPSSHSAQPPR
ENSG00000138162.13
0
|
1693
|
|
SEQ ID NO:
DKERLQAMMTHLHVKSTEPK
ENSG00000114861.14
0
|
1694
|
|
SEQ ID NO:
DLDNAEEKADALNK
ENSG00000011454.12
0
|
1695
|
|
SEQ ID NO:
DLYSALIQFFQIFPEYK
ENSG00000106066.9
0
|
1696
|
|
SEQ ID NO:
DPASDKLLGPAGLTWERNLPGA
ENSG00000138162.13
0
|
1697
GVGKEMAGVPPTLR
|
|
SEQ ID NO:
DSAVMDDSVVIPSHQVSTLAK
ENSG00000145362.12
0
|
1698
|
|
SEQ ID NO:
DSSTPYQEIAAVPSAGR
ENSG00000138162.13
0
|
1699
|
|
SEQ ID NO:
DWDSPYSHDLDT
ENSG00000105223.14
0
|
1700
|
|
SEQ ID NO:
DWDSPYSHDLDTS
ENSG00000105223.14
0
|
1701
|
|
SEQ ID NO:
EDLDQSPLVSSSDSPPRPQPAFK
ENSG00000115310.13
0
|
1702
|
|
SEQ ID NO:
EESREPAPASPAPA
ENSG00000113657.8
0
|
1703
|
|
SEQ ID NO:
ELSSKGVK
ENSG00000176890.11
0
|
1704
|
|
SEQ ID NO:
EMELRRQALEEERR
ENSG00000019144.12
0
|
1705
|
|
SEQ ID NO:
ENGTVPK
ENSG00000165322.13
0
|
1706
|
|
SEQ ID NO:
ENKEVVLQWFTENSK
ENSG00000166825.9
0
|
1707
|
|
SEQ ID NO:
EVAESPRPR
ENSG00000019144.12
0
|
1708
|
|
SEQ ID NO:
FILDNLK
ENSG00000151835.9
0
|
1709
|
|
SEQ ID NO:
FLEAVAEEKPHVKPYFSK
ENSG00000065534.14
0
|
1710
|
|
SEQ ID NO:
FPIEGGQKDPK
ENSG00000107957.12
0
|
1711
|
|
SEQ ID NO:
FSTEYELQQLEQFKKDNEETGFG
ENSG00000166825.9
0
|
1712
SGTR
|
|
SEQ ID NO:
FWPAIDDGLR
ENSG00000105223.14
0
|
1713
|
|
SEQ ID NO:
FYIDFGGVKPMGSEPVPKSR
ENSG00000004864.9
0
|
1714
|
|
SEQ ID NO:
GADLIEEAASRIVDAVIEQVKAAG
ENSG00000170776.15
0
|
1715
ALLTEGE
|
|
SEQ ID NO:
GADYAEPTWNLK
ENSG00000166825.9
0
|
1716
|
|
SEQ ID NO:
GDEEKDKGLQTSQDAR
ENSG00000179218.9
0
|
1717
|
|
SEQ ID NO:
GDILQTPQFQMR
ENSG00000137497.13
0
|
1718
|
|
SEQ ID NO:
GDNLPQYR
ENSG00000205277.5
0
|
1719
|
|
SEQ ID NO:
GNEAVASR
ENSG00000135052.12
0
|
1720
|
|
SEQ ID NO:
GPNKHTLTQIKDAVR
ENSG00000146731.6
0
|
1721
|
|
SEQ ID NO:
GQGPMFLDADFVAFTNHFK
ENSG00000198947.10
0
|
1722
|
|
SEQ ID NO:
GTATPELHTATDYR
ENSG00000170776.15
0
|
1723
|
|
SEQ ID NO:
GWAGDSGPQGRPGVFGLPGEK
ENSG00000134871.13
0
|
1724
|
|
SEQ ID NO:
GYLAPSGDLSLRR
ENSG00000090006.13
0
|
1725
|
|
SEQ ID NO:
HAEQQALR
ENSG00000142453.7
0
|
1726
|
|
SEQ ID NO:
IEDPSLLNSR
ENSG00000032444.11
0
|
1727
|
|
SEQ ID NO:
IFMEEVPGGSLSSLLRS
ENSG00000142733.10
0
|
1728
|
|
SEQ ID NO:
IFMEEVPGGSLSSLLRS
ENSG00000142733.10
0
|
1729
|
|
SEQ ID NO:
IIEVAPQVATQNVNPTPGAT
ENSG00000086475.10
0
|
1730
|
|
SEQ ID NO:
ILNSDQTTCR
ENSG00000132561.9
0
|
1731
|
|
SEQ ID NO:
ISCWGHSEPSMR
ENSG00000105223.14
0
|
1732
|
|
SEQ ID NO:
IVVHSVENMNFR
ENSG00000184922.9
0
|
1733
|
|
SEQ ID NO:
KAVAHMK
ENSG00000132561.9
0
|
1734
|
|
SEQ ID NO:
KDITAALAAER
ENSG00000106976.14
0
|
1735
|
|
SEQ ID NO:
KDNEETGFGSGTR
ENSG00000166825.9
0
|
1736
|
|
SEQ ID NO:
KHQGHFLLGTLSR
ENSG00000061938.12
0
|
1737
|
|
SEQ ID NO:
KIAEIQARR
ENSG00000152894.10
0
|
1738
|
|
SEQ ID NO:
KKEADMQQK
ENSG00000158560.10
0
|
1739
|
|
SEQ ID NO:
KLFGGPGSRR
ENSG00000110237.3
0
|
1740
|
|
SEQ ID NO:
KPAAGLSAAPVPTAPAAGAP
ENSG00000115310.13
0
|
1741
|
|
SEQ ID NO:
KSSTGSPTSPLNAEKLESEEDVSQ
ENSG00000065534.14
0
|
1742
A
|
|
SEQ ID NO:
KVVATTQMQAADARK
ENSG00000166825.9
0
|
1743
|
|
SEQ ID NO:
LADSDQASKVQQQK
ENSG00000137497.13
0
|
1744
|
|
SEQ ID NO:
LAYVSCVR
ENSG00000032444.11
0
|
1745
|
|
SEQ ID NO:
LGIVQGIVGARNTSAASTAQLVE
ENSG00000172037.9
0
|
1746
ATEELRREIG
|
|
SEQ ID NO:
LHYNELGAKVTERKQQ
ENSG00000198947.10
0
|
1747
|
|
SEQ ID NO:
LIEVGPSGAQFLGK
ENSG00000145362.12
0
|
1748
|
|
SEQ ID NO:
LKQTNLQWIK
ENSG00000198947.10
0
|
1749
|
|
SEQ ID NO:
LKTVFYR
ENSG00000104728.11
0
|
1750
|
|
SEQ ID NO:
LLISCWGHSEPSMR
ENSG00000105223.14
0
|
1751
|
|
SEQ ID NO:
LMFDRSEVYGPMK
ENSG00000166825.9
0
|
1752
|
|
SEQ ID NO:
LMLEWQFQK
ENSG00000130396.16
0
|
1753
|
|
SEQ ID NO:
LPAAPPVAPER
ENSG00000115310.13
0
|
1754
|
|
SEQ ID NO:
LPPVLGTESDATVK
ENSG00000065534.14
0
|
1755
|
|
SEQ ID NO:
LPQEPGR
ENSG00000135052.12
0
|
1756
|
|
SEQ ID NO:
LQGQDSERVRAWQR
ENSG00000165912.11
0
|
1757
|
|
SEQ ID NO:
LSRKGGHER
ENSG00000019144.12
0
|
1758
|
|
SEQ ID NO:
LTELENELNTK
ENSG00000130396.16
0
|
1759
|
|
SEQ ID NO:
LTGKAEGGK
ENSG00000104450.8
0
|
1760
|
|
SEQ ID NO:
LWEAVKRR
ENSG00000061938.12
0
|
1761
|
|
SEQ ID NO:
LWHLDPDTEYEIR
ENSG00000152894.10
0
|
1762
|
|
SEQ ID NO:
LYGVVLTPPMK
ENSG00000061938.12
0
|
1763
|
|
SEQ ID NO:
MELEEVTRLLNLKDK
ENSG00000104450.8
0
|
1764
|
|
SEQ ID NO:
MIEDSGPGMKVLL
ENSG00000136631.8
0
|
1765
|
|
SEQ ID NO:
MPVAGSELPR
ENSG00000176890.11
0
|
1766
|
|
SEQ ID NO:
NFVLVLSPGALDK
ENSG00000004139.9
0
|
1767
|
|
SEQ ID NO:
NIMFGPDICGPGTK
ENSG00000179218.9
0
|
1768
|
|
SEQ ID NO:
NITIIVEDPIAESCNDKAKLRGPL
ENSG00000145016.9
0
|
1769
|
|
SEQ ID NO:
NPKAEVARAQAALAVNISAARG
ENSG00000146731.6
0
|
1770
LQDVLRTNLGPK
|
|
SEQ ID NO:
NQVTQLK
ENSG00000100714.11
0
|
1771
|
|
SEQ ID NO:
NVINWENVTER
ENSG00000112096.12
0
|
1772
|
|
SEQ ID NO:
PGHYDILYK
ENSG00000167770.7
0
|
1773
|
|
SEQ ID NO:
PGSPGLPGMPGR
ENSG00000134871.13
0
|
1774
|
|
SEQ ID NO:
PLEEGLNKAIHYFR
ENSG00000115652.10
0
|
1775
|
|
SEQ ID NO:
PLSTRVPR
ENSG00000132561.9
0
|
1776
|
|
SEQ ID NO:
PSAGFLPTHR
ENSG00000090006.13
0
|
1777
|
|
SEQ ID NO:
PSGPQPQADLQALLQSGAQVR
ENSG00000105223.14
0
|
1778
|
|
SEQ ID NO:
PSSSGSTGTKLSPARSTTSGLVGE
ENSG00000205277.5
0
|
1779
STPSR
|
|
SEQ ID NO:
PSSSGSTGTKLSPARSTTSGLVGE
ENSG00000205277.5
0
|
1780
STPSR
|
|
SEQ ID NO:
QGYILNSDQTTCR
ENSG00000132561.9
0
|
1781
|
|
SEQ ID NO:
QVFEELWK
ENSG00000059691.7
0
|
1782
|
|
SEQ ID NO:
QVKPKTVSEEERKV
ENSG00000065534.14
0
|
1783
|
|
SEQ ID NO:
QYISKMIEDSGPGMK
ENSG00000136631.8
0
|
1784
|
|
SEQ ID NO:
QYMPWEAALSSLSYFK
ENSG00000166825.9
0
|
1785
|
|
SEQ ID NO:
RADVLAFPSSGFTDLAEIVSR
ENSG00000032444.11
0
|
1786
|
|
SEQ ID NO:
RAVAAQPGRKR
ENSG00000172977.8
0
|
1787
|
|
SEQ ID NO:
RDEGSQDQTGSLSRARPSSR
ENSG00000110237.3
0
|
1788
|
|
SEQ ID NO:
RDPEVGKDELSKPSSDAESR
ENSG00000138162.13
0
|
1789
|
|
SEQ ID NO:
RMQSSADLIIQEFMDLRTR
ENSG00000151914.13
0
|
1790
|
|
SEQ ID NO:
SASFEPFSNK
ENSG00000179218.9
0
|
1791
|
|
SEQ ID NO:
SDQIGLPDFNAGAMENWGLVT
ENSG00000166825.9
0
|
1792
YR
|
|
SEQ ID NO:
SFACQCPEGHVLR
ENSG00000132561.9
0
|
1793
|
|
SEQ ID NO:
SFLKLILQVEKWQEECEEGEGRTI
ENSG00000152894.10
0
|
1794
IHCLNGGGR
|
|
SEQ ID NO:
SFPAAQIPIAVEEPGSSSRESVSK
ENSG00000138162.13
0
|
1795
AGMPVSADAAK
|
|
SEQ ID NO:
SFTQGEGAR
ENSG00000132561.9
0
|
1796
|
|
SEQ ID NO:
SFTQGEGARPLSTR
ENSG00000132561.9
0
|
1797
|
|
SEQ ID NO:
SHTLSHASYLR
ENSG00000145362.12
0
|
1798
|
|
SEQ ID NO:
SLEQLQK
ENSG00000137497.13
0
|
1799
|
|
SEQ ID NO:
SPHTTLSPAGSTTR
ENSG00000205277.5
0
|
1800
|
|
SEQ ID NO:
SPHTTLSPAGSTTR
ENSG00000205277.5
0
|
1801
|
|
SEQ ID NO:
SPHTTLSPAGSTTR
ENSG00000205277.5
0
|
1802
|
|
SEQ ID NO:
SPHTTLSPAGSTTR
ENSG00000205277.5
0
|
1803
|
|
SEQ ID NO:
SQTLIDLNR
ENSG00000059691.7
0
|
1804
|
|
SEQ ID NO:
SSHNFQLESVNK
ENSG00000135052.12
0
|
1805
|
|
SEQ ID NO:
STCAPSPQR
ENSG00000138162.13
0
|
1806
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1807
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1808
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1809
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1810
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1811
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1812
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1813
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1814
|
|
SEQ ID NO:
STTFYSSPR
ENSG00000205277.5
0
|
1815
|
|
SEQ ID NO:
TATAGAISELTESRLR
ENSG00000128487.12
0
|
1816
|
|
SEQ ID NO:
TEVAIGPSVLNAAR
ENSG00000067704.8
0
|
1817
|
|
SEQ ID NO:
TGDPQETLRR
ENSG00000137497.13
0
|
1818
|
|
SEQ ID NO:
THLSLSHNPEQKGVPTGFILPIRDI
ENSG00000100714.11
0
|
1819
R
|
|
SEQ ID NO:
THTATGIR
ENSG00000169896.12
0
|
1820
|
|
SEQ ID NO:
TLATQLNQQK
ENSG00000151914.13
0
|
1821
|
|
SEQ ID NO:
TPVPEKVPPPKPATPDF
ENSG00000065534.14
0
|
1822
|
|
SEQ ID NO:
TVQQPTVQHR
ENSG00000132561.9
0
|
1823
|
|
SEQ ID NO:
TYQGFWNPPLAPR
ENSG00000152894.10
0
|
1824
|
|
SEQ ID NO:
VLCGDAGLLRGLADGLVQAGVG
ENSG00000142733.10
0
|
1825
TEALLTPLVGRLARL
|
|
SEQ ID NO:
VLCGDAGLLRGLADGLVQAGVG
ENSG00000142733.10
0
|
1826
TEALLTPLVGRLARL
|
|
SEQ ID NO:
VNYDEENWRK
ENSG00000166825.9
0
|
1827
|
|
SEQ ID NO:
VPEGFTCR
ENSG00000090006.13
0
|
1828
|
|
SEQ ID NO:
WSELRKKSLNIR
ENSG00000198947.10
0
|
1829
|
|
SEQ ID NO:
WSSRGSGGWGVYRSPSFGAGE
ENSG00000110237.3
0
|
1830
GLLR
|
|
SEQ ID NO:
WYQPSFHGVDLSALR
ENSG00000142453.7
0
|
1831
|
|
SEQ ID NO:
YCNPGDVCYYASR
ENSG00000134871.13
0
|
1832
|
|
SEQ ID NO:
YGNLGHVNIGAIQEPLAFILPK
ENSG00000213380.9
0
|
1833
|
|
SEQ ID NO:
YITISGNR
ENSG00000151914.13
0
|
1834
|
|
SEQ ID NO:
YLSYTLNPDLIRK
ENSG00000166825.9
0
|
1835
|
|
SEQ ID NO:
YMVTER
ENSG00000105223.14
0
|
1836
|
|
To examine possible functions of somatic promoters on cancer development, we focused on RASA3, a RAS GTPase-activating protein required for Gαi-induced inhibition of mitogen-activated protein kinases. In both GCs (50%) and GC lines, we observed gain of promoter activity at an intronic region 127 kb downstream apart from the canonical RASA3 TSS (FIG. 3c, top, FIG. 10). RNA-seq and 5′ RACE analysis confirmed expression of this shorter RASA3 isoform (FIG. 3c, bottom), and expression of this shorter RASA3 isoform was also observed in TCGA RNA-seq data (FIG. 3c). Compared to the canonical full-length RASA3 protein (CanT), the shorter 31 kDa RASA3 somatic isoform (SomT) is predicted to lack the N-terminal RasGAP domain (FIG. 3d). Consistent with these predictions, transection of RASA3 CanT into GES1 normal gastric epithelial cells induced lower levels of active GTP-bound RAS compared to either empty vector or RASA3 SomT transfected cells, indicating that RASA3 CanT has higher RASGAP activity (FIG. 13).
To address functions of RASA3 SomT, we transfected the RASA3 CanT and SomT isoforms into SNU1967 GC cells. Compared to untransfected cells, transfection of RASA3 SomT into SNU1967 cells significantly stimulated migration (P<0.01) and invasion (P<0.01) while RASA3 CanT significantly suppressed invasion (P<0.001) (FIG. 3E, FIG. 13). Similarly, transfection of RASA3 SomT into GES1 cells significantly stimulated migration (p<0.01, FIG. 3e) and invasion (P<0.01, FIG. 13) while RASA3 CanT did not. When tested on KRAS mutated AGS GC cells that are innately highly migratory, expression of RASA3 CanT potently suppressed migration while RASA3 SomT exhibited significantly less attenuation (P<0.01, FIG. 13). These results suggest that tumor-specific use of RASA3 SomT is likely to increase GC cell migration and invasion. Notably, RASA3 CanT and SomT transfections did not alter SNU1967, GES1 or AGS cellular proliferation rates (FIG. 13). To confirm that these observations are not due to non-physiological in vitro expression levels, we then examined NCC24 GC cells, which normally express high endogenous levels of RASA3 SomT and minimal RASA3 CanT (FIG. 13). Silencing of endogenous RASA3 SomT using two independent siRNA constructs significantly inhibited NCC24 migration and invasion (P<0.01-0.001) (FIG. 13), consistent with RASA3 SomT playing a role in promoting cancer migration and invasion.
In an earlier study, we reported a transcript isoform of the MET receptor tyrosine kinase, driven by an internal alternative promoter, which has been independently confirmed in other cancer types. However, functional implications of this MET variant remain unclear. RNA-seq and 5′ RACE analysis confirmed transcript expression of this shorter isoform, predicted to harbor a truncated SEMA domain (FIG. 14). To assess functional differences between wild type (WT) and variant (Var) MET, we performed transient transfections of MET(WT) and MET(Var) into HEK293 cells. In both untreated and HGF-treated conditions, MET-Var transfected cells exhibited significantly higher levels of p-Gab1 (Y627), a key mediator of MET signaling (e.g. 2.48-3.95 fold comparing MET-Var vs MET-WT, P=0.003 (untreated), P<0.05 (T15 and T30). (66) In addition, in HGF-untreated samples, cells transfected with MET-Var also exhibited higher p-ERK1/2 levels (2.74 fold) and also higher p-STAT3 (Y705)(67-70) levels (1.80 fold) compared to MET-WT (P=0.023 and P=0.026 for p-ERK and p-STAT3 (Y705) respectively). These results suggest that expression of the MET Var isoform may promote MET-downstream signaling kinetics in a manner important for GC tumorigenesis.
Somatic Promoters Correlate with Tumor Immunity
Cancer immunoediting is a process where developing tumors sculpt their immunogenic and antigenic profile to evade host immune surveillance. Mechanisms of cancer immunoediting are diverse, including upregulation of immune checkpoint inhibitors such as PD-L1. To explore potential contributions of somatic promoters to tumor immunity, we identified somatic promoter-associated N-terminal peptides with high predicted affinity binding to GC specific MHC Class I HLA alleles (Table 8 and 9), which are required for antigen presentation to CD8+ cytotoxic T cells (IC50≤50 nM, FIG. 4a). Analysis of recurrent somatic promoter-associated peptides using the NetMHCpan-2.8 algorithm revealed a significant enrichment in high-affinity MHC I binding compared to multiple control peptide populations, including canonical GC peptides (average 36% vs 24%; P<0.01), randomly selected peptides (P<0.001), and C-terminal peptides (P<0.01) (FIG. 4B shows HLA-A, B, and C combined, FIG. 15A depicts data for HLA-A only). The majority of high affinity somatic promoter-associated peptides corresponded to situations where the somatic transcript lacking the N-terminal peptide is overexpressed in tumors relative to normal tissues (78% lost; 76/97 high-affinity peptides, FIG. 4C). Notably, because transcripts driven by the N-terminal lacking somatic TSSs are also overexpressed in tumors to a significantly greater degree than transcripts driven by the canonical TSS (P<0.05, Wilcoxon one sided test) (FIG. 12), such a scenario would be predicted to result in relative depletion of these N-terminal immunogenic peptides in tumors. Interestingly, an analogous N-terminal analysis using RNA-seq data alone (in the absence of epigenomic data) revealed that epigenome-guided N-terminal peptides exhibited significantly higher predicted immunogenicity scores compared to RNA-seq-only identified peptides (36.10% vs 27% for MHC presentation, P=0.02, Fisher Test), suggesting that epigenome-guided promoter identification can provide complementary value to RNA-seq-only guided analyses (FIG. 15).
TABLE 8
|
|
HLA prediction of GC samples
|
Sample
A1
A2
B1
B2
C1
C
|
|
2000639
A*33:03
A*24:02
B*58:01
B*40:01
C*03:02
C*03:67
|
2000721
A*11:01
A*11:01
B*46:01
B*15:01
C*01:02
C*04:01
|
2000986
A*24:02
A*11:01
B*40:01
B*38:02
C*07:02
C*15:02
|
980437
A*33:03
A*02:07
B*40:01
B*39:01
C*07:02
C*04:01
|
990068
A*02:03
A*11:01
B*51:01
B*55:02
C*08:01
C*14:02
|
2000085
A*24:07
A*34:01
B*15:21
B*15:21
C*04:03
C*04:03
|
980401
A*33:03
A*11:01
B*58:01
B*40:01
C*03:02
C*07:02
|
980447
A*11:01
A*11:01
B*38:02
B*27:04
C*12:02
C*07:02
|
2001206
A*02:07
A*24:02
B*46:01
B*40:06
C*01:02
C*08:01
|
980436
A*02:03
A*02:07
B*46:01
B*46:01
C*01:02
C*01:02
|
980417
A*33:03
A*11:01
B*58:01
B*46:01
C*03:02
C*01:02
|
980319
A*33:03
A*11:02
B*58:01
B*27:04
C*03:02
C*12:02
|
20021007
A*24:10
A*24:02
B*15:27
B*40:01
C*03:04
C*04:01
|
|
TABLE 9
|
|
Recurrent N terminal sequences with high affinity to MHC Class I
|
SEQ ID NO.
Gene
N terminal sequence
High Affinity HLA
|
|
SEQ ID NO: 1847
ENSG00000007171.12
MACPWKFLFKTKFHQYA
A*02:03, A*02:07, A*11:01,
|
MNGEKDINNNVEKAPCAT
A*11:02, A*24:10, A*34:01,
|
SSPVTQDDLQYHNLSKQQ
B*15:01, B*15:21, B*15:27,
|
NESPQPLVETGKKSPESLVK
B*27:04, B*39:01, B*40:01,
|
LDATPLSSPRHVRIKNWGS
B*46:01, B*58:01, C*03:02,
|
GMTFQDTLHHKAKGILTCR
C*12:02
|
SKSCLGSIMTPKSLTRGPRD
|
KPTPPDELLPQAIEFVNQYY
|
GSFKEAKIEEHLARVEAVTK
|
EIETTGTYQLTGDELIFATK
|
QAWRNAPRCIGRIQWSNL
|
QVFDARSCSTARE
|
|
SEQ ID NO: 1848
ENSG00000011028.9
MGPGRPAPAPWPRHLLRC
A*02:03, A*11:01, A*11:02,
|
VLLLGCLHLGRPGAPGDAA
A*24:02, A*24:07, A*24:10,
|
LPEPNVFLIFSHGLQGCLEA
A*33:03, B*15:01, B*15:27,
|
QGGQVRVTPACNTSLPAQ
B*38:02, B*39:01, B*40:01,
|
RWKWVSRNRLFNLGTMQ
B*40:06, B*51:01, B*58:01,
|
CLGTGWPGTNTTASLGMY
C*03:02, C*03:04, C*12:02,
|
ECDREALNLRWHCRTLGD
C*14:02
|
QLSLLLGARTSNISKPGTLE
|
RGDQTRSGQWRIYGSEED
|
LCALPYHEVYTIQGNSHGK
|
PCTIPFKYDNQWFHGCTST
|
GREDGHLWCATTQDYGK
|
DERWGFCPIKSNDCETFW
|
DKDQLTDSCYQFNFQSTLS
|
WREAWASCEQQGADLLSI
|
TEIHEQTYINGLLTGYSSTL
|
WIGLNDLDTSGGWQWSD
|
NSPLKYLNWESDQPDNPS
|
EENCGVIRTESSGGWQNR
|
DCSIALPYVCKKKPNATAEP
|
TPPDRWANVKVECEPSW
|
QPFQGHCYRLQAEKRSW
|
QESKKACLRGGGDLVSIHS
|
MAELEFITKQIKQEVEELWI
|
GLNDLKLQMNFEWSDGSL
|
VSFTHWHPFEPNNFRDSLE
|
DCVTIWGPEGRWNDSPC
|
NQSLPSICKKAGQLSQGAA
|
EEDHGCRKGWTWHSPSC
|
YWLGEDQVTYSEARRLCT
|
DHGSQLVTITNREEQAFVS
|
SLIYNWEGEYFWTALQDL
|
NSTGSFFWLSGDEVMYTH
|
WNRDQPGYSRGGCVALA
|
TGSAMGLWEVKNCTSFRA
|
RYICRQSLGTPVTPELPGPD
|
PTPSLTGSCPQGWASDTKL
|
RYCYKVFSSERLQDKKSWV
|
QAQGACQELGAQLLSLASY
|
EEEHFVANMLNKIFGESEP
|
EIHEQHWFWIGLNRRDPR
|
GGQSWRWSDGVGFSYHN
|
FDRSRHDDDDIRGCAVLDL
|
ASLQWVAMQCDTQLDWI
|
CKIPRGTDVREPDDSPQGR
|
REWLRFQEAEYKFFEHHST
|
WAQAQRICTWFQAELTSV
|
HSQAELDFLSHNLQKFSRA
|
QEQHWWIGLHTSESDGRF
|
RWTDGSIINFISWAPGKPR
|
PVGKDKKCVYMTASRED
|
WGDQRCLTALPYICKRSNV
|
TKETQPPDLPTTALGGCPS
|
DWIQFLNKCFQVQGQEPQ
|
SRVKWSEAQFSCEQQEAQ
|
LVTITNPLEQAFITASLPNV
|
TFDLWIGLHASQRDFQWV
|
EQEPLMYANWAPGEPSG
|
PSPAPSGNKPTSCAVVLHS
|
PSAHFTGRWDDRSCTEET
|
HGFICQKGTDPSLSPSPAAL
|
PPAPGTELSYLNGTFRLLQK
|
PLRWHDALLLCESRNASLA
|
YVPDPYTQAFLTQAARGLR
|
TPLWIGLAGEEGSRRYSW
|
VSEEPLNYVGWQDGEPQ
|
QPGGCTYVDVDGAWRTT
|
SCDTKLQGAVCGVSSGPPP
|
PRRISYHGSCPQGLADSA
|
WIPEREHCYSFHMELLLGH
|
KEARQRCQRAGGAVLSILD
|
EMENVFVWEHLQSYEGQS
|
RGAWLGMNFNPKGGTLV
|
WQDNTAVNYSNWGPPGL
|
GPSMLSHNSCYWIQSNSG
|
LWRPGACTNITMGVVCKL
|
PRAEQSSFSPSALPENPAAL
|
VVVLMAVLLLLALLTAALIL
|
YRRRQSIERGAFEGARYSR
|
SSSSPTEATEKNILVSDME
|
MNEQQE
|
|
SEQ ID NO: 1849
ENSG00000020256.15
MNASSEGESFAGSVQIPG
A*02:03, B*15:01, C*03:02,
|
GTTVLVELTPDIHICGICKQ
C*03:04
|
QFNNLDAFVAHKQSGCQL
|
TGTSAAAPSTVQFVSEETV
|
PATQTQTTTRTITSETQTIT
|
VSAPEFVFEHGYQTY
|
|
SEQ ID NO: 1850
ENSG00000032389.8
MEDDAPVIYGLEFQARALT
A*02:03, A*24:07, A*24:10,
|
PQTAETDAIRFLVGTQSLKY
A*33:03, B*15:01, B*15:21,
|
DNQIHIIDFDDENNIINKNV
B*15:27, B*38:02, B*39:01,
|
LLHQAGEIWHISASPADRG
B*40:01, B*40:06, B*46:01,
|
VLTTCYNRRDIIESFGILPVA
B*51:01, B*55:02, B*58:01,
|
QSPTIVFVNTLHQVFFRGQ
C*01:02, C*03:02, C*03:04,
|
VAASDSKVLTCAAVWR
C*03:67, C*04:01, C*08:01,
|
C*12:02, C*14:02, C*15:02
|
|
SEQ ID NO: 1851
ENSG00000037042.8
MLEAILGGGGLPVEGRGST
A*02:03, A*11:01, A*11:02,
|
EFEAFRLILFGSEDSVLPSPL
A*24:02, A*24:07, A*24:10,
|
LYKMAHMGSDGGVLPVH
B*40:01, B*40:06, B*51:01,
|
YATILFSL
C*01:02, C*04:03, C*08:01,
|
C*14:02
|
|
SEQ ID NO: 1852
ENSG00000053747.11
MAAAARPRGRALGPVLPP
A*02:03, A*11:01, A*11:02,
|
TPLLLLVLRVLPACGATARD
A*24:02, A*24:07, A*24:10,
|
PGAAAGLSLHPTYFNLAEA
A*33:03, B*15:01, B*39:01,
|
ARIWATATCGERGPGEGR
B*40:01, B*55:02, B*58:01,
|
PQPELYCKLVGGPTAPGSG
C*03:02, C*03:04, C*03:67,
|
HTIQGQFCDYCNSEDPRKA
C*07:02, C*12:02, C*14:02,
|
HPVTNAIDGSERWWQSPP
C*15:02
|
LSSGTQYNRVNLTLDLGQL
|
FHVAYILIKFANSPRPDLWV
|
LERSVDFGSTYSPWQYFAH
|
SKVDCLKEFGREANMAVT
|
RDDDVLCVTEYSRIVPLEN
|
GEVVVSLINGRPGAKNFTF
|
SHTLREFTKATNIRLRFLRT
|
NTLLGHLISKAQRDPTVTR
|
RYYYSIKDISIGGQCVCNGH
|
AEVCNINNPEKLFRCECQH
|
HTCGETCDRCCTGYNQRR
|
WRPAAWEQSHECEACNC
|
HGHASNCYYDPDVERQQA
|
SLNTQGIYAGGGVCINCQH
|
NTAGVNCEQCAKGYYRPY
|
GVPVDAPDGCIPCSCDPEH
|
ADGCEQGSGRCHCKPNFH
|
GDNCEKCAIGYYNFPFCLRI
|
PIFPVSTPSSEDPVAGDIKG
|
CDCNLEGVLPEICDAHGRC
|
LCRPGVEGPRCDTCRSGFY
|
SFPICQACWCSALGSYQM
|
PCSSVTGQCECRPGVTGQ
|
RCDRCLSGAYDFPHCQGSS
|
SACDPAGTINSNLGYCQCK
|
LHVEGPTCSRCKLLYWNLD
|
KENPSGCSECKCHKAGTVS
|
GTGECRQGDGDCHCKSHV
|
GGDSCDTCEDGYFALEKSN
|
YFGCQGCQCDIGGALSSM
|
CSGPSGVCQCREHVVGKV
|
CQRPENNYYFPDLHHMKY
|
EIEDGSTPNGRDLRFGFDP
|
LAFPEFSWRGYAQMTSVQ
|
NDVRITLNVGKSSGSLFRVI
|
LRYVNPGTEAVSGHITIYPS
|
WGAAQSKEIIFLPSKEPAFV
|
TVPGNGFADPFSITPGIWV
|
ACIKAEGVLLDYLVLLPRDY
|
YEASVLQLPVTEPCAYAGP
|
PQENCLLYQHLPVTRFPCT
|
LACEARHFLLDGEPRPVAV
|
RQPTPAHPVMVDLSGREV
|
ELHLRLRIPQVGHYVVVVE
|
YSTEAAQLFVVDVNVKSSG
|
SVLAGQVNIYSCNYSVLCR
|
SAVIDHMSRIAMYELLADA
|
DIQLKGHMARFLLHQVCII
|
PIEEFSAEYVRPQVHCIASY
|
GRFVNQSATCVSLAHETPP
|
TALILDVLSGRPFPHLPQQS
|
SPSVDVLPGVTLKAPQNQ
|
VTLRGRVPHLGRYVFVIHF
|
YQAAHPTFPAQVSVDGG
|
WPRAGSFHASFCPHVLGC
|
RDQVIAEGQIEFDISEPEVA
|
ATVKVPEGKSLVLVRVLVV
|
PAENYDYQILHKKSMDKSL
|
EFITNCGKNSFYLDPQTASR
|
FCKNSARSLVAFYHKGALP
|
CECHPTGATGPHCSPEGG
|
QCPCQPNVIGRQCTRCAT
|
GHYGFPRCKPCSCGRRLCE
|
EMTGQCRCPPRTVRPQCE
|
VCETHSFSFHPMAGCEGC
|
NCSRRGTIEAAMPECDRDS
|
GQCRCKPRITGRQCDRCAS
|
GFYRFPECVPCNCNRDGTE
|
PGVCDPGTGACLCKENVE
|
GTECNVCREGSFHLDPANL
|
KGCTSCFCFGVNNQCHSS
|
HKRRTKFVDMLGWHLETA
|
DRVDIPVSFNPGSNSMVA
|
DLQELPATIHSASWVAPTS
|
YLGDKVSSYGGYLTYQAKS
|
FGLPGDMVLLEKKPDVQLT
|
GQHMSIIYEETNTPRPDRL
|
HHGRVHVVEGNFRHASSR
|
APVSREELMTVLSRLADVRI
|
QGLYFTETQRLTLSEVGLEE
|
ASDTGSGRIALAVEICACPP
|
AYAGDSC
|
|
SEQ ID NO: 1853
ENSG00000059145.14
MPSVSKAAAAALSGSPPQ
A*02:03, A*24:10, A*33:03,
|
TEKPTHYRYLKEFRTEQCPL
B*15:01, B*39:01, B*40:01,
|
FSQHKCAQHRPFTCFHWH
B*58:01, C*03:02, C*03:04,
|
FLNQRRRRPLRRRDGTFNY
C*15:02
|
SPDVYCSKYNEATGVCPDG
|
DECPYLHRTTGDTERKYHL
|
RYYKTGTCIHETDARGHCV
|
KNGLHCAFAHGPLDLRPPV
|
CDVRELQAQEALQNGQLG
|
GGEGVPDLQPGVLASQA
|
MIEKILSEDPRWQDANFVL
|
GSYKTEQCPKPPRLCRQGY
|
ACPHYHNSRDRRRNPRRF
|
QYRSTPCPSVKHGDEWGE
|
PSRCDGGDGCQYCHSRTE
|
QQFHPESTKCNDMRQTGY
|
CPRGPFCAFAHVEKSLGM
|
VNEWGCHDLHLTSPSSTG
|
SGQPGNAKRRDSPAEGGP
|
RGSEQDSKQNHLAVFAAV
|
HPPAPSVSSSVASSLASSAG
|
SGSSSPTALPAPPARALPLG
|
PASSTVEAVLGSALDLHLS
|
NVNIASLEKDLEEQDGHDL
|
GAAGPRSLAGSAPVAIPGS
|
LPRAPSLHSPSSASTSPLGS
|
LSQPLPGPVGSSA
|
|
SEQ ID NO: 1854
ENSG00000060656.15
MARAQALVLALTFQLCAPE
A*02:03, A*11:01, A*11:02,
|
TETPAAGCTFEEASDPAVP
A*24:02, A*24:10, A*33:03,
|
CEYSQAQYDDFQWEQVRI
A*34:01, B*15:01, B*15:27,
|
HPGTRAPADLPHGSYLMV
B*38:02, B*39:01, B*40:01,
|
NTSQHAPGQRAHVIFQSLS
B*55:02, B*58:01, C*03:02,
|
ENDTHCVQFSYFLYSRDGH
C*03:04, C*07:02, C*12:02,
|
SPGTLGVYVRVNGGPLGS
C*14:02, C*15:02
|
AVWNMTGSHGRQWHQA
|
ELAVSTFWPNEYQVLFEALI
|
SPDRRGYMGLDDILLLSYP
|
CAKAPHFSRLGDVEVNAG
|
QNASFQCMAAGRAAEAE
|
RFLLQRQSGALVPAAGVR
|
HISHRRFLATEPLAAVSRAE
|
QDLYRCVSQAPRGAGVSN
|
FAELIVKEPPTPIAPPQLLRA
|
GPTYLIIQLNTNSIIGDGPIV
|
RKEIEYRMARGPWAEVHA
|
VSLQTYKLWHLDPDTEYEI
|
SVLLTRPGDGGTGRPGPPL
|
ISRTKCAEPMRAPKGLAFA
|
EIQARQLTLQWEPLGYNVT
|
RCHTYTVSLCYHYTLGSSH
|
NQTIRECVKTEQGVSRYTIK
|
NLLPYRNVHVRLVLTNPEG
|
RKEGKEVTFQTDEDVPSGI
|
AAESLTFTPLEDMIFLKWEE
|
PQEPNGLITQYEISYQSIESS
|
DPAVNVPGPRRTISKLRNE
|
TYHVFSNLHPGTTYLFSVR
|
ARTGKGFGQAALTEITTNIS
|
APSEDYADMPSPLGESENT
|
ITVLLRPAQGRGAPISVYQV
|
IVEEERARRLRREPGGQDC
|
FPVPLTFEAALARGLVHYF
|
GAELAASSLPEAMPFTVGD
|
NQTYRGFWNPPLEPRKAY
|
LIYFQAASHLKGETRLNCIRI
|
ARKAACKESKRPLEVSQRS
|
EEMGLILGICAGGLAVLILLL
|
GAIIVIIRKGKPVNMTKATV
|
NYRQEKTHMMSAVDRSFT
|
DQSTLQEDERLGLSFMDT
|
HGYSTRGDQRSGGVTEAS
|
SLLGGSPRRPCGRKGSPYH
|
TGQLHPAVRVADLLQHIN
|
QMKTAEGYGFKQEYESFFE
|
GWDATKKKDKVKGSRQEP
|
MPAYDRHRVKLHPMLGD
|
PNADYINANYIDGYHRSNH
|
FIATQGPKPEMVYDFWR
|
MVWQEHCSSIVMITKLVE
|
VGRVKCSRYWPEDSDTYG
|
DIKIMLVKTETLAEYVVRTF
|
ALERRGYSARHEVRQFHFT
|
AWPEHGVPYHATGLLAFIR
|
RVKASTPPDAGPIVIHCSA
|
GTGRTGCYIVLDVMLDMA
|
ECEGVVDIYNCVKTLCSRR
|
VNMIQTEEQYIFIHDAILEA
|
CLCGETTIPVSEFKATYKEM
|
IRIDPQSNSSQLREEFQTLN
|
SVTPPLDVEECSIALLPRNR
|
DKNRSMDVLPPDRCLPFLI
|
STDGDSNNYINAALTDSYT
|
RSAAFIVTLHPLQSTTPDF
|
WRLVYDYGCTSIVMLNQL
|
NQSNSAWPCLQYWPEPG
|
RQQYGLMEVEFMSGTAD
|
EDLVARVFRVQNISRLQEG
|
HLLVRHFQFLRWSAYRDTP
|
DSKKAFLHLLAEVDKWQA
|
ESGDGRTIVHCLNGGGRS
|
GTFCACATVLEMIRCHNLV
|
DVFFAAKTLRNYKPNMVE
|
TMDQYHFCYDVALEYLEGL
|
ESR
|
|
SEQ ID NO: 1855
ENSG00000066248.10
METRESEDLEKTRRKSASD
A*02:03, A*11:01, A*11:01,
|
QWNTDNEPAKVKPELLPE
A*11:02, A*11:02, A*24:02,
|
KEETSQADQDIQDKEPHC
A*24:10, A*33:03, A*33:03,
|
HIPIKRNSIFNRSIRRKSKAK
A*34:01, B*15:01, B*15:21,
|
ARDNPERNASCLADSQDN
B*15:27, B*39:01, B*40:01,
|
GKSVNEPLTLNIPWSRMPP
B*46:01, B*58:01, C*03:02,
|
CRT
C*03:04, C*03:67, C*12:02,
|
C*14:02
|
|
SEQ ID NO: 1856
ENSG00000077092.14
MTTSGHACPVPAVNGHM
A*24:02, A*24:07, A*24:10,
|
THYPATPYPLLFPPVIGGLS
A*34:01, B*15:01, B*15:21,
|
LPPLHGLHGHPPPSGCSTP
B*15:27, B*46:01, B*51:01,
|
SPATIETQS
B*55:02, C*01:02, C*03:02,
|
C*04:01, C*07:02, C*12:02,
|
C*14:02
|
|
SEQ ID NO: 1857
ENSG00000079308.12
MTRLSWCFSCVIRWGKYL
A*02:03, A*02:07, B*27:04,
|
FSCLLPLRFCLRSQPEDLEA
B*39:01, B*46:01, C*01:02,
|
PKTHRFKVKTFKKVKPCGIC
C*03:02, C*03:04, C*03:67,
|
RQVITQEGCTCKVCSFSCH
C*08:01, C*14:02
|
RKCQAKVAAPCVPPSNHE
|
LVPITTENAPKNVVDKGEG
|
ASRGGNTRKSLEDNGSTRV
|
TPSVQPHLQPIRN
|
|
SEQ ID NO: 1858
ENSG00000080823.17
MKNYKAIGKIGEGTFSEVM
A*02:03, A*33:03, B*40:01,
|
KMQSLRDGNYYACKQMK
C*03:02, C*14:02
|
QRFESIEQVNNLREIQALRR
|
LNPHPNILMLHEVVFDRKS
|
GSLALICELMDMNIYELIRG
|
RRYPLSEKKIMHYMYQLCK
|
SLDHIHRNGIFHRDVKPENI
|
LIKQDVLKLGD
|
|
SEQ ID NO: 1859
ENSG00000097021.15
MARPGLIHSAPGLPDTCAL
A*02:03
|
LQPPAASAAAAPS
|
|
SEQ ID NO: 1860
ENSG00000100441.5
MPTWGARPASPDRFAVSA
A*02:03, A*02:07, A*11:01,
|
EAENKVREQQPHVERIFSV
A*11:02, A*24:02, A*24:07,
|
GVSVLPKDCPDNPHIWLQ
A*24:10, A*33:03, B*15:01,
|
LEGPKENASRAKEYLKGLCS
B*15:21, B*15:27, B*40:01,
|
PELQDEIHYPPKLHCIFLGA
B*40:06, B*55:02, B*58:01,
|
QGFFLDCLAWSTSAHLVPR
C*03:02, C*03:04, C*03:67,
|
APGSLMISGLTEAFVMAQS
C*04:01, C*04:03, C*07:02,
|
RVEELAERLSWDFTPGPSS
C*08:01, C*14:02, C*15:02
|
GASQCTGVLRDFSALLQSP
|
GDAHREALLQLPLAVQEEL
|
LSLVQEASSGQGPGALAS
|
WEGRSSALLGAQCQGVRA
|
PPSDGRESLDTGSMGPGD
|
CRGARGDTYAVEKEGGKQ
|
GGPREMDWGWKELPGEE
|
AWEREVALRPQSVGGGAR
|
ESAPLKGKALGKEEIALGG
|
GGFCVHREPPGAHGSCHR
|
AAQSRGASLLQRLHNGNA
|
SPPRVPSPPPAPEPPWHC
|
GDRGDCGDRGDVGDRGD
|
KQQGMARGRGPQWKRG
|
ARGGNLVTGTQRFKEALQ
|
DPFTLCLANVPGQPDLRHI
|
VIDGSNVAMVHGLQHYFS
|
SRGIAIAVQYFWDRGHRDI
|
TVFVPQWRFSKDAKVRES
|
HFLQKLYSLSLLSLTPSRVM
|
DGKRISSYDDRFMVKLAEE
|
TDGIIVSNDQFRDLAEESEK
|
W
|
|
SEQ ID NO: 1861
ENSG00000103056.7
MVLYTTPFPNSCLSALHCV
A*02:03, A*02:07, A*11:01,
|
SWALIFPCYWLVDRLAASF
A*11:02, A*24:02, A*24:07,
|
IPTTYEKRQRADDPCCLQLL
A*24:10, B*15:01, B*15:21,
|
CTALFTPIYLALLVASLPFAF
B*15:27, B*27:04, B*38:02,
|
LGFLFWSPLQSARRPYIYSR
B*39:01, B*40:01, B*40:06,
|
LEDKGLAGGAALLSEWKG
B*46:01, B*51:01, B*55:02,
|
TGPGKSFCFATANVCLLPD
B*58:01, C*01:02, C*03:02,
|
SLARVNNLFNTQARAKEIG
C*03:04, C*03:67, C*04:01,
|
QRIRNGAARPQIKIYIDSPT
C*04:03, C*07:02, C*08:01,
|
NTSISAASFSSLVSPQGGD
C*12:02, C*15:02
|
GVARAVPGSIKRTASVEYK
|
GDGGRHPGDEAANGPAS
|
GDPVDSSSPEDACIVRIGG
|
EEGGRPPEADDPVPGGQA
|
RNGAGGGPRGQTPNHNQ
|
QDGDSGSLGSPSASRESLV
|
KGRAGPDTSASGEPGANS
|
KLLYKASVVKKAAARRRRH
|
PDEAFDHEVSAFFPANLDF
|
LCLQEVFDKRAATKLKEQL
|
HGYFEYILYDVGVYGCQGC
|
CSFKCLNSGLLFASRYPI
|
|
SEQ ID NO: 1862
ENSG00000103227.14
MLGAGLIKIRGDRCWRDL
A*02:03, A*11:01, A*11:02,
|
TCMDFHYETQPMPNPVA
A*24:02, A*24:07, A*24:10,
|
YYLHHSPWWFHRFETLSN
A*33:03, B*15:01, B*38:02,
|
HFIELLVPFFLFLGRRACIIH
B*40:01, B*58:01, C*03:02,
|
GVLQILFQAVLIVSGNLSFL
C*03:04, C*07:02, C*14:02,
|
NWLTMVPSLACFDDATLG
C*15:02
|
FLFPSGPGSLKDRVLQMQ
|
RDIRGARPEPRFGSVVRRA
|
ANVSLGVLLAWLSVPVVLN
|
LLSSRQVMNTHFNSLHIVN
|
TYGAFGSITKERAEVILQGT
|
ASSNASAPDAMWEDYEFK
|
CKPGDPSRRPCLISPYHYRL
|
DWLMWFAAFQTYEHND
|
WIIHLAGKLLASDAEALSLL
|
AHNPFAGRPPPRWVRGE
|
HYRYKFSRPGGRHAAEGK
|
WWVRKRIGAYFPPLS
|
|
SEQ ID NO: 1863
ENSG00000105559.7
MEGSRPRSSLSLASSASTIS
A*02:03, A*11:01, A*11:02,
|
SLSSLSPKKPTRAVNKIHAF
A*24:10, A*33:03, B*39:01,
|
GKRGNALRRDPNLPVHIR
B*40:01, B*58:01, C*03:02,
|
GWLHKQDSSGLRLWKRR
C*03:04, C*14:02
|
WFVLSGHCLFYYKDSREES
|
VLGSVLLPSYNIRPDGPGA
|
PRGRRFTFTAEHPGMRTY
|
VLAADTLEDLRGWLRALG
|
RASRAEGDDYGQPRSPAR
|
PQPGEGPGGPGGPPEVSR
|
GEEGRISESPEVTRLSRGRG
|
RPRLLTPSPTTDLHSGLQM
|
RRARSPDLFTPLSRPPSPLS
|
LPRPRSAPARRPPAPSGDT
|
APPARPHTPLSRIDVRPPLD
|
WGPQRQTLSRPPTPRRGP
|
PSEAGGGKPPRSPQHWSQ
|
EPRTQAHSGSPTYLQLPPR
|
PPGTRASMVLLPGPPLEST
|
FHQSLETDTLLTKLCGQDR
|
LLRRLQEEIDQKQEEKEQLE
|
AALELTRQQLGQATREAG
|
APGRAWGRQRLLQDRLVS
|
VRATLCHLTQERERVWDT
|
YSGLEQELGTLRETLEYLLH
|
LGSPQDRVSAQQQLWMV
|
EDTLAGLGGPQKPPPHTEP
|
DSPSPVLQGEESSERESLPE
|
SLELSSPRSPETDWGRPPG
|
GDKDLASPHLGLGSPRVSR
|
ASSPEGRHLPSPQLGTKAP
|
VARPRMSAQEQLERMRR
|
NQECGRPFPRPTSPRLLTL
|
GRTLSPARRQPDVEQRPV
|
VGHSGAQKWLRSSGSWSS
|
PRNTTPYLPTSEGHRERVLS
|
LSQALATEASQWHRMMT
|
GGNLDSQGDPLPGVPLPP
|
SDPTRQETPPPRSPPVANS
|
GSTGFSRRGSGRGGGPTP
|
WGPAWDAGIAPPVLPQD
|
EGAWPLRVTLLQSSF
|
|
SEQ ID NO: 1864
ENSG00000105639.14
MAPPSEETPLIPQRSCSLLS
A*02:03, A*11:01, A*11:02,
|
TEAGALHVLLPARGPGPPQ
A*24:02, A*24:07, A*24:10,
|
RLSFSFGDHLAEDLCVQAA
A*33:03, B*15:01, B*39:01,
|
KASGILPVYHSLFALATEDL
B*40:01, B*55:02, B*58:01,
|
SCWFPPSHIFSVEDASTQV
C*03:02, C*03:04, C*07:02,
|
LLYRIRFYFPNWFGLEKCHR
C*14:02
|
FGLRKDLASAILDLPVLEHL
|
FAQHRSDLVSGRLPVGLSL
|
KEQGECLSLAVLDLARMAR
|
EQAQRPGELLKTVSYKACL
|
PPSLRDLIQGLSFVTRRRIR
|
RTVRRALRRVAACQADRH
|
SLMAKYIMDLERLDPAGA
|
AETFHVGLPGALGGHDGL
|
GLLRVAGDGGIAWTQGEQ
|
EVLQPFCDFPEIVDISIKQA
|
PRVGPAGEHRLVTVTRTD
|
NQILEAEFPGLPEALSFVAL
|
VDGYFRLTTDSQHFFCKEV
|
APPRLLEEVAEQCHGPITLD
|
FAINKLKTGGSRPGSYVLRR
|
SPQDFDSFLLTVCVQNPLG
|
PDYKGCLIRRSPTGTFLLVG
|
LSRPHSSLRELLATCWDGG
|
LHVDGVAVTLTSCCIPRPKE
|
KSNLIVVQRGHSPPTSSLV
|
QPQSQYQLSQMTFHKIPA
|
DSLEWHENLGHGSFTKIYR
|
GCRHEVVDGEARKTEVLLK
|
VMDAKHKNCMESFLEAAS
|
LMSQVSYRHLVLLHGVCM
|
AGDSTMVQEFVHLGAIDM
|
YLRKRGHLVPASWKLQVV
|
KQLAYALNYLEDKGLPHGN
|
VSARKVLLAREGADGSPPFI
|
KLSDPGVSPAVLSLEMLTD
|
RIPWVAPECLREAQTLSLE
|
ADKWGFGATVWEVFSGV
|
TMPISALDPAKKLQFYEDR
|
QQLPAPKWTELALLIQQC
|
MAYEPVQRPSFRAVIRDLN
|
SLISSDYELLSDPTPGALAPR
|
DGLWNGAQLYACQDPTIF
|
EERHLKYISQLGKGNFGSV
|
ELCRYDPLGDNTGALVAVK
|
QLQHSGPDQQRDFQREIQ
|
ILKALHSDFIVKYRGVSYGP
|
GRQSLRLVMEYLPSGCLRD
|
FLQRHRARLDASRLLLYSSQ
|
ICKGMEYLGSRRCVHRDLA
|
ARNILVESEAHVKIADFGLA
|
KLLPLDKDYYVVREPGQSPI
|
FWYAPESLSDNIFSRQSDV
|
WSFGVVLYELFTYCDKSCS
|
PSAEFLRMMGCERDVPAL
|
CRLLELLEEGQRLPAPPACP
|
AEVHELMKLCWAPSPQDR
|
PSFSALGPQLDMLWSGSR
|
GCETHAFTAHPEGKHHSLS
|
FS
|
|
SEQ ID NO: 1865
ENSG00000105650.17
MQAPVPHSQRRESFLYRS
A*02:03, B*15:01, B*39:01,
|
DSDYELSPKAMSRNSSVAS
B*40:01, C*03:02, C*03:04,
|
DLHGEDMIVTPFAQVLASL
C*15:02
|
RTVRSNVAALARQQCLGA
|
AKQGPVGN
|
|
SEQ ID NO: 1866
ENSG00000105963.9
MAKERRRAVLELLQRPGN
A*02:03, A*24:10, B*15:01,
|
ARCADCGAPDPDWASYTL
C*03:02, C*03:04
|
GVFICLSCSGIHRNIPQVSK
|
VKSVRLDAWEEAQVEFMA
|
SHGNDAARARFESKVPSFY
|
YRPTP
|
|
SEQ ID NO: 1867
ENSG00000105976.10
MKAPAVLAPGILVLLFTLV
A*02:03, A*11:01, A*11:02,
|
QRSNGECKEALAKSEMNV
A*24:02, A*24:07, A*24:10,
|
NMKYQLPNFTAETPIQNVI
A*33:03, A*34:01, B*15:01,
|
LHEHHIFLGATNYIYVLNEE
B*15:27, B*39:01, B*40:01,
|
DLQKVAEYKTGPVLEHPDC
B*58:01, C*03:02, C*03:04,
|
FPCQDCSSKANLSGGVWK
C*03:67, C*07:02, C*12:02,
|
DNINMALVVDTYYDDQLIS
C*14:02, C*15:02
|
CGSVNRGTCQRHVFPHNH
|
TADIQSEVHCIFSPQIEEPS
|
QCPDCVVSALGAKVLSSVK
|
DRFINFFVGNTINSSYFPDH
|
PLHSISVRRLKETKDGFMFL
|
TDQSYIDVLPEFRDSYPIKY
|
VHAFESNNFIYFLTVQRETL
|
DAQTFHTRIIRFCSINSGLH
|
SYMEMPLECILTEKRKKRST
|
KKEVFNILQAAYVSKPGAQ
|
LARQIGASLNDDILFGVFA
|
QSKPDSAEPMDRSAMCAF
|
PIKYVNDFFNKIVNKNNVR
|
CLQHFYGPNHEHCFNRTLL
|
RNSSGCEARRDEYRTEFTT
|
ALQRVDLFMGQFSEVLLTS
|
ISTFIKGDLTIANLGTSEGRF
|
MQVVVSRSGPSTPHVNFL
|
LDSHPVSPEVIVEHTLNQN
|
GYTLVITGKKITKIPLNGLGC
|
RHFQSCSQCLSAPPFVQCG
|
WCHDKCVRSEECLSGTWT
|
QQICLPAIYKVFPNSAPLEG
|
GTRLTICGWDFGFRRNNK
|
FDLKKTRVLLGNESCTLTLS
|
ESTMNTLKCTVGPAMNKH
|
FNMSIIISNGHGTTQYSTFS
|
YVDPVITSISPKYGPMAGG
|
TLLTLTGNYLNSGNSRHISI
|
GGKTCTLKSVSNSILECYTP
|
AQTISTEFAVKLKIDLANRE
|
TSIFSYREDPIVYEIHPTKSFI
|
SGGSTITGVGKNLNSVSVP
|
RMVINVHEAGRNFTVACQ
|
HRSNSEIICCTTPSLQQLNL
|
QLPLKTKAFFMLDGILSKYF
|
DLIYVHNPVFKPFEKPVMIS
|
MGNENVLEIKGNDIDPEA
|
VKGEVLKVGNKSCENIHLH
|
SEAVLCTVPNDLLKLNSELN
|
IEWKQAISSTVLGKVIVQP
|
DQNFTGLIAGVVSISTALLL
|
LLGFFLWLKKRKQIKDLGSE
|
LVRYDARVHTPHLDRLVSA
|
RSVSPTTEMVSNESVDYRA
|
TFPEDQFPNSSQNGSCRQ
|
VQYPLTDMSPILTSGDSDIS
|
SPLLQNTVHIDLSALNPELV
|
QAVQHVVIGPSSLIVHFNE
|
VIGRGHFGCVYHGTLLDN
|
DGKKIHCAVKSLNRITDIGE
|
VSQFLTEGIIMKDFSHPNVL
|
SLLGICLRSEGSPLVVLPYM
|
KHGDLRNFIRNETHNPTVK
|
DLIGFGLQVAKGMKYLASK
|
KFVHRDLAARNCMLDEKF
|
TVKVADFGLARDMYDKEY
|
YSVHNKTGAKLPVKWMAL
|
ESLQTQKFTTKSDVWSFGV
|
LLWELMTRGAPPYPDVNT
|
FDITVYLLQGRRLLQPEYCP
|
DPLYEVMLKCWHPKAEM
|
RPSFSELVSRISAIFSTFIGEH
|
YVHVNATYVNVKCVAPYP
|
SLLSSEDNADDEVDTRPAS
|
FWETS
|
|
SEQ ID NO: 1868
ENSG00000107317.7
MATHHTLWMGLALLGVL
A*02:03, B*15:01, C*03:02,
|
GDLQAAPEAQVSVQPNFQ
C*03:04, C*12:02
|
QD
|
|
SEQ ID NO: 1869
ENSG00000111700.8
MDQHQHLNKTAESASSEK
A*11:01, A*11:02
|
KKTRRCNGFK
|
|
SEQ ID NO: 1870
ENSG00000111860.9
MWGRFLAPEASGRDSPG
A*02:03, A*11:01, A*11:02,
|
GARSFPAGPDYSSAWLPA
A*24:02, A*24:07, A*24:10,
|
NESLWQATTVPSNHRNN
A*33:03, B*15:01, B*15:27,
|
HIRRHSIASDSGDTGIGTSC
B*39:01, B*40:01, C*03:02,
|
SDSVEDHSTSSGTLSFKPSQ
C*03:04, C*14:02
|
SLITLPTAHVMPSNSSASIS
|
KLRESLTPDGSKWSTSLMQ
|
TLGNHSRGEQDSSLDMKD
|
FRPLRKWSSLSKLTAPDNC
|
GQGGTVCREESRNGLEKIG
|
KAKALTSQLRTIGPSCLHDS
|
MEMLRLEDKEINKKRSSTL
|
DCKYKFESCSKEDFRASSST
|
LRRQPVDMTYSALPESKPI
|
MTSSEAFEPPKYLMLGQQ
|
AVGGVPIQPSVRTQMWLT
|
EQLRTNPLEGRNTEDSYSL
|
APWQQQQIEDFRQGSETP
|
MQVLTGSSRQSYSPGYQD
|
FSKWESMLKIKEGLLRQKEI
|
VIDRQKQQITHLHERIRDN
|
ELRAQHAMLGHYVNCEDS
|
YVASLQPQYENTSLQTPFS
|
EESVSHSQQGEFEQKLAST
|
EKEVLQLNEFLKQRLSLFSE
|
EKKKLEEKLKTRDRYISSLKK
|
KCQKESEQNKEKQRRIETL
|
EKYLADLPTLDDVQSQSLQ
|
LQILEEKNKNLQEALIDTEK
|
KLEEIKKQCQDKETQLICQK
|
KKEKELVTTVQSLQQKVER
|
CLEDGIRLPMLDAKQLQNE
|
NDNLRQQNETASKIIDSQQ
|
DEIDRMILEIQSMQGKLSK
|
EKLTTQKMMEELEKKERN
|
VQRLTKALLENQRQTDETC
|
SLLDQGQEPDQSRQQTVL
|
SKRPLFDLTVIDQLFKEMSC
|
CLFDLKALCSILNQRAQGK
|
EPNLSLLLGIRSMNCSAEET
|
ENDHSTETLTKKLSDVCQL
|
RRDIDELRTTISDRYAQDM
|
GDNCITQ
|
|
SEQ ID NO: 1871
ENSG00000111912.14
XEKTCSSLEREPHFSLLTMR
A*02:03, A*11:01, A*11:02,
|
GQRLPLDIQIFYCARPDEEP
A*24:02, A*24:07, A*24:10,
|
FVKIITVEEAKRRKSTCSYYE
A*33:03, B*15:01, B*15:27,
|
DEDEEVLPVLRPHSALLEN
B*40:01, B*55:02, C*03:02,
|
MHIEQLARRLPARVQGYP
C*03:04, C*03:67, C*12:02,
|
WRLAYSTLEHGTSLKTLYRK
C*14:02, C*15:02
|
SASLDSPVLLVIKDMDNQIF
|
GAYATHPFKFSDHYYGTGE
|
TFLYTFSPHFKVFKWSGEN
|
SYFINGDISSLELGGGGGRF
|
GLWLDADLYHGRSNSCST
|
FNNDILSKKEDFIVQDLEV
|
WAFD
|
|
SEQ ID NO: 1872
ENSG00000112033.9
MEQPQEEAPEVREEEEKEE
A*02:03, A*02:07, A*11:01,
|
VAEAEGAPELNGGPQHAL
A*11:02, A*24:02, A*24:07,
|
PSSSYTDLSRSSSPPSLLDQL
A*24:10, A*33:03, A*34:01,
|
QMGCDGASCGSLNMECR
B*15:01, B*15:21, B*15:27,
|
VCGDKASGFHYGVHACEG
B*27:04, B*38:02, B*39:01,
|
CKGFFRRTIRMKLEYEKCER
B*40:01, B*40:06, B*46:01,
|
SCKIQKKNRNKCQYCRFQK
B*51:01, B*55:02, B*58:01,
|
CLALGMSHNAIRFGRMPE
C*01:02, C*03:02, C*03:04,
|
AEKRKLVAGLTANEGSQYN
C*04:01, C*04:03, C*07:02,
|
PQVADLKAFSKHIYNAYLK
C*08:01, C*12:02, C*15:02
|
NFNMTKKKARSILTGKASH
|
TAPFVIHDIETLWQAEKGL
|
VWKQLVNGLPPYKEISVHV
|
FYRCQCTTVETVRELTEFAK
|
SIPSFSSLFLNDQVTLLKYG
|
VHEAIFAMLASIVNKDGLL
|
VANGSGFVTREFLRSLRKP
|
FSDIIEPKFEFAVKFNALELD
|
DSDLALFIAAIILCGDRPGL
|
MNVPRVEAIQDTILRALEF
|
HLQANHPDAQYLFP
|
|
SEQ ID NO: 1873
ENSG00000113594.5
MMDIYVCLKRPSWMVDN
A*02:03, A*11:01, A*11:02,
|
KRMRTASNFQWLLSTFILL
A*24:02, A*24:07, A*24:10,
|
YLMNQVNSQKKGAPHDLK
A*33:03, A*34:01, B*15:01,
|
CVTNNLQVWNCSWKAPS
B*39:01, B*40:01, B*58:01,
|
GTGRGTDYEVCIENRSRSC
C*03:02, C*03:04, C*03:67,
|
YQLEKTSIKIPALSHGDYEITI
C*12:02, C*14:02, C*15:02
|
NSLHDFGSSTSKFTLNEQN
|
VSLIPDTPEILNLSADFSTST
|
LYLKWNDRGSVFPHRSNVI
|
WEIKVLRKESMELVKLVTH
|
NTTLNGKDTLHHWSWAS
|
DMPLECAIHFVEIRCYIDNL
|
HFSGLEEWSDWSPVKNIS
|
WIPDSQTKVFPQDKVILVG
|
SDITFCCVSQEKVLSALIGH
|
TNCPLIHLDGENVAIKIRNIS
|
VSASSGTNVVFTTEDNIFG
|
TVIFAGYPPDTPQQLNCET
|
HDLKEIICSWNPGRVTALV
|
GPRATSYTLVESFSGKYVRL
|
KRAEAPTNESYQLLFQMLP
|
NQEIYNFTLNAHNPLGRSQ
|
STILVNITEKVYPHTPTSFKV
|
KDINSTAVKLSWHLPGNFA
|
KINFLCEIEIKKSNSVQEQR
|
NVTIKGVENSSYLVALDKL
|
NPYTLYTFRIRCSTETFWK
|
WSKWSNKKQHLTTEASPS
|
KGPDTWREWSSDGKNLIIY
|
WKPLPINEANGKILSYNVS
|
CSSDEETQSLSEIPDPQHKA
|
EIRLDKNDYIISVVAKNSVG
|
SSPPSKIASMEIPNDDLKIE
|
QVVGMGKGILLTWHYDP
|
NMTCDYVIKWCNSSRSEP
|
CLMDWRKVPSNSTETVIES
|
DEFRPGIRYNFFLYGCRNQ
|
GYQLLRSMIGYIEELAPIVA
|
PNFTVEDTSADSILVKWED
|
IPVEELRGFLRGYLFYFGKG
|
ERDTSKMRVLESGRSDIKV
|
KNITDISQKTLRIADLQGKT
|
SYHLVLRAYTDGGVGPEKS
|
MYVVTKENSVGLIIAILIPVA
|
VAVIVGVVTSILCYRKREWI
|
KETFYPDIPNPENCKALQF
|
QKSVCEGSSALKTLEMNPC
|
TPNNVEVLETRSAFPKIEDT
|
EIISPVAERPEDRSDAEPEN
|
HVVVSYCPPIIEEEIPNPAA
|
DEAGGTAQVIYIDVQSMY
|
QPQAKPEEEQENDPVGGA
|
GYKPQMHLPINSTVEDIAA
|
EEDLDKTAGYRPQANVNT
|
WNLVSPDSPRSIDSNSEIVS
|
FGSPCSINSRQFLIPPKDED
|
SPKSNGGGWSFTNFFQNK
|
PND
|
|
SEQ ID NO: 1874
ENSG00000114541.10
MASVFMCGVEDLLFSGSR
A*02:03, A*11:01, A*11:02,
|
FVWNLTVSTLRRWYTERLR
A*24:10, A*33:03, A*34:01,
|
ACHQVLRTWCGLQDVYQ
B*40:01, B*58:01, C*07:02,
|
MTEGRHCQVHLLDDRRLE
C*12:02, C*14:02
|
LLVQPKLLARELLDLVASHF
|
NLKEKEYFGITFIDDTGQQ
|
NWLQLDHRVLDHDLPKKP
|
GPTILHFAVRFYIESISFLKD
|
KTTVELFFLNAKACVHKGQ
|
IEVESETIFKLAAFILQEAKG
|
DYTSDENARKDLKTLPAFP
|
TKTLQEHPSLAYCEDRVIEH
|
YLKIKGLTRGQAVVQY
|
|
SEQ ID NO: 1875
ENSG00000115977.14
MKKFFDSRREQGGSGLGS
A*02:03, A*11:01, A*11:02,
|
GSSGGGGSTSGLGSGYIGR
A*24:02, A*24:07, A*24:10,
|
VFGIGRQQVTVDEVLAEG
B*15:01, B*39:01, B*40:01,
|
GFAIVFLVRTSNGMKCALK
C*03:02, C*12:02, C*14:02
|
RMFVNNEHDLQVCKREIQI
|
MRDLSGHKNIVGYIDSSIN
|
NVSSGDVWEVLILMDFCR
|
GGQVVNLMNQRLQTGFT
|
ENEVLQIFCDTCEAVARLH
|
QCKTPIIHRDLKVENILLHD
|
RGHYVLCDFGSATNKFQN
|
PQTEGVNAVEDEIKKYTTL
|
SYRAPEMVNLYSGKIITTKA
|
DIWALGCLLYKLCYFTLPFG
|
ESQVAICDGNFTIPDNSRYS
|
QDMHCLIRYMLEPDPDKR
|
PDIYQVSYFSFKLLKKECPIP
|
NVQNSPIPAKLPEPVKASE
|
AAAKKTQPKARLTDPIPTTE
|
TSIAPRQRPKAGQTQPNP
|
GILPIQPALTPRKRATVQPP
|
PQAAGSSNQPGLLASVPQ
|
PKPQAPPSQPLPQTQAKQ
|
PQAPPTPQQTPSTQAQGL
|
PAQAQATPQHQQQLFLK
|
QQQQQQQPPPAQQQPA
|
GTFYQQQQAQTQQFQAV
|
HPATQKPAIAQFPVVSQG
|
GSQQQLMQNFYQQQQQ
|
QQQQQQQQQLATALHQ
|
QQLMTQQAALQQKPTMA
|
AGQQPQPQPAAAPQPAP
|
AQEPAIQAPVRQQPKVQT
|
TPPPAVQGQKVGSLTPPSS
|
PKTQRAGHRRILSDVTHSA
|
VFGVPASKSTQLLQAAAAE
|
AELLDPGRQTLQ
|
|
SEQ ID NO: 1876
ENSG00000116833.9
MSSNSDTGDLQESLKHGLT
A*02:03
|
PIGAGLPDRHGSPIPARGR
|
LV
|
|
SEQ ID NO: 1877
ENSG00000118855.14
MDAGKLARHPTDTGSERA
C*03:02, C*03:04, C*14:02
|
VPALAEIRPWWAPPLRPQ
|
|
SEQ ID NO: 1878
ENSG00000119547.5
MKAAYTAYRCLTKDLEGCA
A*02:03, A*11:01, A*11:02,
|
MNPELTMESLGTLHGPAG
A*24:10, A*33:03, B*15:01,
|
GGSGGGGGGGGGGGGG
B*15:27, B*39:01, B*58:01,
|
GPGHEQELLASPSPHHAG
C*03:02, C*03:04, C*07:02,
|
RGAAGSLRGPPPPPTAHQ
C*14:02
|
ELGTAAAAAAAASRSAMV
|
TSMASILDGGDYRPELSIPL
|
HHAMSMSCDSSPPGMG
|
MSNTYTTLTPLQPLPPISTV
|
SDKFHHPHPHHHPHHHH
|
HHHHQRLSGNVSGSFTLM
|
RDERGLPAMNNLYSPYKE
|
MPGMSQSLSPLAATPLGN
|
GLGGLHNAQQSLPNYGPP
|
GHDKMLSPNFDAHHTAM
|
LTRGEQHLSRGLGTPPAA
|
MMSHLNGLHHPGHTQSH
|
GPVLAPSRERPPSSSSGSQ
|
VATSGQLEEINTKEVAQRIT
|
AELKRYSIPQAIFAQRVLCR
|
SQGTLSDLLRNPKPWSKLK
|
SGRETFRRMWKWLQEPEF
|
QRMSALRLAA
|
|
SEQ ID NO: 1879
ENSG00000125826.15
MDEKTKKAEEMALSLTRA
A*02:03, A*02:07, A*11:01,
|
VAGGDEQVAMKCAIWLA
A*11:02, A*24:10, A*33:03,
|
EQRVPLSVQLKPEVSPTQD
B*40:01, C*03:02, C*03:04
|
IRLWVSVEDAQMHTVTIW
|
LTVRPDMTVASLKDMVFL
|
DYGFPPVLQQWVIGQRLA
|
RDQETLHSHGVRQNGDSA
|
YLYLLSARNTSLNPQELQRE
|
RQLRMLEDLGFKDLTLQPR
|
GPLEPGPPKPGVPQEPGR
|
GQPDAVPEPPPVGWQCP
|
GCTFINKPTRPGCEMCCRA
|
RPEAYQVPASYQPDEEERA
|
RLAGEEEALRQYQQRKQQ
|
QQEGNYLQHVQLDQRSLV
|
LNTEPAECPVCYSVLAPGE
|
AVVLRECLHTFCRECLQGTI
|
RNSQEAEVSCPFIDNTYSCS
|
GKLLEREIKALLTPEDYQRF
|
LDLGISIAENRSAFSYHCKT
|
PDCKGWCFFEDDVNEFTC
|
PVCFHVNCLLCKAIHEQM
|
NCKEYQEDLALRAQNDVA
|
ARQTTEMLKVMLQQGEA
|
MRCPQCQIVVQKKDGCD
|
WIRCTVCHTEICWVTKGPR
|
WGPGGPGDTSGGCRCRV
|
NGIPCHPSCQNCH
|
|
SEQ ID NO: 1880
ENSG00000129116.13
MSALASRSAPAMQSSGSF
A*02:03, A*11:01, A*11:02,
|
NYARPKQFIAAQNLGPAS
A*24:02, A*24:10, A*33:03,
|
GHGTPASSPSSSSLPSPMS
B*15:01, B*39:01, B*40:01,
|
PTPRQFGRAPVPPFAQPF
B*58:01, C*03:02, C*03:04
|
GAEPEAPWGSSSPSPPPPP
|
PPVFSPTAAFPVPDVFPLPP
|
PPPPLPSPGQASHCSSPAT
|
RFGHSQTPAAFLSALLPSQ
|
PPPAAVNALGLPKGVTPA
|
GFPKKASRTARIASDEEIQG
|
TKDAVIQDLERKLRFKEDLL
|
NNGQPRLTYEERMARRLL
|
GADSATVFNIQEPEEETAN
|
QEYKVSSCEQRLISEIEYRLE
|
RSPVDESGDEVQYGDVPV
|
ENGMAPFFEMKLKHYKIFE
|
GMPVTFTCRVAGNPKPKIY
|
WFKDGKQISPKSDHYTIQR
|
DLDGTCSLHTTASTLDDDG
|
NYTIMAANPQGRISCTGRL
|
MVQAVNQRGRSPRSPSG
|
HPHVRRPRSRSRDSGDEN
|
EPIQERFFRPHFLQAPGDLT
|
VQEGKLCRMDCKVSGLPT
|
PDLSWQLDGKPVRPDSAH
|
KMLVRENGVHSLIIEPVTSR
|
DAGIYTCIATNRAGQNSFS
|
LELVVAAKE
|
|
SEQ ID NO: 1881
ENSG00000129682.9
MSGKVTKPKEEKDASKVLD
A*02:03, A*02:07, A*24:10,
|
DAPPGTQEYIMLRQDSIQS
A*34:01, B*27:04, B*38:02,
|
AELKKKESPFRAKCHEIFCC
B*39:01, B*46:01, B*55:02,
|
PLKQVHHKENTEPEEPQLK
C*03:02, C*07:02, C*08:01,
|
GIVTKLYSRQGYHLQLQAD
C*15:02
|
GTIDGTKDEDSTYTLFNLIP
|
VGLRVVAIQGVQTKLYLA
|
|
SEQ ID NO: 1882
ENSG00000131374.10
MYHSLSETRHPLQPEEQEV
A*02:03, A*24:02, A*24:07,
|
GIDPLSSYSNKSGGDSNKN
A*24:10, A*33:03, B*27:04,
|
GRRTSSTLDSEGTFNSYRKE
B*51:01, C*07:02, C*15:02
|
WEELFVNNNYLATIRQKGI
|
NGQLRSSRFRSICWKLFLC
|
VLPQDKSQWISRIEELRAW
|
YSNIKEIHITNPRKVVGQQ
|
DL
|
|
SEQ ID NO: 1883
ENSG00000131620.13
MWEASGMEERALEELAM
A*02:03, A*24:10, A*33:03,
|
EETALDPLLAEAAGAVDGE
B*38:02, B*40:01, C*01:02
|
GAPPGGPSAQAATMRVN
|
EKYSTLPAEDRSVHIINICAI
|
EDIGYLPSEGTLLNSLSVDP
|
DAECKYGLYFRDGRRKVDY
|
ILVYHHKRPSGNRTLVRRV
|
QHSDTPSGARSVKQDHPL
|
PGKGASLDAGSGEPP
|
|
SEQ ID NO: 1884
ENSG00000132005.4
MATQAYTELQAAPPPSQP
B*15:01, B*58:01, C*03:02,
|
PQAPPQAQPQPPPPPPPA
C*03:04, C*03:67, C*12:02,
|
APQPPQPPTAAATPQPQY
C*14:02
|
VTELQSPQPQAQPPGGQK
|
QYVTELPAVPAPSQPTGAP
|
TPSPAPQQYIVVTVSEGAM
|
RASETVSEASPGSTASQTG
|
VPTQVVQQVQGTQQRLL
|
VQTSVQAKPGHVSPLQLT
|
NIQVPQQALPTQRLVVQS
|
AAPGSKGGQVSLTVHGTQ
|
QVHSPPEQSPVQANSSSSK
|
TAGAPTGTVPQQLQVHGV
|
QQSVPVTQERSVVQATPQ
|
APKPGPVQPLTVQGLQPV
|
HVAQEVQQLQQVPVPHV
|
YSSQVQYVEGGDASYTASA
|
IRSSTYSYPETPLYTQTASTS
|
YYEAAGTATQVSTPATSQA
|
VASSGS
|
|
SEQ ID NO: 1885
ENSG00000132359.9
MFGRKRSVSFGGFGWIDK
A*02:03, A*11:01, A*11:02,
|
TMLASLKVKKQELANSSDA
A*34:01, B*40:01, C*03:02,
|
TLPDRPLSPPLTAPPTMKSS
C*03:04, C*14:02, C*15:02
|
EFFEMLEKMQGIKLEEQKP
|
GPQKNKDDYIPYPSIDEVV
|
EKGGPYPQVILPQFGGYWI
|
EDPENVGTPTSLGSSICEEE
|
EEDNLSPNTFGYKLECKGE
|
ARAYRRHFLGKDHLNFYCT
|
GSSLGNLILSVKCEEAEGIEY
|
LRVILRSKLKTVHERIPLAGL
|
SKLPSVPQIAKAFCDDAVG
|
LRFNPVLYPKASQ
|
|
SEQ ID NO: 1886
ENSG00000134490.9
MCVRRSLVGLTFCTCYLAS
A*02:03, A*11:01, A*11:02,
|
YLTNKYVLSVLKFTYPTLFQ
A*24:02, A*24:07, A*24:10,
|
GWQTLIGGLLLHVSWKLG
A*33:03, B*15:01, B*15:27,
|
WVEINSSSRSHVLVWLPAS
B*58:01, C*03:02, C*03:04,
|
VLFVGIIYAGSRALSRLAIPV
C*12:02
|
FLTLHNVAEVIICGYQKCFQ
|
KEKTSPAKICSALLLLAAAG
|
CLPFNDSQFNPDGYFWAII
|
HLLCVGAYKILQKSQKPSAL
|
SDIDQQYLNYIFSVVLLAFA
|
SHPTGDLFSVLDFPFLYFYR
|
FHGSCCASGFLGFFLMFST
|
VKLKNLLAPGQCAAWIFFA
|
KIITAGLSILLFDAILTSATTG
|
CLLLGALGEALLVFSERKSS
|
|
SEQ ID NO: 1887
ENSG00000135093.8
MLSSRAEAAMTAADRAIQ
A*02:03, A*02:07, A*11:01,
|
RFLRTGAAVRYKVMKNW
A*11:02, A*24:02, A*24:07,
|
GVIGGIAAALAAGIYVIWG
A*24:10, B*15:21, B*27:04,
|
PITERKKRRKGLVPGLVNL
B*38:02, B*39:01, B*40:01,
|
GNTCFMNSLLQGLSACPA
B*51:01, B*58:01, C*03:02,
|
FIRWLEEFTSQYSRDQKEP
C*07:02, C*14:02, C*15:02
|
PSHQYLSLTLLHLLKALSCQ
|
EVTDDEVLDASCLLDVLRM
|
YRWQISSFEEQDAHELFHV
|
ITSSLEDERDRQPRVTHLFD
|
VHSLEQQSEITPKQITCRTR
|
GSPHPTSNHWKSQHPFHG
|
RLTSN
|
|
SEQ ID NO: 1888
ENSG00000136231.9
MNKLYIGNLSENAAPSDLE
A*02:03, A*11:01, A*11:02,
|
SIFKDAKIPVSGPFLVKTGY
A*24:10, A*33:03, A*34:01,
|
AFVDCPDESWALKAIEALS
B*15:01, B*15:27, C*03:02,
|
GKIELHGKPIEVEHSVPKRQ
C*03:04, C*14:02
|
RIRKLQIRNIPPHLQWEVLD
|
SLLVQYGVVESCEQVNTDS
|
ETAVVNVTYSSKDQARQA
|
LDKLNGFQLENFTLKVAYIP
|
DEMAAQQNPLQQPRGRR
|
GLGQRGSSRQGSPGSVSK
|
QKPCDLPLRLLVPTQFVGAI
|
IGKEGATIRNITKQTQSKID
|
VHRKENAGAAEKSITILSTP
|
EGTSAACKSILEIMHKEAQ
|
DIKFTEEIPLKILAHNNFVG
|
RLIGKEGRNLKKIEQDTDTK
|
ITISPLQELTLYNPERTITVK
|
GNVETCAKAEEEIMKKIRE
|
SYENDIASMNLQAHLIPGL
|
NLNALGLFPPTSGMPPPTS
|
GPPSAMTPPYPQFEQSETE
|
TVHLFIPALSVGAIIGKQGQ
|
HIKQLSRFAGASIKIAPAEA
|
PDAKVRMVIITGPPEAQFK
|
AQGRIYGKIKEENFVSPKEE
|
VKLEAHIRVPSFAAGRVIGK
|
GGKTVNELQNLSSAEVVVP
|
RDQTPDENDQVVVKITGH
|
FYACQVAQRKIQEILTQVK
|
QHQQQKALQSGPPQSRRK
|
|
SEQ ID NO: 1889
ENSG00000136848.12
MEPDSLLDQDDSYESPQE
A*02:03
|
RPGSRRSLPGSLSEKSPSM
|
EPSAATPFRVTGFLSRRLKG
|
SIKRTKSQPKLDRNHSFRHI
|
|
SEQ ID NO: 1890
ENSG00000137203.6
MLWKLTDNIKYEDCEDRH
A*02:03, A*11:01, A*11:02,
|
DGTSNGTARLPQLGTVGQ
A*24:02, A*24:10, A*33:03,
|
SPYTSAPPLSHTPNADFQP
B*39:01, C*14:02
|
PYFPPPYQPIYPQSQDPYS
|
HVNDPYSLNPLHAQPQPQ
|
HPGWPGQRQSQESGLLHT
|
HRGLPHQLSGLDPRRDYRR
|
HEDLLHGPHALSSGLGDLSI
|
HSLPHAIEEVPHVEDPGINI
|
PDQTVIKKGPVSLSKSNSN
|
AVSAIPINKDNLFGGVVNP
|
NEVFCSVPGRLSLLSSTSK
|
|
SEQ ID NO: 1891
ENSG00000137474.15
MVILQQGDHVWMDLRLG
A*02:03, A*11:01, A*11:02,
|
QEFDVPIGAVVKLCDSGQV
A*24:02, A*24:07, A*24:10,
|
QVVDDEDNEHWISPQNA
A*33:03, B*15:01, B*39:01,
|
THIKPMHPTSVHGVEDMI
B*40:01, B*55:02, B*58:01,
|
RLGDLNEAGILRNLLIRYRD
C*03:02, C*03:04, C*03:67,
|
HLIYTYTGSILVAVNPYQLLS
C*07:02, C*12:02, C*14:02,
|
IYSPEHIRQYTNKKIGEMPP
C*15:02
|
HIFAIADNCYFNMKRNSRD
|
QCCIISGESGAGKTESTKLIL
|
QFLAAISGQHSWIEQQVLE
|
ATPILEAFGNAKTIRNDNSS
|
RFGKYIDIHFNKRGAIEGAK
|
IEQYLLEKSRVCRQALDERN
|
YHVFYCMLEGMSEDQKKK
|
LGLGQASDYNYLAMGNCI
|
TCEGRVDSQEYANIRSAM
|
KVLMFTDTENWEISKLLAA
|
ILHLGNLQYEARTFENLDA
|
CEVLFSPSLATAASLLEVNP
|
PDLMSCLTSRTLITRGETVS
|
TPLSREQALDVRDAFVKGI
|
YGRLFVWIVDKINAAIYKPP
|
SQDVKNSRRSIGLLDIFGFE
|
NFAVNSFEQLCINFANEHL
|
QQFFVRHVFKLEQEEYDLE
|
SIDWLHIEFTDNQDALDMI
|
ANKPMNIISLIDEESKFPKG
|
TDTTMLHKLNSQHKLNAN
|
YIPPKNNHETQFGINHFAG
|
IVYYETQGFLEKNRDTLHG
|
DIIQLVHSSRNKFIKQIFQA
|
DVAMGAETRKRSPTLSSQF
|
KRSLELLMRTLGACQPFFV
|
RCIKPNEFKKPMLFDRHLC
|
VRQLRYSGMMETIRIRRAG
|
YPIRYSFVEFVERYRVLLPG
|
VKPAYKQGDLRGTCQRMA
|
EAVLGTHDDWQIGKTKIFL
|
KDHHDMLLEVERDKAITD
|
RVILLQKVIRGFKDRSNFLK
|
LKNAATLIQRHWRGHNCR
|
KNYGLMRLGFLRLQALHRS
|
RKLHQQYRLARQRIIQFQA
|
RCRAYLVRKAFRHRLWAVL
|
TVQAYARGMIARRLHQRL
|
RAEYLWRLEAEKMRLAEEE
|
KLRKEMSAKKAKEEAERKH
|
QERLAQLAREDAERELKEK
|
EAARRKKELLEQMERARH
|
EPVNHSDMVDKMFGFLG
|
TSGGLPGQEGQAPSGFED
|
LERGRREMVEEDLDAALPL
|
PDEDEEDLSEYKFAKFAATY
|
FQGTTTHSYTRRPLKQPLLY
|
HDDEGDQLAALAVWITILR
|
FMGDLPEPKYHTAMSDGS
|
EKIPVMTKIYETLGKKTYKR
|
ELQALQGEGEAQLPEGQK
|
KSSVRHKLVHLTLKKKSKLT
|
EEVTKRLHDGESTVQGNS
|
MLEDRPTSNLEKLHFIIGNG
|
ILRPALRDEIYCQISKQLTH
|
NPSKSSYARGWILVSLCVG
|
CFAPSEKFVKYLRNFIHGGP
|
PGYAPYCEERLRRTFVNGT
|
RTQPPSWLELQATKSKKPI
|
MLPVTFMDGTTKTLLTDSA
|
TTAKELCNALADKISLKDRF
|
GFSLYIALFD
|
|
SEQ ID NO: 1892
ENSG00000138075.7
MGDLSSLTPGGSMGLQV
A*02:03, A*02:07, A*11:01,
|
NRGSQSSLEGAPATAPEPH
A*11:02, A*24:02, A*24:07,
|
SLGILHASYSVSHRVRPW
A*24:10, A*33:03, A*34:01,
|
WDITSCRQQWTRQILKDV
B*15:01, B*15:21, B*15:27,
|
SLYVESGQIMCILGSSGSGK
B*27:04, B*38:02, B*39:01,
|
TTLLDAMSGRLGRAGTFLG
B*40:01, B*40:06, B*46:01,
|
EVYVNGRALRREQFQDCFS
B*55:02, B*58:01, C*03:02,
|
YVLQSDTLLSSLTVRETLHY
C*03:04, C*03:67, C*04:01,
|
TALLAIRRGNPGSFQKKVE
C*04:03, C*07:02, C*08:01,
|
AVMAELSLSHVADRLIGNY
C*12:02, C*14:02, C*15:02
|
SLGGISTGERRRVSIAAQLL
|
QDPKVMLFDEPTTGLDCM
|
TANQIVVLLVELARRNRIVV
|
LTIHQPRSELFQLFDKIAILS
|
FGELIFCGTPAEMLDFFND
|
CGYPCPEHSNPFDFY
|
|
SEQ ID NO: 1893
ENSG00000142185.12
MEPSALRKAGSEQEEGFE
A*02:03, A*11:01, A*11:02,
|
GLPRRVTDLGMVSNLRRS
A*24:02, A*24:07, A*24:10,
|
NSSLFKSWRLQCPFGNND
A*33:03, A*34:01, B*15:01,
|
KQESLSSWIPENIKKKECVY
B*15:27, B*39:01, B*40:01,
|
FVESSKLSDAGKVVCQCGY
B*58:01, C*03:02, C*03:04,
|
THEQHLEEATKPHTFQGT
C*12:02, C*14:02, C*15:02
|
QWDPKKHVQEMPTDAFG
|
DIVFTGLSQKVKKYVRVSQ
|
DTPSSVIYHLMTQHWGLD
|
VPNLLISVTGGAKNFNMKP
|
RLKSIFRRGLVKVAQTTGA
|
WIITGGSHTGVMKQVGEA
|
VRDFSLSSSYKEGELITIGVA
|
TWGTVHRREGLIHPTGSFP
|
AEYILDEDGQGNLTCLDSN
|
HSHFILVDDGTHGQYGVEI
|
PLRTRLEKFISEQTKERGGV
|
AIKIPIVCVVLEGGPGTLHTI
|
DNATTNGTPCVVVEGSGR
|
VADVIAQVANLPVSDITISLI
|
QQKLSVFFQEMFETFTESRI
|
VEWTKKIQDIVRRRQLLTV
|
FREGKDGQQDVDVAILQA
|
LLKASRSQDHFGHENWDH
|
QLKLAVAWNRVDIARSEIF
|
MDEWQWKPSDLHPTMT
|
AALISNKPEFVKLFLENGVQ
|
LKEFVTWDTLLYLYENLDPS
|
CLFHSKLQMHHVAQVLRE
|
LLGDFTQPLYPRPRHNDRL
|
RLLLPVPHVKLNVQGVSLR
|
SLYKRSSGHVTFTMDPIRD
|
LLIWAIVQNRRELAGIIWA
|
QSQDCIAAALACSKILKELS
|
KEEEDTDSSEEMLALAEEY
|
EHRAIGVFTECYRKDEERA
|
QKLLTRVSEAWGKTTCLQL
|
ALEAKDMKFVSHGGIQAFL
|
TKVWWGQLSVDNGLWR
|
VTLCMLAFPLLLTGLISFREK
|
RLQDVGTPAARARAFFTAP
|
VVVFHLNILSYFAFLCLFAY
|
VLMVDFQPVPSWCECAIY
|
LWLFSLVCEEMRQLFYDPD
|
ECGLMKKAALYFSDFWNK
|
LDVGAILLFVAGLTCRLIPA
|
TLYPGRVILSLDFILFCLRLM
|
HIFTISKTLGPKIIIVKRMMK
|
DVFFFLFLLAVWVVSFGVA
|
KQAILIHNERRVDWLFRGA
|
VYHSYLTIFGQIPGYIDGVN
|
FNPEHCSPNGTDPYKPKCP
|
ESDATQQRPAFPEWLTVLL
|
LCLYLLFTNILLLNLLIAMFN
|
YTFQQVQEHTDQIWKFQR
|
HDLIEEYHGRPAAPPPFILL
|
SHLQLFIKRVVLKTPAKRHK
|
QLKNKLEKNEEAALLSWEI
|
YLKENYLQNRQFQQKQRP
|
EQKIEDISNKVDAMVDLLD
|
LDPLKRSGSMEQRLASLEE
|
QVAQTAQALHWIVRTLRA
|
SGFSSEADVPTLASQKAAE
|
EPDAEPGGRKKTEEPGDSY
|
HVNARHLLYPNCPVTRFPV
|
PNEKVPWETEFLIYDPPFYT
|
AERKDAAAMDPMGENP
|
MGRTGLRGRGSLSCFGPN
|
HTLYPMVTRWRRNEDGAI
|
CRKSIKKMLEVLVVKLPLSE
|
HWALPGGSREPGEMLPRK
|
LKRILRQEHWPSFENLLKC
|
GMEVYKGYMDDPRNTDN
|
AWIETVAVSVHFQDQNDV
|
ELNRLNSNLHACDSGASIR
|
WQVVDRRIPLYANHKTLL
|
QKAAAEFGAHY
|
|
SEQ ID NO: 1894
ENSG00000142235.4
MRQVLWLCNVCVTARETR
A*02:03, A*33:03, B*15:01,
|
HHLHLPAILDKMPAPGALI
B*39:01, B*40:01, C*03:02,
|
LLAAVSASGCLASPAHPDG
C*03:04
|
FALGRAPLAPPYAVVLISCS
|
GLLAFIFLLLTCLCCKRGDV
|
GFKEFENPEGEDCSGEYTP
|
PAEETSSSQSLPDVYILPLAE
|
VSLPMPAPQPSHSDMTTP
|
LGLSRQHLSYLQEIGSGWF
|
GKVILGEIFSDYTPAQVVVK
|
ELRASAGPLEQRKFISEAQP
|
YRSLQHPNVLQCLGLCVET
|
LPFLLIMEFCQLGDLKRYLR
|
AQRPPEGLSPELPPRDLRTL
|
QRMGLEIARGLAHLHSHN
|
YV
|
|
SEQ ID NO: 1895
ENSG00000142661.14
MTLPHSLGGAGDPRPPQA
A*02:03, A*11:01, A*11:02,
|
MEVHRLEHRQEEEQKEER
A*24:02, A*24:07, A*24:10,
|
QHSLRMGSSVRRRTFRSSE
A*33:03, B*15:01, B*15:27,
|
EEHEFSAADYALAAALALT
B*39:01, B*40:01, B*58:01,
|
ASSELSWEAQLRRQTSAVE
C*03:02, C*03:04, C*03:67,
|
LEERGQKRVGFGNDWERT
C*07:02, C*08:01, C*12:02,
|
EIAFLQTHRLLRQRRDWKT
C*14:02
|
LRRRTEEKVQEAKELRELCY
|
GRGPWFWIPLRSHAVWE
|
HTTVLLTCTVQASPPPQVT
|
WYKNDTRIDPRLFRAGKYR
|
ITNNYGLLSLEIRRCAIEDSA
|
TYTVRVKNAHGQASSFAK
|
VLVRTYLGKDAGFDSEIFKR
|
STFGPSVEFTSVLKPVFARE
|
KEPFSLSCLFSEDVLDAESIQ
|
WFRDGSLLRSSRRRKILYTD
|
RQASLKVSCTYKEDEGLYM
|
VRVPSPFGPREQSTYVLVR
|
DAEAENPGAPGSPLNVRCL
|
DVNRDCLILTWAPPSDTRG
|
NPITAYTIERCQGESGEWIA
|
CHEAPGGTCRCPIQGLVEG
|
QSYRFRVRAISRVGSSVPSK
|
ASELVVMGDHDAARRKTE
|
IPFDLGNKITISTDAFEDTVT
|
IPSPPTNVHASEIREAYVVL
|
AWEEPSPRDRAPLTYSLEK
|
SVIGSGTWEAISSESPVRSP
|
RFAVLDLEKKKSYVFRVRA
|
MNQYGLSDPSEPSEPIALR
|
GPPATLPPPAQVQAFRDT
|
QTSVSLTWDPVKDPELLGY
|
YIYSRKVGTSEWQTVNNKP
|
IQGTRFTVPGLRTGKEYEFC
|
VRSVSEAGVGESSAATEPIR
|
VKQALATPSAPYGFALLNC
|
GKNEMVIGWKPPKRRGG
|
GKILGYFLDQHDSEELDWH
|
AVNQQPIPTRVCKVSDLHE
|
GHFYEFRARAANWAGVG
|
ELSAPSSLFECKEWTMPQP
|
GPPYDVRASEVRATSLVLQ
|
WEPPLYMGAGPVTGYHVS
|
FQEEGSEQWKPVTPGPISG
|
THLRVSDLQPGKSYVFQVQ
|
AMNSAGLGQPSMPTDPV
|
LLEDKPGAHEIEVGVDEEG
|
FIYLAFEAPEAPDSSEFQWS
|
KDYKGPLDPQRVKIEDKVN
|
KSKVILKEPGLEDLGTYSVIV
|
TDADEDISASHTLTEEELEK
|
LKKLSHEIRNPVIKLISGWNI
|
DILERGEVRLWLEVEKLSPA
|
AELHLIFNNKEIFSSPNRKIN
|
FDREKGLVEVIIQNLSEEDK
|
GSYTAQLQDGKAKNQITLT
|
LVDDDFDKLLRKADAKRRD
|
WKRKQGPYFERPLQWKVT
|
EDCQVQLTCKVTNTKKETR
|
FQWFFQRAEMPDGQYDP
|
ETGTGLLCIEELSKKDKGIYR
|
AMVSDDRGEDDTILDLTG
|
DALDAIFTELGRIGALSATP
|
LKIQGTEEGIRIFSKVKYYNV
|
EYMKTTWFHKDKRLESGD
|
RIRTGTTLDEIWLHILDPKD
|
SDKGKYTLEIAAGKEVRQLS
|
TDLSGQAFEDAMAEHQRL
|
KTLAIIEKNRAKVVRGLPDV
|
ATIMEDKTLCLTCIVSGDPT
|
PEISWLKNDQPVTFLDRYR
|
MEVRGTEVTITIEKVNSEDS
|
GRYGVFVKNKYGSETGQV
|
TISVFKHGDEPKELKSM
|
|
SEQ ID NO: 1896
ENSG00000143669.9
MSTDSNSLAREFLTDVNRL
A*02:03, A*11:01, A*11:02,
|
CNAVVQRVEAREEEEEETH
A*24:02, A*24:07, A*24:10,
|
MATLGQYLVHGRGFLLLTK
A*33:03, A*34:01, B*15:01,
|
LNSIIDQALTCREELLTLLLSL
B*15:27, B*39:01, B*40:01,
|
LPLVWKIPVQEEKATDFNL
B*55:02, B*58:01, C*03:02,
|
PLSADIILTKEKNSSSQRST
C*03:04, C*03:67, C*07:02,
|
QEKLHLEGSALSSQVSAKV
C*12:02, C*14:02, C*15:02
|
NVFRKSRRQRKITHRYSVR
|
DARKTQLSTSDSEANSDEK
|
GIAMNKHRRPHLLHHFLTS
|
FPKQDHPKAKLDRLATKEQ
|
TPPDAMALENSREIIPRQG
|
SNTDILSEPAALSVISNMN
|
NSPFDLCHVLLSLLEKVCKF
|
DVTLNHNSPLAASVVPTLT
|
EFLAGFGDCCSLSDNLESR
|
VVSAGWTEEPVALIQRML
|
FRTVLHLLSVDVSTAEMM
|
PENLRKNLTELLRAALKIRIC
|
LEKQPDPFAPRQKKTLQEV
|
QEDFVFSKYRHRALLLPELL
|
EGVLQILICCLQSAASNPFY
|
FSQAMDLVQEFIQHHGFN
|
LFETAVLQMEWLVLRDGV
|
PPEASEHLKALINSVMKIM
|
STVKKVKSEQLHHSMCTRK
|
RHRRCEYSHFMHHHRDLS
|
GLLVSAFKNQVSKNPFEET
|
ADGDVYYPERCCCIAVCAH
|
QCLRLLQQASLSSTCVQILS
|
GVHNIGICCCMDPKSVIIPL
|
LHAFKLPALKNFQQHILNIL
|
NKLILDQLGGAEISPKIKKA
|
ACNICTVDSDQLAQLEETL
|
QGNLCDAELSSSLSSPSYRF
|
QGILPSSGSEDLLWKWDAL
|
KAYQNFVFEEDRLHSIQIA
|
NHICNLIQKGNIVVQWKLY
|
NYIFNPVLQRGVELAHHCQ
|
HLSVTSAQSHVCSHHNQC
|
LPQDVLQIYVKTLPILLKSRV
|
IRDLFLSCNGVSQIIELNCLN
|
GIRSHSLKAFETLIISLGEQQ
|
KDASVPDIDGIDIEQKELSS
|
VHVGTSFHHQQAYSDSPQ
|
SLSKFYAGLKEAYPKRRKTV
|
NQDVHINTINLFLCVAFLCV
|
SKEAESDRESANDSEDTSG
|
YDSTASEPLSHMLPCISLES
|
LVLPSPEHMHQAADIWS
|
MCRWIYMLSSVFQKQFYR
|
LGGFRVCHKLIFMIIQKLFR
|
SHKEEQGKKEGDTSVNEN
|
QDLNRISQPKRTMKEDLLS
|
LAIKSDPIPSELGSLKKSADS
|
LGKLELQHISSINVEEVSAT
|
EAAPEEAKLFTSQESETSLQ
|
SIRLLEALLAICLHGARTSQ
|
QKMELELPNQNLSVESILFE
|
MRDHLSQSKVIETQLAKPL
|
FDALLRVALGNYSADFEHN
|
DAMTEKSHQSAEELSSQP
|
GDFSEEAEDSQCCSFKLLVE
|
EEGYEADSESNPEDGETQD
|
DGVDLKSETEGFSASSSPN
|
DLLENLTQGEIIYPEICMLEL
|
NLLSASKAKLDVLAHVFESF
|
LKIIRQKEKNVFLLMQQGT
|
VKNLLGGFLSILTQDDSDF
|
QACQRVLVDLLVSLMSSRT
|
CSEELTLLLRIFLEKSPCTKIL
|
LLGILKIIESDTTMSPSQYLT
|
FPLLHAPNLSNGVSSQKYP
|
GILNSKAMGLLRRARVSRS
|
KKEADRESFPHRLLSSWHI
|
APVHLPLLGQNCWPHLSE
|
GFSVSLWFNVECIHEAEST
|
TEKGKKIKKRNKSLILPDSSF
|
DGTESDRPEGAEYINPGER
|
LIEEGCIHIISLGSKALMIQV
|
WADPHNATLIFRVCMDSN
|
DDMKAVLLAQVESQENIFL
|
PSKWQHLVLTYLQQPQGK
|
RRIHGKISIWVSGQRKPDV
|
TLDFMLPRKTSLSSDSNKTF
|
CMIGHCLSSQEEFLQLAGK
|
WDLGNLLLFNGAKVGSQE
|
AFYLYACGPNHTSVMPCK
|
YGKPVNDYSKYINKEILRCE
|
QIRELFMTKKDVDIGLLIESL
|
SVVYTTYCPAQYTIYEPVIRL
|
KGQMKTQLSQRPFSSKEV
|
QSILLEPHHLKNLQPTEYKT
|
IQGILHEIGGTGIFVFLFARV
|
VELSSCEETQALALRVILSLI
|
KYNQQRVHELENCNGLSM
|
IHQVLIKQKCIVGFYILKTLL
|
EGCCGEDIIYMNENGEFKL
|
DVDSNAIIQDVKLLEELLLD
|
WKIWSKAEQGVWETLLAA
|
LEVLIRADHHQQMFNIKQL
|
LKAQVVHHFLLTCQVLQEY
|
KEGQLTPMPREVCRSFVKII
|
AEVLGSPPDLELLTIIFNFLL
|
AVHPPTNTYVCHNPTNFYF
|
SLHIDGKIFQEKVRSIMYLR
|
HSSSGGRSLMSPGFMVISP
|
SGFTASPYEGENSSNIIPQQ
|
MAAHMLRSRSLPAFPTSSL
|
LTQSQKLTGSLGCSIDRLQ
|
NIADTYVATQSKKQNSLGS
|
SDTLKKGKEDAFISSCESAK
|
TVCEMEAVLSAQVSVSDV
|
PKGVLGFPVVKADHKQLG
|
AEPRSEDDSPGDESCPRRP
|
DYLKGLASFQRSHSTIASLG
|
LAFPSQNGSAAVGRWPSL
|
VDRNTDDWENFAYSLGYE
|
PNYNRTASAHSVTEDCLVP
|
ICCGLYELLSGVLLILPDVLL
|
EDVMDKLIQADTLLVLVNH
|
PSPAIQQGVIKLLDAYFARA
|
SKEQKDKFLKNRGFSLLAN
|
QLYLHRGTQELLECFIEMFF
|
GRHIGLDEEFDLEDVRNM
|
GLFQKWSVIPILGLIETSLYD
|
NILLHNALLLLLQILNSCSKV
|
ADMLLDNGLLYVLCNTVA
|
ALNGLEKNIPMSEYKLLAC
|
DIQQLFIAVTIHACSSSGSQ
|
YFRVIEDLIVMLGYLQNSK
|
NKRTQNMAVALQLRVLQ
|
AAMEFIRTTANHDSENLTD
|
SLQSPSAPHHAVVQKRKSI
|
AGPRKFPLAQTESLLMKM
|
RSVANDELHVMMQRRMS
|
QENPSQATETELAQRLQRL
|
TVLAVNRIIYQEFNSDIIDIL
|
RTPENVTQSKTSVFQTEISE
|
ENIHHEQSSVFNPFQKEIFT
|
YLVEGFKVSIGSSKASGSKQ
|
QWTKILWSCKETFRMQLG
|
RLLVHILSPAHAAQERKQIF
|
EIVHEPNHQEILRDCLSPSL
|
QHGAKLVLYLSELIHNHQG
|
ELTEEELGTAELLMNALKLC
|
GHKCIPPSASTKADLIKMIK
|
EEQKKYETEEGVNKAAWQ
|
KTVNNNQQSLFQRLDSKS
|
KDISKIAADITQAVSLSQGN
|
ERKKVIQHIRGMYKVDLSA
|
SRHWQELIQQLTHDRAV
|
WYDPIYYPTSWQLDPTEG
|
PNRERRRLQRCYLTIPNKYL
|
LRDRQKSEDVVKPPLSYLFE
|
DKTHSSFSSTVKDKAASESI
|
RVNRRCISVAPSRETAGELL
|
LGKCGMYFVEDNASDTVE
|
SSSLQGELEPASFSWTYEEI
|
KEVHKRWWQLRDNAVEIF
|
LTNGRTLLLAFDNTKVRDD
|
VYHNILTNNLPNLLEYGNIT
|
ALTNLWYTGQITNFEYLTH
|
LNKHAGRSFNDLMQYPVF
|
PFILADYVSETLDLNDLLIYR
|
NLSKPIAVQYKEKEDRYVD
|
TYKYLEEEYRKGAREDDPM
|
PPVQPYHYGSHYSNSGTVL
|
HFLVRMPPFTKMFLAYQD
|
QSFDIPDRTFHSTNTTWRL
|
SSFESMTDVKELIPEFFYLPE
|
FLVNREGFDFGVRQNGER
|
VNHVNLPPWARNDPRLFI
|
LIHRQALESDYVSQNICQW
|
IDLVFGYKQKGKASVQAIN
|
VFHPATYFGMDVSAVEDP
|
VQRRALETMIKTYGQTPR
|
QLFHMAHVSRPGAKLNIE
|
GELPAAVGLLVQFAFRETR
|
EQVKEITYPSPLSWIKGLK
|
WGEYVGSPSAPVPVVCFS
|
QPHGERFGSLQALPTRAIC
|
GLSRNFCLLMTYSKEQGVR
|
SMNSTDIQWSAILSWGYA
|
DNILRLKSKQSEPPVNFIQS
|
SQQYQVTSCAWVPDSCQL
|
FTGSKCGVITAYTNRFTSST
|
PSEIEMETQIHLYGHTEEIT
|
SLFVCKPYSILISVSRDGTCII
|
WDLNRLCYVQSLAGHKSP
|
VTAVSASETSGDIATVCDS
|
AGGGSDLRLWTVNGDLV
|
GHVHCREIICSVAFSNQPE
|
GVSINVIAGGLENGIVRLW
|
STWDLKPVREITFPKSNKPI
|
ISLTFSCDGHHLYTANSDGT
|
VIAWCRKDQQRLKQPMFY
|
SFLSSYAAG
|
|
SEQ ID NO: 1897
ENSG00000143882.5
MSEFWLISAPGDKENLQAL
A*02:03, A*11:01, A*11:02,
|
ERMNTVTSKSNLSYNTKFA
A*33:03, B*58:01, C*03:02,
|
IPDFKVGTLDSLVGLSDELG
C*03:04
|
KLDTFAESLIRRMAQSVVE
|
VMEDSKGKVQEHLLANGV
|
DLTSFVTHFEWD
|
|
SEQ ID NO: 1898
ENSG00000145214.9
MAAAAEPGARAWLGGGS
A*02:03, A*11:01, A*11:02,
|
PRPGSPACSPVLGSGGRAR
A*33:03, B*15:01, B*39:01,
|
PGPGPGPGPERAGVRAPG
B*40:01, C*03:02, C*03:04
|
PAAAPGHSFRKVTLTKPTF
|
CHLCSDFIWGLAGFLCDVC
|
NFMSHEKCLKHVRIPCTSV
|
APSLVRVPVAHCFGPRGLH
|
KRKFCAVCRKVLEAPALHC
|
EVCELHLHPDCVPFACSDC
|
RQCHQDGHQDHDTHHH
|
HWREGNLPSGARCEVCRK
|
TCGSSDVLAGVRCEWCGV
|
QAHSLCSAALAPECGFGRL
|
RSLVLPPACVRLLPGGFSKT
|
QSFRIVEAAEPGEGGDGA
|
DGSAAVGPGRETQATPES
|
GKQTLKIFDGDDAVRRSQF
|
RLVTVSRLAGAEEVLEAALR
|
AHHIPEDPGHLELCRLPPSS
|
QACDAWAGGKAGSAVISE
|
EGRSPGSGEATPEAWVIRA
|
LPRAQEVLKIYPGWLKVGV
|
AYVSVRVTPKSTARSVVLE
|
VLPLLGRQAESPESFQLVEV
|
AMGCRHVQRTMLMDEQ
|
PLLDRLQDIRQMSVRQVS
|
QTRFYVAESRDVAPHVSLF
|
VGGLPPGLSPEEYSSLLHEA
|
GATKATVVSVSHIYSSQGA
|
VVLDVACFAEAERLYMLLK
|
DMAVRGRLLTALVLPDLLH
|
AKLPPDSCPLLVFVNPKSG
|
GLKGRDLLCSFRKLLNPHQ
|
VFDLTNGGPLPGLHLFSQV
|
PCFRVLVCGGDGTVGWVL
|
GALEETRYRLACPEPSVAIL
|
PLGTGNDLGRVLRWGAGY
|
SGEDPFSVLLSVDEADAVL
|
MDRWTILLDAHEAGSAEN
|
DTADAEP
|
|
SEQ ID NO: 1899
ENSG00000151025.9
MGAMAYPLLLCLLLAQLGL
A*02:03, A*02:07, A*11:01,
|
GAVGASRDPQGRPDSPRE
A*11:02, A*24:02, A*24:07,
|
RTPKGKPHAQQPGRASAS
A*24:10, A*33:03, B*15:01,
|
DSSAPWSRSTDGTILAQKL
B*39:01, B*40:01, B*55:02,
|
AEEVPMDVASYLYTGDSH
B*58:01, C*03:02, C*03:04,
|
QLKRANCSGRYELAGLPGK
C*03:67, C*07:02, C*12:02,
|
WPALASAHPSLHRALDTLT
C*14:02
|
HATNFLNVMLQSNKSREQ
|
NLQDDLDWYQALVWSLLE
|
GEPSISRAAITFSTDSLSAPA
|
PQVFLQATREESRILLQDLS
|
SSAPHLANATLETEWFHGL
|
RRKWRPHLHRRGPNQGP
|
RGLGHSWRRKDGLGGDKS
|
HFKWSPPYLECENGSYKPG
|
WLVTLSSAIYGLQPNLVPEF
|
RGVMKVDINLQKVDIDQC
|
SSDGWFSGTHKCHLNNSE
|
CMPIKGLGFVLGAYECICK
|
AGFYHPGVLPVNNFRRRG
|
PDQHISGSTKDVSEEAYVC
|
LPCREGCPFCADDSPCFVQ
|
EDKYLRLAIISFQALCMLLD
|
FVSMLVVYHFRKAKSIRAS
|
GLILLETILFGSLLLYFPVVILY
|
FEPSTFRCILLRWARLLGFA
|
TVYGTVTLKLHRVLKVFLSR
|
TAQRIPYMTGGRVMRML
|
AVILLVVFWFLIGWTSSVC
|
QNLEKQISLIGQGKTSDHLI
|
FNMCLIDRWDYMTAVAEF
|
LFLLWGVYLCYAVRTVPSA
|
FHEPRYMAVAVHNELIISAI
|
FHTIRFVLASRLQSDWML
|
MLYFAHTHLTVTVTIGLLLI
|
PKFSHSSNNPRDDIATEAY
|
EDELDMGRSGSYLNSSINS
|
AWSEHSLDPEDIRDELKKL
|
YAQLEIYKRKKMITNNPHL
|
QKKRCSKKGLGRSIMRRIT
|
EIPETVSRQCSKEDKEGAD
|
HGTAKGTALIRKNPPESSG
|
NTGKSKEETLKNRVFSLKKS
|
HSTYDHVRDQTEESSSLPT
|
ESQEEETTENSTLESLSGKK
|
LTQKLKEDSEAESTESVPLV
|
CKSASAHNLSSEKKTGHPR
|
TSMLQKSLSVIASAKEKTLG
|
LAGKTQTAGVEERTKSQKP
|
LPKDKETNRNHSNSDNTET
|
KDPAPQNSNPAEEPRKPQ
|
KSGIMKQQRVNPTTANSD
|
LNPGTTQMKDNFDIGEVC
|
PWEVYDLTPGPVPSESKV
|
QKHVSIVASEMEKNPTFSL
|
KEKSHHKPKAAEVCQQSN
|
QKRIDKAEVCLWESQGQSI
|
LEDEKLLISKTPVLPERAKEE
|
NGGQPRAANVCAGQSEEL
|
PPKAVASKTENENLNQIGH
|
QEKKTSSSEENVRGSYNSS
|
NNFQQPLTSRAEVCPWEF
|
ETPAQPNAGRSVALPASSA
|
LSANKIAGPRKEEIWDSFK
|
V
|
|
SEQ ID NO: 1900
ENSG00000151229.8
MSRKASENVEYTLRSLSSL
A*02:03, A*02:07, A*11:01,
|
MGERRRKQPEPDAASAAG
A*11:02, A*24:10, A*34:01,
|
ECSLLAAAESSTSLQSAGA
B*15:01, B*15:21, B*15:27,
|
GGGGVGDLERAARRQFQ
B*27:04, B*40:01, B*40:06,
|
QDETPAFVYVVAVFSALGG
B*46:01, B*55:02, B*58:01,
|
FLFGYDTGVVSGAMLLLKR
C*01:02, C*03:02, C*03:04,
|
QLSLDALWQELLVSSTVGA
C*03:67, C*04:01, C*04:03,
|
AAVSALAGGALNGVFGRR
C*08:01, C*12:02, C*15:02
|
AAILLASALFTAGSAVLAAA
|
NNKETLLAGRLVVGLGIGIA
|
SMTVPVYIAEVSPPNLRGR
|
LVTINTLFITGGQFFASVVD
|
GAFSYLQKDGW
|
|
SEQ ID NO: 1901
ENSG00000151914.13
MAGYLSPAAYLYVEEQEYL
A*02:03, A*11:01, A*11:02,
|
QAYEDVLERYKDERDKVQ
A*24:02, A*24:07, A*24:10,
|
KKTFTKWINQHLMKVRKH
A*33:03, A*34:01, B*15:01,
|
VNDLYEDLRDGHNLISLLEV
B*15:27, B*39:01, B*40:01,
|
LSGDTLPREKGRMRFHRL
B*55:02, B*58:01, C*03:02,
|
QNVQIALDYLKRRQVKLVN
C*03:04, C*07:02, C*12:02,
|
IRNDDITDGNPKLTLGLIWT
C*14:02, C*15:02
|
IILHFQISDIHVTGESEDMS
|
AKERLLLWTQQATEGYAGI
|
RCENFTTCWRDGKLFNAII
|
HKYRPDLIDMNTVAVQSN
|
LANLEHAFYVAEKIGVIRLL
|
DPEDVDVSSPDEKSVITYVS
|
SLYDAFPKVPEGGEGIGAN
|
DVEVKWIEYQNMVNYLIQ
|
WIRHHVTTMSERTFPNNP
|
VELKALYNQYLQFKETEIPP
|
KETEKSKIKRLYKLLEIWIEF
|
GRIKLLQGYHPNDIEKEWG
|
KLIIAMLEREKALRPEVERL
|
EMLQQIANRVQRDSVICE
|
DKLILAGNALQSDSKRLESG
|
VQFQNEAEIAGYILECENLL
|
RQHVIDVQILIDGKYYQAD
|
QLVQRVAKLRDEIMALRN
|
ECSSVYSKGRILTTEQTKLM
|
ISGITQSLNSGFAQTLHPSL
|
TSGLTQSLTPSLTSSSMTSG
|
LSSGMTSRLTPSVTPAYTP
|
GFPSGLVPNFSSGVEPNSL
|
QTLKLMQIRKPLLKSSLLDQ
|
NLTEEEINMKFVQDLLNW
|
VDEMQVQLDRTEWGSDL
|
PSVESHLENHKNVHRAIEE
|
FESSLKEAKISEIQMTAPLKL
|
TYAEKLHRLESQYAKLLNTS
|
RNQERHLDTLHNFVSRAT
|
NELIWLNEKEEEEVAYDWS
|
ERNTNIARKKDYHAELMRE
|
LDQKEENIKSVQEIAEQLLL
|
ENHPARLTIEAYRAAMQT
|
QWSWILQLCQCVEQHIKE
|
NTAYFEFFNDAKEATDYLR
|
NLKDAIQRKYSCDRSSSIHK
|
LEDLVQESMEEKEELLQYK
|
STIANLMGKAKTIIQLKPRN
|
SDCPLKTSIPIKAICDYRQIEI
|
TIYKDDECVLANNSHRAK
|
WKVISPTGNEAMVPSVCF
|
TVPPPNKEAVDLANRIEQQ
|
YQNVLTLWHESHINMKSV
|
VSWHYLINEIDRIRASNVAS
|
IKTMLPGEHQQVLSNLQSR
|
FEDFLEDSQESQVFSGSDIT
|
QLEKEVNVCKQYYQELLKS
|
AEREEQEESVYNLYISEVRN
|
IRLRLENCEDRLIRQIRTPLE
|
RDDLHESVFRITEQEKLKKE
|
LERLKDDLGTITNKCEEFFS
|
QAAASSSVPTLRSELNVVL
|
QNMNQVYSMSSTYIDKLK
|
TVNLVLKNTQAAEALVKLY
|
ETKLCEEEAVIADKNNIENLI
|
STLKQWRSEVDEKRQVFH
|
ALEDELQKAKAISDEMFKT
|
YKERDLDFDWHKEKADQL
|
VERWQNVHVQIDNRLRDL
|
EGIGKSLKYYRDTYHPLDD
|
WIQQVETTQRKIQENQPE
|
NSKTLATQLNQQKMLVSEI
|
EMKQSKMDECQKYAEQYS
|
ATVKDYELQTMTYRAMVD
|
SQQKSPVKRRRMQSSADLI
|
IQEFMDLRTRYTALVTLMT
|
QYIKFAGDSLKRLEEEEKSL
|
EEEKKEHVEKAKELQKWVS
|
NISKTLKDAEKAGKPPFSK
|
QKISSEEISTKKEQLSEALQT
|
IQLFLAKHGDKMTDEERNE
|
LEKQVKTLQESYNLLFSESL
|
KQLQESQTSGDVKVEEKLD
|
KVIAGTIDQTTGEVLSVFQ
|
AVLRGLIDYDTGIRLLETQL
|
MISGLISPELRKCFDLKDAK
|
SHGLIDEQILCQLKELSKAK
|
EIISAASPTTIPVLDALAQS
|
MITESMAIKVLEILLSTGSLV
|
IPATGEQLTLQKAFQQNLV
|
SSALFSKVLERQNMCKDLI
|
DPCTSEKVSLIDMVQRSTL
|
QENTGMWLLPVRPQEGG
|
RITLKCGRNISILRAAHEGLI
|
DRETMFRLLSAQLLSGGLI
|
NSNSGQRMTVEEAVREGV
|
IDRDTASSILTYQVQTGGII
|
QSNPAKRLTVDEAVQCDLI
|
TSSSALLVLEAQRGYVGLI
|
WPHSGEIFPTSSSLQQELIT
|
NELAYKILNGRQKIAALYIP
|
ESSQVIGLDAAKQLGIIDNN
|
TASILKNITLPDKMPDLGDL
|
EACKNARRWLSFCKFQPST
|
VHDYRQEEDVFDGEEPVT
|
TQTSEETKKLFLSYLMINSY
|
MDANTGQRLLLYDGDLDE
|
AVGMLLEGCHAEFDGNTA
|
IKECLDVLSSSGVFLNNASG
|
REKDECTATPSSFNKCHCG
|
EPEHEETPENRKCAIDEEFN
|
EMRNTVINSEFSQSGKLAS
|
TISIDPKVNSSPSVCVPSLIS
|
YLTQTELADISMLRSDSENI
|
LTNYENQSRVETNERANEC
|
SHSKNIQNFPSDLIENPIMK
|
SKMSKFCGVNETENEDNT
|
NRDSPIFDYSPRLSALLSHD
|
KLMHSQGSFNDTHTPESN
|
GNKCEAPALSFSDKTMLSG
|
QRIGEKFQDQFLGIAAINIS
|
LPGEQYGQKSLNMISSNP
|
QVQYHNDKYISNTSGEDEK
|
THPGFQQMPEDKEDESEIE
|
EYSCAVTPGGDTDNAIVSL
|
TCATPLLDETISASDYETSLL
|
NDQQNNTGTDTDSDDDF
|
YDTPLFEDDDHDSLLLDGD
|
DRDCLHPEDYDTLQEEND
|
ETASPADVFYDVSKENENS
|
MVPQGAPVGSLSVKNKAH
|
CLQDFLMDVEKDELDSGE
|
KIHLNPVGSDKVNGQSLET
|
GSERECTNILEGDESDSLTD
|
YDIVGGKESFTASLKFDDSG
|
SWRGRKEEYVTGQEFHSD
|
TDHLDSMQSEESYGDYIYD
|
SNDQDDDDDDGIDEEGG
|
GIRDENGKPRCQNVAEDM
|
DIQLCASILNENSDENENIN
|
TMILLDKMHSCSSLEKQQR
|
VNVVQLASPSENNLVTEKS
|
NLPEYTTEIAGKSKENLLNH
|
EMVLKDVLPPIIKDTESEKT
|
FGPASISHDNNNISSTSELG
|
TDLANTKVKLIQGSELPELT
|
DSVKGKDEYFKNMTPKVD
|
SSLDHIICTEPDLIGKPAEES
|
HLSLIASVTDKDPQGNGSD
|
LIKGRDGKSDILIEDETSIQK
|
MYLGEGEVLVEGLVEEENR
|
HLKLLPGKNTRDSFKLINSQ
|
FPFPQITNNEELNQKGSLK
|
KATVTLKDEPNNLQIIVSKS
|
PVQFENLEEIFDTSVSKEIS
|
DDITSDITSWEGNTHFEESF
|
TDGPEKELDLFTYLKHCAK
|
NIKAKDVAKPNEDVPSHVL
|
ITAPPMKEHLQLGVNNTKE
|
KSTSTQKDSPLNDMIQSN
|
DLCSKESISGGGTEISQFTP
|
ESIEATLSILSRKHVEDVGK
|
NDFLQSERCANGLGNDNS
|
SNTLNTDYSFLEINNKKERI
|
EQQLPKEQALSPRSQEKEV
|
QIPELSQVFVEDVKDILKSR
|
LKEGHMNPQEVEEPSACA
|
DTKILIQNLIKRITTSQLVNE
|
ASTVPSDSQMSDSSGVSP
|
MTNSSELKPESRDDPFCIG
|
NLKSELLLNILKQDQHSQKI
|
TGVFELMRELTHMEYDLEK
|
RGITSKVLPLQLENIFYKLLA
|
DGYSEKIEHVGDFNQKACS
|
TSEMMEEKPHILGDIKSKE
|
GNYYSPNLETVKEIGLESST
|
VWASTLPRDEKLKDLCNDF
|
PSHLECTSGSKEMASGDSS
|
TEQFSSELQQCLQHTEKM
|
HEYLTLLQDMKPPLDNQES
|
LDNNLEALKNQLRQLETFE
|
LGLAPIAVILRKDMKLAEEF
|
LKSLPSDFPRGHVEELSISH
|
QSLKTAFSSLSNVSSERTKQ
|
IMLAIDSEMSKLAVSHEEFL
|
HKLKSFSDWVSEKSKSVKD
|
IEIVNVQDSEYVKKRLEFLK
|
NVLKDLGHTKMQLETTAF
|
DVQFFISEYAQDLSPNQSK
|
QLLRLLNTTQKCFLDVQES
|
VTTQVERLETQLHLEQDLD
|
DQKIVAERQQEYKEKLQGI
|
CDLLTQTENRLIGHQEAFM
|
IGDGTVELKKYQSKQEELQ
|
KDMQGSAQALAEVVKNTE
|
NFLKENGEKLSQEDKALIE
|
QKLNEAKIKCEQLNLKAEQ
|
SKKELDKVVTTAIKEETEKV
|
AAVKQLEESKTKIENLLDW
|
LSNVDKDSERAGTKHKQVI
|
EQNGTHFQEGDGKSAIGE
|
EDEVNGNLLETDVDGQVG
|
TTQENLNQQYQKVKAQHE
|
KIISQHQAVIIATQSAQVLL
|
EKQGQYLSPEEKEKLQKN
|
MKELKVHYETALAESEKKM
|
KLTHSLQEELEKFDADYTEF
|
EHWLQQSEQELENLEAGA
|
DDINGLMTKLKRQKSFSED
|
VISHKGDLRYITISGNRVLE
|
AAKSCSKRDGGKVDTSAT
|
HREVQRKLDHATDRFRSLY
|
SKCNVLGNNLKDLVDKYQ
|
HYEDASCGLLAGLQACEAT
|
ASKHLSEPIAVDPKNLQRQ
|
LEETKALQGQISSQQVAVE
|
KLKKTAEVLLDARGSLLPAK
|
NDIQKTLDDIVGRYEDLSKS
|
VNERNEKLQITLTRSLSVQD
|
GLDEMLDWMGNVESSLK
|
EQDVGTGYCRSSEQYKCH
|
E
|
|
SEQ ID NO: 1902
ENSG00000152359.10
MSSDEEKYSLPVVQNDSSR
A*02:03, A*11:01, A*11:02,
|
GSSVSSNLQEEYEELLHYAI
A*24:02, A*24:10, A*33:03,
|
VTPNIEPCASQSSHPKGEL
A*34:01, B*39:01, B*40:01,
|
VPDVRISTIHDILHSQGNNS
B*55:02, C*03:02, C*03:04,
|
EVRETAIEVGKGCDFHISSH
C*12:02
|
SKTDESSPVLSPRKPSHPV
|
MDFFSSHLLADSSSPATNS
|
SHTDAHEILVSDFLVSDENL
|
QKMENVLDLWSSGLKTNII
|
SELSKWRLNFIDWHRME
|
MRKEKEKHAAHLKQLCNQ
|
INELKELQKTFEISIGRKDEV
|
ISSLSHAIGKQKEKIELMRTF
|
FHWRIGHVRARQDVYEGK
|
LADQYYQRTLLKKVWKVW
|
RSVVQKQWKDVVERACQ
|
ARAEEVCIQISNDYEAKVA
|
MLSGALENAKAEIQRMQH
|
EKEHFEDSMKKAFMRGVC
|
ALNLEAMTIFQNRNDAGI
|
DSTNNKKEEYGPGVQGKE
|
HSAHLDPSAPPMPLPVTSP
|
LLPSPPAAVGGASATAVPS
|
AASMTSTRAASASSVHVP
|
VSALGAGSAATAASEEMY
|
VPRVVTSAQQKAGRTITAR
|
ITGRCDFASKNRISSSLAIM
|
GVSPPMSSVVVEKHHPVT
|
VQTIPQATAAKYPRTIHPES
|
STSASRSLGTRSAHTQSLTS
|
VHSIKVVD
|
|
SEQ ID NO: 1903
ENSG00000153046.13
MASEELYEVERIVDKRKNK
A*02:03, A*11:01, A*11:02,
|
KGKTEYLVRWKGYDSEDD
A*33:03, B*15:01, C*03:02,
|
TWEPEQHLVNCEEYIHDF
C*07:02, C*15:02
|
NRRHTEKQKESTLTRTNRT
|
SPNNARKQISRSTNSNFSK
|
TSPKALVIGKDHESKNSQLF
|
AASQKFRKNTAPSLSSRKN
|
|
SEQ ID NO: 1904
ENSG00000154556.13
MSYYQRPFSPSAYSLPASL
A*02:03, A*11:01, A*11:02,
|
NSSIVMQHGTSLDSTDTYP
A*24:10, A*33:03, B*15:01,
|
QHAQSLDGTTSSSIPLYRSS
B*15:27, B*39:01, B*58:01,
|
EEEKRVTVIKAPHYPGIGPV
C*03:02, C*03:04, C*07:02,
|
DESGIPTAIRTTVDRPKDW
C*12:02, C*14:02, C*15:02
|
YKTMFKQIHMVHKPDDDT
|
DMYNTPYTYNAGLYNPPY
|
SAQSHPAAKTQTYRPLSKS
|
HSDNSPNAFKDASSPVPPP
|
HVPPPVPPLRPRDRSSTEK
|
HDWDPPDRKVDTRKFRSE
|
PRSIFEYEPGKSSILQHERPA
|
SLYQSSIDRSLERPMSSAS
|
MASDFRKRRKSEPAVGPP
|
RGLGDQSASRTSPGRVDLP
|
GSSTTLTKSFTSSSPSSPSRA
|
KGGDDSKICPSLCSYSGLN
|
GNPSSELDYCSTYRQHLDV
|
PRDSPRAISFKNGWQMAR
|
QNAEIWSSTEETVSPKIKSR
|
SCDDLLNDDCDSFPDPKVK
|
SESMGSLLCEEDSKESCPM
|
AWGSPYVPEVRSNGRSRIR
|
HRSARNAPGFLKMYKKM
|
HRINRKDLMNSEVICSVKS
|
RILQYESEQQHKDLLRAWS
|
QCSTEEVPRDMVPTRISEF
|
EKLIQKSKSMPNLGDDMLS
|
PVTLEPPQNGLCPKRRFSIE
|
YLLEEENQSGPPARGRRGC
|
QSNALVPIHIEVTSDEQPR
|
AHVEFSDSDQDGVVSDHS
|
DYIHLEGSSFCSESDFDHFS
|
FTSSESFYGSSHHHHHHHH
|
HHHRHLISSCKGRCPASYT
|
RFTTMLKHERARHENTEEP
|
RRQEMDPGLSKLAFLVSPV
|
PFRRKKNSAPKKQTEKAKC
|
KASVFEALDSALKDICDQIK
|
AEKKRGSLPDNSILHRLISEL
|
LPDVPERNSSLRALRRSPLH
|
QPLHPLPPDGAIHCPPYQN
|
DCGRMPRSASFQDVDTAN
|
SSCHHQDRGGAL
|
|
SEQ ID NO: 1905
ENSG00000155275.14
MAEVGRTGISYPGALLPQG
A*02:03, A*11:01, A*11:02,
|
FWAAVEVWLERPQVANK
A*24:02, A*24:10, A*33:03,
|
RLCGARLEARWSAALPCAE
B*15:01, B*15:27, B*39:01,
|
ARGPGTSAGSEQKERGPG
B*40:01, B*55:02, B*58:01,
|
PGQGSPGGGPGPRSLSGP
C*03:02, C*14:02, C*15:02
|
EQGTACCELEEAQGQCQQ
|
EEAQREAASVPLRDSGHP
|
GHAEGREGDFPAADLDSL
|
WEDFSQSLARGNSELLAFL
|
TSSGAGSQPEAQRELDVVL
|
RTVIPKTSPHCPLTTPRREIV
|
VQDVLNGTITFLPLEEDDE
|
GNLKVKMSNVYQIQLSHS
|
KEEWFISVLIFCPERWHSD
|
GIVYPKPTWLGEELLAKLAK
|
WSVENKKSDFKSTLSLISIM
|
KYSKAYQELKEKYKEMVKV
|
WPEVTDPEKFVYEDVAIAA
|
YLLILWEEERAERRLTARQS
|
FVDLGCGNGLLVHILSSEG
|
HPGRGIDVRRRKIWDMYG
|
PQTQLEEDAITPNDKTLFP
|
DVDWLIGNHSDELTPWIP
|
VIAARSSYNCRFFVLPCCFF
|
DFIGRYSRRQSKKTQYREYL
|
DFIKEVGFTCGFHVDEDCL
|
RIPSTKRVCLVGKSRTYPSS
|
REASVDEKRTQYIKSRRGC
|
PVSPPGWELSPSPRWVAA
|
GSAGHCDGQQALDARVG
|
CVTRAWAAEHGAGPQAE
|
GPWLPGFHPREKAERVRN
|
CAALPRDFIDQVVLQVANL
|
LLGGKQLNTRSSRNGSLKT
|
WNGGESLSLAEVANELDT
|
ETLRRLKRECGGLQTLLRNS
|
HQVFQVVNGRVHIRDWR
|
EETLWKTKQPEAKQRLLSE
|
ACKTRLCWFFMHHPDGC
|
ALSTDCCPFAHGPAELRPP
|
RTTPRKKIS
|
|
SEQ ID NO: 1906
ENSG00000155506.12
MATQVEPLLPGGATLLQA
A*02:03
|
EEHGGLVRKKPPPAPEGKG
|
EPGPNDVRGGEPDGSARR
|
PRPPCAKPHKEGTGQQER
|
ESPRPLQLPGAEGPAISDG
|
EEGGGEPGAGGGAAGAA
|
GAGRRDFVEAPPPKVNPW
|
TKNALPPVLTTVNGQ
|
|
SEQ ID NO: 1907
ENSG00000157514.12
MNTEMYQTPMEVAVYQL
A*02:03, A*24:02, A*24:07,
|
HNFSISFFSSLLGGDVVSVK
A*24:10, B*15:01, C*03:02,
|
LD
C*03:04, C*03:67, C*12:02,
|
C*15:02
|
|
SEQ ID NO: 1908
ENSG00000158321.11
MDGPTRGHGLRKKRRSRS
A*02:03, A*24:10, B*15:01,
|
QRDRERRSRGGLGAGAAG
B*15:27, B*39:01, B*58:01,
|
GGGAGRTRALSLASSSGSD
C*03:02, C*03:04, C*03:67,
|
KEDNGKPPSSAPSRPRPPR
C*12:02, C*14:02, C*15:02
|
RKRRESTSAEEDIIDGFAMT
|
SFVTFEALEKDVALKPQER
|
VEKRQTPLTKKKREALTNG
|
LSFHSKKSRLSHPHHYSSDR
|
ENDRNLCQHLGKRKKMPK
|
ALRQLKPGQNSCRDSDSES
|
ASGESKGFHRSSSRERLSDS
|
SAPSSLGTGYFCDSDSDQE
|
EKASDASSEKLFNTVIVNKD
|
PELGVGTLPEHDSQDAGPI
|
VPKISGLERSQEKSQDCCKE
|
PIFEPVVLKDPCPQVAQPIP
|
QPQTEPQLRAPSPDPDLV
|
QRTEAPPQPPPLSTQPPQ
|
GPPEAQLQPAPQPQVQRP
|
PRPQSPTQLLHQNLPPVQ
|
AHPSAQSLSQPLSAYNSSSL
|
SLNSLSSSRSSTPAKTQPAP
|
PHISHHPSASPFPLSLPNHS
|
PLHSFTPTLQPPAHSHHPN
|
MFAPPTALPPPPPLT
|
|
SEQ ID NO: 1909
ENSG00000158486.9
MGATGRLELTLAAPPHPG
A*02:03, A*02:07, A*11:01,
|
PAFQRSKARETQGEEEGSE
A*11:02, A*24:02, A*24:07,
|
MQIAKSDSIHHMSHSQGQ
A*24:10, A*33:03, A*34:01,
|
PELPPLPASANEEPSGLYQT
B*15:01, B*15:21, B*15:27,
|
VMSHSFYPPLMQRTSWTL
B*27:04, B*38:02, B*39:01,
|
AAPFKEQHHHRGPSDSIA
B*40:01, B*40:06, B*46:01,
|
NNYSLMAQDLKLKDLLKVY
B*51:01, B*55:02, B*58:01,
|
QPATISVPRDRTGQGLPSS
C*01:02, C*03:02, C*03:04,
|
GNRSSSEPMRKKTKFSSRN
C*03:67, C*04:01, C*04:03,
|
KEDSTRIKLAFKTSIFSPMK
C*07:02, C*08:01, C*12:02,
|
KEVKTSLTFPGSRPMSPEQ
C*14:02, C*15:02
|
QLDVMLQQEMEMESKEK
|
KPSESDLERYYYYLTNGIRK
|
DMIAPEEGEVMVRISKLIS
|
NTLLTSPFLEPLMVVLVQE
|
KENDYYCSLMKSIVDYILM
|
DPMERKRLFIESIPRLFPQR
|
VIRAPVPWHSVYRSAKKW
|
NEEHLHTVNPMMLRLKEL
|
WFAEFRDLRFVRTAEILAG
|
KLPLQPQEFWDVIQKHCLE
|
AHQTLLNKWIPTCAQLFTS
|
RKEHWIHFAPKSNYDSSRN
|
IEEYFASVASFMSLQLRELV
|
IKSLEDLVSLFMIHKDGNDF
|
KEPYQEMKFFIPQLIMIKLE
|
VSEPIIVFNPSFDGCWELIR
|
DSFLEIIKNSNGIPKLKYIPLK
|
FSFTAAAADRQCVKAAEP
|
GEPSMHAAATAMAELKGY
|
NLLLGTVNAEEKLVSDFLIQ
|
TFKVFQKNQVGPCKYLNV
|
YKKYVDLLDNTAEQNIAAF
|
LKENHDIDDFVTKINAIKKR
|
RNEIASMNITVPLAMFCLD
|
ATALNHDLCERAQNLKDH
|
LIQFQVDVNRDTNTSICNQ
|
YSHIADKVSEVPANTKELVS
|
LIEFLKKSSAVTVFKLRRQLR
|
DASERLEFLMDYADLPYQI
|
EDIFDNSRNLLLHKRDQAE
|
MDLIKRCSEFELRLEGYHRE
|
LESFRKREVMTTEEMKHN
|
VEKLNELSKNLNRAFAEFEL
|
INKEEELLEKEKSTYPLLQA
|
MLKNKVPYEQLWSTAYEF
|
SIKSEEWMNGPLFLLNAEQ
|
IAEEIGNMWRTTYKLIKTLS
|
DVPAPRRLAENVKIKIDKFK
|
QYIPILSISCNPGMKDRHW
|
QQISEIVGYEIKPTETTCLSN
|
MLEFGFGKFVEKLEPIGAA
|
ASKEYSLEKNLDRMKLDW
|
VNVTFSFVKYRDTDTNILC
|
AIDDIQMLLDDHVIKTQTM
|
CGSPFIKPIEAECRKWEEKLI
|
RIQDNLDAWLKCQATWLY
|
LEPIFSSEDIIAQMPEEGRK
|
FGIVDSYWKSLMSQAVKD
|
NRILVAADQPRMAEKLQE
|
ANFLLEDIQKGLNDYLEKKR
|
LFFPRFFFLSNDELLEILSETK
|
DPLRVQPHLKKCFEGIAKLE
|
FTDNLEIVGMISSEKETVPFI
|
QKIYPANAKGMVEKWLQ
|
QVEQMMLASMREVIGLGI
|
EAYVKVPRNHWVLQWPG
|
QVVICVSSIFWTQEVSQAL
|
AENTLLDFLKKSNDQIAQIV
|
QLVRGKLSSGARLTLGALT
|
VIDVHARDVVAKLSEDRVS
|
DLNDFQWISQLRYYWVAK
|
DVQVQIITTEALYGYEYLGN
|
SPRLVITPLTDRCYRTLMGA
|
LKLNLGGAPEGPAGTGKTE
|
TTKDLAKALAKQCVVFNCS
|
DGLDYKAMGKFFKGLAQA
|
GAWACFDEFNRIEVEVLSV
|
VAQQILSIQQAIIRKLKTFIF
|
EGTELSLNPTCAVFIT
|
|
SEQ ID NO: 1910
ENSG00000159263.11
MKEKSKNAAKTRREKENG
A*02:03, A*24:02, A*24:07,
|
EFYELAKLLPLPSAITSQLDK
A*24:10, A*34:01, B*15:01,
|
ASIIRLTTSYLKMRAVFPEG
B*15:21, B*15:27, B*38:02,
|
LGDA
B*39:01, B*40:01, B*40:06,
|
B*51:01, B*55:02, C*14:02,
|
C*15:02
|
|
SEQ ID NO: 1911
ENSG00000159788.14
MFRAGEASKRPLPGPSPPR
A*02:03, A*11:01, A*11:02,
|
VRSVEVARGRAGYGFTLSG
A*24:10, A*33:03, A*34:01,
|
QAPCVLSCVMRGSPADFV
B*15:01, B*40:01, B*55:02,
|
GLRAGDQILAVNEINVKKA
C*15:02
|
SHEDVVKLIGKCSGVLHMV
|
IAEGVGRFESCSSDEEGGLY
|
EGKGWLKPKLDSKALGINR
|
AERVVEEMQSGGIFNMIF
|
ENPSLCASNSEPLKLKQRSL
|
SESAATRFDVGHESINNPN
|
PNMLSKEEISKVIHDDSVFS
|
IGLESHDDFALDASILNVA
|
MIVGYLGSIELPSTSSNLES
|
DSLQAIRGCMRRLRAEQKI
|
HSLVTMKIMHDCVQLSTD
|
KAGVVAEYPAEKLAFSAVC
|
PDDRRFFGLVTMQTNDD
|
GSLAQEEEGALRTSCHVF
|
MVDPDLFNHKIHQGIARR
|
FGFECTADPDTNGCLEFPA
|
SSLPVLQFISVLYRDMGELI
|
EGMRARAFLDGDADAHQ
|
NNSTSSNSDSGIGNFHQEE
|
KSNRVLVVD
|
|
SEQ ID NO: 1912
ENSG00000160200.13
MPSETPQAEVGPTGCPHR
A*02:03, A*11:01, A*11:02,
|
SGPHSAKGSLEKGSPEDKE
A*24:10, A*33:03, B*15:01,
|
AKEPLWIRPDAPSRCTWQ
B*38:02, B*39:01, B*40:01,
|
LGRPASESPHHHTAPAKSP
B*58:01, C*03:02, C*03:04,
|
KILPDILKKIGDTPMVRINKI
C*07:02, C*14:02
|
GKKFGLKCELLAKCEFFNA
|
GGSVKDRISLRMIEDAERD
|
GTLKPGDTIIEPTSGNTGIG
|
LALAAAVRGYRCIIVMPEK
|
MSSEKVDVLRALGAEIVRT
|
PTNARFDSPESHVGVAWR
|
LKNEIPNSHILDQYRNASN
|
PLAHYDTTADEILQQCDGK
|
LDMLVASVGTGGTITGIAR
|
KLKEKCPGCRIIGVDPEGSIL
|
AEPEELNQTEQTTYEVEGI
|
GYDFIPTVLDRTVVDKWFK
|
SNDEEAFTFARMLIAQEGL
|
LCGGSAGSTVAVAVKAAQ
|
ELQEGQRCVVILPDSVRNY
|
MTKFLSDRWMLQKGFLKE
|
EDLTEKKPWWWHLRVQE
|
LGLSAPLTVLPTITCGHTIEIL
|
REKGFDQAPVVDEAGVILG
|
MVTLGNMLSSLLAGKVQP
|
SDQVGKVIYKQFKQIRLTD
|
TLGRLSHILEMDHFALVVH
|
EQIQYHSTGKSSQRQMVF
|
GVVTAIDLLNFVAAQERDQ
|
K
|
|
SEQ ID NO: 1913
ENSG00000160799.7
MQDGRKGGAYAGKMEAT
A*02:03
|
TAGVGRLEEEALRRKERLK
|
ALREKTG
|
|
SEQ ID NO: 1914
ENSG00000160838.9
MSSEQSAPGASPRAPRPG
A*02:03, A*11:01, A*11:02,
|
TQKSSGAVTKKGERAAKEK
A*24:02, A*24:07, A*24:10,
|
PATVLPPVGEEEPKSPEEY
B*40:01, B*55:02, C*01:02,
|
QCSGVLETDFAELCTRWG
C*03:02, C*04:01, C*04:03,
|
YTDFPKVVNRPRPHPPFVP
C*07:02, C*15:02
|
SASLSEKATLDDPRLSGSCS
|
LNSLESKYVFFRPTIQVELE
|
QEDSKSVKEIYIRGWKVEE
|
RILGVFSKCLPPLTQLQAIN
|
LWKVGLTDKTLTTFIELLPL
|
CSSTLRKVSLEGNPLPEQSY
|
HKL
|
|
SEQ ID NO: 1915
ENSG00000164093.11
METNCRKLVSACVQLGVQ
A*11:01, A*11:02, A*33:03
|
PAAVECLFSKDSEIKKVEFT
|
DSPESRKEAASSKFFPRQH
|
|
SEQ ID NO: 1916
ENSG00000164764.10
MRTLWMALCALSRLWPG
A*11:01, A*11:02, A*24:10,
|
AQAGCAEAGRCCPGRDPA
A*33:03, B*55:02, C*03:02,
|
CFARGWRLDRVYGTCFCD
C*03:04
|
QACRFTGDCCFDYDRACP
|
ARPCFVGEWSPWSGCAD
|
QCKPTTRVRRRSVQQEPQ
|
NGGAPCPPLEERAGCLEYS
|
TPQGQDCGHTYVPAFITTS
|
AFNKERTRQATSPHWSTH
|
TEDAGYCMEFKTESLTPHC
|
ALENWPLTRWMQYLREG
|
YTVCVDCQPPAMNSVSLR
|
CSGDGLDSDGNQTLHWQ
|
AIGNPRCQGTWKKVRRVD
|
QCSCPAVHSFIFI
|
|
SEQ ID NO: 1917
ENSG00000164830.13
MDYLTTFTEKSGRLLRGTA
A*33:03
|
NRLLGFGGGGEARQVRFE
|
DYLREPAQGDLGCGSPPH
|
RPPAPSSPEGP
|
|
SEQ ID NO: 1918
ENSG00000166689.10
MAAATVGRDTLPEHWSY
A*33:03
|
GVCRDGRVFFINDQLRCTT
|
WLHPRTGEPVNSGHMIRS
|
DLPRGWEE
|
|
SEQ ID NO: 1919
ENSG00000167157.9
MDSAAAAFALDKPALGPG
A*11:01, A*11:02, C*03:02,
|
PPPPPPALGPGDCAQARK
C*03:04, C*03:67
|
NFSVSHLLDLEEVAAAGRL
|
AARPGARAEAREGAAREP
|
SGGSSGSEAAPQ
|
|
SEQ ID NO: 1920
ENSG00000167632.10
MSVPDYMQCAEDHQTLL
A*02:03, A*02:07, A*11:01,
|
VVVQPVGIVSEENFFRIYKR
A*11:02, A*24:02, A*24:07,
|
ICSVSQISVRDSQRVLYIRYR
A*24:10, A*33:03, B*15:01,
|
HHYPPENNEWGDFQTHR
B*15:27, B*39:01, B*40:01,
|
KVVGLITITDCFSAKDWPQ
B*55:02, B*58:01, C*03:02,
|
TFEKFHVQKEIYGSTLYDSR
C*03:04, C*03:67, C*07:02,
|
LFVFGLQGEIVEQPRTDVA
C*12:02, C*14:02, C*15:02
|
FYPNYEDCQTVEKRIEDFIE
|
SLFIVLESKRLDRATDKSGD
|
KIPLLCVPFEKKDFVGLDTD
|
SRHYKKRCQGRMRKHVG
|
DLCLQAGMLQDSLVHYH
|
MSVELLRSVNDFLWLGAA
|
LEGLCSASVIYHYPGGTGG
|
KSGARRFQGSTLPAEAANR
|
HRPGALTTNGINPDTSTEI
|
GRAKNCLSPEDIIDKYKEAIS
|
YYSKYKNAGVIELEACIKAV
|
RVLAIQKRSMEASEFLQNA
|
VYINLRQLSEEEKIQRYSILS
|
ELYELIGFHRKSAFFKRVAA
|
MQCVAPSIAEPGWRACYK
|
LLLETLPGYSLSLDPKDFSR
|
GTHRGWAAVQMRLLHEL
|
VYASRRMGNPALSVRHLSF
|
LLQTMLDFLSDQEKKDVA
|
QSLENYTSKCPGTMEPIAL
|
PGGLTLPPVPFTKLPIVRHV
|
KLLNLPASLRPHKMKSLLG
|
QNVSTKSPFIYSPIIAHNRG
|
EERNKKIDFQWVQGDVCE
|
VQLMVYNPMPFELRVEN
|
MGLLTSGVEFESLPAALSLP
|
AESGLYPVTLVGVPQTTGTI
|
TVNGYHTTVFGVFSDCLLD
|
NLPGIKTSGSTVEVIPALPR
|
LQISTSLPRSAHSLQPSSGD
|
EISTNVSVQLYNGESQQLII
|
KLENIGMEPLEKLEVTSKVL
|
TTKEKLYGDFLSWKLEETLA
|
QFPLQPGKVATFTINIKVKL
|
DFSCQENLLQDLSDDGISV
|
SGFPLSSPFRQVVRPRVEG
|
KPVNPPESNKAGDYSHVKT
|
LEAVLNFKYSGGPGHTEGY
|
YRNLSLGLHVEVEPSVFFTR
|
VSTLPATSTRQCHLLLDVF
|
NSTEHELTVSTRSSEALILH
|
AGECQRMAIQVDKFNFES
|
FPESPGEKGQFANPKQLEE
|
ERREARGLEIHSKLGICWRI
|
PSLKRSGEASVEGLLNQLVL
|
EHLQLAPLQWDVLVDGQP
|
CDREAVAACQVGDPVRLE
|
VRLTNRSPRSVGPFALTVV
|
PFQDHQNGVHNYDLHDT
|
VSFVGSSTFYLDAVQPSGQ
|
SACLGALLFLYTGDFFLHIRF
|
HEDSTSKELPPSWFCLPSV
|
HVCALEAQA
|
|
SEQ ID NO: 1921
ENSG00000170615.10
MDHAEENEILAATQRYYVE
A*02:03, A*02:07, A*11:01,
|
RPIFSHPVLQERLHTKDKVP
A*11:02, A*24:02, A*24:07,
|
DSIADKLKQAFTCTPKKIRN
A*24:10, A*33:03, A*34:01,
|
IIYMFLPITKWLPAYKFKEY
B*15:01, B*15:21, B*15:27,
|
VLGDLVSGISTGVLQLPQG
B*27:04, B*38:02, B*39:01,
|
LAFAMLAAVPPIFGLYSSFY
B*40:01, B*40:06, B*46:01,
|
PVIMYCFLGTSRHISIGPFA
B*51:01, B*55:02, B*58:01,
|
VISLMIGGVAVRLVPDDIVI
C*01:02, C*03:02, C*03:04,
|
PGGVNATNGTEARDALRV
C*03:67, C*04:01, C*04:03,
|
KVAMSVTLLSGIIQFCLGVC
C*08:01, C*12:02, C*14:02,
|
RFGFVAIYLTEPLVRGFTTA
C*15:02
|
AAVHVFTSMLKYLFGVKTK
|
RYSGIFSVVYSTVAVLQNV
|
KNLNVCSLGVGLMVFGLLL
|
GGKEFNERFKEKLPAPIPLE
|
FFAVVMGTGISAGFNLKES
|
YNVDVVGTLPLGLLPPANP
|
DTSLFHLVYVDAIAIAIVGFS
|
VTISMAKTLANKHGYQVD
|
GNQELIALGLCNSIGSLFQT
|
FSISCSLSRSLVQEGTGGKT
|
QLAGCLASLMILLVILATGF
|
LFESLPQAVLSAIVIVNLKG
|
MFMQFSDLPFFWRTSKIEL
|
TIWLTTFVSSLFLGLDYGLIT
|
AVIIALLTVIYRTQS
|
|
SEQ ID NO: 1922
ENSG00000171680.16
MHYDGHVRFDLPPQGSVL
A*02:03, A*02:07, A*11:01,
|
ARNVSTRSCPPRTSPAVDL
A*11:02, A*24:10, A*33:03,
|
EEEEEESSVDGKGDRKSTG
B*15:01, B*39:01, B*40:01,
|
LKLSKKKARRRHTDDPSKE
B*58:01, C*03:02, C*03:04,
|
CFTLKFDLNVDIETEIVPAM
C*07:02, C*12:02, C*14:02,
|
KKKSLGEVLLPVFERKGIAL
C*15:02
|
GKVDIYLDQSNTPLSLTFEA
|
YRFGGHYLRVKAPAKPGDE
|
GKVEQGMKDSKSLSLPILR
|
PAGTGPPALERVDAQSRRE
|
SLDILAPGRRRKNMSEFLG
|
EASIPGQEPPTPSSCSLPSG
|
SSGSTNTGDSWKNRAASR
|
FSGFFSSGPSTSAFGREVDK
|
MEQLEGKLHTYSLFGLPRL
|
PRGLRFDHDSWEEEYDED
|
EDEDNACLRLEDSWRELID
|
GHEKLTRRQCHQQEAVW
|
ELLHTEASYIRKLRVIINLFLC
|
CLLNLQESGLLCEVEAERLF
|
SNIPEIAQLHRRLWASVMA
|
PVLEKARRTRALLQPGDFL
|
KGFKMFGSLFKPYIRYCME
|
EEGCMEYMRGLLRDNDLF
|
RAYITWAEKHPQCQRLKLS
|
DMLAKPHQRLTKYPLLLKS
|
VLRKTEEPRAKEAVVAMIG
|
SVERFIHHVNACMRQRQE
|
RQRLAAVVSRIDAYEVVES
|
SSDEVDKLLKEFLHLDLTAPI
|
PGASPEETRQLLLEGSLRM
|
KEGKDSKMDVYCFLFTDLL
|
LVTKAVKKAERTRVIRPPLL
|
VDKIVCRELRDPGSFLLIYLN
|
EFHSAVGAYTFQASGQALC
|
RGWVDTIYNAQNQLQQL
|
RAQEPPGSQQPLQSLEEEE
|
DEQEEEEEEEEEEEEGEDS
|
GTSAASSPTIMRKSSGSPD
|
SQHCASDGSTETLAMVVV
|
EPGDTLSSPEEDSGPFSSQS
|
DETSLSTTASSATPTSELLPL
|
GPVDGRSCSMDSAYGTLS
|
PTSLQDFVAPGPMAELVP
|
RAPESPRVPSPPPSPRLRRR
|
TPVQLLSCPPHLLKSKSEAS
|
LLQLLAGAGTHGTPSAPSR
|
SLSELCLAVPAPGIRTQGSP
|
QEAGPSWDCRGAPSPGSG
|
PGLVGCLAGEPAGSHRKRC
|
GDLPSGASPRVQPEPPPGV
|
SAQHRKLTLAQLYRIRTTLL
|
LNSTLTASEV
|
|
SEQ ID NO: 1923
ENSG00000171791.10
MAHAGRTGYDNREIVMK
A*02:03, A*11:01, A*11:02,
|
YIHYKLSQRGYEWDAGDV
A*24:02, A*24:07, A*24:10,
|
GAAPPGAAPAPGIFSSQPG
A*33:03, A*34:01, B*15:21,
|
HTPHPAASRDPVARTSPLQ
B*27:04, B*40:01, B*40:06,
|
TPAAPGAAAGPALSPVPPV
B*46:01, B*55:02, B*58:01,
|
VHLTLRQAGDDFSRRYRRD
C*01:02, C*03:02, C*04:01,
|
FAEMSSQLHLTPFTARGRF
C*04:03, C*14:02
|
ATVVEELFRDGVNWGRIV
|
AFFEFGGVMCVESVNREM
|
SPLVDNIALWMTEYLNRHL
|
HTWIQDNGGWDAFVELY
|
GPS
|
|
SEQ ID NO: 1924
ENSG00000172765.12
MKRGTSLHSRRGKPEAPK
A*02:03, A*33:03, C*03:02,
|
GSPQINRKSGQEMTAVM
C*03:04
|
QSGRPRSSSTTDAPTSSAM
|
MEIACAAAAAAAACLPGE
|
EGTAE
|
|
SEQ ID NO: 1925
ENSG00000174672.11
MTSTGKDGGAQHAQYVG
A*02:03, A*11:01, A*11:02,
|
PYRLEKTLGKGQTGLVKLG
A*24:02, A*24:10, A*33:03,
|
VHCVTCQKVAIKIVNREKLS
B*40:01, C*03:02, C*03:04,
|
ESVLMKVEREIAILKLIEHPH
C*14:02
|
VLKLHDVYENKKYLYLVLEH
|
VSGGELFDYLVKKGRLTPK
|
EARKFFRQIISALDFCHSHSI
|
CHRDLKPENLLLDEKNNIRI
|
ADFGMASLQVGDSLLETSC
|
GSPHYACPEVIRGEKYDGR
|
KADVWSCGVILFALLVGAL
|
PFDDDNLRQLLEKVKRGVF
|
HMPHFIPPDCQSLLRGMIE
|
VDAARRLTLEHIQKHIWYI
|
GGKNEPEPEQPIPRKVQIR
|
SLPSLEDIDPDVLDSMHSL
|
GCFRDRNKLLQDLLSEEEN
|
QEKMIYFLLLDRKERYPSQE
|
DEDLPPRNEIDPPRKRVDS
|
PMLNRHGKRRPERKSMEV
|
LSVTDGGSPVPARRAIEMA
|
QHGQSKAMFSKSLDIAEA
|
HPQFSKEDRSRSISGASSGL
|
STSPLSSPRVTPHPSPRGSP
|
LPTPKGTPVHTPKESPAGT
|
PNPTPPSSPSVGGVPWRA
|
RLNSIKNSFLGSPRFHRRKL
|
QVPTPEEMSNLTPESSPEL
|
AKKSWFGNFISLEKEEQIFV
|
VIKDKPLSSIKADIVHAFLSI
|
PSLSHSVISQTSFRAEYKAT
|
GGPAVFQKPVKFQVDITYT
|
EGGEAQKENGIYSVTFTLLS
|
GPSRRFKRVVETIQAQLLST
|
HDPPAAQHLSEPPPPAPGL
|
SWGAGLKGQKVATSYESSL
|
|
SEQ ID NO: 1926
ENSG00000177380.9
MMCEVMPTISEDGRRGSA
A*02:03, A*11:01, A*11:02,
|
LGPDEAGGELERLMVTML
A*24:10, A*33:03, B*15:01,
|
TERERLLETLREAQDGLAT
B*39:01, B*40:01, B*58:01,
|
AQLRLRELGHEKDSLQRQL
C*03:02, C*03:04, C*03:67,
|
SIALPQEFAALTKELNLCRE
C*12:02
|
QLLEREEEIAELKAERNNTR
|
LLLEHLECLVSRHERSLRMT
|
VVKRQAQSPGGVSSEVEV
|
LKALKSLFEHHKALDEKVRE
|
RLRMALERVAVLEEELELS
|
NQETLNLREQLSRRRSGLE
|
EPGKDGDGQTLANGLGPG
|
GDSNRRTAELEEALERQRA
|
EVCQLRERLAVLCRQMSQ
|
LEEELGTAHRELGKAEEAN
|
SKLQRDLKEALAQREDME
|
ERITTLEKRYLSAQREATSL
|
HDANDKLENELASKESLYR
|
QSEEKSRQLAEWLDDAKQ
|
KLQQTLQKAETLPEIEAQLA
|
QRVAALNKAEERHGNFEE
|
RLRQLEAQLEEKNQELQRA
|
RQREKMNDDHNKRLSETV
|
DKLLSESNERLQLHLKERM
|
GALEEKNSLSEEIANMKKL
|
QDELLLNKEQLLAEMERM
|
QMEIDQLRGRPPSSYSRSL
|
PGSALELRYSQAPTLPSGA
|
HLDPYVAGSGRAGKRGR
|
WSGVKEEPSKDWERSAPA
|
GSIPPPFPGELDGSDEEEAE
|
GMFGAELLSPSGQADVQT
|
LAIMLQEQLEAINKEIKLIQE
|
EKETTEQRAEELESRVSSSG
|
LDSLGRYRSSCSLPPSLTTST
|
LASPSPPSSGHSTPRLAPPS
|
PAREGTDKANHVPKEEAG
|
APRGEGPAIPGDTPPPTPR
|
SARLERMTQALALQAGSLE
|
DGGPPRGSEGTPDSLHKA
|
PKKKSIKSSIGRLFGKKEKG
|
RMGPPGRDSSSLAGTPSD
|
ETLATDPLGLAKLTGPGDK
|
DRRNKRKHELLEEACRQGL
|
PFAAWDGPTVVSWLELW
|
VGMPAWYVAACRANVKS
|
GAIMANLSDTEIQREIGISN
|
PLHRLKLRLAIQEMVSLTSP
|
SAPASSRTSTGNVWMTHE
|
EMESLTATTKPILAYGDMN
|
HEWVGNDWLPSLGLPQY
|
RSYFMESLVDARMLDHLN
|
KKELRGQLKMVDSFHRVSL
|
HYGIMCLKRLNYDRKDLER
|
RREESQTQIRDVMVWSNE
|
RVMGWVSGLGLKEFATNL
|
TESGVHGALLALDETFDYS
|
DLALLLQIPTQNAQARQLL
|
EKEFSNLISLGTDRRLDEDS
|
AKSFSRSPSWRKMFREKDL
|
RGVTPDSAEMLPPNFRSA
|
AAGALGSPGLPLRKLQPEG
|
QTSGSSRADGVSVRTYSC
|
|
SEQ ID NO: 1927
ENSG00000177455.7
MPPPRLLFFLLFLTPMEVR
A*02:03, A*11:01, A*11:02,
|
PEEPLVVKVEEGDNAVLQC
A*24:10, B*39:01, B*40:01,
|
LKGTSDGPTQQLTWSRES
B*58:01, C*03:02, C*03:04,
|
PLKPFLKLSLGLPGLGIHMR
C*12:02, C*14:02, C*15:02
|
PLAIWLFIFNVSQQMGGFY
|
LCQPGPPSEKAWQPGWT
|
VNVEGSGELFRWNVSDLG
|
GLGCGLKNRSSEGPSSPSG
|
KLMSPKLYVWAKDRPEIW
|
EGEPPCLPPRDSLNQSLSQ
|
DLTMAPGSTLWLSCGVPP
|
DSVSRGPLSWTHVHPKGP
|
KSLLSLELKDDRPARDMW
|
VMETGLLLPRATAQDAGK
|
YYCHRGNLTMSFHLEITAR
|
PVLWHWLLRTGGWKVSA
|
VTLAYLIFCLCSLVGILHLQR
|
ALVLRRKRKRMTDPTRRFF
|
KVTPPPGSGPQNQYGNVL
|
SLPTPTSGLGRAQRWAAG
|
LGGTAPSYGNPSSDVQAD
|
GALGSRSPPGVGPEEEEGE
|
GYEEPDSEEDSEFYENDSN
|
LGQDQLSQDGSGYENPED
|
EPLGPEDEDSFSNAESYEN
|
EDEELTQPVARTMDFLSPH
|
GSAWDPSREATSLGSQSYE
|
DMRGILYAAPQLRSIRGQP
|
GPNHEEDADSYENMDNP
|
DGPDPAWGGGGRMGTW
|
STR
|
|
SEQ ID NO: 1928
ENSG00000178209.10
MVAGMLMPRDQLRAIYE
A*02:03, A*11:01, A*11:02,
|
VLFREGVMVAKKDRRPRSL
A*24:02, A*24:10, A*33:03,
|
HPHVPGVTNLQVMRAMA
A*34:01, B*55:02, C*03:02,
|
SLRARGLVRETFAWCHFY
C*03:04
|
WYLTNEGIAHLRQYLHLPP
|
EIVPASLQRVRRPVAMVM
|
PARRTPHVQAVQGPLGSP
|
PKRGPLPTEEQRVYRRKEL
|
EEVSPETPVVPATTQRTLA
|
RPGPEPAPAT
|
|
SEQ ID NO: 1929
ENSG00000181035.9
MGNGVKEGPVRLHEDAE
A*02:03, A*11:01, A*11:02,
|
AVLSSSVSSKRDHRQVLSSL
A*24:02, A*24:07, A*24:10,
|
LSGALAGALAKTAVAPLDR
A*33:03, B*15:01, B*39:01,
|
TKIIFQVSSKRFSAKEAFRVL
B*40:01, C*03:02, C*03:04,
|
YYTYLNEGFLSLWRGNSAT
C*03:67, C*12:02, C*14:02
|
MVRVVPYAAIQFSAHEEYK
|
RILGSYYGFRGEALPPWPR
|
LFAGALAGTTAASLTYPLDL
|
VRARMAVTPKEMYSNIFH
|
VFIRISREEGLKTLYHGFMP
|
TVLGVIPYAGLSFFTYETLKS
|
LHREYSGRRQPYPFERMIF
|
GACAGLIGQSASYPLDVVR
|
RRMQTAGVTGYPRASIAR
|
TLRTIVREEGAVRGLYKGLS
|
MNWVKGPIAVGISFTTFDL
|
MQILLRHLQS
|
|
SEQ ID NO: 1930
ENSG00000185404.12
MAGGGSDLSTRGLNGGVS
A*02:03, A*24:10, A*33:03,
|
QVANEMNHLPAHSQSLQ
C*03:02
|
RLFTEDQDVDEGLVYDTVF
|
KHFKRHKLEISNAIKKTFPFL
|
EGLRDRELITNK
|
|
SEQ ID NO: 1931
ENSG00000185686.13
MERRRLWGSIQSRYISMS
A*02:03, A*11:01, A*11:02,
|
VWTSPRRLVELAGQSLLKD
A*24:10, A*33:03, B*15:01,
|
EALAIAALELLPRELFPPLF
B*39:01, B*40:01, B*58:01,
|
MAAFDGRHSQTLKAMVQ
C*03:02, C*03:04, C*14:02
|
AWPFTCLPLGVLMKGQHL
|
HLETFKAVLDGLDVLLAQE
|
VRPRRWKLQVLDLRKNSH
|
QDFWTVWSGNRASLYSFP
|
EPEAAQPMTKKRKVDGLS
|
TEAEQPFIPVEVLVDLFLKE
|
GACDELFSYLIEKVKRKKNV
|
LRLCCKKLKIFAMPMQDIK
|
MILKMVQLDSIEDLEVTCT
|
WKLPTLAKFSPYLGQMINL
|
RRLLLSHIHASSYISPEKEEQ
|
YIAQFTSQFLSLQCLQALYV
|
DSLFFLRGRLDQLLRHVMN
|
PLETLSITNCRLSEGDVMHL
|
SQSPSVSQLSVLSLSGVML
|
TDVSPEPLQALLERASATL
|
QDLVFDECGITDDQLLALL
|
PSLSHCSQLTTLSFYGNSISI
|
SALQSLLQHLIGLSNLTHVL
|
YPVPLESYEDIHGTLHLERL
|
AYLHARLRELLCELGRPSM
|
VWLSANPCPHCGDRTFYD
|
PEPILCPCFMPN
|
|
SEQ ID NO: 1932
ENSG00000185989.9
MAVEDEGLRVFQSVKIKIG
A*02:03, A*11:01, A*11:02,
|
EAKNLPSYPGPSKMRDCYC
A*24:02, A*24:07, A*24:10,
|
TVNLDQEEVFRTKIVEKSLC
A*33:03, B*15:01, B*15:27,
|
PFYGEDFYCEIPRSFRHLSF
B*39:01, B*40:01, B*58:01,
|
YIFDRDVFRRDSIIGKVAIQ
C*03:02, C*03:04, C*07:02,
|
KEDLQKYHNRDTWFQLQH
C*12:02, C*14:02
|
VDADSEVQGKVHLELRLSE
|
VITDTGVVCHKLATRIVEC
|
QGLPIVNGQCDPYATVTLA
|
GPFRSEAKKTKVKRKTNNP
|
QFDEVFYFEVTRPCSYSKKS
|
HFDFEEEDVDKLEIRVDLW
|
NASNLKFGDEFLGELRIPLK
|
VLRQSSSYEAWYFLQPRD
|
NGSKSLKPDDLGSLRLNVV
|
YTEDHVFSSDYYSPLRDLLL
|
KSADVEPVSASAAHILGEV
|
CREKQEAAVPLVRLFLHYG
|
RVVPFISAIASAEVKRTQDP
|
NTIFRGNSLASKCIDETMKL
|
AGMHYLHVTLKPAIEEICQ
|
SHKPCEIDPVKLKDGENLE
|
NNMENLRQYVDRVFHAIT
|
ESGVSCPTVMCDIFFSLREA
|
AAKRFQDDPDVRYTAVSSF
|
IFLRFFAPAILSPNLFQLTPH
|
HTDPQTSRTLTLISKTVQTL
|
GSLSKSKSASFKESYMATFY
|
EFFNEQKYADAVKNFLDLIS
|
SSGRRDPKSVEQPIVLKEG
|
|
SEQ ID NO: 1933
ENSG00000196961.8
MPAVSKGDGMRGLAVFIS
A*02:03, A*11:01, A*11:02,
|
DIRNCKSKEAEIKRINKELA
A*24:02, A*24:07, A*24:10,
|
NIRSKFKGDKALDGYSKKK
A*33:03, A*34:01, B*15:01,
|
YVCKLLFIFLLGHDIDFGHM
B*15:27, B*39:01, B*40:01,
|
EAVNLLSSNKYTEKQIGYLFI
B*40:06, B*58:01, C*03:02,
|
SVLVNSNSELIRLINNAIKN
C*03:04, C*03:67, C*08:01,
|
DLASRNPTFMCLALHCIAN
C*12:02, C*14:02, C*15:02
|
VGSREMGEAFAADIPRILV
|
AGDSMDSVKQSAALCLLRL
|
YKASPDLVPMGEWTARVV
|
HLLNDQHMGVVTAAVSLI
|
TCLCKKNPDDFKTCVSLAV
|
SRLSRIVSSASTDLQDYTYY
|
FVPAPWLSVKLLRLLQCYP
|
PPEDAAVKGRLVECLETVL
|
NKAQEPPKSKKVQHSNAK
|
NAILFETISLIIHYDSEPNLLV
|
RACNQLGQFLQHRETNLR
|
YLALESMCTLASSEFSHEAV
|
KTHIDTVINALKTERDVSVR
|
QRAADLLYAMCDRSNAKQ
|
IVSEMLRYLETADYAIREEIV
|
LKVAILAEKYAVDYSWYVD
|
TILNLIRIAGDYVSEEVWYR
|
VLQIVTNRDDVQGYAAKT
|
VFEALQAPACHENMVKVG
|
GYILGEFGNLIAGDPRSSPP
|
VQFSLLHSKFHLCSVATRAL
|
LLSTYIKFINLFPETKATIQG
|
VLRAGSQLRNADVELQQR
|
AVEYLTLSSVASTDVLATVL
|
EEMPPFPERESSILAKLKRK
|
KGPGAGSALDDGRRDPSS
|
NDINGGMEPTPSTVSTPSP
|
SADLLGLRAAPPPAAPPAS
|
AGAGNLLVDVFDGPAAQP
|
SLGPTPEEAFLSPGPEDIGP
|
PIPEADELLNKFVCKNNGV
|
LFENQLLQIGVKSEFRQNL
|
GRMYLFYGNKTSVQFQNF
|
SPTVVHPGDLQTQLAVQT
|
KRVAAQVDGGAQVQQVL
|
NIECLRDFLTPPLLSVRFRY
|
GGAPQALTLKLPVTINKFF
|
QPTEMAAQDFFQRWKQL
|
SLPQQEAQKIFKANHPMD
|
AEVTKAKLLGFGSALLDNV
|
DPNPENFVGAGIIQTKALQ
|
VGCLLRLEPNAQAQMYRL
|
TLRTSKEPVSRHLCELLAQQ
|
F
|
|
SEQ ID NO: 1934
ENSG00000197530.8
MAGALRRGRALGSRPSGP
A*02:03, A*11:01, A*11:02,
|
TVSSRRSPQCPVAQEGLGA
A*24:02, A*24:07, A*24:10,
|
RSRPRVAPRSLARCGPSSRL
A*33:03, B*15:01, B*39:01,
|
MGWKPSEARGQSQSFQA
B*40:01, B*58:01, C*03:02,
|
SGLQPRSLKAARRATGRPD
C*03:04, C*07:02, C*12:02,
|
RSRAAPPNMDPDPQAGV
C*14:02
|
QVGMRVVRGVDWKWGQ
|
QDGGEGGVGTVVELGRH
|
GSPSTPDRTVVVQWDQG
|
TRTNYRAGYQGAHDLLLYD
|
NAQIGVRHPNIICDCCKKH
|
GLRGMRWKCRVCLDYDLC
|
TQCYMHNKHELAHAFDRY
|
ETAHSRPVTLSPRQGLPRIP
|
LRGIFQGAKVVRGPDWE
|
WGSQDGGEGKPGRVVDI
|
RGWDVETGRSVASVTWA
|
DGTTNVYRVGHKGKVDLK
|
CVGEAAGGFYYKDHLPRLG
|
KPAELQRRVSADSQPFQH
|
GDKVKCLLDTDVLREMQE
|
GHGGWNPRMAEFIGQTG
|
TVHRITDRGDVRVQFNHE
|
TRWTFHPGALTKHHSFWV
|
GDVVRVIGDLDTVKRLQA
|
GHGEWTDDMAPALGRVG
|
KVVKVFGDGNLRVAVAGQ
|
RWTFSPSCLVAYRPEEDAN
|
LDVAERARENKSSLSVALD
|
KLRAQKSDPEHPGRLVVEV
|
ALGNAARALDLLRRRPEQV
|
DTKNQGRTALQVAAYLGQ
|
VELIRLLLQARAGVDLPDDE
|
GNTALHYAALGNQPEATR
|
VLLSAGCRADAINSTQSTA
|
LHVAVQRGFLEVVRALCER
|
GCDVNLPDAHSDTPLHSAI
|
SAGTGASGIVEVLTEVPNID
|
VTATNSQGFTLLHHASLKG
|
HALAVRKILARARQLVDAK
|
KEDGFTALHLAALNNHREV
|
AQILIREGRCDVNVRNRKL
|
QSPLHLAVQQAHVGLVPLL
|
VDAGCSVNAEDEEGDTAL
|
HVALQRHQLLPLVADGAG
|
GDPGPLQLLSRLQASGLPG
|
SAELTVGAAVACFLALEGA
|
DVSYTNHRGRSPLDLAAEG
|
RVLKALQGCAQRFRERQA
|
GGGAAPGPRQTLGTPNTV
|
TNLHVGAAPGPEAAECLV
|
CSELALLVLFSPCQHRTVCE
|
ECARRMKKCIRCQVVVSKK
|
LRPDGSEVASAAPAPGPPR
|
QLVEELQSRYRQMEERITC
|
PICIDSHIRLVFQCGHGACA
|
PCGSALSACPICRQPIRDRI
|
QIFV
|
|
SEQ ID NO: 1935
ENSG00000204839.4
MAGGVWGRSRAREAPVG
A*02:03, A*11:01, A*11:02,
|
ALTLTALTEGIRARQGQPQ
A*24:02, A*24:07, A*24:10,
|
GPPSAGPQPKSWEVKPEA
A*33:03, B*39:01, B*40:01,
|
EPQTQALTAPSEAEPGRGA
B*58:01, C*03:02, C*03:04,
|
TVPEAGSEPCSLNSALEPAP
C*14:02
|
EGPHQVPQSSWEEGVLAD
|
LALYTAACLEEAGFAGTQA
|
TVLTLSSALEARGERLEDQV
|
HALVRGLLAQVPSLAEGRP
|
WRAALRVLSALALEHARD
|
VVCALLPRSLPADRVAAEL
|
WRSLSRNQRVNGQVLVQL
|
LWALKGASGPEPQALAAT
|
RALGEMLAVSGCVGATRG
|
FYPHLLLALVTQLHKLARSP
|
CSPDMPKIWVLSHRGPPH
|
SHASCAVEALKALLTGDGG
|
RMVVTCMEQAGGWRRLV
|
GAHTHLEGVLLLASAMVA
|
HADHHLRGLFADLLPRLRS
|
ADDPQRLTAMAFFTGLLQ
|
SRPTARLLREEVILERLLTW
|
QGDPEPTVRWLGLLGLGH
|
LALNRRKVRHVSTLLPALLG
|
ALGEGDARLVGAALGALR
|
RLLLRPRAPVRLLSAELGPR
|
LPPLLDDTRDSIRASAVGLL
|
GTLVRRGRGGLRLGLRGPL
|
RKLVLQSLVPLLLRLHDPSR
|
DAAESSEWTLARCDHAFC
|
WGLLEELVTVAHYDSPEAL
|
SHLCCRLVQRYPGHVPNFL
|
SQTQGYLRSPQDPLRRAA
|
AVLIGFLVHHASPGCVNQD
|
LLDSLFQDLGRLQSDPKPA
|
VAAAAHVSAQQVA
|
|
SEQ ID NO: 1936
ENSG00000205277.5
MLVIWILTLALRLCASVTTV
A*02:03, A*11:01, A*11:02,
|
TPGSTVNTSIGGNTTSASTP
A*24:02, A*24:10, A*33:03,
|
SSSDPFTTFSDYGVSVTFIT
B*15:01, B*39:01, B*40:01,
|
GSTATKHFLDSSTNSGHSE
B*55:02, B*58:01, C*03:02,
|
ESTVSHSGPGATGTTLFPS
C*03:04, C*03:67, C*07:02,
|
HSATSVFVGEPKTSPITSAS
C*12:02, C*14:02, C*15:02
|
METTALPGSTTTAGLSEKS
|
TTFYSSPRSPDRTLSPARTT
|
SSGVSEKSTTSHSRPGPTHT
|
IAFPDSTTMPGVSQESTAS
|
HSIPGSTDTTLSPGTTTPSSL
|
GPESTTFHSSPGYTKTTRLP
|
DNTTTSGLLEASTPVHSST
|
GSPHTTLSPSSSTTHEGEPT
|
TFQSWPSSKDTSPAPSGTT
|
SAFVKLSTTYHSSPSSTPTT
|
HFSASSTTLGHSEESTPVHS
|
SPVATATTPPPARSATSGH
|
VEESTAYHRSPGSTQTMHF
|
PESSTTSGHSEESATFHGST
|
THTKSSTPSTTAALAHTSYH
|
SSLGSTETTHFRDSSTISGRS
|
EESKASHSSPDAMATTVLP
|
AGSTPSVLVGDSTPSPISSG
|
SMETTALPGSTTKPGLSEKS
|
TTFYSSPRSPDTTHLPASM
|
TSSGVSEESTTSHSRPGSTH
|
TTAFPGSTTMPGLSQESTA
|
SHSSPGPTDTTLSPGSTTAS
|
SLGPEYTTFHSRPGSTETTL
|
LPDNTTASGLLEASMPVHS
|
STRSPHTTLSPAGSTTRQG
|
ESTTFHSWPSSKDTRPAPP
|
TTTSAFVEPSTTSHGSPSSIP
|
TTHISARSTTSGLVEESTTY
|
HSSPGSTQTMHFPESDTTS
|
GRGEESTTSHSSTTHTISSA
|
PSTTSALVEEPTSYHSSPGS
|
TATTHFPDSSTTSGRSEEST
|
ASHSSQDATGTIVLPARSTT
|
SVLLGESTTSPISSGSMETT
|
ALPGSTTTPGLSERSTTFHS
|
SPRSPATTLSPASTTSSGVS
|
EESTTSRSRPGSTHTTAFPD
|
STTTPGLSRHSTTSHSSPGS
|
TDTTLLPASTTTSGPSQEST
|
TSHSSSGSTDTALSPGSTTA
|
LSFGQESTTFHSNPGSTHT
|
TLFPDSTTSSGIVEASTRVH
|
SSTGSPRTTLSPASSTSPGL
|
QGESTAFQTHPASTHTTPS
|
PPSTATAPVEESTTYHRSP
|
GSTPTTHFPASSTTSGHSEK
|
STIFHSSPDASGTTPSSAHS
|
TTSGRGESTTSRISPGSTEIT
|
TLPGSTTTPGLSEASTTFYSS
|
PRSPTTTLSPASMTSLGVG
|
EESITSRSQPGSTHSTVSPA
|
STTTPGLSEESTTVYSSSRG
|
STETTVFPHSTTTSVHGEEP
|
TTFHSRPASTHTTLFTEDST
|
TSGLTEESTAFPGSPASTQT
|
GLPATLTTADLGEESTTFPS
|
SSGSTGTKLSPARSTTSGLV
|
GESTPSRLSPSSTETTTLPGS
|
PTTPSLSEKSTTFYTSPRSPD
|
ATLSPATTTSSGVSEESSTS
|
HSQPGSTHTTAFPDSTTTS
|
DLSQEPTTSHSSQGSTEATL
|
SPGSTTASSLGQQSTTFHSS
|
PGDTETTLLPDDTITSGLVE
|
ASTPTHSSTGSLHTTLTPAS
|
STSAGLQEESTTFQSWPSS
|
SDTTPSPPGTTAAPVEVST
|
TYHSRPSSTPTTHFSASSTT
|
LGRSEESTTVHSSPGATGT
|
ALFPTRSATSVLVGEPTTSP
|
ISSGSTETTALPGSTTTAGLS
|
EKSTTFYSSPRSPDTTLSPAS
|
TTSSGVSEESTTSHSRPGST
|
HTTAFPGSTTMPGVSQEST
|
ASHSSPGSTDTTLSPGSTTA
|
SSLGPESITFHSSPGSTETT
|
LLPDNTTASGLLEASTPVHS
|
STGSPHTTLSPAGSTTRQG
|
ESTTFQSWPSSKDTMPAP
|
PTTTSAFVELSTTSHGSPSS
|
TPTTHFSASSTTLGRSEEST
|
TVHSSPVATATTPSPARSTT
|
SGLVEESTAYHSSPGSTQT
|
MHFPESSTASGRSEESRTS
|
HSSTTHTISSPPSTTSALVEE
|
PTSYHSSPGSTATTHFPDSS
|
TTSGRSEESTASHSSQDAT
|
GTIVLPARSTTSVLLGESTTS
|
PISSGSMETTALPGSTTTPG
|
LSEKSTTFHSSPRSPATTLSP
|
ASTTSSGVSEESTTSHSRPG
|
STHTTAFPDSTTTPGLSRHS
|
TTSHSSPGSTDTTLLPASTT
|
TSGPSQESTTSHSSPGSTDT
|
ALSPGSTTALSFGQESTTFH
|
SSPGSTHTTLFPDSTTSSGI
|
VEASTRVHSSTGSPRTTLSP
|
ASSTSPGLQGESTAFQTHP
|
ASTHTTPSPPSTATAPVEES
|
TTYHRSPGSTPTTHFPASST
|
TSGHSEKSTIFHSSPDASGT
|
TPSSAHSTTSGRGESTTSRI
|
SPGSTEITTLPGSTTTPGLSE
|
ASTTFYSSPRSPTTTLSPAS
|
MTSLGVGEESTTSRSQPGS
|
THSTVSPASTTTPGLSEEST
|
TVYSSSPGSTETTVFPRTPT
|
TSVRGEEPTTFHSRPASTH
|
TTLFTEDSTTSGLTEESTAFP
|
GSPASTQTGLPATLTTADL
|
GEESTTFPSSSGSTGTTLSP
|
ARSTTSGLVGESTPSRLSPS
|
STETTTLPGSPTTPSLSEKST
|
TFYTSPRSPDATLSPATTTS
|
SGVSEESSTSHSQPGSTHT
|
TAFPDSTTTPGLSRHSTTSH
|
SSPGSTDTTLLPASTTTSGP
|
SQESTTSHSSPGSTDTALSP
|
GSTTALSFGQESTTFHSSPG
|
STHTTLFPDSTTSSGIVEAST
|
RVHSSTGSPRTTLSPASSTS
|
PGLQGESTTFQTHPASTHT
|
TPSPPSTATAPVEESTTYHR
|
SPGSTPTTHFPASSTTSGHS
|
EKSTIFHSSPDASGTTPSSA
|
HSTTSGRGESTTSRISPGST
|
EITTLPGSTTTPGLSEASTTF
|
YSSPRSPTTTLSPASMTSLG
|
VGEESTTSRSQPGSTHSTV
|
SPASTTTPGLSEESTTVYSSS
|
PGSTETTVFPRSTTTSVRGE
|
EPTTFHSRPASTHTTLFTED
|
STTSGLTEESTAFPGSPAST
|
QTGLPATLTTADLGEESTTE
|
PSSSGSTGTTLSPARSTTSG
|
LVGESTPSRLSPSSTETTTLP
|
GSPTTPSLSEKSTTFYTSPRS
|
PDATLSPATTTSSGVSEESS
|
TSHSQPGSTHTTAFPDSTT
|
TSGLSQEPTASHSSQGSTE
|
ATLSPGSTTASSLGQQSTTF
|
HSSPGDTETTLLPDDTITSG
|
LVEASTPTHSSTGSLHTTLT
|
PASSTSAGLQEESTTFQSW
|
PSSSDTTPSPPGTTAAPVE
|
VSTTYHSRPSSTPTTHFSAS
|
STTLGRSEESTTVHSSPGAT
|
GTALFPTRSATSVLVGEPTT
|
SPISSGSTETTALPGSTTTA
|
GLSEKSTTFYSSPRSPDTTLS
|
PASTTSSGVSEESTTSHSRP
|
GSTHTTAFPGSTTMPGVS
|
QESTASHSSPGSTDTTLSP
|
GSTTASSLGPESTTFHSGPG
|
STETTLLPDNTTASGLLEAS
|
TPVHSSTGSPHTTLSPAGST
|
TRQGESTTFQSWPNSKDT
|
TPAPPTTTSAFVELSTTSHG
|
SPSSTPTTHFSASSTTLGRS
|
EESTTVHSSPVATATTPSPA
|
RSTTSGLVEESTTYHSSPGS
|
TQTMHFPESDTTSGRGEES
|
TTSHSSTTHTISSAPSTTSAL
|
VEEPTSYHSSPGSTATTHFP
|
DSSTTSGRSEESTASHSSQ
|
DATGTIVLPARSTTSVLLGE
|
STTSPISSGSMETTALPGST
|
TTPGLSEKSTTFHSSPRSPA
|
TTLSPASTTSSGVSEESTTS
|
HSRPGSTHTTAFPDSTTTP
|
GLSRHSTTSHSSPGSTDTTL
|
LPASTTTSGSSQESTTSHSS
|
SGSTDTALSPGSTTALSFG
|
QESTTFHSSPGSTHTTLFPD
|
STTSSGIVEASTRVHSSTGS
|
PRTTLSPASSTSPGLQGEST
|
AFQTHPASTHTTPSPPSTA
|
TAPVEESTTYHRSPGSTPTT
|
HFPASSTTSGHSEKSTIFHS
|
SPDASGTTPSSAHSTTSGR
|
GESTTSRISPGSTEITTLPGS
|
TTTPGLSEASTTFYSSPRSP
|
TTTLSPASMTSLGVGEESTT
|
SRSQPGSTHSTVSPASTTTP
|
GLSEESTTVYSSSPGSTETT
|
VFPRSTTTSVRREEPTTFHS
|
RPASTHTTLFTEDSTTSGLT
|
EESTAFPGSPASTQTGLPA
|
TLTTADLGEESTTFPSSSGS
|
TGTKLSPARSTTSGLVGEST
|
PSRLSPSSTETTTLPGSPQP
|
SLSEKSTTFYTSPRSPDATLS
|
PATTTSSGVSEESSTSHSQP
|
GSTHTTAFPDSTTTSGLSQ
|
EPTTSHSSQGSTEATLSPGS
|
TTASSLGQQSTTFHSSPGD
|
TETTLLPDDTITSGLVEASTP
|
THSSTGSLHTTLTPASSTST
|
GLQEESTTFQSWPSSSDTT
|
PSPPSTTAVPVEVSTTYHSR
|
PSSTPTTHFSASSTTLGRSE
|
ESTTVHSSPGATGTALFPTR
|
SATSVLVGEPTTSPISSGSTE
|
TTALPGSTTTAGLSEKSTTF
|
YSSPRSPDTTLSPASTTSSG
|
VSEESTTSHSRPGSMHTTA
|
FPSSTTMPGVSQESTASHS
|
SPGSTDTTLSPGSTTASSLG
|
PESTTEHSSPGSTETTLLPD
|
NTTASGLLEASTPVHSSTGS
|
PHTTLSPAGSTTRQGESTT
|
FQSWPNSKDTTPAPPTTTS
|
AFVELSTTSHGSPSSTPTTH
|
FSASSTTLGRSEESTTVHSS
|
PVATATTPSPARSTTSGLVE
|
ESTTYHSSPGSTQTMHFPE
|
SNTTSGRGEESTTSHSSTTH
|
TISSAPSTTSALVEEPTSYHS
|
SPGSTATTHFPDSSTTSGRS
|
EESTASHSSQDATGTIVLPA
|
RSTTSVLLGESTTSPISSGS
|
METTALPGSTTTPGLSEKST
|
TFHSSPSSTPTTHFSASSTTL
|
GRSEESTTVHSSPVATATTP
|
SPARSTTSGLVEESTAYHSS
|
PGSTQTMHFPESSTASGRS
|
EESRTSHSSTTHTISSPPSTT
|
SALVEEPTSYHSSPGSIATT
|
HFPESSTTSGRSEESTASHS
|
SPDTNGITPLPAHFTTSGRI
|
AESTTFYISPGSMETTLAST
|
ATTPGLSAKSTILYSSSRSPD
|
QTLSPASMTSSSISGEPTSL
|
YSQAESTHTTAFPASTTTSG
|
LSQESTTFHSKPGSTETTLS
|
PGSITTSSFAQEFTTPHSQP
|
GSALSTVSPASTTVPGLSEE
|
STTFYSSPGSTETTAFSHSN
|
TMSIHSQQSTPFPDSPGFT
|
HTVLPATLTTTDIGQESTAF
|
HSSSDATGTTPLPARSTAS
|
DLVGEPTTFYISPSPTYTTLF
|
PASSSTSGLTEESTTFHTSPS
|
FTSTIVSTESLETLAPGLCQE
|
GQIWNGKQCVCPQGYVG
|
YQCLSPLESFPVETPEKLNA
|
TLGMTVKVTYRNFTEKMN
|
DASSQEYQNFSTLFKNRM
|
DVVLKGDNLPQYRGVNIR
|
RLLNGSIVVKNDVILEADYT
|
LEVEELFENLAEIVKAKIMN
|
ETRTTLLDPDSCRKAILCYSE
|
EDTFVDSSVTPGFDFQEQC
|
TQKAAEGYTQFYYVDVLD
|
GKLACVNKCTKGTKSQMN
|
CNLGTCQLQRSGPRCLCPN
|
TNTHWYWGETCEFNIAKS
|
LVYGIVGAVMAVLLLALIILI
|
ILFSLSQRKRHREQYDVPQ
|
EWRKEGTPGIFQKTAIWE
|
DQNLRESRFGLENAYNNF
|
RPTLETVDSGTELHIQRPE
|
MVASTV
|
|
SEQ ID NO: 1937
ENSG00000205744.5
MESRAEGGSPAVFDWFFE
A*02:03, A*11:01, A*11:02,
|
AACPASLQEDPPILRQFPP
A*24:10, A*33:03, B*15:01,
|
DFRDQEAMQMVPKFCFP
B*39:01, B*40:01, B*55:02,
|
FDVEREPPSPAVQHFTFAL
B*58:01, C*03:02, C*03:04,
|
TDLAGNRRFGFCRLRAGT
C*14:02
|
QSCLCILSHLPWFEVFYKLL
|
NTVGDLLAQDQVTEAEELL
|
QNLFQQSLSGPQASVGLEL
|
GSGVTVSSGQGIPPPTRGN
|
SKPLSCFVAPDSGRLPSIPE
|
NRNLTELVVAVTDENIVGL
|
FAALLAERRVLLTASKLSTLT
|
SCVHASCALLYPMRWEHV
|
LIPTLPPHLLDYCCAPMPYL
|
IGVHASLAERVREKALEDV
|
VVLNVDANTLETTFNDVQ
|
ALPPDVVSLLRLRLRKVALA
|
PGEGVSRLFLKAQALLFGG
|
YRDALVCSPGQPVTFSEEV
|
FLAQKPGAPLQAFHRRAV
|
HLQLFKQFIEARLEKLNKGE
|
GFSDQFEQEITGCGASSGA
|
LRSYQLWADNLKKGGGAL
|
LHSVKAKTQPAVKNMYRS
|
AKSGLKGVQSLLMYKDGD
|
SVLQRGGSLRAPALPSRSD
|
RLQQRLPITQHFGKNRPLR
|
PSRRRQLEEGTSEPPGAGT
|
PPLSPEDEGCPWAEEALDS
|
SFLGSGEELDLLSEILDSLSM
|
GAKSAGSLRPSQSLDCCHR
|
GDLDSCFSLPNIPRWQPD
|
DKKLPEPEPQPLSLPSLQN
|
ASSLDATSSSKDSRSQLIPS
|
ESDQEVTSPSQSSTASADP
|
SIWGDPKPSPLTEPLILHLT
|
PSHKAAEDSTAQENPTPW
|
LSTAPTEPSPPESPQILAPTK
|
PNFDIAWTSQPLDPSSDPS
|
SLEDPRARPPKALLAERAHL
|
QPREEPGALNSPATPTSNC
|
QKSQPSSRPRVADLKKCFE
|
G
|
|
SEQ ID NO: 1938
ENSG00000213420.3
MSALRPLLLLLLPLCPGPGP
A*02:03, A*11:01, A*11:02,
|
GPGSEAKVTRSCAETRQVL
A*24:02, A*24:10, A*33:03,
|
GARGYSLNLIPPALISGEHL
B*15:01, B*15:27, B*38:02,
|
RVCPQEYTCCSSETEQRLIR
B*39:01, B*40:01, B*58:01,
|
ETEATFRGLVEDSGSFLVHT
C*03:02, C*03:04, C*12:02,
|
LAARHRKFDEFFLEMLSVA
C*14:02, C*15:02
|
QHSLTQLFSHSYGRLYAQH
|
ALIFNGLFSRLRDFYGESGE
|
GLDDTLADFWAQLLERVF
|
PLLHPQYSFPPDYLLCLSRL
|
ASSTDGSLQPFGDSPRRLR
|
LQITRTLVAARAFVQGLET
|
GRNVVSEALKVPVSEGCSQ
|
ALMRLIGCPLCRGVPSLMP
|
CQGFCLNVVRGCLSSRGLE
|
PDWGNYLDGLLILADKLQ
|
GPFSFELTAESIGVKISEGL
|
MYLQENSAKVSAQVFQEC
|
GPPDPVPARNRRAPPPRE
|
EAGRLWSMVTEEERPTTA
|
AGTNLHRLVWELRERLAR
|
MRGFWARLSLTVCGDSR
|
MAADASLEAAPCWTGAG
|
RGRYLPPVVGGSPAEQVN
|
NPELKVDASGPDVPTRRRR
|
LQLRAATARMKTAALGHD
|
LDGQDADEDASGSGGGQ
|
QYADDWMAGAVAPPARP
|
PRPPYPPRRDGSGGKGGG
|
GSARYNQGRSRSGGASIGF
|
HTQTILILSLSALALLGPR
|
|
SEQ ID NO: 1939
ENSG00000225485.3
MNGVAFCLVGIPPRPEPRP
A*02:03, A*11:01, A*11:02,
|
PQLPLGPRDGCSPRRPFP
A*24:02, A*24:07, A*24:10,
|
WQGPRTLLLYKSPQDGFG
B*15:01, B*39:01, B*40:01,
|
FTLRHFIVYPPESAVHCSLK
B*55:02, B*58:01, C*03:02,
|
EEENGGRGGGPSPRYRLEP
C*03:04, C*03:67, C*12:02,
|
MDTIFVKNVKEDGPAHRA
C*14:02, C*15:02
|
GLRTGDRLVKVNGESVIGK
|
TYSQVIALIQNSDDTLELSI
|
MPKDEDILQLAYSQDAYLK
|
GNEPYSGEARSIPEPPPICY
|
PRKTYAPPARASTRATMVP
|
EPTSALPSDPRSPAAWSDP
|
GLRVPPAARAHLDNSSLG
|
MSQPRPSPGAFPHLSSEPR
|
TPRAFPEPGSRVPPSRLEC
|
QQALSHWLSNQVPRRAG
|
ERRCPAMAPRARSASQDR
|
LEEVAAPRPWPCSTSQDAL
|
SQLGQEGWHRARSDDYLS
|
RATRSAEALGPGALVSPRF
|
ERCGWASQRSSARTPACP
|
TRDLPGPQAPPPSGLQGL
|
DDLGYIGYRSYSPSFQRRT
|
GLLHALSFRDSPFGGLPTF
|
NLAQSPASFPPEASEPPRV
|
VRPEPSTRALEPPAEDRGD
|
EVVLRQKPPTGRKVQLTPA
|
RQMNLGFGDESPEPEASG
|
RGERLGRKVAPLATTEDSL
|
ASIPFIDEPTSPSIDLQAKHV
|
PASAVVSSAMNSAPVLGT
|
SPSSPTFTFTLGRHYSQDCS
|
SIKAGRRSSYLLAITTERSKS
|
CDDGLNTFRDEGRVLRRLP
|
NRIPSLRMLRSFFTDGSLDS
|
WGTSEDADAPSKRHSTSD
|
LSDATFSDIRREGWLYYKQI
|
LTKKGKKAGSGLRQWKRV
|
YAALRARSLSLSKERREPGP
|
AAAGAAAAGAGEDEAAPV
|
CIG
|
|
SEQ ID NO: 1940
ENSG00000243449.2
MFRAALEDSVEKKSSLKET
A*02:03, A*24:10, A*33:03,
|
ETTSKGTSKYDRERETEMK
B*27:04, B*38:02, B*39:01,
|
TVMGMKMHFWVRTPAS
B*40:01, C*01:02, C*03:02,
|
GRGRGGSDHARSRAAPLP
C*03:04, C*03:67, C*04:01,
|
LLA
C*07:02, C*14:02, C*15:02
|
|
SEQ ID NO: 1941
ENSG00000261787.1
MDRGRPAGSPLSASAEPA
A*02:03, A*24:02, A*24:10,
|
PLAAAIRDSRPGRTGPGPA
A*33:03, B*40:01, C*03:02,
|
GPGGGSRSGSGRPAAANA
C*03:04, C*12:02, C*14:02
|
ARERSRVQTLRHAFLELQR
|
TLPSVPPDTKLSKLDVLLLA
|
TTYIAHLTRSLQDDAEAPA
|
DAGLGALRGDGYLHPVKK
|
WPMRSRLYIGATGQFLKH
|
SVSGEKTNHDNTPTDSQP
|
|
TABLE 10
|
|
Peptide pools for alternative promoters
|
Peptide
Alternative
Corresponding
|
SEQ ID NO.
Pool
Promoter
Peptide Sequence
HLA variant
|
|
SEQ ID NO:
1
DNAH3
MAEKLQEANFLLEDI
A*02:01
|
1942
|
|
SEQ ID NO.
QYSHIADKVSEVPAN
A*02:03
|
1943
|
|
SEQ ID NO:
FLKKSSAVTVKLRR
A*03:01
|
1944
|
|
SEQ ID NO:
PKLKYIPLKFSFTAA
A*24:02
|
1945
|
|
SEQ ID NO:
EHLHTVNPMMLRLKE
A*33:03
|
1946
|
|
SEQ ID NO:
VSDFLIQTFKVFQKN
B*15:01
|
1947
|
|
SEQ ID NO:
DNTAEQNIAAFLKEN
B*40:01
|
1948
|
|
SEQ ID NO:
VNPMMLRLKELWFAE
B*58:01
|
1949
|
|
SEQ ID NO:
KTSLTFPGSRPMSPE
C*03:02
|
1950
|
|
SEQ ID NO:
IEEYFASVASFMSLQ
C*14:02
|
1951
|
|
SEQ ID NO:
NEIASMNITVPLAMF
C*15:02
|
1952
|
|
SEQ ID NO:
2
DST
NPKLTLGLIWTIILH
A*02:01
|
1953
|
|
SEQ ID NO:
FTKWINQHLMKVRKH
A*02:03
|
1954
|
|
SEQ ID NO:
ERDKVQKKTFTKWIN
A*03:01
|
1955
|
|
SEQ ID NO:
ISLLEVLSGDTLPRE
B*40:01
|
1956
|
|
SEQ ID NO:
MAGYLSPAAYLYVEE
C*03:02
|
1957
|
|
SEQ ID NO:
MAGYLSPAAYLYVE
C*14:02
|
1958
|
|
SEQ ID NO:
3
EPS8L1
ADVSQYPVNHLVTFC
A*02:01
|
1959
|
|
SEQ ID NO:
EVDILNHVFDDVESF
A*02:03
|
1960
|
|
SEQ ID NO:
MSTATGPEAAPKPSA
A*11:01
|
1961
|
|
SEQ ID NO:
AQPDVHFFQGLRLGA
A*33:03
|
1962
|
|
SEQ ID NO:
ILNHVFDDVESFVSR
B*15:02
|
1963
|
|
SEQ ID NO:
VSQYPVNHLVTFCLG
B*35:03
|
1964
|
|
SEQ ID NO:
PASKEELESYPLGAI
B*40:01
|
1965
|
|
SEQ ID NO:
EPERAQPDVHFFQGL
B*58:01
|
1966
|
|
SEQ ID NO:
4
FRMD4B
VEDLLFSGSRFVWNL
A*02:01
|
1967
|
|
SEQ ID NO:
LLDLVASHFNLKEKE
A*11:01
|
1968
|
|
SEQ ID NO:
TVSTLRRWYTERLRA
A*33:03
|
1969
|
|
SEQ ID NO:
QIEVESETIFKLAAF
B*40:01
|
1970
|
|
SEQ ID NO:
VWNLTVSTLRRWYTE
B*58:01
|
1971
|
|
SEQ ID NO:
AVRFYIESISFLKDK
C*07:02
|
1972
|
|
SEQ ID NO:
5
LAMA3
AEGVLLDYLVLLPRD
A*02:01
|
1973
|
|
SEQ ID NO:
SRIAMYELLADADIQ
A*02:03
|
1974
|
|
SEQ ID NO:
RTNTLLGHLISKAQR
A*03:01
|
1975
|
|
SEQ ID NO:
VIHFYQAAHPTFPAQ
A*24:02
|
1976
|
|
SEQ ID NO:
TKATNIRLRFLRTNT
A*33:03
|
1977
|
|
SEQ ID NO:
YAQMTSVQNDVRITL
A*68:01
|
1978
|
|
SEQ ID NO:
CLLYQHLPVTRFPCT
B*15:01
|
1979
|
|
SEQ ID NO:
DKVSSYGGYLTYQAK
B*15:02
|
1980
|
|
SEQ ID NO:
LSGREVELHLRLRIP
B*40:01
|
1981
|
|
SEQ ID NO:
LHKKSMDKSLEFITN
B*58:01
|
1982
|
|
SEQ ID NO:
DGYFALEKSNYFGCQ
C*03:02
|
1983
|
|
SEQ ID NO:
ENNYYFPDLHHMKYE
C*07:02
|
1984
|
|
SEQ ID NO:
ILRYVNPGTEAVSGH
C*12:02
|
1985
|
|
SEQ ID NO:
ADPFSITPGIWVACI
C*15:02
|
1986
|
|
SEQ ID NO:
6
MET
QNVILHEHHIFLGAT
A*02:01
|
1987
|
|
SEQ ID NO:
CKEALAKSEMNVNMK
A*02:03
|
1988
|
|
SEQ ID NO:
MDRSAMCAFPIKYVN
A*11:01
|
1989
|
|
SEQ ID NO:
TDQVIDVLPEFRDS
A*24:02
|
1990
|
|
SEQ ID NO:
LDAQTFHTRIIRFCS
A*33:03
|
1991
|
|
SEQ ID NO:
SNNFIYFLTVQRETL
A*68:01
|
1992
|
|
SEQ ID NO:
KDGFMFLTDQAYIDV
B*15:01
|
1993
|
|
SEQ ID NO:
RDSYPIKYVHAFESN
B*35:03
|
1994
|
|
SEQ ID NO:
QKVAEYKTGPVLEHP
B*40:01
|
1995
|
|
SEQ ID NO:
CSSKANLSGGVWKDN
B*58:01
|
1996
|
|
SEQ ID NO:
RDEYRTEFTTALQRV
C*07:02
|
1997
|
|
SEQ ID NO:
TINSSYFPDHPLHSI
C*12:03
|
1998
|
|
SEQ ID NO:
PMDRSAMCAFPIKYV
C*15:02
|
1999
|
|
SEQ ID NO:
7
MIB2
GASGIVEVLTEVPNI
A*02:01
|
2000
|
|
SEQ ID NO:
QGFTLLHHASLKGHA
A*03:01
|
2001
|
|
SEQ ID NO:
ENKSSLSVALDKLRA
A*11:01
|
2002
|
|
SEQ ID NO:
QVAAYLGQVELIRLL
A*24:02
|
2003
|
|
SEQ ID NO:
TALHLAALNNHREVA
A*33:03
|
2004
|
|
SEQ ID NO:
CVGEAAGGFYYKDHL
A*68:01
|
2005
|
|
SEQ ID NO:
LQRRVSADSQFFQHG
B*15:01
|
2006
|
|
SEQ ID NO:
GNLRVAVAGQRWTFS
B*58:01
|
2007
|
|
SEQ ID NO:
EDGFTALHLAALNNH
C*03:02
|
2008
|
|
SEQ ID NO:
GGFYYKDHLPRLGKP
C*07:02
|
2009
|
|
SEQ ID NO:
8
MRC2
DSCYQFNFQSTLSWR
A*02:01
|
2010
|
|
SEQ ID NO:
TDGSIINFISWAPGK
A*02:03
|
2011
|
|
SEQ ID NO:
RDCSIALPYVCKKKP
A*11:01
|
2012
|
|
SEQ ID NO:
EWLRFQEAEYKFFEH
A*24:02
|
2013
|
|
SEQ ID NO:
SGDEVMYTHWNRDQP
A*33:03
|
2014
|
|
SEQ ID NO:
RFEQAFVSSLIYNWE
B*15:02
|
2015
|
|
SEQ ID NO:
GWTWHSPSCYWLGED
B*38:02
|
2016
|
|
SEQ ID NO:
TNRFEQAFVSSLIYN
B*40:01
|
2017
|
|
SEQ ID NO:
QGRREWLRFQEAEYK
B*40:06
|
2018
|
|
SEQ ID NO:
LCALPYHEVYTIQGN
B*51:01
|
2019
|
|
SEQ ID NO:
CPIKSNDCETFWDKD
B*58:01
|
2020
|
|
SEQ ID NO:
GGCVALATGSAMGLW
C*03:02
|
2021
|
|
SEQ ID NO:
EGEYFWTALQDLNST
C*14:02
|
2022
|
|
SEQ ID NO:
9
NOS2
PDELLPQAIEFVNQY
A*02:01
|
2023
|
|
SEQ ID NO:
SKSCLGSIMTPKSLT
A*11:01
|
2024
|
|
SEQ ID NO:
VKLDATPLSSPRHVR
A*68:01
|
2025
|
|
SEQ ID NO:
IGRIQWSNLQVFDAR
B*15:01
|
2026
|
|
SEQ ID NO:
AIEFVNQYYGSFKEA
B*15:02
|
2027
|
|
SEQ ID NO:
TKEIETTGTYQLTGD
B*40:01
|
2028
|
|
SEQ ID NO:
MACPWKFLFKTK
B*58:01
|
2029
|
|
SEQ ID NO:
10
PLEC
RPRSLHPHVPGVTNL
A*02:01
|
2030
|
|
SEQ ID NO:
MVAGMLMPRDQL
A*11:01
|
2031
|
|
SEQ ID NO:
HLRQYLHLPPEIVPA
A*24:02
|
2032
|
|
SEQ ID NO:
RETFAWCHFYWYLTN
C*03:02
|
2033
|
|
SEQ ID NO:
11
PLEKHG5
KKKSLGEVLLPVFER
A*02:01
|
2034
|
|
SEQ ID NO:
LWASVMAPVLEKARR
A*03:01
|
2035
|
|
SEQ ID NO:
LHTEASYIRKLRVII
A*33:03
|
2036
|
|
SEQ ID NO:
SLGEVLLPVFERKGI
A*68:01
|
2037
|
|
SEQ ID NO:
WKNRAASRFSGFFSS
B*15:01
|
2038
|
|
SEQ ID NO:
KNMSEFLGEASIPGQ
B*40:01
|
2039
|
|
SEQ ID NO:
GSSGSTNTGDSWKNR
B*58:01
|
2040
|
|
SEQ ID NO:
TFEAYRFGGHYLRVK
C*14:02
|
2041
|
|
SEQ ID NO:
12
PTGDS
THHTLWMGLALLGVL
A*02:01
|
2042
|
|
SEQ ID NO:
HTLWMGLALLGVLGD
A*02:03
|
2043
|
|
SEQ ID NO:
APEAQVSVQPNFQQD
B*15:01
|
2044
|
|
SEQ ID NO:
MATHHTLWMGLA
C*03:02
|
2045
|
|
SEQ ID NO:
13
RASA3
GPSKMRDCYCTVNLD
A*02:03
|
2046
|
|
SEQ ID NO:
EIPRSFRHLSFYIFD
A*03:01
|
2047
|
|
SEQ ID NO:
RYTAVSSFIFLRFFA
A*11:01
|
2048
|
|
SEQ ID NO:
FKESYMATFYEFFNE
A*24:02
|
2049
|
|
SEQ ID NO:
LSFYIFDRDVFRRDS
A*33:03
|
2050
|
|
SEQ ID NO:
KESYMATFYEFFNEQ
B*15:01
|
2051
|
|
SEQ ID NO:
DADSEVQGKVHLELR
B*40:01
|
2052
|
|
SEQ ID NO:
DVRYTAVSSFIFLRF
B*58:01
|
2053
|
|
SEQ ID NO:
DHVFSSDYYSPLRDL
C*03:02
|
2054
|
|
SEQ ID NO:
GEDFYCEIPRSFRHL
C*07:02
|
2055
|
|
SEQ ID NO:
SSDYYSPLRDLLLKS
C*14:02
|
2056
|
|
SEQ ID NO:
14
TRPM2
HSKLQMHHVAQVLRE
A*02:03
|
2057
|
|
SEQ ID NO:
RLKSIFRRGLVKVAQ
A*03:01
|
2058
|
|
SEQ ID NO:
HPTMTAALISNKPEF
A*11:01
|
2059
|
|
SEQ ID NO:
LLGDFTQPLYPRPRH
A*3303
|
2060
|
|
SEQ ID NO:
ECGLMKKAALYFSDF
B*15:01
|
2061
|
|
SEQ ID NO:
VQLKEFYTWDTLLYL
B*40:01
|
2062
|
|
SEQ ID NO:
MKKAALYFSDFWNKL
B*58:01
|
2063
|
|
SEQ ID NO:
HVTFTMDPIRDLLIW
C*12:02
|
2064
|
|
SEQ ID NO:
AALYFSDFWNKLDVG
C*14:02
|
2065
|
|
SEQ ID NO:
15
IKZF3
SAAVLNDYSLTKSHE
A*03:01
|
2066
|
|
SEQ ID NO:
LERHVVSFDSSRPTS
A*33:03
|
2067
|
|
SEQ ID NO:
LNDYSLTKSHEMENV
C*03:02
|
2068
|
|
To explore if somatic promoters might contribute to reducing tumor antigen burden and immunoreactivity in vivo, we proceeded to examine correlations between promoter alterations and intra-tumor T-cell activity in various primary GC cohorts. First, to detect promoter alterations in a cohort of 95 GC-normal pairs (SG cohort), we generated a customized Nanostring panel targeting the top 95 recurrent GC somatic promoters, measuring transcripts associated with either the canonical promoter or the alternative promoter. There was a significant correlation between the Nanostring data and RNA-seq (FIG. 16, r=0.65, P<0.001), with ˜35% of transcripts driven by alternate promoters upregulated in more than half of the GCs (FIG. 4D). Second, to examine markers of T-cell activity in these same GC samples, we analyzed previously published microarray data to measure CD8A (a measure of CD8+ tumor infiltrating lymphocytes), and granzyme A (GZMA) and perforin (PRF1), which are both T-cell effectors and validated markers of T-cell cytolytic activity. We confirmed that these three genes (CD8A, GZMA, and PRF1) were not themselves associated with somatic promoters. Comparing the top and bottom quartiles, GCs with high somatic promoter usage exhibited significantly lower GZMA and PRF1 levels (P<0.001 and P=0.01, Wilcoxon Test) indicating lower T-cell cytolytic activity (FIG. 4E, top left), and also a trend towards lower CD8A levels (P=0.14, Wilcoxon one sided test). Using two different algorithms (ASCAT and ESTIMATE), we further confirmed that the decreased GZMA and PRF1 levels are independent of tumor purity differences between GCs (FIG. 16). Similar results were obtained upon splitting the GC samples based on median promoter usage score (GZMA, P<0.001 and PRF1, P=0.03). Patients with GCs exhibiting high somatic promoter usage (top 25%) also showed poor survival compared to patients with GCs with low somatic promoter usage (bottom 25%) (FIG. 4e top right, HR 2.55, P=0.02). Again, dividing patients by their median somatic promoter usage score also showed similar survival differences (FIG. 11, HR=1.81, P=0.04).
To validate these findings, we then analyzed two other prominent GC cohorts—one from TCGA, and another from the Asian Cancer Research Group (ACRG). In the TCGA cohort, availability of RNA-seq data allowed us to infer somatic promoter usage directly from next-generation sequencing (NGS) data (FIG. 2c). Similar to the Singapore cohort, TCGA GCs with high somatic promoter usage (top 25%) exhibited decreased CD8A (P=0.002, Wilcoxon one sided test), GZMA (P=0.001, Wilcoxon one sided test) and PRF1 levels (P=0.005, Wilcoxon one sided test, FIG. 4e bottom left) compared to GCs with low somatic promoter usage (bottom 25%) in a manner independent of tumor purity (FIG. 16). Notably, as previous studies have suggested that somatic mutation burden may also correlate with intra-tumor T-cell cytolytic response, we further repeated the analysis after adjusting for the total number of missense mutations in each sample using a regression based approach. Even after correcting for somatic mutation burden, we still observed decreased CD8A (P=0.02, Wilcoxon one sided test), GZMA (P=0.01, Wilcoxon one sided test) and PRF1 expression (P=0.03, Wilcoxon one sided test) in samples with high somatic promoter usage (top 25% against bottom 25%) (FIG. 11).
We leveraged a third independent cohort of GC samples from ACRG. Using NanoString to target 89 canonical and alternative promoters along with various immune markers, we profiled 264 primary GC samples from the ACRG cohort. 40% of alternative promoter transcripts showed tumor specific expression in more than half of the samples (FIG. 11). Once again, samples with high somatic promoter usage (top 25%) showed significantly lower expression of T-cell cytolytic activity markers including CD8A (P=0.035, Wilcoxon one sided test), CD4A (P=0.005, Wilcoxon one sided test), GZMA (P=0.001, Wilcoxon one sided test) and PRF1 (P=0.025, Wilcoxon one sided test) (FIG. 4e, bottom right) (FIG. 16). Similar results were obtained upon splitting the GC samples based on median promoter usage score (Table 11) Also, after adjusting for mutational burden (for cases where information is available), samples with high somatic promoter usage still showed decreased CD8A (P=0.167, Wilcoxon one sided test), GZMA (P=0.009, Wilcoxon one sided test), and PRF1 (P=0.03, Wilcoxon one sided test) expression (FIG. 11). Taken collectively, these results, observed across multiple GC cohorts and assessed using diverse technologies (microarray, RNA-seq, Nanostring) all support a significant association between somatic promoter usage and reduced tumor immunity levels. Importantly, the decreased levels of T-cell cytolytic activity associated with somatic promoter usage are likely independent of tumor purity and mutational load.
TABLE 11
|
|
P values of Wilcoxon test between ACRG samples with
|
high and low somatic promoter usage.
|
Top and Bottom
Divided by median
|
Immune Marker
25 pctl
(50 pctl)
|
|
CD4A
0.01151
0.06053
|
CD8A
0.07829
0.02482
|
CTLA4
0.2048
0.2952
|
FOXP3
0.1054
0.1673
|
GZMA
0.002593
0.005957
|
IFNg
0.2376
0.8045
|
IL-10
0.8391
0.9311
|
LAG3
0.1672
0.2627
|
PD1
0.1192
0.1506
|
PDL1
0.5668
0.5869
|
PRF1
0.01272
0.05873
|
TIM3
0.578
0.9424
|
TNFA
0.1394
0.7184
|
|
* All P values are from Wilcoxon two sided test
|
Somatic Promoter Associated Peptides are Immunogenic In Vitro
To functionally test the ability of N-terminal peptides depleted in GC to elicit immune responses, we conducted in-vitro assays using the high-throughput EPIMAX (EPItope MAXimum) platform, which allows multi-epitope testing for both T cell proliferation and cytokine production. First, we identified N terminal peptides predicted to exhibit high HLA-binding affinities across a pool of healthy PBMC (peripheral blood mononuclear cell) donors. Second, selecting 15 alternative promoter-associated peptides for testing, we generated peptide pools for each peptide (Tables 9 and 10, Methods), which were then used to stimulate PBMCs from 9 healthy donors. T cell proliferation and cytokine production levels were measured and benchmarked against control peptides (Table 12). Across all 135 exposures (15 peptides across 9 donors), we observed strong cytokine responses for 79 peptide pools (58%; FC-2 relative to Actin peptides) (FIG. 4g) inducing complex Th1, Th2 and Th17 polarizations in a donor dependent fashion (FIG. 17).
TABLE 12
|
|
Cytokine Responses of N terminal Peptides
|
Fold
|
change
|
of total
|
cytokine
|
response
|
(normal-
|
ized
|
Analyte concentration (pg/ml)
Total
against
|
Treat-
GM-
IFN-
IL-
IL-
IL-
IL-
IL-
IL-
IL-
IL-
IL-
analytes
Actin
|
Sample
ment
CSF
g
2
3
4
7
9
10
13
15
17A
sCD40L
TNFa
(pg/ml)
control)
|
|
Donor 1
DNAH3
99.39
228.45
89
6.35
2.12
0.085
7.32
24.91
228.24
0.925
1.88
4.47
264.89
958.03
2.89
|
Donor 1
DST
114.18
149.87
58.02
11.41
0.03
0.085
14.11
57.29
311.22
0.925
1.58
8.97
251.98
979.67
2.96
|
Donor 1
EPS8L1
153.07
351.34
100.97
11.8
0.03
0.085
28.88
33.71
431.94
0.925
0.02
6.17
434.22
1553.16
4.69
|
Donor 1
FRMD4B
55.53
121.17
76.42
10.54
0.03
1.43
16.77
36.13
198.37
0.925
0.93
3.76
186.12
708.13
2.14
|
Donor 1
LAMA3
67.29
152.66
99.6
4.83
1.72
0.085
9.11
25.85
264.85
0.925
0.02
2.8
506.25
1135.99
3.43
|
Donor 1
MET
54.4
93.08
96.36
6.27
0.03
0.085
5.52
25.85
179.02
0.925
0.02
3.76
606.67
1071.99
3.23
|
Donor 1
MIB2
97.14
201.48
94.37
5.92
0.03
0.085
18.62
27
381.6
0.925
0.67
1.81
684.34
1513.99
4.57
|
Donor 1
MRC2
52.57
63.61
53.15
5.58
0.03
0.085
3.32
37.5
184.11
0.925
0.76
1.81
290.69
694.14
2.09
|
Donor 1
NOS2
31.72
130.64
26.25
3.51
0.03
0.085
5.04
28.47
133.76
0.925
0.02
1.62
154.92
516.99
1.56
|
Donor 1
PLEC
107.71
393.6
96.29
14.5
10.68
0.085
27.93
59.1
413.41
0.925
0.02
7.78
337.55
1469.58
4.43
|
Donor 1
PLEKHG5
74.89
128.23
96.23
9.37
3.33
0.085
9.16
40.97
207.45
0.925
4.22
3.64
236.32
814.82
2.46
|
Donor 1
PTGDS
29.12
223.36
63.06
2.73
0.03
0.085
10.02
48.05
254.29
0.925
0.02
0.01
395.74
1027.44
3.10
|
Donor 1
RASA3
33.95
50.06
58.28
3.84
0.03
0.085
8.6
39.39
196.78
0.925
0.02
0.01
157.88
549.85
1.66
|
Donor 1
TRPM2
121.32
323.62
90.23
6.24
2.53
0.085
18.26
51.65
368.92
0.925
0.02
7.61
428.91
1420.32
4.29
|
Donor 1
IKZF3
9.53
59.94
23.36
0.94
0.03
0.085
1.22
42.98
76.06
0.925
0.02
0.01
48.83
263.93
0.80
|
Donor 1
Actin
19.75
147.18
34.21
1.46
0.03
0.085
1.22
10.1
14.2
0.925
0.02
0.78
101.44
331.40
1.00
|
Donor 2
DNAH3
279.27
1324.9
24
0.5
0.03
0.085
1.22
18.44
156.05
0.925
2.26
4.59
130.71
1942.98
28.04
|
Donor 2
DST
773.57
6732.16
46.6
2
0.03
0.085
1.22
23.76
370.78
0.925
2.56
3.88
257.33
8214.90
118.57
|
Donor 2
EPS8L1
427.99
1030.19
85.97
3.33
4.33
0.085
18.4
21.15
386.22
0.925
0.76
4.3
167.42
2151.07
31.05
|
Donor 2
FRMD4B
390.31
1070.19
94.99
3.93
10.28
1.27
1.22
19.9
415.04
0.925
0.02
5.24
159.4
2172.72
31.36
|
Donor 2
LAMA3
358.14
643.22
67.18
2.34
0.03
0.085
1.22
11.66
362.67
0.925
0.02
0.17
109.58
1557.24
22.48
|
Donor 2
MET
302.2
256.37
64.56
1.53
0.91
0.085
1.22
14.16
312.32
0.925
2.39
4.24
84.79
1045.70
15.09
|
Donor 2
MIB2
173.84
141.37
17.97
0.73
0.03
0.085
1.22
13.23
153.31
0.925
0.02
0.65
61.99
565.37
8.16
|
Donor 2
MRC2
1401.1
5545.58
205.47
5.98
6.32
0.085
13.83
14.06
889.87
0.925
6.68
4.59
531.62
8626.11
124.50
|
Donor 2
NOS2
342.89
462.07
83.01
2.88
10.88
2.29
15.36
21.57
288.7
0.925
5.91
3.82
89.68
1329.99
19.20
|
Donor 2
PLEC
280.02
357.65
74.41
2.44
0.03
0.085
19.79
24.07
343.1
0.925
5.46
2.49
83.91
1194.38
17.24
|
Donor 2
PLEKHG5
236.12
757.03
103.14
2.69
4.13
0.085
1.22
24.39
155.22
0.925
1.54
6.63
89.39
1382.51
19.95
|
Donor 2
PTGDS
142.7
621.5
33.17
1.39
0.03
0.17
1.22
13.75
63.73
0.925
2.39
4.83
57.06
942.87
13.61
|
Donor 2
RASA3
630.2
2755.29
67.63
0.98
4.53
0.085
15.24
36.44
363.46
0.925
0.02
3.28
281.27
4159.35
60.03
|
Donor 2
TRPM2
495.45
1211.48
60.61
2.96
0.03
0.085
2.44
5.29
542.44
0.925
0.02
3.28
143.48
2468.49
35.63
|
Donor 2
IKZF3
427.38
1705.57
71.33
1.36
0.03
0.085
21.04
43.4
419.93
0.925
0.02
4.77
116.74
2812.58
40.59
|
Donor 2
Actin
15.58
7.71
11.28
0.76
0.03
1.73
1.22
5.29
13.75
0.925
0.02
1.81
9.18
69.29
1.00
|
Donor 3
DNAH3
42.21
664.34
19.01
0.005
0.03
0.085
1.22
5.08
15.32
0.925
0.02
0.01
29.25
777.51
4.56
|
Donor 3
DST
100.36
273.74
14.76
0.005
0.03
0.085
1.22
27
58.89
0.925
7.41
1.17
63.68
549.28
3.22
|
Donor 3
EPS8L1
208.07
530.49
41.94
1.07
3.73
0.085
1.22
13.12
107.94
0.925
0.85
0.01
50.21
959.66
5.63
|
Donor 3
FRMD4B
143.55
211.78
47.51
0.73
0.03
0.085
1.22
17.71
91.8
0.925
0.02
1.11
53.79
570.26
3.35
|
Donor 3
LAMA3
100.19
509.46
23.21
1.08
0.03
0.085
1.22
36.97
34.67
0.925
1.19
0.01
50.95
759.99
4.46
|
Donor 3
MET
143.98
322.33
34.04
1.99
0.03
0.085
1.22
12.39
29.84
0.925
2.64
0.01
54.62
604.10
3.55
|
Donor 3
MIB2
113.31
127.71
16.28
0.05
0.03
0.085
1.22
9.27
39.67
0.925
0.02
0.01
39.41
347.99
2.04
|
Donor 3
MRC2
150.52
323.25
48.19
0.96
0.03
0.085
1.22
11.66
54.63
0.925
0.58
0.09
74.36
666.50
3.91
|
Donor 3
NOS2
186.72
328.5
75.34
4.54
0.03
0.085
1.22
18.02
95.19
0.925
1.96
2.06
69.18
783.77
4.60
|
Donor 3
PLEC
132.57
235.34
52.69
0.76
0.03
0.085
1.22
27.21
69.82
0.925
2.93
1.05
43.28
567.91
3.33
|
Donor 3
PLEKHG5
275.71
343.92
56.78
0.69
0.03
0.085
1.22
14.06
132.99
0.925
0.49
0.01
118.75
945.66
5.55
|
Donor 3
PTGDS
185.73
186.82
57.3
0.005
0.28
0.085
1.22
18.44
127.35
0.925
0.02
0.01
90.73
668.92
3.93
|
Donor 3
RASA3
133.59
93.84
40.44
0.01
0.06
0.085
1.22
9.68
73.67
0.925
2.3
1.49
53.69
411.00
2.41
|
Donor 3
TRPM2
176.42
154.05
46.74
1.05
0.03
1.43
1.22
10.93
133.4
0.925
0.02
0.01
72
598.23
3.51
|
Donor 3
IKZF3
32.69
169.24
18.82
0.005
0.03
0.085
1.22
10.52
16.55
0.925
0.02
0.01
21.41
271.53
1.59
|
Donor 3
Actin
56.66
60.86
13.4
0.56
4.53
0.085
1.22
2.56
5.96
0.925
2.89
0.01
20.69
170.35
1.00
|
Donor 4
DNAH3
0.66
0.005
2.21
0.005
0.03
0.085
1.22
0.41
0.58
0.925
0.02
0.01
2.38
8.54
1.24
|
Donor 4
DST
1.83
1.05
1.06
0.005
0.03
0.085
1.22
3.61
2.32
0.925
0.02
0.01
19.23
31.40
4.55
|
Donor 4
EPS8L1
0.66
1.35
0.98
0.005
0.03
2.01
1.22
4.24
1.95
0.925
0.02
0.01
1.86
15.26
2.21
|
Donor 4
FRMD4B
0.66
0.005
2.01
0.07
0.03
0.085
1.22
2.02
1.19
0.925
0.02
0.01
0.6
8.85
1.28
|
Donor 4
LAMA3
0.66
2.26
1.99
0.005
0.03
0.085
1.22
0.09
1.25
0.925
0.02
0.01
2.34
10.89
1.58
|
Donor 4
MET
0.66
0.3
1.19
0.005
0.03
0.085
1.22
4.77
2.69
0.925
0.13
0.01
1.61
13.63
1.98
|
Donor 4
MIB2
0.66
0.005
1.6
0.005
0.03
0.085
1.22
6.55
0.03
0.925
0.02
0.01
2.12
13.26
1.92
|
Donor 4
MRC2
0.66
1.05
0.98
0.005
0.03
0.085
1.22
4.77
0.3
0.925
0.02
0.01
2.08
12.14
1.76
|
Donor 4
NOS2
0.66
2.49
1.02
0.005
0.03
0.085
1.22
6.55
2.14
0.925
0.02
0.01
1.47
16.63
2.41
|
Donor 4
PLEC
1.42
0.005
1.66
0.005
0.03
0.085
1.22
5.29
0.79
0.925
0.31
0.02
16.87
28.63
4.15
|
Donor 4
PLEKHG5
0.66
0.005
1.15
0.005
0.03
0.085
1.22
3.19
1.19
0.925
0.02
0.01
0.8
9.29
1.35
|
Donor 4
PTGDS
0.66
3.65
2.26
0.005
0.03
0.085
1.22
3.19
2.08
0.925
0.02
0.01
10.06
24.20
3.51
|
Donor 4
RASA3
0.66
0.01
2.55
0.005
0.03
0.085
1.22
3.3
1.44
0.925
0.02
0.01
1.81
12.07
1.75
|
Donor 4
TRPM2
0.66
1.35
1.32
0.005
0.03
0.085
1.22
4.98
1.05
0.925
0.02
0.01
1.7
13.36
1.94
|
Donor 4
IKZF3
0.66
0.9
1.21
0.005
0.03
0.085
1.22
2.56
3.12
0.925
0.02
0.01
3.25
14.00
2.03
|
Donor 4
Actin
0.66
0.01
1.27
0.005
0.03
0.085
1.22
0.18
0.99
0.925
0.02
0.01
1.49
6.90
1.00
|
Donor 5
DNAH3
0.66
0.005
1.66
0.84
0.03
0.085
1.22
2.87
1.05
0.925
0.27
0.01
2.82
12.45
0.78
|
Donor 5
DST
0.66
0.6
0.79
0.005
0.03
0.085
1.22
3.61
3.18
0.925
0.02
0.01
2.06
13.20
0.82
|
Donor 5
EPS8L1
0.66
0.16
1.93
0.005
0.03
1.43
1.22
3.4
1.19
0.925
0.58
0.01
3.54
15.08
0.94
|
Donor 5
FRMD4B
0.66
2.03
1.71
0.005
0.03
0.085
1.22
0.09
0.3
0.925
0.02
0.01
1.86
8.95
0.56
|
Donor 5
LAMA3
0.66
0.01
1.93
0.005
0.03
2.29
1.22
0.41
0.3
0.925
0.02
0.01
1.86
9.87
0.62
|
Donor 5
MET
0.66
0.005
1.69
0.005
0.03
0.085
1.22
0.09
1.44
0.925
0.02
0.01
2.54
8.72
0.54
|
Donor 5
MIB2
0.66
0.005
2.44
0.005
0.03
0.95
1.22
1.71
0.06
0.925
0.02
0.01
2.71
10.75
0.67
|
Donor 5
MRC2
0.66
0.005
3.06
0.005
0.03
0.085
1.22
0.09
0.92
0.925
0.02
0.01
1.38
8.41
0.52
|
Donor 5
NOS2
0.66
1.2
1.9
0.005
0.03
0.085
1.22
0.09
1.89
0.925
1.11
0.01
3.63
12.76
0.80
|
Donor 5
PLEC
0.66
0.01
1.56
0.005
0.03
0.085
1.22
1.28
0.03
0.925
0.85
0.01
2.06
8.73
0.54
|
Donor 5
PLEKHG5
0.66
0.005
1.77
0.54
0.49
0.085
1.22
0.09
1.19
0.925
0.93
0.01
3.21
11.13
0.69
|
Donor 5
PTGDS
0.66
0.005
0.48
0.005
0.03
0.085
1.22
2.66
2.57
0.925
1.71
0.01
2.08
12.44
0.78
|
Donor 5
RASA3
0.66
0.3
2.21
0.005
0.03
0.085
1.22
1.49
1.44
0.925
0.02
0.01
1.9
10.30
0.64
|
Donor 5
TRPM2
0.66
0.005
1.1
0.005
0.03
0.085
1.22
0.09
0.03
0.925
0.02
0.01
0.92
5.10
0.32
|
Donor 5
IKZF3
0.66
4.81
2.52
0.005
0.03
2.94
1.22
4.66
0.03
0.925
0.02
0.01
1.52
19.35
1.21
|
Donor 5
Actin
0.66
1.65
1.4
0.005
0.03
0.085
1.22
5.5
1.44
0.925
0.02
0.01
3.08
16.03
1.00
|
Donor 6
DNAH3
59.45
150.57
19.71
0.58
0.91
1.73
1.22
26.38
150.33
0.925
28.58
5.59
367.48
813.46
3.66
|
Donor 6
DST
44.3
186.38
22.05
1.56
0.03
0.085
28.27
21.57
149.86
0.925
6.68
4.12
170.63
636.19
2.86
|
Donor 6
EPS8L1
47.7
132.54
24.08
2.42
0.03
0.085
1.22
23.24
53.62
0.925
10.24
4.59
322.88
623.57
2.81
|
Donor 6
FRMD4B
12.51
94.1
18.98
0.5
4.13
0.78
1.22
27
33.89
0.925
0.8
0.24
24.26
219.34
0.99
|
Donor 6
LAMA3
47.4
31
11.77
0.54
0.03
0.085
1.22
15
48.92
0.925
8.14
0.01
254.81
419.85
1.89
|
Donor 6
MET
36.59
255.47
19.03
1.92
0.03
0.4
1.22
59.85
64.07
0.925
3.14
4.24
56.57
503.46
2.27
|
Donor 6
MIB2
28.73
46.26
15.32
1.69
7.7
0.085
1.22
16.35
44.57
0.925
1.58
0.58
202.54
367.55
1.65
|
Donor 6
MRC2
30.56
173.28
11.42
0.3
0.03
0.085
1.22
15.31
25.45
0.925
13.84
2.86
70.54
345.82
1.56
|
Donor 6
NOS2
70.25
513.42
21.89
2.25
0.03
1.11
1.22
72.8
117.93
1.85
2.77
2.06
197.11
1004.69
4.52
|
Donor 6
PLEC
52.82
69.38
21.92
1.42
0.03
0.085
1.22
20.11
58.11
0.925
16.23
2.43
262.58
507.26
2.28
|
Donor 6
PLEKHG5
23.2
140.24
15.8
0.19
0.03
0.085
1.22
20.73
55.53
0.925
1.96
0.17
136.4
396.48
1.78
|
Donor 6
PTGDS
44.5
194.94
14.38
1.12
0.03
0.085
1.22
30.35
54.69
0.925
6.64
2.43
125.84
477.15
2.15
|
Donor 6
RASA3
67.6
91.21
19.34
1.53
0.03
0.085
7.62
43.82
212.13
0.925
14.56
2.18
273.27
734.30
3.31
|
Donor 6
TRPM2
24.72
145.01
12.57
0.005
0.03
0.085
1.22
22.4
16.66
0.925
1.5
3.28
67.52
295.93
1.33
|
Donor 6
IKZF3
63.92
108.75
23.63
1.97
0.03
0.085
5.1
46.57
131.23
0.925
22.4
2.86
116.65
524.12
2.36
|
Donor 6
Actin
18.81
135.48
11.03
0.5
0.03
0.085
1.22
4.66
8.77
0.925
2.22
0.01
38.39
222.13
1.00
|
Donor 7
DNAH3
25.1
28.72
2.1
0.005
0.03
0.085
1.22
7.49
2.45
0.925
0.02
0.09
48.76
117.00
1.64
|
Donor 7
DST
20.84
93.16
3.11
0.005
0.03
0.085
1.22
10.1
4.73
0.925
1.02
0.01
80.77
216.01
3.03
|
Donor 7
EPS8L1
1.32
0.9
2.84
0.005
0.03
0.085
1.22
3.4
0.03
0.925
0.63
0.01
7.74
19.14
0.27
|
Donor 7
FRMD4B
12.7
21.99
3.25
0.005
0.03
0.085
1.22
2.66
1.7
0.925
0.02
0.01
27.73
72.33
1.01
|
Donor 7
LAMA3
2.88
3.49
3.13
0.005
0.03
0.085
1.22
1.06
2.32
0.925
0.02
0.38
7.3
22.85
0.32
|
Donor 7
MET
0.66
1.05
1.82
0.005
0.03
0.085
1.22
3.09
0.22
0.925
0.02
0.01
8.53
17.67
0.25
|
Donor 7
MIB2
44.9
19.98
7.32
0.005
0.03
0.085
1.22
0.63
8.89
0.925
0.02
0.01
30.68
114.70
1.61
|
Donor 7
MR2C2
4.99
6.61
2.17
0.005
0.03
0.085
1.22
0.09
2.2
0.925
0.02
0.01
15.08
33.44
0.47
|
Donor 7
NOS2
64.4
61.11
9.55
0.38
0.03
2.29
1.22
3.93
10.2
0.925
0.18
0.01
29.13
183.36
2.57
|
Donor 7
PLEC
68.55
449.86
8.19
0.005
0.03
0.085
1.22
6.34
13.64
0.925
0.02
1.43
36.75
587.05
8.23
|
Donor 7
PLEKHG5
39.34
37.86
7.75
0.005
0.03
0.085
1.22
7.6
5.31
0.925
0.02
2.92
55.5
158.57
2.22
|
Donor 7
PTGDS
32.88
24.01
4.51
0.005
2.73
0.085
1.22
7.6
3.9
0.925
0.02
0.01
45.13
123.03
1.73
|
Donor 7
RASA3
42.8
44.03
7.54
0.005
0.03
0.085
1.22
7.8
14.2
0.925
0.02
0.31
36.75
155.72
2.18
|
Donor 7
TRPM2
29.69
140.85
2.97
0.005
0.03
0.085
1.22
25.75
3.72
0.925
0.02
0.01
124.46
329.74
4.62
|
Donor 7
IKZF3
43.4
29.69
8.26
0.005
0.03
0.085
1.22
5.71
6.88
0.925
0.02
0.45
37.8
134.48
1.89
|
Donor 7
Actin
3.31
6.53
0.77
0.01
0.03
2.29
1.22
7.7
0.14
0.925
0.02
0.01
48.35
71.31
1.00
|
Donor 8
DNAH3
110.13
191.67
72.91
1.32
0.03
4.85
3.47
9.27
105.51
0.925
0.4
0.78
121.93
623.20
47.79
|
Donor 8
DST
58.57
75.26
15.34
0.38
0.49
0.085
1.22
12.81
45.35
0.925
0.02
2.43
79.79
292.67
22.44
|
Donor 8
EPS8L1
88.89
63.7
41.38
1.19
0.03
0.085
6.26
10.1
121.32
0.925
0.02
4.24
92.38
430.52
33.02
|
Donor 8
FRMD4B
29.4
65.37
9.26
0.42
0.03
0.085
6.48
8.43
53.96
0.925
0.02
1.68
53.45
229.71
17.62
|
Donor 8
LAMA3
197.84
534.58
80.04
6.66
5.92
0.085
11.96
16.25
222.4
0.925
0.49
0.01
173.02
1250.18
95.87
|
Donor 8
MET
166.16
260.07
34.37
1.29
0.03
0.95
6.15
19.79
180.96
0.925
3.81
0.01
150.63
825.15
63.28
|
Donor 8
MIB2
55.58
97.75
8.09
3.34
0.03
0.4
10.38
14.37
48.48
0.925
4.22
0.01
70.89
314.47
24.12
|
Donor 8
MRC2
18.72
20.86
7.27
0.005
0.03
0.085
1.22
5.92
27.67
0.925
0.02
0.01
27.96
110.70
8.49
|
Donor 8
NOS2
79.04
62.03
23.6
1.36
0.03
0.085
8.21
11.98
120.62
0.925
1.28
0.01
53.5
362.67
27.81
|
Donor 8
PLEC
190.8
360.99
57.12
8.89
0.03
0.085
33.62
22.19
218.93
0.925
0.67
0.58
135.11
1029.94
78.98
|
Donor 8
PLEKHG5
30.37
80.65
6.89
0.005
0.03
0.085
1.22
12.39
12.62
0.925
0.08
0.01
34.21
179.94
13.76
|
Donor 8
PTGDS
17.08
7.78
5.28
0.005
1.92
0.085
1.22
13.44
25.12
0.925
0.67
2.31
25.09
100.93
7.74
|
Donor 8
RASA3
125.64
123.92
31.79
2.26
0.03
0.085
51.42
14.69
295.64
0.925
3.02
1.3
122.48
773.20
59.29
|
Donor 8
TRPM2
24.34
6.76
9.28
0.54
0.03
0.085
1.22
10.62
36.72
0.925
0.76
0.38
38.24
129.90
9.96
|
Donor 8
IKZF3
91.55
147.61
33.66
1.15
0.03
0.085
3.39
9.16
104.46
0.925
1.02
2.8
80.67
476.51
36.54
|
Donor 8
Actin
0.66
1.12
1.9
0.22
0.03
0.085
1.22
3.61
0.03
0.925
0.02
0.58
2.64
13.04
1.00
|
Donor 9
DNAH3
18.58
8.02
1.45
0.005
0.91
0.085
1.22
12.71
4.02
0.925
0.18
0.78
106.41
155.30
2.24
|
Donor 9
DST
18.02
15.32
3.89
0.17
0.03
0.085
1.22
8.22
1.19
0.925
0.02
0.01
64.97
114.07
1.64
|
Donor 9
EPS8L1
0.66
3.49
16.23
0.005
0.03
0.085
1.22
2.77
3.18
0.925
0.58
0.01
7.16
36.35
0.52
|
Donor 9
FRMD4B
5.93
3.18
2.93
0.005
0.03
0.085
1.22
0.09
0.92
0.925
0.04
0.01
12.73
28.10
0.40
|
Donor 9
LAMA3
0.66
4.03
2.75
0.005
0.03
2.01
1.22
1.28
1.51
0.925
0.02
0.01
6.68
21.13
0.30
|
Donor 9
MET
2.43
0.005
2.88
0.005
0.03
0.085
1.22
4.66
0.92
0.925
0.02
0.01
15.76
28.95
0.42
|
Donor 9
MIB2
13.91
10.55
5.42
0.005
0.03
0.085
1.22
6.55
4.25
0.925
0.02
0.01
63.45
106.43
1.53
|
Donor 9
MRC2
0.66
15.32
5.84
0.005
0.03
0.085
1.22
9.06
3.42
0.925
0.02
0.01
11.63
48.23
0.69
|
Donor 9
NOS2
27.96
18.69
4.86
0.005
0.03
0.085
1.22
22.19
2.01
0.925
1.19
0.01
220.43
299.61
4.32
|
Donor 9
PLEC
3.36
4.73
2.7
0.005
0.03
2.01
1.22
1.92
0.65
0.925
0.02
0.01
15.95
33.53
0.48
|
Donor 9
PLEKHG5
1.42
1.35
2.97
0.56
4.13
0.085
1.22
4.03
0.51
0.925
0.02
0.01
8.07
25.50
0.37
|
Donor 9
PTGDS
9.72
1.5
2.15
0.005
0.03
0.085
1.22
5.71
1.95
0.925
0.02
0.01
47.71
71.04
1.02
|
Donor 9
RASA3
2.48
6.14
2.12
0.005
0.03
0.085
1.22
4.03
0.03
0.925
1.19
0.01
14.78
33.05
0.48
|
Donor 9
TRPM2
5.56
0.9
4.77
0.38
0.03
0.085
1.22
4.03
1.32
0.925
0.02
0.01
10.04
29.29
0.42
|
Donor 9
IKZF3
9.67
0.005
6.18
0.005
0.03
1.43
1.22
5.08
1.32
0.925
0.08
0.01
31.98
57.94
0.83
|
Donor 9
Actin
0.66
3.49
0.77
0.36
0.03
2.01
1.22
2.13
1.05
0.925
0.58
0.01
56.18
69.42
1.00
|
|
To test the immunogenic capacity of specific N-terminal peptides in a more cellular setting, we then assessed responses of T cells previously primed to recognize either altered or wild-type peptides, when co-cultured with HLA-matched isogenic GC cells expressing either altered or wild-type peptides respectively (FIG. 12). By MHC-I affinity screening, a VMCDIFFSL nonamer in the WT RASA3 N-terminus was predicted to exhibit high MHC-I affinity binding for both the HLA-A02:01 (IC50=6.93 nm) and HLA-A02:06 (IC50=9.74 nm) alleles. Using HLA-A*02:06 T cells that are cross-reactive to HLA-A*02:01-positive AGS cells, we tested release of interferon gamma (IFNγ) from primed T cells after exposure to AGS lysates expressing either RASA3 CanT or SomT isoforms. ELISA assays demonstrated that T cells primed to recognize RASA3 CanT released significantly more IFNγ when co-cultured with RASA3 CanT-expressing AGS cells than when co-cultured with RASA3 SomT-expressing AGS cells. In contrast, T-cells primed with RASA3 SomT did not exhibit appreciable IFNγ release when co-cultured with RASA3 SomT expressing AGS cells, indicating that RASA3 SomT is less immunogenic (FIG. 12). Taken collectively, these in vitro results demonstrate that peptides predicted to be depleted in GCs through somatic promoter alterations can produce immunogenic responses, with the magnitude of immune responses depending on both peptide sequence and host immune background.
Somatic Promoters are Associated with EZH2 Occupancy
To identify potential oncogenic mechanisms driving somatic promoter alterations, we intersected the genomic locations of the somatic promoters with transcription factor binding sites (TFBS) of 237 transcription factors from 83 different tissues. Regions exhibiting somatic promoters were significantly enriched in regions associated with EZH2 (P<0.01) and SUZ12 (P<0.01) binding (FIG. 6a, Table 13), confirming earlier findings on a smaller cohort. Both EZH2 and SUZ12 are components of the PRC2 epigenetic regulator complex, which is upregulated in many cancer types including GC. To validate these findings, we then performed EZH2 Chip-sequencing on HFE-145 normal gastric epithelial cells (Methods and Materials). Concordant with the previous findings, we observed significant enrichment of EZH2 binding sites at somatic promoters compared to all promoters (Enrichment score 27 vs. 13 for all promoters, P<0.01), and this EZH2 enrichment remained significant when the gained somatic (Enrichment Score 28, P<0.01) and lost somatic promoters (Enrichment Score 24, P<0.01) were analyzed separately (FIG. 18).
TABLE 13
|
|
Somatic Promoters Overlapping EZH2/SUZ12 Binding Sites
|
Annotation
|
Loci
Status
Associated Gene
|
|
chrX: 136647100-
Known
ZIC3
|
136648150
|
chr13: 100634350-
Known
ZIC2
|
100638150
|
chr13: 100630200-
Known
ZIC2
|
100634000
|
chr20: 50719850-
Known
ZFP64
|
50723350
|
chr18: 45660800-
Known
ZBTB7C
|
45664950
|
chr1: 185226150-
Known
Y_RNA
|
185227950
|
chr3: 13920600-
Known
WNT7A
|
13921250
|
chr2: 71126100-
Known
VAX2
|
71129800
|
chr5: 6448050-
Known
UBE2QL1
|
6451150
|
chr8: 72986650-
Known
TRPA1
|
72987850
|
chr22: 17082250-
Known
TPTEP1
|
17084550
|
chr19: 55657350-
Known
TNNT1
|
55658650
|
chr19: 55666950-
Known
TNNI3
|
55668450
|
chr22: 42320400-
Known
TNFRSF13C
|
42323750
|
chr8: 119962100-
Known
TNFRSF11B
|
119965650
|
chr21: 42873650-
Known
TMPRSS2
|
42881750
|
chr20: 1164650-
Known
TMEM74B
|
1168700
|
chr17: 53797250-
Known
TMEM100
|
53803100
|
chr11: 119291200-
Known
THY1
|
119294700
|
chr20: 55203450-
Known
TFAP2C
|
55206500
|
chr6: 10409250-
Known
TFAP2A; TFAP2A-AS1
|
10419650
|
chr6: 85471550-
Known
TBX18
|
85475350
|
chr20: 46411750-
Known
SULF2
|
46414250
|
chr8: 70403800-
Known
SULF1
|
70408450
|
chr5: 172753250-
Known
STC2
|
172757450
|
chr14: 38675750-
Known
SSTR1
|
38681750
|
chr7: 20824950-
Known
SP8
|
20827850
|
chr13: 95362100-
Known
SOX21; SOX21-AS1
|
95368650
|
chr3: 181428150-
Known
SOX2
|
181434750
|
chr8: 101660950-
Known
SNX31
|
101662650
|
chr20: 10197250-
Known
SNAP25; SNAP25-AS1
|
10201300
|
chr20: 48598400-
Known
SNAI1
|
48604100
|
chr14: 70346050-
Known
SMOC1
|
70347700
|
chr12: 85303950-
Known
SLC6A15
|
85307700
|
chr19: 17981100-
Known
SLC5A5
|
17986400
|
chr2: 228580350-
Known
SLC19A3
|
228583450
|
chr3: 121656650-
Known
SLC15A2
|
121658300
|
chr6: 100910100-
Known
SIM1
|
100913300
|
chr21: 44842150-
Known
SIK1
|
44848700
|
chr7: 37953600-
Known
SFRP4
|
37956950
|
chr4: 154708850-
Known
SFRP2
|
154714150
|
chr16: 23193600-
Known
SCNN1G
|
23197800
|
chr16: 23312800-
Known
SCNN1B
|
23315350
|
chr2: 200326950-
Known
SATB2
|
200329550
|
chr20: 50415800-
Known
SALL4
|
50419950
|
chr20: 981750-
Known
RSPO4
|
984100
|
chr1: 148247000-
Known
RP11-89F3.2
|
148248800
|
chr12: 54472600-
Known
RP11-834C11.6; RP11-
|
54477950
834C11.7
|
chr5: 72746300-
Known
RP11-79P5.7
|
72748200
|
chr1: 61103800-
Known
RP11-776H12.1
|
61106600
|
chr11: 134335600-
Known
RP11-627G23.1
|
134339750
|
chr11: 69830350-
Known
RP11-626H12.1
|
69834850
|
chr16: 89987550-
Known
RP11-566K11.4; TUBB3
|
89991500
|
chr16: 86319900-
Known
RP11-514D23.1
|
86321550
|
chr3: 50191700-
Known
RP11-493K19.3; SEMA3F
|
50195800
|
chr3: 132756350-
Known
RP11-469L4.1; TMEM108
|
132758550
|
chr6: 26613750-
Known
RP11-457M11.6
|
26615600
|
chr3: 87841650-
Known
RP11-451B8.1
|
87842700
|
chr1: 113391350-
Known
RP11-426L16.8; RP3-
|
113395900
522D1.1
|
chr12: 85711250-
Known
RP11-408B11.2
|
85713200
|
chr6: 106807450-
Known
RP11-404H14.1
|
106809950
|
chr1: 149230550-
Known
RP11-403I13.5
|
149232000
|
chr1: 222138950-
Known
RP11-400N13.2
|
222144050
|
chr3: 178577000-
Known
RP11-385J1.2
|
178578500
|
chr17: 46721450-
Known
RP11-357H14.17
|
46725800
|
chr5: 522450-
Known
RP11-310P5.2; SLC9A3
|
524750
|
chr15: 80542500-
Known
RP11-2E17.1
|
80545200
|
chr5: 74343750-
Known
RP11-229C3.2
|
74351250
|
chr5: 63460450-
Known
RNF180
|
63463050
|
chr1: 228742450-
Known
RNA5SP19
|
228743450
|
chr1: 228781900-
Known
RNA5S17; RNA5SP18
|
228785450
|
chr21: 38379100-
Known
RIPPLY3
|
38379750
|
chr21: 43180350-
Known
RIPK4
|
43189850
|
chr8: 104510350-
Known
RIMS2; RP11-1C8.4
|
104514700
|
chr10: 62758000-
Known
RHOBTB1
|
62762450
|
chr15: 90039550-
Known
RHCG
|
90040150
|
chr2: 86564650-
Known
REEP1
|
86566000
|
chr4: 82964050-
Known
RASGEF1B; RP11-689K5.3
|
82966400
|
chr3: 75707050-
Known
RARRES2P1
|
75708850
|
chr8: 85093500-
Known
RALYL
|
85097700
|
chr8: 128805200-
Known
PVT1
|
128810000
|
chr1: 29562850-
Known
PTPRU
|
29565950
|
chr7: 158378250-
Known
PTPRN2
|
158380350
|
chr1: 170630400-
Known
PRRX1; RP1-79C4.4
|
170636550
|
chr6: 150463250-
Known
PPP1R14C
|
150464400
|
chr12: 133264050-
Known
POLE; PXMP2; RP13-
|
133266950
672B3.2
|
chr5: 74990850-
Known
POC5
|
74992350
|
chr20: 56280450-
Known
PMEPA1
|
56287350
|
chr16: 57315850-
Known
PLLP
|
57319550
|
chr1: 6544500-
Known
PLEKHG5
|
6545600
|
chr14: 69950300-
Known
PLEKHD1
|
69951550
|
chr1: 201251800-
Known
PKP1
|
201254650
|
chr2: 42275400-
Known
PKDCC
|
42282950
|
chr12: 130823500-
Known
PIWIL1
|
130825600
|
chr4: 111557000-
Known
PITX2
|
111559350
|
chr7: 32107350-
Known
PDE1C
|
32111900
|
chr1: 55504650-
Known
PCSK9
|
55507550
|
chr15: 102029650-
Known
PCSK6
|
102031300
|
chr3: 142606500-
Known
PCOLCE2
|
142609050
|
chr14: 37129750-
Known
PAX9
|
37133800
|
chr1: 17443850-
Known
PADI2
|
17446850
|
chr8: 99951150-
Known
OSR2; RP11-44N12.5; STK3
|
99961750
|
chr1: 161991300-
Known
OLFML2B
|
161994850
|
chr7: 8473050-
Known
NXPH1
|
8474100
|
chr9: 87282200-
Known
NTRK2
|
87286150
|
chr19: 15309800-
Known
NOTCH3
|
15311950
|
chr4: 56500900-
Known
NMU
|
56504300
|
chr1: 183385400-
Known
NMNAT2
|
183388500
|
chr8: 41502400-
Known
NKX6-3
|
41510150
|
chr10: 134596450-
Known
NKX6-2; RP11-288G11.3
|
134599400
|
chr4: 85417400-
Known
NKX6-1
|
85421400
|
chr2: 233791350-
Known
NGEF
|
233792700
|
chrX: 107016000-
Known
NCBP2L; TSC22D3
|
107021000
|
chr11: 1150000-
Known
MUC5AC
|
1157350
|
chr7: 100607850-
Known
MUC12; MUC3A; RP11-
|
100613600
395B7.2
|
chr16: 56699800-
Known
MT1G; MT1H
|
56705700
|
chr12: 132313150-
Known
MMP17
|
132317650
|
chr7: 73036850-
Known
MLXIPL
|
73039200
|
chr19: 54482850-
Known
MIR935
|
54485950
|
chr9: 21554500-
Known
MIR31HG
|
21561150
|
chr17: 46800050-
Known
MIR3185; PRAC1; PRAC2
|
46802400
|
chr1: 1562700-
Known
MIB2
|
1565700
|
chr1: 205537050-
Known
MFSD4
|
205540700
|
chr13: 31480150-
Known
MEDAG
|
31483050
|
chr2: 132152200-
Known
MED15P3
|
132153000
|
chr3: 150959500-
Known
MED12L
|
150960300
|
chr2: 149894250-
Known
LYPD6B
|
149897500
|
chr11: 1889150-
Known
LSP1
|
1894600
|
chr1: 156896950-
Known
LRRC71
|
156898350
|
chr11: 61275250-
Known
LRRC10B; MIR4488
|
61276400
|
chr9: 103789900-
Known
LPPR1
|
103792650
|
chr16: 1013250-
Known
LMF1
|
1015550
|
chr1: 2980250-
Known
LINC00982; PRDM16
|
2991900
|
chr3: 75719150-
Known
LINC00960
|
75723200
|
chr20: 21085550-
Known
LINC00237
|
21087550
|
chr19: 55127750-
Known
LILRB1
|
55130550
|
chr7: 103968400-
Known
LHFPL3
|
103969950
|
chr1: 202182400-
Known
LGR6
|
202184350
|
chr1: 202161700-
Known
LGR6
|
202163400
|
chr1: 65991250-
Known
LEPR
|
65992850
|
chr1: 205424550-
Known
LEMD1; RP11-576D8.4
|
205426850
|
chr20: 9494050-
Known
LAMP5; RP5-1119D9.4
|
9498000
|
chr6: 129203450-
Known
LAMA2
|
129207800
|
chr19: 51485750-
Known
KLK7
|
51487700
|
chr3: 126073900-
Known
KLF15
|
126077300
|
chr1: 245315950-
Known
KIF26B
|
245321950
|
chr1: 180880350-
Known
KIAA1614
|
180883200
|
chr15: 81070500-
Known
KIAA1199
|
81075050
|
chr20: 43728950-
Known
KCNS1
|
43730250
|
chr14: 88788450-
Known
KCNK10
|
88791000
|
chr7: 119911950-
Known
KCND2
|
119914550
|
chr1: 111210100-
Known
KCNA3
|
111218300
|
chr16: 31366400-
Known
ITGAX
|
31369100
|
chr20: 13200350-
Known
ISM1
|
13202100
|
chr16: 54316250-
Known
IRX3
|
54322800
|
chr5: 2748900-
Known
IRX2
|
2751450
|
chr17: 38016450-
Known
IKZF3
|
38022250
|
chr22: 23229500-
Known
IGLC1; IGLJ1; IGLL5
|
23237350
|
chr19: 46579500-
Known
IGFL4
|
46581300
|
chr7: 45927300-
Known
IGFBP1
|
45929150
|
chr7: 23506000-
Known
IGF2BP3
|
23515500
|
chr6: 87646350-
Known
HTR1E
|
87648250
|
chr5: 175084150-
Known
HRH2
|
175086850
|
chr3: 11195250-
Known
HRH1
|
11198600
|
chr4: 175439400-
Known
HPGD
|
175445700
|
chr12: 54386800-
Known
HOXC6; HOXC9; HOXC-
|
54395700
AS1; HOXC-AS2
|
chr12: 54421700-
Known
HOXC6
|
54423400
|
chr12: 54410150-
Known
HOXC4; HOXC6; RP11-
|
54413050
834C11.14
|
chr12: 54446200-
Known
HOXC4
|
54449350
|
chr12: 54331500-
Known
HOXC13; HOXC-AS5
|
54334550
|
chr12: 54375250-
Known
HOXC10; HOXC-AS3; RP11-
|
54381900
834C11.12
|
chr17: 46701450-
Known
HOXB9
|
46705000
|
chr17: 46804450-
Known
HOXB13
|
46808100
|
chr7: 27159450-
Known
HOXA3; HOXA-AS2
|
27164850
|
chr7: 27208400-
Known
HOXA10; HOXA9; HOXA-
|
27220700
AS4; MIR196B; RP1-
|
170O19.20
|
chr7: 27221300-
Known
HOTTIP; HOXA11; HOXA11-
|
27251300
AS; HOXA13; RP1-
|
170O19.14
|
chr12: 54365950-
Known
HOTAIR; HOXC11
|
54373250
|
chr1: 6478800-
Known
HES2
|
6480950
|
chr11: 2016000-
Known
H19
|
2021350
|
chr11: 45942850-
Known
GYLTL1B
|
45946400
|
chr9: 140056700-
Known
GRIN1
|
140058300
|
chr15: 72488700-
Known
GRAMD2
|
72491050
|
chr17: 72425800-
Known
GPRC5C
|
72433550
|
chr5: 89854500-
Known
GPR98
|
89855350
|
chrX: 133117900-
Known
GPC3
|
133120700
|
chr19: 2700850-
Known
GNG7
|
2702900
|
chr7: 99526050-
Known
GJC3; RP4-604G5.1
|
99527900
|
chr8: 75230900-
Known
GDAP1; JPH1
|
75235150
|
chr7: 74379400-
Known
GATSL1
|
74380400
|
chr20: 61046800-
Known
GATA5; RP13-379O24.3
|
61052500
|
chr8: 11533800-
Known
GATA4
|
11540650
|
chr8: 11557150-
Known
GATA4
|
11568950
|
chr11: 11640700-
Known
GALNT18
|
11644650
|
chr12: 130645350-
Known
FZD10; FZD10-AS1
|
130646800
|
chr6: 96460900-
Known
FUT9
|
96466650
|
chr13: 39259850-
Known
FREM2
|
39263000
|
chr16: 86600550-
Known
FOXC2; RP11-463O9.5
|
86601800
|
chr6: 1608550-
Known
FOXC1
|
1611700
|
chr14: 38051900-
Known
FOXA1; TTC6
|
38070050
|
chr17: 39965500-
Known
FKBP10; LEPREL4
|
39970950
|
chr9: 133813800-
Known
FIBCD1
|
133816150
|
chr11: 69630950-
Known
FGF3
|
69635350
|
chr3: 13973700-
Known
FGD5P1
|
13975200
|
chr10: 95325600-
Known
FFAR4
|
95329150
|
chr7: 121942750-
Known
FEZF1; FEZF1-AS1
|
121947900
|
chr16: 86529000-
Known
FENDRR
|
86534050
|
chr21: 42687850-
Known
FAM3B
|
42691150
|
chr17: 66593700-
Known
FAM20A
|
66598900
|
chr1: 179711850-
Known
FAM163A
|
179712600
|
chr8: 53476650-
Known
FAM150A
|
53479500
|
chr4: 187025100-
Known
FAM149A
|
187028650
|
chr12: 124778800-
Known
FAM101A
|
124786100
|
chr7: 27281600-
Known
EVX1; EVX1-AS
|
27284150
|
chrX: 103498450-
Known
ESX1
|
103500200
|
chr1: 216892850-
Known
ESRRG
|
216898200
|
chr19: 55590850-
Known
EPS8L1
|
55593800
|
chr8: 144950100-
Known
EPPK1
|
144953650
|
chr17: 48608600-
Known
EPN3
|
48615100
|
chr1: 23037600-
Known
EPHB2
|
23041300
|
chr9: 112080500-
Known
EPB41L4B
|
112082950
|
chr7: 155250600-
Known
EN2
|
155253200
|
chr19: 14885900-
Known
EMR2
|
14888350
|
chr22: 37821950-
Known
ELFN2; RP1-63G5.5
|
37823900
|
chr19: 1286150-
Known
EFNA2; MUM1
|
1288700
|
chr20: 57874800-
Known
EDN3
|
57877300
|
chr15: 45399500-
Known
DUOX2; DUOXA2
|
45410700
|
chr16: 30021900-
Known
DOC2A
|
30023950
|
chr7: 96633500-
Known
DLX6; DLX6-AS1; DLX6-AS2
|
96636700
|
chr7: 96652750-
Known
DLX5
|
96654900
|
chr19: 6474700-
Known
DENND1C
|
6477300
|
chr10: 94831200-
Known
CYP26A1
|
94834300
|
chr4: 48987500-
Known
CWH43
|
48989500
|
chr8: 104382100-
Known
CTHRC1
|
104385900
|
chr5: 174177950-
Known
CTD-2532K18.1; MIR4634
|
174179050
|
chr14: 19924450-
Known
CTD-2314B22.3
|
19925600
|
chr14: 19640850-
Known
CTD-2314B22.1
|
19641750
|
chr15: 97838750-
Known
CTD-2147F2.1
|
97841300
|
chr5: 134912900-
Known
CTC-321K16.1; CXCL14
|
134915350
|
chr5: 134371700-
Known
CTC-276P9.1
|
134375750
|
chr16: 21288600-
Known
CRYM
|
21290700
|
chr2: 102002650-
Known
CREG2
|
102005250
|
chr15: 78632500-
Known
CRABP1
|
78634200
|
chr3: 9745600-
Known
CPNE9
|
9747050
|
chr16: 89640950-
Known
CPNE7
|
89643950
|
chr3: 99355450-
Known
COL8A1
|
99359900
|
chr6: 33160200-
Known
COL11A2
|
33161450
|
chr6: 35754500-
Known
CLPSL1
|
35755750
|
chr21: 36041150-
Known
CLIC6
|
36045150
|
chr17: 7161850-
Known
CLDN7; RP1-4G17.5
|
7167950
|
chr7: 73181100-
Known
CLDN3
|
73185850
|
chr3: 190034900-
Known
CLDN1; CLDN16
|
190041800
|
chr7: 29184550-
Known
CHN2; CPVL
|
29187650
|
chr2: 27340450-
Known
CGREF1
|
27342750
|
chr13: 28538700-
Known
CDX2
|
28543950
|
chr5: 149545100-
Known
CDX1
|
149550500
|
chr16: 68677900-
Known
CDH3; RP11-615I2.2
|
68681200
|
chr16: 68770300-
Known
CDH1
|
68774200
|
chr11: 6279800-
Known
CCKBR
|
6283200
|
chr18: 57363700-
Known
CCBE1; RP11-2N1.2
|
57365350
|
chr8: 76189900-
Known
CASC9
|
76191050
|
chr6: 17392850-
Known
CAP2
|
17396100
|
chr1: 20808950-
Known
CAMK2N1
|
20814450
|
chr7: 44265350-
Known
CAMK2B
|
44266400
|
chr8: 86350000-
Known
CA3
|
86351450
|
chr5: 2751850-
Known
C5orf38; IRX2
|
2754050
|
chr3: 138664900-
Known
C3orf72; FOXL2
|
138667100
|
chr17: 77019250-
Known
C1QTNF1; C1QTNF1-AS1
|
77024000
|
chr1: 223565950-
Known
C1orf65
|
223567600
|
chr1: 190440800-
Known
BRINP3; RP11-
|
190450200
161I10.1; RP11-547I7.2
|
chr2: 198650550-
Known
BOLL
|
198651850
|
chr15: 83952250-
Known
BNC1
|
83953300
|
chr4: 42152300-
Known
BEND4
|
42155900
|
chr17: 47209750-
Known
B4GALNT2
|
47211400
|
chr11: 134279600-
Known
B3GAT1
|
134282050
|
chr4: 94748600-
Known
ATOH1
|
94754050
|
chr9: 120175650-
Known
ASTN2
|
120177900
|
chr9: 133319400-
Known
ASS1
|
133324650
|
chr11: 2285750-
Known
ASCL2
|
2292550
|
chr16: 329250-
Known
ARHGDIG
|
332250
|
chr8: 145908800-
Known
ARHGAP39
|
145912600
|
chr4: 86395150-
Known
ARHGAP24
|
86399900
|
chr18: 24443050-
Known
AQP4; AQP4-AS1
|
24445900
|
chr11: 71318250-
Known
AP000867.1
|
71320050
|
chr5: 79864800-
Known
ANKRD34B
|
79866650
|
chr2: 133014850-
Known
ANKRD30BL; MIR663B
|
133015750
|
chr12: 85672750-
Known
ALX1
|
85675650
|
chr6: 168195400-
Known
AL009178.1; C6orf123
|
168198750
|
chr10: 4867450-
Known
AKR1E2
|
4870200
|
chr16: 3232300-
Known
AJ003147.8
|
3234150
|
chr8: 11203650-
Known
AF131216.5; TDH
|
11206800
|
chr17: 15847250-
Known
ADORA2B
|
15850800
|
chr7: 5601050-
Known
ACTB
|
5603800
|
chr7: 100490350-
Known
ACHE
|
100495550
|
chr3: 18734950-
Known
AC144521.1
|
18736300
|
chr2: 131593950-
Known
AC133785.1; ARHGEF4
|
131595800
|
chr4: 44447900-
Known
AC131951.1; KCTD8
|
44452050
|
chr17: 7982650-
Known
AC129492.6; ALOX12B
|
7984350
|
chr5: 1003400-
Known
AC116351.2; RP11-
|
1005850
43F13.4
|
chr2: 100721300-
Known
AC092667.2; AFF3
|
100722600
|
chr2: 286750-
Known
AC079779.4; FAM150B
|
288600
|
chr2: 132121200-
Known
AC073869.1
|
132122150
|
chr2: 233282700-
Known
AC068134.5; AC068134.6
|
233286450
|
chr16: 31495650-
Known
AC026471.6; SLC5A2
|
31500700
|
chr12: 54348250-
Known
AC012531.23; HOXC12
|
54351050
|
chr2: 118561200-
Known
AC009312.1
|
118562150
|
chr16: 51182700-
Known
AC009166.5; SALL1
|
51185700
|
chr2: 171671550-
Known
AC007405.8; GAD1
|
171676200
|
chr2: 66801200-
Known
AC007392.3
|
66811950
|
chr2: 71113350-
Known
AC007040.5
|
71116800
|
chr7: 15720950-
Known
AC005550.4; MEOX2
|
15728900
|
chr6: 1611750-
Unknown
—
|
1616000
|
chr15: 96958950-
Unknown
—
|
96961350
|
chr2: 66652100-
Unknown
—
|
66655200
|
chr2: 8833050-
Unknown
—
|
8834200
|
chr9: 17905350-
Unknown
—
|
17908250
|
chr5: 2746900-
Unknown
—
|
2748550
|
chr7: 45001800-
Unknown
—
|
45003250
|
chr12: 52257150-
Unknown
—
|
52258000
|
chr2: 218874000-
Unknown
—
|
218875450
|
chr19: 30214300-
Unknown
—
|
30216100
|
chr8: 140717350-
Unknown
—
|
140719650
|
chr7: 27264550-
Unknown
—
|
27266100
|
chr19: 48900250-
Unknown
—
|
48904400
|
chr16: 51186150-
Unknown
—
|
51187850
|
chr9: 132458700-
Unknown
—
|
132461300
|
chr11: 44337850-
Unknown
—
|
44339250
|
chr17: 46694850-
Unknown
—
|
46697150
|
chr10: 124898400-
Unknown
—
|
124900700
|
chr6: 10382900-
Unknown
—
|
10384750
|
chr8: 144489000-
Unknown
—
|
144490750
|
chr20: 49837550-
Unknown
—
|
49839250
|
chr3: 193921100-
Unknown
—
|
193922050
|
chr13: 100619800-
Unknown
—
|
100623100
|
chr1: 165320950-
Unknown
—
|
165322700
|
chr1: 180203650-
Unknown
—
|
180205650
|
chr1: 23543800-
Unknown
—
|
23544900
|
chr8: 144842350-
Unknown
—
|
144844000
|
chr5: 174162150-
Unknown
—
|
174163450
|
chr1: 184632450-
Unknown
—
|
184634700
|
chr13: 21295150-
Unknown
—
|
21296450
|
chr1: 156893100-
Unknown
—
|
156894550
|
chr20: 46434400-
Unknown
—
|
46435400
|
chr11: 33398050-
Unknown
—
|
33400750
|
chr6: 134216650-
Unknown
—
|
134218050
|
chr2: 45176050-
Unknown
—
|
45177700
|
chr13: 36044350-
Unknown
—
|
36045800
|
chr2: 45227500-
Unknown
—
|
45229600
|
chr10: 43427950-
Unknown
—
|
43429950
|
chr1: 152079200-
Unknown
—
|
152081300
|
chr7: 54731350-
Unknown
—
|
54733200
|
chr20: 4201500-
Unknown
—
|
4202700
|
chr8: 145555300-
Unknown
—
|
145556800
|
chr7: 64733800-
Unknown
—
|
64735500
|
chrX: 119124000-
Unknown
—
|
119127100
|
chr3: 14642850-
Unknown
—
|
14644150
|
chr10: 102488400-
Unknown
—
|
102492200
|
chr5: 42999400-
Unknown
—
|
43001150
|
chr21: 38063750-
Unknown
—
|
38066650
|
chr2: 131010400-
Unknown
—
|
131011600
|
chr19: 30018700-
Unknown
—
|
30020150
|
chr5: 72731550-
Unknown
—
|
72734700
|
chr8: 102092150-
Unknown
—
|
102094400
|
chr4: 4867350-
Unknown
—
|
4869600
|
chr4: 4854350-
Unknown
—
|
4855850
|
chr7: 156735150-
Unknown
—
|
156736500
|
chr1: 161442450-
Unknown
—
|
161443650
|
chr12: 54356450-
Unknown
—
|
54358100
|
chr1: 48174300-
Unknown
—
|
48176650
|
chr7: 25900700-
Unknown
—
|
25903050
|
chr10: 102830000-
Unknown
—
|
102833650
|
chr6: 137310350-
Unknown
—
|
137312150
|
chr1: 152081400-
Unknown
—
|
152084100
|
chr7: 27274550-
Unknown
—
|
27276500
|
chr12: 113904650-
Unknown
—
|
113906650
|
chr1: 17024500-
Unknown
—
|
17028900
|
chr5: 72528750-
Unknown
—
|
72529950
|
chr9: 99481850-
Unknown
—
|
99483650
|
chr1: 46954600-
Unknown
—
|
46956800
|
chr17: 26119900-
Unknown
—
|
26121850
|
chr1: 2253650-
Unknown
—
|
2254650
|
chr7: 73060250-
Unknown
—
|
73063150
|
chr19: 1754200-
Unknown
—
|
1758750
|
chr9: 29211200-
Unknown
—
|
29215700
|
chr7: 31375200-
Unknown
—
|
31377000
|
chr1: 165344500-
Unknown
—
|
165346650
|
chr10: 57389650-
Unknown
—
|
57391700
|
chr1: 163441550-
Unknown
—
|
163443100
|
chr1: 200842700-
Unknown
—
|
200844850
|
chr20: 44639000-
Unknown
—
|
44640950
|
chr2: 176952400-
Unknown
—
|
176953750
|
chr20: 6031700-
Unknown
—
|
6033850
|
chr5: 2738550-
Unknown
—
|
2740800
|
chr3: 74662150-
Unknown
—
|
74664400
|
chr10: 134600350-
Unknown
—
|
134602350
|
chr1: 152084900-
Unknown
—
|
152085650
|
chr8: 52520450-
Unknown
—
|
52521550
|
chr1: 121279850-
Unknown
—
|
121280850
|
chr13: 37729350-
Unknown
—
|
37731000
|
chr7: 8390700-
Unknown
—
|
8392150
|
chr12: 32818500-
Unknown
—
|
32820350
|
chr16: 15350450-
Unknown
—
|
15351950
|
chr2: 58342200-
Unknown
—
|
58346950
|
chr3: 112383300-
Unknown
—
|
112384750
|
chr19: 1682300-
Unknown
—
|
1683350
|
chr4: 27077050-
Unknown
—
|
27078000
|
chr8: 23507850-
Unknown
—
|
23509050
|
chr4: 10782250-
Unknown
—
|
10783600
|
chr17: 12927950-
Unknown
—
|
12928650
|
chr2: 11989300-
Unknown
—
|
11990550
|
chr7: 23074700-
Unknown
—
|
23076100
|
chr22: 28479200-
Unknown
—
|
28480250
|
chr9: 36763800-
Unknown
—
|
36766950
|
chr6: 28757250-
Unknown
—
|
28758600
|
chr1: 50032150-
Unknown
—
|
50033200
|
chr6: 4334150-
Unknown
—
|
4335300
|
chr1: 195732150-
Unknown
—
|
195733300
|
chr6: 170483200-
Unknown
—
|
170484200
|
chr12: 38447100-
Unknown
—
|
38448600
|
chr7: 86667750-
Unknown
—
|
86669950
|
chr16: 9683650-
Unknown
—
|
9684650
|
chr1: 171342100-
Unknown
—
|
171343300
|
chr20: 47203350-
Unknown
—
|
47204450
|
chr20: 62030950-
Unknown
—
|
62034000
|
chr1: 168323150-
Unknown
—
|
168325650
|
chr6: 10133900-
Unknown
—
|
10134950
|
chr4: 71924850-
Unknown
—
|
71926200
|
chrX: 130711450-
Unknown
—
|
130713600
|
chr12: 38549550-
Unknown
—
|
38551600
|
chr2: 131094200-
Unknown
—
|
131095000
|
chr1: 183626800-
Unknown
—
|
183628050
|
chr6: 28918100-
Unknown
—
|
28918850
|
chr2: 198504700-
Unknown
—
|
198507250
|
chr11: 71350450-
Unknown
—
|
71351500
|
chr20: 47001000-
Unknown
—
|
47003900
|
chr21: 10600500-
Unknown
—
|
10603150
|
chr3: 34131250-
Unknown
—
|
34132150
|
chr5: 7170200-
Unknown
—
|
7171750
|
chr17: 50486700-
Unknown
—
|
50487400
|
chr2: 122809550-
Unknown
—
|
122810150
|
chr8: 57178000-
Unknown
—
|
57179050
|
chr4: 142803450-
Unknown
—
|
142805000
|
chr10: 118367950-
Unknown
—
|
118370350
|
chrX: 115004100-
Unknown
—
|
115005700
|
chr3: 53961050-
Unknown
—
|
53963000
|
chr6: 28920750-
Unknown
—
|
28922800
|
chr17: 11769750-
Unknown
—
|
11770850
|
chr6: 1594950-
Unknown
—
|
1595600
|
chr15: 79783300-
Unknown
—
|
79784500
|
chr7: 83684250-
Unknown
—
|
83685650
|
chr18: 2246500-
Unknown
—
|
2247900
|
chr10: 36147250-
Unknown
—
|
36148500
|
chr7: 91023500-
Unknown
—
|
91025650
|
chr2: 79337900-
Unknown
—
|
79339650
|
chrX: 115002950-
Unknown
—
|
115003900
|
chr1: 34557900-
Unknown
—
|
34558600
|
chr19: 523250-
Unknown
—
|
524300
|
chr13: 91315500-
Unknown
—
|
91317200
|
chr6: 26330700-
Unknown
—
|
26333000
|
chr9: 115565950-
Unknown
—
|
115567400
|
chr14: 42380150-
Unknown
—
|
42381450
|
chr7: 76356350-
Unknown
—
|
76358750
|
chr13: 108578200-
Unknown
—
|
108579350
|
chr8: 90569800-
Unknown
—
|
90570900
|
chr3: 185842600-
Unknown
—
|
185844550
|
chr1: 207903150-
Unknown
—
|
207904800
|
chr2: 14988000-
Unknown
—
|
14988950
|
chr12: 47819700-
Unknown
—
|
47821500
|
chr1: 83728350-
Unknown
—
|
83730000
|
chr11: 105384700-
Unknown
—
|
105387850
|
chr3: 88557900-
Unknown
—
|
88558600
|
chr6: 142290050-
Unknown
—
|
142291600
|
chr3: 83265600-
Unknown
—
|
83268250
|
|
To experimentally test if inhibiting EZH2/PRC2 activity might modulate somatic promoter usage in GC, we treated IM95 GC cells with GSK126, a highly selective small-molecule inhibitor of EZH2 methyltransferase activity. This line was selected as it has previously shown to be sensitive to EZH2 depletion (FIG. 14). RNA-seq analysis of GSK126-treated IM95 cells at two treatment time points (Day 6 and 9) confirmed that genes upregulated upon EZH2 inhibition are enriched in previously identified PRC2 target gene sets (FIG. 18). GSK126 treatment caused deregulation of 2134 promoters in total. Of 1959 promoters exhibiting somatic alterations in primary GCs (FIG. 1D), GSK126 treatment caused deregulation of 251 somatic promoters in IM95 cells (12.8%). This proportion was significantly greater than the proportion of unaltered promoters exhibiting deregulation after GSK126 challenge (8.8%, OR 1.46 P<0.001, Fisher Test, FIG. 5B), suggesting heightened sensitivity of somatic promoters to EZH2 inhibition. The proportion of somatic promoters deregulated after EZH2 inhibition was also greater than the total proportion of genes (as defined by Gencode) regulated by GSK126 (1.5%, OR 9.21, P<0.001, FIG. 5B). Of those promoters exhibiting both GSK126 deregulation and also mapping to somatic promoters lost in primary GC, 89.6% were reactivated following GSK126 administration (78/87, FC>=2, qval <0.1, Methods and Materials), consistent with EZH2 functioning to repress these promoters. For example, FIGS. 5C and 5D highlights two lost somatic promoters (SLC9A9 and PSCA), exhibiting expression gain after GSK126 treatment (FIG. 5). These results thus suggest a general role for EZH2 in regulating epigenomic promoter alterations in GC.
Somatic Promoters Reveal Novel Cancer-Associated Transcripts
Finally, when analyzing the altered somatic promoters with respect to both proximity to known genes, we found that somatic promoters could be classified into annotated and unannotated categories. Annotated promoters were defined as promoters mapping close (<500 bp) to a known Gencode transcription start site (TSS), while unannotated promoters refer to those mapping to genomic regions devoid of known Gencode TSSs. The majority of promoters present in non-malignant tissues, and also promoters unchanged between tumors and normal tissues, mapped closely to previously annotated TSSs (72%-92%). In contrast, only 41% of promoters mapped to annotated promoter locations, while the remaining 59% mapped to “unannotated” locations, distant from Gencode TSSs and in many cases 2-10 kb away (FIG. 6a).
To test the functional relevance of these unannotated promoters, we used GenoCanyon, a nucleotide level quantification of genomic functional potential that integrates multiple levels of conservation and epigenomic information. We observed that 81% of the unannotated promoter regions exhibited a maximum genome wide functional score of greater than 0.9 (range 0-1), indicating high functional potential. To ascertain tissue type specificities, we then applied tissue specific annotations using GenoSkyline, an extension of the GenoCanyon framework integrating Roadmap Epigenomics data We observed that GI tissues had the 3rd highest median score after ESC and fetal tissues, consistent with our tumors being gastric in lineage and also de-differentiated (FIG. 5b). In a separate analysis, recent studies have also suggested that endogenous repeat elements in the human genome may contribute significantly to regulatory element variation, and hypomethylation of repeat elements can induce cancer-associated transcription. We found that unannotated promoters, were also significantly enriched for the repeat elements ERV1 (P<0.0001 Unannotated vs. All) and L1 (P<0.0001 Unannotated vs. All, FIG. 13).
Compared to annotated promoters, unannotated promoters exhibited weaker H3K27ac signals suggesting that the former might have lower activity and decreased gene expression levels (FIG. 13). Supporting this, somatic promoters, even those supported by CAGE tags (indicating true promoters), exhibited significantly lower RNA-seq expression levels compared CAGE tag supported all promoters (FIG. 5c). We thus hypothesized that unannotated promoters might be associated with low transcript levels, thereby rendering them more challenging to detect by conventional depth transcriptome sequencing given the very wide dynamic range of cellular transcriptomes (10-10,000 transcripts per cell for different genes) (FIG. 5d). To test this possibility, we employed both down-sampling and up-sampling analysis. Not surprisingly, decreasing levels of RNA-seq depth caused a concomitant decrease in detected somatic promoter transcripts. For example, downsampling to −40M reads caused ˜250 transcripts (FPKM>0, FIG. 5e) to be rendered undetectable at somatic promoters. More convincingly, in the reciprocal experiment, we experimentally generated deep RNA-seq data for matched 5 GC/normal pairs (average read depth 140M compared to standard 100M), and confirmed the additional detection of 435 new somatic promoter-associated transcripts (FPKM>0) (FIG. 5e). We estimate that usage of deep RNA-sequencing data allowed us to discover additional transcripts for 22% of the unannotated promoters, not previously detectible at regular depth RNA-seq (FIG. 5f). These results demonstrate that despite being associated with bona-fide cancer associated transcripts, many somatic promoters defined by epigenomic profiling may have been missed by conventional-depth RNA-seq.
Discussion
Identifying somatically-altered cis-regulatory elements, and understanding how these elements direct cancer-associated gene expression represents a critical scientific goal. Here, we defined close to 2000 promoters exhibiting altered activity in GC, indicating that somatic promoters in GC are pervasive. Promoters are canonically defined as proximal cis-regulatory elements that recruit general transcription factors to initiate transcription. However, selection and activation of TSSs by RNA polymerase at core promoters is dependent on multiple factors. Core promoters are differentially distributed between genes of different functions, and chromatin distributions and epigenetic landscapes of core promoter regions can also differ in a tissue specific manner. Presence of multiple transcription initiation sites within the same gene can generate distinct transcript isoforms with different 5′UTRs that can act as switches to regulate gene expression, and usage of alternative 5′UTRs can also impact both translation and protein stability of cancer associated genes such as BRCA1, TGF-β and ERG Such findings demonstrate that specific promoter element activity is complex and cell context dependent, with impact on downstream transcriptional, translational, and functional processes.
A significant proportion (˜18%) of somatic promoters corresponded to alternative promoters. In cancer, alternative promoter utilization is of major relevance, as increasing numbers of genes (e.g. LEF1, TP53, TGFB3) are now being shown to exhibit distinct alternative-promoter associated isoforms that differentially affect malignant growth. In the current study, we identified alternative promoters in genes both known and novel to GC biology with significant clinical and translational implications. For example, we discovered an alternative promoter at the EpCAM gene locus specifically activated in gastric tumors. In GC, EpCAM encodes a transmembrane glycoprotein which has been proposed as a marker for circulating tumor cells and EpCAM expression levels have been correlated with GC patient prognosis. However, little is known about the specific cellular mechanisms driving high EpCAM expression in GC. Our finding that EpCAM is regulated in GC not through its canonical promoter, but instead through a cancer-specific alternative promoter may lend credence to recent reports suggesting that in addition to acting as an experimentally convenient surface marker, EpCAM may actually play a more direct pro-oncogenic role in stimulating cellular proliferation.
Another novel example of an alternative promoter-associated gene, identified for the first time in our study, was RASA3. While a functional role for RASA3 in cancer remains to definitely established, studies from other biological fields have shown that RASA3 can inhibit RAP1, which in turn has been implicated in invasion and metastasis in various cancers. RASA3 depletion can enhance signaling by integrins and mitogen-activated protein kinases, and the possibility that RASA3 can act as tumor suppressor has also been recently suggested through independent cross-species cancer studies. A plausible role for RASA3 as a potential tumor suppressor is consistent with our own results where expression of wild-type RASA3 potently inhibited cell migration and invasion in GC cell lines, while N-terminal variant RASA3 enhanced migration and invasion in normal gastric epithelial cells. A third example of an alternative-promoter driven genes was MET, which has been extensively investigated as a target for cancer therapy. While we and others have previously reported expression of an N-terminal truncated MET variant in cancer, functional implications of this truncated MET variant have remained unclear. In the present study, experimental assessment of MET wild-type and variant signaling revealed that truncated MET variants may have different downstream signaling effects compared to full-length MET isoforms. Under the experimental conditions used, we observed significant differences in phosphorylation patterns of ERK, STAT3 and GAB1, in a manner consistent with MET-Var being more pro-oncogenic compared to MET-Var, as both ERK, STAT3, and GAB1 have been shown to facilitate MET-induced signaling. The MET signaling pathway is known to be particularly complex with multiple feedback loops, and understanding how expression of the N terminal short MET isoform might modulate downstream survival signaling will be an important subject of future research, particularly in light of recent clinical trials targeting MET in lung cancer using antibodies which have been unsuccessful.
Our study also revealed an unexpected relationship between somatic promoters and tumor immunity. Specifically, we discovered that alternative promoter isoforms overexpressed in GC were significantly depleted of N-terminal peptides predicted to be potentially immunogenic, based on computational predictions of high-affinity MHC Class I binding and other immunological assays. We believe that finding is relevant to cancer immunity, as it builds on previous findings from the literature establishing the existence of self-reactive T-cells, the potential immunogenicity of overexpressed tumor antigens, and the process of tumor immunoediting. First, while the majority of self-reactive T-cells are clonally deleted during early development, numerous groups have also demonstrated the frequent persistence of self-reactive T cells in the periphery. For example, analysis of transgenic mice has shown that 25-40% of autoreactive T cells are likely to escape clonal deletion even in the presence of the deleting ligand, and in humans, Yu et al has demonstrated that clonal deletion prunes the T-cell repertoire but does not fully eliminate self-reactive T-cell clones. Importantly, while such self-reactive T-cells are typically low-avidity and are not capable of recognizing self-antigens under normal physiological conditions, they still retain the ability to become activated and to produce effector and memory cells under conditions of appropriate stimulation, such as infection and the mounting of anti-tumor responses.
Second, in cancer, several studies have shown that self-reactive T-cells can exhibit immunologic activity towards overexpressed tumor antigens, even if these antigens are also expressed at lower levels in normal tissues. One well-known example is the melanocyte differentiation antigen Melan-A/MART-1, which is expressed by both normal melanocytes and overexpressed in malignant melanoma cells. T-cell recognition of Melan-A/MART-1 has been detected in 50% of melanoma patients, and even healthy individuals have been shown to exhibit a disproportionately high frequency of Melan-A/MART-1-specific T cells in the peripheral blood. Besides Melan-A/MART-1, other examples of tumor associated self-antigens inducing immunological recognition in both healthy individuals and cancer patients include tyrosinase-related proteins (TRP-1 and TRP-2) and glycoprotein (gp) 100 in melanoma, and HA in mastocytoma cells. Such examples clearly demonstrate that in certain cases, normally expressed proteins can still become immunogenic when overexpressed in cancer. Third, tumor immunoediting—the acquired capacity of developing tumors to escape immune control, is a recognized hallmark of cancer. Tumor immune escape can occur via different mechanisms, such as through upregulation of immune checkpoint inhibitors (eg PD-L1), and altered transcription of antigen presenting genes or tumor-specific antigens. For example, decreased expression of melanoma antigens (eg gp100, MART-1, and HA) has been associated with melanoma progression to later disease stages. Besides overt downregulation of the entire gene, it is thus highly plausible that transcriptional changes affecting splice forms and promoter variants may also contribute to tumor immunoediting. For example, very recent work in B-cell acute lymphoblastic leukemia (B-ALL) has described the production of N-terminally truncated CD19 transcript variants in response to CD19 CART (chimeric antigen receptor-armed T cells) therapy, clearly showing that promoter transcript variants can indeed arise as a consequence of immunologic pressure. Taken collectively, we believe that these previously established findings all point to a plausible role for alternative promoters in reducing the immunogenic potential of tumors. In this regard, our observation that regions exhibiting somatic promoter alterations showed a significant overlap with binding targets of the Polycomb repressive complex 2 (PRC2) epigenetic regulator complex, and are particularly sensitive to EZH2 inhibition, suggests that pharmacologic approaches for reawakening somatic promoter-associated epitopes might represent an attractive strategy for increasing anti-tumor T-cell immunoreactivity and anti-tumor activity.
In conclusion, our study indicates an important role for somatic somatic promoters in GC. We also note that a significant portion (52%) of the somatic promoters localized to unannotated TSSs, consistent with recent studies indicating the existence of hundreds of transcript loci remaining to be annotated. Interestingly, a large portion of the human transcriptome has been shown to originate from repetitive elements that can exhibit promoter activity and/or express noncoding RNAs. Unannotated promoters activated in our GC study were found to be enriched in ERV-1 and L1 repeat elements which have been shown to be associated with stage specific transcription in early human embryonic cells, suggesting a yet unknown functional role for these promoters. Analysis of these unannotated promoters is likely to provide fertile ground for new and hitherto unanticipated insights into mechanisms of GC development and progression.