Methods Of Detecting Cancer

BACKGROUND

1. Field of Invention

This present application encompasses proteins and peptide fragments of those proteins produced by proteolytic digestion that are useful for diagnosing or monitoring for the presence of cancer in an individual.

2. Description of the Related Art

Screening mammograms typically have a sensitivity of 75% and specificity of around 98% resulting in a false positive rate of roughly 5% per mammogram

(Brown, Houn, Sickles, & Kessler, 1995; Kolb, Lichy, & Newhouse, 2002; Luftner & Possinger, 2002). Follow up imaging to evaluate false positives costs the US over 4 B with an additional 1.6 B for biopsies alone. In 2010 of the 1.6 M biopsies performed as little as 16% (only 261,000) were found to have cancer (Grady, 2012). The answer to increasing the diagnostic parameters of imaging can be found in the pre and post image diagnostics which focuses on genetic and proteomic information, more specifically, biomarkers (Armstrong, Handorf, Chen, & Bristol Demeter, 2013; Li, Zhang, Rosenzweig, Wang, & Chan, 2002).

Tissue and serum are commonly the most logical place for beginning biomarker research, however the large dynamic range of both mediums makes discovery quite difficult (Schiess, Wollscheid, & Aebersold, 2009). The answers may lie in less complex biological fluids, such as saliva and tears. The use of tears as diagnostic medium is not a novel application as the tear proteome has been extensively investigated previously (Böhm et al., 2012; 2011; Lebrecht, Boehm, Schmidt, Koelbl, & Grus, 2009a; Lebrecht et al., 2009b; Wu & Zhang, 2007). In this application a quantitative assay for the detection of a panel of tear-based biomarkers in response to cancer by triple quadrupole LC mass spectrometry is proposed. From this quantitative information, the framework for a Certified Laboratory Improvement Amendments (CLIA) protocol will be defined.

SUMMARY

Methods of determining whether a subject has cancer are provided herein. The methods include obtaining a sample from the subject and performing steps for or detecting the level of at least one of the markers provided in Table 2A or Table 2B in the sample. The subject is likely to have cancer if the levels of the markers of Table 2A are increased or if the markers in Table 2B are decreased as compared to the levels in a control sample lacking cancer. The sample is optionally lacrimal secretions, such as an ocular wash, saliva or other bodily fluid.

Kits for performing the methods described herein are also provided. The kits may comprise an eye wash solution and collection materials such as tubes. The tube for collection may comprise a protease inhibitor or other protein stabilizing agent.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a set of photographs of a NuPAGE showing the proteins collected from each of the pooled ocular wash samples. The lane numbers correspond to pool numbers with the even numbers being breast cancer pools and the odd numbers being control pools.

FIG. 2 is a graph comparing the protein expression in cancer and control samples showing increased expression of several proteins in breast cancer samples as compared to controls based on peak intensity as determined by LC-MS/MS.

FIG. 3 is a graph comparing the protein expression in cancer and control samples showing decreased expression of several proteins in breast cancer samples as compared to controls based on peak intensity as determined by LC-MS/MS.

DETAILED DESCRIPTION

Provided herein are proteins and trypsin produced polypeptides (as defined in Table 2A and 2B in the Examples and the actual trypsin sequences and full length amino acid sequences of the proteins identified as being up regulated and down regulated in cancer samples are provided in Appendix I and Appendix II, respectively) which are shown in the Examples to increase or decrease in biological samples in response to the presence of breast cancer as compared to controls. These proteins and peptides are biomarkers and will be used to determine the disease state of a patient or other subject.

Subjects include humans, domesticated animals such as cats, dogs, cows, pigs or other animals susceptible to cancer. A “patient” indicates a subject who is diagnosed with a disease or with cancer or being tested for having cancer. Thus subject and cancer may be used interchangeably herein. The subjects may be suspected of having cancer, in particular breast cancer. The subjects may have an increased risk of developing breast cancer. For example, the subject may be at increased risk of cancer or suspected of having cancer because of a positive mammography result, by detection of a lump in the breast, testing positive for a gene known to increase the risk of cancer such as BRCA, or already have had a resection, biopsy or other procedure to remove the cancer. The subject may be undergoing or have previously undergone treatment for cancer and the methods and kits herein are used to monitor progression of treatment or alternatively to monitor for recurrence or spread of the cancer. The cancer may be detected as early as stage I or II cancer, but later stages will also be detected.

Also provided herein are methods and kits to collect ocular wash samples for use to determine the expression levels of the identified proteins or polypeptides in lacrimal secretions. In addition, the use of tubes for collection containing protease inhibitor or protein stabilizing agents is covered. The kits further contain buffers or reagents for the elution of breast cancer biomarkers from the eye. The design of devices to collect the applied saline solution from the corner of the exposed ocular surface as well as the packaging of this device together with saline and a pre-prepared sample collection tube are also disclosed.

The methods disclosed herein encompass the use of these breast cancer biomarkers, singly or in multiples, in a CLIA based protocol utilizing a triple quadrupole LC-MS platform, which will be carried out at a centralized laboratory testing facility. The ocular wash samples collected from individuals may be shipped to the testing facility in this embodiment. The identified proteins and their subsequent proteolytic fragments are used for quantitative analysis of diagnostic peptides produced in the triple quad. A threshold value or a relative or actual value in terms of polypeptide concentration directly relating to the polypeptides listed in Tables 2A and 2B can be defined or samples can be compared directly to non-cancerous controls. The quantitative information in report form could be provided to physicians to help in making decisions regarding the pathway of patient care. Physicians may base treatment decisions on these results and the final step may include administration of an appropriate anti-cancer therapeutic to the subject.

In an alternative embodiment, the polypeptides of Tables 2A and 2B may be detected by implementing binding agents (i.e. antibodies, peptoids, coated surfaces) and reagents that accommodate a binding interaction specific to these proteins to produce a reaction which can be quantitated based on production of a detectable signal such as florescence, color change, or UV absorbance. Implementing these components in a cartridge with a partnering reading instrument that could be used at point of care is also provided. Binding agents for these proteins and polypeptides may also be used for detection in a lateral flow device. Thus methods of detecting the level of protein expression in the samples using a binding partner such as an antibody may be used to detect the markers provided herein in an immunoassay.

The immunoassay typically includes contacting a test sample with an antibody that specifically binds to or otherwise recognizes a biomarker, and detecting the presence of a complex of the antibody bound to the biomarker in the sample. The immunoassay procedure may be selected from a wide variety of immunoassay procedures known to the art involving recognition of antibody/antigen complexes, including enzyme-linked immunosorbent assays (ELISA), radioimmunoassay (RIA), and Western blots, and use of multiplex assays, including use of antibody arrays, wherein several desired antibodies are placed on a support, such as a glass bead or plate, and reacted or otherwise contacted with the test sample. Such assays are well-known to the skilled artisan.

The detection of the biomarkers described herein in a sample may be performed in a variety of ways. In one embodiment, the method provides the reverse-transcription of complementary DNAs from mRNAs obtained from the sample. Fluorescent dye-labeled complementary RNAs may be transcribed from complementary DNAs which are then hybridized to the arrays of oligonucleotide probes. The fluorescent color generated by hybridization is read by machine, such as an Agilent Scanner and data are obtained and processed using software, such as Agilent Feature Extraction Software (9.1). Such array based methods include microarray analysis to develop a gene expression profile. As used herein, the term “gene expression profile” refers to the expression levels of mRNAs or proteins of a panel of genes in the subject. As used herein, the term “panel of diagnostic genes” refers to a panel of genes whose expression level can be relied on to diagnose or predict the status of the disease. Included in this panel of genes are those listed in Tables 2A and 2B, as well as any combination thereof, as provided herein. In other embodiments, complementary DNAs are reverse-transcribed from mRNAs obtained from the sample, amplified and simultaneously quantified by real-time PCR, thereby enabling both detection and quantification (as absolute number of copies or relative amount when normalized to DNA input or additional normalizing genes) of a specific gene product in the complementary DNA sample as well as the original mRNA sample.

The methods of this invention include detecting at least one biomarker. However, any number of biomarkers may be detected. It is preferred that at least two biomarkers are detected in the analysis. However, it is realized that three, four, or more, including all, of the biomarkers described herein may be utilized in the analysis. Thus, not only can one or more markers be detected, one to 40, preferably two to 40, two to 30, two to 20 biomarkers, two to 10 biomarkers, or some other combination, may be detected and analyzed as described herein. In addition, other biomarkers not herein described may be combined with any of the presently disclosed biomarkers to aid in the diagnosis of cancer. Moreover, any combination of the above biomarkers may be detected in accordance with the present invention.

The markers of Table 2A may be increased at least 2 fold, 4 fold, 5 fold, 8 fold, 10 fold or more relative to the level of the marker in the control sample. The markers of Table 2B are decreased at least 1.5 fold, 2 fold, 3 fold, 4 fold or more relative to the level of the marker in the control sample. The control sample may be a sample from a subject that does not have cancer, a pooled sample from subjects that do not have cancer or may be a control or baseline expression level known to be the average expression level of subjects without cancer.

Several terms are used throughout this disclosure and should be defined as commonly used in the art, or as specifically provided herein. As provided herein, mass spectrometry or MS refers to an analytical technique generating electrical or magnetic fields to determine mass-to-charge ratio of peptides and chemical compounds in order to identify or determine peptide sequence and chemical structures. LC-MS/MS spectrometry refers to an analytical technique combining the separation capabilities of high performance liquid chromatography (HPLC) with the mass analysis of mass spectrometry. Triple quadrupole mass spectrometry refers to a tandem mass spectrometer with three ionizing chambers (Q1, Q2, &Q3). This technique allows for target detection of molecules of interest. Ion pairs refers to a parent peptide detected in Q1 in it's doubly or triply charged form and a resulting y or b ion as generated by Q2 and detected in Q3 of a triple quadrupole mass spectrometry instrument. SIS internal peptide refers to a synthesized isotopically-labeled peptide with the same sequence as the peptide to be monitored in Q1 and used as an internal standard for reference to quantify the peptide of interest. The −y ion refers to an ion generated from the c-terminal of a peptide fragment. The −b ion refers to an ion generated from the n-terminal of a peptide fragment. Quantitative Ion refers to the selected highest intensity y or b ion used to determine the quantity of it's parent protein in a biological sample. Qualitative Ion refers to ion/ions chosen to ensure the integrity of the Qualitative ion to selected protein of interest and labeled peptide to selected standards.

CLIA refers to Clinical Laboratory Improvements Amendments which are federal regulatory standards that apply to all clinical laboratory testing preformed on humans in the united states, except clinical trials and basic research. (CLIA related Federal Register and Code of Federal Regulation Announcements). CLIA approved laboratory refers to a clinical lab which preforms laboratory testing on human specimens for diagnosis, prevention, or treatment of disease or impairment and is approved and monitored by an FDA approved regulatory organization. CLIA waived test refers to a clinical laboratory test meeting specific criteria for risk, error and complexity as defined by the Food and Drug Association (FDA).

Point-of-care device refers to an instrument or cartridge available at the location of patient and physician care containing binding agents to a biomarker, or series of biomarkers of interest, and can generate information on the presence, absence, and in some cases concentrations of detected biomarkers. Analyte refers to any measurable biomarker which can be protein, peptide, macromolecule, metabolite, small molecule, or autoantibody. Biological fluid as used herein refers to tears, whole blood, serum, urine, and saliva. Biomarker refers to any substance (e.g. protein, peptide, metabolite, polynucleotide sequence) whose concentration level changes in the body (e.g. increased or decreased) as a result of a disease or condition. Marker and biomarker may be used interchangeably herein.

Lateral flow test refers to a device used to measure the presence of an analyte in a biological fluid using porous paper of sintered polymer. ELISA refers to Enzyme-linked immunosorbent assay which utilizes antibodies to detect the presence and concentration of an analyte of interest. Diagnostic Panel refers to a group of molecules (e.g. proteins or peptides) whose combined concentrations are used to diagnose a disease state. (e.g. cancer). A breast cancer marker refers to a molecule (e.g. protein, peptide, metabolite, polynucleotide sequence) whose concentration level in the body changes (e.g. is increased or decreased) as a result of breast cancer.

In addition to being useful to diagnose cancer and in particular breast cancer in a subject, the kits and methods provided herein may be used to monitor treatment or recurrence of cancer in an individual previously diagnosed with cancer. Thus if the levels of the markers in Table 2A begin to rise or the levels of the markers in Table 2B begin to decrease over time in the same subject after treatment, further chemotherapeutics targeting the cancer may be administered. The methods and kits may also be used to monitor the effectiveness of a chemotherapeutic treatment. In this alternative, the levels of the biomarkers in Table 2A would decrease over time if the treatment regime is effective and either would not change or may increase over time if the treatment regime is not effective in a single subject. The levels of the biomarkers in Table 2B would increase over time during treatment with a therapeutic that is effective and would either not change or decrease over time if the treatment regime is not effective in a single subject.

Treating cancer includes, but is not limited to, reducing the number of cancer cells or the size of a tumor or mass in the subject, reducing progression of a cancer to a more aggressive form, reducing proliferation of cancer cells or reducing the speed of tumor growth, killing of cancer cells, reducing metastasis of cancer cells or reducing the likelihood of recurrence of a cancer in a subject. Treating a subject as used herein refers to any type of treatment that imparts a benefit to a subject afflicted with a disease or at risk of developing the disease, including improvement in the condition of the subject (e.g., in one or more symptoms), delay in the progression of the disease, delay the onset of symptoms or slow the progression of symptoms, etc.

The present disclosure is not limited to the specific details of construction, arrangement of components, or method steps set forth herein. The compositions and methods disclosed herein are capable of being made, practiced, used, carried out and/or formed in various ways that will be apparent to one of skill in the art in light of the disclosure that follows. The phraseology and terminology used herein is for the purpose of description only and should not be regarded as limiting to the scope of the claims. Ordinal indicators, such as first, second, and third, as used in the description and the claims to refer to various structures or method steps, are not meant to be construed to indicate any specific structures or steps, or any particular order or configuration to such structures or steps. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to facilitate the disclosure and does not imply any limitation on the scope of the disclosure unless otherwise claimed. No language in the specification, and no structures shown in the drawings, should be construed as indicating that any non-claimed element is essential to the practice of the disclosed subject matter. The use herein of the terms “including.” “comprising,” or “having,” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof, as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements.

Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if a concentration range is stated as 1% to 50, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this disclosure. Use of the word “about” to describe a particular recited amount or range of amounts is meant to indicate that values very near to the recited amount are included in that amount, such as values that could or naturally would be accounted for due to manufacturing tolerances, instrument and human error in forming measurements, and the like. All percentages referring to amounts are by weight unless indicated otherwise.

No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.

The following examples are meant only to be illustrative and are not meant as limitations on the scope of the invention or of the appended claims.

EXAMPLES
Example 1
Methods for Collecting Ocular Wash Samples

This study was carried out under institutional review board approval and participants were recruited at two clinics based in Arkansas, The Breast Center and Highlands Oncology Group, as well as two clinics based in Washington, PeaceHealth Southwest and PeaceHealth Longview Surgery Center. Inclusion/exclusion criteria used by the clinic for patient selection is given in Table 1.

TABLE 1

Inclusion/Exclusion Criteria for participant selection

Individuals who are:

Between the ages of 18-100 years of age

Presenting for a routine check up

Presenting for the evaluation of an abnormal exam or test (mammogram,

ultrasound, MRI, PET, ect.)

Presenting for the evaluation of a palpable lump or mass

Presenting with a mass, pre or post biopsy as long as a portion of mass

is remaining

Currently have or are in treatment for breast cancer.

Individuals who are:

<18 years of age or >100 years of age.

experiencing a concurrent eye infection or trauma.

Currently experiencing acute conjunctivitis

Known to have abnormal production of tears (too much or too little)

Ocular wash samples were obtained by rinsing the exposed surface of the eye with Optics Laboratory single use Eye-Cept Rewetting drops. The single use dropper, selected to eliminate contamination, was used to apply approximately five drops of rewetting saline to the outside corner of the eye. After application the solution naturally flowed across the surface of the eye and pooled in the inner corner/duct next to the nose. The solution was then collected by suction using a one mL tuberculine syringe, with no needle attached, and transferred to a pre-labeled 0.5 mL tube with an o-ring screw top cap. The optimal total volume from each collection is approximately 100 μL, however actual volumes can vary. Samples were stored between −20° C. and −80° C. (depending on freezer unit available) within two hours of collection.

Samples collected at participating clinics were retrieved by Ascendant personnel on a weekly basis and transferred on dry ice to Ascendant's laboratory facility. In the case of the Washington based clinics, samples were shipped to Ascendant on dry ice on a monthly basis.

Data collected from the participants included: sex, race, age, previous cancer history, family history of breast cancer, stage of current cancer (I, II, III, IV) tumor size, breast cancer subtype (Ductal Carcinoma In Situ, Invasive Ductal Carcinoma, Invasive Lobular Carcinoma, Lobular Carcinoma In Situ, and Unknown) and tumor grade. A spreadsheet was created to track data and stratify samples based on selected criteria.

Control samples were collected, using the procedure detailed above, from volunteers between the ages of 18-100 who reported they were cancer and mass free as per the inclusion criteria outlined in the IRB approved collection protocol. Exclusion criteria are the same as for the breast cancer patients. All control participants were recruited from the general population; consent and sample collected was performed by Ascendant Diagnostics personnel. Data collected from control participants included: sex, age, race, previous history of breast cancer, family history of breast cancer, and current medications.

All samples in the tear bank were stored at −80° C. and freeze thaw cycles were limited to three times, as protein degradation was observed after three freeze thaw cycles. In some cases samples were aliquoted to minimize freeze thaw cycles further.

Example 2
Methods for Preparation of Sample Pools for LC MS/MS

Eight pooled samples (four breast cancer pools and four control pools) each with a total volume of 300 μL were assembled from banked tear samples for the purpose of label free quantitation using in-gel digestion. All breast cancer ocular wash samples used were taken from individuals with stage I &II breast cancer and were collected prior to treatment. Controls were age matched for accuracy of comparison.

To ensure sample integrity, MALDI-TOF data was collected on aliquots from each of the individual samples, which were included in the pooled samples. Prior to MALDI testing, tear samples were purified using ZipTip_e18. This procedure serves to remove any contaminates which may be present in the sample and to concentrate the proteins in order to increase ease of detection. A 15 μL aliquot was removed from the freezer and thawed at room temperature for 10 minutes (˜22° C.). The protocol for ZipTip_e18was adapted from the user manual supplied by Millipore and a variable pipette with a total volume capacity of 10 μL was used for all sample preparations. The ZipTip_e18was equilibrated in a wetting solution of acetonitrile (ACN) 0.1% TFA for 10 cycles (1 cycle involves aspirating 10 μL of solution into the tip and dispensing). Following equilibration, the tip was washed with ddH₂O (0.1% TFA) for 10 cycles. The sample was then loaded for 10 cycles, followed by a wash with ddH₂O (0.1% TFA) for 10 cycles. The load procedure, followed by the wash procedure was carried out a total of five times to ensure maximum protein binding. Bound proteins were eluted in 5 μL of ACN (0.1% TFA) for 20 cycles into a clean tube. The ACN (0.1% TFA) was removed using an eppendorf vacufuge plus for 10 minutes at 45° C. Samples were then reconstituted in 5 μL ddH₂O (0.1% TFA) and spotted onto a ground steel MALDI target. Each sample was spotted a total of three times at 1 μL each time, allowing complete drying of the spot before more material was added. After the final spotting was completely dry, 1 μL of a saturated solution of 40 mgs of Sinapinic Acid matrix prepared in 1 mL of 50:50 solution of ACN/ddH₂O (0.1% TFA) was spotted onto each sample and all samples were allowed to dry completely on the bench top prior to data collection. One microliter of protein standard was added to several locations on the MALDI target as well. The protein standard was spotted only once and followed by addition of the sinapinic acid matrix used for the OW samples.

Data was collected on a Bruker Reflex III MALDI-IOF mass spectrometer in its linear positive mode, as linear mode increases the sensitivity. Acquisition of all spectra was performed both manually and automatically (user unbiased acquisition) using Bruker Daltonics flex Control software. For each spot, MALDI-TOF mass spectra were acquired at least three times, with a total of 200 laser shots accumulated for each run. Shot accumulation was programmed using a fuzzy logic operator to only consider spectra with S/N better than 20 in between m/z 2000-45,000. Sample integrity was evaluated by visual inspection of the generated MALDI-TOF spectrum. High mass peak splitting together with increased quantity of low mass peaks suggest protein degredation has occurred and the sample was not used further.

Total protein content of each pool was determined using a bicinchoninic acid protein assay kit with a 1:20 (v/v) ratio of standard and unknown to working reagent and an incubation time of 30 min at 37° C. To ensure reliable total protein content calculation, a series of dilutions were made for each sample (i.e. 1:2, 1:4, 1:6) and all dilutions were plated in triplicate. A standard curve using diluted albumin (2 mg/ml, 1.5 mg/ml, 1 m/ml, 0.75 mg/ml, 0.5 mg/ml, 0.25 mg/ml 0.125 mg/ml 0.025 mg/ml and 0 mg/ml) was generated and blank subtraction was applied to all standards and unknowns. The protein concentration for each unknown was calculated using a four-parameter fit of the standard curve. Concentrations were multiplied by the dilution factor and averaged to give an accurate total protein content calculation. Assays were only considered valid if the coefficient of variation (% CV) was 15% or below.

Using the total protein content determined by BCA, 25 μg of protein from each pool was loaded onto a NuPAGE Bis-Tris 4-12% gradient separation gel and run using methods standard for an individual skilled in the art as shown in FIG. 1. Following separation of the ocular proteome, between 20-22 slices were cut for each lane and subjected to disulfide reduction using Dithiothreitol, followed by sulfhydryl aklyation using iodoacetemide, and finally trypsin digestion. Specific slice counts for each sample were as follows: Lane 1=20 slices, Lane 2=21 slices, Lane 3=22 slices, Lane 4=21 slices, Lane 5=20 slices, Lane 6=20 slices, Lane 7=21 slices, Lane 8=21 slices.

Example 3
Methods and Results for Label Free Quantitation by LC MS/MS

Twenty μL from each trypsin digestion reaction was loaded onto a nanoAcquity UPLC (Waters) and eluted using a gradient from 3-99% 0.1% formic acid, 75% acetonitrile over 30 minutes. A LTQ Orbitrap Velos (Thermo Scientific) was used for detection of the peptides produced by proteolytic cleavage. Raw data files from the LC-MS/MS analysis were uploaded into the MASCOT database for protein identification using the UniProtKB database, 2 ppm peptide mass tolerance, and 0.5 Da fragment mass tolerance. The output from MASCOT was then uploaded into the software packages Scaffold and MaxQuant for analysis.

Greater than 700 protein hits were identified using this method. In order to isolate potential biomarker candidates, peak intensities for each group (cancer and control) were averaged for each protein and fold change was determined with respect to cancer. In addition a student's T-test was applied to each protein providing a p-value. All proteins with a fold change of greater than 1.5 and a p-value <0.05 were, considered as possible biomarker candidates. P-values and fold changes were assessed on a case by case basis and some proteins with higher p-values were included in the candidate biomarkers list. The list was then narrowed based on biological relevance to breast cancer, other cancer subtypes, and cancer processes. The complete list of candidate biomarkers is given in Tables 2A and 2B and shown in graphic form in FIGS. 2 and 3.

TABLE 2A

Biomarkers with an increase expression in

cancer as compared to control samples.

Protein ID
P-Value
Fold Change

CLEC3B
0.067
No expression in control

KLK8
0.07
No expression in control

C8A
0.149
No expression in control

HRC
0.17
No expression in control

KLK13
0.178
No expression in control

C7
0.207
No expression in control

ALDH1A1
0.24
No expression in control

APOL1
0.32
No expression in control

MUC-1
0.27
40.6

BLMH
0.212
38.1

SPRR1B
0.117
35.1

SERPINB2
0.11
16.1

Putative uncharacterized
0.165
11.7

protein

RAB-30
0.153
11.3

C4A
0.099
9.6

PRDX6
0.14
7.6

CFHR1
0.169
7.4

A1BG
0.11
7.2

GGH
0.14
7.1

EZR
0.066
6.3

SERPINF2
0.16
5.9

HPX
0.1
5.5

CRISP3
0.0238
5.2

CPA4
0.14
4.8

PGLYRP2
0.06
3.9

CASP14
0.068
3.3

Ig Kappa Chain V-III region
0.001
2.6

POM

ALB
0.014
2.4

CFH
0.042
2.1

SLC34A2
0.105
29.3

TABLE 2B

Biomarkers with a decrease in expression

in cancer samples as compared to controls

Protein ID
P-value
Fold Change

GAS6
0.045
3.5

CTSL1
0.051
3.4

SFRP1
0.059
3.4

BPI
0.045
2.5

CHID1
0.0546
2.2

MSN
0.0545
2.06

ERAP1
0.014
1.6

QPCT
0.045
1.6

ATRN
0.062
1.6

LTF
0.051
1.5

To further confirm protein identity, the peptide sequences produced by trypsin digestion were mapped back to the original protein sequence. Trypsin products unique to particular proteins were noted, as these sequences have the potential to be used as diagnostic peptides as well as isotopically labeled standards in the final CLIA triple quadrupole mass spectrometry assay. The sequences of the trypsin products and the full-length proteins markers identified in Tables 2A and 2B are provided in Appendix I and Appendix II, respectively.

Example 4
Methods for Schirmer Strip Collections and Processing

Institutional review board approval was obtained for the collection of tears using Schirmer strips. For collection, the rounded tip of the Schirmer strip was folded over at the 0 mm line forming a lip. The folded portion was placed in the lower eyelid of the participant and they were asked to close their eye and keep it in the closed position for a period of 5 minutes. After five minutes the strip was removed and placed in a sterile 1.5 mL pre-labeled snap top tube and placed at −20° C. or −80° C. depending on availability. Collection criteria stated that if the 35 mm mark was reached prior to the five minute time, the strip could be removed.

Data collected from participants included the following, age, sex, race, currently taking birth control or on hormone replacement therapy, ophthamological infections, current or recent chemotherapy treatments, family history of cancer, genetic testing (BRAC1/2) if available, cancer stage, cancer type, hormone receptor status, size of mass, tumor grad, previous history of cancer. A spreadsheet was constructed to house this information and allow for sample stratification based on desired characteristics. Sample total protein content was also entered into the database.

To elute the proteins bound to the Schirmer strip, the strips were first diced and placed in a clean sterile 1.5 mL snap top tube. 200 μL of 1×PBS was added to the diced strip and the sample was incubated at 4° C. with mild shaking overnight. Following elution, the samples were spun briefly to collect the strip fragments at the bottom of the tube, and the supernatant was transferred to a new clean 1.5 mL snap top tube. Total protein content was determined using BCA assay, as described above, and the samples were stored at −80° C. until further use.

REFERENCES

Armstrong, K., Handorf, E. A., Chen, J., & Bristol Demeter, M. N. (2013). Breast cancer risk prediction and mammography biopsy decisions: a model-based study. American Journal of Preventive Medicine, 44(1), 15-22. doi:10.1016/j.amepre.2012.10.002

Böhm, D., Keller, K., Pieter, J., Boehm, N., Wolters, D., Siggelkow, W., et al. (2012). Comparison of tear protein levels in breast cancer patients and healthy controls using a de novo proteomic approach. Oncology Reports, 28(2), 429-438. doi: 10.3892/or.2012.1849

Böhm, D., Keller, K., Wehrwein, N., Lebrecht, A., Schmidt, M., Kölbl, H., & Grus, F.-H. (2011). Serum proteome profiling of primary breast cancer indicates a specific biomarker profile. Oncology Reports, 26(5), 1051-1056. doi:10.3892/or.2011.1420

Brown, M. L., Houn, F., Sickles, E. A., & Kessler, L. G. (1995). Screening Mammography in Community Practice: Positive Predictive. American Journal of Radiology, 165, 1373-1377.

Grady, D. (2012). Study of Breast Biopsies Finds Surgery Used Too Extensively. New York Times, 1-4.

Kolb, T., Lichy, J., & Newhouse, J. (2002). Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations. Radiology, 225(1), 165-175.

Lebrecht, A., Boehm, D., Schmidt, M., Koelbl, H., & Grus, F. H. (2009a). Surface-enhanced Laser Desorption/Ionisation Time-of-flight Mass Spectrometry to Detect Breast Cancer Markers in Tears and Serum. Cancer Genomics & Proteomics, 6(2), 75-83.

Lebrecht, A., Boehm, D., Schmidt, M., Koelbl, H., Schwirz, R. L., & Grus, F. H. (2009b). Diagnosis of breast cancer by tear proteomic pattern. Cancer Genomics & Proteomics, 6(3), 177-182.

Li, J., Zhang, Z., Rosenzweig, J., Wang, Y., & Chan, D. (2002). Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem, 48(8), 1296-1304.

Luftner, D., & Possinger, K. (2002). Nuclear matrix proteins as biomarkers for breast cancer. Expert Rev Mol Diagn, 2(1), 23-31. doi:ERM020106 [pii]10.1586/14737159.2.1.23

Schiess, R., Wollscheid, B., & Aebersold, R. (2009). Targeted proteomic strategy for clinical biomarker discovery. Molecular Oncology, 3(1), 33-44. doi:10.1016/j.molonc.2008.12.001

Wu, K., & Zhang, Y. (2007). Clinical application of tear proteomics: Present and future prospects. Proteomics. Clinical Applications, 1(9), 972-982. doi: 10.1002/prca.200700125

APPENDIX I

Sequences shown to be up regulated in subjects with cancer:

Protein Name
Uniprot
IPI
Gene Name:

Complement C4A
POCOL4
IPI00032258
C4A_Human

Trypsin Fragments

1. AEFQDALEK
2. DHAVDLIQK
3. DKGQAGLQR

4. EMSGSPASG
5. FACYYPR
6. FGLLDEDGK

IPVK

K

7. GHLFLQTDQ
8. GLCVATPVQ
9. GIQDEDGYR

PIYNPGQR
LR

10. GPEVQUVAH
11. GSFEFPVGD
12. HLVPGAPFL

SPWLK
AVSK
LQALVR

13. LLATLCSAE
14. LNMGITDLQ
15. ITQVLHFTK

VCQCAEGK
GLR

16. NVNFQK
17. QGSFQGGFR
18. SCGLHQLLR

19. VDFTLSSER
20. VDVQAGACE
21. VFALDQK

GK

22. VGDTININI
23. VLSLAQEQV
24. VTASDPLDT

R
GGSPEK
LGSEGALSP

GGVASLLR

25. YLDKTEQWS

TLPPETK

10 20 30 40

MRLLWGLIWA SSFFTLSLQK PRLLLFSPSV VHLGVPLSVG

50 60 70 80

VQLQDVPRGQ VVKGSVFLRN PSRNNVPCSP K⁽¹⁵⁾VDFTLSSER

90 100 110 120

DFALLSLQVP LKDAK⁽¹⁸⁾SCGLH QLLR⁽¹⁰⁾GPEVQL VAHSPWLKDS

130 140 150 160

LSRTTNIQGI NLLFSSRR⁽⁷⁾GH LFLQTDQPIY NPGQRVRYR⁽²¹⁾V

170 180 190 200

FALDQKMRPS TDTITVMVEN SHGLRVRKKE VYMPSSIFQD

210 220 230 240

DFVIPDISEP GTWKISARFS DGLESNSSTQ FEVKKYVLPN

250 260 270 280

FEVKITPGKP YILTVPGHLD EMQLDIQARY IYGKPVQGVA

290 300 310 320

YVR⁽⁶⁾FGLLDED GKKTFFRGLE SQTKLVNGQS HISLSK(2)AEFQ

330 340 350 360

DALEK
⁽¹⁴⁾
LNMGI TDLQGLRLYV AAAIIESPGG EMEEAELTSW

370 380 390 400

YFVSSPFSLD LSKTKR⁽¹²⁾HLVP GAPFLLQALV R⁽⁴⁾EMSGSPASG

410 420 430 440

IPVKVSATVS SPGSVPEVQD IQQNTDGSGQ VSIPIIIPQT

450 460 470 480

ISELQLSVSA GSPHPAIARL TVAAPPSGGP GFLSIERPDS

490 500 510 520

RPPR⁽²²⁾VGDTLN LNLRAVGSGA TFSHYYMIL SRGQIVFMNR

530 540 550 560

EPKRTLTSVS VFVDHHLAPS FYFVAFYYHG DHPVANSLR⁽²⁰⁾V

570 580 590 600

DVQAGACEGK LELSVDGAKQ YRNGESVKLH LETDSLALVA

610 620 630 640

LGALDTALYA AGSKSHKPLN MGKVFEAMNS YDLGCGPGGG

650 660 670 680

DSALQVFQAA GLAFSDGDQW TLSRKRLSCP KEKTTRKKR⁽¹⁶⁾N

690 700 710 720

VNFQKAINEK LGQYASPTAK RCCQDGVTRL PMMRSCEQRA

730 740 750 760

ARVQQPDCRE PFLSCCQFAE SLRKKSR⁽³⁾DKG QAGLQRALEI

770 780 790 800

LQEEDLIDED DIPVRSFFFE NWLWRVETVD RFQILTLWLP

810 820 830 840

DSLTTWEIHG LSLSKTK⁽⁸⁾GLC VATPVQLRVF REFHLHLRLP

850 860 870 880

MSVRRFEQLE LRPVLYNYLD KNLTVSVHVS PVEGLCLAGG

890 900 910 920

GGLAQQVLVP AGSARPVAFS VVPTAAAAVS LKVVAR⁽¹¹⁾GSFE

930 940 950 960

FPVGDAVSKV LQIEKEGAIH REELVYELNP LDHRGRTLEI

970 980 990 1000

PGNSDPNMIP DGDFNSYVR⁽²⁴⁾V TASDPLDTLG SEGALSPGGV

1010 1020 1030 1040

ASLLRLPRGC GEQTMIYLAP TLAASR⁽²⁵⁾YLDK TEQWSTLPPE

1050 1060 1070 1080

TK
⁽²⁾
DHAVDLIQ KGYMRIQQFR KADGSYAAWL SRDSSTWLTA

1090 1100 1110 1120

FVLK⁽²³⁾VLSLAQ EQVGGSPEKL QETSNWLLSQ QQADGSFQDP

1130 1140 1150 1160

CPVLDRSMQG GLVGNDETVA LTAFVTIALH HGLAVFQDEG

1170 1180 1190 1200

AEPLKQRVEA SISKANSFLG EKASAGLLGA HAAAITAYAL

1210 1220 1230 1240

TLTKAPVDLL GVAHNNLMAM AQETGDNLYW GSVTGSQSNA

1250 1260 1270 1280

VSPTPAPRNP SDPMPQAPAL WIETTAYALL HLLLHEGKAE

1290 1300 1310 1320

MADQASAWLT R⁽¹⁷⁾QGSFQGGFR STQDTVIALD ALSAYWIASH

1330 1340 1350 1360

TTEERGLNVT LSSTGRNGFK SHALQLNNRQ IRGLEEELQF

1370 1380 1390 1400

SLGSKINVKV GGNSKGTLKV LRTYNVLDMK NTTCQDLQIE

1410 1420 1430 1440

VTVKGHVEYT MEANEDYEDY EYDELPAKDD PDAPLQPVTP

1450 1460 1470 1480

LQLFEGRRNR RRREAPKVVE EQESRVHYTV CIWRNGKVGL

1490 1500 1510 1520

SGMAIADVTL LSGFHALRAD LEKLTSLSDR YVSHFETEGP

1530 1540 1550 1560

HVLLYFDSVP TSRECVGFEA VQEVPVGLVQ PASATLYDYY

1570 1580 1590 1600

NPERRCSVFY GAPSKSR⁽¹³⁾LLA TLCSAEVCQC AEGKCPRQRR

1610 1620 1630 1640

ALER⁽⁹⁾GLQDED GYRMK⁽⁵⁾FACYY PRVEYGFQVK VLREDSRAAF

1650 1660 1670 1680

RLFETK⁽¹⁸⁾ITQV LHFTKDVKAA ANQMRNFLVR ASCRLRLEPG

1690 1700 1710 1720

KEYLIMGLDG ATYDLEGHPQ YLLDSNSWIE EMPSERLCRS

1730 1740

TRQRAACAQL NDFLQEYGTQ GCQV

Protein Name
Uniprot
IPI
Gene Name

Histidine Rich Protein
P04196
IPI00022371
HRG_Human

Trypsin fragments

1. YKEENDDFA
2. ADLFYDVEA

SFR
LDLESPK

MKALIAALLL ITLQYSCAVS PTDCSAVEPE AEKALDLINK

70 80

RRRDGYLFQL LRIADAHLDR VENTTVYYLV LDVQESDCSV

90 100 110 120

LSRKYWNDCE PPDSRRPSEI VIGQCKVIAT RHSHESQDLR

130 140 150 160

VIDFNCTTSS VSSALANTKD SPVLIDFFED TERYRKQANK

170 180 190 200

ALEK⁽¹⁾YKEEND DFASFRVDRI ERVARVRGGE GTGYFVDFSV

210 220 230 240

RNCPRHHFPR HPNVFGFCR⁽²⁾A DLFYDVEALD LESPKNLVIN

250 260 270 280

CEVFDPQEHE NINGVPPHLG HPFHWGGHER SSTTKPPFKP

290 300 310 320

HGSRDHHHPH KPHEHGPPPP PDERDHSHGP PLPQGPPPLL

330 340 350 360

PMSCSSCQHA TFGTNGAQRH SHNNNSSDLH PHKHHSHEQH

370 380 390 400

PHGHHPHAHH PHEHDTHRQH PHGHHPHGHH PHGHHPHGHH

410 420 430 440

PHGHHPHCHD FQDYGPCDPP PHNQGHCCHG HGPPPGHLRR

450 460 470 480

RGPGKGPRPF HCRQIGSVYR LPPLRKGEVL PLPEANPPSF

490 500 510 520

PLPHHKHPLK PDNQPFPQSV SESCPGKFKS GFPQVSMFFT

HTFPK

Protein Name
Uniprot
IPI
Gene Name

C-type lectin domain
P05452
IPI00009028.2
CLEC3B_Human

family 3, member B

(Tetranectin)

Trypsin fragments

1. EQQALQTVC
2. TFHEASEDC

LK
ISR

10 20 30 40

MELWGAYLLL CLFSLLTQVT TEPPTQKPKK IVNAKKDVVN

50 60 70 80

TKMFEELKSR LDTLAQEVAL L¹KEQQALQTV CLKGTKVHMK

90 100 110 120

CFLALTQTK²T FHEASEDCIS RGGTLGTPQT GSENDALYEY

130 140 150 160

LRQSVGNEAE IWLGLNDMAA EGTWVDMTGA RIAYKNWETE

170 180 190 200

ITAQPDGGKT ENCAVLSGAA NGKWFDKRCR DQLPYICQFG

IV

Protein Name
Uniprot
IPI
Gene Name

Kallikrein-8
O60259-2
IPI00219892
KLK8_Human

isoform 2

Trypsin fragments

1. ENFPDTLNC

AEVK

10 20 30 40

MGRPRPRAAK TWMFLLLLGG AWAGHSRAQE DKVLGGHECQ

50 60 70 80

PHSQPWQAAL FQGQQLLCGG VLVGGNWVLT AAHCKKPKYT

90 100 110 120

VRLGDHSLQN KDGPEQEIPV VQSIPHPCYN SSDVEDHNHD

130 140 150 160

LMLLQLRDQA SLGSKVKPIS LADHCTQPGQ KCTVSGWGTV

170 180 190 200

TSPR⁽³⁾ENFPDT LNCAEVKIFP QKKCEDAYPG QITDGMVCAG

210 220 230 240

SSKGADTCQG DSGGPLVCDG ALQGITSWGS DPCGRSDKPG

250 260

VYTNICRYLD WIKKIIGSKG

Protein Name
Uniprot
IPI
Gene Name

Complement Component
P07357
IPI00011252
C8A_Human

8 alpha

Trypsin fragments

1. AIDEDCSQY
2. LGSLGAACE
3. QAQCGQDFQ

EPIPGSQK
QTQTEGAK
CK

10 20 30 40

MFAVVFFILS LMTCQPGVTA QEKVNQRVRR AATPAAVTCQ

50 60 70 80

LSNWSEWTDC FPCQDKKYRH RSLLQPNKFG GTICSGDIWD

90 100 110 120

QASCSSSTTC VR⁽³⁾QAQCGQDF QCKETGRCLK RHLVCNGDQD

130 140 150 160

CLDGSDEDDC EDVR⁽¹⁾AIDEDC SQYEPIPGSQ KAALGYNILT

170 180 190 200

QEDAQSVYDA SYYGGQCETV YNGEWRELRY DSTCERLYYG

210 220 230 240

DDEKYFRKPY NFLKYHFEAL ADTGISSEFY DNANDLLSKV

250 260 270 280

KKDKSDSFGV TIGIGPAGSP LLVGVGVSHS QDTSFLNELN

290 300 310 320

WYNEKKFIFT RIFTKVQTAH FKMRKDDIML DEGMLQSLME

330 340 350 360

LPDQYNYGMY AKFINDYGTH YITSGSMGGI YEYILVIDKA

370 380 390 400

KMESLGITSR DITTCFGGSL GIQYEDKINV GGGLSGDHCK

410 420 430 440

KFGGGKTERA RKAMAVEDII SRVRGGSSGW SGGLAQNRST

450 460 470 480

ITYRSWGRSL KYNPVVIDFE MQPIHEVLRH TSLGPLEAKR

490 500 510 520

QNLRRALDQY LMEFNACRCG PCFNNGVPIL EGTSCRCQCR

530 540 550 560

⁽²⁾
LGSLGAACEQ TQTEGAKADG SWSCWSSWSV CRAGIQERRR

570 580

ECDNPAPQNG GASCPGRKVQ TQAC

Protein Name
Uniprot
IPI
Gene Name

Kallikrein-13
Q9UKR3
IPI00007726
KLK13_Human

Trypsin fragments

1. TLQCANIQL
2. ITDNMLCAG

R
TK

10 20 30 40

MWPLALVIAS LTLALSGGVS QESSKVLNTN GTSGFLPGGY

50 60 70 80

TCFPHSQPWQ AALLVQGRLL CGGVLVHPKW VLTAAHCLKE

90 100 110 120

GLKVYLGKHA LGRVEAGEQV REVVHSIPHP EYRRSPTHLM

130 140 150 160

HDHDIMLLEL QSPVQLTGYI QTLPLSHNNR LTPGTTCRVS

170 180 190 200

GWGTTTSPQV NYPR⁽¹⁾TLQCAN IQLRSDEECR QVYPGK⁽²⁾ITDN

210 220 230 240

MLCAGTKEGG KDSCEGDSGG PLVCNRTLYG IVSWGDFPCG

250 260 270

QPDRPGVYTR VSRYVLWIRE TIRKYETQQQ KWLKGPQ

Protein Name
Uniprot
IPI
Gene Name

Complement
P10643
IPI00296608
C7_Human

Component 7

Trypsin fragments

1. AASGTQNNV
2. DSCTLPASA

LR
EK

10 20 30 40

MKVISLFILV GFIGEFQSFS SASSPVNCQW DFYAPWSECN

50 60 70 80

GCTKTQTRRR SVAVYGQYGG QPCVGNAFET QSCEPTRGCP

90 100 110 120

TEEGCGERFR CFSGQCISKS LVCNGDSDCD EDSADEDRCE

130 140 150 160

DSERRPSCDI DKPPPNIELT GNGYNELTGQ FRNRVINTKS

170 180 190 200

FGGQCRKVFS GDGKDFYRLS GNVLSYTFQV KINNDFNYEF

210 220 230 240

YNSTWSYVKH TSTEHTSSSR KRSFFRSSSS SSRSYTSHTN

250 260 270 280

EIHKGKSYQL LVVENTVEVA QFINNNPEFL QLAEPFWKEL

290 300 310 320

SHLPSLYDYS AYRRLIDQYG THYLQSGSLG GEYRVLFYVD

330 340 350 360

SEKLKENDFN SVEEKKCKSS GWHFVVKFSS HGCKELENAL

370 380 390 400

K⁽¹⁾AASGTQNNV LRGEPFIRGG GAGFISGLSY LELDNPAGNK

410 420 430 440

RRYSAWAESV TNLPQVIKQK LTPLYELVKE VPCASVKKLY

450 460 470 480

LKWALEEYLD EFDPCHCRPC QNGGLATVEG THCLCHCKPY

490 500 510 520

TFGAACEQGV LVGNQAGGVD GGWSCWSSWS PCVQGKKTRS

530 540 550 560

RECNNPPPSG GGRSCVGETT ESTQCEDEEL EHLRLLEPHC

570 580 590 600

FPLSLVPTEF CPSPPALKDG FVQDEGTMFP VGKNVVYTCN

610 620 630 640

EGYSLIGNPV ARCGEDLRWL VGEMHCQKIA CVLPVLMDGI

650 660 670 680

QSHPQKPFYT VGEKVTVSCS GGMSLEGPSA FLCGSSLKWS

690 700 710 720

PEMKNARCVQ KENPLTQAVP KCQRWEKLQN SRCVCKMPYE

730 740 750 760

CGPSLDVCAQ DERSKRILPL TVCKMHVLHC QGRNYTLTGR

770 780 790 800

⁽²⁾
DSCTLPASAE KACGACPLWG KCDAESSKCV CREASECEEE

810 820 830 840

GFSICVEVNG KEQTMSECEA GALRCRGQSI SVTSIRPCAA

ETQ

Protein Name
Uniprot
IPI
Gene Name

Retinal
P00352
IPI00218914
ALDH1A1_Human

Dehydrogenase

Trypsin fragments

1. SSSGTPDLP
2. YILGNPLTP

VLLTDLK
GVTQGPQID

KEQYDK

10 20 30 40

M⁽¹⁾SSSGTPDLP VLLTDLKIQY TKIFINNEWH DSVSGKKFPV

50 60 70 80

FNPATEEELC QVEEGDKEDV DKAVKAARQA FQIGSPWRTM

90 100 110 120

DASERGRLLY KLADLIERDR LLLATMESMN GGKLYSNAYL

130 140 150 160

NDLAGCIKTL RYCAGWADKI QGRTIPIDGN FFTYTRHEPI

170 180 190 200

GVCGQIIPWN FPLVMLIWKI GPALSCGNTV VVKPAEQTPL

210 220 230 240

TALHVASLIK EAGFPPGVVN IVPGYGPTAG AAISSHMDID

250 260 270 280

KVAFTGSTEV GKLIKEAAGK SNLKRVTLEL GGKSPCIVLA

290 300 310 320

DADLDNAVEF AHHGVFYHQG QCCIAASRIF VEESIYDEFV

330 340 350 360

RRSVERAKK⁽²⁾Y ILGNPLTPGV TQGPQIDKEQ YDKILDLIES

370 380 390 400

GKKEGAKLEC GGGPWGNKGY FVQPTVFSNV TDEMRIAKEE

410 420 430 440

IFGPVQQIMK FKSLDDVIKR ANNTFYGLSA GVFTKDIDKA

450 460 470 480

ITISSALQAG TVWVNCYGVV SAQCPFGGFK MSGNGRELGE

490 500

YGFHEYTEVK TVTVKISQKN S

Protein Name
Uniprot
IPI
Gene Name

ApoLipoprotein L1
Q9UKR3
IPI00I86903
APOL1_Human

isoform 2

Trypsin fragments

1. VTEPISAES

GEQVER

10 20 30 40

MEGAALLRVS VLCIWMSALF LGVGVRAEEA GARVQQNVPS

50 60 70 80

GTDTGDPQSK PLGDWAAGTM DPESSIFIED AIKYFKEKVS

90 100 110 120

TQNLLLLLTD NEAWNGFVAA AELPRNEADE LRKALDNLAR

130 140 150 160

QMIMKDKNWH DKGQQYRNWF LKEFPRLKSE LEDNIRRLRA

170 180 190 200

LADGVQKVHK GTTIANVVSG SLSISSGILT LVGMGLAPFT

210 220 230 240

EGGSLVLLEP GMELGITAAL TGITSSTMDY GKKWWTQAQA

250 260 270 280

HDLVIKSLDK LKEVREFLGE NISNFLSLAG NTYQLTRGIG

290 300 310 320

KDIRALRRAR ANLQSVPHAS ASRPRVTEPI SAESGEQVER

330 340 350 360

VNEPSILEMS RGVKLTDVAP VSFFLVLDVV YLVYESKHLH

370 380 390

EGAKSETAEE LKKVAQELEE KLNILNNNYK ILQADQEL

Protein Name
Uniprot
IPI
Gene Name

Mucin 1
P15941-2
IPI00218163
Muc1_Human

isoform 2

Trypsin fragments

1. DISEMFIQL
2. QGGFLGLSN

YK
IK

10 20 30 40

MTPGTQSPFF LLLLLTVLTV VTGSGHASST PGGEKETSAT

50 60 70 80

QRSSVPSSTE KNAVSMTSSV LSSHSPGSGS STTQGQDVTL

90 100 110 120

APATEPASGS AATWGQDVTS VPVTRPALGS TTPPAHDVTS

130 140 150 160

APDNKPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

170 180 190 200

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

210 220 230 240

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

250 260 270 280

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

290 300 310 320

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

330 340 350 360

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

370 380 390 400

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

410 420 430 440

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

450 460 470 480

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

490 500 510 520

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

530 540 550 560

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

570 580 590 600

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

610 620 630 640

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

650 660 670 680

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

690 700 710 720

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

730 740 750 760

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

770 780 790 800

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

810 820 830 840

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

850 860 870 880

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

890 900 910 920

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

930 940 950 960

APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS

970 980 990 1000

ASGSASGSAS TLVHNGTSAR ATTTPASKST PFSIPSHHSD

1010 1020 1030 1040

TPTTLASHST KTDASSTHHS SVPPLTSSNH STSPQLSTGV

1050 1060 1070 1080

SFFFLSFHIS NLQFNSSLED PSTDYYQELQ R⁽¹⁾DISEMFLQI

1090 1100 1110 1120

YK⁽²⁾QGGFLGLS NIKFRPGSVV VQLTLAFREG TINVHDVETQ

1130 1140 1150 1160

FNQYKTEAAS RYNLTISDVS VSDVPFPFSA QSGAGVPGWG

1170 1180 1190 1200

IALLVLVCVL VALAIVYLIA LAVCQCRRKN YGQLDIFPAR

1210 1220 1230 1240

DTYHPMSEYP TYHTHGRYVP PSSTDRSPYE KVSAGNGGSS

1250

LSYTNPAVAA TSANL

Protein Name
Uniprot
IPI
Gene Name

Bleomycin
Q13867
IPI00219575
BLMH_Human

Hydrolase

Trypsin fragments

1. AQHVFQHAV
2. SSSGLNSEK

PQEGKPITN
VAALIQK

QK

10 20 30 40

M⁽²⁾SSSGLNSEK VAALIQKLNS DPQFVLAQNV GTTHDLLDIC

50 60 70 80

LKRATVQR⁽¹⁾AQ HVFQHAVPQE GKPITNQKSS GRCWIFSCLN

90 100 110 120

VMRLPFMKKL NIEEFEFSQS YLFFWDKVER CYFFLSAFVD

130 140 150 160

TAQRKEPEDG RLVQFLLMNP ANDGGQWDML VNIVEKYGVI

170 180 190 200

PKKCFPESYT TEATRRMNDI LNHKMRERCI RLRNLVHSGA

210 220 230 240

TKGEISATQD VMMEEIFRVV CICLGNPPET FTWEYRDKDK

250 260 270 280

NYQKIGPITP LEFYREHVKP LFNMEDKICL VNDPRPQHKY

290 300 310 320

NKLYTVEYLS NMVGGRKTLY NNQPIDFLKK MVAASIKDGE

330 340 350 360

AVWFGCDVGK HFNSKLGLSD MNLYDHELVF GVSLKNMNKA

370 380 390 400

ERLTFGESLM THAMTFTAVS EKDDQDGAFT KWRVENSWGE

410 420 430 440

DHGHKGYLCM TDEWFSEYVY EVVVDRKHVP EEVLAVLEQE

450

PIILPAWDPM GALAE

Protein Name
Uniprot
IPI
Gene Name

Cornifin-B
P22528
IPI00304903
SPRRIB_Human

Trypsin fragments

1. QPCTPPPQL
2. VPEPCPSIV

QQQQVK
TPAPAQQK

10 20 30 40

MSSQQQK⁽¹⁾QPC TPPPQLQQQQ VKQPCQPPPQ EPCIPKTKEP

50 60 70 80

CHPKVPEPCH PKVPEPCQPK VPEPCHPK⁽²⁾VP EPCPSIVTPA

PAQQKTKQK

Protein Name
Uniprot
IPI
Gene Name

Plasminogen activator
P05120
IPI00007117
SERPINB2_Human

inhibitor-2

Trypsin fragments

1. GKIPNLLPE

GSVDGDTR

10 20 30 40

MEDLCVANTL FALNLFKHLA KASPTQNLFL SPWSISSTMA

50 60 70 80

MVYMGSRGST EDQMAKVLQF NEVGANAVTP MTPENFTSCG

90 100 110 120

FMQQIQKGSY PDAILQAQAA DKIHSSFRSL SSAINASTGN

130 140 150 160

YLLESVNKLF GEKSASFREE YIRLCQKYYS SEPQAVDFLE

170 180 190 200

CAEEARKKIN SWVKTQTKGK IPNLLPEGSV DGDTRMVLVN

210 220 230 240

AVYFKGKWKT PFEKKLNGLY PFRVNSAQRT PVQMMYLREK

250 260 270 280

LNIGYIEDLK AQILELPYAG DVSMFLLLPD EIADVSTGLE

290 300 310 320

LLESEITYDK LNKWTSKDKM AEDEVEVYIP QFKLEEHYEL

330 340 350 360

RSILRSMGME DAFNKGRANF SGMSERNDLF LSEVFHQAMV

370 380 390 400

DVNEEGTEAA AGTGGVMTGR TGHGGPQFVA DHPFLFLIMA

410

KITNCILFFG RFSSP

Protein Name
Uniprot
IPI
Gene Name

Peroxiredoxin-6
P30041
IPI00220301
PRDX6_Human

Trypsin fragments

1. DINAYNCEE
2. NFDEILR

PTEK

10 20 30 40

MPGGLLLGDV APNFEANTTV GRIRFHDFLG DSWGILFSHP

50 60 70 80

RDFTPVCTTE LGRAAKLAPE FAKRNVKLIA LSIDSVEDHL

90 100 110 120

AWSK⁽¹⁾DINAYN CEEPTEKLPF PIIDDRNREL AILLGMLDPA

130 140 150 160

EKDEKGMPVT ARVVFVFGPD KKLKLSILYP ATTGR⁽²⁾NFDEI

170 180 190 200

LRVVISLQLT AEKRVATPVD WKDGDSVMVL PTIPEEEAKK

210 220

LFPKGVFTKE LPSGKKYLRY TPQP

Protein Name
Uniprot
IPI
Gene Name

Complement
Q03591
IPI00011264
CFHR1_Human

factor-H

Trypsin Fragments

1. ITCTEEGWS
2. STDTSCVNP
3. TGESAEFVC

PTPK
PTVQNAHIL
K

SR

10 20 30 40

MWLLVSVILI SRISSVGGEA TFCDFPKINH GILYDEEKYK

50 60 70 80

PFSQVPTGEV FYYSCEYNFV SPSKSFWTR⁽¹⁾I TCTEEGWSPT

90 100 110 120

PKCLRLCFFP FVENGRSESS GQTHLEGDTV QIICNTGYRL

130 140 150 160

QNNENNISCV ERGWSTPPKC R⁽²⁾STDTSCVNP PTVQNAHILS

170 180 190 200

RQMSKYPSGE RVRYECRSPY EMFGDEEVMC LNGNWTEPPQ

210 220 230 240

CKDSTGKCGP PPPIDNGDIT SFPLSVYAPA SSVEYQCQNL

250 260 270 280

YQLEGNKRIT CRNGQWSEPP KCLHPCVISR EIMENYNIAL

290 300 310 320

RWTAKQKLYL R⁽³⁾TGESAEFVC KRGYRLSSRS HTLRTTCWDG

330

KLEYPTCAKR

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of
P04217
IPI00022895
A1BG_Human

Alpha-1B-

glycoprotein

Trypsin Fragments

1. ATWSGAVLA
2. CLAPLEGAR
3. GVTFLLR

GR

4. HQFLLTGDT
5. LLELTGPK
6. SGLSTGWTQ

QGR

LSK

10 20 30 40

MSMLVVFLLL WGVTWGPVTE AAIFYETQPS LWAESESLLK

50 60 70 80

PLANVTLTCQ AHLETPDFQL FKNGVAQEPV HLDSPAIK⁽⁴⁾HQ

90 100 110 120

FLLTGDTQGR YRCR⁽⁸⁾SGLSTG WTQLSK⁽⁵⁾LLEL TGPKSLPAPW

130 140 150 160

LSMAPVSWIT PGLKTTAVCR GVLR⁽⁴⁾GVTFLL RREGDHEFLE

170 180 190 200

VPEAQEDVEA TFPVHQPGNY SCSYRTDGEG ALSEPSATVT

210 220 230 240

IEELAAPPPP VLMHHGESSQ VLHPGNKVTL TCVAPLSGVD

250 260 270 280

FQLRRGEKEL LVPRSSTSPD RIFFHLNAVA LGDGGHYTCR

290 300 310 320

YRLHDNQNGW SGDSAPVELI LSDETLPAPE FSPEPESGRA

330 340 350 360

LRLR⁽²⁾CLAPLE GARFALVRED RGGRRVHRFQ SPAGTEALFE

370 380 390 400

LHNISVADSA NYSCVYVDLK PPFGGSAPSE RLELHVDGPP

410 420 430 440

PRPQLR⁽¹⁾ATWS GAVLAGRDAV LRCEGPIPDV TFELLREGET

450 460 470 480

KAVKTVRTPG AAANLELIFV GPQHAGNYRC RYRSWVPHTF

490

ESELSDPVEL LVAES

Protein Name
Uniprot
IPI
Gene Name

Gamma-glutamyl
Q92820
IPI00023728
GGH_Human

hydrolase

Trypsin Fragments

1. NLDGISHAP

NAVK

10 20 30 40

MASPGCLLCV LGLLLCGAAS LELSRPHGDT AKKPIIGILM

50 60 70 80

QKCRNKVMKN YGRYYIAASY VKYLESAGAR VVPVRLDLTE

90 100 110 120

KDYEILFKSI NGILFPGGSV DLRRSDYAKV AKIFYNLSIQ

130 140 150 160

SFDDGDYDPV WGTCLGFEEL SLLISGECLL TATDTVDVAM

170 180 190 200

PLNFTGGQLH SRMFQNFPTE LLLSLAVEPL TANFHKWSLS

210 220 230 240

VKNFTMNEKL KKFFNVLTTN TDGKIEFIST MEGYKYPVYG

250 260 270 280

VQWHPEKAPY EWKNLDGISH APNAVKTAFY LAEFFVNEAR

290 300 310

KNNHHFKSES EEERALIYQF SPIYTGHISS FQQCYIFD

Protein Name
Uniprot
IPI
Gene Name

Ezrin
P15311
IPI00843975
EZR_Human

Trypsin Fragments

1. ALQLEEER
2. APDFVFYAP
3. ELSEQIQR

R

4. IALLEEAR
5. IGFPWSEIR
6. QRIDEFEAL

7. SGYLSSER
8. SQEQLAAEL
9. VSAQEVRK

AEYTAK

10 20 30 40

MPKPINVRVT TMDAELEFAI QPNTTGKQLF DQVVKTIGLR

50 60 70 80

EVWYFGLHYV DNKGFPTWLK LDKK⁽³⁾VSAQEV RKENPLQFKF

90 100 110 120

RAKFYPEDVA EELIQDITQK LFFLQVKEGI LSDEIYCPPE

130 140 150 160

TAVLLGSYAV QAKFGDYNKE VHK⁽⁷⁾SGYLSSE RLIPQRVMDQ

170 180 190 200

HKLTRDQWED RIQVWHAEHR GMLKDNAMLE YLKIAQDLEM

210 220 230 240

YGINYFEIKN KKGTDLWLGV DALGLNIYEK DDKLTPK⁽⁵⁾IGF

250 260 270 280

PWSEIRNISF NDKKFVIKPI DKK⁽²⁾APDFVFY APRLRINKRI

290 300 310 320

LQLCMGNHEL YMRRRKPDTI EVQQMKAQAR EEKHQKQLER

330 340 350 360

QQLETEKKRR ETVEREKEQM MREKEELMLR LQDYEEKTKK

370 380 390 400

AER⁽³⁾ELSEQIQ R⁽¹⁾ALQLEEERK RAQEEAERLE ADRMAALRAK

410 420 430 440

EELERQAVDQ IK⁽¹¹⁾SQEQLAAE LAEYTAK⁽⁴⁾IAL LEEARRRKED

450 460 470 480

EVEEWQHRAK EAQDDLVKTK EELHLVMTAP PPPPPPVYEP

490 500 510 520

VSYHVQESLQ DEGAEPTGYS AELSSEGIRD DRNEEKRITE

530 540 550 560

AEKNERVQRQ LLTLSSELSQ ARDENKRTHN DIIHNENMRQ

570 580

GRDKYKTLRQ IRQGNTK⁽⁶⁾QRI DEFEAL

Protein Name
Uniprot
IPI
Gene Name

Alpha-2-
P08697
IPI00879231
SERPINF2_Human

antiplasmin

Trypsin Fragments

1. LGNQEPGGQ

TALK

10 20 30 40

MALLWGLLVL SWSCLQGPCS VFSPVSAMEP LGRQLTSGPN

50 60 70 80

QEQVSPLTLL KLGNQEPGGQ TALKSPPGVC SRDPTPEQTH

90 100 110 120

RLARAMMAFT ADLFSLVAQT STCPNLILSP LSVALALSHL

130 140 150 160

ALGAQNHTLQ RLQQVLHAGS GPCLPHLLSR LCQDLGPGAF

170 180 190 200

RLAARMYLQK GFPIKEDFLE QSEQLFGAKP VSLTGKQEDD

210 220 230 240

LANINQWVKE ATEGKIQEFL SGLPEDTVLL LLNAIHFQGF

250 260 270 280

WRNKFDPSLT QRDSFHLDEQ FTVPVEMMQA RTYPLRWFLL

290 300 310 320

EQPEIQVAHF PFKNNMSFVV LVPTHFEWNV SQVLANLSWD

330 340 350 360

TLHPPLVWER PTKVRLPKLY LKHQMDLVAT LSQLGLQELF

370 380 390 400

QAPDLRGISE QSLVVSGVQH QSTLELSEVG VEAAAATSIA

410 420 430 440

MSRMSLSSFS VNRPFLFFIF EDTTGLPLFV GSVRNPNPSA

450 460 470 480

PRELKEQQDS PGNKDFLQSL KGFPRGDKLF GPDLKLVPPM

490

EEDYPQFGSP K

Protein Name
Uniprot
IPI
Gene Name

Hemopexin
P02790
IPI00022488
HPX_Human

Trypsin Fragments

1. DVRDYFMPC
2. DYFMPCPGR
3. EVGTPHGII

PGR

LDSVDAAFI

CPGSSR

4. GECQAEGVL
5. GEFVWK
6. GGYTLVSGY

FFQGDR

PK

7. LLQDEFPGI
8. NFPSPVDAA
9. QGHNSVFLI

PSPIDAAVE
FR
K

CHR

10. SGAQATWTE
11. VDGALCMEK
12. WKNFPSPVD

LPWPHEK

AAFR

10 20 30 40

MARVLGAPVA LGLWSLCWSL AIATPLPPTS AHGNVAEGET

50 60 70 80

KPDPDVTERC SDGWSFDATT LDDNGTMLFF K⁽⁵⁾GEFVWKSHK

90 100 110 120

WDRELISERW K⁽⁸⁾NFPSPVDAA FR⁽⁹⁾QGHNSVFL IKGDKVWVYP

130 140 150 160

PEKKEKGYPK ⁽⁷⁾LLQDEFPGIP SPLDAAVECH R⁽⁴⁾GECQAEGVL

170 180 190 200

FFQGDREWFW DLATGTMKER SWPAVGNCSS ALRWLGRYYC

210 220 230 240

FQGNQFLRFD PVRGEVPPRY PR⁽¹⁾DVR⁽²⁾DYFMP CPGRGHGHRN

250 260 270 280

GTGHGNSTHH GPEYMRCSPH LVLSALTSDN HGATYAFSGT

290 300 310 320

HYWRLDTSRD GWHSWPIAHQ WPQGPSAVDA AFSWEEKLYL

330 340 350 360

VQGTQVYVFL TK⁽⁶⁾GGYTLVSG YPKRLEK⁽³⁾EVG TPHGIILDSV

370 380 390 400

DAAFICPGSS RLHIMAGRRL WWLDLK⁽¹⁰⁾SGAQ ATWTELPWPH

410 420 430 440

EK
⁽¹¹⁾
VDGALCME KSLGPNSCSA NGPGLYLIHG PNLYCYSDVE

450 460

KLNAAKALPQ PQNVTSLLGC TH

Protein Name
Uniprot
IPI
Gene Name

Cysteine Rich
P54108
IPI00974055
Crisp3_Human

Secretory Protein 3

Trypsin Fragments

1. WANQCNYR

10 20 30 40

MTLFPVLLFL VAGLLPSFPA NEDKDPAFTA LLTTQTQVQR

50 60 70 80

EIVNKHNELR RAVSPPARNM LKMEWNKEAA ANAQKWANQC

90 100 110 120

NYRHSNPKDR MTSLKCGENL YMSSASSSWS QAIQSWFDEY

130 140 150 160

NDFDFGVGPK TPNAVVGHYT QVVWYSSYLV GCGNAYCPNQ

170 180 190 200

KVLKYYYVCQ YCPAGNWANR LYVPYEQGAP CASCPDNCDD

210 220 230 240

GLCTNGCKYE DLYSNCKSLK LTLTCKHQLV RDSCKASCNC

Protein Name
Uniprot
IPI
Gene Name

Carboxypeptidase A4
Q9UI42
IPI00008894
CP4A_Human

Trypsin Fragments

1. DPAITSILE
2. SRNPGSSCI
3. GASDNPCSE

K
GADPNR
VYHGPHANS

EVEVK

4. SVVDFIQK
5. NPGSSCIGA

DPNR

10 20 30 40

MRWILFIGAL IGSSICGQEK FFGDQVLRIN VRNGDEISKL

50 60 70 80

SQLVNSNNLK LNFWKSPSSF NRPVDVLVPS VSLQAFKSFL

90 100 110 120

RSQGLEYAVT IEDLQALLDN EDDEMQHNEG QERSSNNFNY

130 140 150 160

GAYHSLEAIY HEMDNIAADF PDLARRVKIG HSFENRPMYV

170 180 190 200

LKFSTGKGVR RPAVWLNAGI HSREWISQAT AIWTARKIVS

210 220 230 240

DYQR⁽¹⁾DPAITS ILEKMDIFLL PVANPDGYVY TQTQNRLWRK

250 260 270 280

TR⁽²⁾SR⁽⁵⁾NPGSSC IGADPNRNWN ASFAGK⁽³⁾GASD NPCSEVYHGP

290 300 310 320

HANSEVEVK
⁽⁴⁾
SVVDFIQKHGN FKGFIDLHSY SQLLMYPYGY

330 340 350 360

SVKKAPDAEE LDKVARLAAK ALASVSGTEY QVGPTCTTVY

370 380 390 400

PASGSSIDWA YDNGIKFAFT FELRDTGTYG FLLPANQIIP

410 420

TAEETWLGLK TIMEHVRDNL

Protein Name
Uniprot
IPI
Gene Name

N-acetylmuramyl-
Q96PD5-2
IPI00394992
PGLYRP2_Human

L-alanine amidase

Trypsin Fragments

1. GSQTQSHPD
2. TFTLLDPK

LGTEGCWDQ

LSAPR

10 20 30 40

MAQGVLWILL GLLLWSDPGT ASLPLLMDSV IQALAELEQK

50 60 70 80

VPAAKTRHTA SAWLMSAPNS GPHNRLYHFL LGAWSLNATE

90 100 110 120

LDPCPLSPEL LGLTKEVARH DVREGKEYGV VLAPDGSTVA

130 140 150 160

VEPLLAGLEA GLQGRRVINL PLDSMAAPWE TGDTFPDVVA

170 180 190 200

IAPDVRATSS PGLRDGSPDV TTADIGANTP DATKGCPDVQ

210 220 230 240

ASLPDAKAKS PPTMVDSLLA VTLAGNLGLT FLR⁽¹⁾GSQTQSH

250 260 270 280

PDLGTEGCWD QLSAPR
⁽²⁾
TFTL LDPKASLLTM AFLNGALDGV

290 300 310 320

ILGDYLSRTP EPRPSLSHLL SQYYGAGVAR DPGFRSNFRR

330 340 350 360

QNGAALTSAS ILAQQVWGTL VLLQRLEPVH LQLQCMSQEQ

370 380 390 400

LAQVAANATK EFTEAFLGCP ATHPRCRWGA APYRGRPKLL

410 420 430 440

QLPLGFLYVH HTYVPAPPCT DFTRCAANMR SMQRYHQDTQ

450 460 470 480

GWGDIGYSFV VGSDGYVYEG RGWHWVGAHT LGHNSRGFGV

490 500 510 520

AIVGNYTAAL PTEAALRTVR DTLPSCAVRA GLLRPDYALL

530 540 550 560

GHRQLVRTDC PGDALFDLLR TWPHFTATVK PRPARSVSKR

570

SRREPPPRTL PATDLQ

Protein Name
Uniprot
Gene Name

Caspase-14
P31944
CASP14_Human

Trypsin Fragments

1. AREGSEEDL
2. DPTAEQFQE
3. FQQAIDSR

DALEHMFR
ELEK

4. KTNPEIQST
5. MAEAELVQE
6. RDPTAEQFQ

LR
GK
EELEK

7. RMAEAELVQ
8. SLEEEKYDM
9. TNPEIQSTL

EGK
SGAR
R

10. VYIIQACR

10 20 30 40

MSNPR⁽⁸⁾SLEEE KYDMSGARLA LILCVTK⁽¹⁾ARE GSEEDLDALE

50 60 70 80

HMFRQLRFES TMK⁽⁶⁾R⁽²⁾DPTAEQ FQEELEK⁽³⁾FQQ AIDSREDPVS

90 100 110 120

CAFVVLMAHG REGFLKGEDG EMVKLENLFE ALNNKNCQAL

130 140 150 160

RAKPKV⁽¹⁰⁾YIIQ ACRGEQRDPG ETVGGDEIVM VIKDSPQTIP

170 180 190 200

TYTDALHVYS TVEGYIAYRH DQKGSCFIQT LVDVFTKRKG

210 220 230 240

HILELLTEVT R⁽⁷⁾R⁽⁵⁾MAEAELVQ EGKAR⁽⁴⁾K⁽⁹⁾TNPE IQSTLRKRLY

LQ

Protein Name
Uniprot
IPI
Gene Name

Ig Kappa chain
P04207
IPI00385253
KV308_Human

V-III region POM

Trypsin Fragments

1. LLIYGASTR

10 20 30 40

MEAPAQLLFL LLLWLPDTTG SIVMTQSPAT LSVSPGERAT

50 60 70 80

LSCRASQSVS NNLAWYQQKP GQPPRLLIYG ASTRATGIPA

90 100 110 120

RFSGSGSGTE FTLTISRLQS EDFAVYYCQQ YNNWPPWTFG

QGTRVEIKR

Protein Name
Uniprot
IPI
Gene Name

Ig Kappa chain
P01624
IPI00387119
KV306_Human

V-III region POM

Trypsin Fragments

1. EIVMTQSPV

TLSVSPGER

10 20 30 40

EIVMTQSPVT LSVSPGERAT LSCRASQSIS NSYLAWYQQK

50 60 70 80

PSGSPRLLIY GASTRATGIP ARFSGSGSGT EFTLTISSLQ

90 100

SEDFAVYYCQ QYNNWPPTFG QGTRVEIKR

Protein Name
Uniprot
IPI
Gene Name

Isoform 1
P02768-1
IPI00387119
ALB_Human

Serum Albumin

Trypsin Fragments

1. AACLLPK
2. AAFTECCQA
3. AAFTECCQA

ADK
ADKAACLLP

K

4. ADDKETCFA
5. ADDKETCFA
6. AEFAEVSK

EEGK
EEGKK

7. AEFAEVSKL
8. ATKEQIK
9. ATKEQIKAV

VTDLTK

MDDFAAFVE

K

11. AVMDDFAAF
12. CASIQKFGE
13. CCAAADPHE

VEK
R
CYAK

14. CCKADDKET
15. CCKHPEAK
16. CCTESLVNR

CFAEEGK

17. CCTESLVNR
18. DDNPNLPR
19. DAHKSEVAH

RPCFSALEV

R

DETYVPK

20. DLGEENFK
21. DVCKNYAEA
22. DVFLGMFLY

K
EYAR

23. ECCEKPLLE
24. EFNAETFTF
25. EFNAETFTF

K
HADICTLSE
HADICTLSE

K
KER

26. EQLKAVMDD
27. ETCFAEGK
28. ETCFAEEGK

FAAFVEK

K

29. ETYGEMADC
30. FKDLGEENF
31. FPKAEFAEV

CAK
K
SK

32. FQNALLVR
33. HPDYSVVLL
34. HPYFYAPEL

LR
LFFAK

35. LAKTYETTL
36. LCTVATLR
37. LCTVATLRE

EK

TYGEMADCC

AK

38. LDELRDEGK
39. LDELRDEGK
40. LKCASLQK

ASSAK

41. LKECCEKPL
42. LSQRFPK
43. LFSQRFPKA

LEK

EFAEVSK

44. LVAASQAAL
45. LVNEVTEFA
46. IVNEVTEFA

GL
K
KTCVADESA

ENCDK

47. LVRPEVDVM
48. LVRPEVDVM
49. LVTDLTK

CTAFHDNEE
CTAFHDNEE

TFLK
TFLKK

50. KLVAASQAA
51. KQTALVELV
52. KVPQVSTPT

LGL
K
LLVEVSR

53. KYLYEIAR
54. MPCAEDYIL
55. NECFIQHK

SVVILNQIL

CVILHEK

56. NECFLQHKD
57. NIGKVGSK
58. NYAEAK

DNPNLPR

59. NYAEAKDVF
60. PLVEEPQNL
61. QEPERNECF

IGMFIYEYA
IK
LQHK

R

62. QEPERNECF
63. QNCELFEQL
64. QNCELFEQL

LQHKDDNPN
GEYK
GEYKFQNAI

LPR

IVR

65. QTAIVEIVK
66. RHPDYSVVL
67. RMPCAEDYL

LLR
SVVLKNQLC

VLHEK

68. RPCFSALEV
69. SHCIAEVEN
70. SHCIAEVEN

DETYVPK
DEMPADLPS
DEMPADLPS

LAADFVESK
LAADFVESK

DVCKNYAEA

K

71. SHCIAEVEN
72. SLHTLFGDK
73. SLHTLFGDK

DEMPADLPS

LCTVATLR

LAADFVESK

DVCK

74. TCVADESAE
75. TCVADESAE
76. TCVADESAE

NCDK
NCDKSLHTL
NCDKSLHTL

FGDK
FGDKLCTVA

TLR

77. TPVSDRVTK
78. TYETTLEK
79. TYETTIEKC

CAAADPHEC

YAK

80. VFDEFKPLV
81. VHTECCHGD
82. VHTECCHGD

EEPQNLIK
LLECADDR
LLECADDRA

DLAK

83. VHTECCHGD
84. VPQVSTPTL
85. YICENQDSI

LLECADDRA
VEVSR
SSK

DLAKYICEN

QDSISSK

86. YICENQDSI
87. YLYEIAR
88. YLYEIARR

SSKLK

89. YKAAFTECC
90. YKAAFTECC

QAADK
QAADKAACL

LPK

MKWVTFISLL FLFSSAYSRG VFRR⁽¹⁰⁾DAHKSE VAHR⁽³⁰⁾FK⁽²⁰⁾DLGE ENFKALVLIA

FAQYLQQCPFEDHVK⁽⁴⁵⁾LVNEV TEFAK^(74,75,76)TCVAD ESAENCDK^(72,73)SL HTLFGDK^(37,36)LCT

VATLR
⁽²³⁾
ETYGE MADCCAK^(61)(62)QEP ER^(55)(56)NECFLQHK ⁽¹⁸⁾DDNPNLPR^(47)(48)LV

RPEVDVMCTA FHDNEETFLK
⁽⁵³⁾
K
^(87,88)
YLYEIARR⁽²⁴⁾H PYFYAPELLF FAKR^(89,90)YK⁽²⁾⁽³⁾AAFT

ECCQAADK
⁽¹⁾
AA CLLPK
^(39)(38)
LDELR DEGKASSAKQ R⁽⁴⁹⁾LK⁽¹²⁾CASLQKFGERAFKAWAV

AR^(43)(48)LSQR⁽³¹⁾FPK⁽⁴⁾A EFAEV⁽⁷⁾SK⁽⁴⁹⁾LVT DLTK^(81,82,83)VHTECC HGDLLECADD

RADLAK
^(85,86)
YICE NQDSISSK⁽⁴¹⁾LK ⁽²³⁾ECCEKPLLEK ^(69,70,71)SHCIAEVEND EMPADLPSLA

ADFVESK
⁽²¹⁾
DVC K^(58)(59)NYAEAK⁽²²⁾DVFLGMFLYEYAR ⁽⁶⁸⁾R⁽³³⁾HPDYSVVLL

LR
⁽³⁵⁾
LAK
^(78,79)
TYETT LEK⁽¹²⁾CCAAADP HECYAK⁽⁸⁰⁾VFDE FK⁽⁴⁰⁾PLVEEPQN

LIK
^(63)(64)
QNCELFE QLGEYK
⁽³²⁾
FQNA LLVRYTK⁽⁵²⁾K⁽⁵⁴⁾VP QVSTPTLVEV SR⁽⁵⁷⁾NLGKVGSK

⁽¹⁵⁾
CCKHPEAK
⁽⁶⁷⁾R⁽⁶⁴⁾M PCAEDYLSVV LNQLCVLHEK ⁽⁷⁷⁾TPVSDRVTK^(17)(16)C CTESLVNR⁽⁶⁸⁾RP

CFSALEVDET YVPK
^(25)(24)
EFNAETFTF HADICTL

SEKERQIK⁽⁵¹⁾K⁽⁶⁵⁾Q TALVELVKHK PK^(10)(9)ATK⁽²⁶⁾EQLK⁽¹¹⁾A

VMDDFAAFVEK
⁽¹⁴⁾
CCK
⁽⁴⁾
ADDK
^(28)(27)
ET CFAEEGK
^(5)(56)
K
⁽⁴⁴⁾
LV

AASQAALGL

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of
P08603
IPI00029739
CFH_Human

Complement factor H

Trypsin Fragments

1. *AGEQVTYT
2. CLHPCVISR
3. *DGWSAQPT

CATYYK

CIK

4. *DTSCVNPP
5. *EFDHNSNI
6. EIMENYNIA

TVQNAYIVS
R
LR

R

7. GDAVCTESG
8. GDAVCTESG
9. *IDVHLVPD

WR
WRPLPSCEE
R

K

10. *LSYTCEGG
11. IVSSAMEPD
12. NTEILTGSW

FR
REYHFGQAV
SDQTYPEGT

R
QAIYK

13. RPYFPVAVG
14. *SCDIPVFM
15. SIDVACHPG

K
NAR
YALPK

16. SLGNVIMVC
17. *SSNLIILE
18. *SSQESYAH

R
EHLK
GTK

19. TGDEITYQC
20. TGESVEFVC
21. *TKNDFTWF

R
K
K

22. TTCWDGKLE
23. *VSVLCQEN
24. *WQSIPLCV

YPTCAK
YLIQEGEEL
EK

TCKDGR

25. *WSSPPQCE

GLPCK

10 20 30 40

MRLLAKIICL MLWAICVAED CNELPPRR⁽¹²⁾NT EILTGSWSDQ

50 60 70 80

TYPEGTQAIY KCRPGYR⁽¹⁶⁾SLG NVIMVCRKGE WVALNPLRKC

90 100 110 120

QKRPCGHPGD TPEGTFTLTG GNVFEYGVKA VYTCNEGYQL

130 140 150 160

LGEINYRECD TDGWTNDIPI CEVVKCLPVT APENGK⁽¹¹⁾IVSS

170 180 190 200

AMEPDREYHF GQAVRFVCNS GYKIEGDEEM HCSDDGFWSK

210 220 230 240

EKPKCVEISC KSPDVINGSP ISQKIIYKEN ERFQYKCNMG

250 260 270 280

YEYSER⁽⁷⁾GDAV CTESGWR⁽⁸⁾PLP SCEEKSCDNP YIPNGDYSPL

290 300 310 320

RIKHR⁽¹⁹⁾TGDEI TYQCRNGFYP ATRGNTAKCT STGWIPAPRC

330 340 350 360

TLKPCDYPDI KHGGLYHENM R⁽¹³⁾RPYFPVAVG KYYSYYCDEH

370 380 390 400

FETPSGSYWD HIHCTQDGWS PAVPCLRKCY FPYLENGYNQ

410 420 430 440

NYGRKFVQGK ⁽¹⁵⁾SIDVACHPGY ALPKAQTTVT CMENGWSPTP

450 460 470 480

RCIRVKTCSK SSIDIENGFI SESQYTYALK EKAKYQCKLG

490 500 510 520

YVTADGETSG SITCGK⁽³⁾DGWS AQPTCIK⁽¹⁴⁾SCD IPVFMNAR⁽²¹⁾TK

530 540 550 560

NDFTWFKLND TLDYECHDGY ESNTGSTTGS IVCGYNGWSD

570 580 590 600

LPICYERECE LPK⁽⁹⁾IDVHLVP DRKKDQYKVG EVLKFSCKPG

610 620 630 640

FTIVGPNSVQ CYHFGLSPDL PICKEQVQSC GPPPELLNGN

650 660 670 680

VKEKTKEEYG HSEVVEYYCN PRFLMKGPNK IQCVDGEWTT

690 700 710 720

LPVCIVEEST CGDIPELEHG WAQLSSPPYY YGDSVEFNCS

730 740 750 760

ESFTMIGHRS ITCIHGVWTQ LPQCVAIDKL KKCK⁽¹⁷⁾SSNLII

770 780 790 800

LEEHLKNKK⁽⁵⁾E FDHNSNIRYR CRGKEGWIHT VCINGRWDPE

810 820 830 840

VNCSMAQIQL CPPPPQIPNS HNMTTTLNYR DGEK⁽²³⁾VSVLCQ

850 860 870 880

ENYLIQEGEE ITCKDGR
⁽²⁴⁾
WQS IPLCVEKIPC SQPPQIEHGT

890 900 910 920

INSSR⁽¹⁸⁾SSQES YAHGTK⁽¹⁰⁾LSYT CEGGFRISEE NETTCYMGK⁽²⁵⁾W

930 940 950 960

SSPPQCEGLP CKSPPEISHG VVAHMSDSYQ YGEEVTYKCF

970 980 990 1000

EGFGIDGPAI AKCLGEKWSH PPSCIKTDCL SLPSFENAIP

1010 1020 1030 1040

MGEKKDVYK⁽¹⁾A GEQVTYTCAT YYKMDGASNV TCINSRWTGR

1050 1060 1070 1080

PTCR⁽⁴⁾DTSCVN PPTVQNAYIV SRQMSKYPSG ERVRYQCRSP

1090 1100 1110 1120

YEMFGDEEVM CLNGNWTEPP QCKDSTGKCG PPPPIDNGDI

1130 1140 1150 1160

TSFPLSVYAP ASSVEYQCQN LYQLEGNKRI TCRNGQWSEP

1170 1180 1190 1200

PK⁽²⁾CLHPCVIS R⁽⁶⁾EIMENYNIA LRNTAKQKLY SR⁽²⁰⁾TGESVEFV

1210 1220 1230

CKRGYRLSSR SHTLR⁽²²⁾TTCWD GKLEYPTCAK R

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of Sodium-
O95436-1
IPI00007910
SLC34A2_Human

dependent phosphate

transport protein

Trypsin Fragments

1. EAQGEVPAS
2. VISQIAMND

DSKTECTAL
EK

10 20 30 40

MAPWPELGDA QPNPDKYLEG AAGQQPTAPD KGKETNKTDN

50 60 70 80

TEAPVTKIEL LPSYSTATLI DEPTEVDDPW NLPTLQDSGI

90 100 110 120

KWSERDTKGK ILCFFQGIGR LILLLGFLYF FVCSLDILSS

130 140 150 160

AFQLVGGKMA GQFFSNSSIM SNPLLGLVIG VLVTVLVQSS

170 180 190 200

STSTSIVVSM VSSSLLTVRA AIPIIMGANI GTSITNTIVA

210 220 230 240

LMQVGDRSEF RRAFAGATVH DEFNWLSVLC LLPVEVATHY

250 260 270 280

LEIITQLIVE SFHFKNGEDA PDLLKVITKP FTKLIVQLDK

290 300 310 320

K⁽²⁾VISQIAMND EKAKNKSLVK IWCKTFTNKT QINVTVPSTA

330 340 350 360

NCTSPSLCWT DGIQNWTMKN VTYKENIAKC QHIFVNFHLP

370 380 390 400

DLAVGTILLI LSLLVLCGCL IMIVKILGSV LKGQVALVIK

410 420 430 440

KTINTDFPFP FAWLTGYLAI LVGAGMTFIV QSSSVFTSAL

450 460 470 480

TPLIGIGVIT IERAYPLTLG SNIGTTTTAI LAALASPGNA

490 500 510 520

LRSSLQIALC HFFFNISGIL LWYPIPFTRL PIRMAKGLGN

530 540 550 560

ISAKYRWFAV FYLIIFFFLI PLTVFGLSLA GWRVLVGVGV

570 580 590 600

PVVFIIILVL CLRLLQSRCP RVLPKKLQNW NFLPLWMRSL

610 620 630 640

KPWDAVVSKF TGCFQMRCCC CCRVCCRACC LLCDCPKCCR

650 660 670 680

CSKCCEDLEE AQEGQDVPVK APETFDNITI SR⁽¹⁾EAQGEVPA

690

SDSKTECTAL

Protein Name
Uniprot
IPI
Gene Name

Putative

IPI00152189

Uncharacterized

protein

Trypsin Fragments

1. FSVLGSGLN

R

MAWAPLLLTLLSLLTGSLSQPVLTQPPSASASLGASVTLTCTLSSGYSNYKVDWYQQRPG

KGPRFVMRVGTGGIVGSKGDGIPDRFSVLGSGLNRYLTIKNIQEEDESDYHCGADHGSGS

NFV

Protein Name
Uniprot
IPI
Gene Name

Ras Related Protein
Q15771
IPI00302030
RAB_30_Human

Rab-30

Trypsin Fragments

1. LQIWDTAGQ
2. SMEDYDFLF

ER
K

10 20 30 40

M⁽²⁾SMEDYDFLF KIVLIGNAGV GKTCLVRRFT QGLFPPGQGA

50 60 70 80

TIGVDFMIKT VEINGEKVK⁽²⁾L QIWDTAGQER FRSITQSYYR

90 100 110 120

SANALILTYD ITCEESFRCL PEWLREIEQY ASNKVITVLV

130 140 150 160

GNKIDLAERR EVSQQRAEEF SEAQDMYYLE TSAKESDNVE

170 180 190 200

KLFLDLACRL ISEAPQNTLV NHVSSPLPGE GKSISYLTCC

NFN

APPENDIX II

Sequences shown to be down regulated in subjects with cancer:

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of growth
Q14393
IPI00412410
GAS6_Human

arrest-specific

protein 6

Trypsin Fragments

1. CEQVCVNSP
2. GQSEVSAAQ
3. IAVAGDLFQ

GSYTCHCDG
LQER
PER

R

4. MFSGTPVIR
5. MQCFSVTER
6. NSGFATCVQ

NLPDQCTPN

PCDR

10 20 30 40

MAPSLSPGPA ALRRAPQLLL LLLAAECALA ALLPAREATQ

50 60 70 80

FLRPRQRRAF QVFEEAKQGH LERECVEELC SREEAREVFE

90 100 110 120

NDPETDYFYP RYLDCINKYG SPYTK⁽⁸⁾NSGFA TCVQNLPDQC

130 140 150 160

TPNPCDRKGT QACQDLMGNF FCLCKAGWGG RLCDKDVNEC

170 180 190 200

SQENGGCLQI CHNKPGSFHC SCHSGFELSS DGRTCQDIDE

210 220 230 240

CADSEACGEA RCKNLPGSYS CLCDEGFAYS SQEKACRDVD

250 260 270 280

ECLQGR⁽¹⁾CEQV CVNSPGSYTC HCDGRGGLKL SQDMDTCELE

290 300 310 320

AGWPCPRHRR DGSPAARPGR GAQGSRSEGH IPDRRGPRPW

330 340 350 360

QDILPCVPFS VAKSVKSLYL GR⁽⁴⁾MFSGTPVI RLRFKRLQPT

370 380 390 400

RLVAEFDFRT FDPEGILLFA GGHQDSTWIV LALRAGRLEL

410 420 430 440

QLRYNGVGRV TSSGPVINHG MWQTISVEEL ARNLVIKVNR

450 460 470 480

DAVMK⁽³⁾IAVAG DLFQPERGLY HLNLTVGGIP FHEKDLVQPI

490 500 510 520

NPRLDGCMRS WNWLNGEDTT IQETVKVNTR ⁽⁵⁾MQCFSVTERG

530 540 550 560

SFYPGSGFAF YSLDYMRTPL DVGTESTWEV EVVAHIRPAA

570 580 590 600

DTGVLFALWA PDLRAVPLSV ALVDYHSTKK LKKQLVVLAV

610 620 630 640

EHTALALMEI KVCDGQEHVV TVSLRDGEAT LEVDGTR⁽²⁾GQS

650 660 670 680

EVSAAQLQER LAVLERHLRS PVLTFAGGLP DVPVTSAPVT

690 700 710 720

AFYRGCMTLE VNRRLLDLDE AAYKHSDITA KSCPPVEPAA

Protein Name
Uniprot
IPI
Gene Name

Cathepsin L1
P07711
IPI00012887
CTSL1_Human

Trypsin Fragments

1. HSFTMAMNA
2. LYGMNEEGW
3. NHCGIASAA

FGDMTSEEF
R
SYPV

R

4. NSWGEEWGM

GGYVK

10 20 30 40

MNPTLILAAF CLGIASATLT FDHSLEAQWT KWKAMGNR⁽²⁾LY

50 60 70 80

GMNEEGWRRA VWEKNMKMIE LHNQEYREGK ⁽¹⁾HSFTMAMNAF

90 100 110 120

GDMTSEEFRQ VMNGFQNRKP RKGKVFQEPL FYEAPRSVDW

130 140 150 160

REKGYVTPVK NQGQCGSCWA FSATGALEGQ MFRKTGRLIS

170 180 190 200

LSEQNLVDCS GPQGNEGCNG GLMDYAFQYV QDNGGLDSEE

210 220 230 240

SYPYEATEES CKYNPKYSVA NDTGFVDIPK QEKALMKAVA

250 260 270 280

TVGPISVAID AGHESFLFYK EGIYFEPDCS SEDMDHGVLV

290 300 310 320

VGYGFESTES DNNKYWLVK⁽⁴⁾N SWGEEWGMGG YVKMAKDRR⁽³⁾N

330

HCGIASAASY PTV

Protein Name
Uniprot
IPI
Gene Name

Secreted frizzled-
Q8N474
IPI00749245
SFRP1_Human

related protein 1

Trypsin Fragments

1. FYTKPPQCV
2. LCHNVGYK
3. LCHNVGYKK

DIPADLR

4. MVLPNLLEH
5. PQGTTVCPP
6. QQASSWVPL

ETMAEVK
CDNELK
LNK

7. SEAIIEHLC
8. SQYLLTAIH

ASEFALR
K

10 20 30 40

MGIGRSEGGR RGAALGVLLA LGAALLAVGS ASEYDYVSFQ

50 60 70 80

SDIGPYQSGR ⁽¹⁾FYTKPPQCVD IPADLR^(2,3)LCHN VGYKK⁽⁴⁾MVLPN

90 100 110 120

LLEHETMAEV K
⁽⁶⁾
QQASSWVPL LNKNCHAGTQ VFLCSLFAPV

130 140 150 160

CLDRPIYPCR WLCEAVRDSC EPVMQFFGFY WPEMLKCDKF

170 180 190 200

PEGDVCIAMT PPNATEASK⁽⁵⁾P QGTTVCPPCD NELK⁽⁷⁾SEAIIE

210 220 230 240

HLCASEFALR MKIKEVKKEN GDKKIVPKKK KPLKLGPIKK

250 260 270 280

KDLKKLVLYL KNGADCPCHQ LDNLSHHFLI MGRKVK⁽⁸⁾SQYL

290 300 310

LTAIHKWDKK NKEFKNFMKK MKNHECPTFQ SVFK

Protein Name
Uniprot
IPI
Gene Name

Bactericidal
P17213
IPI00827847
BPI_Human

permeability-

increasing protein

Trypsin Fragments

1. GLDYASQQG
2. IKIPDYSDS

TAALQK
FK

10 20 30 40

MRENMARGFC NAPRWASLMV LVAIGTAVTA AVNPGVVVRI

50 60 70 80

SQK⁽²⁾GLDYASQ QGTAALQKEL KR⁽²⁾IKIPDYSD SFKIKHLGKG

90 100 110 120

HYSFYSMDIR EFQLPSSQIS MVPNVGLKFS ISNANIKISG

130 140 150 160

KWKAQKRFLK MSGNFDLSIE GMSISADLKL GSNPTSGKPT

170 180 190 200

ITCSSCSSHI NSVHVHISKS KVGWLIQLFH KKIESALRNK

210 220 230 240

MNSQVCEKVT NSVSSELQPY FQTLPVMTKI DSVAGINYGL

250 260 270 280

VAPPATTAET LDVQMKGEFY SENHHNPPPF APPVMEFPAA

290 300 310 320

HDRMVYLGLS DYFFNTAGLV YQEAGVLKMT LRDDMIPKES

330 340 350 360

KFRLTTKFFG TFLPEVAKKF PNMKIQIHVS ASTPPHLSVQ

370 380 390 400

PTGLTFYPAV DVQAFAVLPN SSLASLFLIG MHTTGSMEVS

410 420 430 440

AESNRLVGEL KLDRLLLELK HSNIGPFPVE LLQDIMNYIV

450 460 470 480

PILVLPRVNE KLQKGFPLPT PARVQLYNVV LQPHQNFLLF

GADVVYK

Protein Name
Uniprot
IPI
Gene Name

Chitinase domain
Q9BWS9-2
IPI00306719
CHID1_Human

containing protein 1

Trypsin Fragments

1. GLHIVPR
2. GLVVTDLK
3. NVLDSEDEI

EELSK

4. SQFSDKPVQ
5. YIQTLK

DR

10 20 30 40

MRTLFNLLWL ALACSPVHTT LSKSDAKKAA SKTLLEK⁽⁴⁾SQF

50 60 70 80

SDKPVQDR
⁽²⁾
GL VVTDLKAESV VLEHRSYCSA KARDRHFAGD

90 100 110 120

VLGYVTPWNS HGYDVTKVFG SKFTQISPVW LQLKRRGREM

130 140 150 160

FEVTGLHDVD QGWMRAVRKH AK⁽¹⁾GLHIVPRL LFEDWTYDDF

170 180 190 200

R⁽³⁾NVLDSEDEI EELSKTVVQV AKNQHFDGFV VEVWNQLLSQ

210 220 230 240

KRVGLIHMLT HLAEALHQAR LLALLVIPPA ITPGTDQLGM

250 260 270 280

FTHKEFEQLA PVLDGFSLMT YDYSTAHQPG PNAPLSWVRA

290 300 310 320

CVQVLDPKSK WRSKILLGLN FYGMDYATSK DAREPVVGAR

330 340 350 360

⁽⁵⁾
YIQTLKDHRP RMVWDSQASE HFFEYKKSRS GRHVVFYPTL

370 380 390

KSLQVRLELA RELGVGVSIW ELGQGLDYFY DLL

Protein Name
Uniprot
IPI
Gene Name

Moesin
P26038
IPI00219365
MSN_Human

Trypsin Fragments

1. ALELEQER
2. ALTSELANA
3. AQMVQEDLE

R
K

4. ESEAVEWQQ
5. ISQLEMAR
6. IGFPWSEIR

K

10 20 30 40

MPKTISVRVT TMDAELEFAI QPNTTGKQLF DQVVKTIGLR

50 60 70 80

EVWFFGLQYQ DTKGFSTWLK LNKKVTAQDV RKESPLLFKF

90 100 110 120

RAKFYPEDVS EELIQDITQR LFFLQVKEGI LNDDIYCPPE

130 140 150 160

TAVLLASYAV QSKYGDFNKE VHKSGYLAGD KLLPQRVLEQ

170 180 190 200

HKLNKDQWEE RIQVWHEEHR GMLREDAVLE YLKIAQDLEM

210 220 230 240

YGVNYFSIKN KKGSELWLGV DALGLNIYEQ NDRLTPK⁽⁵⁾IGF

250 260 270 280

PWSEIRNISF NDKKFVIKPI DKKAPDFVFY APRLRINKRI

290 300 310 320

LALCMGNHEL YMRRRKPDTI EVQQMKAQAR EEKHQKQMER

330 340 350 360

AMLENEKKKR EMAEKEKEKI EREKEELMER LKQIEEQTKK

370 380 390 400

AQQELEEQTR R⁽¹⁾ALLEQERK RAQSEAEKLA KERQEAEEAK

410 420 430 440

EALLQASRDQ KKTQEQLALE MAELTAR⁽⁵⁾ISQ LEMARQKK⁽⁴⁾ES

450 460 470 480

EAVEWQQK
⁽³⁾
AQ MVQEDLEKTR AELKTAMSTP HVAEPAENEQ

490 500 510 520

DEQDENGAEA SADLRADAMA KDRSEEERTT EAEKNERVQK

530 540 550 560

HLK⁽²⁾ALTSELA NARDESKKTA NDMIHAENMR LGRDKYKTLR

570

QIRQGNTKQR IDEFESM

Protein Name
Uniprot
IPI
Gene Name

Isoform 2 of
Q9NZ08-2
IPI00165949
ERAP1_Human

Endoplasmic reticulum

aminopeptidase 1

Trypsin Fragments

1. GACILNMLR
2. ILASTQFEP
3. SQIEFALCR

TAAR

10 20 30 40

MVFLPLKWSL ATMSFLLSSL LALLTVSTPS WCQSTEASPK

50 60 70 80

RSDGTPFPWN KIRLPEYVIP VHYDLLIHAN LTTLTFWGTT

90 100 110 120

KVEITASQPT STIILHSHHL QISRATLRKG AGERLSEEPL

130 140 150 160

QVLEHPRQEQ IALLAPEPLL VGLPYTVVIH YAGNLSETFH

170 180 190 200

GFYKSTYRTK EGELR⁽²⁾ILAST QFEPTAARMA FPCFDEPAFK

210 220 230 240

ASFSIKIRRE PRHLAISNMP LVKSVTVAEG LIEDHFDVTV

250 260 270 280

KMSTYLVAFI ISDFESVSKI TKSGVKVSVY AVPDKINQAD

290 300 310 320

YALDAAVTLL EFYEDYFSIP YPLPKQDLAA IPDFQSGAME

330 340 350 360

NWGLTTYRES ALLFDAEKSS ASSKLGITMT VAHELAHQWF

370 380 390 400

GNLVTMEWWN DLWLNEGFAK FMEFVSVSVT HPELKVGDYP

410 420 430 440

FGKCFDAMEV DALNSSHPVS TPVENPAQIR EMFDDVSYDK

450 460 470 480

⁽¹⁾
GACILNMLRE YLSADAFKSG IVQYLQKHSY KNTKNEDLWD

490 500 510 520

SMASICPTDG VKGMDGFCSR SQHSSSSSHW HQEGVDVKTM

530 540 550 560

MNTWTLQKGF PLITITVRGR NVHMKQEHYM KGSDGAPDTG

570 580 590 600

YLWHVPLTFI TSKSDMVHRF LLKTKTDVLI LPEEVEWIKF

610 620 630 640

NVGMNGYYIV HYEDDGWDSL TGLLKGTHTA VSSNDRASLI

650 660 670 680

NNAFQLVSIG KLSIEKALDL SLYLKHETEI MPVFQGLNEL

690 700 710 720

IPMYKLMEKR DMNEVETQFK AFLIRLLRDL IDKQTWTDEG

730 740 750 760

SVSERMLRSQ LLLLACVHNY QPCVQRAEGY FRKWKESNGN

770 780 790 800

LSLPVDVTLA VFAVGAQSTE GWDFLYSKYQ FSLSSTEK⁽³⁾SQ

810 820 830 840

IEFALCRTQN KEKLQWLLDE SFKGDKIKTQ EFPQILTLIG

850 860 870 880

RNPVGYPLAW QFLRKNWNKL VQKFELGSSS IAHMNMGTTN

890 900 910 920

QFSTRTRLEE VKGFFSSLKE NGSQLRCVQQ TIETIEENIG

930 940

WMDKNFDKIR VWLQSEKLER M

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of
QPCT
IPI00003919
Q16769-1

glutaminyl-peptide

Trypsin Fragments

1. MASTPHPPG
2. YPGSPGSYA

AR
AR

10 20 30 40

MAGGRHRRVV GTLHLLLLVA ALPWASRGVS PSASAWPEEK

50 60 70 80

NYHQPAILNS SALRQIAEGT SISEMWQNDL QPLLIER²YPG

90 100 110 120

SPGSYAARQH IMQRIQRLQA DWVLEIDTFL SQTPYGYRSF

130 140 150 160

SNIISTLNPT AKRHLVLACH YDSKYFSHWN NRVFVGATDS

170 180 190 200

AVPCAMMLEL ARALDKKLLS LKTVSDSKPD LSLQLIFFDG

210 220 230 240

EEAFLHWSPQ DSLYGSRHLA AK¹MASTPHPP GARGTSQLHG

250 260 270 280

MDLLVLLDLI GAPNPTFPNF FPNSARWFER LQAIEHELHE

290 300 310 320

LGLLKDHSLE GRYFQNYSYG GVIQDDHIPF LRRGVPVLHL

330 340 350 360

IPSPFPEVWH TMDDNEENLD ESTIDNLKNI LQVFVLEYLH

L

Protein Name
Uniprot
IPI
Gene Name

Isoform 1 of
O75882
IPI00027235
ATRN_Human

Attraction

Trypsin Fragments

1. CTWLIEGQP
2. GDECQLCEV
3. GVKGDECQL

NR
ENR
CEVENR

4. LADDLYR
5. IMQSSQSMS
6. LTGSSGFVT

K
DGPGNYK

7. SCALDQNCQ

WEPR

10 20 30 40

MVAAAAATEA RLRRRTAATA ALAGRSGGPH WDWDVTRAGR

50 60 70 80

PGLGAGLRLP RLLSPPLRPR LLLLLLLLSP PLLLLLLPCE

90 100 110 120

AEAAAAAAAV SGSAAAEAKE CDRPCVNGGR CNPGTGQCVC

130 140 150 160

PAGWVGEQCQ HCGGRFR⁽⁶⁾LTG SSGFVTDGPG NYKYKTK⁽¹⁾CTW

170 180 190 200

LIEGQPNRIM RLRFNHFATE CSWDHLYVYD GDSIYAPLVA

210 220 230 240

AFSGLIVPER DGNETVPEVV ATSGYALLHF FSDAAYNLTG

250 260 270 280

FNITYSFDMC PNNCSGRGEC KISNSSDTVE CECSENWKGE

290 300 310 320

ACDIPHCTDN CGFPHRGICN SSDVRGCSCF SDWQGPGCSV

330 340 350 360

PVPANQSFWT REEYSNLKLP RASHKAVVNG NIMWVVGGYM

370 380 390 400

FNHSDYNMVL AYDLASREWL PLNRSVNNVV VRYGHSLALY

410 420 430 440

KDKIYMYGGK IDSTGNVTNE LRVFHIHNES WVLLTPKAKE

450 460 470 480

QYAVVGHSAH IVTLKNGRVV MLVIFGHCPL YGYISNVQEY

490 500 510 520

DLDKNTWSIL HTQGALVQGG YGHSSVYDHR TRALYVHGGY

530 540 550 560

KAFSANKYR⁽⁶⁾L ADDLYRYDVD TQMWTILKDS RFFRYLHTAV

570 580 590 600

IVSGTMLVFG GNTHNDTSMS HGAKCFSSDF MAYDIACDRW

610 620 630 640

SVLPRPDLHH DVNRFGHSAV LHNSTMYVFG GFNSLLLSDI

650 660 670 680

LVFTSEQCDA HRSEAACLAA GPGIRCVWNT GSSQCISWAL

690 700 710 720

ATDEQEEKLK SECFSKRTLD HDRCDQHTDC YSCTANTNDC

730 740 750 760

HWCNDHCVPR NHSCSEGQIS IFRYENCPKD NPMYYCNKKT

770 780 790 800

SCR⁽⁷⁾SCALDQN CQWEPRNQEC IALPENICGI GWHLVGNSCL

810 820 830 840

KITTAKENYD NAKLFCRNHN ALLASLTTQK KVEFVLKQLR

850 860 870 880

⁽⁵⁾
IMQSSQSMSK LTLTPWVGLR KINVSYWCWE DMSPFTNSLL

890 900 910 920

QWMPSEPSDA GFCGILSEPS TRGLKAATCI NPLNGSVCER

930 940 950 960

PANHSAKQCR TPCALRTACG SCTSGSSECM WCSNMKQCVD

970 980 990 1000

SNAYVASFPF GQCMEWYTMS TCPPENCSGY CTCSHCLEQP

1010 1020 1030 1040

GCGWCTDPSN TGKGKCIEGS YKGPVKMPSQ APTGNFYPQP

1050 1060 1070 1080

LLNSSMCLED SRYNWSFIHC PACQCNGHSK CINQSICEKC

1090 1100 1110 1120

ENLTTGKHCE TCISGFYGDP TNGGKCQPCK CNGHASLCNT

1130 1140 1150 1160

NTGKCFCTTK ⁽⁹⁾GVK⁽²⁾GDECQLC EVENRYQGNP LRGTCYYTLL

1170 1180 1190 1200

IDYQFTFSLS QEDDRYYTAI NFVATPDEQN RDLDMFINAS

1210 1220 1230 1240

KNFNLNITWA ASFSAGTQAG EEMPVVSKTN IKEYKDSFSN

1250 1260 1270 1280

EKFDFRNHPN ITFFVYVSNF TWPIKIQIAF SQHSNFMDLV

1290 1300 1310 1320

QFFVTFFSCF LSLLLVAAVV WKIKQSCWAS RRREQLLREM

1330 1340 1350 1360

QQMASRPFAS VNVALETDEE PPDLIGGSIK TVPKPIALEP

1370 1380 1390 1400

CFGNKAAVLS VFVRLPRGLG GIPPPGQSGL AVASALVDIS

1410 1420

QQMPIVYKEK SGAVRNRKQQ PPAQPGTC

Protein Name
Uniprot
IPI
Gene Name

Uncharacterized

IPI00925547
LTF_Human

protein

Trypsin Fragments

1. ADAVTLDGG
2. ARVVWCAVG
3. ARVVWCAVG

FIYEAGLAP
EQELR
EQELRK

YK

4. CAFSSQEPY
5. CFQWQR
6. CGLVPVLAE

FSYSGAFK

NYK

7. CLAENAGDV
8. CLRDGAGDV
9. CSTSPLLEA

AFVK
AFIR
CEFLRK

10. CSTSPLLEA
11. CVPNSNER
12. CVPNSNERY

CEFLR

YGYTGAFR

13. DCHLAR
14. DEYELLCPD
15. DGAGDVAFI

NTR
R

16. DGAGDVAFI
17. DLLFKDSAI
18. DLKLADFAL

RESTVFEDL
GFSR
LCLDGK

SDEAERDEY

ELLCPDNTR

19. DLKLADFAL
20. DKSPKFQLF
21. DSAIGFSR

LCLDGKR
GSPSGQK

22. DSAIGFSRV
23. DSPIQCIQA
24. DSPIQCIQA

PPR
IAENR
IAENRADAV

TLDGGFIYE

AGLAPYK

25. DVTVLQNTD
26. DVTVLQNTD
27. GEADAMSLD

GNNNEAWAK
GNNNEAWAK
GGYVYTAGK

DIK

28. GGSFQLNEL
29. GPPVSCIK
30. GPPVSCIKR

QGLK

31. GQFPNLCR

MKLVFLVLLF LGALGLCLAG RRRSVQWCA VSQPEATK⁽⁵⁾CF QWQRNMRK VR^(29,30)GPPV

SCIKR
^(23,24)
DS PIQCIQAIAE NR
⁽¹⁾
ADAVTLDG GFIYEAGLAP YKLRPVAAE VYGTERQPR

THYYAV AVVKK⁽²⁸⁾G GSFQLNELQG LKSCHTGLRR TAGWNVPIGT LRPFLNWTG PPEPIEAAV

ARFFSA SCVPGA DK⁽³¹⁾GQFPNLCR LCAGTGENK⁽⁸⁾C AFSSQEPYFS YSGAFK⁽⁸⁾CLR

^(15,16)
DGAGDVAFI RESTVFE DLSDE AER
⁽¹⁴⁾
DEYELL CPDNTRKPVDK FK⁽¹³⁾DCHLARVP

SHAVVARSV NGKEDAIWN LLRQAQE KFGK⁽²⁰⁾D KSPKFQLFG SPSGQK⁽¹⁷⁾DLLFK

^(21,22)
DSAIGFSRVP PRIDSGLYL GSGYFTAIQ NLRKSEE EVAAR R^(2,3)ARVVWCAV

GEQELRKCNQW SGLSEGSVTC SSASTTEDC IALK⁽²³⁾GEADA MSLDGGY VYTAG K⁽⁶⁾CGLVPVLA

ENYKSQQSSDP DPNCVDRPVE GYLAVAVVR RSDTSLTWN SVKGKKS CHTAV DRTAGWNIP

MGLLFNQTGSC KFDEYFSQSC APGSDPRSN LCALCIGDE QGENK^(11,12)CV PNSNE RYYGYTGAF

R
⁽⁷⁾
CLAENAGDVA FVK
^(25,26)DVTVLQN TDGNNNEAW AK^(18,19)DLKLADF ALLCLDG KRKPV

TEARSCHLA MAPNHAVVSRM DKVERLKQVL LHQQAKFGR NGSDCPDKF CLFQSET KNLLF

NDNTECLAR LHGKTTYEKYL GPQYVAGITN LKK^(9,10)CSTSPL LEACEFLRK

Methods Of Detecting Cancer

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS REFERENCE TO RELATED APPLICATIONS

Provisional Applications (1)