Methods Of Detecting Cancer

Information

  • Patent Application
  • 20160003786
  • Publication Number
    20160003786
  • Date Filed
    May 08, 2015
    9 years ago
  • Date Published
    January 07, 2016
    8 years ago
Abstract
The present disclosure is directed toward methods and kits for detecting cancer, and in particular breast cancer, in a subject by measuring the levels of at least one of the identified markers, as compared to a control. The expression of the markers in Table 2A is increased in samples from subjects with cancer as compared to the expression level in subjects without cancer and the expression of the markers in Table 2B are decreased in samples from subjects with cancer as compared to the expression level in subjects without cancer. The sample may be lacrimal secretions or eye wash fluid, saliva, or other biological fluids. The kits may include an eye wash kit, collection tubes and protease inhibitors, or protein stabilizers.
Description
BACKGROUND

1. Field of Invention


This present application encompasses proteins and peptide fragments of those proteins produced by proteolytic digestion that are useful for diagnosing or monitoring for the presence of cancer in an individual.


2. Description of the Related Art


Screening mammograms typically have a sensitivity of 75% and specificity of around 98% resulting in a false positive rate of roughly 5% per mammogram


(Brown, Houn, Sickles, & Kessler, 1995; Kolb, Lichy, & Newhouse, 2002; Luftner & Possinger, 2002). Follow up imaging to evaluate false positives costs the US over 4 B with an additional 1.6 B for biopsies alone. In 2010 of the 1.6 M biopsies performed as little as 16% (only 261,000) were found to have cancer (Grady, 2012). The answer to increasing the diagnostic parameters of imaging can be found in the pre and post image diagnostics which focuses on genetic and proteomic information, more specifically, biomarkers (Armstrong, Handorf, Chen, & Bristol Demeter, 2013; Li, Zhang, Rosenzweig, Wang, & Chan, 2002).


Tissue and serum are commonly the most logical place for beginning biomarker research, however the large dynamic range of both mediums makes discovery quite difficult (Schiess, Wollscheid, & Aebersold, 2009). The answers may lie in less complex biological fluids, such as saliva and tears. The use of tears as diagnostic medium is not a novel application as the tear proteome has been extensively investigated previously (Böhm et al., 2012; 2011; Lebrecht, Boehm, Schmidt, Koelbl, & Grus, 2009a; Lebrecht et al., 2009b; Wu & Zhang, 2007). In this application a quantitative assay for the detection of a panel of tear-based biomarkers in response to cancer by triple quadrupole LC mass spectrometry is proposed. From this quantitative information, the framework for a Certified Laboratory Improvement Amendments (CLIA) protocol will be defined.


SUMMARY

Methods of determining whether a subject has cancer are provided herein. The methods include obtaining a sample from the subject and performing steps for or detecting the level of at least one of the markers provided in Table 2A or Table 2B in the sample. The subject is likely to have cancer if the levels of the markers of Table 2A are increased or if the markers in Table 2B are decreased as compared to the levels in a control sample lacking cancer. The sample is optionally lacrimal secretions, such as an ocular wash, saliva or other bodily fluid.


Kits for performing the methods described herein are also provided. The kits may comprise an eye wash solution and collection materials such as tubes. The tube for collection may comprise a protease inhibitor or other protein stabilizing agent.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a set of photographs of a NuPAGE showing the proteins collected from each of the pooled ocular wash samples. The lane numbers correspond to pool numbers with the even numbers being breast cancer pools and the odd numbers being control pools.



FIG. 2 is a graph comparing the protein expression in cancer and control samples showing increased expression of several proteins in breast cancer samples as compared to controls based on peak intensity as determined by LC-MS/MS.



FIG. 3 is a graph comparing the protein expression in cancer and control samples showing decreased expression of several proteins in breast cancer samples as compared to controls based on peak intensity as determined by LC-MS/MS.





DETAILED DESCRIPTION

Provided herein are proteins and trypsin produced polypeptides (as defined in Table 2A and 2B in the Examples and the actual trypsin sequences and full length amino acid sequences of the proteins identified as being up regulated and down regulated in cancer samples are provided in Appendix I and Appendix II, respectively) which are shown in the Examples to increase or decrease in biological samples in response to the presence of breast cancer as compared to controls. These proteins and peptides are biomarkers and will be used to determine the disease state of a patient or other subject.


Subjects include humans, domesticated animals such as cats, dogs, cows, pigs or other animals susceptible to cancer. A “patient” indicates a subject who is diagnosed with a disease or with cancer or being tested for having cancer. Thus subject and cancer may be used interchangeably herein. The subjects may be suspected of having cancer, in particular breast cancer. The subjects may have an increased risk of developing breast cancer. For example, the subject may be at increased risk of cancer or suspected of having cancer because of a positive mammography result, by detection of a lump in the breast, testing positive for a gene known to increase the risk of cancer such as BRCA, or already have had a resection, biopsy or other procedure to remove the cancer. The subject may be undergoing or have previously undergone treatment for cancer and the methods and kits herein are used to monitor progression of treatment or alternatively to monitor for recurrence or spread of the cancer. The cancer may be detected as early as stage I or II cancer, but later stages will also be detected.


Also provided herein are methods and kits to collect ocular wash samples for use to determine the expression levels of the identified proteins or polypeptides in lacrimal secretions. In addition, the use of tubes for collection containing protease inhibitor or protein stabilizing agents is covered. The kits further contain buffers or reagents for the elution of breast cancer biomarkers from the eye. The design of devices to collect the applied saline solution from the corner of the exposed ocular surface as well as the packaging of this device together with saline and a pre-prepared sample collection tube are also disclosed.


The methods disclosed herein encompass the use of these breast cancer biomarkers, singly or in multiples, in a CLIA based protocol utilizing a triple quadrupole LC-MS platform, which will be carried out at a centralized laboratory testing facility. The ocular wash samples collected from individuals may be shipped to the testing facility in this embodiment. The identified proteins and their subsequent proteolytic fragments are used for quantitative analysis of diagnostic peptides produced in the triple quad. A threshold value or a relative or actual value in terms of polypeptide concentration directly relating to the polypeptides listed in Tables 2A and 2B can be defined or samples can be compared directly to non-cancerous controls. The quantitative information in report form could be provided to physicians to help in making decisions regarding the pathway of patient care. Physicians may base treatment decisions on these results and the final step may include administration of an appropriate anti-cancer therapeutic to the subject.


In an alternative embodiment, the polypeptides of Tables 2A and 2B may be detected by implementing binding agents (i.e. antibodies, peptoids, coated surfaces) and reagents that accommodate a binding interaction specific to these proteins to produce a reaction which can be quantitated based on production of a detectable signal such as florescence, color change, or UV absorbance. Implementing these components in a cartridge with a partnering reading instrument that could be used at point of care is also provided. Binding agents for these proteins and polypeptides may also be used for detection in a lateral flow device. Thus methods of detecting the level of protein expression in the samples using a binding partner such as an antibody may be used to detect the markers provided herein in an immunoassay.


The immunoassay typically includes contacting a test sample with an antibody that specifically binds to or otherwise recognizes a biomarker, and detecting the presence of a complex of the antibody bound to the biomarker in the sample. The immunoassay procedure may be selected from a wide variety of immunoassay procedures known to the art involving recognition of antibody/antigen complexes, including enzyme-linked immunosorbent assays (ELISA), radioimmunoassay (RIA), and Western blots, and use of multiplex assays, including use of antibody arrays, wherein several desired antibodies are placed on a support, such as a glass bead or plate, and reacted or otherwise contacted with the test sample. Such assays are well-known to the skilled artisan.


The detection of the biomarkers described herein in a sample may be performed in a variety of ways. In one embodiment, the method provides the reverse-transcription of complementary DNAs from mRNAs obtained from the sample. Fluorescent dye-labeled complementary RNAs may be transcribed from complementary DNAs which are then hybridized to the arrays of oligonucleotide probes. The fluorescent color generated by hybridization is read by machine, such as an Agilent Scanner and data are obtained and processed using software, such as Agilent Feature Extraction Software (9.1). Such array based methods include microarray analysis to develop a gene expression profile. As used herein, the term “gene expression profile” refers to the expression levels of mRNAs or proteins of a panel of genes in the subject. As used herein, the term “panel of diagnostic genes” refers to a panel of genes whose expression level can be relied on to diagnose or predict the status of the disease. Included in this panel of genes are those listed in Tables 2A and 2B, as well as any combination thereof, as provided herein. In other embodiments, complementary DNAs are reverse-transcribed from mRNAs obtained from the sample, amplified and simultaneously quantified by real-time PCR, thereby enabling both detection and quantification (as absolute number of copies or relative amount when normalized to DNA input or additional normalizing genes) of a specific gene product in the complementary DNA sample as well as the original mRNA sample.


The methods of this invention include detecting at least one biomarker. However, any number of biomarkers may be detected. It is preferred that at least two biomarkers are detected in the analysis. However, it is realized that three, four, or more, including all, of the biomarkers described herein may be utilized in the analysis. Thus, not only can one or more markers be detected, one to 40, preferably two to 40, two to 30, two to 20 biomarkers, two to 10 biomarkers, or some other combination, may be detected and analyzed as described herein. In addition, other biomarkers not herein described may be combined with any of the presently disclosed biomarkers to aid in the diagnosis of cancer. Moreover, any combination of the above biomarkers may be detected in accordance with the present invention.


The markers of Table 2A may be increased at least 2 fold, 4 fold, 5 fold, 8 fold, 10 fold or more relative to the level of the marker in the control sample. The markers of Table 2B are decreased at least 1.5 fold, 2 fold, 3 fold, 4 fold or more relative to the level of the marker in the control sample. The control sample may be a sample from a subject that does not have cancer, a pooled sample from subjects that do not have cancer or may be a control or baseline expression level known to be the average expression level of subjects without cancer.


Several terms are used throughout this disclosure and should be defined as commonly used in the art, or as specifically provided herein. As provided herein, mass spectrometry or MS refers to an analytical technique generating electrical or magnetic fields to determine mass-to-charge ratio of peptides and chemical compounds in order to identify or determine peptide sequence and chemical structures. LC-MS/MS spectrometry refers to an analytical technique combining the separation capabilities of high performance liquid chromatography (HPLC) with the mass analysis of mass spectrometry. Triple quadrupole mass spectrometry refers to a tandem mass spectrometer with three ionizing chambers (Q1, Q2, &Q3). This technique allows for target detection of molecules of interest. Ion pairs refers to a parent peptide detected in Q1 in it's doubly or triply charged form and a resulting y or b ion as generated by Q2 and detected in Q3 of a triple quadrupole mass spectrometry instrument. SIS internal peptide refers to a synthesized isotopically-labeled peptide with the same sequence as the peptide to be monitored in Q1 and used as an internal standard for reference to quantify the peptide of interest. The −y ion refers to an ion generated from the c-terminal of a peptide fragment. The −b ion refers to an ion generated from the n-terminal of a peptide fragment. Quantitative Ion refers to the selected highest intensity y or b ion used to determine the quantity of it's parent protein in a biological sample. Qualitative Ion refers to ion/ions chosen to ensure the integrity of the Qualitative ion to selected protein of interest and labeled peptide to selected standards.


CLIA refers to Clinical Laboratory Improvements Amendments which are federal regulatory standards that apply to all clinical laboratory testing preformed on humans in the united states, except clinical trials and basic research. (CLIA related Federal Register and Code of Federal Regulation Announcements). CLIA approved laboratory refers to a clinical lab which preforms laboratory testing on human specimens for diagnosis, prevention, or treatment of disease or impairment and is approved and monitored by an FDA approved regulatory organization. CLIA waived test refers to a clinical laboratory test meeting specific criteria for risk, error and complexity as defined by the Food and Drug Association (FDA).


Point-of-care device refers to an instrument or cartridge available at the location of patient and physician care containing binding agents to a biomarker, or series of biomarkers of interest, and can generate information on the presence, absence, and in some cases concentrations of detected biomarkers. Analyte refers to any measurable biomarker which can be protein, peptide, macromolecule, metabolite, small molecule, or autoantibody. Biological fluid as used herein refers to tears, whole blood, serum, urine, and saliva. Biomarker refers to any substance (e.g. protein, peptide, metabolite, polynucleotide sequence) whose concentration level changes in the body (e.g. increased or decreased) as a result of a disease or condition. Marker and biomarker may be used interchangeably herein.


Lateral flow test refers to a device used to measure the presence of an analyte in a biological fluid using porous paper of sintered polymer. ELISA refers to Enzyme-linked immunosorbent assay which utilizes antibodies to detect the presence and concentration of an analyte of interest. Diagnostic Panel refers to a group of molecules (e.g. proteins or peptides) whose combined concentrations are used to diagnose a disease state. (e.g. cancer). A breast cancer marker refers to a molecule (e.g. protein, peptide, metabolite, polynucleotide sequence) whose concentration level in the body changes (e.g. is increased or decreased) as a result of breast cancer.


In addition to being useful to diagnose cancer and in particular breast cancer in a subject, the kits and methods provided herein may be used to monitor treatment or recurrence of cancer in an individual previously diagnosed with cancer. Thus if the levels of the markers in Table 2A begin to rise or the levels of the markers in Table 2B begin to decrease over time in the same subject after treatment, further chemotherapeutics targeting the cancer may be administered. The methods and kits may also be used to monitor the effectiveness of a chemotherapeutic treatment. In this alternative, the levels of the biomarkers in Table 2A would decrease over time if the treatment regime is effective and either would not change or may increase over time if the treatment regime is not effective in a single subject. The levels of the biomarkers in Table 2B would increase over time during treatment with a therapeutic that is effective and would either not change or decrease over time if the treatment regime is not effective in a single subject.


Treating cancer includes, but is not limited to, reducing the number of cancer cells or the size of a tumor or mass in the subject, reducing progression of a cancer to a more aggressive form, reducing proliferation of cancer cells or reducing the speed of tumor growth, killing of cancer cells, reducing metastasis of cancer cells or reducing the likelihood of recurrence of a cancer in a subject. Treating a subject as used herein refers to any type of treatment that imparts a benefit to a subject afflicted with a disease or at risk of developing the disease, including improvement in the condition of the subject (e.g., in one or more symptoms), delay in the progression of the disease, delay the onset of symptoms or slow the progression of symptoms, etc.


The present disclosure is not limited to the specific details of construction, arrangement of components, or method steps set forth herein. The compositions and methods disclosed herein are capable of being made, practiced, used, carried out and/or formed in various ways that will be apparent to one of skill in the art in light of the disclosure that follows. The phraseology and terminology used herein is for the purpose of description only and should not be regarded as limiting to the scope of the claims. Ordinal indicators, such as first, second, and third, as used in the description and the claims to refer to various structures or method steps, are not meant to be construed to indicate any specific structures or steps, or any particular order or configuration to such structures or steps. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to facilitate the disclosure and does not imply any limitation on the scope of the disclosure unless otherwise claimed. No language in the specification, and no structures shown in the drawings, should be construed as indicating that any non-claimed element is essential to the practice of the disclosed subject matter. The use herein of the terms “including.” “comprising,” or “having,” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof, as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements.


Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if a concentration range is stated as 1% to 50, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this disclosure. Use of the word “about” to describe a particular recited amount or range of amounts is meant to indicate that values very near to the recited amount are included in that amount, such as values that could or naturally would be accounted for due to manufacturing tolerances, instrument and human error in forming measurements, and the like. All percentages referring to amounts are by weight unless indicated otherwise.


No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.


The following examples are meant only to be illustrative and are not meant as limitations on the scope of the invention or of the appended claims.


EXAMPLES
Example 1
Methods for Collecting Ocular Wash Samples

This study was carried out under institutional review board approval and participants were recruited at two clinics based in Arkansas, The Breast Center and Highlands Oncology Group, as well as two clinics based in Washington, PeaceHealth Southwest and PeaceHealth Longview Surgery Center. Inclusion/exclusion criteria used by the clinic for patient selection is given in Table 1.









TABLE 1





Inclusion/Exclusion Criteria for participant selection















Individuals who are:


Between the ages of 18-100 years of age


Presenting for a routine check up


Presenting for the evaluation of an abnormal exam or test (mammogram,


ultrasound, MRI, PET, ect.)


Presenting for the evaluation of a palpable lump or mass


Presenting with a mass, pre or post biopsy as long as a portion of mass


is remaining


Currently have or are in treatment for breast cancer.


Individuals who are:


<18 years of age or >100 years of age.


experiencing a concurrent eye infection or trauma.


Currently experiencing acute conjunctivitis


Known to have abnormal production of tears (too much or too little)









Ocular wash samples were obtained by rinsing the exposed surface of the eye with Optics Laboratory single use Eye-Cept Rewetting drops. The single use dropper, selected to eliminate contamination, was used to apply approximately five drops of rewetting saline to the outside corner of the eye. After application the solution naturally flowed across the surface of the eye and pooled in the inner corner/duct next to the nose. The solution was then collected by suction using a one mL tuberculine syringe, with no needle attached, and transferred to a pre-labeled 0.5 mL tube with an o-ring screw top cap. The optimal total volume from each collection is approximately 100 μL, however actual volumes can vary. Samples were stored between −20° C. and −80° C. (depending on freezer unit available) within two hours of collection.


Samples collected at participating clinics were retrieved by Ascendant personnel on a weekly basis and transferred on dry ice to Ascendant's laboratory facility. In the case of the Washington based clinics, samples were shipped to Ascendant on dry ice on a monthly basis.


Data collected from the participants included: sex, race, age, previous cancer history, family history of breast cancer, stage of current cancer (I, II, III, IV) tumor size, breast cancer subtype (Ductal Carcinoma In Situ, Invasive Ductal Carcinoma, Invasive Lobular Carcinoma, Lobular Carcinoma In Situ, and Unknown) and tumor grade. A spreadsheet was created to track data and stratify samples based on selected criteria.


Control samples were collected, using the procedure detailed above, from volunteers between the ages of 18-100 who reported they were cancer and mass free as per the inclusion criteria outlined in the IRB approved collection protocol. Exclusion criteria are the same as for the breast cancer patients. All control participants were recruited from the general population; consent and sample collected was performed by Ascendant Diagnostics personnel. Data collected from control participants included: sex, age, race, previous history of breast cancer, family history of breast cancer, and current medications.


All samples in the tear bank were stored at −80° C. and freeze thaw cycles were limited to three times, as protein degradation was observed after three freeze thaw cycles. In some cases samples were aliquoted to minimize freeze thaw cycles further.


Example 2
Methods for Preparation of Sample Pools for LC MS/MS

Eight pooled samples (four breast cancer pools and four control pools) each with a total volume of 300 μL were assembled from banked tear samples for the purpose of label free quantitation using in-gel digestion. All breast cancer ocular wash samples used were taken from individuals with stage I &II breast cancer and were collected prior to treatment. Controls were age matched for accuracy of comparison.


To ensure sample integrity, MALDI-TOF data was collected on aliquots from each of the individual samples, which were included in the pooled samples. Prior to MALDI testing, tear samples were purified using ZipTipe18. This procedure serves to remove any contaminates which may be present in the sample and to concentrate the proteins in order to increase ease of detection. A 15 μL aliquot was removed from the freezer and thawed at room temperature for 10 minutes (˜22° C.). The protocol for ZipTipe18 was adapted from the user manual supplied by Millipore and a variable pipette with a total volume capacity of 10 μL was used for all sample preparations. The ZipTipe18 was equilibrated in a wetting solution of acetonitrile (ACN) 0.1% TFA for 10 cycles (1 cycle involves aspirating 10 μL of solution into the tip and dispensing). Following equilibration, the tip was washed with ddH2O (0.1% TFA) for 10 cycles. The sample was then loaded for 10 cycles, followed by a wash with ddH2O (0.1% TFA) for 10 cycles. The load procedure, followed by the wash procedure was carried out a total of five times to ensure maximum protein binding. Bound proteins were eluted in 5 μL of ACN (0.1% TFA) for 20 cycles into a clean tube. The ACN (0.1% TFA) was removed using an eppendorf vacufuge plus for 10 minutes at 45° C. Samples were then reconstituted in 5 μL ddH2O (0.1% TFA) and spotted onto a ground steel MALDI target. Each sample was spotted a total of three times at 1 μL each time, allowing complete drying of the spot before more material was added. After the final spotting was completely dry, 1 μL of a saturated solution of 40 mgs of Sinapinic Acid matrix prepared in 1 mL of 50:50 solution of ACN/ddH2O (0.1% TFA) was spotted onto each sample and all samples were allowed to dry completely on the bench top prior to data collection. One microliter of protein standard was added to several locations on the MALDI target as well. The protein standard was spotted only once and followed by addition of the sinapinic acid matrix used for the OW samples.


Data was collected on a Bruker Reflex III MALDI-IOF mass spectrometer in its linear positive mode, as linear mode increases the sensitivity. Acquisition of all spectra was performed both manually and automatically (user unbiased acquisition) using Bruker Daltonics flex Control software. For each spot, MALDI-TOF mass spectra were acquired at least three times, with a total of 200 laser shots accumulated for each run. Shot accumulation was programmed using a fuzzy logic operator to only consider spectra with S/N better than 20 in between m/z 2000-45,000. Sample integrity was evaluated by visual inspection of the generated MALDI-TOF spectrum. High mass peak splitting together with increased quantity of low mass peaks suggest protein degredation has occurred and the sample was not used further.


Total protein content of each pool was determined using a bicinchoninic acid protein assay kit with a 1:20 (v/v) ratio of standard and unknown to working reagent and an incubation time of 30 min at 37° C. To ensure reliable total protein content calculation, a series of dilutions were made for each sample (i.e. 1:2, 1:4, 1:6) and all dilutions were plated in triplicate. A standard curve using diluted albumin (2 mg/ml, 1.5 mg/ml, 1 m/ml, 0.75 mg/ml, 0.5 mg/ml, 0.25 mg/ml 0.125 mg/ml 0.025 mg/ml and 0 mg/ml) was generated and blank subtraction was applied to all standards and unknowns. The protein concentration for each unknown was calculated using a four-parameter fit of the standard curve. Concentrations were multiplied by the dilution factor and averaged to give an accurate total protein content calculation. Assays were only considered valid if the coefficient of variation (% CV) was 15% or below.


Using the total protein content determined by BCA, 25 μg of protein from each pool was loaded onto a NuPAGE Bis-Tris 4-12% gradient separation gel and run using methods standard for an individual skilled in the art as shown in FIG. 1. Following separation of the ocular proteome, between 20-22 slices were cut for each lane and subjected to disulfide reduction using Dithiothreitol, followed by sulfhydryl aklyation using iodoacetemide, and finally trypsin digestion. Specific slice counts for each sample were as follows: Lane 1=20 slices, Lane 2=21 slices, Lane 3=22 slices, Lane 4=21 slices, Lane 5=20 slices, Lane 6=20 slices, Lane 7=21 slices, Lane 8=21 slices.


Example 3
Methods and Results for Label Free Quantitation by LC MS/MS

Twenty μL from each trypsin digestion reaction was loaded onto a nanoAcquity UPLC (Waters) and eluted using a gradient from 3-99% 0.1% formic acid, 75% acetonitrile over 30 minutes. A LTQ Orbitrap Velos (Thermo Scientific) was used for detection of the peptides produced by proteolytic cleavage. Raw data files from the LC-MS/MS analysis were uploaded into the MASCOT database for protein identification using the UniProtKB database, 2 ppm peptide mass tolerance, and 0.5 Da fragment mass tolerance. The output from MASCOT was then uploaded into the software packages Scaffold and MaxQuant for analysis.


Greater than 700 protein hits were identified using this method. In order to isolate potential biomarker candidates, peak intensities for each group (cancer and control) were averaged for each protein and fold change was determined with respect to cancer. In addition a student's T-test was applied to each protein providing a p-value. All proteins with a fold change of greater than 1.5 and a p-value <0.05 were, considered as possible biomarker candidates. P-values and fold changes were assessed on a case by case basis and some proteins with higher p-values were included in the candidate biomarkers list. The list was then narrowed based on biological relevance to breast cancer, other cancer subtypes, and cancer processes. The complete list of candidate biomarkers is given in Tables 2A and 2B and shown in graphic form in FIGS. 2 and 3.









TABLE 2A







Biomarkers with an increase expression in


cancer as compared to control samples.









Protein ID
P-Value
Fold Change












CLEC3B
0.067
No expression in control


KLK8
0.07
No expression in control


C8A
0.149
No expression in control


HRC
0.17
No expression in control


KLK13
0.178
No expression in control


C7
0.207
No expression in control


ALDH1A1
0.24
No expression in control


APOL1
0.32
No expression in control


MUC-1
0.27
40.6


BLMH
0.212
38.1


SPRR1B
0.117
35.1


SERPINB2
0.11
16.1


Putative uncharacterized
0.165
11.7


protein


RAB-30
0.153
11.3


C4A
0.099
9.6


PRDX6
0.14
7.6


CFHR1
0.169
7.4


A1BG
0.11
7.2


GGH
0.14
7.1


EZR
0.066
6.3


SERPINF2
0.16
5.9


HPX
0.1
5.5


CRISP3
0.0238
5.2


CPA4
0.14
4.8


PGLYRP2
0.06
3.9


CASP14
0.068
3.3


Ig Kappa Chain V-III region
0.001
2.6


POM


ALB
0.014
2.4


CFH
0.042
2.1


SLC34A2
0.105
29.3
















TABLE 2B







Biomarkers with a decrease in expression


in cancer samples as compared to controls











Protein ID
P-value
Fold Change















GAS6
0.045
3.5



CTSL1
0.051
3.4



SFRP1
0.059
3.4



BPI
0.045
2.5



CHID1
0.0546
2.2



MSN
0.0545
2.06



ERAP1
0.014
1.6



QPCT
0.045
1.6



ATRN
0.062
1.6



LTF
0.051
1.5










To further confirm protein identity, the peptide sequences produced by trypsin digestion were mapped back to the original protein sequence. Trypsin products unique to particular proteins were noted, as these sequences have the potential to be used as diagnostic peptides as well as isotopically labeled standards in the final CLIA triple quadrupole mass spectrometry assay. The sequences of the trypsin products and the full-length proteins markers identified in Tables 2A and 2B are provided in Appendix I and Appendix II, respectively.


Example 4
Methods for Schirmer Strip Collections and Processing

Institutional review board approval was obtained for the collection of tears using Schirmer strips. For collection, the rounded tip of the Schirmer strip was folded over at the 0 mm line forming a lip. The folded portion was placed in the lower eyelid of the participant and they were asked to close their eye and keep it in the closed position for a period of 5 minutes. After five minutes the strip was removed and placed in a sterile 1.5 mL pre-labeled snap top tube and placed at −20° C. or −80° C. depending on availability. Collection criteria stated that if the 35 mm mark was reached prior to the five minute time, the strip could be removed.


Data collected from participants included the following, age, sex, race, currently taking birth control or on hormone replacement therapy, ophthamological infections, current or recent chemotherapy treatments, family history of cancer, genetic testing (BRAC1/2) if available, cancer stage, cancer type, hormone receptor status, size of mass, tumor grad, previous history of cancer. A spreadsheet was constructed to house this information and allow for sample stratification based on desired characteristics. Sample total protein content was also entered into the database.


To elute the proteins bound to the Schirmer strip, the strips were first diced and placed in a clean sterile 1.5 mL snap top tube. 200 μL of 1×PBS was added to the diced strip and the sample was incubated at 4° C. with mild shaking overnight. Following elution, the samples were spun briefly to collect the strip fragments at the bottom of the tube, and the supernatant was transferred to a new clean 1.5 mL snap top tube. Total protein content was determined using BCA assay, as described above, and the samples were stored at −80° C. until further use.


REFERENCES



  • Armstrong, K., Handorf, E. A., Chen, J., & Bristol Demeter, M. N. (2013). Breast cancer risk prediction and mammography biopsy decisions: a model-based study. American Journal of Preventive Medicine, 44(1), 15-22. doi:10.1016/j.amepre.2012.10.002

  • Böhm, D., Keller, K., Pieter, J., Boehm, N., Wolters, D., Siggelkow, W., et al. (2012). Comparison of tear protein levels in breast cancer patients and healthy controls using a de novo proteomic approach. Oncology Reports, 28(2), 429-438. doi: 10.3892/or.2012.1849

  • Böhm, D., Keller, K., Wehrwein, N., Lebrecht, A., Schmidt, M., Kölbl, H., & Grus, F.-H. (2011). Serum proteome profiling of primary breast cancer indicates a specific biomarker profile. Oncology Reports, 26(5), 1051-1056. doi:10.3892/or.2011.1420

  • Brown, M. L., Houn, F., Sickles, E. A., & Kessler, L. G. (1995). Screening Mammography in Community Practice: Positive Predictive. American Journal of Radiology, 165, 1373-1377.

  • Grady, D. (2012). Study of Breast Biopsies Finds Surgery Used Too Extensively. New York Times, 1-4.

  • Kolb, T., Lichy, J., & Newhouse, J. (2002). Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations. Radiology, 225(1), 165-175.

  • Lebrecht, A., Boehm, D., Schmidt, M., Koelbl, H., & Grus, F. H. (2009a). Surface-enhanced Laser Desorption/Ionisation Time-of-flight Mass Spectrometry to Detect Breast Cancer Markers in Tears and Serum. Cancer Genomics & Proteomics, 6(2), 75-83.

  • Lebrecht, A., Boehm, D., Schmidt, M., Koelbl, H., Schwirz, R. L., & Grus, F. H. (2009b). Diagnosis of breast cancer by tear proteomic pattern. Cancer Genomics & Proteomics, 6(3), 177-182.

  • Li, J., Zhang, Z., Rosenzweig, J., Wang, Y., & Chan, D. (2002). Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem, 48(8), 1296-1304.

  • Luftner, D., & Possinger, K. (2002). Nuclear matrix proteins as biomarkers for breast cancer. Expert Rev Mol Diagn, 2(1), 23-31. doi:ERM020106 [pii]10.1586/14737159.2.1.23

  • Schiess, R., Wollscheid, B., & Aebersold, R. (2009). Targeted proteomic strategy for clinical biomarker discovery. Molecular Oncology, 3(1), 33-44. doi:10.1016/j.molonc.2008.12.001

  • Wu, K., & Zhang, Y. (2007). Clinical application of tear proteomics: Present and future prospects. Proteomics. Clinical Applications, 1(9), 972-982. doi: 10.1002/prca.200700125



APPENDIX I

Sequences shown to be up regulated in subjects with cancer:

















Protein Name
Uniprot
IPI
Gene Name:





Complement C4A
POCOL4
IPI00032258
C4A_Human










Trypsin Fragments












 1. AEFQDALEK
 2. DHAVDLIQK
 3. DKGQAGLQR





 4. EMSGSPASG
 5. FACYYPR
 6. FGLLDEDGK


    IPVK

    K





 7. GHLFLQTDQ
 8. GLCVATPVQ
 9. GIQDEDGYR


    PIYNPGQR
    LR






10. GPEVQUVAH
11. GSFEFPVGD
12. HLVPGAPFL


    SPWLK
    AVSK
    LQALVR





13. LLATLCSAE
14. LNMGITDLQ
15. ITQVLHFTK


    VCQCAEGK
    GLR
    





16. NVNFQK
17. QGSFQGGFR
18. SCGLHQLLR




    





19. VDFTLSSER
20. VDVQAGACE
21. VFALDQK



    GK






22. VGDTININI
23. VLSLAQEQV
24. VTASDPLDT


    R
    GGSPEK
    LGSEGALSP




    GGVASLLR





25. YLDKTEQWS




    TLPPETK










10         20         30         40


MRLLWGLIWA SSFFTLSLQK PRLLLFSPSV VHLGVPLSVG





50         60         70         80


VQLQDVPRGQ VVKGSVFLRN PSRNNVPCSP K(15)VDFTLSSER





90         100        110        120


DFALLSLQVP LKDAK(18)SCGLH QLLR(10)GPEVQLVAHSPWLKDS





130        140        150        160


LSRTTNIQGI NLLFSSRR(7)GH LFLQTDQPIY NPGQRVRYR(21)V





170        180        190        200



FALDQKMRPS TDTITVMVEN SHGLRVRKKE VYMPSSIFQD






210        220        230        240


DFVIPDISEP GTWKISARFS DGLESNSSTQ FEVKKYVLPN





250        260        270        280


FEVKITPGKP YILTVPGHLD EMQLDIQARY IYGKPVQGVA





290        300        310        320


YVR(6)FGLLDED GKKTFFRGLE SQTKLVNGQS HISLSK(2)AEFQ





330        340        350        360



DALEK
(14)
LNMGI TDLQGLRLYV AAAIIESPGG EMEEAELTSW






370        380        390        400


YFVSSPFSLD LSKTKR(12)HLVP GAPFLLQALV R(4)EMSGSPASG





410        420        430        440



IPVKVSATVS SPGSVPEVQD IQQNTDGSGQ VSIPIIIPQT






450        460        470        480


ISELQLSVSA GSPHPAIARL TVAAPPSGGP GFLSIERPDS





490        500        510        520


RPPR(22)VGDTLN LNLRAVGSGA TFSHYYMIL SRGQIVFMNR





530        540        550        560


EPKRTLTSVS VFVDHHLAPS FYFVAFYYHG DHPVANSLR(20)V





570        580        590        600



DVQAGACEGK LELSVDGAKQ YRNGESVKLH LETDSLALVA






610        620        630        640


LGALDTALYA AGSKSHKPLN MGKVFEAMNS YDLGCGPGGG





650        660        670        680


DSALQVFQAA GLAFSDGDQW TLSRKRLSCP KEKTTRKKR(16)N





690        700        710        720



VNFQKAINEK LGQYASPTAK RCCQDGVTRL PMMRSCEQRA






730        740        750        760


ARVQQPDCRE PFLSCCQFAE SLRKKSR(3)DKG QAGLQRALEI





770        780        790        800


LQEEDLIDED DIPVRSFFFE NWLWRVETVD RFQILTLWLP





810        820        830        840


DSLTTWEIHG LSLSKTK(8)GLC VATPVQLRVF REFHLHLRLP





850        860        870        880


MSVRRFEQLE LRPVLYNYLD KNLTVSVHVS PVEGLCLAGG





890        900        910        920


GGLAQQVLVP AGSARPVAFS VVPTAAAAVS LKVVAR(11)GSFE





930        940        950        960



FPVGDAVSKV LQIEKEGAIH REELVYELNP LDHRGRTLEI






970        980        990        1000


PGNSDPNMIP DGDFNSYVR(24)V TASDPLDTLG SEGALSPGGV





1010       1020       1030       1040



ASLLRLPRGC GEQTMIYLAP TLAASR(25)YLDK TEQWSTLPPE






1050       1060       1070       1080



TK
(2)
DHAVDLIQ KGYMRIQQFR KADGSYAAWL SRDSSTWLTA






1090       1100       1110       1120


FVLK(23)VLSLAQ EQVGGSPEKL QETSNWLLSQ QQADGSFQDP





1130       1140       1150       1160


CPVLDRSMQG GLVGNDETVA LTAFVTIALH HGLAVFQDEG





1170       1180       1190       1200


AEPLKQRVEA SISKANSFLG EKASAGLLGA HAAAITAYAL





1210       1220       1230       1240


TLTKAPVDLL GVAHNNLMAM AQETGDNLYW GSVTGSQSNA





1250       1260       1270       1280


VSPTPAPRNP SDPMPQAPAL WIETTAYALL HLLLHEGKAE





1290       1300       1310       1320


MADQASAWLT R(17)QGSFQGGFR STQDTVIALD ALSAYWIASH





1330       1340       1350       1360


TTEERGLNVT LSSTGRNGFK SHALQLNNRQ IRGLEEELQF





1370       1380       1390       1400


SLGSKINVKV GGNSKGTLKV LRTYNVLDMK NTTCQDLQIE





1410       1420       1430       1440


VTVKGHVEYT MEANEDYEDY EYDELPAKDD PDAPLQPVTP





1450       1460       1470       1480


LQLFEGRRNR RRREAPKVVE EQESRVHYTV CIWRNGKVGL





1490       1500       1510       1520


SGMAIADVTL LSGFHALRAD LEKLTSLSDR YVSHFETEGP





1530       1540       1550       1560


HVLLYFDSVP TSRECVGFEA VQEVPVGLVQ PASATLYDYY





1570       1580       1590       1600


NPERRCSVFY GAPSKSR(13)LLA TLCSAEVCQC AEGKCPRQRR





1610       1620       1630       1640


ALER(9)GLQDED GYRMK(5)FACYY PRVEYGFQVK VLREDSRAAF





1650       1660       1670       1680


RLFETK(18)ITQV LHFTKDVKAA ANQMRNFLVR ASCRLRLEPG





1690       1700       1710       1720


KEYLIMGLDG ATYDLEGHPQ YLLDSNSWIE EMPSERLCRS





1730       1740


TRQRAACAQL NDFLQEYGTQ GCQV













Protein Name
Uniprot
IPI
Gene Name





Histidine Rich Protein
P04196
IPI00022371
HRG_Human










Trypsin fragments












 1. YKEENDDFA
 2. ADLFYDVEA



    SFR
    LDLESPK










MKALIAALLL ITLQYSCAVS PTDCSAVEPE AEKALDLINK





                      70         80


RRRDGYLFQL LRIADAHLDR VENTTVYYLV LDVQESDCSV





90         100        110        120


LSRKYWNDCE PPDSRRPSEI VIGQCKVIAT RHSHESQDLR





130        140        150        160


VIDFNCTTSS VSSALANTKD SPVLIDFFED TERYRKQANK





170        180        190        200


ALEK(1)YKEEND DFASFRVDRI ERVARVRGGE GTGYFVDFSV





210        220        230       240


RNCPRHHFPR HPNVFGFCR(2)A DLFYDVEALD LESPKNLVIN





250        260        270        280


CEVFDPQEHE NINGVPPHLG HPFHWGGHER SSTTKPPFKP 





290        300        310        320


HGSRDHHHPH KPHEHGPPPP PDERDHSHGP PLPQGPPPLL





330        340        350        360


PMSCSSCQHA TFGTNGAQRH SHNNNSSDLH PHKHHSHEQH





370        380        390        400


PHGHHPHAHH PHEHDTHRQH PHGHHPHGHH PHGHHPHGHH





410        420        430        440


PHGHHPHCHD FQDYGPCDPP PHNQGHCCHG HGPPPGHLRR





450        460        470        480


RGPGKGPRPF HCRQIGSVYR LPPLRKGEVL PLPEANPPSF





490        500        510        520


PLPHHKHPLK PDNQPFPQSV SESCPGKFKS GFPQVSMFFT





HTFPK













Protein Name
Uniprot
IPI
Gene Name





C-type lectin domain
P05452
IPI00009028.2
CLEC3B_Human


family 3, member B





(Tetranectin)










Trypsin fragments












 1. EQQALQTVC
 2. TFHEASEDC



    LK
    ISR










10         20         30         40


MELWGAYLLL CLFSLLTQVT TEPPTQKPKK IVNAKKDVVN





50         60         70         80


TKMFEELKSR LDTLAQEVAL L1KEQQALQTV CLKGTKVHMK





90         100        110        120


CFLALTQTK2T FHEASEDCIS RGGTLGTPQT GSENDALYEY





130        140        150        160


LRQSVGNEAE IWLGLNDMAA EGTWVDMTGA RIAYKNWETE





170        180        190        200


ITAQPDGGKT ENCAVLSGAA NGKWFDKRCR DQLPYICQFG





IV













Protein Name
Uniprot
IPI
Gene Name





Kallikrein-8
O60259-2
IPI00219892
KLK8_Human


isoform 2










Trypsin fragments












 1. ENFPDTLNC




    AEVK










10         20         30         40


MGRPRPRAAK TWMFLLLLGG AWAGHSRAQE DKVLGGHECQ





50         60         70         80


PHSQPWQAAL FQGQQLLCGG VLVGGNWVLT AAHCKKPKYT





90         100        110        120


VRLGDHSLQN KDGPEQEIPV VQSIPHPCYN SSDVEDHNHD





130        140        150        160


LMLLQLRDQA SLGSKVKPIS LADHCTQPGQ KCTVSGWGTV





170        180        190        200


TSPR(3)ENFPDT LNCAEVKIFP QKKCEDAYPG QITDGMVCAG





210        220        230        240


SSKGADTCQG DSGGPLVCDG ALQGITSWGS DPCGRSDKPG





250        260


VYTNICRYLD WIKKIIGSKG













Protein Name
Uniprot
IPI
Gene Name





Complement Component
P07357
IPI00011252
C8A_Human


8 alpha










Trypsin fragments












 1. AIDEDCSQY
 2. LGSLGAACE
 3. QAQCGQDFQ


    EPIPGSQK
    QTQTEGAK
    CK










10         20         30         40


MFAVVFFILS LMTCQPGVTA QEKVNQRVRR AATPAAVTCQ





50         60         70         80


LSNWSEWTDC FPCQDKKYRH RSLLQPNKFG GTICSGDIWD





90         100        110        120


QASCSSSTTC VR(3)QAQCGQDF QCKETGRCLK RHLVCNGDQD





130        140        150        160


CLDGSDEDDC EDVR(1)AIDEDC SQYEPIPGSQ KAALGYNILT





170        180        190        200


QEDAQSVYDA SYYGGQCETV YNGEWRELRY DSTCERLYYG





210        220        230        240


DDEKYFRKPY NFLKYHFEAL ADTGISSEFY DNANDLLSKV





250        260        270        280


KKDKSDSFGV TIGIGPAGSP LLVGVGVSHS QDTSFLNELN





290        300        310        320


WYNEKKFIFT RIFTKVQTAH FKMRKDDIML DEGMLQSLME





330        340        350        360


LPDQYNYGMY AKFINDYGTH YITSGSMGGI YEYILVIDKA





370        380        390        400


KMESLGITSR DITTCFGGSL GIQYEDKINV GGGLSGDHCK





410        420        430        440


KFGGGKTERA RKAMAVEDII SRVRGGSSGW SGGLAQNRST





450        460        470        480


ITYRSWGRSL KYNPVVIDFE MQPIHEVLRH TSLGPLEAKR





490        500        510        520


QNLRRALDQY LMEFNACRCG PCFNNGVPIL EGTSCRCQCR





530        540        550        560



(2)
LGSLGAACEQ TQTEGAKADG SWSCWSSWSV CRAGIQERRR






570        580


ECDNPAPQNG GASCPGRKVQ TQAC













Protein Name
Uniprot
IPI
Gene Name





Kallikrein-13
Q9UKR3
IPI00007726
KLK13_Human










Trypsin fragments












 1. TLQCANIQL
 2. ITDNMLCAG



    R
    TK










10         20         30         40


MWPLALVIAS LTLALSGGVS QESSKVLNTN GTSGFLPGGY





50         60         70         80


TCFPHSQPWQ AALLVQGRLL CGGVLVHPKW VLTAAHCLKE





90         100        110        120


GLKVYLGKHA LGRVEAGEQV REVVHSIPHP EYRRSPTHLM





130        140        150        160


HDHDIMLLEL QSPVQLTGYI QTLPLSHNNR LTPGTTCRVS





170        180        190        200


GWGTTTSPQV NYPR(1)TLQCAN IQLRSDEECR QVYPGK(2)ITDN





210        220        230        240



MLCAGTKEGG KDSCEGDSGG PLVCNRTLYG IVSWGDFPCG






250        260        270


QPDRPGVYTR VSRYVLWIRE TIRKYETQQQ KWLKGPQ













Protein Name
Uniprot
IPI
Gene Name





Complement
P10643
IPI00296608
C7_Human


Component 7










Trypsin fragments












 1. AASGTQNNV
 2. DSCTLPASA



    LR
    EK










10         20         30         40


MKVISLFILV GFIGEFQSFS SASSPVNCQW DFYAPWSECN





50         60         70         80


GCTKTQTRRR SVAVYGQYGG QPCVGNAFET QSCEPTRGCP





90         100        110        120


TEEGCGERFR CFSGQCISKS LVCNGDSDCD EDSADEDRCE





130        140        150        160


DSERRPSCDI DKPPPNIELT GNGYNELTGQ FRNRVINTKS





170        180        190        200


FGGQCRKVFS GDGKDFYRLS GNVLSYTFQV KINNDFNYEF





210        220        230        240


YNSTWSYVKH TSTEHTSSSR KRSFFRSSSS SSRSYTSHTN





250        260        270        280


EIHKGKSYQL LVVENTVEVA QFINNNPEFL QLAEPFWKEL





290        300        310        320


SHLPSLYDYS AYRRLIDQYG THYLQSGSLG GEYRVLFYVD





330        340        350        360


SEKLKENDFN SVEEKKCKSS GWHFVVKFSS HGCKELENAL





370        380        390        400


K(1)AASGTQNNV LRGEPFIRGG GAGFISGLSY LELDNPAGNK





410        420        430        440


RRYSAWAESV TNLPQVIKQK LTPLYELVKE VPCASVKKLY





450        460        470        480


LKWALEEYLD EFDPCHCRPC QNGGLATVEG THCLCHCKPY





490        500        510        520


TFGAACEQGV LVGNQAGGVD GGWSCWSSWS PCVQGKKTRS





530        540        550        560


RECNNPPPSG GGRSCVGETT ESTQCEDEEL EHLRLLEPHC





570        580        590        600


FPLSLVPTEF CPSPPALKDG FVQDEGTMFP VGKNVVYTCN





610        620        630        640


EGYSLIGNPV ARCGEDLRWL VGEMHCQKIA CVLPVLMDGI





650        660        670        680


QSHPQKPFYT VGEKVTVSCS GGMSLEGPSA FLCGSSLKWS





690        700        710        720


PEMKNARCVQ KENPLTQAVP KCQRWEKLQN SRCVCKMPYE





730        740        750        760


CGPSLDVCAQ DERSKRILPL TVCKMHVLHC QGRNYTLTGR





770        780        790        800



(2)
DSCTLPASAE KACGACPLWG KCDAESSKCV CREASECEEE






810        820        830        840


GFSICVEVNG KEQTMSECEA GALRCRGQSI SVTSIRPCAA





ETQ













Protein Name
Uniprot
IPI
Gene Name





Retinal
P00352
IPI00218914
ALDH1A1_Human


Dehydrogenase










Trypsin fragments












 1. SSSGTPDLP
 2. YILGNPLTP



    VLLTDLK
    GVTQGPQID




    KEQYDK










10         20         30         40


M(1)SSSGTPDLP VLLTDLKIQY TKIFINNEWH DSVSGKKFPV





50         60         70         80


FNPATEEELC QVEEGDKEDV DKAVKAARQA FQIGSPWRTM





90         100        110        120


DASERGRLLY KLADLIERDR LLLATMESMN GGKLYSNAYL





130        140        150        160


NDLAGCIKTL RYCAGWADKI QGRTIPIDGN FFTYTRHEPI





170        180        190        200


GVCGQIIPWN FPLVMLIWKI GPALSCGNTV VVKPAEQTPL





210        220        230        240


TALHVASLIK EAGFPPGVVN IVPGYGPTAG AAISSHMDID





250        260        270        280


KVAFTGSTEV GKLIKEAAGK SNLKRVTLEL GGKSPCIVLA





290        300        310        320


DADLDNAVEF AHHGVFYHQG QCCIAASRIF VEESIYDEFV





330        340        350        360


RRSVERAKK(2)Y ILGNPLTPGV TQGPQIDKEQ YDKILDLIES





370        380        390        400


GKKEGAKLEC GGGPWGNKGY FVQPTVFSNV TDEMRIAKEE





410        420        430        440


IFGPVQQIMK FKSLDDVIKR ANNTFYGLSA GVFTKDIDKA





450        460        470        480


ITISSALQAG TVWVNCYGVV SAQCPFGGFK MSGNGRELGE





490        500


YGFHEYTEVK TVTVKISQKN S













Protein Name
Uniprot
IPI
Gene Name





ApoLipoprotein L1
Q9UKR3
IPI00I86903
APOL1_Human


isoform 2










Trypsin fragments












 1. VTEPISAES




    GEQVER










10         20         30         40


MEGAALLRVS VLCIWMSALF LGVGVRAEEA GARVQQNVPS





50         60         70         80


GTDTGDPQSK PLGDWAAGTM DPESSIFIED AIKYFKEKVS





90         100        110        120


TQNLLLLLTD NEAWNGFVAA AELPRNEADE LRKALDNLAR





130        140        150        160


QMIMKDKNWH DKGQQYRNWF LKEFPRLKSE LEDNIRRLRA





170        180        190        200


LADGVQKVHK GTTIANVVSG SLSISSGILT LVGMGLAPFT





210        220        230        240


EGGSLVLLEP GMELGITAAL TGITSSTMDY GKKWWTQAQA





250        260        270        280


HDLVIKSLDK LKEVREFLGE NISNFLSLAG NTYQLTRGIG





290        300        310        320


KDIRALRRAR ANLQSVPHAS ASRPRVTEPI SAESGEQVER





330        340        350        360


VNEPSILEMS RGVKLTDVAP VSFFLVLDVV YLVYESKHLH





370        380        390


EGAKSETAEE LKKVAQELEE KLNILNNNYK ILQADQEL













Protein Name
Uniprot
IPI
Gene Name





Mucin 1
P15941-2
IPI00218163
Muc1_Human


isoform 2










Trypsin fragments












 1. DISEMFIQL
 2. QGGFLGLSN



    YK
    IK










10         20         30         40


MTPGTQSPFF LLLLLTVLTV VTGSGHASST PGGEKETSAT





50         60         70         80


QRSSVPSSTE KNAVSMTSSV LSSHSPGSGS STTQGQDVTL





90         100        110        120


APATEPASGS AATWGQDVTS VPVTRPALGS TTPPAHDVTS





130        140        150        160


APDNKPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





170        180        190        200


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





210        220        230        240


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





250        260        270        280


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS 





290        300        310        320


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





330        340        350        360


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





370        380        390        400


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





410        420        430        440


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





450        460        470        480


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





490        500        510        520


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





530        540        550        560


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





570        580        590        600


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





610        620        630        640


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





650        660        670        680


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





690        700        710        720


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





730        740        750        760


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





770        780        790        800


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





810        820        830        840


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





850        860        870        880


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





890        900        910        920


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





930        940        950        960


APDTRPAPGS TAPPAHGVTS APDTRPAPGS TAPPAHGVTS





970        980        990        1000


ASGSASGSAS TLVHNGTSAR ATTTPASKST PFSIPSHHSD





1010       1020       1030       1040


TPTTLASHST KTDASSTHHS SVPPLTSSNH STSPQLSTGV





1050       1060       1070       1080


SFFFLSFHIS NLQFNSSLED PSTDYYQELQ R(1)DISEMFLQI





1090       1100       1110       1120



YK(2)QGGFLGLS NIKFRPGSVV VQLTLAFREG TINVHDVETQ






1130       1140       1150       1160


FNQYKTEAAS RYNLTISDVS VSDVPFPFSA QSGAGVPGWG





1170       1180       1190       1200


IALLVLVCVL VALAIVYLIA LAVCQCRRKN YGQLDIFPAR





1210       1220       1230       1240


DTYHPMSEYP TYHTHGRYVP PSSTDRSPYE KVSAGNGGSS





1250


LSYTNPAVAA TSANL













Protein Name
Uniprot
IPI
Gene Name





Bleomycin
Q13867
IPI00219575
BLMH_Human


Hydrolase










Trypsin fragments












 1. AQHVFQHAV
 2. SSSGLNSEK



    PQEGKPITN
    VAALIQK



    QK










10         20         30         40


M(2)SSSGLNSEK VAALIQKLNS DPQFVLAQNV GTTHDLLDIC





50         60         70         80


LKRATVQR(1)AQ HVFQHAVPQEGKPITNQKSS GRCWIFSCLN





90         100        110        120


VMRLPFMKKL NIEEFEFSQS YLFFWDKVER CYFFLSAFVD





130        140        150        160


TAQRKEPEDG RLVQFLLMNP ANDGGQWDML VNIVEKYGVI 





170        180        190        200


PKKCFPESYT TEATRRMNDI LNHKMRERCI RLRNLVHSGA





210        220        230        240


TKGEISATQD VMMEEIFRVV CICLGNPPET FTWEYRDKDK





250        260        270        280


NYQKIGPITP LEFYREHVKP LFNMEDKICL VNDPRPQHKY





290        300        310        320


NKLYTVEYLS NMVGGRKTLY NNQPIDFLKK MVAASIKDGE





330        340        350        360


AVWFGCDVGK HFNSKLGLSD MNLYDHELVF GVSLKNMNKA





370        380        390        400


ERLTFGESLM THAMTFTAVS EKDDQDGAFT KWRVENSWGE





410        420        430        440


DHGHKGYLCM TDEWFSEYVY EVVVDRKHVP EEVLAVLEQE





450


PIILPAWDPM GALAE













Protein Name
Uniprot
IPI
Gene Name





Cornifin-B
P22528
IPI00304903
SPRRIB_Human










Trypsin fragments












 1. QPCTPPPQL
 2. VPEPCPSIV



    QQQQVK
    TPAPAQQK










10         20         30         40


MSSQQQK(1)QPC TPPPQLQQQQ VKQPCQPPPQ EPCIPKTKEP





50         60         70         80


CHPKVPEPCH PKVPEPCQPK VPEPCHPK(2)VP EPCPSIVTPA






PAQQKTKQK














Protein Name
Uniprot
IPI
Gene Name





Plasminogen activator
P05120
IPI00007117
SERPINB2_Human


inhibitor-2










Trypsin fragments












 1. GKIPNLLPE




    GSVDGDTR










10         20         30         40


MEDLCVANTL FALNLFKHLA KASPTQNLFL SPWSISSTMA





50         60         70         80


MVYMGSRGST EDQMAKVLQF NEVGANAVTP MTPENFTSCG





90         100        110        120


FMQQIQKGSY PDAILQAQAA DKIHSSFRSL SSAINASTGN





130        140        150        160


YLLESVNKLF GEKSASFREE YIRLCQKYYS SEPQAVDFLE





170        180        190        200


CAEEARKKIN SWVKTQTKGKIPNLLPEGSV DGDTRMVLVN





210        220        230        240


AVYFKGKWKT PFEKKLNGLY PFRVNSAQRT PVQMMYLREK





250        260        270        280


LNIGYIEDLK AQILELPYAG DVSMFLLLPD EIADVSTGLE





290        300        310        320


LLESEITYDK LNKWTSKDKM AEDEVEVYIP QFKLEEHYEL





330        340        350        360


RSILRSMGME DAFNKGRANF SGMSERNDLF LSEVFHQAMV





370        380        390        400


DVNEEGTEAA AGTGGVMTGR TGHGGPQFVA DHPFLFLIMA





410


KITNCILFFG RFSSP













Protein Name
Uniprot
IPI
Gene Name





Peroxiredoxin-6
P30041
IPI00220301
PRDX6_Human










Trypsin fragments












 1. DINAYNCEE
 2. NFDEILR



    PTEK










10         20         30         40


MPGGLLLGDV APNFEANTTV GRIRFHDFLG DSWGILFSHP





50         60         70         80


RDFTPVCTTE LGRAAKLAPE FAKRNVKLIA LSIDSVEDHL





90         100        110        120


AWSK(1)DINAYN CEEPTEKLPF PIIDDRNREL AILLGMLDPA





130        140        150        160


EKDEKGMPVT ARVVFVFGPD KKLKLSILYP ATTGR(2)NFDEI





170        180        190        200



LRVVISLQLT AEKRVATPVD WKDGDSVMVL PTIPEEEAKK 






210        220


LFPKGVFTKE LPSGKKYLRY TPQP













Protein Name
Uniprot
IPI
Gene Name





Complement
Q03591
IPI00011264
CFHR1_Human


factor-H










Trypsin Fragments












 1. ITCTEEGWS
 2. STDTSCVNP
 3. TGESAEFVC


    PTPK
    PTVQNAHIL
    K



    SR










10         20         30         40


MWLLVSVILI SRISSVGGEA TFCDFPKINH GILYDEEKYK





50         60         70         80


PFSQVPTGEV FYYSCEYNFV SPSKSFWTR(1)I TCTEEGWSPT





90         100        110        120



PKCLRLCFFP FVENGRSESS GQTHLEGDTV QIICNTGYRL






130        140        150        160


QNNENNISCV ERGWSTPPKC R(2)STDTSCVNP PTVQNAHILS





170        180        190        200



RQMSKYPSGE RVRYECRSPY EMFGDEEVMC LNGNWTEPPQ






210        220        230        240


CKDSTGKCGP PPPIDNGDIT SFPLSVYAPA SSVEYQCQNL





250        260        270        280


YQLEGNKRIT CRNGQWSEPP KCLHPCVISR EIMENYNIAL





290        300        310        320


RWTAKQKLYL R(3)TGESAEFVC KRGYRLSSRS HTLRTTCWDG





330


KLEYPTCAKR













Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of
P04217
IPI00022895
A1BG_Human


Alpha-1B-





glycoprotein










Trypsin Fragments












 1. ATWSGAVLA
 2. CLAPLEGAR
 3. GVTFLLR


    GR







 4. HQFLLTGDT
 5. LLELTGPK
 6. SGLSTGWTQ


    QGR

    LSK










10         20         30         40


MSMLVVFLLL WGVTWGPVTE AAIFYETQPS LWAESESLLK





50         60         70         80


PLANVTLTCQ AHLETPDFQL FKNGVAQEPV HLDSPAIK(4)HQ





90         100        110        120



FLLTGDTQGR YRCR(8)SGLSTG WTQLSK(5)LLEL TGPKSLPAPW






130        140        150        160


LSMAPVSWIT PGLKTTAVCR GVLR(4)GVTFLL RREGDHEFLE





170        180        190        200


VPEAQEDVEA TFPVHQPGNY SCSYRTDGEG ALSEPSATVT





210        220        230        240


IEELAAPPPP VLMHHGESSQ VLHPGNKVTL TCVAPLSGVD





250        260        270        280


FQLRRGEKEL LVPRSSTSPD RIFFHLNAVA LGDGGHYTCR





290        300        310        320


YRLHDNQNGW SGDSAPVELI LSDETLPAPE FSPEPESGRA





330        340        350        360


LRLR(2)CLAPLE GARFALVRED RGGRRVHRFQ SPAGTEALFE





370        380        390        400


LHNISVADSA NYSCVYVDLK PPFGGSAPSE RLELHVDGPP





410        420        430        440


PRPQLR(1)ATWS GAVLAGRDAV LRCEGPIPDV TFELLREGET





450        460        470        480


KAVKTVRTPG AAANLELIFV GPQHAGNYRC RYRSWVPHTF





490


ESELSDPVEL LVAES













Protein Name
Uniprot
IPI
Gene Name





Gamma-glutamyl
Q92820
IPI00023728
GGH_Human


hydrolase










Trypsin Fragments












 1. NLDGISHAP




    NAVK










10         20         30         40


MASPGCLLCV LGLLLCGAAS LELSRPHGDT AKKPIIGILM





50         60         70         80


QKCRNKVMKN YGRYYIAASY VKYLESAGAR VVPVRLDLTE





90         100        110        120


KDYEILFKSI NGILFPGGSV DLRRSDYAKV AKIFYNLSIQ





130        140        150        160


SFDDGDYDPV WGTCLGFEEL SLLISGECLL TATDTVDVAM





170        180        190        200


PLNFTGGQLH SRMFQNFPTE LLLSLAVEPL TANFHKWSLS





210        220        230        240


VKNFTMNEKL KKFFNVLTTN TDGKIEFIST MEGYKYPVYG





250        260        270        280


VQWHPEKAPY EWKNLDGISH APNAVKTAFY LAEFFVNEAR





290        300        310


KNNHHFKSES EEERALIYQF SPIYTGHISS FQQCYIFD













Protein Name
Uniprot
IPI
Gene Name





Ezrin
P15311
IPI00843975
EZR_Human










Trypsin Fragments












 1. ALQLEEER
 2. APDFVFYAP
 3. ELSEQIQR



    R






 4. IALLEEAR
 5. IGFPWSEIR
 6. QRIDEFEAL





 7. SGYLSSER
 8. SQEQLAAEL
 9. VSAQEVRK



    AEYTAK










10         20         30         40


MPKPINVRVT TMDAELEFAI QPNTTGKQLF DQVVKTIGLR





50         60         70         80


EVWYFGLHYV DNKGFPTWLK LDKK(3)VSAQEV RKENPLQFKF





90         100        110        120


RAKFYPEDVA EELIQDITQK LFFLQVKEGI LSDEIYCPPE





130        140        150        160


TAVLLGSYAV QAKFGDYNKE VHK(7)SGYLSSE RLIPQRVMDQ





170        180        190        200


HKLTRDQWED RIQVWHAEHR GMLKDNAMLE YLKIAQDLEM





210        220        230        240


YGINYFEIKN KKGTDLWLGV DALGLNIYEK DDKLTPK(5)IGF





250        260        270        280



PWSEIRNISF NDKKFVIKPI DKK(2)APDFVFY APRLRINKRI






290        300        310        320


LQLCMGNHEL YMRRRKPDTI EVQQMKAQAR EEKHQKQLER





330        340        350        360


QQLETEKKRR ETVEREKEQM MREKEELMLR LQDYEEKTKK





370        380        390        400


AER(3)ELSEQIQ R(1)ALQLEEERK RAQEEAERLE ADRMAALRAK





410        420        430        440


EELERQAVDQ IK(11)SQEQLAAE LAEYTAK(4)IAL LEEARRRKED





450        460        470        480


EVEEWQHRAK EAQDDLVKTK EELHLVMTAP PPPPPPVYEP





490        500        510        520


VSYHVQESLQ DEGAEPTGYS AELSSEGIRD DRNEEKRITE





530        540        550        560


AEKNERVQRQ LLTLSSELSQ ARDENKRTHN DIIHNENMRQ





570        580


GRDKYKTLRQ IRQGNTK(6)QRI DEFEAL













Protein Name
Uniprot
IPI
Gene Name





Alpha-2-
P08697
IPI00879231
SERPINF2_Human


antiplasmin










Trypsin Fragments












 1. LGNQEPGGQ




    TALK










10         20         30         40


MALLWGLLVL SWSCLQGPCS VFSPVSAMEP LGRQLTSGPN





50         60         70         80


QEQVSPLTLL KLGNQEPGGQ TALKSPPGVC SRDPTPEQTH





90         100        110        120


RLARAMMAFT ADLFSLVAQT STCPNLILSP LSVALALSHL





130        140        150        160


ALGAQNHTLQ RLQQVLHAGS GPCLPHLLSR LCQDLGPGAF





170        180        190        200


RLAARMYLQK GFPIKEDFLE QSEQLFGAKP VSLTGKQEDD





210        220        230        240


LANINQWVKE ATEGKIQEFL SGLPEDTVLL LLNAIHFQGF





250        260        270        280


WRNKFDPSLT QRDSFHLDEQ FTVPVEMMQA RTYPLRWFLL





290        300        310        320


EQPEIQVAHF PFKNNMSFVV LVPTHFEWNV SQVLANLSWD





330        340        350        360


TLHPPLVWER PTKVRLPKLY LKHQMDLVAT LSQLGLQELF





370        380        390        400


QAPDLRGISE QSLVVSGVQH QSTLELSEVG VEAAAATSIA





410        420        430        440


MSRMSLSSFS VNRPFLFFIF EDTTGLPLFV GSVRNPNPSA





450        460        470        480


PRELKEQQDS PGNKDFLQSL KGFPRGDKLF GPDLKLVPPM





490


EEDYPQFGSP K













Protein Name
Uniprot
IPI
Gene Name





Hemopexin
P02790
IPI00022488
HPX_Human










Trypsin Fragments












 1. DVRDYFMPC
 2. DYFMPCPGR
 3. EVGTPHGII


    PGR

    LDSVDAAFI




    CPGSSR





 4. GECQAEGVL
 5. GEFVWK
 6. GGYTLVSGY


    FFQGDR

    PK





 7. LLQDEFPGI
 8. NFPSPVDAA
 9. QGHNSVFLI


    PSPIDAAVE
    FR
    K


    CHR







10. SGAQATWTE
11. VDGALCMEK
12. WKNFPSPVD


    LPWPHEK

    AAFR










10         20         30         40


MARVLGAPVA LGLWSLCWSL AIATPLPPTS AHGNVAEGET





50         60         70         80


KPDPDVTERC SDGWSFDATT LDDNGTMLFF K(5)GEFVWKSHK





90         100        110        120


WDRELISERW K(8)NFPSPVDAA FR(9)QGHNSVFL IKGDKVWVYP





130        140        150        160


PEKKEKGYPK (7)LLQDEFPGIP SPLDAAVECH R(4)GECQAEGVL





170        180        190        200



FFQGDREWFW DLATGTMKER SWPAVGNCSS ALRWLGRYYC






210        220        230        240


FQGNQFLRFD PVRGEVPPRY PR(1)DVR(2)DYFMP CPGRGHGHRN





250        260        270        280


GTGHGNSTHH GPEYMRCSPH LVLSALTSDN HGATYAFSGT





290        300        310        320


HYWRLDTSRD GWHSWPIAHQ WPQGPSAVDA AFSWEEKLYL





330        340        350        360


VQGTQVYVFL TK(6)GGYTLVSG YPKRLEK(3)EVG TPHGIILDSV





370        380        390        400



DAAFICPGSS RLHIMAGRRL WWLDLK(10)SGAQ ATWTELPWPH






410        420        430        440



EK
(11)
VDGALCME KSLGPNSCSA NGPGLYLIHG PNLYCYSDVE






450        460


KLNAAKALPQ PQNVTSLLGC TH













Protein Name
Uniprot
IPI
Gene Name





Cysteine Rich
P54108
IPI00974055
Crisp3_Human


Secretory Protein 3










Trypsin Fragments












 1. WANQCNYR












10         20         30         40


MTLFPVLLFL VAGLLPSFPA NEDKDPAFTA LLTTQTQVQR





50         60         70         80


EIVNKHNELR RAVSPPARNM LKMEWNKEAA ANAQKWANQC





90         100        110        120



NYRHSNPKDR MTSLKCGENL YMSSASSSWS QAIQSWFDEY






130        140        150        160


NDFDFGVGPK TPNAVVGHYT QVVWYSSYLV GCGNAYCPNQ





170        180        190        200


KVLKYYYVCQ YCPAGNWANR LYVPYEQGAP CASCPDNCDD





210        220        230        240


GLCTNGCKYE DLYSNCKSLK LTLTCKHQLV RDSCKASCNC













Protein Name
Uniprot
IPI
Gene Name





Carboxypeptidase A4
Q9UI42
IPI00008894
CP4A_Human










Trypsin Fragments












 1. DPAITSILE
 2. SRNPGSSCI
 3. GASDNPCSE


    K
    GADPNR
    VYHGPHANS




    EVEVK





 4. SVVDFIQK
 5. NPGSSCIGA




    DPNR










10         20         30         40


MRWILFIGAL IGSSICGQEK FFGDQVLRIN VRNGDEISKL





50         60         70         80


SQLVNSNNLK LNFWKSPSSF NRPVDVLVPS VSLQAFKSFL





90         100        110        120


RSQGLEYAVT IEDLQALLDN EDDEMQHNEG QERSSNNFNY





130        140        150        160


GAYHSLEAIY HEMDNIAADF PDLARRVKIG HSFENRPMYV





170        180        190        200


LKFSTGKGVR RPAVWLNAGI HSREWISQAT AIWTARKIVS





210        220        230        240


DYQR(1)DPAITS ILEKMDIFLL PVANPDGYVY TQTQNRLWRK





250        260        270        280


TR(2)SR(5)NPGSSC IGADPNRNWN ASFAGK(3)GASD NPCSEVYHGP





290        300        310        320



HANSEVEVK
(4)
SVVDFIQKHGN FKGFIDLHSY SQLLMYPYGY






330        340        350        360


SVKKAPDAEE LDKVARLAAK ALASVSGTEY QVGPTCTTVY





370        380        390        400


PASGSSIDWA YDNGIKFAFT FELRDTGTYG FLLPANQIIP





410        420


TAEETWLGLK TIMEHVRDNL













Protein Name
Uniprot
IPI
Gene Name





N-acetylmuramyl-
Q96PD5-2
IPI00394992
PGLYRP2_Human


L-alanine amidase










Trypsin Fragments












 1. GSQTQSHPD
 2. TFTLLDPK



    LGTEGCWDQ




    LSAPR










10         20         30         40


MAQGVLWILL GLLLWSDPGT ASLPLLMDSV IQALAELEQK





50         60         70         80


VPAAKTRHTA SAWLMSAPNS GPHNRLYHFL LGAWSLNATE





90         100        110        120


LDPCPLSPEL LGLTKEVARH DVREGKEYGV VLAPDGSTVA





130        140        150        160


VEPLLAGLEA GLQGRRVINL PLDSMAAPWE TGDTFPDVVA





170        180        190        200


IAPDVRATSS PGLRDGSPDV TTADIGANTP DATKGCPDVQ





210        220        230        240


ASLPDAKAKS PPTMVDSLLA VTLAGNLGLT FLR(1)GSQTQSH





250        260        270        280



PDLGTEGCWD QLSAPR
(2)
TFTL LDPKASLLTM AFLNGALDGV






290        300        310        320


ILGDYLSRTP EPRPSLSHLL SQYYGAGVAR DPGFRSNFRR





330        340        350        360


QNGAALTSAS ILAQQVWGTL VLLQRLEPVH LQLQCMSQEQ





370        380        390        400


LAQVAANATK EFTEAFLGCP ATHPRCRWGA APYRGRPKLL





410        420        430        440


QLPLGFLYVH HTYVPAPPCT DFTRCAANMR SMQRYHQDTQ





450        460        470        480


GWGDIGYSFV VGSDGYVYEG RGWHWVGAHT LGHNSRGFGV





490        500        510        520


AIVGNYTAAL PTEAALRTVR DTLPSCAVRA GLLRPDYALL





530        540        550        560


GHRQLVRTDC PGDALFDLLR TWPHFTATVK PRPARSVSKR





570


SRREPPPRTL PATDLQ












Protein Name
Uniprot
Gene Name





Caspase-14
P31944
CASP14_Human










Trypsin Fragments












 1. AREGSEEDL
 2. DPTAEQFQE
 3. FQQAIDSR


    DALEHMFR
    ELEK






 4. KTNPEIQST
 5. MAEAELVQE
 6. RDPTAEQFQ


    LR
    GK
    EELEK





 7. RMAEAELVQ
 8. SLEEEKYDM
 9. TNPEIQSTL


    EGK
    SGAR
    R





10. VYIIQACR










10         20         30         40


MSNPR(8)SLEEE KYDMSGARLA LILCVTK(1)ARE GSEEDLDALE 





50         60         70         80



HMFRQLRFES TMK(6)R(2)DPTAEQFQEELEK(3)FQQ AIDSREDPVS






90         100        110        120


CAFVVLMAHG REGFLKGEDG EMVKLENLFE ALNNKNCQAL





130        140        150        160


RAKPKV(10)YIIQ ACRGEQRDPG ETVGGDEIVM VIKDSPQTIP





170        180        190        200


TYTDALHVYS TVEGYIAYRH DQKGSCFIQT LVDVFTKRKG





210        220        230        240


HILELLTEVT R(7)R(5)MAEAELVQEGKAR(4)K(9)TNPE IQSTLRKRLY





LQ













Protein Name
Uniprot
IPI
Gene Name





Ig Kappa chain
P04207
IPI00385253
KV308_Human


V-III region POM










Trypsin Fragments












 1. LLIYGASTR












10         20         30         40


MEAPAQLLFL LLLWLPDTTG SIVMTQSPAT LSVSPGERAT





50         60         70         80


LSCRASQSVS NNLAWYQQKP GQPPRLLIYG ASTRATGIPA





90         100        110        120


RFSGSGSGTE FTLTISRLQS EDFAVYYCQQ YNNWPPWTFG





QGTRVEIKR













Protein Name
Uniprot
IPI
Gene Name





Ig Kappa chain
P01624
IPI00387119
KV306_Human


V-III region POM










Trypsin Fragments












 1. EIVMTQSPV




    TLSVSPGER










10         20         30         40



EIVMTQSPVT LSVSPGERAT LSCRASQSIS NSYLAWYQQK






50         60         70         80


PSGSPRLLIY GASTRATGIP ARFSGSGSGT EFTLTISSLQ





90         100


SEDFAVYYCQ QYNNWPPTFG QGTRVEIKR













Protein Name
Uniprot
IPI
Gene Name





Isoform 1
P02768-1
IPI00387119
ALB_Human


Serum Albumin










Trypsin Fragments












 1. AACLLPK
 2. AAFTECCQA
 3. AAFTECCQA



    ADK
    ADKAACLLP




    K





 4. ADDKETCFA
 5. ADDKETCFA
 6. AEFAEVSK


    EEGK
    EEGKK






 7. AEFAEVSKL
 8. ATKEQIK
 9. ATKEQIKAV


    VTDLTK

    MDDFAAFVE




    K





11. AVMDDFAAF
12. CASIQKFGE
13. CCAAADPHE


    VEK
    R
    CYAK





14. CCKADDKET
15. CCKHPEAK
16. CCTESLVNR


    CFAEEGK







17. CCTESLVNR
18. DDNPNLPR
19. DAHKSEVAH


    RPCFSALEV

    R


    DETYVPK







20. DLGEENFK
21. DVCKNYAEA
22. DVFLGMFLY



    K
    EYAR





23. ECCEKPLLE
24. EFNAETFTF
25. EFNAETFTF


    K
    HADICTLSE
    HADICTLSE



    K
    KER





26. EQLKAVMDD
27. ETCFAEGK
28. ETCFAEEGK


    FAAFVEK

    K





29. ETYGEMADC
30. FKDLGEENF
31. FPKAEFAEV


    CAK
    K
    SK





32. FQNALLVR
33. HPDYSVVLL
34. HPYFYAPEL



    LR
    LFFAK





35. LAKTYETTL
36. LCTVATLR
37. LCTVATLRE


    EK

    TYGEMADCC




    AK





38. LDELRDEGK
39. LDELRDEGK
40. LKCASLQK



    ASSAK






41. LKECCEKPL
42. LSQRFPK
43. LFSQRFPKA


    LEK

    EFAEVSK





44. LVAASQAAL
45. LVNEVTEFA
46. IVNEVTEFA


    GL
    K
    KTCVADESA




    ENCDK





47. LVRPEVDVM
48. LVRPEVDVM
49. LVTDLTK


    CTAFHDNEE
    CTAFHDNEE



    TFLK
    TFLKK






50. KLVAASQAA
51. KQTALVELV
52. KVPQVSTPT


    LGL
    K
    LLVEVSR





53. KYLYEIAR
54. MPCAEDYIL
55. NECFIQHK



    SVVILNQIL




    CVILHEK






56. NECFLQHKD
57. NIGKVGSK
58. NYAEAK


    DNPNLPR







59. NYAEAKDVF
60. PLVEEPQNL
61. QEPERNECF


    IGMFIYEYA
    IK
    LQHK


    R







62. QEPERNECF
63. QNCELFEQL
64. QNCELFEQL


    LQHKDDNPN
    GEYK
    GEYKFQNAI


    LPR

    IVR





65. QTAIVEIVK
66. RHPDYSVVL
67. RMPCAEDYL



    LLR
    SVVLKNQLC




    VLHEK





68. RPCFSALEV
69. SHCIAEVEN
70. SHCIAEVEN


    DETYVPK
    DEMPADLPS
    DEMPADLPS



    LAADFVESK
    LAADFVESK



    DVCKNYAEA




    K






71. SHCIAEVEN
72. SLHTLFGDK
73. SLHTLFGDK


    DEMPADLPS

    LCTVATLR


    LAADFVESK




    DVCK







74. TCVADESAE
75. TCVADESAE
76. TCVADESAE


    NCDK
    NCDKSLHTL
    NCDKSLHTL



    FGDK
    FGDKLCTVA




    TLR





77. TPVSDRVTK
78. TYETTLEK
79. TYETTIEKC




    CAAADPHEC




    YAK





80. VFDEFKPLV
81. VHTECCHGD
82. VHTECCHGD


    EEPQNLIK
    LLECADDR
    LLECADDRA




    DLAK





83. VHTECCHGD
84. VPQVSTPTL
85. YICENQDSI


    LLECADDRA
    VEVSR
    SSK


    DLAKYICEN




    QDSISSK







86. YICENQDSI
87. YLYEIAR
88. YLYEIARR


    SSKLK







89. YKAAFTECC
90. YKAAFTECC



    QAADK
    QAADKAACL




    LPK










MKWVTFISLL FLFSSAYSRG VFRR(10)DAHKSE VAHR(30)FK(20)DLGE ENFKALVLIA





FAQYLQQCPFEDHVK(45)LVNEV TEFAK(74,75,76)TCVAD ESAENCDK(72,73)SL HTLFGDK(37,36)LCT






VATLR
(23)
ETYGEMADCCAK(61)(62)QEPER(55)(56)NECFLQHK(18)DDNPNLPR(47)(48)LV







RPEVDVMCTA FHDNEETFLK 
(53)
K
(87,88)
YLYEIARR(24)H PYFYAPELLF FAKR(89,90)YK(2)(3)AAFT







ECCQAADK
(1)
AA CLLPK
(39)(38)
LDELR DEGKASSAKQ R(49)LK(12)CASLQKFGERAFKAWAV






AR(43)(48)LSQR(31)FPK(4)A EFAEV(7)SK(49)LVT DLTK(81,82,83)VHTECC HGDLLECADD






RADLAK
(85,86)
YICENQDSISSK(41)LK(23)ECCEKPLLEK(69,70,71)SHCIAEVENDEMPADLPSLA







ADFVESK
(21)
DVCK(58)(59)NYAEAK(22)DVFLGMFLYEYAR(68)R(33)HPDYSVVLL







LR
(35)
LAK
(78,79)
TYETTLEK(12)CCAAADPHECYAK(80)VFDEFK(40)PLVEEPQN







LIK
(63)(64)
QNCELFE QLGEYK
(32)
FQNA LLVRYTK(52)K(54)VP QVSTPTLVEV SR(57)NLGKVGSK







(15)
CCKHPEAK
(67)R(64)M PCAEDYLSVV LNQLCVLHEK(77)TPVSDRVTK(17)(16)C CTESLVNR(68)RP







CFSALEVDET YVPK
(25)(24)
EFNAETFTF HADICTL







SEKERQIK(51)K(65)Q TALVELVKHK PK(10)(9)ATK(26)EQLK(11)A







VMDDFAAFVEK
(14)
CCK
(4)
ADDK
(28)(27)
ET CFAEEGK
(5)(56)
K
(44)
LV







AASQAALGL














Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of
P08603
IPI00029739
CFH_Human


Complement factor H










Trypsin Fragments












 1. *AGEQVTYT
 2. CLHPCVISR
 3. *DGWSAQPT


    CATYYK

    CIK





 4. *DTSCVNPP
 5. *EFDHNSNI
 6. EIMENYNIA


    TVQNAYIVS
    R
    LR


    R







 7. GDAVCTESG
 8. GDAVCTESG
 9. *IDVHLVPD


    WR
    WRPLPSCEE
    R



    K






10. *LSYTCEGG
11. IVSSAMEPD
12. NTEILTGSW


    FR
    REYHFGQAV
    SDQTYPEGT



    R
    QAIYK





13. RPYFPVAVG
14. *SCDIPVFM
15. SIDVACHPG


    K
    NAR
    YALPK





16. SLGNVIMVC
17. *SSNLIILE
18. *SSQESYAH


    R
    EHLK
    GTK





19. TGDEITYQC
20. TGESVEFVC
21. *TKNDFTWF


    R
    K
    K





22. TTCWDGKLE
23. *VSVLCQEN
24. *WQSIPLCV


    YPTCAK
    YLIQEGEEL
    EK



    TCKDGR






25. *WSSPPQCE




    GLPCK










10         20         30         40


MRLLAKIICL MLWAICVAED CNELPPRR(12)NT EILTGSWSDQ





50         60         70         80



TYPEGTQAIY KCRPGYR(16)SLGNVIMVCRKGE WVALNPLRKC






90         100        110        120


QKRPCGHPGD TPEGTFTLTG GNVFEYGVKA VYTCNEGYQL





130        140        150        160


LGEINYRECD TDGWTNDIPI CEVVKCLPVT APENGK(11)IVSS





170        180        190        200



AMEPDREYHF GQAVRFVCNS GYKIEGDEEM HCSDDGFWSK






210        220        230        240


EKPKCVEISC KSPDVINGSP ISQKIIYKEN ERFQYKCNMG





250        260        270        280


YEYSER(7)GDAV CTESGWR(8)PLP SCEEKSCDNP YIPNGDYSPL





290        300        310        320


RIKHR(19)TGDEI TYQCRNGFYP ATRGNTAKCT STGWIPAPRC 





330        340        350        360


TLKPCDYPDI KHGGLYHENM R(13)RPYFPVAVG KYYSYYCDEH





370        380        390        400


FETPSGSYWD HIHCTQDGWS PAVPCLRKCY FPYLENGYNQ





410        420        430        440


NYGRKFVQGK (15)SIDVACHPGY ALPKAQTTVT CMENGWSPTP





450        460        470        480


RCIRVKTCSK SSIDIENGFI SESQYTYALK EKAKYQCKLG





490        500        510        520


YVTADGETSG SITCGK(3)DGWS AQPTCIK(14)SCD IPVFMNAR(21)TK





530        540        550        560



NDFTWFKLND TLDYECHDGY ESNTGSTTGS IVCGYNGWSD






570        580        590        600


LPICYERECE LPK(9)IDVHLVP DRKKDQYKVG EVLKFSCKPG





610        620        630        640


FTIVGPNSVQ CYHFGLSPDL PICKEQVQSC GPPPELLNGN





650        660        670        680


VKEKTKEEYG HSEVVEYYCN PRFLMKGPNK IQCVDGEWTT





690        700        710        720


LPVCIVEEST CGDIPELEHG WAQLSSPPYY YGDSVEFNCS





730        740        750        760


ESFTMIGHRS ITCIHGVWTQ LPQCVAIDKL KKCK(17)SSNLII





770        780        790        800



LEEHLKNKK(5)E FDHNSNIRYR CRGKEGWIHT VCINGRWDPE






810        820        830        840


VNCSMAQIQL CPPPPQIPNS HNMTTTLNYR DGEK(23)VSVLCQ





850        860        870        880



ENYLIQEGEE ITCKDGR
(24)
WQS IPLCVEKIPC SQPPQIEHGT






890        900        910        920


INSSR(18)SSQESYAHGTK(10)LSYT CEGGFRISEE NETTCYMGK(25)W





930        940        950        960



SSPPQCEGLP CKSPPEISHG VVAHMSDSYQ YGEEVTYKCF






970        980        990        1000


EGFGIDGPAI AKCLGEKWSH PPSCIKTDCL SLPSFENAIP





1010       1020       1030       1040


MGEKKDVYK(1)A GEQVTYTCAT YYKMDGASNV TCINSRWTGR





1050       1060       1070       1080


PTCR(4)DTSCVN PPTVQNAYIV SRQMSKYPSG ERVRYQCRSP





1090       1100       1110       1120


YEMFGDEEVM CLNGNWTEPP QCKDSTGKCG PPPPIDNGDI





1130       1140       1150       1160


TSFPLSVYAP ASSVEYQCQN LYQLEGNKRI TCRNGQWSEP





1170       1180       1190       1200


PK(2)CLHPCVIS R(6)EIMENYNIA LRNTAKQKLY SR(20)TGESVEFV





1210       1220       1230



CKRGYRLSSR SHTLR(22)TTCWD GKLEYPTCAK R














Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of Sodium-
O95436-1
IPI00007910
SLC34A2_Human


dependent phosphate





transport protein










Trypsin Fragments












 1. EAQGEVPAS
 2. VISQIAMND



    DSKTECTAL
    EK










10         20         30         40


MAPWPELGDA QPNPDKYLEG AAGQQPTAPD KGKETNKTDN





50         60         70         80


TEAPVTKIEL LPSYSTATLI DEPTEVDDPW NLPTLQDSGI





90         100        110        120


KWSERDTKGK ILCFFQGIGR LILLLGFLYF FVCSLDILSS





130        140        150        160


AFQLVGGKMA GQFFSNSSIM SNPLLGLVIG VLVTVLVQSS





170        180        190        200


STSTSIVVSM VSSSLLTVRA AIPIIMGANI GTSITNTIVA





210        220        230        240


LMQVGDRSEF RRAFAGATVH DEFNWLSVLC LLPVEVATHY





250        260        270        280


LEIITQLIVE SFHFKNGEDA PDLLKVITKP FTKLIVQLDK





290        300        310        320


K(2)VISQIAMND EKAKNKSLVK IWCKTFTNKT QINVTVPSTA





330        340        350        360


NCTSPSLCWT DGIQNWTMKN VTYKENIAKC QHIFVNFHLP





370        380        390        400


DLAVGTILLI LSLLVLCGCL IMIVKILGSV LKGQVALVIK





410        420        430        440


KTINTDFPFP FAWLTGYLAI LVGAGMTFIV QSSSVFTSAL





450        460        470        480


TPLIGIGVIT IERAYPLTLG SNIGTTTTAI LAALASPGNA





490        500        510        520


LRSSLQIALC HFFFNISGIL LWYPIPFTRL PIRMAKGLGN





530        540        550        560


ISAKYRWFAV FYLIIFFFLI PLTVFGLSLA GWRVLVGVGV





570        580        590        600


PVVFIIILVL CLRLLQSRCP RVLPKKLQNW NFLPLWMRSL





610        620        630        640


KPWDAVVSKF TGCFQMRCCC CCRVCCRACC LLCDCPKCCR





650        660        670        680


CSKCCEDLEE AQEGQDVPVK APETFDNITI SR(1)EAQGEVPA





690



SDSKTECTAL














Protein Name
Uniprot
IPI
Gene Name





Putative

IPI00152189



Uncharacterized





protein










Trypsin Fragments












 1. FSVLGSGLN




    R










MAWAPLLLTLLSLLTGSLSQPVLTQPPSASASLGASVTLTCTLSSGYSNYKVDWYQQRPG





KGPRFVMRVGTGGIVGSKGDGIPDRFSVLGSGLNRYLTIKNIQEEDESDYHCGADHGSGS





NFV













Protein Name
Uniprot
IPI
Gene Name





Ras Related Protein
Q15771
IPI00302030
RAB_30_Human


Rab-30










Trypsin Fragments












 1. LQIWDTAGQ
 2. SMEDYDFLF



    ER
    K










10         20         30         40


M(2)SMEDYDFLF KIVLIGNAGV GKTCLVRRFT QGLFPPGQGA





50         60         70         80


TIGVDFMIKT VEINGEKVK(2)L QIWDTAGQER FRSITQSYYR





90         100        110        120


SANALILTYD ITCEESFRCL PEWLREIEQY ASNKVITVLV





130        140        150        160


GNKIDLAERR EVSQQRAEEF SEAQDMYYLE TSAKESDNVE





170        180        190        200


KLFLDLACRL ISEAPQNTLV NHVSSPLPGE GKSISYLTCC 





NFN









APPENDIX II

Sequences shown to be down regulated in subjects with cancer:

















Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of growth
Q14393
IPI00412410
GAS6_Human


arrest-specific





protein 6










Trypsin Fragments












 1. CEQVCVNSP
 2. GQSEVSAAQ
 3. IAVAGDLFQ


    GSYTCHCDG
    LQER
    PER


    R







 4. MFSGTPVIR
 5. MQCFSVTER
 6. NSGFATCVQ




    NLPDQCTPN




    PCDR










10         20         30         40


MAPSLSPGPA ALRRAPQLLL LLLAAECALA ALLPAREATQ





50         60         70         80


FLRPRQRRAF QVFEEAKQGH LERECVEELC SREEAREVFE





90         100        110        120


NDPETDYFYP RYLDCINKYG SPYTK(8)NSGFA TCVQNLPDQC





130        140        150        160



TPNPCDRKGT QACQDLMGNF FCLCKAGWGG RLCDKDVNEC






170        180        190        200


SQENGGCLQI CHNKPGSFHC SCHSGFELSS DGRTCQDIDE





210        220        230        240


CADSEACGEA RCKNLPGSYS CLCDEGFAYS SQEKACRDVD





250        260        270        280


ECLQGR(1)CEQV CVNSPGSYTC HCDGRGGLKL SQDMDTCELE





290        300        310        320


AGWPCPRHRR DGSPAARPGR GAQGSRSEGH IPDRRGPRPW





330        340        350        360


QDILPCVPFS VAKSVKSLYL GR(4)MFSGTPVI RLRFKRLQPT





370        380        390        400


RLVAEFDFRT FDPEGILLFA GGHQDSTWIV LALRAGRLEL





410        420        430        440


QLRYNGVGRV TSSGPVINHG MWQTISVEEL ARNLVIKVNR





450        460        470        480


DAVMK(3)IAVAG DLFQPERGLY HLNLTVGGIP FHEKDLVQPI





490        500        510        520


NPRLDGCMRS WNWLNGEDTT IQETVKVNTR (5)MQCFSVTERG





530        540        550        560


SFYPGSGFAF YSLDYMRTPL DVGTESTWEV EVVAHIRPAA





570        580        590        600


DTGVLFALWA PDLRAVPLSV ALVDYHSTKK LKKQLVVLAV





610        620        630        640


EHTALALMEI KVCDGQEHVV TVSLRDGEAT LEVDGTR(2)GQS





650        660        670        680



EVSAAQLQER LAVLERHLRS PVLTFAGGLP DVPVTSAPVT






690        700        710        720


AFYRGCMTLE VNRRLLDLDE AAYKHSDITA KSCPPVEPAA













Protein Name
Uniprot
IPI
Gene Name





Cathepsin L1
P07711
IPI00012887
CTSL1_Human










Trypsin Fragments












 1. HSFTMAMNA
 2. LYGMNEEGW
 3. NHCGIASAA


    FGDMTSEEF
    R
    SYPV


    R







 4. NSWGEEWGM




    GGYVK










10         20         30         40


MNPTLILAAF CLGIASATLT FDHSLEAQWT KWKAMGNR(2)LY





50         60         70         80



GMNEEGWRRA VWEKNMKMIE LHNQEYREGK (1)HSFTMAMNAF






90         100        110        120



GDMTSEEFRQ VMNGFQNRKP RKGKVFQEPL FYEAPRSVDW






130        140        150        160


REKGYVTPVK NQGQCGSCWA FSATGALEGQ MFRKTGRLIS





170        180        190        200


LSEQNLVDCS GPQGNEGCNG GLMDYAFQYV QDNGGLDSEE





210        220        230        240


SYPYEATEES CKYNPKYSVA NDTGFVDIPK QEKALMKAVA





250        260        270        280


TVGPISVAID AGHESFLFYK EGIYFEPDCS SEDMDHGVLV





290        300        310        320


VGYGFESTES DNNKYWLVK(4)N SWGEEWGMGG YVKMAKDRR(3)N





330



HCGIASAASY PTV














Protein Name
Uniprot
IPI
Gene Name





Secreted frizzled-
Q8N474
IPI00749245
SFRP1_Human


related protein 1










Trypsin Fragments












 1. FYTKPPQCV
 2. LCHNVGYK
 3. LCHNVGYKK


    DIPADLR







 4. MVLPNLLEH
 5. PQGTTVCPP
 6. QQASSWVPL


    ETMAEVK
    CDNELK
    LNK





 7. SEAIIEHLC
 8. SQYLLTAIH



    ASEFALR
    K










10         20         30         40


MGIGRSEGGR RGAALGVLLA LGAALLAVGS ASEYDYVSFQ





50         60         70         80


SDIGPYQSGR (1)FYTKPPQCVDIPADLR(2,3)LCHN VGYKK(4)MVLPN





90         100        110        120



LLEHETMAEV K
(6)
QQASSWVPL LNKNCHAGTQ VFLCSLFAPV






130        140        150        160


CLDRPIYPCR WLCEAVRDSC EPVMQFFGFY WPEMLKCDKF





170        180        190        200


PEGDVCIAMT PPNATEASK(5)P QGTTVCPPCD NELK(7)SEAIIE





210        220        230        240



HLCASEFALR MKIKEVKKEN GDKKIVPKKK KPLKLGPIKK






250        260        270        280


KDLKKLVLYL KNGADCPCHQ LDNLSHHFLI MGRKVK(8)SQYL





290        300        310



LTAIHKWDKK NKEFKNFMKK MKNHECPTFQ SVFK














Protein Name
Uniprot
IPI
Gene Name





Bactericidal 
P17213
IPI00827847
BPI_Human


permeability-





increasing protein










Trypsin Fragments












 1. GLDYASQQG
 2. IKIPDYSDS



    TAALQK
    FK










10         20         30         40


MRENMARGFC NAPRWASLMV LVAIGTAVTA AVNPGVVVRI





50         60         70         80


SQK(2)GLDYASQ QGTAALQKEL KR(2)IKIPDYSD SFKIKHLGKG





90         100        110        120


HYSFYSMDIR EFQLPSSQIS MVPNVGLKFS ISNANIKISG





130        140        150        160


KWKAQKRFLK MSGNFDLSIE GMSISADLKL GSNPTSGKPT





170        180        190        200


ITCSSCSSHI NSVHVHISKS KVGWLIQLFH KKIESALRNK





210        220        230        240


MNSQVCEKVT NSVSSELQPY FQTLPVMTKI DSVAGINYGL





250        260        270        280


VAPPATTAET LDVQMKGEFY SENHHNPPPF APPVMEFPAA





290        300        310        320


HDRMVYLGLS DYFFNTAGLV YQEAGVLKMT LRDDMIPKES





330        340        350        360


KFRLTTKFFG TFLPEVAKKF PNMKIQIHVS ASTPPHLSVQ





370        380        390        400


PTGLTFYPAV DVQAFAVLPN SSLASLFLIG MHTTGSMEVS





410        420        430        440


AESNRLVGEL KLDRLLLELK HSNIGPFPVE LLQDIMNYIV





450        460        470        480


PILVLPRVNE KLQKGFPLPT PARVQLYNVV LQPHQNFLLF





GADVVYK













Protein Name
Uniprot
IPI
Gene Name





Chitinase domain
Q9BWS9-2
IPI00306719
CHID1_Human


containing protein 1










Trypsin Fragments












 1. GLHIVPR
 2. GLVVTDLK
 3. NVLDSEDEI




    EELSK





 4. SQFSDKPVQ
 5. YIQTLK



    DR










10         20         30         40


MRTLFNLLWL ALACSPVHTT LSKSDAKKAA SKTLLEK(4)SQF





50         60         70         80



SDKPVQDR
(2)
GL VVTDLKAESV VLEHRSYCSA KARDRHFAGD






90         100        110        120


VLGYVTPWNS HGYDVTKVFG SKFTQISPVW LQLKRRGREM





130        140        150        160


FEVTGLHDVD QGWMRAVRKH AK(1)GLHIVPRL LFEDWTYDDF





170        180        190        200


R(3)NVLDSEDEI EELSKTVVQV AKNQHFDGFV VEVWNQLLSQ





210        220        230        240


KRVGLIHMLT HLAEALHQAR LLALLVIPPA ITPGTDQLGM





250        260        270        280


FTHKEFEQLA PVLDGFSLMT YDYSTAHQPG PNAPLSWVRA





290        300        310        320


CVQVLDPKSK WRSKILLGLN FYGMDYATSK DAREPVVGAR





330        340        350        360



(5)
YIQTLKDHRP RMVWDSQASE HFFEYKKSRS GRHVVFYPTL






370        380        390


KSLQVRLELA RELGVGVSIW ELGQGLDYFY DLL













Protein Name
Uniprot
IPI
Gene Name





Moesin
P26038
IPI00219365
MSN_Human










Trypsin Fragments












 1. ALELEQER
 2. ALTSELANA
 3. AQMVQEDLE



    R
    K





 4. ESEAVEWQQ
 5. ISQLEMAR
 6. IGFPWSEIR


    K










10         20         30         40


MPKTISVRVT TMDAELEFAI QPNTTGKQLF DQVVKTIGLR





50         60         70         80


EVWFFGLQYQ DTKGFSTWLK LNKKVTAQDV RKESPLLFKF





90         100        110        120


RAKFYPEDVS EELIQDITQR LFFLQVKEGI LNDDIYCPPE





130        140        150        160


TAVLLASYAV QSKYGDFNKE VHKSGYLAGD KLLPQRVLEQ





170        180        190        200


HKLNKDQWEE RIQVWHEEHR GMLREDAVLE YLKIAQDLEM





210        220        230        240


YGVNYFSIKN KKGSELWLGV DALGLNIYEQ NDRLTPK(5)IGF





250        260        270        280



PWSEIRNISF NDKKFVIKPI DKKAPDFVFY APRLRINKRI






290        300        310        320


LALCMGNHEL YMRRRKPDTI EVQQMKAQAR EEKHQKQMER





330        340        350        360


AMLENEKKKR EMAEKEKEKI EREKEELMER LKQIEEQTKK





370        380        390        400


AQQELEEQTR R(1)ALLEQERK RAQSEAEKLA KERQEAEEAK





410        420        430        440


EALLQASRDQ KKTQEQLALE MAELTAR(5)ISQ LEMARQKK(4)ES





450        460        470        480



EAVEWQQK
(3)
AQ MVQEDLEKTR AELKTAMSTP HVAEPAENEQ






490        500        510        520


DEQDENGAEA SADLRADAMA KDRSEEERTT EAEKNERVQK





530        540        550        560


HLK(2)ALTSELA NARDESKKTA NDMIHAENMR LGRDKYKTLR





570


QIRQGNTKQR IDEFESM













Protein Name
Uniprot
IPI
Gene Name





Isoform 2 of
Q9NZ08-2
IPI00165949
ERAP1_Human


Endoplasmic reticulum





aminopeptidase 1










Trypsin Fragments












 1. GACILNMLR
 2. ILASTQFEP
 3. SQIEFALCR



    TAAR










10         20         30         40


MVFLPLKWSL ATMSFLLSSL LALLTVSTPS WCQSTEASPK





50         60         70         80


RSDGTPFPWN KIRLPEYVIP VHYDLLIHAN LTTLTFWGTT





90         100        110        120


KVEITASQPT STIILHSHHL QISRATLRKG AGERLSEEPL





130        140        150        160


QVLEHPRQEQ IALLAPEPLL VGLPYTVVIH YAGNLSETFH 





170        180        190        200


GFYKSTYRTK EGELR(2)ILAST QFEPTAARMA FPCFDEPAFK





210        220        230        240


ASFSIKIRRE PRHLAISNMP LVKSVTVAEG LIEDHFDVTV





250        260        270        280


KMSTYLVAFI ISDFESVSKI TKSGVKVSVY AVPDKINQAD





290        300        310        320


YALDAAVTLL EFYEDYFSIP YPLPKQDLAA IPDFQSGAME





330        340        350        360


NWGLTTYRES ALLFDAEKSS ASSKLGITMT VAHELAHQWF





370        380        390        400


GNLVTMEWWN DLWLNEGFAK FMEFVSVSVT HPELKVGDYP





410        420        430        440


FGKCFDAMEV DALNSSHPVS TPVENPAQIR EMFDDVSYDK





450        460        470        480



(1)
GACILNMLRE YLSADAFKSG IVQYLQKHSY KNTKNEDLWD






490        500        510        520


SMASICPTDG VKGMDGFCSR SQHSSSSSHW HQEGVDVKTM





530        540        550        560


MNTWTLQKGF PLITITVRGR NVHMKQEHYM KGSDGAPDTG





570        580        590        600


YLWHVPLTFI TSKSDMVHRF LLKTKTDVLI LPEEVEWIKF





610        620        630        640


NVGMNGYYIV HYEDDGWDSL TGLLKGTHTA VSSNDRASLI





650        660        670        680


NNAFQLVSIG KLSIEKALDL SLYLKHETEI MPVFQGLNEL





690        700        710        720


IPMYKLMEKR DMNEVETQFK AFLIRLLRDL IDKQTWTDEG





730        740        750        760


SVSERMLRSQ LLLLACVHNY QPCVQRAEGY FRKWKESNGN





770        780        790        800


LSLPVDVTLA VFAVGAQSTE GWDFLYSKYQ FSLSSTEK(3)SQ





810        820        830        840



IEFALCRTQN KEKLQWLLDE SFKGDKIKTQ EFPQILTLIG






850        860        870        880


RNPVGYPLAW QFLRKNWNKL VQKFELGSSS IAHMNMGTTN





890        900        910        920


QFSTRTRLEE VKGFFSSLKE NGSQLRCVQQ TIETIEENIG





930        940


WMDKNFDKIR VWLQSEKLER M













Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of
QPCT
IPI00003919
Q16769-1


glutaminyl-peptide










Trypsin Fragments












 1. MASTPHPPG
 2. YPGSPGSYA



    AR
    AR










10         20         30         40


MAGGRHRRVV GTLHLLLLVA ALPWASRGVS PSASAWPEEK





50         60         70         80


NYHQPAILNS SALRQIAEGT SISEMWQNDL QPLLIER2YPG





90         100        110        120



SPGSYAARQH IMQRIQRLQA DWVLEIDTFL SQTPYGYRSF






130        140        150        160


SNIISTLNPT AKRHLVLACH YDSKYFSHWN NRVFVGATDS





170        180        190        200


AVPCAMMLEL ARALDKKLLS LKTVSDSKPD LSLQLIFFDG





210        220        230        240


EEAFLHWSPQ DSLYGSRHLA AK1MASTPHPP GARGTSQLHG





250        260        270        280


MDLLVLLDLI GAPNPTFPNF FPNSARWFER LQAIEHELHE





290        300        310        320


LGLLKDHSLE GRYFQNYSYG GVIQDDHIPF LRRGVPVLHL





330        340        350        360


IPSPFPEVWH TMDDNEENLD ESTIDNLKNI LQVFVLEYLH





L













Protein Name
Uniprot
IPI
Gene Name





Isoform 1 of
O75882
IPI00027235
ATRN_Human


Attraction










Trypsin Fragments












 1. CTWLIEGQP
 2. GDECQLCEV
 3. GVKGDECQL


    NR
    ENR
    CEVENR





 4. LADDLYR
 5. IMQSSQSMS
 6. LTGSSGFVT



    K
    DGPGNYK





 7. SCALDQNCQ




    WEPR










10         20         30         40


MVAAAAATEA RLRRRTAATA ALAGRSGGPH WDWDVTRAGR





50         60         70         80


PGLGAGLRLP RLLSPPLRPR LLLLLLLLSP PLLLLLLPCE





90         100        110        120


AEAAAAAAAV SGSAAAEAKE CDRPCVNGGR CNPGTGQCVC





130        140        150        160


PAGWVGEQCQ HCGGRFR(6)LTG SSGFVTDGPG NYKYKTK(1)CTW





170        180        190        200



LIEGQPNRIM RLRFNHFATE CSWDHLYVYD GDSIYAPLVA






210        220        230        240


AFSGLIVPER DGNETVPEVV ATSGYALLHF FSDAAYNLTG





250        260        270        280


FNITYSFDMC PNNCSGRGEC KISNSSDTVE CECSENWKGE





290        300        310        320


ACDIPHCTDN CGFPHRGICN SSDVRGCSCF SDWQGPGCSV





330        340        350        360


PVPANQSFWT REEYSNLKLP RASHKAVVNG NIMWVVGGYM





370        380        390        400


FNHSDYNMVL AYDLASREWL PLNRSVNNVV VRYGHSLALY





410        420        430        440


KDKIYMYGGK IDSTGNVTNE LRVFHIHNES WVLLTPKAKE





450        460        470        480


QYAVVGHSAH IVTLKNGRVV MLVIFGHCPL YGYISNVQEY





490        500        510        520


DLDKNTWSIL HTQGALVQGG YGHSSVYDHR TRALYVHGGY





530        540        550        560


KAFSANKYR(6)L ADDLYRYDVD TQMWTILKDS RFFRYLHTAV





570        580        590        600


IVSGTMLVFG GNTHNDTSMS HGAKCFSSDF MAYDIACDRW





610        620        630        640


SVLPRPDLHH DVNRFGHSAV LHNSTMYVFG GFNSLLLSDI





650        660        670        680


LVFTSEQCDA HRSEAACLAA GPGIRCVWNT GSSQCISWAL





690        700        710        720


ATDEQEEKLK SECFSKRTLD HDRCDQHTDC YSCTANTNDC





730        740        750        760


HWCNDHCVPR NHSCSEGQIS IFRYENCPKD NPMYYCNKKT





770        780        790        800


SCR(7)SCALDQN CQWEPRNQEC IALPENICGI GWHLVGNSCL





810        820        830        840


KITTAKENYD NAKLFCRNHN ALLASLTTQK KVEFVLKQLR





850        860        870        880



(5)
IMQSSQSMSK LTLTPWVGLR KINVSYWCWE DMSPFTNSLL






890        900        910        920


QWMPSEPSDA GFCGILSEPS TRGLKAATCI NPLNGSVCER





930        940        950        960


PANHSAKQCR TPCALRTACG SCTSGSSECM WCSNMKQCVD





970        980        990        1000


SNAYVASFPF GQCMEWYTMS TCPPENCSGY CTCSHCLEQP





1010       1020       1030       1040


GCGWCTDPSN TGKGKCIEGS YKGPVKMPSQ APTGNFYPQP





1050       1060       1070       1080


LLNSSMCLED SRYNWSFIHC PACQCNGHSK CINQSICEKC





1090       1100       1110       1120


ENLTTGKHCE TCISGFYGDP TNGGKCQPCK CNGHASLCNT 





1130       1140       1150       1160


NTGKCFCTTK (9)GVK(2)GDECQLC EVENRYQGNP LRGTCYYTLL





1170       1180       1190       1200


IDYQFTFSLS QEDDRYYTAI NFVATPDEQN RDLDMFINAS





1210       1220       1230       1240


KNFNLNITWA ASFSAGTQAG EEMPVVSKTN IKEYKDSFSN





1250       1260       1270       1280


EKFDFRNHPN ITFFVYVSNF TWPIKIQIAF SQHSNFMDLV





1290       1300       1310       1320


QFFVTFFSCF LSLLLVAAVV WKIKQSCWAS RRREQLLREM





1330       1340       1350       1360


QQMASRPFAS VNVALETDEE PPDLIGGSIK TVPKPIALEP





1370       1380       1390       1400


CFGNKAAVLS VFVRLPRGLG GIPPPGQSGL AVASALVDIS





1410       1420


QQMPIVYKEK SGAVRNRKQQ PPAQPGTC













Protein Name
Uniprot
IPI
Gene Name





Uncharacterized

IPI00925547
LTF_Human


protein










Trypsin Fragments












 1. ADAVTLDGG
 2. ARVVWCAVG
 3. ARVVWCAVG


    FIYEAGLAP
    EQELR
    EQELRK


    YK







 4. CAFSSQEPY
 5. CFQWQR
 6. CGLVPVLAE


    FSYSGAFK

    NYK





 7. CLAENAGDV
 8. CLRDGAGDV
 9. CSTSPLLEA


    AFVK
    AFIR
    CEFLRK





10. CSTSPLLEA
11. CVPNSNER
12. CVPNSNERY


    CEFLR

    YGYTGAFR





13. DCHLAR
14. DEYELLCPD
15. DGAGDVAFI



    NTR
    R





16. DGAGDVAFI
17. DLLFKDSAI
18. DLKLADFAL


    RESTVFEDL
    GFSR
    LCLDGK


    SDEAERDEY




    ELLCPDNTR







19. DLKLADFAL
20. DKSPKFQLF
21. DSAIGFSR


    LCLDGKR
    GSPSGQK






22. DSAIGFSRV
23. DSPIQCIQA
24. DSPIQCIQA


    PPR
    IAENR
    IAENRADAV




    TLDGGFIYE




    AGLAPYK





25. DVTVLQNTD
26. DVTVLQNTD
27. GEADAMSLD


    GNNNEAWAK
    GNNNEAWAK
    GGYVYTAGK



    DIK






28. GGSFQLNEL
29. GPPVSCIK
30. GPPVSCIKR


    QGLK







31. GQFPNLCR











MKLVFLVLLF LGALGLCLAG RRRSVQWCA VSQPEATK(5)CF QWQRNMRK VR(29,30)GPPV






SCIKR
(23,24)
DS PIQCIQAIAE NR
(1)
ADAVTLDG GFIYEAGLAP YKLRPVAAE VYGTERQPR






THYYAV AVVKK(28)G GSFQLNELQG LKSCHTGLRR TAGWNVPIGT LRPFLNWTG PPEPIEAAV





ARFFSA SCVPGA DK(31)GQFPNLCR LCAGTGENK(8)C AFSSQEPYFS YSGAFK(8)CLR






(15,16)
DGAGDVAFI RESTVFE DLSDE AER
(14)
DEYELL CPDNTRKPVDK FK(13)DCHLARVP






SHAVVARSV NGKEDAIWN LLRQAQE KFGK(20)D KSPKFQLFG SPSGQK(17)DLLFK






(21,22)
DSAIGFSRVP PRIDSGLYL GSGYFTAIQ NLRKSEE EVAAR R(2,3)ARVVWCAV







GEQELRKCNQW SGLSEGSVTC SSASTTEDC IALK(23)GEADA MSLDGGY VYTAG K(6)CGLVPVLA







ENYKSQQSSDP DPNCVDRPVE GYLAVAVVR RSDTSLTWN SVKGKKS CHTAV DRTAGWNIP






MGLLFNQTGSC KFDEYFSQSC APGSDPRSN LCALCIGDE QGENK(11,12)CV PNSNE RYYGYTGAF






R
(7)
CLAENAGDVA FVK
(25,26)DVTVLQN TDGNNNEAW AK(18,19)DLKLADF ALLCLDG KRKPV






TEARSCHLA MAPNHAVVSRM DKVERLKQVL LHQQAKFGR NGSDCPDKF CLFQSET KNLLF





NDNTECLAR LHGKTTYEKYL GPQYVAGITN LKK(9,10)CSTSPL LEACEFLRK








Claims
  • 1. A method of determining whether a subject has cancer comprising: obtaining a sample from the subject; performing steps for or detecting the level of at least one of the list of markers provided in Table 2A or Table 2B in the sample; and determining the subject has or is likely to have cancer if the levels of the marker in Table 2A are increased or if the markers in Table 2B are decreased as compared to the levels in a control sample (non-cancer).
  • 2. The method of claim 1, wherein the cancer is breast cancer.
  • 3. The method of any one of the preceding claims, wherein the sample is an ocular wash sample.
  • 4. The method of any one of the preceding claims, wherein the subject is a human.
  • 5. The method of any one of the preceding claims, wherein the markers of Table 2A are increased at least 2 fold, 4 fold, 5 fold, 8 fold, 10 fold or more relative to the level of the marker in the control sample.
  • 6. The method of any one of the preceding claims, wherein the markers of Table 2B are decreased at least 1.5 fold, 2 fold, 3 fold, 4 fold or more relative to the level of the marker in the control sample.
  • 7. The method of any one of the preceding claims, wherein a combination of markers are used to determine the likelihood of cancer in the subject.
  • 8. The method of any one of the preceding claims, wherein the level of the marker is detected by liquid chromatography-mass spectroscopy.
  • 9. The method of any one of the preceding claims, wherein the level of the marker is detected by an antibody-based detection method.
  • 10. The method of any one of the preceding claims, wherein the level of the marker is detected by an mRNA detection method.
  • 11. The method of any one of the preceding claims, wherein the subject is suspected of having cancer.
  • 12. The method of any one of the preceding claims, wherein the subject is at increased risk for developing a cancer.
  • 13. The method of claim 2, wherein the subject has a palpable lump suspected of being cancerous.
  • 14. The method of any one of the preceding claims, wherein the cancer is detected as a stage II breast cancer.
  • 15. The method of any one of the preceding claims, wherein at least three markers from Table 2A and Table 2B are used in combination.
  • 16. A kit for performing the method of any one of the preceding claims comprising saline solution, a collection tube and a saline collection device.
  • 17. The kit of claim 16, further comprising an antibody capable of binding to at least one of the markers of Table 2A or Table 2B.
  • 18. The kit of claim 16, further comprising a primer pair capable of amplifying at least one of the markers of Table 2A or Table 2B.
  • 19. The kit of any one of claims 16-18, further comprising a protease inhibitor or other protein stabilizing agent.
CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of provisional patent application No. 61/991,061 filed on 9 May 2014, which is hereby incorporated by reference in its entirety.

Provisional Applications (1)
Number Date Country
61991061 May 2014 US