The present invention relates to the diagnosis, particularly differential diagnosis, of renal diseases.
The number of patients presenting with renal diseases has been increasing in the recent years. Thus, renal diseases present an increasing problem to the health system. Many renal diseases are irreversible, therefore an early diagnosis and/or a differential diagnosis of renal diseases is important. Early diagnosis and a therapy precisely tailored to each particular disease could reduce the number of patients requiring dialysis and could also reduce the high cardiovascular risk of the patients.
Currently, precise diagnosis and/or differential diagnosis relies mostly on kidney biopsies. Although biopsies serve as the current “gold standard” in renal diagnostics, biopsies have the disadvantage of being invasive and therefore being conducted only on selected patients.
Urine analysis is a different approach to diagnose renal diseases. However, currently only few parameters of urine are routinely measured, for example creatinin, urea, albumin, blood cells (such as leukocytes and erythrocytes), bacteria, sugar, urobilinogen, bilirubin and pH value. The diagnostic value of these analyses is limited, as they lack sufficient sensitivity and/or selectivity, particularly for differential diagnosis.
Several attempts have been made to analyze the proteins contained in urine.
V. Thongboonkerd et al. have used two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) in combination with matrix-assisted laser desorption ionization-time-of-flight (MALDI-TOF) mass spectrometry followed by mass fingerprinting to investigate normal human urinary proteins. A total of 67 protein forms of 47 unique proteins was identified (V. Thongboonkerd et al. (2001). Proteomic analysis of normal human urinary proteins isolated by acetone precipitation or ultracentrifugation. Kidney International, vol. 62, p. 1461-1469).
C. S. Spahr et al. have digested the proteins contained in urine samples with trypsin and identified 751 peptides from 124 proteins by means of liquid chromatography-tandem mass spectrometry (C. S. Spahr et al. 2001). Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. I. Profiling an unfractionated tryptic digest. Proteomics vol. 1, p. 93-107).
These studies relate only to healthy individuals. The studies have not addressed the question whether alterations in presence of urinary polypeptides can be used for diagnosis or differential diagnosis of renal diseases.
It has been proposed to use the presence or absence of polypeptides in urine for the diagnosis of membranous glomerulonephritis (MGN) (von Neuhoff et al. (2004). Mass Spectrometry for the Detection of Differentially Expressed Proteins: A Comparison of Surface-Enhanced Laser Desorption/Ionization and Capillary Electrophoresis Mass Spectrometry. Rapid Communications in Mass Spectrometry, vol. 18: 149-156). However, samples of only 8 patients were used in the study, which was mainly concerned with the comparison of different analysis methods. The actual diagnostic value of the markers has remained unclear.
Consequently, there is need for a fast and simple methods and means for diagnosis, particularly differential diagnosis, of renal diseases.
Accordingly, an object of certain embodiments of the invention is to provide methods and means for the diagnosis of renal diseases, particularly for differential diagnosis of renal diseases. It is a particular object of certain embodiments of the invention to provide methods and means for the diagnosis and/or differential diagnosis of IgA-nephropathy, which is the most common glomerulopathy.
According to a first aspect of the present invention, the problem is solved by the use of the presence of at least one polypeptide marker in a urine sample for the diagnosis, preferably the differential diagnosis, of a renal disease, wherein the polypeptide marker is selected from the group of polypeptide markers as shown in table 1 to 22.
In the context of the present invention, it has been found that with the help of the polypeptide markers as shown in table 1 to 22 it is possible to reliably diagnose or differentially diagnose, respectively, different renal diseases.
The present invention has numerous advantages compared to the state of the art. First, the presence of the polypeptide markers according to the invention can be determined in urine samples. Therefore, there is no need to take biopsies. Thus, the present invention allows a simplified and fast diagnosis of renal diseases, allowing to screen patients regularly for the presence of renal diseases and to diagnose renal diseases at early stages. Furthermore, the polypeptide markers according to the invention can be used for differential diagnosis between different renal diseases. The high number of markers identified according to the present invention allows to increase both specificity and sensitivity of diagnosis as compared to the use of only a single or a small number of markers. Also, the present invention provides methods which allow to measure said polypeptide markers without the use of specific ligands such as antibodies or aptamers.
The polypeptide markers as shown in the tables have been identified by a method named capillary electrophoresis-mass spectrometry (CE-MS), which will be described further below. Furthermore, the method has been described in detail in von Neuhoff et al. (2004) (Mass Spectrometry for the Detection of Differentially Expressed Proteins: A Comparision of Surface-Enhanced Laser Desorption/Ionization and Capillary Electrophoresis/Mass Spectrometry. Rapid Communications in Mass Spectrometry, vol. 18: 149-156). Starting from the parameters defining the polypeptide markers, it is possible by methods known in the art to identify the sequence of the corresponding polypeptides and then to synthesize or produce the corresponding polypeptides, e.g. with the help of protein synthesis or expression of the corresponding gene in appropriate cells.
The markers are defined by there mass and their migration time in capillary electrophoresis (CE), particularly mass and their migration time obtained according to Example I. It is known that CE migration times can vary, typically in the range of 5 min, more typically in the range of 3 minutes. However, the sequence of markers being eluted is typically the same or very similar for each CE system applied. The system can be calibrated by use of polypeptides which are present in almost any urine sample, e.g. by the polypeptides given in tables 23 or 24. Furthermore, the polypeptides given in SEQ ID NO: 1 to SEQ ID NO: 5 can serve for calibration.
Variation of the masses between measurements or between different mass spectrometers is relatively small, typically it is in the range of plus or minus 0.05%.
In table I, polypeptide markers are listed which are preferred for the discrimination between healthy individuals and individuals suffering from a renal disease, particularly from a glomerulonephritis or glomerulopathy.
In table 2, polypeptide markers are listed, which are preferred for a discrimination between FSGS and the healthy condition.
In table 3, polypeptide markers are listed, which can be used for differential diagnosis between FSGS and MCD.
In table 4, polypeptide markers are listed, which are preferred for a differential diagnosis of FSGS and MGN.
In table 5, polypeptide markers are listed, which are preferred for a differential diagnosis between FSGS on the one hand, and MCD or MGN on the other hand.
In table 6, polypeptide markers are listed, which are preferred for diagnosis of MCD as compared to the healthy condition.
In table 7, polypeptide markers are listed, which are preferred for differential diagnosis between MCD and MGN.
In table 8, polypeptide markers are listed, which are preferred for differential diagnosis between MCD on the one hand, and FSGS or MGN on the other hand.
In table 9, polypeptide markers are listed, which are preferred for diagnosis of MGN as compared to the healthy condition.
In table 10, polypeptide markers are listed, which are preferred for differential diagnosis between MGN on the one hand, and FSGS or MCD on the other hand.
In table 11, polypeptide markers are listed, which are preferred for diagnosis of IgA-nephropathy or MGN on the one hand as compared to the healthy condition.
In table 12, polypeptide markers are listed, which are preferred for diagnosis of IgA-nephropathy as compared to the healthy condition.
In table 13, polypeptide markers are listed, which are preferred for differential diagnosis between IgA-nephropathy and MGN.
In table 14, polypeptides are listed with their respective frequency in healthy, FSGS, MCD, and MGN patients.
In table 15, polypeptides are listed which have been used for differential diagnosis between healthy individuals and renal patients using support vector machines according to Example 1.
In table 16, polypeptides are listed which have been used for differential diagnosis between healthy, FSGS, MCD, and MGN patients using random forest analysis according to Example 1.
In table 17, polypeptides are listed which have been used for differential diagnosis between MCD and MGN patients using support vector machines according to Example 1.
In table 18, polypeptides are listed which have been used for differential diagnosis between MCD and FSGS patients using support vector machines according to Example 1.
In table 19, polypeptides are listed which have been used for differential diagnosis between MGN and FSGS patients using support vector machines according to Example 1.
In table 20 and 21, polypeptides are listed which have been identified in von Neuhoff et al. (2004), which has been cited above.
In table 22, polypeptides are listed which can be used for diagnosis of diabetes and/or diabetic nephropathy.
In table 23, polypeptides are listed, which are preferred as internal standards to standardize the CE-time.
In table 24, polypeptides are listed, which are preferred as internal standards to standardize the CE-time if the pressure method (0.3 to 1 psi) according to Example 1 is used. These standards are e.g. preferred as internal standards in diagnosis of IgA-nephropathy.
In table 25, clinical data of renal patients are listed whose samples were used for identification of polypeptide markers according to Example 1. Abbreviations: CsA, Cyclosporin A; PS, prednisolone; +, frequent relapse; −, currently no immunosuppression; *, clinically unclear whether MCD or FSGS.
The polypeptide markers used according to the present invention can be identified and their presence can be measured in urine samples. Urine samples can be taken as known in the state of the art. Preferably, midstream urine is used in the context of the present invention.
The polypeptide markers used according to the present invention can be gene expression products such as proteins, peptides, and fragments or other degradation products of proteins or peptides. They can be modified by posttranslational modifications, e.g. by glycosylation, phoshorylation, alkylation or disulfide bond. It is known that fragments and degradation products can have a different diagnostic value and/or physiological role than the protein or peptide they have been derived from. For example, in different diseases, different proteolytic degradation products or fragments can be found. It is also considered to be within the scope of the present invention if the urine sample is pretreated to chemically modify the polypeptide markers contained in the urine and to measure these chemically modified polypeptide markers. The polypeptide markers according to the present invention have a molecular mass between 400 and 20000 Da, particularly between 700 and 14000 Da, more particularly between 800 and 11000 Da.
Preferred polypeptide markers according to the present invention are listed in tables 1 to 22, particularly in tables 1 to 21, more particularly in tables 1 to 13.
Preferred polypeptides for use as internal standards are listed in tables 23 to 24.
Preferred are also polypeptide markers which are listed in table 1, but not in table 14 and/or 15 and/or 16 and/or 17 and/or 18 and/or 19 and/or 20 and/or 21 and/or 22.
Preferred are also polypeptide markers which are listed in table 2, but not in table 14 and/or 15 and/or 16 and/or 18.
Preferred are also polypeptide markers which are listed in table 3, but not in table 14 and/or 16 and/or 18.
Preferred are also polypeptide markers which are listed in table 4, but not in table 14 and/or 16 and/or 19.
Preferred are also polypeptide markers which are listed in table 5, but not in table 14 and/or 16 and/or 18 and/or 19.
Preferred are also polypeptide markers which are listed in table 6, but not in table 14 and/or 16.
Preferred are also polypeptide markers which are listed in table 7, but not in table 14 and/or 16 and/or 17.
Preferred are also polypeptide markers which are listed in table 8, but not in table 14 and/or 16.
Preferred are also polypeptide markers which are listed in table 9, but not in table 14 and/or 16 and/or 20 and/or 21.
Preferred are also polypeptide markers which are listed in table 10, but not in table 14 and/or 16.
Preferred are also polypeptide markers which are listed in table 11, but not in table 14 and/or 16.
Renal disease according to the present invention relates to any kind of renal disease or kidney dysfunction known to the person skilled in the art, for example IgA-nephropathy, MGN (membranous glomerulonephritis), MCD (minimal-change disease), FSGS (focal-segmental glomerulosclerosis), or diabetic nephropathy. Particularly, renal disease relates to a glomerulopathy such as IgA-nepluopathy, MGN, MCD, or FSGS. Even more particularly renal disease relates to IgA-nephropathy, MCD, or FSGS. Most particularly, renal disease relates to IgA-nephropathy
The glomerulopathies are a subgroup of renal diseases. Glomerulopathies comprise a several diseases of different etiology. Glomerolopathies are characterized by pathomorphological changes in malpighian corpuscles, glomerulus, and Bovvman's capsule. As a consequence of these changes, further pathomorphological changes may appear in other parts of the nephron and interstice.
IgA-nephropathy is also known as Berger-Nephritis. IgA-nephropathy is the most common glomerulopathy. It may be a specific, kidney-limited, form of purpura Schoenlein-Henoch (also known as anaphylactoid purpura) with increased plasma concentration of IgA. The histopathology includes all forms of glomerular lesions and deposits of IgA in the mesangium. Clinically, IgA nephropathy presents as micro- and macro-hematouria. Therapy may be attempted with ACE inhibitors and omega-3 fatty acids. Progression of the disease occurs over the course of several years and includes transition into progressive renal insufficiency.
MGN is characterized by thickening of the basal membrane and granular subepithelial IgG deposits. MGN becomes frequently manifest in the between the age of 40 and 50. It is frequently caused by medicaments, e.g. gold, D-penicillamine, or ACE inhibitors. Therapy of MGN may be attempted with glucocorticoids or cyclophosphamide. MGN is a nephrotic syndrome, a transition into progressive renal insufficiency may take several years.
MCD is also known as lipoid nephrosis. MCD is the most common cause of a nephrotic syndrome in children. The etiology of the disease is unknown. Histologically, no or only very discrete changes can be found. Therapy of MCD may include treatment with glucocorticoids, cyclosporin A, or cyclophosphamide. In children, the disease spontaneously heals in 90% of the cases, in adults in 50% of the cases. A transition into FSGS is possible.
FSGS is also known as IgM-nephropathy. FSGS is typically characterized by deposits of IgM and C3 in the mesangium. Clinically, it becomes manifest as a nephrotic syndrome. Therapy of FSGS may include treatment with glucocorticoids, cyclosporin A, or cyclophosphamide. Prognosis is poor and includes transition into progressive renal insufficiency.
Diabetic nephropathy is also known as diabetic glomerulosclerosis. Diabetic nephropathy is the most common cause for requirement of dialysis treatment.
In summary, it is evident that renal diseases include a variety of diseases which may show quite similar histology. However, etiology, treatment, and prognosis can be quite different for each disease. For example, IgA-nephropathy requires different treatment from any other glomerulopathy described above: In IgA-nephropathy, treatment with ACE inhibitors may be attempted, which would not be recommendable in the case of MGN. Therefore, fast and reliable diagnosis is of great importance for treatment.
In the context of the present invention, diagnosing or diagnosis means that, for an individual patient, the probability of having the respective disease is determined.
Diagnosis may also include confirming a preliminary diagnosis, particularly a preliminary diagnosis established by a different method.
Furthermore, in a preferred embodiment, diagnosis according to the present invention particularly relates to “differential diagnosis”. The term “differential diagnosis” relates to distinguishing between two different diseases, i.e. to determining for an individual patient the probability of having a certain first disease as compared to having a certain second disease. More particularly, differential diagnosis according to the present invention relates to distinguishing between at least two renal diseases chosen from the group consisting of IgA-nephropathy, MGN, MCD, FSGS, and diabetic nephropathy.
In another embodiment, the present invention relates to a method for the differential diagnosis of a renal disease, the method comprising:
Preferably, the individual probabilities according to step b) are as indicated in the tables.
The term “measuring” according to the present invention relates to determining the presence of a polypeptide or other substance of interest.
The decision whether a polypeptide marker is present or absent may depend on definition of a suitable threshold value. The threshold value can either be defined through the sensitivity of the method of measurement, or it can be defined at will. The threshold in the context of the present invention is 25 fmol/μl in a sample which has been injected into a mass spectrometer according to Example 1. However, this threshold may be the same when other methods are used. This threshold coincides with the detection threshold of a typical mass spectrometer. This threshold corresponds approximately to a concentration of the polypeptide marker in the urine sample of 50-5000 pmol/l. If different thresholds are to be used (e.g. when using another detection method), the corresponding probabilities may differ, but can easily be established by the person skilled in the art.
The “disease patient” according to the present invention is suffering from a renal disease. Particularly, the disease is at least one from the group consisting of IgA-nephropathy, MGN, MCD, FSGS, and diabetic nephropathy.
The “control patient” can either be healthy or suffering from a disease different from the one the disease patient is suffering from, i.e. the control patient can either represent the healthy condition or a disease or group of diseases. Particularly, the represented disease is at least one from the group consisting of IgA-nephropathy, MGN, MCD, FSGS, and diabetic nepmopathy.
Tables 1 to 14, 16, 20, 21, and 22 list the probability (also designated as “frequency”) of a given polypeptide marker being present in a urine sample of a healthy control patient or a control patient suffering from a certain disease. The discrimination factor indicates the difference between the probability of presence in the disease as compared to a given control condition. The discrimination factor can easily be calculated from the respective probabilities. The higher the discrimination factor, the better is the potential of the given marker to distinguish between the disease and the control condition. An absolute value of the discrimination factor of 0.40 or higher is preferred.
The person skilled in the art is able to establish similar tables for the polypeptide markers by himself and/or to refine the data contained in the tables, e.g. based on further patient data and/or according to different thresholds for the presence of the polypeptide marker.
For diagnosis, the probability of the presence of the polypeptide marker in a disease patient is compared to the probability of the presence of this marker in a control patient, wherein the individual probabilities are as indicated in the tables. If the probability of the presence of this marker in a disease patient is higher than the probability of the presence of this marker in a control patient, then the presence of this marker in the sample is indicative that the patient from whom the sample originates has a higher probability of having the disease rather than the control condition. If the probability of the presence of this marker in a disease patient is lower than the probability of the presence of this marker in a control patient, then the absence of this marker in the sample is indicative that the patient from whom the sample originates has a higher probability of having the disease rather than the control condition.
For example, a given marker may have a probability of 73% of being present in a control representing IgA-nephropathy but a probability of 0% of being present in a control representing the healthy condition. If this marker is present in the sample, then the individual is diagnosed as having a 73% probability of suffering from IgA-nephropathy as compared to being healthy. If this marker is not present in the sample, then the individual is diagnosed as having a 73% probability of being healthy instead of suffering from IgA-nephropathy.
Thus, diagnosis can be established according to statistical methods familiar to the person skilled in the art.
The invention can be carried out using only one of the polypeptide markers or using a plurality of the polypeptide markers. Preferably, presence of a plurality of polypeptide markers is measured. Preferably at least 3 of the markers, more preferably at least 10 of the markers, even more preferably at least 20, most preferred at least 50 of the markers according to the present invention are measured.
An advantage of the present invention is that it provides a multitude of suitable markers. Measuring a plurality of markers can increase both sensitivity and selectivity of diagnosis. Therefore, also markers which show low discrimination factors between the disease and control can be used for diagnosis if they are combined with other markers.
If a plurality of polypeptide markers is used, a “pattern” is be generated which contains the information about the presence for each marker measured. This pattern can then be compared to the pattern of probabilities of presence of the polypeptide markers in a disease or control patient. Each table represents a pattern of probabilities of finding given polypeptide markers in certain disease and control patients.
Therefore, in a preferred embodiment, the present invention relates to a method for the differential diagnosis of a renal disease, the method comprising:
Preferably, the individual probability for the at least one polypeptide marker according to step b) is as indicated in the tables.
Comparison of the found pattern with the probability of finding the pattern in a disease or control patient can be performed according to statistical methods known in the art. Preferably, automated methods are employed, e.g. CART-analysis, random forest analysis, and support vector machines (SVM, see e.g, Xiong. M., et al. (2001). Biomarker identification by feature wrappers. Genome Research vol. 11, p. 1878-1887). Comparison can also be performed simultaneously for several different patterns and the probability of finding them.
Thus, the measured pattern is typically compared to the probability of finding the pattern in at least two different conditions. An example for diagnosis and differential diagnosis of renal diseases according to this method is shown in
If necessary, the urine samples may be pre-treated before measurement of the polypeptide marker. Particularly, lipids, nucleic acids or polypeptides may be purified from the sample according to methods known in the art, including filtration, centrifugation, or extraction methods such as chloroform/phenol extraction.
Measuring the presence of a polypeptide marker can be done by any method known in the art.
Preferred methods include gas phase ion spectrometry, such as laser desorption/ionization mass spectrometry, surface enhanced laser desorption/ionization time-of flight mass spectrometry (SELDI-TOF MS) and CE-MS. These spectrometry methods allow to measure the polypeptide markers without the need for ligands such as antibodies or aptamers.
Urine sample generally are highly complex, i.e. they contain numerous polypeptides. In case of high complexity, a spectrometric analysis becomes difficult. To reduce the complexity of the sample, the polypeptides contained in the sample may be separated by any suitable means, e.g. by electrophoretic separation, affinity-based separation, or separation based on ion exchange chromatography. Particular examples include gel electrophoresis, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), capillary electrophoresis, metal-affinity chromatography, immobilized metal-affinity chromatography (IMAC), affinity chromatography based on lectins, liquid chromatography, high pressure liquid chromatography (HPLC), and reversed-phase HPLC, cation exchange chromatography, and selectively binding surfaces (such as the surfaces used in SELDI-TOF, see below).
2D-PAGE is commonly used for polypeptide separation and can be combined with mass spectrometry (MS) yielding identification of individual polypeptides. Over 1000 protein spots can be discerned with 2D-PAGE. However, each single spot must be analyzed separately by MS/MS for identification.
SELDI (surface enhanced laser desorption/ionization) time-of-flight mass spectrometry is currently applied in many fields of biomedical sciences.
In the SELDI system, the ProteinChip Arrays are the most important component. They are narrow metal strips carrying 8 or 16 spots in a row on the surface. Samples to be analyzed are directly applied to the spots, either as a standing drop or in volumes up to 500 μl, by using sample holders called “bioprocessors” as supporting units. They are placed onto the arrays during incubation and washing steps and removed again afterwards. The different types of arrays belong to two main series: chromatographic arrays, presenting hydrophobic, hydrophilic, cation-exchanging, anion-exchanging or immobilized metal ion affinity-surfaces, and preactivated arrays with chemical groups to allow the covalent coupling of proteins. Preferably, a chip with cation-exchange surfaces is used. As the ProteinChip Arrays do not only support the sample but specifically interact with the biomolecules, the composition of the analyte depends on the array type used and the washing conditions applied. This explains why the SELDI-process can be defined as a further development of the traditional MALDI (matrix assisted laser desorption/ionization)-technique. In the SELDI-process, only on those polypeptides are measured that actually bind to the chip surface.
After binding of sample proteins, the energy absorbing matrix is applied to each spot. The matrix rapidly crystallizes and the analysis can start immediately.
The ProteinChip Arrays are placed into the ProteinChip Reader for analysis. The reader is a TOF (time-of-flight) mass spectrometer in which the proteins are desorbed and ionized with the help of a laser beam. As the crystallized proteins are equally distributed on the spot surface, the ionizing laser beam always hits a representative average of the molecules in the analyte, allowing quantitative calculations. After ionization, the proteins are accelerated by an electric field to fly down the flight tube, before reaching the detector. The flight time between the laser striking the array surface and the molecules reaching the detector at the end of the flight tube enables the system to accurately determine the mass of the protein species present in the sample (for more detailed information on the method see the following review; Merchant M and Weinberger S R (2000). Recent advancements in surface-enhanced laser desorption/ionization—time of flight mass spectrometry. Electrophoresis vol. 212, p. 1164-1177).
However, the most preferred method is CE-MS, in which capillary electrophoresis (CE) is coupled to mass spectrometry (MS). CE-MS has been described in detail elsewhere (see e.g. German patent application DE 100 21 737, and Kaiser, T., et. al., Capillary Electrophoresis coupled mass spectrometry to establish polypeptide patterns in dialysis fluids. J Chromatogr A, vol. 1013, p. 157-171(2003)).
CE is known to the person skilled in the art. In brief, the sample is loaded onto an electrophoresis capillary and a voltage of up to 50 kV, typically up to 30 kV, is applied. Typical capillaries are fused silica capillaries, i.e. glass capillaries comprising an outer sheath as mechanical support and to improve mechanical flexibility, e.g. a sheath made of thermoplastic material. Typically, the capillary is untreated, i.e. it shows hydroxy-groups on its inside. However, the capillary may also be coated on the inside. E.g., hydrophobic coating can be used to improve discriminatory power. In addition to the voltage, also pressure may be applied, which is typically in the range of 0 to 1 psi. The pressure can also be applied or increased during the run.
To improve discriminatory power, also a stacking protocol can be applied when loading the sample: Before loading of the sample, a base is loaded, then the sample is loaded, then an acid. The principle is to capture the analyte ions between a base and an acid. If voltage is applied, the positively charges analyte ions move towards the base. There, they get negatively charged and move into the opposite direction towards the acid, where they get positively charged. This stacking repeats itself until acid and base are neutralized. Then, the separation starts from a well concentrated sample.
The sample is contained in an appropriate buffer in which polypeptides are soluble, e.g. phosphate buffer. For CE-MS coupling, it is preferred to use volatile solvents and to work under mostly salt-free conditions to avoid contamination of the MS. Examples comprise acetonitrile, isopropanol, methanol, and the like. The solvents can also be combined with water and a weak acid (e.g. 0.1% formic acid), the latter to protonate the analyte. The polypeptides in the sample are separated according to size and charge, which determine the run-time in the capillary. CE is characterized by high separating power and short time of analysis.
For subsequent MS analysis, either fractions collected from the CE can be analyzed as separate batches or, preferably, the CE system can be coupled via a suitable interface to the mass spectrometer to allow continuous flow analysis. Alternatively, the flow from the CE may be used to generate continuous “separation tracks”, which can be analyzed separately.
In the mass spectrometer, ions generated from the sample are analyzed according to the mass/charge (m/z) quotient. Using mass spectrometry, it is possible to routinely analyze 10 fmol (i.e. 0.1. ng of a 10 kDa polypeptide) with a precision of ±0.01%. Experimentally, is possible to analyze even less than 0.1 fmol.
Any type of mass spectrometer can be used. In mass spectrometers, an ion-generating device is coupled with an suitable analyzer. For example, the electrospray ionization (ESI) interfaces are most commonly used to produce ions from liquid samples, whereas MALDI is most commonly used to produce ions from individually processed samples. Different kinds of analyzers are available, e.g. ion trap analyzers or time-of-flight (TOF) analyzers. Both ESI and MALDI can be combined with essentially all types of mass spectrometers, although ESI has usually been combined with ion traps, whereas MALDI has usually been combined with TOF.
A preferred CE-MS method according to the present invention includes capillary electrophoresis coupled online via ESI to a TOF analyzer.
The CE-MS technique permits to measure the presence of several hundred polypeptide markers simultaneously in a short time in a small volume with high sensitivity. Once the presence of the polypeptide markers has been measured, a pattern of the measured polypeptide markers is generated and can be compared to a disease pattern by any of the methods described further above. However, in many cases it will be sufficient for diagnosis to measure only one or a limited number of the markers.
The polypeptide sequences can be determined according to methods well-known to the person skilled in the art (see e.g. C. S. Spahr et al. (2001). Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. I. Profiling an unfractionated tryptic digest, Proteomics vol. 1, p. 93-107).
Depending on the type of polypeptide marker, it is possible to measure its presence or absence by further means. For example, if the polypeptide is biologically active, its presence may be determined by cellular or enzymatic assays.
Presence of a polypeptide can also be determined by use of ligands binding to the polypeptide of interest. Binding according to the present invention includes both covalent and non-covalent binding.
A ligand according to the present invention can be any peptide, polypeptide, nucleic acid, or other substance binding to the polypeptide of interest. It is well known that polypeptides, if obtained or purified from the human or animal body, can be modified, e.g. by glycosylation. A suitable ligand according to the present invention may bind the polypeptide also via such sites.
Preferred ligands include antibodies, nucleic acids, peptides or polypeptides, and aptamers, e.g. nucleic acid or peptide aptamers. For many polypeptides, suitable ligands are commercially available. Furthermore, methods to generate suitable ligands are well-known in the art. For example, identification and production of suitable antibodies or aptamers is also offered by commercial suppliers.
The term “antibody” as used herein includes both polyclonal and monoclonal antibodies, as well as fragments thereof, such as Fv, Fab and F(ab)2 fragments that are capable of binding antigen or hapten.
Preferably, the ligand should bind specifically to the polypeptide to be measured. “Specific binding” according to the present invention means that the ligand should not bind substantially to (“cross-react” with) another polypeptide or substance present in the sample investigated. Preferably, the specifically bound protein or isoform should be bound with at least 3 times higher, more preferably at least 10 times higher and even more preferably at least 50 times higher affinity than any other relevant polypeptide.
Non-specific binding may be tolerable, particularly if the investigated peptide or polypeptide can still be distinguished and measured unequivocally, e.g. according to its size on a Western Blot, or by its relatively higher abundance in the sample.
A method for measuring the presence of a polypeptide of interest may comprise the steps of (a) contacting a polypeptide with a specifically binding ligand, (b) (optionally) removing non-bound ligand, (c) measuring the presence or amount of bound ligand.
Binding of the ligand can be measured by any method known in the art. First, binding of a ligand may be measured directly, e.g. by NMR or surface plasmon resonance. Second, the ligand also serves as a substrate of an enzymatic activity of the peptide or polypeptide of interest, an enzymatic reaction product may be measured (e.g. the presence of a protease can be measured by measuring the amount of cleaved substrate, e.g. by Western Blot). Third, the ligand may be coupled covalently or non-covalently to a label allowing detection and measurement of the ligand.
Labeling may be done by direct or indirect methods. Direct labeling involves coupling of the label directly (covalently or non-covalently) to the ligand. Indirect labeling involves binding (covalently or non-covalently) of a secondary ligand to the first ligand. The secondary ligand should specifically bind to the first ligand. Said secondary ligand may be coupled with a suitable label and/or be the target (receptor) of tertiary ligand binding to the secondary ligand. The use of secondary, tertiary or even higher order ligands is often used to increase the signal. Suitable secondary and higher order ligands may include antibodies, secondary antibodies, and the well-known streptavidin-biotin system (Vector Laboratories, Inc.).
The ligand or substrate may also be “tagged” with one or more tags as known in the art. Such tags may then be targets for higher order ligands. Suitable tags include biotin, digoxygenin, His-Tag, Glutathion-S-Transferase, FLAG, GFP, myc-tag, influenza A virus haemagglutinin (HA), maltose binding protein, and the like. In the case of a peptide or polypeptide, the tag is preferably at the N-terminus and/or C-terminus.
Suitable labels are any labels detectable by an appropriate detection method. Typical labels include gold particles, latex beads, acridan ester, luminol, ruthenium, enzymatically active labels, radioactive labels, magnetic labels (“e.g. magnetic beads”, including paramagnetic and superparamagnetic labels), and fluorescent labels.
Enzymatically active labels include e.g. horseradish peroxidase, alkaline phosphatase, beta-Galactosidase, Luciferase, and derivatives thereof. Suitable substrates for detection include di-amino-benzidine (DAB), 3,3″-5,5″-tetramethylbenzidine, NBT-BCIP (4-nitro blue tetrazolium chloride and 5-bromo-4-chloro-3-indolyl-phosphate, available as ready-made stock solution from Roche Diagnostics), CDP-Star™ (Amersham Biosciences), ECF™ (Amersham Biosciences). A suitable enzyme-substrate combination may result in a colored reaction product, fluorescence or chemoluminescence, which can be measured according to methods known in the art.
Typical fluorescent labels include fluorescent proteins (such as GFP and its derivatives), Cy3, Cy5, Texas Red, Fluorescein, the Alexa dyes (e.g. Alexa 568), and quantum dots.
Typical radioactive labels include 35S, 125I, 32P, 33P, and the like.
Thus, suitable measurement methods according the present invention also include precipitation (particularly immunoprecipitation), electrochemiluminescence (electro-generated chemiluminescence), RIA (radioimmunoassay), ELISA (enzyme-linked immunosorbent assay), sandwich enzyme immune tests, electrochemiluminescence sandwich immunoassays (ECLIA), dissociation-enhanced lanthanide fluoro immuno assay (DELFIA), scintillation proximity assay (SPA), turbidimetry, nephelometry, latex-enhanced turbidimetry or nephelometry, or solid phase immune tests. Further methods known in the art (such as gel electrophoresis, 2D gel electrophoresis, SDS polyacrylamide gel electrophoresis (SDS-PAGE), Western Blotting), can be used alone or in combination with labeling or other detection methods as described above.
The ligand may also be present on an array. Said array contains at least one additional ligand, which may be directed against a peptide, polypeptide or a nucleic acid of interest. Said additional ligand may also be directed against a peptide, polypeptide or a nucleic acid of no particular interest in the context of the present invention. Preferably, ligands for at least five, more preferably at least 10, even more preferably at least 20 polypeptide markers according to the present invention are contained on the array.
According to the present invention, the term “array” refers to a solid-phase or gel-like carrier upon which at least two compounds are attached or bound in one-, two- or three-dimensional arrangement. Such arrays (including “gene chips”, “protein chips”, antibody arrays and the like) are generally known to the person skilled in the art and typically generated on glass microscope slides, specially coated glass slides such as polycation-, nitrocellulose- or biotin-coated slides, cover slips, and membranes such as, for example, membranes based on nitrocellulose or nylon.
The array may include a bound ligand or at least two cells expressing each at least one ligand.
It is also contemplated to use “suspension arrays” as arrays according to the present invention (Nolan I P, Sklar L A. (2002). Suspension array technology: evolution of the flat-array paradigm. Trends Biotechnol. vol. 20(1), p. 9-12). In such suspension arrays, the carrier, e.g. a microbead or microsphere, is present in suspension. The array consists of different microbeads or microspheres, possibly labeled, carrying different ligands.
The invention further relates to a method of producing arrays as defined above, wherein at least one ligand is bound to the carrier material in addition to other ligands.
Methods of producing such arrays, for example based on solid-phase chemistry and photolabile protective groups, are generally known (U.S. Pat. No. 5,744,305). Such arrays can also be brought into contact with substances or substance libraries and tested for interaction, for example for binding or change of conformation. Therefore, arrays comprising a polypeptide marker according to the present invention may be used for identifying ligands binding specifically to said peptides or polypeptides.
To determine the sequence of a polypeptide, it should be purified to the highest level achievable. However, the polypeptide does not need to be completely isolated. For example, it is enough to have the polypeptide detectable as a coomassie-stained band in a polyacrylamide gel. The corresponding gel piece can then be cut out and used for the next identification steps. After purification of the polypeptide, it can be enzymatically digested with trypsin and the molecular weights of the resulting fragments determined using any suitable method, for example mass spectrometry. Using mass spectrometry, each polypeptide displays a characteristic “fingerprint” of fragments allowing its identification by database searches. In case that the polypeptide to be identified is not present in the database or if the researcher wants to have a closer characterization for any reasons, the polypeptide fragments can also be sequenced according to methods known in the art.
CE-MS allows particularly easy determination of the polypeptide sequences. The capillary electrophoresis elution time for each marker is listed in the tables. Thus, it is possible to collect the fraction containing the polypeptide at relatively high purity. If a single fraction contains insufficient material, fractions of more than one experiment may be pooled.
Sequences of some of the polypeptide markers are listed as SEQ ID NO: 1 to 5. Their masses as measured by CE-MS and their respective sequences are as follows:
The invention is further illustrated by the following examples:
Participants:
After local Ethics Committee approval, informed consent was obtained from all participants. We examined a group of 57 healthy individuals with normal renal function in order to establish normal urinary protein patterns with CE-MS. In addition, we studied 44 patients with biopsy-proven minimal-change disease (n=16; MCD), membranous glomerulonephritis (n=18; MGN), and focal-segmental glomerulosclerosis (n=10; FSGS) (Table 1).
CE-MS Analysis:
Spot urine samples were collected from all participants in the morning after voiding the first urine. Samples were prepared as described in detail elsewhere (Wittke S, Fliser D, Haubitz M, et al: Determination of peptides and proteins in human urine with CE-MS—suitable tool for the establislunent of new diagnostic markers. J Chromatogr A 1013:173-181, 2003). The CE-MS analysis was established as described previously (Kaiser T, Hermann A, Kielstein J T, et al: Capillary Electrophoresis coupled mass spectrometry to establish polypeptide patterns in dialysis fluids. J Chromatogr A 1013: 157-171, 2003), using a Beckman Coulter PAC/E system coupled to a Mariner TOF mass spectrometer (ABI). CE capillaries were from Beckman, ID/OD 75/360 μm and 90 cm in length. The mobile phase used contained 30% methanol and 0.5% formic acid in water. The same liquid was used for the sheath flow, which was applied at 2 μl/min. Sample injection was performed with pressure: 1 psi for 20 sec. Under these conditions about 100 nl of sample could be injected. For sample stacking, the following protocol was applied: injection of 1M NH3 for 7 sec., injection of sample, injection of 2M formic acid for 5 sec. The subsequent CE-MS run was performed at +30 kV with the sequence of the following pressures: 40 min at 0 psi, 2 min at 0.1 psi, 2 min at 0.2 psi, 2 min at 0.3 psi, 2 min at 0.4 psi, 80 min at 0.5 psi. For diagnosis of IgA-nephropathy, the following pressure sequence was used: 40 min at 0.3 psi, 2 min at 0.4 psi, 2 min at 0.6 psi, 2 min at 0.8 psi, 80 min at 1 psi. After each run, the CE capillary was rinsed for 5 min with 0.1 M NaOH, followed by 5 min with water and 5 min with running buffer.
Statistical Analysis:
For discrimination between healthy subjects and different groups of patients with renal diseases we used the method of Random Forests and the corresponding S-Plus program version 6/2002 Breiman L: Random Forests. (http://oz.berkeley.edu/users/breiman/randomforest2001.pdf). In this procedure, a series of PP subsets of fixed size is selected randomly from all candidate PP. For each subset, a classification tree as described in the Classification and Regression Tree (CART) analysis is generated (Steinberg D, Colla P; CART—Classification and Regression trees. San Diego, Calif., Salford Systems 1997), resulting in a classification rule. The forest prediction is the unweight plurality of class votes of the series of classification rules. Over-fitting is not generated due to large numbers of subset selections. The estimated generalisation error is unbiased due to the method of “out of bag” (oob) estimation: each tree is grown on a bootstrap sample of cases of the learning sample and the validation is estimated on the basis of those cases not selected in the bootstrap sample.
Further, discrimination between groups was also performed using support vector machines. This tool has the advantage of discriminating data in high dimensional parameter space. Its fast and stable algorithms showed good performance in the evaluation of clinical markers (Dieterle F, Muller-Hagedorn S, Liebich H M, Gauglitz G; Urinary nucleosides as potential tumor markers evaluated by learning vector quantization. Artif Intell Med 28:265-279, 2003) and different areas of biological analyses like DNA arrays (Brown M P, Grundy W N, Lin D, et al: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 97:262-267, 2000).
Normal Urinary Polypeptide Pattern Analysed with CE-MS:
A graphical depiction (contour plot) of a typical sample is presented in
The subsequent electronic data manipulation for one example is summarized in
The examination of urine obtained from healthy subjects led to the establishment of peaks defined by actual mass and CE-time of the PP detected, so-called peak lists, and contour plots for each individual. The individual peak lists were deposited in an MS-Access database and the probability of each of the PP to appear in a single sample was calculated. One-hundred seventy-three PP were present in over 90% of the control samples examined. In addition, 156 PP were present in more than 75% of the samples, while additional 361 PP were found in over 50% of samples from the healthy individuals. These 690 PP were found in more than 50% of all samples obtained from healthy subjects and were used to establish a “normal PP pattern”.
Urine from Patients with Renal Diseases Analyzed with CE-MS:
Data from the individual runs of 44 patients were sub-grouped in the three disease groups and analyzed. The values from these databases, representing typical PP patterns, were subsequently compared. Significant homology of the protein patterns present in urine samples from each patient group was found within the groups. Typical examples of urinary PP patterns from patients with MCD, FSGS, and MGN are shown in
Statistical analysis for discrimination of healthy individuals and patients with renal disease using CE-MS data was applied. A list of 800 PP, present with more than 50% probability in either disease group was chosen for Random Forest analysis. The correct classification rate for the discrimination between healthy subjects and renal patients was 96.5%, as shown in the following list:
After cross-validation a sensitivity of 81.3% and a specificity of 94.3% could be obtained. Discrimination of the disease groups was achieved in the learning sample. However, most likely due to the small number of FSGS patients, these could not be discriminated from MCD when applying cross-validation. Hence, FSGS and MCD were combined into one group. For the discrimination between healthy subjects, MCD/FSGS and MGN, four PP were selected by CART from the list to build a classification tree with five terminal nodes (table 15). The correct classification rate in the learning sample is 94.1%. After cross-validation it reduces to 84.3% (93.8% for healthy controls, 71.4% for MCD/FSGS and 92.9% for MGN).
Alternatively, statistical analysis was performed using support vector machines on the same data; table 16 shows PP that were employed in this analysis. Using these PP, the correct classification was 98.0% after complete cross-validation. Table 17 depicts PP that were used to discriminate between MCD and MGN. Here the correct classification was 94.1% after complete cross-validation. Further, it was possible to separate patients with MCD and FSGS and patients with MGN and FSGS with (cross-validated) classification rates of 92.3% and 89.3%, respectively (tables 18 and 19). These results can be valued as a first approach using support vector machines to classify a limited number of patients. With increasing patients data the classification will further improve and become more stable. The results also indicate that for stable classification the number of applicable variables (polypeptides) depends on the number of cases (patients), hence an increase in patients will allow to use even more PP for classification.
The present application claims the benefit of U.S. Provisional Application No. 60/569,230 filed May 10, 2004, which is hereby incorporated herein in its entirety by reference.
Number | Date | Country | |
---|---|---|---|
60569230 | May 2004 | US |