1. Field of the Invention
The invention relates to a characteristic pattern of Complement C3 and related proteins that can be used in diagnosis, drug targeting, and therapeutic response monitoring of patients having Amyotrophic Lateral Sclerosis (ALS), patients with Parkinson's disease (PD), patients with ALS-like and PD-Like disorders, that express symptoms similar to ALS and/or PD, and normal age-matched individuals. The method is based on the use of 2-dimensional (2D) gel electrophoresis to separate the complex mixture of proteins found in blood serum and the quantitation of a functionally related group of identified biomarkers, 3 forms of complement C3c and related biomarkers, to differentiate patients having ALS or PD patients from each other and from patients with ALS-like and PD-Like disorders, and from normal individuals, for the purpose of diagnosis, and for determination of disease burden, treatment response monitoring, and drug targeting.
2. Description of the Related Art
There is an urgent need for diagnostic tests for Lou Gehrig's disease (ALS), Parkinson's disease (PD), and “like” disorders. ALS is characterized by degeneration of both upper and lower motor neurons. The majority of patients die within 3-5 years from first symptom [1], usually from respiratory muscle failure. Most of the ALS cases occur in unrelated individuals, i.e. sporadic ALS (sALS). Inheritance is observed in about 10% of cases, i.e. familial ALS (fALS) [2]. In 10-20% of fALS cases, a mutation can be identified in the gene for superoxide dismutase 1 (SOD1), a ubiquitously expressed antioxidant protein [3-5] and this accounts for less than 2% of all ALS cases. Clearly, genetic testing is not applicable to the majority of ALS cases. In fact, from a clinical standpoint, patients with fALS and sALS are indistinguishable [6-9]. Presently, the diagnosis of ALS is symptom based and a definitive diagnostic test is not currently available. The usual diagnostic process consists of a full medical history and a comprehensive physical and neurological examination according to the El Escorial ALS diagnostic criteria [10, 11]. A complete evaluation may include an electromyogram (EMG) with nerve conduction studies (NCV), magnetic resonance imaging (of the brain and spinal cord, lumbar puncture (LP) with analysis of cerebrospinal fluid (CSF), a panel of blood tests, and muscle biopsy. Because the El Escorial criteria set was originally designed for research purposes, many clinicians find it to be somewhat cumbersome [10]. Researchers have searched for genetic susceptibility factors affecting cellular processes that influence the survival of motor neurons, including excitotoxicity [12, 13], oxidative stress [14], neurofilament abnormalities [15, 16], inflammation, immune-inflammation, growth factors, axonal transport, and other processes [17-19]. However, to date, no susceptibility factor has emerged to account for the majority of ALS cases.
PD results primarily from the death of dopaminergic neurons in the substantia nigra. The loss of dopamine production from these cells results in the primary symptoms of PD [20]. Although the PD conditions usually affect those above age 65, 15% of those diagnosed are under age of 50. Studies indicate that physicians make an incorrect initial diagnosis of PD in 24% to 35% of cases [21]. Even general neurologists have difficulties in correctly identifying the disease [22, 23]. The diagnosis is based on a thorough physical and neurological examination. Blood tests and imaging studies (MRI of brain) are performed to rule out other conditions that have similar symptoms. There is currently no blood test or imaging study that can confirm the diagnosis of PD. The disease is treatable, but not curable [23]. Ongoing studies are attempting to identify ways of slowing the deterioration of the dopamine-producing cells [24, 25]. However, the symptoms of PD do not appear until about 80% of the dopamine-producing cells are already dead or impaired [26].
Proteomics is a new field of medical research wherein proteins are identified and linked to biological functions, including roles in a variety of disease states. With the completion of the mapping of the human genome, the identification of unique gene products, or proteins, has increased exponentially. In addition, molecular diagnostic testing for the presence of proteins already known to be involved in certain biological functions has progressed from research applications alone to use in disease screening and diagnosis for clinicians. However, proteomic testing for diagnostic purposes remains in its infancy. There is, however, a great deal of interest in using proteomics for the elucidation of potential disease biomarkers.
Detection of abnormalities in the genome of an individual can reveal the risk or potential risk for individuals to develop a disease. The transition from gene based risk to emergence of disease can be characterized as an expression of genomic abnormalities in the proteome. In fact, whether arising from genetic, environmental, or other factors, the appearance of abnormalities in the proteome signals the beginning of the process of cascading effects that can result in the deterioration of the health of the patient. Therefore, detection of proteomic abnormalities at an early stage is desired in order to allow for detection of disease processes either before the disease is established or in its earliest stages where treatment may be more effective.
Recent progress using a novel form of mass spectrometry called surface enhanced laser desorption and ionization time of flight (SELDI-TOF) for the testing of ovarian cancer and Alzheimer's disease has led to an increased interest in proteomics as a diagnostic tool [27, 28]. Furthermore, proteomics has been applied to the study of breast cancer through use of 2D gel electrophoresis and image analysis to study the development and progression of breast carcinoma in patients' breast ductal fluid specimens and in plasma from Alzheimer's disease patients [29, 30]. In the case of breast cancer, breast ductal fluid specimens were used to identify distinct protein expression patterns in bilateral matched pair ductal fluid samples of women with unilateral invasive breast carcinoma.
Detection of biomarkers is an active field of research. For example, U.S. Pat. No. 5,958,785 discloses a biomarker for detecting long-term or chronic alcohol consumption. The biomarker disclosed is a single biomarker and is identified as an alcohol-specific ethanol glycoconjugate. U.S. Pat. No. 6,124,108 discloses a biomarker for mustard chemical injury. The biomarker is a specific protein band detected through gel electrophoresis and the patent describes use of the biomarker to raise protective antibodies or in a kit to identify the presence or absence of the biomarker in individuals who may have been exposed to mustard poisoning. U.S. Pat. No. 6,326,209 B1 discloses measurement of total urinary 17 ketosteroid-sulfates as biomarkers of biological age. U.S. Pat. No. 6,693,177 B1 discloses a process for preparation of a single biomarker specific for O-acetylated sialic acid and useful for diagnosis and outcome monitoring in patients with lymphoblastic leukemia.
Two-dimensional (2D) gel electrophoresis has been used in research laboratories for biomarker discovery since the 1970's [31-38]. Recently, faster identification of proteins by in-gel digestion and mass spectrometry has provided large-scale application of these techniques [39, 40], while the advent of bioinformatics is progressing proteomics towards diagnostics [41-43]. Comprehensive analyses of disease mechanisms and disease markers is now feasible through clinical proteomics [44].
Neurodegenerative diseases are among those devastating disorders for which the need for early diagnosis is most pressing. These include amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), Alzheimer's disease (AD), and many other neurological disorders. These central nervous system disorders are characterized by the progressive loss of neural and related tissues. Clearly, earlier identification of those with PD, ALS, and like disorders may help to optimize the management of these patients. The present invention involves demonstrated differences in the expression of Complement C3 and related proteins from 422 patients and normal individuals. The biochemical and mechanistic implications of these results, as well as their use combined with biostatistics, are also disclosed.
The present invention is a diagnostic assay for differentiating between patient's having Amyotrophic Lateral Sclerosis (ALS), Parkinson's disease (PD), ALS-Like and/or PD-Like disorders, and normal controls. The method comprises collecting a biological sample from a patient having symptoms consistent with ALS and/or PD, determining the concentrations of up to 34 protein biomarkers identified as related to ALS, PD, ALS-Like, or PD-like disorders, and determining whether or not the patient has ALS, PD, or an ALS-Like, or PD-like disorder, based on a statistical analysis of the concentration in blood serum of the selected 34 protein biomarkers including the Complement C3c and related protein biomarkers.
One aspect of the present invention is a method for screening a patient for ALS, PD, ALS-Like, or PD-like disorders. The method includes: collecting a biological sample from a patient having symptoms consistent with ALS or PD, determining the concentrations of up to 34 protein biomarkers identified as related to ALS, PD, ALS-Like, or PD-like disorders, and determining whether or not the patient may have ALS, PD, or an ALS-Like, or PD-like disorder, based on a statistical analysis of the concentration in blood serum of the selected 34 protein biomarkers including the Complement C3c and related protein biomarkers. Another aspect of the present invention is a method for determining the burden of disease and/or monitoring the response to treatment of a patient with ALS, PD, ALS-Like, or PD-like disorders. The method includes: collecting a biological sample from a patient having symptoms consistent with ALS and/or PD, determining the concentration of up to 34 protein biomarkers identified as related to ALS, PD, ALS-Like, or PD-like disorders, and determining the burden of disease and/or response of the patient to treatment based on the concentration in blood serum of the selected 34 protein biomarkers including the Complement C3c and related protein biomarkers.
Another aspect of the present invention is a method for determining the biological mechanism of disease of a patient and/or the drug target of the patient for treatment of ALS, PD, ALS-Like, or PD-like disorders. The method includes: collecting a biological sample from a patient having symptoms consistent with ALS or PD, determining the concentration of up to 34 protein biomarkers identified as related to ALS, PD, ALS-Like, or PD-like disorders, and determining the mechanism of disease active in the patient and/or the drug target appropriate for treatment of the patient, based on the concentration in blood serum of the selected 34 protein biomarkers including the Complement C3c and related protein biomarkers.
The foregoing has outlined rather broadly several aspects of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and the specific embodiment disclosed might be readily utilized as a basis for modifying or redesigning the methods for carrying out the same purposes as the invention. It should be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
Table 1: List of Complement C3 and related biomarkers identified by MaldiTOF and LC MS/MS of in-gel tryptic peptide digests, and characterized by 2D gel electrophoresis of the intact proteins.
Table 2: Alternative names and amino acid sequence of the parent Complement C3 full-length transcript, and its processing components with positions of the amino acid sequence of biomarkers C3c=spots 7311, 9311, 9312, and C3dg=spot 1511.
Table 3: cDNA sequence of full length parent Complement C3 indicating the beginning and end of coding regions of C3c and C3dg.
Table 4: Amino acid sequence of Similar to Complement C3, including its identical C3dg domain and its non-identical C3c like region.
Table 5: Shows the amino acid sequence identity in the C3dg domain between Complement C3 and Similar to C3.
Table 6: Comparison of serum levels of biomarkers of Complement C3c, Complement C3 related proteins: Factors H and Bb, and Pre-serum Amyloid Protein (SAP), between Normal individuals and patients with ALS, Parkinson's disease, and like disorders, with both averages and Single Variable Linear Discriminant Biostatistics shown.
Table 7: Linear discriminant biostatistical analysis of the 13 PD optimal biomarkers, PD vs. age-matched normal controls.
Table 8: Validation of blood serum biomarkers and test for PD vs. PD-Like and normal controls. Training and independent test sets.
Table 9: Validation of blood serum biomarkers and test for ALS vs. ALS-Like and normal controls. Training and independent test sets.
The present invention is a diagnostic assay for differentiating between patients having Amyotrophic Lateral Sclerosis (ALS), Parkinson's disease (PD), ALS-Like and/or PD-Like disorders, and normal controls. The method is based on the use of 2-dimensional (2D) gel electrophoresis to separate the complex mixture of proteins found in blood serum and the quantitation of a group of identified biomarkers to differentiate between patients having ALS, PD, ALS-Like and/or PD-Like disorders, and normal controls.
In the context of the present invention an “ALS-Like disorder or a PD-Like disorder” would include individuals with symptoms consistent with ALS or PD, including without limitation, for ALS, muscle weakness, numbness, slurred speech, and for PD, slowness of movements, muscle stiffness, tremor, rigidity, slow stooped gait, and difficulty with balance.
The ALS like conditions, are divided into 6 main anatomical categories, as follows:
a. Cervical Disc Protrusion
b. Spinal Stenosis
c. Spinal Cord Tumor
d. Primary lateral sclerosis
a. Post Polio Syndrome
b. Spinal Muscular Atrophy
a. Guillain Barre Syndrome
b. Chronic Inflammatory Demyelinating Polyreuropathy (CIDP)
c. Other Causes of Motor Neuron Compromise
a. Myasthenia Gravis
b. Myasthenia Syndrome (LEMS)
c. Toxins (Black Widow Spider)
a. Muscular Dystrophy
b. Inclusion Body Myositis (IBM)
c. Polymyositis
The PD-like disorders include movement disorders other than PD, including but not limited to:
4. Dementia with Lewy Bodies
10. Also included is Alzheimer's disease (AD), particularly in the early stages.
In the context of the present invention, the “protein expression profile” corresponds to the steady state level of the various proteins in biological samples that can be expressed quantitatively. These steady state levels are the result of the combination of all the factors that control protein concentration in a biological sample. These factors include but are not limited to: the rates of transcription of the genes encoding the hnRNAs; processing of the hnRNAs into mRNAs; The rates of splicing and the splicing variations during the processing of the hnRNAs into mRNAs which govern the relative amounts of the protein sequence isoforms; the rates of processing of the various mRNAs by 3′-polyadenylation and 5′-capping; the rates of transport of the mRNAs to the sites of protein synthesis; the rate of translation of the mRNA's into the corresponding proteins; the rates of protein post-translational modifications, including but not limited to phosphorylation, nitrosylation, methylation, acetylation, glycosylation, poly-ADP-ribosylation, ubiquitinylation, and conjugation with ubiquitin Like proteins; the rates of protein turnover via the ubiquitin-proteosome system and via proteolytic processing of the parent protein into various active and inactive subcomponents; the rates of intracellular transport of the proteins among compartments, such as but not limited to the nucleus, the lysosomes, golgi, the membrane, and the mitochondrion; the rates of secretion of the proteins into the interstitial space; the rates of secretion related protein processing; and the stability and rates of proteolytic processing and degradation of the proteins in the biological sample before and after the sample is taken from the patient.
In the context of the present invention, a “biomarker” corresponds to a protein or protein fragment present in a biological sample from a patient, wherein the quantity of the biomarker in the biological sample provides information about whether the patient exhibits an altered biological state such as ALS, PD or an ALS-Like or a PD-Like disorder.
A “control’ or “normal” sample is a sample, preferably a serum sample, taken from an individual with no known disease, particularly without a neuromuscular disease. The method of the present invention is based on the quantification of specified proteins. Preferably the proteins are separated and identified by 2D gel electrophoresis. 2D gel electrophoresis has been used in research laboratories for biomarker discovery since the 1970's [31-35]. In the past, this method has been considered highly specialized, labor intensive and non-reproducible.
Only recently with the advent of integrated supplies, robotics, and software combined with bioinformatics has progression of this proteomics technique in the direction of diagnostics become feasible. The promise and utility of 2D gel electrophoresis is based on its ability to detect changes in protein expression and to discriminate protein isoforms that arise due to variations in amino acid sequence and/or post-synthetic protein modifications such as phosphorylation, nitrosylation, ubiquitination, conjugation with ubiquitin-Like proteins, acetylation, and glycosylation. These are important variables in cell regulatory processes involved in disease states.
There are few comparable alternatives to 2D gels for tracking changes in protein expression patterns related to disease progression. The introduction of high sensitivity fluorescent staining, digital image processing and computerized image analysis has greatly amplified and simplified the detection of unique species and the quantification of proteins. By using known protein standards as landmarks within each gel run, computerized analysis can detect unique differences in protein expression and modifications between two samples from the same individual or between several individuals.
Serum samples were prepared from blood acquired by venipuncture. The blood was allowed to clot at room temperature for 30-60 minutes, centrifuged at 1200×g for 15 minutes, and the separated serum was divided into aliquots, and frozen at 40° C. or below until shipment. Samples were shipped on dry ice and were delivered within 24 hours of shipping.
Once the serum samples were received, logged in, and assigned a sample number; they were further processed in preparation for 2D gel electrophoresis. All samples were stored at −40° C. or below. When the serum samples were removed from storage, they were placed on ice for thawing and kept on ice for further processing.
To each 100 μl of sample, 100 μl of LB-2 buffer (5M urea, 2M Thiourea, 0.5% ASB-14, 0.25% CHAPS, 0.25% Tween-20, 5% glycerol, 100 mM DTT, 1× Protease inhibitors, and 1× Ampholyte pH 3-10) was added and the mixture vortexed. The sample was incubated at room temperature for about 5 minutes.
The proteins in the patient and control samples were separated using various techniques known in the art for separating proteins, techniques that include but are not limited to gel filtration chromatography, ion exchange chromatography, reverse phase chromatography, affinity chromatography, or any of the various centrifugation techniques well known in the art. In some cases, a combination of one or more chromatography or centrifugation steps may be combined via electrospray or nanospray with mass spectroscopy or tandem mass spectroscopy, or any protein separation technique that determines the pattern of proteins in a mixture either as a one-dimensional, two-dimensional, three-dimensional or multi-dimensional pattern or list of proteins present.
Preferably the protein profiles of the present invention are obtained by subjecting biological samples to two-dimensional (2D) gel electrophoresis to separate the proteins in the biological sample into a two-dimensional array of protein spots.
Two-dimensional gel electrophoresis is a useful technique for separating complex mixtures of proteins and can be performed using a variety of methods known in the art (see, e.g., U.S. Pat. Nos. 5,534,121; 6,398,933; and 6,855,554).
Preferably, the first dimensional gel is an isoelectric focusing gel and the second dimension gel is a denaturing polyacrylamide gradient gel.
Proteins are amphoteric, containing both positive and negative charges and Like all ampholytes exhibit the property that their charge depends on pH. At low pH (acidic conditions), proteins are positively charged while at high pH (basic conditions) they are negatively charged. For every protein there is a pH at which the protein is uncharged, the protein's isoelectric point. When a charged molecule is placed in an electric field it will migrate towards the opposite charge.
In a pH gradient such as those used in the present invention, containing a reducing agent such as dithiothreitol (DTT), a protein will migrate to the point at which it reaches its isoelectric point and becomes uncharged. The uncharged protein will not migrate further and stops. Each protein will stop at its isoelectric point and the proteins can thus be separated according to their isoelectric points. In order to achieve optimal separation of proteins, various pH gradients may be used. For example, a very broad range of pH, from about 3 to 11 or 3 to 10 can be used, or a more narrow range, such as from pH 4 to 7 or 5 to 8 or 7 to 10 or 6 to 11 can be used. The choice of pH range is determined empirically and such determinations are within the skill of the ordinary practitioner and can be accomplished without undue experimentation.
In the second dimension, proteins are separated according to molecular weight by measuring mobility through a uniform or gradient polyacrylamide gel in the detergent sodium dodecyl sulfate (SDS). In the presence of SDS and a reducing agent such as dithiothreitol (DTT), the proteins act as though they are of uniform shape with the same charge to mass ratio. When the proteins are placed in an electric field, they migrate into and through the gel from one edge to the other. As the proteins migrate though the gel, individual proteins move at different speeds with the smaller ones moving faster than the larger ones. This process is stopped when the fastest moving components reach the other side of the gel. At this point, the proteins are distributed across the gel with the higher molecular weight proteins near the origin and the low molecular weight proteins near the other side of the gel.
It is well known in the art that various concentration gradients of acrylamide may be used for such protein separations. For example, a gradient of from about 5% to 20% may be used in certain embodiments or any other gradient that achieves a satisfactory separation of proteins in the sample may be used. Other gradients would include but not be limited to from about 5 to 18%, 6 to 20%, 8 to 20%, 8 to 18%, 8 to 16%, 10 to 16%, or any range as determined by one of skill.
The end result of the 2D gel procedure is the separation of a complex mixture of proteins into a two dimensional array, a pattern of protein spots, based on the differences in their individual characteristics of isoelectric point and molecular weight.
Purified proteins having known characteristics are used as internal and external standards and as a calibrator for 2D gel electrophoresis. The standards consist of seven reduced, denatured proteins that can be run either as spiked internal standards or as external standards to test the ampholyte mixture and the reproducibility of the gels. A set mixture of proteins (the “standard mixture”) is used to determine pH gradients and molecular weights for the two dimensions of the electrophoresis operation. Table A lists the isoelectric point (pI) values and molecular weights for the proteins included in a particular example standard mixture.
In addition, standard mixtures such as Precision Plus Protein Standards (Bio-Rad Laboratories), a mixture of 10 recombinant proteins ranging from 10-250 kD, are typically added as external molecular weight standards for the second dimension, or the SDS-PAGE portion of the system. The Precision Plus Protein Standards have an r2 value of the Rf vs. log molecular weight plot of >0.99.
An appropriate amount of isoelectric focusing (IEF) loading buffer (LB-2), was added to the diluted serum sample, incubated at room temperature and vortexed periodically until the pellet was dissolved to visual clarity. The samples were centrifuged briefly before a protein assay was performed on the sample.
Approximately 100 μg of the serum proteins were suspended in a total volume of 184 μl of IEF loading buffer and 1 μl Bromophenol Blue. Each sample was loaded onto an 11 cm IEF strip (Bio-Rad Laboratories), pH 5-8, and overlaid with 1.5-3.0 ml of mineral oil to minimize the sample buffer evaporation. Using the PROTEAN® IEF Cell, an active rehydration was performed at 50V and 20° C. for 12-18 hours.
IEF strips were then transferred to a new tray and focused for 20 min at 250V followed by a linear voltage increase to 8000V over 2.5 hours. A final rapid focusing was performed at 8000V until 20,000 volt-hours were achieved. Running the IEF strip at 500V until the strips were removed finished the isoelectric focusing process.
Isoelectric focused strips were incubated on an orbital shaker for 15 min with equilibration buffer (2.5 ml buffer/strip). The equilibration buffer contained 6M urea, 2% SDS, 0.375M HC1, and 20% glycerol, as well as freshly added DTT to a final concentration of 30 mg/ml. An additional 15 min incubation of the IEF strips in the equilibration buffer was performed as before, except freshly added iodoacetamide (c2H4INO) was added to a final concentration of 40 mg/ml. The IPG strips were then removed from the tray using clean forceps and washed five times in a graduated cylinder containing the Bio Rad Laboratories running buffer 1× Tris-Glycine-SDS.
The washed IEF strips were then laid on the surface of Bio Rad pre-cast CRITERION SDS-gels 8-16%. The IEF strips were fixed in place on the gels by applying a low melting agarose. A second dimensional separation was applied at 200V for about one hour. After running, the gels were carefully removed and placed in a clean tray and washed twice for 20 minutes in 100 ml of pre-staining solution containing 10% methanol and 7% acetic acid.
Once the 2D gel patterns of the serum samples were obtained, the protein spots resolved in the gels were visualized with either a fluorescent or colored stain. In the preferred embodiment, the fluorescent dye SyproRuby™ (Bio-Rad Laboratories) was the stain. Once the protein spots had been stained, the gel was scanned by a digital fluorescent scanner or when visible dyes are employed, a digital visible light scanner, and a digital image of the protein spot pattern of the gel, i.e. a protein expression profile of the sample, was obtained.
The digital image of the scanned gel was processed using PDQuest™ (Bio-Rad Laboratories) image analysis software to first detect the proteins, locate the selected biomarkers, and then to quantitate the protein in each of the selected spots. The scanned image was cropped and filtered to eliminate artifacts using the image editing control. Individual cropped and filtered images were then placed in a matched set for comparison to other images and controls.
This process allowed quantitative and qualitative spot comparisons across gels and the determination of protein biomarker molecular weight and isoelectric point values. Multiple gel images were normalized to allow an accurate and reproducible comparison of spot quantities across two or more gels. The gels were normalized using the “total of all valid (detected and confirmed by the operator) spots method” in that a small percentage of the 1200 protein spots detected and verified change between serum samples, and that all spots detected and verified is a good estimate to correct for any differences in total protein amount applied to each gel. The quantitative amounts of the selected biomarkers present in each sample were then exported for further analysis using statistical programs.
The 2D gel patterns of 162 serum samples collected from normal control subjects were compared with each other. The 162 normal control samples all gave similar 2D gel protein patterns. The normal protein expression pattern was then compared to the gel patterns obtained in serum samples of 186 patients diagnosed with ALS, 29 patients diagnosed with PD, 32 patients diagnosed with ALS-Like and 13 patients with PD-like disorders. The comparison of the protein expression pattern of normal controls and diseased patients identified at least 34 protein spots seen on 2D gels that differed in protein concentration.
Initially, the disease specific means, standard deviations, medians, and interquartile ranges, of the normalized concentrations of individual protein spots were used to select the biomarkers and to assess the statistical significance of concentration differences in the biomarkers between the control sera and the ALS, PD, ALS-like, PD-Like disease sera. However because of the number of biomarkers and patients in the studies, with patient to patient differences, subsequent studies used multi-variant statistical programs to select the biomarkers. Linear and quadratic single variable discriminate functional analysis was employed to determine the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of each biomarker individually, and a number of combinations of biomarkers, in multiple variable discriminant functions, were used in determining the difference between normal serum and disease serum taken from patients diagnosed with ALS, PD, ALS-like, and PD-Like disorders. Both linear and quadratic discriminant analyses were used for statistical comparison and classification of samples.
The quantitative amount of the selected biomarkers present in each sample was analyzed using a biostatistical discriminant function. The concentrations for the set of selected biomarkers were entered into a discriminant biostatistical algorithm and the sample was classified as either Control, ALS, ALS-Like, PD or PD-Like disorder based on a comparison to a database of values collected from the individuals in the data set from which the discriminant function was derived.
The output of discriminant analysis is a classification table that permits the calculation of clinical sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). These terms are defined herein as follows: (1) the clinical sensitivity measured how often the test yielded positive results in diseased patients, for example, in the case of the present invention, patients with PD; (2) the clinical specificity measured how often the test gave negative results in non-diseased individuals, in this case non-PD patients with ALS, ALS-Like, PD-Like disorders, or normal controls; (3) the negative predictive value (NPV) measured the probability that the patient would not have the disease, in this case PD, and therefore have ALS, ALS-Like, PD-Like disorders, or be a normal control when values were restricted to all individuals who tested negative; and (4) the positive predictive value (PPV) measured the probability that the patient had the disease (e.g., PD) when values were restricted to those individuals who tested positive.
A standard discriminant function analysis was performed to determine the subset of biomarkers that would be most useful in differentiating individuals, for example those with PD, from those individuals with ALS, ALS-Like, or PD-Like disorders. Discriminant analysis has been well validated as a multivariate analysis procedure. Discriminant analysis identified sets of linearly independent functions that successfully classify individuals into a well-defined collection of groups. The statistical model used assumed a multivariate normal distribution for the set of biomarkers identified from each disease group.
Where xij represented the p-tuple vector of biomarkers from the ith patient in the jth group, j=1, j, and represented the p-tuple centroid of the jth group, made up of the mean biomarker values from the jth disease group, then S represented the estimate of the within group variance-covariance matrix. The discriminant function was then that set of linear functions determined by the vector a that maximizes the quantity:
The outcome of the discriminant analysis was a collection of m−1 linear functions of the biomarkers (m) that maximized the ability to separate individuals into disease groups. The vector α is the p-tuple vector which contained the coefficients that, when multiplied by an individual's biomarkers serum concentrations, produced the linear discriminant function, or index that was used to classify that individual. In general, if m biomarkers are used, a maximum of (m−1, g−1) discriminant functions are determined where g represented the number of groups.
Where aj(k) represented the kth p-tuple discriminant function. Then the value of that discriminator for the jth patient is aj(k)′xi. Thus for each patient there are k such values computed, which are used in a classification analysis. The discriminant functions themselves are linearly independent (i.e., for each pair of the m discriminant functions) aj(k) and aj(l), then a1 (k)′aj(l)=0. Thus, the m−1 discriminant functions provide incremental and non-redundant discriminant ability.
Identifying the discriminant function involved identifying the coefficients λ from the linear algebraic system of equations |H−λi(H+E)|=0 where H and E were the one way analysis of variance hypotheses and error matrices respectively. It is this computation that was provided by SAS™ statistical software. The SAS software program identified the collection of best discriminators using a forward entry procedure where the p-value to enter and the p-value to stay in the model are each 0.15.
While the discrimination procedure was fairly robust in the presence of mild departures from the normality assumption, it was very sensitive to the assumption of homogeneity of variance. This means that the variance-covariance matrices of the groups between which discrimination was sought must be equal. In this circumstance, these variance-covariance matrices can be pooled. However, in the situation where the variance-covariance matrices are not equal (multivariate heteroscedasticity), this pooling procedure is suboptimal. In this circumstance, the individual variance-covariance matrices have been used.
The use of the two within-group variance-covariance matrices is an important complication in the computation of discriminant functions. When the homoscedasticity assumption is appropriate, the within group variance-covariance matrices can be pooled, producing a linear discriminant function. The use of the within-group variance-covariance matrices produced a quadratic discriminant function (i.e., where the discriminant function is a function of the squares of the proteomic measures).
Discriminant analysis was applied to the data set, from which the contribution of each individual biomarker was determined. The SAS™ statistical software program was then used to determine the linear combinations of biomarkers that provided an optimum classification of individuals into disease groups. Alternatively, the programmer can manually select different combinations of biomarkers to be incorporated into a linear or quadratic discriminant function to optimize the classification of individuals into disease groups.
Thirty four [34] protein biomarkers were identified in the data set that both individually and/or jointly discriminated ALS and PD patient samples from each other and from samples taken from patients with ALS-ike and/or PD-Like disorders, or normal controls. Individuals were independently diagnosed as having ALS, PD, ALS-Like or AD-Like disorders, by a qualified neurologist based on the current standard of care.
Representative samples from individuals with known cases of ALS, PD, ALS-Like, and PD-Like disorders, and normal individuals were run as positive and negative reference controls. Serum containing all of the selected biomarkers was also provided as a reference standard. A reference control was periodically run as an external standard and for tracking overall performance and reproducibility. In addition, 2D gel images from samples classified as ALS, PD, ALS-Like, and PD-Like disorders were used for reference. The spot locations for the selected biomarkers were illustrated in
The consistency and reproducibility of quantifying biomarkers using 2D-gel electrophoresis was characterized with samples run in replicate and each set of replicate samples was analyzed as a group. The average percent Co-efficient of Variation (% CV) is 11±7% for 10 biomarkers quantified from a single image scanned 10 times. The average %/CV is 23% for a set of biomarkers quantified from 12 separate processed aliquots of the same sample. The range in biomarker concentrations for this group of biomarkers ranged from a low of 248 ppm to a high of 15,548 ppm normalized concentration of spot per total detected spots in the 2D gel.
The protein concentrations employed in the discriminant function were relative values obtained by normalizing the intensity of each spot to all detected spots in the image. The linear range in protein concentrations was 0.5 to 1,000 ng per spot. The concentration of any given spot was the absolute amount of protein in that spot as measured by stain intensity, divided by the total protein loaded onto the gel and resolved into all detected spots. The total amount of protein loaded onto a gel was typically about 100 μg.
Serum is primarily comprised of a highly conserved distribution of the most abundant proteins, such as albumin and immunoglobulin, which enhance efforts to ensure the reproducibility and consistency of biomarker detection and quantitation. The selected biomarkers represented a minor fraction of the total serum protein. Therefore the concentration of the selected biomarkers, varied significantly as a function of disease state without significantly shifting the overall distribution and concentration of the major serum proteins. Discriminant biostatistics was employed to establish the dynamic concentration range of the selected biomarkers useful in differentiating patients and controls.
The effect of multiple freeze/thaw cycles on protein stability and sample integrity was investigated. A serum sample was collected and aliquoted. One aliquot was processed without freezing, while other aliquots were frozen at −80° C. and thawed repetitively. A second set of serum samples was diluted into loading buffer and aliquoted. The second set of samples, similar to the first set, had one aliquot processed without freezing and other aliquots frozen at −80° C. and thawed repetitively.
Triplicate samples were processed as described. The scanned images of the 2D gels were analyzed, and the quantities of each of the 34 neurodegenerative biomarkers of interest were determined. The results illustrated that freezing and thawing either undiluted or diluted serum samples up to 10 times had no significant effect on the serum protein profile or on the abundance of the selected biomarkers.
In addition, sample deterioration was investigated over a one-year period. Twenty-one selected biomarkers were quantitated in control samples stored at −80° C. An aliquot of each control sample was processed several times each quarter, or each 3 month time period. The results demonstrated that there was no significant increase or decrease in the quantity of biomarker detected over a one-year time frame for samples stored at −80° C., beyond that which is typically observed for processing replicate samples.
Serum samples were obtained from 164 sporadic ALS, 22 familial ALS, 32 ALS-Like, 29 PD, and 13 PD-Like patients and 162 normal controls. Also analyzed were 136 samples of Alzheimer's disease (AD) patients, and 16 AD-like disorder patients, as additional disease controls. All individuals with symptoms of a neurodegenerative disorder were evaluated and diagnosed by a qualified neurologist.
The AD-Like disorders included individuals with the following conditions: Lewy body dementia, Multi-infarct dementia, Alcohol-related dementia, Post-radiation Encephalopathy, Seizures, Memory dysfunction, Semantic dementia, Chronic inflammatory demyelinating polyneuropathy (CIDP), Cerebrovascular accident (CVA) dementia, Thalamic CVA, Frontotemporal dementia (FmD) and Corticalbasal danglionic degeneration (CBGD).
The preferred embodiment used all 34 biomarkers of interest and combines disease specific single variable biomarker statistics: means, medians, inter-quartile ranges, and single variable discriminant analysis, with multivariate discriminant analysis. Although a variety of different combinations of biomarkers were also tested that gave comparable statistical performance, one combination, 13 PD optimal biomarkers is specifically described herein, but others would be performed in a similar fashion.
As shown in
The 34 biomarkers were analyzed by excision and in-gel digestion with trypsin to produce tryptic peptides, which were then subjected to MaldiTOF MS, LC MS/MS, and database searches to identify the protein spots. The identities and electrophoretic characteristics of the Complement C3 and related biomarkers are summarized in Table 1 and their positions in the gels are shown in
As shown in
Also, shown in
The mean quantitative level of all 34 biomarkers were tabulated for patients with ALS, PD, AD, as well as ALS-Like, PD-Like, and AD-Like disorders, and Normal subjects. As shown in
When the increases and/or decreases in 1511 (C3dg) from normal, 4411 (Complement Factor H short splice form), 7616 (Complement Factor Bb), and 3209 (Pre-serum Amyloid P) (see Table 6) are also taken into account, disease specific Complement C3 pathway differences are revealed (
As a statistical model, the results in
Single variable linear discriminant analysis was performed using the SAS™ statistical software, utilizing the concentrations of each individual of the 13 PD optimal biomarkers in the serum of a Training Set consisting of 29 PD patients and 104 age matched normal controls (Table 7). The performance of each individual biomarker acting alone to discriminate PD from normal controls ranged from 48%-95% Sensitivity, and 52%-97% Specificity. Biomarkers 1511 (C3dg) (Sensitivity 95%, Specificity 72%), 4402 (Haptoglobin Like Protein, Sensitivity 86%, Specificity 76%) and 3314 (Apo lipoprotein E3, Sensitivity 74%, Specificity 90%) were the best performing. Performance of the remaining 10 PD optimal biomarkers was heavily weighted in favor of either sensitivity or specificity, or was balanced but with lower percentages (Table 7).
However, when multivariate linear discriminant analysis was performed employing the concentrations of the group of 13 PD Optimal Biomarkers together, the cumulative performance of the 13 biomarkers (Bottom of Table 7) yielded a combined sensitivity of 86% and specificity of 990% in discriminating between PD and age matched normal controls. Furthermore, when all 34 biomarkers were employed in the multivariate linear discriminant analysis, 100% combined sensitivity and 100% combined specificity were obtained for the same discrimination of the same training set (Table 8). Furthermore, when the discriminant functions obtained with the training set were challenged with samples from independent groups of patients in test sets, the performance was validated for non-age matched normal controls, PD-like, and AD disease controls (Table 8).
Similar results were obtained for ALS vs. ALS-Like patient's serum samples in a training set. Applying linear and quadratic discriminant analysis using the concentrations of the 34 biomarkers for discrimination of 145 ALS vs. 35 ALS-like patients (Table 9) linear discriminant analysis yielded a sensitivity of 69%, and a specificity of 88%, and quadratic discriminant analysis yielded 1000% sensitivity and 1000% specificity.
Usefulness of the Assays for ALS vs. ALS-Like Disorders and Normal and for PD vs. PD-Like Disorders and Normal.
Definitive diagnostic tests to confirm the diagnosis of patients with ALS or PD and distinguish them from patients with ALS-Like and/or PD-Like disorders, which display similar symptoms but have different treatment options and prognoses, have long been sought by clinicians in hopes of providing earlier treatment decisions and improved patient outcomes.
Presently, the diagnosis of ALS, PD, ALS-Like, and PD-Like disorders is a clinical one. There is no objective test that provides diagnostic accuracy. The usual diagnostic process consists of a full medical history, comprehensive physical and neurological examinations based on clinical criteria, and the results of neuro-imaging, conductivity, and analysis of cerebrospinal fluid. To date no definitive biochemical or genetic test is available to definitively diagnose ALS or PD, or to differentiate them from ALS-Like or PD-Like disorders.
The present invention provides an assay for differentiating ALS and/or PD patients from ALS-Like and/or PD-Like disorders, and from normal individuals. The assay is comprised of the following steps: (1) collecting a serum sample from a patient; (2) running triplicate 2D gel electrophoreses of the patient sample; (3) staining the 2D gel; (4) creating a digital image of the 2D gel; (5) quantifying the protein concentration in selected protein spots on the 2D gel; and (6) performing a statistical analysis on the quantity of the selected proteins to determine the Likelihood of the patient having ALS, PD or an ALS-Like, or PD-Like disorder.
While the methods have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods including the sequence of steps in the methods. Certain agents may be substituted by one of skill and similar results may be achieved, as will be appreciated by one of skill in the art. Such modifications or substitutions to the methods of the present invention are deemed to be within the spirit, scope and concept of the invention as defined by the disclosure and its claims.
The serum concentrations of the C3 and related biomarkers of the present invention are useful to assess the disease processes of inflammation, immune-inflammation, oxidative stress, and inclusion body formation that are integral to neurodegeneration. Elevation of C3c and C3d cleavage products were previously observed in tissues of patients with ALS and PD by immunohistochemical studies [45-47]. Activation of the complement proteins by the alternative pathway is safeguarded by spontaneous low-rate hydrolysis of the thioester in complement C3 and the resultant continuous supply of C3(H2O). Upon the Mg2+-dependent binding of factor B to C3(H2O), the activation of the proenzyme complex by factor D is triggered. The resultant cleavage of B causes the release of Ba, a 30 kD fragment and the formation of C3(H2O)Bb(Mg), the initial C3 convertase, which is confined to the fluid phase. Bb, a 60 kD fragment, is the second formed as a result of the cleavage of B. It is interesting that none of the previous studies were able to detect factor Bb in neurodegenerative samples. A significant reduction in the full-length factor B in the cerebrospinal fluid (CSF) of PD patients was recently reported [48], whereas we detected elevated levels of fragment Bb of Factor B in blood serum of patients with PD and not in serum of ALS patients. This elevation of Factor Bb in PD most probably arose, at least in part, from the elevated levels of Factor H that we also detected, which has been found to induce irreversible cleavage of C3 Convertase (C3bBb(Mg2+))[49], liberating C3b and factor Bb, and interrupting the Alternative Pathway of the downstream Complement activation cascade.
As a Complement regulatory protein, Factor H acts as a cofactor for the serine protease Factor I, which cleaves the α-chain of C3b at three sites to form iC3b, thus inhibiting the downstream Complement activation cascades. Finehout et al [48] reported a significant reduction in full-length Factor H levels in the CSF of PD and MS patients when compared to normal CSF controls, whereas the identified a cleavage product (Mwt ˜37 kD, pI 5.9) of the short spliced form of Factor H (Mwt 57.2 kD) is here disclosed as a serum biomarker that is significantly higher in the blood serum of PD and ALS patients when compared to normal serum. In addition, Pre-serum Amyloid Protein (SAP) was among the differences in serum disclosed here. It showed significant reduction in PD and familial ALS, but no difference in the more common sporatic ALS, when compared to controls. Generally, the 2-fold increased level of Complement C3dg, C3c, Factors H, and Bb, in PD over that of ALS may indicate a broader and higher level of abnormal Complement C3b processing and differential effects on the Complement cascade in PD than in ALS, possibly involving additional complement pathways in PD (
These differences in serum biomarkers in patients with the two diseases may prove useful, not only for differential diagnosis, but also for measuring burden of disease, drug targeting and monitoring of therapeutic interventions [58, 59]. For example, as reflected in the mechanisms assayable with these biomarkers, therapeutic approaches for solubilizing aggregated inclusion bodies, combined with anti-inflammatory drugs to attenuate the neurodegenerative response, could be a valuable approach for future neurodegenerative therapy. Monitoring serum biomarkers may be useful to follow the response to therapy and to provide objective, quantifiable measures, as well as for discovery of new targets for early therapy of these diseases.
PPKNGISTKL MNIFLKDSIT TWEILAVSMS DKKGICVADP FEVTVMQDFF IDLRLPYSVV
RNEQVEIRAV LYNYRQNQEL KVRVELLHNP AFCSLATTKR RHQQTVTIPP KSSLSVPYVI
VPLKTGLQEV EVKAAVYHHF ISDGVRKSLK VVPEGIRMNK TVAVRTLDPE RLGREGVQKE
REGVQKEDIP
PADLSDQVPD TESETRILLQ GTPVAQMTED AVDAERLKHL IVTPSGCGEQ
NMIGMTPTVI AVHYLDETEQ WEKFGLEKRQ GALELIKKGY TQQLAFRQPS SAFAAFVKRA
PSTWLTAYVV KVFSLAVNLI AIDSQVLCGA VKWLILEKQK PDGVFQEDAP VIHQEMIGGL
RNNNEKDMAL TAFVLISLQE AKDICEEQVN SLPGSITKAG DFLEANYMNL QRSYTVAIAG
YALAQMGRLK GPLLNKFLTT AKDKNRWEDP GKQLYNVEAT SYALLALLQL KDFDFVPPVV
This application claims priority to U.S. Provisional patent application Ser. No. 60/710,818 filed on Aug. 24, 2005 and entitled “Complement C3c and Related Protein Biomarkers in ALS, Parkinson's Disease, and their Like Disorders” by inventors Ira L. Goldknopf, et al. It also claims priority to U.S. Provisional patent application 60/742,462 filed on Dec. 12, 2005 and entitled “Remarkable Similarities and Differences in Serum Concentrations of 3 forms of Complement C3 and Related Protein Biomarkers among ALS, Parkinson's and “Like” Diseases—C3 and Related Biomarkers and Diagnosis” by inventors Ira L. Goldknopf, et al. It also claims priority to U.S. Provisional patent application 60/752,129 filed on Dec. 21, 2005 and entitled “2D Gel Blood Serum Biomarkers and Testes for Differential Diagnosis, Disease Burden and Drug Response Monitoring, and Drug Targeting for the Neurodegenerative Diseases” by inventors Ira L. Goldknopf, et al.
Number | Date | Country | |
---|---|---|---|
60710818 | Aug 2005 | US | |
60742462 | Dec 2005 | US | |
60752129 | Dec 2005 | US |