The invention is in the field of diagnostics for liver disease. More particularly it relates to new markers for Hepatitis C infection and for liver fibrosis.
Diagnosis of liver conditions due to virus, chemicals, drugs or metabolic disorders has been uncertain for many years. Biopsy is considered the gold standard, but it is not completely reliable in view of the lack of uniformity in distribution of diseased portions of the liver. It is also invasive and has a number of complications and risks. Others have attempted to find markers that would be monitored in serum or blood with varying degrees of success.
Human liver is especially an important target for damage by hepatitis. Liver biopsy is recommended in the management of patients with chronic Hepatitis C (CHC) to provide important information about fibrosis stage and disease prognosis. As an invasive procedure, liver biopsy is frequently accompanied by transient pain and may occasionally be associated with serious complications. The accuracy of liver biopsy in staging liver disease is limited by the size and quality of the samples and sampling error.
In recent years, intensive research in the field of noninvasive tests of liver fibrosis has yielded a few laboratory markers, which enabled the assessment of some aspects of the severity of Hepatitis C virus (HCV)-induced liver disease. For example, the FibroTest™, combines six serum markers (Alpha-2-macroglobulin, Haptoglobin, Apolipoprotein A1, Gamma-glutamyl transpeptidase, Alanine transaminase and total bilirubin) with the age and gender of the patient to generate a score that correlates with stage of fibrosis in patients with a variety of liver diseases. Platelet counts, AST/ALT ratio, and AST-platelet ratio index (APRI) have been reported as predictors of degree of fibrosis in CHC patients. In the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial, a model based on a combination of standard laboratory tests comprising platelet count, AST/ALT ratio, and INR (international normalized ratio of prothrombin time) predicted histological cirrhosis with high accuracy in 50% of patients with CHC.
Two proteins described below, PROC and RBP4 have been related to liver diseases. The plasma PROC level of patients with liver damage due to chronic alcohol consumption was disclosed as decreased and correlated with clinical performance of the patients (Kloczko, J., et al., Haemostasis (1992) 22:340-344). Serum concentrations of RBP4 were reported significantly higher in obese children with non-alcoholic fatty liver disease (NAFLD) compared to controls and proposed RBP4 as a serum marker of intrahepatic lipid content in obese children (Romanowska, A., et al., Acta Biochim Pol (2011) 58:35-38). A contradictory report showed serum RBP4 levels were not different between the steatosis group and controls as well as between subgroups with high and normal ALT, indicating that serum RBP4 may not be a predictive factor in NAFLD (Cengiz, C., et al., Eur J Gastroenterol Hepatol (2010) 22:813-819).
All documents cited herein are incorporated by reference in their entirety.
Molecular signatures specific for the liver and/or specific for particular causes of liver damage would be very useful for experimental, clinical, and epidemiology studies of liver diseases. There remains a need for noninvasive techniques to provide information with respect to disease status and prognosis.
Through development of targeted proteomic assays utilizing Selected Reaction Monitoring (SRM) incorporating heavy-isotope doping of labeled matched peptides (Picotti, P., et al., Cell (2009) 138:795-806 and Picotti, P., et al., Nat Methods (2010) 7:43-46), and use of the Human SRMAtlas with optimized transitions associated with typically six different peptides for nearly all of the 20,300 human protein-coding genes, accurate quantitation of target proteins can be achieved for most human proteins present at levels that can be detected by targeted mass spectrometry. SRM proteomics have been employed to identify many liver-specific proteins and to characterize the critical progression of fibrosis of the liver to cirrhosis in CHC patients.
The invention thus resides in the identification of new markers that are capable of identifying individuals who are infected with hepatitis or exhibit chronic liver disease generally. The markers of the invention may also be used to establish the status of liver fibrosis in relation to a commonly recognized scale of severity, the Ishak score which ranks severity of fibrosis based on histological criteria from a minimum of 0 to a maximum disease state of 6.
Thus, in one aspect, the invention is directed to a method to distinguish a subject infected with hepatitis virus from a non-infected subject or more generally, a subject with an abnormal liver condition from a normal subject which method comprises assessing the level of alpha-1-B glycoprotein (A1BG) or complement Factor H (CFH) or insulin-like growth factor binding protein acid labeled subunit (IGFALS) or combinations thereof in the blood or fraction thereof of said subject whereby a statistically significant increased level of A1BG or a statistically significant decreased level of CFH or IGFALS or combinations thereof as compared to normal individuals identifies said subject as infected with hepatitis or afflicted with abnormal liver condition generally. Particularly important is the ability to detect infection with Hepatitis C.
While the example below develops the method and verifies it with respect to a cohort of patients infected with Hepatitis C which is a chronic condition, the markers of the invention listed above will also identify patients with acute liver conditions that are abnormal.
In another aspect, the same criteria as mentioned in the previous paragraph may be used to distinguish subjects with liver fibrosis or other abnormal liver condition from those not affected with this condition.
The invention is also directed to a panel of reagents for detecting the foregoing markers, A1BG, CFH, and IGFALS. The panel may contain reagents for detection of only one or two or three of these markers. Typical reagents would include, for example, antibodies or portions thereof, aptamers, and the like.
In still another aspect, the invention is directed to a method to determine severity of liver fibrosis in a subject exhibiting liver fibrosis which method comprises determining the level of protein C (PROC) and/or retinal binding protein 4 (RBP4) in the blood or a fraction thereof of a subject wherein statistically significant decreased levels of one or both of said PROC and RBP4 as compared to normal individuals correlates positively with the severity of fibrosis of the liver in said subject. The fibrosis may be due to hepatitis infection or as a result of other causes, such as infections in general, toxic substances, genetic mutations or cancer. In addition, PROC and RBP4 levels as described in the previous paragraph may be added to the panel of markers to distinguish a subject having an abnormal liver condition from a subject who does not in addition to the A1BG and/or CFH and/or IGFALS described above.
Reagents reactive with RBP4 and/or PROC are also included in the invention. These reagents may also be added to the panel for detection of A1BG, CFH and/or IGFLAS.
The invention provides five new useful markers for liver conditions in individuals that permit simple blood tests for their determination. The diagnostic tests are conveniently done on a fraction of the blood such as serum or plasma. While the test is most useful in detecting disease in humans, it is also applicable to other warm-blooded animals such as livestock, household pets, and service animals such as horses. They may also be used in laboratory animals such as disease models. The availability of such tests, particularly in combination with other markers and/or using a multiplicity of the invention markers permits use of a simple test for liver conditions, optionally in combination with other indications that can be performed at the point of care. The availability of multiplexed assays for proteins using miniaturized devices makes these tests especially important in providing fast assays with high reliability. Thus, in addition specifically to identifying subjects infected with hepatitis or afflicted with other liver fibrotic conditions, these markers may be used in combination with markers for other diseases and conditions to provide a detailed assessment of the physiological state of an individual in a single multiplexed assay.
Classifying stages of liver fibrosis by biopsy has an accuracy of about 80% (Grigorescu, M., et al., J. Gastrointestin Liver Dis (2006) 15:149-159). Other studies have suggested that there can be up to a 33% error in the diagnosis of cirrhosis by biopsy (Afdhal, N. H., et al., Am J Gastroenterol (2004) 99:1160-1174). When the value of biomarkers is validated against biopsy, it is improbable to have a discrimination power that exceeds biopsy. In fact, reported noninvasive methods intended for discriminating hepatic fibrosis rarely have an AUROC (accuracy) exceeding 0.8 to 0.9 (Ray, S, et al., Hepatitis C, Churchill, Livingstone, Elsevier, Philadelphia, Pa. 2009). Thus the results shown in the example below with respect to the invention methods and reagents are unexpected and surprising.
The biomarkers described hereinbelow represent a distinct improvement from the reliability of these prior art methods as demonstrated in the example below. The newly discovered markers, A1BG, CFH and IGFALS, may be used individually or in various combinations in panels to distinguish subjects with chronic liver conditions, such as liver infection or fibrosis, from normal controls. Addition of one or both of the markers PROC and RBP4 to the determination may increase the accuracy of the assay even more. The latter two proteins are particularly useful in determining the progression of liver conditions.
One of the important aspects of the invention relates to establishing the state of progression of liver fibrosis. This is generally ranked on an Ishak group from 0 to 6. Mild fibrosis which is representative of the early stage of cancer has a rating of 1. More severe forms of cancer or more progressed cancers have higher ratings on the Ishak scale. Ishak 2 represents a more advanced stage of cancer, for example.
In addition to being used independently, the protein markers of the invention can be used in conjunction with other indicators. For example, low platelet counts are associated with progression of fibrosis. Used alone, this would be non-determinative since it is well known that there are many other causes other than liver diseases of low blood platelet levels (thrombo-cytopenia) such as leukemia, some types of anemia, immune system malfunction, metabolic disorders, viral infections, toxic chemicals or as a medication side effect. These could lead to a decreased platelet production or increased platelet destruction. Addition of liver-specific proteins to a diagnostic test panel increases the diagnostic relevance of platelet counts.
The markers of the invention have been identified by focusing on proteins known to be relatively specific for liver as compared to other organs as discerned from publicly available databases. Typically, proteins normally present at high levels in the serum or plasma sample are first eliminated from the sample and then the levels of diagnostic proteins assessed. A particularly useful method for detecting low levels of proteins in biological samples is selected reaction monitoring (SRM) targeted proteomics which was used to obtain the initial data. Sophisticated statistical analysis procedures were then employed to demonstrate that the five markers that are the subject of the present invention are reliable indicators of the conditions with which they are associated.
Once the nature of the markers for the conditions indicated has been identified, means for measuring their levels and determining the statistical significance of them as compared to levels in normal subjects are well known in the art. A multiplicity of techniques for measuring levels of protein is well established and includes, for example, various immunological based assays, mass spectra based assays and the like. Further, determination of a statistically significant difference in a test subject as compared to normal subjects employs statistical analysis well understood by practitioners. The values in normal subjects may be obtained from the literature, from comparable healthy individuals, or averages of levels from a multiplicity of individuals. If necessary, the levels in normals may be matched for other, perhaps relevant factors such as age, gender, ethnic background and alike. All of this is well within the expertise of the ordinary artisan.
Protein analytes are good targets for developing antibodies or synthetic capture agents that can be integrated into microfluidic chips (Integrated Blood-Barcode Chip)—devices that have the potential to analyze large numbers of patient samples rapidly (in a few minutes), inexpensively, and in a highly multiplexed format (100s or even 1000s of different assays investigating many different diseases) employing blood from a pinprick. Such microfluidic devices are likely to constitute an important foundation for P4 (Predictive, Preventive, Personalized, and Participatory) Medicine with Point-of-Care Diagnosis.
As noted above, panels of reagents for detection of the markers useful in the methods discussed herein are also included in the invention. Thus, the invention includes panels of reagents for use in each of these methods.
In one embodiment, the reagents are antibodies specifically immunoreactive with the markers to be detected. “Antibodies” includes complete antibodies, fragments of the antibodies that are immunoreactive with the protein to be detected, as well as recombinantly produced forms such as single stranded antibodies. The antibodies may be monoclonal or polyclonal and include antibodies derived from any subject since immunogenicity is not an issue in such assays.
Also included in the invention are individual antibodies as described above for each of the five protein markers useful in the methods as well as aptamers that specifically bind these proteins.
The following example is presented to illustrate but not to limit the invention.
This example is a detailed description of identification of the present markers and of the manner in which they can be used to assess chronic conditions of the liver such as infection, liver fibrosis, and fibrosis progression in an individual.
By employing a liver-specific protein strategy (employing comprehensive transcriptomic databases) and targeted quantitative SRM proteomics technology, we have analyzed 38 liver-specific protein levels in sera of 17 healthy controls and of 38 HCV patients at Ishak fibrosis stages from 2 to 6. Two protein markers, protein C (PROC) and retinol binding protein 4 (RBP4) are present at lower levels in patients than in controls. With Area Under the Curve (AUC) statistical analyses, these two proteins distinguish fibrosis vs. cirrhosis among Chronic Hepatitis C (CHC) patients. Three proteins, A1BG, CFH and IGFALS, distinguish HCV-infected patients from healthy controls, with an individual AUROC score >0.96 for each marker.
Serum samples were obtained from patients who participated in the HALT-C Trial (Ghany, M. G., et al., Hepatology (2011) 54:1527-1537). This trial enrolled patients with CHC who had liver biopsies showing Ishak stages 2-6 (range 0-6) fibrosis at enrollment. Blood samples at enrollment were studied. Patient information at enrollment is listed in Table 1 at the end of this example.
Control sera from normal female and male donors ages 30-50 years were collected at FDA-regulated blood facilities and were non-reactive for HCV antibody (ProMedDx). Pooled plasma from 10 normal donors was obtained from Innovative (Novi, Mich.). Collection and use of control and patient samples were approved by institutional review boards. Samples were stored at −80° C.
Sample Preparation for SRM
To reduce the complexity of samples, the top 14 highly abundant proteins were depleted using an AKTA FPLC system (GE Healthcare, USA) coupled with a Seppro® IgY14 human LC2 depletion column (Sigma-Aldrich, USA). We observed significant sample-to-sample variations with the Seppro® IgY14 spin column. In contrast, the LC system coupled with an IgY14 human LC2 depletion column dramatically improved the reproducibility. All 40 HALT-C and 17 normal serum samples were processed similarly; about 95% of the total protein was depleted. Depleted sera were digested with trypsin and desalted with Oasis™ MCX cartridges (Waters, Milford, Mass.).
Building the Liver-Specific and Liver-Enriched Proteins List
We used a targeted approach focusing on organ-specific proteins to increase the likelihood of identifying protein biomarkers in blood that may reflect pathology of a particular organ. Our list of liver-specific or liver-enriched proteins (liver proteins) was created by mining multi-organ transcriptomic data generated through Massively Parallel Signature Sequencing (MPSS). The MPSS dataset contains transcriptomes of 34 pooled normal (Caucasian) human tissues (Lin, B., et al., PLoS One (2010) 5:e10210). Signatures with their expression levels in liver either 5-fold higher than any other organs or 2-fold greater than the sum of all other organs were selected as liver protein candidates. We also performed organ-specific protein search with Gene Atlas Interface analysis. The databases searched against were 3 datasets from NCBI-GEO (Gene Expression Omnibus) with a total of 180 human tissues from multiple donors (Ge, X., et al., Genomics (2005) 86:127-141; Roth, R. B., et al., Neurogenetics (2006) 7:67-80; and Su, A. I., et al., PNAS USA (2004) 101:6062-6067). We included 21 enzymes and other proteins used in clinical practice or previously reported as liver biomarker candidates. Unfortunately, none of the 21 potential liver markers were successfully detected by SRM except coagulation factor II (F2).
Peptide selection from the Liver Protein List
Two to three peptides were selected for each liver protein based on the sequence of individual liver proteins that were previously detected by mass spectrometry. Peptide selection criteria are as follows (Lange, V., et al., Mol Syst Biol (2008) 4:222): 1) length 8-20 amino acid residues; 2) no chemically unstable residues (single letter notation; M, NG, DG, QG, N-terminal N, and N-terminal Q); 3) LC-compatible; 4) avoid cysteine residue if possible; and 5) sequence specific for the target protein (e.g., proteotypic peptides). Peptides previously identified in PeptideAtlas (Deutsch, E. W., et al., EMBO Rep (2008) 9:429-434) were preferentially chosen.
Mass Spectrometry and HPLC
All SRM analyses were performed on an Agilent 6460A triple quadrupole (QQQ) mass spectrometer with a ChipCube™ nanoelectrospray ionization source coupled with an Agilent 1200 nanoflow HPLC system. Serum samples were eluted over a 60-minute gradient with 0.66% per minute acetonitrile slope in the presence of 0.1% formic acid using a large capacity Agilent HPLC chip (Cat #G4240-62101, 160 mL trap, 150 mm C18 column). Spray voltage was set at 1900 V. The scheduled SRM were performed with 5 min retention time windows and an instrument cycle time of 2000±500 ms. Dwell times varied depending on the number of concurrent transitions; in all cases they were at least 10 ms.
Monitoring Liver-Specific Proteins in Blood by SRM
Crude unpurified peptide standards that correspond to the detected natural counterparts (light peptides) were synthesized with heavy isotopic Lysine (13C615N2) or Arginine (13C615N4) at the C-termini (heavy peptides) (Thermo-Fisher Scientific, Germany or Sigma-Aldrich, USA). Collision energies (CE) were determined using the default formula from Agilent (0.036×precursor mass m/z−4.80) and then optimized with 4 additional CE steps (±5V, ±10V). The best 4 transitions were selected. Detected heavy peptides were titrated at 6 concentrations in a normal human serum background to build a titration curve and to determine the proper amount of each peptide standard to spike-in.
SRM Data Analysis
All SRM data were processed using the Skyline Targeted Proteomics Environment (v1.1) (MacLean, B., et al., Bioinformatics (2010) 26:966-968). The setting of 0.055 Th match tolerance m/z was used. The default peak integration and Savitzky-Golay smoothing algorithm were applied. All data were manually inspected to ensure correct peak detection and accurate integration. Peptides with at least 3-fold signal-to-noise ratio were considered detectable. The total peak area and Light/Heavy ratio of each peptide were exported for statistical analysis.
Statistical Analysis
The exported SRM results were analyzed using R scripts generated for this data set. The pre-processing selection included proteins that have no more than 30% missing data. A similar criterion was applied to samples. Missing data were handled using k-nearest neighbor imputation algorithms (k=10) (Troyanskaya, O., et al., Bioinformatics (2001) 17:520-525). Repeated (duplicate) measurements for the same protein-peptide-m/z combination were averaged. Platelet level and gender were included as a clinical predictor of liver damage. Regularization methods based on logistic regression were used to reduce overfitting. LASSO and Elastic Nets penalty (Zou, H., et al., J. Roy. Statist. Soc. Ser. B (2005) 67:301-320) were applied. The choice of the optimal regularization parameter was done using the Area Under the Receiver-Operating-Characteristic (AUROC) curve as a criterion.
In order to obtain an approximately unbiased assessment of the performance of predictive signatures, tenfold cross validation was used to correct for the generally overoptimistic model building and signature optimization bias. An average over cross-validation runs is reported as the final optimal AUC characterizing the predicted performance of the biomarker signature. LASSO penalty was preferred for its ability to drop non-essential biomarkers from the signature by explicitly assigning them zero weights. All other analyses including calculation and graphics were generated by Prism 5 (GraphPad software, La Jolla, Calif., USA).
Results
The Identification of Liver Proteins
Using strategies described above, we identified 109 liver-specific proteins. GeneCards® summarized each gene expression in normal and diseased human tissues by three categories: 1) mRNA expression data from GeneNote and GNF BioGPS, 2) UniGene electronic Northern, and 3) SAGE (Serial Analysis of Gene Expression). In combination with the 21 proteins used in clinic practice or reported as liver biomarker candidates, a list composed of 130 proteins was created.
Proteins Detected by SRM
After suitable peptides and transitions for each liver protein were selected as described above, we used control plasma and serum to determine how many liver proteins can be detected by SRM. From the 89 liver proteins previously observed in tandem MS/MS experiments, we detected 100 peptides derived from 54 proteins in pooled control plasma (Table 2 at the end of this example). However, the HALT-C samples were in the form of serum, which is not ideal for mass spectrometry-based blood protein measurements due to variation in proteolysis derived from the coagulation cascade, resulting in decreased concentrations of proteins compared to plasma. In order to determine how many of our proteins detected in plasma can be detected in serum; we performed the same SRM analyses against pooled healthy human serum samples. Altogether 38 proteins (represented by 65 peptides) were detected in sera with reasonable signal intensity as shown in Table 3 at the end of this example).
Consistency and Accuracy of SRM Data
Duplicate SRM Runs Are Well Correlated
Duplicate runs were performed for each sample; technical variations between the two runs were generally small (
Protein Levels Measured by Multiple Features Are Consistent
When a protein level is measured by more than one feature (i.e., multiple peptides or same peptide with differently charged precursor ions), close agreement in quantification was observed; an example is the set of three features for the protein of A1BG (
Absolute Protein Levels in Sera Measured by SRM Are Similar to Prior Report
We did not aim to quantify absolute protein concentrations. The crude heavy peptides synthesized by rapid process peptide synthesis were not appropriate for absolute quantitation of protein levels in specimen due to the wide range of purity (−50% to 80%). In addition, the immunodepletion procedure during sample preparation induces additional variation in protein concentrations. Nevertheless, as summarized in Table 4, levels of five informative proteins in depleted control sera measured by SRM are close to concentrations reported in published literatures. The only exception is RBP4; the concentration in our SRM study is 10-fold lower than the studies of Gahne, B., et al. (Hum Genet (1987) 76:111-115) and Polanski, M, et al. (Biomark Insights (2007) 1:1-48) and 100-fold lower than the study of Farrah, et al. (Mol Cell Proteomics (2011) 10:M110 006353).
A1BG, CFH and IGFALS Distinguish Controls from HCV Patients
As shown in
An integrated analysis of A1BG, CFH and IGFALS or A1BG with either of the other two markers resulted in a perfect AUROC score of 1.0 with predicted sensitivity and specificity of 100% for discriminating healthy controls and HCV-infected patients (
PROC and RBP4 Levels in Serum can Further Classify Patients with Fibrosis
The concentrations of two proteins—PROC and RBP4—showed good correlation with disease severity. Serum concentrations of PROC and RBP4 decreased as liver disease progressed. Box-and-whisker plots and Student's t-test showed that each protein can distinguish different disease stages (
With Student's t-test, the difference in serum concentrations of PROC or RBP4 between patients with earlier stages of fibrosis (Ishak 2-4) and patients with cirrhosis (Ishak 5-6) is significant. PROC appears to be a good marker to distinguish cirrhosis patients from those with fibrosis (p=0.004). RBP4 levels showed a similar decrease from normal control to Ishak 5-6 but the difference between Ishak 2-4 and 5-6 was not statistically significant—0.07 with outliers (0.02 without outliers). AUROC scores for PROC were 0.77 for controls vs. patients, 0.75 for Ishak <5 vs. Ishak≧5, and 0.83 for Ishak <5 vs. Ishak 6. AUROC scores for RBP4 were 0.79 for controls vs. patients, 0.68 for Ishak <5 vs. Ishak≧5, and 0.80 for Ishak <5 vs. Ishak 6.
Ishak 6 Patients are Effectively Distinguished by PROC and RBP4
When we combined Ishak 2-4 or 2-5 patients together and compared against the most advanced Ishak 6 patients, more impressive differences between these groups were revealed with an AUROC of 0.83 for PROC and 0.80 for RBP4.
The discrimination power of the 5 proteins in terms of AUROC is summarized as follows:
Multivariate Analysis
Multivariate analysis of Ishak 2-4 vs. 5-6 patients using PROC and RBP4 proteins, gender, and platelets gave a cross-validated AUROC=0.89. PROC, RBP4, and platelets distinguish advanced stages of fibrosis patients (Ishak 5-6) from patients in earlier stages with an impressive sensitivity of 95% and specificity of 84% (
This application claims priority under 35 U.S.C. §119(e) to provisional U.S. application Ser. No. 61/565,383 filed 30 Nov. 2011. The contents of this document are incorporated herein by reference in their entirety.
This work was supported in part by grants from the National Institutes of Health. The U.S. government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
61565383 | Nov 2011 | US |