The invention generally relates to biomarkers for fatty liver disease and methods based on the same biomarkers.
The prevalence of Nonalcoholic Fatty Liver Disease (NAFLD), which encompasses an entire histologic spectrum ranging from simple, benign hepatic steatosis to nonalcoholic steatohepatitis (NASH) characterized by lipid accumulation, inflammation, hepatocyte ballooning, and varying degrees of fibrosis, continues to increase in concert with the obesity epidemic. Despite increasing awareness of obesity-related liver disease, the pathogenesis of NAFLD and NASH is poorly understood and there are no FDA-approved therapies with NASH as an indication. Diagnosis of NASH remains complicated and with significant risk due to the requirement for an invasive liver biopsy. Therefore, identification of a profile of blood-based metabolite biomarkers able to diagnose and stage NAFLD in a patient with or suspected of having liver disease for prognostic purposes (i.e., at risk of progression to a more advanced liver disease stage) is a significant unmet medical need.
Fatty change in the liver results from excessive accumulation of lipids within hepatocytes. Fatty liver is the accumulation of triglycerides and other fats in the liver cells. Fatty liver disease can range from fatty liver alone (simple fatty liver, steatosis) to fatty liver associated with hepatic inflammation (steatohepatitis). Although having fat in the liver is not normal, by itself it probably causes little harm or permanent damage. Steatosis is generally believed to be a benign condition, with rare progression to chronic liver disease. In contrast, steatohepatitis may progress to liver fibrosis and cirrhosis, can be associated with hepatocellular carcinoma and may result in liver-related morbidity and mortality.
Steatosis can occur with the use of alcohol (alcohol-related fatty liver) or in the absence of alcohol (nonalcoholic fatty liver disease, NAFLD). Steatohepatitis may be related to alcohol-induced hepatic damage or may be unrelated to alcohol. If steatohepatitis is present but a history of alcohol use is not, the condition is termed nonalcoholic steatohepatitis (NASH).
In the absence of alcohol the main risk factors for simple fatty liver (NAFLD) and NASH are obesity, diabetes, and high triglyceride levels. In NASH, fat builds up in the liver and eventually causes scar tissue. This type of hepatitis appears to be associated with diabetes, protein malnutrition, obesity, coronary artery disease, and treatment with corticosteroid medications. Fibrosis or cirrhosis in the liver is present in 15-50% of patients with NASH. Approximately 30% of patients with fibrosis develop cirrhosis after 10 years.
Fatty liver disease is now the most common cause for elevated liver function tests in the United States. It is now probably the leading reason for mild elevations of transaminases. Steatosis affects approximately 25-35% of the general population. NAFLD is found in over 80% of patients who are obese. NASH affects 2 to 5 percent of Americans and has been detected in 1.2-9% of patients undergoing routine liver biopsy. Over 50% of patients undergoing bariatric surgery have NASH. The disease strikes males and females; early studies report >70% of cases were in females but recent studies report 50% of patients are females. Fatty liver occurs in all age groups. In the United States NASH is the most common liver disease among adolescents and is the third most common cause of chronic liver disease in adults (after hepatitis C and alcohol).
Both NASH and NAFLD are becoming more common, possibly because of the greater number of Americans with obesity. In the past 10 years, the rate of obesity has doubled in adults and tripled in children. Obesity also contributes to diabetes and high blood cholesterol, which can further complicate the health of someone with NASH. Diabetes and high blood cholesterol are also becoming more common among Americans.
NASH is usually a silent disease with few or no symptoms. Patients generally feel well in the early stages and only begin to have symptoms—such as fatigue, weight loss, and weakness—once the disease is more advanced or cirrhosis develops. The progression of NASH can take years, even decades. The process can stop and, in some cases, reverse on its own without specific therapy. Or NASH can slowly worsen, causing scarring or “fibrosis” to appear and accumulate in the liver. As fibrosis worsens, cirrhosis develops; the liver becomes seriously scarred, hardened, and unable to function normally. Not every person with NASH develops cirrhosis, but once serious scarring or cirrhosis is present, few treatments can halt the progression. A person with cirrhosis experiences fluid retention, muscle wasting, bleeding from the intestines, and liver failure. Liver transplantation is the only treatment for advanced cirrhosis with liver failure, and transplantation is increasingly performed in people with NASH. NASH ranks as one of the major causes of cirrhosis in America, behind hepatitis C and alcoholic liver disease.
NASH is usually first suspected in a person who is found to have elevations in liver tests that are included in routine blood test panels, such as alanine aminotransferase (ALT) or aspartate aminotransferase (AST). When further evaluation shows no apparent reason for liver disease (such as medications, viral hepatitis, or excessive use of alcohol) and when x-rays or imaging studies of the liver show fat, NASH is suspected. The only means of proving a diagnosis of NASH and separating it from simple fatty liver is a liver biopsy. A liver biopsy requires a needle to be inserted through the skin and the removal of a small piece of the liver. If the tissue shows fat without inflammation and damage, simple fatty liver or NAFLD is diagnosed. NASH is diagnosed when microscopic examination of the tissue shows fat along with inflammation and damage to liver cells. A biopsy is required to determine whether scar tissue has developed in the liver. Currently, no blood tests or scans can reliably provide this information. Therefore there exists a need for a less invasive diagnostic method (i.e. a method that would not require a biopsy).
In one aspect, the present disclosure provides methods of diagnosing or aiding in the diagnosis of liver disease in a subject, comprising: analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for liver disease in the sample, where the one or more biomarkers are selected from Tables 12, 2, 3, 4, 5, 7, 8, 10, 11, 14, 16 and/or 18 and comparing the level(s) of the one or more biomarkers in the sample to liver disease-positive and/or liver disease-negative reference levels of the one or more biomarkers in order to diagnose whether the subject has liver disease.
In another aspect, the present disclosure provides methods of diagnosing or aiding in the diagnosis of NASH in a subject, comprising: analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for NASH in the sample, where the one or more biomarkers are selected from Tables 7, 8, 10 and/or 11 and comparing the level(s) of the one or more biomarkers in the sample to NASH-positive and/or NASH-negative reference levels of the one or more biomarkers in order to diagnose whether the subject has NASH.
In a further aspect, the disclosure provides methods of diagnosing or aiding in the diagnosis of NAFLD in a subject, comprising: analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for NAFLD in the sample, where the one or more biomarkers are selected from Tables 2, 3, 4, 5, 7, 8, 10, and/or 11; and comparing the level(s) of the one or more biomarkers in the sample to NAFLD-positive and/or NAFLD-negative reference levels of the one or more biomarkers in order to diagnose whether the subject has NAFLD. In a feature of this aspect, the one or more biomarkers may be selected from the group consisting of 5-methylthioadenosine (5-MTA), glycine, serine, leucine, 4-methyl-2-oxopentanoate, 3-methyl-2-oxovalerate, valine, 3-methyl-2-oxobutyrate, 2-hydroxybutyrate, prolylproline, lanosterol, tauro-beta-muricholate, and deoxycholate.
In another aspect, the disclosure provides methods of distinguishing NASH from NAFLD in a subject, comprising analyzing a biological sample from said subject to determine the level(s) of the one or more biomarkers for NASH and/or NAFLD in the sample where the one or more biomarkers are selected from Tables 2, 3, 4, 5, 7, 8, 10, and/or 11 and comparing the level(s) of the one or more biomarkers in the sample to reference levels of the one or more biomarkers in order to distinguish NASH from NAFLD.
In another aspect, the disclosure provides methods of diagnosing or aiding in the diagnosis of liver fibrosis in a subject, comprising analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for fibrosis in the sample, where the one or more biomarkers are selected from Tables 12, 10, 11, 14, 16, and/or 18 and comparing the level(s) of the one or more biomarkers in the sample to fibrosis-positive and/or fibrosis-negative reference levels of the one or more biomarkers in order to diagnose whether the subject has fibrosis.
In another aspect, the disclosure provides methods of determining the stage of fibrosis of a subject having liver fibrosis, comprising analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Tables 12, 10, 11, 14, 16 and/or 18, and comparing the level(s) of the one or more biomarkers in the sample to the liver fibrosis stage reference levels of the one or more biomarkers in order to determine the stage of the liver fibrosis.
In another embodiment, the disclosure provides methods of monitoring the progression/regression of liver disease in a subject, comprising analyzing a first biological sample from said subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Tables 12, 2, 3, 4, 5, 7, 8, 10, 11, 14, 16, and/or 18 and the first sample is obtained from said subject at a first time point; analyzing a second biological sample from said subject to determine the level(s) of the one or more biomarkers, wherein the second sample is obtained from said subject at a second time point; and comparing the level(s) of one or more biomarkers in the first sample to the level(s) of the one or more biomarkers in the second sample in order to monitor the progression/regression of liver disease in the subject.
In a further embodiment, the disclosure provides methods of distinguishing less severe from more severe in a subject having, comprising analyzing a biological sample from said subject to determine the level(s) of one or more biomarkers for in the sample, wherein the one or more biomarkers are selected from Tables 12, 2, 3, 4, 5, 7, 8, 10, 11, 14, 16, and/or 18, and comparing the level(s) of the one or more biomarkers in the sample to less severe and/or more severe reference levels of the one or more biomarkers in order to determine the severity of the subject's liver disease.
In yet another aspect of the invention, a method of diagnosing or aiding in diagnosing whether a subject has liver disease comprises analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Tables 19 and 20, and comparing the level(s) of the one or more biomarkers in the sample to liver disease-positive and/or liver disease-negative reference levels of the one or more biomarkers in order to diagnose whether the subject has liver disease.
In a feature of this aspect, the liver disease may be NASH and the one or more biomarkers may be selected from Table 19. In another feature of this aspect, the liver disease may be fibrosis and the one or more biomarkers may be selected from Table 20. In further features, the diagnosis may comprise distinguishing NASH from NAFLD or distinguishing NASH from fibrosis.
In a further aspect of the invention, a method of determining the fibrosis stage of a subject having liver fibrosis comprises analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Table 20, and comparing the level(s) of the one or more biomarkers in the sample to high stage liver fibrosis and/or low stage liver fibrosis reference levels of the one or more biomarkers in order to determine the stage of the liver fibrosis.
In an additional aspect of the invention, a method of monitoring progression/regression of liver disease in a subject comprises analyzing a first biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Tables 19 and/or 20 and the first sample is obtained from the subject at a first time point; analyzing a second biological sample from a subject to determine the level(s) of the one or more biomarkers, wherein the second sample is obtained from the subject at a second time point; and comparing the level(s) of one or more biomarkers in the first sample to the level(s) of the one or more biomarkers in the second sample in order to monitor the progression/regression of liver disease in the subject.
In another aspect of the invention, a method of distinguishing less severe liver disease from more severe liver disease in a subject having liver disease comprises analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Tables 19 and/or 20, and comparing the level(s) of the one or more biomarkers in the sample to less severe liver disease and/or more severe liver disease reference levels of the one or more biomarkers in order to determine the severity of the subject's liver disease.
In yet another aspect, a method of aiding in distinguishing NASH from NAFLD in a subject having been diagnosed with a liver disease comprises analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Table 19, and comparing the level(s) of the one or more biomarkers in the sample to liver disease reference levels of the one or more biomarkers in order to distinguish between NASH and NAFLD in the subject.
In a further aspect, a method of aiding in distinguishing NASH from fibrosis in a subject having been diagnosed with a liver disease comprises analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease in the sample, wherein the one or more biomarkers are selected from Table 19 and/or 20, and comparing the level(s) of the one or more biomarkers in the sample to liver disease reference levels of the one or more biomarkers in order to distinguish between NASH and fibrosis in the subject.
In yet another embodiment, the disclosure provides methods of determining a Liver Disease Score.
Biomarkers of NASH, NAFLD, and fibrosis, methods for diagnosis (or aiding in the diagnosis) of NAFLD, NASH and/or fibrosis, methods of distinguishing between NAFLD and NASH, methods of classifying the stage of fibrosis, methods of determining the severity of liver disease, methods of determining the severity of liver disease or fibrosis, methods of monitoring progression/regression of NASH, NAFLD, and/or fibrosis, as well as other methods based on biomarkers of liver disease are described herein.
Prior to describing this invention in further detail, however, the following terms will first be defined.
“Biomarker” means a compound, preferably a metabolite, that is differentially present (i.e., increased or decreased) in a biological sample from a subject or a group of subjects having a first phenotype (e.g., having a disease) as compared to a biological sample from a subject or group of subjects having a second phenotype (e.g., not having the disease). A biomarker may be differentially present at any level, but is generally present at a level that is increased by at least 5%, by at least 10%, by at least 15%, by at least 20%, by at least 25%, by at least 30%, by at least 35%, by at least 40%, by at least 45%, by at least 50%, by at least 55%, by at least 60%, by at least 65%, by at least 70%, by at least 75%, by at least 80%, by at least 85%, by at least 90%, by at least 95%, by at least 100%, by at least 110%, by at least 120%, by at least 130%, by at least 140%, by at least 150%, or more; or is generally present at a level that is decreased by at least 5%, by at least 10%, by at least 15%, by at least 20%, by at least 25%, by at least 30%, by at least 35%, by at least 40%, by at least 45%, by at least 50%, by at least 55%, by at least 60%, by at least 65%, by at least 70%, by at least 75%, by at least 80%, by at least 85%, by at least 90%, by at least 95%, or by 100% (i.e., absent). A biomarker is preferably differentially present at a level that is statistically significant (i.e., a p-value less than 0.05 and/or a q-value of less than 0.10 as determined using either Welch's T-test or Wilcoxon's rank-sum Test).
The “level” of one or more biomarkers means the absolute or relative amount or concentration of the biomarker in the sample.
“Sample” or “biological sample” means biological material isolated from a subject. The biological sample may contain any biological material suitable for detecting the desired biomarkers, and may comprise cellular and/or non-cellular material from the subject. The sample can be isolated from any suitable biological fluid such as, for example, blood, blood plasma, blood serum, urine, or cerebral spinal fluid (CSF).
“Subject” means any animal, but is preferably a mammal, such as, for example, a human, monkey, non-human primate, mouse, or rabbit.
A “reference level” of a biomarker means a level of the biomarker that is indicative of a particular disease state, phenotype, or predisposition to developing a particular disease state or phenotype, or lack thereof, as well as combinations of disease states, phenotypes, or predisposition to developing a particular disease state or phenotype, or lack thereof. A “positive” reference level of a biomarker means a level that is indicative of a particular disease state or phenotype. A “negative” reference level of a biomarker means a level that is indicative of a lack of a particular disease state or phenotype. For example, a “NASH-positive reference level” of a biomarker means a level of a biomarker that is indicative of a positive diagnosis of NASH in a subject, and a “NASH-negative reference level” of a biomarker means a level of a biomarker that is indicative of a negative diagnosis of NASH in a subject. A “reference level” of a biomarker may be an absolute or relative amount or concentration of the biomarker, a presence or absence of the biomarker, a range of amount or concentration of the biomarker, a minimum and/or maximum amount or concentration of the biomarker, a mean amount or concentration of the biomarker, and/or a median amount or concentration of the biomarker; and, in addition, “reference levels” of combinations of biomarkers may also be ratios of absolute or relative amounts or concentrations of two or more biomarkers with respect to each other. Appropriate positive and negative reference levels of biomarkers for a particular disease state, phenotype, or lack thereof may be determined by measuring levels of desired biomarkers in one or more appropriate subjects, and such reference levels may be tailored to specific populations of subjects (e.g., a reference level may be age-matched or gender-matched so that comparisons may be made between biomarker levels in samples from subjects of a certain age or gender and reference levels for a particular disease state, phenotype, or lack thereof in a certain age or gender group). Such reference levels may also be tailored to specific techniques that are used to measure levels of biomarkers in biological samples (e.g., LC-MS, GC-MS, etc.), where the levels of biomarkers may differ based on the specific technique that is used.
“Non-biomarker compound” means a compound that is not differentially present in a biological sample from a subject or a group of subjects having a first phenotype (e.g., having a first disease) as compared to a biological sample from a subject or group of subjects having a second phenotype (e.g., not having the first disease). Such non-biomarker compounds may, however, be biomarkers in a biological sample from a subject or a group of subjects having a third phenotype (e.g., having a second disease) as compared to the first phenotype (e.g., having the first disease) or the second phenotype (e.g., not having the first disease).
“Metabolite”, or “small molecule”, means organic and inorganic molecules which are present in a cell. The term does not include large macromolecules, such as large proteins (e.g., proteins with molecular weights over 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, or 10,000), large nucleic acids (e.g., nucleic acids with molecular weights of over 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, or 10,000), or large polysaccharides (e.g., polysaccharides with a molecular weights of over 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, or 10,000). The small molecules of the cell are generally found free in solution in the cytoplasm or in other organelles, such as the mitochondria, where they form a pool of intermediates which can be metabolized further or used to generate large molecules, called macromolecules. The term “small molecules” includes signaling molecules and intermediates in the chemical reactions that transform energy derived from food into usable forms. Examples of small molecules include sugars, fatty acids, amino acids, nucleotides, intermediates formed during cellular processes, and other small molecules found within the cell.
“Metabolic profile”, or “small molecule profile”, means a complete or partial inventory of small molecules within a targeted cell, tissue, organ, organism, or fraction thereof (e.g., cellular compartment). The inventory may include the quantity and/or type of small molecules present. The “small molecule profile” may be determined using a single technique or multiple different techniques.
“Metabolome” means all of the small molecules present in a given organism.
“Steatosis” refers to fatty liver disease without the presence of inflammation. The condition can occur with the use of alcohol or in the absence of alcohol use.
“Non-alcoholic fatty liver disease” (NAFLD) refers to fatty liver disease (steatosis) that occurs in subjects even in the absence of consumption of alcohol in amounts considered harmful to the liver.
“Steatohepatitis” refers to fatty liver disease that is associated with inflammation. Steatohepatitis can progress to cirrhosis and can be associated with hepatocellular carcinoma. The condition can occur with the use of alcohol or in the absence of alcohol use.
“Non-alcoholic steatohepatitis” (NASH) refers to steatohepatitis that occurs in subjects even in the absence of consumption of alcohol in amounts considered harmful to the liver. NASH can progress to cirrhosis and can be associated with hepatocellular carcinoma.
“Fibrosis” refers to the accumulation of extracellular matrix proteins in the liver as a result of ongoing inflammation. Fibrosis is classified histologically in a liver biopsy sample into five stages, 0-4. Stage 0 means no fibrosis, Stage 1 refers to mild fibrosis, Stage 2 refers to moderate fibrosis, Stage 3 refers to severe fibrosis, and Stage 4 refers to cirrhosis.
“Liver disease”, as used herein refers to NAFLD, NASH, fibrosis, and cirrhosis.
“NAFLD Activity Score” or “NAS” refers to a histological scoring system for NAFLD. The score is comprised of evaluation of changes in histological features such as steatosis, lobular inflammation, absence of lipogranulomas, and hepatocyte ballooning. Fibrosis is assessed independently of the NAS.
“Severity” of liver disease refers to the degree of liver disease on the spectrum of non-alcoholic liver disease activity, ranging from low severity disease associated with fat accumulation in the liver (NAFLD), with an increased severity associated with low levels of inflammation and/or fibrosis in addition to fat accumulation (i.e., borderline NASH), and a further increase in severity associated with higher levels of inflammation and fibrosis (i.e., NASH). Severity may be based on fibrosis stages or may also be assessed using the NAS.
With respect to the nomenclature for select fatty acid lipid metabolites used herein, fatty acids labeled with a prefix “CE”, “DAG”, “FFA”, “PC”, “PE”, “LPC”, “LPE”, “O-PC”, “P-PE”, “PI”, “SM”, “TAG”, “CER”, “DCER”, “LCER”, or “TL” refer to the indicated fatty acids present within cholesteryl esters, diacylglycerols (diglycerides), free fatty acids, phosphatidylcholines, phosphatidylethanolamines, lysophosphatidylcholines, lysophosphatidylethanolamines, 1-ether linked phosphatidylcholines, 1-vinyl ether linked phosphatidylethanolamines (plasmalogens), phosphoinositols, sphingomyelins, triacylglycerols (triglycerides), ceramides, dihydroceramides, lactoceramides, and total lipids, respectively, in a sample. “TL” refer to the indicated fatty acids present within total lipids in a sample. In some embodiments, the indicated fatty acid components are quantified as a proportion of the total fatty acids within the lipid class indicated by the prefix. For example, the abbreviation “TL16:0” indicates the percentage of total lipid in the sample comprised on palmitic acid (16:0). The term “TLTL” or “Total Total Lipid” indicates the absolute amount (e.g., in n Moles per gram) of total lipid present in the sample. In some embodiments, the indicated fatty acid components are quantified as a proportion of total fatty acids within the lipid class indicated by the prefix. References to fatty acids without a prefix or other indication of a particular lipid class generally indicate fatty acids present within total lipids in a sample. The term “LC” following a prefix “CE”, “DAG”, “FFA”, “PC”, “PE”, “LPC”, “LPE”, “O-PC”, “P-PE”, “PI”, “SM”, “TAG”, “CER”, “DCER”, or “LCER” refers to the amount of the total lipid class indicated by the prefix in the sample (e.g., the concentration of lipids of that class expressed as n Moles per gram of serum or plasma). For example, with respect to a measurement taken from plasma or serum, the abbreviation “PC 18:2n6” indicates the percentage of plasma or serum phosphatidylcholine comprised of linoleic acid (18:2n6), and the term “TGLC” indicates the absolute amount (e.g., in n Moles per gram) of triglyceride present in plasma or serum. For triaclyglycerols, the metabolite name refers to the parent mass of the compound (e.g., TAG53:6-FA18:2 indicates that the metabolite is a triacylglycerol with attached fatty acids having 53 total carbons and 6 total double bonds.—FA18:2 refers to the fragment identified on the mass spectrometer (i.e., one of the three fatty acids of the TAG in this example is 18:2)). “MUFA”, “PUFA”, and “SFA” refer to monounsaturated fatty acid, polyunsaturated fatty acid, and saturated fatty acid, respectively.
The NAFLD, NASH, and fibrosis biomarkers described herein were discovered using metabolomic profiling techniques. Such metabolomic profiling techniques are described in more detail in the Examples set forth below as well as in U.S. Pat. Nos. 7,005,255; 7,329,489; 7,550,258; 7,550,260; 7,553,616; 7,635,556; 7,682,783; 7,682,784; 7,910,301 and 7,947,453 the entire contents of which are hereby incorporated herein by reference.
Generally, metabolic profiles were determined for biological samples from human subjects diagnosed with NAFLD, NASH, or fibrosis as well as from one or more other groups of human subjects (e.g., control subjects not diagnosed with NAFLD, NASH, or fibrosis). The metabolic profile for biological samples from a subject having NAFLD, NASH, or fibrosis was compared to the metabolic profile for biological samples from the one or more other groups of subjects. Those molecules differentially present, including those molecules differentially present at a level that is statistically significant, in the metabolic profile of samples from subjects with NAFLD, NASH, or fibrosis as compared to another group (e.g., control subjects not diagnosed with NAFLD, NASH, or fibrosis) were identified as biomarkers to distinguish those groups. In addition, those molecules differentially present, including those molecules differentially present at a level that is statistically significant, in the metabolic profile of samples from subjects with NAFLD, NASH, or fibrosis as compared to another group were also identified as biomarkers to distinguish those groups.
The biomarkers are discussed in more detail herein. The biomarkers that were discovered correspond with the following group(s):
Biomarkers for distinguishing subjects having NAFLD vs. subjects not diagnosed with NAFLD (see Tables 2, 3, 4, 5);
Biomarkers for distinguishing subjects having NASH vs. subjects having NAFLD (see Tables 7, 8);
Biomarkers for distinguishing subjects having fibrosis vs. control subjects not having fibrosis (see Tables 10, 11, 12, 14, 16, 18, 20);
Biomarkers for distinguishing stages of fibrosis (see Tables 10, 11, 12, 14, 16, 18).
Biomarkers for distinguishing subjects having NASH vs. control subjects not having NASH (see Table 20)
The identification of biomarkers for NAFLD, NASH, and fibrosis allows for the diagnosis of (or aiding in the diagnosis of) liver disease in subjects presenting with one or more symptoms consistent with the presence of liver disease and includes the initial diagnosis of liver disease in a subject not previously identified as having liver disease and diagnosis of recurrence of liver disease in a subject previously treated for liver disease. A method of diagnosing (or aiding in diagnosing) whether a subject has liver disease comprises (1) analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers of liver disease in the sample and (2) comparing the level(s) of the one or more biomarkers in the sample to liver disease-positive and/or liver disease-negative reference levels of the one or more biomarkers in order to diagnose (or aid in the diagnosis of) whether the subject has liver disease. The one or more biomarkers that are used are selected from Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 and combinations thereof. When such a method is used to aid in the diagnosis of liver disease, the results of the method may be used along with other methods (or the results thereof) useful in the clinical determination of whether a subject has liver disease.
Any suitable method may be used to analyze the biological sample in order to determine the level(s) of the one or more biomarkers in the sample. Suitable methods include chromatography (e.g., HPLC, gas chromatography, liquid chromatography), mass spectrometry (e.g., MS, MS-MS), enzyme-linked immunosorbent assay (ELISA), antibody linkage, other immunochemical techniques, and combinations thereof. Further, the level(s) of the one or more biomarkers may be measured indirectly, for example, by using an assay that measures the level of a compound (or compounds) that correlates with the level of the biomarker(s) that are desired to be measured.
The levels of one or more of the biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 including a combination of all of the biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 and combinations thereof or any fraction thereof, may be determined and used in methods of aiding in diagnosing whether a subject has liver disease. Determining levels of combinations of the biomarkers may allow greater sensitivity and specificity in diagnosing liver disease and aiding in the diagnosis of liver disease. For example, ratios of the levels of certain biomarkers (and non-biomarker compounds) in biological samples may allow greater sensitivity and specificity in diagnosing liver disease and aiding in the diagnosis of liver disease.
In one example, the levels of one or more biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, and/or 11, and any combination thereof including a combination of all of the biomarkers may be determined in the methods of diagnosing or aiding in diagnosing whether a subject has NAFLD. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing NAFLD: epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, 3-hydroxyisobutyrate, cyclo (L-phe-L-pro), 2-aminoadipate, 4-methyl-2-oxopentanoate, 2-hydroxybutyrate, prolylproline, and tauro-beta-muricholate. In another example, one or more additional biomarkers may optionally be selected from the group consisting of: isoleucine, glutamate, alpha-ketoglutarate, TL16:1n7 (16:1n7, palmitoleic acid), TL16:0 (16:0, palmitic acid), taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glycine, serine, leucine, deoxycholate, 3-methyl-2-oxovalerate, valine, 3-methyl-2-oxobutyrate, and lanosterol and may be used in combination with the one or more biomarkers.
In another example, the levels of one or more biomarkers in Tables 7, 8, 10, 11 and/or 20 and any combination thereof including a combination of all of the biomarkers may be determined in the methods of diagnosing or aiding in diagnosing whether a subject has NASH. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing NASH: epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 5-methylthioadenosine (MTA), valylglycine, cyclo (L-phe-L-pro), fucose, taurine, gamma-glutamylhistidine, 3-hydroxyisobutyrate, CE(24:1), PE(P-16:0/14:1), LPC(14:0), SM(18:1), PE(15:0/22:4), FFA(20:0), LPC(12:0), LCER(26:0), LPE(14:1), PI(16:0/16:0), LPE(20:4), DCER(20:0), LCER(14:0), PE(15:0/18:4), PI(18:0/16:1), PE(16:0/22:2), PE(P-14:1/18:1), PC(16:0/14:1), PE(18:0/17:0), PE(P-16:0/18:0), PE(P-18:0/16:1), PE(O-18:0/18:0), CER(26:0), PE(16:0/16:0), LPE(18:4), and PE(O-18:0/14:1). One or more additional biomarkers may optionally be selected from the group consisting of: TL16:1n7 (16:1n7, palmitoleic acid), TL16:0 (16:0, palmitic acid), taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, LPE(18:2), LPE(20:3), PE(14:0/14:1), PC(14:0/22:4), PC(15:0/16:1), PC(20:0/14:1), PC(17:0/22:6), PE(15:0/18:3), PE(17:0/20:2), PE(18:2/20:2), PE(18:2/20:3), PC(18:1/22:6), PC(18:1/22:5), PC(14:0/18:4), SM(16:0), CE(24:0), PC(14:0/20:2), PC(14:0/20:3), PC(18:1/18:4), SM(18:0), PC(14:0/18:2), and PC(14:0/16:1).
In another example, the levels of one or more biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, and/or 20 may be determined in the methods of distinguishing NASH from NAFLD in a subject. For example, one or more of the following biomarkers may be used alone or in combination to distinguish NASH from NAFLD: epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, 3-hydroxyisobutyrate, cyclo (L-phe-L-pro), 2-aminoadipate, 4-methyl-2-oxopentanoate, 2-hydroxybutyrate, prolylproline, tauro-beta-muricholate, CE(24:1), PE(P-16:0/14:1), LPC(14:0), SM(18:1), PE(15:0/22:4), FFA(20:0), LPC(12:0), LCER(26:0), LPE(14:1), PI(16:0/16:0), LPE(20:4), DCER(20:0), LCER(14:0), PE(15:0/18:4), PI(18:0/16:1), PE(16:0/22:2), PE(P-14:1/18:1), PC(16:0/14:1), PE(18:0/17:0), PE(P-16:0/18:0), PE(P-18:0/16:1), PE(O-18:0/18:0), CER(26:0), PE(16:0/16:0), LPE(18:4), and PE(O-18:0/14:1). One or more additional biomarkers may optionally be selected from the group consisting of: isoleucine, glutamate, alpha-ketoglutarate, TL16:1n7 (16:1n7, palmitoleic acid), TL16:0 (16:0, palmitic acid), taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glycine, serine, leucine, deoxycholate, 3-methyl-2-oxovalerate, valine, 3-methyl-2-oxobutyrate, lanosterol, LPE(18:2), LPE(20:3), PE(14:0/14:1), PC(14:0/22:4), PC(15:0/16:1), PC(20:0/14:1), PC(17:0/22:6), PE(15:0/18:3), PE(17:0/20:2), PE(18:2/20:2), PE(18:2/20:3), PC(18:1/22:6), PC(18:1/22:5), PC(14:0/18:4), SM(16:0), CE(24:0), PC(14:0/20:2), PC(14:0/20:3), PC(18:1/18:4), SM(18:0), PC(14:0/18:2), and PC(14:0/16:1).
In another example, the levels of one or more biomarkers in Tables 10, 11, 12, 14, 16, 18, and/or 20 may be determined in the methods of diagnosing or aiding in diagnosing whether a subject has fibrosis. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing whether a subject has fibrosis: glutarate (pentanedioate), epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 2-aminoheptanoate, 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, cyclo(L-phe-L-pro), CER(14:0), DCER(14:0), LPE(12:0), DCER(18:0), PE(18:0/22:2), PE(P-18:0/18:3), LPC(17:0), LPC(22:0), CER(18:1), LCER(22:0), PE(16:0/20:1), CE(15:0), PE(16:0/22:4), PE(O-18:0/20:2), LPC(20:0), LPE(24:0), PC(12:0/14:1), PE(17:0/22:2), SM(18:1), CER(16:0), LCER(24:0), PE(O-18:0/20:3), CE(17:0), PE(P-16:0/18:3), PE(P-16:0/16:1), LPE(14:1), FFA(24:0), PE(O-16:0/18:4), FFA(15:0), SM(14:0), LPC(20:2), PE(P-14:1/18:1), SM(24:1), PI(18:0/20:2), LPC(15:0), PE(O-18:0/18:1), PI(18:1/20:3), PE(16:0/16:1), DAG(18:1/20:3)X-19561, X-18889, X-21471, X-11871, and X-12850. One or more additional biomarkers may optionally be selected from the group consisting of: taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, TL16:1n7 (16:1n7, palmitoleate), TL16:0 (16:0, palmitic acid), isoleucine, alpha-ketoglutarate, PE(18:2/20:2), PE(14:0/16:1), PE(14:0/14:1), PE(16:0/18:1), PE(18:1/18:1), PE(17:0/20:4), PE(14:0/20:5), PE(16:0/22:5), PE(18:2/20:3), PE(16:0/20:4), PE(14:0/18:2), PE(18:1/18:4), PE(15:0/22:6), PE(16:0/14:0), LPC(18:3), TAG55:7-FA20:3, TAG53:6-FA18:2, TAG55:7-FA20:4, TAG53:5-FA18:2, TAG53:7-FA18:3, TAG55:8-FA20:4, TAG53:5-FA18:1, TAG55:6-FA20:3, TAG57:9-FA22:6, TAG53:6-FA18:3, TAG55:6-FA18:1, TAG53:6-FA18:1, TAG53:4-FA18:1, TAG53:4-FA18:0, TAG51:4-FA16:0, TAG53:3-FA18:0, TAG51:3-FA16:0, TAG51:4-FA18:1, TAG56:5-FA20:4, TAG56:5-FA18:0, TAG56:4-FA20:4, PE(14:0/18:1), PC(14:0/18:4), PC(18:2/22:5), PC(20:0/22:5), SM(18:0), CE(18:0), PC(18:2/18:4), and PC(14:0/20:2).
In another example, the levels of one or more biomarkers in Tables 10, 11, 12, 14, 16, and/or 18 may be determined in the methods of determining the stage of fibrosis in a subject. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing whether a subject has fibrosis: glutarate (pentanedioate), epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 2-aminoheptanoate, 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, and cyclo(L-phe-L-pro). One or more additional biomarkers may optionally be selected from the group consisting of: taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, TL16:1n7 (16:1n7, palmitoleate), TL16:0 (16:0, palmitic acid), isoleucine, and alpha-ketoglutarate.
After the level(s) of the one or more biomarkers in the sample are determined, the level(s) are compared to liver disease-positive and/or liver disease-negative reference levels to diagnose or aid in diagnosing whether the subject has liver disease. Levels of the one or more biomarkers in a sample matching the liver disease-positive reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of a diagnosis of liver disease in the subject. Levels of the one or more biomarkers in a sample matching the liver disease-negative reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of a diagnosis of no liver disease in the subject. In addition, levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to liver disease-negative reference levels are indicative of a diagnosis of liver disease in the subject. Levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to liver disease-positive reference levels are indicative of a diagnosis of no liver disease in the subject.
The level(s) of the one or more biomarkers may be compared to liver disease-positive and/or liver disease-negative reference levels using various techniques, including a simple comparison (e.g., a manual comparison) of the level(s) of the one or more biomarkers in the biological sample to liver disease-positive and/or liver disease-negative reference levels. The level(s) of the one or more biomarkers in the biological sample may also be compared to liver disease-positive and/or liver disease-negative reference levels using one or more statistical analyses (e.g., t-test, Welch's T-test, Wilcoxon's rank sum test, Random Forest, T-score, Z-score) or using a mathematical model (e.g., algorithm, statistical model, mixed effects model).
For example, a mathematical model comprising a single algorithm or multiple algorithms may be used to determine whether a subject has liver disease. A mathematical model may also be used to distinguish between types of liver disease (e.g., NASH and NAFLD) or between fibrosis stages. An exemplary mathematical model may use the measured levels of any number of biomarkers (for example, 2, 3, 5, 7, 9, etc.) from a subject to determine, using an algorithm or a series of algorithms based on mathematical relationships between the levels of the measured biomarkers, whether a subject has liver disease, whether liver disease is progressing or regressing in a subject, whether a subject has more advanced or less advanced liver disease, etc. In one example, the mathematical model is logistic regression modeling. In another example, the mathematical model is multiple logistic regression modeling.
The results of the method may be used along with other methods (or the results thereof) useful in the diagnosis of liver disease in a subject. For example, the results of the method may provide an indication of patients who warrant invasive follow-up testing (e.g., liver biopsy) to confirm the diagnosis of NAFLD, NASH, fibrosis or cirrhosis.
In one aspect, the biomarkers provided herein can be used to provide a physician with a Liver Disease Score (e.g., NASH Score, NAFLD Score, Fibrosis Score) indicating the existence and/or severity of liver disease in a subject. The Score is based upon clinically significantly changed reference level(s) for a biomarker and/or combination of biomarkers. The reference level can be derived from an algorithm. The Score can be used to place the subject in a severity range of liver disease from normal (i.e. no liver disease) to severe. The Score can be used in multiple ways: for example, disease progression, regression, or remission can be monitored by periodic determination and monitoring of the Score; response to therapeutic intervention can be determined by monitoring the Score; and drug efficacy can be evaluated using the Score.
Methods for determining a subject's liver disease Score may be performed using one or more of the liver disease biomarkers identified in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 in a biological sample. The method may comprise comparing the level(s) of the one or more liver disease biomarkers in the sample to liver disease reference levels of the one or more biomarkers in order to determine the subject's liver disease score. The method may employ any number of markers selected from those listed in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20, including 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more markers. Multiple biomarkers may be correlated with liver disease, by any method, including statistical methods such as regression analysis.
After the level(s) of the one or more biomarker(s) is determined, the level(s) may be compared to liver disease reference level(s) or reference curves of the one or more biomarker(s) to determine a rating for each of the one or more biomarker(s) in the sample. The rating(s) may be aggregated using any algorithm to create a score, for example, an liver disease score, for the subject. The algorithm may take into account any factors relating to liver disease including the number of biomarkers, the correlation of the biomarkers to liver disease, etc.
In an embodiment, a mathematical model or formula containing one or more biomarkers as variables is established using regression analysis, e.g., multiple linear regressions. By way of non-limiting example, the developed formulas may include the following:
A+B(Biomarker1)+C(Biomarker2)+D(Biomarker3)+E(Biomarker4)=RScor e
A+B*1n(Biomarker1)+C*1n(Biomarker2)+D*1n(Biomarker3)+E*1n(Biomar ker4)=1nRScore
wherein A, B, C, D, E are constant numbers; Biomarker1, Biomarker2, Biomarker3, Biomarker4 are the measured values of the analyte (Biomarker) and RScore is the measure of liver disease presence or absence or severity.
The formulas may include one or more biomarkers as variables, such as 1, 2, 3, 4, 5, 10, 15, 20 or more biomarkers.
The identification of biomarkers for liver disease also allows for monitoring progression/regression of liver disease in a subject. A method of monitoring the progression/regression of liver disease in a subject comprises (1) analyzing a first biological sample from a subject to determine the level(s) of one or more biomarkers for liver disease selected from Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20, the first sample obtained from the subject at a first time point, (2) analyzing a second biological sample from a subject to determine the level(s) of the one or more biomarkers, the second sample obtained from the subject at a second time point, and (3) comparing the level(s) of one or more biomarkers in the first sample to the level(s) of the one or more biomarkers in the second sample in order to monitor the progression/regression of liver disease in the subject. The results of the method are indicative of the course of liver disease (i.e., progression or regression, if any change) in the subject.
The levels of one or more of the biomarkers of Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 including a combination of all of the biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 and combinations thereof or any fraction thereof, may be determined and used in methods of monitoring the progression/regression of liver disease in a subject. For example, the level(s) of one biomarker, two or more biomarkers, three or more biomarkers, four or more biomarkers, five or more biomarkers, six or more biomarkers, seven or more biomarkers, eight or more biomarkers, nine or more biomarkers, ten or more biomarkers, etc., including a combination of all of the biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 or any fraction thereof, may be determined and used in methods of monitoring the progression/regression of liver disease of a subject.
In one example, the levels of one or more biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, and/or 11, may be determined in the methods of monitoring the progression/regression of NAFLD in a subject. For example, one or more of the following biomarkers may be used alone or in combination to monitor the progression/regression of NAFLD: epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, 3-hydroxyisobutyrate, cyclo (L-phe-L-pro), 2-aminoadipate, 4-methyl-2-oxopentanoate, 2-hydroxybutyrate, prolylproline, and tauro-beta-muricholate. One or more additional biomarkers may optionally be selected from the group consisting of: isoleucine, glutamate, alpha-ketoglutarate, TL16:1n7 (16:1n7, palmitoleic acid), TL16:0 (16:0, palmitic acid), taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glycine, serine, leucine, deoxycholate, 3-methyl-2-oxovalerate, valine, 3-methyl-2-oxobutyrate, and lanosterol.
In another example, the levels of one or more biomarkers in Tables 7, 8, 10, 11, and/or 20 and any combination thereof including a combination of all of the biomarkers may be determined in the methods of monitoring the progression/regression of NASH in a subject. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing NASH: epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 5-methylthioadenosine (MTA), valylglycine, cyclo (L-phe-L-pro), fucose, taurine, gamma-glutamylhistidine, 3-hydroxyisobutyrate, CE(24:1), PE(P-16:0/14:1), LPC(14:0), SM(18:1), PE(15:0/22:4), FFA(20:0), LPC(12:0), LCER(26:0), LPE(14:1), PI(16:0/16:0), LPE(20:4), DCER(20:0), LCER(14:0), PE(15:0/18:4), PI(18:0/16:1), PE(16:0/22:2), PE(P-14:1/18:1), PC(16:0/14:1), PE(18:0/17:0), PE(P-16:0/18:0), PE(P-18:0/16:1), PE(O-18:0/18:0), CER(26:0), PE(16:0/16:0), LPE(18:4), and PE(O-18:0/14:1). One or more additional biomarkers may optionally be selected from the group consisting of: TL16:1n7 (16:1n7, palmitoleic acid), TL16:0 (16:0, palmitic acid), taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, LPE(18:2), LPE(20:3), PE(14:0/14:1), PC(14:0/22:4), PC(15:0/16:1), PC(20:0/14:1), PC(17:0/22:6), PE(15:0/18:3), PE(17:0/20:2), PE(18:2/20:2), PE(18:2/20:3), PC(18:1/22:6), PC(18:1/22:5), PC(14:0/18:4), SM(16:0), CE(24:0), PC(14:0/20:2), PC(14:0/20:3), PC(18:1/18:4), SM(18:0), PC(14:0/18:2), and PC(14:0/16:1).
In another example, the levels of one or more biomarkers in Tables 10, 11, 12, 14, 16, 18, and/or 20 may be determined in the methods of monitoring the progression/regression of fibrosis in a subject. For example, one or more of the following biomarkers may be used alone or in combination to monitor progression/regression of fibrosis in a subject: glutarate (pentanedioate), epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 2-aminoheptanoate, 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, cyclo(L-phe-L-pro), CER(14:0), DCER(14:0), LPE(12:0), DCER(18:0), PE(18:0/22:2), PE(P-18:0/18:3), LPC(17:0), LPC(22:0), CER(18:1), LCER(22:0), PE(16:0/20:1), CE(15:0), PE(16:0/22:4), PE(O-18:0/20:2), LPC(20:0), LPE(24:0), PC(12:0/14:1), PE(17:0/22:2), SM(18:1), CER(16:0), LCER(24:0), PE(O-18:0/20:3), CE(17:0), PE(P-16:0/18:3), PE(P-16:0/16:1), LPE(14:1), FFA(24:0), PE(O-16:0/18:4), FFA(15:0), SM(14:0), LPC(20:2), PE(P-14:1/18:1), SM(24:1), PI(18:0/20:2), LPC(15:0), PE(O-18:0/18:1), PI(18:1/20:3), PE(16:0/16:1), DAG(18:1/20:3)X-19561, X-18889, X-21471, X-11871, and X-12850. One or more additional biomarkers may optionally be selected from the group consisting of: taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, TL16:1n7 (16:1n7, palmitoleate), TL16:0 (16:0, palmitic acid), isoleucine, alpha-ketoglutarate, PE(18:2/20:2), PE(14:0/16:1), PE(14:0/14:1), PE(16:0/18:1), PE(18:1/18:1), PE(17:0/20:4), PE(14:0/20:5), PE(16:0/22:5), PE(18:2/20:3), PE(16:0/20:4), PE(14:0/18:2), PE(18:1/18:4), PE(15:0/22:6), PE(16:0/14:0), LPC(18:3), TAG55:7-FA20:3, TAG53:6-FA18:2, TAG55:7-FA20:4, TAG53:5-FA18:2, TAG53:7-FA18:3, TAG55:8-FA20:4, TAG53:5-FA18:1, TAG55:6-FA20:3, TAG57:9-FA22:6, TAG53:6-FA18:3, TAG55:6-FA18:1, TAG53:6-FA18:1, TAG53:4-FA18:1, TAG53:4-FA18:0, TAG51:4-FA16:0, TAG53:3-FA18:0, TAG51:3-FA16:0, TAG51:4-FA18:1, TAG56:5-FA20:4, TAG56:5-FA18:0, TAG56:4-FA20:4, PE(14:0/18:1), PC(14:0/18:4), PC(18:2/22:5), PC(20:0/22:5), SM(18:0), CE(18:0), PC(18:2/18:4), and PC(14:0/20:2).
The change (if any) in the level(s) of the one or more biomarkers over time may be indicative of progression or regression of liver disease in the subject. In order to characterize the course of liver disease in the subject, the level(s) of the one or more biomarkers in the first sample, the level(s) of the one or more biomarkers in the second sample, and/or the results of the comparison of the levels of the biomarkers in the first and second samples may be compared to liver disease-positive and liver disease-negative reference levels. If the comparisons indicate that the level(s) of the one or more biomarkers are increasing or decreasing over time (e.g., in the second sample as compared to the first sample) to become more similar to the liver disease-positive reference levels (or less similar to the liver disease-negative reference levels), then the results are indicative of liver disease progression. If the comparisons indicate that the level(s) of the one or more biomarkers are increasing or decreasing over time to become more similar to the liver disease-negative reference levels (or less similar to the liver disease-positive reference levels), then the results are indicative of liver disease regression.
In one embodiment, the assessment may be based on a liver disease Score (e.g., NASH Score, NAFLD Score, Fibrosis Score) which is indicative of liver disease in the subject and which can be monitored over time. By comparing the liver disease Score from a first time point sample to the liver disease Score from at least a second time point sample the progression or regression of liver disease can be determined. Such a method of monitoring the progression/regression of liver disease in a subject comprises (1) analyzing a first biological sample from a subject to determine a liver disease score for the first sample obtained from the subject at a first time point, (2) analyzing a second biological sample from a subject to determine a second liver disease score, the second sample obtained from the subject at a second time point, and (3) comparing the liver disease score in the first sample to the liver disease score in the second sample in order to monitor the progression/regression of liver disease in the subject.
The biomarkers and algorithms described herein may guide or assist a physician in deciding a treatment path, for example, whether to implement procedures such as surgical procedures (e.g., full or partial nephrectomy), treat with drug therapy, or employ a watchful waiting approach.
As with the other methods described herein, the comparisons made in the methods of monitoring progression/regression of liver disease in a subject may be carried out using various techniques, including simple comparisons, one or more statistical analyses, mathematical models (algorithms) and combinations thereof.
The results of the method may be used along with other methods (or the results thereof) useful in the clinical monitoring of progression/regression of liver disease in a subject.
As described above in connection with methods of diagnosing (or aiding in the diagnosis of) liver disease, any suitable method may be used to analyze the biological samples in order to determine the level(s) of the one or more biomarkers in the samples. In addition, the level(s) one or more biomarkers, including a combination of all of the biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 or any fraction thereof, may be determined and used in methods of monitoring progression/regression of liver disease in a subject.
Such methods could be conducted to monitor the course of liver disease in subjects having liver disease or could be used in subjects not having liver disease (e.g., subjects suspected of being predisposed to developing liver disease) in order to monitor levels of predisposition to liver disease.
C. Methods of Staging Liver Fibrosis
The identification of biomarkers for liver disease also allows for the determination of the liver fibrosis stage of a subject. A method of determining the stage of fibrosis comprises (1) analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers listed in Tables 10 11, 12, 14, 16, and/or 18 in the sample and (2) comparing the level(s) of the one or more biomarkers in the sample to high stage fibrosis and/or low stage fibrosis reference levels of the one or more biomarkers in order to determine the stage of the subject's liver fibrosis. The results of the method may be used along with other methods (or the results thereof) useful in the clinical determination of the stage of a subject's liver disease. For example, the results of the method may provide an indication of patients who warrant invasive follow-up testing (e.g., liver biopsy) when a diagnosis is NAFLD or NASH is suspected based on the stage of liver fibrosis.
As described above in connection with methods of diagnosing (or aiding in the diagnosis of) liver disease, any suitable method may be used to analyze the biological sample in order to determine the level(s) of the one or more biomarkers in the sample.
The levels of one or more biomarkers listed in Tables 10, 11, 12, 14, 16, and/or 18 and combinations thereof may be determined in the methods of determining the stage of a subject's liver fibrosis. For example, the level(s) of one biomarker, two or more biomarkers, three or more biomarkers, four or more biomarkers, five or more biomarkers, six or more biomarkers, seven or more biomarkers, eight or more biomarkers, nine or more biomarkers, ten or more biomarkers, etc., including a combination of all of the biomarkers in Tables 10, 11, 12, 14, 16, and/or 18 or any fraction thereof, may be determined and used in methods of determining the stage of liver disease of a subject. For example, one or more of the following biomarkers may be used alone or in combination to diagnose or aid in diagnosing whether a subject has fibrosis: glutarate (pentanedioate), epiandrosterone sulfate, androsterone sulfate, I-urobilinogen, 16-hydroxypalmitate, fucose, taurine, 3-hydroxydecanoate, 3-hydroxyoctanoate, 16a-hydroxy DHEA 3-sulfate, dehydroisoandrosterone sulfate (DHEA-S), 2-aminoheptanoate, 5-methylthioadenosine (MTA), gamma-glutamylhistidine, valylglycine, and cyclo(L-phe-L-pro). One or more additional biomarkers may optionally be selected from the group consisting of: taurocholate, glycocholate, taurochenodeoxycholate, glycochenodeoxycholate, glutamate, TL16:1n7 (16:1n7, palmitoleate), TL16:0 (16:0, palmitic acid), isoleucine, and alpha-ketoglutarate.
After the level(s) of the one or more biomarkers in a sample are determined, the level(s) are compared to low stage liver fibrosis and/or high stage liver fibrosis reference levels in order to predict the stage of liver fibrosis of a subject. Levels of the one or more biomarkers in a sample matching the high stage liver fibrosis reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of the subject having high stage liver fibrosis. Levels of the one or more biomarkers in a sample matching the low stage liver fibrosis reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of the subject having low stage liver fibrosis. In addition, levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to low stage liver fibrosis reference levels are indicative of the subject not having low stage liver fibrosis. Levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to high stage liver fibrosis reference levels are indicative of the subject not having high stage liver fibrosis.
Studies were carried out to identify a set of biomarkers that can be used to determine the liver fibrosis stage of a subject. In another embodiment, the biomarkers provided herein can be used to provide a physician with a Fibrosis Score indicating the stage of liver fibrosis in a subject. The score is based upon clinically significantly changed reference level(s) for a biomarker and/or combination of biomarkers. The reference level can be derived from an algorithm. The Fibrosis Score can be used to determine the stage of liver fibrosis in a subject from normal (i.e. no liver fibrosis, Stage 0) to high stage liver fibrosis (i.e., Stage 3-4).
As with the methods described above, the level(s) of the one or more biomarkers may be compared to high stage liver fibrosis and/or low stage liver fibrosis reference levels using various techniques, including a simple comparison, one or more statistical analyses, and combinations thereof.
D. Methods of Distinguishing Less Severe Liver Disease from More Severe Liver Disease
The identification of biomarkers for liver disease also allows for the identification of biomarkers for distinguishing less severe liver disease from more severe liver disease. A method of distinguishing less severe liver disease from more severe liver disease in a subject having liver disease comprises (1) analyzing a biological sample from a subject to determine the level(s) of one or more biomarkers listed in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20 in the sample and (2) comparing the level(s) of the one or more biomarkers in the sample to less severe liver disease and/or more severe liver disease reference levels of the one or more biomarkers in order to determine the severity of the subject's liver disease. The results of the method may be used along with other methods (or the results thereof) useful in the clinical determination of the severity of a subject's liver disease.
As described above in connection with methods of diagnosing (or aiding in the diagnosis of) liver disease, any suitable method may be used to analyze the biological sample in order to determine the level(s) of the one or more biomarkers in the sample.
In one example, the levels of one or more biomarkers listed in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, 18, 19, and/or 20, and any combination thereof including a combination of all of the biomarkers may be determined in the methods of determining the severity of a subject's liver disease. In one example, NAFLD is liver disease of low severity, borderline NASH is liver disease of moderate severity, and NASH is liver disease of high severity. In another example, Stage 0 liver fibrosis is liver disease of low severity, Stage 1-2 liver fibrosis is liver disease of moderate severity, and Stage 3-4 fibrosis is liver disease of high severity. In another example, NASH is a liver disease of high severity, and non-NASH is a liver disease of low severity. In another example, fibrosis is a liver disease of high severity, and non-fibrosis is a liver disease of low severity. In another example, NAFLD is a liver disease of higher severity than non-NAFLD.
After the level(s) of the one or more biomarkers in the sample are determined, the level(s) are compared to less severe liver disease and/or more severe liver disease reference levels in order to determine the aggressiveness of liver disease of a subject. Levels of the one or more biomarkers in a sample matching the more severe liver disease reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of the subject having more severe liver disease. Levels of the one or more biomarkers in a sample matching the less severe liver disease reference levels (e.g., levels that are the same as the reference levels, substantially the same as the reference levels, above and/or below the minimum and/or maximum of the reference levels, and/or within the range of the reference levels) are indicative of the subject having less severe liver disease. In addition, levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to less severe liver disease reference levels are indicative of the subject not having less severe liver disease. Levels of the one or more biomarkers that are differentially present (especially at a level that is statistically significant) in the sample as compared to more severe liver disease reference levels are indicative of the subject not having more severe liver disease.
Studies were carried out to identify a set of biomarkers that can be used to distinguish less severe liver disease from more severe liver disease. In another embodiment, the biomarkers provided herein can be used to provide a physician with a liver disease Score indicating the severity of liver disease in a subject. The score is based upon clinically significantly changed reference level(s) for a biomarker and/or combination of biomarkers. The reference level can be derived from an algorithm. The liver disease Score can be used to determine the severity of liver disease in a subject from normal (i.e. no liver disease) to more severe liver disease.
As with the methods described above, the level(s) of the one or more biomarkers may be compared to more severe liver disease and/or less severe liver disease reference levels using various techniques, including a simple comparison, one or more statistical analyses, and combinations thereof.
As with the methods of diagnosing (or aiding in diagnosing) whether a subject has liver disease, the methods of determining the severity of liver disease of a subject may further comprise analyzing the biological sample to determine the level(s) of one or more non-biomarker compounds.
Other methods of using the biomarkers discussed herein are also contemplated. For example, the methods described in U.S. Pat. No. 7,005,255, U.S. Pat. No. 7,329,489, U.S. Pat. No. 7,553,616, U.S. Pat. No. 7,550,260, U.S. Pat. No. 7,550,258, U.S. Pat. No. 7,635,556, U.S. patent application Ser. No. 11/728,826, U.S. patent application Ser. No. 12/463,690 and U.S. patent application Ser. No. 12/182,828 may be conducted using a small molecule profile comprising one or more of the biomarkers disclosed herein.
In any of the methods listed herein, the biomarkers that are used may be selected from those biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, and/or 18 having p-values of less than 0.05. The biomarkers that are used in any of the methods described herein may also be selected from those biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, and/or 18 that are decreased in liver disease (as compared to the control) or that are decreased in high stage fibrosis (as compared to control or low stage fibrosis) or that are decreased in more severe (as compared to control or less severe) by at least 5%, by at least 10%, by at least 15%, by at least 20%, by at least 25%, by at least 30%, by at least 35%, by at least 40%, by at least 45%, by at least 50%, by at least 55%, by at least 60%, by at least 65%, by at least 70%, by at least 75%, by at least 80%, by at least 85%, by at least 90%, by at least 95%, or by 100% (i.e., absent); and/or those biomarkers in Tables 2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 16, and/or 18 that are increased in the liver disease (as compared to the control) or that are increased high stage fibrosis (as compared to control or low stage fibrosis) or that are increased in more severe (as compared to control or less severe) by at least 5%, by at least 10%, by at least 15%, by at least 20%, by at least 25%, by at least 30%, by at least 35%, by at least 40%, by at least 45%, by at least 50%, by at least 55%, by at least 60%, by at least 65%, by at least 70%, by at least 75%, by at least 80%, by at least 85%, by at least 90%, by at least 95%, by at least 100%, by at least 110%, by at least 120%, by at least 130%, by at least 140%, by at least 150%, or more.
The invention will be further explained by the following illustrative examples that are intended to be non-limiting.
Samples were prepared using the automated MicroLab STAR® system from Hamilton Company. Recovery standards were added prior to the first step in the extraction process for QC purposes. Sample preparation was conducted using a methanol extraction to remove the protein fraction while allowing maximum recovery of small molecules. The resulting extract was divided into five fractions: one for analysis by UPLC-MS/MS with positive ion mode electrospray ionization, one for analysis by UPLC-MS/MS with negative ion mode electrospray ionization, one for LC polar platform, one for analysis by GC-MS, and one sample was reserved for backup. Samples were placed briefly on a TurboVap® (Zymark) under nitrogen to remove the organic solvent. For LC, the samples were stored under nitrogen overnight. For GC, the samples were dried under vacuum overnight. Samples were then prepared for the appropriate instrument, either LC/MS or GC/MS.
LC/MS analysis used a Waters ACQUITY ultra-performance liquid chromatography (UPLC) and a Thermo Scientific Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. The sample extract was dried then reconstituted in acidic or basic LC-compatible solvents, each of which contained 8 or more injection standards at fixed concentrations to ensure injection and chromatographic consistency. One aliquot was analyzed using acidic positive ion optimized conditions and the other using basic negative ion optimized conditions in two independent injections using separate dedicated columns (Waters UPLC BEH C18-2.1×100 mm, 1.7 μm). Extracts reconstituted in acidic conditions were gradient eluted from a C18 column using water and methanol containing 0.1% formic acid. The basic extracts were similarly eluted from C18 using methanol and water containing with 6.5mM Ammonium Bicarbonate. The third aliquot was analyzed via negative ionization following elution from a HILIC column (Waters UPLC BEH Amide 2.1×150 mm, 1.7 μm) using a gradient consisting of water and acetonitrile with 10mM Ammonium Formate. The MS analysis alternated between MS and data-dependent MS2 scans using dynamic exclusion, and the scan range was from 80-1000 m/z.
For GC/MS analysis, samples were re-dried under vacuum desiccation for a minimum of 24 hours prior to being derivatized under dried nitrogen using bistrimethyl-silyl-trifluoroacetamide (BSTFA). The GC column was a 20 m×0.18 mm ID, with 5% phenyl; 95% dimethylsilicone phase. The temperature ramp was from 60° to 340° C. in an 18 minute period. Samples were analyzed on a Thermo-Finnigan Trace DSQ fast-scanning single-quadrupole mass spectrometer using electron impact ionization at unit mass resolution. The instrument was tuned and calibrated for mass resolution and mass accuracy on a daily basis.
In some examples, lipids were extracted in the presence of authentic internal standards by the method of Folch et al. (J Biol Chem 226:497-509) using chloroform:methanol (2:1 v/v). Lipids were transesterified in 1% sulfuric acid in methanol in a sealed vial under a nitrogen atmosphere at 100° C. for 45 minutes. The resulting fatty acid methyl esters were extracted from the mixture with hexane containing 0.05% butylated hydroxytoluene and prepared for GC by sealing the hexane extracts under nitrogen. Fatty acid methyl esters were separated and quantified by capillary GC (Agilent Technologies 6890 Series GC) equipped with a 30 m DB 88 capillary column (Agilent Technologies) and a flame ionization detector. The absolute concentration of each lipid is determined by comparing the peak area to that of the internal standard.
In some examples, lipids were extracted from samples in methanol:dichloromethane in the presence of internal standards. The extracts were concentrated under nitrogen and reconstituted in 0.25 mL of 10 MM ammonium acetate dichloromethane:methanol (50:50). The extracts were transferred to inserts and placed in vials for infusion-MS analysis, performed on a Shimazdu LC with nano PEEk tubing and a Sciex Selexlon-5500 QTRAP. The samples were analyzed via both positiove and negative mode electorspray. The 5500 QTRAP scan is performed in MRM mode with the total of more than 1,100 MRMs. Individual lipid species were quantified by taking the peak area ratios of target compounds and their assigned internal standards, then multiplying by the concentration of internal standard added to the sample. Lipid class concentrations were calculated from the sum of all molecular species within a class, and fatty acid compositions were determined by calculating the proportion of each class comprised by individual fatty acids.
For each biological matrix data set on each instrument (except for GC-FID), relative standard deviations (RSDs) of peak area were calculated for each internal standard to confirm extraction efficiency, instrument performance, column integrity, chromatography, and mass calibration. Several of these internal standards serve as retention index (RI) markers and were checked for retention time and alignment. Modified versions of the software accompanying the UPLC-MS and GC-MS systems were used for peak detection and integration. The output from this processing generated a list of m/z ratios, retention times and area under the curve values. Software specified criteria for peak detection including thresholds for signal to noise ratio, height and width.
The biological data sets, including QC samples, were chromatographically aligned based on a retention index that utilizes internal standards assigned a fixed RI value. The RI of the experimental peak is determined by assuming a linear fit between flanking RI markers whose values do not change. The benefit of the RI is that it corrects for retention time drifts that are caused by systematic errors such as sample pH and column age. Each compound's RI was designated based on the elution relationship with its two lateral retention markers. Using an in-house software package, integrated, aligned peaks were matched against an in-house library (a chemical library) of authentic standards and routinely detected unknown compounds, which is specific to the positive, negative or GC-MS data collection method employed. Matches were based on retention index values within 150 RI units of the prospective identification and experimental precursor mass match to the library authentic standard within 0.4 m/z for the LTQ and DSQ data. The experimental MS/MS was compared to the library spectra for the authentic standard and assigned forward and reverse scores. A perfect forward score would indicate that all ions in the experimental spectra were found in the library for the authentic standard at the correct ratios and a perfect reverse score would indicate that all authentic standard library ions were present in the experimental spectra and at correct ratios. The forward and reverse scores were compared and a MS/MS fragmentation spectral score was given for the proposed match. All matches were then manually reviewed by an analyst that approved or rejected each call based on the criteria above. However, manual review by an analyst is not required. In some embodiments the matching process is completely automated.
Further details regarding a chemical library, a method for matching integrated aligned peaks for identification of named compounds and routinely detected unknown compounds, and computer-readable code for identifying small molecules in a sample may be found in U.S. Pat. No. 7,561,975, which is incorporated by reference herein in its entirety.
From the biological samples, aliquots of each of the individual samples were combined to make technical replicates, which were extracted as described above. Extracts of this pooled sample were injected six times for each data set on each instrument to assess process variability. As an additional quality control, five water aliquots were also extracted as part of the sample set on each instrument to serve as process blanks for artifact identification. All QC samples included the instrument internal standards to assess extraction efficiency, and instrument performance and to serve as retention index markers for ion identification. The standards were isotopically labeled or otherwise exogenous molecules chosen so as not to obstruct detection of intrinsic ions.
Missing values, if any, were imputed with the observed minimum for that particular compound. A mixed-effects model was used to analyze differences between the NAFLD and non-NAFLD groups, and correlations between metabolites and clinical parameters were also assessed with a mixed-effects model. Statistical analyses were performed on natural log-transformed data. Random forest (RF) analysis was carried out to determine the ability of the global biochemical profile to separate the NAFLD and non-NAFLD groups and to separate groups based on fibrosis stage. Logistic regression and area under the curve (AUC) were used to assess the performance of individual metabolite biomarkers and several clinical parameters for distinguishing NAFLD from non-NAFLD and for distinguishing fibrosis stage. Logistic regression with Chi-square analysis and AUC were used to assess the performance of individual metabolite biomarkers for distinguishing fibrosis from no fibrosis and NASH from no NASH. Multiple logistic regression modeling was performed to analyze the performance of combinations of multiple biomarkers (biomarker panels).
Serum samples from 36 subjects with NAFLD (as defined by >5% steatosis by MRI imaging) and 118 subjects without NAFLD were analyzed using four global metabolic profiling mass spectrometry platforms, as well as the GC-FID analysis for fatty acids, cholesterol metabolism lipids, and Vitamin E. A total of 770 named metabolites were detected in the patient samples. Clinical parameters including Age, Gender, Race, Ethnicity, Height/Weight/Body mass index (BMI), Smoking history, Diabetes history, Glucose, Albumin, Bilirubin, Aspartate aminotransferase (AST), Alanine aminotransferase (ALT), Alkaline phosphatase, Total cholesterol, High-density lipoprotein cholesterol (HDL), Low-density lipoprotein cholesterol (LDL), Triglycerides, Ferritin, Gamma-glutamyl transferase (GGT), HBA1c, White blood cell (WBC) count, Hemoglobin (HGB), Hematocrit (HCT), Platelet count, Prothrombin time, International normalized ratio (INR), Insulin, and Hepatic imaging parameters including MRI Proton Density Fat Fraction (MRI PDFF) and MRE (Elastography) were provided for the subjects. Data from MRI PDFF were used in the clinical determination of NAFLD or non-NAFLD.
Random forest (RF) analysis was carried out to determine the ability of the global biochemical profile to separate the NAFLD and non-NAFLD groups. RF is an unbiased and supervised classification technique based on a large number of decision trees. Using the groupings of NAFLD and non-NAFLD, RF classification analysis based on the serum metabolic profile of the entire study cohort (n=154) differentiated the two groups with 83.1% accuracy. Using all named metabolites, 83.9% (99 of 118) non-NAFLD and 80.6% (29 of 36) NAFLD subjects were correctly classified for an overall predictive accuracy of 83.1%.
Logistic regression and area under the curve (AUC) were used to assess the performance of several of the clinical parameters for distinguishing NAFLD from non-NAFLD. The results are shown in Table 1. Since MRI PDFF was used to diagnose NAFLD in this patient cohort, the AUC for that parameter is 1.000.
Logistic regression models and area under the curve (AUC) were used to assess how well individual metabolites discriminated the NAFLD and non-NAFLD groups. Logistic regression analysis was performed using the measured values obtained for all 770 named metabolites that were detected in the sample. The metabolites with an AUC of >0.700 for distinguishing NAFLD from non-NAFLD patient samples are presented in Table 2.
Multiple logistic regression modeling was performed to analyze the performance of various combinations of biomarkers (“biomarker panels”). The leave one out cross validation method was used to determine a number of variables (e.g., metabolite biomarkers) to include in the model. In this method one sample is removed from the data set, the model is fit on the remaining data and then the fitted model is used to predict the sample that was left out. The method provides an estimate of future performance. Here the clinical parameter MRI Proton Density Fat Fraction (MRI PDFF) was used to assess the change in the correlation as more variables are added to the model. As the number of compounds increases, the mean R2 value for the correlation increases until an optimal number is reached, indicating that variable selection is more or less stable. In this analysis models with at least 2 variables increased the correlation and the correlation peaked at five variables.
In one example, multiple logistic regression modeling with 4 and 5 variable models was performed using the measured values obtained for 13 metabolite biomarkers for distinguishing patients with NAFLD from individuals without NAFLD. These biomarkers included glycine, serine, leucine, 4-methyl-2-oxopentanoate, 3-methyl-2-oxovalerate, valine, 3-methyl-2-oxobutyrate, 2-hydroxybutyrate, 5-methylthioadenosine, prolylproline, lanosterol, tauro-beta-muricholate, and deoxycholate. There were 715 4-variable models generated using the listed 13 metabolites. The AUC was >0.800 for 204 of these models. There were 1287 5-variable models generated using the 13 listed metabolites. The AUC was >0.800 for 493 of these models. Table 3 shows the 30 4-variable models having the highest AUC. Table 4 shows the top 30 5-variable models. Table 5 shows the 13 metabolites used in the 4- and 5-variable models and the prevalence (in percentage) of the given metabolite in the models with AUC >0.800. For example, 5-methylthioadenosine (MTA) was identified in 92.2% of all 204 4-variable models with an AUC>0.800 and in 93.5% of all 493 5-variable models with an AUC>0.800.
Serum samples from 116 subjects with NASH, 18 subjects with NAFLD, and 18 subjects with borderline NASH were analyzed using four global metabolic profiling mass spectrometry platforms, as well as the GC-FID analysis for fatty acids, cholesterol metabolism lipids, and Vitamin E. All diagnoses were determined by a trained pathologist using histological analysis of patient biopsy samples. A total of 721 named metabolites were detected in the samples from this cohort. Clinical parameters including Age, Gender, Height/Weight/Body mass index (BMI), Diabetes history, Glucose, Insulin, HBA1c, Aspartate aminotransferase (AST), Alanine aminotransferase (ALT), Total cholesterol, High-density lipoprotein cholesterol (HDL), Low-density lipoprotein cholesterol (LDL), Triglycerides, Gamma-glutamyl transferase (GGT), Steatosis, Lobular Inflammation, Portal Inflammation, Ballooning, and NAFLD Activity Score (NAS) were provided for the subjects.
Logistic regression and area under the curve (AUC) were used to assess the performance of several of the clinical parameters for distinguishing NASH from borderline NASH and NAFLD. The results are shown in Table 6.
All 721 named metabolites were analyzed using a mixed effects model. Metabolites that were significantly altered (p<0.05, q<0.1) in the comparison of NASH to NAFLD samples are presented in Table 7. Other comparisons presented in Table 7 are Baseline (BL) NASH vs. NAFLD, and NASH vs. BL NASH. Table 7 includes, for each metabolite, the biochemical name of the metabolite, the internal identifier for the biomarker compound in the in-house chemical library of authentic standards (CompID), the fold change (FC) of the biomarker for each comparison, which is the ratio of the mean level of the biomarker in one sample type as compared to the mean level in a different sample type (e.g. NASH versus NAFLD), and the p-value determined in the statistical analysis of the data concerning the biomarkers.
Logistic regression models and area under the curve (AUC) were used to assess how well individual metabolites distinguished the NASH from borderline NASH and NAFLD groups. Logistic regression analysis was performed for all 721 named metabolites. Metabolites with an AUC of >0.620 for distinguishing NASH from borderline NASH and NAFLD patient samples are presented in Table 8. Metabolites in bold are significant with p<0.05, q<0.1 in NASH compared to NAFLD patient samples.
epiandrosterone sulfate
myristate (14:0)
androsterone sulfate
16-hydroxypalmitate
1-
oleoylglycerophosphoethanolamine
3-hydroxyoctanoate
dehydroisoandrosterone sulfate
5-methylthioadenosine (MTA)
pregnanediol-3-glucuronide
valylglycine
cyclo(L-phe-L-pro)
TL16:1n7 (palmitoleic acid)
TL16:0 (palmitic acid)
isoleucylglycine
hypoxanthine
1-
arachidoylglycerophosphocholine
2-oleoylglycerophosphocholine
xanthine
phenylalanylvaline
valylleucine
isoleucylvaline
5alpha-pregnan-3beta,20alpha-
glycerol
diol monosulfate
4-androsten-3alpha,17alpha-diol
monosulfate
hydroxybutyrylcarnitine
docosatrienoate (22:3n3)
5-dodecenoate (12:1n7)
10-heptadecenoate (17:1n7)
myristoleate (14:1n5)
3-hydroxybutyrate (BHBA)
malate
pregnenolone sulfate
glutarylcarnitine (C5)
caprate (10:0)
21-hydroxypregnenolone disulfate
1-
margaroylglycerophosphocholine
N-acetylmethionine
carnitine
4-androsten-3beta,17beta-diol
monosulfate
leucylglycine
1-
eicosapentaenoylglycerophosphocholine
cyclohexanebutanoic acid
catechol sulfate
pregn steroid monosulfate
5alpha-androstan-3beta,17beta-
diol monosulfate
Serum samples from 152 subjects with liver biopsy-diagnosed NASH or NAFLD were used in the analysis. All diagnoses were determined by a trained pathologist using histological analysis of patient biopsy samples. Patient samples were classified into three groups according to disease severity based on the fibrosis stage (stage 0, least severe; stage 1-2, moderate severity; stage 3-4, high severity). All samples were analyzed using four global metabolic profiling mass spectrometry platforms, as well as the GC-FID analysis for fatty acids, cholesterol metabolism lipids, and Vitamin E. A total of 721 named metabolites were detected in the sample cohort. Clinical parameters including Age, Gender, Height/Weight/Body mass index (BMI), Diabetes history, Glucose, Insulin, HBA1c, Aspartate aminotransferase (AST), Alanine aminotransferase (ALT), Total cholesterol, High-density lipoprotein cholesterol (HDL), Low-density lipoprotein cholesterol (LDL), Triglycerides, Gamma-glutamyl transferase (GGT), Steatosis, Lobular Inflammation, Portal Inflammation, Ballooning, and NAFLD Activity Score (NAS) were provided for the subjects.
Logistic regression and area under the curve (AUC) were used to assess the performance of several of the clinical parameters for distinguishing fibrosis stages 3-4 (high severity) from stages 1-2 (moderate severity) and stage 0 (low severity). The results are shown in Table 9.
The measured levels of the 721 named metabolites detected in the samples were analyzed using a mixed effects model. Metabolites that were significantly altered (p<0.05, q<0.1) in the comparison of Stage 3+4 (high severity) fibrosis to Stage 0 (low severity) fibrosis samples are presented in Table 10. Other comparisons presented in Table 10 are Stage 3+4 (high severity) vs. Stage 1+2 (moderate severity), and Stage 1+2 vs. Stage 0. Table 10 includes, for each metabolite, the biochemical name of the metabolite, the internal identifier for the biomarker compound in the in-house chemical library of authentic standards (ComplD), the fold change (FC) of the biomarker for each comparison, which is the ratio of the mean level of that biomarker in one sample type as compared to the mean level in a different sample type, and the p-value determined in the statistical analysis of the data concerning the biomarkers.
Logistic regression models and area under the curve (AUC) were used to assess how well individual metabolites distinguished the stage 3-4 fibrosis from stage 1-2 and stage 0 fibrosis groups. Logistic regression analysis was performed on the measured values obtained for all 721 named metabolites detected in the samples.
Metabolites with an AUC of >0.620 for distinguishing stage 3-4 fibrosis from stage 1-2 and stage 0 fibrosis patient samples are presented in Table 11.
In another example, serum samples from 200 subjects spanning the spectrum of nonalcoholic fatty liver disease were analyzed. Patient samples were classified into five groups according to disease severity based on the fibrosis stage (stage 0, no fibrosis (N=12); stage 1, mild severity (N=38); stage 2, moderate severity (N=100); stage 3, high severity (N=42); stage 4, cirrhosis (N=8)). All samples were analyzed using four global metabolic profiling mass spectrometry platforms, as well as the GC-FID analysis for fatty acids, cholesterol metabolism lipids, and Vitamin E. A total of 790 named metabolites and 361 unnamed metabolites were detected in the sample cohort. Clinical parameters including Age, Gender, Race, Ethnicity, Height/Weight/Body mass index (BMI), Smoking history, Diabetes history, Steatosis, Fibrosis, Lobular Inflammation, Portal Inflammation, Hepatocellular ballooning, NAFLD Activity Score (NAS), Fasting glucose, Fasting insulin, Aspartate aminotransferase (AST), Alanine aminotransferase (ALT), Alkaline phosphatase, Total cholesterol, High-density lipoprotein cholesterol (HDL), Low-density lipoprotein cholesterol (LDL), Triglycerides, HBA1c, and Hemoglobin (HGB) were provided for the subjects.
The statistical significance and predictive performance of metabolites detected in the samples, used individually or in combinations, to stage fibrosis in these subjects was assessed using t-tests, AUC calculations, logistic regression and random forest analysis. For comparison, the performance of Age, Type 2 Diabetes, BMI, HDL Cholesterol, Gender, Fructose, and Past Alcohol Use, which are commonly measured clinical parameters, was also evaluated individually and in combinations. The results of these analyses are presented in this example. These results show that many metabolites alone have an AUC higher than obtained using clinical parameters alone, and in some cases, outperformed combinations of clinical parameters. Further, our analyses identified combinations of metabolites that had better predictive performance than any of the combinations of clinical parameters.
The measured levels of the 1151 metabolites detected in the samples were analyzed using Welch's two-sample t-tests to compare the levels measured in samples collected from subjects with more severe fibrosis to the levels measured in samples collected from subjects with less severe fibrosis or no fibrosis. Metabolites detected in the study are presented in Table 12. Comparisons presented in Table 12 are Stage 2-4 vs. Stage 0-1, Stage 3-4 vs. Stage 1-2, Stage 3-4 vs. Stage 0-1, Stage 4 vs. Stage 0, Stage 3-4 vs. Stage 0, and Stage 1-2 vs. Stage 0, Stage 3-4 vs. Stage 1-2, Stage 3-4 vs. Stage 2, and Stage 2 vs. Stage 0-1. Table 12 includes, for each metabolite, the biochemical name of the metabolite, the internal identifier for the biomarker compound in the in-house chemical library of authentic standards (ComplD), the fold change (FC) of the biomarker for each comparison, which is the ratio of the mean level of that biomarker in one sample type as compared to the mean level in a different sample type, and the p-value determined in the statistical analysis of the data concerning the biomarkers. Fold change values in bold font indicate that the p-value for the given comparison was less than 0.05.
1.45
1.39
1.7
1.7
2.01
1.49
1.35
1.27
1.33
0.67
0.63
0.52
0.2
0.7
0.63
0.73
0.62
0.55
0.2
0.62
2
1.73
2.56
4.26
5.87
3.6
1.63
1.49
1.72
1.19
1.13
1.24
1.27
1.18
1.22
1.23
1.27
1.39
1.37
1.27
1.21
0.92
0.84
0.82
0.86
0.84
0.85
1.33
1.29
1.5
1.6
1.27
1.21
1.24
1.59
1.32
1.77
1.69
1.96
1.52
1.29
1.5
0.69
0.71
0.59
0.34
0.56
0.73
0.8
0.74
2.15
1.66
2.65
3.98
4.33
2.75
1.57
1.4
1.9
1.69
1.21
1.74
2.39
2.44
2.11
1.16
1.67
0.82
0.39
0.85
1.24
1.34
1.2
1.63
1.41
1.88
1.8
1.5
1.54
1.42
1.52
1.81
1.58
1.85
1.92
2.3
3.13
4.04
2.82
3.16
2.82
1.85
2.08
2.67
3.13
3.1
2.02
1.87
1.43
2.43
3.53
4.48
2.43
3.52
3.53
3.19
1.4
2.06
2.31
3.12
2.62
4.16
2.23
2.05
1.53
1.08
1.06
1.11
1.07
1.24
1.3
1.38
1.23
1.22
1.27
1.15
1.33
1.13
1.24
0.83
0.92
0.82
0.81
0.87
0.84
1.3
1.15
1.36
1.61
1.25
1.14
1.28
1.31
1.13
1.34
1.73
1.47
1.32
1.11
1.29
5.36
1.33
5.24
6.88
5.54
1.24
5.42
5.38
0.54
2.6
5.95
5.29
10.7
0.5
6.77
1.44
1.34
1.64
1.48
1.32
1.34
0.53
0.58
0.48
0.57
0.64
0.51
1.59
1.58
1.59
1.21
1.11
1.25
1.21
1.1
1.19
0.72
0.76
0.63
0.38
0.64
0.77
0.84
0.76
2.11
1.32
2.25
4.32
5.91
1.24
1.1
2.05
0.84
0.91
0.71
0.91
0.87
0.81
1.62
1.34
1.42
1.66
1.77
0.76
0.78
0.6
0.81
0.84
0.74
0.76
0.87
0.74
0.75
0.85
0.77
1.59
1.48
1.91
1.48
1.44
0.84
0.9
0.81
0.67
0.81
0.86
1.13
1.08
1.16
1.14
1.07
1.11
2.64
2.4
3.97
3.13
3.84
1.65
2.32
2
1.98
3.28
1.72
3.95
3.03
6.01
3.72
1.62
2.95
0.71
0.73
0.31
0.49
0.5
0.69
1.37
1.37
1.59
1.75
1.34
1.25
0.79
0.83
0.73
0.83
0.83
1.24
1.15
1.31
1.42
1.57
1.39
1.2
1.36
1.47
1.31
1.34
1.3
2.29
2.01
1.98
1.36
0.79
0.82
0.72
0.81
0.82
1.33
1.3
1.5
1.67
1.28
1.24
0.62
0.68
0.53
0.32
0.44
0.72
0.67
1.36
1.27
1.52
1.5
1.26
1.29
0.88
0.87
0.8
0.82
0.88
0.86
0.83
0.78
0.86
0.83
0.9
0.88
0.87
0.69
0.79
0.83
0.88
0.79
0.82
0.92
0.94
0.77
1.19
1.11
1.23
1.26
1.11
1.17
2.31
0.78
1.64
2.05
2.65
1.15
1.19
0.8
0.8
0.5
0.74
0.78
0.79
1.45
1.62
1.7
1.28
1.37
5.04
7.47
14.2
7.39
3.83
0.88
0.88
0.69
0.88
0.84
0.87
0.55
0.83
0.84
0.83
0.82
0.84
0.59
0.74
0.75
0.81
1.08
1.09
1.16
1.12
1.08
0.9
0.9
0.9
1.23
1.17
1.31
1.49
1.29
1.16
1.18
0.88
0.78
0.82
0.81
0.86
0.71
0.69
0.72
1.19
1.28
1.42
1.27
1.24
0.92
0.9
1.36
1.35
1.58
1.72
1.33
1.26
0.85
0.87
0.8
0.59
0.84
0.88
0.78
0.82
0.72
0.81
0.88
0.76
0.76
0.86
0.77
1.84
2.29
1.61
1.32
1.11
2.1
2.56
1.42
4.49
4.54
0.85
0.78
0.73
0.77
0.8
1.48
1.16
1.52
1.15
1.04
1.98
2.06
2.2
2.02
1.26
1.29
1.24
3.1
1.87
3.72
1.18
1.17
1.28
1.13
1.09
1.1
1.08
0.85
0.85
0.78
0.85
1.53
7.36
2.42
2.03
1.48
1.13
1.12
1.19
1.28
1.1
1.25
1.26
1.24
1.3
1.36
0.82
0.85
0.81
2.77
1.43
3
1.41
1.1
1.07
1.13
1.19
1.08
0.86
0.84
0.88
1.09
1.11
1.09
1.27
1.12
1.31
1.6
1.51
1.38
1.46
1.71
1.96
1.72
1.61
1.19
1.22
1.76
1.76
1.66
1.17
0.77
0.77
0.77
1.18
1.25
1.31
1.14
0.85
0.85
0.86
0.84
1.12
1.14
0.89
0.84
0.69
0.75
0.91
1.17
1.22
1.31
1.1
0.47
0.58
0.46
0.43
1.07
1.08
0.87
0.68
0.83
0.84
0.84
0.69
0.66
0.59
0.84
0.7
0.71
1.21
1.22
1.35
1.21
1.59
1.55
1.05
1.17
0.99
0.69
0.7
0.41
0.63
0.69
0.87
0.85
0.81
0.79
0.82
0.76
0.74
0.65
1.31
0.78
1.41
0.81
0.81
1.53
1.71
1.44
2.24
4.14
4.04
1.3
0.65
0.66
1.18
1.27
1.4
1.13
1.16
1.23
1.31
1.61
1.33
1.3
1.37
1.75
1.52
1.15
1.16
0.89
0.69
0.88
1.11
1.09
1.16
1.14
1.09
1.06
0.87
0.86
1.22
1.26
1.33
1.09
1.09
0.88
0.86
1.08
1.13
1.16
1.19
1.13
1.12
1.16
1.24
1.27
1.29
1.31
0.85
0.67
0.87
0.71
0.69
0.67
0.69
0.71
0.72
0.91
0.89
1.22
1.2
1.44
1.23
0.84
1.17
0.99
0.88
0.73
0.74
0.93
0.76
0.7
0.74
1.7
1.52
2.05
1.49
0.88
0.87
0.56
0.78
0.82
0.89
0.91
0.88
1.12
1.77
1.33
1.28
1.11
0.8
0.79
0.72
0.81
1.22
1.06
1.21
1.42
1.4
1.35
1.03
0.87
0.86
0.62
0.87
1.45
2.07
1.41
3.26
1.8
1.1
1.12
1.27
1.31
1.1
1.1
1.13
1.14
0.89
0.88
0.8
0.79
1.02
0.86
0.85
0.95
0.95
1.28
1.3
1.41
0.86
0.85
1.08
1.08
1.24
1.23
1.11
1.15
1.2
1.23
1.31
1.14
1.13
0.91
0.89
0.55
0.52
1.15
1.26
1.87
0.5
0.62
0.43
0.67
0.81
1.07
1.07
1.52
1.56
1.87
2.08
1.84
0.9
0.8
3.16
1.14
1.21
1.3
1.17
1.11
1.09
1.15
1.53
1.24
0.95
0.8
0.9
0.89
0.94
0.7
0.64
0.54
0.2
0.63
0.69
1.24
1.33
0.91
0.93
0.25
1.06
1.19
1.14
1.13
1.06
0.94
1.1
0.91
1.13
1.16
1.31
1.12
1.38
1.13
1.14
1.35
1.3
0.94
0.9
1.1
1.17
0.55
0.54
0.71
1.12
1.46
1.26
1.25
1.12
1.28
1.45
1.58
1.59
1.44
1.39
0.91
0.9
1.77
1.25
1.18
1.34
1.77
1.65
0.72
0.57
0.51
0.12
0.57
0.61
0.92
0.88
0.68
0.65
0.54
0.34
0.49
0.7
0.72
0.76
0.9
0.76
1.18
1.33
1.39
1.48
1.32
1.3
1.06
0.86
1.27
1.22
2.1
0.79
0.69
0.83
0.88
0.72
1.15
1.28
1.16
1.12
0.52
1.1
1.35
1.22
1.17
0.8
0.5
0.55
0.4
1.32
0.85
1.45
1.52
1.13
1.25
0.83
0.74
0.42
0.43
1.11
1.23
1.11
1.03
1.53
1.53
1.26
1.36
1.23
1.22
1.65
1.33
1.09
0.9
0.81
0.6
0.8
0.73
0.61
0.58
0.59
0.63
1.05
0.74
0.71
0.57
0.76
0.75
0.8
0.71
0.64
1.2
1.28
1.31
1.19
1.49
1.68
1.06
3.27
4.54
3.22
2.87
0.42
0.72
0.75
0.36
0.61
0.68
1.28
1.1
0.34
0.41
1.12
1.16
1.12
0.57
0.41
0.84
0.78
0.65
0.81
0.85
0.91
0.89
0.91
1.54
1.63
1.51
4.36
2.98
1.52
1.6
0.77
1.23
0.9
0.78
0.82
0.6
0.41
0.6
0.42
1.22
0.81
1.37
1.49
1.92
1.36
1.31
2.32
1.1
1.13
0.87
0.86
0.85
0.88
0.88
0.82
1.1
0.52
0.38
0.62
0.41
0.26
0.34
0.33
0.25
0.23
0.8
1.57
1.23
0.59
1.31
0.71
0.52
1.42
1.33
1.15
1.18
1.14
1.13
0.92
0.74
0.92
1.08
1.08
2.64
1.07
1.15
1.16
1.32
17.7
5.57
17.3
1.36
1.16
1.17
1.29
1.15
1.15
0.42
1.34
1.23
0.35
0.47
2.51
4.14
3.36
0.74
0.73
0.72
0.74
0.74
2.85
2.01
1.9
1.57
1.71
0.92
0.92
0.91
0.97
1.12
1.11
1.19
0.84
0.8
0.84
0.8
0.78
1.19
2.5
0.44
0.53
0.38
0.45
0.41
0.72
0.76
0.63
2.46
2.36
1.84
0.77
1.42
1.88
0.68
0.84
0.74
0.68
0.87
0.87
1.25
1.24
1.24
1.86
0.81
0.85
0.81
0.79
0.75
0.91
0.92
0.91
0.91
2.45
3.01
2.44
2.24
0.69
0.7
0.71
0.7
0.72
0.69
0.79
0.32
0.52
0.61
1.85
1.62
1.21
1.24
1.2
1.19
1.18
1.44
1.2
1.18
0.75
1.43
1.84
1.79
1.77
1.71
0.59
0.53
1.25
0.88
0.86
0.88
0.9
1.34
0.91
1.25
0.86
1.69
2.51
1.15
0.87
1.09
1.11
1.32
0.47
0.28
0.46
0.52
1.07
1.08
1.07
0.69
0.67
0.76
0.76
0.72
0.79
0.77
0.73
1.23
0.78
1.25
1.22
1.27
1
2.11
0.75
2.18
5.78
4.02
3.62
0.91
0.72
1.2
1.17
1.05
1.55
1.83
1.79
1.52
0.58
1.11
1.12
1.11
1.22
1.22
1.23
1.04
1.04
1.04
0.91
0.49
0.79
0.77
3.33
0.99
1.43
1.4
2.38
1.2
0.73
0.72
0.68
0.86
0.76
0.87
0.86
1.24
1.16
1.19
Distinguishing Fibrosis Stages 0-1 from Fibrosis Stages 2-4
To assess the performance of several commonly measured clinical parameters (Age, Type 2 Diabetes, BMI, HDL Cholesterol, Gender, Fructose, and Past Alcohol Use) for distinguishing fibrosis stages 0-1 samples from stages 2-4 samples, logistic regression and area under the curve (AUC) analyses were performed. The AUCs calculated for the individual clinical parameters ranged from 0.5079 for BMI to 0.6096 for Type 2 Diabetes. The data are shown in Table 13. A total of 127 combinations of these seven clinical parameters are possible, and all 127 possible combinatorial models using these clinical parameters were evaluated. The highest AUC obtained was 0.6663, and it was derived from a model that fit all seven clinical parameters.
Logistic regression models and area under the curve (AUC) were also used to assess the performance of individual metabolites for distinguishing the fibrosis stage 0-1 samples from fibrosis stage 2-4 samples. Logistic regression analysis was performed on the measured values obtained for all 1151 metabolites detected in the samples. Metabolites with an AUC of >0.600 for distinguishing fibrosis stage 0-1 from fibrosis stage 2-4 patient samples were identified and are presented in Table 14. Of these, 114 metabolites have individual AUCs greater than the AUC of 0.6096 obtained for Type 2 Diabetes, the top clinical parameter. Further, eight metabolites, X-14662, ribose, I-urobilinogen, X-12850, malate, glutarate (pentanedioate), 2-aminoheptanoate, and X-15497, have an AUC greater than 0.6663, which is the AUC calculated from the best model using all 7 clinical parameters of Age, Type 2 Diabetes, BMI, HDL Cholesterol, Gender, Fructose, and Past Alcohol Use. The metabolites and data are listed in Table 14.
A total of 255 combinations using X-14662, ribose, I-urobilinogen, X-12850, malate, glutarate (pentanedioate), 2-aminoheptanoate, and X-15497 (the eight metabolites with an AUC >0.6663) are possible and all 255 possible combinatorial models for separating fibrosis stage 0-1 from fibrosis stage 2-4 were evaluated. The AUCs that were calculated for each model resulting from fitting all possible model combinations of the eight metabolites range from 0.6523 to 0.7774 and the data are shown in
The metabolite biomarkers were also used to derive statistical models useful to classify the subjects according to fibrosis stage 0-1 or fibrosis stage 2-4 using Random Forest analysis. Random Forest results show that the samples were classified with 74% accuracy. The positive predictive value, which is the proportion of subjects that were truly positive (i.e., subjects with fibrosis stage 2-4) among all those classified as positive, was 84%. The “Out-of-Bag” (00B) Error rate, which gives an estimate of how accurately new observations can be predicted using the Random Forest model (e.g., whether a sample is from a subject with stage 0-1 fibrosis or stage 2-4 fibrosis) from this Random Forest was 26%. The model estimated that, when used on a new set of subjects, the identity of fibrosis stage 0-1 subjects could be predicted correctly 54% of the time and fibrosis stage 2-4 subjects could be predicted 81% of the time.
Based on the Random Forest variable selection procedures, the metabolites that are considered reliably significant for construction of a model or algorithm for predicting fibrosis stage 0-1 or stage 2-4 were identified and ranked by importance. The metabolites that are the most important for distinguishing the groups according to this analysis are ribose, X-14662, isoleucine, I-urobilinogen, glutarate (pentanedioate), X-12263, X-19561, 2-aminoheptanoate, X-18922, gamma-glutamylisoleucine, X-12850, 1-arachidonylglycerol, X-17145, maleate (cis-butenedioate), malate, X-21892, N-methylproline, X-12739, X-21474, threonate, X-11871, glutamate, X-15497, 1-stearoylglycerophosphoinositol, X-21659, 3-hydroxyoctanoate, 3-methylglutaconate, X-14302, X-12812, and fumarate. All but four of the metabolites identified by Random Forest analysis (X-21659, X-21474, 3-methylglutaconate, and X-12812) had individual AUC values greater than 0.6096, the AUC for the clinical parameter Type 2 Diabetes.
Distinguishing Fibrosis Stages 0-2 from Fibrosis Stages 3-4
The performance of the clinical parameters for distinguishing fibrosis stage 0-2 from stage 3-4 were assessed by determining area under the curve (AUC) and logistic regression. The AUCs for the individual clinical parameters range from 0.5056 (Gender) to 0.6183 (Type 2 Diabetes) and the data are shown in Table 15. A total of 127 combinations of the seven clinical parameters are possible and all of the 127 possible combinatorial models derived using these clinical parameters were evaluated. The highest AUC was derived from a model that fit all seven clinical parameters, and the AUC was 0.6686.
Logistic regression models and area under the curve (AUC) were also used to assess how well individual metabolites distinguished the fibrosis stage 0-2 samples from fibrosis stage 3-4 samples. Logistic regression analysis was performed on the measured values obtained for all 1151 metabolites detected in the samples. Sixty-one metabolites have individual AUCs greater than the AUC of 0.6183 that was obtained for the top clinical parameter, Type 2 Diabetes. Three metabolites (gamma-tocopherol, taurocholate, and xylitol) have an individual AUC greater than 0.6686, the highest AUC that was calculated from a model obtained using all seven of the clinical parameters evaluated. The data are shown in Table 16. All possible combinatorial models for separating fibrosis stage 0-2 from fibrosis stage 3-4 using these three metabolites (gamma-tocopherol, taurocholate, and xylitol) were generated. The highest AUC calculated when using a model containing all three metabolites was 0.7131 which is an improvement over the AUC 0 0.6183 using clinical parameters only.
The metabolite biomarkers were also used to derive statistical models to classify the subjects according to fibrosis stage 0-2 from fibrosis stage 3-4 using Random Forest analysis. The Random Forest results show that the samples were classified with 70% accuracy. The negative predictive value, which is the number of subjects that were truly negative (i.e. subjects with fibrosis stage 0-2) among all those classified as negative, was 79%. The “Out-of-Bag” (00B) Error rate, which gives an estimate of how accurately new observations can be predicted using the Random Forest model (e.g., whether a sample is from a subject with stage 0-2 fibrosis or stage 3-4 fibrosis) was 30%. The model estimated that, when used on a new set of subjects, the identity of fibrosis stage 0-2 subjects could be predicted correctly 81% of the time and fibrosis stage 3-4 subjects could be predicted 36% of the time.
Based on the Random Forest variable selection procedures, the biomarker compounds that are considered reliably significant for construction of a model or algorithm for predicting fibrosis stage 0-2 or stage 3-4 were identified and ranked by importance. The biomarkers that are the most important for distinguishing the groups according to this analysis are 1,5-anhydroglucitol (1,5-AG), glycocholate, I-urobilinogen, cys-gly (oxidized), taurochenodeoxycholate, taurocholate, 16-hydroxypalmitate, xylitol, X-12812, gamma-tocopherol, X-12850, fructose, X-14662, glucose, X-17453, fucose, mannose, glycochenodeoxycholate, X-11871, palmitoyl-palmitoyl-glycerophosphocholine, X-14658, imidazole-propionate, X-12093, X-14302, 2-hydroxyglutarate, X-12263, cysteine-glutathione-disulfide, tartronate (hydroxymalonate), aspartylleucine, and glutarate (pentanedioate). All but four of the metabolites identified by Random Forest analysis (fructose, X-12093, 2-hydroxyglutarate, X-12263) had individual AUC values greater than 0.6183, the AUC for the clinical parameter Type 2 Diabetes.
Distinguishing Fibrosis Stages 0-1 from Fibrosis Stages 3-4
To assess the performance of the clinical parameters (Age, Type 2 Diabetes, BMI, HDL Cholesterol, Gender, Fructose, and Past Alcohol Use) for distinguishing fibrosis stages 0-1 from stages 3-4 logistic regression and area under the curve (AUC) were performed. The AUCs for the individual clinical parameters ranged from 0.4939 (BMI) to 0.6698 (Type 2 Diabetes) and the data are presented in Table 17. A total of 127 combinations of these seven clinical parameters are possible and all 127 possible combinatorial models using these clinical parameters were evaluated. The highest AUC was 0.7217, and it was derived from a model that fit all seven clinical parameters.
Logistic regression models and area under the curve (AUC) were also used to assess the performance of individual metabolites for distinguishing the fibrosis stage 0-1 samples from fibrosis stage 3-4 samples. Logistic regression analysis was performed on the measured values obtained for all 1151 metabolites detected in the samples. The analysis identified fifty-three metabolites with an individual AUC greater than 0.6689, which was the AUC for the top clinical parameter, Type 2 Diabetes. Seven metabolites (X-14662, I-urobilinogen, X-12850, glutarate (pentanedioate), xylitol, X-11871, X-11537) had an AUC greater than 0.7217, which is the AUC calculated from the model using all 7 clinical parameters of Age, Type 2 Diabetes, BMI, HDL Cholesterol, Gender, Fructose, and Past Alcohol Use. The data are shown in Table 18. All of the 127 possible combinatorial models for separating fibrosis stage 0-1 from fibrosis stage 3-4 using X-14662, I-urobilinogen, X-12850, glutarate (pentanedioate), xylitol, X-11871, X-11537 (the seven metabolites with an AUC>0.7217) were generated. The AUCs were calculated for each model, and the AUC from fitting all possible model combinations of the seven metabolites range from 0.7296 to 0.8788, and 89 of the models have an AUC greater than 0.8. The data is shown in
In another example, serum samples from 200 subjects spanning the spectrum of nonalcoholic fatty liver disease from NAFLD to NASH, including 181 subjects classified as having NASH and 19 subjects classified as not having NASH (i.e., the non-NASH subjects were classified as NAFLD or borderline NASH), were analyzed. Levels of metabolites, measured in μM, were determined in the samples using TRUEMASS complex lipid panel analysis.
The statistical significance and predictive performance of individual metabolites detected in the samples to determine the presence or absence of NASH in these subjects was assessed using logistic regression with Chi-square analysis and AUC calculations. Welch's two-sample t-tests were used to compare the metabolite levels in samples collected from subjects with NASH compared to the levels measured in samples collected from subjects without NASH. Logistic regression models and AUC assessed how well individual metabolites discriminated the NASH and non-NASH groups. Statistical analyses were performed using the measured values obtained for all lipid metabolites detected in the sample. The metabolites useful for distinguishing NASH from non-NASH patient samples are presented in Table 19. The Chi-square p-value is <0.1 and the AUC is >0.5 for all of the metabolites. Table 19 includes, for each metabolite, the lipid class of the metabolite, the metabolite name, the p-value determined in the logistic regression and Chi-square analysis of NASH samples compared to non-NASH samples, the AUC, and the direction of change (DOC) of the metabolite level in NASH samples compared to non-NASH samples.
In another example, serum samples from 200 subjects spanning the spectrum of nonalcoholic fatty liver disease from NAFLD to fibrosis, including 150 subjects classified as having fibrosis and 50 subjects classified as not having fibrosis (i.e., the non-fibrosis subjects were classified as having NAFLD, borderline NASH, or NASH) were analyzed. Levels of metabolites, measured in μM, were determined in the samples using TRUEMASS complex lipid panel analysis.
The statistical significance and predictive performance of the individual metabolites detected in the samples to determine the presence or absence of fibrosis in these subjects was assessed using logistic regression with Chi-square analysis and AUC calculations. Welch's two-sample t-tests were used to compare the metabolite levels in samples collected from subjects with fibrosis compared to the levels measured in samples collected from subjects without fibrosis. Logistic regression models and AUC were used to assess how well individual metabolites discriminated the fibrosis and non-fibrosis groups. Logistic regression and Chi-square analysis was performed using the measured values obtained for all lipid metabolites detected in the sample. The metabolites useful for distinguishing fibrosis from non-fibrosis patient samples are presented in Table 20. The Chi-square p-value is <0.1and the AUC is >0.5 for all of the metabolites. Table 20 includes, for each metabolite, the lipid class of the metabolite, the metabolite name, the p-value determined in the logistic regression and Chi-square analysis of fibrosis samples compared to non-fibrosis samples, the AUC, and the direction of change (DOC) of the metabolite level in fibrosis samples compared to non-fibrosis samples.
This application claims the benefit of U.S. Provisional Patent Application No. 62/081,903, filed Nov. 19, 2014, and U.S. Provisional Patent Application No. 62/141,494, filed Apr. 1, 2015, the entire contents of which are hereby incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US15/61215 | 11/18/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62081903 | Nov 2014 | US | |
62141494 | Apr 2015 | US |