Nonalcoholic fatty acid liver disease (NAFLD) can be a cause of chronic liver disease which can affect between 80 and 100 million individuals in the United States. This disease can be benign, aggressive, or harmful from a liver perspective and can be associated with cardiometabolic outcomes. In a nonalcoholic fatty liver, excess fat can accumulate in the liver cells. Such build up of fat in the liver can induce inflammation and damage to the liver resulting in non-alcoholic steatohepatitis (NASH). NAFLD and NASH can lead to cirrhosis, hepatocellular carcinoma and become indications for liver transplantation in adults and children. Currently, no approved pharmacologic treatment for NASH is available.
Certain existing methods can require multiple clinical tests to screen NAFLD/NASH patients. Furthermore, while certain tests can be ordered by liver specialists, the burden of the disease is not necessarily placed under the care of liver specialists. Accordingly, there remains a need for improved techniques that can identify patients at risk for NAFLD and NASH from data that can be readily and routinely acquired from patients to facilitate access to appropriate care.
The disclosed subject matter provides systems and methods for identifying nonalcoholic fatty liver disease (NAFLD) and nonalcoholic steatohepatitis (NASH) in patients using clinical data available in the electronic health record. An example system can include one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors. The storage media can store instructions to cause the system to select at least one patient with a risk indicator using an electronic health record (EHR) database, determine that the at least one patient fails to meet exclusion criteria, and display the at least one patient in response to the determination. In example embodiments, the disclosed risk factor can be associated with NAFLD and/or NASH. The risk factor can include demographic data (e.g., age, sex, etc.), diagnosis codes, procedure codes, laboratory measurements, medication history, pathology codes, radiology codes, or combinations thereof. For example, the risk factor can include patient data related to type 2 diabetes, obesity, abnormal liver enzymes, hyperlipidemia, hypertension, chronic nonalcoholic liver disease, nonalcoholic steatohepatitis, steatosis, cirrhosis, and combinations thereof.
In certain embodiments, the disclosed system can assess exclusion criteria for screening patients. The exclusion criteria can include demographic data, diagnosis codes, procedure codes, laboratory measurements, medication history, pathology codes, radiology codes, or combinations thereof. For example, the exclusion criteria can include patient data related to alcohol use/abuse, type 1 diabetes, viral hepatitis infection, HIV infection, age, or combinations thereof.
In certain embodiments, the disclosed system can be configured to verify hepatic steatosis of the at least one patient using a radiology report and/or a pathology report. In some embodiments, the disclosed radiology report can include an ultrasound report, a CT scan report, a MRI report, or combinations thereof.
In certain embodiments, the disclosed system can be further configured to determine that the patient receives a weight-loss surgery. The disclosed weight-loss surgery can include a laparoscopy procedure, a gastric restrictive procedure, a bariatric procedure, a bariatric revision, or combinations thereof.
In certain embodiments, the disclosed system can be further configured to determine that the at least one patient has an end-stage liver-related outcome. The end-stage liver related outcome can include portal hypertension, hepatorenal syndrome, primary bacterial peritonitis, ascites, complications of transplanted liver, hepatic encephalopathy, cirrhosis, hepatocellular carcinoma, hepatopulmonary syndrome, hepatic failure, esophageal varices, esophagogastroduodenoscopy or combinations thereof.
In certain embodiments, the disclosed system can perform a quality control by excluding a patient who has less than two risk factors or less than three occurrences of the risk factors.
In certain embodiments, an example method for diagnosing NAFLD/NASH patients can include selecting at least one patient with a risk indicator using an EHR database, determining that the at least one patient fails to meet exclusion criteria, and displaying the at least one patient in response to the determination. The risk indicator can be associated with NAFLD and/or NASH. In some embodiments, the example method can further include verifying hepatic steatosis of the at least one patient using a radiology report and/or a pathology report. In some embodiments, the example method can further include performing a quality control by excluding a patient who has less than two risk indicators or less than three occurrences of the risk indicator. In certain embodiments, the example method can further include determining that the at least one patient receives a weight-loss surgery. In some embodiments, the example method can further include determining that the at least one patient has an end-stage liver-related outcome.
Further features and advantages of the present disclosure will become apparent from the following detailed description taken in conjunction with the accompanying figures showing illustrative embodiments of the present disclosure, in which:
Throughout the figures, the same reference numerals and characters, unless otherwise stated, are used to denote like features, elements, components or portions of the illustrated embodiments. Moreover, while the present disclosure will now be described in detail with reference to the figures, it is done so in connection with the illustrative embodiments.
The disclosed subject matter provides techniques for diagnosing nonalcoholic fatty liver disease (NAFLD) and nonalcoholic steatohepatitis (NASH) in patients. The disclosed subject matter can assess various data that can be readily and routinely acquired from patients for predicting risks of NAFLD and NASH, thereby tailoring need for additional clinical testing in certain risk populations.
As shown
In certain embodiments, the disclosed system can be configured to select at least one patient with a risk indicator 104. The risk indicator can be associated with a target disease or symptom. The target disease/symptom associated indicator can include a diagnosis code, a procedure code, a laboratory measurement, a medication history, a pathology code, a radiology code, demographic data and combinations thereof. For example, certain risk indicators can be associated with NAFLD and/or NASH. The NAFLD/NASH associated risk indicators can include patient data related to type 2 diabetes (e.g., hemoglobin A1C≥5.7), obesity (e.g., body mass index≥30), abnormal liver enzymes (e.g., alanine aminotransferase≥40), hyperlipidemia (e.g., total cholesterol≥200 or low-density lipoproteins≥130), hypertension, chronic nonalcoholic liver diseases, nonalcoholic steatohepatitis, steatosis, cirrhosis, or combinations thereof.
In certain embodiments, the disclosed system can be configured to select the at least one patient using a database. The database can be a public or a private. For example, an exemplary system can obtain patient data (e.g., risk indicators) from an electronic health record (EHR) database. In some embodiments, the database can be private. The private database can include protected health information, and cannot publicly available. In some embodiments, the disclosed database can be obtained from any medical centers, institutions, and/or hospitals.
In certain embodiments, the disclosed system can be configured to identify patients who meet exclusion criteria 105. The exclusion criteria can include a diagnosis code, a procedure code, a laboratory measurement, a medication history, a pathology code, a radiology code, demographic data and combinations thereof. For example, certain exclusion criteria can include patient data related to alcohol abuse, type 1 diabetes, viral hepatitis infection, HIV infection, age (e.g., ≤18), or combinations thereof In some embodiments, the disclosed system can be configured to deselect/remove the patients who meet the exclusion criteria from the selected patients with the risk indicator 105.
In certain embodiments, the disclosed system can be configured to verify hepatic steatosis of the selected patients 106. Hepatic steatosis can be verified by histologic description based on pathologist review of liver biopsies contained within clinical reports or imaging modalities that incorporate signal detection that has been associated with the presence of intrahepatic fat. For example, increased echogenicity within an abdominal ultrasound report (with appropriate exclusion criteria) can be correlated with intrahepatic fat. In some embodiments, the verification process can be performed using a radiology report and/or a pathology report. For example, the radiology report can include an ultrasound report, a CT scan report, a MRI report, or combinations thereof. The pathology report can include reports obtained via liver biopsy for NASH, NAFLD, steatosis, steatohepatitis, fatty liver, or cirrhosis.
In certain embodiments, the disclosed system can be configured to perform a quality control process by excluding a patient who has less than two risk factors or less than three occurrences of a single risk indicator. Certain electronic health records can include errors that can range from data entry errors to incorrect code usage. To reduce the chance errors and the false positive rate, the process can require patients to have at least two distinct risk factors (e.g. a diagnosis of hypertension and a diagnosis of obesity) or three occurrences of a single risk indicator (i.e. the patient was diagnosed with a risk indicator on 3 different medical visits).
In certain embodiments, the disclosed system can be configured to identify patients with a weight-loss surgery 107. The identification of patients with a weight-loss surgery can be performed independently from portions of the method, and can be a continuation of an example illustrated in
In certain embodiments, the disclosed system can be configured to identify patients with an end-stage liver outcome 108. The end-stage liver outcome can include patient date related to Model for End Stage Liver Disease (MELD) score, portal hypertension, hepatorenal syndrome, primary bacterial peritonitis, ascites, complications of transplanted liver, hepatic encephalopathy, cirrhosis, hepatopulmonary syndrome, hepatic failure, esophageal varices, esophagogastroduodenoscopy, or combinations thereof. The identification of patients with an end-stage liver outcome 108 can be performed independently from other portions of the method, and can be a continuation of an example illustrated in
In certain embodiments, the MELD score can be calculated to stratify patients by expected mortality and to decompensate liver disease with regards to liver transplantation. The formula for calculating a MELD score can be:
10*((0.957*ln(Creatinine))+(0.378*ln(Bilirubin))+(1.12*ln(INR)))+6.43 (1)
For the calculation, laboratory measurements (e.g., creatinine, Bilirubin, and INR) taken at least one year following weight-loss surgery for each patient can be extracted. The measurements (e.g., creatinine, Bilirubin, and INR) can be taken within 30-days of each other, and the max value for each measurement type can be selected. MELD scores can be then calculated per patient using this information. Table 1 below lists the measurement codes used for the MELD score calculation.
In certain embodiments, the disclosed system can be further configured to identify patients with advanced fibrosis. For example, a non-biopsied patient group can be scored using Fibrosis-4 (FIB-4), AST to Platelet Ratio Index (APRI), and NAFLD Fibrosis Score (NAFLD-FS) calculations to discern patients with advanced fibrosis. FIB-4, APRI, and FS can be obtained using the following metrics:
These noninvasive scoring techniques have been applied to chronic liver disease, including NAFLD, to assist with the determination of degrees of fibrosis based on commonly available clinical data.
The presently disclosed subject matter will be better understood by reference to the following Example. The Example provided as merely illustrative of the disclosed methods and systems, and should not be considered as a limitation in any way.
Among other features, the example illustrates the identification of patients with NAFLD and NASH within large electronic health record (EHR) databases for targeted intervention based on clinically relevant phenotypes.
This example considered the rapid identification of patients with NAFLD and NASH using EHRs from 6.4 million adult patients. Structured medical record data (diagnoses, medications, procedures, and demographics) were standardized by mapping to the Observational Medical Outcomes Partnership (OMOP) common data model and stored in MySQL. The example was semi-automated, guided by clinical validation and involved selecting patients with NAFLD risk indicators, removing patients meeting exclusion criteria, and machine confirmation of language indicators of hepatic steatosis. SQL queries were made on the structured data as follows.
First, NAFLD patients were identified using two criteria: presence of a NAFLD risk indicator or presence of a NAFLD diagnosis code. Patients only needed to be diagnosed with 1 risk indicator or NAFLD diagnosis code for cohort inclusion. NAFLD risk indicators include diagnosis of the following: type 2 diabetes (Table 2), obesity (Table 3), abnormal liver enzymes (Table 4), hyperlipidemia (Table 5), or hypertension (Table 6). Diagnosis codes used by the algorithm along with selection criteria for the NAFLD risk indicators are listed in Tables 2-6. Each table lists the OMOP name and code id along with the specific diagnostic code and code type. Criteria for inclusion for ICD 9/10 diagnoses was 1 diagnosis (dx). Laboratory measures (code type=LOINC) can list appropriate cutoffs for cohort inclusion. 833,379 patients with NAFLD risk indicators were identified. The NAFLD diagnosis codes used for patient selection are listed in Table 7. For the ICD 9/ICD 10 codes, patients with 1 diagnosis of the specified code were included in the cohort. For laboratory measurements (LOINC code), cutoff values for cohort inclusion are listed in these tables. 47,054 patients were identified with NAFLD diagnosis codes. 842,791 total unique patients were identified.
Following the identification of potential NAFLD patients, patients meeting specified exclusion criteria were removed. The exclusion criteria include demonstrated alcohol use, diagnosis of HIV, viral hepatitis, type 1 diabetes, and other contributing factors that can result in hepatic steatosis or abnormal liver biochemistries. Patients on medications associated with hepatic steatosis were also excluded. All patient exclusion criteria are listed in Tables 8-13. The exclusion criteria include the followings: alcohol exclusions (Table 8), viral hepatitis exclusions (Table 9), HIV exclusions (Table 10), type 1 diabetes exclusions (Table 11), other excluding diagnoses (Table 12), and medication exclusions (Table 13). Patients meeting any one exclusion criteria were removed from the cohort. 217,969 patients were excluded from the cohort. Patients who tested with Hepatitis and/or HIV were excluded from the cohort (e.g., Positive, Reactive, Detected, Repeatedly Reactive, Confirmed, Indicated). For tests assessing viral load, patients with values above the baseline for detection were excluded.
The application of the exclusions shown in Tables 8-13 produced a cohort of 624,822 potential NAFLD patients. Radiology and pathology reports (unstructured data) from 1980-2016 were used to verify hepatic steatosis in these patients. A regular expression entity-tagging approach was used to identify key words along with the usage context of these key terms. For example, the regular expression entity-tagging approach can start by finding similarities or patterns among textual data that can be then generalized to build regular expressions. In certain embodiments, the regular expression entity-tagging approach can start by supplying keyword patterns which can be then evaluated, transformed or modified until satisfying predefined terminology.
Table 14 lists various radiological modalities and the key words that were queried in the respective reports. Table 15 specifies the key terms used to identify hepatic steatosis from pathology reports obtained via liver biopsy. Hepatic steatosis was verified for 20,291 patients using this approach.
To reduce EHR diagnosis code errors, quality control (QC) measures were employed requiring patients to have ≥2 risk factors or at least three occurrences of a given risk factor diagnosis. From the 20,291 patients with verified hepatic steatosis, 4,231 patients who were under the age of 18 or who failed the QC check were removed from the cohort. This produced a final yield of 16,060 NAFLD patients with 170 of these patients having a biopsy-proven diagnosis of NASH, the advanced phenotype of NAFLD. NASH was verified through histologic confirmation from liver biopsies.
Clinical outcomes can be predicted by fibrosis stages. Liver biopsies are sensitive techniques of detecting fibrosis stages but can be underutilized due to their invasive nature. To identify patients with higher risk features for clinically significant outcomes, noninvasive scoring systems were used to stratify patients by fibrosis stages. Here, to identify additional patients who can be at risk for developing advanced fibrosis due to NAFLD, three common fibrosis scoring metrics were applied on the 15,890 patients without histology. These metrics include the Fibrosis-4 (FIB-4) calculation, the AST to Platelet Ratio Index (APRI) calculation, and the NAFLD Fibrosis score. Data required for these calculations were extracted from each patient's clinical records. For each required variable, the mean of all measures within 1 year of the date of verified hepatic steatosis was used. For example, give a patient with verified hepatic steatosis on Jun. 20, 2017, the ALT value used in the scoring metric was the mean of all available ALT measures from Jun. 20, 2016 to Jun. 20, 2018. R was used to calculate fibrosis scores for each of the 15,890 patients. Patients who exhibited a score suggest of advanced fibrosis using at least two of the metrics were selected.
16,060 NAFLD patients were identified, with 285 having a biopsy-proven NASH diagnosis. Fibrosis scoring was performed on 15,890 patients without histology; 943 exhibited a score suggestive of advanced fibrosis (FIB-4>3.25, APRI>1.0, NAFLD FS>0.675) in ≥2 of the scoring metrics. Chart review of 100 random individuals verified 92 NAFLD patients as correctly identified by the algorithm, a positive predictive value of 92%.
In sum, NASH patients at highest risk for progressing to end-stage liver disease were identified with data commonly found in the EHR. This work highlights the use of the disclosed semi-automated algorithm in identifying NAFLD and NASH with clinical sensitivity.
In addition to the various embodiments depicted and claimed, the disclosed subject matter is also directed to other embodiments having other combinations of the features disclosed and claimed herein. As such, the particular features presented herein can be combined with each other in other manners within the scope of the disclosed subject matter such that the disclosed subject matter includes any suitable combination of the features disclosed herein.
The foregoing description of specific embodiments of the disclosed subject matter has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosed subject matter to those embodiments disclosed.
It will be apparent to those skilled in the art that various modifications and variations can be made in the methods and systems of the disclosed subject matter without departing from the spirit or scope of the disclosed subject matter. Thus, it is intended that the disclosed subject matter include modifications and variations that are within the scope of the appended claims and their equivalents.
This application is a continuation of International Patent Application No. PCT/US 2020/047947 filed Aug. 26, 2020, which claims priority to U.S. Provisional Application No. 62/891,748, which was filed on Aug. 26, 2019, the entire contents of which are incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
62891748 | Aug 2019 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2020/047947 | Aug 2020 | US |
Child | 17679707 | US |