This invention relates to a method of detecting and characterizing liver diseases in a subject by isolating and analyzing circulating epithelial cells (CECs).
Liquid biopsy refers to sampling cellular material that originated from a solid organ and has entered the bloodstream. Circulating epithelial cells (CECs) can be detected by liquid biopsy in the setting of localized cancer (Stott S L, et al. Sci Transl Med 2010; 2:25ra23; Lucci A, et al. Lancet Oncol 2012; 13:688-95) and even preneoplastic pancreatic lesions (Rhim A D, et al. Gastroenterology 2014; 146:647-51; Franses J W, et al. Oncologist 2017) suggesting their presence is not exclusive to carcinogenesis.
Isolating CECs is a technological challenge due to their rarity in the bloodstream and the variable expression of antigens used for cell capture. For example, the EpCAM-dependent Veridex platform yielded Hepatocellular carcinoma (HCC) CEC detection rates of only 35% and 410% in two independent studies (Kelley R K, et al. BMC Cancer 2015; 15:206; Sun Y F, et al. Hepatology 2013; 57:1458-68). To overcome this limitation, an antigen-agnostic cell sorting device called the iChip, has been developed which isolates CECs while preserving cell viability and high-quality RNA content. The iChip device has previously been combined with an RNA signature based on established liver-specific markers to create an assay for the enrichment and detection of CECs in HCC (Kalinich M, et al. Proc Natl Acad Sci USA 2017; 114:1123-1128).
Other approaches to non-invasive diagnosis of HCC has been unsuccessful in achieving high detection rate. For example, a recent study has shown that detection of HCC by combining cell-free DNA and protein blood-based biomarkers yielded an accuracy of only 44% for predicting HCC, likely due to the lack of common recurrent mutations and specific protein markers inherent to HCC (see Cohen J D, et al. Science 2018).
Another challenge in the diagnosis of certain liver diseases by using a non-invasive method is that CECs may be present in two different diseases such that quantitative analysis of CECs may not provide information necessary to distinguish between the two diseases.
To date, there is no non-invasive blood based method available for accurately detecting liver diseases such as HCC, or distinguishing between different liver diseases or between different stages of liver diseases in subjects with chronic liver disease (CLD).
Therefore, there is a need for a non-invasive method for detecting the presence of liver diseases such as HCC and determining stages of liver diseases in CLD patients with high accuracy.
The present invention is based, at least in part, on the discovery that hepatic CECs (hCECs) are not exclusive to carcinogenesis, but also can be present in subjects having non-cancer diseases or conditions such as chronic liver disease (CLD). Furthermore, the present invention is based, at least in part, on the discovery that the hCECs in subjects with CLD can be analyzed quantitatively or qualitatively to accurately detect the presence of cancer such as hepatocellular carcinoma (HCC) and/or to accurately characterize the different stages (e.g., early or late stages) of liver diseases or conditions such as liver fibrosis.
In one aspect, the present invention relates to methods of measuring expression levels of hepatocellular carcinoma (HCC) classifier genes in circulating epithelial cells (CECs) of subjects, where the HCC classifier genes include one or more of TESC, OSBP2, SLC6A8, SEPT5, F2RL3, E2F1, EZH2, CDC20, CCNA2, CCNB1, PLXNB3, CDC6, MYBL2, APOBEC3B, SPP1, AKR1B10, TOP2A, ASPM, SLC6A9, RECQL4, NUSAP1, PLVAP, FMO1, PDZK1IP1, and FBXO32.
In some embodiments, the HCC classifier genes consist of one or more of TESC, OSBP2, SLC6A8, SEPT5, F2RL3, E2F1, EZH2, CDC20, CCNA2, CCNB1, PLXNB3, CDC6, MYBL2, APOBEC3B, SPP1, AKR1B10, TOP2A, ASPM, SLC6A9, RECQL4, NUSAP1, PLVAP, FMO1, PDZK1IP1, and FBXO32.
In some embodiments, the HCC classifier genes consist of TESC, OSBP2, SLC6A8, SEPT5, F2RL3, E2F1, EZH2, CDC20, CCNA2, CCNB1, PLXNB3, CDC6, MYBL2, APOBEC3B, SPP1, AKR1B10, TOP2A, ASPM, SLC6A9, RECQL4, NUSAP1, PLVAP, FMO1, PDZK1IP1, and FBXO32.
In some embodiments, the HCC classifier genes also include one, two, three or more additional genes selected from the group consisting of ACTG2, ADM2, AFP, AGR2, ALDH3A1, ALPK3, AMIGO3, ANKRD65, ANLN, AP1M2, ARHGAP11A, ARHGEF39, ASF1B, ASPHD1, AURKA, AXIN2, BAIAP2L2, BEX2, C15orf48, C1orf106, C1QTNF3, C6orf223, CA12, CA9, CAMK2N2, CAP2, CBX2, CCDC170, CCDC28B, CCDC64, CCNE2, CCNF, CD109, CD34, CDC25A, CDC7, CDCA5, CDCA8, CDH13, CDK1, CDKN2A, CDKN2C, CDT1, CELF6, CENPF, CENPH, CENPL, CENPU, CENPW, CKB, CNNM1, COL15A1, COL4A5, COL7A1, COL9A2, CRIP3, CSPG4, CTNND2, CXorf36, CYP17A1, DLK1, DMKN, DSCC1, DTL, DUOX2, ECT2, EEF1A2, EFNA3, EPHB2, EPPK1, ETV4, FABP4, FAM111B, FAM3B, FAM83D, FANCD2, FANCI, FBXL18, FERMT1, FGF19, FLNC, FLVCR1, FOXD2-AS1, FOXM1, FXYD2, GABRE, GAL3ST1, GCNT3, GINS1, GJC1, GMNN, GNAZ, GOLGA2P7, GPC3, GPR64, GPSM1, HRCT1, IGF2BP2, IGSF1, IGSF3, IQGAP3, ITGA2, ITPKA, KIAA0101, KIF11, KIFC1, KIFC2, KNTC1, KRT23, LAMA3, LEF1, LGR5, LINC00152, LINGO1, LPL, LRRC1, LYPD1, MAD2L1, MAGED4, MAGED4B, MAPK12, MAPK8IP2, MAPT, MCM2, MDGA1, MDK, MFAP2, MISP, MKI67, MMP11, MNS1, MPZ, MSC, MSH5, MTMR11, MUC13, MUC5B, MYH4, NAALADL1, NAV3, NCAPG, NDUFA4L2, NEB, NKD1, NMB, NOTCH3, NOTUM, NPM2, NQO1, NRCAM, NT5DC2, NTS, OBSCN, OLFML2A, OLFML2B, PAQR4, PEG10, PI3, PLCE1, PLCH2, PLK1, PLXDC1, PODXL2, POLE2, PPAP2C, PRC1, PTGES, PTGFR, PTHLH, PTK7, PTP4A3, PTTG1, PYCR1, RACGAP1, RBM24, RHBG, RNF157, ROBO1, RP4-800G7.2, RPS6KL1, RRM2, S100A1, SCGN, 5-Sep, SERPINA12, SEZ6L2, SFN, SGOL2, SLC22A11, SLC51B, SLC6A2, SNCG, SOAT2, SP5, SPARCL1, SPINK1, STIL, STK39, SULT1C2, TCF19, TDGF1, THY1, TK1, TMC5, TMEM132A, TMEM150B, TNFRSF19, TNFRSF25, TONSL, TPX2, TRIM16, TRIM16L, TRIM31, TRIM45, TTC39A, UBD, UBE2C, UBE2T, UGT2B11, USH1C, VSIG10L, WDR62, WDR76, and ZWINT.
In one aspect, the present invention relates to methods for detecting the presence of HCC in subjects having chronic liver diseases (CLDs), the method including: (a) measuring expression levels of the HCC classifier genes described herein in CECs of the subjects; and (b) comparing the expression levels of the HCC classifier genes in the CECs of the subject with reference expression levels of HCC classifier genes thereby determining the presence of HCC.
In some embodiments, the expression levels of HCC classifier genes are used to calculate a HCC score, and the calculated HCC score is compared with a reference score, where the presence of HCC is determined based on the presence of a HCC score above the reference score.
In some embodiments, the HCC score is calculated using a random forest analysis.
In some embodiments, the expression levels of HCC classifier genes are compared with the reference expression levels of HCC classifier genes using a multivariate logistic regression modeling approach.
In some embodiments, the expression levels of HCC classifier genes in circulating epithelial cells (CECs) are measured by: (a) obtaining a sample including blood from the subject; (b) removing red blood cells, platelets, and plasma from the sample by size-based exclusion; (c) removing white blood cells (WBCs) from the sample by magnetophoresis; and (d) measuring the expression of a set of genes in the CECs using RNA-sequencing, qRT-PCT, RNA in situ hybridization, protein microarray, or mass spectrometry and protein profiling.
In some embodiments, the HCC being detected is an early stage HCC or a late stage HCC.
In some embodiments, the methods for detecting the precense of HCC in subjects having CLDs also includes: (a) confirming or having confirmed the presence of HCC in the patient by ultrasound imaging, dynamic CT, MRI imaging, needle biopsy, and/or biopsy; and (b) if the presence of HCC in the patient is confirmed, treating or having the subject treated for HCC by surgical removal of the HCC tissue, radiofrequency ablation of the HCC tissue, embolization of the HCC tissue; embolization of HCC tissue, chemotherapy, and/or cryotherapy.
In one aspect, the present invention relates to methods of monitoring subjects having CLD for development of HCC, the method including: (a) detecting the presence of HCC in subjects having CLDs as described herein at an initial time point, and if the HCC score is below the reference score, then (b) performing detection step at one or more subsequent time points. In some embodiments, the detection step is performed at one or more subsequent time points until the presence of HCC is determined. In some embodiments, the initial and each subsequent time point is about three months, six months, or a year apart.
In one aspect, the present invention relates to methods of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs, the methods including: (a) detecting concentrations of CECs in blood samples of the subjects; (b) comparing the concentrations of CECs in the blood samples of the subjects with a reference value; (c) diagnosing those subjects that have concentrations of CECs in the blood samples that is below the reference value with early stage fibrosis; and (d) diagnosing those subjects that have concentration of CECs in the blood sample that is above the reference value with late stage fibrosis.
In some embodiments, the subjects have hepatitis B. In some embodiments, the concentrations of CECs are measured by immunofluorescence. In some embodiments, the concentrations of CECs is measured by detecting glypican-3 (GPC3) and/or cytokeratins (CKs).
In one aspect, the present invention relates to methods of monitoring subjects having CLDs for development of advanced fibrosis, the method including: (a) performing a method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs described herein; and if the concentrations of CECs in the blood samples of the subjects are lower than the reference value, then (b) performing the method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs at one or more subsequent time points.
In some embodiments, the method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs is performed at one or more subsequent time points until the subject is diagnosed with late stage fibrosis. In some embodiments, the initial and each subsequent time point is about three months, six months, or a year apart.
In one aspect, the present invention relates to method of monitoring a subject having CLD being treated to prevent the progression of fibrosis or HCC, the method including: (a) performing a method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs, described herein; and if the concentration of CECs in the blood sample of the subject is lower than the reference value, then performing the method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs at one or more subsequent time point; and (b) performing a method of detecting the presence of HCC in subjects having CLDs, described herein, and if the expression levels of the HCC scores are below the reference score, then performing the detection method at one or more subsequent time points.
In some embodiments, the method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs is performed at one or more subsequent time points until the subject is diagnosed with late stage fibrosis, and/or where the method of detecting the presence of HCC in subjects having CLDs is performed at one or more subsequent time points until the presence of HCC is determined. In some embodiments, the first initial and each subsequent time point for performing the method of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in subjects having CLDs or the method of detecting the presence of HCC in subjects having CLDs is about three months, six months, or a year apart, and the second initial and each subsequent time point is about three months, six months, or a year apart.
In some embodiments, the CECs in the subjects' blood are purified or enriched using microfluidic devices. In some embodiments, the microfluidic devices are iChip devices.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In addition, U.S. Patent Application US2016/0312298 A1 is specifically incorporated herein by reference in its entirety, and in some embodiments methods described herein can be used in conjunction with methods described in that application. In case of conflict, the present specification, including definitions, will control.
Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The present invention is based, at least in part, on the discovery that hCECs are not exclusive to carcinogenesis, but also can be present in subjects having non-cancer diseases or conditions such as chronic liver disease (CLD). Furthermore, the present invention is based, at least in part, on the discovery that the hCECs in subjects with CLD can be analyzed quantitatively or qualitatively to accurately detect the presence of cancer such as hepatocellular carcinoma (HCC) and/or to accurately characterize the stage (e.g., early or late stage) of a liver disease or liver condition such as liver fibrosis.
As demonstrated herein, cells from diseased livers circulating in the bloodstream (i.e., hCECs) are detected both quantitatively (e.g., by immunofluorescence) and qualitatively (e.g., gene expression profile or expression levels of HCC classifier genes) for use in diagnosis of HCC and CLD. Important applications of this liquid biopsy include detection or diagnosis of a liver disease or condition such as HCC, CLD etiology determination, liver fibrosis staging, and HCC surveillance or monitoring. The present invention can be applied to both diagnosis and monitoring of patients with liver conditions such as CLDs.
As used herein, the phrases “accurately diagnose” and “accurately detect” with respect to a disease or a condition refer to predicting the presence of the disease or the condition with a high degree of sensitivity (i.e., true positive rate or detecting a disease or a condition when the disease or the condition is present) or a high degree of specificity (i.e., true negative rate or not detecting a disease or a condition when the disease or the condition is not present). In some embodiments, the phrases “accurately diagnose” and “accurately detect” can also mean being able to detect the presence of a disease or a condition with a true positive rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%. In some embodiments, the phrases “accurately diagnose” and “accurately detect” can mean being able to detect the presence of a disease or a condition with a true negative rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%.
As used herein, the phrase “accurately distinguish” with respect to two diseases or conditions, can refer to detecting the presence of a first disease or a first condition with a high degree of sensitivity (i.e., detecting a first disease or condition when the first disease or condition is present, i.e., true positive rate) or a high degree of specificity (i.e., not detecting a first disease or condition when the first disease or condition is not present, i.e., true negative rate), regardless of whether the second disease or condition is also present or absent. In some embodiments, the phrase “accurately distinguish” can mean being able to detect the presence of a disease or a condition with a true positive rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%. In some embodiments, the phrase “accurately distinguish” can mean being able to detect the presence of a disease or a condition with a true negative rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%.
As used herein, the phrase “accurately distinguish” with respect to different stages of a disease or a condition can refer to detecting the presence of a particular stage of the disease (e.g., advanced fibrosis in liver) with a high degree of sensitivity (i.e., detecting the stage of a disease or condition when the disease or condition is present at that stage, i.e., true positive rate) or a high degree of specificity (i.e., not detecting a stage of a disease or condition when the disease or condition is not present at that stage, i.e., true negative rate) so that the particular stage of the condition or disease can be predicted. In some embodiments, the phrase “accurately distinguish” can mean being able to detect the presence of a stage of a disease or a condition with a true positive rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%. In some embodiments, the phrase “accurately diagnose” can mean being able to detect the presence of a disease or a condition with a true negative rate of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and at least about 99.9%.
As used herein, the term “circulating epithelial cells (CECs)” can refer to cells of epithelial origin that are shed from a tissue (e.g., diseased tissue, tumor tissue, or non-tumor tissue) and present in the blood, i.e. in circulation. Cell markers (e.g. marker genes) that can be used to identify and/or isolate CECs from other components of the blood are described below herein. In some embodiments, the CECs from a subject with a liver disease (e.g., HCC and/or CLD) are predominantly hepatic CECs (hCECs), for example, as determined by immunofluorescence staining of the CECs with genes expressed in hepatocytes (e.g., GPC3 and CKs).
As used herein, the term “chronic liver disease (CLD)” refers to a disease process of the liver involving progressive destruction and regeneration of the liver parenchyma. In some embodiments, CLD can lead to fibrosis cirrhosis. In some other embodiments, CLD can result in complications such as portal hypertension (e.g., ascites, hyperplenism, and lower esophageal varices and rectal varices) hepatopulmonary syndrome, hepatorenal syndrome, encephalopathy, or HCC. CLD can also refer to disease of the liver which lasts over a period of six months, one year, two years, three years, four years, five years, or more than five years. CLD can be caused by hepatitis B, hepatitis C, cytomegalovirus, Epstein Bar virus, yellow fever viruses, alcoholic liver disease, and/or drug induced liver disease from methotrexate, amiodarone, nitrofurantoin, or acetaminophen. In other embodiments, CLD can be caused by non-alcoholic fatty liver disease, haemochromatosis, Wilson's disease, or autoimmune responses such as primary biliary cholangitis or primary sclerosing cholangitis.
As used herein, the term “monitoring” or “surveillance” refers to periodically assessing a subject or a patient (e.g., a subject who is at risk of developing a condition) for the presence of a disease or a condition. In some embodiments, the periodic assessment can occur about every day, about every other day, about once a week, about once every other week, about every month, about every 2 months, about every 3 months, about every 4 months, about every 5 months, about every 6 months, about every 7 months, about every 8 months, about every 9 months, about every year, about every 18 months, about every 2 years, about every 3 years, about every 4 years, about every 5 years, about every 6 years, about every 7 years, about every 8 years, about every 9 years, or about every 10 years. This recurring assessment of a subject or a patient for the presence of a disease or a condition can continue until (1) the disease or the condition is detected in the subject or the patient; (2) the patient is no longer at risk of developing the disease or the condition; (3) at the discretion of the subject receiving the monitoring or the person administering the monitoring; or (4) discontinuation of the recurring assessment is necessary due to other reasons. The interval with which a subject is assessed for the presence of a disease or a condition can be adjusted during the course of the monitoring.
As used herein the term “ensemble learning method” refers to a supervised learning algorithm such as random forest that can be trained and then used to make predictions.
As used herein the term “hepatocellular carcinoma (HCC)” refers to a type of primary liver cancer prevalent in subjects with CLD. HCC can develop in patients with underlying cirrhotic liver disease of various etiologies, including patients with negative markers for HBV infection and who have HBV DNA integrated in the hepatocyte genome. Epidemiology, etiology, and carcinogenesis of HCC has been described in Ghouri Y A, et al., J Carcinog 2017; 16:1, which is incorporated by reference herein.
As used herein, the phrase “early stage HCC” can refer to HCC being within the Milan criteria. As used herein, the phrase “late stage HCC” can refer to HCC being outside of the Milan criteria. Milan criteria requires the subject with HCC meet the following criteria: HCC being one lesion smaller than 5 cm or up to 3 lesions, each smaller than 3 cm; no extrahepatic manifestations; and no evidence of gross vascular invasion. In other words, “early stage HCC” meets all Milan criteria and “late stage HCC” does not meet all Milan criteria.
As used herein, the term “early stage liver fibrosis” and “late stage liver fibrosis” refer to F1 or F2 stages, and F3 or F4 stages, respectively, as defined by the METAVIR classification.
The methods described herein can be used to accurately diagnose or predict the presence of cancer, e.g., HCC, in a patient with a non-cancerous disease condition, e.g., CLD, by detecting and analyzing expression of a set of genes in the CECs of the patient using a classifier that is based on an ensemble learning method such as random forest classifier.
In some embodiments, hCECs from subjects with CLD (e.g., subjects with Hepatitis B or subjects who are infected with Hepatitis B Virus) can be analyzed (e.g., qualitatively) to accurately distinguish between subjects with and without HCC. In other embodiments, hCECs from subjects with CLD can be quantitatively measured to accurately distinguish between subjects with early stage liver fibrosis and subjects with late stages liver fibrosis.
As demonstrated herein, the presence of cancer, e.g., HCC, and the presence of non-cancer diseases or conditions, e.g., CLD, are associated with the increased presence of CECs. The increased presence of CECs is also associated with the previous presence of cancer (e.g., HCC) which was treated to result in no clinical evidence of disease (e.g., in HCC patients who underwent curative treatment and had no clinical evidence of the disease).
Thus the methods can include the detection and analysis of a set of genes (e.g., HCC classifier genes) using a variety of statistical and computational prediction method (e.g., an ensemble learning method such as random forest classifier or a statistical method such as multivariable logistic regression), to detect the presence of a cancer, e.g., HCC.
The method can, in some embodiments, detect the presence of cancer at an early stage, which may otherwise be difficult to detect using a currently known method such as ultrasound imaging, dynamic CT, MRI imaging, needle biopsy, or biopsy.
In some embodiments, microfluidic (e.g., “lab-on-a-chip” or the iChip device) can be used in the present methods to separate, purify, enrich, or prepare CECs. Such devices have been successfully used for microfluidic flow cytometry, continuous size-based separation, chromatographic, or magnetophoretic separation. For Example, the iChip device and various other embodiments of such devices are described in U.S. Patent Application US2016/0312298 A1 (which is incorporated herein by reference) can be used for separating hCECs from a mixture of cells, or preparing an enriched population of hCECs. In particular, such devices can be used for the isolation of hCECs from complex mixtures such as whole blood.
In some embodiments, the devices retain at least 75%, e.g., 80%, 90%, 95%, 98%, or 99% of the desired cells compared to the initial sample mixture, while enriching the population of desired cells by a factor of at least 100, e.g., by 1000, 10,000, 100,000, or even 1,000,000 relative to one or more non-desired cell types. In one example, a detection module can be in fluid communication with a separation or enrichment device. The detection module can operate using any method of detection disclosed herein, or other methods known in the art. For example, the detection module includes a microscope, a cell counter, a magnet, a biocavity laser (see, e.g., Gourley et al., J. Phys. D: Appl. Phys., 36: R228-R239 (2003)), a mass spectrometer, a PCR device, an RT-PCR device, a microarray, a device for performing RNA in situ hybridization, or a hyperspectral imaging system (see, e.g., Vo-Dinh et al., IEEE Eng. Med. Biol. Mag., 23:40-49 (2004)). In some embodiments, a computer terminal can be connected to the detection module. For instance, the detection module can detect a label that selectively binds to cells, proteins, or nucleic acids of interest, e.g., transcripts of HCC classifier genes or encoded proteins.
In some embodiments, the microfluidic system includes (i) a device for separation or enrichment of CECs (e.g., hCECs); (ii) a device for lysis of the enriched CECs; and (iii) a device for detection of gene transcripts (e.g., transcripts of HCC classifier genes) or encoded proteins.
In some embodiments, a population of CECs prepared using a microfluidic device as described herein is used for analysis of expression of gene transcripts or proteins using known molecular biological techniques, e.g., as described above and in Sambrook, Molecular Cloning: A Laboratory Manual, Third Edition (Cold Spring Harbor Laboratory Press; 3rd edition (Jan. 15, 2001)); and Short Protocols in Molecular Biology, Ausubel et al., eds. (Current Protocols; 52 edition (Nov. 5, 2002)).
In general, devices for detection and/or quantification of expression of classifier genes useful for cancer diagnosis or encoded proteins in an enriched population of CECs (e.g., CTCs) are described herein and can be used for the early detection of cancer, e.g., tumors of epithelial origin, e.g., early detection of liver, pancreatic, lung, breast, prostate, renal, ovarian or colon cancer.
As described herein, the phrase “differential expression analysis” can refer to performing computational or statistical analysis on expression level of individual genes (e.g., individual HCC classifier genes) and/or expression patterns of multiple genes (e.g., multiple HCC classifier genes) in a sample (e.g., cell, e.g., CEC, e.g., hCEC). The term “differential expression” can mean over-expression (expressing a gene at a higher level than the reference value) or under-expression (expressing a gene at a lower level than the reference value). In some embodiments, a differential expression analysis can compare the expression levels or patterns in a sample with a reference value (e.g., expression levels or patterns of one or more genes in a sample from a non-diseased counterpart cell or tissue). In other embodiments, the expression levels or patterns can be normalized to expression levels of one or more control genes, or may be quantified in a non-relative manner (e.g., transcript copies per volume or absolute copy number). The gene expression levels can be measured by any of the known methods, such RNA-sequencing, qRT-PCT, RNA in situ hybridization, protein microarray, and/or mass spectrometry and protein profiling. Other known biochemical, or molecular biology techniques can be used to detect the expression of genes. In some embodiments, RNA-sequencing and qRT-PCT is the preferred method for measuring gene expression levels.
The differential expression analysis can be performed by any one of the known statistical or computational methods, for example, an ensemble learning method such as random forest classifier or a statistical method such as multivariable logistic regression.
In one aspect, the present invention provides methods including measuring expression levels of hepatocellular carcinoma (HCC) classifier genes in circulating epithelial cells (CECs) of a subject. The overexpression of HHC classifier genes by the CECs of subjects was determined as being highly predictive of the presence of HCC in the subjects (see e.g., Example 1-4). In some embodiment, the HCC classifier genes include one, two, three, or more of (e.g., all of) TESC, OSBP2, SLC6A8, SEPT5, F2RL3, E2F1, EZH2, CDC20, CCNA2, CCNB1, PLXNB3, CDC6, MYBL2, APOBEC3B, SPP1, AKR1B10, TOP2A, ASPM, SLC6A9, RECQL4, NUSAP1, PLVAP, FMO1, PDZK1IP1, and FBXO32. In some embodiments, the HCC classifier genes can be include all of TESC, OSBP2, SLC6A8, SEPT5, F2RL3, E2F1, EZH2, CDC20, CCNA2, CCNB1, PLXNB3, CDC6, MYBL2, APOBEC3B, SPP1, AKR1B10, TOP2A, ASPM, SLC6A9, RECQL4, NUSAP1, PLVAP, FMO1, PDZK1IP1. In other embodiments, the HCC classifier genes can also include one or more other genes that are overexpressed in HCC, e.g., one or more of ACTG2, ADM2, AFP, AGR2, ALDH3A1, ALPK3, AMIGO3, ANKRD65, ANLN, AP1M2, ARHGAP11A, ARHGEF39, ASF1B, ASPHD1, AURKA, AXIN2, BAIAP2L2, BEX2, C15orf48, C1orf106, C1QTNF3, C6orf223, CA12, CA9, CAMK2N2, CAP2, CBX2, CCDC170, CCDC28B, CCDC64, CCNE2, CCNF, CD109, CD34, CDC25A, CDC7, CDCA5, CDCA8, CDH13, CDK1, CDKN2A, CDKN2C, CDT1, CELF6, CENPF, CENPH, CENPL, CENPU, CENPW, CKB, CNNM1, COL15A1, COL4A5, COL7A1, COL9A2, CRIP3, CSPG4, CTNND2, CXorf36, CYP17A1, DLK1, DMKN, DSCC1, DTL, DUOX2, ECT2, EEF1A2, EFNA3, EPHB2, EPPK1, ETV4, FABP4, FAM111B, FAM3B, FAM83D, FANCD2, FANCI, FBXL18, FERMT1, FGF19, FLNC, FLVCR1, FOXD2-AS1, FOXM1, FXYD2, GABRE, GAL3ST1, GCNT3, GINS1, GJC1, GMNN, GNAZ, GOLGA2P7, GPC3, GPR64, GPSM1, HRCT1, IGF2BP2, IGSF1, IGSF3, IQGAP3, ITGA2, ITPKA, KIAA0101, KIF11, KIFC1, KIFC2, KNTC1, KRT23, LAMA3, LEF1, LGR5, LINC00152, LINGO1, LPL, LRRC1, LYPD1, MAD2L1, MAGED4, MAGED4B, MAPK12, MAPK8IP2, MAPT, MCM2, MDGA1, MDK, MFAP2, MISP, MKI67, MMP11, MNS1, MPZ, MSC, MSH5, MTMR11, MUC13, MUC5B, MYH4, NAALADL1, NAV3, NCAPG, NDUFA4L2, NEB, NKD1, NMB, NOTCH3, NOTUM, NPM2, NQO1, NRCAM, NT5DC2, NTS, OBSCN, OLFML2A, OLFML2B, PAQR4, PEG10, PI3, PLCE1, PLCH2, PLK1, PLXDC1, PODXL2, POLE2, PPAP2C, PRC1, PTGES, PTGFR, PTHLH, PTK7, PTP4A3, PTTG1, PYCR1, RACGAP1, RBM24, RHBG, RNF157, ROBO1, RP4-800G7.2, RPS6KL1, RRM2, S100A1, SCGN, 5-Sep, SERPINA12, SEZ6L2, SFN, SGOL2, SLC22A11, SLC51B, SLC6A2, SNCG, SOAT2, SP5, SPARCL1, SPINK1, STIL, STK39, SULT1C2, TCF19, TDGF1, THY1, TK1, TMC5, TMEM132A, TMEM150B, TNFRSF19, TNFRSF25, TONSL, TPX2, TRIM16, TRIM16L, TRIM31, TRIM45, TTC39A, UBD, UBE2C, UBE2T, UGT2B11, USH1C, VSIG10L, WDR62, WDR76, and ZWINT.
In another aspect, the present invention provides methods for detecting the presence of HCC in subjects having a chronic liver disease (CLD). The methods can include: (a) measuring expression levels of HCC classifier genes in CECs of a subject; and (b) comparing the expression levels of HCC classifier genes in the CECs of the subject with reference expression levels of HCC classifier genes thereby determining the presence of HCC.
In another aspect, the present invention provides methods of monitoring subjects having CLD for development of HCC. The methods can include: (a) measuring expression levels of HCC classifier genes in CECs of a subject and comparing the expression levels of HCC classifier genes in the CECs of the subject with reference expression levels of HCC classifier genes at an initial time point; and if the expression levels of the HCC classifier genes are below the reference level, then (b) performing the step again at a subsequent time point, and optionally at additional time points, e.g., until the expression levels of HCC classifier genes are above the reference level. This assessment can be formed by first calculating a HCC score (e.g., the vote fraction from the RF classifier) or other metrics that indicate the degree of differential expression of HCC classifier genes in the subject's CECs, as compared to a reference score, or other reference metrics values.
In another aspect, the present invention provides methods of distinguishing between the presence of early stage liver fibrosis and late stage liver fibrosis in a subject having CLD. The methods can include: (a) detecting a concentration of CECs in a blood sample of a subject; (b) comparing the concentration of CECs in the blood sample of the subject with a reference value; (c) diagnosing the subject with early stage fibrosis if the subject's blood concentration of CECs is below the reference value; and (d) diagnosing the subject with late stage fibrosis if the subject's blood concentration of CECs is above the reference value.
In another aspect, the present invention provides methods of monitoring a subject having CLD for development of advanced fibrosis. The methods can include: (a) detecting a concentration of CECs in a blood sample of a subject and comparing the blood CEC concentration to a reference value; and if the concentration of CECs in the blood sample of the subject is lower than the reference value, then (b) performing the same detection and comparison step at one or more subsequent time points, e.g., until the concentration of CECs in the blood sample of the subject is higher than the reference value.
In some embodiments, the expression levels of HCC classifier genes are used to calculate a HCC score, preferably using a random forest analysis, and the method includes comparing the HCC score with a reference score, wherein the presence of HCC is determined based on the presence of a HCC score above the reference score.
In some embodiments, the expression levels of HCC classifier genes are compared with the reference expression levels of HCC classifier genes using a multivariate logistic regression modeling approach.
In some embodiments, the expression levels of HCC classifier genes in circulating epithelial cells (CECs) are measured by: (a) obtaining a sample comprising blood from the subject; (b) removing red blood cells, platelets, and plasma from the sample by size-based exclusion; (c) removing white blood cells (WBCs) from the sample by magnetophoresis; and (d) measuring the expression of a set of genes in the CECs using RNA-sequencing, qRT-PCT, RNA in situ hybridization, protein microarray, or mass spectrometry and protein profiling.
In some embodiments, the HCC being detected is an early stage HCC or a late stage HCC.
In some embodiments, the method also includes (a) confirming or having confirmed the presence of HCC in the patient by ultrasound imaging, dynamic CT, MIR imaging, needle biopsy, and/or biopsy; and (b) if the presence of HCC in the patient is confirmed, treating or having the subject treated for HCC by surgical removal of the HCC tissue, radiofrequency ablation of the HCC tissue, embolization of the HCC tissue; embolization of HCC tissue, chemotherapy, and/or cryotherapy.
In some embodiments, the initial and each subsequent time point for measuring and comparing the blood CEC concentration or for measuring and comparing HCC classifier gene is about three months, six months, or a year apart. In some embodiments, the subject has hepatitis B or not have hepatitis B. In some embodiments, the concentration of CECs is measured by immunofluorescence. In some embodiments, the concentration of CECs is measured by detecting glypican-3 (GPC3) and/or cytokeratins (CKs).
Once a liver disease such as CLD or HCC are detected in a subject, the presence of the disease such as CLD or HCC may be confirmed using other methods.
Diagnosis or Detection of HCC
HCC can be further confirmed or diagnosed by analyzing blood sample using traditional methods, including a complete blood count (CBC), electrolytes, liver function tests (LFTs), coagulation studies (e.g., international normalized ratio (INR) and partial thromboplastin time (PTT)), and alpha-fetoprotein (AFP) determination.
Various imaging techniques can be used to diagnose HCC. For example, ultrasonography offers a relatively inexpensive method of screening without the cost of magnetic resonance imaging (MRI) or the exposure to radiation and potentially nephrotoxic contrast agents required for computed tomography (CT). Ultrasonography as a screening method is reported to have 60% sensitivity and 97% specificity in the cirrhotic population, and it has been demonstrated to be cost-effective. Due to this low-sensitivity, findings on ultrasound examination should be confirmed with further imaging studies and potentially biopsy.
HCC can be detected using CT imaging, preferably with early enhancement on the arterial phase with rapid washout of contrast on the portal venous phase of a three-phase contrast scan. HCC can also be detected using MRI.
HCC can be detected by biopsy, especially for subjects with HCCs that are larger than 2 cm with low levels of alpha-fetoprotein or in whom ablative treatment or transplant is contraindicated.
In patients with elevated AFP and consistent imaging characteristics, patients can be treated presumptively for HCC without a biopsy. Patients preferably can also undergo evaluation for extrahepatic disease (primarily pulmonary metastasis) with cross-sectional imaging; this would preclude curative locoregional therapy
HCC can be treated using a number of methods known in the art, including by liver transplantation-however a limited supply of donor organs limit the availability of transplantation as an option for many subjects. HCC can also be treated using resection, radiofrequency ablation (RFA). Systemic therapy with sorafenib (or, if sorafenib fails, with regorafenib, nivolumab, or lenvatinib), can be used to bridge patients to transplant or to delay recurrence of HCC. In patients who experience a recurrence following resection or transplantation, aggressive surgical treatment appears to be associated with the best possible outcome.
HCC can be treated by transcatheter arterial chemoembolization, which selectively cannulates the feeding artery to the tumor and delivers high local doses of chemotherapy, including doxorubicin, cisplatin, or mitomycin C. To prevent systemic toxicity, the feeding artery is occluded with gel foam or coils to prevent flow.
HCC can be treated by chemotherapy-however, HCC is minimally responsive to systemic chemotherapy. For example, doxorubicin-based regimens, which appears to have the greatest efficacy, has a response rates of 20-30% and a minimal impact on survival.
For patients with Child class C cirrhosis and contraindications for transplantation, HCC can be managed by focusing on pain control, ascites, edema, and portosystemic encephalopathy management.
HCC can be treated surgically. Presently, in view of the absence of effective chemotherapy and the insensitivity of HCC to radiotherapy, complete tumor extirpation is the only option for a long-term cure. Resection of the tumor by partial hepatectomy can be accomplished in a limited number of patients (generally <15-30%) due to the degree of underlying cirrhosis.
Chronic liver disease can include liver cirrhosis, which is characterized by fibrosis and the conversion of normal liver architecture into structurally abnormal nodules. The progression of liver injury to cirrhosis may occur over weeks to years. In addition to fibrosis, the complications of cirrhosis include, but are not limited to, portal hypertension, ascites, hepatorenal syndrome, and hepatic encephalopathy.
Liver cirrhosis can occur in Hepatitis C alcoholic liver disease, NASH; and Hepatitis B. Hepatic fibrosis can occur due to alteration in the normally balanced processes of extracellular matrix production and degradation in liver. In liver cirrhosis, stellate cells can become activated into collagen-forming cells by a variety of paracrine factors. Such factors can be released by hepatocytes, Kupffer cells, and sinusoidal endothelium following liver injury. For example, increased levels of the cytokine transforming growth factor beta1 (TGF-beta1) are observed in patients with chronic hepatitis C and those with cirrhosis. TGF-beta1, in turn, stimulates activated stellate cells to produce type I collagen.
Diagnosis of Liver Cirrhosis
Severity of liver cirrhosis is commonly assessed using the Child-Turcotte-Pugh (CTP) system, a scoring system for assessing the severity of cirrhosis by considering the clinical variables encephalopathy, presence and/or severity of ascites, levels of bilirubin and albumin levels in blood, and prothrombin time.
Severity of liver cirrhosis can also be assessed using the Model for End-Stage Liver Disease (MELD) scoring system, by considering the clinical variables of number of times dialysis was needed, blood levels of creatinine, bilirubin levels, sodium, and prothrombin time.
Treatment of Liver Cirrhosis
Subjects with severe CLD (e.g., decompensated cirrhosis) can be treated using liver transplantation. Liver transplantation has a 1-year survival rate of 85-90% and a 5-year survival rate of higher than 70%. Quality of life after liver transplant is good or excellent in most cases. However, a limited supply of donor organs limit the availability of transplantation as an option for many subjects.
A number of therapies are available to prevent or delay the development of cirrhosis in subjects with CLD: prednisone and azathioprine for treating autoimmune hepatitis, interferon and other antiviral agents for treating hepatitis B and C, phlebotomy for hemochromatosis, ursodeoxycholic acid for primary biliary cirrhosis, and trientine and zinc for Wilson disease. NASH is an advanced form of nonalcoholic fatty liver disease (NAFLD), which are being evaluated for treatment using allosteric Acetyl-CoA Carboxylase (ACC) inhibitors (e.g., NDI-010976/GS-0976), obeticholic acid, thiazolidinediones (e.g., pioglitazone, rosiglitazone, lobeglitazone, ciglitazone, darglitazone, englitazone, netoglitazone, rivoglitazone, troglitazone, balaglitazone), elafibranor (GFT505), obeticholic acid (OCA), apoptosis signal-regulating kinase 1 (ASK1) inhibitor (selonsertib), dual CCR2/CCR5 inhibitor cenicriviroc (CVC, also TBR-652 or TAK-652), and vitamin E.
These therapies are less effective if chronic liver disease evolves into cirrhosis. Once cirrhosis develops, treatment is aimed at the management of complications arising from cirrhosis. For example, cirrhosis-related zinc deficiency can be treated with zinc sulfate at 220 mg orally twice daily to improve dysgeusia and to stimulate appetite. Furthermore, zinc is effective in the treatment of muscle cramps and is adjunctive therapy for hepatic encephalopathy. Pruritus in subjects with CLD (e.g., cholestatic liver diseases or Hepatitis C) can be treated with Cholestyramine, antihistamines (eg, diphenhydramine, hydroxyzine) and ammonium lactate 12% skin cream (Lac-Hydrin), include ursodeoxycholic acid, doxepin, and rifampin. Naltrexone may be effective but is often poorly tolerated. Gabapentin is an unreliable therapy. Patients with severe pruritus may require institution of ultraviolet light therapy or plasmapheresis. Hypogonadism in male subjects with CLD can be treated with topical testosterone preparations. Osteoporosis in subjects with CLD (especially chronic cholestasis or primary biliary cirrhosis) can be treated with calcium and vitamin D supplements. In addition, patients with CLD can be vaccinated against hepatitis A.
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
The following materials and methods were used in the Examples set forth below.
Patient medical data were collected from patient electronic medical record with patient permission and a maximum of 20 ml of blood was obtained from patients at any given blood draw in two 10-mL EDTA tubes, and approximately 8-15 ml of blood was processed per patient.
Microfluidic Purification of CECs from Whole Blood Using the iChip Device
Biotinylated primary antibodies against anti-human CD45 antibody (clone 2D1, R&D Systems, BAM1430) and anti-human CD66b antibody (Abd Serotec, 80H3) were spiked into whole blood (5-10 mL total volume) at 100 fg/WBC and 37.5 fg/WBC, respectively, and incubated rocking at room temperature for 20 min. Dynabeads MyOne Strepavidin T1 (Life Technologies, 65602) magnetic beads were then added and incubated rocking at room temperature for an additional 20 min. The total blood volume (5-10 mL) was then run on the iChip device as previously described.8
Cells in an aliquot of the iChip device-processed blood samples were fixed with 2% paraformaldehyde for 10 min and then applied to glass slides via cytospin using a Shandon EZ Megafunnel (ThermoFisher A78710001) at 2000 rpm for 5 min. Slides were washed with PBS and blocked with 5% donkey serum+0.3% Triton-X in PBS for 1 hr at room temperature (RT). Primary antibodies (each at 1:50 dilution in PBS, 0.1% BSA, 0.3% Triton-X) against wide spectrum cytokeratin (WS CK, Abcam ab9377), glypican-3 (Abcam ab81263), and CD45 (Becton Dickenson 555480) were then added and incubated for 1 hr at RT. Secondary antibodies (each at 1:200 dilution in PBS, 0.1% BSA, 0.3% Triton-X) directed against each of the primary antibodies were then used for fluorescent labelling, incubated for 1 hr at RT protected from light: 1) cytokeratin—donkey anti-rabbit Alexa-647 (Jackson ImmunoResearch 711-605-152); 2) glypican-3—donkey anti-sheep Cy3 (Jackson ImmunoResearch 713-165-003); 3) CD45—donkey anti-mouse Alexa-488 (Jackson ImmunoResearch 715-545-150), which were. Cell nuclei were counterstained with DAPI (5 μg/mL in PBS, Life Technologies). Slides were mounted using ProLong Gold Antifade Reagent (Life Technologies). Stained cells were imaged by fluorescence microscopy (TiE or Eclipse 90i, Nikon) using the appropriate filter cubes for image acquisition and the BioView platform for automated image analysis. All candidate CECs detected were reviewed and scored based on intact morphology, localization of CEC markers (WS CK Alexa-647 and/or GPC3 Cy3) with DAPI nuclear counterstain, and absence of leukocyte markers (CD45 Alexa-488).
HepG2 cells were cultured following American Type Culture Collection-recommended culturing conditions. Individual cells were micropipetted using an Eppendorf TransferMan NK2 micromanipulator and introduced into 4 mL of blood from healthy donors, before processing through the iChip device.
The iChip device-processed blood sample aliquot was pelleted and flash frozen in RNAlater (Thermo-Fisher Scientific) at −80 deg C. RNA was extracted (RNEasy Micro, Qiagen) and processed as follows for RNA-seq. Amplified cDNA was generated from RNA from each sample using the SMARTer Ultra Low Input RNA Kit (v3 or v4) for Sequencing (Clontech Laboratories) according to the manufacturer's protocol. Briefly, 1 μl of a 1:50,000 dilution of ERCC RNA Spike-In Mix (Life Technologies) was added to each sample. First-strand synthesis of RNA molecules was performed using the poly-dT-based 3′-SMART CDS primer II A followed by extension and template switching by the reverse transcriptase. The second strand synthesis and amplification PCR were run for 18 cycles, and the amplified cDNA was purified with a 1× Agencourt AMPure XP bead cleanup (Beckman Coulter). The Nextera® XT DNA Library Preparation kit (Illumina) was used for sample barcoding and fragmentation according to the manufacturer's protocol. 1 ng of amplified cDNA was used for the enzymatic tagmentation followed by 12 cycles of amplification and unique dual-index barcoding of individual libraries. PCR product was purified with a 1.8× Agencourt AMPure XP bead cleanup. The eluted cDNA libraries did not undergo the bead-based library normalization step in the Nextera XT protocol. Library validation and quantification was performed by quantitative PCR using the KAPA SYBR® FAST Universal qPCR Kit (Kapa Biosystems). The individual libraries were pooled at equal concentrations, and the pool concentration was determined using the KAPA SYBR® FAST Universal qPCR Kit. The pool of libraries was subsequently sequenced in three replicates on a HiSeq 2500 in Rapid Run Mode using a 2×100 base pair kit and a dual flow cell. The paired-end reads from the three sequencing runs were combined and aligned to the hg38 genome from http://genome.ucsc.edu using the STAR v2.4.0h aligner with default settings. Reads that did not map or mapped to multiple locations were discarded. Duplicate reads were marked using the MarkDuplicates tool in picard-tools-1.8.4 and were removed. The uniquely aligned reads were counted using htseq-count in the intersection-strict mode against the publicly available Homo_sapiens.GRCh38.79.gtf annotation table. Data were then imported into the R statistical programming language for analysis. All RNA-seq raw data has been submitted to NCBI GEO: accession GSE117623.
For a subset of HCC patients, the iChip device-processed blood sample was divided into two equal aliquots: one aliquot was pelleted and flash frozen as above; the second was flow sorted to isolate subtypes of contaminating white blood cells (monocytes, granulocytes, NK cells, cytotoxic T cells, helper T cells, and B cells). Cells were fixed with Cytofix (BD Biosciences 554655). The following antibodies were used: CD45 (Beckman Coulter IM0782U), CD56 (Beckman Coulter IM2073U), CD16 (Biolegend 360712), CD14 (Biolegend 301808), CD3 (Biolegend 317330), CD19 (Biolegend 302216), CD4 (Biolegend 300556), CD8 (Biolegend 301016), CD66b (Biolegend 305112). As described above, flow sorted cells were pelleted, flash frozen in RNAlater, and subjected to RNA-seq.
The RNA-seq raw data consisted of read counts for 59,074 transcripts on 64 CLD and 52 HCC samples. Of those, only samples with more than 250k total reads were kept, leaving 44 CLD and 39 HCC samples. In order to narrow the list of features in our data set to those with a higher likelihood of relevance for predicting HCC status, RNA-seq expression data was obtained from The Cancer Genome Atlas (TCGA) liver cancer project (LIHC), which contains expression counts for both normal liver and HCC tissue. A differential expression analysis was performed on this data set to identify transcripts overexpressed in HCC vs. normal liver tissue using the DESeq2 package (version 1.16.1) with Benjamini-Hochberg correction for multiple hypothesis testing in R. Using this analysis combined with RNA-seq data on bulk white blood cell (WBC) subsets obtained via flow sorting, a list of transcripts with adjusted p-value <0.05, log 2 fold change >2, WBCs <50 rpm in the summed WBC subsets, and a mean expression in healthy liver tissue >0.5 rpm was constructed. This list was used to narrow the 59,074 features in the raw data set to a set of 248 transcripts more likely to be predictive of HCC. The set of 248 transcripts were: ACTG2, ADM2, AFP, AGR2, AKR1B10, ALDH3A1, ALPK3, AMIGO3, ANKRD65, ANLN, AP1M2, APOBEC3B, ARHGAP11A, ARHGEF39, ASF1B, ASPHD1, ASPM, AURKA, AXIN2, BAIAP2L2, BEX2, C15orf48, C1orf106, C1QTNF3, C6orf223, CA12, CA9, CAMK2N2, CAP2, CBX2, CCDC170, CCDC28B, CCDC64, CCNA2, CCNB1, CCNE2, CCNF, CD109, CD34, CDC20, CDC25A, CDC6, CDC7, CDCA5, CDCA8, CDH13, CDK1, CDKN2A, CDKN2C, CDT1, CELF6, CENPF, CENPH, CENPL, CENPU, CENPW, CKB, CNNM1, COL15A1, COL4A5, COL7A1, COL9A2, CRIP3, CSPG4, CTNND2, CXorf36, CYP17A1, DLK1, DMKN, DSCC1, DTL, DUOX2, E2F1, ECT2, EEF1A2, EFNA3, EPHB2, EPPK1, ETV4, EZH2, F2RL3, FABP4, FAM111B, FAM3B, FAM83D, FANCD2, FANCI, FBXL18, FBXO32, FERMT1, FGF19, FLNC, FLVCR1, FMO1, FOXD2-AS1, FOXM1, FXYD2, GABRE, GAL3ST1, GCNT3, GINS1, GJC1, GMNN, GNAZ, GOLGA2P7, GPC3, GPR64, GPSM1, HRCT1, IGF2BP2, IGSF1, IGSF3, IQGAP3, ITGA2, ITPKA, KIAA0101, KIF11, KIFC1, KIFC2, KNTC1, KRT23, LAMA3, LEF1, LGR5, LINC00152, LINGO1, LPL, LRRC1, LYPD1, MAD2L1, MAGED4, MAGED4B, MAPK12, MAPK8IP2, MAPT, MCM2, MDGA1, MDK, MFAP2, MISP, MKI67, MMP11, MNS1, MPZ, MSC, MSH5, MTMR11, MUC13, MUC5B, MYBL2, MYH4, NAALADL1, NAV3, NCAPG, NDUFA4L2, NEB, NKD1, NMB, NOTCH3, NOTUM, NPM2, NQO1, NRCAM, NT5DC2, NTS, NUSAP1, OBSCN, OLFML2A, OLFML2B, OSBP2, PAQR4, PDZK1IP1, PEG10, PI3, PLCE1, PLCH2, PLK1, PLVAP, PLXDC1, PLXNB3, PODXL2, POLE2, PPAP2C, PRC1, PTGES, PTGFR, PTHLH, PTK7, PTP4A3, PTTG1, PYCR1, RACGAP1, RBM24, RECQL4, RHBG, RNF157, ROBO1, RP4-800G7.2, RPS6KL1, RRM2, S100A1, SCGN, 5-Sep, SERPINA12, SEZ6L2, SFN, SGOL2, SLC22A11, SLC51B, SLC6A2, SLC6A8, SLC6A9, SNCG, SOAT2, SP5, SPARCL1, SPINK1, SPP1, STIL, STK39, SULT1C2, TCF19, TDGF1, TESC, THY1, TK1, TMC5, TMEM132A, TMEM150B, TNFRSF19, TNFRSF25, TONSL, TOP2A, TPX2, TRIM16, TRIM16L, TRIM31, TRIM45, TTC39A, UBD, UBE2C, UBE2T, UGT2B11, USH1C, VSIG10L, WDR62, WDR76, ZWINT
The final data set used in all analyses consisted of log2 (1+RPM) for the 248 transcripts and 83 samples identified as described above. Ten iterations of 10-fold cross-validation were implemented in order to evaluate the performance of the classification algorithm, which is described step-by-step below:
CECs were first detected by immunofluorescence (IF). Blood samples were obtained from 10 healthy blood donors, 39 CLD patients undergoing routine clinical surveillance for but had no evidence of HCC, 54 patients with HCC, and 10 HCC patients who underwent curative treatment and had no clinical evidence of disease (NED) (See Tables 1-4). The iChip device performed size-based exclusion of red blood cells, platelets and plasma, followed by magnetophoresis of labelled white blood cells (WBCs) (as described in Ozkumur E, et al. Sci Transl Med 2013; 5:179ra47) (see
RNA-sequencing (RNA-seq) was performed to detect CECs. To determine the sensitivity of this approach, 0, 1, 3, 5, 10, or 50 HepG2 HCC cells were spiked into 4 mL of healthy donor blood and processed through the iChip device for RNA-seq. HepG2 specific gene expression was detectable in whole blood from a single cell (see
To show that CECs may phenotypically differ depending on the underlying disease state, gene expression profiling was performed to identify qualitative rather than quantitative differences between CECs in the setting of CLD versus HCC (see
The cross-validated classifier provided excellent separation between CLD and HCC samples, with a sensitivity (i.e., true positive rate) of 85% at a specificity (i.e., true negative rate) of 95% and with identification of both early and late stage HCC (by Milan criteria) (see
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This invention was made with Government support under Grant Nos. DK007191, EB012493, CA172738, and DK078772 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US19/50532 | 9/11/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62729787 | Sep 2018 | US |