The present invention relates to the field of pharmaceutical determination, and in particular to an integrated biomarker system for evaluating a risk of impaired fasting glucose (IFG) and type 2 diabetes mellitus (T2DM).
Type 2 diabetes mellitus (T2DM) is a kind of chronic metabolic disease; impaired fasting glucose (IFG) is a type of prediabetes, and the fasting blood glucose is between the normal value and T2DM. Generally, T2DM is an irreversible and lifelong disease, while IFG is reversible. The rate of converting IFG into diabetes mellitus may be reduced by strict diet control, more exercise and other lifestyle intervention. A national survey published in the The New England Journal of Medicine by professor Yang Wenying in 2007 shows that the number of diabetic patients in China has been nearly 100 million. Global Diabetes Reports issued by the World Health Organization in 2016 for the first time shows that about 500 millions of adults are in prediabetic phase, but the diagnostic rate of prediabetes is low, most people do not yet know they are in prediabetic phase. The diagnostic criterion of the World Health Organization on IFG and T2DM in 1999 is based on the definition of fasting blood glucose, but when the subject is about to develop into IFG or T2DM, the fasting blood-glucose has reduced diagnostic sensitivity. Therefore, it is crucial to explore a biomarker for the diagnostic sensitivity of IFG and T2DM, which is of great significance to the early diagnosis of IFG and T2DM, early intervention of IFG, prevention and control of T2DM.
Metabolite not only reflects the change of genome and proteome, but also is influenced by other factors, such as environmental factors and intestinal flora. Moreover, metabolite has stronger dynamics and thus, is more sensitive to the change reflection of an organism. Chinese patent CN104769434B discloses that metabolites glycine, lysophosphatidyl choline and acetyl carnitine C2 may be used for identifying a tendency of developing into T2DM in a subject. However, the biomarker for the diagnosis of IFG and T2DM presents an isolated and dispersed state. Most of the researches are based on the study of unicentral non-targeted metabonomics and thus, have low reproducibility, which is difficult to embody clinical application values of a biomarker. In terms of systems biology, there is a correlation among a plurality of metabolites. Therefore, it is of practical application value to serve a plurality of quantitative metabolites as a biomarker for the diagnosis of IFG and T2DM. An integrated biomarker system is a characteristic change spectrum formed by integrating biomarkers of a disease, and is a real synthetic response of a variation trend of in vivo important metabolites and bio-network association signals. However, no integrated biomarker system for IFG and T2DM patients have been studied and established up to now.
In view of this, the present invention is provided herein.
The present invention provides an integrated biomarker system for evaluating a risk of impaired fasting glucose (IFG) and type 2 diabetes mellitus (T2DM); the integrated biomarker system includes quantitative determination results of L-glutamine within a scope of 2,000-16,0000 ng/mL, L-valine within a scope of 1,200-96,000 ng/mL, L-leucine within a scope of 1,000-8,0000 ng/mL, L-lysine within a scope of 800-64,000 ng/mL, L-proline within a scope of 800-64,000 ng/mL, L-phenylalanine within a scope of 500-40,000 ng/mL, L-arginine within a scope of 500-40,000 ng/mL, L-glutamic acid within a scope of 500-40,000 ng/mL, L-isoleucine within a scope of 300-24,000 ng/mL, L-methionine within a scope of 250-20,000 ng/mL, L-carnitine within a scope of 200-16,000 ng/mL, acetyl-L-carnitine within a scope of 80-6,400 ng/mL, lysophosphatidyl choline (LPC (P-16:0)) within a scope of 60-4,800 ng/mL, LPC (17:0) within a scope of 60-4,800 ng/mL, LPC (14:0) within a scope of 40-3,200 ng/mL and propionyl-L-carnitine within a scope of 4-320 ng/mL in a sample.
Further, the sample is subject serum.
Further, the quantitative determination results are obtained by serving a Cell Free Amino Acid Mix 20 AA, O-acetyl-L-carnitine hydrochloride (N-methyl-D3) and lysophosphatidyl choline (20:0) (eicosacarbonyl-12,12,13,13-D4) as isotope internal standards for analysis.
Further, the integrated biomarker system further includes a model built by the machine learning method.
Further, the machine learning method is eXtreme Gradient Boosting (XGBoost).
Compared with the prior art, the present invention has the following advantages:
The present invention discloses an integrated biomarker system for evaluating a risk of IFG and T2DM for the first time. The integrated biomarker system for IFG and T2DM of subject serum sample established by the present invention contains a correlative biomarker group on a biological network path to reflect the overall metabolic characteristics information of IFG and T2DM and to avoid that characteristic information of a disease cannot be reflected integrally and completely due to single or separate analysis of a biomarker. The quantitative-based integrated biomarker system provided by the present invention is from a clinical real world, and has multi-center clinical study and stronger representativeness, thus improving the potential clinical application value of biomarkers of diseases. Further, the targeted quantitative evaluation and detection method established in this present invention has high sensitivity, strong specificity, good reproducibility, a small amount of detection samples, and simple operation.
LPC in
To further describe the technical means and results taken by the present invention to achieve the predetermined goals of the present invention, preferred examples will be used to describe the detailed embodiments, technical solution and features of the present application specifically below. Specific features, structures or characteristics in a plurality of examples in the description below may be combined in any appropriate form.
Main materials and sources selected and used in the following examples of the present invention are respectively as follows:
The L-glutamine (batch No.: V900419), L-valine (batch No.: 94619), L-leucine (batch No.: 61819), L-lysine (batch No.: 23128), L-proline (batch No.: 81709), L-phenylalanine (batch No.: 852465P), L-arginine (batch No.: 11009-25G-F), L-glutamic acid (batch No.: 95436), L-isoleucine (batch No.: I2752), L-methionine (batch No.: 64319-25G-F), lysophosphatidyl choline (LPC (P-16:0)) (batch No.: 852464P), LPC (17:0) (batch No.: 855676P), LPC (14:0) (batch No.: 855575P) and propionyl-L-carnitine (batch No.: 91275) used in the analysis are purchased from Sigma-Aldrich; L-carnitine (batch No.: DRE-C11045500) is purchased from Beijing J&K Scientific Co., Ltd.; acetyl-L-carnitine hydrochloride (batch No.: DST190510-049) is purchased from Chengdu Desite Biotechnology Co., Ltd.; the isotope Cell Free Amino Acid Mix (20 AA) (U-D, 98%)) (batch No.: DLM-6819-PK), O-acetyl-L-carnitine hydrochloride (N-methyl-D3, 98%) (batch No.: DLM-754-0.05) and LPC (20:0) (eicosacarbonyl-12,12,13,13-D4, 98%) (batch No.: DLM-10520-0.001) are purchased from Cambridge Isotope Laboratories; ammonium acetate (batch No.: E057G140) is purchased from CNW Technologies GmbH; ultra-performance liquid chromatography (UPLC) Quadrupole-Orbitrap high-resolution and precise mass spectrometry (Thermo Fisher Scientific, Q-Exactive); UPLC triple quadrupole mass spectrometer (Thermo Fisher Scientific, TSQ-Altis); refrigerated micro-centrifuge (Thermo Fisher Scientific, Heraeus Fresco 17); multi-purpose vortex mixer (Scientific Industries, Vortex Genie 2); 5 mL serum separation hose (Becton, Dickinson and Company, 367955); and reversed phase column (Waters, ACQUITY BEH C18 and ACQUITY BEH HILIC).
The sample for the integrated biomarker system in the present invention is from subject serum.
Subjects were recruited from 5 clinical centers of Beijing, Zhengzhou and Kaifeng and serum samples were collected. To eliminate diet disturbance, the subject serum samples were together collected at 7:00-9:00 a.m. after overnight fasting. Peripheral venous blood of the subjects was collected with 5 mL serum separation hoses. After standing for 30 min, peripheral venous blood was centrifuged for 10 min at 1510 g with a refrigerated high-speed centrifugal machine at a condition of 4° C., then 200 µL supernatant were taken and subpackaged into 1.5 mL labelled EP tubes, and stored in a -80° C. refrigerator before analysis. Finally, 1132 parts of serum samples were totally collected and then used for the subsequent analysis.
A proper amount of standards L-glutamine, L-valine, L-leucine, L-lysine, L-proline, L-isoleucine, L-methionine, L-phenylalanine, L-arginine, L-glutamic acid, L-carnitine and Cell Free Amino Acid Mix (20 AA) were weighed and respectively placed in 10 mL volumetric flasks, then 10% methanol aqueous solution was added for dissolving and fixing a constant volume to prepare into a stock solution, where L-glutamine has a concentration of 4000 µg/mL; L-valine, L-leucine, L-lysine, L-proline, L-isoleucine and L-methionine have a concentration of 2000 µg/mL; L-phenylalanine, L-arginine, L-glutamic acid and L-carnitine have a concentration of 1000 µg/mL; and 20 AA has a concentration of 1000 µg/mL.
A proper amount of LPC (P-16:0), LPC (17:0), LPC (14:0), propionyl-L-carnitine, LPC (20:0) (eicosacarbonyl-12,12,13,13-D4, 98%) (LPC (20:0)-d4) were weighed, and acetonitrile aqueous solution (1:1, v:v) was added for dissolving and fixing a constant volume to prepare into a stock solution in which LPC (P-16:0), LPC (17:0), LPC (14:0), propionyl-L-carnitine and LPC (20:0)-d4 had a concentration of 100 µg/mL.
A proper amount of acetyl-L-carnitine and O-acetyl-L-carnitine hydrochloride (N-methyl-D3, 98%) (acetyl-L-carnitine-d3) were weighed, and 4% hydrochloric acid aqueous solution was added for dissolving and fixing a constant volume to prepare into a stock solution in which L-acetylcarnitine had a concentration of 100 µg/mL and acetyl-L-carnitine-d3 had a concentration of 100 µg/mL.
The above prepared stock solutions were put and stored in a 4° C. refrigerator for further use.
A proper amount of the above prepared stock solution of 20 AA, acetyl-L-carnitine-d3 and LPC (20:0)-d4 were precisely absorbed and put in a 500 mL volumetric flask, and acetonitrile-methanol (3:1, v:v) solution was added for dissolving and fixing a constant volume to prepare into an acetonitrile-methanol protein precipitant working solution containing internal standards 20 AA, acetyl-L-carnitine-d3 and LPC (20:0)-d4 respectively having a concentration of 10 µg/mL, 500 ng/mL and 25 ng/mL.
Because human blank serum is hardly obtained conventionally, 1x phosphate buffered solution is used to substitute blank serum as a blank control. A proper amount of the stock solution of standards was absorbed, and 1x phosphate buffer solution was added for stepwise dilution to prepare into 7 concentration levels of standard curve working solutions; three concentrations (low, middle and high) of QC samples (LQC, MQC and HQC) were set and used for the subsequent quantitative analysis for the samples. Concentrations of the standard curve working solutions and QC samples are as shown in Table 1.
Sample pretreatment: 10 µL of the prepared standard curve working solution or QC sample was precisely absorbed and put to a 1.5 mL centrifuge tube, and 90 µL serum samples were added for dilution, and mixed well by vortex for 1 min; 300 µL acetonitrile-methanol protein precipitant working solution was added and mixed well by vortex for 5 min; then mixture was centrifuged for 10 min at 16,200 g with a condition of 4° C., then supernatant was taken and used for the subsequent analysis.
Chromatographic conditions: a Waters ACQUITY BEH HILIC (100 mm × 2.1 mm, 1.7 µm) chromatographic column was used; a mobile phase A was 0.1% formic acid aqueous solution containing 20 mmol/L ammonium acetate, a mobile phase B was acetonitrile containing 0.1% formic acid; injection volume was 3 µL, flow rate was 0.30 mL/min, and column temperature was 40° C.; liquid phase elution procedure: the initial mobile phase B was 95% and kept for 2.0 min, and linearly dropped to 60% at 4.0 min, after keeping for 6.0 min, linearly increased to 95% within 0.2 min and kept for 1.8 min; the whole analysis operation time was 12 min.
Mass spectrometry conditions: electrospray ionization mode was a positive ion mode (ESI+); and the monitoring mode was selective reaction monitoring. Spray voltage was 3.5 kV, collision gas was high-purity nitrogen, auxiliary gas had a flow rate of 17 L/min; ion transmission tube had a temperature of 325° C., and the evaporator had a temperature of 320° C. Sheath gas had a flow rate of 20 L/min.
6 parts of serum samples obtained in Example I were drawn randomly and pretreated by the above pretreatment method; meanwhile, 6 parts of the pretreated blank controls and 6 parts of the pretreated 1x phosphate buffer solution were prepared, then the above samples were analyzed. The results are shown in
Results of the lower limit of quantitation (LLOQ), limit of detection (LOD), linearity and concentration range and precision are shown in Table 2. The metabolites show good linearity (correlation coefficient R value is greater than 0.99) within the prepared concentration range; the intra-day precision relative standard deviation (RSD) of the surveyed 6 batches of LQC, MQC and HQC is 2.08%-11.87%; and inter-day precision RSD is 1.68%-11.23%.
Results of the intra-day accuracy, extraction recovery rate and matrix effect are shown in Table 3; the intra-day accuracy relative error (RE) of the LQC, MQC and HQC is -13.33%-13.72%; the inter-day accuracy RE is -13.30%-13.18%; the average extraction recovery rate of the 16 metabolites at LQC and HQC sample concentrations is 68.68%-129.87%; the average matrix effect is 74.54%-142.93%.
Results of the stability are shown in Table 4. When the metabolites were put to an automatic sampler for 24 h at the concentrations of LQC, MQC and HQC, the stability RSD is 0.85%-9.78%; when the metabolites were put in a 4° C. refrigerator for 24 h, the stability RSD is 0.97%-10.20%; when the metabolites were put in a 5-fold dilution condition, the RSD is 0.60%-5.72%, indicating that the content determination of metabolites in the serum samples was free of influence under the 5-fold dilution condition. Through test, the residuals in the residual effect bank samples of the 16 metabolites were less than 20% of the LLOQ.
The above results prove that the selectivity, LLOQ and LOD, linearity and concentration range, precision and accuracy, extraction recovery rate and matrix effect, stability, dilution effect and residual effect of the targeted detection method used in this present invention accord with the requirements of the quantitative analysis method of serum biological samples.
The method in Example III was used to determine the 1132 parts of samples collected in Example I. NGT, IFG, T2DM and hyperlipidemia samples were used to build a model.
The sample data set was randomly divided into a training set and a test set by a 70-30 holdout method; the training set (232 parts of NGT, 314 parts of IFG, 230 parts of T2DM and 96 parts of hyperlipidemia) was used for training the model; and the test set (80 parts of NGT, 97 parts of IFG, 113 parts of T2DM and 50 parts of hyperlipidemia) was used for testing the model.
After data was extracted by TraceFinder software, the metabolite difference was analyzed with Kruskal-Wallis, and the difference among multiple groups was adjusted by Bonferroni correction; Origin 2019 software was used to draw the targeted metabolite content of the training set and the test set. As shown in Table 4, the results show that the serum concentration of the 16 targeted metabolites in the training set and the test set has significant difference. A single metabolite was subjected to receiver operator characteristic curve analysis, and area under the curve (AUC) was used for performance evaluation. The results are shown in Table 5, and a single metabolite has poor evaluation performance to the four types of samples. In terms of systems biology, it is of higher value to serve a plurality of associated metabolites as a biomarker for the evaluation of disease risk. Therefore, machine learning methods were used to establish an evaluation model of IFG and T2DM integrated biomarker system with 16 targeted metabolites.
Further, to screen a suitable method to build the evaluation model of IFG and T2DM integrated biomarker system, AUC served as an evaluation index in the test set to evaluate three machine learning methods (eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR) and Support Vector Machine (SVM). The results are shown in
To improve the specificity and sensitivity of the evaluation model, the significance of metabolites was ordered by Gini impurity, mutual information and analysis of variance; and the optimal metabolite subset was determined by an incremental feature selection strategy. The results are shown in
The test set was used to evaluate the performance of the model; AUC, accuracy, sensitivity, specificity, precision and F1 score were used for evaluation. The results are shown in Table 5.
It can be seen from the data of Table 5 that the model has an accuracy of 85% to the identification of 2DM and NGT, and respectively has an accuracy of 75% and 89% to the identification of T2DM and IFG, T2DM and hyperlipidemia. Therefore, the model may be used for evaluating the risk of NGT, IFG, T2DM and hyperlipidemia.
To visualize the integrated biomarker system of IFG and T2DM, a formula was used to normalize the original data: value of the biomarker after normalization(B(i)) =(concentration of the biomarker before normalization (B(c)) -minimum concentration of the biomarker before normalization (B(min)))/(maximum concentration of the biomarker before normalization (B(max))) -minimum concentration of the biomarker before normalization (B(min))) × 100; after normalization, B(i) mean value ± standard deviation (mean ± SD), and mean ± SD was used for plotting. The results are shown in
Furthermore, a schematic diagram having representative evaluation results of samples is represented as well, as shown in
What is described above are merely preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any equivalent replacement or change made by a person skilled in the art based on the technical solution and improvement concept of the present invention within the technical scope disclosed herein shall be covered within the protection scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202110144115.8 | Feb 2021 | CN | national |
This application is the national phase entry of International Application No. PCT/CN2021/089772, filed on Apr. 26, 2021, which is based upon and claims priority to Chinese Patent Application No. 202110144115.8, filed on Feb. 03, 2021, the entire contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2021/089772 | 4/26/2021 | WO |