The present invention relates to salivary biomarkers for cancers, methods and devices for assaying the same, and methods for determining the salivary biomarkers for cancer. In particular, the present invention relates to salivary biomarkers to differentiate pancreatic cancer, intraductal papillary mucinous neoplasm (IPMN), breast cancer, and oral cancers from healthy controls, and methods and devices for assaying these biomarkers, and methods for determining these salivary biomarkers.
A treatment of pancreatic cancer patients, one of the most maglicant cancers showing a poor prognosis, is still difficult. The median survival year is less than one year for pancreatic cancer patients who do not undego adjuvant therapies, such as chemotherapy and radiotherapy. Thus, detection of pancreatic cancer at the early stages is the only way available to prove the prognosis, indicating the needs of development of novel methods to detect the cancer using a biological sample (body fluid, etc.) minimally or non-invasively.
One of the present inventors previously proposed serum biomarkers to detect liver diseases in Patent Literatures 2 and 3.
Large molecule biomarkers for early detection of pancreatic cancers using blood, serum and plasma samples have been intensively developed (Patent Literatures 4 and 5). For example, carbohydrate antigen 19-9 (CA19-9) is already commonly used as a tumor marker to detect pancreatic cancers and biliary tract cancers as well as to evaluate the effects of chemotherapy. However, early detection of pancreatic cancer using this marker is difficult, and the accuracy of screening cancer is insufficient (Non-Patent Literature 1). In addition, CA19-9 levels do not increase in Lewis negative patients even in the advanced stage. Detection of a pancreatic cancer associated antigen (DUPAN-2 antigen) and a carcinoembryonic antigen (CEA) are also used. However, DUPAN-2 shows low specificity because this marker increases not only for pancreatic cancer but also for biliary tract and liver cancers. CEA also shows low specificity and shows positive for cancers of the digestive system, e.g. esophageal cancer and gastric cancer. Therefore, these markers are not specific to pancreatic cancer. Further, these two markers have not been widely used due to costs.
Polyamines, such as spermine (spermine), and acetylated polyamines, such as N8-acetylspermidine (N8-Acetylspermidine), N1-acetylspermidine (N1-Acetylspermidine), and N1-acetylspermine (N1-Acetylspermine) were known as metabolite biomarkers for various cancers in blood and urine (Non-Patent Literature 2). In a metabolic pathway, arginine is metabolized to ornithine, and then metabolized through putrescine to polyamines. The synthesis of polyamines is usually relatively activated in close to the surface of tumor tissues where oxygen is available compared to the center of the tumor tissue, while synthesis is less activated under a hypoxic condition in the center of the tumor. Despite their hetelogenious conditions in the tumor tissues, overall, the concentration of the polyamines in total tumor tissue increases and a part of these metabolites is transferred to the blood vessel. For example, an increase in the concentration of spermidine in blood is known in patients with breast cancers, prostate cancers and testis tumors (Non-Patent Literature 1). Decreasing the concentrations of spermine and spermidine in blood is reported in patients with acute pancreatitis by experiments on animals (Non-Patent Literature 3).
Conventional screening for protein markers in blood, serum, or plasma is insufficient for early detection of pancreatic cancer. Although blood-based tests are minimally invasive, professionals, such as medical doctors and nurses are required to handle the syringe. Thus, frequency of the test is limited. In contrast, the use of saliva provides definite advantages, i.e. completely non-invasive collection anywhere, which make it possible for frequent and self-sampling tests. For example, salivary biomarkers for lung cancer detection was proposed in Patent Literature 6. As mentioned, because early detection of pancreatic cancer using currently known biomarkers is difficult, frequent salivary testing is the only method to increase the possibility of detecting this cancer in earlier stages.
Detection of pancreatic cancer using mRNA profiles in saliva was proposed in Non-Patent Literature 4.
However, quantification of mRNA requires complex sample processing and addition of RNase inhibitor to saliva just after saliva collection to prevent mRNA degradation. Because of the low reproducibility of microarray-based quantification of mRNA, quantitative PCR (qPCR) is usually used for validating a marker's quantified values. However, each qPCR can profile only one marker, which limits simultaneous quantification of multiple markers. For example, only 35 substances are quantitatively determined in Non-Patent Literature 4. Thus, the use of qPCR limits simultaneous quantification of multiple markers, which cannot capture the holistic view of salivary molecular characteristics, e.g. the overall variation of salivary concentration cannot be determined. Therefore, highly accurate prediction using only a few markers becomes difficult. In the case of qPCR-based quantification, complexity of the sample processing for qPCR may increase artificial noise levels. Thefore, simple methods for quantifying salivary moleculers to minimize possible artificial noise are preferable. Taken together, not only exploring novel biomarkers but also the development of combination techniques for accurately detecting subjects with various cancers, such as pancreatic, breast and oral cancers, and begin diseases including intraductal papillary mucinous neoplasm (IPMN), are required.
The present invention addresses these problems. An object of the present invention is the early detection of cancer such as pancreatic cancer, breast cancer, and oral cancer using saliva.
The present inventors identified multiple metabolite biomarkers in saliva to discriminate patients with pancreatic cancers from healthy controls. Capillary electrophoresis-mass spectrometry (CE-MS) may be used to simultaneously quantify these metabolite markers. The inventors also developed combinations of these biomarkers to realize accurate discrimination. Although saliva samples should be collected carefully to eliminate diurnal variation, there are difficulties to completely eliminate these variations. Therefore, the inventors also found normalization metabolites for estimating the total concentration of the metabolites in saliva, and developed algorithms to combine metabolite markers and normalization metabolites for more accurate detection of subjects with pancreatic cancers.
Further, the inventors have found markers for breast cancer and oral cancer following similar procedures.
The present invention is based on the aforementioned research results. Salivary biomarkers and their combinations have the potential to solve the aforementioned problems.
Herein, salivary metabolite biomarkers and their combinations were developed to detect certain diseases, including pancreatic cancer, intraductal papillary mucinous neoplasm (IPMN), breast cancer, and oral cancer.
Absolute concentration and the combination of the following salivary metabolite biomarkers can be used for detecting patients with pancreatic disease: N-acetylputrescine (N-Acetylputrescine), adenosine (Adenosine), 3-phospho-D-glyceric acid (3PG), urea (Urea), o-acetylcarnitine (o-Acetylcarnitine), citric acid (Citrate), glycyl-glycine (Gly-Gly), 5-aminovaleric acid (5-Aminovalerate), 4-methyl 2-oxopentanoate (2-Oxoisopentanoate), malic acid (Malate), benzoate ester (Benzoate), fumaric acid (Fumarate), N-acetylaspartic acid (N-Acetylaspartate), inosine (Inosine), 3-methylhistidine (3-Methylhistidine), N1-acetylspermine (N1-Acetylspermine), creatine (Creatine), α-aminoadipic acid (alpha-Aminoadipate), phosphorylcholine (Phosphorylcholine), 2-hydroxypentanoate (2-Hydroxypentanoate), xanthine (Xanthine), succinic acid (Succinate), 6-phosphogluconic acid (6-Phosphogluconate), butanoic acid (Butanoate), homovanillic acid (Homovanillate), O-phosphoserine (O-Phosphoserine), trimethylamine-N-oxide (Trimethylamine N-oxide), piperidine (Piperidine), cystine (Cys), 2-isopropylmalic acid (2-Isopropylmalate), N8-acetylspermidine (N8-Acetylspermidine), N1-acetylspermidine (N1-Acetylspermidine), N-acetylneuraminic acid (N-Acetylneuraminate), glucosamine (Glucosamine), spermine (Spermine), agmatine (Agmatine), N-acetylhistamine (N-Acetylhistamine), methionine (Met), p-4-hydroxyphenylacetic acid (p-4-Hydroxyphenylacetate), N,N-dimethylglycine (N,N-Dimethylglycine), hypotaurine (Hypotaurine), glutamyl-glutamic acid (Glu-Glu), and N1,N12-diacetylspermine (N1,N12-Diacetylspermine).
Relative concentration, i.e. the absolute concentration divided by the concentration of the normalization metabolite, of the following salivary metabolite biomarkers can be used for detecting patients with pancreatic cancer or IPMN: N8-acetylspermidine (N8-Acetylspermidine), creatinine (Creatinine), spermine (Spermine), aspartic acid (Asp), N1-acetylspermidine (N1-Acetylspermidine), N1-acetylspermine (N1-Acetylspermine), cytidine (Cytidine), α-aminoadipic acid (alpha-Aminoadipate), cytosine (Cytosine), betaine (Betaine), urea (Urea), homovanillic acid (Homovanillate), N-acetylneuraminic acid (N-Acetylneuraminate), cystine (Cys), urocanic acid (Urocanate), fumaric acid (Fumarate), 1,3-diaminopropane (1,3-Diaminopropane), hypotaurine (Hypotaurine), nicotinic acid (Nicotinate), agmatine (Agmatine), valine (Val), 2-hydroxy-4-methylpentanoic acid (2-Hydroxy-4-methylpentanoate), alanyl-alanine (Ala-Ala), citric acid (Citrate), glucosamine (Glucosamine), carnosine (Carnosine), glycyl-glycine (Gly-Gly), 2-aminobutyric acid (2AB), arginine (Arg), N-acylglutamic acid (N-Acetylglutamate), glycerophosphoric acid (Glycerophosphate), phosphoenolpyruvic acid (PEP), isoleucine (Ile), adenosine (Adenosine), guanine (Guanine), dihydroxyacetonephosphoric acid (DHAP), and cadaverine (Cadaverine).
As an example combination, the absolute concentration of creatinine, N1-acetylspermidine, α-aminoadipic acid, N-acetylneuraminic acid, and 1,3-diaminopropane in saliva can be used for accurate pancreatic cancer detection. The prediction can be made by using another combination or changing the methodology of combination.
As the salivary biomarker for cancer used to detect breast cancer, the absolute concentration of the following substances or a combination thereof in saliva can be used: choline (Choline), 2-hydroxybutyric acid (2-Hydroxybutyrate), β-alanine (beta-Ala), 3-methylhisdine (3-Methylhistidine), α-aminobutyric acid (2AB), N-acetyl-β-alanine (N-Acetyl-beta-alanine), isethionic acid (Isethionate), N-acetylphenylalanine (N-Acetylphenylalanine), trimethyllysine (N6,N6,N6-Trimethyllysine), α-aminoadipic acid (alpha-Aminoadipate), creatine (Creatine), γ-butyrobetaine (gamma-Butyrobetaine), sarcosine (Sarcosine), pyruvic acid (Pyruvate), urocanic acid (Urocanate), piperidine (Piperidine), serine (Ser), homovanillic acid (Homovanillate), 5-oxoproline (5-Oxoproline), GABA (GABA), 5-aminovaleric acid (5-Aminovalerate), trimethylamine-N-oxide (Trimethylamine N-oxide), 2-hydroxyvaleric acid (2-Hydroxyl)pentanoate), carnitine (Carnitine), isopropanolamine (Isopropanolamine), hypotaurine (Hypotaurine), lactic acid (Lactate), 2-hydroxy-4-methylpentanoic acid (2-Hydroxy-4-methylpentanoate), hydroxyproline (Hydroxyproline), butyric acid (Butanoate), adenine (Adenine), N6-acetyllysine (N-epsilon-Acetyllysine), 6-hydroxyhexanoic acid (6-Hydroxyhexanoate), propionic acid (Propionate), betaine (Betaine), N-acetylputrescine (N-Acetylputrescine), hypoxanthine (hypoxanthine), crotonic acid (Crotonate), tryptophan (Trp), citrulline (Citrulline), glutamine (Gln), proline (Pro), 2-oxoisopentanoic acid (2-Oxoisopentanoate), 4-methylbenzoate (4-Methylbenzoate), 3-(4-hydroxyphenyl)propionic acid (3-(4-Hydroxyphenyl)propionate), cysteic acid (Cysteate), azelaic acid (Azelate), ribulose-5-phosphoric acid (Ru5P), pipecolinic acid (Pipecolate), phenylalanine (Phe), O-phosphoserine (O-Phosphoserine), malonic acid (Malonate), hexanoic acid (Hexanoate), and p-hydroxyphenylacetic acid (p-Hydroxyphenylacetate).
The aforementioned substances that are significant substances and that are not publicly known are indicated in Table 7 below.
A combination of β-alanine, N-acetylphenylalanine, and citrulline can be used as one example of a combination of salivary biomarkers for cancer to detect breast cancer. The prediction can be performed by using a different combination or changing the methodology of combination.
When a value is used in which the concentration of saliva was corrected, the following substances or a combination thereof may be used as a marker: choline (Choline), β-alanine (beta-Ala), 3-methylhisdine (3-Methylhistidine), α-aminobutyric acid (2AB), N-acetyl-β-alanine (N-Acetyl-beta-alanine), isethionic acid (Isethionate), N-acetylphenylalanine (N-Acetylphenylalanine), trimethyllysine (N6,N6,N6-Trimethyllysine), urocanic acid (Urocanate), piperidine (Piperidine), 5-aminovaleric acid (5-Aminovalerate), trimethylamine-N-oxide (Trimethylamine N-oxide), isopropanolamine (Isopropanolamine), hypotaurine (Hypotaurine), hydroxyproline (Hydroxyproline), N6-acetyllysine (N-epsilon-Acetyllysine), 6-hydroxyhexanoic acid (6-Hydroxyhexanoate), N-acetylputrescine (N-Acetylputrescine), azelaic acid (Azelate), dihydroxyacetonephosphoric acid (DHAP), glycolic acid (Glycolate), 4-methyl-2-oxopentanoic acid (4-Methyl-2-oxopentanoate), N-acetylaspartic acid (N-Acetylaspartate), glycerophosphoric acid (Glycerophosphate), 3-hydroxybutyric acid (3-Hydroxybutyrate), benzoic acid (Benzoate), adipic acid(Adipate), 2-isopropylmalate (2-Isopropylmalate), phosphorylchlorine (Phosphorylcholine), N-acetylneuraminic acid (N-Acetylneuraminate), histamine (His), o-acetylcarnitine (o-Acetylcarnitine), N-acetylglucosamine 1-phosphate (N-Acetylglucosamine 1-phosphate), creatinine (Creatinine), arginine (Arg), and syringic acid (Syringate).
The aforementioned substances that are significant substances and that are not publicly known are indicated in Table 8 below.
A combination of N-acetylphenylalanine, N-acetylspermidine, and creatine can be used as one example of a combination of salivary biomarkers for cancer used to detect breast cancer. The prediction can be performed by using a different combination or changing the methodology of combination.
As the salivary biomarker for cancer used to detect oral cancer, the concentration of the following substances or a combination thereof in saliva can be used: Glycyl-glycine (Gly-Gly), citrulline (Citrulline), γ-butyrobetaine (gamma-Butyrobetaine), 3-phenyllactate (3-Phenyllactate), butyric acid (Butanoate), hexanoic acid (Hexanoate), methionine (Met), hypoxanthine (Hypoxanthine), spermidine (Spermidine), tryptophan (Trp), aspartic acid (Asp), isopropanolamine (Isopropanolamine), alanyl-alanine (Ala-Ala), N,N-dimethylglycine (N,N-Dimethylglycine), N1-acetylspermidine (N1-Acetylspermidine), N1-,N8-diacetylspermidine (N1,N8-Diacetylspermidine), N8-acetylspermidine (N8-Acetylspermidine), α-aminobutyric acid (2AB), trimethylamine-N-oxide (Trimethylamine N-oxide), N-acetylaspartic acid (N-Acetylaspartate), adenine (Adenine), 2-hydroxyvaleric acid (2-Hydroxyl)pentanoate), putrescine (Putrescine (1,4-Butanediamine)), 3-phosphoglycerate (3PG), 3-phenylpropionic acid (3-Phenylpropionate), serine (Ser), 1-methylnicotinamide (1-Methylnicotineamide), 3-hydroxy-3-methylglutaric acid (3-Hydroxy-3-methylglutarate), guanine (guanine), 3-(4-hydroxyphenyl)propionic acid (3-(4-Hydroxyphenyl)propionate), 4-methylbenzoate (4-Methylbenzoate), ribulose-5-phosphoric acid (Ru5P), α-aminoadipic acid (alpha-Aminoadipate), N6-acetyllysine (N-epsilon-Acetyllysine), glucosamine (Glucosamine), cystine (Cys), carnosine (Carnosine), urocanic acid (Urocanate), phenylalanine (Phe), 2-deoxyribose-1-phosphoric acid (2-Deoxyribose 1-phosphate), cytidine disodium 5′-monophosphate (CMP), p-hydroxyphenylacetic acid (p-Hydroxyphenylacetate), 3-hydroxybutyric acid (3-Hydroxybutyrate), N-acetylputrescine (N-Acetylputrescine), 7-methylguanine (7-Methylguanine), inosine (Inosine), lysine (Lys), dihydroxyacetonephosphoric acid (DHAP), 3-methylhisdine (3-Methylhistidine), carbamoylaspartic acid (Carbamoylaspartate), creatinine (Creatinine), N-methyl-2-pyrrolidone (1-Methyl-2-pyrrolidinone), pyruvic acid (Pyruvate), propionic acid (Propionate), 5-aminovaleric acid (5-Aminovalerate), N-acetylornithine (o-Acetylornithine), 5-oxoproline (5-Oxoproline), creatine (Creatine), homoserine (Homoserine), fumaric acid (Fumarate), glycine (Gly), and N1,N12-diacetylspermine (N1,N12-Diacetylspermine).
The aforementioned substances that are not publicly known are indicated in Table 9 below.
The present invention provides a method for assaying a salivary biomarker for cancer including the steps of: collecting a saliva sample; and detecting the aforementioned salivary biomarker for cancer in the collected saliva sample.
The present invention provides a device for assaying a salivary biomarker for cancer including means for collecting a saliva sample, and means for detecting the aforementioned salivary biomarker for cancer in the collected saliva sample.
The present invention further provides a method for determining a salivary biomarker for cancer including a procedure of performing ultrafiltration of a saliva sample, means for cyclopedically measuring ionic metabolites in the saliva sample after the ultrafiltration, and a procedure of selecting a substance having high ability of distinguishing a patient with a pancreatic disease from a healthy subject according to concentrations of the measured metabolites.
Correlation of absolute concentration among multiple metabolites can be used for identifying a normalizing metabolite that can eliminate variation of overall concentrations in saliva.
A combination of the salivary biomarkers for cancer can be determined using a mathematical model.
According to the present invention, not only pancreatic cancer but also a pancreatic disease including IPMN and chronic pancreatitis, breast cancer, and oral cancer can be detected early using saliva that can be collected non-invasively and simply. In particular, a combination of polyamine with novel metabolite biomarkers makes a highly accurate prediction possible.
Hereinafter, an embodiment suitably implementing the present invention (hereinafter referred to as the embodiment) will be described in detail. The present invention is not limited to the following embodiments and Examples. In addition, constituents in the following embodiments and Examples include those that can be easily assumed by those skilled in the art, those that are substantially equivalent, and those falling within the scope of the so-called doctrine of equivalents. Further, the constituents disclosed in the following embodiments and Examples may be used in appropriate combination or by appropriate selection.
A procedure of determining a biomarker for a pancreatic disease will be described with reference to
A total of 199 salivary samples were collected from patients with pancreatic cancer with various stages, healthy subjects, and patients with intraductal papillary mucinous neoplasm (IPMN) and chronic pancreatitis. Table 1 lists subject characteristics, such as sex and age. Of these, no patients had undergone chemotherapy.
With respect to collection date
Collection is performed on a day other than a surgery day as much as possible.
With respect to diet
After 21:00 of the day before the collection, do not drink anything but water.
On the day of collection, do not eat breakfast.
Notes before collection of saliva on the day
Collect saliva from AM 8:30 to 11:00 before breakfast.
Brush teeth without use of toothpaste 1 hour or more before the collection of saliva.
Do not strenuously exercise 1 hour before the collection of saliva.
Do not clean the inside of oral cavity (with a toothpick, etc.).
Do not smoke.
Do not drink anything but water.
Method for collecting saliva
The mouth is rinsed with water before the collection of saliva, and non-irritant mixed saliva is collected.
Only saliva that runs spontaneously but is not volitionally generated is collected (sialemesis method). Alternatively, a straw is placed in the mouth when saliva is retained to some extent in the mouth (the time is about 3 minutes), and the saliva runs into a tube (passive drool method). When the face is turned down and saliva in the mouth is pushed into the straw that is vertically set, the saliva is likely to run spontaneously. However, when the saliva adheres to a middle of the straw and does not fall down, the saliva is sent out by the breath (in this case, saliva is easily collected by retaining saliva in the mouth to some extent and then pushing the saliva into the tube at one time as compared with opening of the mouth to the tube).
200 μL or more of saliva (as much as possible) is collected.
During the collection of saliva, the tube is placed on ice and kept at a low temperature as much as possible, and the collection is finished within 15 minutes (even when 200 μL of saliva is not collected, the collection is finished in 15 minutes).
Within 5 minutes, the saliva is cryopreserved on ice at −80° C. or with dry ice for storage. The tube and the straw of collecting saliva is a tube and a straw made of a polypropylene material.
A method for collecting saliva is not limited to the aforementioned method, and another method may be used.
400 μL of saliva sample is taken, placed in an ultrafiltration filter (molecular weight cutoff: 5,000 Da), and centrifuged at 4° C. and 9,100 g for 3.5 hours. 45 μL of the filtrate and 5 μL of an aqueous solution in which the concentration of each of methionine sulfone (Methionine sulfone), 2-morpholinoethanesulfonic acid (2-Morpholinoethanesulfonic acid), CSA (D-Camphor-10-sulfonic acid), 3-aminopyrrolidine (3-Aminopyrrolidine), and trimesic acid (Trimesate) is 2 mM are mixed to prepare 50 μL of a sample. Measurement was performed by the following method.
Ionic metabolites were identified and quantitatively determined from saliva by metabolome analysis using CE-MS.
A measurement method was performed in accordance with the method described in Non-Patent Literature 6. Hereinafter, the parameters will be described.
HPCE
Capillary: fused silica, 50 μm in inner diameter×100 cm in length
Buffer: 1 M formic acid (formate)
Voltage: positive, 30 kV
Injection: injection under a pressure of 50 mbar for 5 seconds (about 3 nL)
Washing before measurement: with 30 mM ammonium formate (Ammonium Formate) at a pH of 9.0 for 5 minutes, ultrapure water for 5 minutes, and buffer for 5 minutes
TOFMS
Polarity: positive
Capillary voltage: 4,000 V
Fragmentor voltage: 75 V
Skimmer voltage: 50 V
Drying gas: nitrogen (N2), 10 L/min
Drying gas temperature: 300° C.
Nebulizer gas pressure: 7 psig
Sheath liquid: 50% methanol/0.1 μM Hexakis (2,2-difluoroethoxy) phosphazene-containing water
Flow rate: 10 mL/min
Reference m/z: 2 methanol 13C isotope [M+H]+m/z 66.063061,
Hexakis(2,2-difluoroethoxy)phosphazene [M+H]+m/z 622.028963
HPCE
Capillary: COSMO (+), 50 μm in inner diameter×10.6 cm in length
Buffer: 50 mM ammonium acetate, pH: 8.5
Voltage: negative, 30 kV
Temperature: 20° C.
Injection: injection under a pressure of 50 mbar for 30 seconds (about 30 nL)
Washing before measurement: with 50 mM ammonium acetate at a pH of 3.4 for 2 minutes, and 50 mM ammonium acetate at a pH of 8.5 for 5 minutes
TOFMS
Polarity: negative
Capillary voltage: 3,500 V
Fragmentor voltage: 100 V
Skimmer voltage: 50 V
Drying gas: nitrogen (N2), 10 L/min
Drying gas temperature: 300° C.
Nebulizer gas pressure: 7 psig
Sheath liquid: 5 mM ammonium acetate and 50% methanol/0.1 μM Hexakis (2,2-difluoroethoxy) phosphazene-containing water
Flow rate: 10 mL/min
Reference m/z: 2 acetic acid 13C isotope [M−H]−m/z 120.038339, Hexakis(2,2-difluoroethoxy) phosphazene+acetic acid [M−H]− 680.035541
ESI needle: platinum
The anionic metabolite measurement may be performed before the cationic metabolite measurement.
Signals of a substance in which a value largely varies depending on a measurement day and a substance not derived from a metabolite are removed.
From measurement data, all peaks in which a signal noise ratio was 1.5 or more were first detected. A commercially available standard substance was measured before measurement of the saliva samples. A peak in which a value of mass to charge ratio (m/z) obtained by a mass spectrometer and a corresponding migration time were assigned to a substance name. Thus, identification was performed. In quantitative determination, the peak area of each peak was divided by the area of the peak of the internal standard substance, a fluctation of measurement sensitivity of the mass spectrometer was corrected, and the specific peak area ratio was calculated. The absolute concentration was calculated from a ratio of the specific peak area in the saliva samples to the specific peak area of the standard substance.
Only a substance in which the peak can be detected in 30% or more cases (for example, three out of ten) of each group was selected.
After a typical test (in this case, Mann-Whitney test) was performed, a P value was corrected using a false discovery rate (FDR) and a Q value was calculated. A substance having a significant difference of Q<0.05 was selected.
The substance selected by this procedure is a substance selected from N-acetylputrescine (N-Acetylputrescine), adenosine (Adenosine), 3-phospho-D-glyceric acid (3PG), urea (Urea), o-acetylcarnitine (o-Acetylcarnitine), citric acid (Citrate), glycyl-glycine (Gly-Gly), 5-aminovaleric acid (5-Aminovalerate), methyl 2-oxopentanoate (2-Oxoisopentanoate), malic acid (Malate), benzoate ester (Benzoate), fumaric acid (Fumarate), N-acetylaspartic acid (N-Acetylaspartate), inosine (Inosine), 3-methylhistidine (3-Methylhistidine), N1-acetylspermine (N1-Acetylspermine), creatine (Creatine), α-aminoadipic acid (alpha-Aminoadipate), phosphorylcholine (Phosphorylcholine), 2-hydroxypentanoate (2-Hydroxypentanoate), xanthine (Xanthine), succinic acid (Succinate), 6-phosphogluconic acid (6-Phosphogluconate), butanoic acid (Butanoate), homovanillic acid (Homovanillate), O-phosphoserine (O-Phosphoserine), trimethylamine-N-oxide (Trimethylamine N-oxide), piperidine (Piperidine), cystine (Cys), 2-isopropylmalic acid (2-Isopropylmalate), N8-acetylspermidine (N8-Acetylspermidine), N1-acetylspermidine (N1-Acetylspermidine), N-acetylneuraminic acid (N-Acetylneuraminate), glucosamine (Glucosamine), spermine (Spermine), agmatine (Agmatine), N-acetylhistamine (N-Acetylhistamine), methionine (Met), p-4-hydroxyphenylacetic acid (p-4-Hydroxyphenylacetate), N,N-dimethylglycine (N,N-Dimethylglycine), hypotaurine (Hypotaurine), glutamyl-glutamic acid (Glu-Glu), N1,N12-diacetylspermine (N1,N12-Diacetylspermine), and combinations thereof.
In all the samples measured (including healthy, breast cancer, oral cancer, IPMN, and pancreatic cancer), correlation values between the metabolites were exhaustively calculated using the determined quantitative values of the metabolites. Combinations of substances satisfying a Pearson correlation coefficient (R) of R≧0.8 were listed. Of a metabolite group in which the most substances correlated with each other, a substance that correlated with the most substances was selected.
9. Selection of a Substance Having a Statistically Significant Difference Among the Substances after Concentration Correction (Step 152 in
After the typical test (in this case, Mann-Whitney test) was performed using a value in which the concentration of each substance was corrected with the concentration of the substance selected at Step 142, a P value was corrected using the false discovery rate (FDR), and a Q value was calculated. A substance having a significant difference of Q<0.05 was selected.
A procedure of developing a mathematical model of distinguishing the subjects with pancreatic cancer from the healthy subjects will be then described with reference to
Using the marker selected at Step 150 or 152 in
ln(P/1−P)=b0+b1x1+b2x2+b3x3+ . . . +bkxk (1)
is determined using k description variables x1, x2, x3, . . . , and xk for a ratio P as a target variable.
Specifically, a combination of the smallest independent variables that did not correlate with each other was selected at Step 210, for example, using a stepwise forward selection method of stepwise variable selection. A P value at which the variable was added was 0.05, a P value at which the variable was eliminated was 0.05, and a variable xi was selected.
At Step 220, the data were divided into learning data and evaluation data, and at Step 230, a model was formed from the learning data and evaluated using the evaluation data. In cross validation of Loop 1 in
At Step 240, receiver operating characteristic (ROC) analysis was performed using the selected model. An area under the ROC curve (AUC) and a 95% confidential interval (CI) were calculated, and the model was evaluated. In accordance with the ROC curve, a curve of Y=X+α (α is a constant) was drawn. When the value of α was decreased from 1 to 0, the value of α that first touched the ROC curve was determined. Thus, an optimal cut-off value was determined.
Next, the process proceeded to Step 250, and a model having the best accuracy as the result of cross validation was selected.
Herein, a stepwise method is used. The stepwise method includes three kinds of a forward selection method, a stepwise forward selection method, and a backward selection method. The threshold value may be adjusted to a threshold value of P<0.05, and variable may be added. Therefore, the model having the best accuracy can be selected by forming a model many times at a larger loop 2 in
Specifically, for evaluation of the MLR model, values of risk of pancreatic cancer (PC) with respect to saliva of breast cancer, oral cancer (CP), and IPMN were calculated. A group of the healthy subjects (C), and the subjects with CP and IPMN was formed. An AUC value that could identify pancreatic cancer from this group was calculated. The data were randomly divided into 10, a model was formed using 90% of the data, and the model was evaluated by the rest values of 10%. This operation was repeated 10 times. All the cases were selected once for evaluation, and cross validation (CV) of collecting the evaluation data and calculating the AUC value was performed.
10. Results of Model of Distinguishing Pancreatic Cancer from Searched Substance
For example, for the variable selection, a correlation-based feature subset method (see M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand.), a relief method (see Marko Robnik-Sikonja, Igor Kononenko (1997). An adaptation of Relief for attribute estimation in regression. In: Fourteenth International Conference on Machine Learning, 296-304.), an SVM valiable selection method (see I. Guyon, J. Weston, S. Barnhill, V. Vapnik (2002). Gene selection for cancer classification using support vector machines. Machine Learning. 46(1-3): 389-422.), or the like may be applied.
For the mathematical model, a mechanical learning method of dividing two groups may be applied. For example, Bayesian estimate (see Berger, James 0 (1985). Statistical Decision Theory and Bayesian Analysis. Springer Series in Statistics (Second ed.). Springer-Verlag. ISBN 0-387-96098-8.), neural network (ANN) (see D. E. Rumelhart, G. E. Hinton, and R. J. Williams, (1986): Learning representations by back-propagating errors, Nature, 323-9, 533-536.), support vector machine (SVM) (see J. Platt (1998) Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods-Support Vector Learning), Alternative decision tree (ADTree) (see Yoav Freund and Llew Mason (1999) The Alternating Decision Tree Algorithm. Proceedings of the 16th International Conference on Machine Learning, 124-133, and Freund, Y., Mason, L. (1999) The alternating decision tree learning algorithm. In: Proceeding of the Sixteenth International Conference on Machine Learning, Bled, Slovenia, 124-133), decision tree (see Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, Calif.), PART model (see Eibe Frank, Ian H. Witten (1998) Generating Accurate Rule Sets Without Global Optimization. In: Fifteenth International Conference on Machine Learning, 144-151), Random forest, PLS discriminant analysis (see Partial least squares-discriminant analysis; PLS-DA)(Lindgren, F; Geladi, P; Wold, S (1993). The kernel algorithm for PLS. J. Chemometrics 7: 45-59. doi: 10.1002/cem.1180070104.), Orthogonal PLS discriminant analysis (OPLS-DA) (see Trygg, J., & Wold, S. (2002). Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics, 16(3), 119-128, and Breiman, Leo (2001). Random Forests. Machine Learning 45 (1): 5-32. doi: 10.1023/A: 1010933404324.), or the like may be applied. Bootstrap method and Bagging method (see Breiman, Leo (1996) Bagging predictors. Machine, Learning24 (2): 123-140) in which prediction is performed using average and majority of predictive values of a plurality of mathematical models that are obtained by forming a plurality of mechanical learning methods of dividing two groups may be used. Further, separation may be performed using a principal component in principal component analysis (Principal Component Analysis; PCA) (see Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417-441) that is unsupervised learning.
Substances having a high ability of distinguishing the subjects with pancreatic cancer from the healthy subjects at Step 152 of the above section 9 are shown in Table 2.
In Table 2, a detection ratio shows a ratio of cases in which a peak can be detected relative to all the cases in each group of the healthy subjects and the subjects with pancreatic cancer. A 95% confidential interval (CI) represents a value of 95% confidential interval.
A Mann-Whitney test that is a non-parametric two-group test for the healthy subject group and the pancreatic cancer group was performed between the healthy subjects and the subjects with pancreatic cancer, the p value of each of the metabolites was calculated, and the p value was corrected with the false discovery rate (FDR). A test of q value was performed.
For evaluation of sensitivity and specificity of the substances that distinguish two groups of the subjects with pancreatic cancer (PC) and the healthy subjects (C), receiver operating characteristic (ROC) analysis was performed. The results are shown in
The area under the ROC curve in
Using the MLR model that can distinguish the healthy subjects and the subjects with pancreatic cancer, values calculated as risk of pancreatic cancer (PC) of the healthy subjects (C) and the subjects with pancreatic cancer (PC) as well as the patients with breast cancer, oral cancer (CP), and IPMN are shown in
Table 4 shows an AUC value in which the specificity and general-purpose properties of the MLR model were evaluated.
Herein, CV represents a case of cross validation.
In
In the present invention, the concentrations of ionic metabolites contained in saliva were simultaneously measured, and markers having a high ability of distinguishing the subjects with pancreatic cancer from the healthy subjects were selected. Further, a model having higher accuracy (sensitivity and specificity) as compared with a single substance could be developed by combining the markers.
A problem involved in using saliva is that there is a greater variation of concentrations present in saliva as compared with blood. In this method, saliva was collected under unified conditions depending on the collection time and dietary restriction before the collection. Some of the samples had trends in which the concentrations of all the substances were clearly high or low.
The concentration in the subjects with pancreatic cancer (PC) is significantly higher than that in the healthy subjects (C), and as an indication exhibiting a risk of pancreatic cancer, a high total concentration in saliva itself may be used. However, some of the samples of C have a concentration higher than PC, and in contrast, some of the samples of PC have a concentration lower than C. Therefore, when the samples are simply considered for a risk, the accuracy is low (from results of ROC analysis between C and PC in data of
When only a sample having a concentration falling within a certain range except for the samples (the samples of C having higher concentration, and the samples of PC having lower concentration) is a target of a test, based on the whole concentration, examination should be omitted. Therefore, the fluctuation of the entire concentration was offset by performing normalization using a substance that has a high correlation with the entire metabolite concentration of the saliva and can be detected in all the samples by the method shown in
Among the substances as marker candidates in Table 2, polyamines such as spermine, and acetylated polyamines such as N8-acetylspermidine, N1-acetylspermidine, and N1-acetylspermine are each a substance that reflects on a state of the pancreatic tissues according to various changes in cancer. However, for example, in a case of spermine in urine, the concentration correction with creatinine is only considered. Therefore, spermine cannot achieve the accuracy that a tumor marker measured in a blood test can achieve. Because polyamines in blood are taken up by erythrocytes (see Fu N N, Zhang H S, MaM, Wang H. (2007) Quantification of polyamines in human erythrocytes using a new near-infrared cyanine 1-(epsilon-succinimidyl-hexanoate)-1′-methyl-3,3,3′,3′-tetramethyl-indocarbocyanine-5,5′-disulfonate potassium with CE-LIF detection. Electrophoresis. 28(5): 822-9), the amount of polyamines in a free state is extremely small, and the concentration thereof in urine is extremely low. Even when polyamines in blood and urine are measured in breast cancer, the highest concentration of spermidine is about 140 nM (nanomol), and the concentration of N-acetylspermidine is about 64 nM (nanomol). Thus, concentrations that are much lower than the concentrations in saliva are reported (see Byun J A, Lee S H, Jung B H, Choi M H, Moon M H, Chung B C. (2008) Analysis of polyamines as carbamoyl derivatives in urine and serum by liquid chromatography-tandem mass spectrometry. Biomed Chromatogr. 22(1): 73-80). The quantitative determination of polyamines in erythrocytes requires a complicated step. Therefore, a diagnosis method found in the present invention has characteristics in which a highly accurate prediction can be achieved due to the contribution of the following three points, including (i) use of saliva capable of detecting the marker substances at high concentration, (ii) a decrease in dispersion generated at each measurement due to a simple treatment process for measurement, and (iii) use of the mathematical model in combination with the markers. A difference in mRNA in saliva between the patients with pancreatic cancer and the healthy subjects is already known (Non-Patent Literature 4). However, mRNA is completely different because a molecular group to which the present invention is directed is a metabolite. The variation of metabolites by themselves in saliva depending on pancreatic cancer is already known (Non-Patent Literature 5). However, substances that are not disclosed in known documents are used as a marker in the present invention, and a mathematical model for eliminating the effect of a specific concentration variation in saliva and identifying pancreatic cancer with high sensitivity and specificity can be developed.
With respect to four groups of healthy subjects, chronic pancreatitis, IPMN, and pancreatic cancer, a distribution of risk of pancreatic cancer that is predicted by the MLR model shows that the model exhibits high specificity for pancreatic cancer (
In Examples, capillary electrophoresis-mass spectroscopy (CE-MS) is used to measure the concentrations of metabolites in saliva. However, high speed liquid chromatography (LC), gas chromatography (GC), chip LC, or chip CE, or GC-MS, LC-MS, and CE-MS methods in which they are combined with a mass spectrometer (MS), a measurement method for each MS alone, an NMR method, a measurement method for a metabolite substance that is derivatized into a fluorescent substance or a UV absorptive material, or an enzyme method in which an antibody is produced and measured by an ELISA method, may be used. Regardless of the measurement method, measurement may be performed by any analysis.
Next, a biomarker for breast cancer will be described.
Cases included healthy subjects (20 cases), and patients with breast cancer (90 cases) including patients with breast cancer before initiation of treatment (37 cases), patients with breast cancer that were treated with chemotherapy, hormonotherapy, or the like. In the breast cancer cases, one patient was male and the rest were female. In the patients with breast cancer before initiation of treatment, eight cases were DCIS, and 29 cases were invasive ductal carcinoma.
A method for collecting saliva, a method for measuring metabolites, and the like were the same as those used in the biomarker for pancreatic cancer.
In a variable selection method performed during formation of multiple logistic regression model (MLR mode), only a substance of Q 0.05 was used. In
Substances in which the absolute concentration exhibits a statistic significant difference (p<0.05 in the Mann-Whitney test) between the patients with breast cancer and the healthy subjects are shown in Tables 7-1, 7-2, 7-3, and 7-4. Comparison was performed in 20 cases of the healthy subjects and all the cases (90 cases) including the patients with breast cancer before treatment without chemotherapy or hormonotherapy (37 cases). A Q value was calculated by the false discovery rate (FDR).
A substance for which “7” or “8” is indicated in the column labeled “Publicly Known” is a known substance disclosed in Non-Patent Literature 7 or 8.
Next, a network diagram in which a line is drawn between metabolites exhibiting a correlation between metabolites in the patients with breast cancer before initiation of treatment (37 cases) and metabolites in the healthy subjects (20 cases) of R2>0.92 shown in
Substances in which a relative concentration exhibits a statistical significant difference (p<0.05 in the Mann-Whitney test) between the patients with breast cancer and the healthy subjects are shown in Tables 8-1, 8-2, 8-3, and 8-4. In calculating the relative concentration, the concentration of each substance was divided by the concentration of glutamine, and the value was expressed with no units.
At that time, many of the metabolites included in saliva were measured. In order to calculate the significant difference of each substance, independent statistics (for example, Mann-Whitney test) needs to be repeated. When the test is repeated at a level of significance a of 0.05, null hypothesis that is accidentally dismissed is increased. Therefore, the P value was corrected by the false discovery rate (FDR) method (Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National academy of Sciences of the United States of America, 100, 9440-9445), and a Q value was calculated. For example, when the Q value is 0.5, true null hypothesis occupies a half of the null hypothesis that is dismissed at P<0.05.
Even when the whole concentration of saliva of elderly patient is increased, using glutamine that is a substance that correlates with the most metabolites and can be detected in all of the samples as a concentration-correcting marker makes it possible to distinguish a subject with cancer from a healthy subject by eliminating the influence of concentration variations by this method. On the other hand, people equal to or older than 70 years of age have a trend of increasing the whole concentration of saliva. Therefore, when the absolute concentration is used, the construction of a model using only data of people less than 70 years of age leads to a highly accurate separation.
A substance belonging to polyamines among substances that give a significant difference between the healthy subjects (C, n=20) and the patients with breast cancer (BC, all the cases including before treatment, n=90) is shown in
Examples (the top five substances with a smaller P value) of substances other than polyamines among the substances that give a significant difference between the healthy subjects (C, n=20) and the patients with breast cancer (BC, all the cases including before treatment, n=90) are shown in
Substances that give a significant difference (p<0.05) between the healthy subjects (C, 20 cases) and the patients with breast cancer (BC, 90 cases) regardless of the presence or absence of concentration correction are shown in
Next, a biomarker for oral cancer will be described.
Substances in which the absolute concentrations of metabolites in saliva exhibit a difference between subjects with oral cancer and healthy subjects are shown in Tables 9-1 and 9-2. The healthy subjects were 20 cases, and patients with breast cancer were 20 cases. For the healthy subjects, saliva was collected 1.5 hours after eating, and for the patients with cancer, saliva was collected two times, before eating (in a fasting state from the previous night) and 1.5 hours after eating. By comparing either of them, a P value was calculated using the Mann-Whitney test, and a Q value was calculated using the false discovery rate (FDR). Substances of Q<0.05 were listed.
A substance in which “7” or “9” is described in the column labeled “Publicly Known” is a known substance disclosed in Non-Patent Literature 7 or 9.
Herein, the patients with oral cancer included stages I to IVa, and include oral squamous cell carcinoma (17 cases), malignant melanoma (2 cases), and adenoid cystic carcinoma (1 case). Spermine, spermidine, or acetylated spermine or spermidine consistently have a high concentration in comparison of an oral cancer tissue sample obtained during surgery and a healthy part in a vicinity of the oral cancer tissue.
For example, choline (second substance from the top in Table) among the substances is a known substance in Non-Patent Literatures 7 and 9, and an increase in the concentration of the substance in saliva has been confirmed. However, oral cancer can be identified with high accuracy by a mathematical model combined with a plurality of novel markers by the same procedure as those in pancreatic cancer and breast cancer. The substance is increased in oral cancer, but is not increased in breast cancer. Therefore, when the substance is included as a variable of the mathematical model, the specific type of cancer can be expressed.
The concentrations of metabolites in the cancer tissue sample obtained during surgery of oral cancer and the healthy tissue sample near the cancer tissue sample (herein, the concentration corrected with the weight of the tissue in μM/g is used) are shown in
A difference in the concentration of saliva between the patients with oral cancer and the healthy subjects (C) when a method of collecting saliva in the patients with oral cancer was changed is shown in
Table 10 shows results in which the absolute concentrations of polyamines and hypoxanthine were measured using saliva collected from 17 healthy subjects, 21 patients with pancreatic cancer, 16 patents with breast cancer, and 20 patients with oral cancer during fasting (hungry from 9:00 of previous night, no eating on the collection day) by liquid chromatography-mass spectrometer (LC-MS). A P value for evaluation of difference in average was calculated using the Student's t-test as a parametric test because the number of cases was small.
Results of determination of polyamines and hypoxanthine (Hypoxanthine) as a metabolite other than the polyamines are shown in Table 10. Among the polyamines, when N1,N12-diacetylspermine (N1,N12-diacetylspermine) was measured using CE-TOFMS, the peak thereof overlapped the peak of another substance. When LC-qTOFMS was used, the peak of N1,N12-diacetylspermine and the peak of the other substance could be separately measured. Herein, only the samples that were collected during fasting were used. The quantitative values determined for 17 cases of healthy subjects, 21 cases of pancreatic cancer, 18 cases of oral cancer, and 16 cases of breast cancer are described (the unit of quantitative value is μM).
Saliva for LC-MS is treated as follows.
1) In 270 μL of methanol and ammonium hydroxide solution adjusted to 2 μM 2-morpholinoethanesulfonic acid, saliva stored at −80° C. is dissolved, and 30 μL thereof is added and stirred.
2) The mixture is centrifuged at 4° C. and 15,000 rpm for 10 minutes, and the entire upper layer is transferred to another tube.
3) The whole amount of the liquid is subjected to centrifugal concentration, and added to the liquid are 18 μL of 90% MeOH and 12 μL of BorateBuffer, resulting in redissolution.
4) 5 μL of the liquid is used for LC-MS analysis, and 20 μL of the liquid is used for ELISA analysis.
5) In the LC-MS analysis, 10 μL of ultrapure water containing 4 μM Methionine-sulfone is added to 5 μL of the aforementioned solution to obtain a dilution as a sample.
Measurement conditions of LC-MS are as follows.
LC system: Agilent Technologies 1290 infinity
Mobile phase: Solvent A; Water containing 1% Formic acid: Solvent B; Acetonitrile
containing 0.1% formic acid
Flow rate: 0.5 mL/min
Stop time: 7 min
Post time: 3 min
Column: CAPCELL CORE PC (Shiseido: 2.1 mm×50 mm, 2.7 mm)
Column temp.: 50° C.
Injection volume: 1 μL
Gas temp: 350° C.
Gas flow: 13 L/min
Neblizer Gas: 55 psig
According to the present invention, when the concentration of saliva is corrected (normalized), using data analysis of a correlation network reduces the influence of the concentration. Even in saliva in which concentrations vary greatly, a subject with pancreatic cancer can be distinguished from a healthy subject. The present method makes prediction of chronic pancreatitis, IPMN, breast cancer, and oral cancer possible.
A range in which a test can be performed using the marker of the present invention is determined by the value of concentration-correcting marker that reflects the saliva concentration, and saliva whose overall concentration is outside should be treated as outliers. In saliva within the range, a patient with each cancer can be distinguished from a healthy subject by a mathematical model that combines the markers of absolute concentrations or corrected relative concentrations.
Even by using saliva in which the concentration largely varies, pancreatic cancer, breast cancer, and oral cancer can be early detected in a healthy subject.
Number | Date | Country | Kind |
---|---|---|---|
2013-223738 | Oct 2013 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2014/078671 | 10/28/2014 | WO | 00 |