The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Oct. 27, 2021, is named 56686-702_301_SL.txt and is 13,273 bytes in size.
This invention is in the field of diagnosis and in particular the diagnosis of irritable bowel syndrome (IBS).
Irritable bowel syndrome (IBS) is a common condition that affects the digestive system. Results from global epidemiological studies have shown that IBS is present in 3% to 30% of a population, with no common trend across different countries (1). Symptoms include cramps, bloating, diarrhoea and constipation and occur over a long time period, generally years. Disorders such as anxiety, major depression, and chronic fatigue syndrome are common among people with IBS. There is no known cure for IBS and treatment is generally carried out to improve symptoms. Treatment may include dietary changes, medication, probiotics, and/or counselling. Dietary measures that are commonly suggested as treatments include increasing soluble fiber intake, a gluten-free diet, or a short-term diet low in fermentable oligosaccharides, disaccharides, monosaccharides, and polyols (FODMAPs). The medication loperamide is used to help with diarrhea while laxatives are be used to help with constipation. Antidepressants may improve overall symptoms and pain. Like most chronic non-communicable disorders, IBS appears to be heterogeneous (2). It ranges in severity from nuisance bowel disturbance to social disablement, accompanied by marked symptomatic heterogeneity (3). Although frequently considered a disorder of the brain-gut axis (4,5), it is unclear if IBS begins in the gut or in the brain or both. The occurrence of post-infectious IBS (6) suggests that a proportion of cases are initiated in the end-organ, albeit with susceptibility risk factors, some of which may be psychosocial. Advances in microbiome science, with emerging evidence for a modifying influence by the microbiota on neurodevelopment and perhaps on behaviour, have broadened the concept of the mind/body link to encompass the microbiota-gut-brain axis (7).
However, progress in understanding and treating IBS has been limited by the absence of reliable biomarkers and IBS is still defined by symptoms. Currently, gastrointestinal (GI) diseases such as IBS are standardised using the Rome criteria. Diagnosis of IBS using the Rome Criteria is based on whether the patient has symptoms which are associated with IBS. These criteria were established by a group of experts in functional gastrointestinal disorders, known as the Rome Consensus Commission, in order to develop and provide guidance in research. They have been updated in five separate editions, to make them more relevant outside of research, and useful in improving clinical trials (1,8). However, results from one study (1) have shown that the prevalence of IBS is dependent on which edition of the Rome criteria is applied; the later editions exhibited a lower prevalence of IBS amongst populations.
Other criteria used to diagnose IBS include the WONCA criteria, involving the exclusion of other organic diseases, and DSM (Diagnostic and Statistical Manual for Mental Disorders). Here, the analysis included before diagnosis is minimal, with specialist examination occurring only as an exception (1). Investigations have been carried out into gut microbiota alterations in patients with IBS compared to control (non-IBS) groups (9,10,11,12). Interaction of the microbiome with diet, antibiotics and enteric infections, all of which may be involved in IBS, is consistent with the hypothesis that microbiome alterations could activate or perpetuate pathophysiological mechanisms in the syndrome (13,14). Biomarkers have been found to be associated with IBS, which has provided more flexibility for defining subpopulations of IBS that are not based on clinical symptoms (1). However, robust microbiome signatures or biomarkers that separate IBS patients from controls and that help inform therapies are lacking, though signatures have been suggested for IBS severity (12). Furthermore, most microbiota studies to date have employed 16S rRNA profiling, and did not analyse bacterial metabolites.
The Rome criteria are also used to classify IBS subtypes. Currently, IBS subtypes are defined by the Rome criteria (15). These subtypes are IBS-C, IBS-D and IBS-M. IBS-C is IBS with predominant constipation where stool types 1 and 2 (according to the Bristol stool chart) are present more than 25% of the time and stool types 6 and 7 are present less than 25% of the time. IBS-D is IBS with predominant diarrhoea where stool types 1 and 2 are present less than 25% of the time and stool types 6 and 7 more than 25% of the time. IBS-M is IBS where there is a mixture of IBS-C and IBS-D with stool types 1, 2, 6 and 7 present more than 25% of the time, and is known as IBS-mixed type. While these classifications can establish predominance of constipation over diarrhoea and diarrhoea over constipation, they are not very useful for long term treatment of IBS given the heterogenic nature of the disease and the tendency of patients to move from one subtype classification to another within a given time period (16). The current approach has significant limitations including failure to inform treatment of patients who alternate between subtypes sometimes within days (17). More understanding is required for this disease and like other gut related illness a change in gut microbiota can be signatory of a change in disease pattern (18). Furthermore, the forms of diarrhoea or constipation can be diverse. Pharmaceutical agents designed to tackle polar opposite symptoms have the potential for severe unwanted adverse effects if prescribed for a patient who has been misclassified (19). What is of interest are alterations in the microbiome of patients with IBS and what correlation if any there is with the symptoms of IBS. However, IBS subtypes (IBS-C, IBS-D, IBS-M) are not useful for distinguishing between the different microbiomes of patients diagnosed with IBS according to the Rome criteria.
There is a requirement for further and improved methods for diagnosing bowel disorders such as IBS, including the diagnosis of the various IBS subtypes.
The inventors have developed new and improved methods for diagnosing IBS. A comprehensive and detailed analysis of the microbiome, the metabolome and gene pathways in patients and control (non-IBS) individuals has allowed new indicators of disease to be identified. The invention therefore provides a method of diagnosing IBS in a patient comprising detecting: a bacterial strain of a taxa associated with IBS; a microbial gene involved in a pathway associated with IBS; and/or a metabolite associated with IBS. The inventors have also developed new and improved methods for stratification of patients with IBS. The invention therefore provides a method of classification of a patient with IBS to a subgroup based on the microbiome, comprising detecting: a bacterial strain of a taxa associated with an IBS subgroup and/or a metabolite associated with an IBS subgroup.
Bacterial Taxa as Predictive Features of IBS
The inventors have identified bacterial taxa that are predictive of IBS, as demonstrated in the examples. Accordingly, the invention provides methods for diagnosing IBS comprising detecting the presence of certain bacterial taxa. As detailed below, the bacterial taxa used in the invention may be defined with reference to 16S rRNA gene sequences, or the invention may use Linnaean taxonomy. Bacteria of either category of taxa may be detected using Clade-specific bacterial genes, 16S sequences, transcriptomics, metabolomics, or a combination of such techniques. Preferably, these methods comprise detecting bacteria (i.e. one or more bacterial strains) in a fecal sample from a patient. Alternatively, the bacteria may be detected from an oral sample, such as a swab. Generally, detecting a bacterial taxa associated with IBS in the methods of the invention comprises measuring the relative abundance of the bacteria in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacterial species which may include one or more of the following genera: Actinomyces, Oscillibacter, Paraprevotella, Lachnospiraceae, Erysipelotrichaceae and Coprococcus. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting a bacterial strain belonging to a genus selected from the group consisting of: Escherichia, Clostridium, Streptococcus, Parabacteroides, Turicibacter, Eubacterium, Bacteroides, Klebsiella, Pseudoflavonifractor, and Enterococcus. In a particular embodiment, the bacterial species is of the genus Actinomyces. In a particular embodiment, the bacterial species is of the genus Oscillibacter. In a particular embodiment, the bacterial species is of the genus Paraprevotella. In a particular embodiment, the bacterial species is of the genus Lachnospiraceae. In a particular embodiment, the bacterial species is of the genus Erysipelotrichaceae. In a particular embodiment, the bacterial species is of the genus Coprococcus. In a particular embodiment, the bacterial species is of the genus Escherichia. In a particular embodiment, the bacterial species is of the genus Clostridium. In a particular embodiment, the bacterial species is of the genus Streptococcus. In a particular embodiment, the bacterial species is of the genus Parabacteroides. In a particular embodiment, the bacterial species is of the genus Turicibacter. In a particular embodiment, the bacterial species is of the genus Eubacterium. In a particular embodiment, the bacterial species is of the genus Bacteroides. In a particular embodiment, the bacterial species is of the genus Klebsiella. In a particular embodiment, the bacterial species is of the genus Pseudoflavonifractor. In a particular embodiment, the bacterial species is of the genus Enterococcus. In preferred embodiments, the method of the invention comprises detecting bacteria (i.e. one or more bacterial strains) of more than one of the genera listed in Table 1, such as detecting bacteria of Actinomyces, Oscillibacter, Paraprevotella, Lachnospiraceae, Erysipelotrichaceae and Coprococcus. In certain embodiments, the bacteria (i.e. one or more bacterial strains) may be detected using Clade-specific bacterial genes, 16S sequences, transcriptomics or metabolomics. In any such embodiments, detecting the bacteria comprises measuring the relative abundance of the bacteria in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. The examples demonstrate that such methods are particularly effective.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial species selected from the following: Ruminococcus gnavus, Coprococcus catus, Bamesiella intestinihominis, Anaerotruncus colihominis, Eubacterium eligens, Clostridium symbiosum, Roseburia inulinivorans, Paraprevotella clara, Ruminococcus lactaris, Clostridium citroniae, Clostridium leptum, Ruminococcus bromii, Bacteroides thetaiotaomicron, Eubacterium biforme, Bifidobacterium adolescentis, Parabacteroides distasonis, Dialister invisus, Bacteroides faecis, Butyrivibrio crossotus, Clostridium nexile, Bacteroides cellulosilyticus, Pseudoflavonifractor capillosus, Streptococcus anginosus, Streptococcus sanguinis, Desulfovibrio desulfuricans and/or Clostridium ramosum. In certain embodiments, the method of the invention comprises detecting two or more species from the above list, such as at least 5, 10, 15, 20 or all of the species. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial strains that may be selected from the list consisting of Lachnospiraceae bacterium_3_1_46FAA, Lachnospiraceae bacterium_7_1_58FAA, Lachnospiraceae bacterium_1_4_56FAA, Lachnospiraceae bacterium_2_1_58FAA, Coprococcus sp_ART55_1, Alistipes sp_AP11 and/or Bacteroides sp_1_1_6, or corresponding strains, such as strains with a 16S rRNA gene sequence that is at least 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% identical to the 16S gene rRNA sequence of the reference bacterium. In certain embodiments, the method of the invention comprises detecting two or more bacteria from the above list, such as at least 3, 4, 5 or all of the bacteria. In any such embodiments, detecting the bacteria comprises measuring the relative abundance of the bacteria in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In certain embodiments, the bacteria (i.e. one or more bacterial strains) may be detected using Clade-specific bacterial genes, 16S sequences, transcriptomics or metabolomics.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial species selected from the following: Prevotella buccalis, Butyricicoccus pullicaecorum, Granulicatella elegans, Pseudoflavonifractor capillosus, Clostridium ramosum, Streptococcus sanguinis, Clostridium citroniae, Desulfovibrio desulfuricans, Haemophilus pittmaniae, Paraprevotella clara, Streptococcus anginosus, Anaerotruncus colihominis, Clostridium symbiosum, Mitsuokella multacida, Clostridium nexile, Lactobacillus fermentum, Eubacterium biforme, Clostridium leptum, Bacteroides pectinophilus, Coprococcus catus, Eubacterium eligens, Roseburia inulinivorans, Bacteroides faecis, Bamesiella intestinihominis, Bacteroides thetaiotaomicron, Ruminococcus bromii, Ruminococcus gnavus, Ruminococcus lactaris, Parabacteroides distasonis, Butyrivibrio crossotus, Bacteroides cellulosilyticus, Bifidobacterium adolescentis, and/or Dialister invisus. In certain embodiments, the method of the invention comprises detecting two or more species from the above list, such as at least 5, 10, 15, 20 or all of the species. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial strains that may be selected from the list consisting of Lachnospiraceae bacterium_2_1_58FAA, Lachnospiraceae bacterium_7_1_58FAA, Lachnospiraceae bacterium_1_4_56FAA, Lachnospiraceae bacterium_3_1_46FAA, Alistipes sp_AP11, Bacteroides_sp_1_1_6, and/or Coprococcus_sp_ART55_1, or corresponding strains, such as strains with a 16S rRNA gene sequence that is at least 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% identical to the 16S gene rRNA sequence of the reference bacterium. In certain embodiments, the method of the invention comprises detecting two or more bacteria from the above list, such as at least 3 or 4 or all of the bacteria. In any such embodiments, detecting the bacteria (i.e. one or more bacterial strains) comprises measuring the relative abundance of the bacteria in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In certain embodiments, the bacteria (i.e. one or more bacterial strains) may be detected using clade-specific bacterial genes, 16S sequences, transcriptomics or metabolomics.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial strains belonging to an operational taxonomic unit (OTU) associated with IBS. As known in the art, an operational taxonomic unit (OTU) is an operational definition used to classify groups of closely related individuals. As used herein, an “OTU” is a group of organisms which are grouped by DNA sequence similarity of a specific taxonomic marker gene (49). In some embodiments, the specific taxanomic marker gene is the 16S rRNA gene. In some embodiments, the Ribosomal Database Project (RDP) taxonomic classifier is used to assign taxonomy to representative OTU sequences. For example, the sequence information in Table 12 can be used to classify whether bacteria (i.e. one or more bacterial strains) belong to the OTUs listed in Table 11. Bacteria having at least 97% sequence identity to the sequences in Table 12 belong to the corresponding OTUs in Table 11. In preferred embodiments, the OTU is selected from tables 1, 11 and/or 12. In any such embodiments, detecting the bacteria (i.e. one or more bacterial strains) comprises measuring the relative abundance of the bacteria in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In certain embodiments, the bacterial species belongs to a sequence-based taxon. In preferred embodiments, the sequence-based taxon is selected from tables 1-3.
In one embodiment, a bacterial species or strain predictive of IBS is more abundant in patients suffering from IBS. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial species or strain, wherein increased abundance is associated with IBS, and wherein the strain or species is selected from: Ruminococcus gnavus, Lachnospiraceae bacterium_3_1_46FAA, Lachnospiraceae bacterium_7_1_58FAA, Anaerotruncus colihominis, Lachnospiraceae bacterium_1_4_56FAA, Clostridium symbiosum, Clostridium citroniae, Lachnospiraceae bacterium_2_1_58FAA, Clostridium nexile, and/or Clostridium ramosum, In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial species or strains which is more abundant in patients suffering from IBS. In certain embodiments, the method of the invention comprises detecting two or more species or strains from the above list, such as at least 5, 10, 15, 20 or all of the species.
In one embodiment, the bacterial species predictive of IBS is significantly more abundant in patients suffering from IBS. In a preferred embodiment, the bacterial species predictive of IBS that is significantly more abundant in patients suffering from IBS is Ruminococcus gnavus and/or Lachnospiraceae spp.
In one embodiment, a bacterial species or strain predictive of IBS is less abundant in patients suffering from IBS. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial species or strain, wherein decreased abundance is associated with IBS, and wherein the strain or species is selected from: Coprococcus catus, Barnesiella intestinihominis, Eubacterium eligens, Paraprevotella clara, Ruminococcus lactaris, Eubacterium biforme, and/or Coprococcus sp_ART55_1. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial species or strains which are less abundant in patients suffering from IBS.
In one embodiment, the bacterial species predictive of IBS is significantly less abundant in patients suffering from IBS. In a preferred embodiment, the bacterial species predictive of IBS that is significantly less abundant in patients suffering from IBS is Barnesiella intestinihominis and/or Coprococcus catus.
In a particular embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacterial taxa which are predictive of IBS selected from table 2. In certain embodiments, the bacterial taxa predictive of IBS are significantly more abundant in patients suffering from IBS, for example as shown in tables 2 and/or 3. In other embodiments, the bacterial taxa predictive of IBS is significantly less abundant in patients suffering from IBS, for example as shown in tables 2 and/or 3.
In one embodiment, a bacterial species or strain predictive of IBS is differentially abundant in patients suffering from IBS. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial species, wherein differential abundance is associated with IBS, and wherein the species is selected from: Ruminococcus gnavus, Clostridium bolteae, Anaerotruncus colihominis, Flavonifractor plautii, Clostridium clostridioforme, Clostridium hathewayi, Clostridium symbiosum, Ruminococcus torques, Alistipes senegalensis, Prevotella copri, Eggerthella lenta, Clostridium asparagiforme, Barnesiella intestinihominis, Clostridium citroniae, Eubacterium eligens, Clostridium ramosum, Coprococcus catus, Eubacterium biforme, Ruminococcus lactaris, Bacteroides massiliensis, Haemophilus parainfluenzae, Clostridium nexile, Clostridium innocuum, Bacteroides Xylanisolvens, Oxalobacter formigenes, Alistipes putredinis, Paraprevotella clara and/or Odoribacter splanchnicus. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial strain, wherein differential abundance is associated with IBS, and wherein the strain is selected from: Clostridiales bacterium 1 7 47FAA, Lachnospiraceae bacterium 1 4 56FA, Lachnospiraceae bacterium 51 57FAA, Lachnospiraceae bacterium 3 1 46FAA, Lachnospiraceae bacterium 7 1 58FAA, Coprococcus sp ART55 1, Lachnospiraceae bacterium 3 1 57FAA CT1, Lachnospiraceae bacterium 2 1 58FAA and/or Eubacterium sp 3 1 31. In certain embodiments, the bacteria (i.e. one or more bacterial strains) may be detected using Clade-specific bacterial genes, 16S sequences, transcriptomics or metabolomics.
In one embodiment, a bacterial species or strain predictive of IBS is differentially abundant in patients suffering from IBS. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial species, wherein differential abundance is associated with IBS, and wherein the species is selected from: Escherichia coli, Streptococcus aginosus, Parabacteroides johnsonii, Streptococcus gordonii, Clostridium boltae, Turicibacter sanguinis, Paraprevotella Xylamphila, Streptococcus mutans, Bacteroides plebeius, Clostridium clostridioforme, Klebsiella pneumoniae, Clostridium hathewayi, Bacteroides fragilis, Prevotella disiens, Clostridium leptum, Pseudoflavonifractor capillosus, Bacteroides intestinalis, Enterococcus faecalis, Streptococcus infantis, Alistipes shahii, Clostridium asparagiforme, Clostridium symbiosum and/or Streptococcus sanguinis. In a particular embodiment, the method of the invention comprises measuring the abundance of a bacterial strain, wherein differential abundance is associated with IBS, and wherein the strain is selected from: Clostridiales bacterium 1 7 47FAA, Eubacterium sp 3 1 31, Lachnospiraceae bacterium 5 1 57FAA, Clostridiaceae bacterium JC118 and/or Lachnospiraceae bacterium 1 4 56FA. In certain embodiments, the bacteria (i.e. one or more bacterial strains) may be detected using Clade-specific bacterial genes, 16S sequences, transcriptomics or metabolomics.
In one embodiment, the fecal microbiota alpha diversity of patients with IBS is reduced. In one embodiment, the intra-individual microbiota diversity of patients with IBS is reduced. In one embodiment, the fecal microbiota alpha diversity of patients with IBS is significantly lower than non-IBS patients. In one embodiment, the intra-individual microbiota diversity of patients with IBS is significantly lower than non-IBS patients. In a further embodiment, the microbiota alpha diversity is not significantly different between IBS clinical subtypes.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more bacterial strains belonging to an operational taxonomic unit (OTU) associated with IBS. In preferred embodiments, the OTU is selected from table 11. In one embodiment, the OTU associated with IBS is classified as belonging to the Firmicutes phylum. In a particular embodiment, the OTU associated with IBS is classified as belonging to the Clostridia class. In a particular embodiment, the OTU associated with IBS is classified as belonging to the Clostridiales order. In a particular embodiment, the OTU associated with IBS is classified as belonging to the Clostridiales Lachnospiraceae family or the Ruminococcaceae family. In a particular embodiment, the OTU associated with IBS is classified as belonging to the Butyricicoccus genus.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacterial strains belonging to one or more OTUs listed in Table 11. The sequences in Table 12 can be used to classify bacteria as belonging to the OTUs listed in Table 11. Bacteria (i.e. one or more bacterial strains) having at least 97% sequence identity to the sequences in Table 12 belong to the corresponding OTUs in Table 11. The alignment is across the length of the sequence. In both Metaphlan2 and HUMAnN2 runs, alignment for species composition is done using bowtie 2. Bowtie2 is run with “very-sensitive argument” and the alignment performed is “Global alignment”.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 1. In certain such embodiments, the bacteria is classified as belonging to the Lachnospiraceae family.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 2. In certain such embodiments, the bacteria is classified as belonging to the Firmicutes phylum.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 3. In certain such embodiments, the bacteria is classified as belonging to the Butyricicoccus genus.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 4. In certain such embodiments, the bacteria is classified as belonging to the Lachnospiraceae family.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 5. In certain such embodiments, the bacteria is classified as belonging to the Clostridiales order.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 6. In certain such embodiments, the bacteria is classified as belonging to the Ruminococcaceae family.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 7. In certain such embodiments, the bacteria is classified as belonging to the Ruminococcaceae family.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 8. In certain such embodiments, the bacteria is classified as belonging to the Firmicutes phylum.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No: 9. In certain such embodiments, the bacteria is classified as belonging to the Ruminococcaceae family.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting bacteria (i.e. one or more bacterial strains) having a 16S rRNA gene sequence at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to SEQ ID No:10. In certain such embodiments, the bacteria is classified as belonging to the Lachnospiraceae family.
In preferred embodiments, the invention provides a method for diagnosing IBS, comprising detecting different bacteria (i.e. one or more bacterial strains) having 16S rRNA gene sequences at least 97% (e.g. 98%, 99%, 99.5% or 100%) identical to two or more of SEQ ID No:1-10, such as 5, 8, or all of SEQ ID No:1-10.
Alteration of Pathways as a Predictor of IBS
The inventors have identified that certain pathways are over or underrepresented in the genomes of the microbiota of patients suffering from IBS. Therefore, the invention provides methods for diagnosing IBS based on the presence or abundance of genes, pathways, or bacteria carrying such genes. Methods of diagnosis comprising detecting genes involved in one or more of the pathways identified herein may be particularly useful for use with different populations of patients because different patient populations may have different microbiome populations.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting microbial genes involved in one or more of the pathways selected from the list in table 4. In certain embodiments, the presence, or increased abundance relative to a control (non-IBS) individual, of genes involved in a pathway recited in Table 4 is associated with IBS. In a preferred embodiment, the method comprises detecting genes involved in amino acid biosynthesis/degradation pathways. The data show that these pathways are significantly more abundant in patients with IBS. In a preferred embodiment, the method comprises detecting genes involved in starch degradation V pathway. The data show that such genes are significantly more abundant in patients with IBS. In another embodiment, genes that are significantly more abundant in patients with IBS are associated with Lachnospiraceae and Ruminococcus species. In certain embodiments, the method of the invention comprises detecting genes involved in at least 2, 5, 10, 15, 20 or 30 of the pathways in table 4. In any such embodiments, detecting the genes comprises measuring the relative abundance of the genes, or bacteria carrying the genes in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In certain embodiments, the presence of the microbial genes is detected by detecting metabolites in the sample. In certain embodiments, the presence of the microbial genes is detected by detecting a taxa of bacteria know to carry the microbial genes.
In other embodiments, the absence or decreased abundance relative to a control (non-IBS) individual of genes involved in a pathway are associated with IBS, for example as shown in table 4. In a preferred embodiment, genes involved in galactose degradation, sulfate reduction, sulfate assimilation and cysteine biosynthesis pathways are detected. The data show that these pathways are significantly less abundant in patients with IBS. In a particular embodiment, pathways indicative of sulphur metabolism are less abundant in patients with IBS. In any such embodiments, detecting the genes comprises measuring the relative abundance of the genes, or bacteria carrying the genes in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In certain embodiments, methods comprising detecting the presence or absence or relative abundance of genes involved in a pathway comprise detecting nucleic acid sequences in a sample from the patient. Additionally or alternatively, the methods comprise detecting bacterial species known to carry the genes of the relevant pathway.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting the differential abundance of one or more pathways predictive of IBS relative to control (non-IBS) individuals. In a particular embodiment, the adenosine ribonucleotide de novo biosynthesis functional pathway is differentially abundant in IBS relative to control (non-IBS) individuals. In a preferred embodiment, the adenosine ribonucleotide de novo biosynthesis functional pathway is more abundant in IBS patients relative to control (non-IBS) individuals.
Alteration of Metabolomes as a Predictor of IBS
The inventors have identified metabolites that are associated with IBS and the invention provides methods for diagnosing IBS that comprise detecting such metabolites. Methods of diagnosis comprising detecting metabolites identified herein may be particularly useful for use with different populations of patients because different patient populations may have different microbiome populations, but there may be more uniformity in terms of detectable metabolites. Generally, detecting a metabolite associated with IBS in the methods of the invention comprises measuring the concentration of the metabolite in a sample or measuring changes in the concentration of a metabolite and optionally comparing the concentration to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In some embodiments, detecting a metabolite associated with IBS in the methods of the invention comprises measuring the concentration of a precursor of the metabolite and optionally comparing the concentration to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In some embodiments, detecting a metabolite associated with IBS in the methods of the invention comprises measuring the concentration of a breakdown product of the metabolite and optionally comparing the concentration to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In certain embodiments, the method comprises detecting a bacterial taxa known to produce a metabolite predictive of IBS.
Alteration of Urine Metabolomes as a Predictor of Ibs
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting urine metabolites which may include one or more of the following: A 80987, Ala-Leu-Trp-Gly, Medicagenic acid 3-O-b-D-glucuronide and/or (−)-Epigallocatechin sulfate. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites selected from the list in table 5. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In other embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, and normalising the concentration relative to urine creatinine levels in each sample. In some embodiments, the method comprises detecting a precursor or breakdown product of the above metabolites. In one embodiment, machine learning is applied to urine metabolome data to diagnose IBS.
In a particular embodiment, the method comprises detecting adenosine, such as measuring the concentration of adenosine in a sample. The examples demonstrate that adenosine is more abundant in IBS patients relative to control (non-IBS) individuals. Thus, a level of adenosine that is increased relative to a healthy control is indicative of IBS.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that are differentially abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS). In one embodiment, the one or more urine metabolites that are differentially abundant in patients suffering from IBS are: N-Undecanoylglycine, Gamma-glutamyl-Cysteine, Alloathyriol, Trp-Ala-Pro, A 80987, Medicagenic acid 3-O-b-D-glucuronide, Ala-Leu-Trp-Gly, Butoctamide hydrogen succinate, (−)-Epicatechin sulfate, 1,4,5-Trimethyl-naphtalene, Tricetin 3′-methyl ether 7,5′-diglucuronide, Torasemide, (−)-Epigallocatechin sulfate, Dodecanedioylcarnitine, 1,6,7-Trimethylnaphthalene, Tetrahydrodipicolinate, Sumiki's acid, Silicic acid, Delphinidin 3-(6″-O-4-malyl-glucosyl)-5-glucoside, L-Arginine, Leucyl-Methionine, Phe-Gly-Gly-Ser, Gin-Met-Pro-Ser, Creatinine, Ala-Asn-Cys-Gly, 2-hydroxy-2-(hydroxymethyl)-2H-pyran-3(6H)-one, Thiethylperazine, 5-((2-iodoacetamido)ethyl)-1-aminonapthalene sulfate, dCTP, Isoleucyl-Proline, 3,4-Methylenesebacic acid, Dimethylallylpyrophosphate/Isopentenyl pyrophosphate, (4-Hydroxybenzoyl)choline, Diazoxide, 3,5-Di-O-galloyl-1,4-galactarolactone, 2-Hydroxypyridine, Decanoylcarnitine, Asp-Met-Asp-Pro, 3-Methyldioxyindole, (1S,3R,4S)-3,4-Dihydroxycyclohexane-1-carboxylate, Ala-Lys-Phe-Cys, 3-Indolehydracrylic acid, [FA (18:0)] N-(9Z-octadecenoyl)-taurine, Ferulic acid 4-sulfate, Urea, N-Carboxyacetyl-D-phenylalanine, 4-Methoxyphenylethanol sulfate, UDP-4-dehydro-6-deoxy-D-glucose, Linalyl formate, Demethyloleuropein, 5′-Guanosyl-methylene-triphosphate, Allyl nonanoate, 2-Phenylethyl octanoate, beta-Cellobiose, D-Galactopyranosyl-(1->3)-D-galactopyranosyl-(1->3)-L-arabinose, Cys-Phe-Phe-Gln, Hippuric acid, Cys-Pro-Pro-Tyr, Met-Met-Thr-Trp, methylphosphonate, 3′-Sialyllactosamine, 2,4,6-Octatriynoic acid, Delphinidin 3-O-3″,6″-O-dimalonylglucoside, L-Valine, Met-Met-Cys, Cysteinyl-Cysteine, (all-E)-1,8,10-Heptadecatriene-4,6-diyne-3,12-diol, L-Lysine, Pivaloylcarnitine, Lenticin, Phenol glucuronide, Tyrosyl-Cysteine, Osmundalin, Tetrahydroaldosterone-3-glucuronide, N-Methylpyridinium, L-prolyl-L-proline, Glutarylcarnitine, [FA (15:4)] 6,8,10,12-pentadecatetraenal, Methyl bisnorbiotinyl ketone, Acetoin, LysoPC(18:2(9Z,12Z)), Hexyl 2-furoate, N-carbamoyl-L-glutamate, L-Homoserine, L-Asparagine, Tiglylcarnitine, Thymine, 3-hydroxypyridine, Menadiol disuccinate, 9-Decenoylcarnitine, Pyrocatechol sulfate, sedoheptulose anhydride, (+)-gamma-Hydroxy-L-homoarginine, Thioridazine, Cys-Glu-Glu-Glu, Marmesin rutinoside, L-Serine, L-Urobilinogen, Isobutyrylglycine, S-Adenosylhomocysteine, 2,3-dioctanoylglyceramide, 3-Methoxy-4-hydroxyphenylglycol glucuronide, sulfoethylcysteine, Hydroxyphenylacetylglycine, Pyrroline hydroxycarboxylic acid, 1-(alpha-Methyl-4-(2-methylpropyl)benzeneacetate)-beta-D-Glucopyranuronic acid, 2-Methylbutylacetate, N1-Methyl-4-pyridone-3-carboxamide, Cortolone-3-glucuronide, Asn-Cys-Gly, N6,N6,N6-Trimethyl-L-lysine, Benzylamine, 5-Hydroxy-L-tryptophan, Armillaric acid, Leucine/Isoleucine, 2-Butylbenzothiazole, D-Sedoheptulose 7-phosphate, [Fv Dimethoxy,methyl(9:1)] (2S)-5,7-Dimethoxy-3′,4′-methylenedioxyflavanone, Oxoadipic acid, Thr-Cys-Cys, Creatine, Hydroxybutyrylcarnitine, 5′-Dehydroadenosine, Phe-Thr-Val, dUDP, L-Glutamine and/or Kaempferol 3-(2″,3″-diacetyl-4″-p-coumaroylrhamnoside). In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites predictive of IBS. In one embodiment, the urine metabolite predictive of IBS is selected from: N-Undecanoylglycine, Gamma-glutamyl-Cysteine, Alloathyriol, Trp-Ala-Pro, A 80987, Medicagenic acid 3-O-b-D-glucuronide, Ala-Leu-Trp-Gly, Butoctamide hydrogen succinate, (−)-Epicatechin sulfate, 1,4,5-Trimethyl-naphtalene, Tricetin 3′-methyl ether 7,5′-diglucuronide, Torasemide, (−)-Epigallocatechin sulfate, Dodecanedioylcarnitine, 1,6,7-Trimethylnaphthalene, Tetrahydrodipicolinate, Sumiki's acid, Silicic acid, Delphinidin 3-(6″-O-4-malyl-glucosyl)-5-glucoside, L-Arginine, Leucyl-Methionine, Phe-Gly-Gly-Ser, Gin-Met-Pro-Ser, Creatinine, Ala-Asn-Cys-Gly, 2-hydroxy-2-(hydroxymethyl)-2H-pyran-3(6H)-one, Thiethylperazine, 5-((2-iodoacetamido)ethyl)-1-aminonapthalene sulfate, dCTP, Isoleucyl-Proline, 3,4-Methylenesebacic acid, Dimethylallylpyrophosphate/Isopentenyl pyrophosphate, (4-Hydroxybenzoyl)choline, Diazoxide, 3,5-Di-O-galloyl-1,4-galactarolactone, 2-Hydroxypyridine, Decanoylcarnitine, Asp-Met-Asp-Pro, 3-Methyldioxyindole, (1S,3R,4S)-3,4-Dihydroxycyclohexane-1-carboxylate, Ala-Lys-Phe-Cys, 3-Indolehydracrylic acid, [FA (18:0)] N-(9Z-octadecenoyl)-taurine, Ferulic acid 4-sulfate, Urea, N-Carboxyacetyl-D-phenylalanine, 4-Methoxyphenylethanol sulfate, UDP-4-dehydro-6-deoxy-D-glucose, Linalyl formate, Demethyloleuropein, 5′-Guanosyl-methylene-triphosphate, Allyl nonanoate, 2-Phenylethyl octanoate, beta-Cellobiose, D-Galactopyranosyl-(1->3)-D-galactopyranosyl-(1->3)-L-arabinose, Cys-Phe-Phe-Gln, Hippuric acid, Cys-Pro-Pro-Tyr, Met-Met-Thr-Trp, methylphosphonate, 3′-Sialyllactosamine, 2,4,6-Octatriynoic acid, Delphinidin 3-O-3″,6″-0-dimalonylglucoside, L-Valine, Met-Met-Cys, Cysteinyl-Cysteine, (all-E)-1,8,10-Heptadecatriene-4,6-diyne-3,12-diol, L-Lysine, Pivaloylcarnitine, Lenticin, Phenol glucuronide, Tyrosyl-Cysteine, Osmundalin, Tetrahydroaldosterone-3-glucuronide, N-Methylpyridinium, L-prolyl-L-proline, Glutarylcarnitine, [FA (15:4)] 6,8,10,12-pentadecatetraenal, Methyl bisnorbiotinyl ketone, Acetoin, LysoPC(18:2(9Z,12Z)), Hexyl 2-furoate, N-carbamoyl-L-glutamate, L-Homoserine, L-Asparagine, Tiglylcarnitine, Thymine, 3-hydroxypyridine, Menadiol disuccinate, 9-Decenoylcarnitine, Pyrocatechol sulfate, sedoheptulose anhydride, (+)-gamma-Hydroxy-L-homoarginine, Thioridazine, Cys-Glu-Glu-Glu, Marmesin rutinoside, L-Serine, L-Urobilinogen, Isobutyrylglycine, S-Adenosylhomocysteine, 2,3-dioctanoylglyceramide, 3-Methoxy-4-hydroxyphenylglycol glucuronide, sulfoethylcysteine, Hydroxyphenylacetylglycine, Pyrroline hydroxycarboxylic acid, 1-(alpha-Methyl-4-(2-methylpropyl)benzeneacetate)-beta-D-Glucopyranuronic acid, 2-Methylbutylacetate, N1-Methyl-4-pyridone-3-carboxamide, Cortolone-3-glucuronide, Asn-Cys-Gly, N6,N6,N6-Trimethyl-L-lysine, Benzylamine, 5-Hydroxy-L-tryptophan, Armillaric acid, Leucine/Isoleucine, 2-Butylbenzothiazole, D-Sedoheptulose 7-phosphate, [Fv Dimethoxy,methyl(9:1)] (2S)-5,7-Dimethoxy-3′,4′-methylenedioxyflavanone, Oxoadipic acid, Thr-Cys-Cys, Creatine, Hydroxybutyrylcarnitine, 5′-Dehydroadenosine, Phe-Thr-Val, dUDP, L-Glutamine and/or Kaempferol 3-(2″,3″-diacetyl-4″-p-coumaroylrhamnoside).. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting differential abundance of one or more urine metabolites selected from the list in table 6. In certain embodiments, the method of the invention comprises detecting 2, 5, 10, 15 or 20 or all of the metabolites from table 6. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In some embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, and normalising the concentration relative to urine creatinine levels in each sample. In some embodiments, the method comprises detecting a precursor or breakdown product of the above metabolites.
In certain embodiments, the abundance of urine metabolites is significantly increased in patients with IBS, for example as shown in table 6. In one embodiment, the method comprises detecting metabolites involved in fatty acid oxidation and/or fatty acid metabolism, which are significantly more abundant in patients with IBS. In a preferred embodiment, N-Undecanoylglycine is detected, which is significantly more abundant in patients with IBS. In another preferred embodiment, Decanoylcarnitine is detected, which is significantly more abundant in patients with IBS.
In one embodiment, a urine metabolite predictive of IBS is more abundant in patients suffering from IBS compared to a healthy control. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that have been found to be predictive that a patient is suffering from IBS. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that are more abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS). In certain embodiments, the abundance of urine metabolites is increased in patients with IBS, for example as shown in table 6 and/or table 21b. In one embodiment, the one or more urine metabolites that are more abundant in patients suffering from IBS are: A 80987, Medicagenic acid 3-O-b-D-glucuronide, N-Undecanoylglycine, Ala-Leu-Trp-Gly, Gamma-glutamyl-Cysteine, Butoctamide hydrogen succinate, (−)-Epicatechin sulfate, 1,4,5-Trimethyl-naphtalene, Trp-Ala-Pro, Dodecanedioylcarnitine, 1,6,7-Trimethylnaphthalene, Sumiki's acid, Phe-Gly-Gly-Ser, 2-hydroxy-2-(hydroxymethyl)-2H-pyran-3(6H)-one, 5-((2-iodoacetamido)ethyl)-1-aminonapthalene sulfate, Thiethylperazine, dCTP, Dimethylallylpyrophosphate/Isopentenyl pyrophosphate, Asp-Met-Asp-Pro, 3,5-Di-O-galloyl-1,4-galactarolactone, Decanoylcarnitine, [FA (18:0)] N-(9Z-octadecenoyl)-taurine, UDP-4-dehydro-6-deoxy-D-glucose, Delphinidin 3-O-3″,6″-O-dimalonylglucoside, Osmundalin and/or Cysteinyl-Cysteine. In a preferred embodiment, one or more urine metabolites selected from: A 80987, Medicagenic acid 3-O-b-D-glucuronide, N-Undecanoylglycine, Ala-Leu-Trp-Gly, and/or Gamma-glutamyl-Cysteine are detected, which are more abundant in patients with IBS compared to healthy controls. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting an increase in abundance of one or more urine metabolites selected from the list in table 6 and/or table 21b. In certain embodiments, the method of the invention comprises detecting 2, 5, 10, 15 or 20 or all of the metabolites from table 6 and/or table 21b. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In some embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, and normalising the concentration relative to urine creatinine levels in each sample. In some embodiments, the method comprises detecting a precursor or breakdown product of the above metabolites. In a preferred embodiment, epicatechin sulfate is detected, which is more abundant in patients with IBS. In a preferred embodiment, medicagenic acid 3-O-b-D-glucuronide is detected, which is more abundant in patients with IBS.
In certain embodiments, the abundance of urine metabolites is significantly decreased in patients with IBS, for example as shown in table 6. In one embodiment, the method comprises detecting metabolites involved in the biosynthesis of nitric oxide, which are significantly less abundant in patients with IBS. In one embodiment amino acids are significantly less abundant in patients with IBS, for example L-arginine.
In one embodiment, a urine metabolite predictive of IBS is less abundant in patients suffering from IBS compared to a healthy control. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that have been found to be predictive that a patient is not suffering from IBS, i.e. that the patient is a healthy control. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that are less abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS). In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that are more abundant in healthy controls (i.e. from one or more subjects who does not suffer from IBS) compared to patients suffering from IBS. In certain embodiments, the abundance of urine metabolites is decreased in patients with IBS, for example as shown in table 6 and/or table 21a. In one embodiment, the one or more urine metabolites that are less abundant in patients suffering from IBS are: Tricetin 3′-methyl ether 7,5′-diglucuronide, Alloathyriol, Torasemide, (−)-Epigallocatechin sulfate, Tetrahydrodipicolinate, Silicic acid, Delphinidin 3-(6″-O-4-malyl-glucosyl)-5-glucoside, Creatinine, L-Arginine, Leucyl-Methionine, Gln-Met-Pro-Ser, Ala-Asn-Cys-Gly, Isoleucyl-Proline, 3,4-Methylenesebacic acid, (4-Hydroxybenzoyl)choline, Diazoxide, (1S,3R,4S)-3,4-Dihydroxycyclohexane-1-carboxylate, 2-Hydroxypyridine, Ala-Lys-Phe-Cys, 3-Methyldioxyindole, N-Carboxyacetyl-D-phenylalanine, Urea, Ferulic acid 4-sulfate, 3-Indolehydracrylic acid, Demethyloleuropein, 5′-Guanosyl-methylene-triphosphate, Linalyl formate, 4-Methoxyphenylethanol sulfate, Allyl nonanoate, D-Galactopyranosyl-(1->3)-D-galactopyranosyl-(1->3)-L-arabinose, Met-Met-Thr-Trp, Cys-Pro-Pro-Tyr, methylphosphonate, 2-Phenylethyl octanoate, Hippuric acid, Glutarylcarnitine and/or Cys-Phe-Phe-Gln. In a preferred embodiment, one or more urine metabolites selected from: Tricetin 3′-methyl ether 7,5′-diglucuronide, Alloathyriol, Torasemide, (−)-Epigallocatechin sulfate and/or Tetrahydrodipicolinate are detected, which are less abundant in patients with IBS compared to healthy controls. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting a decrease in abundance of one or more urine metabolites selected from the list in table 6 and/or table 21a. In certain embodiments, the method of the invention comprises detecting 2, 5, 10, 15 or 20 or all of the metabolites from table 6 and/or table 21a. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In some embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, and normalising the concentration relative to urine creatinine levels in each sample. In some embodiments, the method comprises detecting a precursor or breakdown product of the above metabolites.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more urine metabolites that are differentially abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS). In a preferred embodiment, the one or more urine metabolites that are differentially abundant in patients suffering from IBS are sulfate, glucuronide, carnitine, glycine and glutamine conjugates. In one embodiment, the method comprises detecting metabolites involved in phase 2 metabolism, which are is upregulated in patients with IBS. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In other embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, and normalising the concentration relative to urine creatinine levels in each sample.
Alteration of Fecal Metabolomes as a Predictor of IBS
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites selected from: 3-deoxy-D-galactose, Tyrosine, I-Urobilin, Adenosine, Glu-Ile-Ile-Phe, 3,6-Dimethoxy-19-norpregna-1,3,5,7,9-pentaen-20-one, 2-Phenylpropionate, MG(20:3(8Z,11Z,14Z)/0:0/0:0), 1,2,3-Tris(1-ethoxyethoxy)propane, Staphyloxanthin, Hexoses, 20-hydroxy-E4-neuroprostane, Nonyl acetate, 3-Feruloyl-1,5-quinolactone, trans-2-Heptenal, Pyridoxamine, L-Arginine, Dodecanedioic acid, Ursodeoxycholic acid, 1-(Malonylamino)cyclopropanecarboxylic acid, Cortisone, 9,10,13-Trihydroxystearic acid, Glu-Ala-Gln-Ser, Quasiprotopanaxatriol, N-Methylindolo[3,2-b]-5alpha-cholest-2-ene, PG(20:0/22:1(11Z)), (−)-Epigallocatechin, 2-Methyl-3-ketovaleric acid, Secoeremopetasitolide B, PC(20:1(11Z)/P-16:0), Glu-Asp-Asp, N5-acetyl-N5-hydroxy-L-ornithine acid, Silicic acid, (1xi,3xi)-1,2,3,4-Tetrahydro-1-methyl-beta-carboline-3-carboxylic acid, PS(36:5), Chorismate, Isoamyl isovalerate, PA(0-36:4), PE(P-28:0) and/or gamma-Glutamyl-S-methylcysteinyl-beta-alanine. In certain embodiments, the method of the invention comprises detecting at least 2, 5, 10, 15 or 20 or all of these metabolites. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In one embodiment, the invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites selected from: L-Phenylalanine, Adenosine, MG(20:3(8Z,11Z,14Z)/0:0/0:0), L-Alanine, 3,6-Dimethoxy-19-norpregna-1,3,5,7,9-pentaen-20-one, Glu-Ile-Ile-Phe, Glu-Ala-Gln-Ser, 2,4,8-Eicosatrienoic acid isobutylamide, Piperidine, Staphyloxanthin, beta-Carotinal, Hexoses, Ile-Arg-Ile, 11-Deoxocucurbitacin I, 1-(Malonylamino)cyclopropanecarboxylic acid, PG(37:2), [PR] gamma-Carotene/beta,psi-Carotene, 20-hydroxy-E4-neuroprostane, Ethylphenyl acetate, Dodecanedioic acid, Ile-Lys-Cys-Gly, Tuberoside, D-galactal, 3,6-Dihydro-4-(4-methyl-3-pentenyl)-1,2-dithiin, demethylmenaquinone-6, L-Arginine, PC(o-16:1(9Z)/14:1(9Z)), Mesobilirubinogen, Traumatic acid, alpha-Tocopherol succinate, 3-Methylcrotonylglycine, (S)-(E)-8-(3,6-Dimethyl-2-heptenyl)-4′,5,7-trihydroxyflavanone, xi-7-Hydroxyhexadecanedioic acid, beta-Pinene, Leu-Ser-Ser-Tyr, Orotic acid, Heptane-1-thiol, Glu-Asp-Asp, LysoPE(18:2(9Z,12Z)/0:0), LysoPE(22:0/0:0), Creatine, Inosine, SM(d32:2), Arg-Leu-Val-Cys, PS(0-18:0/15:0), Pyridoxamine, N-Heptanoylglycine, Hematoporphyrin IX, 3beta,5beta-Ketotriol, 2-Phenylpropionate, trans-2-Heptenal, LysoPC(0:0/18:0), Linoleoyl ethanolamide, LysoPE(24:0/0:0), 2-Methyl-3-hydroxyvaleric acid, Quasiprotopanaxatriol, N-oleoyl isoleucine, (−)-(E)-1-(4-Hydroxyphenyl)-7-phenyl-6-hepten-3-ol, [FA hydroxy(4:0)] N-(3S-hydroxy-butanoyl)-homoserine lactone, Riboflavin cyclic-4′,5′-phosphate, Arg-Lys-Trp-Val, PC(20:1(11Z)/P-16:0), 3,5-Dihydroxybenzoic acid, Tyrosine, 2,3-Epoxymenaquinone, His-Met-Val-Val, PI(41:2), Phenol, 3,3′-Dithiobis[2-methylfuran], Ala-Leu-Trp-Pro, 1,2,3-Tris(1-ethoxyethoxy)propane, Vanilpyruvic acid, 2-Hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate, Secoeremopetasitolide B, 2-O-Benzoyl-D-glucose, Ile-Leu-Phe-Trp, (R)-lipoic acid, PA(20:4(5Z,8Z,11Z,14Z)e/2:0), PE(P-16:0e/0:0), Benzyl isobutyrate, Hexyl 2-furoate, Trp-Ala-Ser, LysoPC(15:0), 4-Hydroxycrotonic acid, 3-Feruloyl-1,5-quinolactone, Furfuryl octanoate, PC(22:2(13Z,16Z)/15:0), (−)-1-Methylpropyl 1-propenyl disulphide, PC (36:6), Leucyl-Glycine, CE(16:2), Triterpenoid, Violaxanthin, [FA hydroxy(17:0)] heptadecanoic acid, 2-Hydroxyundecanoate, Chorismate, delta-Dodecalactone, 3-O-Protocatechuoylceanothic acid, PG(16:1(9Z)/16:1(9Z)), p-Cresol sulfate, Quercetin 3′-sulfate, PS(26:0)), Ala-Leu-Phe-Trp, L-Glutamic acid 5-phosphate, N,2,3-Trimethyl-2-(1-methylethyl)butanamide, Isoamyl isovalerate, n-Dodecane, PC(14:1(9Z)/14:1(9Z)), Lucyoside Q, Endomorphin-1, 3-Hydroxy-10′-apo-b,y-carotenal, Pyrroline hydroxycarboxylic acid, S-Propyl 1-propanesulfinothioate, N-Methylindolo[3,2-b]-5alpha-cholest-2-ene, Tocopheronic acid, 1-(2,4,6-Trimethoxyphenyl)-1,3-butanedione, Homogentisic acid, LysoPE(18:1(9Z)/0:0), N-stearoyl valine, trans-Carvone oxide, 1,1′-Thiobis-1-propanethiol, 2-(Ethylsulfonylmethyl)phenyl methylcarbamate, menaquinone-4, Benzeneacetamide-4-O-sulphate, N5-acetyl-N5-hydroxy-L-ornithine, Succinic acid, Asn-Lys-Val-Pro, LysoPC(14:1(9Z)), Phenol glucuronide, 2-methyl-Butanoic acid, 2-methylbutyl ester, 3-O-Caffeoyl-1-O-methylquinic acid, [FA hydroxy(24:0)] 3-hydroxy-tetracosanoic acid, N-(2-hydroxyhexadecanoyl)-sphinganine-1-phospho-(1′-myo-inositol), gamma-Dodecalactone, PA(22:1(11Z)/0:0), Butyl butyrate, TG(20:5(5Z,8Z,11Z,14Z,17Z)/18:1(9Z)/22:5(7Z,10Z,13Z,16Z,19Z))[iso6], Clausarinol, 4-Methyl-2-pentanone, Trigoneline, Arg-Val-Pro-Tyr, 2,3-Methylenesuccinic acid, Serinyl-Threonine, Lycoperoside D, Geraniol, 1-18:2-lysophosphatidylglycerol, omega-6-Hexadecalactone, Ambrettolide, gamma-Glutamyl-S-methylcysteinyl-beta-alanine, FA oxo(22:0), D-Ribose, LysoPC(17:0), PA(0-36:4), C19 Sphingosine-1-phosphate, 4-Hydroxy-5-(dihydroxyphenyl)-valeric acid-O-methyl-O-sulphate, PE(14:1(9Z)/14:0), Citronellyl tiglate, Ethyl methylphenylglycidate (isomer 1), N-Acetyl-leu-leu-tyr and/or PS(O-34:3). In certain embodiments, the method of the invention comprises detecting at least 2, 5, 10, 15 or 20 or all of these metabolites. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In a preferred embodiment, method comprises detecting the fecal metabolite L-tyrosine. In a preferred embodiment, the method comprises detecting L-arginine. In a preferred embodiment, method comprises detecting the bile acid ursodeoxycholic acid (UDCA). In a preferred embodiment, the method comprises detecting bile pigment lurobilin. In a preferred embodiment, the method comprises detecting dodecanedioic acid. In a preferred embodiment, the method comprises detecting L-Phenylalanine. In a preferred embodiment, the method comprises detecting L-Phenylalanine. In a preferred embodiment, the method comprises detecting Adenosine. In a preferred embodiment, the method comprises detecting MG(20:3(8Z,11Z,14Z)/0:0/0:0). In a preferred embodiment, the method comprises detecting L-Alanine. In a preferred embodiment, the method comprises detecting 3,6-Dimethoxy-19-norpregna-1,3,5,7,9-pentaen-20-one.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites selected from the list in table 7. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites selected from the list in table 13. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value. In one embodiment, machine learning is applied to fecal metabolome data to diagnose IBS.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites that are differentially abundant in patients suffering from IBS. In one embodiment, the one or more fecal metabolites that are differentially abundant in patients suffering from IBS are: 2-Phenylpropionate, 3-Buten-1-amine, Adenosine, I-Urobilin, 2,3-Epoxymenaquinone, [FA (22:5)] 4,7,10,13,16-Docosapentaynoic acid, 3,6-Dimethoxy-19-norpregna-1,3,5,7,9-pentaen-20-one, Cucurbitacin S, N-Heptanoylglycine, 11-Deoxocucurbitacin I, Staphyloxanthin, Piperidine, Leu-Ser-Ser-Tyr, L-Urobilin, L-Phenylalanine, Ala-Leu-Trp-Pro, 3-Feruloyl-1,5-quinolactone, PG(P-16:0/14:0), 3-deoxy-D-galactose, MG(20:3(8Z,11Z,14Z)/0:0/0:0), Mesobilirubinogen, L-Alanine, Tyrosine, PG(O-30:1), beta-Pinene, 2,4,8-Eicosatrienoic acid isobutylamide, Glutarylglycine, [PR] gamma-Carotene/beta,psi-Carotene, Neuromedin B (1-3), Heptane-1-thiol, Violaxanthin, Isolimonene, Ile-Lys-Cys-Gly, His-Met-Val-Val, Allyl caprylate, Hydroxyprolyl-Tryptophan, Dodecanedioic acid, 2-O-Benzoyl-D-glucose, 2-Ethylsuberic acid, D-Urobilin, 20-hydroxy-E4-neuroprostane, PG(O-31:1), Anigorufone, Nonyl acetate, L-Arginine, PG(P-32:1), Glu-Ala-Gln-Ser, PG(31:0), Cucurbitacin I, Arg-Lys-Phe-Val, Genipinic acid, Hexoses, Lys-Phe-Phe-Phe, PI(41:2), D-galactal, Traumatic acid, Adenine, PC(22:2(13Z,16Z)/15:0), 2-Phenylethyl beta-D-glucopyranoside, PG(37:2), Glycerol tributanoate, Arg-Leu-Pro-Arg, 2-O-p-Coumaroyl-D-glucose, 3,4-Dihydroxyphenyllactic acid methyl ester, PG(P-28:0), PG(34:0), L-Lysine, Ribitol, LysoPE(18:2(9Z,12Z)/0:0), PA(20:4(5Z,8Z,11Z,14Z)e/2:0), 5-Dehydroshikimate, Threoninyl-Isoleucine, L-Methionine, PS(26:0)), alpha-Pinene, Fenchene, Glu-Ile-Ile-Phe, Gln-Phe-Phe-Phe, Ursodeoxycholic acid, PC(34:2), 3,17-Androstanediol glucuronide, Pyridoxamine, [ST hydrox] (25R)-3alpha,7alpha-dihydroxy-5beta-cholestan-27-oyl taurine, PA(42:2), [FA (16:0)] 2-bromo-hexadecanal, 3,6-Dihydro-4-(4-methyl-3-pentenyl)-1,2-dithiin, 3-Methylcrotonylglycine xi-7-Hydroxyhexadecanedioic acid, Camphene, 2-Hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate, 7C-aglycone, 1-(3-Aminopropyl)-4-aminobutanal, Benzyl isobutyrate, (S)-(E)-8-(3,6-Dimethyl-2-heptenyl)-4′,5,7-trihydroxyflavanone, 1,3-di-(5Z,8Z,11Z,14Z,17Z-eicosapentaenoyl)-2-hydroxy-glycerol (d5), SM(d18:0/18:0), L-Homoserine, 17beta-(Acetylthio)estra-1,3,5(10)-trien-3-ol acetate, [ST (2:0)] 5beta-Chola-3,11-dien-24-oic Acid, PG(33:2), PE(22:4(7Z,10Z,13Z,16Z)/P-16:0), Protoporphyrinogen IX, alpha-Tocopherol succinate, Methyl (9Z)-6′-oxo-6,5′-diapo-6-carotenoate, PG(16:1(9Z)/16:1(9Z)), PC(o-22:1(13Z)/20:4(8Z,11Z,14Z,17Z)), PG(31:2), alpha-phellandrene, [PS (12:0/13:0)] 1-dodecanoyl-2-tridecanoyl-sn-glycero-3-phosphoserine (ammonium salt), Glu-Asp-Asp, PG(33:1), PA(0-20:0/22:6(4Z,7Z,10Z,13Z,16Z,19Z)), [FA oxo(19:0)] 18-oxo-nonadecanoic acid, PG(16:1(9Z)/18:0), Leu-Val, demethylmenaquinone-6, PC(o-16:1(9Z)/14:1(9Z)), PG(P-32:0), (24E)-3beta,15alpha,22S-Triacetoxylanosta-7,9(11),24-trien-26-oic acid, PA(33:5), LysoPC(0:0/18:0), Ile-Arg-Ile, Lauryl acetate, Glu-Glu-Gly-Tyr, 3-(Methylthio)-1-propanol, (−)-(E)-1-(4-Hydroxyphenyl)-7-phenyl-6-hepten-3-ol, Dimethyl benzyl carbinyl butyrate and/or Methyl 2,3-dihydro-3,5-dihydroxy-2-oxo-3-indoleacetic acid. In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting differential abundance of one or more fecal metabolites selected from the list in table 8. In certain embodiments, the method of the invention comprises detecting at least 2, 5, 10, 15 or 20 or all of these metabolites. In some embodiments, the method comprises detecting a precursor or breakdown product of the above metabolites.
In certain embodiments, the abundance of metabolites is significantly increased in patients with IBS, for example as shown in table 8. In one embodiment, bile acids are significantly more abundant in patients with IBS. In a particular embodiment, [ST hydroxy] (25R)-3alpha,7alpha-dihydroxy-5beta-cholestan-27-oyl taurine is detected or is measured. It is significantly more abundant in patients with IBS. In a particular embodiment, [ST (2:0)] 5beta-Chola-3,11-dien-24-oic acid is detected or is measured. It is significantly more abundant in patients with IBS. In a particular embodiment, UDCA is detected or is measured, it is significantly more abundant in patients with IBS. In another embodiment, amino acids are significantly more abundant in patients with IBS. for example tyrosine and/or lysine. In particular embodiments, the method of the invention comprises detecting or quantifying the levels of tyrosine or lysine in a sample and diagnosing IBS. In certain embodiments, the abundance of metabolites is significantly decreased in patients with IBS, for example as shown in table 8.
In one embodiment, the present invention provides a method for diagnosing IBS, comprising detecting one or more fecal metabolites that are differentially abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS). In a preferred embodiment, the one or more fecal metabolites that are differentially abundant in patients suffering from IBS are sulfate, glucuronide, carnitine, glycine and glutamine conjugates. In one embodiment, the method comprises detecting metabolites involved in phase 2 metabolism, which are is upregulated in patients with IBS. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In one embodiment, the present invention provides a method for diagnosing IBS-D (IBS associated with diarrhoea), comprising detecting one or more fecal metabolites that are differentially abundant in patients suffering from IBS-D. In one embodiment, bile acids are differentially abundant in patients with IBS-D. In one embodiment, total bile acid, secondary bile acids, sulphated bile acids, UDCA and/or conjugated bile acids are differentially abundant in patients with IBS-D. In a particular embodiment, total bile acid is differentially abundant in patients with IBS-D. In a particular embodiment, secondary bile acids are differentially abundant in patients with IBS-D. In a particular embodiment, sulphated bile acids are differentially abundant in patients with IBS-D. In a particular embodiment, UDCA is differentially abundant in patients with IBS-D. In a particular embodiment, conjugated bile acids are differentially abundant in patients with IBS-D. In any such embodiments, detecting the metabolite comprises measuring the concentration of the metabolite in a sample, for example the concentration relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
Methods of Detecting Urine Metabolites
GC/LC-MS
Metabolites may be detected by any suitable method known in the art. In one embodiment, urine metabolites that are differentially abundant in patients suffering from IBS compared to a healthy control (i.e. from one or more subjects who does not suffer from IBS) are detected using GC/LC-MS.
In a particular embodiment, GC/LC-MS is preferably used for detecting urine metabolites that are predictive of IBS. For urine metabolomics, the values of metabolites may be normalized with reference to urine creatinine levels in each sample.
FAIMS (High Field Asymmetric Waveform Ion Mobility Spectrometry)
In one embodiment, urine metabolites that are differentially abundant in patients suffering from IBS are detected using FAIMS. In a particular embodiment, FAIMS is preferably used for detecting urine metabolites that are predictive of IBS. For urine metabolomics, the values of metabolites may be normalized with reference to urine creatinine levels in each sample.
Ion mobility spectrometry (IMS) is a well-known technique for analysing ion separation in the gaseous phase based on differences in ion mobilities under the influence of an electric field. Field Asymmetric Ion Mobility Spectrometry (FAIMS) is a specific example of an IMS technique that uses a high voltage asymmetric waveform at radio frequency combined with a static compensation voltage applied between two electrodes to separate ions at atmospheric pressure. Different ions pass through the electric fields to a detector at different compensation voltages. Thus, by varying the compensation voltage, a FAIMS analyser can detect the presence of different ions in the sample. The FAIMS instrument benefits from small size and lack of pumping requirements, allowing for portability as a standalone instrument. FAIMS is described in more detail in reference (20).
The FAIMS output consists of two modes: a positive mode (for positively charged ions) and a negative mode (for negatively charged ions). Each of these modes is made up of 51 dispersion fields (DFs), totaling 102 DFs taking both modes into account. Each DF is applied to the testing sample following the principle of linear sweep voltammetry, i.e. the compensation voltage is varied from a starting value to an end value, separated by 512 equally spaced voltages. The ion current value at each of the equally spaced voltages is measured. Each pair of compensation voltage and measured ion current can be referred to as a data point. Across all dispersion fields for both the positive and negative modes, there are 52224 data points.
Previous applications of FAIMS have used the method to study gastrointestinal toxicity, bile acid diarrhoea, and colorectal cancer. For example, PCT application WO 2016/038377 describes a method for diagnosing coeliac disease or bile acid diarrhoea by analysing the concentration of a signature compound in a body sample from a test subject using FAIMS and comparing this concentration with a reference for the concentration of the signature compound in an individual who does not suffer from the disease. An increase in the concentration of the signature compound in the body sample from the test subject compared to the reference suggests that the subject is suffering from the disease being screened for, or has a pre-disposition thereto, or provides a negative prognosis of the subject's condition.
In use, the FAIMS analyser is operated by running the device with air (no sample) and water, to clean the analyser. A urine sample is then introduced to obtain the signals. The FAIMS analyser is operated with water and then with air again before the next test sample is run. The signals from all of the dispersion fields are then aligned using crosscorrelation.
In some embodiments, the method of diagnosing IBS of the present invention is a computer-implemented method. In a preferred embodiment, the computer-implemented method is a method for analysing a FAIMS profile of a urine sample to determine the presence or absence of IBS and/or classify the urine sample into an IBS subset is provided. The method comprises:
Advantageously, by applying signal smoothing to the received signals, the raw signal strength is retained while reducing the ‘noise’ in the signal. By trimming the signal, noise is reduced, improving the quality of the output and reducing technical artefacts between runs caused by crosscontamination and carry-over signals.
Overall, the method retains more features for analysis compared to the prior art method, which, in the context of a diagnostic application, improves the capability to distinguish between populations and stratify subgroups within a population.
Preferably, pre-processing the obtained signals comprises all three steps of smoothing the signals, trimming off baseline noise from the signals, and aligning the signals in regions of interest.
Obtaining the FAIMS signal may comprise analysing the biological sample with a FAIMS system to produce a signal corresponding to the FAIMS profile of the biological sample.
Preferably, the signal smoothing is performed using a Savitzky-Golay filter, as described in Anal. Chem., 36(8), 1964, Savitzky A., Golay M J E. “Smoothing and Differentiation of Data by Simplified Least Squares Procedures”, pages 1627-1639 (21). Using a Savitzky-Golay filter is advantageous because it keeps the peak signal values intact, which can improve the accuracy of the classification. The signal smoothing may be applied to the dispersion fields of both positive and negative modes of the signal.
The signal trimming may be performed using an optimised baseline cut-off. The signal alignment may be performed using cross correlation.
Selection of features from the signals may be performed using a linear regression model, for example LASSO. LASSO is described in more detail in Journal of the Royal Statistical Society, Series B, 58(1), 1996, R. Tibshirani, “Regression Shrinkage and Selection via the Lasso”, pages 267-288 (22).
The trained classifier is preferably a support vector machine. Alternatively, the classifier may be a random forest. In a preferred embodiment, the classifier is a random forest.
Integrative Analysis of Diet, Microbiome and Metabolome in IBS Patients
In certain embodiments, the invention provides a method of diagnosing IBS comprising one or more of i) detecting a bacterial species, for example as discussed above, ii) detecting genes involved in one or more of the pathways, for example as discussed above, iii) detecting metabolites, for example as discussed above. In any such embodiments, detecting the bacteria, gene or metabolite comprises measuring the abundance or concentration of said marker in a sample, for example the relative to a corresponding sample from a control (non-IBS) individual or relative to a reference value.
In one embodiment, the invention provides a method of diagnosing IBS, comprising detecting the depletion of a bacterial species. In one embodiment, the depleted bacterial species is one or more of the following: Paraprevotella species, Bacteroides species, Barnesiella intestinihominis, Eubacterium eligens, Ruminococcus lactaris, Eubacterium biforme, Desulfovibrio desulfuricans, Coprococcus species and Eubacterium species. In certain embodiments, the method of the invention comprises detecting one or more of Paraprevotella species, Bacteroides species, Barnesiella intesfinihominis, Eubacterium eligens, Ruminococcus lactaris, Eubacterium biforme, Desulfovibrio desulfuricans, Coprococcus species and Eubacterium species.
In one embodiment, the invention provides a method of diagnosing IBS, comprising detecting the differential utilisation of dietary components. In a particular embodiment, the invention provides a method of diagnosing IBS, comprising detecting the differential utilisation of a high protein diet.
In one embodiment, the invention provides a method of diagnosing IBS, comprising detecting higher levels of peptides and amino acids. In another embodiment, the invention provides a method of diagnosing IBS, comprising detecting increased levels of L-alanine, L-lysine, L-methionine, L-phenylalanine and/or tyrosine.
In one embodiment, the invention provides a method of diagnosing IBS, comprising detecting increased levels of bile acids. In a particular embodiment, the invention provides a method of diagnosing IBS, comprising detecting increased levels of UDCA, sulfolithocholylglycine and [ST hydrox](25R)-3alpha,7alpha-dihydroxy-5beta-cholestan-27-oyl taurine and/or Iurobilin.
In one embodiment, the invention provides a method of diagnosing IBS, comprising detecting increased levels of metabolites. In another embodiment, the invention provides a method of diagnosing IBS, comprising detecting increased levels of allantoin, cis-4-decenedioic acid, decanoylcarnitine and/or dodecanedioylcarnitine.
Diagnostic Methods
The inventors have developed new and improved methods for diagnosing IBS.
In preferred embodiments, the methods of the invention are for use in diagnosing a patient resident in Europe, such as Northern Europe, preferably Ireland or a patient that has a European, Northern European or Irish diet. The examples demonstrate that the methods of the invention are particular effective for such patients.
In certain embodiments of any aspect of the invention, the abundance of bacteria, genes or metabolites is assessed relative to control (non-IBS) individuals. In preferred embodiments, the abundance of urine metabolites is assessed relative to control (non-IBS) individuals. Such reference values may be generated using any technique established in the art.
In certain embodiments of any aspect of the invention, comparison to a corresponding sample from a control (non-IBS) individual is a comparison to a corresponding sample from a healthy individual.
Preferably the method of diagnosing IBS has a sensitivity of greater than 40% (e.g. greater than 45%, 50% or 52%, e.g. 53% or 58%) and a specificity of greater than 90% (e.g. greater than 93% or 95%, e.g. 96%).
In certain embodiments, the method of diagnosis is a method of monitoring the course of treatment for IBS.
In certain embodiments, the step of detecting the presence or abundance of bacteria, such as in a fecal sample, comprises a nucleic acid based quantification methodology, for example 16S rRNA gene amplicon sequencing. Methods for qualitative and quantitative determination of bacteria in a sample using 16S rRNA gene amplicon sequencing are described in the literature and will be known to a person skilled in the art. Other techniques may involve PCR, rtPCR, qPCR, high throughput sequencing, metatranscriptomic sequencing, or 16S rRNA analysis.
In alternative aspects of any embodiment of the invention, the invention provides a method for diagnosing the risk of developing IBS.
In any embodiment of the invention, modulated abundance of a bacterial strain, species, metabolite or gene pathway is indicative of IBS. In preferred embodiments, the abundance of the bacterial strain, species or OTU as a proportion of the total microbiota in the sample is measured to determine the relative abundance of the strain, species or OTU. In preferred embodiments, the concentration of a metabolite is measured, in particular a urine metabolite. In preferred embodiments, the abundance of bacterial strains carrying a gene pathway of interest as a proportion of the total microbiota in the sample is measured to determine the relative abundance of the strains, or concentrations of gene sequences are measured. Then, in such preferred embodiments, the relative abundance of the bacterium or OTU or the concentration of the metabolite or gene sequence in the sample is compared with the relative abundance or concentration in the same sample from a control (non-IBS) individual. A difference in relative abundance of the bacterium or OTU in the sample, e.g. a decrease or an increase, compared to the reference is a modulated relative abundance. As explained herein, detection of modulated abundance can also be performed in an absolute manner by comparing sample abundance values with absolute reference values. Therefore, the invention provides a method of determining IBS status in an individual comprising the step of assaying a biological sample from the individual for a relative abundance of one or more IBS-associated bacteria and/or a modulated concentration of a metabolite or gene pathway, wherein a modulated relative abundance of the bacteria or modulated concentration of a metabolite or gene pathway is indicative of IBS. Similarly, the invention provides a method of determining whether an individual has an increased risk of having IBS comprising the step of assaying a biological sample from the individual for a relative abundance of one or more IBS-associated oral bacteria or IBS-associated metabolites or gene pathways, wherein modulated relative abundance or concentration is indicative of an increased risk.
In any embodiment of the invention, detecting a bacteria may comprise detecting “modulated relative abundance”. As used herein, the term “modulated relative abundance” as applied to a bacterium or OTU in a sample from an individual should be understood to mean a difference in relative abundance of the bacterium or OTU in the sample compared with the relative abundance in the same sample from a control (non-IBS) individual (hereafter “reference relative abundance”). In one embodiment, the bacterium or OTU exhibits increased relative abundance compared to the reference relative abundance. In one embodiment, the bacterium or OTU exhibits decreased relative abundance compared to the reference relative abundance. Detection of modulated abundance can also be performed in an absolute manner by comparing sample abundance values with absolute reference values. In one embodiment, the reference abundance values are obtained from age and/or sex matched individuals. In one embodiment, the reference abundance values are obtained from individuals from the same population as the sample (i.e. Celtic origin, North African origin, Middle Eastern origin). Method of isolating bacteria from oral and fecal sample are routine in the art and are further described below, as are methods for detecting abundance of bacteria. Any suitable method may be employed for isolating specific species or genera of bacteria, which methods will be apparent to a person skilled in the art. Any suitable method of detecting bacterial abundance may be employed, including agar plate quantification assays, fluorimetric sample quantification, qPCR, 16S rRNA gene amplicon sequencing, and dye-based metabolite depletion or metabolite production assays.
Stratifying Patients
In certain embodiments, the methods of the invention are for use in stratifying patients according to the type of IBS that they are suffering from. In particular, in certain embodiments, the methods of the invention are for diagnosing a patient suffering from IBS as having a normal-like microbiota (i.e. a microbiota composition similar to the microbiota composition of a person without IBS), or an altered microbiota (i.e. a microbiota dissimilar to the microbiota of a person without IBS) (see Jeffery I B, O'Toole P W, Ohman L, Claesson M J, Deane J, Quigley E M, Simren M. 2012. “An irritable bowel syndrome subtype defined by species-specific alterations in fecal microbiota.” Gut 61:997-1006 (23)). Patients suffering from IBS with a normal-like microbiota may benefit from different treatments compared to patients with an altered microbiota, so the methods of the invention may result in more appropriate treatment strategies and better outcomes for patients. Therefore, in certain embodiments, the methods of the invention comprise developing and/or recommending a treatment plan for a patient based on their microbiota. IBS patients with normal-like microbiota may benefit from treatments known to ameliorate anxiety or depression. IBS patients with an altered microbiota may benefit from treatments able to instigate beneficial changes in the microbiota and/or address dysbiosis, such as live biotherapeutic products, in particular compositions comprising Blautia hydrogenotrophica (as described in WO2018109461). IBS patients with an altered microbiota may also benefit from diet adjustments, such as a FODMAP (fermentable oligo-, di-, monosaccharides and polyols) diet. Compositions comprising Blautia hydrogenotrophica are also effective for treating visceral hypersensitivity (as described in WO2017148596), which patients with normal-like microbiota may experience, so such compositions will also be useful for treating such patients.
In certain embodiments, the invention provides a method for stratifying patients suffering from IBS into subgroups based on their microbiome and/or metabolome. In a particular embodiment, the method of the invention comprises detecting one or more bacterial strains belonging to at least one genus selected from the group consisting of: Anaerostipes, Anaerotruncus, Anaerofilum, Bacteroides, Blaufia, Eggerthella, Streptococcus, Gordonibacter, Holdemania, Ruminococcus, Veilonella, Akkermansia, Alistipes, Bamesiella, Butyricicoccus, Butyricimonas, Clostridium, Coprococcus, Faecalibacterium, Haemophilus, Howardella, Methanobrevibacter, Oscillobacter, Prevotella, Pseudoflavonifractor, Roseburia, Slackia, Sporobacter and Victivallis. In a particular embodiment, the method of the invention comprises detecting bacterial species which may belong to Clostridium clusters IV, XI or XVIII. In a particular embodiment, the method of the invention comprises detecting bacterial strains which may include one or more of the following species: Anaerostipes hadrus, Bacteroides ovatus, Bacteroides thetaiotaomicron, Clostridium asparagiforme, Clostridium boltaea, Clostridium hathewayi, Clostridium symbiosum, Coprococcus comes, Ruminococcus gnavus, Streptococcus salivarus, Ruminococcus torques, Alistipes senegalensis, Eubacterium eligens, Eubacterium siraeum, Faecalibacterium prausnitzii, Roseburia hominis, Haemophilus parainfluenzae, Ruminococcus callidus, Veilonella parvula and Coprococcus sp. ART55/1. In a particular embodiment, the method of the invention comprises detecting one or more of the following bacterial strains: Lachnospiracaea bacterium 3 1 46FAA, Lachnospiracaea bacterium 5 1 63FAA, Lachnospiracaea bacterium 7 1 58FAA and Lachnospiracaea bacterium 8 1 57FAA. In a particular embodiment, the method of the invention comprises detecting bacterial taxa selected from tables 17, 18, 19 and/or 20. In certain embodiments, the method of the invention comprises detecting a metabolite associated with an IBS subgroup. In certain embodiments, the metabolite is detected in a fecal sample. In certain embodiments, the metabolite is detected in a urine sample.
In certain embodiments, the invention provides a method of assessing whether a patient suffering from IBS would benefit from a treatment able to instigate beneficial changes in the microbiota and/or address dysbiosis, such as a live biotherapeutic product. In a particular embodiment, the method of the invention comprises detecting one or more bacterial strains belonging to at least one genus selected from the group consisting of: Anaerostipes, Anaerotruncus, Anaerofilum, Bacteroides, Blaufia, Eggerthella, Streptococcus, Gordonibacter, Holdemania, Ruminococcus, Veilonella, Akkermansia, Alistipes, Bamesiella, Butyricicoccus, Butyricimonas, Clostridium, Coprococcus, Faecalibacterium, Haemophilus, Howardella, Methanobrevibacter, Oscillobacter, Prevotella, Pseudoflavonifractor, Roseburia, Slackia, Sporobacter and Victivallis. In a particular embodiment, the method of the invention comprises detecting bacterial species which may belong to Clostridium clusters IV, XI or XVIII. In a particular embodiment, the method of the invention comprises detecting bacterial strains which may include one or more of the following species: Anaerostipes hadrus, Bacteroides ovatus, Bacteroides thetaiotaomicron, Clostridium asparagiforme, Clostridium boltaea, Clostridium hathewayi, Clostridium symbiosum, Coprococcus comes, Ruminococcus gnavus, Streptococcus salivarus, Ruminococcus torques, Alistipes senegalensis, Eubacterium eligens, Eubacterium siraeum, Faecalibacterium prausnitzii, Roseburia hominis, Haemophilus parainfluenzae, Ruminococcus callidus, Veilonella parvula and Coprococcus sp. ART55/1. In a particular embodiment, the method of the invention comprises detecting one or more of the following bacterial strains: Lachnospiracaea bacterium 3 1 46FAA, Lachnospiracaea bacterium 5 1 63FAA, Lachnospiracaea bacterium 7 1 58FAA and Lachnospiracaea bacterium 8 1 57FAA. In a particular embodiment, the method of the invention comprises detecting bacterial taxa selected from tables 17, 18, 19 and/or 20. In certain embodiments, the method of the invention comprises detecting a metabolite associated with an IBS subgroup. In certain embodiments, the metabolite is detected in a fecal sample. In certain embodiments, the metabolite is detected in a urine sample.
In certain embodiments, the method of the invention comprises identifying a subgroup which is characterised by an altered microbiome and/or metabolome relative to healthy control subjects. In certain embodiments, the method of the invention comprises identifying a subgroup which is characterised by a microbiome and/or metabolome similar to healthy control subjects. In certain embodiments, the methods of the invention are for use in classifying of a patient suffering from IBS into a subgroup based on their microbiome. In certain embodiments, the methods of the invention are for use in determining whether a patient suffering from IBS would benefit from a treatment able to instigate beneficial changes in the microbiota and/or address dysbiosis, such as live biotherapeutic products. In certain embodiments, it may be deemed that a patient suffering from IBS would benefit from a treatment able to instigate beneficial changes in the microbiota and/or address dysbiosis, such as live biotherapeutic products, if said patient is classified as belonging to a subgroup characterised by an altered microbiome and/or metabolome relative to healthy control subjects. In certain embodiments, it may be deemed that a patient suffering from IBS would not benefit from a treatment able to instigate changes in the microbiota and/or address dysbiosis, such as live biotherapeutic products, if said patient is classified as belonging to a subgroup characterised by similar microbiome and/or metabolome to healthy control subjects.
Kits
The invention also provides kits comprising reagents for performing the methods of the invention, such as kits containing reagents for detecting one or more, such as two or more of the bacterial species, genes or metabolites set out above. As such, provided are kits that find use in practicing the subject methods of diagnosing IBS, as mentioned above. The kit may be configured to collect a biological sample, for example a urine sample or a fecal sample. In a preferred embodiment, the kit is configured to collect a urine sample. The individual may be suspected of having IBS. The individual may be suspected of being at increased risk of having IBS. A kit can comprise a sealable container configured to receive the biological sample. A kit can comprise polynucleotide primers. The polynucleotide primers may be configured for amplifying a 16S rRNA polynucleotide sequence from at least one IBS-associated bacterium to form an amplified 16S rRNA polynucleotide sequence. A kit may comprise a detecting reagent for detecting the amplified 16S rRNA sequence. A kit may comprise instructions for use.
Background & Aims: Diagnosis and stratification of irritable bowel syndrome (IBS) is based on symptoms and other disease exclusion. Whether the pathogenesis begins centrally and/or at the end organ is unclear. Some patients have an alteration in their microbiota. Therefore, microbiome and metabolomic profiling was conducted to identify biomarkers for the condition.
To work toward an evidence-based stratification of patients with IBS, a metagenomic study of fecal samples was performed, along with metabolomic analyses of urine and faeces in patients with IBS (according to the Rome IV criteria) in comparison with controls. Microbiome and metabolomic signatures are evident in IBS but these are independent of the traditional clinical symptom-based subsets of IBS (IBS-D vs IBS-C, IBS-alternating or mixed).
Methods: 80 patients with IBS (Rome IV) and 65 non-IBS controls were enrolled.
Anthropometric, medical and dietary information were collected with fecal and urine samples for microbiome and metabolomic analyses. Shotgun and 16S rRNA amplicon sequencing were performed on feces, and urine and fecal metabolites were analysed by gas chromatography (GC)—and liquid chromatography (LC) mass spectrometry (MS).
Results: Differential connections between diet and the microbiome with alterations of the metabolome were evident in IBS. Microbiota composition and predicted microbiome function in patients with IBS differed significantly from those of controls, but these were independent of IBS-symptom subtypes. Fecal metabolomic profiles also differed significantly between IBS patients and controls and were discriminatory for the condition. The urine metabolome contained an array of predictive metabolites but was mainly dominated by dietary and medication-related metabolites.
Conclusion: Despite clinical heterogeneity, IBS can be identified by species-, metagenomics and fecal metabolomic-signatures which are independent of symptom-based subtypes of IBS. These findings are useful for diagnosing IBS and for developing precision therapeutics for IBS.
Materials and Methods
Subject recruitment: Eighty patients aged 16-70 years with IBS meeting the Rome IV criteria were recruited at Cork University Hospital. Clinical subtyping of the patients (15) was as follows: IBS with constipation (IBS-C), mixed IBS (IBS-M) or IBS with diarrhea (IBS-D). Sixty-five controls of the same age range and of the same ethnicity and geographic region were recruited. Descriptive statistics for the study population are presented in Table 10.
Exclusion criteria included the use of antibiotics within 6 weeks prior to study enrolment, other chronic illnesses including gastrointestinal diseases, severe psychiatric disease, abdominal surgery other than hernia repair or appendectomy. Standard-of-care blood analysis was carried out on all participants if recent results were not available, and all subjects were tested by serology to exclude coeliac disease. The inclusion/exclusion criteria for the control population were the same as for the IBS population with the exception of having to fulfil the Rome IV criteria for IBS. Gastrointestinal (GI) symptom history, psychological symptoms, diet, medical history and medication data were collected on each participant (both IBS and controls) and using the following questionnaires: Bristol Stool Score (BSS), Hospital Anxiety and Depression Scale (HADS) (24); Food Frequency Questionnaire (FFQ) (25). Ethical approval for the study was granted by the Cork Research Ethics Committee (protocol number: 4DC001) before commencing the study and all participants provided written informed consent to take part.
Sample collection: Fecal and urine samples were collected from all participants for microbiome and metabolomics profiling. Subjects collected a freshly voided fecal sample at home using a collection kit and brought the sample to the clinic that day, when a fresh urine sample was collected. Samples were kept at 4° C. until brought to the laboratory for storage at −80° C. which was within a few hours of the sample collection.
Microbiome profiling and metagenomics-16S amplicon sequencing: Genomic DNA was extracted and amplified from frozen fecal samples (0.25 g) using the method described by Brown et al. (26). The modifications from the methods described by Brown et al. (26) included bead beating tubes consisting of 0.5 g of 0.1 mm zirconia beads and 4×3.5 mm glass beads. Fecal samples were homogenised via bead beating for 3×60 s cycles and cooled on ice between each cycle. Genomic DNA was visualised on 0.8% agarose gel and quantified using the SimpliNano Spectrometer (Biochrom™, US). The PCR master mix used 2× Phusion Taq High-Fidelity Mix (Thermo Scientific, Ireland) and 15 ng of DNA. The resulting PCR products were purified, quantified and equimolar amounts of each amplicon were then pooled before being sent for sequencing to the commercial supplier (GATC Biotech AG, Konstanz, Germany) on the MiSeq (2×250 bp) chemistry platforms. Sequencing was performed by GATC Biotech, Germany on an Illumina MiSeq instrument using a 2×250 bp paired end sequencing run.
Microbiome profiling and metagenomics—16S amplicon sequencing: Using the Qiagen DNeasy Blood & Tissue Kit and following the manufacturer's instructions, microbial DNA was extracted from 0.25 g of each of 144 frozen fecal samples (IBS: n=80 and control (n=64). No fecal sample was available for one control subject. The 16S rRNA gene amplicons preparation and sequencing was carried out using the 16S Sequencing Library Preparation Nextera protocol developed by Illumina (San Diego, Calif., USA). 15 ng of each of the DNA fecal extracts was amplified using PCR and primers targeting the V3-V4 variable region of the 16S rRNA gene using the following gene-specific primers:
The amplicon size was 531 bp. The products were purified and forward and reverse barcodes were attached by a second round of adapter PCR.
Microbiome profiling and metagenomics—Shotgun sequencing: For shotgun sequencing, 1 μg (concentration>5 ng/μL) of high molecular weight DNA for each sample was sent to GATC Biotech, Germany for sequencing on Illumina HiSeq platform (HiSeq 2500) using 2×250 bp paired-end chemistry. This returned 2,714,158,144 raw reads (2,612,201,598 processed reads) of which 45.6% were mapped to an average of 222,945 gene families per sample with a mean count value of 8,924,302±2,569,353 per sample.
Bioinformatics analysis (16S amplicon sequencing): Miseq 16S sequencing data was returned for 144 subjects. Data generated for 3 samples (2 IBS and 1 control) were removed as the number of reads returned from sequencing was too low for analysis, leaving 141 samples (control: n=63, IBS n=78). Raw amplicon sequence data were merged and the reads trimmed using the flash methodology (27). The USEARCH pipeline was used to generate the OTU table (28). The UPARSE algorithm was used to cluster the sequences into OTUs at 97% similarity (29). UCHIME chimera removal algorithm was used with Chimeraslayer to remove chimeric sequences (30). The Ribosomal Database Project (RDP) taxonomic classifier was used to assign taxonomy to the representative OTU sequences (28) and microbiota compositional (abundance and diversity) information was generated.
Bioinformatics analysis (Shotgun metagenomic sequencing): For shotgun metagenomics, 6 control samples were not sequenced due to data not passing QC or no sample available (control: n=59; IBS n=80). The number of raw read pairs obtained after sequencing, varied from 5,247,013 to 21,280,723 (Mean=9,763,159±2,408,048). Reads were processed in accordance with the Standard Operating Procedure of Human Microbiome Project (HMP) Consortium (31). Metagenomic composition and functional profiles were generated using HUMAnN2 pipeline (32). For each sample, multiple profiles were obtained, including: microbial composition profiles from clade-specific gene information (using MetaPhlAn2), Gene family abundance, pathways stratified per organism, total pathway coverage and abundance.
Machine learning: An in-house machine learning pipeline was applied to each datatype (16S, shotgun, and urine and fecal MS metabolomics) using a twostep approach applying the Least Absolute Shrinkage and Selection Operator (LASSO) feature selection followed by Random Forest
(RF) modelling (33). The models were implemented using R software version 3.4.0, using package glmnet version 2.0-10 for LASSO feature selection, and R package randomForest version 4.6-12. (34).
Each variable consisted of data from 78 IBS patients IBS and 64 controls. First, feature selection was performed using the LASSO algorithm to improve accuracy and interpretability of models by efficiently selecting the relevant features. This process was tuned by parameter lambda, which was optimized for each dataset using a grid search. The training data was filtered to include only the features selected by the LASSO algorithm, and RF was then used for modelling whereby 1500 trees were built. Both LASSO feature selection and RF modelling were performed using 10-fold cross validation (CV) repeated 10 times (10-fold, 10 repeats, R package caret version 6.0-76.), which generated an internal 10-fold prediction yielding an optimal model that predicts the IBS or Control classification of samples. This 10-fold cross-validation procedure was repeated ten times and the average area under the curve (AUC), sensitivity and specificity were reported.
Results
Microbiome Differs Between IBS and Controls but not Across IBS Clinical Subtypes
Microbiota profiling by 16S rRNA amplicon sequencing and Principal Co-Ordinate Analysis (PCoA) of the microbiota composition data confirmed that the microbiota of subjects with IBS was distinct from that of controls (
Machine learning was used to identify bacterial taxa predictive of IBS and control groups (
Machine learning (based on shotgun data) identified 6 genera predictive of IBS which included Lachnospiraceae, Oscillibacter and Coprococcus with an Area under the Curve (AUC) of 0.835 (sensitivity: 0.815 and specificity: 0.704; Table 1).
At the species level, 40 predictive features (AUC of 0.878; sensitivity: 0.894, specificity: 0.687; Table 2) were identified which included Ruminococcus gnavus and Lachnospiraceae spp which were significantly more abundant in IBS, while Barnesiella intestinihominis and Coprococcus catus were among taxa significantly less abundant in IBS based on pairwise comparison (Table 3). These alterations are consistent with previous studies (10-12), where the taxa that were significantly differentially abundant belonged to the Ruminococcaceae, Lachnospiraceae and Bacteroidetes families/genera.
Clinical subtypes of IBS did not separate in a PCoA of microbiota beta diversity derived from 16S profiling data (
Other pathways that were less abundant in the metagenome of subjects with IBS included galactose degradation, sulfate reduction, sulfate assimilation and cysteine biosynthesis, collectively indicative of a reduced sulphur metabolism in IBS. The genes encoding 12 pathways were more abundant in IBS subjects including those for starch degradation V. Of a total of 232 functional pathways that were significantly more abundant in the IBS group, 113 were associated with the Lachnospiraceae family or the Ruminococcus species.
Discussion
A species-level microbiome signature for IBS was identified that included some broad taxonomic groups (lower abundance of Bacteroides species, elevated levels of Lachnospiraceae and Ruminococcus spp.) as well as a list of 32 taxa whose collected abundance values could discriminate between IBS and controls. The ability to distinguish the microbiota of subjects with IBS from controls is superior to that of an earlier study based on a supervised split (10), or one which could not distinguish between control and IBS microbiota (12), but which also reported no statistical difference in the phenotypes of the IBS subjects and controls for rates of anxiety, depression, stool frequency and Bristol stool form. The relatively mild disease symptoms of this IBS cohort (12) may have confounded identifying a microbiome signature. Supporting this, in a recent study of the gut microbiome in IBS and IBD, microbiome alterations were significantly associated with a physician diagnosed IBS group but were of fewer and of lower significance in the self-diagnosed IBS subgroup (36).
Materials and Methods
Subject recruitment and sample collection were carried out as described in Example 1.
Urine FAIMS: FAIMS analysis was performed using a protocol modified from that of Arasaradnam et al. (37) and described below. Any other appropriate method known in the art for detecting metabolites may be used in the methods of the invention. Frozen (−80° C.) urine samples were thawed overnight at 4° C., 5 mL of each urine sample was aliquoted into a 20 mL glass vial and placed into an ATLAS sampler (Owlstone, UK) attached to the Lonestar FAIMS instrument (Owlstone, UK). The sample was heated to 40° C. and sequentially run three times.
Each sample run had a flow rate over the sample of 500 mL/min of clean dry air.
Further make-up air was added to create a total flow rate of 2.5 L/min. The FAIMS was scanned from 0 to 99% dispersion field in 51 steps, +6 V to −6 V compensation voltage in 512 steps and both positive and negative ions were detected to produce an untargeted volatile organic compound (VOC) profile for each sample. The signals for each sample at each DF were smoothed using the Savitzky-Golay filter (window size=9, degree=3). The signals were trimmed based on an optimized cut-off of 0.007 for positive mode and −0.007 for negative mode outputs, to obtain the region of interest, and reduce the baseline noise. Signals were aligned to the trimmed signals at each DF, using cross-correlation, using the mean signal as reference to make them comparable. Since the initial DFs of the FAIMS signal, and higher DFs were non-informative, signals corresponding to 17th DF till 42nd DF of both, positive, and negative modes were considered. These pre-processing steps were performed using customized programs developed in Python, v. 2.7.11, with relevant packages (Scipy v-1.1, and Numpy v-1.15.2). To further reduce the complexity, and to retain informative data, kurtosis normality tests were performed on each feature vector and features with raw p-value >0.1, were considered, and final profile was generated for various statistical analyses.
Bioinformatics analysis of urine metabolome data (FAIMS): Each urine sample analysed using FAIMS yielded a profile with ca. 52,224 data points. A pooled profile containing these data points for each sample was generated for pre-processing, to reduce the noise, size, and complexity of the data.
Urine GC/LC MS: 5 mL samples of frozen urine were sent on dry ice to Metabolomic Discoveries (now Metabolon), Potsdam, Germany. Untargeted metabolomics analysis was performed using liquid chromatography (LC) and Solid Phase Microextraction (SPME) gas chromatography (GC) and metabolites were identified using electrospray ionization mass spectrometry (ESI-MS). Short chain fatty acids (SCFA) analysis was also performed by LC-tandem mass spectrometry.
For urine metabolomics, the values of metabolites were normalized with reference to urine creatinine levels in each sample.
Bioinformatics analysis of urine metabolome data (MS): Urine MS metabolomics data was returned for all IBS subjects (n=80) and all but 2 controls (n=63) as these did not pass QC or no sample was available. A total of 2,887 metabolites were returned from untargeted urine metabolomics analysis, of which 594 were identified. Only the identified features with peak values normalized by creatinine levels in urine (mg/dl) were considered for further analysis.
Machine learning: An in-house machine learning pipeline was applied to each datatype (in this example, urine MS metabolomics) using a twostep approach applying the Least Absolute Shrinkage and Selection Operator (LASSO) feature selection followed by Random Forest (RF) modelling (38), as described in Example 1. The models were implemented using R software version 3.4.0, using package glmnet version 2.0-10 for LASSO feature selection, and RF package randomForest version 4.6-12. (34). The ability of urine FAIMS metabolomics to differentiate between health classes was tested using support vector machines (SVM), with a linear kernel, using python 2.7 and Scikit-Learn (v 0.19.2) (39). Features of FAIMS profile were selected using kurtosis normality test. These features were centered and scaled. The samples were split into training and test set, for 10 fold cross validation. Class weights were balanced. Other parameters were set to default. No supervised feature selection was used.
Results
Altered Urine Metabolomes in IBS
Metabolomic analysis was extended to all subjects, focusing initially on urine as a non-invasive test sample. Two methods were compared: High field asymmetric waveform ion mobility spectrometry (FAIMS) analysis for volatile organics, and both GC- and LC-MS.
The FAIMS technique did not identify discriminatory metabolites directly, but separated samples/subjects by characteristic plumes of ionized metabolites. In unsupervised analysis, FAIMS readily identified urine samples from controls and IBS (
GC/LC-MS analysis of the urine metabolome also separated IBS patients from controls (
Machine learning identified four urine metabolomics features predictive of IBS (AUC 0.999; sensitivity: 0.988, specificity: 1.000) which were reflective of dietary components (Table 5). Pairwise comparison of control and IBS urine metabolomes identified 127 differentially abundant features (Table 6). 89 urine metabolites were significantly less abundant in IBS subjects including a number of amino acids such as L-arginine, a precursor for the biosynthesis of nitric oxide which is associated both with mucosal defence as well as IBS pathophysiology (40). Another 38 metabolites were present at significantly higher levels in IBS including an acylgylcine (N-undecanoylglycine) and an acylcarnitine (decanoylcarnitine). Elevated levels of metabolites from these groups are associated with altered fatty acid oxidation/metabolism and disease (41,42,43).
Discussion
Urine metabolomics was highly discriminatory for IBS. The machine learning model showed that the compounds identified were predominantly diet- or medication-associated.
Materials and Methods
Subject recruitment and sample collection were carried out as described in Example 1.
Fecal GC/LC MS: 1 g samples of frozen feces were sent on dry ice to Metabolomic Discoveries (now Metabolon), Potsdam, Germany. For LC-MS, the samples were dried and resuspended to a final concentration of 10 mg per 400 μL before analysis. GC-MS and SCFA analysis were performed using wet samples. Untargeted metabolomics and SCFA analysis was carried out as described previously for urine MS metabolomics.
Bioinformatics analysis of fecal metabolome data: Fecal MS metabolomics data was returned for all IBS subjects (n=80) and all but 2 controls (n=63) as these did not pass QC or no sample was available. 2,933 metabolites were returned from untargeted fecal metabolomics analysis carried out by the service provider of which 753 were identified. Metabolites identified using LC-MS were not normalized, since the fecal samples were already normalized with dry weight (10 mg per 400 μL) during sample preparation. Metabolites identified using GC-MS were normalized with corresponding sample wet weights. Only the identified metabolites were considered for further analyses. Machine learning analysis was carried out as described previously for the urine metabolome. Summary statistics for all datasets were generated using the Wilcoxon rank sum test with q-value adjustment for multiple testing.
Machine learning: An in-house machine learning pipeline was applied to each datatype (in this example, fecal MS metabolomics) using a twostep approach applying the Least Absolute Shrinkage and Selection Operator (LASSO) feature selection followed by Random Forest (RF) modelling (38), as described in Example 1. The models were implemented using R software version 3.4.0, using package glmnet version 2.0-10 for LASSO feature selection, and RF package randomForest version 4.6-12. (39).
Results
Altered Fecal Metabolomes in IBS
Analysis of the Fecal Metabolome by GC/LC-MS Separated IBS Patients from Controls
(
Machine learning applied to the shotgun species dataset produced a marginally better prediction model for IBS than the fecal metabolomic model (AUC 0.878, sensitivity 0.894 and specificity 0.687) based on 40 predictive species (Table 2). The adenosine ribonucleotide de novo biosynthesis functional pathway was significantly more abundant in 11 of the 32 predictive species which resonates with adenosine being the fourth highest ranked predictive metabolite for IBS.
Pairwise comparison analysis of metabolites identified 128 significantly differential abundant features including 77 which were significantly depleted in IBS (Table 8). 51 fecal metabolites were significantly more abundant including tyrosine and lysine and three Bile Acids (BAs):[ ST hydroxy] (25R)-3alpha,7alpha-dihydroxy-5beta-cholestan-27-oyl taurine; [ST (2:0)] 5beta-Chola-3,11-dien-24-oic acid, and UDCA, which is one of the predictive metabolites for IBS.. BAs affect water absorption in intestine, and can lead to diarrhea (45).
The level of bile acid metabolites in the subgroups was analysed and a significant difference was observed in the IBS-D subtype for most bile acid categories (Total BAs, secondary BAs, sulphated BAs, UDCA and conjugated BAs) when compared to the control subjects as shown in Table 9a. These differences were associated with an altered functional potential, reflected by the ursodeoxycholate biosynthesis and glycocholate metabolism pathway gene abundances correlating with the secondary BAs, UDCA and total BA levels (Table 9b). Primary BAs and taurine:glycine conjugated BAs were not significantly different across the groups. Similar findings (in a smaller IBS/control cohort) were reported by Dior and colleagues (46) for secondary BAs, sulphated BAs and UDCA and taurine:glycine conjugated BAs.
Thus the differences in fecal microbiome composition and predicted function in IBS patients and controls are mirrored by differences in the measured metabolome in the two sample types.
Discussion
Here it is shown that the microbiome of patients with IBS is distinct from that of controls and this is reflected in fecal metabolome profiles. However, metagenome and metabolome configurations do not distinguish the so-called clinical subtypes of IBS (IBS-C, -D, -M).
The fecal metabolome correlated well with taxonomic and functional data for the microbiota.
Materials and Methods
Subject recruitment and sample collection were carried out as described in Example 1.
Fecal GC/LC MS: 1 g samples of frozen feces were sent on dry ice to Metabolomic Discoveries (now Metabolon), Potsdam, Germany. For LC-MS, the samples were dried and resuspended to a final concentration of 10 mg per 400 μL before analysis. GC-MS and SCFA analysis were performed using wet samples. Untargeted metabolomics and SCFA analysis was carried out as described previously for urine MS metabolomics.
Bioinformatics analysis of fecal metabolome data: Fecal MS metabolomics data was returned for all IBS subjects (n=80) and all but 2 controls (n=63) as these did not pass QC or no sample was available. 2,933 metabolites were returned from untargeted fecal metabolomics analysis carried out by the service provider of which 753 were identified. Metabolites identified using LC-MS were not normalized, since the fecal samples were already normalized with dry weight (10 mg per 400 μL) during sample preparation. Metabolites identified using GC-MS were normalized with corresponding sample wet weights. Only the identified metabolites were considered for further analyses. Machine learning analysis was carried out as described previously for the urine metabolome. Summary statistics for all datasets were generated using the Wilcoxon rank sum test with q-value adjustment for multiple testing.
Machine learning: An in-house machine learning pipeline was applied to the fecal metabolomic data. The machine learning pipeline used in this example is similar to the machine learning pipeline used in Examples 1 to 3, but comprised additional optimization and validation steps, using a two step approach within a ten-fold cross-validation. Within each validation fold Least Absolute Shrinkage and Selection Operator (LASSO) feature selection was carried out followed by Random Forest (RF) modelling and an optimised model was validated against the cross validation test data which is external to the cross-validation training subset.
The classified fecal metabolome sample profiles were log10 transformed before they were analysed in the machine learning pipeline. The transformed profiles were then used to classify the samples as IBS (80 samples) or Control (63 samples). The classified samples were then analysed in the machine learning pipeline.
After determination of the lambda (λ) range, the samples were assigned weights based on their class probabilities. The weights assigned to the training samples in this step were used in all subsequent applicable steps.
A LASSO algorithm substantially as described in Examples 1 to 3 was then applied to the weighted training samples. In this example, the LASSO algorithm used the previously calculated optimal lambda (λ) range, and used the Caret (version 6.0-84 in this example) and glmnet (version 2.0-18 in this example) packages, The ROC AUC (receiver operating characteristic, area under curve) metric was calculated using 10-fold internal cross validation, repeated 10 times. The feature coefficients identified by the optimized LASSO algorithm were extracted and features with non-zero coefficients were selected for further analysis. In
Following feature selection using LASSO, an optimized random forest classifier (with 1500 trees) was generated using the selected features, or all of the features, as determined by N. This optimised random forest classifier can be used to predict the external test fold. Random forest generation was performed using Caret (version 6.0-84) and internal cross validation, by tuning the ‘mtry’ parameter to maximise the ROC AUC metric. For tuning, if the number of selected features is greater than or equal to 5, mtry ranges from 1 to the square root of the number of selected features or else the range is from 1 to 6. The optimized random forest classifier was then applied to the test set and the performance of the classifier was calculated via the AUC, sensitivity, and specificity metrics.
Both LASSO feature selection and RF modelling were performed within a 10-fold cross validation (CV), which generated an internal 10-fold prediction model that predicts the IBS or control classification of samples. This 10-fold cross-validation procedure was repeated ten times and the average AUC, sensitivity and specificity are reported. The optimized model is then used to predict the cross-validation test subset, and final classifier performance metrics are calculated from across the ten folds of the cross-validation (AUC, Sensitivity and Specificity).
Results
Fecal Metabolome is Predictive of IBS
The optimized random forest classifier was investigated for its predictive ability to classify samples as IBS or Control. External validation was 10-fold cross validation. Internal validation was 10-fold cross validation, repeated 10 times.
The performance summary and feature details are shown in Table 13. Features selected by LASSO having coefficients less than zero are associated with IBS, while positive coefficients are associated with Controls. Overall, for 10 folds, the mean ROC AUC was 0.686 (±0.132). Sensitivity, and specificity were 0.737 (±0.181), and 0.476 (±0.122), respectively. Accuracy was observed to be 0.622±0.095.
The classification threshold was also optimized to achieve maximum sensitivity and specificity using pROC package (version 1.15.0) and Youden J score. The obtained optimized values for Sensitivity and Specificity were 0.55, and 0.794, respectively. Thresholds were also optimized such that specificity >=0.9. The optimized values thus obtained for Sensitivity and Specificity were 0.288, and 0.905, respectively, at a threshold equal to 0.689.
The analysis identified 158 metabolites predictive of IBS, which are listed in Table 13. Metabolites with the highest RF feature importance included L-Phenylalanine, Adenosine and MG(20:3(8Z,11Z,14Z)/0:0/0:0). Increased levels of phenylethylamine, which is involved in the key metabolism pathway of phenylalanine, were found in fecal extracts of IBS mice compared with healthy control mice (47), indicating a connection between fecal phenylalanine levels and IBS, which is consistent with the present findings. Other metabolites which were predictive of IBS included the amino acids Lalanine, L-arginine, tyrosine and inosine previously reported as a biomarker of IBS (along with adenosine). The identified metabolites also included dodecanedioic acid, which, as discussed in Example 3, is an indicator of fatty acid oxidation defects (32).
Discussion
Here it is shown that the fecal metabolome profile of patients with IBS is distinct from that of controls. This observation is consistent with the results obtained using a different machine learning pipeline, as described in Example 3.
Materials and Methods
Subject recruitment and sample collection were carried out as described in Example 1.
Co-abundance clustering: Clusters of co-abundant genes (CAGs) representing metagenomically-defined species variables were identified using gene family abundances. The generation of the gene family abundances is described in detail in Example 1, but for completeness is also detailed below.
Microbiome profiling and metagenomics: Genomic DNA was extracted and amplified from frozen fecal samples (0.25 g) using the method described by Brown et al. (26).
Microbiome profiling and metagenomics—Shotgun sequencing: Genomic DNA was extracted as described above. For shotgun sequencing, 1 μg (concentration>5 ng/μL) of high molecular weight DNA for each sample was sent to GATC Biotech, Germany for sequencing on Illumina HiSeq platform (HiSeq 2500) using 2×250 bp paired-end chemistry. This returned 2,714,158,144 raw reads (2,612,201,598 processed reads) of which 45.6% were mapped to an average of 222,945 gene families per sample with a mean count value of 8,924,302±2,569,353 per sample.
Bioinformatics analysis (16S amplicon sequencing): Miseq 16S sequencing data was returned for 144 subjects. Data generated for 3 samples (2 IBS and 1 control) were removed as the number of reads returned from sequencing was too low for analysis, leaving 141 samples (control: n=63, IBS n=78). Raw amplicon sequence data were merged and the reads trimmed using the flash methodology (27). The USEARCH pipeline was used to generate the OTU table (28). The UPARSE algorithm was used to cluster the sequences into OTUs at 97% similarity (29). UCHIME chimera removal algorithm was used with Chimeraslayer to remove chimeric sequences (30). The Ribosomal Database Project (RDP) taxonomic classifier was used to assign taxonomy to the representative OTU sequences (28) and microbiota compositional (abundance and diversity) information was generated.
Bioinformatics analysis (Shotgun metagenomic sequencing): For shotgun metagenomics, 6 control samples were not sequenced due to data not passing QC or no sample available (control: n=59; IBS n=80). The number of raw read pairs obtained after sequencing, varied from 5,247,013 to 21,280,723 (Mean=9,763,159±2,408,048). Reads were processed in accordance with the Standard Operating Procedure of Human Microbiome Project (HMP) Consortium (31). Metagenomic composition and functional profiles were generated using HUMAnN2 pipeline (32). For each sample, multiple profiles were obtained, including: microbial composition profiles from clade-specific gene information (using MetaPhlAn2), Gene family abundance, Pathway coverage and abundance.
After clusters of co-abundant genes representing metagenomically-defined species variables were identified from the gene family abundances, using the HUMAnN2 pipeline, a co-abundance analysis of the gene families was performed using a modified canopy clustering algorithm (Nielsen et al., 2014) (48). The canopy clustering algorithm was run with default parameters for 139 samples (IBS (80 samples) or Controls (59 samples)) using the relative abundance of 1,706,571 gene families (UniRef90 database) stratified by species using the HUMAnN2 methodology (Franzosa et al., 2018) (32).
The resulting gene family clusters were filtered to keep those where at least 90% of the cluster signal originated from more than three samples and contained more than two gene families. This was in order to remove clusters driven by outliers or with too few values, as recommended by Nielsen et al, 2014 (48). The clusters remaining after filtering were termed co-abundant groups or CAGs.
Abundance Indices of CAGs: The abundance indices of the CAGs were generated by Singular Value Decomposition (SVD) as implemented in Principal Component Analysis (PCA) using the dudi.pca command with default parameters (ade4 package in R. R version 3.5.1). The first principal component was extracted as the index and directionality was corrected by the index being compared to the median CAG gene abundance using the spearman correlation of all values within a CAG. CAGs returning a negative correlation were corrected by inverting the principal component values for that CAG. The principal component values were then scaled by subtracting the minimum value for a CAG from each CAG value.
Assignment of Taxonomy to CAGs: As each CAG is composed of multiple gene families, taxonomy was assigned to a CAG by reporting the most common genera and species associated with the gene families in the CAGs, along with the percentage of the CAG that they composed. For CAGs where a genus or species represented greater than 60% of the gene families, a taxonomy was assigned.
CAG results: After filtering for a minimum of 3 gene families per CAG, the strain level information (as represented by CAGs) within the shotgun dataset consisted of a total of 955 CAGs. The CAGs had a mean of 41.09 and maximum of 3,174 gene families. The distribution of CAGs across samples was sparse, with the mean number of CAGs per sample at 31.86 (3.34% of all 955 CAGs) and the max number of CAGs observed in any sample at 80 (8.38% of CAGs). The CAG cluster profile obtained was used to calculate inter-sample correlation distance based on Kendall correlation. Principal coordinate analysis based on this Beta-diversity metric showed a significant split between IBS and Controls (
Machine learning: The in-house machine learning pipeline described in Example 4 was applied to the CAG profiles, following preliminary multivariate analysis.
Results
CAG Cluster Profiles are Predictive of IBS (IBS v Control)
An informative way to reduce the complexity of metagenomic data while increasing biological signal is to assemble the reads into Co-abundant Gene groups or CAGs, representing strain-level variables and commonly referred to as metagenomic species. The optimized random forest classifier, generated using the CAG cluster profiles as input data, was investigated for its predictive ability to classify samples as IBS or Control. External validation was 10 fold CV, while internal validations for optimization, were 10 fold CV repeated 10 times.
Analysis of these strain-level variables significantly differentiated IBS from controls, as shown in
The performance summary, and feature details are described in table 14. Features selected by LASSO having coefficients less than zero are associated with IBS while positive coefficients are associated with Controls.
Machine learning applied to the metagenomic species (CAGs) dataset produced prediction model for IBS based on 136 predictive features (Table 14). Overall, for 10 folds, the mean ROC AUC was 0.814 (±0.134). Sensitivity, and specificity were 0.875(±0.102), and 0.497 (±0217), respectively. Accuracy was observed to be 0.713±0.134.
The classification threshold was optimized to achieve maximum sensitivity and specificity using pROC package and Youden J score. The obtained optimized values for Sensitivity and Specificity were 0.75, and 0.797, respectively. Thresholds were also optimized such that specificity was equal to or greater than (>=) 0.9. The optimized values thus obtained for Sensitivity and Specificity were 0.3875, and 0.915, respectively, at a threshold equal to 0.791.
Therefore, the analysis identified 136 CAGs predictive of IBS (table 14). Taxonomic assignment of the CAGs was sparse, with the majority of features unclassified, but assigned features were broadly consistent with the species-level analysis. The CAGs to which taxonomy was assigned include those associated with the genera Escherichia, Clostridium and Streptococcus, amongst others. At the species level, predictive CAGs included those associated with Escherichia coli, Streptococcus anginosus, Parabacteroides johnsonii, Streptococcus gordonii, Clostridium bolteae, Turicibacter sanguinis and Paraprevotella xylamphila, amongst others. A number of CAGs associated with individual strains were also identified, including Clostridiales bacterium 1_7_47 FAA, Eubacterium sp 3_1_31, Lachnospiraceae bacterium 5_1_57 FAA and Clostridiaceae bacterium JC118.
Discussion
Here it is shown that the microbiome of patients with IBS is distinct from that of controls, and that machine learning can be applied to co-abundance clustering of genes to reliably detect IBS.
A strain-level microbiome signature for IBS comprising 136 metagenomic species was identified. The separation between the microbiota of IBS and controls by unsupervised analysis exceeds that of earlier reports (10, 12). The limitations of 16S amplicon datasets and the relatively mild disease symptoms may account for failure to identify a microbiome signature in one report (12). Moreover, microbiome alterations were significantly associated with physician-diagnosed IBS, but were less significant in self-reported Rome criteria IBS (36).
Background
The current approach to stratification of patients into clinical subtypes based on predominant symptoms has significant limitations. This Example uses microbiome profiling to stratify IBS patients into subgroups.
Materials and Methods
Subject recuitment: A total of 142 samples were used for the analyses. Patients were recruited through gastroenterology clinics at Cork University Hospital, advertisements in the hospital, GP practices and shopping centres and emails to university staff. 80 patients were selected with IBS satisfying the Rome III/IV criteria and agreed inclusion/exclusion criteria and 65 healthy control. Not all samples were used for each analysis due to differing availability of sample specific datasets (Table 15). For example, sequencing data from 3 samples were of too poor quality to include with data from the remaining 142 samples and so were removed from the analyses.
Microbiome profiling: The samples were sequenced using 16S rRNA amplicon sequencing as described in Example 1. The resulting table showed abundance measures for each taxa across all 142 samples. If OTUs were present in 30% or less of samples they were filtered from the table.
Machine learning: Unsupervised learning was used to group the samples. A heatmap of the microbiome OTU table was generated along with hierarchical clustering applied using the Ward2 dendrogram and the Canberra distance measure.
Results
Descriptive Analysis of Samples
Of 142 samples that were analysed, 64 samples were healthy controls with the remaining 78 samples being IBS. Out of the 78, a group of 29 was diagnosed as the IBS-C subtype, a group of 20 was diagnosed as the IBS-D subtype and a group of 29 was diagnosed as the IBS-M subtype.
Identification of Subtypes
The hierarchical clustering identified 4 clusters (
Discussion
Here it is shown that hierarchical clustering applied to microbiome data may be used to define phenotypically distinct subgroups within the IBS population.
Materials and Methods
Subjects: The same subjects were studied as in Example 6. The number of samples analysed in this Example is shown in Table 15.
Analysis of alpha diversity: The same OTU data was used as in Example 6. Observed species (richness) is a measure of diversity defined as the count of unique OTU's within a sample. Statistical analysis was performed using ANOVA.
Analysis of beta diversity: Principal Component Analysis with Canberra distance was used to analyse the differences in diversity of 16S data across the three IBS subgroups. Statistical analysis was performed using Pairwise Permutational MANOVA (adonis function, vegan library in R). The following six pairwise comparisons were made:
1. IBS-1 subgroup vs Healthy (significant).
2. IBS-1 subgroup vs IBS-2 subgroup (significant).
3. IBS-1 subgroup vs IBS-3 subgroup (significant).
4. IBS-2 subgroup vs IBS-3 subgroup (significant).
5. IBS-2 subgroup vs Healthy (significant).
6. IBS-3 subgroup vs Healthy (not significant).
Differential abundance analysis: Statistical analysis was carried out using the DESeq2 pipeline (R library: DESeQ2). Differentially abundant taxa at the genus level were identified for the above six pairwise comparisons.
Results
Differences in Alpha Diversity Across Subgroups
Applying the subgroup stratification of Example 1 to the OTU table and analysing the alpha diversity using the observed species metric within each of the groups revealed significant differences between all 4 groups, as shown in
Principal Coordinate Analysis of Beta Diversity of 16S Data
An analysis of the beta diversity using Principal Coordinate Analysis with Canberra distance at genus level across the three IBS subgroups, the results of which are shown in
The results show that the IBS-3 subgroup can be claimed to have a normal-like microbiota composition as evidenced by its lack of separation from the healthy controls.
The results of Principal Coordinate Analysis for Examples 7-9 are summarised in Table 16.
Differential Abundance Analysis—Genus Level
The differentially abundant genera identified in this study are shown in Table 17. For the comparison of the IBS-1 subgroup to Healthy groups there were in total 23 significant taxa where 6 were increased in abundance (adjusted p-value <0.05). With the IBS-2 subgroup vs Healthy groups there was 13 significant taxa where 6 were increased in abundance (adjusted p-value <0.05) and IBS-3 subgroup group when compared to the healthy group identified only 1 significant taxa (adjusted p-value <0.05) which was increased in abundance (Table 17). Notably, it was observed that Blautia and Eggertella were increased in both altered IBS groups (IBS-1 and IBS-2 subgroups). Butyricoccus, Copproccus and Prevotella were decreased in both altered IBS groups. Veillonella was the only genus to be increased in the Normal-like IBS group (IBS-3 subgroup).
The IBS-1 and IBS-2 subgroups were also compared to the normal-like IBS-3 subgroup. The results are shown in Table 18. As expected the genus level changes in the IBS-1 and IBS-2 subgroups to IBS-3 subgroup was similar to those seen for the IBS-1 and IBS-2 subgroups compared to the healthy controls (Table 17). Like in the comparison to the Healthy group both Blautia and Eggertella have increased in abundance and Prevotella has decreased. Flavonifrator has also increased in abundance across both altered IBS groups when comparing to the normal-like IBS group (IBS-3) which was not the case when comparing to the healthy group.
Discussion
Here it is shown that the IBS subgroups identified in Example 6 have distinct microbiome profiles. A number of differentially abundant genera were identified that are increased or decreased in particular subgroups. This may be informative for future stratification.
Materials and Methods
Subjects: The same subjects were studied as in Examples 6 and 7. The number of samples analysed in this Example is shown in Table 15.
Metagenome profiling: Samples were sequenced using Shotgun sequencing as described in Example 1. Quality assessment of reads was carried out using FASTQC and MultiQC. The Humann2 pipeline (which includes metaphlan2) was used to determine abundance measures for taxa at the species level. In brief the output files from the humann2 pipeline showing the relative abundance for each taxonomy were merged into a single table of relative abundance values for each taxonomy across all samples. The number of counts associated with each value of relative abundance can be inferred by multiplying each relative abundance value with the total number of reads in the sample which contains each relative abundance value and taking the integer part of the resulting value. The final output was then a count table for species level taxa across all 142 samples. Again, if taxa were present in 30% or less of samples then they were removed from the table.
Analysis of beta diversity: Principal Coordinate Analysis was performed as described in Example 6.
Differential abundance analysis: Statistical analysis was carried out as described in Example 7. Differentially abundant metabolites at the species level were identified for the same six pairwise comparisons.
Results
Principal Coordinate Analysis of Beta Diversity of Metagenomics Data
As shown in
Differential Abundance Analysis—Species Level
As in Example 7, an intersection matrix was used to portray the taxa between groups that had increased or decreased in abundance (Table 19). The matrix easily captured the difference between all the IBS groups showing the dissimilarities and similarities between each IBS group compared to the Healthy group relative to significance in species abundance. The fact that the normal-like IBS group is essentially the same as the healthy group in terms of species abundance is reflected in the absence of any species within the normal-like column of the intersection matrix (Table 19). For the altered IBS groups, Ruminoccus gnavus was increased in abundance in both IBS-1 and IBS-2 subgroups. Three different species of Clostridium have also increased across both altered IBS groups when compared to the Healthy group.
Using the same intersection matrix methodology, it was also invenstigated what species were significantly differentially abundant across the altered IBS groups (IBS-2 and IBS-3) when compared to the normal-like IBS group (IBS-3). The results are shown in Table 20. Notable differences were observed. Firstly, no species was found significantly differentially abundant between the IBS-1 subgroup group and the IBS-3 subgroup group. Secondly, in the IBS-2 subgroup group compared to the IBS-3 subgroup group there were only 4 species which were significantly differentially abundant. Amongst these, Ruminoccus gnavus and a Clostridium species showed significant increases in abundance. The comparison between both altered IBS groups also revealed a low number of significantly differentially abundant species.
Discussion
Notably, the separation of altered IBS groups (IBS-1 and IBS-2) to the normal-like (IBS-3) and healthy subjects that was seen here (
This study also revealed that a number of species are significantly differentially abundant across the IBS subgroups, but not between the IBS-3 group and healthy subjects.
In summary, this study demonstrated that the IBS subgroups identified in Example 6 have distinct metagenomic profiles, which may be informative for future stratification.
Materials and Methods
Subjects: The same subjects were studied as in Examples 6-8. The number of samples analysed in this Example is shown in Table 15.
Metabolome profiling: LC/GC-MS was used to measure the quantity of metabolomes for urine and fecal metabolites in each sample, as described in Examples 2 and 3, respectively, except SFCA analysis was not performed. The output measurement is a laser intensity and can be viewed in signal form as a peak on a spectrograph. Results from all samples are collated into a matrix of peak values for each metabolite detected across all 142 samples. Urine peak values were normalised to creatinine values. Faecal peak values were normalised to either dry weight of sample (LC) or wet weight of sample (GC).
Analysis of beta diversity: Principal Coordinate Analysis was performed as described in Example 6.
Results
Principal Coordinate Analysis of Beta Diversity of Fecal and Urine Metabolomics Data
Using the normalised peak value data from the metabolomic results and the stratification from Examples 6-8, the beta diversity between the altered IBS groups, the normal-like IBS group and the Healthy group was determined. The results of Principal Coordinate Analysis for fecal and urine metabolomics data are shown in
Discussion
Here it is shown that the IBS subgroups identified in Example 6 have distinct fecal metabolomic profiles. The results obtained for the urine metabolomics data differed from those obtained for the microbiome, metagenomics and fecal metabolomics data. This may be informative for future stratification.
Materials and Methods
Subject recruitment and sample collection were carried out as described in Example 1.
Urine FAIMS: FAIMS analysis was performed using a protocol modified from that of Arasaradnam et al. (37) and described below. Any other appropriate method known in the art for detecting metabolites may be used in the methods of the invention. Frozen (−80° C.) urine samples were thawed overnight at 4° C., 5 mL of each urine sample was aliquoted into a 20 mL glass vial and placed into an ATLAS sampler (Owlstone, UK) attached to the Lonestar FAIMS instrument (Owlstone, UK). The sample was heated to 40° C. and sequentially run three times.
Each sample run had a flow rate over the sample of 500 mL/min of clean dry air.
Further make-up air was added to create a total flow rate of 2.5 L/min. The FAIMS was scanned from 0 to 99% dispersion field in 51 steps, ′+6 V to −6 V compensation voltage in 512 steps and both positive and negative ions were detected to produce an untargeted volatile organic compound (VOC) profile for each sample. The signals for each sample at each DF were smoothed using the Savitzky-Golay filter (window size=9, degree=3). The signals were trimmed based on an optimized cut-off of 0.007 for positive mode and −0.007 for negative mode outputs, to obtain the region of interest, and reduce the baseline noise. Signals were aligned to the trimmed signals at each DF, using crosscorrelation, using the mean signal as reference to make them comparable. Since the initial DFs of the FAIMS signal, and higher DFs were non-informative, signals corresponding to 17th DF till 42nd DF of both, positive, and negative modes were considered. These pre-processing steps were performed using customized programs developed in Python, v. 2.7.11, with relevant packages (Scipy v-1.1, and Numpy v-1.15.2). To further reduce the complexity, and to retain informative data, kurtosis normality tests were performed on each feature vector and features with raw p-value >0.1, were considered, and final profile was generated for various statistical analyses.
Bioinformatics analysis of urine metabolome data (FAIMS): Each urine sample analysed using FAIMS yielded a profile with ca. 52,224 data points. A pooled profile containing these data points for each sample was generated for pre-processing, to reduce the noise, size, and complexity of the data.
Urine GC/LC MS: 5 mL samples of frozen urine were sent on dry ice to Metabolomic Discoveries (now Metabolon), Potsdam, Germany. Untargeted metabolomics analysis was performed using liquid chromatography (LC) and Solid Phase Microextraction (SPME) gas chromatography (GC) and metabolites were identified using electrospray ionization mass spectrometry (ESI-MS). Short chain fatty acids (SCFA) analysis was also performed by LC-tandem mass spectrometry.
For urine metabolomics, the values of metabolites were normalized with reference to urine creatinine levels in each sample.
Bioinformatics analysis of urine metabolome data (MS): Urine MS metabolomics data was returned for all IBS subjects (n=80) and all but 2 controls (n=63) as these did not pass QC or no sample was available. A total of 2,887 metabolites were returned from untargeted urine metabolomics analysis, of which 594 were identified. Only the identified features with peak values normalized by creatinine levels in urine (mg/dl) were considered for further analysis.
Machine learning: An in-house machine learning pipeline was applied to the urine metabolomic data. The machine learning pipeline used in this example is similar to the machine learning pipeline used in Examples 1 to 3, but comprised additional optimization and validation steps, using a two step approach within a ten-fold cross-validation. Within each validation fold Least Absolute Shrinkage and Selection Operator (LASSO) feature selection was carried out followed by Random Forest (RF) modelling and an optimised model was validated against the cross validation test data which is external to the cross-validation training subset.
The classified urine metabolome sample profiles were log 10 transformed before they were analysed in the machine learning pipeline. The transformed profiles were then used to classify the samples as IBS (80 samples) or Control (63 samples). The classified samples were then analysed in the machine learning pipeline.
After determination of the lambda (λ) range, the samples were assigned weights based on their class probabilities. The weights assigned to the training samples in this step were used in all subsequent applicable steps.
A LASSO algorithm substantially as described in Examples 1 to 3 was then applied to the weighted training samples. In this example, the LASSO algorithm used the previously calculated optimal lambda (λ) range, and used the Caret (version 6.0-84 in this example) and glmnet (version 2.0-18 in this example) packages, The ROC AUC (receiver operating characteristic, area under curve) metric was calculated using 10-fold internal cross validation, repeated 10 times. The feature coefficients identified by the optimized LASSO algorithm were extracted and features with non-zero coefficients were selected for further analysis. In
Following feature selection using LASSO, an optimized random forest classifier (with 1500 trees) was generated using the selected features, or all of the features, as determined by N. This optimised random forest classifier can be used to predict the external test fold. Random forest generation was performed using Caret (version 6.0-84) and internal cross validation, by tuning the ‘mtry’ parameter to maximise the ROC AUC metric. For tuning, if the number of selected features is greater than or equal to 5, mtry ranges from 1 to the square root of the number of selected features or else the range is from 1 to 6. The optimized random forest classifier was then applied to the test set and the performance of the classifier was calculated via the AUC, sensitivity, and specificity metrics.
Both LASSO feature selection and RF modelling were performed within a 10-fold cross validation (CV), which generated an internal 10-fold prediction model that predicts the IBS or control classification of samples. This 10-fold cross-validation procedure was repeated ten times and the average AUC, sensitivity and specificity are reported. The optimized model is then used to predict the cross-validation test subset, and final classifier performance metrics are calculated from across the ten folds of the cross-validation (AUC, Sensitivity and Specificity).
Results
Metabolomic analysis was extended its application to all subjects, focusing initially on urine as a non-invasive test sample. Two methods were compared: FAIMS analysis for volatile organics, and combined GC-/LC-MS. The FAIMS technique did not identify discriminatory metabolites directly, but separated samples/subjects by characteristic plumes of ionized metabolites. In unsupervised analysis, FAIMS readily identified urine samples from controls and IBS (
Machine learning identified urine metabolomics features that are predictive of IBS (AUC 1.000; sensitivity: 1.000, specificity: 0.97, see Table 21a and 21b). Features that were highly predictive included dietary components such as epicatechin sulfate and medicagenic acid 3-O-b-Dglucuronide but also an acylgylcine (N-undecanoylglycine) and an acylcarnitine (decanoylcarnitine) (Table 21a and 21b). Pairwise comparison of control and IBS urine metabolomes identified 127 differentially abundant features (Table 6). Eighty nine urine metabolites were significantly less abundant in IBS subjects including a number of amino acids such as L-arginine, a precursor for the biosynthesis of nitric oxide which is associated both with mucosal defence and perhaps IBS pathophysiology. Another 38 metabolites were present at significantly higher levels in IBS including an acylgylcine (N-undecanoylglycine) and an acylcarnitine (decanoylcarnitine). Elevated levels of metabolites from these groups are associated with altered fatty acid oxidation/metabolism and disease.
Discussion
Although urine metabolomics was highly discriminatory for IBS, the machine learning analysis showed that the compounds identified were predominantly diet- or medication-associated. This observation is consistent with the results obtained using a different machine learning pipeline, as described in Example 2.
The findings of the current study have clinical implications. First, the microbiome and fecal metabolome, and the urine metabolome, offer objective biomarkers for IBS.
Second, the traditional Rome subtyping of IBS is not supported by differences in microbiome and metabolome and it may be time to look for an alternative basis for disease classification.
Third, while the results in no way detract from the concept of an altered brain-gut axis in IBS, they point toward disturbances of the diet-microbiome-metabolome axis which are consistent with the complaints of many patients and should inform the design of future therapeutic interventions in IBS.
The taxa, pathways and metabolites that distinguish IBS subjects from controls identified here may be targeted by a range of microbiota-directed therapies such as fecal transplants, antibiotics, probiotics or live biotherapeutics.
Fourth, hierarchical clustering can be used to identify distinct IBS subtypes with differing microbiomes and fecal metabolomes. Some subgroups have an altered microbiome and fecal metabolome, whilst one subgroup had a normal-like microbiome and fecal metabolome. The identification and characterisation of these subgroups as described herein may be informative for future stratification and treatment.
Current stratification into clinical subtypes of IBS should not form the basis for therapeutic decisions, because the altered microbiota (compared to control subjects) is similar in the subtypes, consistent with alternating between constipation and diarrheal forms in many patients. A more informative stratification would be achieved by fecal microbiota and metabolome profiling. The metagenomic and metabolomic signatures that distinguish IBS subjects from controls identified here may be targeted by these microbiota-directed therapies.
Actinomyces
Oscillibacter
Oscillibacter
Paraprevotella
Coprococcus
Paraprevotella
Coprococcus
Actinomyces
Ruminococcus
—
gnavus
Clostridium
—
bolteae
Clostridiales
—
bacterium_1_7_47FAA
Anaerotruncus
—
colihominis
Lachnospiraceae
—
bacterium_1_4_56FAA
Flavonifractor
—
plautii
Clostridium
—
clostridioforme
Clostridium
—
hathewayi
Clostridium
—
symbiosum
Ruminococcus
—
torques
Alistipes
—
senegalensis
Prevotella
—
copri
Eggerthella
—
lenta
Lachnospiraceae
—
bacterium_5_1_57FAA
Lachnospiraceae
—
bacterium_3_1_46FAA
Clostridium
—
asparagiforme
Barnesiella
—
intestinihominis
Clostridium
—
citroniae
Eubacterium
—
eligens
Lachnospiraceae
—
bacterium_7_1_58FAA
Coprococcus_sp_ART_551
Lachnospiraceae
—
bacterium_3_1_57FAA_CT1
Clostridium
—
ramosum
Coprococcus
—
catus
Eubacterium
—
biforme
Ruminococcus
—
lactaris
Bacteroides
—
massiliensis
Lachnospiraceae
—
bacterium_2_1_58FAA
Haemophilus
—
parainfluenzae
Clostridium
—
nexile
Clostridium
—
innocuum
Bacteroides
—
xylanisolvens
Oxalobacter
—
formigenes
Alistipes
—
putredinis
Paraprevotella
—
clara
Odoribacter
—
splanchnicus
Eubacterium_sp_3_1_31
Butyricicoccus
Firmicutes
Clostridia
Clostridiales
Lachno-
spiraceae
Firmicutes
Firmicutes
Clostridia
Clostridiales
Ruminoco-
Buty-
ccaceae
ricico-
ccus
Firmicutes
Clostridia
Clostridiales
Lachno-
spiraceae
Firmicutes
Clostridia
Clostridiales
Firmicutes
Clostridia
Clostridiales
Ruminoco-
ccaceae
Firmicutes
Clostridia
Clostridiales
Ruminoco-
ccaceae
Firmicutes
Firmicutes
Clostridia
Clostridiales
Ruminoco-
ccaceae
Firmicutes
Clostridia
Clostridiales
Lachno-
spiraceae
Number | Date | Country | Kind |
---|---|---|---|
19167114.8 | Apr 2019 | EP | regional |
19167118.9 | Apr 2019 | EP | regional |
1909052.1 | Jun 2019 | GB | national |
1915143.0 | Oct 2019 | GB | national |
1915156.2 | Oct 2019 | GB | national |
This application is a continuation of International Application No. PCT/EP2020/059459, filed Apr. 2, 2020, which claims the benefit of European Application No. 19167114.8, filed Apr. 3, 2019, European Application No. 19167118.9, filed Apr. 3, 2019, Great Britain Application No. 1909052.1, filed Jun. 24, 2019, Great Britain Application No. 1915143.0, filed Oct. 18, 2019, and Great Britain Application No. 1915156.2, filed Oct. 18, 2019, all of which are hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2020/059459 | Apr 2020 | US |
Child | 17491563 | US |