The present invention refers to the medical field. Particularly, it refers to an in vitro method for the diagnosis of viral infections, for selecting a therapy for a patient suffering a viral infection and/or for monitoring the response of vaccinated patients to a viral vaccine.
There have been increasing efforts to find host biomarkers to identify viral infections in febrile children. The interest stems from the need to avoid the antibiotics overuse which is accelerating antimicrobial resistance worldwide and has been declared as one of the greatest threats to human health by the World Health Organization (WHO).
In the last years the employment of host blood gene expression biomarkers, derived from transcriptomic studies, for identifying phenotypically similar diseases have experience an explosion as it has yielded promising results in scenarios where the available technology is uncertain or inefficient.
Up to the date, several signatures for different infectious diseases have been described but its implementation is still limited. For diagnostic tests based on RNA signatures to be translated into clinical set up, the first step is to identify a small number of transcripts able to identify the disease in question with enough precision. The second requisite is to develop a fast and cheap method or protocol for measure the gene expression levels such as qPCR or new emerging technologies which may hold the key to the introduction of transcriptomic biomarkers into mainstream clinical decision-making in the next years.
It is herein provided significant results about the performance of a 2-transcript host RNA signature for discriminating viral infections which hold the potential to be used in mainstream clinical decision making.
The present invention is focused on solving the above cited problems and, after the study and analysis of transcriptome modifications, it is herein provided an in vitro method for the diagnosis of viral infections, for selecting a therapy for a patient suffering a viral infection and/or for monitoring the response of vaccinated patients to a viral vaccine.
The two-transcript signature proposed in the present invention is able to distinguish viral infections in a broad sense. Therefore, it can be used for the diagnosis of viral infections, for selecting a therapy for a patient suffering a viral infection and/or for monitoring the response of vaccinated patients to a life attenuated viral vaccine.
The diagnose signature is based on assigning to each patient a disease risk score calculated adding the total intensity of both transcripts following the formula:
Disease Risk Score=log(expression [ENSG00000273149])+log(expression [ENSG00000254680])
Lower scores imply viral assignment, whereas higher scores correspond to healthy assignment. The optimal threshold value is defined by the Youden's J statistic, as the point of the ROC curve that maximizes the specificity and the sensitivity.
Although in a preferred embodiment the present invention refers to a RNA signature which comprises, in combination, the SEQ ID NO: 1 (ENSG00000273149) and SEQ ID NO: 2 (ENSG00000254680), it is important to note that the present invention can be carried out by using one of the above cited RNAs. Thus, in a preferred embodiment, the present invention can be carried out by using SEQ ID NO: 1 (ENSG00000273149). Please refer to
The RNA transcriptomic signature of the invention is suitable for distinguishing vaccinated from unvaccinated children and children affected by community acquired Rotavirus. Consequently, this signature could be used to detect vaccinated failures and prevent severe Rotavirus re-infections. However, surprisingly, the biomarkers and signature provided by the present invention are able to distinguish healthy controls from viral infections in a broad sense including (non-exhaustive list): Bocavirus, Influenza, Metaneumovirus, Respiratory Syncytial virus and Varicella Zoster virus (see
According to the ROC curves shown in
The fact that the RNAs of SEQ ID NO: 1 and/or SEQ ID NO: 2 have been found differentially expressed between vaccinated-or-wildtype infected children and healthy controls, showing a high sensitivity, can be considered as an unexpected and promising result. So, the above cited RNAs can be efficiently used for the diagnosis of viral infections, for selecting a therapy for a patient suffering a viral infection and/or for monitoring the response of vaccinated patients to a viral vaccine.
Particularly, Table 1 shows the AUC, sensitivity and specificity associated with the use of SEQ ID NO: 1 (ENSG00000273149) for the identification of viral or bacterial infections. Such as it can be observed in Table 1 the use of SEQ ID NO: 1 for the identification of variety of viral infections gives rise to an AUC higher than 0.9, with a sensitivity and specificity higher than 0.8. In contrast, the use of SEQ ID NO: 1 for the identification of bacterial infections gives rise to an AUC lower than 0.8, with a sensitivity and/or specificity lower than 0.8.
On the other hand, Table 2 shows the AUC, sensitivity and specificity associated with the use of the combination of SEQ ID NO: 1 (ENSG00000273149) and SEQ ID NO: 2 (ENSG00000254680) for the identification of viral or bacterial infections. Such as it can be observed in Table 2 the use of SEQ ID NO: 1 and SEQ ID NO: 2 for the identification of a variety of viral infections gives rise to an AUC higher than 0.89, with a sensitivity and specificity higher than 0.8. In contrast, the use of SEQ ID NO: 1 and SEQ ID NO: 2 for the identification of bacterial infections gives rise to an AUC lower than 0.8, with a sensitivity and/or specificity lower than 0.8.
Particularly, the first embodiment of the present invention refers to an in vitro method for the diagnosis of viral infections in a patient which comprises determining the level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, in a biological sample obtained from the patient, wherein a reduced level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or the protein encoded thereof, as compared with the reference level determined in healthy control subjects, preferably as compared with a corresponding predetermined threshold level selected to provide a sensitivity and specificity of at least 0.8, is an indication that the patient is suffering from a viral infection.
The second embodiment of the present invention refers to an in vitro method for selecting a therapy for a patient which comprises determining the level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, in a biological sample obtained from the patient, wherein a reduced level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or the protein encoded thereof, as compared with the reference level determined in healthy control subjects, preferably as compared with a corresponding predetermined threshold level selected to provide a sensitivity and specificity of at least 0.8, is an indication that the patient is suffering from a viral infection and consequently a treatment with antibiotics can be discarded.
The third embodiment of the present invention refers to an in vitro method for monitoring the response of vaccinated patients to a viral vaccine which comprises determining the level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, in a biological sample obtained from the patient, wherein a reduced level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or the protein encoded thereof, preferably as compared with a corresponding predetermined threshold level selected to provide a sensitivity and specificity of at least 0.8, as compared with the reference level determined in healthy control subjects, is an indication that the patient is responding to the viral vaccine.
The fourth embodiment of the present invention refers to the in vitro use of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, for the diagnosis of a viral infection in a patient.
The fifth embodiment of the present invention refers to the in vitro use of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, for selecting a therapy for a patient with a viral infection.
The sixth embodiment of the present invention refers to the in vitro use of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, for monitoring the response of vaccinated patients to a viral vaccine.
The seventh embodiment of the present invention refers to the in vitro use of a kit comprising reagents for the determination of the level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, for the diagnosis of a viral infection, for selecting a therapy for a patient with a viral infection or for monitoring the response of vaccinated patients to a viral vaccine.
The eight embodiment of the present invention refers to a method for treating a patient which comprises selecting a therapy by determining the level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or a protein encoded thereof, in a biological sample obtained from the patient, wherein a reduced level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or the protein encoded thereof, as compared with the reference level determined in healthy control subjects, is an indication that the patient is suffering from a viral infection and consequently a treatment with antibiotics can be discarded, and wherein a higher level of at least the RNA of SEQ ID NO: 1 and/or SEQ ID NO: 2, or the protein encoded thereof, as compared with the reference level determined in healthy control subjects, is an indication that the patient is not suffering from a viral infection and consequently a treatment with antibiotics might be recommended.
In a preferred embodiment, the viral infection detected and/or treated according to the present invention is caused by (non-exhaustive list): Rotavirus, Varicella, Bocavirus, Influenza, Metapneumovirus, Rhinovirus or Respiratory syncytial virus.
In a preferred embodiment, the viral vaccine that has been used to treat the patient is a vaccine for the prophylactic treatment of a viral infection caused by non-exhaustive list): Rotavirus, Varicella, Bocavirus, Influenza, Metapneumovirus, Rhinovirus or Respiratory syncytial virus.
In a preferred embodiment, the present invention comprises determining the level of at least the RNA of SEQ ID NO: 1 in combination with the RNA of SEQ ID NO: 2, or proteins encoded thereof.
In a preferred embodiment, the present invention is carried out in a sample selected from the list: blood, serum, plasma or dermal fibroblasts.
For the purpose of the present invention the following terms are defined:
All researchers were trained in the study protocol for patient recruitment, sample processing and sample storage. The study was conducted following the Good Clinical Practice. Written informed consent was obtained from a parent or legal guardian for each subject before study inclusion. The project was approved by the Ethical Committee of Clinical Investigation of Galicia (CEIC ref. 2012/301). Furthermore, this project followed the guidelines of the Declaration of Helsinki.
46 samples: 6 controls (roughly 7 months of age with all the vaccines of the Spanish calendar up to date), 14 vaccinated (roughly 7 months of age with all the vaccines of the Spanish calendar up to date plus 3 Rotateq® dosis), 12 infected (with moderate and severe symptomatology) and 14 pre-vaccinated (children that had only received hepatitis B vaccine). 26 Western-European donors were prospectively collected at the Hospital Clinico Universitario of Santiago de Compostela (Galicia; Spain) during the period 2013 to 2014. Blood samples were obtained from these children using a PAXgene RNA tube (PreAnalytiX GmbH). All children recruited (ages ranging from nearly 2 to 34 months, male/female ratio=0.77) had routine immunization up-to-date. In wild type affected children the mean time elapsed from hospital admission to blood collection was three days, and in Rotavirus vaccinated children the blood sample was taken approximately a month after the last Rotateq® dose. There were no remarkable clinical features in the individuals recruited.
4 blood samples collected from patients with H7N9 infection (n=2) and healthy people (n=2). Sample were obtained from the NIH repository accession number PRJNA230906.
6 samples of Varicella Zoster Virus (VZV)-infected human dermal fibroblasts cell line (HDF) infected with different strains or vaccines (Suduvax® and Varivix®) (n=1 control, n=2 wildtype strains and n=3 vaccinated) were obtained from NIH repository accession number PRJNA497243.
Validation cohort of children affected by viral infections of different etiologies was prospectively collected at the Hospital Clinico Universitario of Santiago de Compostela (Galicia; Spain) during the period 2013 to 2014. It comprises 1 Bocavirus patient, 2 Influenza patients, 1 Metapneumovirus, 2 Rhinovirus, 4 Rotavirus and 36 respiratory syncytial virus patients.
32 samples (6 controls, 14 rotaviruses vaccinated and 12 rotavirus infected children with moderate or severe symptomatology) of Western-European donors were prospectively collected at the Hospital Clinico Universitario of Santiago de Compostela (Galicia; Spain) during the period 2013 to 2014. A blood sample was obtained from these children using a PAXgene RNA tube (PreAnalytiX GmbH). There were no remarkable clinical features in the individuals recruited.
The quality standards followed in the present study were previously described in [Salas, A. et al., 2016. Strong down-regulation of glycophorin genes: A host defense mechanism against rotavirus infection. Infection, Genetics and Evolution, 44, 403-411]. Briefly, Bioanlayzer 2100 and Qubit 2.0 were employed to evaluate the quality and the quantity of the collected RNA. We used GLOBINclear™-Human Blood Globin Reduction Kit (Life Technologies; CA, USA) to eliminate globin mRNA and obtain a clearer signal from mRNAs from leukocytes. Poly(A)+mRNA fraction was isolated from total RNA, and cDNA libraries were obtained following Illumina's recommendations. An equimolar pooling of the libraries was performed before clusters generation using cbot from Illumina. An Illumina HiSeq 2000 sequencer was used to sequence the pool of cDNA libraries using paired-end sequencing (100×2).
First of all, we performed a quality control of the raw data using FastaQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) and MultiQC to ensure that there were no problems or biases in our data which may affect the downstream analysis. Afterwards, the whole transcriptome paired-end reads were mapped against the version of the human genome provided by Ensembl (version GRCh37) using the ultrafast universal RNA-seq aligner STAR. We also used STAR for counting the number of reads that map to each gene.
The next step was the normalization of the count data to reduce the systematic technical effects that may appear in the data, and therefore decrease the technical bias impact on the final results. Currently many methods for normalizing RNA-seq data have been developed, however a gold standard normalization method has not been stablished yet. For normalizing the data, we used the statistical software R V3.4.3 (http:/www.r-project.org) and we tried several methods such as: RPKM Reads per million mapped reads, TMM implemented in edgeR package, CQN Conditional quantile normalization from tweeDEseq package and finally Deseq2 implemented in the package of the same name. All of the methods yielded virtually the same result, so we chose the normalization method included in the Deseq2 package, as this package was chosen for performing the downstream analysis.
Finally, we used the Negative Binomial distribution, implemented in the DESeq package together with the Surrogate Variable Analysis (SVA) method implemented in the sva R package for the estimation of the differentially expressed genes (DEG), between vaccinated children and healthy controls, and minimize batch effects between sequencing runs. A generalized linear model was fitted in each cohort, and at statistic was calculated for each gene, and then P-values obtained were corrected for multiple testing using the Benjamini-Hochberg false discovery rate approach. We obtained 8997 differentially expressed genes in comparison.
We applied a known variable selection algorithm called elastic net, to the genes differentially expressed (P-adjusted<0.05) between vaccinated and controls that have a log2change higher than two units using the glmnet R package. The parameters needed for the calculation of elastic net were estimated using 10-fold cross-validation. Obtaining an 18-transcript signature.
In order to determine a less complex signature we looked for the most informative genes between the ones previously selected by the Elastic net algorithm, using a machine learning approach a single-hidden-layer neural network model that was fitted with the R package nnet, obtaining a (SEQ ID NO: 1 and SEQ ID NO: 2) transcript signature:
Disease Risk Score=log (expression [ENSG00000273149])+log(expression [ENSG00000254680])
The performance of the proposed signature as potential diagnosis tools was evaluated using Receiver Operating Characteristic (ROC) curves that represent the true positive rate (TPR) against the false positive rate (FPR) at different threshold cut-points. ROC curves were built in R using the package pROC.
After finding this 2-transcript signature, we evaluated its performance with ROC curves created with the R package pROC.
Finally, we performed an external validation with different external datasets to evaluate if the discovered signal was specific for the rotavirus life attenuated virus that contains the vaccine or if it would be a viral signal in a broad sense and to assess the accuracy in truly independent datasets.
Following the strategy represented in
In order to study the changes experienced in the transcriptome of vaccinated children and children with community acquired Rotavirus, a large-scale expression screening was performed using a RNA-Seq approach. A comparison of gene expression between children with community acquired rotavirus and healthy controls indicates a total of 9544 genes show statistically significant differences, whereas 8997 genes showed statistically significant differences when comparing children vaccinated against rotavirus with controls.
It was examined whether patients clustered according to their disease status (viral infection, bacterial infection and healthy controls) when employing only one of the two genes of the DRS.
Boxplots were generated with one-dimensional scatter plot with closely-packed but non-overlapping points (
The diagnostic accuracy of the test to discriminate viral infection was evaluated using ROC analysis (
On the other hand,
For SEQ ID NO: 1 (ENSG00000273149) in all the scenarios, the ROC curve indicates that the accuracy of the test is very high AUC>90% when comparing viral infection from healthy controls. When comparing bacterial vs viral infection, the AUC almost reach the 80% and when comparing bacteria versus controls it drop a little bit to the 76%.
Taken all together, these results suggest that translate this viral signature to a clinical applicable test based on the determination of the level of SEQ ID NO: 1 or SEQ ID NO: 2 may be feasible. Particularly, these results probe that SEQ ID NO: 1 (ENSG00000273149) is the variable with the highest impact in the accuracy of the model based on 2-transcript.
Looking for biomarkers to distinguish vaccinated children from unvaccinated using a Lasso variable selection method followed by a neural network approach, an unexpected but promising result was found. The prediction model based on just two RNAs: Disease Risk Score=log (expression SEQ ID NO: 1)+log (expression SEQ ID NO: 2) can be efficiently used to perform viral diagnose in a broad sense. This model was capable of accurately distinguish between viral infections and healthy controls/bacterial disease in the samples provided in the present invention and four external validation datasets: one from Spain including respiratory and intestinal viruses, one from China with influenza samples (PRJNA230906), one from Mexico (PRJNA285798) with Rotavirus and bacterial samples, and one composed by epithelial cells affected by varicella zoster virus (PRJNA497243) (see
It was examined whether patients clustered according to their disease status (viral infection, bacterial infection and healthy controls) when applying the DRS. Boxplots were generated with one-dimensional scatter plot with closely-packed but non-overlapping points (
The diagnostic accuracy of the test to discriminate viral infection was evaluated using ROC analysis (
Number | Date | Country | Kind |
---|---|---|---|
19382084.2 | Feb 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/052907 | 2/5/2020 | WO |