The present invention relates to a method for early determination of an active SARS-CoV2 infection and for predicting the infection outcome in an individual, based on identification of the subgenomic RNAs of the virus, and a kit for implementing said method.
Coronavirus SARS-CoV-2, the agent responsible for the disease COVID-19, has recently caused a pandemic with serious social and economic repercussions. In an endeavour to limit the effects of the pandemic, a great deal of effort has been expended on developing fast, reliable technologies designed to detect the virus at an early stage and reduce its spread.
One of the major challenges in the diagnosis of SARS-CoV-2 is associated with identification of asymptomatic carriers, who contribute significantly to the spread of the virus. Other difficulties are represented by early identification of the disease and the ability to detect only its symptoms, which are similar to, and therefore liable to be mistaken for, those of other respiratory disorders or influenza.
The SARS-CoV-2 detection techniques currently used can be divided into three groups: (i) molecular methods able to detect viral RNA sequences, (ii) rapid diagnostic tests that detect the host's viral antigens or antibodies, and (iii) imaging techniques that detect lung changes.
In the molecular approach, technologies based on PCR and high-throughput sequencing are commonly used to amplify nucleic acids and detect the presence of the virus in respiratory samples.
At present, the available diagnostic kits are based on detection of SARS-CoV-2 genes that encode both structural proteins (e.g. spike protein [S], envelope protein [E], membrane protein [M] and nucleocapsid protein [N]) and non-structural proteins (Orf1b/RdRp).
CoV viruses contain the largest genomes (26-32 kb) of all the RNA virus families. Each viral transcript has a 5′-cap protective structure and a 30 poly(A) tail (Lai and Stohlman, 1981; Yogo et al., 1977). On entry into the cell, the genomic RNA is translated to produce non-structural proteins (nsps) from two open reading frames (ORFs), ORF1a and ORF1b. The viral genome is also used as a replication and transcription template, mediated by nsp12 which harbours the activity of RNA-dependent RNA polymerase (RdRP) (Snijder et al., 2016; Sola et al., 2015). Negative-sense RNA intermediates are generated to serve as templates for positive-sense genomic RNA (gRNA) and subgenomic RNA (sgRNA) synthesis. All the structural and accessory proteins are translated by the sgRNAs of the CoVs. The gRNA is packed by the structural proteins to assemble progeny virions. Shorter subgenomic RNAs encode conserved structural proteins (spike protein [S], envelope protein [E], membrane protein [M] and nucleocapsid protein [N]), together with various accessory proteins. However, the ORFs have not yet been experimentally verified for expression. It is therefore not yet clear which accessory genes are actually expressed by this compact genome.
SARS-CoV-2 is known to have at least six accessory proteins (3a, 6, 7a, 7b, 8 and 10) according to the current annotation (GenBank: NC_045512.2). Taken together, SARS-CoV-2 expresses nine sgRNAs (S, 3a, E, M, 6, 7a, 7b, 8 and N) together with the gRNA (Cell. 2020; 181:914-921).
The functions of the sgRNAs are unclear, and some of them have been considered as parasites competing for viral proteins, hence their name of “defective interfering RNAs” (DI-RNA).
However, a certain association has been observed between the detection of sgRNAs and isolation of the virus in cell cultures, and in samples such as stool samples (J Infect 2020S0163-4453(20)30753-2). N sgRNA (sgN) is considered to be the transcript most abundantly expressed during viral replication, followed by E sgRNA (sgE); sgE is produced in amounts of transcript lower than about 1.5 Log10 (J Infect 2020S0163-4453(20)30753-2).
Detection of more than one sgRNA could be used as a marker for viral replication, but further studies are required to confirm this hypothesis (Leung et al. Emerg Infect Dis 2020;26:2701-4). It has also been suggested that detection of subgenomic RNA of SARS-CoV-2 in diagnostic samples is not a valid indicator of infection and replication of the virus [Alexandersen S et al. “SARS-CoV-2 genomic and subgenomic RNAs in diagnostic samples are not an indicator of active replication”, Nature Communications volume 11, Article number: 6059 (2020)].
A method for early determination of the presence of SARS-Cov-2 infection, in the active stage and characterised by a high viral load, has now been found. The method according to the invention is based on detection of subgenomic RNA encoding nucleoprotein N (sgN) and subgenomic RNA encoding envelope protein (sgE), in a biological sample of the subject's cells or tissues, wherein positive detection of both subgenomic RNAs is indicative of an active infection state with high viral load.
The sample is generally taken from the subject's upper or lower respiratory tract with a nasal, oropharyngeal or nasopharyngeal swab. After collection, the sample is analysed for the presence of sgN and sgE using Real-Time PCR or Droplet Digital PCR (ddPCR). The subgenomic regions sgN and sgE can be detected together in the same RT-PCR or ddPCR reaction or in separate reactions starting with the same biological sample. The procedure comprises the following steps:
Real-Time PCR technology is known in the art and combines the PCR technique with the use of fluorescent “reporter” molecules to monitor the formation of products of amplification during each cycle of the PCR reaction. Amplification of the target DNA is obtained by means of repeated denaturing cycles followed by pairing of the primers and probes and polymerase-catalysed primer extension. DNA amplification is monitored in each PCR cycle by measuring the fluorescent signal produced, for example by non-specific dyes intercalated in double-stranded DNA or by sequence-specific probes consisting of oligonucleotides labelled with a fluorescent reporter that allows detection after hybridisation with the complementary target DNA. Dyes suitable for intercalation include SYBR® (Green I, Green II, Gold), LCGreen®, SYTO-(9, 13, 16, 60, 62, 64, 82), BOBO-3, POPO-3, BEBO, TO-PRO3, PicoGreen®, SYTOX Orange and other commercially available fluorescent dyes (fluorophores). The oligonucleotide probe is labelled with a fluorescent reporter (fluorophore) at one end and a quencher at the opposite end of the probe. The 5′ exonuclease activity of the polymerase cleaves the probe, releasing the reporter molecule, and resulting in increased intensity of fluorescence. Examples of fluorophores include 5- or 6-carboxyfluorescein (5- or 6-FAM), tetrachlorofluorescein (TET), hexachloro-6-carboxyfluorescein (HEX), 6-carboxy-4′,5′-dichloro-2′,7′-dimethoxyfluorescein succinimidyl ester (JOE), tetram ethyl rhodami ne (TAMRA), 5-carboxytetramethylrhodamine (TAMRASE), carboxy-X-rhodamine (ROX) and 4-(dimethylaminoazo)benzene-4-carboxylic acid (DABCYL). Examples of quenchers include those belonging to the BHQ (Black Hole Quencher) family, NFQ-MGB (non-fluorescent quencher and minor groove binder), and QSY 7 or 21 carboxylic acid succinimidyl ester.
Droplet Digital PCR (ddPCR) technology is a method for conducting digital PCR based on water-oil emulsion droplet technology. A sample is partitioned into 20,000 droplets, and PCR amplification of the template molecules takes place in each droplet. ddPCR technology uses reagents and workflows similar to those used for the majority of the standard tests based on TaqMan probes. The very large partitioning of the samples is a key aspect of the ddPCR technique.
In a preferred embodiment of the invention, the following are used to detect sgN with RT-PCR:
The same primers and probes as identified in (a) and (b) are preferably used to detect sgN with ddPCR.
In another preferred embodiment, detection of sgE by RT-PCR is conducted with the primer pair SEQ ID NO:6 and SEQ ID NO:7 and the probe SEQ ID NO:8, suitably labelled with a fluorophore-quencher pair, or the primer pair SEQ ID NO:9 and SEQ ID NO:10 together with the SYBR-Green dye. The same primers, SEQ ID NO:6 and SEQ ID NO:7, and the labelled probe SEQ ID NO:8, are preferably used to detect sgE with ddPCR.
The RT-PCR reaction, wherein the indicated primers and probe are used, is preferably conducted in a thermocycler under the following conditions: (i) heating and denaturing at 50° C. for 2 min and 95° C. for 10 min; followed by (ii) PCR stage at 95° C. for 15 sec and 60° C. for 1 min, repeated for 60 cycles; followed by (iii) melt curve stage at 95° C. for 15 sec, 60° C. for 1 min and 95° C. for 15 sec.
The ddPCR reaction, wherein the indicated primers and probe are used, is preferably conducted in a thermocycler under the following conditions: (i) 95° C. for 10 min, followed by (ii) 95° C. for 30 sec and 55° C. for 1 min, repeated for 45 cycles, followed by (iii) 98° C. for 10 min.
In addition to the subgenomic RNAs for the genes of nucleoprotein N and envelope protein E, other genes or viral regions commonly used as target genes or sequences in commercial kits used to analyse nasal and/or oropharyngeal swab samples can also be detected. In particular, the target genes of the SARS-CoV2 virus are selected from gene E, gene N, RdRp, RNAse P and/or orf1ab. sgN, sgE and one or more of the other target genes can be detected simultaneously in a single RT-PCR reaction or by conducting a plurality of reactions in parallel or at different times, starting with the same biological sample.
The following sets of primers and probes are preferably used for detection of E and Orf1a genes (Arena F. et al., International Journal of Molecular Sciences 2021, 22, 1298, Table 2, pages 5 and 6):
the following primers and probe are preferably used for detection of RNAse P gene (https://www.who.int/docs/default-source/coronaviruse/whoinhouseassays.pdf):
the following primers and probe are preferably used for detection of RdRp gene (https://www.who.int/docs/default-source/coronaviruse/whoinhouseassays.pdf):
and the following primers and probe are preferably used for detection of N gene:
Detection of sgE serves as positive control for the analysis, and is indicative of SARS-CoV2 infection, even in the absence of sgN amplification. In particular, detection of sgE in the biological sample tested makes it possible to discriminate between SARS-CoV2 infection and infection with other SARS-CoV viruses (FluA, FluB, RSV, etc). Moreover, amplification of sgE in the absence of sgN indicates the presence of SARS-CoV-2 infection with a low viral load, whereas amplification of sgE together with sgN is indicative of infection with a high viral load.
Moreover, simultaneous amplification of a housekeeper gene such as beta globin can be performed as internal control of the reaction.
Various evaluations can therefore be made on the basis of the result of amplification of the subgenomic RNAs sgN and sgE in the test sample:
These evaluations are particularly useful in cases wherein the amplification of the target genes tested for by the kits currently on the market give results that are difficult to interpret or poorly indicative of the actual presence of infection, for example by showing positivity to only one of the different genome targets amplified.
The method according to the invention can therefore be conveniently applied to the screening of patients infected with the SARS-CoV2 virus, to identify patients at an active stage of infection, who are therefore potentially contagious.
In addition, it was found that sgN is the first transcript that becomes undetectable, compared with sgE transcript, during the recovery in both hospitalized and isolated COVID19-affected patients, indicating that sgN loss is a predictive marker for lower SARS-CoV2 replication activity, thus being of importance for both monitoring the therapeutic response and alerting clinicians that the SARS-CoV2 negativization process is underway.
Accordingly, in a further embodiment the invention provides a method for predicting viral negativization and consequent recovery from SARS-CoV2 infection in a patient, which comprises determining the levels of sgN and sgE RNAs in a biological sample from said patient at different times from the initial positive test for SARS-CoV2 infection, wherein a decreased expression of sgN over time or its lack of detection while the levels of sgE remain substantially unchanged, indicates benign SARS-CoV-2 negativization in said patient.
Detection of sgN over time can be carried out with the methods and reagents herein disclosed, particularly using RT-PCR.
In a preferred embodiment, the patient is analysed for sgN expression at 3 to 7 days intervals from the test indicating SARS-CoV2 positivity.
Overall, the experimental results show that SgN is a biomarker that is predictive of virus replication loss in patients infected with SARSCoV-2. This enables reducing the risk of further infection considering those under an active viral load versus those who are in remission in terms of disease transmission and virus infectivity. Furthermore, sgN detection can be used during the follow-up of hospitalized and home-isolated COVID19-positive patients, to monitor their disease progression and therapeutic responses.
A further aspect of the invention relates to a kit for implementing the methods described herein. The kit comprises the primers and probes disclosed in Table 1 for detection of sgN and/or sgE RNAs in a biological sample and optionally the primers and probes herein diclosed for detecting other gene transcripts from E, N, RdRp, RNAse P and/or Orf1ab genes.
The naso-oropharyngeal swab samples were taken with commercial flocked swabs collected in about 1 mL of universal transport medium (UTM; Copan, Brescia, Italy) and sent to our laboratory in containers at a controlled temperature within four hours of the sample being taken. A unique centralised unit for sample collection, consisting of qualified healthcare professionals trained to take oropharyngeal swabs, guaranteed the homogeneity of the sample-taking procedures. Our study was approved by the Ethics Committee of Federico II University (protocol no. 000576 of Oct. 4, 2020) and conducted in compliance with the Declaration of Helsinki.
All samples were extracted by an automated procedure on MagPurix instrumentation. In detail, a 200 μL volume was used to extract RNA in a fully automated system based on MAGPURIX VIRAL/PATHOGEN NUCLEIC ACIDS kit (Zinexts, marketed by Resnova, Italy) running on the MAGPURIX 24 instrument. MagPurix® CE-IVD reagent kits are designed to provide the maximum extraction quality, through optimised protocols. All the RNA was eluted in 50 μl of elution buffer supplied by the manufacturers.
The oligonucleotide sequence of the primers is described in Table 1. Total RNA was extracted from all positive samples. The sgRNA RT-PCR assay used the SuperScript IV VILO Master Mix (11756500, Invitrogen) according to the manufacturer's instructions. The sgRNA tests used a leader-specific primer, as well as primers and probes targeting sequences downstream of the start codons of genes E and N [10, 11]. In addition, SYBR-green technology was also used to detect said subgenomic transcripts with a different couple of primers. In detail, the reverse transcription products (cDNA) were amplified by quantitative PCR in real time, using a real-time PCR system (Quantstudio 5). The target genes were detected with a Brightgreen 2× qPCR Mastermix low-rox (# Mastermix-lr; ABM.). These analyses were conducted with a PCR machine (Quantstudio 5) under the following conditions: Heating/denaturing step, 50° C. for 2 min, 95° C. for 10 min; PCR stage, 95° C. for 15 sec, 60° C. for 1 min (×60 cycles); Melt curve stage, 95° C. for 15 sec, 60° C. for 1 min; 95° C. for 15 sec. The SYBER primers are reported in Table 1.
The absolute quantification of SARS-CoV2-RNA was carried out by ddPCR using a two-step reaction: cDNA was synthesized with the SensiFAST cDNA Synthesis Kit (Bioline), using 2× ddPCR Supermix (no dUTP) (Bio-Rad). The QX200 droplet generator was used to generate the droplets by mixing the cDNA samples, 9 μM of forward and reverse primers, and 2.5 μM of probe with 70 μL of droplet formation oil. The amplification step was performed on the T100 thermocycler (Bio-Rad) under the following conditions: heating at 95° C. for 10 minutes, followed by 95° C. for 30 seconds and 55° C. for 1 minute, repeated for a total of 45 cycles (at a heating rate of 2° C./s), followed by 98° C. for 10 minutes. After PCR, the positive/negative droplets were analysed in the QX200 droplet reader (Bio-Rad), and QuantaSoft analysis software (Bio-Rad) was used to calculate the number of targets analysed.
The statistical analyses were conducted with the IBM SPSS® Statistics software package (IBM Company, New York, N.Y., USA) (IBM SPSS Statistics for Mac, version 26). The correlation matrix (Spearman's rho coefficient) was used to show the linear relationship between the diagnostic tests. P≤0.05 was considered statistically significant.
We analysed 48 RNA samples extracted from patients who tested positive for COVID19 with Ct values ranging from 13.5-22.5 (no.=26) to >22.5-40 (no.=22). The target sgN was only detectable in the samples with Ct values close to or lower than 22.5. Conversely, sgE was always detected independently in the Ct value range. These results were obtained with both real-time TaqMan technology and SYBR Green. Moreover, by means of high-resolution melting analysis (HRMA), it was possible to distinguish between the different targets amplified, excluding the false positives which could have been generated by off-target signals (
Our ddPCR results (
We have developed a diagnostic kit based on a Taqman approach, that can detect expression levels of viral sgN, gene E, gene ORF1ab, and the human RNAse P gene. We compared the results obtained from 50 oro/nasopharyngeal swabs to those obtained using the “in-vitro diagnostic” (IVD) approved Allplex 2019-nCoV assay (Seegene; https://www.seegene.com/). These data show that these kits can identify with certainty the SARS-CoV-2-positive patients. Furthermore, we demonstrate that the new SARS-CoV-2 kit can identify ‘true negative’ COVID19-free people through analysis of an independent cohort of 12 samples.
The SARS-CoV-2 kit also identified viral sgN, gene E, and gene ORF1ab in SARS-CoV-2-positive bronchial aspirate specimens collected from hospitalized patients.
The SARS-CoV-2 kit was also evaluated for sensitivity (sgN, gene E, gene ORF1ab; 300,000 to 30 viral copies) and for sgN specificity (≥99.9%), with a hit rate of 95.0%. We tested the detection of SARS-CoV-2 sgN transcripts using the SARS-CoV-2 kit through the analysis of oro/nasopharyngeal swab samples from a cohort of 315 COVID19-positive Italian patients (in Coronet Laboratories based in Milan, Udine, Naples). The positivity of these patients to SARS-CoV-2 infection was confirmed through detection of viral gene E and gene ORF1ab in all of the samples. In contrast, the levels of sgN were not detectable (i.e., Ct>40) in 120 of these samples (38.1%). One-way analysis of variance (ANOVA) was used to determine that sgN expression was detected using the SARS-CoV-2 kit only in the samples that were characterized by Ct values<33.163 for viral gene E (P<0.0001;
We also compared sgE expression levels (using Taqman methodology) to sgN expression levels in 122 patients from one of the single Coronet centers, as part of the full cohort (ASL Napoli3-sud; Data S4). These data showed that sgN and sgE were not detectable using the SARS-CoV-2 kit in terms of their levels of expression in 82.8% and 64.8% of this single-center cohort, respectively (
Taken together, these data indicated that both of the sgRNA transcripts (i.e., sgN, sgE) are independently detected only in those patients with higher viral loads, when the infection is expanding and rapidly progressing. Vice-versa, at lower viral loads, sgN was generally not detected (gene E Ct>33.16; gene Orf1b Ct>33.15; see
(A,B). Samples obtained from oro/nasopharyngeal swabs from COVID19-positive patients (N=315) were stratified into three groups according to the median Ct values of sgN (sgN Ct median=33.51), as detected through SARS-CoV-2 kit. The first group consisted of those samples where Ct for sgN was below the median value (i.e., Ct<30.51; n=99 samples; light grey). The second group was characterized by Ct values for sgN from the Ct median value (30.51) to 40.00 (n=96 samples; dark grey). The third group comprised samples where sgN was not detected (i.e., Ct>40; n=120 samples; grey). ANOVA was used through IBM SPSS Statistics to determine the cut-off for sgN detection. SgN was detected in the samples with viral E Ct values <33.163 (A) and ORF1ab Ct values <33.155 (B) (P<0.0001, for both). (C, D) Pie charts showing the proportions (%) of the oro/nasopharyngeal swab specimens where the levels of sgN (C) and sgE (D) were detectable (i.e., Ct values<40; dark grey) or not detectable (i.e., Ct values>40; grey), for the 122 COVID19-positive patients belonging to a single cohort (entire cohort, N=315). The data show no detectable levels of sgN and sgE in 82.8% (C; grey) and 64.8% (D; grey) of the patients, respectively. (E-H) ANOVA was used through IBM SPSS Statistics to determine the cut-off for sgN and sgE detection in the 122 oro/nasopharyngeal swabs from the single-cohort COVID19-positive patients. SgN was detected in the samples with viral E Ct values<33.41 (E) and ORF1ab Ct values<33.54 (G) (P<0.0001, for both). SgE was detected in the samples with viral E Ct values<34.06 (E) and ORF1ab Ct values<34.20 (G) (P<0.001, for both).
With the aim to monitor viral replication and its potential failure, we undertook further analyses to answer the question of how the longitudinal expression occurs for the sgRNAs (i.e., sgN, sgE) and for the genes N, E, ORF1ab and RpRd that are expressed during SARS-CoV-2 infection. Here, we analyzed a cohort of oro/nasopharyngeal swabs collected from 16 COVID19-positive home-isolated patients at specific times (i.e., 3-day intervals from the first swab) until they reached a negative status for the SARS-CoV-2 genes, when possible (
In more detail, gene ORF1ab and RdRp were detected only in 10% of the patients 7 days from the first swab (
We then analyzed an independent cohort of six COVID19-affected patients hospitalized in an Intensive Care Unit. Here, the analysis monitored sgN, gene E, gene ORF1ab, gene N, and RdRp using longitudinal detection at 7-day intervals (0, 7, 14 days). SgN was detected on the first swab tests, and again in 67% of the patients after 7 days, and in 33% after 14 days (
Taken together, these data obtained through the analysis of two independent datasets of oro-pharyngeal swab tests from COVID19-affected patients (home-isolated, hospitalized) identified sgN as the first viral transcript to show decreased expression levels (to the ‘undetectable’ level) during their recovery period of SARS-CoV-2 infection. Overall, sgN detection preceded the benign SARS-CoV-2 negativization by 3 to 7 days from the first swab in home-isolated and hospitalized COVID19-positive patients, respectively.
(A) A cohort of oro/nasopharyngeal swabs was collected from home-isolated COVID19-positive patients and analyzed according to the scheduled times (i.e., at 3-days intervals from the first swab). Ten patients were followed up to 7 days from the first swab test; 6 patients were followed up to 3 days. (B) Pie charts showing the proportions (%) of positivity of the oro/nasopharyngeal samples to viral subgenomic sgN and sgE, and genomic N, E, ORF1ab, and RdRp at the different times (dark grey, first swab [n=16]; grey, second swab collected after 3 days [n=16]; light grey, third swab collected after 7 days [n=10]). SgE was detected in 8 oro/nasopharyngeal samples. SgN was detected in 44% of the samples after 3 days from the first swab. sgN, gene E and gene ORF1ab were measured using the SARS-CoV-2 kit. Gene N, gene E and gene ORF1ab were detected using the Allplex 2019-nCoV assay. SgE was evaluated by Taqman qPCR. (C) A cohort of oro/nasopharyngeal swabs collected from 6 hospitalized COVID19-positive patients was analyzed according to the scheduled times (i.e., 7-days intervals from the first swab). (D) Pie charts showing the proportions (%) of positivity of the oro/nasopharyngeal samples to viral subgenomic sgN, and genomic N, E, ORF1ab and RdRp at the different time points (dark grey, first swab [n=4]; grey, second swab collected after 7 days [n=4]; light grey, third swabs collected after 14 days [n=4]). SgN was detected in 50% of the samples after 7 days from the first swab. sgN, gene E and gene ORF1ab were measured using the SARS-CoV-2 kit. N and ORF1ab were detected using the Allplex 2019-nCoV assay.
Number | Date | Country | Kind |
---|---|---|---|
102020000032351 | Dec 2020 | IT | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/087512 | 12/23/2021 | WO |