The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Mar. 3, 2022, is named P140814US02_Sequence_listing.txt and is 27,841 bytes in size.
The present invention relates to a method for predicting and monitoring the severity of COVID-19 disease following infection of a subject with the SARS-CoV-2 virus.
There is no specific and objective clinical test to determine or predict COVID-19 disease severity. Currently, clinicians use several generic readouts (e.g. blood oxygen saturation, interleukin-6 concentration) and their clinical judgement. Unique to COVID-19, there is a problem around timelines where any required tests need to be designed, developed and deployed rapidly.
Clinicians use a variety of generic readouts to try to determine and monitor disease severity. One of the more widely spread measurements is blood oxygen saturation. Interleukin-6 concentration may also be used among other, more standard blood tests and clinical readouts. Prior to the making of the present invention, it is believed that there is no specific commercially available blood test for COVID-19 disease severity.
A recent paper by Messner et al (Cell Systems, vol. 11, pages 11-24 (2020)) describes the use of ultra-high-throughput clinical proteomics to reveal classifiers of COVID-19 infection. The report is focused on an early proteomics signature for COVID-19 disease classifiers and deals with the discovery stage of the protein signature. However, the report does not suggest any means to predict the severity of the COVID-19 disease in infected subjects.
The technique described by Messner et al is also an unbiased discovery mass spectrometry analysis used in the context of discovery only and not for prognosis of a patient's disease status. Another paper by Demichev et al (2020) reports a time-resolved proteomic and diagnostic assay for COVID-19 disease progression in which a broad group of protein markers is studied (doi.org/10.1101/2020.11.09.20228015).
The present invention provides a specific clinical test to classify and predict COVID-19 disease severity. The test uses a previously undisclosed combination of 31 proteins and 52 peptide sequences therefrom, where the peptide sequences relied on have not been previously disclosed and at least some of the proteins have not previously been associated with COVID-19 disease. The invention is a targeted proteomics blood test to predict and monitor COVID-19 disease severity. The blood test measures the absolute concentration of 52 peptides arising from 31 blood plasma proteins at the same time. The heavily multiplexed assay is definitive and can be accredited to existing regulatory standards to deploy in the clinic.
In accordance with a first aspect of the invention, there is provided a method for predicting and/or classifying the severity of COVID-19 disease in a subject, the method comprising:
The protease may be a serine protease (e.g. trypsin, chymotrypsin, thrombin, elastase, or subtilisin), a cysteine protease (e.g. papain, caspase-1, adenain, pyroglutamly-peptidase I, sortase A, hepatitis C virus peptidase 2, sindbis virus-type nsP2 peptidase, dipeptidyl-peptidase VI, DeSI-1 peptidase, TEV protease, amidophosphoribozyltransferase precursor, gamma-glutamyl hydrolase, hedgehog protein, dmpA aminopeptidease), a threonine protease (e.g. ornithine acetyltransferase), an aspartic protease (e.g. pepsin, cathepsin D, cathepsin E, napsin-A, nepenthesin, presenilin, renin (chymosin)), a glutamic protease (e.g. scytalidoglutamic peptidase (eqolisin), aspergilloglutamic peptidase), a metalloprotease (e.g. an ADAM or a matrix metalloproteinase), or an asparagine peptide lyase.
A proteolytic peptide is therefore a peptide sequence from a protein which has been produced by the action of a protease cleaving a peptide bond between amino acids in the protein sequence. Such proteolytic peptides are therefore oligopeptides being formed of a number of amino acids. The proteolytic peptide may be from 5 to 30 amino acid residues in length, suitably 6 to 25 amino acids in length, 7 to 21 amino acids in length, or any of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 amino acids in length.
A proteolytic peptide prepared by the action of the protease trypsin on a protein may be referred to as a tryptic peptide and so on.
The proteolytic peptide may suitably have a sequence as set out in Table 1.
The method of the invention may comprise assaying for the presence of up to all 31 proteins in Table 1 with respect to a proteolytic peptide thereof. In some embodiments, the method of the invention may comprise assaying for the presence of a proteolytic peptide of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or 31 proteins as shown in Table 1.
The methods of the invention therefore comprise the assaying for up to 52 proteolytic peptides of the 31 proteins shown in Table 1. In some embodiments, the method of the invention may comprise assaying for at least 5 to 10, 5 to 15, 5 to 20, 5 to 25, 5 to 30, 5 to 35, 5 to 40, 5 to 45 or 5 to 50 proteolytic peptides of Table 1. Preferably the number of proteolytic peptides assayed for is at least 7, at least 17, at least 29 or at least 41 proteolytic peptides as shown in Table 1.
References to proteolytic peptides as shown in Table 1 may alternatively refer to proteolytic peptides as shown in Table 6 or Table 9. The proteolytic peptide may suitably have a sequence as set out in Table 9 Cohort 1 and/or Table 9 Cohort 2.
Table 9 contains adjusted p values for cohort 1 and cohort 2. The lower the adjusted p value the more important the peptide is for stratification and outcome prediction. The method of the invention may suitably comprise assaying for proteolytic peptides from Table 9 with relatively low adjusted p values. The method may comprise assaying for the proteolytic peptide having the lowest adjusted p value from Table 9. The method may comprise assaying for the 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48 proteolytic peptides having the lowest adjusted p values from Table 9. The method may comprise assaying for at least the 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48 proteolytic peptides having the lowest adjusted p values from Table 9. The method may comprise assaying for at least the 2, 5, 10, 15, 20, 25, 30, 35, 40, or 45 proteolytic peptides having the lowest adjusted p values from Table 9.
The proteolytic peptides from Table 9 may be from cohort 1 and/or cohort 2. Where the proteolytic peptides are from cohort 1 the method may comprise assaying for the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 1. Where the proteolytic peptides are from cohort 2 the method may comprise assaying for the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 2. Where the proteolytic peptides are from cohort 1 and cohort 2 the method may comprise assaying for the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 1 and for the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 2. The method may comprise assaying for at least the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 1 and for at least the proteolytic peptide having the lowest adjusted p value from Table 9 cohort 2, optionally further comprising at least the 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48 proteolytic peptides having the lowest adjusted p values from Table 9 cohort 1 and/or optionally further comprising at least the 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48 proteolytic peptides having the lowest adjusted p values from Table 9 cohort 2. The skilled person will appreciate that the method when the method comprises assaying for a proteolytic peptide on the basis of a relatively low adjusted p value in both Table 9 cohort 1 and Table 9 cohort 2, the method does not involve assessing the proteolytic peptide twice; rather the presence of a relatively low adjusted p value in Table 9 cohort 1 and Table 9 cohort 2 is a pointer to the skilled person to assess the proteolytic peptide once.
The method may comprise assaying for one or more proteolytic peptides having an adjusted p value from Table 9 cohort 1 and/or Table 9 cohort 2 which is at or below a threshold adjusted p value. The threshold adjusted p value may be 0.05, 0.01, 0.001, 0.0001 or 0.00001, 1×10−6, 1×10−7, 1×10−3, 1×10−9, 1×10−10, 1×10−1, 1×10−12, 1×10−13 or 1×10−14. The threshold adjusted p value from Table 9 cohort 1 may be 0.05, 0.01, 0.001, 0.0001 or 0.00001. The threshold adjusted p value from Table 9 cohort 2 may be 0.05, 0.01, 0.001, 0.0001 or 0.00001, 1×10−6, 1×10−7, 1×10−3, 1×10−9, 1×10−10, 1×10−1, 1×10−12, 1×10−13 or 1×10−14.
The method may therefore comprise:
The method may therefore comprise:
The method may comprise assaying for any combination of proteolytic peptides from Table 9 cohort 1 and/or Table 9 cohort 2.
The top-right panel of supplementary
The method may therefore comprise (ii) assaying the proteolytic digest of proteins of step (i) for the presence of one or more proteolytic peptides as shown in the top-right panel of Supplementary
The method may comprise assaying for any combination of proteolytic peptides from the top-right panel of supplementary
In one embodiment, the method of the invention may suitably comprise assaying for the seven proteolytic peptides in the following group of proteolytic peptides:
According to this aspect of the invention, one or more additional peptide sequences may also be assayed for from the peptides shown in Table 1 in addition to the peptide sequences shown above. Suitably, the peptide sequences assayed for may therefore be as shown in Figures, 1, 2, 3 or 4.
Suitably, the peptide sequences assayed for may be as shown in Figures, 7, 8 or 9.
Suitably, the peptide sequences assayed for may be as shown in Supplementary Figures, 1, 2, 3 or 4.
In one embodiment, the method of the invention may suitably comprise assaying for the 32 proteolytic peptides that changed with the severity of the COVID-19 disease according to treatment escalation: i.e. from uninfected (WHO 0) to mild (WHO3), moderate (WHO 4, 5) and severely (WHO 6, 7) COVID-19 affected individuals (
In one embodiment, the method of the invention may suitably comprise assaying for the 33 proteolytic peptides that had a significant trend between patients according to the WHO ordinal outcome scale for clinical improvement with patients capturing a WHO score from relatively mild (WHO 3) to very severe cases (WHO 7) (Supplementary
The methods of the present invention use mass spectrometry to identify the concentrations of the peptides, for example proteolytic peptides having the sequences as set out in Table 1. In a mass spectrometry (MS) system for the analysis of proteolytic peptides in a sample, the proteolytic peptides are injected into the MS system where the proteolytic peptides are first detected as intact peptides and then subsequently fragmented into smaller pieces which may be termed peptide fragments. The methods of the present invention provide for the detection of up to 52 proteolytic peptides of the up to 31 proteins as shown in Table 1. The concentrations of the peptides are used to assign a subject to a given level of severity of COVID-19 disease with reference to a calibration curve of the concentrations for known reference proteolytic peptides. For example, one scale for determining the severity of COVID-19 disease in a patient is the WHO scale of COVID-19 disease severity. In one embodiment of the invention, the determination of the severity of COVID-19 disease may be made according to the WHO scale of COVID-19 disease severity. The scale of COVID-19 disease severity may therefore be of from 0 to 10 using the WHO scale. Other scaling systems may be used also based on the concentrations of the peptides assayed for in the sample with reference to a calibration curve based on known standard proteolytic peptides. The disease severity correlates to the absolute concentration of peptides in a sample from a subject. Suitably, the concentrations of the proteolytic peptides in a sample are used in a linear regression model to calculate a risk score for a patent to develop a severe disease, for example to assign a subject to a defined grade of COVID-19 severity according to the WHO scale. The overall risk score may therefore be generated using a statistical model taking proteolytic peptide concentration and patient age as the relevant inputs.
The methods of the invention may further optionally also include the use of a statistical model which incorporates the concentration of proteolytic peptides determined according to the mass spectrometry analysis with the patients age in order to determine an overall risk for COVID-19 disease severity.
The concentrations of the proteolytic peptides assayed for in the samples are therefore linked with the prediction and/or diagnosis of COVID-19 disease severity. Generally, the higher the dysregulation of proteolytic peptide concentration from a baseline control (either increased or decreased), the more severe the disease is or will be. The measurement of peptide concentration variability within a disease stage/group is also important for accurate prediction and/or diagnosis. The lower the variability the more accurate the prediction or diagnosis. The ANOVA p-value is therefore a proxy that takes both of these components into account and determines which peptides are critical for prediction and/or diagnosis.
In the data presented herein in Table 4, the peptide EQHLFLPFSYK (SEQ ID No: 16) has the lowest p-value and a clear concentration increase in WHO7 patients vs. healthy or WHO3-WHO4 patients (
Reference peptides may be used to pre-configure the mass spectrometer prior to use in a method of the invention to detect and quantitate the concentration of peptides of Table 1 in a sample. The reference peptides also allow for the construction of calibration lines with each batch of samples tested in order to ensure robust results. Heavy isotope-labelled peptides may be used as internal standards to control analytical variability in each sample and also provide for calibration lines.
The methods of the invention may further account for whether the proteolytic peptides are shown herein to be up-regulated or downregulated in patients. See for example the 11 up-regulated and 22 downregulated proteolytic peptides described in Example 3. See also panel a of
Novel peptide sequences of proteins not previously associated with detection of COVID-19 disease include sequences from:
The WHO scale of COVID-19 disease severity lists the patient state and the clinical descriptor against a score as follows (Marshall et al., The Lancet, vol. 20, e192-e197 (2020)):
The methods of the present invention are able to stratify patients according to the above scale with respect to those patients requiring hospitalisation, i.e. having a WHO scale of COVID-19 disease severity score of 3 or above, compared to those patients who do not require hospitalisation, i.e. having a WHO scale of COVID-19 disease severity score of below 3. For any patient sample, therefore, the methods of the invention provide a means to differentiate between patients requiring hospitalisation and those who do not require such treatment.
The methods of the invention therefore provide for the classification and prediction of COVID-19 severity. Clearly, the predictive aspect of the methods of great value to patients and to health systems in terms of being able to prioritise utilisation of resources and assess the outcome of an infection.
Accordingly, the methods of the invention provide for the detection of a patient requiring therapy for the treatment of COVID-19. The patient may be symptomatic or asymptomatic with respect to infection by SARS-CoV-2 virus. The therapy may be for treatment of COVID-19 disease with a WHO severity score of 3 or above. The therapy may comprise a drug therapy, or oxygen therapy. The oxygen therapy may be non-invasive or invasive. The oxygen therapy may comprise mechanical ventilation of the patient.
The test of the invention may suitably be performed as a blood test. Such a test may be conducted by collection of venous blood or potentially by a finger-prick blood collection device or a bloodspot card or plasmaspot card. The method of the present invention may be suitably performed on a sample of blood plasma or serum. In one embodiment, the test is conducted on citrate plasma (e.g. plasma to which sodium citrate has been added), but different plasma sample additives (anticoagulants) may also be used, for example, K2 and K3 EDTA plasma tubes, heparin, potassium oxalate/sodium fluoride treated plasma tubes and others. Samples based on serum, whole blood (venous or peripheral) or bloodspot or plasma-spot samples, cerebrospinal fluid (CSF), interstitial fluid, lymph fluid, urine, faeces and/or tissue biopsies may also be used according to the invention.
Any of the tests described herein may be multiplexed with another test measuring proteins in a biological sample, e.g. blood plasma proteins, with the same technology platform. Thus, the test of the present invention may be a part of a larger test. If other disease severity tests emerge, the test of the present invention could be used to augment such other tests to enhance overall performance.
The test of the present invention not only provides for an assessment of the presence of COVID-19 disease but also allows for a prediction of the severity of disease. The test provides for prediction of the need for specific treatment options (i.e. mechanical ventilation). The test also predicts whether the patients are likely to survive or not if they have severe disease and does so on average 39 days before outcome.
As a sample type, plasma is also difficult to prepare for mass spectrometry applications. However, in the present invention samples based on plasma have been used successfully. The plasma sample preparation in the methods of the present invention avoids problems of the prior art (analytical signal suppression and variability) by selecting the optimal peptides for analysis with corresponding heavy isotope labelled-internal standards with digestion efficiency control tags and calibration curves. Without wishing to be bound by theory, plasma samples (specifically from venous blood) may be the most suitable form of sample for use in a method of the present invention. Other sources of plasma obtained from finger prick collection or bloodspot samples may be also be used.
The same plasma proteins may also be measured using a different technology platform. ELISA, SIMOA, Olink, Western Blot, or other immunoassay platforms may be used to the measure the same protein set. Aptamers (oligonucleotide or peptide molecules that bind to a specific target molecule) can be used instead of antibodies in similar assays.
The proteins from which the peptide sequences are derived are set out in Table 1 described herein. The method of the present invention may be a multiplexed assay. In contrast to earlier experimental uses of mass spectrometry in the study of COVID-19 disease in patients, the present invention suitably uses targeted mass spectrometry.
As set out herein, a different peptide set to that shown in Table 1 could also be measured from the same proteins as shown in Table 1 using the same targeted proteomics platform but using a different protease. The examples of the present invention described herein show one embodiment of a method of the invention, however different peptides from the same set of 31 proteins may be used. This includes peptides generated using the same protease as in this test (trypsin) or different proteases (LysC, GluC etc).
Typically, the mass spectrometry analysis of peptides according to a method of the present invention, is liquid chromatography-targeted mass spectrometry (LC-MS) using triple quadrupole instruments, operated in timed multiple reaction monitoring (MRM) mode.
A different mass spectrometry platform could be used to measure the same set of analytes. Whilst triple quadrupole mass spectrometry platforms, operated in timed MRM mode, are preferred for this test, there could be other mass spectrometry platforms or other data acquisition modes used to design around this test. One example is high resolution mass spectrometry platforms (e.g. Sciex 6600, Thermo Orbitrap or Waters QTOF-type instruments) where “pseudo” MRM or PRM modes could be used for measurements. Triple quadrupole instruments are robust and provide reproducible data so may be a preferred technology.
A wide variety of chromatography systems can be coupled to a mass spectrometry platform. The test of the present invention may be suitably used with “standard flow” (ml/min) liquid chromatography systems but lower flow systems may be used—in a range of μl/min or nl/min.
The method may comprise a targeted, LC-MRM or LC-SRM assay to be used on a conventional triple-quadrupole mass spectrometer. The mass spectrometer may be running routine-typical chromatography. The flow-rate may be about 800 μL*min-1.
Various peptide ionisation interface methods can be used prior to mass spectrometry analysis according to a method of the invention. The present invention may suitably use electrospray ionisation (ESI). However, the peptides can also be ionised using matrix-assisted laser desorption/ionization (MALDI) or desorption electrospray ionization (DESI) or atmospheric-pressure chemical ionization (APCI) or other ionisation methods.
The methods of the present invention therefore comprise the use of targeted proteomics. The technique comprises the quantification of specific, pre-selected proteins or proteolytic peptides from a given sample and requires a pre-existing understanding of disease biology to guide protein selection. The technique is therefore distinct from discovery proteomics which seeks to gather information about all proteins and proteolytic peptides in a sample without pre-existing knowledge/hypotheses around disease biology. Internal standards or calibration lines for every protein or peptide of interest are not and cannot be used in discovery proteomics. Discovery proteomics is also conducted using different instrument operation modes, e.g. SWATH, HDMSE etc comparted to MRM or PRM in targeted proteomics. Data processing also uses different approach to signal normalisation and quantification where absolute concentration cannot be provided. Discovery proteomics platforms lack robustness and reproducibility of targeted proteomics platforms. Targeted and discovery proteomics are distinct to the extent that a team of scientists utilising discovery proteomics platforms are generally not able to develop a targeted proteomics biomarker test without specific knowledge and experience in targeted proteomics. This is due to the above mentioned and other differences at every stage of the process, from initial concepts, sample preparation, data acquisition and processing to final test implementation in a clinical setting.
The methods of the invention are performed using a corresponding labelled reference peptide or labelled reference proteolytic peptides and may be configured to use any suitable internal standard on the same targeted proteomics platform. The methods of the invention may suitably be used with heavy isotope-labelled internal standards for accurate measurements. Examples of internal standards as heavy-isotope labelled proteolytic peptide are shown in Table 1 with respect to the proteolytic peptides described therein. The internal standard may be added to the sample before a proteolytic digestion of the proteins in the sample has occurred in step (i) of the methods of the invention. Alternatively, internal standard may be added to the sample after a proteolytic digestion of the proteins in the sample has occurred in step (i) of the methods of the invention. The mass spectrometer may be pre-configured using said internal standard. Suitable, heavy-isotope labels are 13C, 15N; and/or 2H.
The method may comprise predicting and/or classifying the severity of COVID-19 disease in a subject using a trained machine learning model. Such a machine learning model may be used to predict and/or classify the severity of COVID-19 disease in a subject based on data such as one or more of: proteolytic peptide profiles (including individual concentrations for each proteolytic peptide for each patient); clinical scores such as CCI, SOFA, APACHE II and ABCS; patient state and/or descriptor; patient WHO grade; and specific levels of at least one proteolytic peptide. The data may be separated based on the specific WHO scale of the patient.
The machine learning model may be created using various scripts, such as Python or R scripts, to create Support Vector Machine (SVM) models, for example. The machine learning model may therefore comprise an SVM model. The skilled person is familiar with SVM models, and as such a specific implementation of SVM models will now be briefly described.
The SVM models may be created using the known and freely-accessible package ClassyFire developed by the Wishart Research Group (http://classyfire.wishartlab.com/), and described in the publication: Djoumbou Feunang Y, Eisner R, Knox C, Chepelev L, Hastings J, Owen G, Fahy E, Steinbeck C, Subramanian S, Bolton E, Greiner R, and Wishart D S. ClassyFire: Automated Chemical Classification With A Comprehensive, Computable Taxonomy. Journal of Cheminformatics, 2016, 8:61. As the skilled person would understand, ClassyFire is a web-based application for automated structural classification of chemical entities. ClassyFire uses an SVM to create the machine learning models. As would be understood, SVM classifies, makes a regression, and creates a novelty detection for the creation of the model. Several such models may be created until the most accurate model is found. Validation of the models is achieved using a validation cohort to estimate the Matthews Correlation Coefficient (MCC) value and assess the accuracy of the prediction, as would be understood. These SVM models output accuracy percentages and MCC values after validation.
The SVM models are trained based on training data. Such data includes “explanatory” data and “response” data for patients in which the severity of COVID-19 disease is already knows. The explanatory data comprises all the data that is used to determine the severity of COVID-19 disease in a subject. For example, the explanatory data may be the levels of at least one proteolytic peptide for a particular subject, including the individual concentrations of proteolytic peptides as obtained from the sample. The response data comprises data indicating the actual severity of COVID-19 disease in a subject. The training data may therefore comprise a tab-delimited table with a training dataset of subjects as columns and proteolytic peptide levels as rows, and a tab-delimited table with a validation dataset of subjects as columns and proteolytic peptide levels as rows.
The method may therefore comprise measuring the levels of at least one proteolytic peptide and predicting and/or classifying the severity of COVID-19 disease, the method comprising inputting the levels of at least one proteolytic peptide into a trained machine learning algorithm, the trained machine learning algorithm being arranged to:
optionally wherein the trained machine learning model is a Support Vector Machine (SVM) model.
The trained machine learning algorithm may be trained based on training data, the training data comprising:
The methods of the present invention may comprise steps performed by a computer and involve equipment controlled by the computer. The step of assaying the proteolytic digest of proteins of step (i) for the presence of a proteolytic peptide may be performed by equipment controlled by the computer.
The invention also provides a computer-implemented method predicting and/or classifying the severity of COVID-19 disease in a subject, which comprises receiving in a computer sample data representing the level of at least one proteolytic peptide in sample obtained from a subject and executing software on the computer to compare the level of the at least at least one proteolytic peptide in the sample to a baseline control, wherein the difference between the level of the at least one proteolytic peptide and the baseline control is indicative of the severity of COVID-19 disease in the subject, and to output severity data representing the severity of COVID-19 disease in the subject on the basis of the comparison.
The invention also provides a computer program comprising instructions which, when executed by a computer, cause the computer to carry out a computer implemented method of the invention.
It will be appreciated that the step of comparing the level of the at least at least one proteolytic peptide in the sample with a baseline control may be carried out on a different computer from a computer that initially receives data representing the at least at least one proteolytic peptide in the sample.
The invention also provides a computer apparatus for assessing the severity of COVID-19 disease in a subject, which comprises a first device incorporating a computer, a second computer and a communication channel between the first device and second computer for the transmission of data therebetween; wherein the first device is arranged to receive sample data representing level of the at least one proteolytic peptide in a sample obtained from the subject and to transmit the sample data to the second computer via the communication channel, and the second computer is arranged to execute software to compare levels of the at least at least one proteolytic peptide in the sample to a baseline control to determine the severity of COVID-19 disease in the subject, wherein the difference between the level of the at least one proteolytic peptide and the baseline control is indicative of the severity of COVID-19 disease in the subject, and to output severity data representing the severity of COVID-19 disease in the subject on the basis of the comparison.
The second computer may be arranged to transmit the severity data to the first device via the communication channel, or to a third computer.
In some embodiments, the first device may incorporate mass spectrometry equipment or devices for measuring the level of at least one proteolytic peptide in a sample.
In accordance with a second aspect of the invention, there is provided a method for the treatment of a subject with COVID-19 disease, the method comprising:
Details of the methods of the second aspect of the invention are as for the first aspect as described above.
Suitable treatment of a subject suffering from COVID-19 disease will depend on the severity of the disease state of the subject (i.e. the patient). Treatment may comprise oxygen therapy, supply of oxygen by non-invasive ventilation (NIV) or high flow, intubation and mechanical ventilation, pO2/FiO2≥150 or SpO2/FiO2≥200, mechanical ventilation pO2/FiO2<150 (SpO2/FiO2<200) or vasopressors, or mechanical ventilation pO2/FiO2<150 and vasopressors, dialysis or extracorporeal membrane oxygenation (ECMO).
Patients at all levels of severity of disease may be treated with appropriate therapeutic agents. Suitable therapeutic agents may comprise, but are not limited to steroids, non-steroidal anti-inflammatory agents (NSAIDs) and/or anti-viral agents, and/or derivatives or salts thereof, antibodies, donor plasma, cells and/or products of cells.
Examples of suitable steroids are corticosteroids, such as dexamethasone and/or derivatives or salts thereof (e.g. dexamethasone sodium phosphate).
Examples of suitable non-steroidal anti-inflammatory agents (NSAIDs) are paracetamol (acetaminophen), ibuprofen, diclofenac and/or ketorolac, and/or salts or derivatives thereof.
Examples of suitable anti-viral agents are remdesivir and/or derivatives thereof.
Examples of suitable antibodies are tocilizumab, bamlanivimab, casirivimab, and imdevimab, or combinations thereof.
Examples of cell-based medicines or products thereof are mesenchymal stem cells (e.g. remestemcel-L) and the secretome of cultured amnion-derived epithelial cells (e.g. ST266).
The therapeutically active substance may be administered alone or in combination. Where therapeutically active substances are administered in combination, the administration may separate, simultaneous or subsequent.
If the patient is characterized as being “Hospitalized Mild Disease” at WHO Score Level 4, then suitably therapy may comprise oxygenation, e.g. oxygen delivered by mask or nasal prongs.
If the patient is characterized as being “Hospitalised Severe Disease” at WHO Score Level 5, then suitable treatment may comprise non-invasive ventilation or high-flow oxygen, which may be supplemented by intubation and mechanical ventilation for patients at WHO Score Level 6.
If the patient is characterised characterized as being “Hospitalised Severe Disease” at WHO Score Level 7, then suitable treatment may comprise ventilation and additional organ support, administration of vasopressors, renal replacement therapy (RRT) and extracorporeal membrane oxygenation (ECMO).
According to a third aspect of the present invention, there is provided a kit for predicting and/or classifying the severity of COVID-19 disease in a subject according to a method of the first aspect of the present invention, comprising:
Optionally, the kit may additionally comprise a biological collection device. The kit may also comprise instructions for use.
Suitable media may comprise one or more of the following selected from the group consisting of a peptide storage solvent (for example comprising aqueous acetonitrile, optionally comprising 50% acetonitrile/50% ddH2O), a denaturation buffer (for example comprising urea, ammonium bicarbonate and dithiothreitol (DTT), optionally comprising 8M Urea, 100 mM ammonium bicarbonate, 50 mM DTT); an alkylation agent (for example comprising 2-iodoacetamide (IAA), optionally comprising 100 mM IAA); salts (for example comprising ammonium bicarbonate, optionally comprising 100 mM ammonium bicarbonate), and/or a surrogate matrix (for example bovine serum albumin (BSA), optionally comprising 40 mg/ml bovine serum albumin in ddH2O).
Such a kit may include materials for use in mass spectrometry analysis of protein samples, e.g. protease(s), diluents and/or other media, optionally further comprising instructions for use of the kit in a method of the invention. The sample collection device may comprise additional preservatives and/or stabilisers and/or reference peptide standards, i.e. synthetic natural peptides or internal standards, such as heavy-isotope labelled peptides as described herein.
The kits of the invention may also be used to assess the efficacy of new therapeutics or vaccines in clinical or pre-clinical studies for the treatment of COVID-19 disease.
According to a fourth aspect of the invention, there is provided a pharmaceutical composition comprising a therapeutic agent for use in a method of treatment of a subject with COVID-19 disease, wherein the COVID-19 disease of the subject has been classified according to a method of the first aspect of the invention.
According to a fifth aspect of the invention, there is provided a kit for use in the treatment of a subject with COVID-19 disease according to a method according to the second aspect of the invention, comprising:
The test of the present invention advantageously provides an accurate measurement of COVID-19 disease severity. The test is set up on a robust, targeted mass spectrometry platform and uses synthetic reference and internal standards. The test allows rapid iteration and implementation of predictive protein signatures (e.g. for different patient groups/settings). The sample preparation method of the test is short, simple and can be manual or automated. The protein signature, which this test is based on, was discovered in thousands of proteomes from hundreds of patients with longitudinal follow up and across 3 cohorts which makes it particularly robust.
The present invention provides a test to assess and/or predict COVID-19 disease severity when patients arrive at hospital, including assessing and/or predicting COVID-19 disease severity throughout the time that the patient is being treated at hospital for COVID-19. The test can be used to plan bed occupancy and broader facilities, personnel and equipment use at hospital, including mechanical ventilator and supplementary oxygen use.
The test of the invention can be used to assess the effectiveness of current and future treatment regiments in individual patients at hospital or in primary care. The test can also be used to assess and/or predict COVID-19 disease severity when patients present at a GP surgery (in primary care) in order to direct patients to or away from hospitals.
The present invention provides a test to assess the efficacy of a COVID-19 therapeutic agent and/or vaccine in pre-clinical studies and/or clinical trials and/or post marketing authorisation pharmacovigilence. The COVID-19 therapeutic agent and/or vaccine can be newly developed or re-purposed to treat COVID-19, including convalescent plasma treatment and also different formulations and administration routes of the same therapeutic agent and/or vaccine.
The present invention provides a test to compare properties (e.g. virulence and resulting disease severity) of different or mutated strains (i.e. variants) of SARS-COV-2 in pre-clinical, animal and human studies, including viral strains known to infect animals and/or humans. The test can be used as a reference method/benchmark for future disease severity tests, which includes cross-validation at test development stages. The test can be used for population/epidemiological studies to determine which sub-groups of the population develop milder or more severe COVID-19, including differences in pre-existing vaccination status in different populations, specifically patients who have previously received a dose of a BCG tuberculosis (TB) vaccine or newly developed COVID-19 vaccines.
Preferred features of the second and subsequent aspects of the invention are as for the first aspect mutatis mutandis.
In one embodiment of the invention, the method of the first aspect may comprise
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Reference is made to the following drawings in which:
Global healthcare systems continue to be challenged by the COVID-19 pandemic, and there is a need for clinical assays that can help to optimise resource allocation, support treatment decisions and accelerate the development and evaluation of new therapies. We developed a multiplexed proteomics assay and performed a stratification and prediction study on 2 cohorts of patients with COVID-19. The assay quantifies 50 peptides, derived from 30 known and newly introduced COVID-19 related protein markers, in a single measurement using analytical flow rate liquid chromatography and multiple reaction monitoring (LC-MRM), on equipment that is broadly available in routine laboratories. Technical analytical validation of the targeted, mass spectrometry-based peptide panel showed that it enables reproducible (i.e. inter-batch CV of 10.9%) absolute quantification of 47 peptides with high sensitivity (i.e. median LLOQ of 1.6 ng/ml) and accuracy (median 97.3%). Applied to two COVID-19 inpatient cohorts treated before and after dexamethasone became standard of care, composed of 30 and 164 patients, respectively, the assay reproducibly captured hallmarks of COVID-19 infection and severity, as it distinguished healthy subjects, mild, moderate and severe COVID-19. In the post dexamethasone cohort, the assay predicted survival with an accuracy of 0.83 (108/130), and death with an accuracy of 0.76 (26/34) in the median 2.5 weeks before the outcome, thereby outperforming compound clinical risk assessments such as SOFA, APACHE II, and ABCS scores. Disease severity and clinical outcomes of COVID-19 patients can be well stratified and predicted by this scalable and standardised protein panel assay that combines known and novel COVID-19 biomarkers. The prognostic value of the peptide panel assay should be prospectively assessed in larger patient cohorts for future support of clinical decisions, including evaluation of sample-flow in routine setting. The possibility to objectively classify COVID-19 severity can be helpful for monitoring of novel therapies, especially in early clinical trials.
The clinical presentation of COVID-19 is extremely diverse, ranging from asymptomatic infection to fatal disease, and can change rapidly. Timely assignment of the appropriate level of care substantially improves outcome in COVID-19. Objective and easy-to-apply tools to anticipate the patient's risk of deterioration, maximum disease severity and outcome based on validated biomarkers are fundamentally required, particularly in situations with overstrained healthcare resources. Risk assessment, as informed by established ICU outcome predictors, such as APACHE II or SOFA, as well as prognostic markers that depend on established clinical chemistry, have so far proven to be of limited predictive value in COVID-19. Moreover, the pandemic situation has prompted an unseen amount of repurposing attempts of existing antiviral and immunomodulatory drugs. However, small and rapidly conducted clinical trials can fail to yield reliable results with regard to clinical benefit and patient safety. Plasma and serum proteome studies have recently shifted to the centre stage to provide added value in COVID-19 patient classification and outcome prediction. Proteomic assays could overcome limitations surrounding the accuracy of early clinical testing. However, proteome analyses are thus far restricted to research settings, as the technology employed in discovery proteomics does not meet the requirements of technical stability and ease of implementation as required in the routine laboratory. In order to bridge the gap between research based discovery proteomics and the clinical application of a proteomic marker panel, we used proteomic datasets recorded in a deeply phenotyped COVID-19 patient cohort. We select a panel of 50 peptides derived from 30 proteins whose functions are associated with the COVID-19 host response and which can classify disease severity in COVID-19. Assembled on the basis of observational criteria, our panel contains a set of established clinical markers, but is also based on new protein markers that are not in use in the routine so far. We then develop and analytically validate a scalable and standardised proteomic panel assay that may be performed on instrumentation common in certified laboratories. Applying the assay to two independent cohorts, we demonstrate accurate disease classification, and show that the marker panel is prognostic about outcome. There is value in using the human plasma proteome in severity classification, risk assessment and outcome prediction in COVID-19. Missing so far is a translation of this research evidence into a routinely applicable assay. We show that a proteomic marker panel, which predicts survival in COVID-19 with high accuracy, can be used in routine laboratory testing. The described proteomic marker panel has the potential to substantially improve clinical risk assessment for patients with COVID-19 by translating discovery proteomics findings to patient care.
The invention will now be further described with reference to the Examples which are present for the purposes of reference only and are not to be construed as limitations on the invention.
1. Venous blood was collected from patients suspected or diagnosed to have COVID-19
2. EDTA plasma was prepared from the collected blood samples using standard hospital pathology laboratory protocols.
3. Plasma samples were prepared for mass spectrometry analysis by spiking with heavy isotope-labelled peptides shown in Table 1 and digestion with trypsin to release tryptic peptides from plasma proteins for analysis on the targeted LC-MS/MS platform.
4. LC-MS/MS analysis was performed on the prepared plasma samples by measuring tryptic peptides. This occurred on a pre-configured targeted LC-MS/MS system which was “programmed” according to the specifications in Table 2 to detect and quantify a specific set of peptides in Table 1.
5. Tryptic peptide concentration in μg/ml or ng/ml was determined for all 52 analytes using heavy isotope-labelled internal standards and synthetic reference peptides at known concentration as shown in Table 1.
6. Tryptic peptide concentration with a patient's age was used as input in a linear regression statistical model to generate an overall risk score for each individual patient.
7. The test results were sent to a treating clinician in order to make a treatment decision, taking other clinical readouts and their judgment into account.
(i) solution preparation:
(ii) Internal standard Spike Solution (Peptide Mix A)
Final volume of pooled IS mix was 1.04 ml.
(iii) Calibration Curve Preparation
(iv) Calibration Sample Tryptic Digestion
Abstract
Global healthcare systems continue to be challenged by the COVID-19 pandemic, and there is a need for clinical assays that can both help to optimize resource allocation and be used to monitor clinical trials. Several recent studies reported a huge potential for plasma and serum proteomics in classifying COVID-19 disease severity and to provide accurate outcome prediction weeks in advance. However, these investigations are so far based on shotgun and discovery proteomic platforms using nanoflow chromatography and relative quantification, which are difficult to implement in clinical settings and routine laboratories. Here, we present an absolute quantitative proteomic panel assay for the assessment of COVID-19 disease severity and outcome. For the ease of implementation, the assay is based on analytical flow rate chromatography and multiple reaction monitoring (LC-MRM), and runs on equipment available in routine and CLIA laboratories. We demonstrate classification of severe COVID-19 patients according to treatment choices in two cohorts. Moreover, we show that the panel assay substantially outperforms established risk assessments such as the Charlson comorbidity Index, the SOFA score, and the Apache II in predicting COVID-19 outcome. Our study hence shows that the combination of discovery and targeted plasma proteomics based on analytical-flow rate chromatography has become a powerful platform that facilitates the rapid translation of proteomic findings into routinely applicable clinical assays.
Introduction
COVID-19 continues to challenge healthcare systems worldwide despite vaccination efforts and novel treatments. This is particularly apparent in areas with limited vaccine uptake or supply. Currently, the outlook remains uncertain even in countries with high vaccination rates as the immunity conferred by the vaccines appears to diminish over time (2-5) and SARS-COV-2 variants with varying capacity to evade vaccine induced immunity continue to emerge (2, 6-8).
Biomarker tests that can classify patients, predict disease severity and are prognostic could help optimise resource allocation, specifically during a pandemic (9-11). Indeed, clinical manifestation of COVID-19 is highly variable, which does create challenges in timely clinical decision making. For instance, ‘happy hypoxia’ describes situations where COVID-19 patients report minor impairments, while molecular indicators such as blood oxygen levels, indicate they are, in fact, severely ill (12). Furthermore, in situations when healthcare systems reach maximum capacity, prognostic markers could support difficult clinical decisions, for instance, to navigate through triaging situations (13). Prognostic and disease severity tests could further help to increase the likelihood of success and accelerate clinical trials, by improving the assessment of treatment efficacy of COVID-19 therapies or stratify patient populations that are to be included in the trials. Indeed, COVID-19 prompted the rapid repurposing and development of new treatments. However, during a pandemic, there is pressure to conduct trials in a timely manner. Furthermore, specifically in ICU settings, study cohorts are limited in size, and when underpowered, clinical trials can lead to false positive and false negative results (14)(15). So far however, the reliability of several risk assessment scores conventionally used in ICU settings, such as the Acute Physiology And Chronic Health Evaluation (APACHE II) and Sequential Organ Failure Assessment (SOFA) scores, appear to be limited for COVID-19 cases (16). Moreover, combinations of generic clinical readouts, e.g. blood oxygen saturation, interleukin-6 concentration, have been considered for COVID-19 outcome prediction at various disease severity stages (Vatansever and Becer 2020), however, several predictive models based on clinical parameters and routine tests were reported to be vulnerable to bias and not suitable for the clinic (17,18).
Protein biomarker signatures from plasma present a promising alternative. Proteomic datasets have repeatedly been successful to classify and predict COVID-19 severity and outcome (9, 10, 16, 19-21). For instance, we recently presented a proteomics biomarker signature that stratified COVID-19 severity grades and successfully predicted disease progression and outcome, such as survival of hospitalised COVID-19 patients (13, 22, 23). Furthermore, specifically early in the pandemic, proteomics successfully characterized the antiviral host response, which greatly improved our understanding of the COVID-19 disease (9, 10, 16, 19-21, 23).
The success of shotgun proteomics in the clinical routine is so far however limited for technical and economic reasons. Most of the research platforms require a high degree of expert knowledge to use and they are also susceptible to interference and batch effects. Moreover, with only recent exceptions (15,22), most discovery proteomics platforms make use of nano-flow chromatography, which limits the throughput of proteomics while creating high maintenance efforts. The objective of this study was to translate selected biomarkers from this discovery based approach into a routinely applicable proteomics platform that could be deployed for clinical use within existing regulatory frameworks and on broadly available analytical instruments. Triple quadrupole mass spectrometers coupled to high flow liquid chromatography were the optimal choice for rapid test development and deployment as they are used in the clinic in other areas, for instance rare diseases, newborn screening and steroid hormone analysis (24-27). Further, they are widely available in large hospital laboratories, diagnostic laboratories and contract research organizations. Biomarker tests developed on this platform can be accredited to existing regulatory standards in GCP, ISO:17025, ISO:15189 and CLIA environments, standardised and transferred across different instruments, manufacturers and laboratories, and thus deployed at scale rapidly. Triple quadrupole mass spectrometry-based tests are cost effective to run at scale as the sample preparation is amenable to automation, consumables costs are typically <£5 per test and the instrument uptime is typically >95%. The tests also integrate with existing workflows at clinical and analytical laboratories.
We have mined discovery proteomics data from COVID19 patients, and selected peptides that are informative about COVID19 disease progression and severity. The biomarkers were chosen for i) being prognostic of remaining duration of hospitalisation, disease aggravation, or being differentially concentrated in plasma depending on the treatment escalation level, used as a measure of disease severity, and ii) participating in biological processes that contribute to COVID-19 pathology, and iii) to be technically and analytically suitable for the assay. The COVID-19 severity biomarkers chosen include proteins that function in inflammation (e.g. C-reactive protein), coagulation and vascular dysfunction (e.g. von Willebrand factor), complement cascade (e.g. Complement C1q subcomponent subunit C) and diverse biological processes detected to be altered by COVID-19 (e.g. Cystatin C). We then established a targeted proteomics assay utilising multiple reaction monitoring (MRM) data acquisition mode. Employing calibration curves with synthetic reference and stable isotope labelled internal standards, the assay provides absolute quantification of 50 surrogate tryptic peptides arising from 30 plasma proteins (Table 5). These includes proteins that function in inflammation (e.g. C-reactive protein), coagulation, vascular dysfunction (e.g. von Willebrand factor), complement cascade (e.g. Complement C1q subcomponent subunit C) and other biological processes detected to be altered by COVID-19 (e.g. Cystatin C, which is a known marker of kidney dysfunction). We have validated the assay technically, and implemented it in two analytical laboratories employing two triple quadrupole LC-MS/MS platforms from different vendors. The assay was applied to two cohorts with a total of XX patients and XX samples. We demonstrate that the presented MRM assay captures host response to SARS-COV-2 and, based on it, classifies and predicts COVID-19 disease severity and is predictive about outcome in severely ill individuals. We found that the biomarker panel is predictive about survival weeks before outcome and outperforms several commonly used risk assessment scores.
Results
Peptide Selection and Optimisation of an MRM Based, Targeted COVID-19 Biomarker Assay.
To select peptides for a COVID-19 biomarker panel assay, we used shotgun plasma proteomic data recorded on a deeply phenotyped COVID-19 patient cohort treated at Charit{tilde over (e)} Universitätsmedizin Berlin (13,22). On the basis of xx proteomes recorded for xx patients, we selected 50 proteotypic peptides that corresponded to 30 plasma proteins. These peptides were chosen based on their ability to predict the future worsening of COVID-19 over time, the remaining time in hospital for respective COVID-19 patients, which in turn serves as a treatment-insensitive proxy for COVID-19 severity (
To quantify the 50 proteotypic peptides, we obtained synthetic reference standards. For each peptide 2 synthetic standards were manufactured: 1) with a natural isotope distribution (‘native’) and 2) with a C-terminal stable isotope-labelled (SIL) amino acid (either [13C6,15N2]-lysine or [13C6,15N4]-arginine) to act as an internal standard. Internal standards contained a short tryptic tag to account for the digestion efficiency (Table 5). The native peptides were employed to optimise liquid chromatography-mass spectrometry (LC-MS/MS) data acquisition method and quality of the Q1/Q3 (MRM) transitions. To select optimal MRM transitions for each peptide, we first predicted the transitions (consisting of several precursor ion charge states and respective product ions) using Skyline v21.1.0.146 (30). We infused each native peptide solution into the LC-MS/MS system and selected 1 precursor ion per peptide with the highest relative intensity and 5 most abundant product ions for collision energy optimization which was also provided by Skyline. From these 5 product ions, we ultimately selected 2-5 experimentally optimised ion transitions per native peptide based on i) highest relative signal intensity, ii) optimal chromatographic peak shape and iii) absence of interfering signals. Product ions of <300 Da were excluded where possible to ensure specificity. Precursor and product ion-matched SIL internal standard transitions were included. Lastly, all selected transitions were combined into one dynamic MRM method, which was subsequently analytically validated. In the designed method, the most abundant transition for each peptide is used for quantification and 1-4 less abundant transitions are used as qualifiers (Tables 6 & 7). In order to establish the assay on a chromatographic system that could be run in a routine setting, we chose analytical flow-rate (800 μl/min) reversed phase chromatography using an 1290 Infinity II (Agilent) binary pump. The selected 50 peptides were well distributed along a 8.6 minute linear gradient, and quantifiable with the chosen total runtime of 10 minutes (
Analytical Validation of the Peptide Panel MRM Assay.
To evaluate the general applicability of the method for a clinical assay we tested performance of the selected peptides/transitions with respect to intra- and inter-batch repeatability, linearity, limits of quantification, accuracy, and potential matrix effects.
First, we determined the intra- and inter-batch repeatability. We calculated the coefficient of variation (CV) between three independently prepared replicate calibration curves.
These were constructed from serial dilutions of native peptide standards in BSA, covering a concentration range ˜5×105, and measured in technical pentuplicates (N=15). We used BSA as a surrogate matrix to test the analytical performance achieved on the standards in the absence of the endogenous plasma peptides (31) and achieved a median inter-batch CV of 10.9% across low (LLOQ), medium ((LLOQ+ULOQ)/2) and high (ULOQ) concentration points (Table 8). CV was determined from the response ratios, calculated by dividing native peptide area by internal standard peptide area. Additionally, we determined the limits of quantification as the highest and lowest concentration points on the linear calibration curve where the CV of the inter-batch repeatability was 520%. 37/50 peptides fulfilled the ≤20% CV cutoff requirement. Since analytical validation requirements for clinical assays are purpose and context dependent, and are influenced by the magnitude of change of target analyte levels in control versus disease samples, we subsequently expanded the CV cutoff to 540%. This enabled us to determine LOQ of 10 additional peptides and prevented missing values in downstream data analysis. Thus, we eventually determined limits of quantification and generated calibration curves for 47 peptides, with a median LLOQ of 1.6 ng/ml. Obtained calibration curves also showed excellent linearity within the determined LOQs (R2>0.99) and typically spanned 3-4 orders of magnitude (Table 8).
As LC-MS measurements can suffer from matrix effects (32), we next evaluated how quantification in surrogate matrix (BSA) compares to human plasma. We compared the slopes obtained from calibration samples measured in commercial human plasma samples (manufacturer), with those measured in the surrogate matrix. 41/47 quantified peptide biomarkers that passed the above described validation showed no statistically significant matrix effects when comparing the slopes between matrices (P>0.05) as expected for an assay using SIL internal standards (33,34). 6 peptides (AADDTWEPFASGK (SEQ ID No: 35), ASDTAMYYCAR (SEQ ID No: 50), ATEHLSTLSEK (SEQ ID No: 25), EQLSLLDR (SEQ ID No: 12), GDVAFVK (SEQ ID No: 13), IADAHLDR (SEQ ID No: 29)) differed significantly and matrix factor (slope plasma/slope BSA×100%) was calculated. The matrix factor was within the acceptable +/−20% limits and these peptides were included in the final method.
To test if peptide quantities from actual patient samples would be covered within the (linear) range of the calibration curves, we performed absolute quantification in plasma samples obtained from COVID-19 patients (pooled COVID-19 patient samples of different WHO severity grades; see methods). Absolute quantification was performed by calculating the relative response ratio (peak area ratio of native peptide over its corresponding ISTD peptide) in pooled samples, followed by absolute concentration interpolation from the above described BSA calibration curve. The peptide concentrations obtained from patient samples were covered within the determined linear range of the assay and were measured with a median accuracy of 97.3% across low (LLOQ), medium ((LLOQ+ULOQ)/2) and high (ULOQ) concentration points (Methods; Table 8).
Diagnostic Performance of the Peptide Panel Assay in an Early COVID-19 Cohort.
Next, we assessed how the absolute concentration of the selected potential biomarkers changed as a function of the COVID-19 treatment escalation level, a proxy for disease severity, as expressed by the WHO ordinal severity scale. To establish this relationship, we applied the MRM assay on a COVID-19 cohort that was also analysed with a discovery proteomics platform (23). This cohort was sampled as citrate plasma (in contrast to EDTA plasma used in the cohorts employed to select the peptides), and included healthy controls, as well as samples from patients hospitalized during the first wave of the pandemic between March 2020 and XX with mild to severe forms of the disease (23).
For robust and reliable sample preparation, we employed the recently presented procedure that enables tryptic digest and solid phase extraction in a semi-automated way (22). 40/50 peptides could be reliably detected and quantified on the Agilent 6495c LC-MS/MS platform (Table 9). The concentration of 32 of these both up- and down-regulated peptides changed with the severity of the COVID-19 disease according to treatment escalation: i.e. from uninfected (WHO 0) to mild (WHO3), moderate (WHO 4, 5) and severely (WHO 6, 7) COVID-19 affected individuals (
Analytical Cross-Platform and Cross-Laboratory Validation.
To evaluate transferability of the assay, samples from the above described cohort were further employed for a cross-platform, cross-laboratory validation of the LC-MS/MS assay. Post-digested samples were analyzed on a triple-quadrupole platform from a different vendor, with independently optimised MRM transitions (Table 7), in a different laboratory. For the majority of selected peptides, we obtained an excellent correlation between the concentration measured in respective samples on both instruments (
Similarly, the absolute quantities obtained for individual peptides were highly similar, with some exceptions where there was a high correlation, but some discrepancies in the obtained absolute concentration, suggesting differences in the employed calibration curves (see for instance the peptide CNLLAEK (SEQ ID No: 27) in
Overall, as we detected and quantified the majority of peptides on both platforms with high precision, the peptides that differentiate between different stages of COVID-19 can be quantified on both platforms, demonstrating the general applicability of the assay independent of the employed analytical platform.
Severity Stratification and Prediction of Disease Progression in a Longitudinal COVID-19 Cohort.
After establishing that the assay successfully captured hallmarks of COVID-19 disease in a small patient cohort, we next measured a large, longitudinal collection of samples obtained during the second wave of the pandemic in Germany. This cohort was selected as i) it provided a large number of samples, and thus more statistical power to evaluate the potential of the MRM assay to stratify disease severity, and ii) the longitudinal nature of the study, which allowed us to assess the predictive value of the MRM panel. Reassuringly, despite the large number of samples acquired (n=655 including quality controls) split over three batches and measured over 10 days, variation was low and no significant batch-effect was observed (
To evaluate the ability of the whole MRM panel to differentiate COVID-19 severity, we first analyzed the earliest sample obtained for each patient of this cohort (n=165). We were able to reliably detect and quantify 47/50 peptides, and the majority (33 peptides, 11 up-regulated, 22 downregulated) had a significant trend between patients according to the WHO ordinal outcome scale for clinical improvement with patients capturing a WHO score from relatively mild (WHO 3) to very severe cases (WHO 7) (Supplementary
To further explore the possibilities to classify COVID-19 severity with the MRM assay, we constructed a support vector machine trained to differentiate between three different treatment groups: WHO grade 3 (mild COVID-19, hospitalised, but no oxygen therapy necessary), WHO grade 4/5 (severe COVID-19, hospitalised, non-invasive oxygen therapy) and WHO grade 6/7 (severe COVID-19, hospitalised, mechanical ventilation).
The data was split in a training and a validation set in a cross-validated manner. The SVM successfully predicted the WHO grade in the prediction validation set (
Finally, we examined the capability of the measured peptides for predictions. An outcome predictor (SVM) was trained in a cross-validated manner to differentiate patients who survived COVID-19, from patients who had a fatal disease (First sample taken for each patient, n=165, n_died=34) (
Plotting the predicted outcome with respect to the time until the death (Kaplan-Meier survival analysis,
To evaluate how well the predictor performs compared to the clinically established scores, we determined the Sepsis-related Organ Failure Assessment score (SOFA), Acute Physiology And Chronic Health Evaluation (APACHE II) and Charlson Comorbidity Index (CCI), all which are in clinical use. The proteomics predictor significantly outperformed all three clinical scores (AUC of the ROC-curves). The SOFA-score and the APACHE II-score, which are directly linked to the severity of the patient, performed the best among the three “conventional”/established scores tested. For both SOFA and APACHE II scores one can observe an early enrichment of critical cases who will die. However, due to only capturing the current state of the disease, a future prediction especially for non-critical patients can not be performed reliably.
In conclusion, we demonstrate the development of a fast, multiplexed, sensitive LC-MS peptide assay that captures hallmarks of COVID19 disease severity and progression. The assay is easily translatable, sensitive, and specific, and thus well suited for rapid translation into a clinical test. Further, the fast and multiplexed nature of the presented method gives suitable throughput for large patient cohorts or routine diagnostic applications.
Discussion
Novel infectious diseases such as COVID-19 that lack immediate treatment options can quickly challenge health systems on a global scale. Considering the enormous pressure on intensive care units, there is an urgent clinical need for assays that capture and monitor the individual response of patients. Such personalized tests can support an objective clinical decision making that otherwise depends on confounders like age, and could guide development of novel treatments. While a range of such assays and clinical scoring systems have been applied, they often rely on a limited number of biomarkers that do not necessarily capture all major features associated with complex diseases such as COVID-19.
Here, we presented how the combination of high-throughput proteomics for identification of biomarker signatures, followed by the development and validation of a targeted, clinically translatable and scalable LC-MS/MS based assay is a powerful strategy to rapidly transition from a discovery approach to a platform that can be deployed in a clinical setup.
Targeted mass spectrometry using multiple reaction monitoring (MRM) is a method of choice for quantification of multiplex or multiparametric marker panels in biofluids. This is because LC-MRM i) provides excellent sensitivity and specificity, ii) has the possibility to easily include internal standards that give the assay precision and enable control over potential matrix effects (33, 34), iii) facilitates absolute quantification, enabling cross-platform transferability (35, 36). iv) has a large dynamic range (4 orders of magnitude of linear range in the presented assay) (37), which makes it possible to compare biomarkers with large abundance differences within one run, facilitating multiplexing of many biomarkers in parallel; and further, it facilitates sample preparation as matrix depletion is not necessarily required (38). Given the simplicity, flexibility, and multiplexability of LC-MS based MRM assays, the initial panel of peptides can be large, increasing the chance of finding a reliable set of reproducible biomarkers based on hierarchical filtering and selection of the most suitable markers during establishment of the targeted assay (39,40). Finally, although LC-MRM assays require analytical expertise and are labor-intensive to set up, they are highly cost-effective to run. In our study, we also overcome a common limitation of proteomic assays for their routine use—their dependency on low flow rate-chromatography (Bian et al. 2020; Song et al. 2017; Gao et al. 2020). Exploiting the high sensitivity of contemporary triple-quadrupole mass spectrometers, we demonstrate the accurate quantification of the peptide panel using analytical flow rate chromatography, which not only is robust and fast, but also available in typical clinical laboratories, greatly simplifying the application our assay in the routine.
The developed biomarker panel includes 50 proteotypic peptides derived from 30 plasma proteins. The peptides were selected from discovery proteomic data (13, 22, 23, 28) for their ability to predict the patient stay in hospital and to indicate the likelihood of future worsening, and were found to change in abundance depending on the treatment escalation level according to the WHO ordinal scale, introduced as of April 2020 (41), used as a measure of COVID19 disease severity. These proteins were all found to belong to biological processes important for the COVID19 host response, like the coagulation system, the complement cascade or metabolism. The assay is hence monitoring processes causal to the disease progression. In this study, we established the assay on two routine-laboratory compatible LC-MRM platforms, and performed an elaborate analytical validation procedure of the technical aspects of the assay. We demonstrate excellent reproducibility across two different mass spectrometric technologies, and reveal for each peptide the optimal measurement parameters, dynamic range, as well as sensitivity to matrix effects. As such, the assay could easily be translated into a multiplexed, high-throughput clinical assay that captures hallmarks of the COVID-19 pathophysiology.
We confirm in two independent patient cohorts, that the panel assay captures the severity of the COVID19 affected individual, and is discriminatory about the treatment levels. Moreover, the assay is prognostic about the outcome for the most severely affected COVID19 patients. Thus, the panel assay parameters could be used to assess the current state of the patient, help to monitor efficacy of novel treatments, or stratify patients based on their responsiveness to novel clinical interventions for COVID-19 therapies. Furthermore, the assay can be employed to predict the progression of COVID-19 disease, as exemplified by the performed prediction of disease outcome weeks into the future. Such knowledge can help guide clinical decision making, and optimise hospital resource planning, specifically in critical situations of the pandemic that threatens the capacities of hospitals.
Methods
Patient Cohorts
Patient samples were collected as described previously (13, 22, 23, 28) as part of a prospective observational cohort study Pa-COVID-19 at Charité—Universitätsmedizin Berlin. The study protocol has been described in detail before (Kurth et al., 2020). The study is registered in the German and the WHO international registry for clinical studies (DRKS00021688).
Reagents and Peptide Standards
Synthetic reference peptides were from Pepmic (Suzhou, China). Native peptides were synthesised at ≥95% purity and internal standard peptides—at ≥70% purity. Internal standards contained 4-6 amino acid tryptic tags mimicking the sequence in a corresponding human plasma protein and were labelled on C-terminal lysine (K) or arginine (R) amino acids with stable isotopes (K(U-13C6,15N2) or R(U-13C6,15N4)). Water was from Merck (LiChrosolv LC-MS grade; Cat #115333), acetonitrile was from Biosolve (LC-MS grade; Cat #012078), trypsin (Sequence grade; Cat #V511X) was from Promega, 1,4-Dithiothreitol (DTT; Cat #6908.2) from Carl-Roth, iodoacetamide (IAA; Bioultra; Cat #11149) and urea (puriss. P.a., reag. Ph. Eur.; Cat #33247) were from Sigma-Aldrich, ammonium bicarbonate (Eluent additive for LC-MS; Cat #40867) and Dimethyl sulfoxide (DMSO; Cat #41648) were from Fluka, formic acid (LC-MS Grade; Eluent additive for LC-MS; Cat #85178) was from Thermo Scientific™, bovine serum albumin (BSA) (Albumin Bovine Fraction V, Very Low Endotoxin, Fatty Acid-free; Cat #47299) was from Serva, commercial human plasma samples (Human Source Plasma, LOT #20CILP1034) was from zenbio.
All peptide stock solutions were prepared at 1 mg/ml concentration in 50:50 ddH2O: acetonitrile mix, except for STDYGIFQINSR (SEQ ID No: 22) and VEGTAFVIFGIQDGEQR (SEQ ID No: 51) where 200 μl of DMSO were added to solubilise the peptides at 5 mg/ml which were then diluted to small aliquot of 1 mg/ml with 50:50 ddH2O: acetonitrile mix before each sample preparation). Internal standard mix was prepared by pooling 20 μl of each heavy isotope-labeled peptide, evaporating 200 μl of this mix to dryness and reconstituting in a denaturation buffer to the final concentration of 1.4 μg/ml for each peptide. Cassetted calibration curves were prepared by serial dilution of pooled native reference peptide standards as described in Analytical method validation. After serial dilution, these samples were treated identically to respective clinical samples.
Peptide Selection
Peptides were selected based on discovery LC-MS/MS data from our prior studies (Messner et al., 2020; Demichev et al., 2021), where we identified a protein biomarker signature predictive of COVID-19 outcome. We selected a total of 50 proteotypic reference peptides from 30 human plasma proteins (1-2 peptides per protein) that had the highest predictive power in previous studies. The predictive performance in the following statistical tests was used for selection: (i) prediction of the remaining time in hospital for mild patients, (ii) prediction of future worsening for patients of all severity grades and (iii) stratification of patients of different severity grades. For each protein, only peptides that showed a good signal in the SWATH data as well as were predicted to be suitable for MRM by the Peptide Analyzing Tool (Thermo) were selected. Each native reference peptide was unique to a corresponding protein within the human proteome in a Uniprot BLAST search. Isotope-labelled internal standards were designed based on the selected native reference peptides.
Sample Preparation
Samples were prepared as described previously (Messner et al., 2020) with minor modifications. Briefly, clinical samples and calibration lines in Cohort 1 were prepared as follows: 5 μl of plasma or serum sample were added to 55 μl of denaturation buffer, composed of 5 μl 8M Urea, 100 mM ammonium bicarbonate, 5 μl 50 mM dithiothreitol (DTT) and peptide internal standard mix. The samples were incubated for 1 h at room temperature before addition of 5 μl of 100 mM iodoacetamide (IAA). After a 30 min incubation at room temperature the samples were diluted with 340 μl of 100 mM ammonium bicarbonate and digested overnight with 23 μl of 0.1 μg/μl trypsin at 37° C. The digestion was quenched by adding 50 μl of 10% v/v formic acid. The resulting tryptic peptides were purified on a 96-well C18-based solid phase extraction (SPE) plate (BioPureSPE Macro 96-well, 100 mg PROTO C18, The Nest Group). The purified samples were resuspended in 120 μl of 0.1% formic acid and 20 μl were injected into the LC-MS/MS system.
Samples in cohort 2 were prepared as described above, with one modification. As these samples had already been prepared for a discovery proteomics study, internal standards were digested separately and added to pre-digested clinical and calibration line samples before their injection into the LC-MS/MS system. Quality control (QC) samples consisted of pooled commercial control and COVID-19 human plasma samples (as described in a previous publication (Messner et al. 2020)), and were prepared alongside clinical and calibration curve samples in each cohort.
The COVID-19 sample pools used for the analytical validation were generated by pooling 5 μl of patient plasma from cohort 2 according to their WHO treatment severity score. Only samples of patients that had not received dexamethasone were used.
Liquid Chromatography-Tandem Mass Spectrometry
Tryptic peptides were quantified on 2 liquid chromatography-triple quadrupole mass spectrometry (LC-MS/MS) platforms—7500 (Sciex) and 6495c (Agilent).
6495c (Agilent) LC-MS/MS Method
All clinical samples were analysed on the Agilent 6495c mass spectrometer, coupled to Agilent 1290 Infinity UHPLC system. Prior to MS analysis, samples were chromatographically separated on Agilent InfinityLab Poroshell 120 EC-C18 1.9 μm, 2.1×50 mm column heated to 45° C. and with a flow rate of 800 μl/min. Linear gradients employed were as follows (time, % of mobile phase B): 0 min, 3%; 1 min, 3%; 7.5 min, 35%; 8 min 98%; 8.5 min, 98%; 8.6 min, 3%; 10 min, 3% where mobile phase A & B are 0.1% formic acid in water and 0.1% formic acid in acetonitrile respectively.
The 6495c mass spectrometer was controlled by Agilent's MassHunter Workstation software (LC-MS/MS Data Acquisition for 6400 series Triple Quadrupole, Version 10.1) and was operated in positive electrospray ionisation mode with the following parameters: 3500 V capillary voltage (positive), 0 V nozzle voltage (positive), 12 L/min sheath gas flow at a temperature of 280° C., 17 L/min gas flow at a temperature of 170° C., 40 psi nebulizer gas flow, 166 V default fragmentor voltage, 5 V default cell accelerator potential. Samples were analysed in dynamic MRM mode with both quadrupoles operated in unit resolution. All other MRM parameters, including monitored transitions and scheduling are provided in the Table 6.
7500 (Sciex) LC-MS/MS Method
Samples from the Kubler cohort (cohort 2) were analysed on a SCIEX 7500 mass spectrometer coupled to an ExionLC AD UHPLC system (SCIEX, UK) in addition to the analysis on the Agilent platform. Prior to MS analysis, samples were chromatographically separated on a Phenomenex Luna Omega Polar 3 μm, 100×2.1 mm column heated to 40° C. and with a flow rate of 500 μl/min. Linear gradients employed were as follows (time, % of mobile phase B): 0 min, 3%; 1 min, 3%; 7.5 min, 30%; 8 min 95%; 8.5 min, 95%; 8.6 min, 3%; 10 min, 3% where mobile phase A & B are 0.1% formic acid in water and 0.1% formic acid in acetonitrile respectively.
The 7500 triple quadrupole mass spectrometer was operated in positive electrospray ionisation mode with the following ion source parameters: 1750 V Ionspray voltage, 40 psi curtain gas, 40 psi Ion source gas 1, 70 psi ion source gas 2 and 500° C. temperature. Samples were analysed in MRM mode with both quadrupoles operated in unit resolution. All other MRM parameters, including monitored transitions and scheduling are provided in the Table 7.
Establishment of the MRM Based Assay
The assay was first set up on the 6495C (Agilent) system. Preliminary transitions for the 50 selected putative biomarkers (consisting of several precursor ion charge states and respective product ions) were predicted by Skyline v21.1.0.146. The native peptide standard solution was then infused into LC-MS/MS system and 1 precursor ion per peptide with the highest relative intensity and 5 most abundant product ions were selected for collision energy optimisation as provided by Skyline. From these 5 product ions, 2-5 experimentally optimised ion transitions per native peptide were ultimately selected in the panel based on the following criteria: i) highest relative signal intensity, ii) optimal chromatographic peak shape and iii) absence of interfering signals. Product ions of <300 m/z were excluded where possible to ensure specificity. Precursor and product ion-matched ISTD transitions were also included. Lastly, all selected transitions were combined into one scheduled MRM method, which was subsequently analytically validated. In the designed method, the most abundant transition for each peptide was used for quantification and 1-4 less abundant transitions were used as qualifiers (data not shown). For analytical cross-platform and cross-laboratory validation, the assay was set up on the 7500 (SCIEX) system in parallel following that approach (data not shown).
Mass Spectrometry Data Processing
Mass spectrometry data processing was performed with vendor-specific software: MassHunter Quantitative Analysis, v0.1, Agilent Technologies and SCIEX OS Software v2.0.1. Peak selection and integration were manually assessed before exporting the peak area values to .csv for further analysis. Peptide absolute concentration was determined from calibration curves, constructed with native and heavy isotope-labelled synthetic reference standards. Of note, SIL internal standards for 5 corresponding native peptides could not be detected on the 6495C system. To quantify these native peptides, we used other, closely eluting SIL internal standards in the assay: AADDTWEPFASGK(U-13C6,15N2 (SEQ ID No: 87) was used for ASDTAMYYCAR (SEQ ID No. 50), GYSIFSYATK(U-13C6,15N2) (SEQ ID No: 73) for GSPAINVAVHVFR (SEQ ID No: 34) and WEMPFDPQDTHQSR (SEQ ID No: 11), ANRPFLVFIR(U-13C6,15N4) (SEQ ID No: 90) for LAELPADALGPLQR (SEQ ID No: 49) and VSASPLLYTLIEK(U-13C6,15N2) (SEQ ID No: 97) for VEGTAFVIFGIQDGEQR (SEQ ID No: 51). In addition, due to low signal intensity of pre-assigned quantifier transitions (transitions with matched precursor and product ions across native and SIL peptides), we chose other transitions with higher signal intensity for 4 SIL peptides, even if they did not match the fragmentation pattern of their respective native peptides. The transitions used for quantification on both 6495C and 7500 LC-MS/MS platforms are shown in Tables 6 and 7 respectively. Linear regression analysis of each calibration curve was performed in Rstudio or Sciex OS (with 1/x weighting) and the respective peptide concentration in patient samples was expressed in ng/ml.
Analytical Method Validation
Method analytical validation was performed based on FDA Bioanalytical Method Validation criteria (42) where sensitivity, specificity, intra, inter-repeatability, accuracy and matrix effects have been assessed. A total of 5 independent calibration curves were prepared by serial dilution of native reference peptide standards in assay buffer (1), surrogate matrix (3) and pooled human plasma (1) across the final sample peptide concentration range of 0-1.63 μg/ml. Surrogate matrix (40 mg/ml BSA) calibration curves were prepared and analysed across 3 separate batches and all calibration curve samples were analysed in quintuplets. Linear 1/x weighted calibration curves were obtained for all native peptide reference standards in order to check the linearity of the response. Lower limit of quantification (LLOQ) was defined as the lowest calibration sample on the linear curve with a CV 20%. Ten peptides where low endogenous concentrations were observed in a pooled clinical subset of samples, LLOQ criteria were expanded to CV≤40% to prevent missing values where these peptides that were highly differentially expressed between COVID-19 severity groups were successfully quantified. Upper limit of quantification (ULOQ) was defined as the highest calibration sample on the linear curve with a CV≤20%. Accuracy was assessed by treating 1 of the 5 replicates in each calibration curve in the surrogate matrix as pseudo-unknown samples and quantifying with the curve generated from the remaining 4 replicates. The final accuracy was determined as the median of all calculated accuracy of each peptide. Matrix effects were measured by comparing the slopes of curves from calibration samples prepared in a BSA matrix and pooled human plasma. Here an Extra Sum of Square F test was used for statistical comparison with a p-value<0.05 indicating potential matrix effects.
Data Analysis and Visualisation, Statistics
Significance testing of the trend between absolute peptide concentrations and the ordinal classification as provided by the WHO treatment escalation scale (levels as indicated) was performed using Kendall's tau (KT) statistics as implemented in the “EnvStats v2.4.0” R package “kendallTrendTest” function. For cohort 1 the KT statistics was calculated as the trend of peptide quantities against the following WHO groups: 0, 3, 4, 5, 6, 7; for cohort 2, peptide quantities were tested against the following WHO groups: 3, 4, 5, 6, 7. A full summary of statistical test results is provided in Table 9. Multiple testing correction was performed by controlling for false discovery rate using the Benjamini-Hochberg procedure (43) as provided by the R package “stats v4.1.0”-“p.adjust” function. Principal component analysis was performed using the R function “prcomp” from the “stats 4.1.0” package and visualized using “ggplot2 v3.3.5”.
Prediction of WHO Grade and Disease Outcome
Clinical scores (CCI [39], SOFA [40], APACHE II [41], and ABCS [22]) were extracted from the clinical information system or, where missing, manually calculated. CCI and APACHE II were determined at time of admission, while SOFA was calculated for time of sampling. ABCS was calculated for admission and time of sampling. For ABCS, up to two missing laboratory values (either lymphocytes, blood urea nitrogen (BUN) or aspartate aminotransferase (ASAT)) were imputed by using the median value of patients within the same maximum WHO severity group. Note that due to imputation of the ABCS score memory leakage between training and test data for the ABCS score models can not be excluded. For the prediction of the current WHO grade and for the outcome prediction a Support Vector Machine was used as implemented in scikit-learn 0.23.2 (sklearn.svm.SVC) (44) using default parameters (rbf-kernel) and balanced class weights (class_weight=“balanced”). For one peptide (VSASPLLYTLIEK (SEQ ID No: 45)) negative values were present after external calibration. Those values were replaced by the minimal positive value of the respective peptide measured over all samples. Two peptides were removed (ASDTAMYYCAR (SEQ ID No: 50) and LVGGPMDASVEEEGVRR (SEQ ID No: 9)) as they were not reliably quantified leaving 48 peptides for the analysis. For every patient the first sample measured was selected (n=165). All patients with unknown WHO grade/outcome were neglected. All data were log 2-transformed and scaled to 0 mean and 1 variance fitted on the training data (sklearn.preprocessing.StandardScaler). The model was trained and validated using a shuffled stratified 5-fold cross-validation (sklearn.model_selection.StratifiedKFold) to assure that every split has a comparable case-to-control ratio and that every sample was used in 4 runs for training and in the remaining run for validating the trained model not including this sample. For reproducibility the seed was fixed to 42. For models trained on severity scores, only samples for which the respective score was determined were included in model construction and testing.
Decision function, ROC-Curve, accuracy, sensitivity and specificity were calculated using scikit-learn 0.23.2. For the Kaplan-Meier estimate lifelines 0.26.0 (45) was used. The data were divided in positive and negative predicted cases. For the true positive and false negative predicted cases, the days until death were included in the model. The samples for people who left the hospital alive were censored. Samples with missing time until outcome for patients who died were neglected. The confidence intervals were calculated using Greenwood's Exponential formula as implemented in lifelines 0.26.0 (alpha=0.05).
In addition, a predictor based on the extra-trees algorithm implemented in scikit-learn 0.23.2 (sklearn.ensemble.ExtraTreesClassifier) was evaluated. The same approach as described above was applied with the differences that the data weren't log 2-transformed and scaled as this isn't needed for a tree-based classifier. In addition, the maximal depth of the trees was set to 3 (max depth=3) to compensate for overfitting issues due to limited data set size.
Reference is made herein to the peptides and proteins of Table 1 which are as follows:
Table 2 shows experimentally optimised targeted LC-MS/MS conditions to monitor native peptides listed in Table 1. The data in Table 2 is based on AB Sciex triple quadruple LC-MS/MS platforms but is equally transferable to platforms manufactured by other companies. In Table 2, the following abbreviations are used: EP—Entrance Potential, CE—Collision Energy, CXP—Collision Cell Exit Potential, DP—Declustering Potential, and RT—retention time.
Table 3 shows experimentally optimised targeted LC-MS/MS conditions to monitor heavy isotope-labelled peptide internal standards listed in Table 1. This is based on AB Sciex triple quadruple LC-MS/MS platforms.
Table 4 shows peptides in the targeted LC-MS/MS assay ranked by ANOVA p-value. Top 17 peptides are essential for the COVID-19 assay as they show statistically significant concentration changes at different severity grades of COVID-19 disease. The top 17 peptides can be further ranked according to their p-value where the smallest p-value indicates the most important contribution to COVID-19 disease severity classification and prediction. Other peptides in the table also contribute to the overall assay performance and can also be ranked based on their P-value.
Table 6 shows native and labelled peptide MRM transitions on 6495c (Agilent) LC-MS/MS platform. Internal standard (labelled) peptide sequences are shown in their post-digestion form (without tryptic tags). In table 6 the following abbreviation is used: CE—collision energy.
Table 7 shows native and labelled peptide MRM transitions on 7500 (Sciex) LC-MS/MS platform. Internal standard (labelled) peptide sequences are shown in their post-digestion form (without tryptic tags). The following abbreviations are used in table 7: EP—entrance potential; CE—collision energy; CXP—collision cell exit potential; DP—declustering potential; RT—retention time
Table 8 shows a summary of analytical validation. The following abbreviations are used in table 8 are: LLOQ—lower limit of quantitation; ULOQ—upper limit of quantification; CV—coefficient of variation
This application claims priority to U.S. Provisional Application No. 63/156,291, filed Mar. 3, 2021, and to U.S. Provisional Application No. 63/283,787, filed Nov. 29, 2021. The disclosures of the prior applications are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
63283787 | Nov 2021 | US | |
63156291 | Mar 2021 | US |