The invention relates to the use of a polymerase chain reaction method (PCR method), especially for detection of the presence of a pathogen. The present invention further relates to the evaluation of qPCR measurements.
DNA strand segments in a substance to be tested, such as, for example, a serum or the like, are detected by carrying out PCR methods in automated systems. Said PCR systems make it possible to amplify and detect particular DNA strand segments to be detected, for example those which can be assigned to a pathogen. A PCR method generally comprises cyclic use of the steps of denaturation, annealing and elongation. In particular, the PCR process involves splitting of a DNA double strand into individual strands and making each of them complete again by attachment of nucleotides in order to reproduce the DNA strand segments in each cycle.
The qPCR method makes it possible to quantify the pathogen load detected using this process. To this end, at least some of the nucleotides are provided with fluorescent molecules which, upon binding to the individual strand of the DNA strand segment to be detected, activate a fluorescence property. After synthesis of the double strands, a fluorescence value dependent on the number of DNA strand segments generated can be determined after each cycle.
During amplification, it is then possible to determine from the fluorescence values (intensity values) determined a qPCR curve which has a sigmoidal shape in the event of the presence of the DNA strand segment to be detected in the substance to be tested. In reality, the qPCR curves measured may contain artifacts, and so multiple parallel measurements are generally carried out in order to make a more accurate evaluation of the qPCR curves possible through averaging of the measurement values.
According to the invention, a method for carrying out a qPCR method as claimed in claim 1 and a device and a qPCR system as claimed in the alternative independent claims are provided.
Further embodiments are specified in the dependent claims.
According to a first aspect, a method for conducting a quantitative polymerase chain reaction (qPCR) method is provided, comprising the following steps:
The qPCR method comprises cyclic repetition of the steps of denaturation, annealing and elongation. In the case of denaturation, the entire double-stranded DNA in the substance to be tested is split into two individual strands at a high temperature. In the annealing step, one of the primers added to the substance is bound to the individual strands, which primers specify the starting point of amplification of the DNA strand segments to be detected. In the elongation step, a complementary second DNA strand segment is synthesized from free nucleotides on the individual strands provided with the primer. After each of these cycles, the DNA quantity of the DNA strand segments to be detected has thus ideally doubled.
By using the qPCR method, fluorescent molecules are incorporated as labels into the DNA strand segments to be detected, and so it is possible, via measurement of the intensity of the fluorescence after each elongation step, to determine a time plot of the intensity values. The qPCR curve thus obtained comprises three distinct phases, namely a baseline, in which the intensity of the fluorescence of the fluorescent light emitted by incorporated labels is still indistinguishable from the background fluorescence, an exponential phase, in which the fluorescence intensity rises above the baseline, i.e., becomes visible, the doubling of the DNA strands in each cycle causing the fluorescence signal to exponentially rise proportional to the quantity of the DNA strand segments to be detected, and a plateau phase, in which the reagents, i.e., the primer and the free nucleotides, are no longer present in the required concentration and no further doubling takes place.
For the detection of a specified DNA strand segment to be detected, which can correspond to a pathogen for example, the so-called ct (cycle threshold) value is relevant here. The ct value determines the start of the exponential phase and is determined by exceeding of a specific threshold, which has been defined for whichever DNA strand segment is to be detected and which is identical for all samples for the DNA strand segment to be detected, or is determined mathematically by the second derivative of the qPCR curve in the exponential phase and corresponds to the intensity value of the steepest rise of the qPCR curve. If the target value is known, the starting concentration of the DNA strand segment to be detected in the substance to be tested can be determined by back-calculation.
In reality, the qPCR curves are highly inaccurate and are subject to considerable fluctuations. Baseline drift can occur, which refers to the rise of the background fluorescence above the measurement cycles. This means that, even if no amplification is taking place, the fluorescence signal is rising. Further influencing factors which have an adverse effect on the accuracy of the qPCR curve can, for example, result from thermal noise, fluctuations or metering tolerances in the reagent concentration, and air pockets and artifacts in the fluorescence volume.
In conventional qPCR systems, what is done, firstly, is software-based correction of the PCR curves and what can be envisaged, secondly, is repeatedly measuring a sample under the same conditions and smoothing the resultant qPCR curves by averaging. However, this requires increased effort.
It is a concept of the above method to decide with the aid of a data-based classification model, especially in the form of an artificial neural network, especially a deep neural network or a recurrent neural network, especially an LSTM, or in the form of a support vector machine, whether a measured qPCR curve indicates that the DNA strand segment to be detected occurs or does not occur in the substance to be tested. Since measured qPCR curves are generally greatly affected by noise, use of the trained data-based classification model allows more reliable assessment of the qPCR curve shape, even in borderline cases, as to whether the DNA strand segment to be detected occurs or does not occur in the substance to be tested.
The classification model can have been trained to provide, depending on the qPCR curve, a classification result which indicates or does not indicate a presence of a DNA strand segment to be detected. In particular, it is possible to train the data-based classification model on labeled qPCR curves as training data, so that a measured qPCR curve can be appropriately classified. However, this requires extensive training of the classification model with a high number of data sets of manually classified or labeled qPCR curves, since no presuppositions are taken into account.
Furthermore, residual error plots between the measured qPCR curve and a parameterized presence function and a parameterized nonpresence function can be determined, wherein the classification model has been trained to provide, depending on at least one of the residual error plots, a classification result which indicates or does not indicate a presence of a DNA strand segment to be detected. Therefore, it is alternatively possible to fit the measured qPCR curve to parameterized ideal shapes of a qPCR curve in the case of a presence of a DNA strand segment to be detected (presence curve shape) in the substance to be tested and in the case of a nonpresence of a DNA strand segment to be detected (nonpresence curve shape) in the substance to be tested. This is done by parameterization of the presence curve shape and the nonpresence curve shape to the measured qPCR curve. The residual error plots between the parameterized presence curve shape and the parameterized nonpresence curve shape and the actually measured qPCR curve can then be evaluated with the aid of a trained data-based classification model. The classification model is trained to decide whether the respective curve fit is based on a parameterized ideal shape which comes closest to the measured shape of the qPCR curve. This means that, with the aid of the classification model, it is established whether the measured qPCR curve tends to correspond to a qPCR presence curve or qPCR nonpresence curve.
A presence of a DNA strand segment to be detected is established if the classification result based on the residual error plot from the parameterized presence function indicates the presence of the DNA strand segment to be detected. Accordingly, a nonpresence of the DNA strand segment to be detected can be established if the classification result based on the residual error plot from the parameterized nonpresence function indicates the nonpresence of the DNA strand segment to be detected.
The evaluation can be made by inferring a presence of the DNA strand segment to be detected if the classification result confirms a curve fit with the presence curve shape and does not confirm a curve fit with the nonpresence curve shape. Analogously, a nonpresence of the DNA strand segment to be detected is inferred if the classification result confirms a curve fit with the nonpresence curve shape and does not confirm a curve fit with the presence curve shape.
Whereas the first above-described variant avoids possible incorrect or inaccurate basic assumptions about the underlying typical curve shapes and can also depict unknown relationships and dynamics, the second above-described variant, owing to the available domain knowledge, can be used with a classification model which requires a less comprehensive training data set for its training.
Altogether, the use of a data-based classification model leads to a lower number of misclassifications compared to conventional methods.
It can be envisaged that the qPCR method is conducted by
if a presence of the DNA strand segment to be detected is established.
According to a further aspect, a device for conducting a quantitative polymerase chain reaction (qPCR) method is, wherein the device is designed to execute the following steps:
Embodiments will be more particularly elucidated below on the basis of the accompanying drawings, where:
In the annealing step S1, the double-stranded DNA in a substance is broken up into two individual strands at a high temperature of, for example, above 90° C. In a subsequent annealing step S2, a so-called primer is bound to the individual strands at a particular DNA position marking the start of a DNA strand segment to be detected. Said primer represents the starting point of an amplification of the DNA strand segment. In an elongation step S3, the complementary DNA strand segment is synthesized on the individual strands from free nucleotides added to the substance, starting at the position marked by the primer, with the result that the previously split individual strands have been completed to form complete double strands at the end of the elongation step.
By providing the free nucleotides or the primer with fluorescent molecules which exhibit fluorescence properties only when bound to the DNA strand segment, it is possible, by determining an intensity of a fluorescence following the elongation step S3, to obtain an intensity value of the fluorescence through an appropriate measurement. What is assigned to the measured intensity of the fluorescent light is an intensity value.
The method comprising steps S1 to S3 is executed cyclically and the intensity values are recorded in order to obtain a plot of intensity values as a qPCR curve.
The plot of intensity values ideally has the shape depicted in
In step S11, the qPCR measurement is carried out in order to receive intensity values in consecutive cycles of a qPCR measurement. The number of cycles for the qPCR measurement is about 30 to 60 cycles, preferably 40 cycles. A qPCR curve showing the intensity values (or values derived therefrom) against a cycle index is obtained.
In step S12, the intensity values of the qPCR curve are supplied to a trained classification model. The classification model is in the form of a data-based model, such as, for example, a SVM (support vector machine) or a deep neural network. Alternatively, the data-based classification model can also be formed with a neural network composed of temporal convolutional layers.
The classification model can have been trained with data sets of actually measured qPCR plots, each of which has been assigned a label indicating whether or not the data set (qPCR curve) corresponds to a measurement of a substance containing the DNA strand segment to be detected.
In step S13, it is determined, according to the result of the classification by the classification model, whether the qPCR curve corresponds to a presence or a nonpresence of the DNA strand segment to be detected, i.e., specified whether the DNA strand segment to be detected is present in the substance or not.
In step S14, the PCR method is executed according to the classification result. In particular, the qPCR method can be conducted by signaling that a ct value is determinable and determining the ct value from the parameterized presence function, if a presence of the DNA strand segment to be detected is established.
In step S21, a qPCR method is used to carry out a qPCR measurement and to determine a qPCR curve through consecutive measurement of intensity values.
In step S22, a specified parametric nonpresence function is first parameterized by fitting the measured qPCR curve to the nonpresence function. For example, the nonpresence function can be a linear function, as depicted in
In step S23, the measured qPCR curve is fitted to a presence function. The presence function corresponds to a parameterized function which substantially corresponds to plot characteristics as depicted in
In step S24, residual error plots of the measured qPCR curve in relation to the parameterized presence function and in relation to the parameterized nonpresence function are determined.
In step S25, the residual error plots are supplied to a data-based classification model. The classification model has been trained on the basis of training data which assign residual error plots to the corresponding presence function or nonpresence function. This means that the training data indicate whether or not a residual error plot from the parameterized presence function confirms the presence of the strand segment to be detected. Furthermore, the training data indicate whether or not the residual error plot from the parameterized nonpresence function confirms the nonpresence of the strand segment to be detected.
Accordingly, it is established in step S25 with the aid of the residual error plot that a measured qPCR curve indicates a presence of the DNA strand segment to be detected in the substance if the classification model confirms the residual error plot from the presence function. Analogously, it is established with the aid of the residual error plot that a measured qPCR curve indicates a nonpresence of the DNA strand segment to be detected in the substance if the classification model confirms the residual error plot from the nonpresence function. This means that, if what generally arises from the residual error plot based on the parameterized presence function or the parameterized nonpresence function is that the classification result the residual error plot of the corresponding parameterized presence function or parameterized nonpresence function underlying the residual error plot, it is established that the DNA strand segment to be detected is present or is not present.
If the classification result with regard to the residual error plot gives rise to a result contrary to parameterized presence or nonpresence function underlying the residual error plot, it can be decided to discard the qPCR measurement, since it is not possible to make a clear decision about a presence or nonpresence of the DNA strand segment to be detected.
Number | Date | Country | Kind |
---|---|---|---|
10 2020 202 360.3 | Feb 2020 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/053656 | 2/15/2021 | WO |