The present invention relates to a method for determining the presence of a specific nucleic acid according to the introduction of claim 1, a mathematical model for the detection of a specific nucleic acid within a sample and the use of the mathematical model for the decision whether a specific nucleic acid is present within a sample or not.
Among the number of different analytical methods that detect and quantify nucleic acids based upon the sequences contained in said nucleic acids, polymerase chain reaction (PCR) has become the most powerful and widespread technology, the principles of which are disclosed in the U.S. Pat. No. 4,683,195 and U.S. Pat. No. 4,683,102.
Among a plurality of possible applications of the PCR technique one important field is the detection of DNA sequences being responsible for serious medical defects or diagnosis of serious diseases like Hepatitis, AIDS, Human Papillomavirus (which can cause cervical cancer), Chlamydia trachomatis (which can lead to infertility in women) and the like. PCR technology has become an essential research and diagnostic tool for improving human health and quality of life. PCR technology allows scientist to take a specimen of genetic material, even from just one cell, copy its genetic sequence over and over again and generate a test sample sufficient to detect the presence of absence of specific DNA viruses or bacteria or any particular sequence of genetic materials.
One very important specific field is the testing of blood in blood donation centres where within a short period thousands of samples of blood have to be tested in order to decide whether a specific series of blood may be used or has to be rejected. In particular for blood testing, it is important to have a quick, easy and absolutely reliable test in order to blood testing, it is important to have a quick, easy and absolutely reliable test in order to sort out any contaminated blood and to detect a specific DNA sequence which is responsible for a serious disease as e.g. mentioned above.
Therefore, it has been proposed to use labelled substances which can be added to the PCR mixture before amplification of the DNA and to be used to analyse PCR products during amplification. This concept of combining amplification with product analysis has become known as real time PCR which is disclosed e.g. within the WO/97 46707, WO/97 46712 and WO/97 46714. Furthermore, this technique is disclosed within the EP 0 543 942 as well as within the EP 1 041 158 and EP 1 059 523.
Specifically, fluorescent entities are used which are capable of indicating the presence of a specific nucleic acid and which are capable of providing a fluorescent signal related to the amount of specific nucleic acid present within the reaction mixture. In other words, the forming of further nucleic acid chains during the progress of the PCR can be visually followed due to the fluorescent entities.
A specific method in that respect is using so-called TaqMan probes which are short DNA fragments that anneal to a region located between the primer binding sites of the template DNA. The probes bear at different positions a reporter entity and a quencher entity. The polymerases in the PCR solution are able to break down the TaqMan probes during the doubling of the DNA template. In doing so, they free the quencher entity which then migrates away from the influence of the reporter. Hence the fluorescence of the reorter entity is measurable only if the polymerase has in fact copied the desired DNA strand. Each fluorescing molecule of reporter entity represents a DNA strand that has been formed. TaqMan probes can therefore be used to measure and determine the amount of specific DNA formed at any given time.
At present, for determining whether a specific nucleic acid is present within a sample by using so-called TaqMan probes, the change, preferably the increase, of fluorescence is measured and plotted versus time, preferably the number of cycles during the PCR. If the plotted measured points represent more or less a linear base line, the diagnosis usually bears that there is no specific nucleic acid present within the solution. The testing, e.g. of a blood sample, is negative which means that no critical nucleic acid, i.e. DNA or RNA, representing e.g. Hepatitis, AIDS and the like is present. If a deviation of the increase of fluorescence from to the linear base line is observed, which means that the curve does include a so-called elbow deviation, the diagnosis is positive, meaning the tested blood sample is contaminated.
But as the kinetics of PCR reaction is quite complicated, the reaction results require special data analysis because the fluorescence signal level has no simple relation to the amount of input nucleic acid. In the actual used method for diagnosis, in particular of blood samples, some of diagnosis results are judged to be negative which in fact might be positive.
Therefore, one subject of the present invention is to create a method for the detection of a specific nucleic acid in a sample, which method is easy to be executed, completed within a relatively short period, is more reliable and relatively cheap.
Proposed according to the present invention is a method linked to the wording of claim 1. According to the proposed method, a novel qualitative algorithm is proposed combining the two models of linear versus combined linear and sigmoid curves which are compared statistically. Therefore, by using the PCR technique a labelled substance is added to a sample to be tested containing a sequence complementary to a region of the nucleic acid to be determined to detect whether it is present or not within the mentioned sample. The mixture is maintained under conditions for amplification, e.g. by polymerase chain reaction, and the increase of a signal initiated by the labelled substance and/or the effect initiated by the labelled substance, due to the possible increase of the specific nucleic acid, is measured or determined. The measured increase of signal or effect is plotted against time, e.g. the cycles of the PCR, and the plotted results are analysed by using the mentioned combined regression model.
Compared with the state of the art, the proposed regression model takes into consideration any deflections or deviations of the measured results in relation to the regression model, which means that deflections or deviations of the particular fluorescence signals at each cycle are taken into consideration due to the kinetics of the PCR.
First, a mathematical regression analysis is made with the full data set. A quasi linear regression according to the following formula
f(x)=β1+β2·+β3·s(x)
with three regression coefficients is made multiple times. β1 is a constant, β2 is the linear slope and β3 the size of the sigmoid like function s(x) The trial function is a linear curve, combined with a sigmoid curve, with a constant (preset) slope d.
This is made with the inflection points e varying over a preset cycle number range (input parameter). The series of calculated regression coefficients β3 are used for further analysis. In the attached
For the linear and combined curve regression e.g. the following specific mathematical model is proposed:
If the term β3=0, then we have a classical linear regression, meaning that we have a straight line. In such a case the diagnosis is quite simple as we have no accelerated increase of the fluorescence and therefore the straight line is representing the basic fluorescence within the mixture. The slope increase may be caused e.g. by changes of the reagents used in amplification, e.g. the “mastermix”, changes of the pH-value, changes in temperature of the mixture, etc.
In such a case the diagnosis is simple as the result is negative. The null hypothesis β3=0 corresponds to no growth present which would be reported as “negative”. A positive result is indicated by a fluorescence increase starting at any of the amplification cycles which is above the fluorescence baseline. Taking e.g.
According to the present invention, it is now proposed to further take statistical methods such as e.g. the t-test of the regression coefficient β3 into consideration. A statistical hypothesis test is made for the sigmoid coefficient β3. The above mentioned null hypothesis is investigated with e.g. a t-test for the ratio between β3 and the standard error
of β3. In this case the t-value t is is a normalized deviation calculated as the quotient of the regression coefficient and its standard error.
From the inverse Student t-distribution function a statistical false positive (statistical type I error) probability p can be calculated. The statistical type I error p means that the hypothesis is rejected by the methode even it is true in reality. Other common statistical significance criteria (adjusted R2, SIC) lead to similar results. The most significant regression with varying inflection point is chosen based on the smallest t-value of each regression with varying inflection point e for the final result as shown in
In
To judge now whether the diagnosis is positive or negative, a cut-off value for the statistical false positive probability serves for POSITIVE/NEGATIVE discrimination. With this parameter the sensitivity/specificity of the algorithm can be adjusted.
The borderline between negative and positive is of course an empiric value for each specific application which has to be designated by the execution of a plurality of tests in advance.
Furthermore, in
Going back to
The main result of the algorithm is the discrimination between positive and negative. The main result according to the present invention is to judge whether a specific DNA sequence or nucleic acid to be determined within a sample is present or not. In case of the new samples testing, the diagnosis can be done easily, quickly and absolutely safely whether a blood sample is contaminated by the HIV virus or not. Of course, the same diagnosis can be made in relation to other defects such as e.g. the nucleic acids representing Hepatitis B and other diseases as mentioned above. The calculated false positive (type I error) probability itself can also serve to estimate the safety of the result. Additionally, some optional estimations of curve characteristics numbers are extracted from this calculation. They might be used for R&D purposes and possible additional consistency criteria.
Some slight adjustments to the “Sigmoid Regression” algorithm were made to improve the performance. Naturally, negative sigmoid parameters β3 are dropped. In case of no positive sigmoid parameter at all, NEGATIVE is reported. Signals more than 20 cycles after inflection are not used for regression to reduce the false detection of negatives with nonlinear drift. Since the algorithm uses all data point, it is robust to spikes. Therefore, no spike detection is required.
To get a visual impression of the power of the “Sigmoid Regression” algorithm according to the present invention, the attached graphics, shown in
In
The β3 value of the curve in
In
Comparing the two curves shown in
The β3 value in
In
Even if the t-test value is rather low due to the very high imprecision or the very low β3, the diagnosis would be considered as intermittent. Preceding studies lead to the cut-off value to decide on the reported result.
The “Sigmoid Regression” algorithm according to the present invention is the first one developed especially for qualitative detection. Using all data point for calculation, it is statistically well based. However, it is still relatively simple to implement.
Initial algorithm comparison analysis has shown average increased sensitivity from double to 5 fold for low positive samples of five assays. This is reached without affecting specificity.
While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention. For example, all the techniques and apparatus described above can be used in various combinations. All publications, patents, patent applications, and/or other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were individually indicated to be incorporated by reference for all purposes.
Number | Date | Country | Kind |
---|---|---|---|
04023115.1 | Sep 2004 | EP | regional |