The present invention relates to estimating the initial size of a population of interest in a sample subjected to a succession of amplification reactions.
The present invention finds a particularly advantageous, but non-limiting, application in determining an initial quantity of nucleic acids in a sample subjected to a polymerase chain reaction (PCR) in real time. A technique of this type, known as “PCR quantification”, is used in particular for evaluating the number of copies of pathogenic agents (e.g. of the human immunodeficiency virus (HIV)) in a sample of body fluids taken from a patient, typically in the context of a medical checkup.
Reference is made to
It should be observed that for the initial PCR cycles (first and second portions), the population of interest increases in substantially exponential manner, whereas for the following cycles (third and fourth portions), other phenomena come into competition with growth in the population of interest, so that said growth is then damped up to the plateau stage PLA.
The document “Mathematics of quantitative kinetic PCR and the application of standard curves” by R. G. Rutledge and C. Côté, published in Nucleic Acids Research, 2003, Vol. 31, No. 16, discloses a method of estimating the unknown initial quantity of nucleic acids in a sample of interest by means of PCR. That method consists in using a plurality of samples having known initial quantities of nucleic acids, referred to as “standards”, in order to determine by interpolation the initial quantity of nucleic acids present in the sample of interest.
In general, the greater the initial quantity of nucleic acids in a sample, the sooner a detectable quantity of PCR product is obtained, i.e. the sooner a detectable quantity of emitted fluorescence is obtained. With reference to
Thus, such a Ct cycle, corresponding to the cycle at which the fluorescence measurements reach a fluorescence threshold THR (as shown in
Although that method is in widespread use, it nevertheless presents some drawbacks.
Firstly, it requires the use of a plurality of standard samples having respective known initial populations.
Secondly, the method depends on the judgment of the user, since the fluorescence threshold value, as selected by the user, has a direct influence on the values of the Ct cycles in the amplification curves, and consequently on the estimated values for the initial population size in the sample of interest. The threshold value also has an impact on the accuracy of the result, since accuracy is generally better if the threshold is selected to lie in the exponential growth stage EXP of the amplification curve. Nevertheless, in practice, it is difficult for the user to know whether the fluorescence threshold level THR that has been set does indeed correspond to the exponential stage of the curves, and does so for all of the samples (the standard samples and the sample of interest).
Finally, the method assumes without any verification that the population has the same amplification yield in the sample of interest and in all of the standard samples. Thus, if the sample of interest contains PCR inhibitors, as is typically the case, then its result will be falsely lowered.
It should thus be understood that the prior art technique depends on the fluorescence threshold THR as defined by the user. The value selected has an influence on the values of the Ct cycles and consequently on determining the initial quantity in the sample of interest. That is one of the reasons why a large amount of work has recently been undertaken to automate Ct cycle detection and make it reliable.
The present invention seeks to improve the situation by proposing an approach that is completely different.
Firstly, the invention provides a method, the method being implemented by computer means for quantifying in absolute and/or relative manner an initial population of nucleic acids in a sample of interest. The sample is subjected to a succession of applications of a reaction for amplifying the population of interest. In very general manner, this amplification may be undertaken by implementing successive PCR cycles, however any other amplification technique could also be used. Above all, it should be understood that the amplification needs merely to be defined by a reaction yield, as described below. During these successive amplification operations, experimental measurements are taken that are representative of a current population size, at least in the sample of interest. It will be understood that one or more measurements can be taken after or during each amplification reaction without loss of generality.
In a presently preferred definition of the invention, the method in the meaning of the invention comprises the following steps:
a) providing a model of the yield of the amplification reaction as a function of the succession of amplifications, said model comprising:
the first and second portions being united by a changeover region in which yield changes over between the constant and non-constant stages, said region having an amplification index corresponding substantially to the changeover;
b) using the yield model to express a relationship involving at least the changeover index and a parameter representative of the initial population size in the sample of interest;
c) determining at least the changeover index by comparison with the experimental measurements; and, in a subsequent or immediately following step d) deducing therefrom the initial population size in the sample of interest.
Other advantages and characteristics of the invention appear on reading the following detailed description of an implementation given below by way of example with reference to the accompanying figures, in which:
Reference is made to
Firstly, it should be understood that
As mentioned above with reference to
The following two assumptions are made:
This decrease in yield may have a variety of explanations, in particular a degradation and/or a lack of PCR reagents (DNA polymerase, dNTPs, primers, etc.) and/or inhibition by the products that are made themselves.
It is assumed herein that the yield is initially constant and that it subsequently decreases. Nevertheless, it should be understood that the invention applies more generally to the context of yield:
In the context of reactions for amplifying the quantity of nucleic acids, it has been found that the yield often changes over from a constant stage to a non-constant stage. In the meaning of the invention, advantage is taken of this observation to deduce therefrom the initial quantity of nucleic acids, as described below in detail. Initially, it is merely stated that the yield can also change over from a non-constant stage during early cycles to a subsequent constant stage. The present invention is equally applicable to such a circumstance. In general, it should therefore be understood that in the meaning of the invention, a changeover of yield between a constant stage and a non-constant stage is detected.
The objective is to find the initial size of the population that has been subjected to amplification. With reference to
In a completely different approach, the present invention instead makes use of nearly all of the points of the amplification curve in order to determine accurately a region CHO where the yield changes over between a constant stage and a non-constant stage, typically in present circumstances between the exponential stage EXP and the linear stage LIN. It will be understood that measurements are logically less affected by noise in this region CHO than in the background noise exit region since the region CHO occurs during later cycles. Furthermore, particularly because of the mathematical properties associated with yield, it is shown below that, most advantageously, the number of standards that need to be used for quantifying the initial size of the population of interest is smaller than the number of standards used in prior art quantification.
The relationship for associating the changeover region CHO with the initial size of the population of interest is briefly described below. The yield of an amplification reaction is given by:
Nn+1=Nn+En×Nn
in which:
Reformulating this relationship as a recurrence relationship, we obtain:
Nn+1=(1+En) (1+En−1) (1+En−2) . . . (1+E0)N0
where N0 is the initial size of the population of interest. So long as the yield En is constant, it will be understood that the above relationship can be written more simply as follows:
Nn+1=N0×(1+E0)n+1
where the index n+1 has not yet reached the changeover region CHO. While the yield is constant during the initial cycles, the following applies:
En=En−1=En−2= . . . =E0
where E0 is the value of the yield during the constant stage. Nevertheless, when the index n+1 moves into the changeover region CHO, the relationship becomes:
Nn+1=N0×(1+E0)CEEP×function(CEEP, n+1)
where:
It can thus be seen how it is possible to associate the changeover index CEEP and the initial size N0 of the population of interest. At this stage it can be understood that steps a) and b) of the above-defined method have already been implemented.
A first implementation consists in determining the changeover index CEEP experimentally and in correlating it with the initial size by regression by using a plurality of standard samples that are subjected to the same amplification treatment as the sample of interest. Under such circumstances, it will be understood that steps b) and c) of the above-defined method are merely interchanged since initially the changeover index CEEP (step c)) is determined experimentally, and subsequently the relationship between the index CEEP and the initial size N0 (step b)) is determined in order to end up with the initial size N0 (step d)).
Before describing all of these steps in detail in the meaning of the first implementation, a method is described for determining the index CEEP on the basis of experimental measurements. In particular, it will be understood that this method of determining the index CEEP experimentally can be applied to another implementation that is different from the above-mentioned first implementation.
Returning to the relationship between the effectiveness En of a given cycle n and the current size of the population of interest in the same cycle Nn and in a subsequent cycle Nn+1, the effectiveness of the amplification can be expressed as follows:
En=(Nn+1/Nn)−1
In certain circumstances, in particular when there is no need to take account of background noise BN in the measurements, it is possible to a first approximation to assume that the measurements are substantially proportional to the current size of the population of interest. Nevertheless, in practice, account will more often be taken of measurement drift, with corrected experimental measurements F′n being determined on the basis of direct measurements Fn as shown in
A prior step of processing the experimental measurements Fn is preferably applied, this step consisting in subtracting the background noise BN and subsequently in introducing compensation to take account of a non-zero measurement ε representative of the initial population size. In the example shown in
F′n=Fn−BN+ε
where:
Although these steps of correcting for background noise are very advantageous in determining the changeover index CEEP, they may also be applied to any determination and quantification of the initial population size N0 whenever background noise is likely to falsify measurement of said population size N0. In this respect, these steps may constitute the subject matter of separate protection, where appropriate.
The corrected measurements F′n as obtained in this way are advantageously proportional to the current population sizes Nn in the samples of interest, such that the yield En can now be expressed directly as a function of measurement values (corrected as described above), by the following relationship:
En=(F′n+1/F′n)−1
Thus, from the experimental measurements Fn of
In short, the experimental measurements are expressed in the form of an experimental variation in the effectiveness En of the kind shown in
At least in the most usual circumstance of amplification by PCR and measurement by fluorescence, the non-constant stage of yield is decreasing and corresponds to said second region presenting little noise (as shown in
These points NEG (
When yield presents a non-constant stage in which yield is decreasing and which follows a constant stage, as shown in
As described below in a subsequent implementation, it is possible for each measurement point to model the variation in its yield as though said set point itself corresponds to the changeover index CEEP. In that implementation, if the constant yield stage E0 is estimated, and if the estimated value exceeds the above-mentioned predetermined value, then the point is considered as corresponding to the coarse index CG.
In general, a maximum yield has a value of 1 so it is possible to select the above-mentioned predetermined value as being equal to 1. Nevertheless, this can be varied, and, for example, provision can be made to set the predetermined value as corresponding to the mean yield E0 as evaluated over the initial reaction cycles.
Thereafter, the estimate of the value of the amplification index CEEP in the changeover region is refined, which value may advantageously be a fraction, by working in the direction of increasing amplification index numbers, starting from the coarse index CG, and by detecting an amplification index for which the yield is approximately equal to the above-mentioned predetermined value. Thus, referring again to
In the above-mentioned first implementation, a plurality of standard samples are provided having respective known initial population sizes, and the succession of amplifications is applied thereto under substantially the same conditions as for the sample of interest. Their respective changeover indices are determined in accordance with above-described steps a), b), and c). In step d):
Thus, with reference to
This first implementation is thus quite similar to that of the prior art described with reference to
In an approach that is significantly different from this first implementation:
In a second implementation, this parameterized variation is representative of the current population size Nn in the sample of interest.
Typically, this parameterized variation can be drawn from an expression of the type given above:
Nn+1=N0×(1+E0)CEEP×function(CEEP,n+1)
Thus, in addition to a parameter representing the changeover index CEEP, this variation makes use of a parameter representative of the initial population size N0 in the sample of interest.
Thereafter, in steps c) and d) of this second implementation, these two parameters CEEP and N0 are determined substantially together.
Previously, in step a), it is necessary to determine a model for the above-mentioned function function(CEEP, n+1).
Usually, for PCR quantification, a model is selected for the non-constant stage of the yield corresponding to a decreasing exponential having a decrease parameter β which is described in greater detail below. This decrease parameter β is then determined in step c), at least with the changeover index CEEP, by comparison with the experimental measurements.
Thus, in this second implementation, once the yield model En has been selected, it is applied to the general expression for the current population size Nn given by the above relationship. This provides a model for variation in the current population size Nn.
Nevertheless, unless the experimental measurements give the value for the current population size Nn directly (which is rarely true in practice at present), it is appropriate subsequently to model the experimental measurements Fn themselves, taking account of the subtracted background noise and the subsequent compensation ε as described above.
Thus, in a presently preferred implementation, the above-mentioned parameterized variation:
Thereafter, the measured value of the initial population size F0 is determined by comparing said parameterized variation Fn with the experimental measurements.
In order to perform this comparison, it is possible, for example, to adjust the parameters F0, E0, CEEP and the decrease parameter β in the model of the measurements Fn by using statistical correlations (typically the least squares method) applied to the raw experimental measurements. An example implementation is described in detail below.
Initially, variation is obtained for the measured and adjusted quantity of fluorescence as a function of the number of PCR cycles that have been applied, as shown for example in
In the example described, it will be understood that the amplification reaction is a PCR reaction in real time. The experimental measurement represents quantities of emitted fluorescence.
The fluorescence of cycle n after adjustment for background noise, as described above, is written Fn below. The theoretical initial fluorescence before the first cycle is written F0. The effectiveness of the PCR in cycle n is written En. The total number of cycles performed during the PCR reaction is written N.
By assumption, the fluorescence measured on each cycle n of the PCR reaction cycle is defined by:
Fn+1≈Fn(1+En) for all n∈{0, 1, 2, . . . , N−1} (1)
with 0≦En≦1.
The effectiveness of the reaction on each cycle n is calculated as follows:
It should be observed that equation (1) is assumed to be true for n=0. Nevertheless, by definition, the initial fluorescence F0 is unknown. It is therefore not possible to calculate the effectiveness on the first cycle E0 directly for formula (2).
The following assumptions are preferably made:
Nevertheless, it is preferable to assume that variation in effectiveness obeys a model of the type including:
The cycle (CEEP−1) thus represents the last cycle (which may be a fraction) for which effectiveness continues to be constant.
It is then proposed to model the effectiveness of the reaction as follows:
where E0 and β are real parameters which are estimated using the amplification curve of
In a variant, some other selection may be preferred, e.g. from the models F1 to F3 given below, particularly depending on the type of nucleic acid that is to be quantified.
En=exp(−β(n−CEEP+1))−1 F1
En=exp(−μ(n−CEEP+1))α)−1 F2
En=α−exp(−μ(n−CEEP+1)α)−1 F3
Preferably, several sets of parameters are estimated in step c) for several candidate changeover cycles CEEP, and the minimum candidate cycle is selected for which the associated parameters maximize the statistical correlations that can be undertaken in step c), for each changeover cycle CEEP.
As mentioned above, expression (1) may also be written in the form:
Thus, by introducing the expression (3) for effectiveness into formula (4), a new model is obtained having four parameters (F0, E0, β, CEEP) for the adjusted emitted fluorescence Fn:
The initial size N0 of the population of interest, the effectiveness E0 of the reaction of n=0, the parameter β, and the changeover cycle CEEP are evaluated repetitively for several cycle values in the vicinity of the changeover region CHO in order to find a statistical correlation maximum that is achieved for a minimum cycle value that is equal to the changeover cycle CEEP.
In this second implementation, it is preferred to model variation in the measured and adjusted quantities of fluorescence as a function of cycle number on the basis of the models or variation in effectiveness, and subsequently to carry out the correlations directly on the measured and adjusted quantities of fluorescence.
It should be observed that by adjusting the measured emitted fluorescence for background noise, an artificial adjustment is also made on the initial fluorescence F0. Thus, estimating the parameters of the effectiveness model on the basis of effectiveness measurements that are deduced from adjusted fluorescence measurements constitutes an additional source of error and it might be preferable to proceed in two stages as described below for the third implementation.
Nevertheless, the second implementation as described is simpler and adapts well to PCR quantification using fluorescence measurements. It is based on the real measurements of fluorescence F′n which correspond to the fluorescence measurements adjusted for drift in background noise together with compensation ε on said measurements. Once the background noise has been subtracted, we have a relationship of the following type:
F′n=Fn+ε
where ε is a quantity that may or may not depend on cycle number n. It is preferably selected to be constant.
Under such circumstances, the measured and “adjusted” effectiveness also written E′n on cycle n is defined by:
The model of above relationship (5) thus becomes:
Under such circumstances, the effectiveness values E′n are approximated experimentally from the measurements so as to be able to set a minimum acceptable effectiveness threshold during the stage of decreasing effectiveness. A threshold cycle is thus determined beyond which the adjusted fluorescence measurements are not used for the purposes of the model (points NEG in
More generally, the value of the effectiveness threshold preferably lies in the range 0 to 0.5, and PCR having an effectiveness value below said threshold is potentially biased by uncontrolled inhibition phenomena.
In the example shown in
The main steps of this implementation can be summarized as follows, with reference to
In a start step 70, the measured values for quantities of fluorescence have been obtained and adjusted relative to background noise as a function of cycle number n, as shown in
In step 71, an approximation for effectiveness of the reaction in cycle n is calculated using above formula (2) for each of the cycles n=1, 2, . . . , (N−1).
In step 72, the minimum cycle Cs is determined for which the following two conditions are satisfied:
It is already possible to eliminate the points NEG for which effectiveness is less than Es.
In step 73, a model is formed for the curve of adjusted emitted fluorescence which effectiveness is decreasing over the cycle range CEEP=(Cs−5) to Cs, using expression (8) in which it is assumed that compensation ε is given by ε=F′0:
Thereafter, test 74 on the value Ê′0 estimated for the value E′0 and the decrementation in step 75 of the value for the changeover cycle CEEP seeks to find the looked-for value of CEEP using a step size P (which may be equal to 1), and in repeating step 73 so long as the value of Ê′0 is less than 1.
Thereafter, when the estimated effectiveness value exceeds the value 1 (arrow n on exiting the test 74), the value of the index CEEP is incremented by a step of size h (which may be a fraction smaller than unity) in step 76 and in step 77 fluorescence Fn is modeled in the same manner as in step 73. So long as the estimated effectiveness Ê′0 is greater than or equal to 1 in step 78, steps 76 to 78 are repeated. When the estimated effectiveness takes a value of less than 1, the estimated parameters ({circumflex over (F)}′0,Ê′0,{circumflex over (β)}′0,ĈEEP) are conserved in an end step 79.
In this step, a value {circumflex over (F)}′0 has finally been obtained that alone is representative of the initial population size N0 in the sample of interest. It is then possible to use at least one standard sample having a known population size N0st so as to determine in step 80 the initial population size N0 in the sample of interest.
For this purpose, a measured value of an initial population size F0st in a standard sample of known initial population N0st is obtained. Thereafter, the value of the initial population size N0 in the sample of interest is obtained by deriving a proportionality relationship between the measurement for the standard sample and its known initial population size, and applying that relationship to the measurement F′0 to obtain the actual initial population size N0.
In other words, in step 80 of
N0={circumflex over (F)}′0(N0st/{circumflex over (F)}′0st)
implying that the initial population size in the standard N0st and the ratio of the corrected fluorescences as compensated and estimated by adjusting the fluorescence model apply both to the sample of interest and to the standard sample.
It will thus be understood that a single standard ought to be sufficient for determining the initial size of the population of interest in the sample of interest, which is an advantage provided by the invention.
Nevertheless, in a variant, and where necessary, provision could also be made to obtain respective measured values for initial population sizes {circumflex over (F)}′0st in a plurality of standard samples having known initial population sizes N0st. Thereafter, a dependency relationship is established between the initial population sizes N0st of the standard samples and the respective measured values for their initial population sizes {circumflex over (F)}′0st. Thereafter, after finding the measured value for the initial population size of the sample of interest {circumflex over (F)}′0, the actual initial population size N0 of interest is determined by interpolation using the dependency relationship. It will be understood that this dependency relationship may also typically be a regression of the type shown in
Once use is made of one or more standards, provision can be made for one or more standard samples having respective known initial population sizes N0st to which the succession of amplification reactions is applied under substantially the same conditions as for the sample of interest. Thereafter, the measured values {circumflex over (F)}′0st for their initial population sizes are determined by making comparisons of the parameterized variations with the experimental measurements, as for the sample of interest.
In other words, the same calculations are naturally applied concerning the measured and adjusted quantities of fluorescence both on the standard(s) and on the sample of interest. The quantity of fluorescence {circumflex over (F)}′0st before the first cycle is estimated for the standard(s) using the same method as is used for determining {circumflex over (F)}′0 for the sample of interest, as described above.
A third implementation, corresponding to a variant of the above-described second implementation consists overall in adjusting the model for the effectiveness En relative to the experimental measurements, and in subsequently injecting said adjusted effectiveness model into the model for the current population size Nn, or into the model for the measurement Fn. This third implementation can be summarized as follows.
The parameterized variation constructed in step b) is representative of yield, and in step c), experimental variation of the yield is determined on the basis of experimental measurements in order to compare the parameterized variations with the experimental variation. Thereafter, in order to obtain a parameter representative of the initial population size N0 the following steps are performed in step d):
d1) determining a second parameterized variation representative of the current population size Nn in the sample of interest, making use at least of the parameter representing the changeover index CEEP, and a parameter representative of the initial population size N0;
d2) applying to said second variation, a parameterized value for the changeover index CEEP as determined in step c); and
d3) adjusting at least the parameter representative of the initial population size N0 by direct comparison of the second variation with the experimental measurements.
Advantageously the following are performed:
Finally, it should be understood that the presently preferred second implementation of
Naturally, the present invention is not limited to the embodiments described above by way of example, and it extends to other variants.
Thus, it will be understood that the present invention can also apply to relative quantification, in particular by PCR. In this application, as well as amplifying the population of interest, a reference population is also amplified either simultaneously in the same medium, or separately. Measurements are taken as follows:
The method can then continue by applying steps a), b), and c) to the reference population while step d) consists merely in determining a ratio between the respective initial sizes of the population of interest and of the reference population.
Relative quantification can be used for analyzing the expression of a gene of interest during the development of an organism. In order to correct in particular for variations in quantity and in quality between samples taken from the organism at different times, in addition to analyzing the target gene of interest, a reference gene is also analyzed that is known for having a level of expression that remains stable during development.
A final step then consists in comparing the ratios
between the various samples that have been taken.
In order to achieve the desired results, two strategies are possible.
The prior art strategy is based on detecting the threshold cycle Ct and it normally takes place as follows. For each sample taken at different instants t0, t1, t2, . . . , tn, the ratio
is determined, making use of at least one standard (i.e. a standard for which N0target and N0ref are known), which amounts to performing two successive absolute quantifications followed by calculating a ratio.
Another strategy that is particularly advantageous in the context of the invention consists in determining for each sample taken at different instants t0, t1, t2, . . . , tn the ratio:
directly by using the following formula:
In this second implementation, which in the end makes use only of the parameter F0, in combination with the technique of the invention, no standard sample is needed, which is particularly advantageous.
Reference is now made to
In the example described, provision is preferably made to take measurements of the quantities of fluorescence emitted on each cycle, both by the standard St and by the sample of interest ECH. To this end, a selected reagent is inserted into the wells and the samples are illuminated by a lamp (e.g. a halogen-tungsten lamp) in order to measure the respective quantities of fluorescence coming from the sample of interest and from the standard sample on each PCR cycle that is applied. In addition, an apparatus for detecting fluorescence comprises, for example, an objective lens 11 for collecting the light coming from the fluorescence, and photon counting means 10, e.g. a charge-coupled device (CCD) camera, and/or photomultipliers, in order to measure the fluorescence emitted on each PCR cycle from the sample of interest and from the standard. Thus, the fluorescence emitted by each well is advantageously focused by the lens 11 and then is preferably detected by a CCD camera 10 connected to an acquisition card 21, e.g. of the Personal Computer Memory Card International Association (PCMCIA) type provided in a central unit 20 of a computer.
The computer is then connected to the above-mentioned measuring means 10 to receive therefrom signals that are representative of the measured quantities of fluorescence detected on each PCR cycle, and to process these signals in order to determine an initial size for the population of interest prior to the first cycle, by implementing the method of the invention.
Typically, the processor unit comprises the following:
The computer may also have input members such as a keyboard 41 and/or a mouse 42 connected to the central unit 20.
Nevertheless, it should be understood that in the meaning of the invention the installation comprises overall:
For this purpose, a computer program product can be used for controlling the computer means. The program may be stored in a memory of the processor unit 20 or on a removable memory medium (CD-ROM etc.) and suitable for co-operating with the reader of the processor unit. The computer program in the meaning of the invention then contains instructions for implementing all or some of the steps of the method of the invention. For example, the algorithm of the program may be represented by a flow chart equivalent to the diagram of
Number | Date | Country | Kind |
---|---|---|---|
04 12471 | Nov 2004 | FR | national |
Number | Date | Country | |
---|---|---|---|
20060111883 A1 | May 2006 | US |