This disclosure relates generally to technology for determining whether a target is present in a sample.
An amplification curve obtained from a real-time (also known as quantitative) polymerase chain reaction (qPCR) experiment can be used to determine whether a target is present in a biological sample (e.g., a blood or food sample). In a typical qPCR experiment, fluorescence of a sample is measured after each thermal cycle of the experiment. The set of fluorescence values versus cycle number associated with a particular assay on a sample forms an amplification curve. Traditionally, an algorithm analyzes and/or a human reviews the amplification curve and, based on visual or other analysis of the curve's characteristics, determines whether the relevant sample amplified, which in turn indicates whether the target molecule was present within the sample. A typical algorithmic technique to determine that the relevant sample has amplified involves determining whether the associated amplification curve has crossed a threshold value that is either fixed or is calculated based on the characteristics of the amplification curve. If the threshold is crossed, the curve is determined to represent amplification; if the threshold is not crossed, the curve is determined to represent non-amplification.
Automated determination of amplification is important for increasing throughput of sample analysis, which in turn can both advance scientific research and improve provision of time-sensitive, clinically important information. Existing methods of automatically determining amplification have relied on combinations of techniques and parameters to improve accuracy. Machine learning techniques, including deep learning networks such as artificial neural networks, can help improve accuracy. Improved machine learning techniques are needed to further improve accuracy and reduce the instances in which human review is required. At the same time, it is important to effectively and optimally determine when human review is needed, and the criteria for triggering human review of amplification curves that have been evaluated by a computer system can vary by application context. Improving the overall call-rate accuracy of an amplification-calling machine learning system does not necessarily fully address the problem of knowing which individual amplification curve automated call results are at most risk of being wrong and therefore require human review.
Embodiments of the invention address aspects of this “which curve” problem by using an improved amplification calling machine learning system (in some embodiments, an improved artificial neural network) in parallel with a machine learning system (e.g., a deep learning network such as an artificial neural network) for evaluating the quality of an amplification curve. This can help identify potential problem curves. In some embodiments, alternative amplification calling algorithms are used to help determine which curves called by a machine-learning (e.g., neural network) amplification-calling system should be evaluated further for potential curve-quality issues that might require invalidating results.
In some embodiments, evidential learning techniques are used to extract call confidence data from both an amplification calling neural network and a curve-quality calling neural network. The confidence data can be used to further evaluate when a machine-generated amplification call should be reviewed by a human.
Some embodiments also provide improved amplification-calling and curve-quality calling networks based on particular types, variations, and arrangements of neural network elements, pre-processing and processing techniques, and/or engineered features used in processing the amplification curves from a qPCR assay. Further details of these embodiments are more fully-disclosed herein.
While the invention is described with reference to the above drawings, the drawings are intended to be illustrative, and other embodiments are consistent with the spirit, and within the scope, of the invention.
The various embodiments now will be described more fully hereinafter with reference to the accompanying drawings, which form a part hereof, and which show, by way of illustration, specific examples of practicing the embodiments. This specification may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this specification will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Among other things, this specification may be embodied as methods or devices. Accordingly, any of the various embodiments herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. The following specification is, therefore, not to be taken in a limiting sense.
Instructions for implementing amplification (“amp”) curve analysis system 102 reside in computer program product 104 which is stored in storage 105 and those instructions are executable by processor 106. When processor 106 is executing the instructions of computer program product 104, the instructions, or a portion thereof, are typically loaded into working memory 109 from which the instructions are readily accessed by processor 106. In the illustrated embodiment, computer program product 104 is stored in storage 105 or another non-transitory computer readable medium (which may include being distributed across media on different devices and different locations). In alternative embodiments, the storage medium is transitory.
In one embodiment, processor 106 in fact comprises multiple processors which may comprise additional working memories (additional processors and memories not individually illustrated) including a graphics processing unit (GPU) comprising at least thousands of arithmetic logic units supporting parallel computations on a large scale. GPUs are often utilized in deep learning applications because they can perform the relevant processing tasks more efficiently than can typical general-purpose processors (CPUs). Other embodiments comprise one or more specialized processing units comprising systolic arrays and/or other hardware arrangements that support efficient parallel processing. In some embodiments, such specialized hardware works in conjunction with a CPU and/or GPU to carry out the various processing described herein. In some embodiments, such specialized hardware comprises application specific integrated circuits and the like (which may refer to a portion of an integrated circuit that is application-specific), field programmable gate arrays and the like, or combinations thereof. In some embodiments, however, a processor such as processor 106 may be implemented as one or more general purpose processors (preferably having multiple cores) without necessarily departing from the spirit and scope of the present invention.
User device 107 incudes a display 108 for displaying results of processing carried out by amp curve analysis system 102. In alternative embodiments, amp curve analysis system 102, or a portion thereof, may be stored in storage devices and executed by one or more processors residing on PCR instrument 101 and/or user device 107. Such alternatives do not depart from the scope of the invention.
Pre-processing block 201 is typically configured to receive one or more amplification curves corresponding to amplification data obtained from a 40-cycle or a 50-cycle qPCR assay. In alternative embodiments, amplification curves resulting from PCR experiments with other amounts of cycles can be processed by the illustrated embodiment using various techniques. The collection of discrete data points, for example, 40 fluorescence values for a 40-cycle PCR experiment (or 50 for a 50-cycle experiment) is referred to as a “curve” herein, even though it is not continuous data. At the same time, a best-fit continuous curve may or may not be fit to the data for easier visual display and/or analysis purposes. In any event, the term amplification “curve” herein will generally be used to refer to the set of discrete amplification data obtained and analyzed for an assay of a biological sample, unless another meaning is implied from the context.
Pre-processing block 201 processes amplification curves to generate pre-processed curves and engineered features. The term “engineered” features refers to various features obtained from pre-determined computations on an amplification curve and is distinguished from “learned” features that result from a submitting amplification curves to a neural network or network portion (e.g., convolutional layers of a neural network). In some embodiments of the disclosure, engineered features can be obtained during pre-processing and submitted to neural networks along with the pre-processed amplification curves and can enhance learning speed and/or accuracy relative to submitting only the pre-processed curves themselves. In some embodiments of the disclosure, engineered features include one or more of curve derivatives (e.g., first, second, and/or other higher order derivatives), time series features, and other features, such as, for example, features from various PCR analysis algorithms including PCRedux features and/or Cycle Relative Threshold features, as further defined and explained elsewhere herein.
In addition to processing received amplification curves to obtain engineered features, pre-processing block 201 also processes the amplification curves to remove baseline signals, elongate or shorten the curves to be a uniform length (e.g., 40-cycle or 50-cycle) and normalizes the curves. These operations are used to generate pre-processed amplification curves output by block 201.
Amplification (“amp”) calling neural network(s) 202, alternative amp calling algorithm(s) 204, and curve-quality calling neural network(s) 203 receive the pre-processed amplification curves. In the illustrated embodiment, amp calling neural network(s) 202 receive the engineered features from pre-processing block 201 and alternative amp-calling algorithms 204 receive a portion of the engineered features relevant for executing certain alternative amp calling algorithms. In alternative embodiments, curve-quality calling neural networks 203 also receive engineered features from a pre-processing block such as pre-processing block 201.
Amplification and curve-quality calling evaluation block 205 receives amplification call data from amp-calling neural network(s) 202. In the illustrated embodiment, the amplification call data includes class probabilities for at least an amplified class and a non-amplified class and further includes confidence data generated from evidential learning techniques. However, in alternative embodiments, confidence data is not generated and used. Block 205 also receives amplification call data from alternative amp-calling algorithm(s) 204. Block 205 receives curve-quality call data from curve-quality calling neural network(s) 203. In the illustrated embodiment, the curve-quality call data includes class probabilities for at least a “clean” curve class and an anomalous (“problem”) curve class and further includes confidence data generated from evidential learning techniques. However, in alternative embodiments, confidence data is not generated and used.
As will be further described below in the context of
In the illustrated embodiment, derivative-computing step 305 computes derivatives from an unnormalized amplification curve of 50 qPCR cycles. It computes both a 1st order derivative and a 2nd order derivative. In one example, the derivative computations are performed using a Savitzky-Golay filter and the following Savitzky-Golay parameters: Polynomial Order=3; and Window Length=9. The 1st and 2nd order derivatives may be normalized using max-normalization. The resulting normalized derivative values represent a portion of the engineered features output by pre-processing 3000.
Step 306 normalizes the amp curves using cycle threshold (“Ct”) values for a specific assay on a specific instrument. The normalization may be performed by dividing the each value of the unnormalized curve with the Ct value corresponding to the assay from which the curve was obtained. In alternative embodiments, different values can be used for normalizing.
Normalized curves are used by step 308 to compute additional engineered features. In one embodiment, additional engineered features include, for example, one or more of the following: Features related to the Cycle Relative Threshold (“Crt”) algorithm, a version of which is described in patent publication US 2016/0110495 A1 of U.S. patent application Ser. No. 14/921,948 (“Crt patent publication”); features from PCRedux (see PCRedux: A Data Mining and Machine Learning Toolkit for qPCR Experiments at biorxiv.org/content/10.1101/2021.03.31.437921v1); time series features (e.g., shapelets, wavelets); and features associated with various anomalies (e.g., waterfall, bubble, wavy baseline, and creep). Both of the publications referenced above are hereby incorporated by reference in their entireties.
In one embodiment, one or more of the features from the Crt algorithm listed and described in Table 1 below are computed and used. Cross references to the above-referenced Crt patent publication, where applicable, are also provided below:
“AmpStatus” is a prediction result of the Crt algorithm. It can be obtained using software in commercially available products from Thermo Fisher Scientific including, among others, one or more of the following products: QuantStudio™ 12k Flex v1.5; QuantStudio™ 6 and 7 Flex v1.7.2; Relative Quantification v4.3 (Cloud App); Standard Curve v4.0 (Cloud App); and Presence Absence Analysis v1.9 (Cloud App).
In one embodiment, one or more of the PCRedux features listed in Table 2 are determined and used:
In one embodiment, step 308 normalizes the additional engineered features after computing them. In one embodiment, a quantile normalizer is used. A quantile normalizer transforms the features to follow a normal distribution. The number of quantiles that are computed is equal to 1000. A scaler to be used for normalizing is obtained from a chosen representative training set for a given model.
Step 307 outputs the normalized derivatives generated by step 305, the normalized curves generated by step 306, and the additional normalized engineered features generated by step 308. These values are output by, for example, pre-processing block 201 shown in
Structure 4000 comprises neural net (“artificial,” i.e., computerized, neural network implied throughout when referencing a “neural net” or “neural network” herein; other deep learning networks might be used in alternative embodiments, and “deep learning network” as used herein will also be assumed to imply “computerized deep learning network”) first portion 401, neural net second portion 402, and neural net third portion 403. In the illustrated embodiment, neural net second portion 402 receives (from pre-processing block 201 of
Additional engineered features are received by neural net first portion 401. In the illustrated embodiment, the additional engineered features include one or more of the features previously described in the context of step 308 of
Neural net first portion 401 and neural net second portion 402 process their respective inputs. Their respective outputs are joined by concatenation function 404, which provides the concatenated output as the input to neural net third portion 403. Neural net third portion 403 processes the concatenated output of first portion 401 and second portion 402 and generates, as output, call and confidence data. In an alternative embodiment, only amplification call data is determined and output by neural net third portion 403.
The above structures will now be describe with further details regarding layer input/output sizes and other characteristics of a particular embodiment. The disclosed details have been found to work well for the purpose of the amplification curve analysis application described herein. But one skilled in the art will understand that many of these details can be varied from those referenced below without necessarily departing from the spirit and scope of the present disclosure.
In one embodiment, fully-connected layer 501 has 9-unit (1×9) input data and 16-unit (1×16) output data. Leaky ReLU layer 502 is an activation function operating on fully-connected layer 501's output and it provides 1×16 output to fully connected layer 503 that, in turn, provides 1×16 output that is processed by Leaky ReLU layer 504 to provide 1×16 output to concatenation function 404.
Turning to further details of neural net second portion 402, separable convolution layer 505, in this embodiment, receives 3-channel data, each channel including 50 units, i.e., 3×50 data. Specifically, for a given cycle of the 50-cycle amp curve data, three values are provided as input to convolution layer 505: the pre-processed amp curve's normalized fluorescence value at that cycle, the first order derivative, and the second order derivative.
In the illustrated embodiment, separable convolution layer 505 uses 16 filters, each 3×3 in size. A stride of 2 is used and sufficient padding is applied to the input array to obtain the desired output feature map dimensions which, for separable convolution layer 505, are 16×26. In this embodiment, that means that the 3×50 input array is separated into three “sub” arrays, each being 1×50 in size. Similarly, each 3×3 filter is separated into three “sub” filters, each being 1×3 in size. Each respective sub-array is convolved with a respective sub-filter to create a respective 1D feature map. The resulting feature maps are “stacked” to provide a 2D feature map. A pointwise convolution is performed on the 2D feature map in which it is convolved with a 1×3 filter. This pointwise step outputs a 1×26 array, which is the same size feature map as would have resulted from a normal convolution of 3×3 filter with a 50×3 input array, including necessary padding. With 16 filters used in the illustrated embodiment, the resulting output from separable convolutional layer 505 is a 16×26 feature map.
In alternative embodiments, more or fewer filters can be used in each of the separable convolutional layers described herein. The number of filters shown in the presently illustrated embodiment is preferred, but can be different in alternative embodiments without necessarily departing from the spirit and scope of the present disclosure. In the illustrated embodiment, separable convolution layer 505 uses a stride of 2 and uses replication padding. Replication padding makes a copy of a sequence in the input array to be padded, reverses it, and then uses the reversed sequence to pad on either end of the sequence. Padding allows the convolution process to create feature maps that have the desired length dimension. Alternative embodiments use different types of padding, e.g. “same” padding, without necessarily departing from the spirit and scope of the disclosure.
Other separable convolutional layers in the illustrated embodiment use the same filter size, stride, padding type, and depth-wise followed by point-wise convolution; those characteristics will be assumed and not repeated further below, though the characteristics can be varied in alternative embodiments.
The resulting 16×26 output from separable convolutional layer 505 is processed by a Leaky ReLU activation function, represented here by Leaky ReLU layer 506, and the results are provided as 16×26 data to separable convolution layer 507. Separable convolution layer 507 uses 8 filters. The resulting 8×13 output data is processed by Leaky ReLU layer 508, and the results are provided as 8×13 data to flattening layer 509. Flattening layer 509 converts the 8×13 data to single-column (i.e., one dimensional) array of length 104, and provides the resulting 1×104 output to concatenation operation 404.
The 1×16 output of Leaky ReLU 504 of the first portion 401 and the 1×104 output of the flattening layer 509 of the second portion 402 are concatenated at block 404 and the concatenated resulting 1×120 data is provided to fully connected layer 511 of the third portion 403.
Fully-connected layer 511 receives the 1×120 concatenated data and provides 1×16 output data, which is processed by Leaky ReLU layer 512 which in turn provides 1×16 output to fully connected layer 513, which in turn provides output from a 2-node final layer that provides 2-unit output (the output dimension corresponding to the number of classifications for the illustrated embodiment, i.e., two classifications: amplified and non-amplified).
In the illustrated embodiment, the output of fully-connected layer 513 is processed by ReLU layer 514. The illustrated embodiment implements evidential learning. The output of ReLU layer 514 is used as an evidence vector by evidential processing block 515. Evidential processing block 515 uses evidential learning techniques to process the received evidence vector and obtain class probability determinations corresponding to amp and non-amp classifications for each amplification curve along with confidence measurements for the corresponding classification data. An example of evidential learning techniques used by block 515 in the illustrated embodiment are described in, Sensoy et al., “Evidential Deep Learning to Quantify Classification Uncertainty,” arXiv:1806.01768v3 [cs.LG] 31 Oct. 2018, https://arxiv.org/pdf/1806.01768.pdf, incorporated herein by reference in its entirety.
In an alternative embodiment, ReLU layer 514 and evidential processing block 515 are replaced by a softmax layer that simply converts the output of fully connected layer 513 to class probabilities and does not generate confidence data.
In the illustrated embodiment, the same pre-processed amp curves and selected engineered features (first and second order derivatives) submitted to neural net second portion 402 (see
Neural net first portion 601 can be understood as operating to extract learned features from the pre-processed amp curves (along with selected engineered features of those curves) that it receives. The optimized parameter values (e.g., convolutional filter values) used to extract those features from each curve are learned during training of the neural network. Second portion 602 can be understood as a classification network used to classify curves as “clean” or “problem.”
As shown, neural net first portion 601 comprises separable convolution layer 605, Leaky ReLU layer 606, separable convolution layer 607, Leaky ReLU layer 608, and flattening layer 609. In the illustrated embodiment, the more detailed characteristics (e.g., input/output data dimensions, number of filters, filter size, stride, padding type) of separable convolution layer 605, Leaky ReLU layer 606, separable convolution layer 607, Leaky ReLU layer 608, and flattening layer 609 are the same as previously described for, respectively, separable convolution layer 505, Leaky ReLU layer 506, separable convolution layer 507, Leaky ReLU layer 508, and flattening layer 509, and those description are therefore not repeated here.
Thus, the input provided from flattening layer 609 to fully-connected layer 611 is 1×104 in size. Fully connected layer 611 outputs 1×16 data for processing by Leaky ReLU layer 612. The resulting 1×16 data is input to fully connected layer 613, which outputs 1×2 data to ReLU layer 614. The resulting evidence vector is processed by evidential processing block 615 in a manner similar to that described above in the context of evidential processing block 515. Evidential processing block 615 outputs class probabilities and corresponding confidence measurements regarding the following curve-quality classifications: “clean” or “problem.”
In an alternative embodiment, ReLU layer 614 and evidential processing block 615 are replaced by a softmax layer that simply converts the output of fully connected layer 613 to class probabilities and does not generate confidence data.
Processing 7000 being at step 701. Step 702 receives amp calls from alternative amp calling algorithms such as one or more of those previously described (or from other alternative amp-calling algorithms). Step 703 receives amp calls from a neural network/machine-learning based algorithm such as, for example, neural network(s) 202 of
Step 710 determines if any curves corresponding to the currently analyzed well have CQI flags. If no, then processing proceeds to step 712 and 716 directly and none of the tests associated with the current well being analyzed are invalidated. If yes, then processing proceeds to step 711 to determine if CQI invalidation is currently enabled. In some embodiments, this is a user-selected setting. In alternative embodiments, whether CQI invalidation is enabled can be automatically determined based on various user-selected or pre-determined factors and/or can be enabled by default. If CQI invalidation is not enabled, then the result of step 711 is no and processing proceeds to step 712 and 716 and none of the tests associated with the current well being analyzed are invalidated. If CQI invalidation is enabled, then the result of step 711 is yes and processing proceeds to step 713 to determine whether calls only from individual tests associated with CQI flags in a given well should be invalidated or whether all calls for tests in the well should be invalidated. If the result of step 713 is “call,” then only test results associated with CQI flags are invalidated. If the result of step 713 is “well,” then all results for that well are invalidated, whether or not the individual test is associated with a CQI-flagged amplification curve.
Those skilled in the art will appreciate that, in various embodiments, different factors can be used to determine whether to invalidate all tests in a well or only ones associated with calls for CQI-flagged curves. In some embodiments, this determination can be made from available data based on user-set criteria. For example, a user might set criteria based on the number or percentage of CQI-flagged calls in a given well to determine whether or not to invalidate all calls in the well. Once the user-based criteria is set, then the result can be implemented automatically based on the call and CQI flag data. If confidence measurements are available, then a user could, in some embodiments, set criteria based on confidence scores meeting or not meeting certain thresholds at the level of individual calls and/or across all calls in the well. Conditional criteria based on various combinations of call and curve quality data could also be implemented. Various alternatives consistent with the present disclosure will be apparent to one skilled in the art.
As previously described in the context of
Structure 8000 comprises neural net first portion 801, neural net second portion 802, and neural net third portion 803. In the illustrated embodiment, neural net second portion 802 receives (from pre-processing block 201 of
Additional engineered features are received by neural net first portion 801. In the illustrated embodiment, the additional engineered features include one or more of the features previously described in the context of step 308 of
Neural net first portion 801 and neural net second portion 802 process their respective inputs. Their respective outputs are joined by concatenation function 804, which provides the concatenated output as the input to neural net third portion 803. Neural net third portion 803 processes the concatenated output of neural net first portion 801 and neural net second portion 802 and generates, as output, call amplification call data. In one example, such call data comprises class probabilities (sometime referred to herein as prediction scores) for one or more classifications such as amplified and/or not amplified.
As further shown in
In the illustrated embodiment, each dropout layer 923 in an FCwLD block uses a dropout rate of 0.5. However, other rates can be used without necessarily departing from the spirit and scope of the present disclosure. In the illustrated embodiment, separable convolution layers 903 and 905 operate in a manner similar to that described for separable convolution layers 505 and 507 in
In the illustrated embodiment, softmax layer 912 outputs class probabilities (prediction scores) for use by a computer user interface and for use by confidence processing block 805.
In an alternative embodiment, softmax layer 912 and confidence processing block 805 are replaced by a Leaky ReLU layer and evidential learning block as previously described that generate amplification call data and confidence data.
As shown, neural net first portion 1001 comprises separable convolution layer 1003, Leaky ReLU layer (activation function) 1004, separable convolution layer 1005, Leaky ReLU layer (activation function) 1006, and flattening layer 1007. Neural net second portion 1002 comprises FCwLD blocks 1008, 1009, and 1010 followed by fully connected layer 1011 and softmax layer 1012. Like FCwLD block 901 and 902, FCwLD blocks 1008, 1009, and 1010 each comprise a fully connected layer 921 followed by a Leaky ReLU layer 922, and a dropout layer 923.
In the illustrated embodiment, each dropout layer 923 in an FCwLD block uses a dropout rate of 0.5. However, other rates can be used without necessarily departing from the spirit and scope of the present disclosure. In the illustrated embodiment, separable convolution layers 1003 and 1005 operate in a manner similar to that described for separable convolution layers 605 and 607 in
In the illustrated embodiment, softmax layer 1012 outputs class probabilities (prediction scores) for use by a computer user interface and for use by confidence processing block 1025.
In an alternative embodiment, softmax layer 1012 and confidence processing block 10255 are replaced by a Leaky ReLU layer and an evidential learning block as previously described that generate curve-quality call data and confidence data.
Processing 1100 begins at step 1101, which computes ensemble confidence for each prediction score. As previously explained, some embodiments of the disclosure use an ensemble of deep learning networks that are similarly structured but are differently trained. In relevant embodiments using processing 1100, each deep learning amp-calling network 202 in an ensemble of such networks generates a prediction score for a particular amp curve. As one skilled in the art will appreciate, various statistical techniques can be used to generate a confidence metric using set of data points drawn from an underlying population.
In some embodiments, the 95% confidence interval is computed for an ensemble prediction score using all the individual prediction scores generated by the amp-calling networks 202 in the ensemble and assuming a normal distribution. Then, the difference between the upper and lower bounds of that confidence interval is used as a confidence metric and empirical thresholds are used to assign and output an intuitive confidence level to the prediction score result (e.g., “low,” “medium,” or “high” confidence). For example, a confidence interval for an ensemble's prediction scores might have a lower bound (LB) of 0.75 and an upper bound (UB) of 0.85. The resulting confidence metric (UB-LB) would be 0.10. Using, for example, a threshold of 0.15 such that anything below 0.15 is “high” confidence, the process would assign a high confidence level to the corresponding amp prediction score of the ensemble as a whole (the ensemble's overall prediction score might, for example, be taken as the average prediction score of the all the scores generated for the given curve by the amp-calling networks of the ensemble).
Step 1102 determines whether the computed confidence metric is below a first, lower threshold. If the result of step 1102 is yes, then processing proceeds to step 1103 which outputs a “high” confidence level. If the result of step 1102 is no, then processing proceeds to step 1104 which determines whether the confidence metric (e.g., UB-LB of the confidence interval) is below a next, higher threshold. If the result of step 1104 is yes, then processing proceeds to step 1105 which outputs a “medium” confidence level. If the result of step 1104 is no, then processing proceeds to step 1106 which outputs a “low” confidence level.
In one embodiment, the following thresholds are used: If the UB-LB of the 95% confidence interval is less than 0.25, then confidence is determined to be “high.” If UB-LB of the 95% confidence interval is equal to or greater than 0.25, but less than 0.6, then the confidence level is determined to be “medium.” If UB-LB of the 95% confidence interval is 0.6 or greater, then the confidence level is determined to be “low.” However, one skilled in the art will appreciate that appropriate thresholds for confidence determinations will be dependent on the data sets and context. And different thresholds might be used depending on the statistics of the underlying data set.
Also, the thresholds used might change based on what Cq value is associated with the amplification curve being processed for prediction. For example, over some Cq ranges, a tighter range of prediction scores might be expected. As one example, if the Cq value corresponds to a cycle in the range of 12-20, then the amplified class probability would be expected to be consistently high (near 1) and a tighter confidence interval range (lower threshold for UB-LB) might be required (e.g., 0.2 or 0.15 instead of 0.25) in order to assign a “high” confidence value to the class probability (prediction score). Therefore, in some embodiments, a first (lower) threshold is dependent on Cq value. For example, a first threshold might be used at step 1103 that is lower than 0.25 (e.g., 0.15, 0.2) when the corresponding amplification curve has a relatively low Cq value (e.g., 12-20) and 0.25 or some other value higher than 0.2 might be used for curves with higher Cq values. In some embodiments, a second (higher) threshold is also dependent on Cq values.
In some embodiments, processing 1100 (or similar processing) is used by block 1025 of
Although Cq values might be less relevant to assessing confidence of curve-quality predictions than they are to assessing confidence of amplification predictions, in some embodiments, other context-dependent factors might be used to modify thresholds for curve-quality confidence processing. As noted above, one skilled in the art will appreciate that appropriate thresholds for assigning confidence levels can be dependent on the context of a particular data set. The above thresholds are set forth merely by way of example.
As one skilled in the art will appreciate, in alternative embodiments, the confidence metric itself and/or other underlying statistics regarding the relevant dataset can be output through a computer user interface without necessarily assigning intuitive confidence values such as “low,” “medium,” or “high.” A sophisticated user might prefer to assess confidence based on the underlying confidence interval metrics rather than having the assessment generated automatically.
In the example, computer system 1200 may provide one or more of the components of an automated qPCR curve analysis system configured to implement one or more logic modules and artificial neural networks and associated components for a computer-implemented qPCR automated analysis system and associated interactive graphical user interface. Computer system 1200 executes instruction code contained in a computer program product 1260. Computer program product 1260 comprises executable code in an electronically readable medium that may instruct one or more computers such as computer system 1200 to perform processing that accomplishes the exemplary method steps performed by the embodiments referenced herein. The electronically readable medium may be any non-transitory medium that stores information electronically and may be accessed locally or remotely, for example, via a network connection. In alternative embodiments, the medium may be transitory. The medium may include a plurality of geographically dispersed media, each configured to store different parts of the executable code at different locations or at different times. The executable instruction code in an electronically readable medium directs the illustrated computer system 1200 to carry out various exemplary tasks described herein. The executable code for directing the carrying out of tasks described herein would be typically realized in software. However, it will be appreciated by those skilled in the art that computers or other electronic devices might utilize code realized in hardware to perform many or all the identified tasks without departing from the present invention. Those skilled in the art will understand that many variations on executable code may be found that implement exemplary methods within the spirit and the scope of the present invention.
The code or a copy of the code contained in computer program product 1460 may reside in one or more storage persistent media (not separately shown) communicatively coupled to computer system 1200 for loading and storage in persistent storage device 1270 and/or memory 1210 for execution by processor 1220. Computer system 1200 also includes I/O subsystem 1230 and peripheral devices 1240. I/O subsystem 1230, peripheral devices 1240, processor 1220, memory 1210, and persistent storage device 1270 are coupled via bus 1250. Like persistent storage device 1270 and any other persistent storage that might contain computer program product 1260, memory 1210 is a non-transitory media (even if implemented as a typical volatile computer memory device). Moreover, those skilled in the art will appreciate that in addition to storing computer program product 1260 for carrying out the processing described herein, memory 1210 and/or persistent storage device 1270 may be configured to store the various data elements referenced and illustrated herein.
Those skilled in the art will appreciate computer system 1200 illustrates just one example of a system in which a computer program product in accordance with an embodiment of the present invention may be implemented. To cite but one example of an alternative embodiment, storage and execution of instructions contained in a computer program product in accordance with an embodiment of the present invention may be distributed over multiple computers, such as, for example, over the computers of a distributed computing network.
This application claims the benefit of U.S. Provisional Application 63/320,213 filed on Mar. 15, 2022. To the extent permitted in applicable jurisdictions, the entire contents of that application are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2023/015326 | 3/15/2023 | WO |
Number | Date | Country | |
---|---|---|---|
63320213 | Mar 2022 | US |