This invention concerns a method for measuring performance of a spectroscopy system. The invention has particular, but not exclusive, application to measuring the performance of a Raman spectroscopy system used to identify or quantify one or more components present in a sample from a known set of possible components, such as in a multiplex assay.
It is known to use Raman spectroscopy to identify components present in a sample. To enhance the Raman signal, surface enhanced resonance Raman scattering (SERRS) may be used. SERRS uses the principle that the molecules of the component to be identified are adsorbed on an active surface containing a chromophore having an electronic transition with a frequency near to (preferably within 150 nm) of the laser wavelength used to excite the plasmon on the enhancing substrate.
For a biological sample, to provide a sufficiently distinct chromophore for each type of molecule to be identified, the sample may be treated to attach different dyes to each type of molecule to be identified (e.g. different types of oligonucleotides). Examples of such techniques are described in WO09/022125 and US2006246460, which are incorporated herein by reference.
In one form, an assay is constructed to detect disease states by attaching the dye to an oligonucleotide built to complement a target nucleotide sequence (target sequence) known to be unique to the causative organism(s). It is then introduced to a sample containing DNA fragments. If the target sequence(s) are present in the sample the dye labelled oligonucleotide hybridises to it. By adding an oligonucleotide with biotin which also recognises the target sequence it is possible to separate the DNA complex containing the target sequence and the dye using streptavidin coated magnetic beads which attach to the complex via the biotin/streptavidin interaction. The dye sequence is then released and adsorbed onto silver or gold nanoparticles which, preferably when aggregated, act as the SERRS substrate for the dye giving very strong signals from an aqueous environment. It is a characteristic of SERRS that the spectrum consists of a sharp set of lines almost always exclusively from the dye. This is because the Raman scattering surface enhancement factor for the dye is very high compared to the enhancement factor from the rest of the oligonucleotide so other signals are very weak in comparison. These sharp lines are characteristic of the dye giving in situ identification and the sharp nature of the lines mean that mixtures of dyes can be identified without separation. This enables the detection of multiple labels in one sample.
It is desirable to measure a level of performance that can be expected from such a multiplex assay. However, the complexity of the multiplex assay and the unknown nature of a sample mean that there is a wide range of factors that could affect the chances of correctly identifying or quantifying an analyte in the sample. It is desirable that a measure of performance of the multiplex assay takes into account this wide range of factors in order that the measure of performance fairly reflects the performance that is likely to be achieved by the user.
One way in which a performance of a system can be measured is to carry out multiple experiments in which factors that can change are deliberately varied and, from the results, a determination is made of the performance of the system. However, if the number of factors that can vary is large, if the factors are difficult to control or if one or more of the factors can vary over a large range, conducting experiments even approximately spanning all possible circumstances that can occur to obtain a representative measure of performance is a considerable challenge, if indeed it is possible at all.
According to a first aspect of the invention there is provided a method of measuring the performance of a spectroscopy system comprising
In this way, a measure of performance can be determined within a reasonable time frame that takes into account possible variation that may occur. In particular, it may be possible to collect a plurality of component spectra for each single component identifiable using the system that is representative of the spectral variability that may occur. These single spectra can then be used to simulate scenarios for which spectra have not been obtained, for example, variations in performance that occur with different concentrations and/or components and the performance of the system when more than one of the components is present in a sample. This approach may assume that the spectral response from a single component is linear with concentration and additive in the presence of spectral contributions from other components; that is the Beer-Lambert law is obeyed by the system.
Many sample spectra may be simulated for any one potential sample in order to obtain a statistically significant measure of performance for that potential sample.
Accordingly, there may be a many-to-one correspondence between the sample spectra and potential sample(s). However, it may not be necessary to simulate multiple sample spectra for any one potential sample, such as if the measure of performance is for the overall system rather than specific to a particular potential sample.
The analyses may comprise qualifying the presence and/or absence of components based upon the sample spectrum. In such an embodiment, sample spectra may be simulated for different relative concentrations of the components but the analysis may only be concerned with whether or not a component can be identified. However, the analyses may comprise quantifying a concentration of components from the sample spectrum.
The measure of performance may comprise a measure of sensitivity and/or specificity of the system in identifying and/or quantifying one or more of the components. The measure of performance may comprise a limit of concentration of one or more of the components in a sample at which a minimum level of sensitivity is achieved for identifying and/or quantifying the one or more components.
Each sample spectrum may be simulated for a specified quantity and/or quality of the characteristic of the corresponding potential sample and generating a measure of performance comprises a comparison of the measured quantity and/or quality with the specified quantity and/or quality. For example, a sample spectrum may be simulated for a specified component and/or concentration of the component in the potential sample and the analyses step identifies components and/or concentrations of components from the sample spectrum, the component and/or concentration identified in the analyses step compared to the specified component and/or concentration to determine a measure of performance. Alternatively or additionally, the analyses step obtains measurement of a quality and/or quantity that was not specified/known in advance of simulating the sample spectrum, such as distance between characteristic peaks.
The plurality of component spectra for each component may be obtained by performing experiments in accordance with an experimental design, such as fractional or full factorial design, in which factors identified as influencing a spectral response of the components are varied through a range of possibilities. These factors may include operator, time between preparation of a sample and measurement, instrument from which spectra are obtained, batch of the component(s) and/or batch of reagents, such as dye and/or colloid. The range in which factors may vary may be limited by system specifications, for example the system specification may require that the spectrum of a sample is obtained with a particular type, such as make, of instrument such that it is not necessary to obtain spectra from other makes of instruments and/or may require that the batch of components are obtained from a particular source such that it is not necessary to obtain spectra for components obtained from other sources.
The plurality of component spectra obtained for each component may be obtained with the component at the same concentration, which may be a pre-determined reference concentration. However, these concentrations may vary between components, depending upon system specifications.
The analyses may comprise analysing each sample using a reference spectrum for each component to obtain the measured quantity and/or quality of the characteristic. The method may comprise selecting a reference spectrum for each component of the set of components, each reference spectrum selected from the plurality of component spectra for that component and differing from the component spectrum used to simulate the sample spectrum that is analysed using the reference spectrum. Different reference spectra may be selected for the analysis of different sample spectra. In this way, the measure of performance may take account of variations in reference spectra as well as variations in sample spectra. Use of different reference spectra to analyse a sample spectrum may produce different results. A pool of component spectra from which a reference spectrum may be selected may be limited by system requirements. For example, if it is a requirement when analysing an unknown sample that the reference spectra are obtained using the same spectroscopy apparatus as that used to obtain the spectrum from the unknown sample, the method of measuring performance may only select reference spectra from component spectra that have been collected using the same apparatus as the component spectrum/spectra used to simulate the sample spectrum. Similar restrictions on the reference spectra may apply to other factors such as temperature, time between measurements, etc whose variation may be limited between reference spectra and sample spectra, even though the plurality of component spectra capture a greater range of variations.
The sample spectra may be simulated for different concentrations of the or each component in the potential sample.
The component spectra used to simulate each sample spectrum, and optionally the reference spectra used to analyse each simulated sample spectrum, may be randomly selected from an appropriate set of spectra. Such a procedure may be appropriate when the number of possible combinations of component spectra and reference spectra is so large as to make a more systematic approach unfeasible. An appropriate set of component spectra may be all of the component spectra for a component or may be a subset of these component spectra.
The or each component in the potential sample and/or a concentration of the the or each component in the potential sample may be randomly selected based upon at least one probability distribution. Such a probability distribution may represent the likelihood of finding a component, combination of components and/or concentration of a component in a sample. The probability distribution may be a likelihood of a sample containing particular components, a particular component at various concentrations or combinations of particular components. Alternatively, a probability distribution may not be specific to particular components but may be a general likelihood of a sample containing a non-specific component of the set at various concentrations or a number of non-specific components in a combination. This alternative may be appropriate when the probability distributions are similar for different components/combinations such that there is little advantage in using separate distributions for different components/combinations or when there is insufficient information for component specific probability distributions. Biasing the random selection towards certain combinations using the probability distribution may ensure that the measure of performance is not biased by scenarios that are unlikely to occur in reality.
The analysis may be a multivariate analysis technique, such as a method based upon a direct classical least squares (DCLS) analysis.
The method may be for measuring the performance of a system using surface enhanced resonance Raman spectroscopy (SERRS), such as a SERRS multiplex assay, as described above. The components may be dyes to be used within the multiplex assay.
The method may form part of a method of selecting specifications of the system. Specifications of the system may be selected and/or modified based upon the measurements of performance. Such selection and/or modification of the specifications may alter the set of component spectra and/or potential samples that can be measured by the system. For example, in response to measurement of performance of an initial set of sample spectra, more stringent requirements may be placed on a factor that influences spectral response of the components, eg modifying the maximum time allowed between preparation of a sample and measurement, eliminating the need to simulate sample spectra using component spectra obtained under conditions falling outside of these new specifications and/or only for a subset of components.
Initially, the performance may be measured for potential samples comprising only one component, with an initial selection and/or modification of the specifications being made based upon these measurements, later measurements of performance based upon potential samples comprising multiple components. In this way, spectra and/or components may be eliminated before analysing multi-component samples reducing the number of possible multi-component samples, and therefore, the processing that is required.
The invention also concerns a system, such as a multiplex assay, designed in accordance with the above method.
According to another aspect of the invention there is provided data carrier having instructions thereon which, when executed by a processor, cause the processor to:
According to a further aspect of the invention there is provided apparatus for measuring the performance of a spectroscopy system, the apparatus comprising:—
The apparatus may comprise memory having stored thereon a library of the plurality of component spectra, the processor arranged to retrieve the spectra form the memory, as required. It will be understood that the step of retrieving a plurality of spectra from memory may be carried out before or during step b). For example, the relevant spectrum may be retrieved only once it has been selected for use as a reference spectrum or for simulating the sample spectrum.
According to a further aspect of the invention there is provided a method of designing a multiplex assay that uses spectroscopy to identify analytes comprising selecting a specification for the multiplex assay, simulating spectra representative of possible variation of factors within the specification, calculating a measure of performance based upon the simulated spectra and modifying the specification based upon the measure of performance.
According to another aspect of the invention there is provided a method of measuring the performance of a spectroscopy system comprising
Referring to
In this embodiment, each spectrum is obtained from a test sample containing the component at a reference concentration. The reference concentration may be selected to be a target or expected concentration of the component in an unknown sample to be analysed. For example, an expected concentration of an analyte in a patient sample or a target concentration to be achieved by an amplification of the analyte, such as by PCR.
There is no lower or upper limit on the number of spectra, but each set should encompass factors effecting variation in the spectroscopy signal. In this embodiment, the spectra are filtered to remove spectra where the spectroscopy signal is either significantly weaker of stronger than the average. This is intended to remove outlying spectra, which are not representative of the variation to be expected.
A set of “blank” spectra (not shown) are also obtained for samples containing no components of interest. For example, a spectrum may be obtained from a sample comprising the support substrate only.
In a second step, an analysis technique is used to determine whether the correct component(s) can be identified from a simulated spectrum for a plurality of potential samples. In this embodiment, the analysis technique is a method based upon Direct Classical Least Squares (DCLS) fitting of reference spectra to a sample spectrum, as described below with reference to
To obtain a statistically significant data set, sample spectra are repeatedly simulated and analysed for a potential sample by:—
These steps are carried out for a number of different potential samples. The table of
Referring to
wherein Ck is the concentration relative to its reference concentration for each component k in the potential sample and B is the selected blank spectrum. In this way, the resultant sample spectrum 201 comprises a blank contribution 201a that is representative of that which would be found in a signal from a typical sample.
In a third step one or more measures of performance are calculated from the data set. Referring to
Alternatively or additionally, the measure of performance may be an estimated number of true positives (sensitivity) for each specified sample.
The measure(s) of performance may be used to select/modify specifications for a system. For example, a component may be selected for use in a multiplex assay based upon whether or not its sensitivity, limit of concentration and/or specificity is above a predefined level.
Other measurements of performance may be used for selecting/modifying other specifications of the system. For example, measures of performance may be determined for different time periods between adding a colloid to a sample and taking a measurement. A cut-off point could then be determined at which a delay between adding the colloid and taking a measurement reduces sensitivity, specificity, a limit of concentration or other statistical measure of performance below an acceptable level. Such analysis may be carried out for other factors that may affect the spectroscopy signal, such as source of components/reagents, relative concentration of reagents and/or type of spectroscopy apparatus on which spectra are collected.
In a further embodiment, the sample spectra that are simulated may be based upon probability distributions used to select the components and concentrations for the components. To simulate a sample spectrum, firstly a random selection is made on whether or not each component of the set is present based upon a probability distribution. A simple form of such a selection comprises using data on probabilities that one, two, three, etc components are present in a sample to randomly select how many components are present and then randomly selecting, based on an equal probability for each component, that number of components from the set of components. Secondly, a random selection is made of the concentration of each component in the sample relative to its reference concentration. Again, a simple form of such a selection is to use a single probability distribution for concentration of a component to randomly select the concentrations of all the components that have been selected as present. A sample spectrum is then simulated using a randomly selected spectrum for each component selected as present scaled for the chosen concentration.
Such a simple form for randomly selecting the component spectra and concentration used to simulate the sample spectrum may be used because of lack of information on probabilities for individual components or because the likelihood of a component being present in a sample and/or likelihood of a component having a particular concentration relative to its reference concentration is the same or similar for all the components of the set and for all combinations of components. Where there is a significant variation between components, individual probability data may be used for each component. Such data may comprise data on the likelihood of each component being found on its own as well as the likelihood of the component being found in the presence of other components. For example, one component may be used as a control in a multiplex assay and therefore, have a very high chance of being present, or two components may have a zero or very low chance of being found together if they are naturally mutually exclusive. Different probability distributions for concentrations of the components may also be used.
The use of probability distributions for biasing selections could also be extended to the selection of a component spectrum for a particular component. For example, a random selection of a component spectrum may be made based upon a probability distribution for variation in a particular factor for which component spectra were collected, such as the time between adding a colloid to the sample and measurement where it may more likely that the time between falls at a central time between two extremes. Selection of the component spectrum may be biased towards selecting a component spectrum obtained for a more likely value of the factor.
Such a method of forming sample spectra may provide a set of sample spectra that are more representative of that which would have been obtained through experiment/in use. A measurement of performance calculated from the set may be more representative of that to be expected because it is not unduly distorted by over-representation of sample spectra determined for improbable potential samples.
The invention has particular application to measuring the performance of a multiplex assay in which SERRS is used to identify the components. However, the method of the invention could be used to measure the performance of other spectroscopy systems and, in particular, other Raman based systems, in which a reference spectrum is used to identify an unknown component from that unknown component's spectrum.
Referring to
where i represents the spectral frequency index. This results in a series of linear equations which can be solved directly by matrix inversion for the component concentrations Ck.
An iterative process is carried out in which Equation (1) is resolved for each candidate component using the selected reference spectrum, steps 103 to 108.
In step 103, for each candidate component, equation (1) is minimised for the component's reference spectrum together with any component reference spectra that have already been selected in a previous iteration. A measure of goodness of fit is calculated for the resolved components relative to the simulated spectrum.
The measure of goodness of fit can be a measure of lack of fit (LoF) given by:—
This measure of lack of fit is compared to a previous measure of LoF calculated for the selected component reference spectra before the addition of the candidate component reference spectrum to determine an improvement to the measure of LoF resulting from the addition.
The improvement in the LoF, LIpr, is calculated as a proportional improvement in the LoF:—
where Lold is the LoF value calculated for the selected component reference spectra before the inclusion of the candidate component reference spectrum and Lnew is the LoF value calculated for the selected component reference spectra including the candidate component reference spectrum.
In step 104, the candidate component reference spectra resolved as having a negative concentration are removed from further consideration in the iteration (but not subsequent iterations).
In step 105, the improvements in the LoF, LIpr, for the remaining candidate component reference spectra are compared and the candidate component reference spectrum associated with the greatest improvement in the LoF becomes the leading candidate component reference spectrum for inclusion in the final form of the model.
A check 106 is made to determine whether the improvement in the LoF resulting from addition of the leading candidate component reference spectrum is above a preset limit. If the improvement to the LoF, LIpr, for the leading candidate component reference spectrum is above the preset limit, it is selected 107 as a component reference spectrum that is present in the final form of the model. The process 103 to 107 is then repeated for the remaining unselected component reference spectra.
If the improvement to the LoF, LIpr, for the leading component reference spectrum is below the preset limit, then the method is terminated and the final form of the model, comprising the model resolved for the component reference spectra selected up to that point, is output. The final form of the model will typically comprise a subset of the set of predetermined component reference spectra, these spectra being those of most significance as measured by lack of fit.
A determination can be made of components present in the sample based upon whether the reference spectrum corresponding to a component is included in the final form of the model.
Further details regarding the above method and of the preferred apparatus for conducting this method can be found in UK patent “Spectroscopic apparatus and methods for determining components present in a sample” application number EP11250530.0, filed on 16 May 2011.
Number | Date | Country | Kind |
---|---|---|---|
12163397.8 | Apr 2012 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2013/050864 | 4/2/2013 | WO | 00 |