The present invention relates to methods for obtaining a reaction volume having a predetermined copy number of a known nucleic acid molecule therein. Such a reaction volume may be useful as a control for nucleic acid amplification reactions. Aspects of the invention relate to a nucleic acid molecule that may be useful in such methods. Further aspects of the invention are described herein.
Molecular Diagnostic Testing is the process by which the presence (or absence) of a certain DNA sequence (or other molecular species, such as a protein) is confirmed with high sensitivity and specificity. Sensitivity is the ability to detect the analyte even when there are physically very few copies of the analyte. Specificity is the ability to reliably detect the analyte whilst failing to erroneously detect any other species present, which may only be marginally distinct from the analyte of interest. The desirable combination of sensitivity and specificity is achieved when a rare species is detected with great veracity whilst present in a test sample in which many other very similar molecular species are present. High sensitivity and specificity is often hard to achieve and are critical considerations as to whether a technology is judged to be fit for purpose or not. A false negative may be generated by a test that has low sensitivity (regardless of how specific the test), whereas a false positive may be the result of a test that has low specificity, even if a high sensitivity test is used to detect an analyte that is absent from the test sample.
Molecular Diagnostic Testing is frequently employed during medical investigations, and the consequences of a false negative result, or a false positive result, can be detrimental to clinical decision-making and the selection of appropriate treatment for a patient.
It should be the goal of any sufficiently optimised molecular diagnostic test to return results that are reliable and reproducible, performing within set parameters. Such assurance is often obtained through the analysis of control samples concurrently with the test samples. These control samples are of known content, and therefore the outcome of the molecular tests performed upon these controls is predictable. Failure to generate the anticipated result from known controls, be they positive or negative, invalidates any results generated from concurrently run test samples. Good controls are consciously designed to be more prone to failure than the actual test sample, such that in the event of a positive result being generated from test sample, this is credible if the positive control has also generated a positive result. A good positive control is, by design, more likely to fail under standard test conditions than a genuinely positive test sample. This may be due to, for example, the control containing fewer representations of the analyte than are present in the test sample, putting greater demands of sensitivity on the control than the test sample. The provision of a meaningful control regime is key to the assembly of a well-designed and reliable molecular diagnostic test.
In the field of molecular diagnostics, there is a need to accurately test DNA analytes present in very low copy number, which may be as low as a single molecule of DNA, in a test sample that may be as complicated as a human genome. In situations where the test analyte (DNA sequence) is present at low single digit copies, and there may be a large number of other (similar) DNA species present, there is a need to provide a low copy number positive control that will validate the generation of a test sample positive result and, to some extent, validate the generation of a test sample negative result.
Controls, and specifically the means of generation disclosed here, are of broad utility to demonstrate the sensitivity of many amplification-based reactions and may, for example, be of utility in forensic science, demonstrating the efficiency of detection of a particular forensic target using a particular forensic DNA amplification technology.
Many other applications will also be enabled by the provision of controls that are of known numbers, especially single molecules. It is in the field of medical investigation that the greatest application of the invention is anticipated, for example in the detection of Blood Stream Infections or in the analysis of somatic mutations driving oncogenesis.
Among the objects of aspects of the invention is the provision of an artificial and manipulable control DNA analyte that can be co-amplified with a test sample, employing test components (DNA primers, for example) that are identical to those that interrogate the test sample for the presence of the test analyte. Because the control DNA analyte is amplified at the same time as the test analyte (if present), the detection of the control DNA analyte provides a reliable verification that the test, as assembled, was capable of amplification of the test analyte, even where that test analyte is itself only present at very low copy number, and perhaps as low as a single copy. However, the sensitivity of the system also provides some degree of verification of a negative test result: if the artificially introduced single copy control is detected with great certainty, the characteristics of the test design give some degree of confidence that a negative test sample result is indeed due to there being fewer than one molecule of the test analyte in the test sample, and that the test sample was therefore truly devoid of the test analyte.
According to a first aspect of the invention, there is provided a nucleic acid cassette comprising a nucleic acid molecule comprising a first region and a second region, wherein the second region is flanked by first and second primer binding sites to allow amplification across the second region, and wherein a selectively cleavable region is located between the first and second regions, with the selectively cleavable region being flanked by third and fourth primer binding sites to allow amplification across the selectively cleavable region.
As will be apparent from the detailed description below, this nucleic acid cassette allows a reaction volume to be prepared having a predetermined copy number of a nucleic acid molecule comprising the first region therein. In brief, though, the second region acts as a reporter to indicate the presence of the entire nucleic acid molecule (also termed a nucleic acid cassette); the selectively cleavable region can be cleaved to separate the first region from the second region, leaving an isolated copy of the first region. The second region may be termed herein a reporter region. The first region may be termed herein a mimic region (as it is intended to mimic a test sequence).
The first region may be flanked by fifth and sixth primer binding sites to allow amplification across the first region. In a preferred embodiment, the fifth and sixth primer binding sites correspond to primer binding sites flanking a desired test nucleic acid sequence. “Correspond to” means that a primer capable of binding to a given primer binding site in the nucleic acid molecule will also be capable of binding to the corresponding primer binding site in the test nucleic acid sequence. This means that the same primers may be used in an assay to amplify both the test sequence and the first (mimic) region. Hence the first region can act as a control to confirm that the nucleic acid amplification has been successful. In certain embodiments the fifth and sixth primer binding sites are identical to the corresponding primer binding sites, although in other embodiments the fifth and sixth primer binding sites are non-identical (for example, they may differ by 1, 2, 3, 4, 5, or more nucleotides) to the corresponding primer binding sites. Where the primer binding sites are non-identical, this reduces the efficiency of the nucleic acid amplification of the first region, which may be desirable when the first region is being used as a control.
The selectively cleavable region may be a native nuclease binding site; for example, a restriction enzyme site (which should not be found in the cassette other than in the cleavable region(s)). Synthetic nucleases, such as metallic complexes, or engineered nuclease activities such as CRISPR Cas9 and derivatives could also achieve directed strand breakage. Alternatively, the selectively cleavable region may be chemically cleavable or photo-cleavable, or a combination of the two; it may comprise modified nucleotides or a non-nucleotide moiety which can be selectively cleaved. For example O-nitrobenzyl modifications that render the nucleic acid photolabile, or a 7-nitro-indole modification, which permits light activation and subsequent mild alkaline or thermal cleavage.
The cassette may comprise multiple reporter regions, each flanked by primer binding sites. Each reporter region may be separated from its neighbour by a selectively cleavable region. The multiple reporter regions are preferably identical, as are the multiple selectively cleavable regions. In certain embodiments, however, the multiple reporter regions may be different.
The cassette may comprise multiple mimic regions. Each mimic region may be flanked by primer binding sites. In certain embodiments, the multiple mimic regions are different, while in others the multiple mimic regions are identical. Similarly, the multiple primer binding sites may be different, such that each mimic region may be amplified with a different primer pair.
Where the cassette comprises multiple reporter and/or mimic regions, these need not be in any particular order. For example, all reporter regions may be grouped together, or they may alternate with mimic regions, or any other order may be used.
The nucleic acid molecule may be linear, or it may be circular such as a plasmid.
The mimic region may be selected to possess certain desired properties; for example, length, G/C composition, absence of repeat sequences, and/or capacity to form secondary structures. The desired properties may vary depending on the desired use of the nucleic acid molecule; for example, when used as a control, the length of the mimic region may be selected so as to be similar but not necessarily identical to the desired test analyte sequence, allowing both to be readily distinguished if both are amplified in an assay. The desired properties may also be selected so as to mimic the test analyte region, such that conditions which allow amplification of the control will also be expected to allow amplification of the test analyte sequence.
The reporter region may comprise modified nucleotides; for example, certain nucleotides may be labelled with a detectable label. This may aid its use as a reporter. Alternatively the reporter region need not be labelled, and can be detected in some other way; for example, by the use of dyes which detect dsDNA.
The nucleic acid molecule is preferably DNA.
The nucleic acid molecule may be immobilised on a solid support. The solid support may be a bead, a membrane, an adsorbent surface, or the like.
Also provided is a solid support having a nucleic acid cassette as herein defined immobilised thereon.
A further aspect of the invention provides a method for obtaining a predetermined copy number of a known nucleic acid molecule, the method comprising the steps of:
In embodiments where the nucleic acid cassette comprises fifth and sixth primer binding sites flanking the first region, then the nucleic acid molecule in the plurality of volumes in (e) will also comprise the primer binding sites.
The plurality of reaction mix volumes of (a) may be prepared as a water-in-oil emulsion of an aqueous solution comprising the cassette. Each emulsion droplet represents a reaction volume. The emulsion is preferably prepared by combining an aqueous solution comprising the cassette with oil. The emulsion may be achieved by mixing, sonication, or injection of aqueous solution into oil. This latter is preferred as it allows greater control over droplet volumes. The droplets may be nano, pico, or femtolitre scale volumes. The emulsion may contain a detergent, to help maintain the droplets separate.
When preparing an emulsion, the desired copy number can be achieved by using a relatively low concentration of the cassette in the aqueous solution. The concentration can be selected such that this results in a statistical distribution within the volumes such that the majority of volumes will include no cassette, while at least some include the desired copy number of cassettes (and some may include more than the desired copy number).
Preferably, the desired copy number of cassettes is one.
The desired copy number of the nucleic acid molecule may be the same as the desired copy number of cassettes. Preferably this is one, and this may be achieved by use of a cassette having a single first region. Alternatively, the nucleic acid molecule copy number may be greater than the copy number of cassettes; copy numbers which are a multiple of the copy number of cassettes may be achieved using cassettes having more than one identical first region. For example, where the cassettes have two first regions, and the cassette copy number is one, then the copy number of the nucleic acid molecule will be two.
Preferably the majority of reaction mix volumes in (a) include no copies of the cassette. By “at least some” of the volumes containing the desired copy number of the cassette is meant at least 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0.1, 0.01, 0.001% of the volumes contain the desired copy number.
One or more, and preferably all, of the combining steps may be carried out by fusion of two or more reaction mix volumes. These volumes may be in the form of droplets, for example, water-in-oil droplets. Fusion of droplets can be carried out using microfluidics systems, with known techniques. For example, suitable techniques include forced merging within a microchannel (described in Hung et al, Lab Chip, 2006, 6, 174-178) or collision between droplets through Electrowetting-on-Dielectric (EWOD) (described in Fan et al Lab Chip. 2009 May 7;9(9):1236-42). In alternative embodiments, the reaction mix volumes may be contained within a reaction vessel, for example, a multiwell plate. In these embodiments, reaction mixes may be successively added to a reaction vessel in order to combine them. Either approach allows for nucleic acid amplification to be carried out in situ, for example by thermal cycling reactions.
Nucleic acid amplification preferably comprises polymerase chain reaction (PCR) amplification. Necessary reagents for nucleic acid amplification may be combined with the reaction mix volumes at each step, or may be contained within the initial reaction mix volume, such that only the necessary primers are added at each step. Necessary reagents may include DNA polymerase enzyme, buffers, dNTPs, and so forth. The skilled person will be aware of how to perform nucleic acid amplification, and which reagents are necessary.
The nucleic acid amplifications may be labelled. This allows ready determination of whether amplification has occurred or not. Labelling may be performed with a dye, eg, a fluorescent dye, to quantitate the amount of nucleic acid in the reaction mix. For example, a dye may be used which binds to dsDNA (such as SYBR green). Different amplifications may use different dyes, to allow each amplification to be distinguished. Alternatively labelled nucleotides may be incorporated into each amplification.
Step (d) may further comprise quantitating the amplified reporter region, and discarding those reaction mixes where the amount of amplified nucleic acid is above and/or below a predetermined threshold. For example, this may be used to indicate that the starting copy number of the cassette was greater than the desired copy number—in such a case, the starting copy number of the reporter region will be higher than anticipated, and the final amount of amplified reporter will also be higher than anticipated.
The method may further comprise the step of binding the nucleic acid molecules comprising the first region contained in the volumes of (e) to a solid support. For example, the support may be a microbead, a membrane, an adsorbent material, etc. Each volume may be bound to a separate solid support, or a separate region of a solid support. In this way, a single (or a desired copy number) nucleic acid molecule may be bound to a solid support. The method may further comprise the step of combining the solid support with a reaction mix volume containing a test nucleic acid, and allowing the test nucleic acid to hybridise to the bound nucleic acid(s) on the solid support. In this way the bound nucleic acid molecule (of known copy number) may be used to isolate a test nucleic acid containing complementary sequences. Importantly, this will be isolated at known copy number. Alternatively, the nucleic acid molecules comprising the first region may be adsorbed into a solid support (for example, a hydrophilic and lipophobic material, eg, a cellulose-based filter paper). The use of a hydrophilic and lipophobic material allows the aqueous phase of a water-in-oil droplet containing the nucleic acid to be adsorbed (preferably in its entirety) into the material, while the oil phase is not adsorbed, and can be removed.
The method may further comprise the step of combining the volumes obtained in (e) with a reaction mix volume containing a test nucleic acid and a primer pair which will bind to the fifth and sixth primer binding sites; performing nucleic acid amplification on the combined reaction mix; and determining whether nucleic acid amplification has taken place of i) the nucleic acid molecules comprising the first region; and/or ii) a portion of the test nucleic acid. The first region thereby acts as a control to indicate that the amplification reaction is proceeding.
The method may further comprise the step of combining the volumes obtained in (e) with an additional reaction mix having different physical properties. For example, the additional reaction mix may have a greater volume, a greater viscosity, or both.
A further aspect of the invention provides a method for performing a nucleic acid assay to detect the presence of a test nucleic acid in a sample, the test nucleic acid being flanked by fifth and sixth primer binding sites corresponding to fifth and sixth primer binding sites flanking a first region of a nucleic acid molecule as defined above; the method comprising:
A yet further aspect of the invention provides a solid support comprising a hydrophilic and lipophobic material having a nucleic acid molecule of known copy number adsorbed thereon.
A still further aspect of the invention provides a method for isolating a target nucleic acid molecule in a known copy number, the method comprising the steps of:
The solid support having a nucleic acid molecule of known copy number attached thereto may be prepared as described herein, and as described with reference to the preceding aspects of the invention. The nucleic acid molecule of known copy number may be prepared as described with reference to the preceding aspects of the invention.
The nucleic acid molecule of known copy number may be double stranded or single stranded. Where double stranded, the method may further comprise the step of denaturing the double stranded nucleic acid molecule to allow hybridisation. Further, where double stranded, preferably only a single strand of the double stranded molecule is attached to the solid support.
The complementary portions of the molecules are preferably at least 10, 15, 20, 25, 30, 35, or 40 nucleotides in length. The complementary portions are preferably at least 85%, 90%, 95%, 97%, 99%, or 100% complementary.
Preferably at least a portion of the nucleic acid molecule of known copy number is not complementary to a corresponding portion of the target nucleic acid molecule. That is, both molecules have complementary portions, and the portions of the molecules adjacent these are not complementary, such that the molecules will not hybridise at the non-complementary portions. The non-complementary portion of the nucleic acid molecule of known copy number is preferably at the end of the molecule which is not attached to the solid support; preferably this is the 3′ end. There may also or instead be a further non-complementary portion of the molecule at the end which is attached to the solid support; this portion can be incorporated into any amplification products generated from the nucleic acid molecule of known copy number, and may be used for example to incorporate specific sequence tags or further binding sites.
Preferably the known copy number is 1.
Preferably the solution comprises multiple copies of the target nucleic acid molecule; more preferably there are significantly more copies of the target nucleic acid molecule than the known copy number.
The solid support is preferably a polymer bead.
Preferably a plurality of solid supports are provided, each having a nucleic acid molecule of known copy number attached thereto.
The method may further comprise the step d) delivering the solid support and the target nucleic acid molecule to a reaction vessel or a reaction volume. The reaction vessel may be a well. Preferably the well is dimensioned so as to be capable of receiving only a single solid support. The method may further or alternatively comprise the step e) extending the captured target nucleic acid molecule by a polymerisation reaction using the nucleic acid molecule of known copy number as a template, thereby incorporating additional sequence into the captured target nucleic acid molecule. The additional sequence is preferably a sequence which is not naturally found adjacent the captured target; this may be used, for example, to incorporate known primer binding sites or universal primer sites into the target molecule. The method may still further comprise the step f) amplifying at least a portion of the target nucleic acid molecule to provide a plurality of copies of the amplified portion.
The present invention is intended to permit the generation of a reaction mix or reaction volume containing a controlled copy number (usually one) of a known and defined nucleic acid sequence. This volume may be used as a control in a molecular diagnostic assay. Although the invention is primarily described herein in terms of providing an artificial control sample, it will be appreciated that there are other fields where it may be desirable to generate a sample containing a known copy number of a known nucleic acid; the invention is thus not limited to generation of control sequences.
The sensitivity and specificity of DNA testing technologies has increased in recent years, such that it is now possible to detect a single molecule of a test analyte DNA sequence by a number of means, including clonal amplification of that test analyte DNA sequence and mass detection of the amplified product by, for example, Next Generation Sequence analysis.
However, the detection of single molecules (or very low single digit copies) is challenging. Any sufficiently developed assay must enjoy very high sensitivity and very high specificity, but frequently maximising one of these compromises the other. In the investigation of blood stream infection (BSI) for example, it may be clinically significant (and a patient may be very sick) where there is as low as 1 colony forming unit (CFU) of the pathogen per millilitre of whole blood. It is necessary to rapidly identify the pathogen and perhaps antimicrobial resistance (AMR) genes that might confer resistance to specific classes of antibiotic. With a minimal blood draw of 10 ml and acceptance that preparation of DNA from the pathogen will incur loss of a proportion of that DNA, it is not unreasonable to think that a molecular diagnostic test will have to be capable of accurately detecting low single digit copies of the target analyte in a background of large quantities (both physical and sequence-content) of ‘contaminating’ DNA, and reporting a result that can be relied upon by the clinician when selecting appropriate therapy.
The use of controls, run concomitantly with the test sample, has long been an accepted means of giving confidence that a diagnostic test is performing within the expected validation parameters. This is of particular relevance in demonstrating that the sensitivity of the assay is performing at the challenging levels of low single molecules in the test sample, but until now there has been no simple and reliable method that enables control samples to be generated to accurately reflect the very low copy number that might be encountered in a test sample.
The invention disclosed herein enables the provision of an artificial test sample that reliably mimics the very low copy number and the sequence context of a test analyte. Provision of such a control, which is co-amplified along with the test analyte, enables the performance of the assay to be confirmed, and the veracity of any test result generated to be confirmed. Even in situations where there is a negative test result, the veracity of this as a ‘true negative’ is confirmed to a greater extent where the control, run at a level of just a single molecule, does generate a convincing positive result.
Although amplification of very small quantities of DNA is prone to stochastic variability during early cycles, running a control that generates an output using the same amplification primer sequences as the test analyte, in the same tube as the test analyte, gives effective ‘normalisation’ of many experimental variables that might otherwise affect the efficiency of the test assay. Such variables as absolute concentration of reagents, pipetting errors and temperature fluctuations, plasticware and operator variables are automatically controlled for when the control and the test are run in the same reaction vessel. These variables can confound the usefulness of a control, if performed in a separate reaction vessel from the test assay. Other factors that can be designed to ensure that control and test sequences amplify with the same efficiency are for example the length of the amplicon, as there is a selective bias to amplify short amplicons with greater efficiency. The ‘GC’ content of the intervening sequence may also impact how efficiently sequences of identical length are amplified. These and other factors might be designed deliberately into the control, where it may be beneficial to make that control slightly ‘harder to amplify’, and therefore more prone to failure, as any good control should be. If designed to be as functionally equivalent as possible, then the primer binding sites, the length and the GC content and distribution may be normalised, together with other sequence considerations, such as runs of homopolymer and capacity to form secondary structures.
A detailed description of the invention is now given, for illustrative purposes only. In this example, the detection of a blood stream infection pathogen is used as an exemplar, in which it might be expected that any pathogen nucleic acids indicative of infection would likely be present at very low (single digit) copy number. The example also envisages the use of microdroplets as reaction volumes, and to contain the nucleic acid cassette; such microdroplets are amenable to manipulation and combination by using microfluidics techniques. Combination of a first droplet with additional reagents can be achieved relatively straightforwardly by fusion of two droplets. Again, however, the invention is not limited to the use of microdroplets.
In order to provide a control for the diagnostic detection of the test analyte 101, an artificial nucleic acid construct is provided. This artificial construct is called the CONTROL CASSETTE 200, as shown in
Importantly, the susceptible site 203 is located between the analyte mimic 201 and the reporter 202. The susceptible site 203 could be, for example, a restriction endonuclease cleavage recognition sequence. The sequences of the analyte mimic 201 and the reporter 202 are not necessarily derived from any naturally occurring sequence, and therefore may be any optimally designed sequence. However, typically the analyte mimic 201 will be selected to have at least some similarity (for example, length, G/C content, etc) to the test analyte 101, but equally to be distinguishable therefrom when both sequences are amplified.
Whilst the DNA sequences of the analyte mimic 201 and reporter 202 are unconstrained in their selection, the DNA sequence of the susceptible site 203 and the sequences flanking all three of the regions 201, 202 and 203 are constrained such that specific amplification primers can be provided to hybridise to these sites.
Primers 301 and 302 are designed to amplify the reporter 202, and primers 303 and 304 are designed to amplify through the susceptible site 203 provided the nucleic acid cassette is intact. In certain embodiments, one or other or both of primers 303 and 304 may include 5′ tails of non-template nucleotides, to aid strand displacement of these primers when annealed. However, this is not believed to be essential, and in preferred embodiments the primers do not include tails.
Primers 102 and 103 are selected so as to allow hybridisation to both the cassette and the test sample, and are therefore based on DNA sequences found in the test sample. However, primers 301, 302, 303 and 304 are not necessarily derived from any naturally occurring sequences and can be freely optimised. In particular, primers 301, 302, 303, and 304 can be designed such that there is no (or minimal) unwanted hybridisation between these primers and the target sites of primers 102 and 103, or indeed these primers and their respective non-target sites. This design strategy reduces the chances of unintended amplification of analyte mimic 201; specifically, the primers should not be able to hybridise anywhere which could result in the unwanted duplication of the mimic region 201 since this could clearly compromise the ability to produce a single copy of that region in the final product. Likewise, the sequence of the mimic region 201 can also be designed so as to reduce or eliminate unwanted hybridisation. It will be understood, therefore, that only the sequences of the primers 102 and 103, and their respective binding sites, are constrained by the desired target sequence; other sequences may in certain embodiments be freely designed in order to optimise performance. In certain embodiments, the respective melting temperatures of a primer to its target may also or instead be designed to arrive at a preferred range of melting temperatures of different primer:target hybridisations. For example, primers 301 and 302 may have a lower melting temperature than primers 303 and 304. Such an arrangement may permit a further check against amplification of an unwanted region during performance of the present methods.
This invention provides a method whereby a single copy (or other known copy number) of the analyte mimic 201 may be delivered to a reaction volume, and which may therefore be used as a control in a diagnostic assay where the presence of the test analyte 101 is being assessed. The first step of this process is to distribute the control cassette 200 construct into isolated volumes, such as by formation of an emulsion of ‘water-in-oil’ droplets or other small volume reaction chambers. The format of ‘water-in-oil’ droplets will be used as an exemplar in
A known concentration of the control cassette 200 is prepared in aqueous solution, such that when combined with oil, small volume (nano, pico or femtolitre range) droplets are created. The concentration of the control cassette in the initial solution is sufficiently low that the overwhelming majority of the ‘water-in-oil’ droplets formed will be entirely devoid of the control cassette 200; these droplets are identified in
The concentration of the initial solution, and the volume of oil used, may be varied to achieve a desired final distribution of the cassette in the droplets.
There are a number of means that ‘water-in-oil’ droplets can be formed, including vigorous mixing, sonication or directly injecting the aqueous solution through a narrow constriction into the oil. This final method is preferred as it gives the greatest potential to control the uniform volume of the resulting droplets. Whichever method is chosen, the final result of the process is a very large number of droplets, maintained separately to each other (through, for example, the inclusion of a detergent within the aqueous solution).
The mixed population of droplets 401 and 402 must now be individually identified and separated, which is achieved by sequential utilisation of the susceptible site 203 and the reporter 202 that are linked to the analyte mimic 201 on the control cassette 200.
If the 401/402 population were to be analysed directly by, for example, amplifying the reporter 202, then this would risk unwanted amplification of the mimic 201 which is still on the same molecule as the reporter 202. For example, one possible method of confirming the presence of the reporter 202 would be to combine the individual droplets of the 401/402 mixture with a separate species of droplet enclosing biochemistry including primers 301 and 302, and subjecting the combined droplets to an amplification reaction. The presence of the reporter 202 could then be confirmed, through (for example) monitoring the accumulation of the reporter 202 amplicon. This can be achieved by including a dye that increases in fluorescence in the presence of increasing accumulation of dsDNA. However, if this is done on the intact cassette, there is the potential that the primer 302 would extend through the reporter 202, through the binding site for the primer 301, through the binding site for primer 304, through the intact susceptible site 203 and its primer site 303 and on through the region represented by the analyte mimic 201, and its associated primer binding sites 102 and 103. This undesirable duplication would mean that the analyte mimic 201 would no longer be present as a single copy within the droplet. This undesirable copying of the analyte mimic 201 would be linear (as opposed to exponential) and would only duplicate the analyte mimic 201 on one strand. Duplication of the analyte mimic 201 must therefore be prevented by introducing a physical blockage to the passage of the actively extending primer 302.
The nature of the blockade that would prevent the passage of the extending primer 302 could be, for example;
However, systems that rely on the hybridisation of blocking molecules cannot be guaranteed to be 100% effective, and systems that rely on the direct inclusion of abasic sites, HEG, or other chemical blockers mean the control DNA is no longer ‘natural’, and is less amenable to future manipulation, as might be necessary to introduce new control sequences for new targeted test analytes or to generate copies of the sequence for manufacturing purposes.
The present invention therefore makes use of a method whereby the mimic 201 and reporter 202 are physically separated prior to detection of the reporter 202. This ensures that amplification of the reporter 202 does not inadvertently also bring about the risk of amplification of the mimic 201. The most attractive and effective means of preventing the extension of reverse primer 302 through analyte mimic 201 during the polymerase-driven identification of the reporter 202 would be to physically break the covalent linkage of the analyte mimic 201 and reporter 202, necessarily after the intact control cassette 200 has been delivered to a specific droplet 401. This physical breakage can be achieved by the inclusion of the susceptible site 203 between analyte mimic 201 and the reporter 202 (but confirmed as being absent from any other regions within the control cassette 200). Restriction enzyme digestion of the cassette 200 using an enzyme which recognises or cuts at the susceptible site 203 will cleave the cassette 200 into two parts. As the susceptible site 203 is flanked by DNA sequences that can support hybridisation of oligonucleotide primers 303 and 304, then amplification across this region can be used to confirm whether cleavage has taken place or not. See
Initially, in order to determine which of the droplets within the mixture of 401 and 402 droplets do indeed contain the control cassette 200, each one of these droplets must be treated as if it does contain this construct and attempts to inactivate the susceptible site 203 must be made (by restriction digestion, for example). The droplets contained within mixture 401 and 402 are therefore individually merged with a further species of droplet, 701, that contains the necessary biochemistry to effect inactivation (cleavage) of the susceptible site 203, as shown in
In order to discriminate droplets 702, 703 and 704, polymerisation across the susceptible site 203 with primers 303 and 304 is effected (see
Note that it is preferable to initially assess the inactivation of the susceptible site 203, rather than identifying the (relatively small number of) droplets that contained the reporter 202 region first. This is because assessing reporter 202 followed by susceptible site 203 would necessitate discrimination of a positive fluorescent result building on a previously positive fluorescent result. Although it would not be impossible, it would be clearer and more preferred to have a positive result generated (from assessment of reporter 202) subsequent to a negative result (from failure to amplify across the susceptible site 203). This strategy however demands assessment of a very great number of species 704 droplets, the majority of droplet species devoid of the control cassette 200.
In certain embodiments of the invention, however, it may be possible to assess the reporter 202 first, if two different dyes (for example, fluorescing at different wavelengths) were used to assess the reporter 202 and susceptible site 203. This might be achieved for example by using TaqMan probes, which generate fluorescence after cleavage of a probe during Taq polymerase extension. TaqMan probes could be targeted against the reporter (Wavelength 1 positive result indicative of presence) and then the susceptible site 203 (Wavelength 2 failure to generate a positive response indicative of successful cleavage of the susceptible site 203).
Droplets 803 and 804, which failed to amplify across the susceptible site 203, must now be discriminated through assessment of the presence of the reporter 202. These droplets are individually merged with droplet species 901 (see
Once merged, droplets 902 and 903 are thermal cycled under conditions that will amplify the reporter 202 region, if present. Only droplet 902 will demonstrate this amplification and generate an increase in fluorescence as the SYBR Green dye binds to the amplicon generated.
Droplet 902 has therefore been generated from a progenitor droplet that has sequentially;
This droplet 902 is therefore confidently determined to include a nucleic acid molecule comprising the analyte mimic 201 and flanking primer binding sites, which has been separated from the reporter 202 by cleavage of the susceptible site 203. This droplet will contain a single copy of the analyte mimic 201 (in the majority of cases), due to the low starting concentration and distribution within the original droplets of the control cassette 200, and will also contain the detritus of the various analyses used to confirm its identity. Given that this droplet can be manipulated into a diagnostic assay that seeks to confirm the presence of the test analyte 101 by virtue of amplification with primers 102 and 103, and that the carried over biochemistry droplet 902 harbours does not interfere with the performance of primers 102 and 103, this droplet 902 provides a single copy analyte mimic 201, amenable to amplification by primers 102 and 103.
An overview of the flow of droplets, mergers and rejections is given in
After thermal cycling, the droplet 902 is identified (within the black diamond) as being fluorescent due to reporter amplification, and is harvested as the final product of the process. Droplet 903, which remains non-fluorescent and therefore devoid of the reporter 202, is discarded to waste.
As a final demonstration of the effectiveness of this scheme, provision of a TaqMan probe designed to hybridise to the control mimic 201 sequence and biochemistry to amplify this region (primers 102 and 103) can be used to distinguish droplet 902 from droplets 903: only species 902 should support a positive TaqMan reaction, with the fluorescent emission of the TaqMan reaction being by necessity in a different channel to that of the dye already used to confirm the presence of the reporter 202 (SybrGreen for example). This confirmatory test is depicted in
During manufacture, no TaqMan assay will be performed to confirm the presence of control mimic 201; the control mimic 201 will remain as a single molecule. Modulating the physical nature of droplet 902 is desirable in order to enable it to be manipulated, such that its contents are easily introduced into a diagnostic assay.
Further possible manipulations might include desiccation of the droplets to form rehydrateable pellets, with each pellet containing just a single representation of the analyte mimic 201. It might be beneficial to remove certain (or all) of the detritus of previous analyses from the final desired single molecule droplet, in case these components interfere with the ultimate molecular diagnostic test, although in preferred embodiments, these carry-over components are tolerated in the final diagnostic assay and need not be removed.
In other embodiments, the single copy nucleic acid could be bound to a solid support (rather than being simply adsorbed); for example, the solid support could be a polystyrene bead, a derivatised glass surface, etc. This may allow alternative uses of the nucleic acid, such as acting as a nucleic acid probe. In certain embodiments of the invention, such solid supports may be useful in isolating a single copy of another nucleic acid to which the single copy nucleic acid may hybridise. The single copy nucleic acid can act as a molecular “hook” or “fishing rod” and placed in a reaction mix containing a desired target; the target will hybridise to the hook, while the solid support allows the hybridised nucleic acids to be manipulated thereafter. Further details of the use of a single copy nucleic acid as a “fishing rod” are described elsewhere in this document, with reference to
The diagnostic assay could be performed in a number of reaction vessels (as opposed to droplet-based water-in-oil), but is disclosed here as being a PCR amplification based system: the assay would require the combination of extracted test sample (Template DNA), the biochemistry required to amplify the test analyte 101, and the analyte mimic (in one of at least three formats disclosed here). This combination is represented diagrammatically in
Droplet 902, ‘modified’ droplet 1102 or matrix 1104 will each harbour a single copy of the analyte mimic 201. When one (and only one) of these species is combined with test template 1201 and the biochemistry 1202 to support amplification of the test analyte 101/analyte mimic 201, the reaction vessel 1203 will be capable of simultaneous amplification of the test analyte 101 (grey hatched box) and the analyte mimic 201 (black box). Subsequent detection/discrimination of the amplified species by any suitable means allows the anticipated positive control analyte mimic 201 to inform the significance of a positive or negative result generated from the test analyte 101.
After amplification within the vessel 1203 by, for example, thermal cycling, the vessel may contain multiple copies of the test analyte 101 and the analyte mimic 201. These amplicon species will have common sequence at their terminal ends (by virtue of having both been generated through the extension of primers 102 and 103), but will have distinct core sequences.
Irrespective of the presence or absence of the test analyte 101 in the vessel 1203, the analyte mimic 201 is confidently present and should be amplified to give multiple copies in the vessel 1301. If the test analyte 101 is also present this will amplify, but if it is absent, it will of course fail to amplify. The comparison of the number of copies of the analyte mimic 201 amplicon and the test analyte 101 may give some impression of the relative abundance of the test analyte 101 in the test sample, but this comparison will be at best semi-quantitative.
Enhancements of the system are envisaged that allow the system to be used to control for more than one test analyte in a multiplex molecular diagnostic test analysis.
Total digestion of this cassette 1400 at both the susceptible sites 203 is necessary to demonstrate that duplication of one or other or both of the analyte mimics 201 and 1401 has been prevented. Detection of an intact susceptible site 203 post-inactivation (by amplification across the site 203) cannot reveal which of the susceptible sites 203 has remained intact since they are identical. Regardless, an intact site 203 will result in that droplet being discarded. It may be that in certain embodiments only the susceptible site 203 located between the analyte mimic 1401 and the reporter 202 is absolutely required to prevent duplication of the analyte mimics during analysis of the reporter 202 (and hence the two analyte mimics 201, 1401 are not separated by a susceptible site, and will not be separated in the final droplet). However, for certainty, and to prevent confusion of amplicons where there are primers amplifying both analyte mimics, it is preferable to position a susceptible site 203 between the analyte mimics 201 and 1401.
A further enhancement of the system (shown in
Comparison of the fixed-point (as opposed to end-point) responses from different representations of droplet 401 will enable the distribution of responses to be considered, and the potential for a secondary separation of the response to eliminate any overlap in the response, where the separation of the highest responses, together with any in a grey area, can be eliminated.
A final enhancement of the system, shown in
The above detailed description demonstrates the flexibility and reliability of the invention to provide a system that can be used to provide a single copy of a mimic of the analyte in a format that enables the detection of the analyte to be assessed.
Although the illustrated embodiments have been described from the perspective of preparation of droplets, the use of droplets per se may not be required, as there exist methods of carrying out ‘digital PCR’ that rely on very small reaction chambers (nanolitre wells; eg Wafergen or Life Technologies QuantStudio DX) to which additional components could be sequentially added. For example, after the Poisson distribution of the original diluted control cassette 200 into the wells of a digital PCR chip, the susceptible site 203 inactivation biochemistry, the susceptible site 203 amplification biochemistry and the reporter 202 amplification biochemistry could be introduced to these wells. However, this is not preferred, as it is envisaged that recovery of the entire contents of the well after identification of those wells that contained a single copy of the control cassette could be very difficult. However, this approach may be useful where such recovery is not needed and further reactions performed in the same well. This sequential addition scheme may also be used to demonstrate the serial biochemistries are preforming as anticipated.
Another enhancement that might be desirable is that rather than assessing the degree of amplification of the reporter 202 sequence after a limited and quantifiable level of amplification, the individual droplets would most beneficially be amplified in a device which allowed massively parallel amplification and simultaneous real time PCR assessment, perhaps allowing more reliable quantification of the presence of the reporter 202 region.
As noted above, single copy nucleic acid sequences may be attached to beads or other solid supports in order to be used as molecular “hooks” or “fishing rods”. The following section describes this in more detail.
Beyond the use as a sensitive control system, one attractive application of the ability to reliably isolate a single molecule of nucleic acid is the potential to use this (in its single stranded form) as a species-specific molecular ‘fishing hook’. If linked to a solid surface, for example a polymer bead, then the specific hybridization capacity of the single linked DNA sequence enables recovery of a second single DNA strand (bearing the complementary sequence) from a solution containing a plurality of DNA strands. The plurality of DNA strands may all bear the complementary sequence, or only a proportion of the plurality of DNA strands may bear the complementary sequence. Ideally, in order to maximize the potential for the bead-linked single molecule to encounter and capture a complementary DNA strand within a reasonable timeframe, the number of DNA strands in solution bearing the complementary sequence will be in massive excess compared to the single molecule linked to the bead. The complementary sequences can be designed to most advantageously allow high specificity, with little or no potential for cross-talk hybridization of the bead-linked capture sequence with any other DNA strand sequence that may exist within the plurality of DNA strands. Once captured, the DNA strand recovered from solution can then be manipulated as a non-covalently attached ‘passenger’ on the polymer bead.
The above system may be advantageously employed to seed the geographically-separated, clonal amplification of individual molecules (second single DNA strands) as the initiating step of an NGS reaction. Having delivered just one non-covalently attached passenger DNA strand to a specific geographical region, multiple (clonal) copies of identical sequence can be generated. Synchronous NGS interrogation of the bases of these copies maximizes the signal output generated at each individual base position of the DNA strand.
Delivery of a single copy of the DNA strand to be sequenced is ensured by virtue of the capacity to capture just one DNA strand onto each bead, and the geometric capacity to accommodate just a single polymer bead at each discrete geographic location. For example, after capture of a single DNA strand from a plurality of DNA strands, the bead harbouring the DNA strand may be delivered to a discrete well structure, where the well has dimensions sufficiently large to accommodate a single bead, but insufficiently large to accommodate more than one bead. This geometric limitation ensures that there will only ever be a single bead loaded to the well, and it follows only a single passenger DNA strand loaded per well. For example, such a system is described in WO2014/013263, to DNA Electronics Ltd. Reference is made particularly to page 15, line 25 to page 16, line 12; and to page 39, line 6 to page 43, line 15. These passages describe methods and systems for obtaining a limited number of beads per well, and for using such beads complexed with nucleic acids for sequence amplification and/or sequencing.
It is possible that non-specific adsorption of DNA stands from the plurality of DNA strands in solution onto the surface of the polymer bead may confound the aspiration to ensure just a single strand (passenger) of DNA delivered to the geographically distinct regions of the sequencing system. This potential may be minimized through stringent post-capture washing and/or selection of polymer bead/surface chemistry treatments that limit any such non-specific adsorption. Furthermore, it is possible to discriminate between legitimately captured DNA stands (hybridization captured) and non-specifically adsorbed sequences, by virtue of only the former having a potentially DNA polymerase-extendable, hybridized 3′0H end; the capture sequence attached to the bead may be specifically designed to include a non-hybridizable (i.e. not employed in capture) element that, upon extension of a legitimately captured DNA strand 3′ end, will drive DNA polymerase mediated incorporation of the complementary sequence to the non-hybridizable element (not employed in capture) onto the 3′ end of the captured DNA strand. The legitimately captured DNA strand is unique in its capacity to incorporate this additional sequence, which can subsequently be utilized as an essential element of the clonal amplification strategy (below).
The system detailed above is now illuminated using diagrammatic representations.
Once the Control Cassette is cleaved by restriction digestion (for example) and confirmed through the amplification of Reporter 202, the single molecule component 201 released may resemble the DNA element presented in
Number | Date | Country | Kind |
---|---|---|---|
1520883.8 | Nov 2015 | GB | national |
1605055.1 | Mar 2016 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2016/053649 | 11/23/2016 | WO | 00 |