This application claims the priority benefits of European Patent Application No. 19306765.9, filed Dec. 23, 2019, and U.S. patent application Ser. No. 17/013,222, filed Sep. 4, 2020, the contents of each of which are hereby incorporated by reference in their entirety.
The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 186142000641SEQLIST.TXT, date recorded: Dec. 14, 2020, size: 3 KB).
The present application is related to multiplex digital polymerase chain reaction (PCR) assays (such as multiplex drop-off dPCR assays), methods and systems, including methods for assessing microsatellite instability (MSI) and genome-editing products.
Digital polymerase chain reaction (dPCR) is a powerful and sensitive method that can be used to detect rare mutations in nucleic acid samples. Digital PCR assays using allele-specific fluorescent TAQMAN™ probes have been developed to detect somatic mutations in biomarker genes. In a conventional dPCR assay, a wildtype probe recognizing the wildtype allele and a mutant probe recognizing a specific mutant allele are used. Upon hybridization of a wildtype probe or a mutant probe to an amplicon in a dPCR partition, the probe releases its fluorophore through the exonuclease activity of a DNA polymerase. The released fluorophore from a wildtype probe is detected via a fluorescence detection channel that is distinct from the released fluorophore from a mutant probe. Such assays require a dPCR instrument with R detection channels to detect R mutations at one or more genetic loci. In contrast, drop-off assays allow the quantification of any number of mutations occurring at a mutation hotspot by using two probes in a dPCR reaction: a drop-off probe that recognizes a wildtype sequence at the mutation hotspot, and a reference probe that recognizes a sequence at a low-mutation region on the same amplicon. See, Decraene C. et al., Clinical Chemistry 64(2): 317-328 (2017). In a drop-off assay, two detection channels are required to detect any number of mutations at a single genetic locus.
Because dPCR instruments have limited fluorescence detection channels, there is a need to increase the multiplex levels of dPCR assays for different mutations and at different genetic loci. Such assays are especially useful in clinical and other applications involving samples that are limited in quantity. Robust methods for quantification of different genetic species based on data from multiplexed dPCR assays are also needed.
The present application provides methods, apparatus, systems and compositions, for detection and/or quantification of wildtype and mutant sequences at a plurality of target regions in nucleic acid samples using multiplex dPCR assays, such as multiplex drop-off dPCR assays. The assays described herein can be used to assess microsatellite instability (MSI) and detect genome-editing products (e.g., CRISPR-Cas system-edited products).
One aspect of the present application provides a method for quantification of wildtype and/or mutant sequences at a plurality of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions each comprises:
In some embodiments according to any one of the methods described above, the plurality of probe sets are R number of probe pairs, wherein a first probe pair of the R number of probe pairs comprises:
wherein Ĉ(mi) is indicative of the estimated concentration of mutant sequences at the target region corresponding to the i-the probe pair in the sample,
wherein v is indicative of volume of a partition, and
wherein {circumflex over (P)}(mi) is indicative of the mutant probability that a given partition contains a mutant sequence at the target region corresponding to the i-th probe pair in the sample. In some embodiments, the method further comprises determining a confidence interval and/or an uncertainty measure associated with the estimated concentration of the mutant sequences at the target region corresponding to the i-th probe pair in the sample.
In some embodiments according to any one of the methods described above, wherein the plurality of probe sets are R number of probe pairs, the method further comprises calculating a wildtype probability that a given partition contains a wildtype sequence at the target region corresponding to the i-th probe pair, wherein the wildtype probability is based on the mutant probability corresponding to the i-th probe pair and the mutant probability corresponding to the (i+1)-th probe pair, wherein the (i+1)-th probe pair refers to the first probe pair if (e.g., when) i=R. In some embodiments, the wildtype probability is calculated according to:
wherein {circumflex over (P)}(w1) is indicative of the wildtype probability that a given partition contains a wildtype sequence at the target region corresponding to the i-th probe pair, wherein ni,(i+1) is indicative of a count of one or more partitions that each produces a positive signal via the Xi detection channel, a positive signal via the Xi+1 detection channel, and negative signals via any other of the detection channels X1-XR;
wherein ni,(i+1) refers to nR,1 if (e.g., when) i=R;
wherein n0 is indicative of a count of one or more partitions that each produces negative signals via all the detection channels X1-XR;
wherein ni is indicative of a count of one or more partitions that each produces positive signal via the Xi detection channel and negative signals via any other of the detection channels X1-XR; wherein ni+1 is indicative of a count of one or more partitions that each produces positive signal via the Xi+1 detection channel and negative signals via any other of the detection channels X1-XR;
wherein ni+1 refers to n1 if (e.g., when) i=R,
wherein {circumflex over (P)}(mi) is indicative of the mutant probability that a given partition contains a mutant sequence at the target region corresponding to the i-th probe pair,
wherein {circumflex over (P)}(mi+1) is indicative of the mutant probability that a given partition contains a mutant sequence at the target region corresponding to the (i+1)-th probe pair, and
wherein {circumflex over (P)}(mi+1) refers to {circumflex over (P)}(mi) if (e.g., when) i=R. In some embodiments, the method further comprises determining an estimated concentration of the wildtype sequence at the target region corresponding to the i-th probe pair in the sample based on the wildtype probability. In some embodiments, the estimated concentration of the wildtype sequences at the target region corresponding to the i-th probe pair in the sample is determined according to:
wherein Ĉ(w1) is indicative of the estimated concentration of wildtype sequences at the target region corresponding to the i-the probe pair in the sample,
wherein v is indicative of volume of a partition, and
wherein {circumflex over (P)}(w1) is indicative of the wildtype probability that a given partition contains a wildtype sequence at the target region corresponding to the i-th probe pair in the sample. In some embodiments, the method further comprises determining a confidence interval and/or a uncertainty measure associated with the estimated concentration of the wildtype sequence at the target region corresponding to the i-th probe pair in the sample.
In some embodiments according to any one of the methods described above, wherein the plurality of probe sets are R number of probe pairs, the method further comprises adjusting the concentration of nucleic acid molecules in the sample based on a count of partitions that each produces a positive signal via three or more of the detection channels X1-XR, wherein: (i) if (e.g., when) the count is larger than a pre-determined value, the adjusting is decreasing the concentration of the nucleic acid molecules in the sample by diluting the sample; or (ii) if (e.g., when) the count is smaller than a pre-determined value, the adjusting is increasing the concentration of the nucleic acid molecules in the sample by concentrating the sample. In some embodiments, the method further comprises determining a quality control measure by comparing a count of partitions that each produces a positive signal via each of the detection channels X1-XR with an estimated count, wherein the estimated count is based on counts of partitions other than the count of partitions that each produces a positive signal via each of the detection channels X1-XR.
In some embodiments according to any one of the methods described above, wherein the plurality of probe sets are R number of probe pairs, R is between 2 and 6, such as 3.
In some embodiments according to any one of the methods described above, wherein the plurality of probe sets are three probe pairs, the method comprises obtaining a first count (n100) of one or more partitions that each produces a positive signal via the detection channel X1, a negative signal via the detection channel X2, and a negative signal via the detection channel X3; obtaining a second count (n000) of one or more partitions that each produces negative signals on all of the detection channels X1-X3, and calculating a mutant probability ({circumflex over (P)}(mi)) that a given partition contains a mutant sequence at the target region corresponding to the first probe pair, wherein the mutant probability is based on a ratio between the first count (n100) and a sum of the first count (n100) and the second count (n000). In some embodiments, the method further comprises determining an estimated concentration Ĉ(mi) of the mutant sequences at the target region corresponding to the first probe pair in the sample based on the mutant probability {circumflex over (P)}(mi). In some embodiments, the method further comprises determining a confidence interval and/or an uncertainty measure associated with the estimated concentration Ĉ(mi) in the sample.
In some embodiments according to any one of the methods described above, wherein the plurality of probe sets are three probe pairs, the method further comprises calculating a wildtype probability ({circumflex over (P)}(w1)) that a given partition contains a wildtype sequence at the target region corresponding to the first probe pair in the sample, wherein the wildtype probability is calculated based on {circumflex over (P)}(mi). In some embodiments, the wildtype probability is calculated based on {circumflex over (P)}(mi) and {circumflex over (P)}(m2). In some embodiments, the wildtype probability ({circumflex over (P)}(w1)) is determined based on
wherein n110 is indicative of a count of one or more partitions that each produces a positive signal via the first detection channel, a positive signal via the second detection channel, and a negative signal via the third detection channel,
wherein n010 is indicative of a count of one or more partitions that each produces a negative signal via the first detection channel, a positive signal via the second detection channel, and a negative signal via the third detection channel, and
wherein {circumflex over (P)}(m2) is indicative of a probability that a given partition contains a mutant sequence at the target region corresponding to the second probe pair.
In some embodiments according to any one of the methods described above, substantially all partitions each further comprises:
In some embodiments according to any one of the methods described above, each of the reference probes and the drop-off probes has a single detectable label. In some embodiments, the reference labels and drop-off labels are fluorophores. In some embodiments, one or more different detection channels have different excitation wavelength ranges and/or different emission wavelength ranges. In some embodiments, one or more different detection channels share the same excitation and/or emission wavelength ranges, but are associated with different fluorescence intensities. In some embodiments, probe sets corresponding to different target regions within a gene of interest comprise drop-off probes having drop-off labels associated with different detection channels that share the same excitation and/or emission wavelength ranges,
wherein the drop-off probes are detected at different fluorescence intensities with respect to each other. In some embodiments, the reference labels and drop-off labels are selected from the group consisting of fluorescein, FAM, YAKIMA YELLOW®, Cy3, HEX, VIC, ROX, CY5, CY5.5, ALEXA FLUOR® 647, ALEXA FLUOR® 448, and Quasar705. In some embodiments, wherein the plurality of probe sets are three probe pairs, the first reference label, the second reference label and the third reference label are selected from the group consisting of Cy3, FAM and Cy5, or wherein the first reference label, the second reference label and the third reference label are selected from the group consisting of FAM, HEX and Cy5.
In some embodiments according to any one of the methods described above, the target regions are mutation hotspot regions in one or more genes selected from the group consisting of EGFR, NRAS, KRAS, ESR1, and BRAF.
In some embodiments according to any one of the methods described above, each partition further comprises:
In some embodiments according to any one of the methods described above, the amplicons are about 100 to about 200 nucleotides long. In some embodiments, the reference regions are not associated with single nucleotide polymorphisms.
In some embodiments according to any one of the methods described above, the method further comprises forming a plurality of partitions having a pre-determined volume.
In some embodiments according to any one of the methods described above, the nucleic acid molecules are genomic DNA molecules, tumor DNA molecules, or cDNA molecules. In some embodiments, the method further comprises extracting the nucleic acid molecules from a biological sample. In some embodiments, the nucleic acid molecules are obtained from a formalin-fixed, paraffin-embedded (FFPE) sample, or a liquid biopsy sample. In some embodiments, the method comprises fragmenting nucleic acid molecules in the biological sample to provide the sample comprising nucleic acid molecules.
In some embodiments according to any one of the methods described above, the plurality of target regions are microsatellite sequence loci.
In some embodiments according to any one of the methods described above, the nucleic acid molecules are genomic DNA in a sample of cells, wherein the cells have been contacted with a site-specific genome-editing reagent configured to cleave target sites in the plurality of target regions, and wherein the mutant sequences are non-homologous end joining (NHEJ) edited sequences at the plurality of target regions. In some embodiments, the site-specific genome-editing reagent comprises a Cas nuclease, a TALEN, or a Zinc-finger nuclease. In some embodiments, the method further comprises contacting the cells with the site-specific genome-editing reagent.
In some embodiments according to any one of the methods described above, each probe set of the plurality of probe sets further comprises:
One aspect of the present application provides a method for quantification of mutations at a plurality of microsatellite sequence loci in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions each comprises:
a plurality of primer sets corresponding to the plurality of microsatellite sequence loci, wherein each primer set of the plurality of primer sets comprises:
One aspect of the present application provides a method for quantification of unmodified, homology directed repair (HDR)-edited, and/or non-homologous end joining (NHEJ)-edited sequences at a plurality of target regions in nucleic acid molecules from a sample of cells, wherein the cells have been contacted with a site-specific genome-editing reagent and HDR template nucleic acids comprising HDR replacement sequences, wherein the site-specific genome-editing reagent is configured to cleave target sites in the plurality of target regions,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions each comprises:
a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
thereby providing quantification of unmodified, HDR-edited, and/or NHEJ-edited sequences at the plurality of target regions in the sample. In some embodiments, the plurality of probe sets are (R−1) number of probe triplets,
wherein a first probe triplet of the (R−1) number of probe triplets comprises:
One aspect of the present application provides a method for quantification of wildtype and/or allelic sequences at R number of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions each comprises R number of probe triplets corresponding to the R number of target regions,
wherein a first probe triplet of the R number of probe triplets comprises:
Also provided are compositions, systems, kits and articles of manufacture for any one of the methods described above.
The present application provides multiplex digital polymerase chain reaction (dPCR) assays such as multiplex drop-off dPCR assays that can detect wildtype and mutant sequences at R different genetic loci using fewer than two times R number of detection channels, for example, by using reference probes and drop-off probes sharing overlapping sets of labels. The assays described herein may be used to assess microsatellites instability (MSI) and genome-editing products.
In some embodiments, the multiplex drop-off dPCR assays use R probe pairs each comprising a reference probe comprising a reference label and a drop-off probe comprising a drop-off label, in which the reference label and the drop-off label in each probe pair are detectable via different detection channels. In some embodiments, the set of the drop-off labels and the set of the reference labels used in the probe pairs are circular permutations with respect to each other, which allows detection of 2R number of genetic species (i.e., wildtype and mutant at each genetic locus) via only R number of different detection channels. Additionally, each drop-off probe is capable of detecting all mutation sequences associated with its respective target genetic locus, thereby increasing the multiplex level of the dPCR assay in terms of detectable mutations per assay.
In some embodiments, the multiplex drop-off dPCR assay uses R−1 probe triplets each comprising a reference probe comprising a reference label, a drop-off probe (e.g., an NHEJ drop-off probe) comprising a drop-off label, and an allele-specific probe (e.g., a HDR probe) comprising an allele-specific label, in which the reference label, the drop-off label and the allele-specific label in each probe triplet are detectable via different detection channels. In some embodiments, the set of the reference labels, the set of the drop-off labels, and the set of the allele-specific labels used in the probe triplets are permutations with respect to each other (e.g., as shown in
In some embodiments, the multiplex dPCR assays use R probe sets each comprising a reference probe comprising a reference label, a first allele-specific probe comprising a first allele-specific label, and a second allele-specific probe comprising a second allele-specific label, wherein the second allele-specific probe hybridizes to the same allelic sequence or its complementary sequence as the first allele-specific probe or the first allele-specific probe and the second allele-specific probe hybridize to two different portions of the same allelic sequence or complementary sequences thereof, wherein the reference label and the first allele-specific label in each probe set are detectable via the same detection channel, and the reference label and the second allele-specific label in each probe set are detectable via different detection channels. In some embodiments, the set of the second allele-specific labels and the set of the reference labels used in the probe sets are circular permutations with respect to each other, which allows detection of 2R number of genetic species (i.e., wildtype and allelic sequences at each genetic locus) via only R number of different detection channels. The multiplex dPCR assays may be used for detecting multiple copy number variants (CNVs), assessing multiple allelic frequencies (MAF) and determining multiple variant allele fractions (VAF) in a sample.
The multiplex dPCR assays and methods described herein can be used in a variety of applications, such as detection of microsatellite mutations, and quantification of site-specific genome-edited products. Also provided are compositions, systems, methods of diagnosis, methods of treatment, methods of screening, kits and articles of manufacture.
The terms “polynucleotide” and “nucleic acid” are used interchangeably herein to refer to a polymer of nucleotides of any length, and includes DNA and RNA. The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a polymer by DNA or RNA polymerase. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and their analogs. Nucleic acids may be single-stranded, double-stranded, or in more highly aggregated hybridization forms, and may include chemical modifications. “Polynucleotide” or “nucleic acid” may also be used herein to refer to the sequence encoded by the nucleic acid, including the sense strand (i.e., coding strand) sequence and anti-sense strand (i.e., non-coding strand) sequence in a double-stranded nucleic acid molecule.
An “oligonucleotide,” as used herein, generally refers to a short, generally single-stranded, generally synthetic, polynucleotide that is generally, but not necessarily, no more than about 200 nucleotides in length. The terms “oligonucleotide” and “polynucleotide” are not mutually exclusive. The description above for polynucleotides is equally and fully applicable to oligonucleotides.
The term “sample” as used herein refers to a sample that can be subject to the methods described herein with or without pre-processing such as nucleic acid extraction, fragmentation, dilution/concentration, or other pre-treatment. The sample may be a biological sample, or obtained by processing or manipulating a biological sample. In some embodiments, the sample is ready for loading onto a digital PCR instrument for analysis. In some embodiments, the sample has been diluted from a biological sample.
A “probe” refers to a molecule (e.g., a protein, nucleic acid, aptamer, etc.) that specifically interacts with or specifically binds to, and thus detects, a target polynucleotide. Non-limiting examples of molecules that specifically interact with or specifically bind to a target polynucleotide include nucleic acids (e.g., oligonucleotides), proteins (e.g., antibodies, transcription factors, zinc finger proteins, non-antibody protein scaffolds, etc.), and aptamers. Generally, a probe is labeled with a detectable label. The probe can indicate the presence or level of the target polynucleotide by either an increase or decrease in signal from the detectable label. In some embodiments, the probes detect the target polynucleotide in an amplification reaction by being digested by the 5′ to 3′ exonuclease activity of a DNA dependent DNA polymerase.
As used herein, “set,” “pair” and “triplet” refers to an ordered list of members. For example, “set,” “pair” and “triplet” correspond to “list”, “couple” and “triple” respectively in mathematics. For example, each probe pair may have the order {reference probe, drop-off probe}; each probe triplet may have the order {reference probe, drop-off probe, and AS probe} and the set of labels for reference probes may have the order {reference label of probe set 1, reference label of probe set 2, . . . reference label of probe set R}.
The terms “label,” and “detectable label” are used interchangeably herein to refer to an agent detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include fluorescent dyes (fluorophores), luminescent agents, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, 32P and other isotopes, haptens. The term includes combinations of single labeling agents, e.g., a combination of fluorophores that provides a unique detectable signature, e.g., at a particular wavelength or combination of wavelengths. Any method known in the art for conjugating a label to a desired agent may be employed, e.g., using methods described in Hermanson, Bioconjugate Techniques 1996, Academic Press, Inc., San Diego.
The term “circular permutation” refers to the act of rearranging members (e.g., labels) of a set (e.g., a set of reference probes, and a set of drop-off probes) in a circular manner to generate another set having the same members without changing the relative positions of the members, e.g., by moving the final element of a linear arrangement of the members in the set to its front. Two circular permutations are equivalent if one can be rotated into the other (that is, cycled without changing the relative positions of the elements). Each set of n members has (n−1)! circular permutations. For example, a circular permutation of the set {fluorophore 1, fluorophore 2, fluorophore 3} can be {fluorophore 2, fluorophore 3, fluorophore 1}, or {fluorophore 3, fluorophore 1, fluorophore 2}.
The term “permutation” refers to the act of rearranging the members (e.g., labels) of a set (e.g., a set of reference probes, a set of drop-off probes and a set of allele-specific probes) into a sequence or order. For example,
A “primer” is generally a short single-stranded polynucleotide, generally with a free 3′-OH group, that binds to a target nucleic acid by hybridizing with a target sequence, and thereafter promotes polymerization of a polynucleotide complementary to the target nucleic acid. Primers can be of a variety of lengths and are often less than 50 nucleotides in length, for example 12-30 nucleotides, in length. Primers can be DNA, RNA, or a chimera of DNA and RNA portions. In some cases, primers can include one or more modified or non-natural nucleotide bases.
A nucleic acid sequence is “complementary” to another nucleic acid when at least two contiguous bases of, e.g., a first nucleic acid or a primer, can combine in an antiparallel association or hybridize with at least a subsequence of a second nucleic acid to form a duplex. In some embodiment, complementary refers to hydrogen-bonded base pair formation preferences between the nucleotide bases G, A, T, C and U, such that when two given polynucleotides or nucleotide sequences anneal to each other, A pairs with T and G pairs with C in DNA, and G pairs with C and A pairs with U in RNA.
A first nucleic acid sequence “corresponding to” a second nucleic acid sequence is a sequence that is identical to or complementary to the second nucleic acid sequence or a portion of the second nucleic acid sequence, or comprises the second nucleic acid sequence or its complementary sequence. When a second nucleic acid sequence contains a unique feature, such as a mutation, a nucleic acid sequence “corresponding to” the second nucleic acid sequence comprises a sequence having the unique feature or a complement thereof.
“Hybridization” and “annealing” as used interchangeably herein refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or by any other sequence specific manner. A nucleic acid, or a portion thereof, “hybridizes” to another nucleic acid under conditions such that non-specific hybridization is minimal at a defined temperature in a physiological buffer (e.g., pH 6-9, 25-150 mM chloride salt). In some embodiments, the defined temperature at which specific hybridization occurs is room temperature. In some embodiments, the defined temperature at which specific hybridization occurs is higher than room temperature. In some embodiments, the defined temperature at which specific hybridization occurs is at least about 37, 40, 42, 45, 50, 55, 60, 65, 70, 75, or 80° C. In some embodiments, the defined temperature at which specific hybridization occurs is, or is about, 37, 40, 42, 45, 50, 55, 60, 65, 70, 75, or 80° C.
“Target region,” “target locus” or “target genetic locus”, as used herein, refers to a unique genomic location that defines the position of an individual nucleic acid sequence of interest, including one or more contiguous nucleotides. In some embodiments, a region or locus is a single nucleotide position of interest. In some embodiments, a region or locus is at least about any of 2, 3, 5, 10, 15, 20, 25 contiguous nucleotides. A gene may contain multiple target regions of interest. The sequence at a target region may refer to the sequence of either the sense strand or the anti-sense strand sequence of the target region, and may be a wildtype sequence or a mutant sequence. In some embodiments, the target region is associated with one or more variant sequences. In some embodiments, the target region is susceptible to mutation, and is associated with one or more mutant sequences.
“Reference region” used herein refers to a unique genomic location that defines a nucleic acid sequence (i.e., “reference sequence”) that is associated with a wildtype sequence. In some embodiments, the reference region is not associated with mutations or variations. Reference regions and reference sequences can be selected and validated by a skilled person in the art. Different reference regions may be needed for different target regions. In some embodiments, the target region and the reference region corresponding to a probe set are overlapping or identical.
“Target fragment” used herein refers to the fragment of a nucleic acid molecule that is amplified by a primer set corresponding to the respective target region. “Reference fragment” used herein refers to the fragment of a nucleic acid molecule that is amplified by a primer set corresponding to the respective reference region. In some embodiments, the target fragment includes the target region (e.g., mutation hotspot, microsatellite sequence locus or target genetic locus in a genomic DNA that is subject to site-specific genome-editing) and an adjacent reference region upstream or downstream to the target region, which provides sequence that the reference probe can hybridize to. In this situation, the target fragment and the reference fragment are the same. In some embodiments, the target region and the reference region are located in different fragments (i.e., target fragment and reference fragment respectively) that are amplified with separate pairs of primers. As used herein, a target fragment may also refer to amplicons of the target fragment, and a reference fragment may also refer to amplicons of the reference fragment.
As used herein, “adjacent to” a target region refers to a region that may partially overlap with the target region or outside the target region in a target fragment or amplicon thereof.
“Allele”, as used herein, refers to one of several alternative forms of a gene or DNA sequence at a specific genomic location (locus). In human, at each autosomal locus an individual possesses two alleles, one inherited from the father and one from the mother. “Allelic sequence” as used herein refers to the sequence of a specific allele. An allelic sequence may be longer than the sequence of an allele-specific (AS) probe, or shorter than the sequence of an AS probe. An AS probe hybridizes to its corresponding allelic sequence, or a portion thereof.
“Mutant sequence” and “variant sequence” as used interchangeably herein, refer to any sequence alteration in a sequence of interest in comparison to a reference sequence. “Wildtype sequence” and “reference sequence” are used interchangeably herein, to refer to a sequence to which one wishes to compare a sequence of interest, for example, a sequence corresponding to the dominant allele of a gene, or an unmodified sequence of a genetic locus. Mutant sequences include, but are not limited to, insertions, deletions, and substitutions, including single nucleotide changes, and alterations of more than one nucleotide in a sequence.
“Mutation hotspots” refer to genetic loci that are known to have naturally-occurring mutations, for example, in a diseased tissue or a diseased state. As used herein, the term “single nucleotide variant,” or “SNV” for short, refers to the alteration of a single nucleotide at a specific position in a genomic sequence. When alternative alleles occur in a population at appreciable frequency (e.g., at least 1% in a population), a SNV is also known as “single nucleotide polymorphism” or “SNP”.
As used herein, “specific” when used in the context of a primer specific for a target nucleic acid or a probe specific for a target nucleic acid refers to a level of complementarity between the primer and the target such that there exists an annealing temperature at which the primer or probe will anneal to and mediate amplification of the target nucleic acid and will not anneal to or mediate amplification of non-target sequences present in a sample.
“Amplification” as used herein, generally refers to the process of producing two or more copies of a desired sequence. Components of an amplification reaction may include, but are not limited to, for example, primers, a polynucleotide template, polymerase, nucleotides, dNTPs and the like.
“Polymerase chain reaction” or “PCR” refers to a method thereby a specific segment or subsequence of a target double-stranded DNA, is amplified in a geometric progression. PCR is well known to those of skill in the art; see, e.g., U.S. Pat. Nos. 4,683,195 and 4,683,202; and PCR Protocols: A Guide to Methods and Applications, Innis et al., eds, 1990. Exemplary PCR reaction conditions typically comprise either two or three step cycles. Two-step cycles have a denaturation step followed by a hybridization/elongation step. Three step cycles comprise a denaturation step followed by a hybridization step followed by a separate elongation step. Also contemplated herein are polymerase chain reactions that are carried out without thermal cycling, including, but not limited to, isothermal PCR and loop-mediated isothermal amplification (LAMP).
An “amplicon” refers to a nucleic acid fragment formed as a product of a PCR amplification reaction that are copies of a portion of a particular target nucleic acid, e.g., a target fragment comprising a target region as illustrated in
As used herein, “digital PCR” refers to a PCR assay that separates a sample into a large number of partitions and PCR reactions are carried out in each partition. Signal from each partition is detected to allow quantification of nucleic acids by statistical analysis. See, e.g., Sykes et al., 1992 Quantitation of targets for PCR by use of limiting dilution. BioTechniques 13, 444-449, Vogelstein and Kinzler 1999 Digital PCR. Proc Natl Acad Sci USA, 96:9236-9241 and Pohl and Shihle 2004 Principle and applications of digital PCR. Expert Rev Mol Diagn, 4:41-47, see also, Monya Baker 2012 Nature Methods 9, 541-544.
As used herein, the term “partitioning” or “partitioned” refers to separating a sample into a plurality of portions, or “partitions.” Partitions are generally physical, such that a sample in one partition does not, or does not substantially, mix with a sample in an adjacent partition. Partitions can be solid or fluid. In some embodiments, a partition is a solid partition, e.g., a microwell. In some embodiments, a partition is a fluid partition, e.g., a droplet. In some embodiments, a fluid partition (e.g., a droplet) is the result of a mixture of immiscible fluids (e.g., water and oil). In some embodiments, a fluid partition (e.g., a droplet) is an aqueous droplet that is surrounded by an immiscible carrier fluid (e.g., oil). As used herein, “substantially all partitions” refer to at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, or more of the total number of partitions.
A “microsatellite sequence locus” refers to a region of genomic DNA that contains short, repetitive sequence elements of one to seven, such as one to five, or one to four basepairs in length. Each sequence repeated at least once within a microsatellite locus is referred to herein as a “repeat unit.” Each microsatellite locus typically comprises at least seven repeat units, such as at least ten repeat units, or at least twenty repeat units.
A “site-specific genome-editing reagent” refers to a component or set of components that can be used for site-specific genome editing. Generally, such a reagent contains a targeting module and a nuclease module. Exemplary targeting modules include nucleic acids, e.g., guide RNAs, such as those utilized in CRISPR/Cas systems. Alternatively, the targeting module can be, or be derived from, a transcription factor domain, or a TAL effector DNA binding domain. For example, a zinc-finger domain can be employed as a targeting moiety. Exemplary nuclease modules include, but are not limited to a type IIS restriction endonuclease (e.g., FokI), a Cas nuclease (e.g., Cas9), or a derivative thereof. In some cases, the site-specific genome-editing reagent utilizes a combination of a guide RNA, a “dead” Cas nuclease, and a type IIS restriction endonuclease. Other variations are known in the art. Generally, site-specific genome-editing reagents target a genomic region and induce a double stranded cut (“cleave”) into the DNA within the target region. Repair of the cutting can proceed via two alternative pathways. In non-homologous end joining (NHEJ), the cut ends of a DNA strand are directly ligated without the need for a homologous template nucleic acid. NHEJ can lead to addition, deletion, and/or substitution of one or more nucleotides at the repair site, and the resulting sequences are referred herein as “NHEJ-edited sequences.” In homology directed repair (HDR), the cleaved ends of a DNA strand are repaired by polymerization from a homologous template nucleic acid, and the resulting sequence is referred herein as a “HDR-edited sequence.”
The terms “individual” or “subject” are used interchangeably herein to refer to an animal; for example, a mammal. In some examples, an “individual” or “subject” refers to an individual or subject in need of treatment for a disease or disorder.
It is understood that embodiments of the invention described herein include “consisting” and/or “consisting essentially of” embodiments.
Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X”. For example, a value of about X may be within (i.e., ±) 10%, 5%, 2%, 1% or less of X.
As used herein, reference to “not” a value or parameter generally means and describes “other than” a value or parameter.
As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
Where a range of values is provided, it is to be understood that each intervening value between the upper and lower limit of that range, and any other stated or intervening value in that stated range, is encompassed within the scope of the present disclosure. Where the stated range includes upper or lower limits, ranges excluding either of those included limits are also included in the present disclosure.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination. All combinations of the embodiments pertaining to the multiplex dPCR assays (e.g., multiplex drop-off dPCR assays) and methods are specifically embraced by the present invention and are disclosed herein just as if each and every combination was individually and explicitly disclosed. In addition, all subcombinations of the multiplex dPCR assays (e.g., multiplex drop-off dPCR assays) and methods listed in the embodiments describing such variables are also specifically embraced by the present invention and are disclosed herein just as if each and every such sub-combination of the multiplex dPCR assays (e.g., multiplex drop-off dPCR assays) and methods was individually and explicitly disclosed herein.
The present application provides methods of quantifying reference (e.g., wildtype) sequences and/or variant sequences (e.g., mutant sequences) at two or more target regions in a nucleic acid sample using any of the probe set designs described in the “Probe sets” subsection, which include reference probes and drop-off probes that have overlapping sets of labels, and reference probes and allele-specific probes that have overlapping sets of labels. The methods described herein are useful as multiplex drop-off digital PCR assays.
Multiplex Drop-Off dPCR Methods
In some embodiments, there is provided a method for quantification of wildtype and/or mutant sequences at a plurality of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or mutant sequences at R number of target regions in a sample comprising nucleic acid molecules,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample,
wherein substantially all partitions (e.g., all partitions) each comprises R number of probe pairs corresponding to the R number of target regions,
wherein a first probe pair of the R number of probe pairs comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or mutant sequences at three target regions in a sample comprising nucleic acid molecules,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample,
wherein substantially all partitions (e.g., all partitions) each comprises three probe pairs corresponding to the three target regions, wherein the three probe pairs comprise:
a first probe pair comprising:
In some embodiments, there is provided a method for quantification of wildtype, mutant and/or allelic sequences at a plurality of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
In some embodiments, there is provided a method for quantification of wildtype, mutant, and/or allelic sequences at (R−1) number of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises (R−1) number of probe triplets corresponding to the (R−1) number of target regions,
wherein a first probe triplet of the (R−1) number of probe triplets comprises:
The methods described herein may further comprise one or more steps of forming the partitions, amplification, and/or sample preparation, etc., as described in the “Digital PCR” subsection below. In some embodiments, the method further comprises forming the plurality of partitions. In some embodiments, the method further comprises distributing a composition comprising the nucleic acid molecules and the plurality of probe sets into the plurality of partitions. In some embodiments, the method further comprises amplifying the nucleic acid molecules in the plurality of partitions using a plurality of primer sets corresponding to the plurality of target regions. In some embodiments, substantially all partitions (e.g., all partitions) each comprises a plurality of primer pairs corresponding to the plurality of target regions, wherein each primer pair comprises a forward oligonucleotide primer and a reverse oligonucleotide primer suitable for amplifying a target fragment comprising a target region corresponding to the primer set and the reference region corresponding to the target region. In some embodiments, method further comprises distributing a composition comprising the nucleic acid molecules, the plurality of probe sets, and optionally the plurality of primer sets into the plurality of partitions.
The methods may further comprise data analysis steps as described in the subsections “Embodiment Employing Three (3) Probe Pairs,” “Embodiment Employing R number of probe pairs,” and “Embodiment Employing (R−1) number of Probe Triplets.” In some embodiments, said quantification comprises providing estimated concentrations of the wildtype sequences and/or mutant sequences at the plurality of target regions in the sample. In some embodiments, said quantification comprises providing confidence intervals of the estimated concentrations of the wildtype sequences and/or mutant sequences at the plurality of target regions in the sample. In some embodiments, said quantification comprises providing an uncertainty measure of the wildtype sequences and/or mutant sequences at the plurality of target regions. The confidence intervals and/or uncertainty measures may be at any given confidence level, such as at about any one of 80%, 85%, 90%, 95%, 98%, 99%, or higher confidence level. The quantification of the wildtype sequences and/or mutant sequences at the plurality of target regions in the sample may further be converted to quantification of the wildtype sequences and/or mutant sequences at the plurality of target regions in a biological sample, from which the sample is derived, for example, by multiplying with a dilution factor, if the sample is prepared by diluting the biological sample.
The methods described herein may further be multiplexed with conventional allele-specific dPCR assays by including allele-specific probes with labels detectable via detection channels that are distinct from those used in the plurality of probe sets (including probe pairs and probe triplets). An allele-specific (“AS”) probe hybridizes to a specific allelic sequence, including wildtype sequence, mutant sequence, or SNP at a target region of interest. An AS probe is designed to confer its ability to bind properly to a specific allelic sequence at a target region, while preventing hybridization of the AS probe in the presence of any other sequence at the target region. In some embodiments, each AS probe is also used together with a dark probe to increase stringency of the assay by binding to a wildtype sequence of the allele, but not the allelic sequence associated with the AS probe.
For example,
In some embodiments, the probes (e.g., reference probes, drop-off probes, AS probes) each has a single detectable label. In some embodiments, the labels (e.g., reference labels, drop-off labels, AS labels) are fluorophores. In some embodiments, different detection channels have different excitation wavelength ranges and/or different emission wavelength ranges.
The methods described herein that distinguish probes based on the excitation and/or emission wavelengths or wavelength ranges associated with different fluorophores may further be combined with multiplexing methods that distinguish probes having the same fluorophores but relying on different fluorescence intensities. In some embodiments, one or more detection channels are associated with different excitation wavelengths or wavelength ranges, and/or emission wavelengths or wavelength ranges, and one or more detection channels are associated with different fluorescence intensities. In some embodiments, probe sets that correspond to different target regions within the same gene of interest are labeled with the same sets of fluorophores, which are detected via different fluorescence intensities. In some embodiments, the probe sets corresponding to different target regions within a gene of interest comprise reference probes having reference labels detectable via detection channels that share the same excitation and/or emission wavelengths or wavelength ranges, wherein the reference probes are detected at different fluorescence intensities with respect to each other; drop-off probes having drop-off labels detectable via different detection channels that share the same excitation and/or emission wavelengths or wavelength ranges, wherein the drop-off probes are detected at different fluorescence intensities with respect to each other; and/or AS probes having AS labels detectable via different detection channels that share the same excitation and/or emission wavelengths or wavelength ranges, and wherein the AS probes are detected at different fluorescence intensities with respect to each other.
For example,
The methods described herein use a plurality of probe sets for detecting wildtype and mutant sequences at a plurality of target regions. Each probe set may comprise 2, 3, 4, 5, 6, or more probes. In some embodiments, a plurality of probe sets is a plurality of probe pairs each comprising a reference probe that always hybridizes to target fragments, and a drop-off probe that hybridizes to target fragments comprising wildtype sequences at the target region. In some embodiments, a plurality of probe sets is a plurality of probe triplets, each comprising a reference probe, a drop-off probe, and an allele-specific (“AS”) probe that hybridizes to a specific allelic sequence at the target region. Any of the probe sets (including probe pairs and probe triplets) described herein may be used together with one or more “standalone” AS probes that are not part of the plurality of probe sets, e.g., the one or more AS probes may hybridize to target regions that are different from the target regions corresponding to the plurality of the probe sets. In some embodiments, at least 1, 2, 3, 4, 5, 6 or more standalone AS probes are used. Each probe comprises a detectable label.
In some embodiments, a plurality of probe sets is a plurality of probe triplets comprising a reference probe, a first AS probe and a second AS probe, wherein the first AS probe and the second AS probe hybridize to the same specific allelic sequence or its complementary sequence or different portions of the same specific allelic sequence. In some embodiments, the allelic sequence is at a junction of two repeats of a mutant gene associated with CNV at the target region. In some embodiments, the reference probe and one of the first AS probe and the second AS probe in each probe triplet have the same detectable label, and the other one of the first AS probe and the second AS probe has a different detectable label that can be detected via a different detection channel from that of the detectable label of the reference probe. In some embodiments, a plurality of probe sets is a plurality of probe pairs comprising a reference probe having a single detectable label and an AS probe with two detectable labels, in which one of the two detectable labels of the AS probe is the same as the detectable label of the reference probe, and the other detectable label of the AS probe can be detected via a different detection channel from that for the detectable label of the reference probe. The first AS probe and the second AS probe in the plurality of probe triplets or the AS probe with two detectable labels in the plurality of probe pairs are referred herein as “dual-labeled AS probes.”
In some embodiments, the AS probe(s) hybridize to a mutant sequence at a target region and the reference probe hybridizes to a wildtype sequence at the same target region or a reference region that overlaps with the target region.
In some embodiments, the probes (e.g., reference probe, drop-off probe, or AS probe) are oligonucleotide probes. Exemplary probes include oligonucleotide primers having hairpin structures with a fluorescent molecule held in proximity to a fluorescent quencher until forced apart by primer extension, e.g., Whitecombe et al., Nature Biotechnology, 17: 804-807 (1999)(AMPLIFLUOR™, hairpin primers). Exemplary probes may alternatively comprise an oligonucleotide attached to a fluorophore and a fluorescence quencher, wherein the fluorophore and quencher are in proximity until the oligonucleotide specifically binds to an amplification product, e.g., Gelfand et al., U.S. Pat. No. 5,210,015 (TAQMAN™ PCR probes); Nazarenko et al., Nucleic Acids Research, 25: 2516-2521 (1997) (“scorpion probes”); and Tyagi et al., Nature Biotechnology, 16: 49-53 (1998) (“molecular beacons”). Such probes may be used to measure the total amount of reaction product at the completion of a reaction or to measure the generation of amplification product during an amplification reaction. In some embodiments, the probes (e.g., reference probe, drop-off probe, or AS probe) are TAQMAN™ probes.
The probes (e.g., reference probe, drop-off probe, or AS probe) of the same probe set described herein hybridize within the same target fragments or amplicons thereof. However, standalone AS probes that are not part of the plurality of probe sets may or may not hybridize within the same target fragments or amplicons thereof as any one of the probes in the plurality of probe sets. The probes are designed according to the well-established practice in the art to minimize PCR artifact and to specifically hybridize to the target sequences. Specificity of hybridization of the oligonucleotide probes to target fragments or amplicons thereof can be achieved by altering the length of the probe, the GC content, or the amplification and/or detection conditions (e.g., temperature, salt content, etc.).
In some embodiments, the probes have a nucleotide sequence length of about 10 to about 50. In some embodiments, the probes have a nucleotide sequence length of about any one of 15-40, 25-50, 15-35, 20-40, 30-50 or 30-40. The probes may further comprise modifications that increase the specificity of the probes to their target sequences, e.g., by increasing the melting temperature (Tm) of the probe and stabilizing probe-target hybrids. In some embodiments, one or more probes include a minor groove binder (MGB) moiety at their 3′ end. In some embodiments, the probes comprise a chemically modified nucleotide, such as a Locked Nucleic Acid (LNA).
The drop-off probe hybridizes to a wildtype sequence at a target region in a target fragment of a nucleic acid molecule or amplicon thereof. The target region may be a mutation hotspot in a gene of interest, including a microsatellite sequence locus. The target region may alternatively be a genomic locus edited by a site-specific genome-editing reagent (e.g., CRISPR-Cas). Preferably a drop-off probe covers the full wildtype sequence at a target region and extends further a few nucleotides on each extremity (typically 1 to 10 nucleotides, such as 2 to 8, 2 to 6, 2 to 5, or 2 to 4) to confer both its ability to bind properly and the resulting destabilization in case of a mutant sequence. In other words, the probe size is designed to confer its ability to bind properly to the wildtype sequence at the target region, while preventing hybridization of the drop-off probe in the presence of a mutation at the target region.
The reference probe hybridizes to a wildtype sequence at a reference region. The reference region is a region associated with low mutation or single nucleotide polymorphism frequency. In some embodiments, the reference probe hybridizes at a reference region that is located on a different fragment or amplicon than the AS probe. In some embodiments, the reference probe hybridizes at a reference region that is adjacent to a target region located on the same target fragment of a nucleic acid molecule of amplicon thereof.
For drop-off assays, the reference probe hybridizes to a wildtype sequence at an adjacent reference region in a target fragment of a nucleic acid molecule or amplicon thereof. The reference region is upstream or downstream of the target region, and does not overlap with the target region. A reference probe is designed to confer its ability to bind to substantially all target fragments or amplicons thereof that are associated with its respective target region, regardless of the mutation status at the target region.
An allele-specific (AS) probe in a probe set hybridizes to a specific allelic sequence at the target region that the corresponding drop-off probe and the corresponding reference probe hybridize to. Each standalone AS probe that is not part of the plurality of probe sets hybridizes to a specific allelic sequence at a target region that may or may not overlap with a target region corresponding to any one probe set of the plurality of probe sets. The AS probes may be used to detect specific sequences (e.g., wildtype sequence, mutant sequence, SNP, or amplification) in a gene of interest, or HDR-edited sequences at a target genomic locus edited by a site-specific genome-editing reagent (e.g., CRISPR-Cas). An AS probe is designed to confer its ability to bind properly to a specific allelic sequence at a target region, while preventing hybridization of the AS probe in the presence of any other sequence at the target region. In some embodiments, the AS probe has a single detectable label. In some embodiments, the AS probe has two different detectable labels. In some embodiments, two AS probes having different detectable labels are used to detect a specific allelic sequence at a target region.
In some embodiments, a probe set comprising an AS probe further comprises a dark probe that binds to a wildtype sequence of the allele, but not the allelic sequence associated with the AS probe. In some embodiments, an AS probe that is not part of the plurality of probe sets is used in combination with a dark probe that binds to a wildtype sequence of the allele, but not the allelic sequence associated with the AS probe. The dark probe can increase the stringency of the assay by decreasing erroneous signal provided by binding of the AS probe to the wildtype target genetic locus. Typically, the dark probe is designed to contain a non-extendible 3′ end. An exemplary non-extendible 3′ end includes, but is not limited to a 3′ terminal phosphate. Alternative non-extendible 3′ ends include, but not limited to, those disclosed in, e.g., international patent application publication No. WO 2013/026027.
In addition, since dPCR is performed as an endpoint reaction (PCR is run to completion before measuring fluorescence), having single or close to single (e.g., 2, 3, 4, 5, 6 copies etc., for example, as in a Poisson distribution of target molecules into partitions with each partition containing 0, 1, 2, 3, 4, 5 or more copies of target molecules) target molecules in isolation allows multiplexing based on probe intensity (Zhong, Bhattacharya, et al., 2011 Multiplex digital PCR: breaking the one target per color barrier of quantitative PCR. Lab Chip, 11:21 67-2 174). For example, by adding the target-specific fluorescent assay at a limiting concentration, a compartment with a first target will be PCR-positive, but with a limited brightness at PCR endpoint. To count a second target type, a different target-specific probe with the same “color” (i.e., with the same fluorophore) is added at a different concentration. A compartment with the second target will have a brighter signal at PCR endpoint than a compartment with the first target, providing separate clouds and thus enabling separate counts for each target. Thus, combinations of both different color probes and different concentration of probes can be used to multiplex at higher levels. Alternatively, different primer concentrations may be used for different target fragments in order to result in different signal intensity for different probe sets, thereby allowing multiplexing based on probe intensity.
In some embodiments, the probes (e.g., reference probe, drop-off probe, or AS probe) are detectably labeled with a fluorophore which can be selected, for example, from the group consisting of FAM (5- or 6-carboxyfluorescein), VIC, NED, Fluorescein, FITC, IRD-700/800, Cy3, Cy5, Cy3.5, Cy5.5, HEX, TET (5-tetrachloro-fluorescein), TAMRA, JOE, ROX, BODIPY TMR, Oregon Green, Rhodamine Green, Rhodamine Red, ALEXA FLUOR PET®, BIOSEARCH BLUE™, MARINA BLUE, BOTHELL BLUE, ALEXA FLUOR®, 350 FAM™, SYBR® Green 1, EvaGreen™, ALEXA FLUOR® 488 JOE™, 25 VIC™, HEX™ TET™, CAL FLUOR® Gold 540, YAKIMA YELLOW®, ROX™, CAL FLUOR® Red 610, Cy3.5™, TEXAS RED®, ALEXA FLUOR® 568 CRY5™, QUASAR™ 670, LIGHTCYCLER RED640®, ALEXA FLUOR® 633 QUASAR™ 705, LIGHTCYCLER RED705®, ALEXA FLUOR® 680, SYTO9, LC GREEN®, LC GREEN® Plus+, and EVAGREEN™. In some embodiments, the reference labels, drop-off labels and/or AS labels are selected from the group consisting of fluorescein, FAM, YAKIMA YELLOW®, Cy3, HEX, VIC, ROX, Cy5, Cy5.5, ALEXA FLUOR® 647, ALEXA FLUOR® 448, and Quasar705. In some embodiments, the reference labels, drop-off labels and/or AS labels are selected from the group consisting of Cy3, FAM and Cy5. In some embodiments, the reference labels, drop-off labels and/or AS labels are selected from the group consisting of FAM, HEX and Cy5.
In some embodiments, each fluorophore is detected via a detection channel having a characteristic excitation range and a characteristic emission range. In some embodiments, different fluorophores used in the probe sets have non-overlapping excitation wavelength ranges and/or non-overlapping emission wavelength ranges. TABLE 1 below shows exemplary detection channels and compatible fluorophores that are useful in a method with three probe pairs.
The methods described herein use a total number of detection channels that is fewer than the total number of probes, including reference probes, drop-off probes and AS probes. In some embodiments, the plurality of probe sets is a plurality of probe pairs (e.g., a reference probe and a drop-off probe in each probe pair, or a reference probe and an AS probe with two different labels in each probe pair), and the total number of detection channels is fewer than two times the total number of probe pairs. In some embodiments, wherein R number of probe pairs are used, and R is 2 or more, the total number of detection channels is R, R+1, R+2, . . . or 2R−1. In some embodiments, the total number of detection channels is equal to the total number of probe pairs. In some embodiments, the plurality of probe sets is a plurality of probe triplets (e.g., a reference probe, a drop-off probe and an AS probe in each probe triplet), and the total number of detection channels is fewer than three times the total number of probe triplets. In some embodiments, wherein R number of probe triplets are used, and R is 2 or more, the total number of detection channels is R+1, R+2, R+3, . . . or 3R−1. In some embodiments, the total number of detection channels is equal to the total number of probe triplets plus 1. In some embodiments, the plurality of probe sets is a plurality of probe triplets (e.g., a reference probe, a first AS probe and a second AS probe that hybridize to the same allelic sequence in each probe triplet), and the total number of detection channels is fewer than two times the total number of probe triplets. In some embodiments, wherein R number of probe triplets are used, and R is 2 or more, the total number of detection channels is R, R+1, R+2, . . . or 2R−1. In some embodiments, the total number of detection channels is equal to the total number of probe triplets.
The quencher may be an internal quencher or a quencher located in the 3′ end of the probe. Typical quenchers include, but are not limited to, tetramethylrhodamine, TAMRA, BLACK HOLE QUENCHER® (BHQ; e.g., BHQ-1, BHQ-2, BHQ-3), and nonfluorescent quencher (NFQ). Hydrolysis probes usable according to the invention are well-known in the field. In some embodiments, hydrolysis probes have a fluorophore covalently attached to their 5′-end of the oligonucleotide probe and a quencher. The quencher molecule quenches the fluorescence emitted by the fluorophore when excited by a light source typically via FRET (Forster Resonance Energy Transfer). As long as the fluorophore and the quencher are in proximity, quenching inhibits any fluorescence signals. In some embodiments, the probes (e.g., reference probe, drop-off probe, or AS probe) are designed such that they anneal within the target fragments amplified by a specific set of primers. As the DNA polymerase (e.g., Taq polymerase) extends the primer and synthesizes the nascent strand, the 5′-3′ exonuclease activity inherent in the DNA polymerase then separates the 5′ reporter from the 3′ quencher, which provides a fluorescent signal that is proportional to the amplicon yield.
In addition, as discussed above, it is possible to multiplex the probes based on probe intensity, e.g., by varying probe concentrations and/or primer concentrations. See, Zhong, Bhattacharya, et al., 2011 Multiplex digital PCR: breaking the one target per color barrier of quantitative PCR. Lab Chip, 11:21 67-2 174. Thus, combinations of using overlapping sets of labels for the reference probes, drop-off probes and AS probes in the plurality of probe sets (such as permutation of labels among the different types of probes) and different concentration of probes and/or primers can be used to multiplex at higher levels.
The methods described herein can be carried out in a digital PCR format, where substantially all partitions contain either 0, 1, or close to 1 target molecule.
In some embodiments, each partition contains 0 or 1 target molecule. In some embodiments, the plurality of partitions has a Poisson distribution of the target molecules, wherein each partition has 0, 1, 2, 3, 4, 5 or more target molecules, and wherein the average number of target molecules per partition is close to 1. In some embodiments, the average number of target molecules per partition is about any one of 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2.0. For example, an optimal condition for the assays described herein may have a Poisson distribution of target molecules among the plurality of partitions with an average number of target molecules being about 1.6. In some embodiments, about 20% of the partitions each contain 0 target molecules, about 32.3% of the partitions each contain 1 target molecules; about 25.8% of the partitions each contain 2 target molecules; about 13.8% of the partitions each contain 3 target molecules; about 5.5% of the partitions each contain 4 target molecules; about 1.8% of the partitions each contain 5 target molecules; about 0.47% of the partitions each contain 6 target molecules; and about 0.12% of the partitions each contain 7 or more target molecules. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a sequence of interest at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a wildtype sequence at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a mutant sequence or mutant sequences at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having either a wildtype or mutant sequence(s) at a target region.
In some embodiments, no more than about any one of 95%, 90%, 85%, 80%, 75%, 70%, 65%, or 60% of the plurality of partitions are occupied by one or more target molecules. In some embodiments, about any one of 60%-95%, 60%-70%, 70%-80%, 80%-90%, 70%-90%, 75%-85%, 76%-84%, 77%-83%, 78%-82%, or 79%-81% of the plurality of partitions are occupied by one or more target molecules. In some embodiments, about 80% of the plurality of partitions are occupied by one or more target molecules. In some embodiments, at least about any one of 5%, 10%, 15%, 20%, 25%, 30%, 35% or 40% of the plurality of partitions each have 0 target molecule. In some embodiments, about any one of 5%-40%, 10%-20%, 20%-30%, 30%-40%, 10%-30%, 15%-25%, 16%-24%, 17%-23%, 18%-22%, or 19%-21% of the plurality of partitions each have 0 target molecules. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a sequence of interest at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a wildtype sequence at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having a mutant sequence or mutant sequences at a target region. In some embodiments, the number of target molecules referred in this paragraph are target molecules having either a wildtype or mutant sequence(s) at a target region.
Because different sequences of interest (e.g., different alleles) at a target region may be present at different frequencies, in some embodiments, a first dPCR is carried out to detect a first sequence of interest with a first distribution of target molecules among the plurality of partitions, and a second dPCR is carried out to detect a second sequence of interest with a second distribution of target molecules among the plurality of partitions, e.g., by diluting the sample and redistributing the diluted sample among the plurality of partitions.
Techniques available for digital PCR include PCR amplification on a microfluidic chip (Warren et al., 2006 Transcription factor profiling in individual hematopoietic progenitors by digital RT-PCR. Proc Natl Acad Sci USA 103, 17807-1 781 2; Ottesen et al., 2006 Microfluidic digital PCR enables multigene analysis of individual environmental bacteria. Science 314, 1464-1467; Fan and Quake 2007 Detection of aneuploidy with digital polymerase chain reaction. Anal Chem 79, 7576-7579). Other systems involve separation onto microarrays (Morrison et al., 2006 Nanoliter high-throughput quantitative PCR. Nucleic Acids Res 34, e123) or spinning microfluidic discs (Sundberg et al., 201 0 Spinning disk platform for microfluidic digital polymerase chain reaction. Anal Chem 82, 1546-1 550) and droplet techniques based on oil-water emulsions (Hindson, Benjamin et al., 2011 High-Throughput Droplet Digital PCR System for Absolute Quantitation of DNA Copy Number. Analytical Chemistry 83 (22): 8604-8610; J. Madic et al. 2016, Three-Color crystal digital PCR, Biomolecular Detection and Quantification, 10: 34-36). Typically, digital PCR is selected from DROPLET DIGITAL™ PCR (ddPCR), CRYSTAL DIGITAL™ PCR, chamber (e.g., microwell-based) digital PCR, BEAMing (beads, emulsion, amplification, and magnetic) based digital PCR, and microfluidic chip-based digital PCR. In some embodiments, the dPCR is DROPLET DIGITAL™ PCR. In some embodiments, the dPCR is CRYSTAL DIGITAL™ PCR.
Examples of suitable digital PCR systems include the NAICA™ CRYSTAL DIGITAL™ PCR system by Stilla Technologies, which partitions samples into 25,000-30,000 nanoliter-sized droplets; QX100™ DROPLET DIGITAL™ PCR System by Bio-Rad, which partitions samples containing nucleic acid template into 20,000 nanoliter-sized droplets; and the RAINDROP™ digital PCR system by RainDance, which partitions samples containing nucleic acid template into 1,000,000 to 10,000,000 picoliter-sized droplets. Droplet PCR systems have been described, for example, in U.S. Ser. No. 10/501,789B2, the contents of which are incorporated herein by reference in their entirety.
In a typical digital PCR experiment, a PCR solution is made similarly to a classical TaqMan probe assay, which typically comprises the DNA sample, fluorescence-quencher probes (i.e., hydrolysis probes), primers, and a PCR master mix, which generally contains DNA polymerase, dNTPs, MgCl2, and reaction buffers at optimal concentrations. The PCR solution is then randomly distributed into discrete (i.e. individual) partitions, such that some contain no target DNA and others contain one or more target DNA copies, e.g., an average of about 1.6 target DNA copy per partition. The partitions are individually amplified to the terminal plateau phase of PCR (or end-point) and then read for fluorescence, to determine the fraction of positive partitions.
If the partitions are of uniform volume, the number of target DNA molecules present may be calculated from the fraction of positive end-point reactions using Poisson statistics, according to the following equation:
λ=−ln(1−p)
wherein λ is the average number of target DNA molecules per partition (i.e., replicate reaction) and p is the fraction of positive end-point reactions. From λ, together with the volume of each partition and the total number of partitions analyzed, an estimate of the absolute target DNA concentration is calculated.
The methods described herein use multiple probe sets in which different types of probes share overlapping sets (e.g., circular permutation sets, or permutations as shown in
The nucleic acid molecules in each partition may be subject to amplification. Each partition may comprise a plurality of primer sets corresponding to the plurality of target regions. In some embodiments, one or more primer sets each further comprises a forward primer and a reverse primer for amplifying the reference region corresponding to the target region. Each primer pair in a primer set comprises a forward primer and a reverse primer. In some embodiments, the forward primer and the reverse primer are oligonucleotide primers that anneal to opposite strands of a nucleic acid molecule and that flank the target region and the reference region (i.e., target fragment). The primer set allows production of an amplicon specific to the target fragment during the PCR reaction. The corresponding probe sets can thus hybridize to the amplicons.
In some embodiments, substantially all partitions (e.g., all partitions) each comprise (a) a plurality of primer sets corresponding to the plurality of target regions, and (b) a DNA-dependent DNA polymerase; wherein each primer set of the plurality of primer sets comprises a forward oligonucleotide primer and a reverse oligonucleotide primer suitable for amplifying a target fragment comprising a target region corresponding to the primer set and the reference region corresponding to the target region; wherein the method comprises amplifying the target fragments from the nucleic acid molecules in the plurality of partitions; and wherein the method comprises detecting hybridization of the reference probes, the drop-off probes and/or the AS probes to amplicons of the target fragments. In some embodiments, the method comprises detecting hybridization of the reference probes and the drop-off probes to amplicons of the target fragments.
In some embodiments, substantially all partitions (e.g., all partitions) each comprise (a) a plurality of primer sets corresponding to the plurality of target regions and reference regions, and (b) a DNA-dependent DNA polymerase; wherein each primer set of the plurality of primer sets comprises a forward oligonucleotide primer and a reverse oligonucleotide primer suitable for amplifying a target fragment comprising a target region corresponding to the primer set; wherein each primer set of the plurality of primer sets comprises a forward oligonucleotide primer and a reverse oligonucleotide primer suitable for amplifying a reference fragment comprising a reference region corresponding to the target region; wherein the method comprises amplifying the target fragments from the nucleic acid molecules and the reference fragments from the nucleic acid molecules in the plurality of partitions; and wherein the method comprises detecting hybridization of the reference probes to amplicons of the reference fragments, and detecting hybridization of the AS probes (e.g., the first AS probe and the second AS probe) to amplicons of the target fragments.
The primers may be of any suitable length and GC contents. In some embodiments, the plurality of primer sets can be designed using available computer programs such that upon amplification the resulting amplicons are predicted to have the same melting temperature.
The primers are designed to provide an amplicon having a suitable length so that an amplicon is long enough to allow hybridization of the respective reference probe, drop-off probe (and AS probe in some experiments), but at the same time, the amplicon is sufficiently short to avoid excessive nonspecific binding by any of the probes in the reaction mixture. In some embodiments, the amplicons are at least about any one of 50, 100, 150, 200, 250, 300, 350, 400, 450, or 500 basepairs long. In some embodiments, the amplicons are no more than about any one of 500, 450, 400, 350, 300, 250, 200, 150, or 100 basepairs long. In some embodiments, the amplicons are about any one of 100-500, 100-400, 100-300, 100-250, 100-200, 150-250, or 150-300 basepairs long.
Each partition may comprise a polymerase, which is an enzyme that performs template-directed synthesis of polynucleotides, e.g., DNA and/or RNA. The term “polymerase” encompasses both the full-length polypeptide and a domain that has polymerase activity. DNA polymerases are well-known to those skilled in the art, including but not limited to DNA polymerases isolated or derived from Pyrococcus furiosus, Thermococcus litoralis, and Thermotoga maritime, or modified versions thereof. Additional examples of commercially available polymerase enzymes include, but are not limited to: Klenow fragment (New England Biolabs Inc.), Taq DNA polymerase (QIAGEN), 9° WM DNA polymerase (New England Biolabs Inc.), Deep Vent™ DNA polymerase (New England Biolabs Inc.), Manta DNA polymerase (Enzymat-ics), Bst DNA polymerase (New England Biolabs Inc.), and phi29 DNA polymerase (New England Biolabs Inc.). In some embodiments, the polymerase is a DNA-dependent polymerase. In some embodiments, the polymerase is an RNA-dependent polymerase, such as reverse transcriptase.
A droplet supports PCR amplification of template molecule(s) using homogenous assay chemistries and workflows similar to those widely used for real-time PCR applications (Hinson et al., 2011, Anal. Chem. 83:8604-8610; Pinheiro et al., 2012, Anal. Chem. 84: 1003-1011). Once droplets are generated, they can be transferred on a PCR plate and emulsified PCR reactions can be run on a thermal cycler using a classical PCR program. Alternatively, droplets generated on a Sapphire chip of the NAICA™ system can be subject to thermal cycling using a classical PCR program. Thermal cycling is performed to endpoint.
To circumvent the technical challenges associated with the amplification of low complexity sequence such as a microsatellite sequence, the annealing temperature and/or extension time of the amplification step may be increased. For example, typical annealing temperature is 55° C., and for microsatellite loci detection, the annealing temperature may be increased by an amount from 3 to 15° C.
The PCR data collection step is typically performed using an optical detector (for example, the NAICA™ PRISM 3 system by Stilla, or the Bio-Rad QX-100 droplet reader). A detection system having a suitable number of detection channels, e.g., a three-color detection system, is used.
The partitions described herein can be in any suitable format. Microwell plates, capillaries, oil emulsion, and arrays of miniaturized chambers with nucleic acid binding surfaces can be used to partition the samples in distinct partitions or droplets. Thus, digital PCR as used herein includes a variety of formats, including chamber digital PCR, DROPLET DIGITAL™ PCR (ddPCR), CRYSTAL DIGITAL™ PCR, BEAMing (beads, emulsion, amplification, and magnetic)-based digital PCR, and microfluidic chip-based digital PCR.
Samples can be partitioned into a plurality of mixture partitions. The use of partitioning can be advantageous to reduce background amplification, reduce amplification bias, increase throughput, provide absolute or relative quantitative detection, or a combination thereof. Partitions can include any of a number of types of partitions, including solid partitions (e.g., wells or tubes) or fluid partitions (e.g., aqueous droplets within an oil phase). In some embodiments, the partitions are droplets. In some embodiments, the partitions are microwells. In some embodiments, the partitions are two-dimensional monolayers of droplets in microchambers. Methods and compositions for partitioning a sample are described, for example, in published patent applications WO 2010/036352, US 2010/0173394, US2011/0092373, US2011/0092376, U.S. Ser. No. 10/501,789B2, the entire content of each of which is incorporated by reference herein.
In some cases, samples are partitioned and detection reagents (e.g., probes, enzyme, etc.) are incorporated into the partitioned samples. In other cases, samples are contacted with detection reagents (e.g., probes, enzyme, etc.) and the sample is then partitioned. In some embodiments, reagents such as probes, primers, buffers, enzymes, substrates, nucleotides, salts, etc. are mixed together prior to partitioning, and then the sample is partitioned. In some cases, the sample is partitioned shortly after mixing reagents together so that substantially all, or the majority, of reactions (e.g., DNA amplification, DNA cleavage, etc.) occur after partitioning. In other cases, the reagents are mixed at a temperature in which reactions proceed slowly, or not at all, the sample is then partitioned, and the reaction temperature is adjusted to allow the reaction to proceed. For example, the reagents can be combined on ice, at less than 5° C., or at about 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 20-25, 25-30, or 30-35° C. In general, one of skill in the art will know how to select a temperature at which the one or more reactions are inhibited. In some cases, a combination of temperature and time are utilized to avoid substantial reaction prior to partitioning. In some embodiments, reagents and sample can be mixed using one or more hot start enzymes, such as a hot start DNA-Dependent DNA polymerase. Thus, sample and one or more of buffers, salts, nucleotides, probes, labels, enzymes, etc. can be mixed and then partitioned. Subsequently, the reaction catalyzed by the hot start enzyme, can be initiated by heating the mixture partitions to activate the one or more hot-start enzymes.
In some embodiments, sample and reagents (e.g., one or more of buffers, salts, nucleotides, probes, labels, enzymes, etc.) can be mixed together without one or more reagents necessary to initiate an intended reaction (e.g., DNA amplification). The mixture can then be partitioned into a set of first partition mixtures and then the one or more essential reagents can be provided by fusing the set of first partition mixtures with a set of second partition mixtures that provide the essential reagent. In some embodiments, the essential reagent can be added to the first partition mixtures without forming second partition mixtures. For example, the essential reagent can diffuse into the set of first partition mixture water-in-oil droplets. As another example, the missing reagent can be directed to a set of microchannels, which contain the set of first partition mixtures.
In some embodiments, the sample is partitioned into a plurality of droplets. In some embodiments, a droplet comprises an emulsion composition, i.e., a mixture of immiscible fluids (e.g., water and oil). In some embodiments, a droplet is an aqueous droplet that is surrounded by an immiscible carrier fluid (e.g., oil). In some embodiments, a droplet is an oil droplet that is surrounded by an immiscible carrier fluid (e.g., an aqueous solution). In some embodiments, the droplets described herein are relatively stable and have minimal coalescence between two or more droplets. In some embodiments, less than 0.0001%, 0.0005%, 0.001%, 0.005%, 0.01%, 0.05%, 0.1%, 0.5%1, %, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, or 10% of droplets generated from a sample coalesce with other droplets. The emulsions can also have limited flocculation, a process by which the dispersed phase comes out of suspension in flakes.
In some embodiments, the droplets are formed by flowing an oil phase against an aqueous sample comprising nucleic acid molecules to be detected. In some embodiments, the droplets are formed by flowing an aqueous sample through microchannels comprising wall portions that diverge to detach a droplet of the aqueous sample under the effect of surface tension of the solution into a storage zone with an oil-phase carrier fluid in a microfluidic device. The oil phase can comprise a fluorinated base oil, which can additionally be stabilized by combination with a fluorinated surfactant such as a perfluorinated polyether. In some embodiments, the oil phase further comprises an additive for tuning the oil properties, such as vapor pressure, viscosity, or surface tension.
In some embodiments, the emulsion is formulated to produce highly monodisperse droplets having a liquid-like interfacial film that can be converted by heating into microcapsules having a solid like interfacial film; such microcapsules can behave as bioreactors able to retain their contents through an incubation period. The conversion to microcapsule form can occur upon heating. For example, such conversion can occur at a temperature of greater than about 40°, 50°, 60°, 70°, 80°, 90°, or 95° C. During the heating process, a fluid or mineral oil overlay can be used to prevent evaporation. Excess continuous phase oil can be removed prior to heating, or not. The microcapsules can be resistant to coalescence and/or flocculation across a wide range of thermal and mechanical processing. In some embodiments, these capsules are useful for storage or transport of partition mixtures. For example, a sample can be collected at one location, partitioned into droplets containing enzymes, buffers, probes, and/or primers, optionally one or more amplification reactions can be performed, the partitions can then be heated to perform microencapsulation, and the microcapsules can be stored or transported for further analysis.
In some embodiments, the sample is partitioned into at least 500 partitions, at least 1000 partitions, at least 2000 partitions, at least 3000 partitions, at least 4000 partitions, at least 5000 partitions, at least 6000 partitions, at least 7000 partitions, at least 8000 partitions, at least 10,000 partitions, at least 15,000 partitions, at least 20,000 partitions, at least 30,000 partitions, at least 40,000 partitions, at least 50,000 partitions, at least 60,000 partitions, at least 70,000 partitions, at least 80,000 partitions, at least 90,000 partitions, at least 100,000 partitions, at least 200,000 partitions, at least 300,000 partitions, at least 400,000 partitions, at least 500,000 partitions, at least 600,000 partitions, at least 700,000 partitions, at least 800,000 partitions, at least 900,000 partitions, at least 1,000,000 partitions, at least 2,000,000 partitions, at least 3,000,000 partitions, at least 4,000,000 partitions, at least 5,000,000 partitions, at least 10,000,000 partitions, at least 20,000,000 partitions, at least 30,000,000 partitions, at least 40,000,000 partitions, at least 50,000,000 partitions, at least 60,000,000 partitions, at least 70,000,000 partitions, at least 80,000,000 partitions, at least 90,000,000 partitions, at least 100,000,000 partitions, at least 150,000,000 partitions, or at least 200,000,000 partitions.
In some embodiments, the NAICA™ dPCR platform is used to carry out the methods described herein. In some embodiments, the Sapphire chip of the NAICA™ dPCR platform is used to partition the sample. Typically, a Sapphire chip contains 4 microchambers, each with a 2-dimensional monolayer of droplets. In some embodiments, data from dPCR reactions in droplets from different microchambers in a Sapphire chip is combined to provide quantification of the genetic species (e.g., wildtype and mutant sequences at a plurality of target regions) that the method is designed to detect.
In some embodiments, the sample is partitioned into a sufficient number of partitions such that at least a majority of partitions has no more than 1-5 target regions or amplicons thereof (e.g., no more than about 1, 2, 3, 4, or 5 target regions or amplicons thereof). In some embodiments, on average about 0.5, 1, 2, 3, 4, or 5 target regions or amplicons thereof are present in each partition. In some embodiments, no more than about any one of 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, or less of all partitions each contains at least 1 target region or amplicon thereof. In some embodiments, at least one partition contains no target regions or amplicons thereof (the partition is “empty”). In some embodiments, at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 22%, 25%, 30%, or 40% of the partitions contain no target regions or amplicons thereof. Generally, partitions can contain an excess of enzyme, probes, and primers such that each mixture partition is likely to successfully amplify any target regions present in the partition.
In some embodiments, the droplets that are generated are substantially uniform in shape, size and/or volume. For example, in some embodiments, the droplets are substantially uniform in average diameter. In some embodiments, the droplets that are generated have an average diameter of about 0.001 microns, about 0.005 microns, about 0.01 microns, about 0.05 microns, about 0.1 microns, about 0.5 microns, about 1 microns, about 5 microns, about 10 microns, about 20 microns, about 30 microns, about 40 microns, about 50 microns, about 60 microns, about 70 microns, about 80 microns, about 90 microns, about 100 microns, about 150 microns, about 200 microns, about 300 microns, about 400 microns, about 500 microns, about 600 microns, about 700 microns, about 800 microns, about 900 microns, or about 1000 microns. In some embodiments, the droplets that are generated have an average diameter of less than about 1000 microns, less than about 900 microns, less than about 800 microns, less than about 700 microns, less than about 600 microns, less than about 500 microns, less than about 400 microns, less than about 300 microns, less than about 200 microns, less than about 100 microns, less than about 50 microns, or less than about 25 microns. In some embodiments, the droplets that are generated are non-uniform in shape and/or size.
In some embodiments, the droplets that are generated are substantially uniform in volume. For example, the standard deviation of droplet volume can be less than about 1 pico liter, 5 pico liters, 10 picoliters, 100 pico liters, 1 nL, or less than about 10 nL. In some cases, the standard deviation of droplet volume can be less than about 10-25% of the average droplet volume. In some embodiments, the droplets that are generated have an average volume of about 0.001 nL, about 0.005 nL, about 0.01 nL, about 0.02 nL, about 0.03 nL, about 0.04 nL, about 0.05 nL, about 0.06 nL, about 0.07 nL, about 0.08 nL, about 0.09 nL, about 0.1 nL, about 0.2 nL, about 0.3 nL, about 0.4 nL, about 0.5 nL, about 0.6 nL, about 0.7 nL, about 0.8 nL, about 0.9 nL, about 1 nL, about 1.5 nL, about 2 nL, about 2.5 nL, about 3 nL, about 3.5 nL, about 4 nL, about 4.5 nL, about 5 nL, about 5.5 nL, about 6 nL, about 6.5 nL, about 7 nL, about 7.5 nL, about 8 nL, about 8.5 nL, about 9 nL, about 9.5 nL, about 10 nL, about 11 nL, about 12 nL, about 13 nL, about 14 nL, about 15 nL, about 16 nL, about 17 nL, about 18 nL, about 19 nL, about 20 nL, about 25 nL, about 30 nL, about 35 nL, about 40 nL, about 45 nL, or about 50 nL.
The sample analyzed by the methods in the application contain nucleic acid molecules. In some embodiments, the nucleic acid molecules are DNA molecules, such as genomic DNA, or DNA obtained from reverse transcription of RNA (e.g., cDNA). The genomic DNA may be chromosomal DNA, DNA originating from a tumor (i.e., tumor genomic DNA), fetal DNA, or a genomic DNA subject to site-specific genome editing. In some embodiments, the nucleic acid molecules are cell-free DNA (cfDNA), such as circulating DNA, for example, circulating tumor DNA, or cell-free fetal DNA.
In some embodiment, the nucleic acid molecules are RNA molecules. In such cases, the sample may be further subjected to a reverse transcription step.
The methods described herein may further comprise one or more of sample preparation steps, including, but not limited to, obtaining a biological sample from an individual, extraction of nucleic acid molecules from a biological sample, fragmenting nucleic acid molecules, and diluting nucleic acid molecules.
The sample may be prepared from a biological sample, such as a biological fluid or a biological tissue. Examples of biological fluids include urine, blood, plasma, serum, saliva, semen, stool, sputum, cerebrospinal fluid, tears, mucus, pancreatic juice, gastric juice, amniotic fluid, serous fluids such as pericardial fluid, pleural fluid or peritoneal fluid.
Biological tissues are aggregate of cells, usually of a particular kind together with their intercellular substance that form one of the structural materials of a human, animal, plant, bacterial, fungal or viral structure, including connective, epithelium, muscle and nerve tissues. Examples of biological tissues also include organs, tumor tissue, lymph nodes, arteries and disseminated cell(s). The tissue can be fresh, freshly frozen, or fixed, such as formalin-fixed paraffin-embedded (FFPE) tissues. The biological sample can be obtained by any means, for example via a surgical procedure, such as a biopsy, or by a less invasive method, including, but not limited to, abrasion or fine needle aspiration.
In some embodiments, the biological sample is selected from the group consisting of tumor tissue, disseminated cells, feces, blood cells, blood plasma, serum, lymph nodes, urine, saliva, semen, stool, sputum, cerebrospinal fluid, tears, mucus, pancreatic juice, gastric juice, amniotic fluid, cerebrospinal fluid, serous fluids. In some embodiments, the method further comprises extracting nucleic acid molecules from a biological sample.
In some embodiments, the sample comprises microbes such as bacterial cells, archaeal cells, and/or yeast cells, or nucleic acids derived from the microbes. In some embodiments, the sample comprises viruses, or nucleic acids derived from viruses. In some embodiments, the sample comprises one or more pathogens, or nucleic acids derived from the pathogens.
In some embodiments, the sample is derived from an animal, such as a pet, a farm animal, or a model animal, e.g., a mammal. In some embodiments, the sample comprises animal cells, such as cells from a primary cell or a cell line. In some embodiments, the sample is derived from a plant, such as a crop or a model plant, including genetically modified (GM) plants and genetically edited (GE) plants. In some embodiments, the sample comprise genetically engineered cells, such as genome-engineered plant cell or animal cell. In some embodiments, the sample comprises nucleic acids derived from one or more cells.
In some embodiments, the sample is an environmental sample. In some embodiments, the sample is obtained from sewage water.
In some embodiments, the nucleic acid molecules in the sample have a low molecular weight. For example, the nucleic acid molecules may be no more than about any one of 1000, 900, 800, 700, 600, 500, 400, 300, or 200 nucleotides long. In some embodiments, the method further comprises fragmenting high molecular weight nucleic acid molecules (e.g., chromosomal DNA) into nucleic acid molecules of suitable size, for example, for sonication or restriction digestion.
In some embodiments, the concentration of the nucleic acid molecules in a sample is adjusted, e.g., by dilution of the sample or by concentrating the sample (e.g., by dialysis, or by lyophilization and reconstitution), to provide a suitable concentration for dPCR. In some embodiments, the method is carried out with a first sample, and the concentration of the nucleic acid molecules in the sample is adjusted based on a count of partitions that each produces a positive signal via three or more detection channels, wherein if (e.g., when) the count is larger than a pre-determined value, the adjusting is decreasing the concentration of the nucleic acid molecules in the sample by diluting the sample; or wherein if (e.g., when) the count is smaller than a pre-determined value, the adjusting is increasing the concentration of the nucleic acid molecules in the sample by concentrating the sample. In some embodiments, the dilution factor or the concentration factor is based on the count of partitions that each produces a positive signal via three or more detection channels. In some embodiments, the concentration of the nucleic acid molecules in the sample is adjusted based on the estimated concentration of the wildtype sequence, the estimated concentration of the mutant sequences, or the estimated concentration of the specific allelic sequences at one or more of the plurality of target regions in the sample. In some embodiments, the method is repeated with the sample diluted to one or more concentrations in order to provide optimal concentrations for accurate quantification of different genetic species (e.g., wildtype, mutant and/or allelic sequences) at different target regions. In some embodiments, the concentration of a genetic species in a sample is at least about any one of 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10, 15, 20, 25, 30, 35, 50, 75, 100, 150, 200, 250, 500, 1000, 2000, 5000, 7500, 10000, 15000, 20000, 100000 or more copies per μL. In some embodiments, the methods are designed to detect mutations at different target regions that occur at comparable frequencies, such as frequencies that differ by no more than about 100×, 50×, 20×, 10×, 5×, 2×, or less.
In some embodiments, the method comprises determining a quality control measure based on a count of partitions that each produces a positive signal via each of the detection channels. In some embodiments, the quality control measure is determined by comparing a count of partitions that each produces a positive signal via each of the detection channels with an estimated count, wherein the estimated count is based on counts of partitions other than the count of partitions that each produces a positive signal via each of the detection channels X1-XR. In some embodiments, the estimated count is based on counts of partitions that each produces a positive signal via one of the detection channels and negative signals via each of the other detection channels. For example, the probability of partitions that each produces a positive signal via each of the detection channels can be estimated as a product of each probability of partitions that each produces a positive signal via one of the detection channels and negative signals via each of the other detection channels.
In some embodiments, the sample is obtained from an individual. In some embodiments, the individual is a mammal, such as a primate, e.g., a human. In some embodiments, the primate is a monkey or an ape. The subject can be male or female and can be any suitable age, including infant, juvenile, adolescent, adult, and geriatric subjects. In some embodiments, the subject is a non-primate mammal, such as a rodent.
In some embodiments, the individual has a cancer, is in remission of a cancer, or is at risk of suffering from a cancer notably based on family history. In some embodiments, the individual has familial tumor predisposition.
In some embodiment, the individual is suffering from, is in remission, or has familial cancer predisposition. In some embodiments, the individual is suffering from or is at risk of suffering from a disease caused by mutations in mismatch repair (MMR) genes, such as Constitutional mismatch repair deficiency syndrome (CMMRD syndrome) or Lynch syndrome.
The cancer may be a solid cancer or a “liquid tumor” such as cancers affecting the blood, bone marrow and lymphoid system, also known as tumors of the hematopoietic and lymphoid tissues, which notably include leukemia and lymphoma. Liquid tumors include for example acute myelogenous leukemia (AML), chronic myelogenous leukemia (CML), acute lymphocytic leukemia (ALL), and chronic lymphocytic leukemia (CLL), (including various lymphomas such as mantle cell lymphoma or non-Hodgkin's lymphoma (NHL).
Solid cancers include cancers affecting one of the organs selected from the group consisting of colon, rectum, skin, endometrium, lung (including non-small cell lung carcinoma), uterus, bones (such as Osteosarcoma, Chondrosarcomas, Ewing's sarcoma, Fibrosarcomas, Giant cell tumors, Adamantinomas, and Chordomas), liver, kidney, esophagus, stomach, bladder, pancreas, cervix, brain (such as Meningiomas, Glioblastomas, Lower-Grade Astrocytomas, Oligodendrocytomas, Pituitary Tumors, Schwannomas, and Metastatic brain cancers), ovary, breast, head and neck region, testis, prostate and the thyroid gland.
In process 700, some blocks are, optionally, combined, the order of some blocks is, optionally, changed, and some blocks are, optionally, omitted. In some examples, additional steps may be performed in combination with the process 700. Accordingly, the operations as illustrated (and described in greater detail below) are exemplary by nature and, as such, should not be viewed as limiting.
In some embodiments, one or more variables of process 700 can be obtained via a drop-off digital PCR process in which three (3) probe pairs corresponding to the three target regions are employed. Each probe pair of the three probe pairs comprises a drop-off probe comprising a drop-off label and an oligonucleotide drop-off sequence complementary to a wildtype sequence, and not to the mutant sequence(s), at a target region corresponding to the respective probe pair, and a reference probe comprising a reference label and an oligonucleotide reference sequence complementary to a wildtype sequence at an adjacent reference region upstream or downstream to the target region corresponding to the respective probe pair.
In some embodiments, a reference label and a drop-off label of each probe pair of the three probe pairs are detectable via different detection channels. In an exemplary scheme shown in TABLE 2, each of the three probe pairs has a reference label and a drop-off label detectable via different detection channels (i.e., blue vs. green; green vs. red; red vs. blue). In some embodiments, the reference labels of the three probe pairs are detectable via different detection channels with respect to each other, and the drop-off labels of the three probe pairs are detectable via different detection channels with respect to each other. In the exemplary scheme shown in TABLE 2, the reference labels of the three probe pairs are detectable via a blue detection channel, a green detection channel, and a red detection channel, respectively. Further, the drop-off labels of the three probe pairs are detectable via a green detection channel, a red detection channel, and a blue detection channel, respectively.
The exemplary scheme in TABLE 2 employs circular permutation. Specifically, the detection channel for one of the three reference labels is also the detection channel for one of the three drop-off labels. For example, the detection channel for the drop-off label corresponding to the first probe pair (i.e., the green detection channel) is the same as the detection channel for the reference label corresponding to the second probe pair. Further, the number of detection channels (i.e., 3) is the same as the number of probe pairs (i.e., 3).
It should be appreciated, however, that the scheme TABLE 2 is merely exemplary and circular permutation is not required for performing process 700. For example, the drop-off label corresponding to the third probe pair can be yellow rather than blue. In some embodiments, the total number of detection channels can be the same or fewer than twice the number of probe pairs (i.e., 6).
In a digital drop-off PCR process, a sample comprising nucleic acid molecules is distributed among a plurality of partitions, and substantially all partitions each comprises the three probe pairs. Hybridization of reference probes of the three probe pairs to nucleic acid molecules or amplicons thereof comprising wildtype sequences at the reference regions in the plurality of partitions can be detected. Furthermore, hybridization of drop-off probes of the three probe pairs to nucleic acid molecules or amplicons thereof comprising wildtype sequences at the target regions in the plurality of partitions can be detected. Process 700 can then be performed to provide quantification of wildtype and/or mutants sequences at the three target regions in the sample.
Process 700 is based on the assumption of independence of the partition encapsulation of nucleic acid molecules containing target regions 1, 2, and 3. Indeed, despite the fact that in the exemplary scheme in TABLE 2, due to the biomolecular design, there is no independence of the partition encapsulation of fluorophores Blue, Green and Red, there is nonetheless independence of the partition encapsulation of nucleic acid molecules containing target regions 1, 2 and 3.
Process 700 is described below in accordance to the following notations:
At block 702, a system (e.g., one or more electronic devices) determines a mutant probability ({circumflex over (P)}(mi)) that a given partition contains a mutant sequence at the target region corresponding to the first probe pair.
In some embodiments, block 702 includes blocks block 704 and block 706. At block 704, the system obtains a first count (n100) of one or more partitions that each produces a positive signal via the detection channel X1, a negative signal via the detection channel X2, and a negative signal via the detection channel X3.
Further, at block 706, the system obtains a second count (n000) of one or more partitions that each produces negative signals on all of the detection channels X1-X3.
In some embodiments, the system calculates {circumflex over (P)}(mi) based on a ratio between the first count (n100) and a sum of the first count (n100) and the second count (n000), as shown below.
In some embodiments, the first count and/or the second count is zero. For example, if a mutant sequence is absent, there will be no single positive partitions and the exemplary method is unable to detect the mutant sequence(s), but it is able to provide an upper limit for its real concentration. In some embodiments, if the mutant (or the other targets) are excessively highly concentrated, there will be no full negative partitions and the exemplary method is able to detect its presence and provide a lower limit for its real concentration.
The formula above is derived as follows:
In some embodiments, the system can calculate {circumflex over (P)}(m2) and {circumflex over (P)}(m3) in a similar manner. For example:
In some embodiments, at block 710, the system determines an estimated concentration Ĉ(m1) of the mutant sequence(s) at the target region corresponding to the first probe pair in the sample based on the mutant probability {circumflex over (P)}(m1) in the sample. For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
In some embodiments, the system determines a confidence interval and/or an uncertainty measure associated with the estimated concentration Ĉ(m1) in the sample. For example, the confidence interval and uncertainty at 95% confidence level can be calculated as follows
In some embodiments, the uncertain measure refers to uncertainty of the digital PCR method and can be calculated as provided below. One of ordinary skill in the art should appreciate that other types of uncertainty (e.g., sample taking, sample handling, processing) may be factored in.
In some embodiments, the system can calculate Ĉ(m2), Ĉ(m3), Ĉmax(m2), Ĉmax(m3), Ĉmin(m2), Ĉmin(m3), Û(m2) and Û(m3) in a similar manner.
At block 712, the system determines a wildtype probability ({circumflex over (P)}(w1)) that a given partition contains a wildtype sequence at the target region corresponding to the first probe pair. In some embodiments, the wildtype probability is based on {circumflex over (P)}(m1). In some embodiments, the wildtype probability is calculated based on {circumflex over (P)}(mi) and {circumflex over (P)}(m2).
In some embodiments, the system calculates {circumflex over (P)}(w1) in accordance with the following formula:
The formula is derived as follows:
Thus,
and the calculation of {circumflex over (P)}(w1) can be derived accordingly.
In some embodiments, at block 720, the system determines an estimated concentration Ĉ(w1) of the wildtype sequences at the target region corresponding to the first probe pair in the sample based on the wildtype probability {circumflex over (P)}(w1). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
In some embodiments, if the wildtype concentrations obtained on each probe pairs are expected to be the same, a more robust wildtype concentration estimate can be obtained by averaging the three estimated values.
In some embodiments, the system can calculate {circumflex over (P)}(w2) and {circumflex over (P)}(w3), Ĉ(w2) and Ĉ(w3) in a similar manner.
In some embodiments, the confidence interval, including Ĉmax(w1) and Ĉmin(w1), the uncertainty measure (Û(w1)) at 95% confidence level can be derived from the variance of {circumflex over (P)}(w1), respectively, where i is 1, 2, or 3.
The embodiments described in this section are applicable to dPCR methods using three sets of probes comprising dual-labeled AS probes (e.g., each probe set contains a reference probe and an AS probe with two detectable labels, or each probe set contains a reference probe, a first AS probe and a second AS probe). An exemplary scheme for a dual-labeled AS assay capable of detecting six genetic species using three fluorophores is shown in TABLE 4 below.
In process 800, some blocks are, optionally, combined, the order of some blocks is, optionally, changed, and some blocks are, optionally, omitted. In some examples, additional steps may be performed in combination with the process 800. Accordingly, the operations as illustrated (and described in greater detail below) are exemplary by nature and, as such, should not be viewed as limiting.
In some embodiments, one or more variables of process 800 can be obtained via a drop-off digital PCR process in which a R number of probe pairs corresponding to the R number of target regions are employed. Each probe pair of the R probe pairs comprises a drop-off probe comprising a drop-off label and an oligonucleotide drop-off sequence complementary to a wildtype sequence, and not to the mutant sequence(s), at a target region corresponding to the respective probe pair, and a reference probe comprising a reference label and an oligonucleotide reference sequence complementary to a wildtype sequence at an adjacent reference region upstream or downstream to the target region corresponding to the respective probe pair.
In some embodiments, a reference label and a drop-off label of each probe pair of the plurality of probe pairs are detectable via different detection channels. In an exemplary scheme shown in TABLE 5, each probe pair of the plurality of probe pairs has a reference label and a drop-off label detectable via different detection channels (i.e., Xi v. Xi+1). In some embodiments, the reference labels of the plurality of probe pairs are detectable via different detection channels with respect to each other, and the drop-off labels of the plurality of probe pairs are detectable via different detection channels with respect to each other. In the exemplary scheme shown in TABLE 5, the reference labels of the plurality of probe pairs are detectable via X1-XR respectively. Further, the drop-off labels of the three probe pairs are detectable via X1-XR respectively.
The exemplary scheme in TABLE 5 employs circular permutation. Specifically, the detection channel for one of the plurality of reference labels is also the detection channel for one of the plurality of drop-off labels. For example, the detection channel for the drop-off label corresponding to the probe pair i (i.e., Xi+1) is the same as the detection channel for the reference label corresponding to the probe pair i+1. Further, the number of detection channels (i.e., R) is the same as the number of probe pairs (i.e., R).
It should be appreciated, however, that the scheme TABLE 5 is merely exemplary and circular permutation is not required for performing process 800. In some embodiments, the total number of detection channels can be the same or fewer than twice the number of probe pairs (i.e., 2R).
In a multiplex drop-off digital PCR process, a sample comprising nucleic acid molecules is distributed among a plurality of partitions, and substantially all partitions each comprises the R number of probe pairs. Hybridization of reference probes of the plurality of probe pairs to nucleic acid molecules or amplicons thereof comprising wildtype sequences at the reference regions in each partition can be detected via each of the detection channels X1-XR. Further, hybridization of drop-off probes of the plurality of probe pairs to nucleic acid molecules or amplicons thereof comprising wildtype sequences in each partition can be detected via each of the detection channels X1-XR. Process 800 can then be performed to provide quantification of wildtype and/or mutants sequences at the R target regions in the sample.
Process 800 is based on the assumption of independence of the partition encapsulation of nucleic acid molecules containing target regions 1 to R. Indeed, despite the fact that in the exemplary scheme in TABLE 5, due to the biomolecular design, there is no independence of the partition encapsulation of fluorophores X1 to XR, there is nonetheless independence of the partition encapsulation of nucleic acid molecules containing target regions 1 to R.
Process 800 is described below in accordance to the following notations:
At block 802, a system (e.g., one or more electronic devices) determines a mutant probability ({circumflex over (P)}(mi)) that a given partition contains mutant sequence(s) at the target region corresponding to the i-th probe pair.
In some embodiments, block 802 includes block 804 and block 806. At block 804, the system obtains a first count of one or more partitions that each produces a positive signal via the i-th detection channel and negative signals via any other of the detection channels X1-XR. Further, at block 806, the system obtains a second count of one or more partitions that each produces negative signals via all of the detection channels X1-XR.
In some embodiments, the system calculates {circumflex over (P)}(mi) based on a ratio between the first count (ni) and a sum of the first count (ni) and the second count (n0), as shown below.
In some embodiments, the first count and/or the second count is zero for reasons discussed above.
The formula above is derived as follows:
In some embodiments, at block 810, the system determines an estimated concentration Ĉ(mi) of the mutant sequence(s) at the target region corresponding to the i-th probe pair in the sample based on the mutant probability {circumflex over (P)}(mi). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
In some embodiments, the system determines a confidence interval and/or an uncertainty measure associated with the estimated concentration Ĉ(mi) in the sample. For example, the confidence interval and uncertainty at 95% confidence level can be calculated as follows
At block 812, the system determines a wildtype probability ({circumflex over (P)}(w1)) that a given partition contains a wildtype sequence at the target region corresponding to the i-th probe pair. In some embodiments, the wildtype probability is based on {circumflex over (P)}(mi). In some embodiments, the wildtype probability is calculated based on {circumflex over (P)}(mi) and {circumflex over (P)}(mi+1).
In some embodiments, the system calculates {circumflex over (P)}(w1) in accordance with the following formula:
The formula is derived as follows:
Thus, by equating the two above-referenced formulas, the calculation of {circumflex over (P)}(w1) can be derived accordingly:
And subsequently:
In some embodiments, at block 820, the system determines an estimated concentration Ĉ(wi) of the wildtype sequences at the target region corresponding to the i-th probe pair in the sample based on the wildtype probability {circumflex over (P)}(wi). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
In some embodiments, if the wildtype concentrations are expected to be the same, a more robust wildtype concentration estimate can be obtained by averaging the R estimated values.
In some embodiments, the confidence interval and the uncertainty measure at 95% confidence level can be derived from the variance of {circumflex over (P)}(w1), noted as Var({circumflex over (P)}(w1)) as follows:
where the variance of {circumflex over (P)}(w1) can itself be derived from
from
and from the variance of X and from the variance of A.
Knowing that:
With this biomolecular design, the higher the wildtype concentration, the higher the uncertainty of the mutant concentration.
In some embodiments, R is an integer between 2 and 6.
The embodiments described in this section are applicable to dPCR methods using R sets of probes comprising dual-labeled AS probes (e.g., each probe set contains a reference probe and an AS probe with two detectable labels, or each probe set contains a reference probe, a first AS probe and a second AS probe). An exemplary scheme for a dual-labeled AS assay capable of detecting 2R number of genetic species using R fluorophores is shown in TABLE 7 below.
Process 900 can be performed, for example, using one or more electronic devices implementing a software platform, by one or more human users, or any combination thereof. In some examples, process 900 can be performed using a client-server system, and the blocks of process 900 can be divided up in any manner between the server and a client device. In other examples, the blocks of process 900 can be divided up between the server and multiple client devices. Thus, while portions of process 900 are described herein as being performed by particular devices of a client-server system, it will be appreciated that process 900 are not so limited. In other examples, process 900 can be performed using only a client device or only multiple client devices.
In process 900, some blocks are, optionally, combined, the order of some blocks is, optionally, changed, and some blocks are, optionally, omitted. In some examples, additional steps may be performed in combination with the process 900. Accordingly, the operations as illustrated (and described in greater detail below) are exemplary by nature and, as such, should not be viewed as limiting.
In some embodiments, one or more variables of process 900 can be obtained via a drop-off digital PCR process in which a (R−1) number of probe triplets corresponding to the (R−1) number of target regions are employed. Each probe triplet of the (R−1) number of probe triplets comprises a HDR probe comprising a HDR label and an oligonucleotide HDR sequence complementary to a HDR replacement sequence inserted at a target region corresponding to the respective probe triplet, an NHEJ drop-off probe comprising an NHEJ drop-off label and an oligonucleotide drop-off sequence complementary to a wildtype sequence of the target region corresponding to the respective probe triplet, and wherein the drop-off sequence does not hybridize to NHEJ-edited mutant sequence(s) at the target region corresponding to the respective probe triplet, a reference probe comprising a reference label and an oligonucleotide reference sequence complementary to a wildtype sequence at an adjacent reference region upstream or downstream to the target region corresponding to the respective probe triplet.
In some embodiments, a HDR label, an NHEJ drop-off label, and a reference label of each probe triplet of the plurality of probe triplets are detectable via different detection channels. In an exemplary scheme shown in TABLE 8, each probe triplet of the plurality of probe triplets has a HDR label, an NHEJ drop-off label, and a reference label of each probe triplet of the plurality of probe triplets are detectable via different detection channels (i.e., Xi v. Xi+1 v. Xi+2). In some embodiments, the reference labels of the plurality of probe triplets are detectable via different detection channels with respect to each other, the HDR labels of the plurality of probe triplets are detectable via different detection channels with respect to each other, and the NHEJ drop-off labels of the plurality of probe triplets are detectable via different detection channels with respect to each other. In the exemplary scheme shown in TABLE 8, the reference labels of the (R−1) number of probe triplets are detectable via X1-XR-2 and XR respectively. Further, the NHEJ drop-off labels of the (R−1) number of probe triplets are detectable via X2-XR-1 and XR-1 respectively. The HDR labels of the (R−1) number of probe triplets are detectable via X1, X3-XR respectively.
The exemplary scheme in TABLE 8 employs permutation of labels in the probe triplets.
It should be appreciated, however, that the scheme TABLE 8 is merely exemplary and permutation as shown in
In a multiplex drop-off digital PCR process, a sample comprising nucleic acid molecules is distributed among a plurality of partitions, and substantially all partitions each comprises the (R−1) number of probe triplets. Hybridization of reference probes of the (R−1) number of probe triplets to nucleic acid molecules or amplicons thereof comprising wildtype sequences at the reference regions in each partition can be detected via each of the detection channels X1-XR-2 and XR. Hybridization of NHEJ drop-off probes of the (R−1) number of probe triplets to nucleic acid molecules or amplicons thereof comprising the wildtype sequences at the target regions in each partition can be detected via each of the detection channels X2-XR-1. Further, hybridization of HDR probes of the (R−1) number of probe triplets to nucleic acid molecules or amplicons thereof comprising HDR-edited sequences (i.e., HDR replacement sequences) at the target regions in each partition can be detected via each of the detection channels X1, and X3-XR. In an alternative setup, hybridization of HDR probes of the (R−1) number of probe triplets to nucleic acid molecules or amplicons thereof comprising HDR-edited sequences (i.e., HDR replacement sequences) at the target regions in each partition can be detected via each of the detection channels X2-XR-1. Further, hybridization of NHEJ drop-off probes of the (R−1) number of probe triplets to nucleic acid molecules or amplicons thereof comprising wildtype sequences at the target regions in each partition can be detected via each of the detection channels X1, and X3-XR. Process 800 can then be performed to provide quantification of unmodified, NHEJ-edited and/or HDR-edited sequences at the (R−1) target regions in the sample.
Process 900 is based on the assumption of independence of the partition encapsulation of all types of sequences at the target regions 1 to R−1. Indeed, despite the fact that in the exemplary scheme in TABLE 8, due to the biomolecular design, there is no independence of the partition encapsulation of fluorophores X1 to XR, there is nonetheless independence of the partition encapsulation of nucleic acid molecules containing target regions 1 to R−1.
Process 900 is described below in accordance to the following notations:
Process 900 depicts an exemplary process for calculating an NHEJ-edited probability ({circumflex over (P)}(ri)), an unmodified probability ({circumflex over (P)}(mi)), and an HDR-edited probability ({circumflex over (P)}(wi)) corresponding to the i-th probe triplet if 1≤i≤R−2, in accordance with some embodiments.
At block 902, a system (e.g., one or more electronic devices) calculates an NHEJ-edited probability ({circumflex over (P)}(ri) that a given partition contains NHEJ-edited sequence(s) at the target region corresponding to the i-th probe triplet.
In some embodiments, block 902 includes blocks block 904 and block 906. At block 904, the system obtains a first count (ni) of one or more partitions that each produces a positive signal via the Xi detection channel and negative signals via any other of the detection channels X1-XR. At block 906, the system obtains a second count (n0) of one or more partitions that each produces negative signals via all of the detection channels X1-XR. In some embodiments, the first count and/or the second count is zero.
In some embodiments, the system calculates ({circumflex over (P)}(ri)) based on a ratio between the first count and a sum of the first count and the second count, as shown below:
The formula above is derived as follows:
In some embodiments, the system determines an estimated concentration Ĉ(r1) of NHEJ-edited sequence(s) at target region number i in the sample based on the NHEJ-edited probability {circumflex over (P)}(r1). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
In some embodiments, the system determines a confidence interval and/or an uncertainty measure associated with the estimated concentration Ĉ(r1) in the sample. For example, the confidence interval and uncertainty at 95% confidence level can be calculated as follows
At block 908, the system calculates an unmodified probability ({circumflex over (P)}(mi)) that a given partition contains a wildtype sequence at the target region corresponding to the i-th probe triplet.
In some embodiments, block 908 includes blocks 910 and 912. At block 910, the system obtains a third count of one or more partitions that each produces a positive signal via the Xi detection channel, a positive signal via the Xi+1 detection channel and negative signals via any other of the detection channels X1-XR. At block 912, the system obtains a fourth count of one or more partitions that each produces negative signals via one or more of the detection channels X1-XR except for the Xi detection channel and the Xi+1 detection channel. In some embodiments, the fourth count is calculated as n0+ni+ni+1+ni,(i+1). In some embodiments, the first count, the second count, the third count and/or the fourth count is zero.
In some embodiments, the system calculates ({circumflex over (P)}(mi)) in accordance with the following formula for i<R−1:
The formula above is derived as follows:
By equating the two formulas above, the formula for ({circumflex over (P)}(mi)) can be derived accordingly.
{circumflex over (P)}(mi) is maximized (or minimized) when {circumflex over (P)}(r1) and {circumflex over (P)}(ri+1) are minimized (or maximized).
In some embodiments, the system determines an estimated concentration Ĉ(mi) of unmodified sequence at target region number i in the sample based on the unmodified probability {circumflex over (P)}(mi). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
At block 914, the system calculates a HDR-edited probability ({circumflex over (P)}(w1)) that a given partition contains a HDR-edited sequence at the target region corresponding to the i-th probe triplet. In some embodiments, block 914 includes blocks 916 and 918.
At block 916, the system obtains a fifth count of one or more partitions that each produces a positive signal via the Xi detection channel, a positive signal via the Xi+2 detection channel and negative signals via any other of the detection channels X1-XR. At block 918, the system obtains a sixth count of one or more partitions that each produces negative signals via the detection channels X1-XR except for the Xi detection channel and the Xi+2 detection channel. In some embodiments, the sixth count is calculated as n0+ni+ni+2+ni,(i+2) In some embodiments, the fifth count, and/or the sixth count is zero.
In some embodiments, ({circumflex over (P)}(w1)) is calculated in accordance with the following formula, for i<R−1:
The formula above is derived as follows:
Further:
In some embodiments, the system determines an estimated concentration Ĉ(w1) of HDR-edited sequence at target region number i in the sample based on the HDR-edited probability {circumflex over (P)}(w1). For example, the system can calculate the estimated concentration based on Poisson's Law in accordance with the following formula:
As described above, process 900 depicts an exemplary process for calculating an NHEJ-edited probability ({circumflex over (P)}(ri)), an unmodified probability ({circumflex over (P)}(mi)), and an HDR-edited probability ({circumflex over (P)}(w1)) corresponding to the i-th probe triplet if 1≤i≤R−2, in accordance with some embodiments.
If i=R−1, the following formulas are used. The formulas are derived in a similar manner as those described with reference to
Indeed, by definition “rR-1” is an impossible event, so {circumflex over (P)}(rR-1)=0
Although described in the context of detection of CRISPR-Cas genome-edited sequences, the above analysis is generally applicable to methods for detection of genome editing using any tailored to any site-specific genome-editing reagent that edits genomic DNA via NHEJ or HDR-mediated repair of cleaved genomic DNA. Furthermore, the above embodiment of detection of site-specific genome-editing products is also generally applicable to any of the methods described herein for detection of wildtype, mutant and/or allelic sequences at (R−1) number of target regions using (R−1) number of probe triplets, in which the wildtype sequences correspond to the unmodified sequences in the CRISPR embodiment, and the mutant sequences correspond to the NHEJ-edited sequences in the CRISPR embodiment, and the allelic sequences correspond to the HDR-edited sequences in the CRISPR embodiment.
Multiplex dPCR Methods with Dual-Labelled Allele-Specific Probes
The present application further provides multiplex dPCR methods that do not use drop-off probes, but use the same concept of circular permutation of labels in the probe sets in order to allow high-order multiplexing of dPCR assays. In some embodiments, the method uses a plurality of probe sets, wherein each probe set comprises a reference probe and a dual-labelled allele-specific (AS) probe or AS probe pair, wherein the dual-labelled AS probe or AS probe pair has a first detectable label that can be detected via the same detection channel as the detectable label of the reference probe, and a second detectable label that can be detected via a different detection channel as the detectable label of the reference probe. Each probe set and its associated primers allow quantification of a target species (i.e., allelic sequence), such as a mutation (e.g., SNP, insertion, deletion, etc.) or a copy number variation (CNV), with respect to a reference species, thereby allowing detection and quantification of the target species in a sample.
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences (e.g., rare allele or CNV) at a plurality of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences (e.g., rare allele or CNV) at R number of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises R number of probe triplets corresponding to the R number of target regions,
wherein a first probe triplet of the R number of probe triplets comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences at three target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises three probe triplets corresponding to the three target regions, wherein the three probe triplets comprise:
a first probe triplet comprising:
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences (e.g., rare allele or CNV) at a plurality of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences (e.g., rare allele or CNV) at R number of target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises R number of probe pairs corresponding to the R number of target regions,
wherein a first probe pair of the R number of probe pairs comprises:
In some embodiments, there is provided a method for quantification of wildtype and/or allelic sequences at three target regions in a sample comprising nucleic acid molecules, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, wherein substantially all partitions (e.g., all partitions) each comprises three probe pairs corresponding to the three target regions, wherein the three probe pairs comprise:
a first probe pair comprising:
The skilled person will know the different practical ways to implement such an approach, as well as the working conditions (concentration of each of the probes, etc.). In particular, illustrations of dual-labelled probes configurations may be found, for example, in U.S. Pat. No. 9,222,128 at
In some embodiments, a probe set comprises dual-labelled AS probe(s) and a reference probe that hybridize to overlapping regions, including 100% identical regions in a nucleic acid molecule or amplicons thereof. The AS probe(s) hybridize to a mutant sequence at a target region. The reference probe hybridize to a wildtype sequence at a reference region, which can be identical to the target region as shown in
In
The calculations presented herein for a multiplex drop-off dPCR method are directly transferrable to a method using probe sets comprising a dual-labelled AS probes (referred herein as “dual-labelled AS assay”, for example with the following assignments:
In an exemplary multiplex assay for detecting CNV of three genes, three sets of probes and primers are designed, one for each target gene. Each probe set may include a triplet of a reference probe, a first AS probe and a second AS probe. Each set of primers include a first pair of forward and reverse primers for amplifying reference fragments containing the reference region, and a second pair of forward and reverse primers for amplifying target fragments containing the target region. The first AS probe and the second AS probe hybridize to an allelic sequence comprising a portion of the respective gene. For example, the first AS probe and the second AS probe may hybridize to a portion of a repeated sequence in a mutant gene associated with CNV. The template nucleic acid may be fractionated either physically or by restriction enzyme digestion in the preparation step so that each repeat sequence is cut from the next and partitioned into individual droplets. In other examples, the first AS probe and the second AS probe may hybridize to a junction of two repeats of a mutant gene associated with CNV, but not sequences in a wildtype gene. Other AS probe pairs capable of detecting CNVs may also be used. The reference probe and the AS probes may be TAQMAN™ probes, with the reference probe and the first AS probe labeled with the same fluorophore and the second AS probe labeled with a different fluorophore, which are detectable via different fluorescence detection channels. For a nucleic acid containing CNV of a target gene, the reference probe hybridizes to amplicons of the reference fragment, and the first AS probe and the second AS probe hybridize to amplicons of the target fragment, thereby resulting in positive signals in the fluorescence channels that correspond to both the reference probe and the second AS probe. For a nucleic acid containing wildtype sequences of the target gene, only the reference probe hybridizes to the amplicons of the reference fragment, thereby resulting in positive signals in the fluorescence channel that corresponds to only the reference probe, but no signal in the fluorescence channel that corresponds to the second AS probe. The signals from dPCR droplets can be plotted in three-dimensions. Space segments corresponding to different clusters of signals are determined, and the number of droplets in each space segments is counted. The counts are used to estimate the concentration for each wildtype and CNV populations.
A skilled person in the art would readily appreciate that features and embodiments of multiplex drop-off dPCR methods described herein can be applied to multiplex dPCR methods with dual-labelled AS probes mutatis mutandis according to the assignments above, including, but not limited to, features described in the “Probe sets” and “Digital PCR” sections above and various applications, systems, kits and articles of manufacture in Sections III and IV below.
In the context of the dual-labelled AS assay, which uses two AS probes each having a different detectable label, the mathematical uncertainty is the lowest for the species labelled with two probes. In such assays, it may be preferable to use the dual-labelled AS probe pair to detect target species (e.g., CNV or rare alleles) in order to minimize uncertainty. In some embodiments, where it is desirable to have the lowest mathematical uncertainty attached to the reference sequences in the context of an assay having dual-labeled probes, a pair of dual-labeled reference probes may be used to detect the reference sequences, and a single-label allele-specific probe may be used to detected target species (e.g., CNV or rare alleles) with one probe. The above assignment of species in paragraph [00265] to make use of the calculations presented for the multiplex drop-off assay are therefore exchanged in this context, refi corresponding to wi and targeti corresponding to mi.
Although the mathematical uncertainty is lowest with the dual-labelled probe(s), in certain assays, such as MAF assays (e.g., using probe sets of
Any one of the dual-labelled AS probe assays and methods described herein may be used together with the drop-off assays to simultaneously measure one or more drop-offs and to calculate one or more alleles (e.g., CNVs) as described above. For example, in some embodiments, the method uses a first plurality of probe sets each comprising a reference probe and a drop-off probe, and a second plurality of probe sets each comprising a reference probe, a first AS probe and a second AS probe. In some embodiments, the method uses a first plurality of probe sets each comprising a reference probe and a drop-off probe, and a second plurality of probe sets each comprising a reference probe and a dual-labeled AS probe. In some embodiments, the method further uses one or more standalone AS probes. The set of labels for the plurality of probe sets are permutated to reduce the number of detection channels required for detecting the various genetic species.
In a first example, a multiplex dPCR assay for detecting 2 CNV sequences and 1 drop-off sequence is designed using 3 fluorescence channels represented by 1, 2, 3 that can detect 3 corresponding fluorophores: 1, 2, 3. The two CNVs can be detected using 2 sets of primers and 2 sets of AS probe pairs to quantify their respective targets and 2 sets of primers and 2 sets of single probes to quantify their respective reference. The drop-off is quantified using one set of primers and one probe pair, one probe from said pair being specific to the drop-off sequence and the other probe from said pair being specific to the reference sequence. TABLE 10 below shows an exemplary scheme for this assay. A similar configuration may be used to simultaneously quantify 2 rare allele sequences at two different genetic loci (corresponding to CNV1 and CNV2), and 1 drop-off sequence.
In a second example, a multiplex dPCR assay for detecting 2 drop-off sequences and 1 CNV sequence is designed using 3 fluorescence channels represented by 1, 2, 3 that can detect 3 corresponding fluorophores: 1, 2, 3. TABLE 11 below shows an exemplary scheme for this assay. A similar configuration may be used to simultaneously quantify 2 drop-off sequences and 1 rare allele sequence (corresponding to CNV).
The methods and multiplex dPCR assays (e.g., multiplex drop-off dPCR assays) described herein are useful in a variety of applications, including treatment, diagnosis and genome editing. Because of their sensitivity and multiplexing capacity, the methods described herein are particularly useful for detection of predictive mutation biomarkers (e.g., microsatellite instability) in DNA samples containing very low concentrations of target DNA, or for detection of rare NHEJ or HDR-edited sequences at target genomic loci by site-specific genome-editing reagents (e.g., CRISPR/Cas).
In some embodiments, there is provided a method of diagnosing a disease or condition in an individual, wherein the disease or condition is associated with mutations at a plurality of target genetic loci, comprising detecting mutant sequences at the plurality of target genetic loci in a sample of the individual using any one of the methods described in the “Multiplex dPCR methods” section. In some embodiments, the disease or condition is cancer. In some embodiments, the target genetic loci are selected from the group consisting of EGFR, KRAS, NRAS, ESR1 and BRAF. In some embodiments, the target genetic loci are microsatellite sequence loci. In some embodiments, the target genetic loci are associated with rare alleles or CNVs.
In some embodiments, there is provided a method for prognosis of a disease or condition in an individual, wherein the disease or condition is associated with mutations at a plurality of target genetic loci, comprising detecting mutant sequences at the plurality of target genetic loci in a sample of the individual using any one of the methods described in the “Multiplex dPCR methods” section. In some embodiments, the disease or condition is cancer. In some embodiments, the target genetic loci are selected from the group consisting of EGFR, KRAS, NRAS, ESR1 and BRAF. In some embodiments, the target genetic loci are microsatellite sequence loci. In some embodiments, the target genetic loci are associated with rare alleles or CNVs.
In some embodiments, there is provided a method for predicting the efficacy of a treatment in an individual having a disease or condition, wherein the efficacy of the treatment is associated with mutations at a plurality of target genetic loci, comprising detecting mutant sequences at the plurality of target genetic loci in a sample of the individual using any one of the methods described in the “Multiplex dPCR methods” section. In some embodiments, the disease or condition is cancer. In some embodiments, the target genetic loci are microsatellite sequence loci. In some embodiments, the target genetic loci are selected from the group consisting of EGFR, KRAS, NRAS, ESR1 and BRAF. In some embodiments, the treatment is an immunotherapy, such as an immune checkpoint modulator. In some embodiments, the target genetic loci are associated with rare alleles or CNVs.
In some embodiments, there is provided a method for treating a disease or condition in an individual, wherein the disease or condition is associated with mutations at a plurality of target genetic loci, comprising detecting mutant sequences at the plurality of target genetic loci in a sample of the individual using any one of the methods described in the “Multiplex dPCR methods” section, and administering to the individual an effective amount of a therapeutic agent, if mutant sequences are detected. In some embodiments, the disease or condition is cancer. In some embodiments, the target genetic loci are microsatellite sequence loci. In some embodiments, the target genetic loci are selected from the group consisting of EGFR, KRAS, NRAS, ESR1 and BRAF. In some embodiments, the therapeutic agent is an immunotherapeutic agent, such as an immune checkpoint modulator. In some embodiments, the target genetic loci are associated with rare alleles or CNVs.
In some embodiments, there is provided a method for monitoring an individual diagnosed with a disease or condition associated with a plurality of target genetic loci, comprising detecting mutant sequences at the plurality of target genetic loci in a first sample obtained from the individual at a first time point using any one of the methods described in the “Multiplex dPCR methods” section, detecting mutant sequences at the plurality of target genetic loci in a second sample obtained from the individual at a second time point using the method, and comparing the estimated concentrations of the mutant sequences at one or more of the plurality of target genetic loci in the first sample versus that in the second sample. In some embodiments, the disease or condition is cancer. In some embodiments, the target genetic loci are microsatellite sequence loci. In some embodiments, the target genetic loci are selected from the group consisting of EGFR, KRAS, NRAS, ESR1 and BRAF. In some embodiments, the first time point is before the individual receives a treatment. In some embodiments, the second time point is after the individual receives a treatment. In some embodiments, the treatment is an immunotherapeutic agent, such as an immune checkpoint modulator. In some embodiments, the target genetic loci are associated with rare alleles or CNVs.
The methods described herein are useful for detecting mutations that occur at mutation hotspots in the genome. Because drop-off probes are used to detect the mutations, any mutant sequence at a specific target region can be detected, thereby allowing detection of low-frequency mutations that have similar functional impact on the gene product.
The methods described herein are useful for detecting microsatellite instability (MSI) in a sample of an individual having cancer or at the risk of having cancer.
In some embodiments, there is provided a method of for quantification of mutations at a plurality of microsatellite sequence loci in a sample comprising nucleic acid molecules,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions (e.g., all partitions) each comprises:
Microsatellite instability (MSI) is the condition of genetic hypermutability (predisposition to mutation) that results from impaired DNA mismatch repair (MMR). The presence of MSI represents phenotypic evidence that MMR is not functioning normally. Mutations at microsatellite locus commonly typically include deletion(s), addition(s) or substitution of at least one repeat unit at a microsatellite locus. Typically, MSI results in a change in length at a microsatellite locus, due to addition(s) or most frequently deletion(s).
Many microsatellite sequence loci are known. The ability to detect microsatellite expansion mutations at multiple microsatellite sequence loci in a single assay increases the sensitivity of MSI detection. Exemplary microsatellite sequence loci have been described in Bacher et al., 2004 Disease Markers 20, 237-250, as well as in Hause et al., 2016 Nat Medicine November 22(1 1):1 342-1 350). In some embodiments, the target microsatellite sequence loci (or microsatellite markers) are selected from microsatellites found to be highly associated with MSI positive tumors, based on their frequency of instability in colon, endometrial, rectal and stomach adenocarcinomas. In some embodiments, the target microsatellite sequence loci are located in regions frequently amplified in tumors (e.g. chr8q region of the human genome). In some embodiments, the target microsatellite sequence loci are selected from the group comprising BAT-25, BAT-26, BAT-34c4, BAT-40, NR21, NR24, MONO-27, D2S1 23, D5S346, D 17S250, ACVR2A, DEFB105A, DEFB105B, RNF43, DOCK3, GTF2IP1, LOC100093631, PIP5K1A, MSH3, TRIM43B, PPFIA1 and TDRD1. In some embodiments, the target microsatellite sequence loci are selected among the Bethesda panel, which comprises BAT-25, BAT-26, D2S123, D5S346 and D17S250.
Mononucleotide repeat loci have been shown to be very susceptible to alteration in tumors with dysfunctional DNA mismatch repair systems (Parsons, 1995 supra), making such loci particularly useful for the detection of cancer and other diseases associated with dysfunctional DNA mismatch repair systems, such that mononucleotides MSI markers may be preferred.
In some embodiments, the microsatellite sequence loci are short microsatellite sequences (typically comprising 8 to 30, 8 to 25, 8 to 20, 8 to 15, or 8 to 12 nucleotides) such as the target microsatellite sequence locus exemplified in the group consisting of D2S123, D5S346, D17S250, ACVR2A, DEFB105A, DEFB105B, RNF43, DOCK3, GTF2IP1, LOC100093631, PIP5K1A, MSH3, TRIM43B, PPFIA1 and TDRD1.
The MSI detection methods can be routinely performed on biological samples (or nucleic acid samples derived from biological samples), such as blood samples, plasma samples, urine, or fecal samples. Mutant allele frequency determined using any one of the methods described herein can be compared with a control mutated allele frequency obtained from a control DNA sample. The control DNA sample may be a wildtype sample or a sample of a cell line derived from a subject diagnosed with a MSI positive tumor or with a disease associated with a mutation in the DNA mismatch repair, at a prior time point, during the time-course of the disease and/or during the time course of the treatment.
In some embodiments, a cancer (or a tumor) associated with MSI is also named a MSI positive cancer (or a MSI positive tumor) and relates to a cancer (or tumor) wherein the genomic tumor DNA exhibits at least one mutation in a microsatellite sequence locus. MSI has thus been associated with a great variety of cancers such as but not limited to colorectal cancers, gastric cancer, endometrium cancer, ovarian cancer, urinary tract cancer, brain cancer, and breast cancer. MSI is most prevalent as the consequence of colon cancers. Additionally, MSI is associated with the Constitutional mismatch repair deficiency syndrome (CMMRD syndrome) or the Lynch syndrome. Therefore, detection of mutant sequences at one or more microsatellite sequence loci according to the methods described herein can be used in the diagnostic of cancers, which are associated with impaired DNA mismatch repair, e.g., MSI positive cancers, or in the diagnostic of familial tumor predisposition in an individual.
The MSI phenotype of the cancer (i.e. positive or negative) has important implications in cancer prognosis and rational planning of treatment (Boland and Goel, Gastroenterology 2010). Therefore, even in the case of cancers with low MSI positive prevalence, it remains of high relevance to identify whether the patient is suffering from a MSI positive tumor or a MSI negative tumor. The method of the present invention can be used in the prognosis of various cancers. Identification of a positive MSI cancer is generally associated with a better prognosis.
The present application also relates to a method for predicting the efficacy of a treatment. Reports have shown for example that colorectal cancer patients with MMR deficiency have better responses to immunotherapy by PD-1 immune checkpoint blockade and show improved progression-free survival. Therefore, identification of patients suffering from cancer associated with MSI (i.e. MSI positive cancer or tumor) is of high clinical relevance for selection of an appropriate therapeutic strategy. In some embodiments, the treatment is immunotherapy. Immunotherapy includes but is not limited to immune checkpoint modulators (i.e. inhibitors and/or agonists), monoclonal antibodies, and cancer vaccines. In some embodiments, the treatment comprises administration of immune checkpoint modulators such as anti-PD-1 and/or anti-PDL-1 inhibitors. In some embodiments, immunotherapy is administered to the subject if mutant sequences at one or more microsatellite sequence loci in nucleic acid molecules from a sample is detected.
The methods for detecting microsatellite instability may also be used for monitoring of an individual diagnosed with a tumor associated with impaired DNA mismatch repair. In some embodiments, said monitoring is performed during the time course of the treatment. The method may also be used for the monitoring of cancer relapse in a subject having suffered from a tumor associated with impaired DNA mismatch repair. In an individual having suffered from a tumor associated with impaired DNA mismatch repair, detection of microsatellite instability in circulating tumor DNA may be indicative of a relapse.
The methods described herein are useful for detecting unmodified (i.e., wildtype) and mutant (e.g., NHEJ or HDR-edited) sequences at a plurality of target genomic regions in a cell that is subject to site-specific genome editing.
In some embodiments, there is provided a method for quantification of unmodified and/or NHEJ-edited sequences a plurality of target regions in a sample comprising nucleic acid molecules from cells, wherein the cells have been contacted with a site-specific genome-editing reagent, wherein the site-specific genome-editing reagent is configured to cleave target sites in the plurality of target regions, wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions (e.g., all partitions) each comprises:
In some embodiments, there is provided a method for quantification of unmodified, homology directed repair (HDR)-edited, and/or non-homologous end joining (NHEJ)-edited sequences at a plurality of target regions in a sample comprising nucleic acid molecules from cells, wherein the cells have been contacted with a site-specific genome-editing reagent and HDR template nucleic acids comprising HDR replacement sequences, wherein the site-specific genome-editing reagent is configured to cleave target sites in the plurality of target regions,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions (e.g., all partitions) each comprises: a plurality of probe sets corresponding to the plurality of target regions, wherein each probe set of the plurality of probe sets comprises:
In some embodiments, there is provided a method for quantification of unmodified, homology directed repair (HDR)-edited, and/or non-homologous end joining (NHEJ)-edited sequences at (R−1) number of target regions in a sample comprising nucleic acid molecules from cells, wherein the cells have been contacted with a site-specific genome-editing reagent and HDR template nucleic acids comprising HDR replacement sequences, wherein the site-specific genome-editing reagent is configured to cleave target sites in the plurality of target regions,
wherein the nucleic acid molecules are distributed among a plurality of partitions of the sample, and wherein substantially all partitions (e.g., all partitions) each comprises (R−1) number of probe triplets corresponding to the (R−1) number of target regions,
wherein a first probe triplet of the plurality of probe triplets comprises:
In some embodiments, the method described herein is carried out in a dPCR, such as CRYSTAL DIGITAL™ PCR assay. In some embodiments, the site-specific genome-editing reagent comprises a Cas nuclease, a TALEN, or a Zinc-finger nuclease. In some embodiments, the method further comprises contacting the cells with the site-specific genome-editing reagent.
In some embodiments, there is provided a method for identifying an optimized condition for genome editing of a cell, comprising: a) performing site specific genome editing of a plurality of cells under a first set of conditions to provide first sample comprising nucleic acid from genome-edited cells; b) performing site specific genome editing of a plurality of cells under a second set of conditions to provide a second sample comprising nucleic acid from genome-edited cells; c) using any one of the methods described in the “Detection of genome-editing” section to quantify NHEJ-edited sequences and/or HDR-edited sequences at a plurality of target genomic regions in the first and second samples to determine a genome editing efficiency for the first and second set of conditions; and d) comparing the genome editing efficiency of the first and second set of conditions, thereby identifying an optimized set of conditions that provide a higher genome editing efficiency. In some embodiments, different set of conditions comprise different site-specific genome editing reagent (e.g., different Cas and/or different gRNA), different target genomic loci, different delivery method, and/or different concentrations of site-specific genome editing reagents. In some embodiments, editing is performed under a third, fourth, fifth, sixth, etc. number of conditions. In some cases, the higher efficiency of HDR editing is identified as the optimized condition for genome editing. In some cases, the higher efficiency of NHEJ editing is identified as the optimized condition for genome editing. In some cases, the higher ratio of the efficiency of NHEJ to HDR editing is identified as the optimized condition for genome editing. In some cases, the higher ratio of the efficiency of HDR to NHEJ editing is identified as the optimized condition for genome editing.
The methods can be used to determine efficacy of genome editing at a plurality of genomic loci in a sample, for identifying optimal conditions for genome editing, or to guide enrichment of populations of cells for genome editing products (e.g., by sub-selection). For example, genome-editing conditions can be optimized to decrease, or increase, the type or amount of NHEJ mutations in comparison to HDR mutations at a plurality of target genomic regions. As another example, genome editing conditions can be optimized to increase the efficiency of editing, thus allowing the use of a low concentration or activity of genome editing reagent without unduly reducing the amount of editing achieved. Usage of a low concentration or activity of genome editing reagent may be useful for reducing off-target editing events.
In some embodiments, the sample is a sample of cells, a sample of genomes extracted from a sample of cells, or fragments thereof. The cells, or genomes extracted from cells, can be contacted with site-specific genome-editing reagents under conditions suitable for the genome editing of a plurality of target genomic regions. In some cases, the cells or genomes are contacted with a plurality of HDR replacement nucleic acid to introduce a pre-determined HDR mutation into the genome. The genome editing reagents can contain one or more nucleases that introduce double-strand breaks into a DNA.
Site-specific genome editing reagents are known in the art. Generally, such reagents target a genomic region and induce a double stranded cut into the DNA within the target region. Repair of the cutting can proceed via two alternative pathways. In non-homologous end joining (NHEJ), the cut ends of a DNA strand are directly ligated without the need for a homologous template nucleic acid. NHEJ can lead to the addition, the deletion, substitution, or a combination thereof, of one or more nucleotides at the repair site. In homology directed repair, the cut ends of a DNA strand are repaired by polymerization from a homologous template nucleic acid. Thus, the original sequence is replaced with the sequence of the template. The homologous template nucleic acid can be provided by homologous sequences elsewhere in the genome (sister chromatids, homologous chromosomes, or repeated regions on the same or different chromosomes). Alternatively, an exogenous template nucleic acid can be introduced to obtain a specific HDR mutation.
As another example, a genome-editing reagent containing an obligate heterodimer nuclease can be used to reduce off-target mutations. Such a genome-editing reagent can be designed to only generate double stranded breaks when obligate heterodimer nucleases are formed at the target genomic region by site-specific recruitment of each monomer component to an adjacent target half-site. Exemplary obligate heterodimer nucleases include, but are not limited to, those described in U.S. patent application Ser. No. 13/812,857. The targeting function can be provided by a nuclease defective Cas9 (dCas9) and appropriate guide RNAs, a pair of TALENs, or any other nucleic acid sequence specific targeting method.
The NHEJ drop-off probes are designed according to the type of the site-specific genome-editing reagent used. For example, if the genome-editing reagent is a Cas9 nuclease and a guide RNA, cut sites are generally 3-5 base pairs directly upstream of a protospacer adjacent motif (PAM). The PAM generally consists of the sequence NGG, although some other PAM sequences can be utilized, such as NGA or NAG. Thus, cut sites can be, for instance, either [5′-20 nt target-NGG-3′] or [5′-CCN-20 nt target-3′]. When the target site is 5′-20 nt target-NGG, the predicted cut-site is approximately 3-5 base pairs upstream of the 5′ end of the NGG PAM. In such cases, the NHEJ drop-off probe can be designed to hybridize to a subregion containing this predicted cut-site.
As another example, the genome editing reagent can be a pair of guide RNAs targeted to sites adjacent to PAM sequences on opposite strands of the target genomic region, each guide RNA complexed with a nuclease defective, or dead, Cas9 nuclease (dCas9) that is fused to monomer of an obligate heterodimer of a type IIS restriction nuclease (e.g., FokI). In such cases, the cut site is generally from 12 to 21 base pairs between the adjacent PAM sequences on the opposite strands of the targeted genomic region. Thus, the NHEJ drop-off probe can be designed to hybridize to a sub-region containing a predicted cut site from 12 to 21 base pairs between the adjacent PAM sequences. Similar rules can be utilized to design NHEJ drop-off probes for other genome editing reagents.
In some embodiments, the NHEJ drop-off probe is sensitive to (i.e., detects) both HDR mutations and NHEJ mutations. For example, the genome-editing reagent can include an exogenous HDR template nucleic acid. The template nucleic acid can be used as a template to repair a region encompassing, or within, the double strand breaks introduced by the genome-editing reagent. Thus, any mutations present in the HDR template nucleic acid relative to the wildtype genome will be introduced. When the HDR site is proximal to, or at, the target cut site, the NHEJ drop-off probe can hybridize to both potential NHEJ edit sites and the potential HDR edit site. In such cases, the NHEJ drop-off probe can detect both HDR mutations and NHEJ mutations by failing to hybridize to target genomic regions containing such mutations. In some embodiments, NHEJ mutations and HDR mutations are distinguished by including an HDR probe in each probe set. If the NHEJ drop-off probe detects a mutation (HDR or NHEJ), and the HDR probe does not, then the mutation can be classified as NHEJ. Conversely, if the NHEJ probe detects a mutation and the HDR probe also detects a mutation, then the mutation can be classified as HDR.
In some embodiments, the site-specific genome-editing reagent induces double-strand breaks in DNA within the cells. In some embodiments, the site-specific genome-editing reagent comprises a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN), a Cas protein, a Cre recombinase, a Hin recombinase, or a Flp recombinase. In some embodiments, the site-specific genome-editing reagent comprises a fusion protein that combine homing endonucleases with the modular DNA binding domains of TALENs (megaTAL). For example, megaTAL may be delivered as a protein or alternatively, an mRNA encoding a megaTAL protein is delivered to the cells. In some embodiments, the site-specific genome-editing reagent comprises one or more RNA molecules, such as a sgRNA, a crRNA, or a crRNA and a tracrRNA. In some embodiments, the site-specific genome-editing reagent is a ribonucleoprotein (RNP), and the RNP comprises a Cas protein and a sgRNA or a crRNA and a tracrRNA.
Non-limiting descriptions relating to gene editing (including HDR repair templates) using the CRISPR-Cas system are discussed in Ran et al. (2013) Nat Protoc. 2013 November; 8(11): 2281-2308, the entire content of which is incorporated herein by reference. Embodiments involving repair templates are not limited to those comprising the CRISPR-Cas system. Various aspects of the CRISPR-Cas system are known in the art. Non-limiting aspects of this system are described, e.g., in U.S. Pat. No. 9,023,649, issued May 5, 2015; U.S. Pat. No. 9,074,199, issued Jul. 7, 2015; U.S. Pat. No. 8,697,359, issued Apr. 15, 2014; U.S. Pat. No. 8,932,814, issued Jan. 13, 2015; PCT International Patent Application Publication No. WO 2015/071474, published Aug. 27, 2015; Cho et al., (2013) Nature Biotechnology Vol 3 1 No 3 pp 230-232 (including supplementary information); and Jinek et al., (2012) Science Vol 337 No 6096 pp 816-821, the entire contents of each of which are incorporated herein by reference.
Non-limiting examples of Cas proteins include Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csn1 and Csx12), Cas1O, Csy1, Csy2, Csy3, Cse1, Cse2, Csc1, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3, Csx17, Csx14, Csx1O, Csx16, CsaX, Csx3, Csx1, Csx15, Csf1, Csf2, Csf3, Csf4, homologs thereof, or modified versions thereof. These enzymes are known; for example, the amino acid sequence of S. pyogenes Cas9 protein may be found in the SwissProt database under accession number Q99ZW2 and in the NCBI database as under accession number Q99ZW2.1. UniProt database accession numbers A0A0G4DEU5 and CDJ55032 provide another example of a Cas9 protein amino acid sequence. Another non-limiting example is a Streptococcus thermophilus Cas9 protein, the amino acid sequence of which may be found in the UniProt database under accession number Q03JI6.1. In some embodiments, the unmodified CRISPR enzyme has DNA cleavage activity, such as Cas9. In certain embodiments, the CRISPR enzyme is Cas9, and may be Cas9 from S. pyogenes or S. pneumoniae. In various embodiments, the CRISPR enzyme directs cleavage of both strands at the location of a target sequence.
In general, a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. In some embodiments, the degree of complementarity is 100%.
Gene editing nucleases, including ZFN have been described in Bhakta, M. et al., Genome Research 23:530-538; 2013, and Beerli, R. et al., Proc. Natl. Acad. Sci v. 95 pp 14628-14633; 1998, TAL has been described in Cermak, T. et al., Nucleic Acids Research 2011, v. 39, no. 12, Miller, J. et al., Nature Biotechnology vol. 29 no. 2; 2011, Christian, M. et al., Genetics 186:757-761; 2010, Deng, D. et al, Science 2012: v. 335 p. 720, and Boch, J. et al., Science 2009: v. 326 p. 1509, the entire content of each of which is incorporated herein by reference. Additionally, Cre has been described in Chevalier, B. et al., Nucleic Acids Research 2001, v. 29 no. 18, the entire content of which is incorporated herein by reference. MegaTal has been described in Sather, B. et al Sci Transl Med 7(307) 2015, Ibarra, G. et al., Molecular Therapy-Nucleic Acids (2016) 5, e352, Osborn, M. et al., Molecular Therapyv. 24 no. 3, 570-581 (2016); Wang, Y. et al., Nucleic Acid Research 2014; v. 42, 6463-6475; and Gaj, T. et al., Cold Spring Harbor Perspectives in Biology 2015, each of which is incorporated herein by reference.
In some embodiments, the cells are primary cells, cell line, or immortalized cells. For example, the cells may include mesenchymal stem cells, lung cells, neuronal cells, fibroblasts, human umbilical vein (HUVEC) cells, and human embryonic kidney (HEK) cells, primary or immortalized hematopoietic stem cell (HSC), T cells, natural killer (NK) cells, cytokine-induced killer (CIK) cells, human cord blood CD34+ cells, and B cells. Non-limiting examples of T cells may include CD8+ or CD4+ T cells. In some aspects, the CD8+ subpopulation of the CD3+ T cells are used. CD8+ T cells may be purified from the PBMC population by positive isolation using anti-CD8 beads. In some embodiments, primary NK cells are isolated from PBMCs, or NK cell lines, e.g., NK92 may be used. Cell types also include cells that have previously been modified for example T cells, NK cells and MSC to enhance their therapeutic efficacy. For example: T cells or NK cells that express chimeric antigen receptors (CAR T cells, CAR NK cells, respectively); T cells that express modified T cell receptor (TCR); or engineered MSCs.
Also provided are apparatus, devices, systems, compositions, kits, and articles of manufacture for any one of the methods of quantification, multiplex dPCR assays (e.g., multiplex drop-off dPCR assays), methods of treatment, and methods of diagnosis described herein.
Input device 1020 can be any suitable device that provides input, such as a touch screen, keyboard or keypad, mouse, or voice-recognition device. Output device 1030 can be any suitable device that provides output, such as a touch screen, haptics device, or speaker.
Storage 1040 can be any suitable device that provides storage, such as an electrical, magnetic or optical memory including a RAM, cache, hard drive, or removable storage disk. Communication device 1060 can include any suitable device capable of transmitting and receiving signals over a network, such as a network interface chip or device. The components of the computer can be connected in any suitable manner, such as via a physical bus or wirelessly.
Software 1050, which can be stored in storage 1040 and executed by processor 1010, can include, for example, the programming that embodies the functionality of the present disclosure (e.g., as embodied in the devices as described above).
Software 1050 can also be stored and/or transported within any non-transitory computer-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as those described above, that can fetch instructions associated with the software from the instruction execution system, apparatus, or device and execute the instructions. In the context of this disclosure, a computer-readable storage medium can be any medium, such as storage 1040, that can contain or store programming for use by or in connection with an instruction execution system, apparatus, or device.
Software 1050 can also be propagated within any transport medium for use by or in connection with an instruction execution system, apparatus, or device, such as those described above, that can fetch instructions associated with the software from the instruction execution system, apparatus, or device and execute the instructions. In the context of this disclosure, a transport medium can be any medium that can communicate, propagate or transport programming for use by or in connection with an instruction execution system, apparatus, or device. The transport readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic or infrared wired or wireless propagation medium.
Device 1000 may be connected to a network, which can be any suitable type of interconnected communication system. The network can implement any suitable communications protocol and can be secured by any suitable security protocol. The network can comprise network links of any suitable arrangement that can implement the transmission and reception of network signals, such as wireless network connections, T1 or T3 lines, cable networks, DSL, or telephone lines.
Device 1000 can implement any operating system suitable for operating on the network. Software 1050 can be written in any suitable programming language, such as C, C++, Java or Python. In various embodiments, application software embodying the functionality of the present disclosure can be deployed in different configurations, such as in a client/server arrangement or through a Web browser as a Web-based application or Web service, for example.
The exemplary embodiments and examples below are intended to be purely exemplary of the invention and should therefore not be considered to limit the invention in any way. The following exemplary embodiments, examples and detailed description are offered by way of illustration and not by way of limitation.
The invention provides the following embodiments:
In clinical settings, a set of predictive genetic markers are routinely monitored for diagnosis and to track therapy efficacy. For example, in non-small cell lung cancer the presence of deletions in the epidermal growth factor (EGFR) exon 19 confers sensitivity to first generation tyrosine kinase inhibitors. Moreover, in colorectal carcinoma, KRAS and NRAS proto-oncogene mutations are strong indicators of resistance to anti-EGFR antibodies.
A triplex drop-off dPCR assay was developed to detect wildtype and mutant sequences at the G13 locus (e.g., G13D) of KRAS, the Q61 locus (e.g., Q61K) of NRAS and the E19 locus (e.g., E19 deletion) of EGFR. Previously, drop-off dPCR assays have been developed to detect individual pairs of wildtype and mutant sequences at each genetic locus in individual assays. The ability to multiplex three mutation hotspots in a single dPCR assay greatly facilitates detection of predictive biomarkers such as KRAS, NRAS and EGFR in samples from cancer patients. The triplex drop-off assay may be used on any dPCR system with at least three fluorescence channels, such as the NAICA™ System (Stilla Technologies).
A DNA sample was prepared by mixing three mutant DNA species (i.e., G13D KRAS, Q61K NRAS and E19 deletion EGFR) at different concentrations with wildtype DNA. The sample was subject to the triplex drop-off dPCR assay with primers, reference and drop-off probes as shown in TABLE 12. The experiment was carried out according to the methods described below.
The following materials were used:
The following reagents were used:
Primers and probes were designed using three distinct fluorophores that can be detected using three channels of detection. Three drop-off assays each comprising a forward and a reverse primer and a probe pair consisting of a reference probe and a drop-off probe were designed. Each drop-off assay targeted a genetic region (“target genetic locus”) known to host genetic alterations, including single mutations, insertions, and deletions. The primers were designed in order to generate an amplicon as short as possible for the assay to be suitable for the analysis of fragmented DNA template. Both reference probes and drop-off probes were designed to anneal within the region delimited by the forward and the reverse primer. The reference probe was designed to anneal to amplicons with either wildtype or mutant sequences at the target genetic locus. For example, the reference probe was designed to anneal to a sequence upstream or downstream of the target genetic locus, and such sequence did not include a single nucleotide polymorphism (SNP) site. The drop-off probe was designed to anneal only to the wildtype sequence at the target genetic locus. The drop-off probe annealing site includes the region where genetic alterations of interest occurs. The reference probes and the drop-off probes could additionally contain modified nucleotides (e.g., locked nucleic acids) that increase the specificity of the probes.
Each reference probe was labeled with a reference fluorophore. Each drop-off probe was labeled with a drop-off fluorophore that was distinct from the reference fluorophore in the same probe pair. The three reference probes were designed to have distinct fluorophores. Each combination of reference/drop-off fluorophores corresponding to each probe pair was unique, and the set of reference fluorophores and the set of drop-off fluorophores were designed to be circular permutations with respect to each other. For example, the three probe pairs were labeled with fluorophores in the following manner:
General considerations: Special caution was exercised to prevent DNA carry-over contamination, which could lead to false positives. All reagents and plastic consumables were sterile, DNA- and DNAse-free, and of molecular biology grade. Gloves were worn while handling reagents, materials, and equipment. Commercially available DNA decontamination solutions were used to clean all surfaces dedicated to protocol handling. Areas dedicated to sample extraction, digital PCR mixture assembly, and digital PCR amplification were separated. As PCR products are the leading cause of contamination, reagents and consumables brought in the post amplification area must not be reintroduced in the pre amplification areas.
DNA template preparation, quantification and quality monitoring: The DNA template was prepared as follows. High quality DNA or cDNA templates suitable for PCR amplification were obtained using a standard phenol/chloroform extraction method or commercial extraction kits. The DNA quantity and quality obtained were assessed prior to digital PCR amplification and control samples were first accessed to ensure compatibility with the NAICA™ System. High molecular weight DNA was fragmented using sonication or restriction enzyme digestion. The DNA templates were quantified using a NANODROP™ ND-3300 Spectrophotometer, a QUBIT™ fluorometer or using real-time PCR quantification. DNA quantities compatible within the detection range of the NAICA™ System were used: e.g., 0.0165 to 1650 ng of human DNA per digital PCR reaction for the sapphire chip (equivalent to 5 to 500 000 copies of DNA per 25 μL reaction). A DNA template is deemed as high quality when a 260 nm/280 nm absorbance ratio of ˜1.8-2 and a 260 nm/230 nm absorbance ratio in the range of 2.0-2.2 are obtained.
Set-up of dPCR reactions: The digital PCR reaction was set up as follows. Reagents were thawed on ice, mixed thoroughly before use. In the pre-PCR “clean room”, the primer pool was prepared by mixing all of the forward and the reverse primers at 100 μM in a stock solution to obtain 20 μM working solutions. Individual 20 μM working solutions for each probe was prepared, and a 1p M fluorescein working solution. In a clean 1.5 mL microcentrifuge tube, the digital PCR reaction mix was prepared by assembling the reagents (5X PERFECTA™ ToughMix, Fluorescein, Primers, Probes) in a 25 μl final volume using the following final concentrations: 1X PERFECTA™ ToughMix, 100 nM fluorescein, 500-1000 nM each primer pair and 250-750 nM each probe. The digital PCR reaction mix was dispensed in separate microcentrifuge tubes and the lids were closed. Opening only one tube at a time to avoid cross-contamination, the DNA template was added individually to each reaction tube. The lids were closed and the tubes were vertexed thoroughly. The microcentrifuge tubes were briefly centrifuged to collect the entire volume at the bottom of each tube. The white caps of the inlet ports of the sapphire chip were removed and 25 μL of the reaction mixture were gently pipetted over the oil phase in each inlet port, being extremely careful not to introduce air bubbles. A tall PCR-ready white cap was placed on each inlet port. Care was taken to avoid preparing too many chips at a time to prevent evaporation of the oil contained within the chips. The chips were subsequently placed on the thermal plate of the NAICA™ Geode. The lid of the Geode was closed and the following thermocycling program was run:
Data acquisition and analysis of test samples: Following digital PCR amplification and Geode depressurization, the sapphire chips were placed in the NAICA™ 6-color prototype reader and the reading program was launched. After the reading step, the data was analyzed using the CRYSTAL™ Miner software. Briefly, the numbers of positive and negative droplets in each channel of detection corresponding to the labels in the reference and drop-off probes were counted. The mutant and wildtype targets concentrations were calculated according to the formula in the “Embodiment Employing Three (3) Probe Pairs” section.
Number | Date | Country | Kind |
---|---|---|---|
19306765.9 | Dec 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/087704 | 12/22/2020 | WO |