A Sequence Listing has been submitted in an ASCII text file named “19594” created on May 10, 2020, consisting of 34,802 bytes, the entire content of which is herein incorporated by reference.
The invention provides compositions and methods for accurately and specifically amplifying sequences to allow for accurate and specific detection of mutations, such as in disease-associated genes and alleles, thus distinguishing true nucleotide variants over random nucleotide sequencing errors. In one embodiment, this is accomplished, prior to sequencing, by using a combination of director and driver in a combination of two PCR reactions. Thus, in one embodiment, this is accomplished by first amplifying a nucleotide sequence of interest to introduce a director that creates a specific target for a second amplification in which a driver, which specifically hybridizes to the director, drives the directionality of the second amplification. The amplification produces an amplicon of a sense or antisense strand of a double-stranded nucleotide sequence of interest. The amplicon may optionally contain universal sequences and/or index sequences that facilitate subsequent sequencing of the amplicon, such as using sequencing-by-synthesis. The reagents of the first and second amplification steps may be combined in a single reaction mixture.
Genetic testing of disease relies on accurately and specifically detecting disease-associated mutations and/or disease-associated alleles, particularly hypervariable alleles, such as those in the HLA DQA and HLA DQB genes associated with celiac disease. Testing for celiac disease is one of the highest volume assays, costly, and time consuming, requiring a PCR, a gel, hybridization and washing of beads, and reading on instruments. Thus, even small cost savings in this assay would lead to large overall savings.
Current genetic testing for celiac disease can rule out celiac disease, can indicate an individual is at risk to develop celiac disease, but cannot, alone, directly diagnose celiac disease.
Thus, there remains a need for compositions and methods for accurately and specifically amplifying sequences so that disease-associated mutations and/or disease-associated alleles may be accurately and specifically sequences for disease detection.
The invention provides compositions and methods for accurately and specifically amplifying sequences to allow for accurate and specific detection of mutations, such as in disease-associated genes and alleles, thus distinguishing true nucleotide variants over random nucleotide sequencing errors. In one embodiment, this is accomplished, prior to sequencing, by using a combination of director and driver in a combination of two PCR reactions. Thus, in one embodiment, this is accomplished by first amplifying a nucleotide sequence of interest to introduce a director that creates a specific target for a second amplification in which a driver, which specifically hybridizes to the director's complementary strand product, drives the directionality of the second amplification. The amplification produces an amplicon of a sense or antisense strand of a double-stranded nucleotide sequence of interest. The amplicon may optionally contain universal sequences and/or index sequences that facilitate subsequent sequencing of the amplicon, such as using sequencing-by-synthesis. The reagents of the first and second amplification steps may be combined in a single reaction mixture.
Thus, in one embodiment, the invention provides a method for amplifying a target polynucleotide sequence, comprising A) contacting a double-stranded target polynucleotide sequence with i) a first primer modified by having an insertion of a first director, and ii) a second primer, said contacting is under conditions sufficient for amplifying said double-stranded target polynucleotide sequence to produce a first plurality of amplicons comprising a first single-stranded amplicon that comprises at least one single strand of said double-stranded target polynucleotide sequence, said at least one single strand having an insertion of said first director in either its 3′ or 5′ terminal regions, and B) contacting said at least one single strand with i) a third primer fused at its 3′ end to a first driver, and ii) a fourth primer, wherein said first driver has the same sequence as said first director, wherein said contacting is under conditions sufficient for amplifying said at least one single strand produced in step A) to produce a second plurality of amplicons comprising a second single-stranded amplicon having at its 3′ end at least one single strand containing an insertion of said first director in either its 3′ or 5′ terminal regions. In one embodiment, the 5′ end of said first driver is fused to a universal adapter, and wherein said second single-stranded amplicon comprises said universal adapter sequence fused to its 5′ end.
Thus, in another embodiment, the invention provides a method for amplifying a target polynucleotide sequence, comprising A) contacting a double-stranded target polynucleotide sequence with i) a first primer modified by having an insertion of a first director, and ii) a second primer modified by having an insertion of a second director, said contacting is under conditions sufficient for amplifying said double-stranded target polynucleotide sequence to produce a first plurality of amplicons comprising a first single-stranded amplicon that comprises at least one single strand of said double-stranded target polynucleotide sequence, said at least one single strand having an insertion of said first director in either its 3′ or 5′ terminal regions and an insertion of said complement of said second director in its other terminal region, and B) contacting said at least one single strand with i) a third primer fused at its 3′ end to a first driver, and ii) a fourth primer fused at its 3′ end to a second driver, wherein said first driver has the same sequence as said first director, and wherein said second driver has the same sequence as said second director, and wherein said contacting is under conditions sufficient for amplifying said at least one single strand produced in step A) to produce a second plurality of amplicons comprising a second single-stranded amplicon having at its 3′ end said at least one single strand containing an insertion of said first director in either its 3′ or 5′ terminal regions, and containing an insertion of said second director in its other terminal region. In one embodiment, the 5′ end of said first driver is fused to a universal adapter, or the 5′ end of said second driver is fused to a complement of said universal adapter sequence, and wherein said second single-stranded amplicon comprises said universal adapter sequence fused to its 5′ end. In one embodiment, said contacting of steps A) and B) is in a single reaction mixture. In one embodiment, said first primer and said third primer are forward primers, and said second primer and said fourth primer are reverse primers. In one embodiment, said first primer and said third primer are reverse primers, and said second primer and said fourth primer are forward primers. In one embodiment, said second single-stranded amplicon contains fused in operable combination from the 5′ end to the 3′ end, a) said universal adapter sequence, b) an insertion of said first director in either its 3′ or 5′ terminal regions, c) said nucleotide sequence of interest, d) an insertion of said second director in its other terminal region, and e) an optional unique index sequence. In another embodiment, said second single-stranded amplicon contains fused in operable combination from the 5′ end to the 3′ end, a) an optional unique index sequence, b) an insertion of said first director in either its 3′ or 5′ terminal regions, c) said nucleotide sequence of interest, d) an insertion of said second director in its other terminal region, and e) said universal adapter sequence. In one embodiment, the 5′ end of said first driver of said third primer is fused to the complement of a unique index sequence and the 5′ end of said second driver of said fourth primer is fused to said universal adapter sequence. In one embodiment, said second single-stranded amplicon is fused at its 3′ end to said unique index sequence, and fused at its 5′ end to said universal adapter sequence. In one embodiment, the 5′ end of said first driver of said third primer is fused to the complement of said universal adapter sequence and the 5′ end of said second driver of said fourth primer is fused to a unique index sequence. In one embodiment, said second single-stranded amplicon is fused at its 5′ end to said unique index sequence, and fused at its 3′ end to said universal adapter sequence. In one embodiment, said second single-stranded amplicon contains an insertion in its 5′ terminal region of a sense strand of said unique index sequence, and an insertion in its 3′ terminal region of a sense strand of said universal adapter sequence. In one embodiment, the method further comprises sequencing said single-stranded amplicon comprised in said second plurality of amplicons. In one embodiment, said conditions in steps A) and B) are sufficient to produce said second single-stranded amplicon at a higher efficiency than in the absence of one or more of said director, said first driver, and said second driver. In one embodiment, said conditions in steps A) and/or B) comprise denaturing said double-stranded target polynucleotide sequence into a single-stranded sense sequence and single-stranded antisense sequence. The contacting step need not be with a double-stranded target polynucleotide sequence, but may be with a single-stranded target derived from a double-stranded target polynucleotide sequence. In one embodiment, said conditions sufficient for amplifying said double-stranded target polynucleotide sequence to produce a first plurality of amplicons comprise one or more of polymerase chain reaction (PCR) amplification, isothermal amplifications, denaturing said double-stranded target polynucleotide sequence, annealing one or more of said first and second primers to said double-stranded target polynucleotide sequence, and extending the annealed primer. In one embodiment, said conditions sufficient for amplifying said at least one single strand produced in step A) to produce a second plurality of amplicons comprise one or more of polymerase chain reaction (PCR) amplification, isothermal amplifications, denaturing said double-stranded target polynucleotide sequence, annealing one or more of said third and fourth primers to said at least one single strand produced in step A), and extending the annealed primer. In one embodiment, the concentration of one or both of said first primer and said first primer is 25% or less than the concentration of one or both of said second primer and said second primer. In one embodiment, said amplifying of one or both of steps A) and B) comprises 50 or fewer PCR cycles. In one embodiment, said target polynucleotide sequence comprises genomic DNA. In one embodiment, said genomic DNA comprises a variable sequence of an allele. In one embodiment, said second single-stranded amplicon is an amplified sense strand of said double-stranded target polynucleotide sequence or an amplified antisense strand of said double-stranded target polynucleotide sequence.
The invention also provides a method for amplifying a target polynucleotide sequence, comprising contacting i) a sample comprising a plurality of double-stranded target polynucleotide sequences comprising a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and comprising a first portion and a second portion, ii) first primer comprising a first sequence that is complementary to said first portion of said first single-stranded polynucleotide sequence, said first sequence is modified by having an insertion of a director, wherein said director is not complementary either to said first portion of said first single-stranded polynucleotide sequence or to said second portion of said second single-stranded polynucleotide sequence, iii) second primer comprising a second sequence that is complementary to said second portion of said second single-stranded polynucleotide sequence, said second sequence is modified by having an insertion of a second director, wherein neither the first nor second said directors is complementary either to said first portion of said first single-stranded polynucleotide sequence or to said second portion of said second single-stranded polynucleotide sequence, iv) third primer fused at its 3′ end to a first driver, and v) fourth primer fused at its 3′ end to a second driver, wherein said first driver has the same sequence as said first director and said second driver has the same sequence as said second director, and wherein said contacting is under conditions sufficient for hybridizing said first driver with said complement of said first director and second driver with said complement of said second director, and for amplifying said plurality of target polynucleotide sequences to produce a) a first plurality of amplicons comprising a first single-stranded amplicon that comprises i) said first single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said second director in its other terminal region, or ii) said second single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said second director in its other terminal region, and b) a second plurality of amplicons comprising a second single-stranded amplicon having at its 3′ end either i) said first single-stranded polynucleotide sequence containing an insertion of said director in either its 3′ or 5′ terminal regions, and containing an insertion of said second director in its other terminal region, or ii) said second single-stranded polynucleotide sequence containing an insertion of said director in either its 3′ or 5′ terminal regions, and containing an insertion of said second director in its other terminal region. In one embodiment, the 5′ end of either said first driver or said second driver is fused to a universal adapter sequence. In one embodiment, said second single-stranded amplicon is an amplified sense strand of said double-stranded target polynucleotide sequence or an amplified antisense strand of said double-stranded target polynucleotide sequence. In one embodiment, said conditions comprise denaturing said double-stranded target polynucleotide sequence into a single-stranded sense sequence and single-stranded antisense sequence. The contacting step need not be with a double-stranded target polynucleotide sequence, but may be with a single-stranded target derived from a double-stranded target polynucleotide sequence. In one embodiment, said conditions sufficient for hybridizing said first driver with the complement of said first director and said second driver with the complement of said second director comprise using a lower temperature than a temperature used to denature said double-stranded target polynucleotide sequence. In one embodiment, said conditions sufficient for amplifying said plurality of target polynucleotide sequence comprise one or more of polymerase chain reaction (PCR) amplification, isothermal amplifications, denaturing said double-stranded target polynucleotide sequence, annealing one or more of said first, second, third and fourth primers to said double-stranded target polynucleotide sequence, and extending the annealed primer. In one embodiment, said contacting of steps c) and e) is in a single reaction mixture. In one embodiment, said first primer and said third primer are forward primers, and said second primer and said fourth primer are reverse primers. In one embodiment, said first primer and said third primer are reverse primers, and said second primer and said fourth primer are forward primers. In one embodiment, said second single-stranded amplicon contains, fused in operable combination from the 5′ end to the 3′ end, universal adapter sequence, and said first single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said complement of said director in its other terminal region, or said second single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said complement of said director in its other terminal region. In one embodiment, wherein the 5′ end of said first driver of said third primer is fused to a unique index sequence, and the 5′ end of said second driver of said fourth primer is fused to universal adapter sequence. In one embodiment, said second single-stranded amplicon is fused at its 3′ end to said unique index sequence, and fused at its 5′ end to universal adapter sequence. In one embodiment, said second single-stranded amplicon contains an insertion in its 3′ terminal region of a sense strand of said unique index sequence, and an insertion in its 5′ terminal region of a sense strand of universal adapter sequence. In one embodiment, said second single-stranded amplicon contains, fused in operable combination from the 5′ end to the 3′ end, i) universal adapter sequence, ii) said first single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said complement of said director in its other terminal region, or said second single-stranded polynucleotide sequence having an insertion of said director in either its 3′ or 5′ terminal regions and an insertion of said complement of said director in its other terminal region, and iii) said unique index sequence. In one embodiment, said unique index sequence is comprised in an index adapter. In one embodiment, the 5′ end of said first driver of said third primer is fused to universal adapter sequence, and the 5′ end of said second driver of said fourth primer is fused to a unique index sequence. In one embodiment, said second single-stranded amplicon is fused at its 5′ end to said unique index sequence, and fused at its 3′ end to universal adapter sequence. In one embodiment, said second single-stranded amplicon contains an insertion in its 5′ terminal region of a sense strand of said unique index sequence, and an insertion in its 3′ terminal region of a sense strand of universal adapter sequence. In one embodiment, said unique index sequence is comprised in an index adapter.
The invention additionally provides a method for amplifying a sense strand of a target polynucleotide sequence, comprising, contacting i) a sample comprising a plurality of target polynucleotide sequences comprising sense and antisense strands, ii) first forward primer comprising a first sequence that is complementary to the sense strand of a first portion of said target polynucleotide sequences, said first sequence is modified by having an insertion of a director, iii) first reverse primer comprising a second sequence that is complementary to the antisense strand of a second portion of said target polynucleotide sequences, said second sequence is modified by having an insertion of a second director, wherein neither first nor second director is complementary either to said first portion or to said second portion of said target polynucleotide sequences, iv) second forward primer fused at its 3′ end to a first driver, v) second reverse primer fused at its 3′ end to a second driver, wherein one of said first driver and said second driver has the same sequence as said director, and the other of said first driver and said second driver has the same sequence as the second director, and wherein said contacting is under conditions sufficient for a) hybridizing said plurality of target polynucleotide sequences with the complement of said first forward primer and said first reverse primer, b) amplifying said plurality of target polynucleotide sequences to produce a first plurality of amplicons comprising a first sense strand amplicon comprising said target polynucleotide sequences having an insertion of said forward strand director in its 5′ terminal region and having an insertion of complement to said reverse strand director in its 3′ terminal region, c) contacting said second forward primer with said first sense strand amplicon, d) hybridizing said reverse strand driver with the complement of said reverse strand director of said first sense strand amplicon, e) contacting said second reverse primer with said first antisense strand amplicon, f) hybridizing said forward strand driver with said complement to said forward strand director of said first antisense strand amplicon, and g) amplifying said first sense strand amplicon and said first antisense strand amplicon to produce a second plurality of amplicons comprising a single-stranded amplicon that contains, fused in operable combination from the 5′ end to the 3′ end, said sense strand universal adapter sequence, and said target polynucleotide sequence having an insertion of said forward strand director in its 5′ terminal region and having an insertion of said complement to said reverse strand director in its 3′ terminal region. In one embodiment, the 5′ end of either said first driver or said second driver is fused to a universal adapter sequence. In one embodiment, said contacting of steps c) and e) is in a single reaction mixture. In one embodiment, said second forward primer comprises a complement to a unique index sequence fused at its 3′ end to said driver, and wherein said single-stranded amplicon contains at its 3′ end a sense strand of said unique index sequence.
The invention also provides a reaction mixture for amplifying a double-stranded target polynucleotide sequence that contains a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and contains a first portion and a second portion, said reaction mixture comprising a) first primer comprising a first sequence that is complementary to said first portion of said first single-stranded polynucleotide sequence, said first sequence is modified by having an insertion of a director, b) second primer comprising a second sequence that is complementary to said second portion of said second single-stranded polynucleotide sequence, said second sequence is modified by having an insertion of a complement of a second director, wherein neither said first director or said second director is complementary either to said first portion of said first single-stranded polynucleotide sequence or to said second portion of said second single-stranded polynucleotide sequence, c) third primer fused at its 3′ end to a first driver, and d) fourth primer fused at its 3′ end to a second driver, wherein the first driver has the same sequence as said the first director, and the second driver is a the same sequence as the second director. In one embodiment, said reaction mixture comprises said first sequence hybridized along a portion of its length to said first portion of said first single-stranded polynucleotide sequence, and said second sequence hybridized along a portion of its length to said second portion of said second single-stranded polynucleotide sequence. In one embodiment, said reaction mixture comprises said first driver hybridized to the complement of said first director, and said second driver hybridized to said complement of said second director. In one embodiment, said reaction mixture comprises one or both of the 5′ end of said first driver of said third primer is fused to a unique index sequence, and the 5′ end of said second driver of said fourth primer is fused to said universal adapter sequence. In one embodiment, the 5′ end of said first driver of said third primer is fused to a unique index sequence. In one embodiment, the 5′ end of said second driver of said fourth primer is fused to said universal adapter sequence. In one embodiment, the 5′ end of said first driver of said third primer is fused to said universal adapter sequence. In one embodiment, the 5′ end of said second driver of said fourth primer is fused to a unique index sequence.
The invention further provides a reaction mixture for amplifying a double-stranded target polynucleotide sequence that contains a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and contains a first portion and a second portion, said reaction mixture comprising a) first primer comprising a first sequence that is complementary to said first portion of said first single-stranded polynucleotide sequence, said first sequence is modified by having an insertion of a director, and said first sequence hybridized along a portion of its length to said first portion of said first single-stranded polynucleotide sequence, b) second primer comprising a second sequence that is complementary to said second portion of said second single-stranded polynucleotide sequence, said second sequence is modified by having an insertion of a second director, wherein neither said first director or said second director is complementary either to said first portion of said first single-stranded polynucleotide sequence or to said second portion of said second single-stranded polynucleotide sequence, and said second sequence hybridized along a portion of its length to said second portion of said second single-stranded polynucleotide sequence, c) third primer fused at its 3′ end to a first driver, and d) fourth primer fused at its 3′ end to a second driver, wherein said first driver has the same sequence as said first director, and the said second driver has the same sequence as said second director.
Also provided herein is a reaction mixture for amplifying a double-stranded target polynucleotide sequence that contains a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and contains a first portion and a second portion, said reaction mixture comprising a) first primer comprising a first sequence that is complementary to said first portion of said first single-stranded polynucleotide sequence, said first sequence is modified by having an insertion of a director, b) second primer comprising a second sequence that is complementary to said second portion of said second single-stranded polynucleotide sequence, said second sequence is modified by having an insertion of a second director, wherein neither said first director nor said second director is complementary either to said first portion of said first single-stranded polynucleotide sequence or to said second portion of said second single-stranded polynucleotide sequence, c) third primer fused at its 3′ end to a first driver, and d) fourth primer fused at its 3′ end to a second driver, wherein said first driver has the same sequence as said first director, and the said second driver has the same sequence as said second director, and wherein said first driver is hybridized to the complement of said first director, and said second driver is hybridized to said complement of said second director.
The invention also provides a kit for a amplifying double-stranded target polynucleotide sequence, said kit comprising any one or more of the reaction mixtures described herein.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
To facilitate understanding of the invention, a number of terms are defined below.
The term “in a single reaction mixture” when in reference to contacting reagents (such as primers, nucleotide bases, target polynucleotide sequence template, enzymes) of two or more reactions (such as a first reaction of site specific PCR amplification using site specific primers, and a second reaction of PCR amplification using index primers and universal primers) means combining the reagents of the two or more reactions without temporally waiting for, and/or without providing, conditions (e.g., thermal cycling for PCR amplification, such as temperature for denaturing double-stranded nucleotide sequences into to single-stranded sequences, temperature for hybridization of sequences, etc.) that are sufficient for any of the two or more reactions to begin and/or to be completed. The term contacting “in a single reaction mixture” includes sequentially and/or substantially simultaneously adding the reagents to a single vessel or receptacle that allows physical contact between the reagents.
Plurality” refers to a population of two or more different polynucleotides or other referenced molecule. Accordingly, unless expressly stated otherwise, the term “plurality” is used synonymously with “population.” A plurality includes 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90 or a 100 or more different members of the population. A plurality also can include 200, 300, 400, 500, 1000, 5000, 10000, 50000, 1×105, 2×105, 3×105, 4×105, 5×105, 6×105, 7×105, 8×105, 9×105, 1×106, 2×106, 3×106, 4×106, 5×106, 6×106, 7×106, 8×106, 9×106 or 1×107, or more different members. A plurality includes all integer numbers in between the above exemplary population numbers.
“Target polynucleotide” means a polynucleotide of interest that is the object of an analysis or action. “Target polynucleotide” includes members of a plurality of target polynucleotide sequences having the same sequence. The analysis or action includes subjecting the polynucleotide to copying, amplification, sequencing and/or other procedure for nucleic acid interrogation. A target polynucleotide is exemplified by a portion of a gene, director, driver sequences, adapter sequences, and/or index sequences.
“Director” refers to at least one nucleotide that is inserted within a PCR forward primer and/or reverse primer, and that is not complementary, and therefore does not hybridize with, a target polynucleotide that is desired to be amplified using the PCR forward and reverse primers, which are complementary to portions of the target polynucleotide, and which contain an insertion of the director. Directors include a single nucleotide (referred to as “director nucleotide”) as well as nucleotide sequence (referred to as “director nucleotide sequence” or “director sequence”) of at least one nucleotide. The nature of the nucleotide and/or nucleotides in the director nucleotide and in the director sequence is not limited to any particular nucleotide and/or sequence, so long the director nucleotide and/or sequence is not complementary, and therefore do not hybridize with, the target polynucleotide in the vicinity of the forward primers and reverse primers that contain the director, and that are used for amplification of the target polynucleotide. It may be desirable to design directors to avoid secondary structures such as hairpins, homodimers, and heterodimers, and avoid amplifying sequences that are similar to the target polynucleotide. Directors may be double-stranded, or single-stranded such as reverse strand director sequences and forward strand director sequences produced by PCR amplification (
“Driver” refers to at least one nucleotide that is equal in length to, and hybridizes with, the complementary strand generated by PCR amplification of a director. Drivers include a single nucleotide (referred to as “driver nucleotide”) as well as nucleotide sequence (referred to as “driver nucleotide sequence” or “driver sequence”) of at least two nucleotides. Drivers may be double-stranded, or single-stranded such as reverse strand driver sequences and forward strand driver sequences produced by PCR amplification (
“Index” and “indexed” polynucleotide sequence means a unique nucleotide sequence that is distinguishable from the sequence of other indices as well as the sequence of other nucleotide sequences within polynucleotides contained within a sample. An index sequence is useful as a barcode where different members of the same molecular species can contain the same index sequence and where different species within a population of different polynucleotides can have different unique indices. An index sequence can be a random or a specifically designed nucleotide sequence. An index sequence can be of any desired sequence length so long as it is of sufficient length to be a unique nucleotide sequence within a plurality of indices in a population and/or within a plurality of polynucleotides that are being analyzed or interrogated. A nucleotide index is useful, for example, to be attached to a target polynucleotide to tag or mark a particular species for identifying all members of the tagged species within a population. Methods for designing and making index sequences are known in the art (Illumina TruSeq Adapters Demystified Rev. A, © 2011 Tufts University Core Facility), and U.S. Pat. No. 9,926,598. Index polynucleotide sequences are exemplified in
“Universal polynucleotide sequence” is a sequence that enables amplification of any target polynucleotide of known or unknown sequence that has been modified to enable amplification with the universal primers. In one embodiment, such amplification produces an amplified target polynucleotide containing a “universal” sequence, such as a universal adapter sequence, at the target polynucleotides' 3′ and/or 5′ ends. The attachment of universal known ends to a library of DNA fragments by ligation allows the amplification of a large variety of different sequences in a single amplification reaction. The sequences of the known sequence portion of the nucleic acid template can be designed such that type 2s restriction enzymes bind to the known region, and cut into the unknown region of the amplified template. Universal primers are known in the art and exemplified by Illumina's Sequences S1 and S2 which, in combination, direct amplification of a template by solid-phase bridging amplification reaction. The template to be amplified must itself comprise (when viewed as a single strand) at the 3′ end a sequence capable of hybridizing to sequence S1 in the forward primers and at the 5′ end a sequence the complement of which is capable of hybridizing to sequence S2 the reverse primer. Methods for designing and making universal sequences are known in the art (Illumina TruSeq Adapters Demystified Rev. A, CO 2011 Tufts University Core Facility), and U.S. Pat. No. 8,765,381. Universal polynucleotide sequences are exemplified in
“Adapter” or “adaptor” or a “linker” is a short, chemically synthesized, single-stranded or double-stranded oligonucleotide that can be ligated to the 3′ and/or 5′ ends of other DNA or RNA molecules. Adapters containing specific sequences designed to interact with next-generation-sequencing (NGC) platforms (such as the surface of the flow-cell or beads may be ligated to one or both of the 3′ and 5′ ends of target polynucleotides prior to sequencing. For example, adapters include “indexed adapters” and “universal adapters.” The primary function of both indexed adapters and universal adapters is to allow any DNA sequence to bind to a flowcell for next generation sequencing (NGS), and to allow for PCR enrichment of only adapter ligated DNA sequences for cluster generation (such as either on a MiSeq flowcell or on an Ion Torrent bead). Next generation sequencing (NGS) does not require indexed adapters and could be done exclusively with universal adapters. However, doing so would limit any run to only one sample. The addition of indexes unique to each sample allows for the mixing of two or more samples, for sequencing to occur, and for results to be analyzed after the sequencing is complete. The structure of adapters is dictated by the sequencing platform.
“Indexed adapters” (also referred to as “index adapters”) contain index polynucleotide sequences, and are known in the art as exemplified by TruSeq Indexed Adapter: 5′ P*GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCC 3′ (SEQ ID NO: 1) Indexed adapters allow for indexing or “barcoding” of samples so multiple DNA libraries can be mixed together into one sequencing lane (known as multiplexing). Methods for designing and making index adapters are known in the art (Illumina TruSeq Adapters Demystified Rev. A, © 2011 Tufts University Core Facility).
“Universal adapters” contain universal polynucleotide sequences, and are known in the art as exemplified by TruSeq Universal Adapter: 5′AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC*T 3′. (SEQ ID NO: 2). The stars (*) in the above TruSeq Indexed Adapter and TruSeq Universal Adapter indicate a phosphorothioate bond between the last C and T to prevent cleaving off the last T that is needed for annealing the overhang. The phosphate group on the indexed adapter is required to ligate the adapter to the DNA fragment. The NNNNNN in the above exemplary TruSeq Indexed Adapter represents the “barcode.” The last 12 bases are complementary if the Indexed Adapter is reversed. Methods for designing and making universal adapters are known in the art (Illumina TruSeq Adapters Demystified Rev. A, © 2011 Tufts University Core Facility).
A “primer” sequence is a short single-stranded DNA that hybridizes to a target polynucleotide sequence, and serves as a starting point for synthesis of a complementary strand of the target polynucleotide sequence. “PCR primer” is a primer used in a “polymerase chain reaction” (“PCR”). Design principles for PCR primers are known in the art, including primer length, specificity to the target polynucleotide sequence, melting temperature (Tm) value, annealing temperature (Ta), freedom of strong secondary structures and self-complementarity, and GC content.
“Target specific” and “site specific” when used in reference to a primer or other oligonucleotide sequence is intended to mean a primer or other oligonucleotide sequence that includes a nucleotide sequence that is complementary to, and that specifically and selectively hybridizes (i.e., anneals) to, at least a portion of a target polynucleotide sequence. Target specific primers include forward and reverse primers, universal primers, index primers, sequencing primers, and the like.
“Forward primer” is a primer sequence that hybridizes to the sense strand of a DNA sequence of interest. In contrast, a “reverse primer” hybridizes to the anti-sense strand of the DNA sequence of interest.
“Universal primer” sequences refer to a primer sequence that is complementary to, and that specifically and selectively hybridizes (i.e., anneals) to, a universal polynucleotide sequence.
“Index primer” and “indexed primer” sequences interchangeably refer to a primer sequence that is complementary to, and that specifically and selectively hybridizes (i.e., anneals) to, an index polynucleotide sequence.
“Insert,” “insertion,” and grammatical equivalents refer to a change in a polynucleotide sequence that results in the addition of one or more nucleotides. For example, a PCR primer that is complementary along its entire length to a region of a target polynucleotide sequences may be modified by insertion of a director (e.g., director nucleotide or director sequence) that is not complementary to the region of a target polynucleotide, and the presence of the director within the modified PCR primer therefore results in hybridization of only a portion (located at either the 3′ end or 5′ end of the inserted director) of the modified PCR primer to the region of the target polynucleotide sequences.
A “sense strand” and “coding strand” interchangeably refer to a segment within double-stranded DNA that runs from 5′ to 3′, and that is complementary to the “antisense strand” (i.e., “template strand”) of DNA, which runs from 3′ to 5′.
“Complement” and “complementary” when in reference to a sequence of interest interchangeably refer to a nucleic acid sequence that can form a double-stranded structure with the sequence of interest by matching base pairs. For example, the complementary sequence to 5′-G-T-A-C-3′ is 3′-C-A-T-G-5′. In one embodiment, PCR primers are 100% complementary along their entire length to a region of a target polynucleotide. In another embodiment, it may be desirable to modify a PCR primer that is complementary along its entire length to a region of a target polynucleotide sequences may be modified by insertion of a director (e.g., director nucleotide or director sequence) that is not complementary to the region of a target polynucleotide, and the presence of the director within the modified PCR primer therefore results in complementarity of only a portion (located at either the 3′ end or 5′ end of the inserted director) of the modified PCR primer to the region of the target polynucleotide sequences.
“Amplification” refers to making copies of polynucleotide sequences of interest. Amplification methods include both thermocycling (such as “polymerase chain reaction” (“PCR”)) amplification) and isothermal amplifications (such as described in application number WO07107710), using a commercially available Solexa/Illumina cluster station as described in PCT/US/2007/014649. The cluster station is essentially a hotplate and a fluidics system for controlled delivery of reagents to a flowcell.
“Amplicon” is a nucleotide sequence that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase chain reactions (LCR), or naturally through gene duplication.
“Variable” sequence refers to a segment of a chromosome characterized by variation in the number of tandem repeats at one or more loci. In some embodiments, a “variable” sequence is a “hypervariable” sequence, which refers to a segment of a chromosome characterized by considerable variation in the number of tandem repeats at one or more loci. Repeats in the hypervariable region are highly polymorphic. A hypervariable locus refers to a locus with many alleles; especially those whose variation is due to variable numbers of tandem repeats. A hypervariable region (HVR) refers to a chromosomal segment characterized by multiple alleles within a population for a single genetic locus.
“Polymerase chain reaction “(“PCR”) is a method for making copies of a specific DNA segment using repeated thermal PCR cycles. “PCR cycle” refers to a combination of denaturing a double-stranded template DNA by heating to separate it into two single strands, annealing the DNA primers to the template DNA by lowering the temperature, and extending the new DNA strand by the Taq polymerase enzyme and by raising the temperature.
“Hybridizing” and grammatical equivalents refer to a process by which single-stranded DNA or RNA molecules anneal to complementary single-stranded DNA or RNA through base pairing. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, the melting temperature (Tm) of the formed hybrid, and the G:C ratio within the nucleic acids. Conditions for hybridizing DNA molecules, such as primers and target DNA polynucleotides are known in the art, and exemplified herein.
“Operable combination” and “operably linked” when in reference to the relationship between nucleic acid sequences refers to fusing the sequences in frame such that they perform their intended function. For example, operably linking a promoter sequence to a nucleotide sequence of interest refers to fusing the promoter sequence and the nucleotide sequence of interest in a manner such that the promoter sequence is capable of directing the transcription of the nucleotide sequence of interest and/or the synthesis of a polypeptide encoded by the nucleotide sequence of interest.
“Fuse,” “fusion,” and grammatical equivalents when made in reference to a first and second nucleotide sequences refer to the linkage of the first and second nucleotide sequences via phosphodiester bonds. Fusion of a first and second nucleotide sequences may be direct or indirect. “Direct” fusion refers to the absence of intervening nucleotides between the first and second nucleotide sequences. “Indirect” fusion refers to the presence of one or more nucleotides between the first and second nucleotide sequences. For example, the term “index sequence fused at its 3′ end to a driver” refers to an index sequence that is fused, directly or indirectly, at its 3′ end to a driver.
The terms “3′ end” and “5′ end” when in reference to a nucleotide sequence refer to the terminal nucleotide base that is located at, respectively, the 3′ terminal and 5′ terminal of the nucleotide sequence.
The terms “3′ terminal region” and “5′ terminal region” when in reference to a nucleotide sequence refer to a portion of the nucleotide sequence that is approximately a third of the length of the nucleotide sequence and that spans and includes, respectively, the 3′ end and 5′ end the nucleotide sequence.
“Efficiency” when in reference to an amplicon refers to the percentage of total reads of the amplicon. “Higher efficiency” refers to an increase in the percentage of total reads, exemplified by an increase of at least 0.1 fold (i.e., 10%), including an increase of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, and 30 fold, and exemplified by an increase from 0.1 fold (i.e., 10%) fold to 100 fold (i.e., 10,000 fold), including from 0.1 to 90 fold, 1 to 90 fold, 1 to 80 fold, 1 to 70 fold, 1 to 60 fold, 1 to 50 fold, 1 to 40 fold, 1 to 30 fold, 1 to 29 fold, 1 to 28 fold, 1 to 27 fold, 1 to 26 fold, 1 to 25 fold, 1 to 24 fold, 1 to 23 fold, 1 to 22 fold, 1 to 21 fold, 1 to 20 fold, 1 to 19 fold, 1 to 18 fold, 1 to 17 fold, 1 to 16 fold, 1 to 15 fold, 1 to 14 fold, 1 to 13 fold, 1 to 12 fold, 1 to 11 fold, 1 to 10 fold, 1 to 9 fold, 1 to 8 fold, 1 to 7 fold, 1 to 6 fold, 1 to 5 fold, 1 to 4 fold, 1 to 3 fold, and 1 to 2 fold, as exemplified by a 27 fold (i.e. 2,700%) increase shown in Example 3, Table 3, and
The terms “higher,” “greater,” and grammatical equivalents (including “increase,” “elevate,” “raise,” etc.) when in reference to the level of any molecule (e.g., amplicon, nucleic acid sequence, amino acid sequence, etc.) and/or phenomenon (e.g., amplification of a nucleotide sequence, expression of a gene, etc.), specificity of binding of two molecules (e.g., binding of a director to a driver), in a first sample relative to a second sample, mean that the quantity of the molecule and/or phenomenon in the first sample is higher than in the second sample by any amount that is statistically significant using any art-accepted statistical method of analysis.
“Kit” is used in reference to a combination of reagents and other materials. A kit may include reagents such as buffering agents, nucleic acid stabilizing reagents, protein stabilizing reagents, and/or testing containers. In one embodiment, the kit further comprises instructions for using the reagents, such as for amplification of a target polynucleotide sequence, exemplified by, but not limited to, instruction in Example 2.
The invention provides compositions and methods for accurately and specifically amplifying sequences to allow for accurate and specific detection of mutations, such as in disease-associated genes and alleles, thus distinguishing true nucleotide variants over random nucleotide sequencing errors. In one embodiment, this is accomplished, prior to sequencing, by using a combination of director and driver in a combination of two PCR reactions. Thus, in one embodiment, this is accomplished by first amplifying a nucleotide sequence of interest to introduce a director that creates a specific target for a second amplification in which a driver, which specifically hybridizes to the director, drives the directionality of the second amplification. The amplification produces an amplicon of a sense or antisense strand of a double-stranded nucleotide sequence of interest. The amplicon may optionally contain universal sequences and/or index sequences that facilitate subsequent sequencing of the amplicon, such as using sequencing-by-synthesis. The reagents of the first and second amplification steps may be combined in a single reaction mixture.
Thus, in one embodiment, the invention's methods include two steps that, optionally, occur together in the same reaction vessel in order. First, a PCR occurs where the PCR primers include a director that creates a specific target. Second, the universal adapter and optional index adapter are added using primers that include a driver that specifically binds to the director. If the index adapter's driver finds the index adapter's specific director it will base pair and extend. If the index adapter's driver finds the universal adapter's specific director it will not base pair and will not extend. Conversely, if the universal adapter's driver finds the universal adapter's specific director it will base pair and extend. If the universal adapter's driver finds the index adapter's specific director it will not base pair and will not extend. Without the specificity of the combination of a driver and director the methods are very inefficient.
In one embodiment, the forward PCR primers 1 (director 1) and 3 (driver 1) directly incorporate a director/driver to the sense strand. The reverse PCR primers 2 (director 2) and 4 (driver 2) directly incorporate a director/driver to the antisense strand. Through PCR, the sense strand is copied as a complementary antisense strand and the antisense strand is copied as a complementary sense strand. As a final product, the sense strand ultimately has a direct 5′ incorporation of the PCR 1 and 3 driver/director and an indirect 3′ incorporation of the complement of the PCR 2 and 4 driver/director. As a final product, the antisense strand ultimately has a direct 5′ incorporation of the PCR 2 and 4 driver/director and an indirect 3′ incorporation of the complement of the PCR 1 and 3 driver/director. Because director 1 and driver 1 have the same sequence (and director 2 has the same sequence as driver 2) the driver does not directly hybridize with the director. Instead, it hybridizes with the transcribed, complementary strand produced as primer 1 is amplified. The first director of the first primer and second driver of the second primer are unique from one another. They cannot be equal and cannot be complementary to one another. Their uniqueness from one another is what creates specificity for the second PCR and is what gives directionality to the system.
One advantage of the invention's methods is that they produce sequenceable amplicons at a higher efficiency compared to amplification methods that omit using the invention's combination of director and driver. For example, while the total amplicon count is similar without the use of the invention's combination of driver and director, however, the majority of products cannot be sequenced or otherwise interfere with sequencing of the sequenceable amplicon.
A further advantage of the invention's methods is that they accurately and specifically produce amplicons, thus enabling accurate and specific sequencing and detection of variable and hypervariable alleles. For example, genomic sequences amplified using the invention's methods and compositions may be subjected to sequencing, thus enabling diagnosis of disease. In one embodiment, the invention's methods produced sequences that successfully diagnosed patients as having celiac disease by confirming a patient as having DQB1:03/DQB1:06 alleles, and another patient having DQA1:02/DQA1:01:03 alleles (Example 2), which are different from the types that are traditionally tested.
Another advantage of the invention's methods is that their use to detect particular genes does not require amplifying and sequencing an entire disease-associated gene (such as celiac disease HLA DQA and HLA DQB genes), but may be accomplished by amplifying and sequencing only the hypervariable regions that define the specific disease-associated alleles. NGS carries strand level information and all variants found within a strand can be understood in a cis/trans context. While not sufficient by SSO or even Sanger, by using a few targeted regions of, for example, less than 300 bp, it is contemplated that alleles (such as celiac disease HLA DQA and HLA DQB alleles) can be determined down to the second or third field.
Yet another advantage of the invention's methods is that they can be performed by combining the reagents for more than one PCR step in a single reaction vessel, thereby reducing time for technician involvement, reducing the likelihood for transcription errors, and reducing the likelihood for secondary PCR contamination.
An additional advantage of the invention's methods is that they use lower primer concentrations and fewer amplification cycles to drive amplification reactions to completion, thus reducing cost and time.
One characteristic of the invention's methods is that they generate an amplicon of either a sense strand or antisense strand of the target polynucleotide. The generated amplicon is fused to a universal adapter and index adapter, and is thus complementary to the flowcell surface and are retained during the cluster generation step for subsequent sequencing (
In some embodiments, the invention's methods target a selected region of a gene with sufficient read depth, such that data provided by one strand (such as the sense strand or antisense strand, and more preferably the sense strand) that is generated by the invention's methods provides adequate information regarding the actual sequence. In some embodiments, the range of read depths is dependent on instrument, assay complexity, and number of samples or runs. As an example, 40,000 samples could be run on one MiSeq run and still get 100 fold coverage. Even without full optimization, depths measured are in the 2,000-20,000 range. In one embodiment, the read depth is from 20 to 1,000, including from 20 to 900, 20 to 800, 20 to 700, 20 to 600, 20 to 500, 20 to 400, 20 to 300, 20 to 200, 20 to 100, and 20 to 50. For example, in some embodiment, the read length is 100 for germline variant samples, and at least 500 for somatic (such as cancer) variant samples.
The invention's methods were developed and applied to the exemplary HLA genes associated with celiac disease. Celiac disease is a long-term autoimmune disorder that primarily affects the small intestine. Celiac disease is “permissive” with the right HLA type. Without the right HLA type celiac disease doesn't occur, but with the right type it can occur. HLA Class II molecules are heterodimers containing an alpha and beta chain, with each chain encoded by a different gene. There are several different HLA Class II molecules including HLA-DP, HLA-DQ, and HLA-DR, as well as many different pseudogenes. All HLA genes including both Class I and Class II are encoded on chromosome 6.
Current genetic testing for celiac disease can rule out celiac disease, can indicate an individual is at risk to develop celiac disease, but cannot, alone, directly diagnose celiac disease. Almost all people (95%) with celiac disease have either the variant HLA-DQ2 allele or (less commonly) the HLA-DQ8 allele. However, about 20-30% of people without celiac disease have also inherited either of these alleles. There are three different celiac disease permissive HLA-DQ types, either DQ2.5, DQ2.2, or DQ8 (in order of celiac disease-associated frequency). This nomenclature indicates a heterodimer containing specific alpha and beta chains. DQ2.5 contains DQB1*0201 and DQA1*0501. DQ2.2 contains DQB1*0202 and DQA1*0201. DQ8 contains DQB1*0302 and DQA1*0301. Traditionally, genetic testing for celiac disease aims to detect the presence of the following alleles:
Additionally, current celiac disease testing reports HLA-DQ typing at a low resolution for all patients, not just celiac disease permissive patients. While on the one hand clinicians need to confirm whether any of the above types are present, there is still a desire to report types not on this list. This is one reason why there are six PCR products, more than necessary to prove a simple positive/negative result, in the invention's assay design.
Initial attempts in developing the invention's methods and compositions were not completely optimized. One initial strategy in developing the invention's methods (Example 1) attempted to sequence a few targeted regions of less than 300 bp of the hypervariable regions that define specific alleles in the celiac disease HLA DQA and HLA DQB genes, by using universal tags to randomly add the index adapter or universal adapter to either the forward or reverse strand. However, this strategy did not succeed (Example 1).
The invention's subsequent successful methods for amplification of an exemplary sense strand of a double-stranded target polynucleotide are exemplified herein in Examples 2 and 3,
The invention's methods may be used for amplification of only a sense strand or only an antisense strand of a double-stranded target polynucleotide sequence into a sequenceable sequence. Thus, in one embodiment, the invention provides a method for amplifying a target polynucleotide sequence, comprising two amplification steps.
The first amplification comprises contacting a double-stranded target polynucleotide sequence with i) a first PCR primer that is modified by having an insertion of a director, and ii) a second primer modified by having an insertion of a second director, the contacting is under conditions sufficient for amplifying the double-stranded target polynucleotide sequence to produce a first plurality of amplicons comprising a first single-stranded amplicon that comprises at least one single strand of the double-stranded target polynucleotide sequence, the at least one single strand having an insertion of the director in either its 3′ or 5′ terminal regions and an insertion of the complement of the second director in its other terminal region.
The second amplification step comprises contacting the at least one single strand with i) a third primer fused at its 3′ end to a first driver, and ii) a fourth primer fused at its 3′ end to a second driver, wherein the first has the same sequence as the first director, and the second driver has the same sequence as the second director, wherein the 5′ end of either the first driver or the second driver is fused to a universal adapter sequence, and wherein the contacting is under conditions sufficient for amplifying the at least one single strand to produce a second plurality of amplicons comprising a second single-stranded amplicon having the universal adapter sequence fused to its 5′ end, and comprising at its 3′ end either the at least one single strand containing an insertion of the director in either its 3′ or 5′ terminal regions, and containing an insertion of the complement of the director in its other terminal region.
In one embodiment, it may be desirable to carry out the two amplification steps in a single reaction mixture.
In some embodiments, where it may be desirable to amplify a sense strand of double-stranded target polynucleotide sequence, the first primer and the third primer are forward primers, and the second primer and the fourth primer are reverse primers (
In some embodiments, where it may be desirable to amplify an antisense strand of double-stranded target polynucleotide sequence, the first primer and the third primer are reverse primers, and the second primer and the fourth primer are forward primers.
In one embodiment, where the sense strand of double-stranded target polynucleotide sequence is amplified, the invention's method generates a second single-stranded amplicon that contains, fused in operable combination from the 5′ end to the 3′ end, a) the universal adapter sequence, and b) the at least one single strand having an insertion of the director in either its 3′ or 5′ terminal regions and an insertion of the complement of the director in its other terminal region.
In a particular embodiment, where it is desirable to include an index sequence in the amplified sense strand, the 5′ end of the first driver of the third primer is fused to a unique index sequence, and the 5′ end of the second driver of the fourth primer is fused to the universal adapter sequence. This produces a second single-stranded amplicon fused at its 3′ end to the unique index sequence, and fused at its 5′ end to the universal adapter sequence (
In a particular embodiment, where it is desirable to include an index sequence in the amplified antisense strand, the 5′ end of the first driver of the third primer is fused to the universal adapter sequence, and the 5′ end of the second driver of the fourth primer is fused to a unique index sequence. In a particular embodiment, the second single-stranded amplicon is fused at its 5′ end to the unique index sequence, and fused at its 3′ end to the universal adapter sequence. In a further embodiment, the second single-stranded amplicon contains an insertion in its 5′ terminal region of a sense strand of the unique index sequence, and an insertion in its 3′ terminal region of a sense strand of the universal adapter sequence. In a particular embodiment, the unique index sequence is comprised in an index adapter.
While not necessary, it may be desirable to contact the reagents of the first and second amplification steps in a single reaction mixture (Example 2 and 3).
It may be desirable to sequence the single-stranded amplicon comprised in the second plurality of amplicons, such as using sequencing-by-synthesis (SBS), using methods known in the art (U.S. Pat. No. 8,765,381).
On advantage of the invention's methods is that they produce the second single-stranded amplicon at a higher efficiency than in the absence of one or more of the director, the first driver, and the second driver. Data herein demonstrate that inclusion of the director and driver increased the efficiency by 27 fold (i.e. 2,700%) (Example 3, Table 3, and
Another advantage of the invention's methods is that they may be accomplished using a lower concentration of one or both of the first primer and the first primer than the concentration of one or both of the second primer and the second primer. The concentration of one or both of the first primer and the second primer is 25% or less, 24% or less, 23% or less, 22% or less, 21% or less, 20% or less, 19% or less, 18% or less, 17% or less, 16% or less, 15% or less, 14% or less, 13% or less, 12% or less, 11% or less, and 10% or less than the concentration of one or both of the third primer and the fourth primer, including from 10% to 25% less, 10% to 24% or less, 10% to 23% or less, 10% to 22% or less, 10% to 21% or less, 10% to 20% or less, 10% to 19% or less, 10% to 18% or less, 10% to 17% or less, 10% to 16% or less, and 10% to 15% or less than the concentration of one or both of the third primer and the fourth primer. Data herein demonstrate the successful use of ⅙th (i.e., 16%) the concentration of the site specific PCR primer, compared to the concentration of either the index primer or universal primer, to exhaust reagents and drive the PCR reactions to completion.
A further advantage of the invention's methods is that they may be accomplished using fewer than 100 cycles, including fewer than each of 90, 80, 70, 60, 50, 45, 40, 35, 30, 25, and 20 cycles. Data herein demonstrate successful use of 50 PCR cycles (Examples 2 and 3).
In a particular embodiment, the target polynucleotide sequence comprises genomic DNA, exemplified by, but not limited to, the DQB1 gene and/or DQA1 gene. In a particular embodiment, the genomic DNA is not fragmented. In a further embodiment, the genomic DNA comprises a variable and/or hypervariable sequence of an allele. While not intending to limit the invention to any particular variable and/or hypervariable allele, in one embodiment, the hypervariable allele comprises one or more of DQ2.5 allele, DQ2.2 allele, DQ8 allele, DQB1 02:01 allele, DQB1 02:02 allele, DQB1 03:02 allele, DQA1 05:01 allele, DQA1 02:01 allele, and DQA1 03:01 allele (Table 1). In one embodiment, the hypervariable allele comprises one or more of KIR2DL1 allele, KIR2DL2 allele, KIR2DL3 allele, and KIR2DL4 allele. In one embodiment, hypervariable alleles comprise one or more of. In one embodiment, variable alleles include CYPD6*2 allele, CYPD6*3 allele, and CYPD6*4 allele.
The invention's methods may be used for amplification of a sense strand or an antisense strand of a target polynucleotide. Thus, in one embodiment, the invention provides a method for amplifying target polynucleotide sequences, comprising, contacting i) a sample comprising a plurality of double-stranded target polynucleotide sequences comprising a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and comprising a first portion and a second portion, ii) first primer (such as a first forward primer) comprising a first sequence that is complementary to the first portion of the first single-stranded polynucleotide sequence (such as the sense strand of a first portion of the target polynucleotide sequences), the first sequence is modified by having an insertion of a director, iii) second primer (such as a first reverse primer) comprising a second sequence that is complementary to the second portion of the second single-stranded polynucleotide sequence (such as the antisense strand of a second portion of the target polynucleotide sequences), the second sequence is modified by having an insertion of a complement of the second director, wherein the director is not complementary either to the first portion of the first single-stranded polynucleotide sequence or to the second portion of the second single-stranded polynucleotide sequence, iv) third primer (such as a second forward primer) fused at its 3′ end to a first driver, and v) fourth primer (such as second reverse primer) fused at its 3′ end to a second driver, wherein one of the first driver and the second driver has the same sequence as the director, and the other of the first driver and the second driver is the same sequence as the second director, wherein the 5′ end of either the first driver or the second driver is fused to a universal adapter sequence, and wherein the contacting step is under conditions sufficient for hybridizing the director either with the first driver or with the complement of the second driver, and for amplifying the plurality of target polynucleotide sequences to produce a) a first plurality of amplicons comprising a first single-stranded amplicon that comprises i) the first single-stranded polynucleotide sequence (such as a sense strand) having an insertion of the first director in either its 3′ or 5′ terminal regions and an insertion of the complement of the second director in its other terminal region, or ii) the second single-stranded polynucleotide sequence (such as an antisense strand) having an insertion of the second director in either its 3′ or 5′ terminal regions and an insertion of the complement of the first director in its other terminal region, and b) a second plurality of amplicons comprising a second single-stranded amplicon having the universal adapter sequence fused to its 5′ end, and comprising at its 3′ end either i) the first single-stranded polynucleotide sequence (such as a sense strand) containing an insertion of the first director in either its 3′ or 5′ terminal regions, and containing an insertion of the complement of the second director in its other terminal region, or ii) the second single-stranded polynucleotide sequence (such as an antisense strand) containing an insertion of the second director in either its 3′ or 5′ terminal regions, and containing an insertion of the complement of the first director in its other terminal region.
In one embodiment, it may be desirable to carry out the contacting steps c) and e) in a single reaction mixture.
In some embodiments, where it may be desirable to amplify a sense strand of double-stranded target polynucleotide sequence, the first primer and the third primer are forward primers, and the second primer and the fourth primer are reverse primers (
In some embodiments, where it may be desirable to amplify an antisense strand of double-stranded target polynucleotide sequence, the first primer and the third primer are reverse primers, and the second primer and the fourth primer are forward primers.
In one embodiment, the invention's method generates a second single-stranded amplicon that contains, fused in operable combination from the 5′ end to the 3′ end, a) the universal adapter sequence, and b) the first single-stranded polynucleotide sequence (such as sense strand) having an insertion of the first director in either its 3′ or 5′ terminal regions and an insertion of the complement of the second director in its other terminal region, or the second single-stranded polynucleotide sequence (such as antisense strand) having an insertion of the second director in either its 3′ or 5′ terminal regions and an insertion of the complement of the first director in its other terminal region.
In a particular embodiment, where it is desirable to include an index sequence in the amplified sense strand, the 5′ end of the first driver of the third primer is fused to a complement of the unique index sequence, and the 5′ end of the second driver of the fourth primer is fused to the universal adapter. This produces a second single-stranded amplicon fused at its 3′ end to the unique index sequence, and fused at its 5′ end to the universal adapter sequence (
In a particular embodiment, where it is desirable to include an index sequence in the amplified antisense strand, the 5′ end of the first driver of the third primer is fused to the complement of the universal adapter sequence, and the 5′ end of the second driver of the fourth primer is fused to a unique index sequence. In one embodiment, the second single-stranded amplicon is fused at its 5′ end to the unique index sequence, and fused at its 3′ end to the universal adapter sequence. In a particular embodiment, the second single-stranded amplicon contains an insertion in its 5′ terminal region of a sense strand of the unique index sequence, and an insertion in its 3′ terminal region of a sense strand of the universal adapter sequence. In a particular embodiment, the unique index sequence is comprised in an index adapter. The invention's methods are exemplified by amplification of a sense strand of a target polynucleotide (
In one embodiment, it may be desirable to carry out the two amplification steps in a single reaction mixture.
In a particular embodiment, where it is desirable to include an index sequence in the amplified sense strand, the second reverse primer comprises a complement to a unique index sequence fused at its 3′ end to the driver, and wherein the single-stranded amplicon contains at its 3′ end a sense strand of the unique index sequence. In a particular embodiment, the unique index sequence is comprised in an index adapter.
The invention further provides reaction mixtures for amplifying a double-stranded target polynucleotide sequence that contain a first single-stranded polynucleotide sequence and a second single-stranded polynucleotide sequence, and contains a first portion and a second portion, the reaction mixture comprising a) first primer (such as first forward primer) comprising a first sequence that is complementary to the first portion of the first single-stranded polynucleotide sequence (such as the antisense strand of a first portion of the target polynucleotide sequences), the first sequence is modified by having an insertion of a director, b) second primer (such as first reverse primer) comprising a second sequence that is complementary to the second portion of the second single-stranded polynucleotide sequence (such as the sense strand of a second portion of the target polynucleotide sequences), the second sequence is modified by having an insertion of a second director, wherein neither director is complementary either to the first portion of the first single-stranded polynucleotide sequence or to the second portion of the second single-stranded polynucleotide sequence, c) third primer fused at its 3′ end to a first driver, and d) fourth primer fused at its 3′ end to a second driver, wherein the first driver has the same sequence as the first director, and the second driver has the same sequence as the second director.
In a particular embodiment, where it is desirable to include an index sequence or a universal adapter sequence in the amplified sense strand, the reaction mixture comprises one or both of the 5′ end of the first driver of the third primer is fused to a unique index sequence, and the 5′ end of the second driver of the fourth primer is fused to the universal adapter sequence.
In a further embodiment where it is desired to amplify the sense strand of a target polynucleotide sequence, the 5′ end of the second driver of the fourth primer is fused to the universal adapter sequence. Optionally, the 5′ end of the first driver of the third primer is fused to a unique index sequence.
In a further embodiment where it is desired to amplify the antisense strand of a target polynucleotide sequence, the 5′ end of the first driver of the third primer is fused to the universal adapter sequence. Optionally, the 5′ end of the second driver of the fourth primer is fused to a unique index sequence.
The invention further provides kits for amplifying double-stranded target polynucleotide sequence, the kits comprising any one or more of the reaction mixtures herein.
The following examples serve to illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.
One initial strategy in developing the invention's methods (Example 1) attempted to sequence a few targeted regions of <300 bp of the hypervariable regions that define specific alleles in the celiac disease HLA DQA and HLA DQB genes, by using universal tags to randomly add the index adapter or universal adapter to either the forward or reverse strand. A schematic of the initial strategy is shown in
As can be seen, because of the random nature of universal adapter and index adapter extensions, there were several undesired products generated by this method. Additionally, while much simpler than other library preparation methods, there are still four separate steps. Furthermore, presumably due to the number of product species a qPCR using the KAPA kit showed that the amount of sequenceable product was very limited. Also, as PCR optimization was performed we were unable to find sites within DQB1 that were both specific to DQB1 as well as not having polymorphisms that would lead to allelic dropout. Even the use of multiple primers for each target that included degenerate bases specific to each possible known variant at a given base continued to lead to allelic dropout.
In view of the lack of optimization of the initial strategy to sequence targeted regions of the hypervariable regions that define specific alleles in the celiac disease HLA DQA and HLA DQB genes, alternative methods were carried out, in which the specificity requirement was lifted. This allowed for multiple genes such as HLA-DQB2 to be amplified as long as it reduced allelic dropout compared to the initial strategy of Example 1. In view of the removal of the specificity requirement, this new method therefore uses bioinformatics to eliminate the off-target reads. To resolve all of the non-sequenceable products, to reduce the number of steps, and to increase the efficiency of the reaction we modified the universal tags by adding a 3 bp change to the PCR primers and the 3′ end of the universal adapter and index primers. This now forces directional specificity to the universal adapter and index adapter incorporation. The invention's method is exemplified by the amplification of the sense strand of the target polynucleotide sequence as shown in
This invention's method can be performed in one step where the PCR primers, index sequences, and universal adapter sequences are all added into the same reaction mixture, substantially simultaneously. This reduces tech time, reduces the chances for transcription errors, and reduces the chances for secondary PCR contamination.
While there are technically four species present, two of these are precursor molecules to the final product. We have carried out the invention's methods to drive the reaction to the final product by using ⅙th the PCR primer concentration compared to index adapter and universal adapter concentration and by using 50 cycles to exhaust reagents and drive to completion. Quantitation by KAPA kit confirmed that the amount of sequenceable product was high, with the last library generating 473 nM product. In some embodiments, it may be desired to reduce this concentration, for example by changing the PCR primer concentrations and/or number of PCR cycles. However, by guaranteeing all reactions have gone beyond the log-linear amplification and into the plateau phase the need for normalization should be reduced or even eliminated.
A) Design of Index Adapter and Universal Adapter
The index adapter and universal adapter were designed using methods known in the art (Illumina TruSeq Adapters Demystified Rev. A, © 2011 Tufts University Core Facility) as follows: (SEQ ID NOS: 3, 4, 5, 6):
Paying attention only to the sense 5′ to 3′ strands in the image above, the region “left of insert” is provided in the invention's method by the forward primer that incorporates a universal primer that the Illumina adapter extends from in the sense orientation. The “right of insert” region referred to in the above image is provided by a reverse primer incorporating a universal primer that the index adapter extends from in the anti-sense orientation. During amplification, the sense orientation is transcribed from the antisense template, providing the completed upper strand. As showing in
Because the index is oriented in a 5′ to 3′ direction away from the insert, it cannot be directly incorporated by extension but instead is indirectly added by the extension of the sense strand into the index primer. As a result, the extended product is the reverse complement of the index primer. This means the index sequence reported to the MiSeq is the reverse complement of the index sequence contained in the index primer.
B) Exemplary Amplification of a Target Sense Strand
Sequences used in the exemplary amplification are shown in
The overall method uses four DQB1 amplicons and two DQA1 amplicon, each interrogating hypervariable regions for low resolution typing. The following is a more detailed discussion of the methodology and data for PCR1, then a more general discussion of the remaining PCRs.
In Example 2, an exemplary three base pair (bp) driver sequence was successfully used to direct the index sequence and adapter sequence to their respective locations. This Example addresses whether other lengths of driver sequence may be used in the invention's methods.
To do this, six different driver sequence lengths were designed for each of the various PCRs, universal adapters, and index sequences including 0, 1, 2, 3, 6, and 15 bp. Sequences were selected to not interfere with either Illumina or gene specific regions, to avoid long stretches of the same base in a row, to have limited G/C ratios in order to not radically affect melting temperature, and were designed to not cause secondary structures such as hairpin folding. The sequences used are shown in
The PCR primers, index sequences, and universal adapter sequences were used as described above in Example 2. Five patients were used with each of 0, 1, 2, 3, 6, and 15 bp drivers. After the one step library preparation method was complete equal volumes of each sample were pooled, purified by Ampure, quantitated by qPCR (Kapa Library Quant Kit), diluted to 9 pM, diluted with PhiX, and sequenced on a 300 cycle MiSeq 300 kit.
The total number of reads associated with each PCR target region was determined using FastQC's Overrepresented Sequences tool. The total percent of reads each patient received in the flowcell are displayed in Table 3 and plotted in
Each and every publication and patent mentioned in the above specification is herein incorporated by reference in its entirety for all purposes. Various modifications and variations of the described methods and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific embodiments, the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in the art and in fields related thereto are intended to be within the scope of the following claims.
This application claims priority under 35 U.S.C. § 119(e) to co-pending U.S. Provisional Patent Application Ser. No. 62/798,163, filed Jan. 29, 2019, incorporated by reference.
Number | Date | Country | |
---|---|---|---|
62798163 | Jan 2019 | US |