The invention herein disclosed provides for devices and methods that can regulate the time at which an individual polymer in a mixture is acted upon by another compound, for example, an enzyme. The invention is of particular use in the fields of molecular biology, structural biology, cell biology, molecular switches, molecular circuits, and molecular computational devices, and the manufacture thereof. The invention also relates to methods of using the compositions to diagnose whether a subject is susceptible to cancer, autoimmune diseases, cell cycle disorders, or other disorders.
The invention relates to the field of compositions, methods, and apparatus for characterizing polynucleotides and other polymers.
Determining the nucleotide sequence of DNA and RNA in a rapid manner is a major goal of researchers in biotechnology, especially for projects seeking to obtain the sequence of entire genomes of organisms. In addition, rapidly determining the sequence of a polynucleotide is important for identifying genetic mutations and polymorphisms in individuals and populations of individuals.
Nanopore sequencing is one method of rapidly determining the sequence of polynucleotide molecules. Nanopore sequencing is based on the property of physically sensing the individual nucleotides (or physical changes in the environment of the nucleotides (that is, for example, an electric current)) within an individual polynucleotide (for example, DNA and RNA) as it traverses through a nanopore aperture. In principle, the sequence of a polynucleotide can be determined from a single molecule. However, in practice, it is preferred that a polynucleotide sequence be determined from a statistical average of data obtained from multiple passages of the same molecule or the passage of multiple molecules having the same polynucleotide sequence. The use of membrane channels to characterize polynucleotides as the molecules pass through the small ion channels has been studied by Kasianowicz et al. (Proc. Natl. Acad. Sci. USA. 93:13770-13773, 1996, incorporate herein by reference) by using an electric field to force single stranded RNA and DNA molecules through a 1.5 nanometer diameter nanopore aperture (for example, an ion channel) in a lipid bilayer membrane. The diameter of the nanopore aperture permitted only a single strand of a polynucleotide to traverse the nanopore aperture at any given time. As the polynucleotide traversed the nanopore aperture, the polynucleotide partially blocked the nanopore aperture, resulting in a transient decrease of ionic current. Since the length of the decrease in current is directly proportional to the length of the polynucleotide, Kasianowicz et al. (1996) were able to determine experimentally lengths of polynucleotides by measuring changes in the ionic current.
Baldarelli et al. (U.S. Pat. No. 6,015,714) and Church et al. (U.S. Pat. No. 5,795,782) describe the use of nanopores to characterize polynucleotides including DNA and RNA molecules on a monomer by monomer basis. In particular, Baldarelli et al. characterized and sequenced the polynucleotides by passing a polynucleotide through the nanopore aperture. The nanopore aperture is imbedded in a structure or an interface, which separates two media. As the polynucleotide passes through the nanopore aperture, the polynucleotide alters an ionic current by blocking the nanopore aperture. As the individual nucleotides pass through the nanopore aperture, each base/nucleotide alters the ionic current in a manner that allows the identification of the nucleotide transiently blocking the nanopore aperture, thereby allowing one to characterize the nucleotide composition of the polynucleotide and perhaps determine the nucleotide sequence of the polynucleotide.
One disadvantage of previous nanopore analysis techniques is controlling the rate at which the target polynucleotide is analyzed. As described by Kasianowicz, et al. (1996), nanopore analysis is a useful method for performing length determinations of polynucleotides. However, the translocation rate is nucleotide composition dependent and can range between 105 to 107 nucleotides per second under the measurement conditions outlined by Kasianowicz et al. (1996). Therefore, the correlation between any given polynucleotide's length and its translocation time is not straightforward. It is also anticipated that a higher degree of resolution with regard to both the composition and spatial relationship between nucleotide units within a polynucleotide can be obtained if the translocation rate is substantially reduced.
Recently, the properties of DNA or RNA molecules bound to nucleic acid processing enzymes have been analyzed at a nanopore orifice. The complexes studied include those of single-stranded DNA with Escherichia coli Exonuclease 1 (Hornblower, B.; Coombs, A.; Whitaker, R. D.; Kolomeisky, A.; Picone, S. J.; Meller, A.; Akeson, M. Nat. Methods. 2007, 4, 315-317), RNA with the bacteriophage phi8 ATPase (Astier, Y.; Kainov, D. E.; Bayley, H.; Tuma, R.; Howorka, S. Chemphyschem. 2007, 8, 2189-2194), and primer/template DNA substrates bound to the 3′-5′-exonuclease deficient versions of two A-family DNA polymerases, the Klenow fragment of E. coli DNA polymerase (KF(exo-)) and bacteriophage T7 DNA polymerase (T7DNAP(exo-)) (Benner, S.; Chen, R. J.; Wilson, N. A.; Abu-Shumays, R.; Hurt, N.; Lieberman, K. R.; Deamer, D. W.; Dunbar, W. B.; Akeson, M. Nat. Nanotechnol. 2007, 2, 718-724; Cockroft, S. L.; Chu, J.; Amorin, M.; Ghadiri, M. R. J. Am. Chem. Soc. 2008, 130, 818-820; Gyarfas, B.; Olasagasti, F.; Benner, S.; Garalde, D.; Lieberman, K. R.; Akeson, M. ACS. Nano. 2009, 3, 1457-1466; Hurt, N.; Wang, H.; Akeson, M.; Lieberman, K. R. J. Am. Chem. Soc. 2009, 131, 3772-3778; Wilson, N. A.; Abu-Shumays, R.; Gyarfas, B.; Wang, H.; Lieberman, K. R.; Akeson, M.; Dunbar, W. B. ACS. Nano. 2009, 3, 995-1003. We have demonstrated that T7DNAP(exo-) could replicate and advance a DNA template held in the α-hemolysin (α-HL) nanopore against an 80 mV applied potential (Olasagasti, F.; Lieberman, K. R.; Benner, S.; Cherf, G. M.; Dahl, J. M.; Deamer, D. W.; Akeson, M., Nat. Nanotechnol. 2010, advance online publication, doi:10.1038/nnano.2010.2177). However, due to the low stability of the T7DNAP(exo-)-DNA complex under load, diminished signal to noise ratio at 80 mV potential, and the high turnover rate of the polymerase, it was difficult to detect ionic current steps that reported more than three sequential nucleotide additions during replication.HEREInternational Patent Application No. PCT/US2008/004467 and related U.S. patent application Ser. Nos. 12/080,684 and 12/459,059 disclose a number of technologies that comprise α-hemolysin nanopores coupled with several exemplary DNA polymerases that may be used with the technologies disclosed herein.
There is currently a need to provide compositions and methods that can be used in characterization of polymers, including polynucleotides and polypeptides, as well as diagnosis and prognosis of diseases and disorders. There is also a need in the art to provide systems and methods that can detect single nucleotides in a timeframe that can be used to distinguish not only between individual nucleotides in a polynucleotide but also the chemical characteristics of the individual nucleotide. In particular there is also a need to provide compositions that are tolerant in vitro to elevated concentrations of salts.
The inventors have surprisingly demonstrated that Phi29 DNA polymerase acts like a molecular brake controlling the movement of a polynucleotide through a pore along the field resulting from an applied voltage. The polymerase is surprisingly capable of controlling the movement of a polynucleotide through a pore in three modes, namely the polymerase mode, the exonuclease mode and the unzipping mode. The polymerase mode and exonuclease modes are based on the normal activity of the enzyme. When both polymerase and exonuclease activity are inhibited, Phi29 DNA polymerase surprisingly unzips dsDNA when pulled through a nanopore by a strong applied field. This has been termed unzipping mode. Unzipping mode implies that it is the unzipping of dsDNA above or through the enzyme, and importantly, it is the requisite force required to disrupt the interactions of both strands with the enzyme and to overcome the hydrogen bonds between the hydridised states. Herein we describe how Phi29 DNA polymerase can act as a molecular brake for a polynucleotide, enabling sufficient controlled movement through a nanopore for sequencing.
Accordingly, the invention provides a method of sequencing a target polynucleotide, comprising:
The method is preferably carried out in one of three modes as follows:
The invention also provides a method of forming a sensor for sequencing a target polynucleotide, comprising: (a) contacting a pore with a Phi29 DNA polymerase in the presence of the target polynucleotide; and (b) applying a voltage across the pore to form a complex between the pore and the polymerase; and thereby forming a sensor for sequencing the target polynucleotide.
The invention also provides a method of increasing the rate of activity of a Phi29 DNA polymerase, comprising: (a) contacting the Phi29 DNA polymerase with a pore in the presence of a polynucleotide; and (b) applying a voltage across the pore to form a complex between the pore and the polymerase; and thereby increasing the rate of activity of a Phi29 DNA polymerase.
The invention also provides use of a Phi29 DNA polymerase to control the movement of a target polynucleotide through a pore.
The invention also provides a kit for sequencing a target polynucleotide comprising (a) a pore and (b) a Phi29 DNA polymerase.
The invention also provides an analysis apparatus for sequencing target polynucleotides in a sample, comprising a plurality of pores and a plurality of Phi29 DNA polymerases.
The invention also provides a system for determining the nucleotide sequence of a polynucleotide in a sample, the system comprising an electrical source, an anode, a cathode, a cis chamber, a trans chamber, wherein the cis and the trans chambers are separated by a thin film, the thin film having a plurality of apertures (pores), wherein each aperture (pore) is between about 0.25 nm and about 4 nm in diameter, a conducting solvent, a processive DNA modifying enzyme, a plurality of dNTP molecules, and a metal ion co-factor.
In one embodiment, the system further comprises at least one species of ddNTP molecule. In another embodiment, the system further comprises an ammeter. In one preferred embodiment, the aperture diameter is about 2 nm. In another embodiment, the conducting solvent is an aqueous solvent. In an alternative embodiment the conducting solvent is a non-aqueous solvent. In another embodiment, the processive DNA modifying enzyme is a DNA polymerase. In another embodiment, the processive DNA modifying enzyme is tolerant to at least 0.6 M monovalent salt. In another embodiment, the concentration of the monovalent salt is at saturation. In another embodiment, the concentration of the monovalent salt is between 0.6 M and at saturation. In another embodiment, the processive DNA modifying enzyme is isolated from a mesophile or a virus naturally infecting a mesophile. In another embodiment, the processive DNA modifying enzyme is isolated from a halophile or a virus naturally infecting a halophile. In another embodiment, the processive DNA modifying enzyme is isolated from an extreme halophile or a virus naturally infecting an extreme halophile.
In another embodiment, the processive DNA modifying enzyme is selected from a bacterium from the group consisting of Haloferax, Halogeometricum, Halococcus, Haloterrigena, Halorubrum, Haloarcula, Halobacterium, Salinivibrio costicola, Halomonas elongata, Halomonas israelensis, Salinibacter rube, Dunaliella salina, Staphylococcus aureus. Actinopolyspora halophila, Marinococcus halophilus, and S. costicola. In another embodiment, the processive DNA modifying enzyme is selected from the group consisting of phi29 DNA polymerase, T7 DNA polymerase, His I DNA polymerase, and His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CP1 DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof.
In a preferred embodiment, the processive DNA modifying enzyme is phi29 DNA polymerase.
In another embodiment, the processive DNA modifying enzyme is from a moderate halophile, wherein the moderate halophile is selected from the group consisting of Pseudomonas, Flavobacterium, Spirochaeta, Salinivibrio, Arhodomonas, and Dichotomicrobium.
In another embodiment, the invention provides an apparatus for determining the nucleotide sequence of a polynucleotide in a sample, the apparatus comprising an electrical source, an anode, a cathode, a cis chamber, a trans chamber, wherein the cis and the trans chambers are separated by a thin film, the thin film having a plurality of apertures (pores), wherein each aperture (pore) is between about 0.25 nm and about 4 nm in diameter, a conducting solvent, a processive DNA modifying enzyme, a plurality of dNTP molecules, and a metal ion co-factor.
In one embodiment, the apparatus further comprises at least one species of ddNTP molecule. In another embodiment, the apparatus further comprises an ammeter. In one preferred embodiment, the aperture diameter is about 2 nm. In another embodiment, the conducting solvent is an aqueous solvent. In an alternative embodiment the conducting solvent is a non-aqueous solvent. In another embodiment, the processive DNA modifying enzyme is a DNA polymerase. In another embodiment, the processive DNA modifying enzyme is tolerant to at least 0.6 M monovalent salt. In another embodiment, the concentration of the monovalent salt is at saturation. In another embodiment, the concentration of the monovalent salt is between 0.6 M and at saturation. In another embodiment, the processive DNA modifying enzyme is isolated from a mesophile or a virus naturally infecting a mesophile. In another embodiment, the processive DNA modifying enzyme is isolated from a halophile or a virus naturally infecting a halophile. In another embodiment, the processive DNA modifying enzyme is isolated from an extreme halophile or a virus naturally infecting an extreme halophile.
In another embodiment, the processive DNA modifying enzyme is selected from a bacterium from the group consisting of Haloferax, Halogeometricum, Halococcus, Haloterrigena, Halorubrum, Haloarcula, Halobacterium, Salinivibrio costicola, Halomonas elongata, Halomonas israelensis, Salinibacter rube, Dunaliella salina, Staphylococcus aureus, Actinopolyspora halophila, Marinococcus halophilus, and S. costicola. In another embodiment, the processive DNA modifying enzyme is selected from the group consisting of phi29 DNA polymerase, T7 DNA polymerase, His I DNA polymerase, and His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CP1 DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof.
In a preferred embodiment, the processive DNA modifying enzyme is phi29 DNA polymerase.
In another embodiment, the processive DNA modifying enzyme is from a moderate halophile, wherein the moderate halophile is selected from the group consisting of Pseudomonas, Flavobacterium, Spirochaeta, Salinivibrio, Arhodomonas, and Dichotomicrobium.
In an yet other embodiment, the invention provides a device for determining the nucleotide sequence of a polynucleotide in a sample, the device comprising an electrical source, an anode, a cathode, a cis chamber, a trans chamber, wherein the cis and the trans chambers are separated by a thin film, the thin film having a plurality of apertures (pores), wherein each aperture (pore) is between about 0.25 nm and about 4 nm in diameter, a conducting solvent, a processive DNA modifying enzyme, a plurality of dNTP molecules, and a metal ion co-factor.
In one embodiment, the device further comprises at least one species of ddNTP molecule. In another embodiment, the device further comprises an ammeter. In one preferred embodiment, the aperture diameter is about 2 nm. In another embodiment, the conducting solvent is an aqueous solvent. In an alternative embodiment the conducting solvent is a non-aqueous solvent. In another embodiment, the processive DNA modifying enzyme is a DNA polymerase. In another embodiment, the processive DNA modifying enzyme is tolerant to at least 0.6 M monovalent salt. In another embodiment, the concentration of the monovalent salt is at saturation. In another embodiment, the concentration of the monovalent salt is between 0.6 M and at saturation. In another embodiment, the processive DNA modifying enzyme is isolated from a mesophile or a virus naturally infecting a mesophile. In another embodiment, the processive DNA modifying enzyme is isolated from a halophile or a virus naturally infecting a halophile. In another embodiment, the processive DNA modifying enzyme is isolated from an extreme halophile or a virus naturally infecting an extreme halophile.
In another embodiment, the processive DNA modifying enzyme is selected from a bacterium from the group consisting of Haloferax, Halogeometricum, Halococcus, Haloterrigena, Halorubrum, Haloarcula, Halobacterium, Salinivibrio costicola, Halomonas elongata, Halomonas israelensis, Salinibacter rube, Dunaliella salina, Staphylococcus aureus. Actinopolyspora halophila. Marinococcus halophilus, and S. costicola. In another embodiment, the processive DNA modifying enzyme is selected from the group consisting of phi29 DNA polymerase, T7 DNA polymerase, His 1 DNA polymerase, and His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CP1 DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof.
In a preferred embodiment, the processive DNA modifying enzyme is phi29 DNA polymerase.
In another embodiment, the processive DNA modifying enzyme is from a moderate halophile, wherein the moderate halophile is selected from the group consisting of Pseudomonas, Flavobacterium, Spirochaeta. Salinivibrio, Arhodomonas, and Dichotomicrobium.
In another embodiment, the invention provides a method for determining the nucleotide sequence of a polynucleotide in a sample, the method comprising the steps of: providing two separate adjacent chambers comprising a liquid medium, an interface between the two chambers, the interface having an aperture (pore) so dimensioned as to allow sequential monomer-by-monomer passage from the cis-side of the channel to the trans-side of the channel of only one polynucleotide strand at a time; providing a processive DNA-modifying enzyme having binding activity for a polynucleotide; providing a polynucleotide in a sample, wherein a portion of the polynucleotide is double-stranded and a portion is single-stranded; introducing the polynucleotide into one of the two chambers; introducing the enzyme into the same chamber; allowing the processive DNA-modifying enzyme to bind to the polynucleotide; applying a potential difference between the two chambers, thereby creating a first polarity, the first polarity causing the single-stranded portion of the polynucleotide to transpose through the aperture (pore) to the trans-side; introducing the enzyme into the same chamber; allowing the enzyme to bind to the polynucleotide; measuring the electrical current through the channel thereby detecting a nucleotide base in the polynucleotide; decreasing the potential difference a first time; allowing the single-stranded portion of the polynucleotide to transpose through the aperture; measuring the change in electrical current; increasing the potential difference; measuring the electrical current through the channel, thereby detecting a particular nucleotide base positioned at the aperture (pore); repeating any one of the steps, thereby determining the nucleotide sequence of the polynucleotide. In one embodiment the method further comprises a step of adding at least one species of ddNTP molecule. In another embodiment the method further comprises wherein the aperture diameter is about 2 nm. In another embodiment the method further comprises wherein the liquid medium is an aqueous solvent. In another embodiment the method further comprises wherein the processive DNA modifying enzyme is a DNA polymerase. In another embodiment the method further comprises wherein the processive DNA modifying enzyme is tolerant to at least 0.6 M salt. In another embodiment, the concentration of the monovalent salt is at saturation. In another embodiment, the concentration of the monovalent salt is between 0.6 M and at saturation. In another embodiment the method further comprises wherein the processive DNA modifying enzyme is isolated from a mesophile or a virus naturally infecting a mesophile. In another embodiment the method further comprises wherein the processive DNA modifying enzyme is isolated from a mesophile, a halophile, or an extreme halophile or a virus naturally infecting a mesophile, a halophile, or an extreme halophile.
In another embodiment, the processive DNA modifying enzyme is selected from a bacterium from the group consisting of Haloferax, Halogeometricum, Halococcus, Haloterrigena, Halorubrum, Haloarcula, Halobacterium, Salinivibrio costicola, Halomonas elongata, Halomonas israelensis, Salinibacter rube, Dunaliella salina, Staphylococcus aureus, Actinopolyspora halophila, Marinococcus halophilus. and S. costicola. In another embodiment, the processive DNA modifying enzyme is selected from the group consisting of phi29 DNA polymerase, T7 DNA polymerase, His 1 DNA polymerase, and His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CP1 DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof.
In a preferred embodiment, the processive DNA modifying enzyme is phi29 DNA polymerase.
In another embodiment, the processive DNA modifying enzyme is from a moderate halophile, wherein the moderate halophile is selected from the group consisting of Pseudomonas, Flavobacterium, Spirochaeta, Salinivibrio, Arhodomonas, and Dichotomicrobium.
In another embodiment, the invention provides a method of sequencing a polynucleotide, the method comprising a step of including an oligomer, the oligomer comprising at least one abasic nucleotide species. In a preferred embodiment, the oligomer comprises more than one abasic nucleotide. In a more preferred embodiment, the oligomer comprises at least five abasic nucleotides. In an alternative embodiment, the method further comprises a step of including an oligomer comprising a C3 (CPG) spacer.
In another embodiment, the invention provides a method for sequencing a polynucleotide, the method further comprising a step of including a registry oligomer.
In another embodiment, the invention provides a method for sequencing a polynucleotide, the method further comprising a step of including a blocking oligomer. In a preferred embodiment, the blocking oligomer comprises at least 15 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 20 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 25 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 30 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 35 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 40 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 45 nucleotides. In another preferred embodiment, the blocking oligomer comprises at least 50 nucleotides. In an alternative embodiment the blocking oligomer is selected from the group consisting of a 10-mer, a 15-mer, a 20-mer, a 25-mer, a 30-mer, a 31-mer, a 32-mer, a 33-mer, a 34-mer. a 35-mer, a 36-mer, a 37-mer, a 38-mer, a 39-mer, a 40-mer, a 50-mer, or any number of nucleotides therebetween. It may also be desirable to provide a blocking oligomer having more than 50 nucleotides.
In another embodiment the invention provides a method of sequencing a polynucleotide, wherein the polynucleotide has a size in the range of between 10 nucleotides to 50 thousand nucleotides. The number of nucleotides in the polynucleotide can be 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 750, 1,000, 1,500, 2,000, 2,500, 3,000, 3,500, 4,000, 4,500, 5,000, 10,000, 15,000, 20,000, 30,000, 40.000. 50,000 or any number therebetween. It may also be desirable to sequence polynucleotides in excess of 50,000 nucleotides.
The invention provides thin film devices, systems, and methods for using the same. The subject devices or systems comprise cis and trans chambers connected by an electrical communication means. The cis and trans chambers are separated by a thin film comprising at least one pore or channel. In one preferred embodiment, the thin film comprises a compound having a hydrophobic domain and a hydrophilic domain. In a more preferred embodiment, the thin film comprises a phospholipid. The devices or systems further comprise a means for applying an electric field between the cis and the trans chambers. The pore or channel is shaped and sized having dimensions suitable for passaging a polymer. In one preferred embodiment the pore or channel accommodates a part but not all of the polymer. In one other preferred embodiment, the polymer is a polynucleotide. In an alternative preferred embodiment, the polymer is a polypeptide. Other polymers provided by the invention include polypeptides, phospholipids, polysaccharides, and polyketides.
In one embodiment, the thin film further comprises a compound having a binding affinity for the polymer. In one preferred embodiment the binding affinity (Ka) is at least 106 l/mole. In a more preferred embodiment the Ka is at least 108 Umole. In yet another preferred embodiment the compound is adjacent to at least one pore. In a more preferred embodiment the compound is a channel. In a yet more preferred embodiment the channel has biological activity. In a most preferred embodiment, the compound comprises the pore.
In another embodiment the pore is sized and shaped to allow passage of an activator, wherein the activator is selected from the group consisting of ATP, NAD+, NADP+, diacylglycerol, phosphatidylserine, eicosinoids, retinoic acid, calciferol, ascorbic acid, neuropeptides, enkephalins, endorphins, 4-aminobutyrate (GABA), 5-hydroxytryptamine (5-HT), catecholamines, acetyl CoA, S-adenosylmethionine, and any other biological activator.
In yet another embodiment the pore is sized and shaped to allow passage of a cofactor, wherein the cofactor is selected from the group consisting of Mg2+, Mn2+, Ca2+, ATP, NAD+, NADP+, and any other biological cofactor.
In a preferred embodiment the pore or channel is a pore molecule or a channel molecule and comprises a biological molecule, or a synthetic modified molecule, or altered biological molecule, or a combination thereof. Such biological molecules are, for example, but not limited to, an ion channel, a nucleoside channel, a peptide channel, a sugar transporter, a synaptic channel, a transmembrane receptor, such as GPCRs and the like, a nuclear pore, synthetic variants, chimeric variants, or the like. In one preferred embodiment the biological molecule is α-hemolysin.
In an alternative, the compound comprises non-enzyme biological activity. The compound having non-enzyme biological activity can be, for example, but not limited to, proteins, peptides, antibodies, antigens, nucleic acids, peptide nucleic acids (PNAs), locked nucleic acids (LNAs), morpholinos, sugars, lipids, glycophosphoinositols, lipopolysaccharides or the like. The compound can have antigenic activity. The compound can have selective binding properties whereby the polymer binds to the compound under a particular controlled environmental condition, but not when the environmental conditions are changed. Such conditions can be, for example, but not limited to, change in [H+], change in environmental temperature, change in stringency, change in hydrophobicity, change in hydrophilicity, or the like.
In another embodiment, the invention provides a compound, wherein the compound further comprises a linker molecule, the linker molecule selected from the group consisting of a thiol group, a sulfide group, a phosphate group, a sulfate group, a cyano group, a piperidine group, an Fmoc group, and a Boc group.
In one embodiment the thin film comprises a plurality of pores. In one embodiment the device comprises a plurality of electrodes.
In another embodiment, the invention provides a method for controlling binding of an enzyme to a partially double-stranded polynucleotide complex, the method comprising: providing two separate, adjacent pools of a medium and an interface between the two pools, the interface having a channel so dimensioned as to allow sequential monomer-by-monomer passage from one pool to the other pool of only one polynucleotide at a time; providing an enzyme having binding activity to a partially double-stranded polynucleotide complex; providing a polynucleotide complex comprising a first polynucleotide and a second polynucleotide, wherein a portion of the polynucleotide complex is double-stranded, and wherein the first polynucleotide further comprises a moiety that is incompatible with the second polynucleotide; introducing the polynucleotide complex into one of the two pools; introducing the enzyme into one of the two pools; applying a potential difference between the two pools, thereby creating a first polarity; reversing the potential difference a first time, thereby creating a second polarity; reversing the potential difference a second time to create the first polarity, thereby controlling the binding of the enzyme to the partially double-stranded polynucleotide complex. In a preferred embodiment, the medium is electrically conductive. In a more preferred embodiment, the medium is an aqueous solution. In a preferred embodiment, the moiety is selected from the group consisting of a peptide nucleic acid, a 2′-O′-methyl group, a fluorescent compound, a derivatized nucleotide, and a nucleotide isomer. In another preferred embodiment, the method further comprises the steps of measuring the electrical current between the two pools; comparing the electrical current value obtained at the first time the first polarity was induced with the electrical current value obtained at the time the second time the first polarity was induced. In another preferred embodiment the method further comprises the steps of measuring the electrical current between the two pools; comparing the electrical current value obtained at the first time the first polarity was induced with the electrical current value obtained at a later time. In a more preferred embodiment, the enzyme is selected from the group consisting of DNA polymerase, RNA polymerase, endonuclease, exonuclease, DNA ligase, DNase, uracil-DNA glycosidase, kinase, phosphatase, methylase, and acetylase. In another alternative embodiment, the method further comprises the steps of providing at least one reagent that initiates enzyme activity; introducing the reagent to the pool comprising the polynucleotide complex; and incubating the pool at a suitable temperature. In a more preferred embodiment, the reagent is selected from the group consisting of a deoxyribonucleotide and a cofactor. In a yet more preferred embodiment, the deoxyribonucleotide is introduced into the pool prior to introducing the cofactor. In another still more preferred embodiment, the cofactor is selected from the group consisting of Mg2+, Mn2+, Ca2+, ATP, NAD+, and NADP+. In one embodiment the enzyme is introduced into the same pool as the polynucleotide. In an alternative embodiment, the enzyme is introduced into the opposite pool.
The invention herein disclosed provides for devices and methods that can regulate the rate at which an individual polymer in a mixture is acted upon by another compound, for example, an enzyme. The devices and methods are also used to determine the nucleotide base sequence of a polynucleotide The invention is of particular use in the fields of molecular biology, structural biology, cell biology, molecular switches, molecular circuits, and molecular computational devices, and the manufacture thereof.
In one embodiment the nanopore system can control binding of a molecule to a polymer at a rate of between about 5 Hz and 2000 Hz. The nanopore system can control binding of a molecule to a polymer at, for example, about 5 Hz, at about 10 Hz, at about 15 Hz, at about 20 Hz, at about 25 Hz, at about 30 Hz, at about 35 Hz, at about 40 Hz, at about 45 Hz, at about 50 Hz, at about 55 Hz, at about 60 Hz, at about 65 Hz, at about 70 Hz, at about 75 Hz, at about 80 Hz, at about 85 Hz, at about 90 Hz, at about 95 Hz, at about 100 Hz, at about 110 Hz, at about 120 Hz, at about 125 Hz, at about 130 Hz, at about 140 Hz, at about 150 Hz, at about 160 Hz, at about 170 Hz, at about 175 Hz, at about 180 Hz, at about 190 Hz, at about 200 Hz, at about 250 Hz, at about 300 Hz, at about 350 Hz, at about 400 Hz, at about 450 Hz, at about 500 Hz, at about 550 Hz, at about 600 Hz, at about 700 Hz, at about 750 Hz, at about 800 Hz, at about 850 Hz, at about 900 Hz, at about 950 Hz, at about 1000 Hz, at about 1125 Hz, at about 1150 Hz, at about 1175 Hz, at about 1200 Hz, at about 1250 Hz, at about 1300 Hz, at about 1350 Hz, at about 1400 Hz, at about 1450 Hz, at about 1500 Hz, at about 1550 Hz, at about 1600 Hz, at about 1700 Hz, at about 1750 Hz, at about 1800 Hz, at about 1850 Hz, at about 1900 Hz, at about 950 Hz, and at about 2000 Hz. In a preferred embodiment, the nanopore system can control binding of a molecule to a polymer at a rate of between about 25 Hz and about 250 Hz. In a more preferred embodiment the nanopore system can control binding of a molecule to a polymer at a rate of between about 45 Hz and about 120 Hz. In a most preferred embodiment the nanopore system can control binding of a molecule to a polymer at a rate of about 50 Hz.
The invention also provides thin film devices and methods for using the same. The subject devices comprise cis and trans chambers connected by an electrical communication means. The cis and trans chambers are separated by a thin film comprising at least one pore or channel. In one preferred embodiment, the thin film comprises a first compound having a hydrophobic domain and a hydrophilic domain. In a more preferred embodiment, the thin film comprises a phospholipid. The devices further comprise a means for applying an electric field between the cis and the trans chambers. The pore or channel is shaped and sized having dimensions suitable for passaging a polymer. In one preferred embodiment the pore or channel accommodates a part but not all of the polymer. In another preferred embodiment the pore or channel accommodates a monomer part of the polymer but not a dimer part of the polymer. In one other preferred embodiment, the polymer is a polynucleotide. In an alternative preferred embodiment, the polymer is a polypeptide. Other polymers provided by the invention include polypeptides, phospholipids, polysaccharides, and polyketides.
In one embodiment, the thin film further comprises a second compound having a binding affinity for the polymer. In one preferred embodiment the binding affinity (Ka) is at least 1061/mole. In a more preferred embodiment the Ka is at least 108 l/mole. In yet another preferred embodiment the compound is adjacent to at least one pore. In a more preferred embodiment the compound is a channel. In a yet more preferred embodiment the channel has biological activity. In a most preferred embodiment, the compound comprises the pore.
In one embodiment the second compound comprises enzyme activity. The enzyme activity can be, for example, but not limited to, enzyme activity of proteases, kinases, phosphatases, hydrolases, oxidoreductases, isomerases, transferases, methylases, acetylases, ligases, lyases, and the like. In a more preferred embodiment the enzyme activity can be enzyme activity of DNA polymerase, RNA polymerase, endonuclease, exonuclease, DNA ligase, DNase, uracil-DNA glycosidase, kinase, phosphatase, methylase, acetylase, or the like.
In one preferred embodiment, the DNA polymerase is isolated from a halophile microorganism. In an alternative preferred embodiment, the DNA polymerase is a naturally-occurring variant of the DNA polymerase isolated from a halophile microorganism. In an alternative preferred embodiment, the DNA polymerase is a synthetic variant of the DNA polymerase isolated from a halophile microorganism. In yet another alternative preferred embodiment, the DNA polymerase is a synthetic composition having the enzyme properties of the DNA polymerase isolated from a halophile microorganism or alternatively, a naturally-occurring variant of the DNA polymerase isolated from a halophile microorganism. In a more preferred embodiment, the halofile microorganism is an extreme halophile microorganism. In another more preferred embodiment the halophile microorganism is a moderate halophile microorganism.
In another preferred embodiment, the halophile microorganism thrives under environmental conditions selected from the group consisting of temperature equal to or greater than 50° C., pressure equal to or greater that 200 kPa, pH equal to or lower than 6.5, pH equal to or greater than 7.5, and salinity equal to or greater than 0.5M. For example, the temperature can be 50° C., 55° C., 60° C., 65° C., 70° C., 75° C., 80° C., 85° C., 90° C., 95° C., and 99° C. In another example, the pH can be 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.5, 8.0, 8.5, 9.0, and 9.5.
In an alternative preferred embodiment, the DNA polymerase is isolated from a virus that can infect a halophile microorganism. In an alternative preferred embodiment, the DNA polymerase is a naturally-occurring variant of the DNA polymerase isolated from a virus that can infect a halophile microorganism. In an alternative preferred embodiment, the DNA polymerase is a synthetic variant of the DNA polymerase isolated from a virus that can infect a halophile microorganism. In yet another alternative preferred embodiment, the DNA polymerase is a synthetic composition having the enzyme properties of the DNA polymerase isolated from a virus that can infect a halophile microorganism or alternatively, a naturally-occurring variant of the DNA polymerase isolated from a virus that can infect a halophile microorganism. In a more preferred embodiment, the halofile microorganism is an extreme halophile microorganism. In an alternative embodiment, the virus that can infect a halophile microorganism is infective under environmental conditions selected from the group consisting of temperature equal to or greater than 50° C., pressure equal to or greater that 200 kPa, pH equal to or lower than 6.5, pH equal to or greater than 7.5, and salinity equal to or greater than 0.5M. For example, the temperature can be 50° C., 55° C., 60° C., 65° C., 70° C., 75° C., 80° C., 85° C., 90° C., 95° C., and 99° C. In another example, the pH can be 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.5, 8.0, 8.5, 9.0, and 9.5.
The second compound can have selective binding properties whereby the polymer binds to the second compound under a particular controlled environmental condition, but not when the environmental conditions are changed. Such conditions can be, for example, but not limited to, change in [H+], change in environmental temperature, change in stringency, change in hydrophobicity, change in hydrophilicity, or the like.
In another embodiment the pore is sized and shaped to allow passage of an activator, wherein the activator is selected from the group consisting of ATP, NAD+, NADP+, diacylglycerol, phosphatidylserine, eicosinoids, retinoic acid, calciferol, ascorbic acid, neuropeptides, enkephalins, endorphins, 4-aminobutyrate (GABA), 5-hydroxytryptamine (5-HT), catecholamines, acetyl CoA, S-adenosylmethionine, and any other biological activator.
In yet another embodiment the pore is sized and shaped to allow passage of a cofactor, wherein the cofactor is selected from the group consisting of Mg2+, Mn2+, Ca2+, ATP, NAD+, NADP+, and any other biological cofactor.
In a preferred embodiment the pore or channel comprises a biological molecule, or a synthetic modified or altered biological molecule. Such biological molecules are, for example, but not limited to, an ion channel, a nucleoside channel, a peptide channel, a sugar transporter, a synaptic channel, a transmembrane receptor, such as GPCRs and the like, a nuclear pore, or the like.
In an alternative, the second compound comprises non-enzyme biological activity. The second compound having non-enzyme biological activity can be, for example, but not limited to, proteins, peptides, antibodies, antigens, nucleic acids, peptide nucleic acids (PNAs), locked nucleic acids (LNAs), morpholinos, sugars, lipids, glycophosphoinositols, lipopolysaccharides or the like.
In another embodiment, the invention provides a third compound, wherein the third compound further comprises a linker molecule, the linker molecule selected from the group consisting of a thiol group, a sulfide group, a phosphate group, a sulfate group, a cyano group, a piperidine group, an Fmoc group, and a Boc group.
In one embodiment the thin film comprises a plurality of pores. In one embodiment the device comprises a plurality of electrodes.
The invention also contemplates a method of binding phi29 DNA polymerase (DNAP) to single-stranded DNA (ss-DNA) and thereby reducing the rate at which the ss-DNA traverses an a-Hemolysin nanopore under a 180 mV applied potential. In a preferred embodiment, single-stranded DNA threads through the phi29 DNAP and a-Hemolysin nanopore at a rate near one nucleotide per 1-100 ms. In another embodiment, the rate is from between one nucleotide per 100-1000 ms.
The invention also contemplates a method of using the primer DNA 5′ terminus to protect the template 3′ terminus from digestion by DNA polymerases (DNAP).
The invention also contemplates a method of covalently bonding a C3 (CPG) spacer, followed by an abasic residue on the 3′-terminus and preventing exonucleolytic digestion of the DNA.
The invention also contemplates a method of protecting the primer DNA strand from phi29 DNAP function by binding a modified DNA oligomer adjacent to the primer template junction. In a preferred embodiment, phi29 binds at the oligomer 5′-terminus and capture of this complex on an α-Hemolysin nanopore with 180 mV applied potential removes the oligomer and places phi29 at the primer terminus, after which DMA replication can take place.
The invention also contemplates a method of using a registry oligomer, preferrably a modified DNA oligomer, to control where phi29 DNAP binds and sits on the ss-DNA. Capture of these DNAP-DNA complexes on an α-Hemolysin nanopore using a 180 mV applied potential removes the oligomer and allows the s-DNA to translocate through phi29 DNAP and the α-Hemolysin.
The invention also contemplates a method wherein phi29 DNAP-bound dsDNA unzips in a nanopore by applied voltage (180 mV). In a preferred embodiment, voltage reduction allows re-zipping of the DNA. Restoring the voltage unzips the DNA again and this allows movement of the DNA back and forth through the nanopore.
The invention also contemplates using a blocking oligomer binding at the DNA primer/transcript junction whereby the oligomer is stripped off when captured on a nanopore, and the DNA is subsequently activated for ratcheting through the nanopore.
The invention also contemplates using shorter blocking oligomers and decreasing the time required to strip the blocking oligomer off the DNA. In a preferred embodiment, this allows activation of DNA molecules for replication on the nanopore faster, and that this increases the throughput of the nanopore for sequencing applications.
The invention also contemplates a method of sequencing a polynucleotide, the method comprising a step of determining the noise level in a signal, the noise level being representative of the identity of the nucleotide inducing the signal compared with the previous nucleotide inducing a signal and the subsequent nucleotide inducing a signal. In a preferred embodiment, the signal is a change in current measured between the two adjacent pools. In a more preferred embodiment, the noise level measured is greater for a nucleotide when the previous nucleotide and/or the subsequent nucleotide are a different nucleotide.
The invention also contemplates a method of sequecing a polynculeotide, the method comprising the step of including a dNTP at lower concentration that other dNTPs thereby reducing the rate of reaction of the DNAP. In one embodiment, the dNTP is at about one order of magnitude lower in concentration that the other dNTPs. In a more preferred embodiment, the dNTP is at about two orders of magnitude lower in concentration that the other dNTPs.
SEQ ID NO: 1 shows the polynucleotide sequence encoding one subunit of α-hemolysin-E111N/K147N (α-HL-NN; Stoddart et al., PNAS, 2009; 106(19): 7702-7707).
SEQ ID NO: 2 shows the amino acid sequence of one subunit of α-HL-NN.
SEQ ID NO: 3 shows the codon optimised polynucleotide sequence encoding the Phi29 DNA polymerase.
SEQ ID NO: 4 shows the amino acid sequence of the Phi29 DNA polymerase. SEQ ID NOs.: 5 to 28 are the synthetic polynucleotide sequences (templates, oligomers, and blocking oligomers) used in the Examples.
The embodiments disclosed in this document are illustrative and exemplary and are not meant to limit the invention. Other embodiments can be utilized and structural changes can be made without departing from the scope of the claims of the present invention. All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a nanopore” includes a plurality of such nanopores, and a reference to “a signal” is a reference to one or more signals and equivalents thereof, and so forth.
By “polynucleotide” is meant DNA or RNA, including any naturally occurring, synthetic, or modified nucleotide. Nucleotides include, but are not limited to, ATP, dATP, CTP, dCTP, GTP, dGTP, UTP, TTP, dUTP, 5-methyl-CTP, 5-methyl-dCTP, ITP, dITP, 2-amino-adenosine-TP, 2-amino-deoxyadenosine-TP, 2-thiothymidine triphosphate, pyrrolo-pyrimidine triphosphate, 2-thiocytidine as well as the alphathiotriphosphates for all of the above, and 2′-O-methyl-ribonucleotide triphosphates for all the above bases. Modified bases include, but are not limited to, 5-Br-UTP, 5-Br-dUTP, 5-F-UTP, 5-F-dUTP, 5-propynyl dCTP, and 5-propynyl-dUTP.
By “transport property” is meant a property measurable during polymer movement with respect to a nanopore. The transport property may be, for example, a function of the solvent, the polymer, a label on the polymer, other solutes (for example, ions), or an interaction between the nanopore and the solvent or polymer.
A “hairpin structure” is defined as an oligonucleotide having a nucleotide sequence that is about 6 to about 10,000 nucleotides in length, the first half of which nucleotide sequence is at least partially complementary to the second part thereof, thereby causing the polynucleotide to fold onto itself, forming a secondary hairpin structure.
“Identity” or “similarity” refers to sequence similarity between two polynucleotide sequences or between two polypeptide sequences, with identity being a more strict comparison. The phrases “percent identity” and “% identity” refer to the percentage of sequence similarity found in a comparison of two or more polynucleotide sequences or two or more polypeptide sequences. “Sequence similarity” refers to the percent similarity in base pair sequence (as determined by any suitable method) between two or more polynucleotide sequences. Two or more sequences can be anywhere from 0-100% similar, or any integer value therebetween. Identity or similarity can be determined by comparing a position in each sequence that may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same nucleotide base or amino acid, then the molecules are identical at that position. A degree of similarity or identity between polynucleotide sequences is a function of the number of identical or matching nucleotides at positions shared by the polynucleotide sequences. A degree of identity of polypeptide sequences is a function of the number of identical amino acids at positions shared by the polypeptide sequences. A degree of homology or similarity of polypeptide sequences is a function of the number of amino acids at positions shared by the polypeptide sequences.
The term “incompatible” refers to the chemical property of a molecule whereby two molecules or portions thereof cannot interact with one another, physically, chemically, or both. For example, a portion of a polymer comprising nucleotides can be incompatible with a portion of a polymer comprising nucleotides and another chemical moiety, such as for example, a peptide nucleic acid, a 2′-O-methyl group, a fluorescent compound, a derivatized nucleotide, a nucleotide isomer, or the like. In another example, a portion of a polymer comprising amino acid residues can be incompatible with a portion of a polymer comprising amino acid residues and another chemical moiety, such as, for example, a sulfate group, a phosphate group, an acetyl group, a cyano group, a piperidine group, a fluorescent group, a sialic acid group, a mannose group, or the like.
“Alignment” refers to a number of DNA or amino acid sequences aligned by lengthwise comparison so that components in common (such as nucleotide bases or amino acid residues) may be visually and readily identified. The fraction or percentage of components in common is related to the homology or identity between the sequences. Alignments may be used to identify conserved domains and relatedness within these domains. An alignment may suitably be determined by means of computer programs known in the art, such as MACVECTOR software (1999) (Accelrys, Inc., San Diego, Calif.).
The terms “highly stringent” or “highly stringent condition” refer to conditions that permit hybridization of DNA strands whose sequences are highly complementary, wherein these same conditions exclude hybridization of significantly mismatched DNAs. Polynucleotide sequences capable of hybridizing under stringent conditions with the polynucleotides of the present invention may be, for example, variants of the disclosed polynucleotide sequences, including allelic or splice variants, or sequences that encode orthologs or paralogs of presently disclosed polypeptides. Polynucleotide hybridization methods are disclosed in detail by Kashima et al. (1985) Nature 313: 402-404, and Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (“Sambrook”); and by Haymes et al., “Nucleic Acid Hybridization: A Practical Approach”, IRL Press, Washington, D.C. (1985), which references are incorporated herein by reference.
In general, stringency is determined by the incubation temperature, ionic strength of the solution, and concentration of denaturing agents (for example, formamide) used in a hybridization and washing procedure (for a more detailed description of establishing and determining stringency, see below). The degree to which two nucleic acids hybridize under various conditions of stringency is correlated with the extent of their similarity. Thus, similar polynucleotide sequences from a variety of sources, such as within an organism's genome (as in the case of paralogs) or from another organism (as in the case of orthologs) that may perform similar functions can be isolated on the basis of their ability to hybridize with known peptide-encoding sequences. Numerous variations are possible in the conditions and means by which polynucleotide hybridization can be performed to isolate sequences having similarity to sequences known in the art and are not limited to those explicitly disclosed herein. Such an approach may be used to isolate polynucleotide sequences having various degrees of similarity with disclosed sequences, such as, for example, sequences having 60% identity, or more preferably greater than about 70% identity, most preferably 72% or greater identity with disclosed sequences, the resulting sequence having biological activity.
The invention provides a method of sequencing a target polynucleotide. The method comprises contacting the target polynucleotide with a pore and a Phi29 DNA polymerase such that the polymerase controls the movement of the target polynucleotide through the pore and a proportion of the nucleotides in the target polynucleotide interacts with the pore. The current passing through the pore during each interaction is measured and this determines the sequence of the target polynucleotide. Steps (a) and (b) are carried out with a voltage applied across the pore. The target polynucleotide is therefore sequenced using Strand Sequencing.
As discussed above, the Phi29 DNA polymerase acts like a molecular brake controlling the movement of the polynucleotide through the pore along the field resulting from the applied voltage. The method has several advantages. For instance, the target polynucleotide moves through the pore at a rate that is commercially viable yet allows effective sequencing. The method may also be carried out in one of three preferred ways based on the three modes of the Phi29 DNA polymerase. These are discussed in more detail below. Each way includes a method of proof reading the sequence.
The method of the invention is for sequencing a polynucleotide. A polynucleotide, such as a nucleic acid, is a macromolecule comprising two or more nucleotides. The polynucleotide or nucleic acid may comprise any combination of any nucleotides. The nucleotides can be naturally occurring or artificial. The nucleotide can be oxidized or methylated. A nucleotide typically contains a nucleobase, a sugar and at least one phosphate group. The nucleobase is typically heterocyclic. Nucleobases include, but are not limited to, purines and pyrimidines and more specifically adenine, guanine, thymine, uracil and cytosine. The sugar is typically a pentose sugar. Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The nucleotide is typically a ribonucleotide or deoxyribonucleotide. The nucleotide typically contains a monophosphate, diphosphate or triphosphate. Phosphates may be attached on the 5′ or 3′ side of a nucleotide.
Nucleotides include, but are not limited to, adenosine monophosphate (AMP), adenosine diphosphate (ADP), adenosine triphosphate (ATP), guanosine monophosphate (GMP), guanosine diphosphate (GDP), guanosine triphosphate (GTP), thymidine monophosphate (TMP), thymidine diphosphate (TDP), thymidine triphosphate (TTP), uridine monophosphate (UMP), uridine diphosphate (UDP), uridine triphosphate (UTP), cytidine monophosphate (CMP), cytidine diphosphate (CDP), cytidine triphosphate (CTP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (dADP), deoxyadenosine triphosphate (dATP), deoxyguanosine monophosphate (dGMP), deoxyguanosine diphosphate (dGDP), deoxyguanosine triphosphate (dGTP), deoxythymidine monophosphate (dTMP), deoxythymidine diphosphate (dTDP), deoxythymidine triphosphate (dTTP), deoxyuridine monophosphate (dUMP), deoxyuridine diphosphate (dUDP), deoxyuridine triphosphate (dUTP), deoxycytidine monophosphate (dCMP), deoxycytidine diphosphate (dCDP) and deoxycytidine triphosphate (dCTP). The nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP.
A nucleotide may contain a sugar and at least one phosphate group (that is, lack a nucleobase).
The polynucleotide may be single stranded or double stranded. At least a portion of the polynucleotide is preferably double stranded.
The polynucleotide can be a nucleic acid, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The target polynucleotide can comprise one strand of RNA hybridized to one strand of DNA. The polynucleotide may be any synthetic nucleic acid known in the art, such as peptide nucleic acid (PNA), glycerol nucleic acid (GNA), threose nucleic acid (TNA), locked nucleic acid (LNA) or other synthetic polymers with nucleotide side chains.
The whole or only part of the target nucleic acid sequence may be sequenced using this method. The target polynucleotide can be any length. For example, the polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400 or at least 500 nucleotide pairs in length. The polynucleotide can be 1000 or more nucleotide pairs, 5000 or more nucleotide pairs in length or 100000 or more nucleotide pairs in length.
The target polynucleotide is present in any suitable sample. The invention is typically carried out on a sample that is known to contain or suspected to contain the target polynucleotide. Alternatively, the invention may be carried out on a sample to confirm the identity of one or more target polynucleotides whose presence in the sample is known or expected.
The sample may be a biological sample. The invention may be carried out in vitro on a sample obtained from or extracted from any organism or microorganism. The organism or microorganism is typically prokaryotic or eukaryotic and typically belongs to one the five kingdoms: plantae, animalia, fungi, monera and protista. The invention may be carried out in vitro on a sample obtained from or extracted from any virus. The sample is preferably a fluid sample. The sample typically comprises a body fluid of the patient. The sample may be urine, lymph, saliva, mucus or amniotic fluid but is preferably blood, plasma or serum. Typically, the sample is human in origin, but alternatively it may be from another mammal animal such as from commercially farmed animals such as horses, cattle, sheep or pigs or may alternatively be pets such as cats or dogs. Alternatively a sample of plant origin is typically obtained from a commercial crop, such as a cereal, legume, fruit or vegetable, for example wheat, barley, oats. canola, maize, soya, rice, bananas, apples, tomatoes, potatoes, grapes, tobacco, beans, lentils, sugar cane, cocoa, cotton.
The sample may be a non-biological sample. The non-biological sample is preferably a fluid sample. Examples of a non-biological sample include surgical fluids, water such as drinking water, seawater or river water, and reagents for laboratory tests.
The sample is typically processed prior to being assayed, for example by centrifugation or by passage through a membrane that filters out unwanted molecules or cells, such as red blood cells. The sample may be measured immediately upon being taken. The sample may also be typically stored prior to assay, preferably below −70° C.
A transmembrane pore is a structure that permits hydrated ions driven by an applied potential to flow from one side of the membrane to the other side of the membrane.
Any membrane may be used in accordance with the invention. Suitable membranes are well-known in the art. The membrane is preferably an amphiphilic layer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties.
The membrane is preferably a lipid bilayer. Lipid bilayers are models of cell membranes and serve as excellent platforms for a range of experimental studies. For example, lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording. Alternatively, lipid bilayers can be used as biosensors to detect the presence of a range of substances. The lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include. but are not limited to, a planar lipid bilayer, a supported bilayer or a liposome. The lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in International Application No. PCT/GB08/000,563 (published as WO 2008/102121), International Application No. PCT/GB08/004,127 (published as WO 2009/077734) and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
Methods for forming lipid bilayers are known in the art. Suitable methods are disclosed in the Example. Lipid bilayers are commonly formed by the method of Montal and Mueller (Proc. Natl. Acad. Sci. USA., 1972; 69: 3561-3566), in which a lipid monolayer is carried on aqueous solution/air interface past either side of an aperture which is perpendicular to that interface.
The method of Montal & Mueller is popular because it is a cost-effective and relatively straightforward method of forming good quality lipid bilayers that are suitable for protein pore insertion. Other common methods of bilayer formation include tip-dipping, painting bilayers and patch-clamping of liposome bilayers.
In a preferred embodiment, the lipid bilayer is formed as described in International Application No. PCT/GB08/004,127 (published as WO 2009/077734). Advantageously in this method, the lipid bilayer is formed from dried lipids. In a most preferred embodiment, the lipid bilayer is formed across an opening as described in WO2009/077734 (PCT/GB08/004,127).
In another preferred embodiment, the membrane is a solid state layer. A solid-state layer is not of biological origin. In other words, a solid state layer is not derived from or isolated from a biological environment such as an organism or cell, or a synthetically manufactured version of a biologically available structure. Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si3N4, Al203, and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses. The solid state layer may be formed from graphene. Suitable graphene layers are disclosed in International Application No. PCT/US2008/010637 (published as WO 2009/035647). The solid state layer may be formed from silicon, silicon nitride, or graphene. The solid state layer may further comprise a solid state pore or a plurality of such pores. The solid state layer or pore may further comprise a linker group compound that is attached by covalent bond. A DNA Polymerase may be attached to a solid state layer or solid state pore using a suitable linker group.
The method is typically carried out using (i) an artificial bilayer comprising a pore, (ii) an isolated, naturally-occurring lipid bilayer comprising a pore, or (iii) a cell having a pore inserted therein. The method is preferably carried out using an artificial lipid bilayer. The bilayer may comprise other transmembrane and/or intramembrane proteins as well as other molecules in addition to the pore. Suitable apparatus and conditions are discussed below with reference to the sequencing embodiments of the invention. The method of the invention is typically carried out in vitro.
The transmembrane pore is preferably a transmembrane protein pore. A transmembrane protein pore is a polypeptide or a collection of polypeptides that permits hydrated ions, such as analyte, to flow from one side of a membrane to the other side of the membrane. In the present invention, the transmembrane protein pore is capable of forming a pore that permits hydrated ions driven by an applied potential to flow from one side of the membrane to the other. The transmembrane protein pore preferably permits analyte such as nucleotides to flow from one side of the membrane, such as a lipid bilayer, to the other. The transmembrane protein pore allows a polynucleotide, such as DNA or RNA, to be moved through the pore.
The transmembrane protein pore may be a monomer or an oligomer. The pore is preferably made up of several repeating subunits, such as 6, 7 or 8 subunits. The pore is more preferably a heptameric or octameric pore.
The transmembrane protein pore typically comprises a barrel or channel through which the ions may flow. The subunits of the pore typically surround a central axis and contribute strands to a transmembrane 0 barrel or channel or a transmembrane α-helix bundle or channel.
The barrel or channel of the transmembrane protein pore typically comprises amino acids that facilitate interaction with analyte, such as nucleotides, polynucleotides or nucleic acids. These amino acids are preferably located near a constriction of the barrel or channel. The transmembrane protein pore typically comprises one or more positively charged amino acids, such as arginine, lysine or histidine, or aromatic amino acids, such as tyrosine or tryptophan. These amino acids typically facilitate the interaction between the pore and nucleotides, polynucleotides or nucleic acids.
Transmembrane protein pores for use in accordance with the invention can be derived from β-barrel pores or α-helix bundle pores. β-barrel pores comprise a barrel or channel that is formed from β-strands. Suitable β-barrel pores include, but are not limited to, O-toxins, such as α-hemolysin, anthrax toxin and leukocidins, and outer membrane proteins/porins of bacteria, such as Mycobacterium smegmatis porin (Msp), for example MspA, outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NalP). α-helix bundle pores comprise a barrel or channel that is formed from α-helices. Suitable α-helix bundle pores include, but are not limited to, inner membrane proteins and a outer membrane proteins, such as WZA and ClyA toxin. The transmembrane pore may be derived from Msp or from α-hemolysin (α-HL).
To overcome the limitations disclosed above (low stability of the T7DNAP(exo)-DNA complex under load, diminished signal to noise ratio at 80 mV potential, and the high turnover rate of the polymerase), we examined other DNA-modifying enzymes whose structural and functional properties might facilitate processive catalysis when positioned at a nanopore orifice. An attractive candidate was the bacteriophage phi29 DNA polymerase (phi29 DNAP) (Blanco, L.; Salas, M. J. Biol. Chem. 1996, 271, 8509-8512; (19) Salas, M.; Blanco, L.; Lázaro, J. M.; de Vega, M. IUBMB. Life 2008, 60, 82-85). This DNA-dependent DNA replicase from the B family of DNA polymerases contains both 5%3′ polymerase and 3′-5′ exonuclease functions within a single ˜66.5 kDa protein chain. Following an initial protein-primed stage that ensures the integrity of the ends of the bacteriophage phi29 linear chromosome, phi29 DNAP transitions to a DNA-primed stage and replicates the entire 19.2 kilobase bacteriophage genome without the need for accessory proteins such as sliding clamps or helicases (Salas, M. Annu. Rev. Biochem. 1991, 60, 39-71). This highly processive polymerase can catalyze the replication of at least 70 kilobases of DNA in vitro following a single binding event to a DNA-primed substrate (Blanco, L.; Bemad, A.; Lázaro, J. M.; Martin, G.; Garmendia, C.; Salas, M. J. Biol. Chem. 1989, 264, 8935-940).
Crystal structures of phi29 DNAP revealed the structural basis of this remarkable processivity. The polymerase domain of phi29 DNAP shares the conserved architecture of palm, fingers and thumb sub-domains that resembles a partially open right hand. In addition, a 32 amino acid beta-hairpin insert that is unique to protein-primed DNA polymerases, together with the palm and thumb sub-domains, encircles the primer-template DNA, suggesting that this structure enhances processivity in a manner similar to that achieved by sliding clamp proteins (Johnson, A.; O'Donnell, M. Annu. Rev. Biochem. 2005, 74, 283-315). This same beta-hairpin also forms part of a tunnel that surrounds the downstream template DNA. These features indicate that the beta hairpin insert contributes to both the strong DNA binding and processivity of phi29 DNAP. Consistent with this prediction, deletion of the beta-hairpin results in a mutant phi29 DNAP that displays distributive DNA synthesis activity rather than the processive activity of the wild-type enzyme and a markedly diminished binding affinity for primer-template duplex DNA (Rodriguez, I.; Lázaro, J. M.; Blanco, L.; Kamtekar, S.; Berman, A. J.; Wang, J.; Steitz, T. A.; Salas, M.; de Vega, M. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6407-6412).
Experiments using optical tweezers have shown that phi29 DNAP can advance several hundred nucleotides along a template against applied loads of up to ˜37 pN, suggesting that this enzyme could replicate a DNA template held atop the nanopore. Here we show that phi29 DNAP-DNA complexes are three-to-four orders of magnitude more stable than KF(exo-)-DNA complexes when captured in an electric field across the α-HL nanopore. DNA substrates in captured complexes were activated for replication by exploiting the 3′-5′ exonuclease activity of wild-type phi29 DNAP to excise a 3′-H terminal residue, yielding a primer strand 3″-OH. In the presence of deoxynucleoside triphosphates (dNTPs), DNA synthesis was initiated, allowing real time detection of numerous sequential nucleotide additions that was limited only by the length of the DNA template.
We have observed processive DNA synthesis on a nanopore in an electric field: phi29 DNAP-DNA complexes remained associated with the nanopore orifice and readily catalyzed sequential nucleotide additions under 180 mV applied potential. This is in sharp contrast to T7DNAP(exo-), which was difficult to retain atop the pore for sequential additions even at lower voltages (see Olasagasti et al. 2010 supra).
The tenacious binding of phi29 DNAP to DNA is highlighted by the different pathways by which this polymerase and KF(exo-) dissociate from DNA atop the nanopore, under conditions that do not permit exonucleolytic degradation of the DNA by phi29 DNAP. While the bond between KF and DNA can be pulled apart at 180 mV within a few milliseconds (
We exploited three features of the phi29 DNAP 3′-5′ exonuclease in this study. First, we found that a 3′-H terminated DNA substrate was degraded more slowly in bulk phase than a 3′-OH terminated substrate (
To Our Knowledge this is the First Demonstration of Discrimination Against 3″-H Terminated DNA Substrates by the 3′-5′ Exonuclease Activity of Phi29 DNAP.
This feature provided protection in the bulk phase against both degradation and ddNMP excision-dependent initiation of primer extension of DNA substrate molecules. This protection in turn afforded a window following the addition of Mg2+ to the nanopore chamber during which we could capture numerous phi29 DNAP-DNA complexes in series in which the primer terminus was intact.
Second, we used the phi29 DNAP exonuclease activity to excise the ddNMP terminus of the DNA substrate in complexes while they were held atop the pore in an electric field. In the presence of dNTPs, the polymerization reaction is highly favored over processive degradation. Therefore excision of the ddCMP residue to yield a primer strand 3″-OH permitted the subsequent initiation of synthesis from a defined DNA template position.
The excision of the ddNMP terminus may be accelerated in complexes by the electric field force atop the pore, as we observed that the time from complex capture to the initiation of synthesis decreased when voltage was increased (not shown). This voltage-promoted excision would nonetheless differ from the processive exonucleolytic regime induced under conditions of high template tension in optical tweezers experiments, in which processive exonucleolytic cleavage dominated even in presence of dNTPs. In contrast, while the initiation of synthesis required excision of the ddCMP residue, the polymerization reaction dominated in the nanopore experiments (
Maintenance of a significant pool of intact, unextended DNA substrate in the bulk phase due to the slow exoncleolytic removal of a ddNMP primer terminus allowed us to examine phi29 DNA-catalyzed synthesis in the nanopore under relatively simple conditions. Nonetheless, due to concerns regarding the slow change in the state of the DNA molecules and potential dNTP substrate depletion in the bulk phase over time, this strategy puts constraints on the time frame in which experiments can be conducted. The use of a more robust means of protecting DNA substrate molecules in the bulk phase, such as the blocking oligomers recently employed with KF(exo-) and T7DNAP(exo-), will extend the utility of this enzyme for both DNA sequencing applications and mechanistic studies of polymerase function using the nanopore.
Third, we used the exonulease domain to systematically move the DNA strand through the nanopore by excising nucleotides at the 3 prime end of the priming strand (
This arrangement and set of biochemical reactions is particularly useful for the field of polynucleotide sequencing as the sequence reads of individual nucleotide can be repeated to confirm the base-call as well as having the ability to perform the reactions in a time-frame whereby useful data are generated.
The results of this study demonstrate that phi29 DNAP has properties ideally suited for moving long strands of DNA through nanoscale pores at a rate that is compatible with reliable base detection and identification. In this study we used only chemically synthesized DNA templates, yet the number of sequential nucleotide additions catalyzed by a single enzyme molecule that could be observed was limited only by DNA template length. Features within current traces, such as the ionic current flicker within binary complex events that can predict ternary complex amplitude (
Here, we describe nanopore analysis of up to 500 DNA templates in single file order using modifications of a blocking oligomer strategy herein described. These optimized blocking oligomers promote pre-loading of phi29 DNAP onto target DNA, while simultaneously protecting the DNA substrate against replication and exonucleolysis in bulk phase for at least five hours. These DNA molecules are activated for replication only when captured on the nanopore.
We have used blocking oligomers to regulate ssDNA movement through the nanopore catalyzed by phi29 DNAP. Improvements were 1) increased protection of p/t DNA from replication and digestion in bulk phase, 2) faster activation of p/t DNA for replication on the nanopore, and 3) forward and reverse ratcheting of DNA through the nanopore. Overall, these improvements increased the throughput of the nanopore for sequencing application to ˜130 analyzed DNA molecules per hour on a single nanopore, and increased the allowable nanopore experiment time to at least 5 hours. In addition, each molecule was analyzed twice by forward and reverse ratcheting through the nanopore. Coupling this method of strand control with a nanopore that can resolve individual nucleotides could potentially allow for sequencing and re-sequencing of the same DNA strands in a nanopore.
Single-channel thin film devices and methods for using the same are provided. The subject devices comprise cis and trans chambers connected by an electrical communication means. At the cis end of the electrical communication means is a horizontal conical aperture sealed with a thin film that includes a single nanopore or channel. The devices further include a means for applying an electric field between the cis and trans chambers. The subject devices find use in applications in which the ionic current through a nanopore or channel is monitored. where such applications include the characterization of naturally occurring ion channels, the characterization of polymeric compounds, and the like.
In particular, the invention provides a novel system comprising a nanopore positioned between the cis and trans chambers and a DNA polymerase isolated from a mesophile, a halophile, or an extreme halophile microorganism. In one preferred embodiment. the DNA polymerase isolated from the mesophile prokaryote is phi29 DNAP protein. In another preferred embodiment, the DNA polymerase comprises a 5′-3″ polymerase and a 3′-5″ exonuclease. In a more preferred embodiment, the halophile microorganism is an extreme halophile microorganism. In the alternative, the DNA polymerase is isolated from a virus that can infect a mesophile, a halophile, or an extreme halophile microorganism.
The DNA polymerase may be active in low salt concentrations, for example less than 0.5M salt, or under high-salt concentrations, for example, at least about 0.5 M, at least about 0.6 M, at least about 1 M, at least about 1.5 M, at least about 2 M, at least about 2.5 M, at least about 3 M, at least about 3.5 M, at least about 4 M, at least about 4.5 M, at least about 5 M, at least about 5.5 M, and at saturation.
The invention also provides a DNA polymerase that may also be active for significantly longer time than that of a Klenow (exo-) fragment under similar conditions. In one example, the DNA polymerase of the invention can be active for up to 40 seconds compared with a few milliseconds using Klenow (exo-) fragment. This ˜10,000-fold increase in activity is clearly an unexpectedly superior result that would not have been predicted by the prior art in any combination, including T7 DNA polymerase which is known to be highly processive in bulk phase when bound to thioredoxin but which rapidly dissociates when captured on a nanopore (Olasagasti, F.; Lieberman, K. R.; Benner, S.; Cherf, G. M.; Dahl, J. M.; Deamer, D. W.; Akeson, M., Nat. Nanotechnol. 2010, advance online publication, doi:10.1038/nnano.2010.177. The invention provides a DNA polymerase that may be active for 40 seconds, for 60 seconds, for 120 seconds, for 5 minutes, for 10 minutes, for 15 minutes, for 20 minutes, for 30 minutes, for 45 minutes, for 60 minutes, for 1.5 hours, for 2 hours, for 4 hours, for 8 hours, for 12 hours, for 16 hours, for 20 hours, for 24 hours, for several days, or for several weeks, including more than one month, or even indefinitely. One additional advantage of the invention is that in some instances or circumstances, it is not necessary to provide a step of waiting for a reaction to occur.
In one embodiment, the DNA polymerase activity results in a terminal cascade, a series of discrete ionic current steps.
(1) A nanopore device can be used to monitor the turnover of enzymes such as exonucleases and polymerases, which have important applications in DNA sequencing.
(2) A nanopore device can function as a biosensor to monitor the interaction between soluble substances such as enzyme substrates or signaling molecules. Examples include blood components such as glucose, uric acid and urea, hormones such as steroids and cytokines, and pharmaceutical agents that exert their function by binding to receptor molecules.
(3) A nanopore device can monitor in real time the function of important biological structures such as ribosomes, and perform this operation with a single functional unit.
(4) Various scientific and industrial applications exist in which it would be advantageous to use a DNA polymerase that function efficiently at high salt concentrations. In sequencing, GC compressions can be resolved by using high salt concentrations. In nanopore sequencing high salt concentration boosts the signal to noise ratio for ionic-current-based nanopore measurements. Salt tolerant DNA polymerases may be found among members of the extreme halophiles, in which salt tolerance is achieved not by exclusion of monovalent ions from the cytosol, but by adapting intracellular machinery function in elevated salt. As an example of salt tolerance among members of the extreme halophiles, malate dehydrogenase from the archaeal halophile Haloarcula marismortui incorporates a salt-adaptive strategy where the high ionic concentration from the environment is not only tolerated but is incorporated within the protein. Sodium (or potassium) and chloride ions are found incorporated within the molecule itself. When considering viruses that infect extreme halophiles, not only are proteins of the viral capsid exposed directly to the environment, but the proteins of the replication machinery must operate effectively within the elevated salt environment of its archaeal host.
The high salt tolerance of these DNA polymerases may be very useful for various applications in which high salt concentration is an advantage. For example, the polymerases are useful for sequencing in which they provide better resolution of GC-rich compressions. Additionally the polymerases are useful for nanopore sequencing where a high salt concentration will boost the signal to noise ratio for ionic-current-based nanopore measurements.
(A) We have also found that DNA polymerase enzymes with a 3′-5′ exonuclease can digest DNA from the 3′->5′ terminus. We have found that covalently bonding a C3 (CPG) spacer, followed by an abasic residue on the 3′-terminus prevents exonucleolytic digestion of the DNA.
(B) Phi29 DNAP-bound dsDNA unzips in a nanopore by applied voltage (180 mV). Voltage reduction allows re-zipping of the DNA. Restoring the voltage unzips the DNA again and this allows movement of the DNA back and forth through the nanopore.
(C) We have found that when a blocking oligomer binds at the DNA p/t junction the oligomer is stripped off when captured on a nanopore, and the DNA is subsequently activated for ratcheting through the nanopore. Using shorter blocking oligomers decrease the time required to strip the blocking oligomer off the DNA. This allows us to activate DNA molecules for replication on the nanopore faster, and that this increases the throughput of the nanopore for sequencing applications.
(D) Noise in a current trace can help identify neighboring monomers along a polymer strand.
(E) Controlled DNA delivery through a nanopore: this is expemplified in Example XVI and in
Single-channel thin film devices and methods for using the same are provided. The subject devices comprise a mixed-signal semiconductor wafer, at least one electrochemical layer, the electrochemical layer comprising a semiconductor material, such as silicon dioxide or the like, wherein the semiconductor material further comprises a surface modifier, such as a hydrocarbon, wherein the electrochemical layer defines a plurality of orifices, the orifices comprising a chamber and a neck and wherein the chamber of the orifices co-localize with a first metal composition of the mixed-signal semiconductor wafer, wherein a portion of the orifice is plugged with a second metal, for example, silver, wherein the second metal is in electronic communication with the first metal, and wherein the orifice further comprises a thin film, such as a phospholipid bilayer, the thin film forming a solvent-impermeable seal at the neck of the orifice, the thin film further comprising a pore, and wherein the orifice encloses an aqueous phase and a gas phase. In a preferred embodiment the metallization layer comprises a metal, or metal alloy, such as, but not limited to, nickel, gold, copper, and aluminum.
Pores for use in accordance with the invention can be β-barrel pores or α-helix bundle pores. β-barrel pores comprise a barrel or channel that is formed from β-sheets. Suitable β-barrel pores include, but are not limited to, β-toxins, such as α-hemolysin and leukocidins, and outer membrane proteins/porins of bacteria, such as Mycobacterium smegmatis porin A (MspA), MspB, MspC, MspD, outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NaIP). α-helix bundle pores comprise a barrel or channel that is formed from α-helices. Suitable α-helix bundle pores include, but are not limited to, inner membrane proteins and outer membrane proteins, such as E. coli Wza and ClyA toxin. Other useful pore proteins may include the NNN-RRK mutant of the MspA monomer that includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K.
Methods are known in the art for inserting subunits into membranes, such as lipid bilayers. For example, subunits may be suspended in a purified form in a solution containing a lipid bilayer such that it diffuses to the lipid bilayer and is inserted by binding to the lipid bilayer and assembling into a functional state. Alternatively, subunits may be directly inserted into the membrane using the “pick and place” method described in M. A. Holden, H. Bayley. J. Am. Chem. Soc. 2005, 127, 6502-6503 and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
The concentration of pore molecule or channel molecule is sufficient to form a single channel in any of the thin films or bilayers in approximately, for example, fifteen minutes. The time to form such channels can be for example, between one-half minute and one hour, for example, about one-half minute, one minute, two minutes, three minutes, four minutes, five minutes, seven minutes, ten minutes, fifteen minutes, twenty minutes, twenty five minutes, thirty minutes, thirty five minutes, forty minutes, forty five minutes, fifty minutes, fifty five minutes, sixty minutes, or any time therebetween. The time for formation can be altered by an operator by several factors or parameters, for example, increasing or decreasing the ambient or incubation temperature, increasing or decreasing the concentration of salt in second solution or first solution, placing a potential difference between the first solution and the second solution that attracts the pore or channel molecule towards the thin film or bilayer, or other methods know to those of skill in the art. The finite state machine can detect and/or sense formation of a single channel in its corresponding bilayer by reacting to the flow of current (ions) through the circuit, the circuit comprising the macroscopic electrode, the second solution, the single nanopore or channel molecule, first solution, and the metal electrode for any given array element.
Formation of biological channels is a stochastic process. Once a single channel has formed in a given array element bilayer, it is preferred that the chance that a second channel so forming therein is reduced or preferably, eliminated. The probability of second channel insertion can be modulated with applied potential, that is potential difference, across the bilayer. Upon sensing a single channel, the finite state machine adjusts the potential on the metal electrode to decrease the possibility of second channel insertion into the same bilayer.
In an alternative embodiment, each array element may comprise a gold electrode surrounding the orifice. This gold electrode may serve to activate chemical reagents using reduction or oxidation reactions and that can act specifically at the location of a specific orifice.
The nanopore system can be created using state-of-the-art commercially available 65 nm process technology, for example from Taiwan Semiconductor Manufacturing Company, Taiwan). A 600×600 array of nanopores can perform 360,000 biochemical reaction and detection/sensing steps at a rate of 1000 Hz. This may enable sequencing of polynucleotides, for example, to proceed at a rate of 360 million baser per second per 1 cm×1 cm die cut from the semiconductor wafer.
Exemplary means for applying an electric field between the cis- and trans-chambers are, for example, electrodes comprising an immersed anode and an immersed cathode, that are connected to a voltage source. Such electrodes can be made from, for example silver chloride, or any other compound having similar physical and/or chemical properties.
Time-dependent transport properties of the nanopore aperture may be measured by any suitable technique. The transport properties may be a function of the medium used to transport the polynucleotide, solutes (for example, ions) in the liquid, the polynucleotide (for example, chemical structure of the monomers), or labels on the polynucleotide. Exemplary transport properties include current, conductance, resistance, capacitance, charge, concentration, optical properties (for example, fluorescence and Raman scattering), and chemical structure. Desirably, the transport property is current.
Exemplary means for detecting the current between the cis and the trans chambers have been described in WO 00/79257, U.S. Pat. Nos. 6,46,594, 6,673 6,673,615, 6,627,067, 6,464,842, 6,362,002, 6,267,872, 6,015,714, and 5,795,782 and U.S. Publication Nos. 2004/0121525, 2003/0104428, and 2003/0104428, and can include, but are not limited to, electrodes directly associated with the channel or pore at or near the pore aperture, electrodes placed within the cis and the trans chambers, ad insulated glass micro-electrodes. The electrodes may be capable of, but not limited to, detecting ionic current differences across the two chambers or electron tunneling currents across the pore aperture or channel aperture. In another embodiment, the transport property is electron flow across the diameter of the aperture. which may be monitored by electrodes disposed adjacent to or abutting on the nanopore circumference. Such electrodes can be attached to an Axopatch 200B amplifier for amplifying a signal.
Applications and/or uses of the invention disclosed herein may include, but not be limited to the following:
Polynucleotides homologous to other polynucleotides may be identified by hybridization to each other under stringent or under highly stringent conditions. Single-stranded polynucleotides hybridize when they associate based on a variety of well characterized physical-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the like. The stringency of a hybridization reflects the degree of sequence identity of the nucleic acids involved, such that the higher the stringency, the more similar are the two polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, salt concentration and composition, organic and non-organic additives, solvents, etc. present in both the hybridization and wash solutions and incubations (and number thereof), as described in more detail in the references cited above.
Encompassed by the invention are polynucleotide sequences that are capable of hybridizing to polynucleotides and fragments thereof under various conditions of stringency (for example, in Wahl and Berger (1987) Methods Enzymol. 152: 399-407, and Kimmel (1987) Methods Enzymol. 152: 507-511). Estimates of homology are provided by either DNA-DNA or DNA-RNA hybridization under conditions of stringency as is well understood by those skilled in the art (Hames and Higgins, Editors (1985) Nucleic Acid Hybridisation: A Practical Approach, IRL Press, Oxford, U.K.). Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions.
In one embodiment, the invention may be used to perform sequence analysis of polynucleotides. The analyses have an advantage over the prior art and the current art in that a single analysis may be performed at a single site, thereby resulting in considerable cost savings for reagents, substrates, reporter molecules, and the like. Of additional import is the rapidity of the sequencing reaction and the signal generated, thereby resulting in an improvement over the prior art.
Other methods for sequencing nucleic acids are well known in the art and may be used to practice any of the embodiments of the invention. These methods employ enzymes such as the Klenow fragment of DNA polymerase I, SEQUENAS, Taq DNA polymerase and thermostable T7 DNA polymerase (Amersham Pharmacia Biotech, Piscataway N.J.), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies, Gaithersburg Md.). Preferably, sequence preparation is automated with machines such as the HYDRA microdispenser (Robbins Scientific, Sunnyvale Calif.), MICROLAB 2200 system (Hamilton, Reno Nev.), and the DNA ENGINE thermal cycler (PTC200; MJ Research, Watertown Mass.). Machines used for sequencing include the ABI PRISM 3700, 377 or 373 DNA sequencing systems (PE Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Pharmacia Biotech), and the like. The sequences may be analyzed using a variety of algorithms that are well known in the art and described in Ausubel et al. (1997; Short Protocols in Molecular Biology, John Wiley & Sons, New York N.Y., unit 7.7) and Meyers (1995; Molecular Biology and Biotechnology, Wiley VCH, New York N.Y., pp. 856-853).
Shotgun sequencing is used to generate more sequence from cloned inserts derived from multiple sources. Shotgun sequencing methods are well known in the art and use thermostable DNA polymerases, heat-labile DNA polymerases, and primers chosen from representative regions flanking the polynucleotide molecules of interest. Incomplete assembled sequences are inspected for identity using various algorithms or programs such as CONSED (Gordon (1998) Genome Res. 8: 195-202) that are well known in the art. Contaminating sequences including vector or chimeric sequences or deleted sequences can be removed or restored, respectively, organizing the incomplete assembled sequences into finished sequences.
The sequences of the invention may be extended using various PCR-based methods known in the art. For example, the XL-PCR kit (PE Biosystems), nested primers, and commercially available cDNA or genomic DNA libraries may be used to extend the polynucleotide sequence. For all PCR-based methods, primers may be designed using commercially available software, such as OLIGO 4.06 primer analysis software (National Biosciences, Plymouth Minn.) to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to a target molecule at temperatures from about 55° C. to about 68° C. When extending a sequence to recover regulatory elements, it is preferable to use genomic, rather than cDNA libraries.
Use of Polynucleotides with the Invention
A wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid, amino acid, and antibody assays. Synthesis of labeled molecules may be achieved using Promega (Madison Wis.) or Amersham Pharmacia Biotech kits for incorporation of a labeled nucleotide such as 32P-dCTP, Cy3-dCTP or Cy5-dCTP or amino acid such as 35S-methionine. Nucleotides and amino acids may be directly labeled with a variety of substances including fluorescent, chemiluminescent, or chromogenic agents, and the like, by chemical conjugation to amines, thiols and other groups present in the molecules using reagents such as BIODIPY or FITC (Molecular Probes, Eugene Oreg.).
The polynucleotides, fragments, oligonucleotides, complementary RNA and DNA molecules, and PNAs may be used to detect and quantify altered gene expression, absence/presence versus excess, expression of mRNAs or to monitor mRNA levels during therapeutic intervention. Conditions, diseases or disorders associated with altered expression include idiopathic pulmonary arterial hypertension, secondary pulmonary hypertension, a cell proliferative disorder, particularly anaplastic oligodendroglioma, astrocytoma, oligoastrocytoma, glioblastoma, meningioma, ganglioneuroma, neuronal neoplasm, multiple sclerosis, Huntington's disease, breast adenocarcinoma, prostate adenocarcinoma, stomach adenocarcinoma, metastasizing neuroendocrine carcinoma, nonproliferative fibrocystic and proliferative fibrocystic breast disease, gallbladder cholecystitis and cholelithiasis, osteoarthritis, and rheumatoid arthritis; acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, benign prostatic hyperplasia, bronchitis, Chediak-Higashi syndrome, cholecystitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, chronic granulomatous diseases, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polycystic ovary syndrome, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, severe combined immunodeficiency disease (SCID), Sjogren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, hemodialysis, extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infection; a disorder of prolactin production, infertility, including tubal disease, ovulatory defects, and endometriosis, a disruption of the estrous cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation syndrome, an endometrial or ovarian tumor, a uterine fibroid, autoimmune disorders, an ectopic pregnancy, and teratogenesis; cancer of the breast, fibrocystic breast disease, and galactorrhea; a disruption of spermatogenesis, abnormal sperm physiology, benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, gynecomastia; actinic keratosis, arteriosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, primary thrombocythemia, complications of cancer, cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In another aspect, the polynucleotide of the invention.
The polynucleotides, fragments, oligonucleotides, complementary RNA and DNA molecules, and PNAs, or fragments thereof, may be used to detect and quantify altered gene expression; absence, presence, or excess expression of mRNAs; or to monitor mRNA levels during therapeutic intervention. Disorders associated with altered expression include akathesia, Alzheimer's disease, amnesia, amyotrophic lateral sclerosis, ataxias, bipolar disorder, catatonia, cerebral palsy, cerebrovascular disease Creutzfeldt-Jakob disease, dementia, depression, Down's syndrome, tardive dyskinesia, dystonias, epilepsy, Huntington's disease, multiple sclerosis, muscular dystrophy, neuralgias, neurofibromatosis, neuropathies, Parkinson's disease, Pick's disease, retinitis pigmentosa, schizophrenia, seasonal affective disorder, senile dementia, stroke, Tourette's syndrome and cancers including adenocarcinomas, melanomas, and teratocarcinomas, particularly of the brain. These cDNAs can also be utilized as markers of treatment efficacy against the diseases noted above and other brain disorders, conditions, and diseases over a period ranging from several days to months. The diagnostic assay may use hybridization or amplification technology to compare gene expression in a biological sample from a patient to standard samples in order to detect altered gene expression. Qualitative or quantitative methods for this comparison are well known in the art.
The diagnostic assay may use hybridization or amplification technology to compare gene expression in a biological sample from a patient to standard samples in order to detect altered gene expression. Qualitative or quantitative methods for this comparison are well known in the art.
For example, the polynucleotide or probe may be labeled by standard methods and added to a biological sample from a patient under conditions for the formation of hybridization complexes. After an incubation period, the sample is washed and the amount of label (or signal) associated with hybridization complexes, is quantified and compared with a standard value. If the amount of label in the patient sample is significantly altered in comparison to the standard value, then the presence of the associated condition, disease or disorder is indicated.
In order to provide a basis for the diagnosis of a condition, disease or disorder associated with gene expression, a normal or standard expression profile is established. This may be accomplished by combining a biological sample taken from normal subjects, either animal or human, with a probe under conditions for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained using normal subjects with values from an experiment in which a known amount of a substantially purified target sequence is used. Standard values obtained in this manner may be compared with values obtained from samples from patients who are symptomatic for a particular condition, disease, or disorder. Deviation from standard values toward those associated with a particular condition is used to diagnose that condition.
Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies and in clinical trial or to monitor the treatment of an individual patient. Once the presence of a condition is established and a treatment protocol is initiated, diagnostic assays may be repeated on a regular basis to determine if the level of expression in the patient begins to approximate the level that is observed in a normal subject. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to months.
The polynucleotide or a fragment thereof may be used to purify a ligand from a sample. A method for using a polynucleotide or a fragment thereof to purify a ligand would involve combining the polynucleotide or a fragment thereof with a sample under conditions to allow specific binding, detecting specific binding, recovering the bound protein, and using an appropriate agent to separate the polynucleotide from the purified ligand.
In additional embodiments, the polynucleotides may be used in any molecular biology techniques that have yet to be developed, provided the new techniques rely on properties of polynucleotides that are currently known, including, but not limited to, such properties as the triplet genetic code and specific base pair interactions.
The invention also contemplates variants of the processory DNA polymerase. Such variants may have increased or decreased binding affinity for DNA. Such variants may also have increased or decreased rates of reaction. For example, in the KF, the reactive tyrosine residue may be substituted by, for example, tryptophan.
Amino acid substitutions may be made to an peptide sequence, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions. Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties or similar side-chain volume. The amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality or charge to the amino acids they replace. Alternatively, the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid. Conservative amino acid changes are well-known in the art and may be selected in accordance with the properties of the 20 main amino acids as defined in Table 1 below. Where amino acids have similar polarity, this can also be determined by reference to the hydropathy scale for amino acid side chains in Table 2.
Conservative substitutions are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the Table 3 when it is desired to maintain the activity of the protein. Table 2 shows amino acids which can be substituted for an amino acid in a protein and which are typically regarded as conservative substitutions.
Similar substitutions are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the Table 4 when it is desired to maintain the activity of the protein. Table 4 shows amino acids which can be substituted for an amino acid in a protein and which are typically regarded as structural and functional substitutions. For example, a residue in column 1 of Table 4 may be substituted with a residue in column 2; in addition, a residue in column 2 of Table 4 may be substituted with the residue of column 1.
Substitutions that are less conservative than those in Table 2 can be selected by picking residues that differ more significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in protein properties will be those in which (a) a hydrophilic residue, for example, seryl or threonyl, is substituted for (or by) a hydrophobic residue, for example, leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, for example, lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, for example, glutamyl or aspartyl; or (d) a residue having a bulky side chain, for example, phenylalanine, is substituted for (or by) one not having a side chain, for example, glycine.
The transmembrane protein pore is also preferably derived from α-hemolysin (α-HL). The wild type α-HL pore is formed of seven identical monomers or subunits (i.e. it is heptameric). The sequence of one monomer or subunit of α-hemolysin-NN is shown in SEQ ID NO: 2. The transmembrane protein pore preferably comprises seven monomers each comprising the sequence shown in SEQ ID NO: 2 or a variant thereof. Amino acids 1, 7 to 21, 31 to 34, 45 to 51, 63 to 66, 72, 92 to 97, 104 toll!, 124 to 136, 149 to 153, 160 to 164, 173 to 206, 210 to 213, 217, 218, 223 to 228, 236 to 242, 262 to 265, 272 to 274, 287 to 290 and 294 of SEQ ID NO: 2 form loop regions. Residues 113 and 147 of SEQ ID NO: 2 form part of a constriction of the barrel or channel of α-HL.
In such embodiments, a pore comprising seven proteins or monomers each comprising the sequence shown in SEQ ID NO: 2 or a variant thereof are preferably used in the method of the invention. The seven proteins may be the same (homoheptamer) or different (heteroheptamer).
A variant of SEQ ID NO: 2 is a protein that has an amino acid sequence which varies from that of SEQ ID NO: 2 and which retains its pore forming ability. The ability of a variant to form a pore can be assayed using any method known in the art. For instance, the variant may be inserted into a lipid bilayer along with other appropriate subunits and its ability to oligomerise to form a pore may be determined. Methods are known in the art for inserting subunits into membranes, such as lipid bilayers. Suitable methods are discussed above.
The variant may include modifications that facilitate covalent attachment to or interaction with the Phi29 DNA polymerase. The variant preferably comprises one or more reactive cysteine residues that facilitate attachment to the nucleic acid binding protein. For instance, the variant may include a cysteine at one or more of positions 8, 9, 17, 18, 19, 44, 45, 50, 51, 237, 239 and 287 and/or on the amino or carboxy terminus of SEQ ID NO: 2. Preferred variants comprise a substitution of the residue at position 8, 9, 17, 237, 239 and 287 of SEQ ID NO: 2 with cysteine (A8C, T9C, N17C, K237C, S239C or E287C). The variant is preferably any one of the variants described in International Application No. PCT/GB09/001,690 (published as WO 2010/004273), PCT/GB09/001,679 (published as WO 2010/004265) or PCT/GB10/000133 (published as WO 2010/086603).
The variant may also include modifications that facilitate any interaction with nucleotides.
The variant may be a naturally occurring variant which is expressed naturally by an organism, for instance by a Staphylococcus bacterium. Alternatively, the variant may be expressed in vitro or recombinantly by a bacterium such as Escherichia coli. Variants also include non-naturally occurring variants produced by recombinant technology. Over the entire length of the amino acid sequence of SEQ ID NO: 2, a variant will preferably be at least 50% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 2 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid identity over a stretch of 200 or more, for example 230, 250, 270 or 280 or more, contiguous amino acids (“hard homology”). Homology can be determined as discussed above.
Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 2 in addition to those discussed above, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions. Conservative substitutions may be made as discussed above.
One or more amino acid residues of the amino acid sequence of SEQ ID NO: 2 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 residues may be deleted, or more.
Variants may fragments of SEQ ID NO: 2. Such fragments retain pore-forming activity. Fragments may be at least 50, 100, 200 or 250 amino acids in length. A fragment preferably comprises the pore-forming domain of SEQ ID NO: 2. Fragments typically include residues 119, 121, 135. 113 and 139 of SEQ ID NO: 2.
One or more amino acids may be alternatively or additionally added to the polypeptides described above. An extension may be provided at the amino terminus or carboxy terminus of the amino acid sequence of SEQ ID NO: 2 or a variant or fragment thereof. The extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids. A carrier protein may be fused to a pore or variant.
As discussed above, a variant of SEQ ID NO: 2 is a subunit that has an amino acid sequence which varies from that of SEQ ID NO: 2 and which retains its ability to form a pore. A variant typically contains the regions of SEQ ID NO: 2 that are responsible for pore formation. The pore forming ability of α-HL, which contains a β-barrel, is provided by β-strands in each subunit. A variant of SEQ ID NO: 2 typically comprises the regions in SEQ ID NO: 2 that form β-strands. The amino acids of SEQ ID NO: 2 that form β-strands are discussed above. One or more modifications can be made to the regions of SEQ ID NO: 2 that form β-strands as long as the resulting variant retains its ability to form a pore. Specific modifications that can be made to the β-strand regions of SEQ ID NO: 2 are discussed above.
A variant of SEQ ID NO: 2 preferably includes one or more modifications, such as substitutions, additions or deletions, within its α-helices and/or loop regions. Amino acids that form α-helices and loops are discussed above.
The variant may be modified to assist its identification or purification as discussed above.
In some embodiments, the transmembrane protein pore is chemically modified. The pore can be chemically modified in any way and at any site. The transmembrane protein pore is preferably chemically modified by attachment of a molecule to one or more cysteines (cysteine linkage), attachment of a molecule to one or more lysines, attachment of a molecule to one or more non-natural amino acids, enzyme modification of an epitope or modification of a terminus. Suitable methods for carrying out such modifications are well-known in the art. The transmembrane protein pore may be chemically modified by the attachment of any molecule. For instance, the pore may be chemically modified by attachment of a dye or a fluorophore.
Any number of the monomers in the pore may be chemically modified. One or more, such as 2, 3, 4, 5, 6, 7, 8, 9 or 10, of the monomers is preferably chemically modified as discussed above.
The reactivity of cysteine residues may be enhanced by modification of the adjacent residues. For instance, the basic groups of flanking arginine, histidine or lysine residues will change the pKa of the cysteines thiol group to that of the more reactive S− group. The reactivity of cysteine residues may be protected by thiol protective groups such as dTNB. These may be reacted with one or more cysteine residues of the pore before a linker is attached.
The molecule (with which the pore is chemically modified) may be attached directly to the pore or attached via a linker as disclosed in International Application Nos. PCT/GB09/001,690 (published as WO 2010/004273), PCT/GB09/001,679 (published as WO 2010/004265) or PCT/GB10/000,133 (published as WO 2010/086603).
Any Phi29 DNA polymerase may be used in accordance with the invention. The Phi29 DNA polymerase preferably comprises the sequence shown in SEQ ID NO: 4 or a variant thereof. Wild-type Phi29 DNA polymerase has polymerase and exonuclease activity. It may also unzip double stranded polynucleotides under the correct conditions. Hence, the enzyme may work in three modes. This is discussed in more detail below. A variant of SEQ ID NO: 4 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 4 and which retains polynucleotide binding activity. The variant must work in at least one of the three modes discussed below. Preferably, the variant works in all three modes. The variant may include modifications that facilitate handling of the polynucleotide and/or facilitate its activity at high salt concentrations and/or room temperature. The variant may include Fidelity Systems' TOPO modification, which improves enzyme salt tolerance.
Over the entire length of the amino acid sequence of SEQ ID NO: 4, a variant will preferably be at least 40% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 4 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid identity over a stretch of 200 or more, for example 230, 250, 270 or 280 or more, contiguous amino acids (“hard homology”). Homology is determined as described below. The variant may differ from the wild-type sequence in any of the ways discussed below with reference to SEQ ID NO: 2. The polymerase may be covalently attached to the pore.
These methods are possible because transmembrane protein pores can be used to differentiate nucleotides of similar structure on the basis of the different effects they have on the current passing through the pore. Individual nucleotides can be identified at the single molecule level from their current amplitude when they interact with the pore. The nucleotide is present in the pore if the current flows through the pore in a manner specific for the nucleotide (i.e. if a distinctive current associated with the nucleotide is detected flowing through the pore). Successive identification of the nucleotides in a target polynucleotide allows the sequence of the polynucleotide to be determined. As discussed above, this is Strand Sequencing.
During the interaction between a nucleotide in the single stranded polynucleotide and the pore, the nucleotide affects the current flowing through the pore in a manner specific for that nucleotide. For example, a particular nucleotide will reduce the current flowing through the pore for a particular mean time period and to a particular extent. In other words, the current flowing through the pore is distinctive for a particular nucleotide. Control experiments may be carried out to determine the effect a particular nucleotide has on the current flowing through the pore. Results from carrying out the method of the invention on a test sample can then be compared with those derived from such a control experiment in order to determine the sequence of the target polynucleotide.
The sequencing methods may be carried out using any apparatus that is suitable for investigating a membrane/pore system in which a pore is inserted into a membrane. The method may be carried out using any apparatus that is suitable for transmembrane pore sensing. For example, the apparatus comprises a chamber comprising an aqueous solution and a barrier that separates the chamber into two sections. The barrier has an aperture in which the membrane containing the pore is formed.
The sequencing methods may be carried out using the apparatus described in International Application No. PCT/GB08/000,562.
The methods of the invention involve measuring the current passing through the pore during interaction with the nucleotide(s). Therefore the apparatus also comprises an electrical circuit capable of applying a potential and measuring an electrical signal across the membrane and pore. The methods may be carried out using a patch clamp or a voltage clamp. The methods preferably involve the use of a voltage clamp.
The sequencing methods of the invention involve the measuring of a current passing through the pore during interaction with the nucleotide. Suitable conditions for measuring ionic currents through transmembrane protein pores are known in the art and disclosed in the Example. The method is typically carried out with a voltage applied across the membrane and pore. The voltage used is typically from −400 mV to +400 mV. The voltage used is preferably in a range having a lower limit selected from −400 mV, −300 mV, −200 mV, −150 mV, −100 mV, −50 mV, −20 mV and 0 mV and an upper limit independently selected from +10 mV, +20 mV, +50 mV, +100 mV, +150 mV, +200 mV, +300 mV and +400 mV. The voltage used is more preferably in the range 100 mV to 240 mV and most preferably in the range of 160 mV to 240 mV. It is possible to increase discrimination between different nucleotides by a pore by using an increased applied potential.
The sequencing methods are typically carried out in the presence of any alkali metal chloride salt. In the exemplary apparatus discussed above, the salt is present in the aqueous solution in the chamber. Potassium chloride (KCl), sodium chloride (NaCl) or caesium chloride (CsCl) is typically used. KCl is preferred. The salt concentration is typically from 0.1 to 2.5M, from 0.3 to 1.9M, from 0.5 to 1.8M, from 0.7 to 1.7M, from 0.9 to 1.6M or from 1M to 1.4M. The salt concentration is preferably from 150 mM to 1M. In some alternative embodiments, it may be desirable to include salt at saturating concentrations. Phi29 DNA polymerase surprisingly works under high salt concentrations. The salt concentration is preferably at least 0.3M, such as at least 0.4M or 0.5 M. High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of a nucleotide to be identified against the background of normal current fluctuations. Lower salt concentrations may be used if nucleotide detection is carried out in the presence of an enzyme.
The methods are typically carried out in the presence of a buffer. In the exemplary apparatus discussed above, the buffer is present in the aqueous solution in the chamber. Any buffer may be used in the method of the invention. Typically, the buffer is HEPES. Another suitable buffer is Tris-HCl buffer. The methods are typically carried out at a pH of from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5. The pH used is preferably about 7.5.
The methods may be carried out at from 0° C. to 100° C., from 15° C. to 95° C., from 16° C. to 90° C., from 17° C. to 85° C., from 18° C. to 80° C., 19° C. to 70° C., or from 20° C. to 60° C. The methods are typically carried out at room temperature. The methods are optionally carried out at a temperature that supports enzyme function, such as about 37° C.
As mentioned above, good nucleotide discrimination can be achieved at low salt concentrations if the temperature is increased. In addition to increasing the solution temperature, there are a number of other strategies that can be employed to increase the conductance of the solution, while maintaining conditions that are suitable for enzyme activity. One such strategy is to use the lipid bilayer to divide two different concentrations of salt solution, a low salt concentration of salt on the enzyme side and a higher concentration on the opposite side. One example of this approach is to use 200 mM of KCl on the cis side of the membrane and 5001mM KCl in the trans chamber. At these conditions, the conductance through the pore is expected to be roughly equivalent to 400 mM KCl under normal conditions, and the enzyme only experiences 200 mM if placed on the cis side. Another possible benefit of using asymmetric salt conditions is the osmotic gradient induced across the pore. This net flow of water could be used to pull nucleotides into the pore for detection. A similar effect can be achieved using a neutral osmolyte, such as sucrose, glycerol or PEG. Another possibility is to use a solution with relatively low levels of KCl and rely on an additional charge carrying species that is less disruptive to enzyme activity.
The target polynucleotide being analysed can be combined with known protecting chemistries to protect the polynucleotide from being acted upon by the binding protein while in the bulk solution. The pore can then be used to remove the protecting chemistry. This can be achieved either by using protecting groups that are unhybridised by the pore, binding protein or enzyme under an applied potential (WO 2008/124107) or by using protecting chemistries that are removed by the binding protein or enzyme when held in close proximity to the pore (J Am Chem. Soc. 2010 Dec. 22; 132(50):17961-72).
When the target polynucleotide is contacted with a Phi29 DNA polymerase and pore, the target polynucleotide firstly forms a complex with the Phi29 DNA polymerase. When the voltage is applied across the pore, the target polynucleotide/Phi29 DNA polymerase complex forms a complex with the pore and controls the movement of the polynucleotide through the pore.
As discussed above, wild-type Phi29 DNA polymerase has polymerase and exonuclease activity. It may also unzip double stranded polynucleotides under the correct conditions. Hence, the enzyme may work in three modes. The method may be carried out in one of three preferred ways based on the three modes of the Phi29 DNA polymerase. Each way includes a method of proof reading the sequence. First, the method is preferably carried out using the Phi29 DNA polymerase as a polymerase. In this embodiment, steps (a) and (b) are carried out in the presence of free nucleotides and an enzyme cofactor such that the polymerase moves the target sequence through the pore against the field resulting from the applied voltage. The target sequence moves in the 5′ to 3′ direction. The free nucleotides may be one or more of any of the individual nucleotides discussed above. The enzyme cofactor is a factor that allows the Phi29 DNA polymerase to function either as a polymerase or an exonuclease. The enzyme cofactor is preferably a divalent metal cation. The divalent metal cation is preferably Mg2+, Mn2+, Ca2+ or Co2+. The enzyme cofactor is most preferably Mg2+. The method preferably further comprises (c) removing the free nucleotides such that the polymerase moves the target sequence through the pore with the field resulting from the applied voltage (i.e. in the 3′ and 5′ direction) and a proportion of the nucleotides in the target sequence interacts with the pore and (d) measuring the current passing through the pore during each interaction and thereby proof reading the sequence of the target sequence obtained in step (b), wherein steps (c) and (d) are also carried out with a voltage applied across the pore.
Second, the method is preferably carried out using the Phi29 DNA polymerase as an exonuclease. In this embodiment, wherein steps (a) and (b) are carried out in the absence of free nucleotides and the presence of an enzyme cofactor such that the polymerase moves the target sequence through the pore with the field resulting from the applied voltage. The target sequence moves in the 3′ to 5′ direction. The method preferably further comprises (c) adding free nucleotides such that the polymerase moves the target sequence through the pore against the field resulting from the applied voltage (i.e. in the 5′ to 3′ direction) and a proportion of the nucleotides in the target sequence interacts with the pore and (d) measuring the current passing through the pore-during each interaction and thereby proof reading the sequence of the target sequence obtained in step (b), wherein steps (c) and (d) are also carried out with a voltage applied across the pore.
Third, the method is preferably carried out using the Phi29 DNA polymerase in unzipping mode. In this embodiment, steps (a) and (b) are carried out in the absence of free nucleotides and the absence of an enzyme cofactor such that the polymerase controls the movement of the target sequence through the pore with the field resulting from the applied voltage (as it is unzipped). In this embodiment, the polymerase acts like a brake preventing the target sequence from moving through the pore too quickly under the influence of the applied voltage. The method preferably further comprises (c) lowering the voltage applied across the pore such that the target sequence moves through the pore in the opposite direction to that in steps (a) and (b) (i.e. as it re-anneals) and a proportion of the nucleotides in the target sequence interacts with the pore and (d) measuring the current passing through the pore during each interaction and thereby proof reading the sequence of the target sequence obtained in step (b), wherein steps (c) and (d) are also carried out with a voltage applied across the pore.
The method of the invention preferably involves a pore derived from MspA and a Phi29 DNA polymerase. The Phi29 DNA polymerase preferably separates a double stranded target polynucleotide and controls the movement of the resulting single stranded polynucleotide through the pore. This embodiment has three unexpected advantages. First, the target polynucleotide moves through the pore at a rate that is commercially viable yet allows effective sequencing. The target polynucleotide moves through the Msp pore more quickly than it does through a hemolysin pore. Second, an increased current range is observed as the polynucleotide moves through the pore allowing the sequence to be determined more easily. Third, a decreased current variance is observed when the specific pore and polymerase are used together thereby increasing the signal-to-noise ratio.
The invention also provides a method of forming a sensor for sequencing a target polynucleotide. The method comprises contacting a pore with a Phi29 DNA polymerase in the presence of the target polynucleotide. A voltage is then applied across the pore to form a complex between the pore and the polymerase. This complex is a sensor for sequencing the target polynucleotide. The method preferably comprises contacting a pore derived from Msp with a Phi29 DNA polymerase in the presence of the target nucleic acid sequence and applying a voltage across the pore to form a complex between the pore and the polymerase. Any of the embodiments discussed above with reference to the sequencing method of the invention equally apply to this method.
The invention further provides a method of increasing the rate of activity of a Phi29 DNA polymerase. The method comprises contacting the Phi29 DNA polymerase with a pore in the presence of a polynucleotide. A voltage is applied across the pore to form a complex between the pore and the polymerase and this increases the rate of activity of a Phi29 DNA polymerase. The method preferably comprising contacting the Phi29 DNA polymerase with a pore derived from Msp in the presence of a nucleic acid sequence and applying a voltage across the pore to form a complex between the pore and the polymerase. Any of the embodiments discussed above with reference to the sequencing method of the invention equally apply to this method.
The present invention also provides kits for sequencing a target polynucleotide. The kits comprise (a) a pore and (b) a Phi29 DNA polymerase. Any of the embodiments discussed above with reference to the sequencing method of the invention equally apply to the kits.
The kit may further comprise the components of a membrane, such as the phospholipids needed to form a lipid bilayer.
The kits of the invention may additionally comprise one or more other reagents or instruments which enable any of the embodiments mentioned above to be carried out. Such reagents or instruments include one or more of the following: suitable buffer(s) (aqueous solutions), means to obtain a sample from a subject (such as a vessel or an instrument comprising a needle), means to amplify and/or express polynucleotides, a membrane as defined above or voltage or patch clamp apparatus. Reagents may be present in the kit in a dry state such that a fluid sample resuspends the reagents. The kit may also, optionally, comprise instructions to enable the kit to be used in the method of the invention or details regarding which patients the method may be used for. The kit may, optionally, comprise nucleotides.
The invention also provides an apparatus for sequencing a target polynucleotide. The apparatus comprises a plurality of pores and a plurality of Phi29 DNA polymerases. The apparatus preferably further comprises instructions for carrying out the sequencing method of the invention. The apparatus may be any conventional apparatus for polynucleotide analysis, such as an array or a chip. Any of the embodiments discussed above with reference to the methods of the invention are equally applicable to the apparatus of the invention.
The apparatus is preferably set up to carry out the sequencing method of the invention.
The apparatus preferably comprises:
The invention will be more readily understood by reference to the following examples, which are included merely for purposes of illustration of certain aspects and embodiments of the present invention and not as limitations.
Herein are described several examples to demonstrate the capability of measuring macromolecules and polanions or polycations.
The D355A, E357A exonuclease-deficient KF (100,000 U ml−1; specific activity 20,000 U mg−1) was from New England Biolabs. Wild-type phi29 DNAP (833,000 U ml−1; specific activity 83,000 U mg−1) was from Enzymatics. DNA oligonucleotides were synthesized at Stanford University Protein and Nucleic Acid Facility and purified by denaturing PAGE.
A 67 mer, 14 base-pair hairpin DNA substrate labeled with 6-FAM at its 5_end was self-annealed by incubation at 90° C. for four minutes, followed by rapid cooling in ice water. Reactions were conducted with 1 μM annealed hairpin and 0.75 μM phi 29 DNAP(exo+) in 10 mM K-Hepes, pH 8.0, 0.3 M KCl, 1 mM EDTA, 1 mM DTT with MgCl2 added to 10 mM when indicated, and dNTPs added at the concentrations indicated. Reactions were incubated at room temperature for the indicated times and were terminated by the addition of buffer-saturated phenol. Following extraction and ethanol precipitation, reaction products were dissolved in 7 M urea, 0.1×TBE and resolved by denaturing electrophoresis on gels containing 18% acrylamide:bisacrylamide (19:1), 7 M urea, 1×TBE. Extension products were visualized on a UVP Gel Documentation system using a Sybr Gold filter. Band intensities were quantified using ImageJ software (NIH).
The nanopore device and insertion of a single α-HL nanopore into a lipid bilayer have been described. Ionic current flux through the α-HL nanopore was measured using an integrating patch clamp amplifier (Axopatch 200B, Molecular Devices) in voltage clamp mode. Data were sampled using an analog-to-digital converter (Digidata 1440A, Molecular Devices) at 100 kHz in whole-cell configuration and filtered at 5 kHz using a low-pass Bessel filter. For voltage clamped experiments, current blockades were measured at the voltages specified in each figure (trans-positive). Experiments were conducted at 23±0.2° C. in buffer containing 10 mM K-Hepes pH 8.0, 1 mM EDTA, 1 mM DTT, 0.3 M or 0.6 M KCl as indicated, and 10 mM MgCl2 where indicated. DNA hairpin substrates were annealed prior to each experiment by heating at 95° C. for 3 minutes and rapidly cooling in an ice bath to prevent intermolecular hybridization.
Active voltage control of DNAP-DNA complexes atop the nanopore was achieved using finite state machine (FSM) logic, which was programmed with LabVIEW software (Version 8, National Instruments) and implemented on a FPGA system (PCI-7831R, National Instruments), as described previously (Benner et al. 2000 supra; Wilson et al. 2009 supra). Details of the FSM logic applied in the experiments shown in
Dwell time and amplitudes for KF(exo-)-DNA binary complexes were quantified using software developed in our laboratory that detects and quantifies the dwell time and amplitude of EBS and terminal current steps of capture events. Current blockades for phi29 DNAP complexes were quantified using Clampfit 10.2 software (Axon Instruments). Dominant IEBS values for phi29 binary and ternary complexes were obtained by using Clampfit software to determine the peaks of all-points amplitude histograms measured for 1 to 5 second windows in the initial segment of capture events.
To perform nanopore experiments, a single α-HL nanopore is inserted in a lipid bilayer separating two chambers (termed cis and trans) containing buffer solution, and a patch-clamp amplifier applies voltage and measures ionic current (
Binary complexes between phi29 DNAP and DNA substrates can be formed in the absence of the divalent cations required for both 5′-3′ polymerase and 3′-5′ exonuclease activity. When phi29 DNAP-DNA binary complexes were formed with the hairpin substrate in
This model suggests that the interaction between phi29 DNA and the DNA is strong enough that the DNA secondary structure unzips due to the force pulling on the template strand before the bond between phi29 DNAP and DNA can be broken. It furthermore predicts that reducing the applied voltage during the terminal cascade could allow the DNA duplex to re-anneal while associated with the enzyme and thus reset the phi29 DNAP-DNA complex to its original position on the DNA template strand, indicated by a return to the ˜35 pA state. To test this prediction, we compared the ability of complexes captured in the presence or absence of Mg2+ to recover their original EBS amplitude at 180 mV following a controlled voltage drop. A prerequisite for this comparison is a means to ensure that DNA molecules captured in the presence of Mg2+ are intact, so that the nanopore assay compares their fate only after capture. Thus exonucleolytic cleavage of the primer strand in the bulk phase must be miminized during the course of the experiment.
We tested whether a 3′-H terminus on the DNA substrate inhibited the rate of 3%5″ exonucleolytic cleavage by phi29 DNAP, in a gel assay comparing degradation of two 67 mer 5′-6-FAM labeled hairpin substrates (
In this experiment, upon capture of a phi29 DNAP-DNA complex at 180 mV, a finite state machine (FSM, see Example IV) monitored ionic current in real time until the downward current steps of the terminal cascade were detected (
Importantly, the dominant amplitude during the 70 mV intervals was ˜10.2 pA, with occasional deflections to ˜8.5 pA, measurably above the 6.8 pA value determined for unbound DNA at 70 mV in a control experiment (FIG. S2). This indicates that the phi29 DNAP complex remained atop the nanopore orifice without dissociating throughout the lower voltage interval, consistent with a model in which hairpin unzipping at 180 mV and refolding at 70 mV occurs when associated with phi29 DNAP atop the pore.
When the refolding experiment was performed in the presence of 10 mM Mg2+, 16 complexes out of 24 captured in the first 12.5 minutes after the addition of Mg2+ had the ˜35 pA IEBS level indicating they were formed with intact DNA substrate molecules (
Our strategy for detecting DNA synthesis catalyzed by polymerase-DNA complexes held atop the nanopore employs monitoring changes in ionic current as a block of abasic residues in the template strand is drawn into and through the nanopore lumen in single nucleotide increments when the polymerase advances along the template. This approach permits the recognition of sequential Angstrom-scale movements driven by the enzyme.
As a prelude to DNA replication experiments with phi29 DNAP, we established a reference map that related IEBS to the position of a 5 abasic block within the template strand of DNA hairpin substrates (
The IEBS maps for phi29 DNAP binary complexes (blue dots) and ternary complexes (red dots) are shown in
At all positions within the map, IEBS for the binary and ternary complexes were offset from one another. The direction and the scale of the offset depended in part on the position along the map. For example, at position (i) (
The results of the mapping experiments permit a prediction based upon the model proposed for the molecular events that give rise to the terminal cascade (
Results from our laboratory have shown that advance of a DNA template in the α-HL nanopore could be detected at single nucleotide precision during replication by T7DNAP(exo-). However, for the majority of complexes with this enzyme only one or two nucleotide addition cycles could be monitored. To determine if phi29 DNAP was more efficient at catalyzing sequential nucleotide additions on the nanopore, we measured phi29 DNAP-driven displacement of synthetic DNA substrates molecules bearing 5 abasic inserts in their template strands. The map in
The experiment in
Because dGTP can slow the rate of ddCMP excision due to formation of ternary complexes (
In initial nanopore replication experiments under these conditions (
In contrast, when the experiment was conducted in the presence of 20 μM each dATP, dCTP, dTTP and 5 μM dGTP a different ionic current pattern resulted, characterized by a peak at 35.4 pA (
In the gel experiment shown in
We measured the extent of primer extension for the 5′-6-FAM, 3′-H terminated hairpin in the presence of 100 μM each of dGTP, dCTP, dTTP and dATP as a function of time (
To test the model proposed for the ionic current signatures observed in the replication experiment in
When phi29 DNAP complexes formed with this DNA substrate were captured under both of these conditions, an initial period of several seconds occurred during which the dominant current amplitude was ˜31 A, with oscillations to ˜27 pA (
Experiments using optical tweezers have shown that the rate of replication catalyzed by phi29 DNAP is slowed by tension on the template at forces between ˜20 and ˜37 pN. This result predicts that the rate of phi29 DNAP replication would be influenced by the voltage applied across the nanopore. However, the voltage regime where this would occur is not known.
In anticipation of replicating natural DNA templates in the nanopore, we measured phi29 DNAP-dependent replication of a longer segment within a synthetic DNA hairpin substrate. This hairpin substrate had a starting abasic configuration of 5ab(50,54), and up to 50 nucleotides can be added before the enzyme reaches the abasic block (
We speculated that this oscillating signature corresponds to complexes captured with the ddCMP terminus intact, prior to the ddCMP excision reaction that permits synthesis to ensue, because (i) a similar pattern invariably occurred between capture and synthesis for each successful replication reaction that subsequently traversed the abasic 35.4 pA peak in the experiments shown in
We therefore used the end of this oscillating state as a start point to approximate the time required for phi29 DNAP to traverse the ˜50 nt template segment. We measured from a small but reproducible current dip that occurred just after the oscillation ended (left blue arrow in
Surprisingly for this mesophilic polymerase, replication of the 5ab(50,54) substrate by phi29 DNAP was also detectable in buffer containing 0.6 M KCl (
a: This trace shows six average current levels (i-vi) associated with movement of a DNA strand bearing abasic residues through the alpha-HL pore controlled by phi29 DNA polymerase. The peak-to-peak noise in current level iv is significantly greater than noise in all other levels. This is caused by motion of the template around position iv which probes neighboring positions iii and v. The current associated with positions iii and v are much different than position iv, thus the noise around iv is greater predicting the identity of its neighbors.
b: This trace shows current differences due to strand displacement by ˜3-5 angstrom as a DNA template bearing abasic residues is displaced within phi29 DNA polymerase. In panel (i) absent substrates, the dominant current is 31 pA with current deflections (noise) to about 34 pA, i.e. predicting that the next dominant state caused by strand displacement relative to the sensor will be 34 pA. In panel (ii), the next position (about 3-5 angstrom away from the first) is stabilized by substrates at the predicted 34 pA level. Occasional downward noise spikes to 31 pA confirm the identity of the monomer or monomers that previously occupied the sensor. iii) At a different position along the template strand bound to phi29 DNA polymerase, the dominant current (absent substrates) is 31 pA. Noise deflections to ˜25 pA predict the current that will dominate when the strand is stabilized one nucleotide (˜3-5 angstrom). In the presence of substrates (panel iv), the ˜25 pA level is stabilized confirming the prediction in (iii). Occasional noise spikes from 25 pA to 31 pA in (iv) confirm the identity of the prior monomer or monomers in the sensor.
c: This trace shows replication and attendant 1nt movement of a DNA template in the nanopore catalyzed by phi29DNA polymerase. A single abasic reporter in the DNA template causes a large current dynamic range. Here catalysis occurred in the presence of 100 uM each of dATP, dCTP, dTTP, but only 1 uM dGTP. Distinct flicker between some states is due to 3-5 angstrom (1nt) displacement of the template strand as a dC monomer within the template reaches the catalytic domain of phi29 DNA polymerase but fails to incorporate a dG nucleotide thus returning to the prior state. As in (b), flicker predicts the next stable amplitude. This is highlighted at positions i, ii, and iii. Note at these positions the flicker is asymmetric around the current mean.
d: This trace shows that our ability to predict subsequent ionic current amplitudes is valid for an all DNA template. In this case catalysis occurred in the presence of 100 uM each of dATP, dCTP, dGTP, but only 1 uM dTTP. Flicker from 23 pA to 22 pA at (i) occurs as phi29 DNA polymerase attempts to add dT opposite a templating dA in the catalytic domain. Failure to add dT causes the template to regress to its prior state (1 nt away) under a 180 mV load. Eventually (ii, red arrow) the dT is added, stabilizing the current at 22 pA thus allowing the template to advance further.
We have found that binding phi29 DNA polymerase (DNAP) to single-stranded DNA (ss-DNA) dramatically reduces the rate at which the ss-DNA traverses an α-Hemolysin nanopore under a 180 mV applied potential. Single-stranded DNA threads through the phi29 DNAP and α-Hemolysin nanopore at a rate near one nucleotide per 1-100 ms.
a shows a typical nanopore having a potential difference across the membrane.
We have found that we can protect the primer DNA strand from phi29 DNAP function by binding a modified DNA oligomer adjacent to the primer template junction. Phi29 binds at the oligomer 5′-terminus and capture of this complex on an α-Hemolysin nanopore with 180 mV applied potential removes the oligomer and places phi29 at the primer terminus, after which DMA replication can take place.
a illustrates a typical oligonucleotide binding to the target DNA; ‘X’ represents abasic nucleotides, ‘S’ represents the C3 (CPG) spacer.
We have found that phi29 DNAP can bind and move along ss-DNA. We use a registry oligomer—a modified DNA oligomer—to control where phi29 DNAP binds and sits on the ss-DNA. Capture of these DNAP-DNA complexes on an α-Hemolysin nanopore using a 180 mV applied potential removes the oligomer and allows the s-DNA to translocate through phi29 DNAP and the α-Hemolysin.
a illustrates a typical registry oligonucleotide binding to the target DNA; ‘X’ represents abasic nucleotides.
We have found that DNA polymerase enzymes with a 3′-5′ exonuclease can digest the 3′ terminus of template DNA. The method uses the primer DNA 5′ terminus to protect the template 3′ terminus from digestion by DNA polymerases (DNAP).
In this experiment, we sequenced a short (˜20 mer) segment of a modified DNA template using enzyme-directed DNA synthesis through a nanopore. Here we captured pimer/template (p/t) DNA bound by phi29 DNAP on the nanopore with a 180 mV applied voltage (
Capture of the DNA-enzymen complex reduces the ionic current through the nanopore from 60pA to ˜24 pA (
Thirty three discrete ionic current amplitudes (plotted in
When 100 nM of a single dNTP (for example, dTTP) is added to the nanopore reaction along with 100 μM all other dNTPs (dATP, dCTP, and dGTP), addition of each dTTP will take roughly 1,000 times longer than the addition of any other dNTP. Therefore the reported ionic current signal will stall at each position from 0-16 in
d-g summarize a set of four experiments where either 100 nM dTTP (
b illustrates features of blocking oligomers designed for use with phi29 DNAP. The DNA substrate is a 23-mer primer annealed to a synthetic 79-mer DNA template. To protect the DNA primer from phi29 DNAP-dependent extension and digestion in bulk phase, a blocking oligomer is annealed immediately adjacent to the DNA primer/template junction. Each blocking oligomer includes a ˜25 nt complement to the template strand (
c illustrates protection of DNA primers using these blocking oligomers. DNA substrates were incubated in nanopore buffer (+10 mM Mg2+) for 5 hours at 23° C. Products were subsequently analyzed by polyacrylamide gel electrophoresis. Absent blocking oligomers, phi29 DNAP digested the DNA primer strands (−dNTP, lane 3) or extended them (+dNTP, lane 4). In contrast, when protected by either of the blocking oligomer constructs, the primer strands were not digested (−dNTP, lanes 6 and 9), nor extended (+dNTP, lanes 7 and 10) by phi29 DNAP.
Our next objective was to remove the blocking oligomer from each DNA template captured in the nanopore. We initially considered a proven strategy wherein active voltage control is used to unzip the blocking oligomer from the DNA template, followed by voltage polarity reversal to drive the newly exposed DNA primer-template junction into the cis well and ‘fish’ for a polymerase. We discovered, however, that active control was unnecessary for this application: When phi29 DNAP was added to the nanopore bath, it formed stable complexes with the DNA substrates in bulk phase that nonetheless could not be enzymatically modified due to the presence of blocking oligomers (
We took advantage of this discovery to pre-bind and then activate DNA substrates at the nanopore and then measure attendant replication of individual polymerase-bound DNA templates (
An ionic current trace typical of 200 replication events from a representative experiment is shown in
Our model to explain this pattern is comprised of five successive stages illustrated in
This model makes two predictions. First, because traversal of the second 35 pA ionic current peak would require DNA replication, this process should stall in the absence of key dNTP substrates. Results consistent with this prediction are described in the supplement (
From these experiments, we infered that phi29 DNAP can be used to control forward and reverse ratcheting of individual DNA templates through the α-HL pore at single nucleotide precision. However, for nanopore DNA sequencing, it is necessary to determine the error rate of this control process. In other words, when a single DNA molecule is ratcheted through the pore, what is the probability that correct nucleotide registry is lost due to backsliding (examining the same nucleotide position more than once) or due to skipping forward (missing a nucleotide position)? To address this question, we used a standard amplitude map to test the accuracy of the 200 translocation events (
We next determined how often each of the standardized amplitude positions was skipped or repeated as the other 190 DNA templates traversed back and forth through the α-HL pore (
Lastly, we modified the blocking oligomer to increase throughput. This was achieved by reducing the binding sequence of the blocking oligomer from 25 to 15 nucleotides (see
The FPGA/FSM nanopore system can also be used for other enzyme studies. Applying voltage ramps upon capture of DNA/enzyme complexes can produce data to calculate bond energy landscapes using voltage force spectroscopy. Also, DNA's interaction with the pore can be characterized using feedback control of the applied voltage. Regulation of enzyme catalysis can be by achieved applying tension to DNA occupying the pore, counteracting the enzymes processive force.
Blood samples (2-3 ml) are collected from patients via the pulmonary catheter and stored in EDTA-containing tubes at −80° C. until use. Genomic DNA is extracted from the blood samples using a DNA isolation kit according to the manufacturer's instruction (PUREGENE, Gentra Systems, Minneapolis Minn.). DNA purity is measured as the ratio of the absorbance at 260 and 280 nm (1 cm lightpath; A260/A280) measured with a Beckman spectrophotometer.
A region of a gene from a patient's DNA sample is amplified by PCR using the primers specifically designed for the region. The PCR products are sequenced using methods as disclosed above. SNPs identified in the sequence traces are verified using Phred/Phrap/Consed software and compared with known SNPs deposited in the NCBI SNP databank.
A cDNA library is constructed using RNA isolated from mammalian tissue. The frozen tissue is homogenized and lysed using a POLYTRON homogenizer (Brinkmann Instruments, Westbury N.J.) in guanidinium isothiocyanate solution. The lysates are centrifuged over a 5.7 M CsCl cushion using a SW28 rotor in an L8-70M Ultracentrifuge (Beckman Coulter, Fullerton Calif.) for 18 hours at 25,000 rpm at ambient temperature. The RNA is extracted with acid phenol, pH 4.7, precipitated using 0.3 M sodium acetate and 2.5 volumes of ethanol, resuspended in RNAse-free water, and treated with DNAse at 37° C. RNA extraction and precipitation are repeated as before. The mRNA is isolated with the OLIGOTEX kit (Qiagen, Chatsworth Calif.) and used to construct the cDNA library.
The mRNA is handled according to the recommended protocols in the SUPERSCRIPT plasmid system (Invitrogen). The cDNAs are fractionated on a SEPHAROSE CL4B column (APB), and those cDNAs exceeding 400 bp are ligated into an expression plasmid. The plasmid is subsequently transformed into DH5αa competent cells (Invitrogen).
Nucleic acids are isolated from a biological source and applied to a substrate for standard hybridization protocols by one of the following methods. A mixture of target nucleic acids, a restriction digest of genomic DNA, is fractionated by electrophoresis through an 0.7% agarose gel in 1×TAE [Tris-acetate-ethylenediamine tetraacetic acid (EDTA)] running buffer and transferred to a nylon membrane by capillary transfer using 20×saline sodium citrate (SSC). Alternatively, the target nucleic acids are individually ligated to a vector and inserted into bacterial host cells to form a library. Target nucleic acids are arranged on a substrate by one of the following methods. In the first method, bacterial cells containing individual clones are robotically picked and arranged on a nylon membrane. The membrane is placed on bacterial growth medium, LB agar containing carbenicillin, and incubated at 37° C. for 16 hours. Bacterial colonies are denatured, neutralized, and digested with proteinase K. Nylon membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to the membrane.
In the second method, target nucleic acids are amplified from bacterial vectors by thirty cycles of PCR using primers complementary to vector sequences flanking the insert. Amplified target nucleic acids are purified using SEPHACRYL-400 beads (Amersham Pharmacia Biotech). Purified target nucleic acids are robotically arrayed onto a glass microscope slide (Corning Science Products, Corning N.Y.). The slide is previously coated with 0.05% aminopropyl silane (Sigma-Aldrich, St. Louis Mo.) and cured at 110° C. The arrayed glass slide (microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene).
cDNA probes are made from mRNA templates. Five micrograms of mRNA is mixed with 1 μg random primer (Life Technologies), incubated at 70° C. for 10 minutes, and lyophilized. The lyophilized sample is resuspended in 50 μl of 1×first strand buffer (cDNA Synthesis systems; Life Technologies) containing a dNTP mix, [α-32 P]dCTP, dithiothreitol, and MMLV reverse transcriptase (Stratagene), and incubated at 42° C. for 1-2 hours. After incubation, the probe is diluted with 42 μl dH2O, heated to 95° C. for 3 minutes, and cooled on ice. mRNA in the probe is removed by alkaline degradation. The probe is neutralized, and degraded mRNA and unincorporated nucleotides are removed using a PROBEQUANT G-50 MicroColumn (Amersham Pharmacia Biotech). Probes can be labeled with fluorescent markers, Cy3-dCTP or Cy5-dCTP (Amersham Pharmacia Biotech), in place of the radionucleotide, [32P]dCTP.
Hybridization is carried out at 65° C. in a hybridization buffer containing 0.5 M sodium phosphate (pH 7.2), 7% SDS, and 1 mM EDTA. After the substrate is incubated in hybridization buffer at 65° C. for at least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing the probes. After incubation at 65° C. for 18 hours, the hybridization buffer is removed, and the substrate is washed sequentially under increasingly stringent conditions, up to 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65° C. To detect signal produced by a radiolabeled probe hybridized on a membrane, the substrate is exposed to a PHOSPHORIMAGER cassette (Amersham Pharmacia Biotech), and the image is analyzed using IMAGEQUANT data analysis software (Amersham Pharmacia Biotech). To detect signals produced by a fluorescent probe hybridized on a microarray, the substrate is examined by confocal laser microscopy, and images are collected and analyzed using gene expression analysis software.
Molecules complementary to the polynucleotide, or a fragment thereof, are used to detect, decrease, or inhibit gene expression. Although use of oligonucleotides comprising from about 15 to about 30 base pairs is described, the same procedure is used with larger or smaller fragments or their derivatives (for example, peptide nucleic acids, PNAs). Oligonucleotides are designed using OLIGO 4.06 primer analysis software (National Biosciences) and SEQ ID NOs: 1-163. To inhibit transcription by preventing a transcription factor binding to a promoter, a complementary oligonucleotide is designed to bind to the most unique 5′ sequence, most preferably between about 500 to 10 nucleotides before the initiation codon of the open reading frame. To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding to the mRNA encoding the mammalian protein.
A conjugate comprising a complex of polynucleotide and a binding protein thereof is purified using polyacrylamide gel electrophoresis and used to immunize mice or rabbits. Antibodies are produced using the protocols below. Rabbits are immunized with the complex in complete Freund's adjuvant. Immunizations are repeated at intervals thereafter in incomplete Freund's adjuvant. After a minimum of seven weeks for mouse or twelve weeks for rabbit, antisera are drawn and tested for antipeptide activity. Testing involves binding the peptide to plastic, blocking with 1% bovine serum albumin, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. Methods well known in the art are used to determine antibody titer and the amount of complex formation.
The polynucleotide, or fragments thereof, are labeled with 32P-dCTP, Cy3-dCTP, or Cy5-dCTP (Amersham Pharmacia Biotech), or with BIODIPY or FITC (Molecular Probes, Eugene Oreg.), respectively. Similarly, the conjugate comprising a complex of polynucleotide and a binding protein thereof can be labeled with radionucleide or fluorescent probes. Libraries of candidate molecules or compounds previously arranged on a substrate are incubated in the presence of labeled polynucleotide or protein. After incubation under conditions for either a polynucleotide or amino acid molecule, the substrate is washed, and any position on the substrate retaining label, which indicates specific binding or complex formation, is assayed, and the ligand is identified. Data obtained using different concentrations of the polynucleotide or protein are used to calculate affinity between the labeled polynucleotide or protein and the bound molecule.
Those skilled in the art will appreciate that various adaptations and modifications of the just-described embodiments can be configured without departing from the scope and spirit of the invention. Other suitable techniques and methods known in the art can be applied in numerous specific modalities by one skilled in the art and in light of the description of the present invention described herein. Therefore, it is to be understood that the invention can be practiced other than as specifically described herein.
The above description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the invention should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The present application claims priority to and benefits of U.S. Provisional Patent Application Ser. No. 61/402,903 entitled “Control of DNA Movement in a Nanopore at One Nucleotide Precision by a Processive Enzyme”, filed 7 Sep. 2010, U.S. Provisional Patent Application Ser. No. 61/574,237 entitled “Methods for Sequencing Single-Stranded Polynucleotides on A Nanopore”, filed 30 Jul. 2011, U.S. Provisional Patent Application Ser. No. 61/574,238 entitled “DNA Primer that Protects DNA Template 3′ Terminus from Exonuclease Digestion”, filed 30 Jul. 2011, U.S. Provisional Patent Application Ser. No. 61/574,236 entitled “Protection of DNA 3′ Termini From Exonucleolytic Digestion Using Abasic DNA and a C3 (CPG) Spacer”, filed 7 Sep. 2010, U.S. Provisional Patent Application Ser. No. 61/574,240 entitled “Activation of Individual DNA Molecules For DNA Replication By Phi29 DNAP Using a Blocking Oligomer and a Protein Nanopore”, filed 30 Jul. 2011, U.S. Provisional Patent Application Ser. No. 61/574,239 entitled “Control of Phi29 DNAP Binding Location Along a ss-DNA Substrate Using a Registry Oligomer”, filed 30 Jul. 2011, U.S. Provisional Patent Application Ser. No. 61/574,235 entitled “Re-Reading DNA Sequence in a Nanopore Using Voltage-Controlled Unzipping and Re-Zipping of the DNA Duplex”, filed 30 Jul. 2011, and U.S. Provisional Patent Application Ser. No. 61/574,233 entitled “Shorter Blocking Oligomers Allowing Faster Activation of DNA for Ratcheting Through a Nanopore Using a DNA Polymerase Enzyme”, filed 30 Jul. 2011, which are herein incorporated by reference in their entirety for all purposes.
This invention was made partly using funds from the National Human Genome Research Institute grant number 5RC2NG00553-02. The US Federal Government has certain rights to this invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US11/01552 | 9/7/2011 | WO | 00 | 9/17/2013 |
Number | Date | Country | |
---|---|---|---|
61574240 | Jul 2011 | US | |
61574239 | Jul 2011 | US | |
61574238 | Jul 2011 | US | |
61574237 | Jul 2011 | US | |
61574236 | Jul 2011 | US | |
61574235 | Jul 2011 | US | |
61574233 | Jul 2011 | US | |
61402903 | Sep 2010 | US |