Incorporated by reference in its entirety herein is a computer-readable nucleotide/amino acid sequence listing submitted concurrently herewith and identified as follows: One 3,054 Byte ASCII (Text) file named “2020-11-16_38875-254_SQL.txt,” created on Nov. 16, 2020.
Electrical readout of motions that underlie functions of a native protein might enable many new types of analytical measurement without labeling. For example, monitoring functional fluctuations of an enzyme would provide a rapid and simple way of screening candidate drug molecules. Monitoring the fluctuations of proteins that process biopolymers would reveal information about their composition and conformation.
Electrical readout of enzyme function was demonstrated by Choi et al. (Choi, Moody et al. 2012) who showed that telegraph noise, induced in a carbon nanotube field effect transistor, reflected the functional motion of the enzyme lysozyme when acting on its substrate, peptidoglycan. It was realized that monitoring the fluctuations of precessive enzymes, such as DNA polymerase might thus give a method for sequencing DNA. One example was given In a controversial paper, in which the Huang group (Chen, Lee et al. 2013) claimed to measure electrical fluctuations in a polymerase as nucleotides were incorporated into an extending chain, the signals reporting the sequence of the template being extended with high accuracy. The paper was subsequently retracted (Nature Nanotechnology 8, 452-458 (2013); published online 5 May 2013; corrected after print 11 Jul. 2013 and 28 Aug. 2013; retracted after print 3 Jun. 2015) but illustrates what might be possible if the structural fluctuations of a protein could be monitored by an electrical readout. More significantly, a working realization of this proposal was demonstrated around the same time by the Collins group who used a carbon nanotube field effect transistor to which a polymerase was tethered (Olsen, Choi et al. 2013). The signals consisted of telegraph noise that were shown to be associated with the opening and closing of the polymerase as nucleotides were incorporated. Importantly, the characteristics of the noise reflected the specific nucleotide that was being incorporated, opening the way to electrical single-molecule readout of DNA sequences.
In Olsen, Choi et al. 2013, fluctuations of the protein were detected indirectly via the electric field fluctuations they generate, the field fluctuations being sensed by a field effect transistor channel in close proximity to the polymerase or lysozyme.
Clearly, it would be desirable to make a more direct electrical connection to the enzyme under test. We have developed a technology called recognition tunneling and have used recognition molecules to bind a protein to at least one of a pair of closely spaced electrodes (Zhang, Song et al. 2017). This approach is illustrated in
Citation of any reference in this section is not to be construed as an admission that such reference is prior art to the present disclosure.
The present disclosure relates to devices, systems and methods for direct electrical measurement of protein activity. In some embodiments, a device is provided, the device comprising: a first and a second electrode, the first and second electrode being separated by a gap; a protein attached to one or both electrodes; wherein the first electrode and the second electrode are configured for contact with a sample to be analyzed.
In some embodiments, the protein is attached to one electrode. In other embodiments, the protein is attached to two electrodes.
In some embodiments, the device further comprises an insulating dielectric layer disposed within the gap.
In some embodiments, the protein is selected from the group consisting of a polymerase, a nuclease, a proteasome, a glycopeptidase, a glycosidase, a kinase, and an endonuclease.
In some embodiments, the protein is attached to one electrode. In other embodiments, the protein is attached to both electrodes.
In some embodiments, the protein is attached to the electrode via a linker.
In some embodiments, the device comprises: a first and a second electrode, the first and second electrode being separated by a gap; a protein attached to one or both electrodes; wherein a current fluctuation is produced when the protein interacts with a chemical entity.
In some embodiments, a device is provided, the device comprising:
wherein the first electrode, the insulating dielectric layer, the second electrode and passivation layer have an opening formed therethrough.
In some embodiments, the device comprises: a first and a second electrode, the first and second electrode being separated by a gap; a protein attached to one or both electrodes; wherein the first electrode and the second electrode have an opening formed therethrough.
In some embodiments, the device comprises: a first and a second electrode, the first and second electrode being co-planar and separated by a gap; a protein attached to one or both electrodes; wherein the first electrode and the second electrodes are configured for contact with a sample to be analyzed.
In some embodiments, the device further comprises an insulating dielectric layer disposed within the gap.
In some embodiments, the protein is attached to one electrode. In some aspects of this embodiment, the protein is a polymerase. In some aspects of this embodiment, the polymerase is attached to the electrode via a linker. In some aspects, the polymerase is a biotinylated polymerase. In some aspects, the polymerase is a biotinylated polymerase and is attached to the electrode via streptavidin.
In some embodiments of the device, the first and/or second electrode comprise a metal selected from the group consisting of gold, platinum, palladium, and ruthenium. In some embodiments, the metal is palladium.
In some embodiments, the gap has a width of about 1.0 nm to about 20.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 10.0 nm. In some embodiments, the gap has a width of about 2.0 nm to about 10.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 7.5 nm. In some embodiments, the gap has a width of about 1.0 nm to about 5.0 nm. In some embodiments, the gap has a width of about 4.0 nm to about 5.0 nm. In some embodiments, the gap has a width of about 5.0 nm to about 6.0 nm.
In some embodiments, the device can be used to detect a single molecule.
In some embodiments, a system is provided, the system comprising a device as described herein; a means for introducing a chemical entity that is capable of interacting with the protein; a means for applying a bias between the first and second electrode of value; and a means for monitoring fluctuations that occur as a chemical entity interacts with the protein.
In some embodiments, the bias is between 1 mV and 50 mV.
In some embodiments, the bias is between 1 mV and 100 mV.
In some embodiments, a method is provided, the method comprising (a) providing a system as described herein; (b) contacting the protein with a chemical entity; (c) applying a bias between the first and second electrode of value such that spontaneous fluctuations of the current between the electrodes do not occur; (d) detecting fluctuations that occur as the chemical entity interacts with the protein.
The methods of the disclosure can be used to detect the activity of a single protein molecule. The methods can also be used to sequence a biopolymer. The methods can also be used in drug screening assays. Advantageously, the methods require no labels or special chemistries.
The methods of sequencing a biopolymer provide for long reads (>10 kB), and polymerase runs at the speed of native polymerase (100 nt/s).
The devices of the disclosure have simple device geometries, which allows for easy scale up.
The present disclosure relates to an array, system and method for sequencing a biopolymer by direct electrical measurements on single processive protein.
In one embodiment, the present disclosure provides an array for sequencing a biopolymer comprising: an arrangement of a plurality of devices, as described herein. In one aspect of this embodiment, the array is for sequencing DNA.
The present disclosure provides a system for direct measurement of protein activity. The system comprises: (a) an array as described herein; (b) optionally a means for introducing and removing a solution to the array; (c) a means for applying a bias between the first and second electrode; and (d) a means for monitoring the current generated between the first and second electrodes. In one aspect of this embodiment, the system is for direct measurement of polymerase activity.
The present disclosure also provides a method for sequencing a biopolymer. In one embodiment, the method is for sequencing DNA, the method comprises: (a) introducing a solution comprising a DNA template to a system as described herein; (b) measuring a first current generated when a bias is applied to a system as described herein; (b) introducing a solution comprising a dNTP to the system under conditions that allow for incorporation of the dNTP complementary to the DNA template; (c) measuring a second current generated in step (b); (d) removing the solution comprising unincorporated dNTP; (e) repeating steps (b) through (d) with each of the remaining three types of dNTPs not used in step (b); (f) repeating steps (b) through (e); wherein the DNA is sequenced from the generated current signals.
In another embodiment, the method comprises: (a) introducing a solution comprising a DNA template to a system as described herein; (b) measuring a first current generated when a bias is applied to a system as described herein; (b) introducing a solution comprising at least two types of dNTPs to the system under conditions that allow for incorporation of the dNTP complementary to the DNA template, wherein the types of dNTPs are present in the solution at different concentrations; (c) measuring a second current generated in step (b); (d) removing the solution comprising the unincorporated dNTPs; (e) repeating steps (b) through (d) with the remaining types of dNTPs not used in step (b); (f) repeating steps (b) through (e); wherein the DNA is sequenced from the generated current signals.
The disclosure includes at least the following:
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as those commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. The materials, methods and examples are illustrative only, and are not intended to be limiting. All publications, patents and other documents mentioned herein are incorporated by reference in their entirety.
Throughout this specification, the word “comprise” or variations such as “comprises” or “comprising” will be understood to imply the inclusion of a stated integer or groups of integers but not the exclusion of any other integer or group of integers.
The term “a” or “an” may mean more than one of an item.
The terms “and” and “or” may refer to either the conjunctive or disjunctive and mean “and/or”.
The term “about” means within plus or minus 10% of a stated value. For example, “about 100” would refer to any number between 90 and 110.
The term “nucleotide” refers to a base-sugar-phosphate combination and includes ribonucleoside triphosphates ATP, UTP, CTG, GTP and deoxyribonucleoside triphosphates such as dATP, dCTP, dITP, dUTP, dGTP, dTTP, or derivatives thereof.
Device and System for Direct Measurement of Protein Activity
The present disclosure provides a device for direct measurement of protein activity. In one embodiment, the device comprises a first and a second electrode, the first and second electrode being separated by a gap; and a protein attached to one or both electrodes; wherein the first electrode and the second electrode have an opening formed therethrough.
In another embodiment, the device comprises a first and a second electrode, the first and second electrode being separated by a gap; and a protein attached to one or both electrodes.
In some embodiments, the device comprises:
wherein the first electrode, the insulating dielectric layer, the second electrode and passivation layer have an opening formed therethrough.
In some embodiments, the device comprises:
wherein the passivation layer has an opening formed therethrough.
In some embodiments, the device comprises:
wherein the first electrode and the second electrodes are configured for contact with a sample to be analyzed.
In embodiments in which the electrodes are planar, the device advantageously does not require a dielectric layer. Devices requiring dielectric layers can suffer from drawbacks. Dielectric layers require adhesion layers to adhere to the electrodes. These adhesion layers can oxidize upon exposure to air, which, in effect, increases the size of the gap between the electrodes. To compensate for this effect, the dielectric layer can be made thinner. However, a thin dielectric layer is susceptible to pinholes, which can be difficult to eliminate.
In each of the device embodiments described herein, the protein is selected from the group consisting of a polymerase, a nuclease, a proteasome, a glycopeptidase, a glycosidase, a kinase and an endonuclease.
The protein can be attached to one electrode directly or indirectly. In some embodiments, the protein is attached to the electrode via a linker. In some embodiments, the protein is attached to the electrode indirectly via interactions with a ligand attached to the electrode. In some embodiments, the protein is modified to incorporate a ligand-binding site.
In one embodiment, the device comprises: a first and a second electrode, the first and second electrode being separated by a gap; a polymerase attached to one or both electrodes; wherein the first electrode and the second electrode have an opening formed therethrough.
In some embodiments, the polymerase is attached to one electrode. In some aspects of this embodiment, the polymerase is attached to the electrode via a linker. In some aspects, the polymerase is a biotinylated polymerase. In some aspects, the polymerase is a biotinylated polymerase and is attached to the electrode via streptavidin.
In each of the device embodiments described herein, the first and/or second electrode comprise a metal selected from the group consisting of gold, platinum, palladium, and ruthenium. In some embodiments, the metal is palladium.
In some embodiments, the gap has a width of about 1.0 nm to about 20.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 10.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 7.5 nm. In some embodiments, the gap has a width of about 1.0 nm to about 5.0 nm. In some embodiments, the gap has a width of about 4.0 nm to about 5.0 nm.
In some embodiments, the device can be used to detect a single molecule.
The present disclosure also provides a system for direct measurement of protein activity. The system comprises a device as described herein; a means for introducing a chemical entity that is capable of interacting with the protein; a means for applying a bias between the first and second electrode; and a means for monitoring the current generated between the first and second electrodes as the chemical entity interacts with the protein.
In each of the system embodiments described herein, the protein is selected from the group consisting of a polymerase, a nuclease, a proteasome, a glycopeptidase, a glycosidase, a kinase and an endonuclease. In one embodiment, the protein is a polymerase.
When the protein is a polymerase, the polymerase is attached to one electrode, or preferably both electrodes. In some aspects of this embodiment, the polymerase is attached to the electrodes via one or more linkers. In some aspects, the polymerase is a biotinylated polymerase. In some aspects, the polymerase is a biotinylated polymerase and is attached to the electrode via streptavidin.
In each of the system embodiments described herein, the first and/or second electrode comprise a metal selected from the group consisting of gold, platinum, palladium, and ruthenium. In some embodiments, the metal is palladium.
In some embodiments, the gap has a width of about 1.0 nm to about 20.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 10.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 7.5 nm. In some embodiments, the gap has a width of about 1.0 nm to about 5.0 nm. In some embodiments, the gap has a width of about 4.0 nm to about 5.0 nm.
The basis of the present disclosure lies in a remarkably unexpected and very recent observation about the behavior of a protein in a large (approximately 4.5 nm) gap when the protein is strongly tethered to two electrodes as described above. We find that below the critical bias voltage previously reported for the onset of telegraph noise signals, a simple linear (Ohmic) response is found. This is completely unexpected because proteins are believed to be molecular solids in which the mode of electron transport should only be tunneling. However, tunneling cannot account for the large currents with linear current-voltages observed when proteins are tethered in the manner described in
A larger collection of measurements reveals a more complex distribution of conductances as shown in
No conductance was observed when electrodes were exposed to the control molecules listed, showing that specific chemical tethering of the protein to the electrodes is required for electronic conductance to be observed.
In the case of the three antibodies, two binding sites are available, one at each of the two binding domains, separated by 13 nm. As a consequence, a second, higher conductance peak is observed in the distributions for these three molecules. This yields the second peak conductance listed for these molecules in Table 1. Consequently, high conductance can be obtained over long distances (13 nm) if proteins are chemically tethered to both electrodes.
This threshold for the onset of voltage-driven fluctuations of about 100 mV has been found for a number of proteins studied to date. Thus, by operating the junction below this threshold for the onset of spontaneous fluctuations (i.e., V<VC in
Protein fluctuations open up additional channels for electron transport. Thus, when a protein is biased below VC but stimulated by introducing a substrate molecule, large current fluctuations can occur. An example of the current signals induced by protein fluctuations is shown in
The signals shown in
Methods of Making a Device of the Disclosure
A device of the disclosure can be readily fabricated by depositing a layer of a noble metal such as Au, Pt or Pd onto a silicon, glass or sapphire wafer (or other dielectric substrate), then depositing a thin (typically 1 nm) layer of a reactive metal for adhesion (such a s chrome or titanium), and then a layer of 1 to 10, or 1 to 20 or 1 to 50 nm of the noble metal. This bottom electrode is than covered with an insulating dielectric layer, preferably alumina, though other oxides such as SiO2 or hafnium oxide can be used. The layer should be between 2 and 10 nm in thickness. A 2 nm layer can be deposited by coating the bottom noble metal electrode with 1 to 1.5 nm of aluminum, and allowing it to oxidize in air, thereby producing a 2 to 3 nm thick layer of Al2O3. If a greater thickness of dielectric is required, further Al2O3 can be added by atomic layer deposition with water/trimethylaluminum cycles as is well known in the art.
A second noble metal electrode is then deposited, again using a thin adhesion layer (chrome or titanium) but of a maximum of 1 nm so as not to alter significantly the gap presented at the edge of the device where this adhesion layer will oxidize.
Finally, a passivation layer is placed on top of the top electrode. This can be alumina, SiO2, hafnium oxide or a resist material such as PMMA or SUB.
In order to make a cavity small enough to ensure that the exposed electrode area is such that only one polymerase is attached, a small opening is then made using Reactive Ion Etching (RIE) as is well known in the art. This opening may be between 10 and 500 nm in diameter with about 50 nm preferred. The depth of the opening should be large enough so that it cuts through the passivation layer, the top electrode, the dielectric layer separating the electrodes, and into the bottom electrode.
A second way to limit the amount of exposed electrode area is to make one of the electrodes (top or bottom, with top preferred) a thin wire of 50 to 100 nm in width. The RIE opening can then be much larger (e.g., micron sized permitting conventional lithography) because the exposed electrode will be limited by the small width of the electrode.
A third way is to control the functionalization chemistry by controlling the amount of time that the junction is exposed to polymerase molecules and/or the concentration of polymerase. The loading of each junction can be tested by monitoring the telegraph noise that is induced by contact fluctuations when the applied bias is above 100 mV. The presence of 2 signal levels indicates that just a single molecule is trapped. The presence of three levels indicates that two molecules are trapped and so on. In this way the concentration and exposure time can be adjusted an ideal Poisson loading wherein about 30% of the sites are singly occupied.
After cleaning with an oxygen plasma, the exposed area of the electrodes in the opening can be functionalized with protein. This may be either directly, using the native thiols on the surface of the protein, or via chemical modifications that attach sulfhydryl groups to the protein, or indirectly, by attaching a thiolated streptavidin and then capturing a biotinylated protein.
A second approach to making the device is to form the two electrodes in the same plane with a small gap between them. This can be done by opening a trench across a single wire using e-beam lithography and lift-off or reactive ion etching (RIE) as is well known in the art. Other approaches are to use helium ion milling, or angled deposition of metal over a step edge so that a gap is naturally formed. The electrode pair are then covered with a passivation layer and an opening formed by RIE such that the electrode gap is exposed. The electrodes can then be functionalized as described above.
The width of the exposed electrodes is important, because the devices described here generally rely on connecting to just one molecule. In the case of extracting sequence information, this requirement of a single molecule signal is particularly important. If the electrodes are not much wider than a single molecule (5 to 15 nm) then attachment of multiple molecules across the gap is not possible. However, reliable functionalization and fabrication of such small electrodes is very difficult. In practice, we have found that electrodes of up to 100 nm width are unlikely to capture more than one protein molecule. In particular, when the probability of binding in the desired (bridging) configuration is small, it may even be desirable to have even wider electrodes, of 200, 300, 400, 500 or even 1000 nm width. Protein molecules that are bound to just one electrode, rather than bridging the pair of electrodes, will contribute relatively small amounts of current.
Methods of Attaching a Polymerase to the Electrodes
When the protein is a polymerase, it should be a polymerase with high processivity, such as the phi29 polymerase, and its exonuclease function should be disabled as described below.
The wild-type (WT) polymerase requires modification to (a) remove its exonuclease activity and (b) add a chemical attachment point if so desired. This modification is achieved by recombinant DNA using an E. Coli expression system to produce the modified polymerase.
Exonuclease activity of phi29 requires the following acidic amino acids: D12, E14, D66 and D169. Mutating any one of these will eliminate the exonuclease activity, and we have mutated D12 and E14 to alanine.
The clone we used has both the his-tag and the avitag (for biotinylation) at the N-terminus, ie His-Avitag-Phi29. As a result, the following sequence was added to the N-terminus of the enzyme:
The six histidine residues are the his tag (used for purification of the desired enzyme product) and the GLNDIFEAQKIEWHE is the Avitag. The biotin is attached to the K in the avitag by the biotin ligase BirA (Avidity, Lansing, Mich.). Activity assays show that the biotinylated enzyme attached to streptavidin is still active (
Another useful and unexpected feature of the present disclosure is that both streptavidin and polymerase are conductive proteins, so, as we show below, the polymerase can be attached to the electrode indirectly. First a streptavidin, modified with thiols, is attached to the electrode, and then biotinylated polymerase introduced. This binds to the streptavidin, providing a conductive path to the electrode.
The polymerase can also be modified at the C-terminus by the same recombinant methods. In addition to the avitag, other peptide-based binding tags can be used such as GST tags, Myc tags and His tags. These tags can all be incorporated at either the N- or C-terminus of the polymerase. Incorporation at the C terminus places the tag site close to the site at which a primed template is captured, so oligoalanine or glycine-glycine-serine spacer sequences can be incorporated between the C terminus and the tag to reduce interference with the template capture activity of the polymerase.
The same technology can also be used to attach other proteins whose activity is to be monitored using the methods of the present disclosure (such as kinases, proteases or molecules that process glycans).
In addition to modification at the N- or C-termini, there are seven cysteines in phi29. None are disulfide bonded, so all have the potential for forming disulfide bonds, offering additional sites for attachment to electrodes. Based on the structure of phi29 with template and primer C448, C106 and C22 are most surface exposed and look like good candidates for either biotinylation through maleimide or direct attachment to heavy metals that bind sulfhydryl groups. The problem is how to control specificity. We tried to mutate out all but one cysteine once, but the result is insoluble protein. However, up to four may be removed without affecting solubility, leaving C448, C106 and C22 as targets for attachment points.
Although the present disclosure works with just one chemical attachment site to one electrode, the second contact being made by physical contact between the protein and the metal, it is desirable to make two chemically well-defined contacts in a manner that spans the gap between the two electrodes. The C terminus is separated from the N terminus by a distance of ˜5 nm, and if biotinylation via an avitag is used, attachment to the same streptavidin molecule by both the N- and C-termini of the same polymerase is improbable. Thus, with both electrodes functionalized with thio-streptavidin, there is an opportunity for bridging structures to form in which the N terminus is connected to one electrode and the C terminus to the second electrode.
Another approach is to use two attachment points that are widely spaced, but in an inactive region of the protein. In the case of phi29 polymerase, where the exonuclease domain has been disabled by mutations of D12 and E14, two points in the exonuclease domain spaced by >5 nm are found between G111 and K112, and between E279 and D280. Accordingly, with the Avitag sequence GLNDIFEAQKIEWHE inserted at these two points, and the Avitag lysine biotinylated by BirA, the polymerase can be bound across a pair of electrodes as shown in
Making Deterministic Contacts Between Polymerases and Electrodes by Adding ‘Conducting Whiskers’ to the Polymerase.
At the N terminus add recombinantly:
The terminal C is the cysteine for attachment to the first electrode.
The his tag is for protein extraction and purification.
The 61-amino acid sequence following SS is the sequence of the pilus protein from geobacter sulferreductans, which acts as a metallic wire.
At the C terminus add recombinantly:
The 61-amino acid sequence following AA is the sequence of the pilus protein from geobacter sulferreductans.
The terminal C is the cysteine for attachment to the first electrode.
Conductivity of the Complex
Key to the present disclosure is that good electronic conductance can be obtained through the polymerase.
Conductivity Changes with Conformation
Dynamic Monitoring of Protein Conformation
Methods of Use of the Devices and Systems
The present disclosure provides methods for sequencing a biopolymer. We illustrate this for the specific case of a nucleic acid chain being extended by a polymerase in
In use, once the device is prepared, it should be rinsed and then exposed to the primed template DNA to be sequenced. This DNA is prepared from the sample to be sequenced by ligating hairpin primers as well known in the art. A buffer solution comprising the four dNTPs and Mg2+ should be introduced to the device. The dNTPs are present in the buffer solution in about equal concentration. In one embodiment, the concentrations of dNTPs are about equal to the saturation concentration of template-bound polymerase, at the saturation of concentration of template-bound polymerase in a second embodiment, and above the saturation concentration in third embodiment. When the concentrations of dNTPs are at or above the saturation concentration, the polymerase runs fast (i.e., 100 nucleotide incorporations per second).
When the buffer solution comprising the dNTPs is introduced to the device, a polymerization reaction will be initiated and the captured template will be copied to the primer, producing a series of current spikes. Each spike (or cluster of spikes) occurs as each new nucleotide is incorporated to the primer, and the characteristics of each spike (duration, amplitude, shape) used to decode the identity of the nucleotide being incorporated. A typical sequencing speed at saturation concentration (>30 μM) of nucleotides is about 100 nucleotides per second. The saturation concentration of template is about 10 nM with a new template incorporated almost immediately after completion of the previous template. Each molecule will continue to turn over template so long as there are templates available in solution. Therefore, one molecule can sequence continuously for as long as the device is operated.
The device geometry is extremely simple with no need to separate fluidic compartments for each junction, so one junction would only occupy about a micron2. Allowing for interconnects, isolation and on-chip processing electronics, a single reading device could readily be fitted into an area of 100 microns by 100 microns, so an active chip area of 1 cm2 would accommodate 10,000 devices. A chip with 10,000 junctions on a chip operated for 1 hour would sequence an entire human genome (10000×3600×100=3.6×109). A denser device geometry or a larger chip would accommodate even a significant fraction of inactive devices and still permit genome-scale sequencing on one small device in times of an hour or less.
The preceding example illustrates the sequencing of nucleic acid polymers, but it also can be applied such that other enzymes that process polymers could be used. For example, current fluctuations in an exonuclease will reflect the composition of the nucleic acid they are degrading. The same would be true of proteasomes that digest peptides. An example is the proteasome 20S CP, and proteasomes like this could likely be used for single molecule peptide sequencing by incorporating them into the system of
Similar enzymes, called glycosidases, exist for digesting glycans. The incorporation of a glycosidase into the device of
In yet another embodiment, the present disclosure provides a method for detecting kinase activity. In this embodiment, a kinase is incorporated into the device of
In all of these methods, the use of a simple junction (as opposed to a FET structure) greatly simplifies both manufacture and enables scale up to large parallel arrays of devices. The device of the disclosure may be prepared in massively parallel fabrication using methods for scalable fabrication of junction devices, as described below.
Arrays and Systems for Sequencing DNA or Other Polymers
The present disclosure provides an array for sequencing biopolymers using any of the enzymes that interact processively with molecular templates such as a nuclease, a proteasome, a glycopeptidase, a glycosidase, a kinase and an endonuclease. The embodiments below illustrate an array for sequencing DNA using a polymerase. It should be understood that any processive enzyme can be substituted for the polymerase in the arrays.
The array comprises an arrangement of a plurality of devices. The device used in the arrays of the present disclosure include the following.
In one embodiment, the device comprises a first and a second electrode, the first and second electrode being separated by a gap; and a polymerase attached to one or both electrodes; wherein the first electrode and the second electrode have an opening formed therethrough.
In another embodiment the device comprises a first and a second electrode, the first and second electrode being separated by a gap; and a polymerase attached to both the first and second electrode.
In some embodiments, the device comprises:
wherein the first electrode, the insulating dielectric layer, the second electrode and passivation layer have an opening formed therethrough.
In some embodiments, the device comprises:
wherein the passivation layer has an opening formed therethrough.
In some embodiments, the device comprises:
wherein the first electrode and the second electrodes are configured for contact with a sample to be analyzed.
In each of the device embodiments described herein, the first and/or second electrode comprise a metal selected from the group consisting of gold, platinum, palladium, and ruthenium. In some embodiments, the metal is palladium.
In some embodiments, the gap has a width of about 1.0 nm to about 20.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 10.0 nm. In some embodiments, the gap has a width of about 1.0 nm to about 7.5 nm. In some embodiments, the gap has a width of about 1.0 nm to about 5.0 nm. In some embodiments, the gap has a width of about 4.0 nm to about 5.0 nm.
The array of devices can be arranged in any suitable manner, e.g., in a grid.
In some embodiments, the array comprises polymerase molecules bound with template DNA. Such templates can be made by ligating genomic DNA fragments (generated by sonication, for example) with primer sequences containing a nick for binding by polymerase, as is well known in the art. The result is a library of templates spanning an entire genome, if needed. Each template will then randomly bind one polymerase in the array.
Referring now to
The present disclosure also provides a system of arrays for direct measurement of polymerase activity. The system comprises an array as described herein; optionally a means for introducing and removing a solution to the array; a means for applying a bias between the first and second electrode; and a means for monitoring the current generated between the first and second electrodes.
Referring back to
Methods for DNA Sequencing Using a System of Arrays
The present disclosure also provides a method for DNA sequencing using a system of arrays as described herein.
The sequencing proceeds by introducing a solution comprising one nucleotide monophosphate (e.g., one from among dATP, dGTP, dCTP, dTTP) together with magnesium to the array. Each polymerase bound with a complementary nucleotide will generate a signal, which is read by the unique pair of electrodes to which each polymerase is bound. For example, if the added nucleotide is dCTP, then every template bound at a G base will incorporate a C into the extending chain, generating a signal, whereas the other sequences will generate distinctly different signals. For example, a polymerase presented with a non-matching base will generate a brief burst of signal as the mis-matched base is captured, but the signal train will terminate prematurely as the mismatch is rejected and the process of polymerization and translocation is aborted. In contrast, incorporation of a matching base results in a much longer train of pulses as the process of incorporation and translocation is completed. As the second step, the array is rinsed to remove excess nucleotide and a solution comprising the next nucleotide is introduced (for example dATP, so that all site with a T now generate a signal). The cycle is continued until all four dNTPs have been cycled through the device, after which the cycle is repeated. This cycling can be repeated until all the template DNA is exhausted, thus generating sequence data for the entire library of fragments.
It will be recognized that this approach has two major advantages over current sequencing strategies that use cycling of dNTPs. One is that, by utilizing a single molecule read-out at a time-scale faster than the base-incorporation rate of a polymerase, it now becomes straightforward to count repeats of the same base. So the sequence AAAAA would give 5 distinct bursts of signal in the presence of dTTP, and so on. The second is that, in contrast to known optical readout schemes, the length of template that is read is not constrained by distance from the mounting substrate, so that the potential read length is as high as the processivity of the polymerase (10 kB for phi29).
In another embodiment, the solution comprises more than one dNTP, with the dNTPs present in different concentrations. For example, the solution comprises 1 mM dATP, 100 μM dGTP 10 μM dCTP and 0.1 μM dTTP. In this embodiment, the polymerases in the array would generate signals continuously as each template is extended. At points where T is present in the template DNA, the signal of incorporation would follow the previous burst of telegraph noise rapidly (generally within 10 ms). The template DNAs in which the next base was a C would show a more delayed burst of telegraph noise because of the slower arrival of dGTP owing to its lower concentration. Similarly, templates containing a G would be preceded by a longer delay because of even lower concentration of dCTP. The longest delays would precede A bases because the concentration of dTTP is lowest.
In yet another embodiment, the two approaches can be combined, using 2 cycles of rinsing, using one pair of nucleotides in unequal concentration in the first cycle, and then the other two, also in unequal concentration in the second cycle.
While the preceding section describes methods of sequencing DNA using a system of arrays comprising a polymerase, the system of arrays can be easily modified to sequence other polymers as well.
Although the invention has been described and illustrated in the foregoing illustrative embodiments, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the invention can be made without departing from the spirit and scope of the invention. Features of the disclosed embodiments can be combined and rearranged in various ways. All publications, patents and other documents mentioned herein are incorporated by reference in their entirety.
The following references are hereby incorporated by reference in their entireties:
This invention was made with government support under HG910080 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US19/32707 | 5/16/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62673080 | May 2018 | US | |
62682991 | Jun 2018 | US | |
62812312 | Mar 2019 | US |