Embodiments of the present disclosure are directed to systems, methods, devices, and compositions of matter for sequencing molecules. More specifically, the present disclosure includes embodiments where a polysaccharide or other heterogeneous polymer concatenated with a nucleic acid polymer is captured by a primer on a polymerase tethered to a bead trapped by a nanopore, where the polymer may be sequenced/identified.
Carbohydrates, particularly those glycosylating proteins and lipids (glycans), play an essential role in biological processes at all levels, such as protein folding, cell adhesion, signal transduction, pathogen recognition, and immune responses. On the other hand, the aberrant glycosylation of proteins is associated with oncogenic transformation. Over 50% of all human proteins are glycosylated. A glycome—a complete collection of glycans and glycoconjugates in a cell or organism—is diverse (e.g. 1.92×1011 possible hexasaccharides formed mainly from ten of the most abundant mammalian monosaccharides) and dynamic (i.e., variation of glycoforms of proteins at different developmental stages of a cell).
Currently, mass spectrometry is the most powerful analytical technique for structural glycomics. Since many carbohydrates are epimers, anomers, and regioisomers, mass spectrometry is unable to identify those sharing a molecular weight without additional chemical steps. The problem has been addressed by combining ion-mobility spectrometry, which uses collision cross-sections to separate isomers, with mass spectrometry (IM-MS), but IM-MS cannot resolve closely related epimers because they have almost identical collision cross-sections.
Emerging nanotechnologies (e.g., nanopores for analyzing oligosaccharides) offer a promising alternative for glycomics. In US20150144506, herein incorporated by reference, an electron tunneling technique is introduced which is configured to, among other things, identify carbohydrates electronically at a single-molecule level. Some of the disclosed embodiments may be capable of analyzing nanomolar (nM) concentrations in volumes of a few microliters, using less than a picomole of sample. In some embodiments, the number of individual molecules in each subset in a population of coexisting isomers are counted, and can be quantitative over more than four orders of magnitude of concentration. For example, in some embodiments, it can resolve epimers that are not well separated by ion-mobility, and can detect glycosylation of a peptide.
Recently, we have shown that some embodiments can identify common biological mono- and di-saccharides (see, e.g., Electronic Single Molecule Identification of Carbohydrate Isomers by Recognition Tunneling, arxiv.org/abs/1601.04221), herein incorporated by reference. However, the method may only identify one molecular species at a time, so solving the combinatorially complex problem of reading the sequence of sugars in a linear polymer is very challenging.
Oligosaccharide molecules, such as glycosaminoglycans, are generally charged, and thus, can be pulled through a nanopore using an electric field. However, they are very small, requiring a very small (one nanometer diameter) nanopore to ensure that each sugar residue passes the reading element in turn. Their small size also means that they move very rapidly in an electric field because they present a small friction to the surrounding water. Thus, even if they could be passed through a constriction small enough to ensure that only one sugar residue at a time lies in the reading region of the device, they would spend too little time in the reading region to generate a signal that could be read. This is because tunneling signals are typically picoamps, so millisecond data acquisition times are needed for typical device capacitances of a few pF.
The same problem has been addressed in the case of DNA sequencing, using a DNA polymerase to both clamp the DNA and to regulate the speed with which it can be pulled through a nanopore. However, currently, no equivalent of a DNA polymerase is known to exist for oligosaccharides.
Some embodiments of the current disclosure introduce a device that uses a DNA polymerase to regulate the motion of an oligosaccharide, as well as to hold it in place so that it can be captured in a reading junction embedded in a pore that is much larger than the diameter of the sugar molecule. Such embodiments, enables the use of larger pores to identify oligosaccharides and the like, addressing the difficulty in manufacturing small (nm-diameter) pores.
Some of the disclosed embodiments may be use in association with the embodiment disclosed in (especially disclosed molecule sequencing/identification system embodiments, and in some cases, the system recited in claim 1), of U.S. Pat. No. 9,395,352 (Lindsay et al.), herein incorporated by reference in its entirety.
In some embodiments, an apparatus for sequencing a heteropolymer is provided and may include: (a) a substrate, (b) a pair of electrodes proximate to or within the constriction and separated by a gap of between 0.5 to 10 nm, (c) a constriction arranged within the substrate and configured with a size and operatively arranged with the gap such that a heteropolymer molecule to be sequenced passes through the constriction, (d) means for reading an electrical signal characteristic of the molecule from the pair of electrodes as the heteropolymer molecule passes through the constriction and becomes electrically connected with the electrodes, (e) a bead having a size that is greater than a size of the constriction, (f) a DNA-binding protein attached to the bead, and (g) a DNA polymer bound to the DNA-binding protein and configured to bind with a heteropolymer for sequencing by the apparatus. In some embodiments, the heteropolymer is not a nucleic acid.
The above noted embodiments are further clarified, and/or may further include one and/or another of the following feature(s)/functionality(ies):
In some embodiments, a method for preparing a heteropolymer for sequencing is provided and may include attaching a DNA-binding protein to a bead, the bead having a size greater than a size of a constriction of a sequencing apparatus, binding a DNA polymer to the DNA-binding protein, and binding a heteropolymer to the DNA polymer.
In some embodiments, a method for sequencing a heteropolymer in a sequencing apparatus having a constriction is provided and may include: (a) attaching a DNA-binding protein to a bead, the bead including a size greater than a size of a constriction of a sequencing apparatus, the sequencing apparatus further including a substrate, the constriction arranged within the substrate and configured with a size and operatively arranged with a pair of electrodes separated by a gap of between 0.5 to 10 nm such that a heteropolymer molecule to be sequenced passes through the constriction, reading means for reading an electrical signal characteristic of a heteropolymer molecule being sequenced from the pair of electrodes as the molecule being sequenced becomes electrically connected to the electrodes; (b) binding a DNA polymer to the DNA-binding protein; (c) binding a heteropolymer for sequencing to the DNA polymer; (d) arranging the bead to a first side of the constriction; and (e) sequencing the heteropolymer by reading the electrical signals thereof as the heteropolymer passes through the constriction.
In some embodiments, the present disclosure also provides a method for regulating the speed of a heteropolymer passing through a constriction in a sequencing apparatus. The method comprises: (a) attaching a DNA-binding protein to a bead, the bead including a size greater than a size of a constriction of a sequencing apparatus; (b) binding a DNA polymer to the DNA-binding protein; (c) binding a heteropolymer for sequencing by the sequencing apparatus to the DNA polymer; (d) arranging the bead to a first side of the constriction of the sequencing apparatus, wherein the first side of the constriction is in fluid communication with a reservoir having free nucleotides; and (e) regulating a speed of the heteropolymer for sequencing through the constriction by varying a concentration of the free nucleotides in the reservoir. In some embodiments, the concentration of the free nucleotides is increased such that the heteropolymer for sequencing increases speed through the constriction.
It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
The term “and/or” is used in this disclosure to mean either “and” or “or” unless indicated otherwise.
As used herein, the term “heteropolymer” refers to a polymer having at least two monomer units, and where at least one monomeric unit differs from the other monomeric units in the polymer. In some embodiments, the heteropolymer is the molecule to be sequenced.
As used herein, the term “peptide” refers to a short polypeptide, e.g., one that typically contains less than about 50 amino acids and more typically less than about 30 amino acids. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.
As used herein, the term “bead” can include any object. The bead can be in any shape or form. For example, the bead can be a sphere, a cube, a rod, a star, or any irregular shape.
The term “comprising” as used herein is synonymous with “including” or “containing”, and is inclusive or open-ended and does not exclude additional, unrecited members, elements or method steps. By “consisting of” is meant including, and limited to, whatever follows the phrase “consisting of.” Thus, the phrase “consisting of” indicates that the listed elements are required or mandatory, and that no other elements may be present. By “consisting essentially of” is meant including any elements listed after the phrase, and limited to other elements that do not interfere with or contribute to the activity or action specified in the disclosure for the listed elements. Thus, the phrase “consisting essentially of” indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present depending upon whether or not they materially affect the activity or action of the listed elements.
Prior DNA translocation control is shown in
The single stranded tail (5) is pulled into a nanopore (7) using an electric field. In this case, the pore is a protein pore small enough in diameter to only pass a single-stranded region. Referring to ii in
Referring to iii in
In prior disclosures, we have described a device for reading the identity of individual molecules based on recognition tunneling (e.g., see US20100084276 hereby incorporated by reference). Referring to
However, in an unexpected development, we have found that DNA molecules are readily trapped by the recognition molecules (27) even if the diameter of the opening (d) is much greater than the diameter of the DNA. For example, signals have been obtained with openings as big as 40 nm with single stranded DNA of diameter less than 2 nm. Thus, any fluctuation that causes the molecule to be read to become bonded to the recognition molecules (27) tends to hold the polymer chain against the wall as it passes through the pore.
In
In one aspect, the present disclosure relates to an apparatus for sequencing a heteropolymer. The apparatus can include: (a) a substrate, (b) a pair of electrodes proximate to or within the constriction and separated by a gap of between 0.5 to 10 nm, (c) a constriction arranged within the substrate and configured with a size and operatively arranged with the gap such that a heteropolymer molecule to be sequenced passes through the constriction, (d) means for reading an electrical signal characteristic of the molecule from the pair of electrodes as the heteropolymer molecule passes through the constriction and becomes electrically connected with the electrodes, (e) a bead having a size that is greater than a size of the constriction, (f) a DNA-binding protein attached to the bead, and (g) a DNA polymer bound to the DNA-binding protein and configured to bind with a heteropolymer for sequencing by the apparatus. In some embodiments, the heteropolymer is not a nucleic acid. In some embodiments, the heteropolymer is selected from the group consisting of an oligosaccharide, a polysaccharide, a peptide, a protein, and a glycoprotein. The heteropolymer can be either charged or uncharged. In some embodiments, the DNA-binding protein is a DNA polymerase. The means for reading an electrical signal can be any electronic device capable of reading an electrical signal.
In another aspect, the present disclosure relates to a method for preparing a heteropolymer for sequencing. The method can include attaching a DNA-binding protein to a bead, the bead having a size greater than a size of a constriction of a sequencing apparatus, binding a DNA polymer to the DNA-binding protein, and binding a heteropolymer to the DNA polymer.
In another aspect, the present disclosure relates to a method for sequencing a heteropolymer in a sequencing apparatus having a constriction. The method can include: (a) attaching a DNA-binding protein to a bead, the bead including a size greater than a size of a constriction of a sequencing apparatus, the sequencing apparatus further including a substrate, the constriction arranged within the substrate and configured with a size and operatively arranged with a pair of electrodes separated by a gap of between 0.5 to 10 nm such that a heteropolymer molecule to be sequenced passes through the constriction, reading means for reading an electrical signal characteristic of a heteropolymer molecule being sequenced from the pair of electrodes as the molecule being sequenced becomes electrically connected to the electrodes; (b) binding a DNA polymer to the DNA-binding protein; (c) binding a heteropolymer for sequencing to the DNA polymer; (d) arranging the bead to a first side of the constriction; and (e) sequencing the heteropolymer by reading the electrical signals thereof as the heteropolymer passes through the constriction.
In another aspect, the present disclosure relates to a method for regulating the speed of a heteropolymer passing through a constriction in a sequencing apparatus. The method can include: (a) attaching a DNA-binding protein to a bead, the bead including a size greater than a size of a constriction of a sequencing apparatus; (b) binding a DNA polymer to the DNA-binding protein; (c) binding a heteropolymer for sequencing by the sequencing apparatus to the DNA polymer; (d) arranging the bead to a first side of the constriction of the sequencing apparatus, wherein the first side of the constriction is in fluid communication with a reservoir having free nucleotides; and (e) regulating a speed of the heteropolymer for sequencing through the constriction by varying a concentration of the free nucleotides in the reservoir.
In some embodiments, the apparatus includes a recognition tunneling junction, such as those described below.
A general scheme of some of the embodiments is shown in
The metal electrodes 42a and 42b can include palladium gold, platinum, or a combination thereof. The lower support membrane 41 can include a dielectric, such as silicon nitride, silicon dioxide, and other semiconductor or metal oxide. The lower support membrane 41 can be in contact with a first fluid reservoir. The top dielectric layer 44 can include a dielectric such as silicon nitride, silicon dioxide, and other semiconductor or metal oxide. The top dielectric layer 44 serves to isolate the top electrode 42a from a fluid (e.g., an aqueous electrolyte) in a second fluid reservoir. The fluid can serve as a transport medium for the molecules to be analyzed. The first and second fluid reservoirs can be in fluidic communication through the pore 45.
The lower support membrane 41 can have a thickness of about 5 nm to about 500 nm, about 10 nm to about 400 nm, about 20 nm to about 300 nm, about 20 nm to about 200 nm, or about 20 nm to about 100 nm. The metal electrodes 42a and 42b can each have a thickness of about 1 nm to about 20 nm, about 1 nm to about 15 nm, or about 1 nm to about 10 nm. The thin dielectric layer 43 can have a thickness of about 0.5 nm to about 10 nm, about 1 nm to about 5 nm, or about 1 nm to about 3 nm. The top dielectric layer 44 can have a thickness of about 5 nm to about 500 nm, about 10 nm to about 400 nm, about 20 nm to about 300 nm, about 20 nm to about 200 nm, or about 20 nm to about 100 nm. The pore 45 can have a diameter of about 2 to about 50 nm, about 5 nm to 40 nm, or about 5 nm to about 30 nm.
In some embodiments, a molecular motor (47) is attached to a bead 46 that is larger in size than the pore 45, thus attaching the motor 47 to the top electrode 42a via the top dielectric layer 44 once the bead 46 is pulled into the pore 45 by means of an attached charged molecule. In some embodiments, the bead 46 can be larger in diameter than the pore 45 by at least 1%, at least 2%, at least 3%, at least 4%, at least 5%, at least 6%, at least 7%, at least 8%, at least 9%, at least 10%, at least 15%, or at least 20%.
In some embodiments, a bead of somewhat smaller diameter can still be trapped at the opening of the device using a chemical approach. For example, if the opening in the top dielectric layer 44 in
In some embodiments, the molecular motor 47 may be a DNA polymerase attached to a double stranded DNA 48 at a double-single strand junction. The single stranded tail 50 that protrudes from the polymerase 47 is attached at its end 51 to the molecule to be sequenced (dashed line 49). In the event that the molecule to be sequenced is uncharged, it can also be ligated at its far end to a second piece of DNA 52 which will serve as a charged thread to pull the molecule 49 through the pore by means of electrophoresis. For example, the first and second fluid reservoirs can each include a reference electrode. By applying a voltage between these reference electrodes having a polarity opposite to that of the second piece of DNA 52, electrophoresis would pull the molecule 49 through the pore.
Examples of DNA polymerases include, but are not limited to, DNA polymerase I, DNA polymerase II, DNA polymerase III, DNA polymerase IV, DNA polymerase V, polymerase β, polymerase λ, polymerase σ, polymerase μ, polymerase α, polymerase δ, polymerase ε, polymerase η, polymerase ι, polymerase κ, polymerase Rev1, polymerase ζ, telomerase, polymerase γ, polymerase θ, polymerase ν, reverse transcriptase, polymerase T4, polymerase T7, and polymerase ϕ29 DNA.
DNA-binding proteins include transcription factors which modulate the process of transcription, various polymerases, nucleases which cleave DNA molecules, and histones which are involved in chromosome packaging and transcription in the cell nucleus. DNA-binding proteins can incorporate such domains as the zinc finger, the helix-turn-helix, and the leucine zipper (among many others) that facilitate binding to nucleic acid. There are also more unusual examples such as transcription activator like effectors. Examples of DNA-binding proteins include, but are not limited to, c-myb, AAF, abd-A, Abd-B, ABF-2, ABF1, ACE2, ACF, ADA2, ADA3, Adf-1, Adf-2a, ADR1, AEF-1, AF-2, AFP1, AGIE-BP1, AhR, AIC3, AIC4, AID2, AIIN3, ALF1B, alpha-1, alpha-CP1, alpha-CP2a, alpha-CP2b, alpha-factor, alpha-PAL, alpha2uNF1, alpha2uNF3, alphaA-CRYBP1, alphaH2-alphaH3, alphaMHCBF1, aMEF-2, AML1, AnCF, ANF, ANF-2, Antp, AP-1, AP-2, AP-3, AP-5, APETALA1, APETALA3, AR, ARG RI, ARG RII, Arnt, AS-C T3, AS321, ASF-1, ASH-1, ASH-3b, ASP, AT-13P2, ATBF1-A, ATF, ATF-1, ATF-3, ATF-3deltaZIP, ATF-adelta, ATF-like, Athb-1, Athb-2, Axial, abaA, ABF-1, Ac, ADA-NF1, ADD1, Adf-2b, AF-1, AG, AIC2, AIC5, ALF1A, alpha-CBF, alpha-CP2a, alpha-CP2b, alpha-IRP, alpha2uNF2, alphaH0, AmdR, AMT1, ANF-1, Ap, AP-3, AP-4, APETALA2, aRA, ARG RIII, ARP-1, Ase, ASH-3a, AT-BP1, ATBF1-B, ATF-2, ATF-a, ATF/CREB, Ato, B factor, B″, B-Myc, B-TFIID, band I factor, BAP, Bcd, BCFI, Bcl-3, beta-1, BETA1, BETA2, BF-1, BGP1, BmFTZ-F1, BP1, BR-C Z1, BR-C Z2, BR-C Z4, Brachyury, BRF1, BrlA, Brn-3a, Brn-4, Brn-5, BUF1, BUF2, B-Myb, BAF1, BAS1, BCFII, beta-factor, BETA3, BLyF, BP2, BR-C Z3, brahma, byr3, c-abl, c-Ets-1, c-Ets-2, c-Fos, c-Jun, c-Maf, c-myb, c-Myc, c-Qin, c-Rel, C/EBP, C/EBPalpha, C/EBPbeta, C/EBPdelta, C/EBPepsilon, C/EBPgamma, C1, CAC-binding protein, CACCC-binding factor, Cactus, Cad, CAD1, CAP, CArG box-binding protein, CAUP, CBF, CBP, CBTF, CCAAT-binding factor, CCBF, CCF, CCK-1a, CCK-1b, CD28RC, CDC10, Cdc68, CDF, cdk2, CDP, Cdx-1, Cdx-2, Cdx-3, CEBF, CEH-18, CeMyoD, CF1, Cf1a, CF2-I, CF2-II, CF2-III, CFF, CG-1, CHOP-10, Chox-2.7, CIIIB1, Clox, Cnc, CoMP1, core-binding factor, CoS, COUP, COUP-TF, CP1, CP1A, CP1B, CP2, CPBP, CPC1, CPE binding protein CPRF-1, CPRF-2, CPRF-3, CRE-BP1, CRE-BP2, CRE-BP3, CRE-BPa, CreA, CREB, CREB-2, CREBomega, CREMalpha, CREMbeta, CREMdelta, CREMepsilon, CREMgamma, CREMtaualpha, CRF, CSBP-1, CTCF, CTF, CUP2, Cut, Cux, Cx, cyclin A, CYS3, D-MEF2, Da, DAL82, DAP, DAT1, DBF-A, DBF4, DBP, DBSF, dCREB, dDP, dE2F, DEF, Delilah, delta factor, deltaCREB, deltaE1, deltaEF1, deltaMax, DENF, DEP, DF-1, Dfd, dFRA, dioxin receptor, dJRA, D1, DII, D1x, DM-SSRP1, DMLP1, DP-1, Dpn, Dr1, DRTF, DSC1, DSP1, DSXF, DSXM, DTF, E, E1A, E2, E2BP, E2F, E2F-BF, E2F-I, E4, E47, E4BP4, E4F, E4TF2, E7, E74, E75, EBF, EBF1, EBNA, EBP, EBP40, EC, ECF, ECH, EcR, eE-TF, EF-1A, EF-C, EF1, EFgamma, Egr, eH-TF, EIIa, EivF, EKLF, Elf-1, Elg, Elk-1, ELP, Elt-2, EmBP-1, embryo DNA binding protein, Emc, EMF, Ems, Emx, En, ENH-binding protein, ENKTF-1, epsilonF1, ER, Erg, Esc, ETF, Eve, Evi, Evx, Exd, Ey, f(alpha-epsilon), F-ACT1, f-EBP, F2F, factor 1-3, factor B1, factor B2, factor delta, factor I, FBF-A1, Fbf1, FKBP59, Fkh, F1bD, F1h, Fli-1, FLV-1, Fos-B, Fra-2, FraI, FRG Y1, FRG Y2, FTS, Ftz, Ftz-F1, G factor, G6 factor, GA-BF, GABP, GADD 153, GAF, GAGA factor, GAL4, GAL80, gamma-factor, gammaCAAT, gammaCAC, gammaOBP, GATA-1, GATA-2, GATA-3, GBF, GC1, GCF, GCF, GCN4, GCR1, GE1, GEBF-I, GF1, GFI, Gfi-1, GFII, GHF-5, GL1, Glass, GLO, GM-PBP-1, GP, GR, GRF-1, Gsb, Gsbn, Gsc, Gt, GT-1, Gtx, H, H16, H1lTF1, H2Babp1, H2RIIBP, H2TF1, H4TF-1, HAC1, HAP1, Hb, HBLF, HBP-1, HCM1, heat-induced factor, HEB, HEF-1B, HEF-1T, HEF-4C, HEN1, HES-1, HIF-1, HiNF-A, HIP1, HIV-EP2, Hlf, HMBI, HNF-1, HNF-3, Hox11, HOXA1, HOXA10, HOXA10PL2, HOXA11, HOXA2, HOXA3, HOXA4, HOXA5, HOXA7, HOXA9, HOXB1, HOXB3, HOXB4, HOXB5, HOXB6, HOXB7, HOXB8, HOXB9, HOXC5, HOXC6, HOXC8, HOXD1, HOXD10, HOXD11, HOXD12, HOXD13, HOXD4, HOXD8, HOXD9, HP1 site factor, Hp55, Hp65, HrpF, HSE-binding protein, HSF1, HSF2, HSF24, HSF3, HSF30, HSF8, hsp56, Hsp90, HST, HSTF, I-POU, IBF, IBP-1, ICER, ICP4, ICSBP, Id1, Id2, Id3, Id4, IE1, EBP1, IEFga, IF1, IF2, IFNEX, IgPE-1, IK-1, IkappaB, Il-1 RF, IL-6 RE-BP, 1L-6 RF, ILF, ILRF-A, IME1, INO2, INSAF, IPF1, IRBP, IRE-ABP, IREBF-1, IRF-1, ISGF-1, Isl-1, ISRF, ITF, IUF-1, Ixr1, JRF, Jun-D, JunB, JunD, K-2, kappay factor, kBF-A, KBF1, KBF2, KBP-1, KER-1, Ker1, KN1, Kni, Knox3, Kr, kreisler, KRF-1, Krox-20, Krox-24, Ku autoantigen, KUP, Lab, LAC9, LBP, Lc, LCR-F1, LEF-1, LEF-1S, LEU3, LF-A1, LF-B1, LF-C, LF-H3beta, LH-2, Lim-1, Lim-3, lin-11, lin-31, lin-32, LIP, LIT-1, LKLF, Lmx-1, LRF-1, LSF, LSIRF-2, LVa, LVb-binding factor, LVc, LyF-1, Lyl-1, M factor, M-Twist, M1, m3, Mab-18, MAC1, Mad, MAF, MafB, MafF, MafG, MafK, Ma163, MAPF1, MAPF2, MASH-1, MASH-2, mat-Mc, mat-Pc, MATa1, MATalpha1, MATalpha2, MATH-1, MATH-2, Max1, MAZ, MBF-1, MBP-1, MBP-2, MCBF, MCM1, MDBP, MEB-1, Mec-3, MECA, mediating factor, MEF-2, MEF-2C, MEF-2D, MEF1, MEP-1, Meso1, MF3, Mi, MIF, MIG1, MLP, MNB1a, MNF1, MOK-2, MP4, MPBF, MR, MRF4, MSN2, MSN4, Msx-1, Msx-2, MTF-1, mtTF1, muEBP-B, muEBP-C2, MUF1, MUF2, Mxi1, Myef-2, Myf-3, Myf-4, Myf-5, Myf-6, Myn, MyoD, myogenin, MZF-1, N-Myc, N-Oct-2, N-Oct-3, N-Oct-4, N-Oct-5, Nau, NBF, NC1, NeP1, Net, NeuroD, neurogenin, NF III-a, NF-1, NF-4FA, NF-AT, NF-BA1, NF-CLE0a, NF-D, NF-E, NF-E1b, NF-E2, NF-EM5, NF-GMa, NF-H1, NF-IL-2A, NF-InsE1, NF-kappaB, NF-lambda2, NF-MHCIIA, NF-muE1, NF-muNR, NF-S, NF-TNF, NF-U1, NF-W1, NF-X, NF-Y, NF-Zc, NFalpha1, NFAT-1, NFbetaA, NFdeltaE3A, NFdeltaE4A, NFe, NFE-6, NFH3-1, NFH3-2, NFH3-3, NFH3-4, NGFI-B, NGFI-C, NHP, Nil-2-a, NIP, NIT2, Nkx-2.5, NLS1, NMH7, NP-III, NP-IV, NP-TCII, NP-Va, NRDI, NRF-1, NRF-2, Nrf1, Nrf2, NRL, NRSF form 1, NTF, NUC-1, Nur77, OBF, OBP, OCA-B, OCSTF, Oct-1, Oct-10, Oct-11, Oct-2, Oct-2.1, Oct-2.3, Oct-4, Oct-5, Oct-6, Oct-7, Oct-8, Oct-9, Oct-B2, Oct-R, Octa-factor, octamer-binding factor, Odd, Olf-1, Opaque-2, Otd, Otx1, Otx2, Ovo, P, P1, p107, p130, p28 modulator, p300, p38erg, p40x, p45, p49erg, p53, p55, p55erg, p58, p65de1ta, p67, PAB1, PacC, Pap1, Paraxis, Pax-1, Pax-2, Pax-3, Pax-5, Pax-6, Pax-7, Pax-8, Pb, Pbx-1a, Pbx-1b, PC, PC2, PC4, PC5, Pcr1, PCRE1, PCT1, PDM-1, PDM-2, PEA1, PEB1, PEBP2, PEBP5, Pep-1, PF1, PGA4, PHD1, PHO2, PHO4, PHO80, Phox-2, Pit-1, PO-B, pointedP1, Pou2, PPAR, PPUR, PPYR, PR, PR A, Prd, PrDI-BF1, PREB, Prh protein a, protein b, protein c, protein d, PRP, PSE1, PTF, Pu box binding factor, PU.1, PUB1, PuF, PUF-I, Pur factor, PUT3, pX, qa-1F, QBP, R, R1, R2, RAd-1, RAF, RAP1, RAR, Rb, RBP-Jkappa, RBP60, RC1, RC2, REB1, Re1A, Re1B, repressor of CAR1 expression, REX-1, RF-Y, RF1, RFX, RGM1, RIM1, RLM1, RME1, Ro, RORalpha, Rox1, RPF1, RPGalpha, RREB-1, RRF1, RSRFC4, runt, RVF, RXR-alpha, RXR-beta, RXR-beta2, RXR-gamma, S-CREM, S-CREMbeta, S8, SAP-1a, SAP1, SBF, Sc, SCBPalpha, SCD1/BP, SCM-inducible factor, Scr, Sd, Sdc-1, SEF-1, SF-1, SF-2, SF-3, SF-A, SGC1, SGF-1, SGF-2, SGF-3, SGF-4, SIF, SIII, Sim, SIN1, Skn-1, SKO1, Slp1, Sn, SNP1, SNF5, SNAPC43, Sox-18, Sox-2, Sox-4, Sox-5, Sox-9, Sox-LZ, Sp1, spE2F, Sph factor, Spi-B, Sprm-1, SRB10, SREBP, SRF, SRY, SSDBP-1, ssDBP-2, SSRP1, STAF-50, STAT, STAT1, STAT2, STAT3, STAT4, STATS, STATE, STC, STD1, Ste11, Ste12, Ste4, STM, Su(f), SUM-1, SWI1, SWI4, SWI5, SWI6, SWP, T-Ag, t-Pou2, T3R, TAB, all TAFs including subunits, Tal-1, TAR factor, tat, Tax, TBF1, TBP, TCF, TDEF, TEA1, TEC1, TEF, tel, Tf-LF1, TFE3, all TFII related proteins, TBA1a, TGGCA-binding protein, TGT3, Th1, TIF1, TIN-1, TIP, T11, TMF, TR2, Tra-1, TRAP, TREB-1, TREB-2, TREB-3, TREF1, TREF2, Tsh, TTF-1, TTF-2, Ttk69k, TTP, Ttx, TUBF, Twi, TxREBP, TyBF, UBP-1, Ubx, UCRB, UCRF-L, UF1-H3beta, UFA, UFB, UHF-1, UME6, Unc-86, URF, URSF, URTF, USF, USF2, v-ErbA, v-Ets, v-Fos, v-Jun, v-Maf, v-Myb, v-Myc, v-Qin, v-Rel, Vab-3, vaccinia virus DNA-binding protein, Vav, VBP, VDR, VETF, vHNF-1, VITF, Vmw65, Vp1, Vp16, Whn, WT1, X-box binding protein, X-Twist, X2BP, XBP-1, XBP-2, XBP-3, XF1, XF2, XFD-1, XFD-3, xMEF-2, XPF-1, XrpFI, W, XX, yan, YB-1, YEB3, YEBP, Yi, YPF1, YY1, ZAP, ZEM1, ZEM2/3, Zen-1, Zen-2, Zeste, ZF1, ZF2, Zfh-1, Zfh-2, Zfp-35, ZID, Zmhoxla, and Zta.
In some embodiments, the DNA-binding protein is a helicase. In some embodiments, the DNA-binding protein is an endonuclease. In some embodiments, the DNA-binding protein is a DNA repair protein.
In some embodiments, referring to
In some embodiments, referring to
One of skill in the art will appreciate that incorporating an abasic strand into the construct (as shown in
In some embodiments, rolling circle amplification (RCA) may be exploited. Referring to
The molecule to be sequenced 84 may be attached to the tether by a covalent linkage of the kind described below. In the presence of nucleotides, the double stranded region is extended until the polymerase reaches the 5′ end of the primer. At this point, the polymerase can push the synthesized strand off the circle at a rate that depends on the concentration of free nucleotides, continuing the amplification. This can allow the molecule to be sequenced 84 to be pulled down into the reading junction where its sequence can be read. Once again, the molecule to be sequenced can be attached to a nucleic acid ‘thread molecule’ if its charge is insufficient, as shown in
Some of the embodiments have been described in the context of a layered tunnel junction with a pore running through the layers. However, the same principles can apply to a tunnel junction in which the electrodes lay opposite on another in a plane, separated by a small gap that forms a tunnel junction. In this case, the constriction that can be used to transport the molecules to the junction would be a narrow channel lying across the junction. The mouth of the constriction would then serve as a point to trap the bead (46) so that the motion of the polymer down the channel could be controlled as described above.
A component of some of the embodiments includes a method for tethering the molecule to be sequenced to the 5′ or 3′ end of DNA. We have described a method whereby peptide chains can be reliably attached to DNA at their N-terminus (Biswas, Song et al. 2015), thus allowing peptides to be sequenced via the characteristic signals produced by their amino acid residues (Zhao, Ashcroft et al. 2014) if they are pulled through the tunnel junction in the manner outlined in some of the embodiments of the present disclosure. The contents of these references are incorporated by reference in their entireties.
In the present disclosure, we also describe a method for attaching oligosaccharides to a DNA molecule. Referring to
Any and all references to publications or other documents, including but not limited to, patents, patent applications, articles, webpages, books, etc., presented in the present application, are herein incorporated by reference in their entirety.
Example embodiments of the devices, systems and methods have been described herein. These embodiments have been described for illustrative purposes only and are not limiting. Other embodiments are possible and are covered by the disclosure, which will be apparent from the teachings contained herein. Thus, the breadth and scope of the disclosure should not be limited by any of the above-described embodiments, but should be defined only in accordance with claims supported by the present disclosure and their equivalents. Moreover, embodiments of the subject disclosure may include methods, systems and devices that include any and all elements from any other disclosed methods, systems, and devices, including any and all elements corresponding to sequencing molecules and the preparation of such molecules for sequencing. In other words, elements from one or another disclosed embodiments may be interchangeable with elements from other disclosed embodiments. In addition, one or more features/elements of disclosed embodiments may be removed and still result in patentable subject matter (and thus, resulting in yet more embodiments of the subject disclosure). Correspondingly, some embodiments of the present disclosure may be patentably distinct from one and/or another reference by specifically lacking one or more elements/features. In other words, claims to certain embodiments may contain negative limitation to specifically exclude one or more elements/features resulting in embodiments which are patentably distinct from the prior art which include such features/elements.
Apweiler, R., et al. (1999). “On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database1.” Biochimica et Biophysica Acta 1473: 4-8.
Biswas, S., et al. (2015). “Click Addition of a DNA Thread to the N-Termini of Peptides for Their Translocation through Solid-State Nanopores.” ACS Nano 9 (10): 9652-9664.
Hart, G. W. and R. J. Copeland (2010). “Glycomics hits the big time.” Cell 143 (5): 672-676.
Hofmann, J., et al. (2015). “Identification of carbohydrate anomers using ion mobility-mass spectrometry.” Nature 526 (7572): 241-244.
Kawai, T. and S. Akira (2009). “The roles of TLRs, RLRs and NLRs in pathogen recognition.” International Immunology 21 (4): 317-337.
Manrao, E. A., et al. (2012). “Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase.” Nat Biotechnol 30 (4): 349-353.
Nagy, G. and N. L. Pohl (2015). “Monosaccharide identification as a first step toward de novo carbohydrate sequencing: mass spectrometry strategy for the identification and differentiation of diastereomeric and enantiomeric pentose isomers.” Analytical Chemistry 87 (8): 4566-4571.
Ohtsubo, K. and J. D. Marth (2006). “Glycosylation in cellular mechanisms of health and disease.” Cell 126 (5): 855-867.
Parodi, A. J. (2000). “Protein glucosylation and its role in protein folding.” Annu Rev Biochem 69: 69-93.
Pinho, S. S. and C. A. Reis (2015). “Glycosylation in cancer: mechanisms and clinical implications.” Nature Reviews: Cancer 15 (9): 540-555.
Zhang, X. L. (2006). “Roles of glycans and glycopeptides in immune system and immune-related diseases.” Curr Med Chem 13 (10): 1141-1147.
Zhao, Y., et al. (2014). “Single-molecule spectroscopy of amino acids and peptides by recognition tunnelling.” Nature Nanotechnology 9: 466-473.
Zhao, Y. Y., et al. (2008). “Functional roles of N-glycans in cell signaling and cell adhesion in cancer.” Cancer Science 99 (7): 1304-1310.
This application claims priority to and the benefit of U.S. Ser. No. 62/400,530, filed Sep. 27, 2016, the contents of which are incorporated herein by reference in their entireties.
This invention was made with government support under R01 HG006323 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2017/053561 | 9/26/2017 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62400530 | Sep 2016 | US |