A Sequence Listing associated with this application is being filed concurrently herewith in XML format and is hereby incorporated by reference into the present specification. The XML file containing the Sequence listing is titled “Sequence_Listing.xml”, was created on Nov. 6, 2024, and is 7866 bytes in size.
The present disclosure relates to a method for maintaining nanopore sequencing speed, and further provides a method for sequencing a double-stranded target polynucleotide and a kit for sequencing a double-stranded target polynucleotide.
With the continuous development of nanotechnology, the nanopore sequencing technology that can realize nucleic acid sequence reading at single molecule level has developed rapidly. This technology has obvious advantages compared with the next generation sequencing technology in terms of portability, sequencing read length and sequencing speed, etc. In 1996, scientists used the alpha-hemolysin protein to identify different bases for the first time (Kasianowicz, John J., et al. Proceedings of the National Academy of Sciences 93.24 (1996): 13770-13773.), which pioneered DNA sequencing based on the principle of nanopore detection.
British Oxford Nanopore Technologies has launched a number of sequencers with different throughputs. The principle is that: transmembrane proteins with a diameter of nanometer scale are inserted into a polymer membrane to form a nanopore channel on the membrane. Two electrodes are placed on both sides of the membrane, and conductive buffer is placed on both sides of the membrane. After electrified, ions generate current through the transmembrane proteins. The double-stranded DNA of a sequencing library is unwound into single-stranded DNA under the action of helicase, and the single-stranded DNA passes through the transmembrane protein under the action of electric field force to produce the characteristic current signals. The characteristic current signals are analyzed through an algorithm to realize sequence reading. Wherein the helicase uses the energy generated by the hydrolysis of adenosine triphosphate (ATP) or guanosine triphosphate (GTP) to move along a nucleic acid skeleton in a specific direction (5′-3′ or 3′-5′) to unwind hydrogen-bonded double-stranded DNA into single-stranded DNA.
The inventors of the present disclosure find that there is a problem in the above sequencing process. That is, as the sequencing progresses, the sequencing speed of the nanopore decreased. The decrease of the sequencing speed will affect sequencing throughput and sequencing accuracy. Therefore, to consistently maintain high throughput and high accuracy during the sequencing process, a method that can maintain nanopore sequencing speed shall be provided.
In order to solve the above problem, the inventors of the present disclosure found through numerous experiments that the nanopore sequencing speed is related to the composition of conductive buffer on one side or both sides of the membrane. Further, through numerous experiments and repeated explorations, the inventors of the present disclosure realized for the first time that controlling the amounts of ADP and ATP at the same time in the above sequencing process is the key to affecting the sequencing rate, and thus completed the present disclosure.
Therefore, the present disclosure provides a method for sequencing a double-stranded target polynucleotide, including:
In the method herein, different nucleotides (e.g., A, T, C and G) have different effects on the current passing through the transmembrane pore, and the transmembrane pore can distinguish the nucleotides with similar structures. Thus, in some embodiments, individual nucleotides can be identified at the single molecule level according to the amplitude of current or the duration of the interaction when each nucleotide interacts with the transmembrane pore. If a characteristic current associated with the nucleotide is detected flowing through the pore, then the type (e.g., A, T, C and G) of the nucleotide passing through the pore can be determined, that is, the sequencing is realized. By continuously identifying the nucleotides in the target polynucleotide, the sequence of the target polynucleotide can be estimated or determined.
In some embodiments, the method further includes measuring the current passing through the pore during the interaction with the nucleotide. Therefore, in some embodiments, the method also provides a circuit capable of applying an electric potential and measuring an electrical signal across the membrane and the pore. In some embodiments, the method also provides a patch clamp or voltage clamp.
In the method herein, the double strands of the double-stranded target polynucleotide are separated by the helicase to form the single-stranded target polynucleotide. In some embodiments, the helicase uses the energy generated by the hydrolysis of adenosine triphosphate (ATP) or guanosine triphosphate (GTP) to move along a nucleic acid skeleton in a specific direction (5′-3′ or 3′-5′) to unwind the hydrogen-bonded double-stranded DNA into single-stranded DNA.
In some embodiments, the helicase controls the movement of the single-stranded target polynucleotide along the field generated by the applied voltage to pass through the pore. In some embodiments, the helicase acts as a brake, preventing the single-stranded target polynucleotide from moving through the pore too quickly under the influence of the applied voltage. In some embodiments, the method further includes: (d) reducing the voltage applied across the pore so that the single-stranded target polynucleotide moves through the pore in a direction opposite to that in step (b), and a portion of nucleotides in the polynucleotide interact with the pore to measure the current passing through the pore during each interaction, thereby performing a correction read on the sequence of the target polynucleotide obtained in step (c).
In some embodiments, the single-stranded target polynucleotide may interact with the pore on either side of the membrane. The single-stranded target polynucleotide may interact with the pore at any position in any manner.
In some embodiments, the ATP-generating enzyme is selected from: pyruvate kinase, acetate kinase, creatine phosphokinase, serine kinase, threonine kinase, tyrosine kinase, FoF1-ATPase, polyphosphate kinase, nucleoside diphosphate kinase, or any combination thereof.
In some embodiments, the ATP-generating enzyme is a pyruvate kinase, and the ATP-generating substrate is phosphoenolpyruvate. In some embodiments, the phosphoenolpyruvate is 5 mM and the pyruvate kinase is 0.02 U/mL. In some embodiments, the phosphoenolpyruvate is 5 mM and the pyruvate kinase is 0.2 U/mL. In some embodiments, the phosphoenolpyruvate is 5 mM and the pyruvate kinase is 2 U/mL.
In some embodiments, the ATP-generating enzyme is the acetate kinase, and the ATP-generating substrate is lithium potassium acetyl phosphate. In some embodiments, the lithium potassium acetyl phosphate is 5 mM and the acetate kinase is 0.02 U/mL.
In some embodiments, the ATP-generating enzyme is the creatine phosphokinase, and the ATP-generating substrate is creatine phosphate disodium.
In some embodiments, the ATP-generating enzyme is the FoF1-ATPase, and the ATP-generating substrate is inorganic phosphate.
In some embodiments, the ATP-generating enzyme is the creatine phosphokinase, and the ATP-generating substrate is creatine phosphate.
In some embodiments, the ATP-generating enzyme is the polyphosphate kinase, and the ATP-generating substrate is polyphosphate.
In some embodiments, the ATP-generating enzyme is the nucleoside diphosphate kinase, and the ATP-generating substrate is nucleoside triphosphate.
Herein, the amounts of the ATP-generating enzyme and the ATP-generating substrate used herein are not limited to the specific concentrations and amounts used in the embodiments, as long as the concentration of ATP can be kept relatively constant during the sequencing process so that the sequencing rate can be better kept stable and unchanged. In fact, during the sequencing process, under the action of the ATP-generating enzyme, the substrate will react and the amount will be gradually decreased. Those skilled in the art have the ability to properly adjust the concentrations and the amounts of the ATP-generating enzyme and the ATP-generating substrate in a sequencing system according to an experimental purpose, to obtain appropriate concentrations and amounts of the ATP-generating enzyme and the ATP-generating substrate.
In some embodiments, the helicase is selected from Dda, UvrD, Rep, RecQ, PcrA, eIF4A, NS3, gp41, T7gp4, or any combination thereof.
In some embodiments, the helicase is further linked to an additional polypeptide, the additional polypeptide being selected from a label, an enzyme cleavage site, a signal peptide or leading peptide, a detectable marker, or any combination thereof.
In some embodiments, the helicase is a wild type Dda or a mutant thereof.
In some embodiments, the helicase has an amino acid sequence as set forth in SEQ ID NO: 2.
In some embodiments, the transmembrane pore is a transmembrane protein pore or a transmembrane solid-state pore.
In some embodiments, the transmembrane protein pore is selected from hemolysin, MspA, MspB, MspC, MspD, Frac, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, phi29 connector, T7 connector, GspD, InvG, or any combination thereof.
In some embodiments, the transmembrane pore is further linked to an additional polypeptide, the additional polypeptide being selected from a label, an enzyme cleavage site, a signal peptide or leading peptide, a detectable marker, or any combination thereof.
In some embodiments, the transmembrane pore is a wild type CsgG protein or a mutant thereof. In some embodiments, the transmembrane pore has an amino acid sequence as set forth in SEQ ID NO: 1.
In some embodiments, the membrane is an amphiphilic layer (e.g., phospholipid bilayer) or a polymer membrane (e.g., di-block and tri-block).
The method described above can be performed using any suitable membrane. In some embodiments, the membrane is a phospholipid bilayer, wherein the transmembrane pore is inserted into the phospholipid bilayer.
In some embodiments, the method is generally performed using the following membrane that is: (i) an artificial bilayer containing pores, (ii) an isolated naturally-occurring lipid bilayer containing pores, or (iii) a cell having pores inserted therein. The method is preferably performed using an artificial bilayer (e.g., an artificial phospholipid bilayer).
In some embodiments, the double-stranded target polynucleotide is a double-stranded DNA and/or a double-stranded DNA-RNA hybrid.
In some embodiments, the double-stranded target polynucleotide is naturally occurring and/or artificially synthesized.
In some embodiments, the double-stranded target polynucleotide is obtained from biological samples extracted from viruses, prokaryotes (e.g., bacteria), eukaryotes (e.g., plants (e.g., cereals, legumes, fruits or vegetables)), mammals (e.g., horses, cattle, mice or humans), or any combination thereof.
In some embodiments, the double strands of the double-stranded target polynucleotide are linked by a bridging part at or near one end of the target polynucleotide; and the bridging part is selected from a polymer linker, a chemical linker, a polynucleotide or a polypeptide.
In some embodiments, the double-stranded target polynucleotide is circular or linear.
In some embodiments, all or only a portion of the double-stranded target polynucleotide may be sequenced using the method described above. The double-stranded target polynucleotide may be of any length. For example, the double-stranded target polynucleotide may be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotide pairs in length. The double-stranded target polynucleotide may be 1,000 or more nucleotide pairs, 5,000 or more nucleotide pairs, or 100,000 or more nucleotide pairs in length. The double-stranded target polynucleotides may be naturally occurring or artificially synthesized. For example, the method can be used to verify the sequence of an artificially synthesized oligonucleotides, and the method is usually performed in vitro.
In some embodiments, the double-stranded target polynucleotide includes at least one single-stranded overhang (e.g., a 5′ overhang and/or a 3′ overhang); the single-stranded overhang includes a leader sequence; and the leader sequence guides a nucleic acid strand linked thereto into the pore.
In some embodiments, in step (b), the double-stranded target polynucleotide is contacted with the helicase to form a complex, and then the complex is contacted with the transmembrane pore.
In some embodiments, the method described above is generally performed in the presence of a reagent for nanopore sequencing.
In some embodiments, in step (b), the reagent for nanopore sequencing is contacted with the helicase.
In some embodiments, the reagent for nanopore sequencing is selected from ATP, inorganic salts (e.g., chloride salt, e.g., sodium chloride, potassium chloride or lithium chloride), buffers (HEPES and/or Tris-HCl), EDTA, metal ions (e.g., Mn2+, Mg2+, Co+, Zn2+, Cu2+, Cu+, Ni+, Fe2+ or Fe3+), or any combination thereof.
In some embodiments, the concentration of the chloride salt is saturated.
In some embodiments, the concentration of the chloride salt is 0.1 M to 2.5 M, 0.3 M to 1.9 M, 0.5 M to 1.8 M, 0.7 M to 1.7 M, 0.9 M to 1.6 M or 1 M to 1.4 M.
In some embodiments, the method described above is performed at a pH of 4.0 to 12.0, 4.5 to 10.0, 5.0 to 9.0, 5.5 to 8.8, 6.0 to 8.7, or 7.0 to 8.8, or 7.5 to 8.5. In some embodiments, the method described above is performed at a pH of 7.5.
In another aspect, the present disclosure provides a kit, including: a membrane containing a transmembrane pore, a helicase, an ATP-generating enzyme and an ATP-generating substrate; and optionally, the kit further includes a reagent for nanopore sequencing.
In some embodiments, the reagent for nanopore sequencing is selected from ATP, inorganic salts (e.g., chloride salt, e.g., sodium chloride, potassium chloride or lithium chloride), buffers (HEPES and/or Tris-HCl), EDTA, metal ions (e.g., Mn2+, Mg2+, Co+, Zn2+, Cu2+, Cu+, Ni+, Fe2+ or Fe3+), or any combination thereof.
In some embodiments, the ATP-generating enzyme is selected from: pyruvate kinase, acetate kinase, creatine phosphokinase, serine kinase, threonine kinase, tyrosine kinase, FoF1-ATPase, polyphosphate kinase, nucleoside diphosphate kinase, or any combination thereof.
In some embodiments, the kit includes: a membrane containing a transmembrane pore; a helicase; an ATP; (I) pyruvate kinase and phosphoenolpyruvate, (II) acetate kinase and lithium potassium acetyl phosphate, (III) creatine phosphokinase and creatine phosphate, (IV) FoF1-ATPase and inorganic phosphate; (V) polyphosphate kinase and polyphosphate; (VI) nucleoside diphosphate kinase and nucleoside triphosphate, or any combination of (I) to (VI).
In some embodiments, the kit is used to sequence the double-stranded target polynucleotide.
In some embodiments, the helicase is selected from Dda, UvrD, Rep, RecQ, PcrA, eIF4A, NS3, gp41, T7gp4, or any combination thereof.
In some embodiments, the helicase is further linked to an additional polypeptide, the additional polypeptide being selected from a label, an enzyme cleavage site, a signal peptide or leading peptide, a detectable marker, or any combination thereof.
In some embodiments, the helicase is a wild type Dda or a mutant thereof.
In some embodiments, the helicase has an amino acid sequence as set forth in SEQ ID NO: 2.
In some embodiments, the transmembrane pore is a transmembrane protein pore or a transmembrane solid-state pore. In some embodiments, the transmembrane protein pore is selected from hemolysin, MspA, MspB, MspC, MspD, Frac, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, phi29 connector, T7 connector, GspD, InvG, or any combination thereof.
In some embodiments, the transmembrane pore is further linked to an additional polypeptide, the additional polypeptide being selected from a label, an enzyme cleavage site, a signal peptide or leading peptide, a detectable marker, or any combination thereof.
In some embodiments, the transmembrane pore is a wild type CsgG protein or a mutant thereof. In some embodiments, the transmembrane pore has an amino acid sequence as set forth in SEQ ID NO: 1.
In some embodiments, the membrane is an amphiphilic layer (e.g., phospholipid bilayer) or a polymer membrane (e.g., di-block and tri-block).
Use of the above kit in the preparation of a sequencing device is provided for sequencing the double-stranded target polynucleotide.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as those commonly understood by those ordinary skilled in the art to which the present disclosure belongs. All patents, applications and other publications mentioned herein are hereby incorporated by reference in their entirety. In the event of a conflict or inconsistency between the definitions set forth herein and the definitions set forth in patents, applications, and other publications incorporated herein by reference, the definitions set forth herein shall control.
As used herein, the term “polynucleotide” is a macromolecule containing one, two or more than two nucleotides. The polynucleotide can include any combination of any nucleotide, which may be naturally occurring or artificial. The nucleotide generally contains a nucleobase, a sugar and at least one phosphate group.
As used herein, the term “transmembrane pore” is a structure that allows hydrated ions driven by an applied electric potential to flow from one side of the membrane to the other side of the membrane, and the transmembrane pore is generally inserted into the membrane (e.g., lipid bilayer). The transmembrane pore is preferably a transmembrane protein pore. The “transmembrane protein pore” is a polypeptide or a collection of polypeptides that allows hydrated ions to flow from one side of the membrane to the other side of the membrane. The transmembrane protein pore allows the polynucleotide to move and pass through the pore. The transmembrane protein pore generally includes a barrel or channel through which ions can flow. The transmembrane protein pore generally includes amino acids that facilitate the interaction with a target nucleotide, and these amino acids are preferably located near a constriction of the barrel or channel.
As used herein, the term “movement/move” is to make the single-stranded target polynucleotide move from one side of the pore to the other side. The movement of the single-stranded polynucleotide through the pore may be influenced by electropotential action or enzymatic action, as well as electropotential and enzymatic action. The movement may be unidirectional or bidirectional.
As used herein, the term “ATP-generating enzyme” refers to an enzyme that can catalyze the substrates in a system to generate ATP.
As used herein, the term “ATP-generating substrate” refers to a substance that can convert or generate ATP in the presence of the ATP-generating enzyme. In some embodiments, the ATP-generating substrate is capable of reacting with the ADP to generate the ATP under the catalysis of the ATP-generating enzyme.
The present disclosure provides a method capable of consistently keeping sequencing speed stable during the sequencing process, and further maintaining high throughput and high accuracy of sequencing. Compared with other methods (for example, overall change of the sequencing buffer, or addition of ATP), the method and the kit of the present disclosure can keep the concentration of ATP relatively constant during the sequencing process, so that the sequencing rate can be better kept stable and unchanged.
The information on some of the sequences involved in the present disclosure is shown in Table 1 below.
The present disclosure is described by reference to the following examples which are intended to illustrate the present disclosure by examples (not to limit the present disclosure).
Unless otherwise specified, the experimental methods of molecular biology used in the present disclosure are performed basically with reference to the methods described in J. Sambrook et al., Molecular Cloning: Laboratory Manual, 2nd edition, Cold Spring Harbor Laboratory Press, 1989, and F. M. Ausubel et al., eds., Comprehensive Guide to Molecular Biology Experiments, 3rd edition, John Wiley & Sons, Inc., 1995. Those skilled in the art know that embodiments describe the present disclosure by way of examples and are not intended to limit the protection scope claimed by the present disclosure.
The inventors of the present disclosure have conducted multi-faceted explorations of the sequencing system and ultimately unexpectedly discovered that adding ADP to the sequencing buffer of the system would reduce the sequencing rate.
Specifically, a CsgG transmembrane protein mutant (the amino acid sequence of the wild type is shown in SEQ ID NO.1, and the mutation sites are: Y51A/F56Q/R97W/R192D) was used as a nanopore, and a nanopore detection method was used for detection. After a single CsgG transmembrane protein was inserted into the phospholipid bilayer, a buffer (470 mM KCl, 25 mM HEPES, and 1 mM EDTA, at PH 8.0) without fuel (ATP) was used to flow through the system to remove excess transmembrane protein. A constructed pUC57 sequencing library (SEQ ID NO.3) including T4 Dda helicase (the encoding amino acid sequence of the wild type is shown in SEQ ID NO.2, and the mutation sites are E94C/C109A/C136A/A360C) and the sequencing buffer were pre-mixed and then added to a nanopore experimental system. The sequencing of the CsgG transmembrane protein was detected at 0.18V voltage. Wherein:
The experimental results are shown in
Based on the above example, the applicant has further explored the sequencing system, i.e., adding components to the sequencing buffer of the system to reduce the amount of ADP generated by the system during the sequencing process, while increasing the amount of ATP in the system during the sequencing process.
Specifically, a CsgG transmembrane protein mutant (the amino acid sequence of the wild type is shown in SEQ ID NO.1, and the mutation sites are: Y51A/F56Q/R97W/R192D) was used as a nanopore, and a nanopore detection method was used for detection. After a single CsgG transmembrane protein was inserted into the phospholipid bilayer, a buffer (470 mM KCl, 25 mM HEPES, and 1 mM EDTA, at PH 8.0) without fuel (ATP) was used to flow through the system to remove excess transmembrane protein. A constructed pUC57 sequencing library (SEQ ID NO.3) including T4 Dda helicase (the encoding amino acid sequence of the wild type is shown in SEQ ID NO.2, and the mutation sites are E94C/C109A/C136A/A360C) and the sequencing buffer were pre-mixed and then added to a nanopore experimental system. The sequencing of the CsgG transmembrane protein was detected at 0.18V voltage. Wherein:
Three concentrations were set in the experimental groups:
The experimental results of the experimental groups 1, 2 and 3 are shown in
Moreover, the experimental results in
The experimental conditions and steps of the present disclosure are the same as those of Example 2, except that:
The experimental results (
The experimental conditions and steps of the present disclosure are the same as those of Example 2, except that:
The experimental results (
Although the specific embodiments of the present disclosure have been described in detail, those skilled in the art will understand that various modifications and changes may be made to the details based on all the teachings published, and these changes are all within the protection scope of the present disclosure. The entire scope of the present disclosure is defined by the appended claims and any equivalents thereof.
This application is a continuation of International Application No. PCT/CN2022/095500, filed on May 27, 2022, the content of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2022/095500 | May 2022 | WO |
Child | 18960408 | US |