The Sequence Listing associated with this application is provided in xml format in lieu of a paper copy, and is hereby incorporated by reference into the specification. The name of the xml file containing the Sequence Listing is P37013 seq list.xml. The xml file is 27,648 bytes, was created on Oct. 31, 2023, and is being submitted electronically via Patent Center.
Embodiments described herein relate to nanopore-based sequencing methods and systems. In particular, methods and systems include methods to generate consensus reads using Sequencing By eXpansion (SBX™).
Nanopore membrane devices having pore sizes on the order of one nanometer in internal diameter have shown promise in rapid nucleotide sequencing. When a voltage potential is applied across a nanopore immersed in a conducting fluid, a small ion current attributed to the conduction of ions across the nanopore can exist. The size of the current is sensitive to the pore size and which molecule in the nanopore. The molecule can be a particular reporter code corresponding to a particular nucleotide, thereby allowing detection of a nucleotide at a particular position of a nucleic acid. A voltage or other signal in a circuit including the nanopore can be measured (e.g., at an integrating capacitor) as a way of measuring the resistance of the molecule, thereby allowing detection of which molecule is in the nanopore.
A nanopore based sequencing chip may be used for DNA sequencing. A nanopore based sequencing chip can incorporate a large number of sensor cells configured as an array. For example, an array of one million cells may include 1000 rows by 1000 columns of cells.
The signals that are measured can vary from chip to chip and from cell to cell of a same chip due to manufacturing variability. Therefore, it can be difficult to determine the correct molecule, which may be or correspond to the correct nucleotide in a particular nucleic acid or other polymer in a cell. In addition, other time dependent non-idealities in the measured signals can lead to inaccuracies. And, because these circuits employ biochemical circuit elements, e.g., lipid bilayers, nanopores, etc., the variability in the electrical characteristics can be much higher than for traditional semiconductor circuits. Further, sequencing processes are stochastic in nature, and thus variability can occur across a wide variety of systems, including sequencing devices not using nanopores.
Accordingly, improved characterization techniques are desired to improve the accuracy and stability of sequencing processes.
Embodiments described herein involve methods of sequencing and improvements to the sequencing of surrogate polymers encoded with nucleic acid information in a nanopore. The surrogate polymer (also referred to herein as an “Xpandomer” polymer) is formed by template directed synthesis that preserves the original genetic information of the target nucleic acid, while also increasing linear separation of the individual elements of the sequence data. The surrogate polymer is formed from a template nucleic acid molecule. A surrogate polymer includes multiple units. Each unit includes a reporter code portion or portions. The reporter codes correspond to the different nucleotides (e.g., A, T, C, G). The reporter codes generate different electrical signals in the nanopore and therefore allow identification of the nucleotide sequence. Each unit includes a translocation control element (TCE). To pass through the nanopore, the TCE requires a higher voltage to be applied as compared to the baseline voltage for driving the rest of the unit through the nanopore. Surrogate polymers can be passed forward and backward through a nanopore several times to allow for multiple reads. Surrogate polymers include one leader segment, which may get stuck in a membrane on one side of the nanopore when backing the surrogate polymers out of the nanopore. Embodiments described herein address these stuck surrogate polymers.
In order to allow for multiple reads on the surrogate polymers, a technique of processive consensus can be applied. The surrogate polymer may be moved a few units forward (e.g., 30) and then fewer units backward (e.g., 25) so that some of the same reporter codes are identified again. This method allows for multiple reads of the same reporter codes. The surrogate polymereventually passes through the nanopore in the forward direction. Periodically, higher clearing voltages may be applied to clear any stuck surrogate polymer in the nanopore.
Clearing voltages may be applied more frequently but in a targeted manner. Cells (i.e., wells) where the surrogate polymer is not determined to be stuck within the nanopore may be deactivated before the clearing voltage is applied. Techniques allow molecules of any length can be sequenced at their full length with a depth of greater than one.
In view of the foregoing, an aspect of the present disclosure is a method for sequencing a target nucleic acid molecule, the method comprising applying a first number of voltage pulses at a first level across a nanopore to displace a compound a first distance in a first direction through the nanopore, the compound created from the target nucleic acid molecule, wherein the compound comprises a plurality of units, each unit of the plurality of units comprises one type of reporter element of a plurality of types of reporter elements, each type of reporter element corresponds to an identity of a nucleotide in the target nucleic acid molecule, and applying the first number of voltage pulses passes a first subset of the plurality of units through the nanopore; detecting, in the nanopore, the types of reporter elements in the first subset; applying a second number of voltage pulses at a second level across the nanopore to displace the compound a second distance in a second direction through the nanopore, wherein the first direction is opposite the second direction, the voltage pulses of the first number of voltage pulses have an opposite polarity as the voltage pulses of the second number of voltage pulses, the second distance is less than the first distance, and the second number is less than the first number; applying a third number of voltage pulses at a third level across the nanopore to displace the compound a third distance in the first direction through the nanopore, wherein applying the third number of voltage pulses passes a second subset of the plurality of units through the nanopore the second subset and the first subset comprise some of the same units, the second subset comprises units not in the first subset, and the third distance is greater than the second distance; and detecting, in the nanopore, the types of reporter elements in the second subset.
In some embodiments, the method further comprises applying a clearing voltage at a fourth level across the nanopore to pass the compound entirely out of the nanopore, wherein the fourth level is greater than the first level, the second level, and the third level.
In some embodiments the compound is a first compound of a plurality of compounds, the plurality of compounds is created from a plurality of target nucleic acid molecules, the nanopore is a first nanopore of a plurality of nanopores, and each compound of the plurality of compounds is in one nanopore of the plurality of nanopores, the method further comprising applying the first number of voltage pulses at the first level, the second number of voltage pulses at the second level, and the third number of voltage pulses at the third level to the plurality of nanopores.
In some embodiments, the method further comprises determining a plurality of sequences of the plurality of target nucleic acid molecules.
In some embodiments, the size distribution of the plurality of sequences has a mode greater than 300 nt.
In some embodiments, the method further comprises applying the first number of voltage pulses at the first level, the second number of voltage pulses at the second level, and the third number of voltage pulses at the third level to the plurality of nanopores; determining that a first portion of the plurality of compounds is being displaced in a first portion of the plurality of nanopores by the first number of voltage pulses, the second number of voltage pulses, or the third number of voltage pulses, applying a clearing voltage at a fourth level across each nanopore of a second portion of the plurality of nanopores to pass a second portion of the plurality of compounds entirely out of the respective nanopore of the plurality of nanopores, wherein the fourth level is greater than the first level, the second level, and the third level, and the second portion of the plurality of nanopores does not comprise nanopores in the first portion of the plurality of nanopores.
In some embodiments, the method further comprises determining a sequence of the target nucleic acid molecule.
In some embodiments, determining the sequence of the target nucleic acid molecule comprises for one or more units in both the first subset and the second subset, detecting the same type of reporter element.
In some embodiments, the method further comprises passing the compound entirely out of the nanopore.
In some embodiments, passing the compound entirely out of the nanopore occurs during the applying of the third number of voltage pulses.
In some embodiments, each unit of the plurality of units comprises a translocation control element, applying the first number of voltage pulses passes a first number of translocation control elements through the nanopore, and the first number of voltage pulses equals the first number of translocation control elements.
In some embodiments, the method further comprises applying a voltage at a fourth level across the nanopore to displace the compound a fourth distance in the first direction through the nanopore in between voltage pulses of the first number of voltage pulses, wherein the voltage at the fourth level is the same polarity as the voltage pulses of the first number of voltage pulses, the fourth level is less than the first level, and the compound after being displaced the fourth distance has a translocation control element in the nanopore.
In some embodiments, the second level is greater than the first level.
In some embodiments, the method further comprises measuring signal values for a nanopore having a voltage applied across the nanopore when reporter elements in the first subset of the plurality of units are in the nanopore; and determining, using the signal values, the types of reporter elements in the first subset, thereby determining the identities of nucleotides in the target nucleic acid molecule.
In some embodiments, the first subset of the plurality of units comprises 30 or more units.
In some embodiments, the third level is equal to the first level.
In some embodiments, the target nucleic acid molecule is longer than 200 nt.
In some embodiments, the first number of voltage pulses is 30 or more.
In some embodiments, the first number of voltage pulses may exceed the second number of voltage pulses by 5 or more 0.1.
Another aspect of the present disclosure is a computer product comprising a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform the aforementioned method and its embodiments.
Another aspect of the present disclosure is a system comprising the computer product, and one or more processors for executing instructions stored on the computer readable medium.
Further aspects of the present disclosure are a system comprising means for performing any of the above methods, a system comprising one or more processors configured to perform any of the above methods, and a system comprising modules that respectively perform the steps of any of the above methods.
A better understanding of the nature and advantages of embodiments of the present invention may be gained with reference to the following detailed description and the accompanying drawings.
Embodiments described herein include protocols for nanopore sequencing, including Sequencing By eXpansion (SBX™) protocol. Protocols include a “lead” portion of the molecule that has the property that it enters the pore easily in the forward direction but has a high barrier to translocating through the pore in the reverse direction. Additionally, embodiments include a modified bright and dark period applied voltage pattern that is designed to electrically trap surrogate polymer (e.g., Xpandomer) molecules in the nanopore and position molecules in a controlled way so as to allow for multiple passes or reads on the same surrogate polymer molecule. Furthermore, embodiments include applying higher voltage pulses and/or longer cycle times on periodic bright/dark cycles to finally clear the molecule from the pore. Some embodiments include employing selective application of higher voltage pulses to clear molecules only from specific pores each global pulse application period. Advantages include enrichment of compound molecular trace events (i.e., subreads from the same molecule in a series of bright periods). Other advantages may include increases in pore occupancy and raw base call throughput. Furthermore, protocols may result in higher accuracy multipass reads, as compared to single pass reads. Embodiments may include running with much shorter bright periods without permanently cutting off reads of long surrogate polymer molecules. This may allow for practical experimental conditions that lead to short bright period decay time constants.
Additionally, embodiments may include the ability to dynamically elect to spend more time on surrogate polymers from specific UMI (unique molecular identifier) Molecular Families and spend less time on surrogate polymers from other UMI Molecular Families. UMIs may be added to sample nucleic acids during the sample prep phase. Every nucleic acid fragment from a particular sample will have the same UMI. Different samples will have different UMIs. The UMIs allow for pooling different samples together for sequencing while identifying the sample of each nucleic acid molecule. During sequencing, one may determine that sequences from a particular UMI family (i.e., sample) may need additional sequencing. For example, the confidence of certain base calls may be low due to a relatively high number of different base calls for the same position. Fragments from these samples may be sequenced extra times using the methods described herein.
Embodiments may be applied to sequencing by expansion (SBX) using nanopores. Sequencing by expansion is described in WO 2020/236526 A1, “Translocation control elements, reporter codes, and further means for translocation control for use in nanopore sequencing,” filed May 14, 2020, and U.S. Pat. No. 7,939,259 B2, “High throughput nucleic acid sequencing by expansion,” filed Jun. 19, 2008, the entire contents of both of which are incorporated herein by reference for all purposes.
Sequencing by expansion protocol is based on the polymerization of highly modified, non-natural nucleotide analogs referred to as “XNTPs”. In general terms, SBX uses biochemical polymerization to transcribe the sequence of a DNA template onto a measurable polymer called an “Xpandomer.” The transcribed sequence is encoded along the Xpandomer backbone in high signal-to-noise reporters that are separated by ˜10 nm and are designed for high-signal-to-noise, well-differentiated responses. These differences provide significant performance enhancements in sequence read efficiency and accuracy of Xpandomers relative to natural DNA.
Section 102 shows primer-directed Xpandomer synthesis. XNTP 104 is illustrated in the “constrained configuration”, characteristic of the XNTP substrates and the daughter strand products of template-dependent polymerization. The constrained configuration of polymerized XNTPs is the precursor to the expanded configuration (XNTP 108), as found in Xpandomer products. Section 106 illustrates cleaving to expand the Xpandomer. The transition from the constrained configuration to the expanded configuration occurs upon scission of the P—N bond of the phosphoramidate within the primary backbone of the daughter strand.
During assembly, the monomeric XNTP substrates (XATP, XCTP, XGTP and XTTP) are polymerized on the extendable terminus of a nascent daughter strand by a process of template-directed polymerization using single-stranded template as a guide. Generally, this process is initiated from a primer and proceeds in the 5′ to 3′ direction. Generally, a DNA polymerase 110 or other polymerase is used to form the daughter strand, and conditions are selected so that a complimentary copy of the template strand is obtained. After the daughter strand is synthesized, the coupled SSRTs form the constrained Xpandomer that further forms the daughter strand. SSRTs in the daughter strand have the “constrained configuration” of the XNTP substrates. The constrained configuration of the SSRT is the precursor to the expanded configuration, as found the Xpandomer product.
In this example, once synthesis and expansion are complete, each monomeric XNTP unit 112 in the Xpandomer contains two reporter codes 116a and 116b with a Reporter Code “level” corresponding to the base type it encodes, and a Translocation Control Element (TCE) 120. The TCE controls the rate of Xpandomer translocation through a nanopore through a combination of sterics, electrorepulsion, and/or preferential interaction with the nanopore. The resistance of the TCE to the driving force of the ion current when positioned at the pore aperture and the consequent increase in applied voltage (i.e., the voltage pulse) necessary to overcome the arrest and resume translocation, can be customized by modulating various properties of the TCE, (and in some embodiments, the reporter codes and other elements of the SSRT) e.g., the bulk, length, and/or charge density.
Brancher 124 is the branched structure that terminates the TCE and links to reporter codes 116a and 116b. Enhancers 128a and 128b may aid in polymerase incorporation. Nucleotide 132 is attached to enhancer 128b and may include a cleavable linker 136. Cleavable linker 136 may be a photocleavable linker. Cleavable linker 136 may be cleaved to result in the expansion shown in section 106.
In certain embodiments, TCEs are polymers produced by solid-phase synthesis using the phosphoramidite method with suitable monomeric building blocks that terminate with a branched structure (i.e., the “brancher”). Branched phosphoramidites include both symmetrical and asymmetrical branchers. In one embodiment, the TCE brancher 210 is a symmetrical branching CED phosphoramidite, wherein each arm of the brancher is linked to a reporter code. Exemplary symmetrical chemical branchers include 1,2,3-O-tris-(phosphosphodiester)propane, 1,3-bis-(5-O-phosphodiester-pentylamido)-2-O-phosphodiester-propane, and 1,4,7-O-tris-(phosphodiester)-heptane.
A UV chromophore 212 may be attached to the end of TCE 208. UV chromophore 212 may allow for visualization or quantification. Spacers 216a and 216b are attached to reporter codes 204a and 204b, respectively. Spacers 216a and 216b may be polyethylene glycol (PEG) units, which may modulate the length traversed in a pore. Enhancers 220a and 220b may be attached to spacers 216a and 216b. Enhancers 220a and 220b may be positively charged spermine that facilitates polymerase incorporation. Nucleotide 224 may be attached to enhancers 220a and 220b. Nucleotide 224 may include a triphosphoramidate diester. The structural elements of XNTP 200 may be adjusted for improving measurement within a nanopore.
During the “bright period,” Xpandomer molecules capture and begin to translocate through the nanopore due to a combination of both baseline and TCE applied voltage pulses. Baseline voltages are sufficient to read the tag code at each XNTP position, and the short, higher voltage TCE pulses are designed to overcome the energetic barrier associated with a TCE. Ideally, each TCE pulse results in translocation past a single TCE barrier, thus moving the Xpandomer further into the pore in the forward direction by an amount of one “base” position.
During typical operation, applied voltage patterns are designed so that there are a fixed number of TCE pulses during each bright period, which cause the Xpandomer to translate in the “forward” direction by a number of bases, corresponding to the number of TCE pulses, or until the Xpandomer fully translocates, and is released into the fluidic “trans” chamber below the membrane.
During typical operation, an Xpandomer molecule may not fully translocate prior to the end of a single bright period. This may happen due to the molecule capturing late in the bright period and having an Xpandomer length with more base positions than there are TCE pulses remaining in the bright period. A molecule can get stuck while attempting to translocate in the forward direction for a variety of reasons. There may be a base position which has a defect (such as a failed cleavage event) which makes it impossible or very difficult for the molecule to translocate past that point. In such circumstances, and for other reasons, an Xpandomer may not be able to fully translocate during the bright period, regardless of the number of TCE pulses in a bright period. In such situations, it may be observed that a number of base positions in the beginning of the read are sequenced and generate the expected signal levels, until the defective position is reached. The last tag code level located just before the defect can then be observed for the remainder of the bright period. In order that pores do not remain permanently clogged, a large negative voltage may be applied for some fraction of time in the dark period in order to clear out any stuck molecules by driving them hard in the reverse direction.
Signal 520 shows molecule 3 during a bright period. The event shows that molecule 3 gets stuck in the pore and does not clear over several cycles (dark periods 524, 528, 532 and signals 536, 540, and 544 in bright periods). Eventually, the molecule clears in a dark cycle, as indicated by the change from signal 548a to signal 548b (when molecule 3 clears). This event may result from properties of the Xpandomer's leader segment, which create difficulty in the leader translocating in the reverse direction.
Xpandomer molecules can be designed with properties in the leader portion of the Xpandomer that cause the leader to behave differently in the forward and reverse directions. During the bright period (forward direction), the leader may have characteristics that allow the leader to be captured into the pore from the cis side with relatively high capture rates under reasonably applied voltages. Following capture, but still during the same bright period, the leader may protrude from the underside of the pore (trans side of membrane), as TCE pulses cause the molecule to process steadily through the pore.
During the dark period (reverse direction), if a molecule is still in the pore when a dark period begins, the molecule should begin to translocate in the reverse direction under a negative applied voltage. Once the Xpandomer molecule has almost fully reversed its position (i.e., it has almost fully backed out), the leader may remain on the trans side of the barrel. At this point, the desired property of the leader is that the leader has a high energetic barrier to entry into the barrel from the trans side, and thus is highly resistant to translocation through the barrel in the trans to cis direction (forward direction).
Strategies for leveraging asymmetric leader behavior (with respect to direction through the pore) are described herein. These strategies may increase accuracy through reducing the number of Xpandomers that get stuck in a pore.
A. Full Forward and Full Reverse with Same Voltage
During the dark period (e.g, diagram 616), the applied voltage is reversed, and the Xpandomer translocates in the upward direction. The same voltage is applied in the dark period as the bright period, but the voltage has an opposite polarity. Leader 608 may have difficulty going through the membrane and/or the pore in the opposite direction.
The distribution of SM3T durations may be exponential. The distribution may show many shorter lengths, reflecting Xpandomers getting stuck in the nanopore. The mode of durations should equal 1.
B Full Forward and Full Reverse with Increased Reverse Voltage
The shape of the distribution of SM3T durations may be changed from the distribution with
The shape of the distribution of SM3T durations may be changed from the distribution with
As shown in
An Xpandomer 912 may enter the nanopore late in bright period 916. There may be more bases (e.g., reporter elements) remaining after the bright period is over. With the dark period, Xpandomer 912 moves completely out of the nanopore.
In
Xpandomer 924 may arrive in the pore early in bright period 928. The molecule may have more bases than there are pulses remaining in the bright period. Thus, Xpandomer 924 does not fully translocate through. Almost exactly as many bases are read as there are pulses in the bright period.
In step 1008, the Xpandomer is captured. A high voltage (e.g., TCE voltage) is applied for a longer period of time (e.g., 0.1 to 10 s of milliseconds instead of 8 μs). A captured molecule 1012 with a blocker moves to an end so that further translocation is stopped by the blocker. Also shown in step 1008 is a molecule 1016 with no blocker that is captured. Molecule 1016 may include fragmented molecules. The molecule with no blocker may quickly pass through the nanopore.
In step 1020, Xpandomer 1012 is moved in the reverse direction to back out Xpandomer 1012 from the nanopore. The TCE voltages are applied in the reverse direction. Sequencing information is obtained as the molecule is backing out. All captured molecules are expected to be positioned at the molecule's end (or at an uncleaved position). Molecules are pulsed in the reverse direction and data is acquired as the molecule is being backed out. The bright periods may be as long as the longest expected molecules in the sample. For example, for a ctDNA assay, the longest expected molecules may be 350 bp, corresponding to 350 pulses or 350 ms for 1 ms interpulse durations.
In step 1030, an optional recharge may be performed before the next capture step. A dark period may be applied. The dark voltage may help recharge the electrode if the cumulative voltages from steps 1 and 2 are not balanced. During this period, no data is acquired. If phased array mode is being run, then this dark period might be the duration of steps 1008 and 1020, allowing for the other half of the chip (phase) to complete those two steps.
The leader may be modified so that it does not translocate easily when moving through the pore starting on the vestibule (i.e., membrane) side. Xpandomers may need to be captured reasonably well when entering from the vestibule side. High voltage may help capture the Xpandomers. Membranes may be resilient to high voltage for longer times than with other strategies (e.g., with 8 μs pulse durations). A block may be added to the end of the Xpandomer.
The benefits of this strategy may include filtering out a fraction of fragmented molecules. Additionally, the read starts may be synchronized. Furthermore, pore insertion may be from the cis side. For example, a well may be filled with a nanopore solution. The well then can be covered with a membrane, and the nanopore may be inserted from the cis side.
Strategies may include reading the same XNTPs multiple times in a nanopore and applying voltages to clear stuck Xpandomers. These strategies may increase accuracy through reducing the number of Xpandomers that get stuck in a pore and/or through repeating reads of XNTPs through the nanopore.
A strategy for sequencing a molecule may not involve reading a molecule all the way from start to end each pass. A part of the molecule may move through the nanopore for a subread, rather than the whole molecule moving through the nanopore for a complete read. Additionally, the molecule may move forward and then backward to make many short overlapping passes and a processive way. Bright and dark period durations may be shortened compared to other strategies. The number of high voltage TCE pulses may be less than the number of TCEs in an Xpandomer. The number of TCE pulses in bright periods would be greater than the number of reverse TCE pulses in dark periods.
As an example, the Xpandomer length distribution may have a peak around 350 bp. The bright period may include a duration of 30 TCE pulses. The dark period duration may equal in total time to bright period duration but with an applied voltage pattern that contains 25 TCE pulses in the reverse direction. Having equal durations of bright periods and dark periods allow balancing charge on the electrode, regenerating and/or resetting the electrode. The protocol may include additional periods that include “clearing voltages” to periodically remove the occasional stuck Xpandomer.
During the bright period, the Xpandomer moves in the downward direction as illustrated. The Xpandomer is sequenced during the bright period. Each pulse should correspond to a read. Capture of the Xpandomer may occur at any time during a bright period. During the dark period, the Xpandomer moves in the reverse (upward) direction. No sequencing is performed during the reverse direction. If the leader backs into the vestibule, the Xpandomer may get stuck. The meta-periods with clearing voltages may dislodge the Xpandomer. The meta-periods are not shown in
After a sufficient number of cycles, a molecule (e.g., Xpandomer 1124) exits the pore in the forward direction.
As an example, each half cycle may be 30 ms at 1 ms interpulse spacing. A full cycle may take 60 ms. The total time spent on a molecule may be 1200 ms. A total of 577 raw bases may be read in 1.2 s total time or 0.6 s bright time. With the exception of 20 bases at either end, the 116 bp molecule is sequenced at 6× depth of coverage.
A special bright period and dark period may be executed once, twice, or more each meta-period. The special bright and dark periods may have higher applied voltage and/or be a longer duration. In
Normal bright periods (e.g., bright periods 1312a and 1312b) may have an AC modified period much shorter than 1 or 2 seconds. The number of pulses in the bright period may be equal to the median of the input xmer fragment length distribution. Normal dark periods (e.g., period 1316a and period 1316b) may have a high voltage initially in the period to quickly drive the molecule in reverse back to the leader position.
The protocol of processive consensus described with
By the end of the dark period prior to each meta-period (i.e., clearing period), a decision is made for every active cell in the array on whether the clearing period should be experienced by each cell or whether the cell should be temporarily deactivated during assertion of the global clearing period. A deactivation mask may be updated in preparation for the clearing period.
During the bright and dark clearing periods, the deactivation mask is applied, and high positive and negative voltages are applied to global chip lines. Many or most cells may be temporarily deactivated and thus electrically isolated such that they do not experience a clearing voltage in either the bright or dark clearing period, or both.
In one embodiment, ˜1 ms may elapse to update the deactivation mask for all cells on the Nanopore Sequencer chip. The mask update time may be followed by the bright and/or dark clearing periods. The clearing period itself may last anywhere from one millisecond to tens of milliseconds. Following the clearing period, another ˜1 ms duration of time may be allocated to re-activate cells with a second deactivation mask update.
Both processive consensus applied to all cells and applied to specific cells may include individual bright periods that are much shorter than what is required to traverse the full length of target molecules. Additionally, the subreads produced in consecutive bright periods may overlap at the ends. This leads to a protocol where molecules of any length can be sequenced at their full length with a depth of greater than one. This method is an improvement over the strategy described with
In some embodiments, reducing the number of times the sequences are read may be desired. For example, molecules may be read at single-depth coverage or slightly greater (e.g., an average depth of between 1 and 2). To achieve this, the dark period can be set so there are few TCE pulses in the reverse direction compared to the TCE pulses in the bright period in the forward direction.
Aiming for a depth coverage slightly greater than 1 based on the processive consensus voltage pattern results in a shorter bright period duration than simply sequencing an entire molecule in one single bright period.
Short AC modulation periods (i.e., bright periods) result in electrochemical and circuit related advantages. For example, because of the way sequencing is performed with a wet analog circuit, voltage across the membrane/pore decays during a bright period. Dark periods are used to “recharge” both an electrochemical battery in the volume of the well and an electrostatic working electrode capacitor at the bottom of the well. The rate of this bright period decay is governed in part by the size of the working electrode capacitor and also in part by the concentration and volume of electrochemically active redox species within the well. There are some practical advantages to lowering the concentration of electrochemically active redox species used during sequencing, but doing so results in a shorter bright period decay time constant. One way to deal with a shorter bright period decay time constant is simply to shorten the duration of the bright period itself. Processive consensus allows for shortening the bright period duration substantially while not permanently cutting off reads for Xpandomer molecules with lengths longer than the number of TCE pulses in the bright period.
At block 1610, a first number of voltage pulses at a first level may be applied across a nanopore to displace a compound a first distance in a first direction through the nanopore. The compound may be created from the target nucleic acid molecule. The compound may be a surrogate polymer or an Xpandomer, as described herein and in WO 2020/236526 A1 and U.S. Pat. No. 7,939,259 B2, the entire contents of both of which are incorporated herein by reference for all purposes. Prior to applying the first number of voltage pulses, the first compound may be captured by the nanopore. The compound may include a plurality of units. A unit may be similar to the unit shown in
The first direction may be in the same direction that the compound moved through the nanopore when the compound was initially captured. For example, the compound may have a leader portion. The leader portion may be captured in the nanopore. The first direction may be in a way where the leader moves farther away from the nanopore.
Applying the first number of voltage pulses may pass a first subset of the plurality of units through the nanopore. The first number of voltage pulses may be the voltage pulses in the bright period. The number of voltage pulses may correspond to the number of units in the plurality of units passed through the nanopore. Each unit of the plurality of units may include a translocation control element. Applying the first number of voltage pulses may pass a first number of translocation control elements through the nanopore. The first number of voltage pulses may equal to the first number of translocation control elements.
The first subset of the plurality of units may include 10 to 20, 20 to 30, 30 to 40, 40 to 50, 50 to 60, 60 to 70, 70 to 80, 80 to 90, 90 to 100, or more than 100 units. The first number of voltage pulses may include 10 to 20, 20 to 30, 30 to 40, 40 to 50, 50 to 60, 60 to 70, 70 to 80, 80 to 90, 90 to 100, or more than 100 pulses.
A baseline level of voltage across the nanopore may be applied to move the units through the nanopore for sequencing in between pulses of the first number of voltage pulses. This lower level of voltage may move the units through the nanopore but may not be enough to move the translocation control element through the nanopore. A voltage at a baseline level may be applied across the nanopore to displace the compound a distance in the first direction through the nanopore. The voltage at the baseline level may be the same polarity as the voltage pulses of the first number of voltage pulses. The baseline level may be less than the first level. The compound after being displaced the distance may have a translocation control element in the nanopore.
As such, in an example, at block 1610, the following actions may be taken: Apply a first number of voltage pulses at a first level across a nanopore to displace a compound a first distance in a first direction through the nanopore, where the compound comprises a plurality of units, each unit of the plurality of units comprises one type of reporter element of a plurality of types of reporter elements, and applying the first number of voltage pulses passes a first subset of the plurality of units through the nanopore.
At block 1620, the types of reporter elements in the first subset may be detected. Signal values may be measured for a nanopore having a voltage applied across the nanopore when reporter elements in the first subset of the plurality of units are in the nanopore. The types of reporter elements in the first subset are determined using the signal values. As a result of the correspondence of the types of reporter elements and the nucleotides, the identities of the nucleotides in the target nucleic acid molecules are also determined.
As such, in an example, at block 1620, the following actions may be taken: Detect, in the nanopore, the types of reporter elements in the first subset.
At block 1630, a second number of voltage pulses at a second level may be applied across the nanopore to displace the compound a second distance in a second direction through the nanopore. The second number of voltage pulses may be voltage pulses in a dark period. The first direction is opposite the second direction. The voltage pulses of the first number of voltage pulses have an opposite polarity as the voltage pulses of the second number of voltage pulses. The second distance may be less than the first distance. The second number may be less than the first number.
The first number of voltage pulses may exceed the second number of voltage pulses by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to 15, 15 to 20, 20 to 25, 25 to 30, 30 to 40, 40 to 50, or greater than 50. In some embodiments, the first number of voltage pulses may equal the second number of voltage pulses. Subsequent cycles of pulses may then be unequal so that the pulses end up advancing the compound through the nanopore.
As such, in an example, at block 1630, the following actions may be taken: Apply a second number of voltage pulses at a second level across the nanopore to displace the compound a second distance in a second direction through the nanopore.
At block 1640, a third number of voltage pulses at a third level may be applied across the nanopore to displace the compound a third distance in the first direction through the nanopore. The third number of voltage pulses may be voltage pulses in another bright period. Applying the third number of voltage pulses passes a second subset of the plurality of units through the nanopore. The second subset and the first subset may include some of the same units. The second subset may include units not in the first subset. The third distance may be greater than the second distance. The third level may equal the first level. The third number of voltage pulses may be the same or different as the first number of voltage pulses.
As such, in an example, at block 1640, the following actions may be taken: Apply a third number of voltage pulses at a third level across the nanopore to displace the compound a third distance in the first direction through the nanopore, where applying the third number of voltage pulses passes a second subset of the plurality of units through the nanopore.
At block 1650, the types of reporter elements in the second subset may be detected. A sequence of the target nucleic acid molecule may be determined. The sequence may be determined from the order of the types of reporter elements detected. For one or more units in both the first subset and the second subset, the same type of reporter element may be detected. For example, the same reporter elements may be detected as described in techniques involving processive consensus. In some embodiments, reporter elements in certain units may be detected 2, 3, 4, 5, 6, 7, 8, 9, 10, or more times. The type of reporter element at a unit may be determined to be the type that is most frequently detected.
As such, in an example, at block 1650, the following actions may be taken: Detect, in the nanopore, the types of reporter elements in the second subset.
In embodiments, process 1600 may further include passing the compound entirely out of the nanopore. The compound may pass out of the nanopore during a number of voltage pulses with the same polarity as the voltage pulses of the first number of voltage pulses. For instance, the compound may pass out of the nanopore during a bright period. The compound may pass entirely out of the nanopore during the applying of the third number of voltage pulses.
Additional numbers of voltage pulses of alternating polarities may be applied. For example, a fourth number of voltage pulses may displace the compound in the second direction. A fifth number of voltage pulses may then displace the compound in the first direction. These cycles of voltage pulses may continue until the compound passes out of the nanopore.
In some embodiments, a clearing voltage at a fourth level may be applied across the nanopore to pass the compound entirely out of the nanopore. The fourth level may be greater than the first level, the second level, and the third level. In some embodiments, the clearing voltage may be applied for a longer duration than the duration of the first number, second number, and third number of pulses.
In some embodiments, several nanopores (e.g., in an array) may be used to sequence several nucleic acid molecules. The compound may be a first compound of a plurality of compounds. The plurality of compounds may be created from a plurality of target nucleic acid molecules. The nanopore may be a first nanopore of a plurality of nanopores. Each compound of the plurality of compounds may be in one nanopore of the plurality of nanopores. The first number of voltage pulses at the first level, the second number of voltage pulses at the second level, and the third number of voltage pulses at the third level may be applied to the plurality of nanopores. The pulses may be applied in the same order as for the first compound. Reporter elements in each compound of the plurality of compounds may be detected. A plurality of sequences of the plurality of nucleic acid molecules may be determined. The size distribution of the plurality of sequences may have a mode greater than 300 nt. The mode may be 200 to 300 nt, 300 to 400 nt, 400 to 500 nt, or greater than 500 nt.
In some embodiments, clearing voltages may be applied to certain nanopores and not other nanopores, similar to the process described in section . The first number of voltage pulses, the second number of voltage pulses at the second level, and the third number of voltage pulses at the third level to the plurality of nanopores may be applied to the plurality of nanopores. A first portion of the plurality of compounds may be determined to be displaced in a first portion of the plurality of nanopores by the first number of voltage pulses, the second number of voltage pulses, or the third number of voltage pulses. The first portion of the plurality of compounds may not be stuck in the nanopore. A clearing voltage at a fourth level may be applied across each nanopore of a second portion of the plurality of nanopores. The applied clearing voltage may pass a second portion of the plurality of compounds entirely out of the respective nanopore of the plurality of nanopores. The fourth level may be greater than the first level, the second level, and the third level. In some embodiments, the clearing voltage may be applied for a longer duration the duration of any of the voltage pulses. The second portion of the plurality of nanopores may not include nanopores in the first portion of the plurality of nanopores.
In some embodiments, the first portion of the plurality of compounds may be determined to not be displaced in a first portion of the plurality of nanopores by the first number of voltage pulses, the second number of voltage pulses, or the third number of voltage pulses. The first portion of the plurality of compounds may be stuck in the nanopore. A clearing voltage at a fourth level may be applied across each nanopore of the first portion of the plurality of nanopores.
Compounds may be determined to be stuck when there is no characteristic change in the measured electrical signal moving from one reporter element to the next. When moving from one reporter element to the next, there is a characteristic change in the measured electrical signal. When stuck at a particular reporter element, there is no change in the electrical signal expected after applying the translocation voltage pulse. A high enough sampling rate would be used to distinguish between stuck compounds and compounds that are moving but have consecutive types of reporter elements.
Process 1600 may include additional implementations, such as any single implementation or any combination of implementations described herein and/or in connection with one or more other processes described elsewhere herein.
Although
Logic system 1703 may be, or may include, a computer system, ASIC, microprocessor, etc. It may also include or be coupled with a display (e.g., monitor, LED display, etc.) and a user input device (e.g., mouse, keyboard, buttons, etc.). Logic system 1703 and the other components may be part of a stand-alone or network connected computer system, or they may be directly attached to or incorporated in a device (e.g., a sequencing device) that includes detector 1702 and/or sample holder 1701. Logic system 1703 may also include software that executes in a processor 1720. Logic system 1703 may include a computer readable medium storing instructions for controlling system 1700 to perform any of the methods described herein. For example, logic system 1703 can provide commands to a system that includes sample holder 1701 such that sequencing or other physical operations are performed. Such physical operations can be performed in a particular order, e.g., with reagents being added and removed in a particular order. Such physical operations may be performed by a robotics system, e.g., including a robotic arm, as may be used to obtain a sample and perform an assay.
Any of the computer systems mentioned herein may utilize any suitable number of subsystems. Examples of such subsystems are shown in
The subsystems shown in
A computer system can include a plurality of the same components or subsystems, e.g., connected together by external interface 81, by an internal interface, or via removable storage devices that can be connected and removed from one component to another component. In some embodiments, computer systems, subsystem, or apparatuses can communicate over a network. In such instances, one computer can be considered a client and another computer a server, where each can be part of a same computer system. A client and a server can each include multiple systems, subsystems, or components.
Aspects of embodiments can be implemented in the form of control logic using hardware circuitry (e.g., an application specific integrated circuit or field programmable gate array) and/or using computer software with a generally programmable processor in a modular or integrated manner. As used herein, a processor can include a single-core processor, multi-core processor on a same integrated chip, or multiple processing units on a single circuit board or networked, as well as dedicated hardware. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will know and appreciate other ways and/or methods to implement embodiments of the present invention using hardware and a combination of hardware and software.
Any of the software components or functions described in this application may be implemented as software code to be executed by a processor using any suitable computer language such as, for example, Java, C, C++, C#, Objective-C, Swift, or scripting language such as Perl or Python using, for example, conventional or object-oriented techniques. The software code may be stored as a series of instructions or commands on a computer readable medium for storage and/or transmission. A suitable non-transitory computer readable medium can include random access memory (RAM), a read only memory (ROM), a magnetic medium such as a hard-drive or a floppy disk, or an optical medium such as a compact disk (CD) or DVD (digital versatile disk) or Blu-ray disk, flash memory, and the like. The computer readable medium may be any combination of such storage or transmission devices.
Such programs may also be encoded and transmitted using carrier signals adapted for transmission via wired, optical, and/or wireless networks conforming to a variety of protocols, including the Internet. As such, a computer readable medium may be created using a data signal encoded with such programs. Computer readable media encoded with the program code may be packaged with a compatible device or provided separately from other devices (e.g., via Internet download). Any such computer readable medium may reside on or within a single computer product (e.g., a hard drive, a CD, or an entire computer system), and may be present on or within different computer products within a system or network. A computer system may include a monitor, printer, or other suitable display for providing any of the results mentioned herein to a user.
Any of the methods described herein may be totally or partially performed with a computer system including one or more processors, which can be configured to perform the steps. Thus, embodiments can be directed to computer systems configured to perform the steps of any of the methods described herein, potentially with different components performing a respective step or a respective group of steps. Although presented as numbered steps, steps of methods herein can be performed at a same time or at different times or in a different order. Additionally, portions of these steps may be used with portions of other steps from other methods. Also, all or portions of a step may be optional. Additionally, any of the steps of any of the methods can be performed with modules, units, circuits, or other means of a system for performing these steps.
The specific details of particular embodiments may be combined in any suitable manner without departing from the spirit and scope of embodiments of the invention. However, other embodiments of the invention may be directed to specific embodiments relating to each individual aspect, or specific combinations of these individual aspects.
The above description of example embodiments of the present disclosure has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure to the precise form described, and many modifications and variations are possible in light of the teaching above.
A recitation of “a”, “an”, or “the” is intended to mean “one or more” unless specifically indicated to the contrary. The use of “or” is intended to mean an “inclusive or,” and not an “exclusive or” unless specifically indicated to the contrary. Reference to a “first” component does not necessarily require that a second component be provided. Moreover, reference to a “first” or a “second” component does not limit the referenced component to a particular location unless expressly stated. The term “based on” is intended to mean “based at least in part on.”
All patents, patent applications, publications, and descriptions mentioned herein are incorporated by reference in their entirety for all purposes. None is admitted to be prior art.
This application is a continuation of International Patent Application No. PCT/EP2023/077003, filed Sep. 29, 2023, which claims the benefit of U.S. Provisional Patent Application No. 63/412,774, filed Oct. 3, 2022, which are hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
63412774 | Oct 2022 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2023/077003 | Sep 2023 | WO |
Child | 19097390 | US |