The invention relates to the field of polymer separation. More particularly, the invention relates to the separation of polymer molecules of different sizes.
Techniques for separation of polymer molecules on the basis of their size are well known in the art. For example, polynucleotides or polypeptides may be separated via gel-based electrophoresis techniques, which involve gel matrices comprising for example agarose or polyacrylamide. In the case of DNA sequencing, polynucleotides may be separated with a resolution as low as a single polymer unit (nucleotide).
End Labelled Free Solution Electrophoresis (ELFSE) provides a means of separating DNA with free solution electrophoresis, eliminating the need for gels and polymer solutions. In free solution electrophoresis, DNA is normally free-draining and all fragments elute at the same time. In contrast, ELFSE often uses uncharged label molecules attached to each DNA fragment in order to render the electrophoretic mobility of the DNA fragments size-dependent. For example, methods for ELFSE are disclosed for example in U.S. Pat. Nos. 5,470,705, 5,514,543, 5,580,732, 5,624,800, 5,703,222, 5,777,096, 5,807,682, and 5,989,871, all of which are incorporated herein by reference. Many types and variations of end labels are known in the art, as described in the aforementioned patents, as well as U.S. patent publication US2006/0177840 published May 1, 2006, which is also incorporated herein by reference.
The nature of the end labels (also known as ‘drag-tags’) can vary significantly. Typically, an end label refers to any chemical moiety that may be attached to or near to an end of a polymeric compound to increase the drag of the complex during free solution electrophoresis, wherein the drag is caused by hydrodynamic friction. It is desirable to use end labels that induce a significant amount of hydrodynamic friction, since this may improve the ELFSE process. For example, end labels with significant hydrodynamic friction may permit greater separation of a larger range of polymer molecule sizes. When applied to DNA sequencing methods, this may translate into greater nucleotide resolution and/or increased read lengths.
In specific examples, a drag tag may comprise a peptide or a polypeptoid comprising up to or more than 100, preferably up to 200, more preferably up to or more than 300 polymer units. If required, the drag-tags or end labels may be uncharged such that they merely act to cause drag upon the charged polymeric compound during motion through a liquid substance.
There is a general desire in the art to produce end labels that are simple to manufacture, simple to attach to a polymer molecule, and which cause a significant degree of hydrodynamic drag in solution (when the end labeled polymer molecule is subjected for example to electrophoresis, optionally with an electroosmotic flow). However, the mechanisms that give rise to relative increases in hydrodynamic drag are poorly understood. It follows that there remains a need to develop further improved end labels and corresponding methods for polymer separation by optimization of the properties of the end labels. For example, there remains a need to develop methods for DNA sequencing via ELFSE with increased nucleotide sequence resolution and sequence read length. There is also a need to develop improved design rules to help optimize hydrodynamic drag properties of end labels.
It is an object of the invention, at least in preferred embodiments, to provide a method for separating polymer molecules on the basis of their size.
It is another object of the invention, at least in preferred embodiments, to provide a method for sequencing DNA.
In one aspect the invention provides an end label suitable for attachment at or near to an end of a polymer molecule, so as to increase the hydrodynamic drag of the polymer molecule during motion through a liquid substance such as during electrophoresis, with or without the presence of an electroosmotic flow, the end label comprising:
(1) a backbone such as a substantially linear backbone;
(2) at least one branch arm extending from the backbone at branch point(s) therein, the branch arm(s) selected from at least one of the group consisting of:
Preferably, the backbone comprises from 20-10000 monomer units.
Preferably, the branch arms of (2a) comprise from 2-1000 branch arms, each comprising from 5-10000 monomer units. Preferably, the branch arms of (2b) comprise from 2-1000 branch arms, each comprising from 5-10000 monomer units. Each backbone and/or each branch arm may be charged or uncharged.
Preferably, the substantially linear backbone and each branch arm each comprise monomer units. More preferably, the end label comprises from 30-500 monomer units. Preferably, the end label is a polypeptide and/or polypeptoid, and the monomer units comprise natural and/or non-natural amino acids.
Preferably, the at least one branch arm each extending from a corresponding branch point at or near at least one end of the linear backbone, comprises two branch arms each extending from an opposite end of the substantially linear backbone.
Preferably, the plurality of branch arms extending from the linear backbone at branch points therein are substantially equally spaced along the linear backbone, each branch arm having a length substantially equal to every other branch arm, and substantially equal to a length of said linear backbone between consecutive branch points.
Preferably, the at least one branch arm each extending from a corresponding branch point at or near at least one end of the linear backbone, each including iterative branching comprising at least two further branch arm extensions to each branch arm, each extension extending at or near an end of each previous extension closer to the substantially linear backbone.
In another aspect of the invention there is provided a method for constructing an end label for attachment to a polymer molecule to increase the hydrodynamic drag of the molecule through a liquid such as during electrophoresis or electroosmotic flow, the method comprising the steps of:
(1) synthesizing a substantially linear backbone comprising a plurality of monomer units; and
(2) synthesizing at least one branch arm extending from the backbone at branch point(s) therein, the branch arm(s) selected from at least one of the group consisting of:
Preferably, the monomer units are natural and/or unnatural amino acids, said end label comprising a polypeptide and/or a polypeptoid.
In another aspect the invention provides a plurality of covalently modified polymer molecules having more than one length, suitable for separation via ELFSE, each comprising a substantially linear sequence of monomer units, and having covalently attached to at least one end thereof an end label of the invention. Preferably, each polymer molecule in the plurality of covalently modified polymer molecules comprises ssDNA, derived from at least one DNA sequencing reaction.
In another aspect of the invention there is provided a method for sequencing a section of a DNA molecule, the method comprising the steps of:
(a) synthesizing a first plurality of ssDNA molecules each comprising a sequence identical to at least a portion at or near the 5′ end of said section of DNA, said ssDNA molecules having substantially identical 5′ ends but having variable lengths, the length of each ssDNA molecule corresponding to a specific adenine base in said section of DNA;
(b) synthesizing a second plurality of ssDNA molecules each comprising a sequence identical to at least a portion at or near the 5′ end of said section of DNA, said ssDNA molecules having substantially identical 5′ ends but having variable lengths, the length of each ssDNA molecule corresponding to a specific cytosine base in said section of DNA;
(c) synthesizing a third plurality of ssDNA molecules each comprising a sequence identical to at least a portion at or near the 5′end of said section of DNA, said ssDNA molecules having substantially identical 5′ ends but having variable lengths, the length of each ssDNA molecule corresponding to a specific guanine base in said section of DNA;
(d) synthesizing a fourth plurality of ssDNA molecules each comprising a sequence identical to at least a portion at or near the 5′end of said section of DNA, said ssDNA molecules having substantially identical 5′ ends but having variable lengths, the length of each ssDNA molecule corresponding to a specific thymine base in said section of DNA;
(e) attaching an end label of claim 1 at or near at least one end of said ssDNA molecules to generate end-labeled ssDNAs; and
(f) subjecting each plurality of end labelled ssDNA molecules to free-solution electrophoresis;
(g) identifying the nucleotide sequence of the section of DNA in accordance with the relative electrophoretic mobilities of the end labeled ssDNAs in each plurality of ssDNAs;
wherein any of steps (a), (b), (c), and (d) may be performed in any order or simultaneously;
whereby each end label imparts increased hydrodynamic friction to at least one end of each end-labeled ssDNAs thereby to facilitate separation of the end-labeled ssDNAs according to their electrophoretic mobility.
Preferably, the section of DNA comprises less than 2000 nucleotides, more preferably less than 500 nucleotides, more preferably less than 100 nucleotides.
In another aspect the invention also provides a DNA sequencing kit comprising the end label of claim 1, together with at least one other component for a DNA sequencing reaction.
Polymeric compounds, such as polypeptides and polynucleotides, are routinely subject to modification. Chemical synthesis or enzymatic modification can enable the covalent attachment of artificial moieties to selected units of the polymeric compound. Desirable properties may be conferred by such modification, allowing the polymeric molecules to be manipulated more easily. In the case of DNA, enzymes are commercially available for modifying the 5′ or 3′ ends of a length of ssDNA, for example to phosphorylate or dephosphorylate the DNA. In another example, biotinylated DNA may be formed wherein the biotin moiety is located at or close to an end of the DNA, such that Strepavidin may be bound to the biotin as required. Tags such as fluorescent moieties may also be attached to polynucleotides for the purposes of conducting DNA sequencing, for example using an ABI Prism™ sequencer or other equivalent sequencing apparatus that utilizes fluorimetric analysis.
In the framework of the classical blob theory of End-Labeled Free Solution Electrophoresis (ELFSE) of ssDNA and other polymer molecules, and based on recent experimental data with linear and branched polymeric labels (or drag-tags), the present invention provides design principles for the optimal type of branching that would give, for a given total number of monomers, the highest effective frictional drag for example for ssDNA sequencing purposes. The hydrodynamic radii of the linear and branched labels are calculated using standard models like the freely jointed chain model and the Kratky-Porod worm like chain model.
To separate DNA fragments by free solution electrophoresis is an impossible task [1], unless the free-draining DNA polymer is modified at the molecular level, e.g. by conjugation with an uncharged “drag” molecule that can change its hydrodynamic friction without affecting its total charge [2]. The charge-to-friction ratio then becomes a function of the DNA chain length and free solution electrophoresis becomes possible. The conjugation method has been applied successfully to separate ssDNA fragments up to a maximum length of ≈120 bases, with single monomer resolution [3]. The method has been called End-Labeled Free Solution Electrophoresis (ELFSE) [4]. The key parameter of the ELFSE method is clearly the effective hydrodynamic friction provided by the drag-tag. To successfully apply ELFSE for ssDNA sequencing, we need to maximize the value of this fundamental property [5-7]. Given all the constraints that ELFSE is working under, this is not an easy task. For instance, simply increasing the length of a water-soluble and neutral linear polymeric drag-tag is not as easy as it seems because the drag-tags must also remain perfectly monodisperse (i.e., to within one monomer). An alternative, recently proposed by Haynes et at. (Bioconjugate Chem 2005, 16, 929-938) is to use branched polymers with a fixed architecture. The present invention extended this work significantly to provide ways to optimize such a branched structure for ELFSE.
Since the theory of ELFSE is rather well documented [8, 9], we will use the classical model (described in Example 1) to analyze the recent experimental data (presented in Example 2 and 3) with linear and branched labels (Haynes et al.,). In order to compute the effective hydrodynamic radius of the various branched labels, we will first use the freely jointed chain (FJC) model and the equations derived by Teraoka [10] for branched FJC polymers. As we shall see, the predicted friction coefficients will be too small to explain the experimental data (Example 4-8). This indicates that a more detailed treatment of a branched worm like chain (WLC) model is necessary. We develop two such models in Section 5: in the first case, we take into account the finite persistence length of the polymer, but we disregard the branching points; in the second approach, we also consider the effect of the branching points.
In fact, it is possible to predict the hydrodynamic radius of branched drag-tags, what remains is a constrained optimization problem that can be phrased in the following way: What is the best strategy to distribute a set number of monomers onto primary (or even secondary) side chains such that we maximize the drag-tag's effective ELFSE friction coefficient? Corresponding aspects of the invention, as well as general branching strategies for drag-tags for use in ELFSE, will also be shown.
In preferred embodiments the invention encompasses an end label suitable for attachment at or near to an end of a polymer molecule, so as to increase the hydrodynamic drag of the polymer molecule during motion through a liquid substance such as during electrophoresis, with or without the presence of an electroosmotic flow, the end label comprising:
(1) a backbone such as a substantially linear backbone;
(2) at least one branch arm extending from the backbone at branch point(s) therein, the branch arm(s) selected from at least one of the group consisting of.
Preferably, the backbone comprises from 20-10000 monomer units.
Preferably, the branch arms of (2a) comprise from 2-1000 branch arms, each comprising from 5-10000 monomer units. Preferably, the branch arms of (2b) comprise from 2-1000 branch arms, each comprising from 5-10000 monomer units. Each backbone and/or each branch arm may be charged or uncharged. However, the number of monomer units in the backbone or any branch arm, or the number of branch arms, may vary even further in accordance with the end labels of the present invention, providing the desired attributes for the end label of superior levels of hydrodynamic drag are exhibited.
In more preferred embodiments each end label comprises from 30-500 monomer units. The monomer units may be derived from or form any polypeptide and/or polypeptoid, and the monomer units may comprise natural and/or non-natural amino acids.
Particularly preferred configurations, positions, and lengths for the branch arms, which provide particularly increased levels of hydrodynamic drag, will be apparent from the present discussion.
In further preferred embodiments the invention provides methods for constructing an end label for attachment to a polymer molecule to increase the hydrodynamic drag of the molecule through a liquid such as during electrophoresis or electroosmotic flow, the methods comprising the steps of:
(1) synthesizing a substantially linear backbone comprising a plurality of monomer units; and
(2) synthesizing at least one branch arm extending from the backbone at branch point(s) therein, the branch arm(s) selected from at least one of the group consisting of:
In still further embodiments the invention provides a plurality of covalently modified polymer molecules having more than one length, suitable for separation via ELFSE, each comprising a substantially linear sequence of monomer units, and having covalently attached to at least one end thereof an end label of the invention. Preferably, each polymer molecule in the plurality of covalently modified polymer molecules comprises ssDNA, derived from at least one DNA sequencing reaction.
Particularly preferred embodiments of the invention provide a method for sequencing a section of a DNA molecule, the method comprising the steps of:
conducting a sequencing reaction for a length of DNA using labelled chain terminator nucleotides to form ssDNAs;
attaching an end label of the invention at or near at least one end of said ssDNA molecules to generate end-labeled ssDNAs; and
subjecting each plurality of end labelled ssDNA molecules to free-solution electrophoresis;
identifying the nucleotide sequence of the section of DNA in accordance with the relative electrophoretic mobilities of the end labeled ssDNAs in each plurality of ssDNAs;
whereby each end label imparts increased hydrodynamic friction to at least one end of each end-labeled ssDNAs thereby to facilitate separation of the end-labeled ssDNAs according to their electrophoretic mobility.
In preferred embodiments such methods may permit sequencing of up to or even more than a read length of 2000 nucleotides.
The following examples illustrates preferred embodiments of the invention, and are in no way intended to be limiting to the invention disclosed and claimed herein:
Meagher et al. [2] have recently reviewed the evolution of ELFSE over the last decade, including the theoretical concepts used to analyze experimental data and the technological progress still needed to develop a competitive ELFSE-base sequencing method. Although the exact conformation of the composite ssDNA/drag-tag molecule is in principle important for deriving accurate ELFSE theories, we shall assume in the following that there is no physical segregation of the ssDNA and the label. We shall also assume that the label is not deformed, which means that the hybrid molecule is globally a random coil of effective hydrodynamic blobs. Previous studies indicated that these two assumptions can indeed explain currently available data. For the sake of completeness, we now review the corresponding theoretical arguments.
The electrophoretic mobility μ of a block copolymer consisting of a linear chain of Mc charged monomers linked to a linear chain of Mn uncharged but otherwise identical monomers has been shown [11- 13] to be given by the following relation:
where μ0 is the free electrophoretic mobility—without the drag-tag—of the charged polymer. This equation neglects the correction due to the effects of the ends of the molecule [9].
Arguing that uncharged label monomers are not always equivalent to ssDNA monomers from a hydrodynamic point of view, and therefore that a non-uniform weighted average of the monomers' mobilities should be used, McCormick et al. [8] developed the blob theory of ELFSE (
is a dimensionless parameter, bu and bc are the monomer sizes of the charged and uncharged monomers respectively, and bKu and bKc are the corresponding Kuhn lengths. The Kuhn statistical segment length is a measure of polymer stiffness, and can be calculated from the local structure of the chain. It can be defined by bK≡R2/Rmax, where R is the chain's end-to-end distance and Rmax is its maximum value. Actually, Eq. (2b) was derived using this definition and assuming that both polymer chains are much longer than their Kuhn lengths. Note that for a perfectly flexible molecule (such as a FJC), one has bK=b, Rmax=Mb and R2=Mb2. The definition of the Kuhn length means that a stiff polymer can be treated as a FJC made of NK=Mb/bK segments of length bK.
Parameter a1 , in Eq. (2b) is a relative friction coefficient and has no dimensions. In fact, a≡a1Mu is the number of ssDNA monomers required to form a molecule with a hydrodynamic radius equal to the hydrodynamic radius of the Mu label monomers. Since ssDNA is generally stiffer than the polymers used as drag-tags a1 is often much smaller than unity.
For an elution length L, the elution time of a labeled ssDNA fragment is given by:
where E is the applied electric field. From Eq. (3) the total effective friction coefficient a═a1Mu specific to a drag-tag can be simply obtained from the slope of a plot of the reduced elution time t/(L/μE) vs. the inverse of the number of charged monomers 1/Mc.
Haynes et al. first measured the electrophoretic migration times of unconjugated “free” DNA and of DNA conjugated to a linear drag-tag with Mu=30 monomers. Using the equations derived in the previous section it is easy to compute the value of a1 (or of the total effective drag coefficient a=a1Mu) from the two elution times thus measured. These authors repeated the experiments using Mc=20 as well as Mc=30 base ssDNA primers (Table 1). We note that both ssDNA molecules give the same result a=7.9 (equivalent to an effective drag coefficient of a1=0.26 per uncharged drag-tag monomer). Equation (2b) can then be used to estimate the Kuhn length of this polymeric drag-tag: with bc=0.43 nm and bKc=3 nm for ssDNA, and bu=0.43 nm (estimated from the chemical structure, see
In order to increase the effect of the drag-tag on the resolving power of ELFSE, one must build drag-tags with very large effective friction coefficients. Haynes et al. examined the role that branching could play in this process. To that end, they added branches to their initial Mu=30 linear drag-tag. Using the equations of Example 2, they found that the apparent value of a increases roughly linearly with the molecular size of the branched label and the two ssDNA primers give slightly different values of a (see Table 1). Both of these results are surprising, and in selected embodiments the invention examines the physics that is relevant in the case of branched drag-tags.
The terminology in Table 1 refers to a series of polypeptoid drag-tags based on a fixed thirty-residue “backbone” with branches forming stable amide bonds with the amino side-chains on the backbone.
As mentioned previously, Haynes et al. analyzed their data using the theory for linear labels (i.e., Eq. 2a). This theory applies in the case where the blob construction [8] is valid. However, it is not clear that this can be directly applied to the branched label.
In order to generalize the ELFSE theory to the case of branched labels, the inventors have determined how such hybrid molecules will be represented by blobs of identical hydrodynamic radii. There are two obvious ways to do this.
First, one can use the approach previously used for the bulky streptavidin label [2]: the whole label is seen as one uncharged blob (with a hydrodynamic radius RH), and the ssDNA molecule is subdivided into blobs with the same size RH (see
The second approach, shown in
In this expression, mu≅Nb+N1 is known from the chemical structure while mc=mc(rH(mu)) must be computed. However, our drag-tags are so small that any Gaussian approximation would necessarily fail for the small blob model. We will thus focus our attention on the big blob model in subsequent examples. Note that in the Gaussian limit, and without excluded volume effects, one should have RH2≅arH2, and the two models should give the same answer.
In the example 5-7 we will examine the hydrodynamic radii of branched polymers in order to estimate the radius RH of the whole label treated as a large blob (as shown in
The Kirkwood's approximation can be used to calculate the friction coefficient, or the Stokes hydrodynamic radius RH of macromolecules:
where Rij is the distance between monomers i and j (the double sum is taken over all pairs of monomers). The simplest macromolecule is a linear chain of monomers with no correlation between the directions of the different bond vectors; this is generally called the Freely-Jointed Chain model (FJC). The average distance Rij between the i-th and the j-th segments of a FJC chain is given by:
Rij2=|i−j|b2 (6)
Using the preaverage Eq. 6, the definition Eq. 5, and taking the sum over the pairs of monomers we obtain the hydrodynamic radius for a linear freely-jointed chain polymer:
where L=Nb is the contour length of the linear FJC molecule. We note that RH˜N1/2, a standard result for random-walk models.
In a paper on the calibration of retention volumes in size exclusion chromatography, Teraoka recently derived an analytical expression for the hydrodynamics radius of a FJC branched polymer without excluded volume interactions. We shall use Teraoka's approach as this represents the simplest possible way to understanding the hydrodynamic properties of branched drag-tags. In our case, the total number of monomers is given by N=a(N1+N)+2Nb′−Nb, where a is the number of arms along the backbone, N1 is the number of bonds on each side chain, Nb is the number of bonds between the branched arms along the main backbone, and Nb′ is the number of bonds at each of the two ends of the molecule (see
RH−1=CARH−1A+CBRH−1B+CCRH−1C+CDRH−1D (8)
where
r=N1/Nb, (8a)
CA=aN12/N2, (8b)
CB=a(a−1)N12/N2, (8c)
CC=((a−1)Nb+2Nb1)2/N2, (8d)
CD=2aN1((a−1)Nb+2Nb1)/N2, (8e)
and
The above expressions are simple functions of the various lengths, the number a of arms and the ratio r. We note that Eq. 8 reduces to Eq. 7 when N10.
In Section 3 we calculated the backbone monomer size from the bond lengths (see
For the tetramer label we have the parameters: N1=4, a=5, Nb=6, Nb′=3, and therefore, by substituting these values into Eqs. (8) we obtain RH(50)=0.70 nm. For the octamer-branched label the parameter N1=8 is different and we obtain RH(70)=0.80nm instead. In order to make comparisons with the experiments, we need to convert the hydrodynamic radii into the corresponding effective drag-tag friction coefficients at. To this end, we need a model for ssDNA.
If we first assume that the ssDNA is also a flexible FJC polymer with a monomer size bc=0.43 nm, we can use Eq. 7 to obtain
where we simply wrote N=a in Eq. 7. For the tetramer and octamer labels, we then obtain a(50)=36 and a(70)=47. These values are way too large, and they are also meaningless since one must take into account the stiffness of the ssDNA, which is a very rigid polymer.
For a sufficiently long linear stiff molecule (such as ssDNA), the radius of gyration of the coil is given by:
where L=Mcbc is the contour length of the polymer, bc is the monomer length and bKc is the Kuhn length. Note that this expression neglects the effects of the excluded volume interactions. The relation between the radius of gyration Rg and the hydrodynamic radius RH is approximately [14]:
From Eqs. (10) and (11), we can thus write the hydrodynamic radius of the linear polymer:
Using this expression, we can replace Eq. (9) by the more realistic relation
Using the values mentioned before for ssDNA, we obtain a(50)=5.1 and a(70)=6.7. These predicted values are now too small (by about a factor of 2.5) and show that a FJC model is not a sufficient model for the drag-tags. We thus need to take into account the stiffness of the drag-tags as well.
Again, we will apply our generalization of Teraoka's theory to predict the a values of the drag-tags, but this time we will take into account the finite stiffness of the label using a simple, “0th order” approximation. As derived in Section 3.1, the molecular properties of the linear drag-tag are bu=0.43 nm and bKu=0.78 nm. For all practical purposes, a sufficiently long stiff polymer of contour length L can be considered as a FJC if we use bKu as the monomer size and NK=L/bKu as the number of monomers. Therefore, a simple way to take into account the finite flexibility of the drag-tag segments (backbone and arms) is to use Eq. (8) with the renormalized values
while the number of arms a is kept constant and the monomer size is increased to bKu. The calculations are straightforward and we now obtain RH(50)=0.95 nm and RH(70)=1.10 nm for the tetramer and octamer labels, respectively. The corresponding alpha values are then calculated using Eq. (13), and we obtain a(50)=9.44 and a(70)=12.7. These predicted values are in much better agreement with the experiments than the results derived in Section 4.3, but we still under-predict the actual value of a by about 40%.
Several reasons for this discrepancy can be proposed. For instance, our 0th order approach to the drag-tag stiffness, described in this subsection, is strictly valid only for very long polymer segments, which is not the case here since Nb and N1 are rather small (our approach actually underestimates the effect of stiffness). This critical weakness of the theory presented so far will be examined in Section 5. Other effects, neglected in this paper, will be discussed in Section 7.
In this and subsequent examples the inventors improve upon the approach presented in Example 8 to take into account the stiffness of the label in a more appropriate way. First and foremost we note that Eq. (5), which gives the equation for the hydrodynamic radius of a macromolecule, can be easily calculated for any given branching architecture if the average inverse distance Rij−1 between any two monomers i and j is known. In absence of excluded volume interactions, any two monomers i and j are in fact the end monomers of a linear chain. Therefore, an improved theory starts necessarily with a better approximation for the average Rij−1, which means a better approximation for the end-to-end distance of a linear worm-like polymer chain. We discuss this subject in Section 5.1. However, we do note that along the linear chain starting at the ith monomer and ending at the jth monomer, there might be side-chains and grafting points; such junctions may obviously have an impact on the usual chain statistics. This problem is discussed below.
We begin by reviewing the classical theory of the Kratky-Porod chain model for a linear worm-like polymer (the backbone), and then expand this theory to allow for the presence of side-chains attached to the linear chain.
The average squared end-to-end distance of a polymer chain made of N segments can be written as following [14]:
where {right arrow over (r)}i and {right arrow over (r)}j are bond vectors, b is the bond length (which is assumed constant), cosθij≅{right arrow over (r)}i·{right arrow over (r)}j/b2 is the bond angle, and the average is taken over all possible chain conformations.
To account for the limited flexibility of real polymers we can assume that the bond angle between any two consecutive segments is only allowed to freely rotate, white its average value θ is maintained constant. This is the well-known Kratky-Porod model, or the worm-like chain (WLC) model. The average angle between any two arbitrary segments i and j can thus be written as follows:
where lp is the persistence length of the chain (note that the Kuhn length bK of the chain is defined as being equal to 2lp). Using Eqs. (14) and (15), and changing the summation over bonds into an integral over the contour length of the chain, the mean square end-to-end distance R2 can be rewritten as follows:
where Rmax=Nb is the maximum end-to-end distance of the actual polymer (or the chain's contour length). The two well-known limits of Eq. (16) are the ideal FJC limit R2≈bKRmax, which applies when Rmax>>lp, and the rod-like limit R2≈Rmax2, valid when Rmax<<lp. We used the result of the first limit in Section 4.4; however, our chains are really in the intermediate regime where the approximation R2≈bKRmax underestimates the mean chain end-to-end distance.
In order to use this chain model to calculate the hydrodynamic radius of a branched polymer, we have to compute the inverse end-to-end distance between any two monomers i and j. From Eq. (16), we can write:
Together with a knowledge of the properties of the branching points, Eqs. (17) and (5) allow us to compute the hydrodynamic radius of branched drag-tags.
First, we disregard the branching points and consider that the sequence of monomers between monomers i and j always forms a continuous WLC satisfying Eq. (17). This is the simplest way to improve upon Section 4.4. The details of the calculations are presented in Appendix A.
Using these equations it is possible to obtain a new numerical estimate for the effective friction coefficient of the two labels studied by Haynes et al.; we obtain RH(50)=1.11 nm and RH(70)=1.24 nm for the tetramer and octamer labels, respectively. From Eq. (13), the corresponding a-values are a(50)=12.92 and a(70)=16.04. This is a major improvement upon the results obtained previously. This simple calculation demonstrates very clearly the importance of taking into account the stiffness of the drag-tag molecule. We note that the persistence length of the labels has been taken as lp=½bKu=0.39 nm (see Section 3.1).
To properly calculate the hydrodynamic radius of branched polymers we need to evaluate the distance between any two monomers. When there is no branching point between the two monomers, Eq. (16) can be used. We propose here to improve the Kratky-Porod equation (16) and derive an expression for the end-to-end distance between two monomers in the case where we have branching points between them.
We start with a description of the branching points. In
We assume that the linear polymer starting at the A monomer and ending at the B monomer is made of independent Kratky-Porod segments linked together at the branching points. For the case of a single given branching point,
The average {right arrow over (r)}i·{right arrow over (r)}j in Eq. (14) must be calculated differently if we have branching points. If there is one branching point between the ith and jth bond vectors (
{right arrow over (r)}i·{right arrow over (r)}j=b2(cosθ)|j-i-1|cosδ (18)
where δ is the angle between the last bond vector of the 1st KP chain and the first bond vector of the 2nd KP chain. Similarly, if there are two branching points we use the expression:
{right arrow over (r)}i·{right arrow over (r)}j=b2(cosθ)|j-i-2|cos2δ (19)
We derive now the mean square end-to-end distance of a linear chain with one or two branching points. For just one such connection, found at monomer n1 (see
The 2nd and 3rd terms in Eq. 20 are the statistical properties of the 1st KP and the 2nd KP chains, while the 4th and the 5th terms are the projections of the 1st KP and the 2nd KP chains onto the bond vector {right arrow over (r)}n
for the two KP sub-chains, we obtain:
For a linear chain with two branching points we obtain a similar result:
where {right arrow over (r)}n
If we assume that δ=θ, i.e. there are no branching points along linear chains, or equivalently all the bond angles are the same, both Eqs. (20) and (23) reduce to Eq. (16). However, if we assume that δ≠θ we obtain chains with larger or smaller hydrodynamic radii. Together with Eq. (5), these equations allow us to compute the hydrodynamic radius of any type of branched drag-tag. The end result will now be a function of the angle δ.
We show in
The problem of optimising the architecture of an ELFSE label cannot be approached solely by an experimental trial-and-error method because of the difficulties in the chemical synthesis of large macromolecules. Moreover, these drag-tags, either linear or branched, must have very specific properties—uncharged, hydrophilic, monodisperse, etc. It is therefore essential to find design principles for the optimal type of branching which would provide the largest effective friction coefficient a for a given number of monomers.
We first examine the hydrodynamic radii of a linear polymer and of a branched polymer (with the architecture shown in
From
A somewhat similar situation is encountered if the stiffness of the polymer is taken into account.
Quantitatively, our results explain the somewhat surprising data in Table 1. The fact that the effective friction coefficient a increases almost linearly with the total molecular size of the branched labels is actually due to the fact that the arms are rather short. The situation would be quite different for long arms, as we shall see in the next section.
Although linear polymers are preferable, branched polymers offer practical advantages because of the possibility of synthesizing larger monodisperse molecules in a simple, stepwise way. Our results suggest two branching strategies. First, adding two very long arms near the ends of the backbone molecule can add a large amount of friction with very little loss when compared to having all the monomers forming a single linear chain (see bottom curve in
We now compare the hydrodynamic radii of the tetramer and octamer labels (as derived from the experimental data) with the hydrodynamic radii predicted for an optimal label of the same type of branching. Again, we keep the total number of monomers N fixed at either 50 (tetramer) or 70 (octamer). We use the persistence length lp=0.39 nm, determined in Example 2, and the WLC model presented in Example 11 (i.e., we assume that the branching angle δ=θ since we showed in Example 12 that its value has very little impact on the final result). With these numbers and the relevant equations, it is possible to compute the hydrodynamic radius of all possible combinations giving the same value of N when only Nb′ is kept fixed. The results are shown in
The curves show two interesting regimes, already mentioned in Example 13. First, the largest hydrodynamic radii are obtained on the left when we have only two short arms (a=2, N=2). This is not surprising because we already know that the maximum value of RH is always found for the linear polymer (a=0). For the 50-mer label, the best set of parameters (N1=2, a=2 and Nb=40—
On the other hand, we see that some of the curves are going up for large values of N1 (i.e., for long arms). This corresponds to the second case mentioned before: a few long arms, preferably situated near the ends of the molecule, also provide a quasi-linear chain with a potentially large drag coefficient. In the N=70 case, for instance, a 3-branch polymer with N1=16 monomers per branch (a hexamer) and a distance of Nb=8 monomers between the arms would have produce a slightly higher friction coefficient than the octamer used experimentally (a(70)=16.49 vs a(70)=16.04). Obviously, the curves would go up even further for larger values of N. Although the values obtained with this strategy are slightly lower than those obtained with the first strategy, this is a much better approach since it avoids the synthesis of extremely long and monodisperse backbones. Instead, one can use moderately long building blocks, such as hexamers in this example, together with a moderately long backbone (30 mers in total) and a only a few branching points (2 or 3).
Table 1 shows an interesting effect when branched labels are used: the apparent value of at appears to increase slightly when a larger DNA primer is used to pull it through the electrophoretic system. This effect is of order 10% for the tetramer and 5% for the octamer. However, no such effect was reported for the linear labels of size N=30. We suspect that at least two different phenomena can possibly explain this second-order effect, and it is not possible to distinguish between them with the current state of the theory and the very restricted amount of experimental data presently available.
First, it is known that there are corrections to the electrophoretic mobilities related to the end effects [8,9]; for linear labels it has been shown that this may slightly increase the apparent friction coefficient of a drag-tag as the size of the DNA increases. Unfortunately, there is no end-effect theory that would apply to branched labels, and therefore no further quantitative insight can be gained in this direction.
Second, the fact that the branched drag-tags are bulky may also induce some steric segregation between the DNA and the labels; the standard ELFSE theory used here assumes that the hybrid molecule can be treated as a coherent sequence of blobs forming a single random coil. The case of the segregated label has yet to be studied theoretically [11], but it is likely that hybrid molecules would segregate to different extents for different molecular sizes. Segregation is also directly related to excluded volume interactions and electrostatic interactions, effects that we did not consider in this study.
Since it is quite difficult to produce long, monodisperse linear polymer chains to be used as drag-tags for ELFSE, Haynes et al. recently proposed to build branched drag-tags from various monodisperse building blocks (shorter linear chains that can be attached together). Since a branched object is necessarily more compact, one would instinctively conclude that this approach would lose in terms of performance although it may gain in ease of preparation. Surprisingly, the experimental results of Haynes et al. actually showed an almost linear increase in the value of the effective drag coefficient ax with the molecular weight of the label.
In this application the inventors present three models for uniform comb-like branched polymers: the FJC model, the worm-like chain model, and finally a modified WLC model that took into account the properties of the branching points. For all three models, the underling theory used to calculate the friction properties has been based on the work of Teraoka [10]. Comparing the predictions of these three models with the measured a values, we saw that the FJC model gave values about 50% lower than that of the experiment, while for the WLC model the agreement was within a few percent (the modified WLC provided little improvement). We also speculate that the small dependence of a upon the size of the DNA could be explained by two possible phenomena that we neglected in this paper (namely, end effects [8] and steric segregation).
Based on our results herein and on polymer theory, the inventors deduce three different approaches to optimizing the architecture of polymer labels for ELFSE:
In calculating the hydrodynamic radius of a branched polymer we follow the formalism of Teraoka [10], except that now the pre-averages are calculated using Eq. (17). The hydrodynamic radius is written as follows:
RH−1=CAuA(N1,lp)+CBuB(N1,a,lp)+CCuC(Nb,Nb′,a,lp)+CDuD(N1, Nb,Nb′,a,lp) (A1)
where the coefficients CA, CB, CC, and CD were defined in Section 4.2, and the functions uA through uD are given by:
where b=0.43 nm is the monomer size of the label.
The present invention, at least in selected embodiments, provides design principles for branched polymers for use as polymeric end labels (or drag tags) in End-Labeled Free Solution Electrophoresis (ELFSE) of DNA. The optimal branching provides high potential frictional drag for a given number of monomers (or molecular weight). The invention also provides design principles for the design of cationic labels that have an increased effective frictional drag effect for ELFSE.
Deduced approaches towards optimizing the architecture and composition of polymer labels for ELFSE in accordance with the teachings of the present invention:
The hydrodynamic radii of the linear and branched labels (all neutral) were calculated using standard models like the freely jointed chain model (FJC) and the Kratky-Porod worm like chain model (WLC). Based on comparisons of the theory with the experimental data, the inventors propose that the design of new branched labels should use either side chains whose length is comparable to or greater than the distance between the branching points, or longer branches (preferably two longer branches) located near the ends of the molecule's backbone. The theoretical calculations were based on three major models for branched polymers: 1. The FJC, the WLC, and a modified WLC that takes into account the properties of the branching points. The first of these models is based on the work of Teraoka while the others are new theories put forward by the authors. Comparing the predictions of these three models with the experimental results, it was determined that the FJC model under-predicted the friction values by 50%. The WLC model and the modified WLC model provided close agreement to the experimental results.
Hydrodynamic Radii for Branched Freely Jointed Chains
The method is based on a constrained optimization procedure for the hydrodynamic radii of banched labels. The total number of monomers N=a(N1+Nb)+2Nb′−Nb is kept constant and all the other parameters are varied. This means a—the number of arms along the backbone, N1—the number of bonds on each side chain, Nb—the number of bonds between the branching points along the main backbone, Nb′—the number of bonds at each of the two ends of the molecule. The inventors selected those parameters that give high hydrodynamic friction.
Estimating a: WLC Polymers
The unsatisfying results obtained with the FJC indicated that the stiffness of the drag-tag molecule must be taken into account in a proper way. Simply rescaling the number of monomers by arranging them into equivalent Kuhn blobs is not sufficient. Application of the theory for branched labels: To properly calculate the hydrodynamic radius of branched polymers the distance between any two monomers has been carefully considered. Derivation of a Kratky-Porod-like equation led to a new expression for the end-to-end distance between two monomers in the case where branching points between them exist. Therefore, based on comparisons of the theory with the experimental data, the design of new branched labels should use either side chains whose length is comparable to or greater than the distance between the branching points or longer branches (preferably two longer branches) located near the ends of the molecule's backbone for optimized separation. In the latter case, we further suggest that the process can be used iteratively, i.e., a single branching point near the other end of each branch can be added, and a new branch attached at that position. This process can in principle be continued until the desired value of alpha is reached (for example see
While the invention has been described with reference to particular preferred embodiments thereof, it will be apparent to those skilled in the art upon a reading and understanding of the foregoing that numerous methods for polymer molecule modification and separation, as well as corresponding end labels for their separation, other than the specific embodiments illustrated are attainable, which nonetheless lie within the spirit and scope of the present invention. It is intended to include all such methods and apparatuses, and equivalents thereof within the scope of the appended claims.
This application claims the priority right of prior U.S. patent application No. 60/783,034 filed Mar. 17, 2006 by applicants herein.
Number | Date | Country | |
---|---|---|---|
60783034 | Mar 2006 | US |