This invention concerns methods for determining the size distribution of a mixture of particles by implementing the Taylor dispersion and associated system, including the following steps:
Below, ‘particle’ refers to any molecule in solution and/or particles in suspension in the mixture.
In this document, a species includes all particles characterized by the same size, e.g., the same hydrodynamic ray. A species is thus associated with a ‘particle size’ value.
In this field, the ‘deconvolution’ of a Taylor signal refers to the processing of the experimental Taylor signal leading to the determination of the hydrodynamic ray of each of the species forming the mixture and the determination of the concentration of each of these species.
The international application published under no. WO 2010 009907 A1 discloses a method of the aforementioned type, the analysis step of which implements various deconvolution algorithms for an experimental Taylor signal. However, these algorithms may only be used in the specific case of a binary mixture, i.e., a mixture of two species. Accordingly, these known algorithms do not allow for the analysis of any desired sample, but only of samples of which it is known in advance that they result from the mixture of two species.
In practice, it is currently considered impossible to solve the general problem of deconvoluting the experimental Taylor signal of a sample of any given mixture of species.
The invention thus seeks to alleviate this problem by proposing, in particular, a method for real-time analysis of an experimental Taylor signal of a sample of any given mixture.
To this end, the invention concerns a method for determining the size distribution of a mixture of molecule or particle species including the following steps:
characterized in that the step of analysing an experimental Taylor signal Ŝ(t) of a sample of the mixture consists of searching an amplitude distribution P(G(c)) that allows the experimental Taylor signal Ŝ(t) to be broken down into a sum of Gaussian functions by means of the equation:
{circumflex over (S)}(t)≡∫0∞P(G(c))G(c)c/2exp[−(t−t0)2G(c)c]dG(c)
where
t is a variable upon which the experimental Taylor signal depends and t0 is a value of the variable t common to the various Gaussian functions and corresponding to the peak of the experimental Taylor signal Ŝ(t);
G(c) is a characteristic parameter of a Gaussian amplitude function P(G(c)) and is associated:
where c=1, with the diffusion coefficient D of a species according to the relation G(1)=12D/(Rc2t0)
for c=−1, to the hydrodynamic ray Rh of a species according to the relation
and
for c=−1/df=−(1−a)/3, to the molar mass M of a species according to the relation
where kB is the Boltzmann constant, T is the absolute temperature expressed in Kelvins at which the experiment is conducted, η is the viscosity of the eluent used, Rc is the internal ray of the capillary used, Na is Avogadro's number, and K and a are Mark Houwink coefficients,
by implementing a constrained regularization algorithm consisting of minimising a cost function Hα including at least one constraint term associated with a constraint that must observe the amplitude distribution P(G(c)) that is the solution of the foregoing equation, whereby the minimization is carried out on an interval of interest of the values of the parameter G(c).
According to specific embodiments, the method includes one or more of the following characteristics, taken alone or in all combinations technically possible:
H
α=χ2+α2Δ2
S(t)=Σm=1NcmP(Gm)√{square root over (Gm)}exp[−(t−t0)2Gm], and
χ2=Σk=1L(S′(tk)−{circumflex over (S)}(tk))2
where the experimental Taylor signal Ŝ(t)and the reconstructed function S′(t)are sampled over time, whereby each sample is indexed by an integer k varying between the unit value and the value L;
Δ2=Σm=2N−1[P(Gm−1)−2P(Gm)+P(Gm+1)]2;
where β and γ are respectively the average and the standard deviation of the logarithm of the parameter G of a log-normal distribution, followed by,
G
min=exp(β−k√{square root over (2)}γ) and Gmax=exp(β+k√{square root over (2)}γ);
relative to the variable x=(t−t0)2, and
with αmin=0.1; αmax=3,
for c=1: Gmin=τmax−2, Gmax=τmin−2,
for c=−1: Gmin=τmin−2, Gmax=τmax−2,
for c=−1/df=−(1+a)/3:
then
G
min=exp(β−k√{square root over (2)}γ) and Gmax=exp(β+k√{square root over (2)}γ).
The invention also concerns a data storage medium including instructions for the execution of a method for determining the hydrodynamic ray, diffusion coefficient, or molar mass distribution of a mixture of molecule or particle species as defined above when the instructions are executed by a computer.
The invention lastly concerns a system for determining the hydrodynamic ray, diffusion coefficient, or molar mass distribution of a mixture of molecule or particle species including a computer, whereby the computer is programmed to execute a method for determining the hydrodynamic ray, diffusion coefficient, or molar mass distribution of a mixture of molecule or particle species as defined above.
Other characteristics and advantages of the invention will become apparent from the following detailed description, provided by way of example and by reference to the attached drawings, in which:
By reference to
The experimental device 3 includes, as is known, a capillary 6.
The experimental device 3 includes, in the vicinity of one end of the capillary 6, an injection section 7, and, in the vicinity of the other end of the capillary 6, a detection section 9.
The injection section 7 includes means 11 for injection into the capillary 6 of a sample of the mixture to be analyzed. The injection section 7 also includes means to allow an eluent to flow inside the capillary 6 from the injection section 7 to the detection section 9. These flow means are shown schematically in
The detection section 9 is optical. It is equipped with an optical cell including a light source S and an optical system 15 suited to cause the rays of light emitted by the source S to converge on a narrow portion of the capillary 6. Along the optical axis of the optical system 15, but opposite the lighted side of the capillary 6, the cell includes a CCD, diode array, or photomultiplier sensor 17 suited to collect the light that has passed through the capillary 6 and to generate a detection signal corresponding to the light collected. The sensor 17 is electrically connected to an electronic card 19 for pre-processing and digitising the detection signal generated by the sensor 17. The card 19 outputs a digital measurement signal that is time-dependent. This measurement signal is referred to as the ‘experimental Taylor signal’ or Taylorgram. It is indicated by the notation Ŝ(t) in the following. It depends on the time t.
The experimental Taylor signal Ŝ(t) is sampled at a predetermined temporal frequency, although the sampling points tk(k=1, . . . , L) are at regular intervals. The experimental Taylor signal Ŝ(t) thus consists of a group of L pairs of data (tk,Ŝ(tk)).
In one variant, the detection section may be of another type, e.g., a conductivity detector, using mass spectrometry, fluorescence (laser-induced, if applicable), electrochemical, light diffusion, or more generally, any type of detector used in capillary electrophoresis. In particular, the Taylor signal may not be a time signal (scrolling the Taylor peak in front of a narrow sensor), but rather a spatial signal (instantaneous capture of the Taylor peak in front of an extended sensor). In this case, the variable t does not represent time, but the position along the capillary.
The analysis device 5 consists of a computer including an input/output interface 21 to which the electronic card 19 of the experimental device 3 is connected.
The computer further includes a memory 23, such as a RAM and/or ROM, as well as a processing unit 25, such as a microprocessor. The computer also includes human-machine interface means, indicated by the number 27 in
The experimental Taylor signal Ŝ(t) is processed by a software application, the instructions of which are stored in the memory 23 and executed by the processing unit 25. This software is shown schematically in
In the known manner, assuming that i) the contribution of the diffusion along the axis of the capillary 6 to the dispersion at the level of the peak of the signal is negligible, ii) the injection time of the sample into the capillary 6 is sufficiently short (typically, the injected volume is less than 1% of the volume of the capillary), and iii) the detection device is sensitive to the mass of the molecules, the real Taylor signal S(t) of a monodispersed
sample, i.e., a sample having only one species, and, accordingly, characterized by a hydrodynamic ray value Rh or diffusion coefficient D, is modelled by a Gaussian function:
S(t)=CMρ√{square root over (D)}exp[−(t−t0)212D/(Rc2t0)]+B. (1.)
C is an instrumental constant,
M is the molar mass of the species,
ρ is the molar concentration of the species,
Rc is the internal ray of the capillary,
t0 is the moment corresponding to the peak of the Taylor signal, and
B is an offset constituting a measurement artifact (this term will be omitted in the following for clarity and because it is taken into account in the constrained regularization method).
It should be emphasized that assumption iii) depends on the nature of the sensor used in the detection section, and that the use of another type of sensor results in modifications to the equations shown herein in a manner known to persons skilled in the art.
The real Taylor signal S(t) of a polydispersed sample, i.e., a sample including
several species, is modelled for the sum of the contributions of each of the species. Thus, assuming a mixture including a continuum of species, equation (1) is generalized by a continuous sum of Gaussian functions according to:
S(t)=∫0∞CM(D)ρ(D)√{square root over (D)}exp[−(t−t0)212D/(Rc2t0)]dD, (2.)
In equation (2), the Gaussian functions are all centred on the same reference time t0.
The parameter G=12D/(RC2T0) is introduced, causing the equation (2) to become:
S(t)=∫0∞P(G)√{square root over (G)}exp[−(t−t0)2G]dG (3.)
where P(G) is referred to in the following as the ‘amplitude distribution’ of the Guassian functions of the parameter G.
A value of the parameter G is associated, via the diffusion coefficient D, with a species. For example, the Stokes-Einstein-Sutherland formula allows for the association of a value of the parameter G with the species characterized by the hydrodynamic ray Rh, according to the relation:
kB is the Boltzmann constant,
T is the absolute temperature expressed in Kelvins at which the experiment is carried out, and
η is the viscosity of the eluent used.
Thus, in equation (3), each Gaussian function P(G)√{square root over (G)}exp[−(t−t0)2G] represents the contribution of one species to the total amplitude of the real Taylor signal S(t). The amplitude P(G) of each Gaussian function depends directly on the concentration of the corresponding species in the mixture.
The function of the software 31 is to determine the distribution P(G) that is the solution of the following equation corresponding to equation 83) when the real Taylor signal S(t) is replaced with the experimental Taylor signal Ŝ(t):
{circumflex over (S)}(t)≡∫0∞P(G)√{square root over (G)}exp[−(t−t0)2G]dG (5.)
Like any measurement device, the detection section 9 introduces a systematic measurement error such that the experimental Taylor signal Ŝ(t) is not exactly equal to the real Taylor signal S(t).
The solution of the equation (5) then results in the determination of several distributions that are each solution to the measurement error. In other words, the solution of the equation (5) results in the identification of several families of Gaussian functions that each result in sum in a reconstructed Taylor signal Ŝ(t)=∫0∞P(G)√{square root over (G)}exp[−(t−t0)2G]dG that is adjusted to the experimental Taylor signal Ŝ(t), with the adjustment criterion taking into account the measurement error introduced by the detection section.
However, amongst the various distributions P(G) that are solutions to the equation (5), only some have physical significance. It is such a ‘physical’ solution that the software 31 is suited to determine.
To solve this problem, the software 31 uses an algorithm that implements a constrained regularization method.
This algorithm is based on the following equation, which results from a discretization relative to the parameter G of the equation (5):
{circumflex over (S)}(t)≡Σm=1N−1cmP(Gm)√{square root over (Gm)}exp[−(t−t0)2Gm] (6.)
where the interval of the values of interest of the parameter G, between the predetermined minimum Gmin=G0 and maximum Gmax=GN, is subdivided into N sub-intervals identified by the integer m of the length cm. Preferably, the various sub-intervals have the same length:
The limits are such that the interval on which equation (6) is discretized exceeds the interval for which the distribution P(G) is not nil. Accordingly, P(G1)=P(GN)=0.
The unknowns of equation (6) consist of all amplitudes P(Gm), m=1, . . . N.
In principle, the solution of equation (6) is obtained by a process of adjusting the experimental Taylor signal Ŝ(t) by the reconstructed Taylor signal Ŝ(t)=Σm=1NcmP(Gm)√{square root over (Gm)}exp[−(t−t0)2Gm]. That is, for any time tk, S′(tk) must be as near as possible to Ŝ(tk).
In order to obtain robust results that are physically significant, however, it is necessary to take into account all information available on the amplitudes P(Gm) during
the adjustment process so as to reject all solutions that are not physically acceptable.
To this end, the constrained regularization algorithm solves equation (6) by minimising a cost function dependent on the N unknown P(Gm) and translating the
information available on the amplitudes P(Gm) into constraints.
In the current embodiment, the cost function Hα takes the following form:
H
α=χ2+αΔ2 (7.)
It includes a first term χ2 corresponding to a ‘distance’ between the experimental Taylor signal Ŝ(tk) and a reconstructed Taylor signal S′(tk).
For example, this first term is a distance of the type ‘least squares’:
χ2=Σk=1L(S′(tk)−{circumflex over (S)}(tk))2 (8.)
Another distance measure may be used, in particular, one that, in the foregoing sum, weights each term by a coefficient wk inversely proportional to the noise affecting the measurement taken at the instant tk.
The cost function Hα includes a second term Δ2, referred to as the constraint term, expressing a constraint that penalises the amplitudes P(Gm) that have no physical significance.
For example:
Δ2=Σm=2N−1[P(Gm−1)−2P(Gm)+P(Gm+1)]2 (9.)
In this example, the constraint term corresponds to the sum of the terms elevated to the square of the second derivative of the distribution P(G). The constraint term translates a regularity constraint. The amplitudes P(Gm) that vary too rapidly relative to their neighbours, P(Gm−1) or P(Gm+1), are thus penalised.
Another example of a regularity constraint is the following:
Δ2=Σm=3N−1[P(Gm−2)−3P(Gm−1)+3P(Gm)−P(Gm+1)]2 (9.)
This constraint term corresponds to the sum of the terms elevated to the square of the third derivative of the distribution P(G). The generalization to a regularity constraint based on the n-th derivative (n≧1) is immediate and can be carried out by a person skilled in the art.
In the following, it is assumed that the regularity constraint used is that of the second derivative of the distribution P(G), eq. (9).
The first and second terms of the cost function Hα have relative contributions that may be adapted by selecting the value of a coefficient α, a Lagrange coefficient. This coefficient verifies the size of the constraint term relative to the distance term. If α is very small, the constraint term is negligible. In this case, the minimization of the cost function yields the same result as a simple adjustment to the experimental data. For values of α that are too large, on the other hand, a significant cost is assigned to the constraint on the P(G), and the algorithm will reject the solutions that do not observe the constraint at the risk of preserving a solution that does not properly fit the experimental data.
Supplemental constraints that may be expressed in the form of supplemental equalities or inequalities, or P(Gm) linear inequalities are directly imposed during the search for the minimum of the cost function by limiting the search to the subregions specific to the space of the P(Gm).
For example, the constraint that the amplitudes are positive, P(Gm)≧0 ∀m , is imposed by minimising the cost function only on the half space of the positive amplitudes.
For example, if the value of an average G of the parameter G is determined, a supplemental constraint on the amplitudes P(Gm) that are solutions to the equation (6) is that the amplitudes allow the previously determined average G to be determined at a deviation ε. This constraint is also expressed in the form of P(Gm) linear inequalities.
G
−ε≦Σ
m=1
N
P(Gm)Gm≦G+ε (11)
With, for example: ε/G=5%.
Equation 12 may easily be generalized to other types of averages than the arithmetical average:
G
=Σ
m=1
N
P(Gm)Gm (12)
for example the averages GT and GΓ which will be introduced below.
A crucial point is the choice of the coefficient α of the cost function Hα in equation 7. Two strategies may be used to select the coefficient α:
If the standard deviation of the noise σk is not known, an estimated value σest thereof may be determined based on the mean deviation between the experimental data and the best possible adjustment without taking into account the constraint, i.e., that obtained for α=0,:
where S′(tk)(α=0) is the value of the k-th point of the Taylor signal reconstructed based on the amplitudes P(Gm) obtained by minimising only the first term of the cost function Hα=0.
Once the noise has been estimated, α is selected such that the normalized value χnorm2 of tne distance term χ2, calculated by replacing σk with σest in equation 12 does not exceed the number of degrees of freedom: ν=L−N
A second strategy consists of selecting α value for a that gives equal weight to the distance term and the constraint term.
In this case, the parameter α that is retained is the one for which, once the constraint function Hα has been minimized, the result is χ2=αΔ2.
In practice, the selection of α is made then by scanning a large range of values of the coefficient α. For each value of α, all amplitudes P(Gm) that minimise the cost function Hα are determined. The corresponding values of χnorm2 , Hα and P(Gm) are recorded. Amongst all of the trials, the Lagrange coefficient α0 having the greatest value such that χnorm2≦ν is finally retained.
In order to improve the efficiency of the digital search, the values of α are first scanned on a trial grid with large steps, and then the value of α is refined using a finer grid.
Below, two averages of G are presented that may be obtained directly from the experimental Taylor signal.
The T average of the parameter G is defined as follows:
and the Γ average of the parameter G is defined as follows:
where cm are those defined in relation to equation (6).
The purpose of determining these averages is twofold:
GT and GΓ may be calculated respectively based on the temporal variance of the experimental Taylor signal and based on the cumulative approach described below.
It is shown that:
Thus, the T average of the parameter G is accessible by integrating the experimental Taylor signal.
With G=12D/(Rc2t0), the T average of the parameter D is given by:
In this part, the breakdown of the experimental Taylor signal is described in the case of a sample that is moderately polydispersed.
It is assumed that the size distribution is discrete. Equation (2) then becomes:
S(t)=C′Σi=1NρiMi√{square root over (D)}iexp[−(t−t0)212Di/(Rc2t0)] (16.)
where ηi is the molar concentration of the i-th species in the mixture, and Mi and Di are respectively the molar mass and diffusion coefficient thereof.
It is useful to ‘normalise’the Taylor signal relative to the height of its peak by introducing:
s(t)=S(t)/S(t0=Σi=1Nƒiexp[−(t−t0)2Gi] (17.)
−Gi=12Di/(Rc2i0)and
−ƒ=ρiMi√{square root over (D)}i/Σi=1N(ρiMi√{square root over (D)}) is the relative contribution of the i-th species in
the Taylor signal. It should be noted that t, depends on the diffusion coefficient of the i-th species.
The Γ average of the parameter G is then expressed as follows:
By positing Gi=GΓ=δGi, with δGΓ=0, equation (19) becomes:
s(t)=exp[−(t−t0)2GΓ]Σi=1Nƒi exp[−(t−t0)2δGi], (19.)
which is the product of a Gaussian function by correction terms.
If (t−t0)2δGi<<1, i.e., near the peak of the Taylor signal, the limited development of the second term of equation (21) results in:
exp[−(t−t0)2δGi]=1−(t−t0)2δGi+½[(t−t0)2δGi]2+ . . . (20.)
Which leads, in equation (21), using Σi=1Nƒi=1 and Σi=1NδGi=0 to:
s(t)=exp[−(t−t0)2GΓ](1+½(t−t0)4δG2Γ+ . . . ) (21.)
Where the size distribution is not too broad (i.e., a slightly polydispersed sample), it is possible to express the normalized Taylor signal s(t) as the sum of a Gaussian function (as in the case of a monodispersed sample) and correction terms (to take into account a deviation in the case of a monodispersed sample).
Taking the logarithm of the expression (23) and carrying out a new limited development, the following is obtained:
ln[s(t)]=−(t−t0)2GΓ+½(t−t0)4δG2Γ+ . . . (22.)
The equation (24) is the desired cumulant development. The coefficients Γ1=GΓ and Γ2δG2Γ are the first- and second-order cumulants of this development.
They may be obtained by adjusting a second-order polynomial of the variable (t−t0)2 to the function ln[s(t)].
The first cumulant also provides access to the Γ average of the diffusion coefficient:
The Γ average is different to the aforementioned T average because the diffusion coefficient D appears there at different powers. These two averages contain different information on the distribution P(G). They allow for the addition of two constraints in the cost function to be minimized.
The second cumulant is linked to the Γ average of the variance of the distribution of the diffusion coefficients, providing an estimate of the polydispersity of the sample. More specifically, the ratio of the second cumulant divided by the square of the first cumulant gives:
In the constrained regularization adjustment procedure, the choice of the interval of the values G on which the distribution P(G) is sought is an important factor.
In fact, the number N of points used in the discretization of the equation (6) cannot be too large; otherwise, the calculation time for the adjustment will be too long.
Furthermore, N must be significantly smaller than the number L of digitization points of the experimental Taylor signal.
Typical values of N are in the range of 50-200.
These considerations show that the interval [Gmin, Gmax] (Gmin=G0 and Gmax=GN) must be carefully chosen.
However, the interval [Gmin, Gmax] must be greater than the interval on which the distribution P(G) is not nil in order to avoid artifacts due to the truncation of this distribution.
Additionally, if the interval on which the distribution P(G) is not nil is a sub-interval of the interval [Gmin, Gmax] that is too narrow, the details of the distribution P(G) will be weekly resolved during discretization.
It is also essential to define an automatic procedure allowing for the determination of Gmin and Gmax, such that the user does not waste time selecting the limits of the interval and avoiding a series of trial/error.
We propose three possible approaches to determine Gmin and Gmax. The first two approaches are based on the calculation of the equivalent log-normal distribution, whilst the third is empirical and based on the representation of ln[S(t)] depending on (t−t0)2 in the same system of axes that is used for the breakdown into cumulants.
Equivalent Log-Normal Distribution
A log-normal distribution often allows for a highly accurate description of the size distribution of a polymer or particle sample:
PDF(G) is the probability density that the particles of the sample will have a G value between G and G+dG ; and,
β and γ are respectively the average and the standard deviation of the logarithm of the parameter G, ln G.
Although the log-normal distribution may be a poor model for more complex mixtures, the determination of the equivalent log-normal distribution of any mixture is useful to estimate the interval of the values of the parameter G on which the distribution P(G) that is the solution of equation (6) is to be sought.
The probability density PDF (G) depends only on the parameters β and γ. It is possible to determine these two parameters from GT and GΓ.
The definition is:
By replacing equation (27) in equations (28) and (29), the following is obtained:
It is also possible to obtain the equivalent log-normal distribution from first- and second-order cumulants according to the following relations:
In conclusion, it is possible to determine the log-normal distribution either from G and GΓ according to equations (30) and (31) or from Γ1 and Γ2 using equations (32) and (33).
The Gmin et Gmax may be estimated by replacing the distribution P(G) with an equivalent log-normal distribution.
The objective is for the interval [Gmin, Gmax] to cover a significant fraction of the log-normal distribution equivalent to the experimental Taylor signal. This fraction of the distribution is yielded by:
ΔQG=QG(Gmax)−QG(Gmin), (32.)
where QG(G) is the cumulative probability defined by:
Q
G(G)=∫0GdG′PDF(G′). (33.)
Furthermore, it is preferable for the interval [QG(Gmin), QG(Gmax)] to be distributed symmetrically relative to the median value QG=½. Thus::
Q
G(Gmin)=1−QG(Gmax) (34.)
In the context of this assumption, equation (36) yields:
where erf is the error function known to persons skilled in the art.
This results in:
G
max
=β+k√{square root over (2)}γ i.e., Gmax=exp(β+k√{square root over (2)}γ); Gmax=exp(β+k√{square root over (2)}γ), (36.)
and
G
min
=β−k√{square root over (2)}γ i.e., Gmin=exp(β−k√{square root over (2)}γ); Gmin=exp(β−k√{square root over (2)}γ), (37.)
with k=erf−1(ΔQG) where erf1 is the inverse error function.
For example, if ΔQG=99.53%, then k=2, or if ΔQG=99.998%, then k=3.
For the sake of simplicity, the following notation will be used: x=(t−t0)2.
For a monodispersed sample, ln[s(t)] as a function of x is a line, the gradient of which gives the diffusion coefficient of the species of the sample. That is:
Thus, ∂lns/∂x does not depend on x, i.e., it is not time-dependent.
On the other hand, for a polydispersed sample, the curve ln[s(t)] depending on x has a curve that is calculated by determining the derivative ∂lns/∂x. This is proportional to the parameter G.
Based on equation (40), it is assumed that Gmin is linked to the minimum of the local gradient of ln s (in absolute value), because the species with low diffusion coefficients correspond to low G values, and thus to a slight decrease of the Taylor signal over time. Accordingly, it is assumed that:
where the minimum of |∂lns/∂x| sought on an adapted interval of x, to be determined empirically, and where bmin is a numerical coefficient, also to be determined empirically.
By studying a large number of Taylor signals of samples of all kinds, we have found that the suitable interval of x is that for which the signal s(t) decreases by two decades. This corresponds to an interval in time between the time t0 corresponding to the peak of s(t) and the time t1 such that s(t1)=S(t1)/S(t0)=0.01.
Likewise, it is considered that:
where the maximum of |∂lns/∂x| is sought on the same interval of x.
In analysing an experimental Taylor signal, it is simpler to estimate the characteristic decrease time of s(t). Defining the decrease time as τ=G−1/2, minimum and maximum decrease times are defined based on the aforementioned relations, according to:
By studying a large number of Taylor signals of samples of all kinds, we have found that the following values of the parameters αmin and αmax are able to frame the desired min and max values.
αmin=0.1; αmax=3 i.e., bmin=1/9; bmax=100 (43.)
Lastly, the values Gmin, Gmax to use in order to adjust the data are calculated according to:
Gmin=τmax−2 (44.)
Gmax=τmin−2 (45.)
Once the distribution of the amplitudes P(G) has been obtained, the distribution of the amplitudes PD(D) according to the diffusion coefficient D, can be easily calculated.
The following equation links the probability distribution Py of the stochastic variable y to the probability distribution Px of the stochastic variable x, where x is a function of y:
with G=12D/(Rc2t0), the following is obtained:
In this expression, an integral was introduced to the denominator because the distribution P(G) is not necessarily normalized.
It is often desirable to express the polydispersity of the sample in terms of amplitude distribution according to the first hydrodynamic ray Rh, or the parameter of molar mass, M.
These two distributions may be calculated based on the distribution P(G) using equation (46) and the following transformation rules:
Equation (48) uses the Stokes-Einstein relation
where kB is the Boltzmann constant, T the absolute temperature, and η the viscosity of the eluent.
Equation (49) uses the Einstein equation for the viscosity of a diluted suspension and the Mark Houwink equation linking the intrinsic viscosity [η] to the molar mass according to the relation:
[η]=K Ma (50.)
where K and a are the Mark Houwink coefficients.
The following relation, which gives the hydrodynamic ray as a function of molar mass, may also be used:
where Na is Avogadro's number and df=3/(1+a) is the fractal dimension of the object (e.g., df=3 for an ordinary compact object, 2 for a statistical polymer, and 5/3 for a polymer in a good solvent).
Equations (48) and (49) show that G is non-linear in Rh and M, respectively, whilst G is simply proportional to D.
Due to this non-linearity, the transformation of the distribution according to the parameter G identified as the solution of equation (6) results in a distribution according to the parameter Rh, or the parameter M, which does not necessarily observe the constraint of the cost function, in particular the regularity constraint. In most cases, the transform results in fact in the presence of non-physical peaks or oscillations in the distribution according to the parameter Rh, or the parameter M.
This is why, in a variant of the method described in detail above, the method consists of directly seeking the distribution according to the parameter Rh, or the parameter M, which observes the constraint(s) and allows for the correct reproduction of the experimental data, by constrained regularization.
To this end, the experimental Taylor signal is broken down on a family of Gaussians of an adapted parameter: The equation (5) is thus generalized in the form of:
{circumflex over (S)}(t)≡∫0∞P(G(c))G(c)c/2exp[−(t−t0)2G(c)c ]dG, (52.)
where the three following cases are considered:
1) c=1: to be used when seeking the amplitude distribution according to D;
2) c=−1: to be used when seeking the amplitude distribution according to Rh;
3) c=−1/df=−(1+a)/3: to be used when seeking the amplitude distribution according to M;
Pnorm(G(c)) is the distribution P(G(c)) , properly normalized:
In case 1), G(f)=G. Equation (54) then reduces to equation (5). The amplitude distribution PD(D) is determined based on the amplitude distribution P(G(1)) using the following equation:
For case 2),
The implementation of the constrained regularization algorithm results in the determination of the amplitude distribution P(G(−1)). The distribution PR(Rh) is then determined based on Pnorm(G(−1)) according to the relation:
Lastly, for case 3),
The implementation of the constrained regularization algorithm results in the determination of the amplitude distribution P(G(−(1+a)/3)). The distribution PM(M) is then determined by means of Pnorm(G(−(1+a)/3)) according to the following relation:
The manner of selecting the interval [G(c),min, G(c),max] on which the distribution P(G(c)) is to be sought is similar to that described above. In particular, τmin and τmax are calculated according to equations (43) and (44). Lastly, the values G(c),min, G(c),maxare determined as follows:
The method for determining the size distribution of a mixture of particles will now be described by reference to
The method includes a first step 100 of injecting a sample to be analyzed into the injection section 7 of the experimental device 3.
Then, in step 110, after actuating the means 13 for introducing and circulating an eluent inside the capillary 6, the sample injected is transported from the injection section 7 to the detection section 9 of the experimental device 3. The experimental conditions (nature of the eluent, flow speed of the eluent, transport distance separating the injection section from the detection section, temperature, internal ray of the capillary, etc.) are adapted so that a Taylor dispersion phenomenon will occur that is detectable in the detection section 9. In the experimental examples below, precise experimental conditions are indicated.
In step 120, the sample transported by the eluent passes through the optical cell of the detection section 9. The sensor 17 then generates an electrical measurement signal characteristic of the Taylor dispersion occurring in the sample.
In step 130, the detection signal generated by the sensor 17 is processed by the electronic card 19 so as to deliver a digitized experimental Taylor signal Ŝ(t).
In step 140, the experimental Taylor signal Ŝ(t) is acquired by the computer 5.
It is then analyzed (step 200) by running the software 31 in order to determine a size distribution. The software 31 carries out the following elementary steps.
In step 142, a first adapted menu is presented to the user so that the user may select the parameter according to which the constrained regularization method is to be carried out. The user may thus choose either the diffusion coefficient D (case 1, c=1), the hydrodynamic ray (case 2, c=−1), or the molar mass M (case 3, c=−1/df). The user is also asked to choose the number N for the discretization of the distribution sought. In the following, for simplicity, it is assumed that the user chooses the diffusion coefficient D and that the parameter to be taken into account is the parameter G.
In step 144, a second adapted menu is presented to the user so that the user may select the number and nature of the constraints to be taken into account in seeking the distribution that is the solution of equation (5). The proposed constraints to select are, e.g.:
In the following, it is assumed that the first constraint is implemented via a Lagrange multiplier in the cost function, whilst the other constraints are implemented directly by appropriately limiting the space of the amplitudes P(Gm) in which an extrema of the cost function is sought.
In step 146, the experimental Taylor signal Ŝ(t) is broken down into cumulants. More specifically, the normalized Taylor signal s(t)=Ŝ(t)/Ŝ(t0) is first determined, then its logarithm ln[s(t)] is calculated. Lastly, a second-degree polynomial of the variable (t−t0)2 is adjusted to the function ln[s(t)]. The first- and second-order cumulants Γ1 and Γ2 are then determined. Γ1 allows, in particular, for a measurement of the Γ average of the parameter G. Additionally, the T average of the Taylor signal is measured.
In step 148, the limits Gmin and Gmax of the interval of the values of the parameter G on which the distribution is sought are calculated based on the equivalent log-normal distribution determined based on the values of the first- and second-order cumulants Γ1 and Γ2 obtained in step 148, using equations (32) and (33) followed by (38) and (39).
In step 150, the cost function Hα is developed from the first constraint selected by the user in step 144. The constraint term associated with the constraint selected is red in the memory of the computer 5.
In step 152, the discretized expression of the cost function Hα is obtained by subdividing the interval Gmin and Gmax determined in step 154 into N sub-intervals.
In step 154, for each value of the Lagrange coefficient α in a group of test values,
the minimum of the cost function Hα is determined. To take into account strict constraints according to which the amplitudes are positive and must result in predetermined T and Γ averages with a predetermined deviation, the minimum of the cost function is sought exclusively on the appropriate subspace of the amplitudes P(Gm) that satisfy these strict constraints.
In step 156, the statistical error ν is determined, and the optimal value α0 of the Lagrange coefficient α is determined by selecting the value of the Lagrange coefficient α, which, in step 156, resulted in the nearest distance term χ2 by values lower than this statistical error ν.
In step 158, the group of distributions P(G) sought is the group of those that minimise the cost function Hα for the optimal value α0 of the Lagrange coefficient
determined in step 156.
In step 160, the value related to the size of the particles of the mixture is calculated based on the distribution P(G) obtained in step 158.
Lastly, in step 162, for an adapted transformation, the distributions according to the hydrodynamic ray or molar mass are calculated based on the distribution P(G) obtained in step 158.
If applicable, the various distributions calculated are displayed on the screen of the computer 5. The software 31 includes ‘tools’ allowing the user to carry out the desired calculations on the calculated distributions.
The software 31 thus includes means suited for the execution of each of the steps of the analysis of the experimental Taylor signal.
In one variant, the limits Gmin and Gmax of the interval on which the distribution P(G) is sought are calculated empirically. This consists of determining the normalized Taylor signal s(t)=Ŝ(t)/Ŝ(t0), obtaining its logarithm, and then calculating the derivative |∂ln s/∂x|. The limits of the interval of interest of the parameter G are finally deduced using equations (43) and (44), followed by equations (46) and (47).
In yet another variant, independent of the preceding variant, the T average of the parameter G, GT, is calculated by integrating the experimental Taylor signal Ŝ(t) using equation (16), and the Γ average of the parameter G, GΓ, is calculated based on the determination of the first-order cumulant resulting from the breakdown of the experimental Taylor signal into cumulants. The limits Gmin, and Gmax of the interval on which the size distribution is to be sought are calculated based on the T and Γ of the parameter G according to equations (30), (31), and (38), (39). This variant of the method also allows for a constraint term based on one or the other of these averages to be integrated into the cost function.
Adjustment by constrained regularization may be advantageously implemented using several experimental Taylor signals obtained by repeating identical experiments on a group of samples of a single mixture.
Although each repetition may be analyzed independently, and the amplitude distributions obtained may be averaged, it has been found to be more robust to accumulate the various individual Taylor signals into a single global Taylor signal including a number of experimental points equal to the sum of the experimental points of each individual Taylor signal. Secondly, the amplitude distribution is sought on the global Taylor signal by applying the constrained regularization algorithm.
This results in an amplitude distribution P(G) that most closely observes the constraint imposed, e.g., the distribution P(G) is more regular. This also allows the uncertainties and imprecisions affecting the acquisition of the individual Taylor signals to be taken into account.
During this operation, if the reference time t0 is not strictly identical from one experimental Taylor signal to another, the time coordinates are translated such that all experimental Taylor signals have exactly the same reference time t0.
The correction of the baseline, followed by the normalization of each experimental Taylor signal, may also be necessary before processing.
The software 31 includes a menu allowing users to process several experimental Taylor signals before analysing the global Taylor signal thus obtained.
The method just discussed allows the size distribution of a mixture of species, as well as the concentrations of these species in the mixture, to be obtained automatically and in real time, no matter what the polydispersity of the sample analyzed is, i.e., the number of species included in this sample and the respective concentrations thereof.
The fields of application of the device and method described above include the size characterization of polymers, colloids, latex nanomaterials, emulsions, liposomes, vesicles, and molecules or biomolecules in general. One important field of application is the study of the stability/degradation/aggregation of proteins for the pharmaceuticals industry.
The advantages of the characterization of a sample by means of the Taylor dispersion phenomenon are known to persons skilled in the art: Low volume of the sample to be injected into the capillary, no need to calibrate the experimental device, use of an extremely simple experimental device, technique that is particularly well adapted to size measurements of particles smaller than a few nanometres, a signal that is generally sensitive to mass concentration, etc.
Virgin silicon capillary: Rc=50 μm with a distance between the injection and detection sections of 30 cm.
Temperature: T=293° K
Eluent: Sodium borate buffer 80 mM, pH 9.2.
Viscosity of the eluent: η=8.9 10−4 Pa·s.
Sample: Polystyrene sulphonate (PSS) 0.5 g/l.
Injection: 0.3 psi (20 mbar), 9 s, i.e., an injected volume of 8 nl (total capillary volume 589 nl).
UV detection at a wavelength of 200 nm.
where Mw is the average molar mass by weight, Mp the molar mass at the summit of the chromatographic peak, and Mw/Mn is the polydispersity index. The average molar masses and the characteristics of the distribution were given by the supplier, who determines them by steric exclusion chromatography with calibration using polymer standards of the same chemical nature (PSS).
Furthermore, only the left part of the experimental Taylor signal is shown. In fact, generally, taylorgrams are symmetrical. However, in the case of the phenomenon of adsorption to the capillary surface, the right part of the signal, corresponding to the times following the time t0, may not be exactly symmetrical on the left part of the signal, corresponding to the time preceding the time t0. In this case, it is preferable to focus the
processing of the data on the left part of the experimental Taylor signal. Advantageously, the method described above only takes into account the left part of the signal in order to limit the influence of these possible parasitic phenomena.
adjustment between the logarithm of the normalized experimental Taylor signal Ŝ(t)(Data) and the cumulant development (Fit).
Table 1 shows the various averages of the diffusion coefficient D obtained: Directly
by breakdown into cumulants (Γ average, column 2) or by integrating the taylorgram (7 average, column 3), and, on the other hand, by running the software 31 (Γ average, column 4 and Taverage, column 5). It should be noted that, in this example, the averages measured directly on the experimental Taylor signal are not used to constrain the deconvolution of the signal, and only a loose regularity constraint and a strict positivity constraint were used.
D Γa
D T b
D Γc
D T c
a based on the breakdown into cumulants (equation 25)
b based on the integration of the taylorgram (equation 17)
c based on the distribution given by the software 31
Overall, the results show great coherence, and the software 31 results in a solution in which the T and Γ averages (columns 4 and 5) are close to the experimental values (columns 2 and 3).
Table 2 shows the values of τmin and τmax determined based on the various proposed approaches, i.e., the imperical approach (columns 4 and 5), breakdown into cumulants (columns 6 and 7) based on the cumulants Γ1 and Γ2 (columns 2 and 3), and the approach using the T and Γ averages of the diffusion coefficient (columns 8 and 9 based on the first-order cumulant and the integration of the taylorgram).
aempirical approach based on equations (43) and (44)
bbased on the cumulant breakdown (equations (32-33), (38-39), and (46-47))
cbased on the T and Γ averages of the diffusion coefficient (equations (17), (25), (30-31), (38-39), and (46-47)).
The orders of magnitude of τmin and τmax are highly consistent no matter what method is considered.
Table 3 compares the average hydrodynamic ray values obtained by breakdown into cumulants (Γ average, column 2), and by running the software 31 by determining the distribution P(G) followed by the Γ T integration (columns 3 and 6), by the reference method by steric exclusion chromatography (columns 4 and 7) following the average in question, by direct integration of the taylorgram on the entire signal (column 5). For the same average (columns 2-4, on the one hand, and column 5-7, on the other hand), the results are homogeneous for all samples considered.
This shows high consistency for all samples for each group of averages in question.
aobtained based on the distribution obtained by SEC.
bobtained by integrating the taylorgram (based on the variance of the taylorgram).
cobtained by integrating the weight distribution of the diffusion coefficients obtained by the software 31 (minimisation on D).
dobtained based on the weight distribution of the hydrodynamic rays originating from the SEC.
Experimentally, the time at the peak to of the experimental Taylor signal Ŝ(t) is not known with precision due to the measurement noise.
The time at peak t0 affects both a cumulant analysis and the determination of the size distribution obtained by constrained regularization.
Additionally, the cumulant method is based on a limited development for (t−t0)→0. From an experimental standpoint, it is necessary to choose the range of time t for the analysis wisely: If one limits oneself to a very small interval, the result will be substantially affected by measurement noise. If, on the other hand, too wide an interval is considered, the contribution of the higher-order (t−t0) terms, which are ignored in the cumulant method, will be significant.
A step for determining a peak time t0, as well as an optimal range of times suitable to a cumulant analysis, is shown below.
In a first sub-step, a first estimate t0,guess of the peak time is obtained, e.g., by considering the time for which Ŝ(t) is at its maximum or by adjusting the peak of Ŝ(t) by means of a parabolic or Gaussian function.
In a second sub-step, a list of N peak times to be tested t0,i is established, with the natural integer i varying between 1 and N, where the times t0,i are around t0,guess and regularly spaced by a constant time increment, with:
t0,1<t0,2<. . . t0,N;
t
0,1+1
=t
0,i
+dt; and
t
0,1
=t
0,guess
−Δt, t
0,N
=t
0,guess
+Δt;
where
df is the time increment between two consecutive tested peak times; and
Δt is a time interval typically on the order of t0,guess/50.
In a third sub-step, for each of the peak times to be tested t0,i, a series of cumulant analyses is carried out taking into account various ranges of time of differing lengths.
The time range is, e.g., at a cutoff level of the signal Ŝ(t). For example, for a cutoff level of 0.1, the time range t is considered such that Ŝ(t)>0.1×Ŝ(t0,i). The value of the first and second cumulant resulting from the adjustment on each range of time is noted.
In a fourth sub-step, the optimal peak time t0 is determined as being between the peak times for which the first cumulant Γ1 diverges towards positive values when the cutoff level increases, and those for which the first cumulant Γ1 diverges towards negative values when the cutoff level increases.
Tracing on a graph, for each peak time to be tested t0,i, the curve of the first cumulant Γ1 as a function of the cutoff level, the optimal peak time t0 is that for which the curve is located between upward concave curves and downward concave curves. This curve has a smaller variation than the others.
The choice of optimal peak time is made by visual analysis of the aforementioned graphics or automatically, e.g., based on the sign of the second numerical derivative of the first cumulant Γ1 as a function of the cutoff.
Alternatively or optimally, it is possible to do the same with the second cumulant Γ2 and/or the square of the ratio of the second cumulant to the first cumulant. By doing it simultaneously for the first cumulant, the second cumulant, and the square of the ratio of the second cumulant to the first cumulant, the choice of peak time may be made more reliable.
In a fifth step, the optimal cutoff value is determined as the one that is the highest before the data show a significant deviation relative to their general tendancy due to the influence of measurement noise for very high cutoff levels.
Number | Date | Country | Kind |
---|---|---|---|
1256050 | Jun 2012 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/063432 | 6/26/2013 | WO | 00 |