The present invention relates to a model function fitting device and a model function fitting method for fitting a model function to a chromatogram.
Various model functions shown in Non-Patent Document 1 are suggested for a quantitative analysis and a qualitative analysis of a waveform measured by a chromatograph. In regard to application to a peak separation algorithm, a model function is required to be able to fit a measured waveform with high accuracy and is required to be less likely to take a shape different from a peak waveform of a chromatogram with respect to any parameter. In order to meet these requirements, an EMG function and a BEMG function described in Non-Patent Document 2, for example, are used.
By using the EMG function or the BEMG function, it is possible to separate peaks from many measured waveforms. However, if there is a model function with which fitting can be performed with higher accuracy, it is highly convenient for a user.
An object of the present invention is to provide a model function with which fitting can be performed with high accuracy.
A model function fitting device according to one aspect of the present invention includes an acquirer that acquires a chromatogram, and a fitter that fits a model function to the chromatogram, while applying, to the model function, a constraint that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions being located at both sides of the first portion and being approximatable to a linear function.
With the present invention, it is possible to provide a model function with which fitting can be performed with high accuracy.
A model function fitting device and a model function fitting method according to embodiments of the present invention will now be described with reference to the attached drawings.
The model function fitting device 1 of the present embodiment is constituted by a personal computer. As shown in
The CPU 11 controls the model function fitting device 1 as a whole. The RAM 12 is used as a work area for execution of a program by the CPU 11. Various data, a program and the like are stored in the ROM 13. The operation unit 14 receives an input operation performed by a user. The operation unit 14 includes a keyboard, a mouse, etc. The display 15 displays information such as a result of fitting. The storage device 16 is a storage medium such as a hard disc. A program P1 and the measurement data MD are stored in the storage device 16. The program P1 executes a process of acquiring a chromatogram and a process of fitting a model function to a chromatogram. The communication interface 17 is an interface that communicates with another computer through wireless or wired communication. The device interface 18 is an interface that accesses a storage medium 19 such as a CD, a DVD or a semiconductor memory.
The acquirer 21 receives the measurement data MD. The acquirer 21 receives the measurement data MD from another computer, an analysis device or the like via the communication interface 17, for example. Alternatively, the acquirer 21 receives the measurement data MD stored in the storage medium 19 via the device interface 18.
The fitter 22 executes the process of fitting a model function to a chromatogram. The fitter 22 of the present embodiment fits a model function to a chromatogram, while applying a constraint to the model function that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions being at both sides of the first portion and being approximatable to a linear function.
The outputter 24 causes the display 15 to display a result of fitting performed by the fitter 22 and information in regard to a fitted model function.
The program P1 is stored in the storage device 16, byway of example. In another embodiment, the program P1 may be provided in the form of being stored in the storage medium 19. The CPU 11 may access the storage medium 19 via the device interface 18 and may store the program P1 stored in the storage medium 19, in the storage device 16 or the ROM 13. Alternatively, the CPU 11 may access the storage medium 19 via the device interface 18 and may execute the program P1 stored in the storage medium 19.
Next, a model function fitting method according to a first embodiment will be described.
Although an effective constraint can be applied to the model function in consideration of the sequence L[t] in which a second order differential is non-positive as the model function described by a logarithmic scale, the sequence L[t] has the large number of parameters. That is, in a case in which the sequence L[t] is used as the model function, because “the number of parameters of the model function”=“the number of data points of the measurement data MD,” the stability of optimization calculation is degraded. Further, in a case in which the sequence [t] is used as the model function, the model function may take a shape unlike for a chromatogram because of overfitting to the measurement data MD.
As such, preferably, the fitter 22 fits the model function to the chromatogram C1 by using a Generalized Additive Model (GAM). In the present embodiment, a smoothing spline model is used as a Generalized Additive Model. That is, the sequence L[t] in which a second order differential is non-positive is applied as the model function described by a logarithmic scale, and the Generalized Additive Model is applied to the model function by smoothing spline. In the present specification, a method of applying a Generalized Additive Model to the constraint that a second order differential is non-positive is referred to as a DGAM.
In this manner, in the present embodiment, the fitter 22 fits the model function to the chromatogram utilizing the Generalized Additive Model. In the Generalized Additive Model, parameters are arranged in a chronological order. It is possible to add a restriction of a convex function while performing smoothing by applying the constraint that a second order differential of a parameter is non-positive. Further, as described above, as compared to a case in which the sequence L[t] is used as the model function, it is possible to reduce the number of parameters and reduce an amount of calculation. In a case in which a smoothing spline model is utilized, the number of parameters can be reduced to the number of peaks of splines. In a case in which a least squares method is used, the number of parameters is not a major issue. However, in a case in which Bayesian inference is performed, for regression, using a Markov chain Monte Carlo method (MCMC), a reduction in number of parameters is a great advantage in calculation.
In regard to fitting of a model function, a unimodal restriction is used in a peak separation algorithm such as MCR-ALS as a restriction method of suppressing an error. The DGAM in the present embodiment can apply a strong constraint to a model function even compared with a unimodal restriction.
By using the DGAM of the present embodiment for calculating an area of one or a plurality of peaks included in a chromatogram, it is possible to accurately perform a quantitative analysis and a qualitative analysis of a sample. In a case in which a model function is used for a chromatogram separation algorithm, its accuracy of approximation is important. In management of impurities of a pharmaceutical product, it is necessary to manage an impurity peak of a very minute amount (0.05%, for example) compared to a main component peak. In such application, as a matter of course, an error of a model function used for fitting is required to be smaller than 0.05%. However, a model function such as the BEMG has an error of about 0.1% as described with reference to
Next, a model function fitting method according to a second embodiment will be described. The formula 1 is a formula expressing a model function exp(g (x, a, b)) according to the second embodiment. In the formula 1, x is a retention time obtained when a peak position and a peak width are normalized. That is, letting a peak position be u and letting a peak width be s, (x−u)/s is input to x. Further, in the formula 1, a, b are tailing parameters.
In the formula 1, as shown in
Further, a model function exp(h(x, a, b)) that is normalized in regard to a peak position, a peak height and a peak width is expressed by the formula 2. In the formula 2, β is a beta function.
The EMLC function of the present embodiment has less collinearity of parameters, and can enhance the efficiency of Bayesian inference and optimization.
The simulation shown in
How to obtain the EMLC function expressed by the formula 1 and the normalized EMLC function expressed by the formula 2 will be described. In f(x, a, b) expressed by the formula 3, a constant is added to the formula obtained when a sigmoid function is multiplied by a constant.
The result of integration of f(x, a, b) expressed by the formula 3 is g(x, a, b) expressed by the formula 4. That is, the formula 4 expresses g(x, a, b) in the formula 1. That is, g(x, a, b) is the logarithmic function of the EMLC function and has an upwardly convex shape.
It is more preferable that a tailing parameter or a leading parameter does not have a significant influence on a peak feature such as a peak position, a peak height, a peak area or a peak width. Such conditions reduce collinearity in fitting to a chromatogram having a standard tailing shape or a standard leading shape. Therefore, since a peak position and a peak height of exp(g(x, a, b)) are analytically obtained, it is desirable to use a function gg(x, a, b) with the normalized peak position and the normalized peak height. The formula 5 expresses the function gg(x, a, b) obtained when g(x, a, b) is normalized. Instead of a peak position or a peak height, another peak feature such as the center of gravity may be used for normalization.
It is empirically required that an area Ns(a, b) of the function gg(x, a, b) is approximated by the formula (6).
In the formula 5, correction is made by multiplication of x by Ns(a, b). Thus, a peak width is corrected so that an area is constant for normalization of a peak shape. The function after normalization is expressed by the formula 7. Instead of an area, correction may be made using a formula that is empirically obtained in regard to a half value width or a formula obtained by machine learning. The formula 7 expresses h(x, a, b) in the formula 2. More preferably, in order to prevent a loss of significance at the time of floating-point arithmetic, log β obtained when a beta function p is modified can be used as shown in the formula 6.
In a case in which a model function such as the EMG or the BEMG is used for function optimization or Bayesian inference, a peak position or a peak width varies depending on the magnitude of a tailing parameter, for example. With these model functions, since the relationship between parameters is strong, there is a problem that a function cannot be fitted to a target shape unless a plurality of parameters are largely changed, although the change amount of a square error is small. In particular, in a case in which the strength of the relationship between parameters is different depending on the state of the parameters, it is difficult to set a momentum term in optimization based on a gradient method, etc. It is empirically known that, when a method using a derivative of a model function such as function optimization or Bayesian inference is used with respect to the BEMG function in a case in which two peaks of a chromatogram are adjacent to each other, or a case in which a noise is large, etc., the BEMG function is likely to fall into a local solution. In contrast, the EMLC function in the present embodiment is convenient for function optimization or Bayesian inference because the collinearity between parameters is suppressed.
Further, with a model function such as the EMG or the BEMG, parameters are not expressed in a manner that is easy for a human to interpret, such as a peak position or a half value width. Therefore, there is a problem that it is difficult for a user to understand a model function. In contrast, the EMLC function of the present embodiment is modified by normalization into a format that is easy for the user to understand. Further, a model function such as the EMG or the BEMG includes multiplication of exp and erfc. In the calculation of a peak tail portion, a value close to 0 and a value close to ∞ are multiplied, and a minute value is obtained as a result. Therefore, since degradation of accuracy such as a loss of significance occurs, it is necessary to prepare a function separately for a tail portion, and calculation is difficult. In contrast, by normalization as described above, the EMLC function of the present embodiment can prevent a loss of significance caused by calculation.
Next, a model function fitting method according to a third embodiment will be described.
It is difficult to perform accurate fitting for such a chromatogram C2 using the method described in the first and second embodiments. Therefore, the fitter 22 fits a function, that is obtained when a conversion function smoother than an exponential function is applied to a function having the non-positive second order differential (hereinafter referred to as an original function), as a model function, to the chromatogram C2.
The fitter 22 uses a composite function of an exponential function and a gamma correction function, for example, as a conversion function to be applied to an original function. Letting an original function be B(t), letting a gamma correction function be G, and letting an exponential function be exp, the conversion function is expressed by exp(G(*)). The formula 8 is an example of the gamma correction function used for the conversion function.
In the formula (8), a parameter q is a constant that is equal to or larger than 0. The larger the value of the parameter q is, the smaller a chromatogram intensity is in a region in which effect of gamma correction can be obtained. The parameter p is usually a value equal to or smaller than 1, and the range of positive value allowing a deviation is set. A parameter r is a parameter for adjusting a peak width.
In
As another example of a conversion function, a polynomial can be used. The formula 9 is an example in which a polynomial is used as a conversion function Q. The value of an original function is input to x in the formula 9. For example, in a case in which the original function B(t)=−t{circumflex over ( )}2, a chromatogram to which a conversion function is applied is obtained when −t{circumflex over ( )}2 is input to x in the formula 9.
In
A general formula as shown in the formula 10, for example, can be used for a conversion function Q(x) using a polynomial. That is, the conversion function Q(x) is expressed by a function including an n-th order polynomial in a denominator.
As another example of a conversion function, a cosh function can also be used. The formula 11 is an example in which a cosh function is used as a conversion function Q. In the formula 11, u is a parameter for adjusting a peak width. In this manner, the conversion function Q(x) is expressed by a function including a cosh function in a denominator.
In
As described above, a gamma correction function, a polynomial or a cosh function is utilized as an example of a conversion function. These functions are examples, and a monotonic function having a gentler slope than that of an exponential function can be used as a conversion function.
Next, a model function fitting method according to a fourth embodiment will be described. Similarly to the third embodiment, although the model function according to the fourth embodiment has a constraint that the model function described by a logarithmic scale is convex upward in many portions, a deviation from the constraint is allowed in some areas. In the fourth embodiment, the deviation from the constraint is allowed by distortion of time. The fitter 22 fits a model function to a chromatogram by applying a GAM model to a time distortion function.
A function that distorts time t is expressed by m(t). For example, when time distortion by m(t) is applied to a chromatogram expressed by exp(−t{circumflex over ( )}2), the chromatogram is expressed by exp(−m(t){circumflex over ( )}2). For example, a logarithmic chromatogram LC9 as shown in
Here, the constraint that a second order derivative is non-positive may be applied to a logarithmic chromatogram to which the time distortion function m(t) is applied, or may be applied to the time distortion function m(t). It is considered that the time distortion function m(t) is applied to a chromatogram expressed by exp(−t{circumflex over ( )}2), for example. In this case, a logarithmic chromatogram is −m(t){circumflex over ( )}2. The constraint that a second order derivative of the logarithmic chromatogram −m(t){circumflex over ( )}2 is non-positive is expressed by the formula 12.
Although the constraint expressed by the formula 12 may be implemented as an optimization algorithm, it increases an amount of calculation. As such, in order to reduce the amount of calculation, it is considered to apply a constraint to the time distortion function m(t). In regard to the time distortion function m(t), the farther an area is from the center of a peak, the smaller the slope, that is, the smaller the value of a first order derivative of m(t). By using this constraint, it is possible to apply the constraint equivalent to a unimodal restriction. This constraint is expressed by the formula 13 with successively arranged feature points as tn.
Although the lower limit is 0 for the formula 13, the lower limit of a model function of an actual chromatogram or a function using the GAM model satisfying a condition that a second order derivative is non-positive is not 0, and is empirically within a certain range of values. Therefore, the lower limit may be set to an empirically obtained value that is larger than 0 and smaller than 1. Further, in a case in which the GAM model is used, only a function defined by a spline can be expressed. Therefore, a minute systematic error remains. Therefore, even in a case in which fitting is simply performed on a waveform without tailing, vibration of coefficient may occur as shown in
In this manner, although the model function of the fourth embodiment has the constraint that f a model function described by a logarithmic scale is convex upward in many portions, a deviation from the constraint is allowed in some areas by distortion of time of an original function. This enables fitting of a model function with higher accuracy.
Next, in the step S2, the fitter 22 fits a model function to a chromatogram, while applying a constraint to the model function that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions that are located at both sides of the first portion and being approximatable to a linear function. In the step S2, in the first embodiment, a model function to which a constraint that a second order differential is non-positive is applied is used. In the step S2, in the second embodiment, a function obtained by integration of a function obtained by addition by constant of a function obtained by constant multiplication of a sigmoid function are used as the model function described by a logarithmic scale.
Next, in the step S12, the fitter 22 fits a model function to a chromatogram, while applying a constraint to the model function that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions that are located at both sides of the first portion and being approximatable to a linear function. In the step S12, in the third embodiment, the fitter 22 fits a model function, obtained when a conversion function smoother than an exponential function is applied to an original function having the non-positive second order differential, to a chromatogram. In the step S2, in the fourth embodiment, the fitter 22 fits a model function, being allowed to deviate from a constraint that a second order differential is non-positive by distortion of time with respect to an original function having the non-positive second order differential, to a chromatogram.
In the first embodiment, a smoothing spline model is utilized as a Generalized Additive Model. As a modified example of the first embodiment, a Bezier function, a Gaussian function or the like can also be utilized instead of a spline.
In the first embodiment, the method of applying a Generalized Additive Model (DGAM) to a constraint that a second order differential is non-positive is used. As the modified example, the EMLC function in the second embodiment may be used as an initial value when the DGAM is used. Although having a relatively large number of parameters, the DGAM can apply an effective constraint in an initial state by using the EMLC function as an initial value.
In the third embodiment, as a conversion function, a gamma correction function is used, a polynomial is used or a cosh function is used, byway of example. However, a sum, a product or a composite function of these functions may be used as a conversion function. In the third and fourth embodiments, the method of allowing a partial deviation from the constraint that a second order derivative of a logarithmic chromatogram is non-positive is described. As another method, this constraint may be relaxed by a direct and empirical method. For example, a parameter for allowing a deviation may be set empirically by a user, or the sum of positive values or the exponentiation of the positive values may be set as a penalty term for solving an optimization problem.
It will be appreciated by those skilled in the art that the exemplary embodiments described above are illustrative of the following aspects.
A model function fitting device according to one aspect includes an acquirer that acquires a chromatogram, and a fitter that fits a model function to the chromatogram, while applying, to the model function, a constraint that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions being located at both sides of the first portion and being approximatable to a linear function.
It is possible to perform fitting with high accuracy.
The model function fitting device according to item 1, wherein the fitter may apply, to the model function, a constraint that a second order differential of the logarithmic function is non-positive.
Even in a case in which the model function has a large number of parameters, it is possible to apply an effective constraint to the model function.
The model function fitting device according to item 2, wherein the fitter may fit the model function to the chromatogram by using a Generalized Additive Model.
It is possible to reduce the number of parameters of the model function and enhance the stability of optimization calculation.
The model function fitting device according to item 2 or 3, may be used for calculation of an area of one peak or a plurality of peaks.
It is possible to perform a quantitative analysis and a qualitative analysis on measurement data with high accuracy.
The model function fitting device according to item 2 or 3, wherein as an initial value of the model function, a function may be used, the function being obtained by integration of a function obtained by addition by constant of a function obtained by constant multiplication of a sigmoid function as the model function described by a logarithmic scale.
Even in a case in which the model function has a large number of parameters, it is possible to apply an effective constraint to the initial value of the model function.
The model function fitting device according to item 1, wherein the fitter may use a function, the function being obtained by integration of a function obtained by addition by constant of a function obtained by constant multiplication of a sigmoid function as the model function described by a logarithmic scale.
It is possible to apply an effective constraint for fitting a model function to a chromatogram.
The model function fitting device according to item 6, wherein the model function may be a function with a normalized peak height and a normalized peak position.
The format of the model function is easy for a user to understand, and it is easy for the user to handle the model function.
The model function fitting device according to item 7, wherein the model function may be a function that corrects a peak width using a formula including a beta function and an exponential function.
The format of the model function is easy for a user to understand, and it is easy for the user to handle the model function.
The model function fitting device (1) according to item 1, wherein the fitter (22) may fit the model function to the chromatogram (C2), the model function being obtained when a conversion function having a gentler slope than that of an exponential function is applied to an original function having a second order differential that is non-positive.
It is possible to perform fitting with high accuracy by partially allowing a deviation from a constraint that a second order derivative of a logarithmic chromatogram is non-positive.
The model function fitting device (1) according to item 9, wherein the conversion function may include a composite function of a gamma correction function and an exponential function.
With the gamma correction function, it is possible to allow a deviation from a constraint that a second order derivative is non-positive.
The model function fitting device (1) according to item 9, wherein the conversion function may include a function having an n-th order polynomial in a denominator.
With the conversion function including a polynomial, it is possible to allow a deviation from a constraint that a second order derivative is non-positive.
The model function fitting device (1) according to item 9, wherein the conversion function may include a function having a cosh function in a denominator.
With the conversion function including a cash function. It is possible to allow a deviation from a constraint that a second order derivative is non-positive.
The model function fitting device (1) according to item 1, wherein the fitter (22) may fit the model function to the chromatogram, the model function being allowed to deviate from a constraint that a second order differential is non-positive by distortion of time with respect to an original function having a non-positive second order differential.
It is possible to perform fitting with high accuracy by partially allowing a deviation from a constraint that a second order derivative of a logarithmic chromatogram is non-positive.
A model function fitting method according to another aspect includes the steps of acquiring a chromatogram, and fitting a model function to the chromatogram, while applying, to the model function, a constraint that the model function described by a logarithmic scale has a first portion being approximatable to a quadratic function and second portions being located at both sides of the first portion and being approximatable to a linear function.
It is possible to perform fitting with high accuracy.
The model function fitting method according to item 14, wherein the step of fitting may include applying, to the model function, a constraint that a second order differential of the logarithmic function is non-positive.
Even in a case in which the model function has a large number of parameters, it is possible to apply an effective constraint to the model function.
The model function fitting method according to item 14, wherein the step of fitting may include using a function, the function being obtained by integration of a function obtained by addition by constant of a function obtained by constant multiplication of a sigmoid function as the model function described by a logarithmic scale.
It is possible to apply an effective constraint for fitting a model function to a chromatogram.
The model function fitting method according to item 14, wherein the step of fitting (S12) may include fitting the model function to the chromatogram (C2), the model function being obtained when a conversion function having a gentler slope than that of an exponential function is applied to an original function having a non-positive second order differential.
It is possible to perform fitting with high accuracy by partially allowing a deviation from a constraint that a second order derivative of a logarithmic chromatogram is non-positive.
The model function fitting method according to item 14, wherein the step of fitting (S12) includes fitting the model function to the chromatogram (C2), the model function being allowed to deviate from a constraint that a second order differential is non-positive by distortion of time with respect to an original function having a non-positive second order differential.
It is possible to perform fitting with high accuracy by partially allowing a deviation from a constraint that a second order derivative of a logarithmic chromatogram is non-positive.
Number | Date | Country | Kind |
---|---|---|---|
2021-101606 | Jun 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/024399 | 6/17/2022 | WO |