This application is a National Stage of International Application No. PCT/JP2015/054068 filed Feb. 16, 2015, the contents of which are incorporated herein by reference in its entirety.
The present invention relates to a method for estimating the magnitude of a noise component (noise level) in a chromatogram, spectrum or other kinds of measurement data, as well as a device for processing such measurement data and a program for processing measurement data.
One type of device for analyzing components contained in a liquid sample is the liquid chromatograph. In a liquid chromatograph, a liquid sample is carried by a stream of mobile phase and introduced into a column. The components in the sample are temporally separated within the column and subsequently detected with a detector, such as an absorptiometer, to create a chromatogram Each component is identified from the position of a peak on the chromatogram, and the concentration of that component is determined from the height or area of that peak (for example, see Patent Literature 1).
A chromatogram obtained through a measurement normally contains a noise component in addition to the peak component. The magnitude of the peak component changes with the elution of the various components contained in the liquid sample. On the other hand, the magnitude of the noise component fluctuates due to various factors. Since it is impossible to identify all of those fluctuating factors and calculate the magnitude of the noise component, the task of removing a noise component from a chromatogram to obtain a peak component has conventionally been performed by approximating the noise component by white noise and fitting it to the measurement data, or by estimating the noise component for the entire chromatogram from a portion which is considered to include no peak in the chromatogram obtained by a measurement.
Patent Literature 1: JP 7-98270 A
Patent Literature 2: JP 2006-163614 A
As noted earlier, the noise component in an actual chromatogram fluctuates due to various factors. Therefore, the approximation by white noise may be insufficient to estimate the magnitude of the noise component with high accuracy. In the case of estimating the noise component from a portion of a measured chromatogram, it is preferable to deduce the portion in the chromatogram which contains absolutely no peak component. However, this is a difficult task, and therefore, it is difficult to estimate the magnitude of the noise component with high accuracy.
Although the previous description is concerned with the case of a chromatogram, similar problems can also occur in various other kinds of measurement data which contain a peak component and noise component, such as an optical spectrum in spectrophotometry or a mass spectrum in mass spectrometry.
The problem to be solved by the present invention is to provide a noise level estimation method, measurement data processing device, and program for processing measurement data by which the magnitude of the noise component (noise level) contained in measurement data, such as a chromatogram or spectrum, can be estimated with high accuracy.
The first aspect of the present invention developed for solving the previously described problem is a method for estimating a magnitude of a noise component from measurement data containing a peak component and noise component obtained by measuring an intensity of a signal which changes with respect to a predetermined physical quantity, the method including:
a) performing a time-frequency analysis on the measurement data to obtain, for each of a plurality of predetermined frequencies, waveform data representing a change in the intensity of the frequency component concerned in the aforementioned signal with respect to the predetermined physical quantity;
b) dividing the waveform data of each of the plurality of predetermined frequencies into a plurality of segments so that each section where positive values successively occur and each section where negative values successively occur in the direction of the change in the physical quantity are defined as one segment, or so that each section between a local maximum and a local minimum neighboring each other in the direction of the change in the physical quantity is defined as one segment;
c) determining the magnitude of each of the plurality of segments in the waveform data of each of the plurality of predetermined frequencies;
d) creating, for the waveform data of each of the plurality of predetermined frequencies, a selected segment group by excluding a segment whose magnitude exceeds a predetermined reference value from the plurality of segments in the waveform data; and
e) determining a noise level of each of the plurality of predetermined frequency components by calculating a statistical value of the magnitudes of the segments included in the selected segment group.
For example, the predetermined physical quantity is time, wavelength or mass-to-charge ratio, while the measurement data is, for example, a chromatogram, optical spectrum or mass spectrum.
The time-frequency analysis is an analytical technique used in such fields as the image processing. Specifically, the continuous wavelet transform, discrete wavelet transform, filter bank and other techniques are commonly known (for example, see Patent Literature 2). In the case of using the continuous wavelet transform, waveform data are acquired at a plurality of continuous frequencies, from which a set of waveform data to be used is extracted at each of a plurality of relevant frequencies. In the case of using the filter bank or discrete wavelet transform, a plurality of relevant frequencies are previously set, and a set of waveform data is acquired at each of those frequencies. The plurality of predetermined frequencies may be set by analysis operators each time, or a plurality of standard frequencies may be previously set.
The use of the term “time-frequency analysis” does not mean that the predetermined physical quantity is limited to time. The method according to the present invention is also applicable in an analysis of other kinds of data, such as optical spectrum data in which the predetermined physical quantity is wavelength, or mass spectrum data in which the predetermined physical quantity is mass-to-charge ratio.
The “magnitude” of a segment can be determined, for example, from the area or height of the segment. The “predetermined reference value” may be, for example, the average+Nσ of the magnitudes of the segments (where N is a positive integer, and a is the unbiased standard deviation), or the median+M×MAD (median absolute deviation) of the magnitudes of the segments (where M is a positive integer). The reference value may also be determined from the distribution of the magnitudes of the segments included in the same set of waveform data, by finding the range which includes a specific proportion (e.g. 90% of those segments and designating the upper limit of that range as the reference value. The “statistic value” of the magnitudes of the segments included in the selected segment group may be, for example, the average or median of the magnitudes of the segments forming the selected segment group.
In the noise level estimation method according to the present invention, a segment whose magnitude exceeds the predetermined reference value is regarded as a segment originating from the peak component and excluded from the plurality of segments included in the waveform data. Therefore, the noise level of the measurement data can be estimated with high accuracy.
The noise level in a chromatogram is not always at the same level. For example, in a gradient analysis, the noise level may fluctuate due to the temporal change in the mixture ratio of the solutions constituting the mobile phase. The noise level may also fluctuate due to a change in the temperature around the device during the measurement of a chromatogram. In these types of measurement data, an increase in the noise level causes the segment to be larger in magnitude, so that a segment originating from the noise component may be inappropriately excluded.
Accordingly, the noise level estimation method may preferably include:
normalizing the magnitude of each of the plurality of segments before creating the selected segment group, using index data concerning a change in the magnitude of the noise component in the direction of the change in the physical quantity in the measurement data.
For example, a set of data showing the change in the mixture ratio of the solutions in a gradient analysis, or a set of data showing the temperature change during the acquisition of the measurement data, may be used as the index data. By using, as the index data, those data which are expected to affect the rise and fall of the noise level, it is possible to more correctly create the selected segment group and estimate the noise level with a higher level of accuracy.
The noise level estimation method according to the present invention may be configured so that:
the selected segment group is created by excluding a segment which is located at a position corresponding to the aforementioned excluded segment in the direction of the change in the predetermined physical quantity and which belongs to the waveform data at a lower frequency than the frequency of the waveform data to which the aforementioned excluded segment belongs.
The power spectrum of the peak component (a) shown in
The noise level estimation method according to the present invention may further include:
comparing the noise levels at the plurality of predetermined frequencies with each other, and correcting the noise levels so that the noise level at a lower frequency becomes equal to or higher than the noise level at a higher frequency.
In a measurement of a signal intensity which changes with respect to a predetermined physical quantity, a detector which includes an electrical circuit having a capacitor or an analogue-to-digital (A/D) converter responding with a predetermined time constant is normally used. It is commonly known that a capacitor which accumulates electric charges for a predetermined period of time, or an A/D converter which responds with a predetermined time constant, acts like a low-pass filter and decreases the signal within a high frequency range. Therefore, the noise level in a signal acquired through such a detector tends to increase from higher to lower frequencies. Accordingly, by correcting the noise level in a manner to reflect such a tendency, the noise level can be even more accurately determined.
The second aspect of the present invention developed for solving the previously described problem is a measurement data processing device used for estimating a magnitude of a noise component from measurement data containing a peak component and noise component obtained by measuring an intensity of a signal which changes with respect to a predetermined physical quantity, the device including:
a) a time-frequency analyzer for performing a time-frequency analysis on the measurement data to obtain, for each of a plurality of predetermined frequencies, waveform data representing a change in the intensity of the frequency component concerned in the aforementioned signal with respect to the predetermined physical quantity;
b) a segment divider for dividing the waveform data of each of the plurality of predetermined frequencies into a plurality of segments so that each section where positive values successively occur and each section where negative values successively occur in the direction of the change in the physical quantity are defined as one segment, or so that each section between a local maximum and a local minimum neighboring each other in the direction of the change in the physical quantity is defined as one segment;
c) a segment value calculator for determining the magnitude of each of the plurality of segments in the waveform data of each of the plurality of predetermined frequency components;
d) a selected segment group creator for creating, for the waveform data of each of the plurality of predetermined frequency components, a selected segment group by excluding a segment whose magnitude exceeds a predetermined reference value from the plurality of segments in the waveform data; and
e) a noise level calculator for determining a noise level of each of the plurality of predetermined frequency components by calculating a statistical value of the magnitudes of the segments included in the selected segment group.
The third aspect of the present invention developed for solving the previously described problem is a program for processing measurement data used for estimating a magnitude of a noise component from measurement data containing a peak component and noise component obtained by measuring and intensity of a signal which changes with respect to a predetermined physical quantity, the program characterized by making a computer function as the measurement data processing device according to the second aspect of the present invention.
With the noise level estimation method, measurement data processing device or program for processing measurement data according to the present invention, the magnitude of a noise component (noise level) in a chromatogram, spectrum or other kinds of measurement data can be estimated with high accuracy.
Embodiments of the noise level estimation method, measurement data processing device, and program for processing measurement data according to the present invention are hereinafter described with reference to the attached drawings. The following embodiments deal with the case of estimating a noise level which is the magnitude of a noise component contained in a chromatogram acquired using a liquid chromatograph.
In the storage section 16, index data which has been prepared along with the acquisition of the chromatogram data is stored. For example, the index data is a set of data recording a temporal change of a parameter which affects the rise and fall of the noise level, such as the temporal change in the solution mixture ratio during a gradient analysis or the temporal change in the ambient temperature inside the measurement room. An OS (operating system) and a program 18 for processing measurement data are also stored in the storage section 16. Executing the program 18 for processing measurement data makes the CPU 11 function as a time-frequency analyzer 18a, segment divider 18b, segment value calculator 18c, selected segment group creator 18d, noise level calculator 18e and noise level corrector 18f, all of which will be described later.
The noise level estimation method using the measurement data processing device 10 of the present embodiment is hereinafter described with reference to the flowchart of
Initially, based on a determination of the analyzing frequencies by the user and a command to initiate the analysis, the time-frequency analyzer 18a performs a time-frequency analysis on the measurement data and obtains, for each of the frequencies specified by the user, a set of waveform data representing the temporal change in the intensity of the frequency component concerned in the chromatogram (Step S1).
Subsequently, the segment divider 18b divides each of the plurality of sets of waveform data obtained at the plurality of frequencies by the time-frequency analysis into a plurality of segments so that each period of time Where positive values successively occur in the time-axis direction and each period of time where negative values successively occur are defined as one segment (Step S2).
Next, the segment value calculator 18c calculates the area of each of the segments of the waveform data obtained at the highest frequency. Then, it normalizes the area values based on the index data, stored in the storage section 16, to obtain segment values (Step S3). In other words, by using the temporal change in the noise factor recorded in the index data, the segment value calculator 18c calculates segment values which are free of the rise and fall of the noise level due to the noise factor. Subsequently, the selected segment group creator 18d calculates the average value and unbiased standard deviation σ of the segment values of the plurality of segments belonging to the same waveform data, and creates a selected segment group by excluding each segment whose segment value exceeds the average+Nσ (where N is a positive integer) from the segments constituting the waveform data concerned (
In the previously described example, the area of each segment is normalized to obtain the segment value, and each segment whose segment value exceeds the average+Nσ is excluded. It is also possible to use the height in place of the area and/or to exclude each segment whose area or height exceeds the median+M×MAD (median absolute deviation, where M is a positive integer). As for N or M in the aforementioned formulae, a suitable value can be used for each set of measurement data taking into account the distribution of the segment values.
As noted earlier, if a peak is present in a chromatogram, area or height of a segment in the waveform increases as a result of the time-frequency analysis. Accordingly, by Step S4, peak components can be excluded from the waveform data.
If there is any segment excluded by the selected segment group creator 18d (YES in Step S5), other segments located at the same position in the time-axis direction in the waveform data at the lower frequencies are also excluded (Step S6). If there is no segment excluded by the selected segment group creator 18d (NO in Step S5). The segment value calculator 18c once more calculates the segment value of each of the plurality of segments in the waveform data at the next highest frequency, i.e. the set of waveform data at the highest frequency among the sets of waveform data which remain unprocessed (Step S3). Then, a selected segment group is created in the previously described manner (Step S4), and if there is any segment excluded (YES in Step S5). Other segments located at the same position in the time-axis direction in the waveform data at the lower frequencies are excluded (Step S6; see
As just described, if there is a peak component at a certain frequency, there is certainly a peak component within the frequency range lower than that frequency. Accordingly, by performing Step S6, a selected segment group from which peak components are more assuredly removed can be created.
As a result of sequentially creating the selected segment groups in descending order of the frequency, when the selected segment groups for all sets of waveform data have been created (YES Step S7), the noise level calculator 18e calculates the noise level from the average value of the areas of segments included in the selected segment groups for each frequency of the plurality of the frequencies (Step S8).
As noted earlier, the capacitor or A/D converter included in a commonly used detector functions like a low-pass filter. Therefore, in a set of measurement data obtained through such a detector, signals within a high frequency range are relatively decreased. Accordingly, in order to reflect such a tendency, the noise level corrector 18f determines whether or not the noise level at lower frequencies is equal to or higher than the noise level at higher frequencies, and if not (NO in Step S9), the noise level corrector 18f corrects the calculated values of the noise level (Step S10) and determines the noise level at each frequency (Step S11).
It should be noted that Steps S5, S6, S9 and S10 are additional steps for calculating the noise level with high accuracy and are dispensable for the present invention. The normalization of the segment areas using the index data only needs to be performed when necessary; this process may be omitted in the case of a chromatogram obtained under fixed conditions (i.e. when it is possible to consider that there is no specific factor causing a temporal change in the noise level). That is to say, it is possible to independently create a selected segment group from the waveform data at each frequency, calculate the average or median of the segment values of the segments constituting the selected segment group, and directly adopt the calculated value as the noise level.
Although the previous embodiment is concerned with the case of processing a chromatogram obtained with a liquid chromatograph, the described method can also be used to determine the noise level in various other kinds of measurement data, such as an optical spectrum obtained through a spectrometric measurement or a mass spectrum obtained through mass spectrometry, other than a chromatogram acquired with a liquid chromatograph or gas chromatograph.
Additionally, as opposed to the previous embodiment in which waveform data are divided so that each period of time where positive values successively occur and each period of where negative values occur are defined as one segment, the waveform data may also be divided so that each period of time between a local maximum and a local minimum is defined as one segment.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2015/054068 | 2/16/2015 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2016/132422 | 8/25/2016 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4837726 | Hunkapiller | Jun 1989 | A |
5966684 | Richardson | Oct 1999 | A |
6112161 | Dryden | Aug 2000 | A |
6449584 | Bertrand | Sep 2002 | B1 |
7340375 | Patenaud et al. | Mar 2008 | B1 |
7409298 | Andreev | Aug 2008 | B2 |
7657387 | Malek | Feb 2010 | B2 |
7983852 | Wright | Jul 2011 | B2 |
8027743 | Johnston | Sep 2011 | B1 |
8530828 | Ivosev | Sep 2013 | B2 |
9177559 | Stephenson | Nov 2015 | B2 |
10359404 | Kozawa | Jul 2019 | B2 |
20030009091 | Edgar, Jr. | Jan 2003 | A1 |
20050261838 | Andreev | Nov 2005 | A1 |
20050285023 | Liu | Dec 2005 | A1 |
20060241491 | Bosch-Charpenay | Oct 2006 | A1 |
20060241937 | Ma | Oct 2006 | A1 |
20070143319 | Malek | Jun 2007 | A1 |
20080270083 | Lange | Oct 2008 | A1 |
20090199620 | Kawana | Aug 2009 | A1 |
20100030562 | Yoshizawa | Feb 2010 | A1 |
20100208902 | Yoshizawa | Aug 2010 | A1 |
20100215191 | Yoshizawa | Aug 2010 | A1 |
20100283785 | Satulovsky | Nov 2010 | A1 |
20110054804 | Pfaff | Mar 2011 | A1 |
20120089344 | Wright | Apr 2012 | A1 |
20130087701 | Ivosev | Apr 2013 | A1 |
20130131998 | Wright | May 2013 | A1 |
20130311110 | Aizikov | Nov 2013 | A1 |
20140079248 | Short | Mar 2014 | A1 |
20140142894 | Chang | May 2014 | A1 |
20140177868 | Jensen | Jun 2014 | A1 |
20140180682 | Shi | Jun 2014 | A1 |
20140252218 | Wright | Sep 2014 | A1 |
20150287422 | Short | Oct 2015 | A1 |
20160224830 | Noda | Aug 2016 | A1 |
20170336370 | Noda | Nov 2017 | A1 |
20180011067 | Kozawa | Jan 2018 | A1 |
Number | Date | Country |
---|---|---|
0 296 781 | Dec 1988 | EP |
0296781 | Dec 1988 | EP |
62291562 | Dec 1987 | JP |
07-098270 | Apr 1995 | JP |
11153588 | Jun 1999 | JP |
2006-163614 | Jun 2006 | JP |
2009008582 | Jan 2009 | JP |
2009204397 | Sep 2009 | JP |
2012177568 | Sep 2012 | JP |
2014137350 | Jul 2014 | JP |
2015200532 | Nov 2015 | JP |
Entry |
---|
Liland et al; “Optimal Choice of Baseline Correction for Multivariate Calibration of Spectra”; Applied Spectroscopy, vol. 64, No. 9, 2010. (Year: 2010). |
Zhang et al; “Multiscale peak alignment for chromatographic datasets”; Journal of Chromatography A, vol. 1223, Feb. 3, 2012, pp. 93-106. (Year: 2012). |
Machine Translation for JP2009008582 (Year: 2009). |
Machine Translation for JP2009204397 (Year: 2009). |
Machine Translation for JP2012177568 (Year: 2012). |
Machine Translation for JP2014137350 (Year: 2014). |
Machine Translation for JP2015200532 (Year: 2015). |
Machine Translation for JPH11153588 (Year: 1999). |
Machine Translation for JPS62291562 (Year: 1987). |
Wikipedia Entry about normalization (i.e. Normalization Wiki) (snapshot taken on Dec. 19, 2014 using Wayback Machine; https://web.archive.org/web/20141219154944/https:en.wikipedia.org/wiki/Normalization_(statistics).) (Year: 2014). |
Li-Thiao-Te, Sebastien & Schwikowski, Benno (2012). Feature Detection with Controlled Error Rates in LC/MS Images. Journal of Computational Biology, 19(4). https:doi.org/10.1089/cmb.2009.0125 (Year: 2012). |
Payne et al., “A Signal Filtering Method for Improved Quantification and Noise Discrimination in Fourier Transform Ion Cyclotron Resonance Mass Spectrometry-Based Metabolomics Data”, American Society for Mass Spectrometry, Jun. 1, 2009, vol. 20, No. 6, XP026494910, pp. 1087-1095. |
Communication dated Oct. 30, 2017 from the European Patent Office in counterpart Application No. 15882527.3. |
Written Opinion dated Apr. 28, 2015 in application No. PCT/JP2015/054068. |
International Search Report of PCT/JP2015/054068 dated Apr. 28, 2015 English. |
Number | Date | Country | |
---|---|---|---|
20180003683 A1 | Jan 2018 | US |