This application is a 371 application of PCT/JP2018/005551 having an international filing date of Feb. 16, 2018, which claims priority to JP2017-034419 filed Feb. 27, 2017, the entire content of each of which is incorporated herein by reference.
This application is a 371 application of PCT/JP2018/005551 having an international filing date of Feb. 16, 2018, which claims priority to JP2017-034419 filed Feb. 27, 2017, the entire content of each of which is incorporated herein by reference.
The present invention relates to a measurement analysis mode using a chemical sensor, or more specifically, to a method of identifying a sample by using a chemical sensor provided with multiple channels. The present invention also relates to a device configured to identify a sample based on the aforementioned identification method.
Nowadays, various devices are connected to one another through networks and allowed to mutually exchange massive data. Given the situation, cloud computing, big data analyses, the Internet of things (IoT), and so forth have been drawing attention as new services and systems based on this technology. Sensors are extremely important pieces of hardware for utilizing these new techniques. Among other things, chemical sensors designed to analyze liquid and gas samples are in high demand in the fields of food, safety, environment, and so forth. Meanwhile, advances in MEMS technologies have brought micro chemical sensor elements into realization. Accordingly, it is expected that a mobile terminal or the like equipped with such a micro chemical sensor will be able to automatically conduct or allow anybody to easily conduct a measurement of a sample and enable various analyses by combining data thus obtained with the new IT techniques mentioned above.
Flow rate control of a sample is often a problem in the measurement using the chemical sensor. In an ordinary measurement using the sensor, a sample is introduced into the sensor element while controlling its flow rate by using a pump, a mass flow controller, and the like and the sample is identified by analyzing signals obtained as a consequence. Since this measurement method requires installation of the pump and the mass flow controller in a measurement device, a measurement system cannot be reduced in size as a whole even though the sensor element is very small. On the other hand, there is also a measurement method in which the flow rate to introduce the sample is just monitored without controlling it, and the sample is identified based on a correspondence between the signal and a change in flow rate to introduce the sample with time. Nevertheless, even in this case, it is still necessary to monitor the flow rate and a component such as a flowmeter for monitoring the flow rate needs to be installed in a flow passage.
An object of the present invention is to provide a novel analysis method that enables identification of a sample without controlling or monitoring a change in sample introduction with time when a measurement using a chemical sensor takes place.
According to an aspect of the present invention, there is provided a sample identification method using a chemical sensor, comprising providing a sample to be identified to a chemical sensor having a plurality of channels as inputs in accordance with identical first functions that vary with time, and thus obtaining a group of outputs for the sample to be identified including a plurality of time-varying outputs from the plurality of channels; providing a control sample to as an input in accordance with identical second functions that vary with time to a chemical sensor having a plurality of channels which chemical sensor is identical to or has the same characteristics as the formerly mentioned chemical sensor, and obtaining a group of outputs for a control sample including a plurality of time-varying outputs from the plurality of channels; performing a first comparison or a second comparison, the first comparison being obtaining relationships between the group of the outputs for the sample to be identified and the group of the outputs for the control sample for each of the corresponding channels, and then comparing the relationships between the plurality of channels, and the second comparison being obtaining relationships between the outputs corresponding to the channels within each of the group of the outputs for sample to be identified and the group of the outputs for the control sample, and then comparing the thus obtained relationships between the outputs for the sample to be identified and the outputs for the control sample; and identifying the sample to be identified and the control sample based on a result of the first or the second comparisons
Here, in each of the channels of the chemical sensor, the output from the channel may be describable in a separative form of a multiplication or an addition of an input to the channel and a transfer function (h) of the channel.
Meanwhile, each of the outputs for the sample to be identified and the outputs for the control sample may be expressed by the following formula (A) using a transfer function of the corresponding one of the plurality of channels,
yq,c(t)=hq,c(t)*xq(t) (A),
(where xq(t) represents any of the first function and the second function expressed as a time function, yq,c(t) represents the outputs for the sample to be identified and the outputs for the control sample expressed as a time function, hq,c(t) represents the transfer function expressed as a time function, the suffix q indicates discrimination of the control sample and the sample to be identified, c indicates a channel number in a range from 1 to C, and * represents a convolution operation). Moreover, each of the first and second comparisons may be a comparison where, with respect to simultaneous expressions including C polynomial concerning the sample to be identified and C polynomials concerning the control sample obtained by transforming the formula (A) so that the formula (A) is expressed in polynomials, an identification is made whether or not the transfer function in the formula (A) concerning the sample to be identified is the same as the transfer function in the formula (A) concerning the control sample.
In the meantime, the polynomials of the transform so that the formula (A) is expressed in polynomials may be expressed by a multiplication formula (B) in the form of
Y=HX (B),
where X and Y are the input to and the output from the channel respectively, and H is a variable or a constant corresponding to the transfer function between the input and the output.
Meanwhile, the first comparison may comprise obtaining ratios between each pair of the corresponding channels among the C polynomial expressions in the form of the formula (B) concerning the sample to be identified and the C polynomial expressions in the form of the formula (B) concerning the control sample; and comparing the obtained ratios among the multiple channels.
On the other hand, the second comparison may comprise obtaining first ratios between the C polynomial expressions in the form of the formula (B) concerning the sample to be identified; obtaining second ratios between the C polynomial expressions in the form of the formula (B) concerning the control sample; and comparing the first ratios and the second ratios between the corresponding channels.
Meanwhile, the transform of the formula (A) so that the formula (A) is expressed in polynomials may be a transform of the convolution operation in the formula (A) into a multiplication of matrices or vectors.
Alternatively, the transform from the formula (A) so that the formula (A) is expressed in polynomials may be a transform of the formula (A) from a function in the time domain into a function in the frequency domain.
Meanwhile, the first function and the second function may be determined independently of each other.
In the meantime, at least one of the first function and the second function may be a random function.
Meanwhile, the outputs for the sample to be identified and the outputs for the control sample may be subjected to time discretization.
According to another aspect of the present invention, there is provided a sample identification device which includes multiple chemical sensors; and an information processing device connected to the chemical sensors. Here, the sample identification device conducts the sample identification method using any one of the above-mentioned chemical sensors.
According to the present invention, the same input is measured with the sensor provided with the multiple channels having different characteristics. Thus, it is possible to perform identification based on features of the sample (a chemical species, a concentration, a temperature, and the like of the sample) without controlling or monitoring a change in sample introduction with time. To be more precise, the present invention makes it possible to evaluate whether or not an unknown sample matches a known sample by comparing measurement data obtained from the unknown sample with data of the known sample measured in advance.
An aspect of the present invention provides a mode of identifying a sample in which, when the sample is introduced into a chemical sensor provided with multiple channels having different characteristics, the sample is identified by performing an analysis based on responses from the respective channels. In this way, there is provided an identification mode based on features of the sample (a chemical species, a concentration, a temperature, and the like of the sample) without controlling or monitoring a change in sample introduction with time.
The present invention is configured to analyze a sample by a measurement using a chemical sensor. The chemical sensor is a broad concept which signifies a sensor designed to identify and detect various molecules and ions existing in a gas phase, a liquid phase, and the like. As an example of the chemical sensor, this specification will pick up and describe a membrane-type surface stress sensor (MSS), which is one of nanomechanical sensors designed to detect very small expansions or contractions of a membrane attributed to adsorption of a chemical substance representing a particular chemical species and to convert the expansions or contractions into electrical signals. Nevertheless, the chemical sensor represents the concept based on the detection target and an operating principle, a structure, and the like thereof do not matter. For instance, various other principles are applied as the operating principles of chemical sensors, including those that utilize various chemical reactions, those that utilize electrochemical phenomena, those that utilize interactions between a semiconductor element and various substances present in the vicinity thereof, those that utilize biological functions such as enzymes, and so forth. A sensor body usable in the present invention is not limited to any particular structures, operating principles, and the like as long as the sensor body is designed to show a response of some sort to a sample.
<Theoretical Backgrounds>
When the sample is identified by using the chemical sensor provided with the multiple channels as described above, the identification method falls roughly into two conceivable ways. Now, let us consider a case where a measurement is performed on a sample (also referred to as a sample to be measured) u, of which chemical species, concentration, temperature, and the like are unknown, by using a chemical sensor provided with C pieces of channels so as to evaluate whether or not the sample u matches a known sample g that is measured in advance. As shown in
The above-described identification of the sample is feasible when an output (y) can be described in the form of either a multiplication or an addition of an input (x) and a transfer function (h) thereof separately from the other outputs. Here, if it is possible to describe the form of the multiplication, the multiplication is transformed into the addition by taking the logarithm, and vice versa. Therefore, it is sufficient just by explaining one of these forms. Accordingly, the case of the multiplication will be described in the following. A system in which a relationship between the input and the output is linear will be considered as an example. First, an inflow amount of the sample g into a sensor element is defined as xg(t) and a sensor signal yg,c(t) is assumed to be obtained in the channel c as a consequence. Note that t represents the time. Here, assuming that xg(t) and yg,c(t) have linearity, the sensor signal yg,c(t) can be expressed by the following convolution while using a time transfer function hg,c(t):
[MATH. 1]
yg,c(t)=∫0thg,c(t−τ)xg(t−τ)dτ≡hg,c(t)*xg(t) (1)
In the convolution integral, an integration interval is usually set from −∞ to +∞. However, according to the law of casualty, xg(t) in the future after the time t will not have an effect on current yg,c(t). Accordingly, τ<0 is excluded from the integration interval. In the meantime, the measurement is performed for a period that is sufficiently longer than the time required for transfer of the signal from xg(t) to yg,c and an effect of the inflow amount xg(t) in the past before the measurement start time t=0 on current yg,c(t) is assumed to be negligible. Thus, t−τ<0, or more specifically, τ>t is excluded from the integration interval. As a consequence, the integration interval can be defined as [0: t]. This time transfer function hg,c(t) does not depend on the inflow amount xg(t) of the sample g as long as it is the linear system, but varies depending on features of the sample on the other hand.
Next, this will be considered with an expression in terms of a frequency domain. The input xg(t), the output yg,c(t), and the time transfer function hg,c(t) in the case of considering with the time domain can be expressed as Xg(f), Yg,c(f), and Hg,c(f), respectively, as functions of the frequency f by conducting the Fourier transform or the Laplace transform. In this instance, Yg,c(f) can be described in the form of a multiplication by using Xg(f) and the frequency transfer function Hg,c(f) as:
[MATH. 2]
Yg,c(f)=Hg,c(f)Xg(f) (2)
As with the time transfer function, the frequency transfer function Hg,c(f) also varies depending on the features of the sample. Accordingly, the sample can be identified by obtaining Hg,c(f) by the measurement. Note that the formula (1) and the formula (2) are the expressions in terms of the time domain and the frequency domain, respectively, and are mathematically equivalent to each other.
Here, the measurement with multiple channels will be considered. According to the formula (1), when the sample g is inputted at xg(t), the signals yg,1(t), yg,2(t), . . . , yg,c(t) to be obtained in the channels c=1, 2, . . . , C can be expressed by using the time transfer functions hg,1(t), hg,2(t), . . . , hg,c(t) applicable to the respective channels as:
Next, a case of inputting a different sample u into the same sensor will be considered. If the input in this case is xu(t) while a signal obtained in each channel c is yu,c(t) and the frequency transfer function applicable to each channel c is hu,c(t), then the following expression is obtained as with the formula (3):
Let us consider the same concept in light of the frequency domain. According to the formula (2), each signal Yg,c(f) to be obtained in each channel c in response to the input Xg(f) of the sample g can be expressed by using the frequency transfer function Hg,c(f) applicable to each channel as:
Meanwhile, if a different sample u is inputted to the same sensor at Xu(f), each signal Yu,c(f) obtained in each channel c can be expressed by using the frequency transfer function Hu,c(f) applicable to each channel c concerning the sample u as:
Now, assuming that u is the unknown sample that is expected to be identified and g is the measurement sample for training, let us consider a way to evaluate whether or not u matches g by using only yu,c(t) and yg,c(t) or using only Yu,c(f) and Yg,c(f) but without using the input xq(t) or Xq(f).
Here, it is to be noted that the act of not using the input function for obtaining the training data and not using any of the inputs xq(t) and Xq(f) for obtaining the test data is equivalent to a situation where information indicating what these functions are is not used in the processing for identifying the sample. This is indicated more specifically in the following evaluation methods 1 to 3. This means that these two functions may be different from each other or may happen to be the same, or in other words, these functions may be determined independently and separately from each other.
<Evaluation Method 1: Analysis in Time Domain>
An evaluation in the time domain will be conducted in accordance with the analysis mode (
where h′g,c(j×Δt)=hg,c(j×Δt)×Δt, c=1, 2, . . . , C, p×Δt=t, and i=1, 2, . . . , 2I-1. In this instance, the following holds true for each channel:
therefore:
holds true on condition that:
It is possible to assume that xg is substantially regular because elements therein are statistically random. Likewise, the following holds true for the measurement of the unknown sample u:
therefore:
holds true on condition that:
It is possible to assume that xu is substantially regular because elements therein are statistically random.
If u=g, then the respective channels satisfy the following:
Therefore, the following is derived from the formula (9) and the formula (12):
Accordingly, the following holds true:
The rightmost side of the formula (16) depends only on the xu and xg (i=1, 2, . . . , 2I-1) but does not depend on the channels. Accordingly, the values Rug,c are equal among all the channels.
On the other hand, the following holds true if u≠g:
Therefore, the values Rug,c are not equal among many of the channels. Accordingly, it is possible to determine whether or not the unknown sample u is the training sample g without controlling or monitoring the input xq(t) but instead by comparing the values Rug,c among the respective channels.
Here, in the above-described example, the transform of the above-described formula (1) into the polynomial expressions in the case of time discretization of the continuous function was expressed as the multiplication of matrices. It is to be noted, however, that the expressions do not always have to be composed of the matrices but may be expressed as a multiplication of vectors instead.
<Evaluation Method 2: Analysis in Frequency Domain (No. 1)>
An evaluation in the frequency domain will be conducted in accordance with the analysis mode (
Accordingly, the following is derived from the formulae (5) and (6):
where Cug(f)=Xg(f)/Xu(f). Here, assuming that:
Cug(f)=Xg(f)/Xu(f)=Aug(f)elθ
the formula (19) can also be notated as:
where Aug(f)=|Xg(f)/Xu(f)|. In this case, absolute values of both sides in the formula (20) satisfy the following:
Accordingly, the following holds true:
[MATH. 22]
log|Yu,1(f)|−log|Yg,1(f)|=log|Yu,2(f)|−log|Yg,2(f)|= . . . =log|Yu,C(f)|−log|Yg,C(f)|=−log Aug(f) (2 2)
Meanwhile, phase components of both sides in the formula (20) satisfy the following:
[MATH. 23]
arg Yu,1(f)−arg Yg,1(f)=arg Yu,2(f)−arg Yg,2(f)= . . . =arg Yu,C(f)−arg Yg,C(f)=−θug(f) (23)
On the other hand, the following holds true if u≠g:
which leads to:
whereby the equations of the formulae (22) and (23) do not hold true any longer.
Accordingly, it is possible to evaluate whether or not the unknown sample u is the same as the training sample g without controlling or monitoring the input Xq(f) but instead by comparing log|Yu,c(f)|−log|Yg,c (f)| and argYu,c(f)−argYg,c (f) among the channels, respectively, based on the measurement of the unknown sample u and the measurement of the training sample g.
<Evaluation Method 3: Analysis Based on Frequency Domain (No. 2)>
An evaluation in the frequency domain will be conducted in accordance with the analysis mode (
[MATH. 26]
Xg(f)=Yg,1(f)/Hg,1(f)=Yg,2(f)/Hg,2(f)= . . . =Yg,C(f)/Hg,C(f) (26),
the following holds true for the arbitrary channels m and n:
[MATH. 27]
Yg,m(f)/Yg,n(f)=Hg,m(f)/Hg,n(f)=K′g,mn(f) (27)
Likewise, regarding the measurement of the unknown sample u, the following is derived from the formula (6):
[MATH. 28]
Yu,m(f)/Yu,n(f)=Hu,m(f)/Hu,n(f)=K′u,mn(f) (28)
If the sample is the same, the frequency transfer function Hq,c(f) is invariant with any input Xq(f). Accordingly, K′u,mn(f)=K′g,mn(f) holds true when u=g while K′u,mn(f)≠K′g,mn(f) holds true when u≠g. As a consequence, it is possible to determine whether or not the unknown sample u is the same as the training sample g without controlling or monitoring the input Xq(f) but instead by comparing the values K′u,mn(f) obtained by the measurement of the unknown sample u with the values K′g,mn(f) obtained in advance in the measurement for training.
As described above, when the sample is identified by using the multiple chemical sensors, there are two conceivable ways, namely, the analysis mode (
Note that the input function for obtaining the training data and the input function for obtaining the test data are not limited to particular functions as long as the functions can satisfy the theoretical explanation described above. It is possible to achieve the analyses theoretically when each function is a function containing various frequency components or a time-varying function. Accordingly, such a function may for example be any of a function that varies randomly with time, a function in which frequency components are distributed to a predetermined range, and the like. Those functions may be generated in some way as needs arise, or may be such functions that seem to be random but are actually preset such as patterns stored in a memory or the like.
In the meantime, the above-described evaluation method 1 is designed to perform the processing on the outputs from the chemical sensors subjected to the time discretization. However, the evaluation method is not limited to the foregoing. Even in a case of conducting an evaluation in accordance with a method other than the evaluation method 1, such an evaluation is thought to be accomplished in fact by using an information processing device that applies a digital computer in most cases. Therefore, it is also possible to conduct the analysis based on the formula (1) or on the formula (2) obtained by the time discretization of results of the outputs from the chemical sensors even in the case of the evaluation method other than the evaluation method 1. Modes for performing the information processing by way of the time discretization, instruments used therefor, and the like are matters known to those skilled in the art and specific explanations thereof will be omitted.
Furthermore, it is to be noted that a sample identification device according to the present invention does not always require a pump for providing the sample to the chemical sensors, instruments for controlling flow rates thereof, and so forth. For example, a source of generation of the sample may provide a flow of the sample by itself in accordance with an appropriate input function. Meanwhile, any of a source of the sample and the chemical sensors may be held at a proper position by hand instead of completely fixing a positional relationship therebetween with a fixture or the like. Moreover, the input function may further be optimized for example by moving the hand or the sample when appropriate. It should be understood that these measures are sufficiently practical. Accordingly, the minimum required elements of the sample identification device according to the present invention consist of the chemical sensors and the information processing device that receives the outputs therefrom and conducts analyses.
The present invention will be described further in detail below based on examples. However, it is needless to say that the present invention shall not be limited only to these examples. For instance, a membrane-type surface stress sensor (MSS) (Patent Literature 1 and Non-patent Literature 1) is used as an example of the chemical sensor in the following description. However, it is also possible to use chemical sensors of other types depending on the situation.
In this example, the measurement was conducted by randomly changing the flow rate of the MFC1 10. In principle, the frequency transfer function can be obtained by applying an impulse (a pulse having an infinitely small temporal width and an infinitely large height) as an input and observing a response thereto. Nonetheless, it is difficult to apply such an impulse with the actual experimental system. There is also another problem of a difficulty in extracting only the response to the impulse from an output signal if the output contains noise. Accordingly, in this example, the frequency transfer function was obtained by applying white noise as the input instead of the impulse. The white noise has the constant value across the entire frequency range. It is therefore possible to seek the frequency transfer function by evaluating the response to the white noise. In the actual experimental system, the input-controllable frequency range is limited. Hence, the responses to a half bandwidth of the aforementioned frequency are significant for the analysis according to the Nyquist's theorem. In this example, the flow rate of the MFC1 10 was randomly changed every second (1 Hz) within a range from 0 to 100 sccm as shown in
This analysis was conducted based on <Evaluation method 1> in <Theoretical backgrounds>. The measurement data corresponding to 20 Hz and 120 seconds were divided into K=2400/(2I) sessions and then K−1 sessions (measurement number k=1, 2, . . . , K−1) were used as the training data while the last session (k=K) was used as the test data. The formulae (16) and (17) enable identification of a gas by evaluating whether or not the values Rug,c are equal to one another. In the case where the determination as to whether or not the values Rug,c are equal to one another is made based on the magnitude of variance of the values Rug,c across channels of (i, j) elements, even if structures of Rug,c are not similar in general, the structures are possibly determined to be similar if the respective elements of Rug,c are small in general due to combinations of scales of yg,c(t) and yu,c(t). To avoid this, the determination as to whether or not the values Rug,c are equal to one another is made based on differences in logarithm and sign among the respective element values. If u=g, then:
[MATH. 29]
log|Rug,1(i,j)|=log|Rug,2(i,j)|= . . . =log|Rug,C(i,j)| (29); and
[MATH. 30]
sign(Rug,1(i,j))=sign(Rug,2(i,j))= . . . =sign(Rug,C(i,j)) (30)
are derived from the formula (16). Here, Rug,c(i, j) denotes the (i, j) elements of Rug, c, and sign(x) is a sign of x which has any of values of −1, 0, and 1 when x<0, x=0, and x>0, respectively. In the meantime, when u≠g, the values log|Rug,c(i,j)| and sign(Rug,c(i, j)) do not match among the channels in terms of many of Rug,c(i, j).
Here, regarding a value Rug,ck to be obtained from the test data and the training data of the measurement number k, values LRug,k(i, j) and SRug,k(i, j) are defined as follows:
If u=g, then the values Rug,k(i, j) have averages LRug,k and SRug,k in common irrespective of the value of c. Here, values log|Rug,ck (i,j)| and sign(Rug,ck(i,j)) are assumed to be in conformity to common distributions N(LRug,k(i,j), σLg,c2(i,j)) and N(SRug,k(i,j), σSg,c2(i,j)), respectively, regardless of the value k on condition that N(μ, σ2) is a normal distribution with the mean μ and the variance σ2. Meanwhile, assuming that Rg,ck1k2 is a matrix to be obtained by conducting a calculation similar to the case of Rug,c in the formula (16) by using K pieces of the measurement values yg,ck1 and yg,ck2 concerning the respective channels, values σLg,c2(i,j) and σSg,c2(i,j) are defined as follows:
on condition that:
Accordingly, in the comparison between the test data yu,c(t) and the training data yg,ck(t), a consistent probability L(u, g) with a substance g across all the channels c, all the training data (the measurement number k), and respective transfer matrix elements (i, j) is defined as:
A logarithmic likelihood LL(u, g) obtained by taking the logarithm of the formula (37) is defined as:
As a consequence, it is possible to evaluate the consistent probability of the training data with each gas type by calculating the rightmost side of the formula (38) regarding each piece of the test data.
Calculations of LL(u, g) were conducted in accordance with the above discussion. While tables of data used for calculations in other analysis examples (analysis examples 2 and 3) are quoted near the end of this specification, this analysis example involved enormous data pieces that ran into several millions and the quotation thereof was therefore omitted. As for the data preprocessing, high-frequency noise was removed by using a finite response (FIR) low-pass filter set to a cut-off frequency of 0.5 Hz. Results in the case of setting I=10 and K=120 are shown in Table 1. Note that actual values of LL(u, g) are given by multiplying The values on the Table by 107. The case that brings about the largest value of LL(u, g) in each row was indicated in bold. As for four types of the solvents, namely, water, ethanol, and ethyl acetate, the pieces of the training data that brought about the largest likelihood coincided with those of the test data. In other words, the identification of those samples was successful. As for hexane, the piece of the training data that brought about the largest likelihood turned out to be that of benzene which was incorrect. Nonetheless, the piece of the training data representing hexane as the correct answer showed the second largest likelihood. However, as for THF, the logarithmic likelihood of the piece of training data representing THF as the correct answer was the fifth largest and showed poor accuracy. A possible reason why the analysis in the time domain based on the concept in
−1.6227
−1.5981
−1.5283
−1.5336
−1.5821
−1.5506
This analysis was conducted based on <Evaluation method 2> in <Theoretical backgrounds>. The measurement data corresponding to 120 seconds were divided by six (K=6) and then 5 sessions (the measurement number k=1, 2, 3, 4, 5) were used as the training data while the last session (k=6) was used as the test data. First, the measurement data were subjected to the Fourier transform to seek the frequency components of the sensor signal. In this example, the measurement was conducted for 20 seconds at the sampling frequency of 20 Hz (400 pieces). Accordingly, the frequency characteristics are obtained at 0.05-Hz intervals. Since the flow rate is randomly changed every second (1 Hz) in this example, the frequency components up to 0.5 Hz, which is a half of 1 Hz, are useful in the analysis according to the Nyquist's theorem. As mentioned earlier, the input-controllable frequency range is limited in the actual experimental system. In this example, the system could hardly follow in an attempt to switch the flow rate at a frequency higher than 1 Hz due to a restriction in responsiveness of the MFCs and the like. As a consequence, ten components at 0.05, 0.1, 0.15, . . . , and 0.5 Hz are used in the analysis. Hence, Yq,c(f) is obtained as a 10-dimensional complex vector per channel in each measurement.
Based on the above, let us consider a case of identifying the gas type of the test data by comparing Fourier components Yu,c(f) of the test data obtained based on Evaluation method 2 with Fourier components Yg,ck(f) of the training data. Here, the gas is identified based on a probability defined as
Aug(f)eiθ
that is derived from the formula (20).
Based on the formulae (22) and (23), the values log|Yu,c(f)|−log|Yg,c(f)| and argYu,c(f)−argYg,c(f) take the same values in all the channels when u=g. Accordingly, based on Yu,c(f) of the test data with the channel c obtained by the measurement and Yg,ck(f) of the k-th piece of the training data, values {circumflex over (L)}ug,k(f) and {circumflex over (θ)}ug,k(f) are defined respectively as:
Here, the values log|Yu,c (f)|−log|Yg,c (f)| and argYu,c(f)−argYg,c(f) are assumed to be in conformity to common distributions N({circumflex over (L)}ug,k(f), σLg,c2(f)) and N({circumflex over (θ)}ug,k(f), σθg,c2(f)), respectively. Here, if Yu,c(f) is assumed to be in conformity to a population distribution similar to Yg,ck(f), then variances of log|Yu,c(f)| and argYu,c(f) are thought to be equivalent to variances sLg,c2(f) and sθg,c2(f) of log|Yg,ck(f)| and argYg,ck(f) Accordingly, values σLg,c2(f) and σθg,c2(f) are defined as:
[MATH. 41]
σLg,c2(f)=2sLg,c2(f) (41); and
[MATH. 42]
σθg,c2(f)=2sθg,c2(f) (42)
on condition that:
where log|Yg,c(f)| and argYg,c(f) are defined as:
respectively.
Accordingly, a probability that Yu,c(f) of the test data match the training data Yg,ck(f) notated as
P(Aug(i)eiθ
is defined as:
Therefore, a probability L(u, g) that the unknown sample u matches the substance g across all the channels c, all the training data (the measurement number k), and all the frequency components f (from the lowest frequency fL to the highest frequency fH) is obtained by:
The logarithmic likelihood LL(u, g) obtained by taking the logarithm of the formula (48) is obtained by:
As a consequence, it is possible to evaluate the probability that a test data matches the training data of each gas type by calculating the rightmost side of the formula (49) regarding each piece of the test data.
Results are shown in Table 2. If needed, please refer to tables of data used for the calculations that are quoted near the end of this specification together with tables of data for the analysis example 3. The case that brings about the largest value of LL(u, g) in each row was indicated in bold. As for five types of the solvents, namely, water, ethanol, hexane, ethyl acetate, and THF, the pieces of the training data that brought about the largest likelihood coincided with those of the test data. In other words, the identification of those samples was successful. The piece of the training data that brought about the largest likelihood was incorrect only in the case of the benzene, which turned out to be that of ethanol. Nonetheless, the piece of the training data representing benzene as the correct answer showed the second largest likelihood. Thus, the analysis method based on Evaluation method 2 showed the possibility to identify the gasses.
−915.1
−897.2
−1377.5
−607.9
−677.9
This analysis was conducted based on <Evaluation method 3> in <Theoretical backgrounds>. The measurements corresponding to 120 seconds were divided by six (K=6) and then 5 sessions (the measurement number k=1, 2, 3, 4, 5) were used as the training data while the last session (k=6) was used as the test data. First, the measurement data were subjected to the Fourier transform to seek the frequency components of the sensor signal. In this analysis example as well, ten components at 0.05, 0.1, 0.15, . . . , 0.5 Hz are used in the analysis as with <Analysis example 2>. Hence, Yq,c(f) is obtained as a 10-dimensional complex vector per channel in each measurement.
Being the complex number, K′q,mn(f) can be notated as:
[MATH. 50]
K′q,mn(f)=rq,mn(f)exp[iθq,mn(f)] (50)
In this case, regarding the training data, values rg,mnk(f) and θg,mnk(f) are assumed to be in conformity to common distributions N ({circumflex over (r)}g,mn(f), σr
Here, if K′u,mn(f) is assumed to be in conformity to a population distribution similar to K′g,mnk(f), then variances of ru,mn(f) and θu,mn(f) are thought to be equivalent to variances sr
[MATH. 53]
σr
[MATH. 54]
σθ
on condition that:
Accordingly, a probability that K′u,mn(f) obtained from the test data match the training data K′g,mnk(f) (the measurement number k) notated as
P(K′u,mn(f)=K′g,mnk(f))
is defined as:
Therefore, the probability L(u, g) that u matches the substance g across all combinations of the channels (m, n, m<n), all the training data (the measurement number k), and all the frequencies f (from the lowest frequency fL to the highest frequency fH) is obtained by:
The logarithmic likelihood LL(u, g) obtained by taking the logarithm of the formula (58) is obtained by:
As a consequence, it is possible to evaluate the probability that the unknown sample u matches the training data of each gas type by calculating the rightmost side of the formula (59) regarding each piece of the test data.
Results are shown in Table 3. If needed, please refer to the tables of data used for the calculations that are quoted near the end of this specification together with the tables of data for the analysis example 2. The case that brings about the largest value of LL(u, g) in each row was indicated in bold. Regarding all the solvents, the pieces of the training data that brought about the largest likelihood coincided with those of the test data. In other words, the identification of those samples was successful with the best accuracy among the analyses conducted in the examples. A reason why <Analysis example 3> based on the concept of
P(Aug(f)eiθ
used for the evaluations in <Analysis example 2> based on the concept of
P(K′u,mn(f)=K′g,mnk(f))
used for the evaluations in <Analysis example 3> based on the concept of
−1192.1
−920.6
−3753.3
−800.6
−1621.8
−1077.6
<Data in Calculation Processes>
The whole tables of data in the process of the calculations used for calculating the logarithmic likelihood LL from the actual measurement data in the analysis examples 2 and 3 are shown below. In each of the analysis example 2 and the analysis example 3, the data for each of the solvents measured were organized into one table. It is to be noted, however, that each Table is extremely long in a horizontal direction and is therefore divided into four segments in the horizontal direction and listed accordingly. A code appearing on the upper left and below each Table indicates a Table identification code formed from [one-digit number (an analysis example number)]−[an abbreviation indicating the solvent]−[one-digit number (a Table segment number)]. The Table segment number indicates which segment on the original Table the relevant segmented Table corresponds to when it is counted from the left end thereof. For example, a segmented Table with a Table identification of “2-H2O-4” indicates a segmented Table on the right end among the four segmented tables obtained from the original Table organizing the data on water in Analysis example 2.
It has been discussed earlier that the sample identification device according to the present invention did not always require a pump for providing the sample to the chemical sensor, instruments for controlling flow rates thereof, and so forth. Moreover, it has also been discussed in general terms that any of a source of the sample and the chemical sensor might be held at a proper position by hand instead of completely fixing a positional relationship therebetween with a fixture or the like, and that the input function might be optimized for example by moving the hand or the sample when appropriate, and furthermore, that these measures were sufficiently practical. Now, a specific example of a mode of generating the input function by a manual operation will be described.
While the vapor obtained by the evaporation of the liquid was provided as the sample to the chemical sensor in the aspect shown in
In the meantime, the input functions generated by the aforementioned manual operation preferably comply with the principle of the sample identification method of the present invention described earlier, such that the functions include the frequency components in a certain range, and the like. As for the adjustment of the manual operation as to how to move the hand in order to generate an appropriate function can be learned relatively easily by carrying out actual measurements several times. Alternatively, it is also easily possible to assist an untrained operator by judging whether or not a manual operation is inappropriate based on a change in signals outputted from the chemical sensor while conducting the manual operation, and giving a warning to the operator by using a result of the judgment, and the like.
As described above, according to the present invention, in a measurement of various samples such as a gas measurement using a chemical sensor provided with multiple channels having different characteristics, it is possible to identify a sample only from responses of sensor signals without controlling or monitoring a change in sample introduction with time. Thus, it is not necessary to provide components such as pumps, mass flow controllers, flowmeters, and the like for controlling and monitoring the change in sample introduction with time, so that significant reduction in size of a measurement system can be realized. Meanwhile, the analysis method provided by the present invention is designed to identify an unknown sample by comparing measurement data of the unknown sample with measurement data of a known sample. Accordingly, accuracy of identification is improved more as an amount of the data of known samples (the training data) is larger. As a consequence, it is possible to identify the sample at high accuracy by establishing an environment accessible to a large amount of training data through a network while saving the training data on a cloud storage and so forth.
Therefore, according to the present invention, it is possible to identify a sample only by using a simple and small measurement system that does not need to control or monitor introduction of the sample. Moreover, accumulation of a large amount of training data and access to a database thereof are enabled in combination with cloud computing and the like, whereby accuracy of identification can be dramatically improved. Thus, the present invention is expected to be applied to wide fields including food, safety, environment, medical care, and the like.
Number | Date | Country | Kind |
---|---|---|---|
JP2017-034419 | Feb 2017 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2018/005551 | 2/16/2018 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2018/155344 | 8/30/2018 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
3849653 | Sakaide | Nov 1974 | A |
6089206 | Suzuki | Jul 2000 | A |
6285807 | Walt | Sep 2001 | B1 |
20010029774 | Grate | Oct 2001 | A1 |
20020141901 | Lewis et al. | Oct 2002 | A1 |
20060229820 | Kemp | Oct 2006 | A1 |
20110097740 | Paek | Apr 2011 | A1 |
20110244584 | Haick | Oct 2011 | A1 |
20140165702 | Tanabe | Jun 2014 | A1 |
20140233039 | Takahashi | Aug 2014 | A1 |
20170248514 | Pavey | Aug 2017 | A1 |
20170325724 | Wang | Nov 2017 | A1 |
Number | Date | Country |
---|---|---|
2003-506714 | Feb 2003 | JP |
2014-139557 | Jul 2014 | JP |
2017-156254 | Sep 2017 | JP |
WO 2011148774 | Dec 2011 | WO |
Entry |
---|
Llobet, E., et al., “Quantitative Vapor Analysis Using the Transient Response of Non-Selective Thick-Film Tin Oxide Gas Sensors,” Proceedings of International Solid-State Sensors and Actuators Conference (Transducers '97), vol. 2, Jun. 16-19, 1997, pp. 971-974. (Year: 1997). |
Gardner, J.W, et al., “Prediction of health of dairy cattle from breath samples using neural network with parametric model of dynamic response of array of semiconducting gas sensors,” Science, Measurement and Technology, IEE Proceedings), vol. 146, No. 2, Mar. 1997, pp. 102-106. (Year: 1997). |
Extended European Search Report for Europe Application No. 18756666.6, dated Nov. 10, 2020, 11 pages. |
Llobet et al., “Quantitative Vapor Analysis Using the Transient Response of Non-Selective Thick-Film Tin Oxide Gas Sensors” Int'l Conference on Solid-State Sensors and Actuators, Digest of Technical Papers; New York, NY: IEEE; US, vol. 2, dated Jun. 16, 1997, pp. 971-974, XP010240638. |
Gardner et al., “Prediction of Health of Dairy Cattle from Breath Samples Using Neural Network with Parametric Model of Dynamic Response of Array of Semiconducting Gas Sensors” IEE Proceedings: Science, Measurement and Technology, IEE, Stevenage, Herts, GB, vol. 146, No. 2, dated Mar. 4, 1999, pp. 102-106, XP006013745. |
Yoshikawa et al., “Nanomechanical Membrane-type Surface Stress Sensor,” Nano Letters, vol. 11, 2011, American Chemical Society, ACS Publications, pp. 1044-1048. |
International Search Report in International Application No. PCT/JP2018/005551, dated May 22, 2018, 1 page. |
Number | Date | Country | |
---|---|---|---|
20190391122 A1 | Dec 2019 | US |