This invention relates to gas phase chemical analysis using absorption spectroscopy.
There are a variety of measurement approaches in which the measured absorption spectrum of a gas sample is used to quantify the concentration of one or more gas phase compounds. Examples of these measurement approaches include Fourier Transform Infrared Spectroscopy (FTIR) as well as a host of laser-based techniques such as Tunable Diode Laser Absorption Spectroscopy (TDLAS), Integrated Cavity Output Spectroscopy (ICOS), and Cavity Ring Down Spectroscopy (CRDS), among many others. These techniques generally all rely on the same basic analysis process:
When only one absorbing compound is present, this process is rather simple. When two or more compounds are present, the reported concentrations can have systematic errors that is often referred to as “crosstalk” between the two species. We define crosstalk as an error in the measurement of a given analyte gas due to the presence of one or more other compounds. In general, crosstalk will occur whenever the actual absorption spectrum of a given compound does not match what is in the model. In these instances, it is important to note that there can be reported errors not only in the compound for which the model function is erroneous, but also for the other compounds present in the fit. The least squares optimization will attempt to distribute the error in the one model function among the reported concentrations of the other compound, in such a way that the overall data-model mismatch is minimized. For example, for a gas sample containing two gases, CO2 and NH3, an error in the CO2 model function will lead to a reporting error in both compounds. The error in the reporting of CO2 is a nonlinearity or inaccuracy that we will not address here; however, the error in the reporting of NH3 falls under the category of crosstalk (“crosstalk of CO2 onto NH3”) and is the topic of this application.
There can be a variety of sources of this model function error, including:
Generally, the ideal solution is to try to correct the model in such a way that correctly incorporates the variable in question, whether it be another compound, temperature, pressure, etc. In practice, however, this is not always easy to accomplish, as it may require extensive experimentation to determine the correct spectral parameters, or computing the correct spectrum might be computationally intensive.
A practical approach that is commonly employed is to perform experiments in which the parameters of concern are varied, and the effect of the variation is observed in the reported output concentrations. We return to the concrete example of the instrument that measures NH3 and CO2, where there is an error in the CO2 model function. An experiment is performed in which the NH3 concentration in the gas sample is held fixed, and the CO2 is varied. Due to the error in the CO2 model function, the NH3 is mis-reported. A simple approach is to create a correction term for NH3 that is a Taylor series in CO2 concentration, which will correct the observed crosstalk of CO2 onto NH3.
This conventional approach has several advantages:
This approach can be extended in a straightforward manner to crosstalk from more than one species, or to nonlinear responses as a function of concentration, or even crosstalk where the errors are proportional to the product of the concentrations of two gases, simply by adding more terms to the crosstalk correction. For example, the correction to NH3 due to errors in the CO2 and H2O model functions could be expressed in the following way:
This is a second-order correction which could have linear and quadratic terms in the concentrations of CO2 and H2O as well as a bi-linear term that is proportional to the product of the two gas concentrations. The coefficients ai and bi are determined from an experiment in which concentrations of the two gases CO2 and H2O are varied as the NH3 concentration is held fixed.
This is a generally effective approach, and it is by far the most common method for correcting crosstalk in spectroscopic instruments. It is however not without drawbacks that limit its utility in certain situations:
What this means in a practical sense that, once the correction coefficients are determined, the spectral scan and the components of the fit cannot be changed. This restriction is typically not important for most applications in which laser-based spectrometers are employed-spectral scan and free parameters in the fit are normally defined during the manufacturing process and are not changed after shipment.
However, there are applications in which it would be advantageous to change either the data points collected during the spectral scan or to change the list of free parameters present in the fit. Often, one may want to change these in tandem-adding another compound to the optimization might be accompanied by changes to the spectral scanning to optimize the measurement of this other compound. Using the paradigm above, the instrument would have to be tested separately in two modes of operation: one with the original scan and set of free parameters, and one with the new scan and new set of free parameters. For each mode, there would need to be a unique set of crosstalk correction coefficients derived from the analysis of a carefully controlled experiment that is typically performed only when the spectrometer is built in a specially designed test station. If one would like to add another free parameter to the optimization, the instrument would need to be sent back to the laboratory to perform the requisite crosstalk experiment and derive the appropriate crosstalk correction coefficients.
To avoid this complication, we have created a novel approach to crosstalk correction. Rather than correct the crosstalk effect on the outputs of the least squares optimization, we instead correct the input spectra directly prior to fitting. We perform the same crosstalk experiment as described above, but with the spectrometer operating in a special mode in which it is collecting data at all candidate spectral points. All future spectral scans will be constructed out of a subset of these spectral points. We then determine correction factors on a per-frequency basis, in which the parameter that we are correcting is the measured spectrometer absorption at each frequency in the set of candidate spectral points. These correction factors take the following form:
In other words, we have a system of crosstalk correction factors (one at each frequency vk), where each set of correction factors is a Taylor expansion of the measured absorption measured at the frequency as a function of the crosstalk parameters (in the example, the concentrations of CO2 and H2O).
To implement these crosstalk parameters, for each frequency measured in the spectrum, the absorption is corrected on the basis of the Taylor expansion prior to the final fit of the data. It is important to note that every parameter in the Taylor expansions must be available as a determined variable prior to this final fit. This can be accomplished, for example, with a pre-fit in which those compounds concentrations are determined. Clearly, these early estimates of the concentrations might have crosstalk present in their results. This is acceptable, in the sense that the crosstalk corrections could absorb those errors into the overall crosstalk correction. All that matters is that we have repeatable observables that we can use in each Taylor expansion at each frequency.
The fact that we are correcting the absorption on a per frequency basis means that any spectrum that is a subset of the spectrum used to create the coefficients can be properly corrected. And the fact that we are correcting the absorption prior to fitting means that we can add or remove parameters from the optimization without adversely affecting the efficacy of the crosstalk correction. These changes to the spectral scan and/or the optimization can occur in the field, in real time, while maintaining the crosstalk correction.
This flexibility comes at a cost of additional software bookkeeping to manage a large set of crosstalk coefficients for each frequency (which can number in the hundreds or thousands), and some computational overhead of calculating coefficients on the fly for every measured spectrum.
One application of this approach is especially attractive, which is the measurement of trace levels volatile organic compounds (VOCs) in ambient air. VOCs number in the thousands, and typically have smoothly varying model functions that are not orthogonal from one another. Measuring all possible spectral points and fitting those spectral points with all the compounds simultaneously is sometimes possible, but generally delivers very poor precision and is very slow. Rather, it is better to tailor the frequency points scanned and the suite of compounds to match the sample composition, so that an optimized performance can be achieved. Complicating the analysis are the common atmospheric constituents: the primary three optical absorbers (CO2, H2O, and CH4) and the inert compounds (typically N2, O2, and Ar), which do not absorb directly but affect the spectra of the other common optical absorbers. Because these three common absorbers are ubiquitous and absorb so strongly in the near-infrared compared to the absorption of the trace VOCs, it is commonly necessary to perform crosstalk correction. By using the legacy approach of correcting concentrations, one would have to know the mixture of VOCs present in the sample a priori, which is unfortunately not always practical in real applications. However, by applying this novel per-frequency crosstalk correction, one can change the spectral scan and the fit parameters at will, leading to optimized measurement performance without suffering from crosstalk from the common atmospheric species.
In the above discussion, we focused on gas concentrations as the parameters to be used in the Taylor expansion. However, the approach is not restricted to gas concentrations; it can be generalized to any parameter that can be measured either with spectroscopy or with another sensor (e.g., a thermistor), or information supplied externally either about the gas sample, the spectrometer, or the environment in which the gas sample was drawn or the spectrometer is housed.
Section A describes general principles relating to embodiments of the invention, and section B considers some examples in greater detail.
As indicated above, an exemplary embodiment of the invention is a method including:
The difference between this approach and the conventional approach described above is schematically shown on
In contrast,
After the corrected spectrum is determined, a curve 304 is fitted through the data points of the corrected spectrum. Note that on
The two or more distinct frequencies are preferably selected from a predetermined frequency list. In such cases the prior calibration preferably includes determining per-frequency coefficients for all frequencies in the predetermined frequency list. In such cases, it is further preferred to delete any data point of the raw spectrum having a frequency difference from a nearest frequency of the predetermined frequency list above a predetermined error threshold. The reason for this is that such data points are presumed to be in error (e.g., from an instrument error or the like), and so their deletion is likely to improve results.
The one or more environment variables can include, but are not limited to: pressure, temperature, concentrations of one or more absorbing gas species, concentrations of one or more non-absorbing gas species, relative isotope abundance of one or more absorbing gas species, H2O concentration, CO2 concentration, and CH4 concentration.
Values of the one or more environment variables can be obtained in various ways, such as: spectroscopic measurement, sensor measurement, and specified environment conditions.
The prior calibration can include determining the per-frequency coefficients according to a method selected from the group consisting of: calibration measurement and spectral modeling. In other words, any suitable combination of measurement and modeling can be used to determine the per-frequency coefficients.
Suitable optical spectroscopy instrument include, but are not limited to: cavity ring-down spectroscopy (CRDS) instruments and cavity enhanced absorption spectroscopy instruments.
The terms of the polynomial in one or more environment variables can be a selected subset chosen from a master list of correction terms, where this selection of the subset is a data-driven selection. Input for this data-driven selection can be from: spectral models and the raw spectrum. The data driven selection can include performing a Least Absolute Shrinkage and Selection Operator (LASSO) analysis to determine the selected subset on a per-frequency basis.
This example considers LASSO regression in the per-frequency crosstalk problem in more detail. The use of LASSO regression allows one to drop uninformative predictors from a regression problem automatically. This can reduce overfitting, improve coefficient interpretability, and reduce variance on coefficients estimates.
To start, it will be helpful to revisit the formulation of the per-frequency crosstalk problem. Suppose we measure room air in our analyzer: we always find H2O, CO2, CH4 (the big three) at fairly large concentrations. We want to remove those contributions, so that our spectrum is only representative of the trace absorbers that we are interested in; the resulting spectrum is called the partial_fit (i.e., “partial_fit” here is the “corrected spectrum” above).
Since our models for the big three are not perfect, the removal does not always work well. Given the imperfections in the big three modeling, and their relatively high concentration, some interaction effects occur, meaning that each effect cannot be studied separately, which is the above-described crosstalk problem.
For example, if our models were perfect, we would have partial_fit=0 at all frequencies in this case (i.e., no analytes present, perfect removal of spectral interference). That's not what we see in practice. Not only is partial_fit nontrivial compared to typical trace species absorption, but the partial_fit spectra have very different characters at different big three concentrations.
We want to correct the partial_fit and bring it to zero. In this example, our corrections are functions of the concentrations of H2O, CO2, CH4, HDO, N2 (linear terms) and their interactions H2O{circumflex over ( )}2, H2O CO2, H2O HDO, . . . (quadratic or bilinear terms). Higher order terms and/or additional species (both chemical and isotopic) can be included. E.g., here HDO is water with one hydrogen replaced with deuterium.
It is convenient to refer to these terms as predictors. These predictors are polynomial terms in the environment variables (here the environment variables are concentrations of H2O, CO2, CH4, HDO, N2, also referred to as base species in the following). More generally, as indicated above, environment variables aren't limited to concentrations and can include any relevant parameters.
Absorption at each frequency can be impacted differently by the value of these predictors, for spectroscopic reasons (such as the presence of absorption lines for some compounds and/or others): this means that each frequency can and should be analyzed separately (i.e., per-frequency).
For each frequency in the scheme of frequencies Ω, we can use a linear regression model to correct for these crosstalk effects:
where:: the partial_fit values recorded at frequency
. Here,
∈≤
N, where N is the number of points in the spectrum.
X: the values of the predictors. They are “concentrations” (quotes because they are concentrations or products of concentrations), and as such they are independent from frequency. X∈p Dimensions: p is the number of predictors.
: the per-frequency crosstalk correction coefficients.
∈
N×p. Dimensions: N is the number of spectral points and p is the number of predictors. Thus we have a per-frequency correction because β
is, in general, different at each frequency.
∈: the residual error. Dimensions: N the number of points in the spectrum.
The values of are the goal of the crosstalk correction. At each frequency
, we would like to have one coefficient for every predictor, so that when we multiply the predictor values (“concentrations”) by their crosstalk correction coefficients, we get a spectral correction
that will bring the partial_fit
to zero (or as close as we can).
Suppose we have B available to us, we measure a spectrum/partial_fit y, its value at frequency
is
, and we know our concentrations of H2O, CO2, CH4, HDO, N2 (environment variables). The correction process would look like this:
A slope term can be added to remove time trends from the crosstalk coefficients. In such cases, a column can be added in X, Δt=tn−t0 n=(0, . . . , N), with the time delta from the first experiment time as the values in that column. This can be expressed in seconds, or hours, or any other time scale. In the crosstalk problem this is often not important, as predictors are typically scaled and centered (more on this later).
Various issues can affect how well the correction works:
The crosstalk experiment entails changing the concentration of one of the base species while keeping the other ones fixed. The base species concentration should span the range of reasonable values expected to be encountered in real-life applications: this will ensure that our linear model will never be operated outside of the tested bounds (i.e., it is preferred to avoid extrapolating).
As indicated earlier, having inadequate predictors can lead to unsatisfactory performance. We often identify the proper predictors by thinking about which predictors are physically reasonable based on our understanding of spectroscopy.
However, as we keep adding base species and higher order predictors to our model, it becomes harder and harder to “guess” which parameters would be helpful for the correction. Adding more and more predictors can be risky:
These considerations motivate applying LASSO regression in this context.
The linear least squares (LLS) approach can be expressed as
where p identifies the number of predictors.
Possible issues include:
LASSO adds a L1-constraint on the value of the coefficients. The geometry of this constraint has the effect of shrinking some of these coefficients to zero, leading to a sparse solution when λ is sufficiently large.
We prefer a sparse solution because it will tend to set to zero the values of coefficients of predictors that don't have any predictive ability (they have little to no value in determining a suitable correction for our problem) and it makes our solution easier to interpret. Moreover, it can reduce the variance on the predicted values.
The minimization problem above has one hyper-parameter λ. The higher the value of λ, the higher the strength of the constraint that we are imposing, i.e., the more terms are going to be set to zero.
A preferred way to determine A is with cross-validation. Cross-validation is based on creating training and test datasets by splitting the original dataset K-fold (typically 5 to 10). We designate 1 fold as the test data, and the remaining K−1 folds as training data: we train N models (for N different λ values) using the training data, and record the prediction error MSE (mean square error) or RMSE (root mean square error) evaluated on the test data (proxy for the generalization error on unseen data). We then repeat this K times (each of the K folds acts once as test data).
For each λ we will have K error estimates, and we can plot a curve of the prediction error (and its standard error) as function of λ. We prefer to select the largest λ (strongest penalty) that has a mean error within one standard error of the minimum error: this is commonly known as the “one standard error rule”. This ensures that we are using the simplest model that gives an error that is within the error uncertainty of the best model.