The disclosures herein relate generally to spectroscopy systems and methods. More particularly, in some examples, the disclosure relates to systems and methods for analyzing unknown sample compositions through optical emission spectroscopy.
Optical emission spectroscopy (OES) is an imaging and analytical technique used to determine the constituents (e.g., elemental composition) of a broad range of samples. In particular, OES techniques excite a sample and measure the light that, as a result of the excitement, is emitted by the sample. The emitted light is spatially dispersed such that the different wavelengths of the emitted light may be measured, resulting in a measured spectrum. A spectrum that is measured during OES typically comprises multiple sets of atomic emission lines that can be used to determine the constituent elements and/or compounds of the sample. The atomic emission lines are a result of energy released by electrons of the constituent elements as they move from higher to lower electron energy levels of a particular atom. Each atomic element has a unique set of atomic emission lines. Thus, the optical emission spectrum measured during OES may be used to qualitatively identify the constituents of a sample based on a predicted pattern of energy released by the electrons of each constituent element or compound.
In addition to including multiple constituent elements and/or compounds, each constituent of a sample may have a wide range of concentrations. During OES, the amount of light emitted by a particular element (e.g., the amount of light associated with a particular set of atomic emission lines) correlates with the concentration of that particular element within the sample. Thus, the optical emission spectrum obtained during OES can be used to quantitatively determine the concentration of each constituent element within the sample.
Aspects of the disclosure relate to techniques for analyzing unknown sample compositions using a prediction model based on optical emission spectra.
The disclosure provides, for example, a method for estimating an unknown sample composition. The sample composition may include desired analytes along with interfering elements or compounds that may cause spectral interference. Through the use of a prediction model, the method allows for a more accurate estimation of the sample composition (e.g., identities and concentrations of the constituent elements or compounds).
In at least one method, a first emission spectra may be received from a storage, an external computing system, or from one or more detectors of a spectroscopy system. The first emission spectra may correspond to a training sample comprising a plurality of pure elements of known concentrations. One or more processors of a computing system may determine, based on the first emission spectra corresponding to the training sample, a plurality of spectral regions corresponding to the plurality of pure elements of known concentration. The computing system may further determine, for each spectral region corresponding to each pure element of a known concentration, one or more features associated with a signature peak of the spectral region. Thereafter, the computing system may form, for each spectral region corresponding to each pure element of the known concentration, a feature vector comprising the one or more features associated with the signature peak of the spectral region. The computing system may associate the feature vector with the known concentration of the pure element corresponding to the spectral region. Also or alternatively, the computing system may form matrices comprising, for each spectral region, a value based on the feature vector and the known concentration of the pure element corresponding to the spectral region. The computing system may train, based on the associated feature vectors or the matrices, a prediction model to predict unknown concentrations of a plurality of constituents of an unknown sample. A second emission spectra may be received, from one or more detectors of a spectroscopy system. The second emission spectra may correspond to an unknown sample comprising a plurality of constituents of unknown concentrations. The computing system may generate, based on the application of the trained prediction model, a concentration for each of the constituents of the unknown sample.
In certain examples, generating the concentration for each of the constituents of the unknown sample may comprise determining, based on the second emission spectra, a plurality of spectral regions corresponding to the plurality of the constituents of the unknown concentrations. The computing system may determine, for each spectral region corresponding to each of the plurality of constituents of the unknown concentration, one or more features associated with a signature peak of the spectral region corresponding to the constituent of the unknown concentration. The computing system may form, for each spectral region corresponding to each of the plurality of the constituents of the unknown concentration, a feature vector comprising of the one or more features associated with the signature peak of the spectral region corresponding to the constituent of the unknown concentration. Furthermore, for each spectral region corresponding to each of the plurality of the constituents of the unknown concentration, the computing system may apply the feature vector into the trained machine learning algorithm.
The emission filter has a field of view over which it is illuminated by the emission spectrum. The locations of the emission filter's field of view may be characterized using, e.g., (x, y) position coordinates. For each position within the field of view, the emission spectrum may illuminate the emission filter at a respective angle of incidence. The angle of incidence may influence the measured intensity of the transmission spectrum from the emission filter at that position.
The disclosures below also provide a system that may include, for example one or more emission filters of an imaging modality that provides emission spectra; and a computing device storing instructions for analyzing unknown sample compositions through spectroscopy using a prediction model. When executed by one or more processors of the computing device, the instructions may cause the computing device to: receive, from the one or more emission filters, first emission spectra corresponding to a training sample comprising a plurality of pure elements of known concentrations; determine, based on the first emission spectra corresponding to the training sample, a plurality of spectral regions corresponding to the plurality of pure elements of known concentration; determine, for each spectral region corresponding to each pure element of a known concentration, one or more features associated with a signature peak of the spectral region; form, for each spectral region corresponding to each pure element of the known concentration, a feature vector comprising the one or more features associated with the signature peak of the spectral region; associate, for each spectral region corresponding to each pure element of the known concentration, the feature vector with the known concentration of the pure element corresponding to the spectral region; train, based on the associated feature vectors, a prediction model to predict unknown concentrations of a plurality of constituents of an unknown sample; receive, from the emission filter, second emission spectra corresponding to the unknown sample comprising a plurality of constituents of unknown concentrations; and generate, based on the application of the trained prediction model, a concentration for each of the constituents of the unknown sample.
In some aspects, the imaging modality may be one or more of an inductively coupled plasma optical emission spectrometry (ICP-OES), infrared spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, or an ultraviolet (UV) spectroscopy.
It should be appreciated that aspects of the various examples described herein may be combined with and/or substituted for aspects of other examples (e.g., elements of claims depending from one independent claim may be used to further specify implementations of other independent claims). Other features and advantages of the disclosure will be apparent from the following figures, detailed description, and the claims.
The objects and features of the disclosure can be better understood with reference to the drawings described below, and the claims. In the drawings, like numerals are used to indicate like parts throughout the various views.
The inventors have recognized and appreciated that determining the constituents (e.g., elements, compounds, etc.) of a chemical sample by OES is often challenging due to spectral interference, which may result when atomic emission lines from a first element (sometimes referred to as an interfering constituent) within the sample overlap with an atomic emission line of interest from a second element (sometimes referred to as the desired analyte). Spectral interference can be a result of the spectral richness of atomic emission line spectra. Interfering constituents of a sample can form unwanted lines that overlap (directly or partially) with a desired analyte's atomic emission lines in an OES spectrum. Thus, spectral interference may complicate both qualitative and quantitative determinations of the constituents of a sample.
Conventional techniques of interference correction involve knowing, a priori, various properties or characteristics (e.g., an identity, a sample composition, etc.) of the unknown sample being analyzed. For example, one conventional technique first determines the composition of the interfering constituent of the unknown sample before performing the OES analysis for a desired analyte. Another conventional technique of spectral interference correction uses reference spectra for blank samples and samples having several standard concentrations of known elements. The inventors have recognized and appreciated that these conventional techniques of spectral interference correction are labor-intensive, time-inefficient and error prone, because various steps of these techniques must be repeated multiple times for each unknown sample being analyzed. Accordingly, the inventors have developed techniques that correct for spectral interference that are simpler and less prone to error than these conventional techniques.
Various implementations of the present disclosure address one or more of the challenges described above. For example, the present disclosure may describe systems, methods, devices, and apparatuses for analyzing unknown sample compositions using a prediction model based on optical emission spectra.
Described herein are techniques for analyzing unknown sample compositions through OES using a prediction model based on optical emission spectra. In some embodiments, the use of the prediction model overcomes the need to determine the compositions of interfering constituents of the sample to correct for spectral interference resulting from the interfering constituents. Removing the need for determining the compositions of interfering constituents reduces the number of steps necessary thereby saving time, cost and reducing the chance for error. For example, in at least one embodiment, a user of the above-described systems, methods, apparatuses, and devices may only input a “blank” sample and an unknown sample, as will be described below, without the need to know the identities and/or compositions of the interfering constituents in advance or perform additional analyses to determine the identities and/or compositions of the interfering constituents. In various embodiments of the present disclosure, the expected emission spectra of pure elements may be sufficient to allow one to determine what constituent elements and/or compounds make up a sample and with what concentrations. These embodiments disclosed herein may allow the users to overcome the laborious and time intensive steps in conventional techniques of interference correction.
The electric source 102 may be any device that receives electricity (e.g., via a power outlet) and uses the electricity to discharge and/or direct energy toward the sample 110. For example, as shown in
In the example OES system 100 shown in
In the example OES system 100 in
As the discharge 108 strikes the sample 110, the discharge 108 may excite, heat, vaporize, and/or otherwise energize at least some of the sample 110, e.g., constituent elements and compounds at the surface of the sample that comes in contact with the discharged energy. For example, a discharged spark, arc, or flame may cause a small part of the sample to turn into plasma 112, e.g., as shown in
The electrons of the constituents of the plasma 112 may get excited as a result of the energy transferred from the electric source 102 via the discharge 108. As may be known to those having ordinary skill in the art, electrons may be typically situated around an atom in orbits of varying energy levels. An excitement of an electron may cause the electron to move to an orbit corresponding to a higher energy level. The movement of the electron to the orbit corresponding to the higher energy level may leave a vacancy in the orbit corresponding to the former (lower) energy level. This vacancy may cause the atom to become unstable. To stabilize the atom, an electron that is different from or the same as the electron that moved to the higher energy level may move back to the former (lower) energy level having the vacancy. As electrons move from an orbit corresponding to a higher energy level to an orbit corresponding to a lower energy level, energy may be released. The energy release may be in the form of a light or optical emission. Given that each constituent element or compound (e.g., constituent 110A and constituent 110B) may be defined or characterized by a unique atomic structure with unique electron orbits corresponding to electron energy levels, energy released as a result of the above-described process may produce optical emissions of a fixed wavelength or energy of radiation. For example, since differences between two or more energy levels of atomic orbits for a constituent element or compound may already be known, each transition (e.g., an electron moving down from a higher energy level to a lower energy level) may produce a specific optical emission line of a fixed wavelength or energy of radiation.
As the amounts of energy released through this process are discrete and dependent on the electron orbits that are characteristic of an element or compound, a user may be able to identify an element or compound based on the specific pattern of the amounts of energy released (e.g., an optical emission spectrum). For example, some elements, which typically have electrons at orbits corresponding to very high energy levels, may release very high energy of a specific amount as a result of electrons moving down energy levels. Furthermore, some constituent elements or compounds (e.g., metallic elements) emit optical emissions of many wavelengths after interacting with the discharge 108, thus resulting in an optical emission spectrum of many emission lines. This may be because of a plurality of electron transitions between various combinations of energy levels in the atomic structure of the metallic element, thus emitting energy radiation of many wavelengths. Thus, the specific pattern of energy that is released may be a result of predictable movement of electrons between the atomic orbits that a constituent element or compound is known to have.
The light released by the constituent element or compound through the above-described process may be referred to as “emitted light,” “optical emission” “emission spectrum” or “atomic emission.” The emitted light may be of discrete wavelengths and may therefore form distinct optical emission lines depending on the element or compound releasing it.
It is expected that a sample 110 may comprise a variety of constituent elements or compounds (e.g., constituents 110A and 110B). Furthermore, it is expected that an atom of an element or compound may have a plurality of atomic orbits for its electrons, each atomic orbit corresponding to a different energy level. For example, atoms of constituent 110A may have a set of atomic orbits that are distinct from those of the atoms of constituent 110B. This variety may lead to a plurality of amounts of energy to be released by the constituent elements or compounds, which may form emitted energy (e.g., light) of multiple wavelengths, which may be directed to flow through a diffraction grating device 116. For simplicity the emitted energy (e.g., light) of multiple waveforms, which stem from the sample, and are directed to the diffraction grating device 116, can be referred to as “multiple emission lines” 114. As shown in
The diffraction grating device 116 may separate the incoming multiple emission lines 114, e.g., based on their wavelengths or wavelength ranges. Different amounts of energy released by the constituent atoms as a result of electrons dropping from a higher energy level to a lower energy level in the constituent atoms may correspond with different wavelengths or wavelength ranges of optical emission. As discussed previously, the specific amount of energy released may depend on the atomic structure, e.g., the energy levels of the electron orbits of the atom. Also, as previously discussed, each constituent element and/or compound may have its own unique atomic structure. Thus, by separating the incoming multiple emission lines by wavelengths and/or wavelength ranges, the diffraction grating device 116 may separate the multiple emission lines based on the element and/or compound associated with an optical emission. As shown in
Detector devices 120A-120B can measure the intensity of each wavelength-specific emission line. The detector 120 may be a light-sensitive device that would transform the light received to image data. For example, detector 120 may be a charge coupled device (CCD) detector. A CCD detector, or other like detectors, may include various detector elements that may build up a charge based on the intensity of light. In some aspects of the present disclosure, other detectors of electromagnetic radiation may be used, e.g., photomultiplier tubes, photodiodes, and avalanche photodiodes, etc. The intensity measured from the individual optical emission lines 118 may be proportional to the concentration of the element in the sample. Furthermore, from the emission lines, the detector devices 120A-120B may collect the individual emission line's peak signals. Collectively or individually, the detector devices 120A-120B may process the received signals to generate a spectrum showing light intensity peaks as a function of wavelength. Also or alternatively, as shown in
The computer system 124 may determine or acquire the measured intensities from the detectors 120A-120B and processes this data to determine the composition of the sample 110 using methods described herein. A user interface of the computer system 124 can be used to display, print or store the measured intensities, wavelengths, and/or estimated sample compositions. Furthermore, the computer system 124 may store and run applications for learning (e.g., via machine learning, convolution neural networks, etc.) from training samples to better predict the composition of a testing sample. More detail regarding the computer system 124 may be described in conjunction with
It is to be noted that while
Still referring to
The computing system 250 may be programmed with specific instructions to perform the various image processing operations described herein. The computing system 250 can be, for example, a specially-programmed embedded computer, a personal computer such as a laptop or desktop computer, or another type of computer, that is capable of running the software, issuing suitable control commands, and/or recording information in real-time. The computing system 250 may include, be connected to, or be communicatively linked to a display 256 for reporting information to an operator of the instrument (e.g., displaying an optical emission spectrum, peak intensities, peak wavelengths, signature peaks, fitted curves, sample composition predictions, interfering element composition predictions, an indicia of spectral interference, etc.), an input device 251 (e.g., keyboard, mouse, interface with optical imaging system, etc.) for enabling the operator to enter information and commands, and/or a printer 258 for providing a print-out, or permanent record, of measurements made by the system and for printing images. Some commands entered at the keyboard may enable a user to perform certain data processing tasks. In some implementation, data acquisition and data processing are automated and require little or no user input after initializing the system.
The computing system 250 may comprise one or more processors 263, which may execute instructions of a computer program to perform any of the functions described herein. The instructions may be stored in a tangible, non-transitory computer medium, such as a read-only memory (ROM) 252, random access memory (RAM) 253, removable media 254 (e.g., a USB drive, a compact disk (CD), a digital versatile disk (DVD)), and/or in any other type of computer-readable medium or memory (collectively referred to as “electronic storage medium”). Instructions may also be stored in an attached (or internal) hard drive 259 or other types of storage media. The computing system 250 may comprise one or more output devices, such as a display device 256 (e.g., to view generated images, optical emission spectra, fitted curves, signature peaks, etc.) and a printer 258, and may comprise one or more output device controllers 255, such as an image processor for performing operations described herein. One or more user input devices 251 may comprise a remote control, a keyboard, a mouse, a touch screen (which may be integrated with the display device 256), etc. The computing system 250 may also comprise one or more network interfaces, such as a network input/output (I/O) interface 257 (e.g., a network card) to communicate with an external network 270. The network I/O interface 257 may be a wired interface (e.g., electrical, RF (via coax), optical (via fiber)), a wireless interface, or a combination of the two. The network I/O interface 257 may comprise a modem configured to communicate via the external network 270. The external network may comprise, for example, local area network, a network provider's wireless, coaxial, fiber, or hybrid fiber/coaxial distribution system (e.g., a DOCSIS network), or any other desired network.
One or more of the elements of the computing system 250 may be implemented as software or a combination of hardware and software. Modifications may be made to add, remove, combine, divide, etc. components of the computing system 250. Additionally, the elements shown in
Computing system 250 may further include one or more applications 260 that may include stored programs, code, or instructions for running via processor 263. For example, applications 260 may include tools for performing various machine learning operations (e.g., machine learning (ML) tools 261), which may include training prediction models for determining the sample composition and spectral interference from an unknown sample. The applications 260 may further include an application for detecting one or more signature peaks from an optical emission spectrum and identifying a known element or compound from the at least one signature peak (element ID tool 262). The applications 260 may rely on an image processing of the optical emission spectrum received from the spectroscopy system 210. For example, the application may analyzed an image obtained after an optical emission spectrum has undergone image processing externally (e.g., at another computing system or at another component of computing system 250). Also or alternatively, the applications 260 may use the processor 263 to perform image processing on the optical emission spectrum to perform further analysis.
Referring now to the training phase 300A of method 300, step 302A may include receiving a training dataset comprising emission spectra of a known sample composition. A sample composition may be known if the identities and concentrations of the constituent elements or compounds making up at least a significant part of the sample are known. The training dataset may thus include, as its domain, emission spectra for a plurality of sample compositions. Each emission spectra may include, for example, spectral regions characterized by a signature peak. The signature peak may be associated with a known element or compound. The training dataset may also include, as its range, the identity and concentration of constituent elements and compounds, of each of a plurality of samples. In some implementations, the training dataset may be stored in the computing system (e.g., within hard drive 259), and/or may be periodically updated via information regarding other known sample compositions and their respective emission spectra. ML tools 261 may be used to update and/or train a prediction model using the steps described herein. In some aspects, the training dataset may be supplied to the computing system, e.g., from external computing systems, servers, or libraries. In some embodiments, the received optical emission spectra may be digitized and analyzed via an image processor, e.g., for subsequent steps discussed herein. Furthermore, the emission spectra, and their respective known sample composition information (e.g., identities of sample constituents and their respective concentrations) may be stored, e.g., within data structures in ML tools 261 of applications 260 and/or within hard drive 259.
At step 302B, the computing system may identify or determine spectral regions from the received emission spectra. The spectral regions may be based on regions of an optical emission spectrum that involve a signature peak. For example, an optical emission spectrum may include a curve that fluctuates in intensity across a wavelength range. A signature peak would involve a markedly drastic increase in intensity around a specific wavelength or wavelength range. The signature peak may contrast with other regions of the curve where the fluctuations are relatively minor.
In some implementations, spectral regions and/or the signature peak of the spectral region may be identified using image processing techniques. For example, one such technique may involve searching for intensity values corresponding to an optical emission curve that exceed and/or fall below a predetermined threshold. Another technique may involve detecting a local or global maximum of the optical emission curve, and a corresponding local minimum on either side of the detected maximum. Based on the span of a signature peak, the optical emission curve may be cropped so that the spectral region comprises at least the wavelength range spanning the signature peak. It is contemplated that the spectral region may include peaks other than the signature peak (e.g., where the signature peak is flanked by one or more peaks of lesser intensity). In such examples, the optical emission curve may be cropped so that the spectral region comprises at least the wavelength range spanning the signature peak and minor peaks adjacent to the signature peak.
In some embodiments, a signature peak may be characteristic of a constituent element or compound. For example, the wavelength or wavelength range of the signature peak may indicate the amount of energy released as a result of electron transitions across energy levels in an atomic structure, as was previously discussed in conjunction with
For each identified or determined spectral region of each emission spectrum, the computing system may determine one or more features of the spectral region that are predictive of constituent concentrations (e.g., step 304). For example, the one or more features may include, but are not limited to, the raw data points of the spectral region or a signature peak of spectral region 306A, a local or a global maximum of a fitted curve over the spectral region or over the signature peak of the spectral region (e.g., curve fit max 306B), or an area under a fitted curve of the spectral region or an area under a fitted curve of the signature peak of the spectral region (curve fit area 306C). Other features that may be predictive of constituent concentrations may include, for example, data associated with a peak other than a signature peak of a spectral region. For example, such features may include the raw data points of the peak different from the signature peak of the spectral region, a local or a global maximum of a fitted curve over the peak different from the signature peak of the spectral region, or an area under a fitted curve of the peak different from the signature peak of the spectral region.
In some implementations, step 304 may include determining features of the spectral region that is most predictive of the constituent concentrations by testing a plurality of data points from the spectral region. As discussed above, the actual constituent element or compound can be identified based on the signature peak of a spectral region, and the concentration may already be known because the training dataset may use known sample compositions. The testing of various data points may involve processing the data points using a convolutional neural network, which may involve assigning feature weights to the various data points of the spectral region based on how well they predict the constituent concentration.
The determined or identified features of the spectral region that are predictive of the constituent concentrations may not necessarily be the same for each constituent element or compound. For example, the training dataset may indicate that a curve fit max of a signature peak corresponding to element X is most predictive of element's X concentration in a sample, whereas the training dataset could indicate that a curve fit area of a signature peak corresponding to element Y is most predictive of element Y's concentration in a sample. In other implementations, the identified features of a spectral region that are predictive of the constituent's concentrations may be the same for various constituent elements or compounds. The identified or determined features may be saved in hard drive 259, e.g., by the ML Tool application 261. The saved features can be retrieved in the testing phase to determine the most predictive features of an unknown sample's optical emission to determine the unknown sample's composition.
After identifying features of the spectral region that are predictive of concentrations of the various constituents of the sample, the values for these features can be determined, calculated and/or measured.
At step 308, the computing system may generate a training matrix, based on the values of the determined features that are predictive of the constituent concentrations, and the respective constituent concentrations. In some implementations, a training matrix may be generated for each known sample composition and its respective emission spectrum. Also or alternatively, the known sample compositions and the values of the determined features may be arranged in a form other than a matrix. For example, the values of the determined features that are predictive of the constituent concentrations may be arranged as a feature vector. The feature vector may assign or provide variables for weights based on how well a feature predicts the constituent concentration.
At step 310, the computing system may train a prediction model, e.g., using the training matrix. The training may rely on one or more machine learning algorithms or statistical methods to determine a mathematical relation between values for the determined features predictive of the constituent's concentration, and the constituent's actual concentration. For example, the training may include using partial least squares regression to determine the mathematical relation between values for the determined features predictive of the constituent's concentration, and the constituent's actual concentration. In at least one embodiment, values for various features of the spectral region may be provided for a variety of constituents of known concentrations in a matrix. The features most predictive of the constituent's concentration as well as the mathematical relationships (e.g., slope) may be determined through the matrix.
Step 312 may include storing the trained prediction model in an electronic storage medium. For example learned relationships for the concentration of various elements and compounds and the features from the spectral region of the respective various elements and compounds can be stored by the ML Tool application 261 on hard drive 259. Since each element or compound, given their unique atomic structure, would provide unique optical emissions as a result of electron transitions, each element or compound can be identified by the wavelength or wavelength range of a unique signature peak. Depending on the spectral regions identified from a testing sample in subsequent steps, the mathematical relationship between features of a spectral region and the concentration can be obtained from the trained prediction model stored in step 312.
Referring now to the testing phase 300B of method 300, step 314A may include receiving testing data comprising an emission spectrum of an unknown sample composition. In contrast to the training dataset, which may comprise emission spectra from multiple samples of known sample composition, the testing data may comprise an emission spectrum for a sample for which knowledge of the sample's composition is desired. Furthermore, the sample corresponding to the testing data may include, along with analytes of interest, interfering elements and compounds that could cause spectral interference. As discussed previously, traditional methods of spectral interference correction may have involved knowing at least the composition (e.g., identities, concentration, etc.) of the interfering elements and compounds. Through the use of a prediction model from training phase 300A, various embodiments of the present disclosure may overcome the need to know the composition of the interfering elements and compounds. Furthermore, the various embodiments provide a more accurate determination of a sample's composition that compensates for the detrimental effects caused by spectral interference, by analyzing the sample's optical emission using the prediction model.
At step 314B, the computing system may identify or determine spectral regions from the received emission spectra. The spectral regions may be based on regions of an optical emission spectrum that involve a signature peak. For example, an optical emission spectrum may include a curve that fluctuates in intensity across a wavelength range. A signature peak would involve a markedly drastic increase in intensity around a specific wavelength or wavelength range. The signature peak may contrast with other regions of the curve where the fluctuations are relatively minor.
In some implementations, spectral regions and/or the signature peak of the spectral region may be identified using image processing techniques. For example, one such technique may involve searching for intensity values corresponding to an optical emission curve that exceed and/or fall below a predetermined threshold. Another technique may involve detecting a local or global maximum of the optical emission curve, and a corresponding local minimum on either side of the detected maximum. Based on the span of the signature peak, the optical emission curve may be cropped so that the spectral region comprises at least the wavelength range spanning the signature peak. It is contemplated that the spectral region may include peaks other than the signature peak (e.g., where the signature peak is flanked by one or more peaks of lesser intensity). In such examples, the optical emission curve may be cropped so that the spectral region comprises at least the wavelength range spanning the signature peak and minor peaks adjacent to the signature peak.
It is to be appreciated that a signature peak may be characteristic of a constituent element or compound. For example, the wavelength or wavelength range of the signature peak of may indicate the amount of energy released as a result of electron transitions across energy levels in an atomic structure, as was previously discussed in conjunction with
For each identified or determined spectral region of each emission spectrum, the computing system may determine one or more features of the spectral region that are predictive of constituent concentrations (e.g., step 316). However, since the constituent concentrations of the testing data is unknown, the features may be determined from the training phase 300A. Based on the identified or determined spectral regions (e.g., from step 314B), and the identified constituent elements or compounds from the signature peaks of the spectral regions (e.g., from step 314C), the computing system may retrieve features determined to be predictive of the identified constituent's concentration, e.g., from step 304.
As previously discussed, the one or more features may include, but are not limited to, the raw data points of the spectral region or a signature peak of spectral region 318A, a local or a global maximum of a fitted curve over the spectral region or over the signature peak of the spectral region (e.g., curve fit max 318B), or an area under a fitted curve of the spectral region or an are under a fitted curve of the signature peak of the spectral region (curve fit area 318C). Also as previously discussed, other features that may be predictive of constituent concentrations may include, for example, data associated with a peak other than a signature peak of a spectral region.
It is contemplated that the computing system may already have a stored list of predictive features to use based on the constituent element and/or compound identified from the signature peaks of the emission spectra of the unknown sample. As discussed previously in conjunction with the training phase 300A, the stored list of predictive features may be determined based on training from a variety of known sample compositions, which happen to have the constituent elements and compounds identified in the testing dataset.
Based on the features of the spectral region that are prescribed (based on the results of the training phase 300A) to be most predictive of the constituent concentrations, the values for these features can be calculated and/or measured.
The computing system may then use the prediction model trained in the training phase to estimate or predict the constituent concentrations of the unknown sample composition using the determined features from step 316. In at least one implementation, the computing system may generate feature vectors and/or a testing matrix, based on the values of the determined features that are predictive of the constituent concentrations (e.g., as in step 320). Since the constituent concentrations for the testing data are unknown, the feature vectors may be inputted into the prediction model stored in step 312 to estimate the constituent concentrations of the testing data (e.g., as in step 324). For example, weights may be assigned to the feature vector based on how well a feature predicts the constituent concentration. Mathematical relationships learned from the prediction model may thus be applied to the feature vector to calculate the concentration of constituent elements or compounds of the unknown sample that is being tested.
Also or alternatively, the determined features may be used to form a testing matrix where the constituent concentrations are yet unknown. An example of the testing matrix is shown and explained in conjunction with
It is contemplated that the constituent elements or compounds, whose identities and concentrations are determined or estimated using the steps presented in testing phase 300B may include desired analytes as well as interfering elements or compounds. As discussed previously, in traditional methods, the interfering elements or compounds may typically affect accurate determination of a sample composition by causing spectral interference. However, by relying on a prediction model trained through the use of a plurality of known sample compositions and their respective optical emission spectra, various embodiments of the present disclosure overcome the problems caused by spectral interference.
Referring now to
The rows 402 of the matrix may each correspond to a different feature of the spectral region of the emission spectra of the Lithium samples. The different features may be tested for their ability to predict a constituent's concentration using a training matrix such as training matrix 400A. For example, the feature tested in the first row is the maximum point of a signature peak of the spectral region (“peak intensity of a signature peak”) corresponding to the two Lithium samples. The feature tested in the second row is an area under the fitted curve of the signature peak of the spectral region (“curve fit area”) corresponding to the two Lithium samples. Other rows may be added to correspond with other features of the spectral region of a constituent (e.g., as in Lithium as shown in matrix 400A). Thus, different features may be tested for their ability to predict a constituent's concentration.
The columns 404 of the matrix may each correspond to a different concentration of a constituent that has been identified or hypothesized as being within a sample composition. Thus, in the example training matrix 400, the columns correspond to a Lithium sample at a concentration 0.2 ppm and a Lithium sample at a concentration of 1 ppm. As explained previously, a wavelength or wavelength range of a signature peak may be used to identify or hypothesize the existence of a specific element or compound in a sample composition. Within each row of each constituent, the constituent's concentration may be used in the determination of a mathematical relationship between the concentration and a feature of the spectral region, e.g., to determine the feature that is most predictive of the constituent's concentration. For each entry of the training matrix, the constituent's concentration is shown to be listed above a value corresponding to a feature of the spectral region. While a constituent's concentration may be known for sample compositions being used in the training dataset, the constituent's concentration may not be known for the sample composition being tested. Thus, as will be shown in the testing matrix used in
As shown in training matrix of
Referring now to
As shown in marker 418, the feature which is being measured at these specific wavelengths is the peak intensity. However, a value for a peak intensity (or any other feature) at one or more of these specific wavelengths may be insignificant, for example, because the value may not satisfy a threshold. As shown by marker 420, values that do not meet a threshold may be rendered null (e.g., “0”) in order to discard the peak, amplification, and/or minor fluctuation at the wavelengths corresponding to each of those values as background noise.
Nevertheless, as shown by marker 422, the values of the peak intensities for some wavelengths of the emission spectra may be sufficiently high (e.g., because the values satisfy a threshold). For Lithium, these peak intensities may be centered at or near wavelengths of 460 nm, 610 nm, and 671 nm, as shown by marker 416. The peaks corresponding to these wavelengths may be identified as the signature peaks for Lithium. As previously discussed, each known constituent element or compound may be identifiable by their signature peaks. The training matrix 418 may indicate the values of a feature (e.g., peak intensity values) at these signature peaks for the emission spectrum of each sample (e.g. Lithium samples at 0.02 ppm, 0.1 ppm, 0.2 ppm, 0.5 ppm, and 1 ppm, respectively). Thus, the intensity values for the signature peaks for a Lithium sample at 0.5 ppm may include 6,712 at 460 nm, 3,813,022 at 610 nm, and 3,813,022 at 671 nm.
The training matrix (e.g., matrix 418) may be used to determine a prediction model (e.g., a machine learning algorithm, a mathematical relation, etc.) between the concentration of a constituent (e.g., Lithium) and the value of a feature of the emission spectra (e.g., intensity value at the signature peaks). As shown in the matrix 418, there may be a linear relationship between the concentration of Lithium in a sample and the intensity value for each of the signature peaks of the emission spectra of the sample. For example, the intensity value for the signature peak at the wavelength of 460 nm is 268 for a Lithium sample of 0.02 ppm concentration. For a Lithium sample of 0.1 ppm, the signature peak at the wavelength of 460 nm may yield an intensity value of 1342. In both examples, for signature peaks at or near the 460 nm wavelength, the ratio of the concentration to the intensity value is 7.5×10−5. As can be seen from matrix 418, this ratio of the concentration of Lithium to the intensity value of a signature peak at the 460 nm wavelength may also apply for other concentrations of Lithium. For the signature peaks at the 610 nm and 671 nm wavelengths, the intensity value may be the same for the Lithium sample of 0.02 ppm concentration, e.g., at approximately 152521, as shown in
Thus the training matrix shown in
As noted above, the constituent element or compounds of an unknown sample may be identified by their signature peaks on the emission spectra of the unknown sample. Thus, training matrices, their corresponding training data, and their resulting prediction models may vary for each compound or element. For example, while
Referring now to
There may also be a linear relationship between the concentration of Calcium and the intensity value for each of the signature peaks. For example, the intensity value for a signature peak at or near the 393 nm wavelength is 630 for a Calcium sample of 0.02 ppm concentration. For a signature peak at or near the 393 nm wavelength, the intensity value is 3150 for a Calcium sample of 0.1 ppm concentration. For both concentrations at or near the 393 nm wavelength, the ratio of the concentration to the intensity value is 3.2×10−5. The ratio may also apply for the signature peaks of other concentrations of Calcium at or near the 393 nm wavelength. For the signature peaks at the wavelength of 397 nm, the intensity value is also 630 nm for the Calcium sample of 0.02 ppm concentration and also 3150 for the Calcium sample of 0.1 ppm concentration. Thus, for both examples at or near this wavelength, the ratio of the concentration to the intensity value may remain as 3.2×10−5, and may also apply for the other concentrations of Calcium at these wavelengths. Thus the training matrix shown in
In some embodiments the process of determining which feature is most predictive for determining a sample composition (e.g., as explained in relation to
For each of the signature peaks of each of the spectral regions corresponding to the identified constituents, the peak intensity (and/or other features of the signature peak) can be measured. Thus, as shown in
As shown by marker 464, the signature peak for Lithium at wavelength of 460 nm yields an intensity of 6712. Based on the prediction model for Lithium at a signature peak at or near the wavelength of 460 nm, as discussed in conjunction with
It is contemplated that methods, systems, and processes described herein encompass variations and adaptations developed using information from the examples described herein. While the disclosures have been particularly shown and described with reference to example implementations, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of claimed subject matter.
Throughout the description, where systems and compositions are described as having, including, or comprising specific components, or where processes and methods are described as having, including, or comprising specific steps, it is contemplated that, additionally, there are systems and compositions of the present disclosure that consist essentially of, or consist of, the recited components, and that there are processes and methods of the present disclosure that consist essentially of, or consist of, the recited processing steps.