The present invention relates to spectral image analysis and, in particular, to methods for spectral image analysis by exploiting spatial simplicity.
Several full-spectrum imaging techniques have been introduced in recent years that promise to provide rapid and comprehensive chemical characterization of complex samples. These spectroscopic imaging techniques include Electron Probe Microanalysis (EPMA), Scanning Electron Microscopy (SEM) with attached Energy Dispersive X-Ray Spectrometer (EDX), X-Ray Fluorescence (XRF), Electron Energy Loss spectroscopy (EELS), Particle Induced X-ray Emission (PIXE), Auger Electron Spectroscopy (AES), gamma-ray spectroscopy, Secondary Ion Mass Spectroscopy (SIMS), X-Ray Photoelectron Spectroscopy (XPS), Infrared Spectroscopy (IR), Raman Spectroscopy, Magnetic Resonance Imaging (MRI) scans, Computerized Axial Tomography (CAT) scans, IR reflectometry, Mass Spectrometry (MS), multidimensional chromatographic/spectroscopic techniques, hyperspectral remote imaging sensors, etc. These new spectroscopic imaging systems enable the collection of a complete spectrum at each point in a 1-, 2- or 3-dimensional spatial array. It is not uncommon that these spectral image data sets comprise tens of thousands of individual spectra, or more.
One of the remaining obstacles to adopting these techniques for routine use is the difficulty of reducing the vast quantities of raw spectral data to meaningful chemical information. Multivariate factor analysis techniques have proven effective for extracting the essential chemical information from high dimensional spectral image data sets into a limited number of components that describe the spectral characteristics and spatial distributions of the chemical species comprising the sample. In mathematical terms, given an m-pixel×n-channel matrix of spectral data D, we wish to approximate D by the matrix factorization
D≅CST (1)
Here, C is an m-pixel×p-component matrix describing the distribution of the pure components at each spatial location and S is an n-channel×p-component matrix representation of the pure-component spectra. In the typical case that p<<m and n, the factorization in Eq. (1) accomplishes a large reduction in the dimensionality of the data set with the goal of optimally separating factors embodying real chemical information from those describing only noise.
It is well known, however, that factor-based methods suffer from a “rotational ambiguity.” Given any invertible p×p transformation matrix R, D can be equally well expressed as
D≅CST=(CR)(R−1ST)={tilde over (C)}{tilde over (S)}T (2)
That is, an infinite number of factor pairs {tilde over (C)} and {tilde over (S)} will provide equally good fits to the data. The key to deriving relatively unique factors, then, is to select those factor solutions that satisfy additional optimization criteria. Thus, physically inspired constraints are often employed to derive relatively unique factor models that make the pure components more easily interpretable. In general, one would expect that the extent to which these criteria or constraints actually reflect the physical reality of a given sample, the higher the fidelity and reliability of the derived components.
Principal Component Analysis (PCA), used either by itself or to preprocess data, is the most ubiquitous tool of factor analysis. The constraints imposed by PCA are that the spectral and concentration factors must contain orthogonal components and that the components serially maximize the variance in the data that each accounts for. Neither constraint has any basis in physical reality; thus, the factors obtained via PCA are abstract and not easily interpreted. Alternating Least Squares-based Multivariate Curve Resolution (MCR-ALS) is another common factorization method used for spectral image analysis. This technique may force spectra and concentrations to be non-negative, for instance, yielding more physically realistic pure components. There are many cases, however, in which those constraints are not effective and where alternative approaches may provide new analytical insights.
For many cases of practical importance, imaged samples are “simple” in the sense that they consist of relatively discrete chemical phases. That is, at any given location, only one or a few of the chemical species comprising the entire sample have non-zero concentrations. In the limiting case that each location has only a single chemical component having non-zero concentration, the sample is said to be “perfectly simple.” The methods of the present invention exploit this simplicity in the spatial domain to make the resulting factor models more realistic. Therefore, more physically accurate and interpretable spectral and abundance components can be extracted from spectral images that have spatially simple structure.
The present invention is directed to methods for spectral image analysis that exploit spatial simplicity. A method comprises providing a matrix of spectral image data; factoring the data matrix into the product of a matrix of orthogonal scores vectors and a matrix of orthonormal loading vectors using RCA; determining a rotation matrix; rotating the spatial-domain loading matrix to provide a simplified rotated loading matrix; and inverse rotating the scores matrix to provide estimates of the pure-component spectra. The rotation matrix can be determined using an Orthomax procedure. In particular, rotation of the orthonormal spatial factors arising from PCA can be effective for estimating physically acceptable and easily interpretable pure-component spectra and abundances for spectral images that approximate perfectly simple structure. Furthermore, the rotated factor solutions can be refined to provide pure components that satisfy physically prescribed constraints, such as, non-negativity of spectra and abundances. For example, the rotated loading matrix and the rotated scores matrix can be further refined using constrained least squares, constrained alternating least squares, or non-negative matrix factorization approaches.
Alternatively, spatial simplicity can be imposed as an additional constraint during MCR-ALS procedures. Accordingly, a method for spectral image analysis comprises providing a matrix of spectral image data; obtaining an initial estimate for a spectral shapes matrix; selecting elements of zero abundance in the concentration matrix; obtaining a least squares estimate for the concentration matrix, subject to the zero-equality constraints at the zero abundance matrix elements in addition to standard constraints such as non-negativity at the remaining abundance matrix elements; and obtaining a least squares estimate for spectral shapes matrix, subject to constraints. Either the number or location of the elements to have zero abundance can be selected. These steps can be iterated for a predetermined number of iteration loops or until an acceptable level of convergence is achieved.
The accompanying drawings, which are incorporated in and form part of the specification, illustrate the present invention and, together with the description, describe the invention. In the drawings, like elements are referred to by like numbers.
a) shows the estimated pure-component spectra and abundance maps of the simulated braze interface obtained using a MCR-ALS method.
a) shows a simulation wherein the composition at each pixel is an equal combination of two of the four components comprising the entire sample.
a) shows a simulation wherein the composition at each pixel is an equal combination of two of the four components comprising the entire sample.
a) shows the estimated pure-component spectra and abundance maps of the MEMS device obtained by the MCR-ALS method.
In general, multivariate spectral analysis for chemical characterization of a sample can include: determining the number of chemical species (pure elements and chemical phases or alloys) that comprise an inhomogeneous mixture being imaged; extracting the spectra of these “pure” components (elements or phases); quantifying the amount or concentration of each component present in the sample; and mapping the spatial distribution of these components across the sample.
In general, an image can comprise any arbitrary, multidimensional array of points. The image can include a spatial dimension, such as lines, traditional 2D images, or 3D volumes; or a temporal dimension, such as a time series of images. For example, in the case of a gas chromatography/mass spectroscopy (GC/MS) image, the dimension is a separation coordinate. The multivariate image analysis techniques will be described herein in reference to a spatial dimension. However, it is understood that the techniques can be applied also to non-spatial images, such as those comprising a time series or chromatographic coordinate.
Multivariate spectral analysis can be performed on a full spectrum image that can be represented as a two-dimensional data matrix D. The two-dimensional data matrix D can be obtained by unfolding a measured multidimensional spectral data set. For example, the multidimensional spectra data set D can be a data cube that comprises a 1D spectrum at each pixel on a 2D spatial grid corresponding to the (X,Y) coordinates of the pixel's location. The 2D data matrix D enables the easy and efficient use of standard linear algebra and matrix operations.
The data matrix D can be factored into the product of two matrices, C and ST, according to Eq. (1), where D has dimensions of m×n, and m is the number of pixels and n is the number of spectral channels. The matrix C is a concentration matrix, which is related to the concentration of the chemical phases (e.g., a matrix representing a map of the pure-component abundances) and has dimensions of m×p, where p is the number of pure components. The matrix S is a spectral shapes matrix, which contains information about the spectral shapes of the pure chemical components (e.g., a matrix of the pure-component spectra). S has dimensions n×p.
The factorization of Eq. (1) can be accomplished by an image analysis of D. A number of prior image analysis methods are described in U.S. Pat. Nos. 6,584,413 and 6,675,106 to Keenan and Kotula, which are incorporated herein by reference. A spectral-image factorization method based on PCA has also been described. See M. R. Keenan and P. G. Kotula, Surf. Interface Anal. 36, 203 (2004), which is incorporated herein by reference.
If desired, the measured data matrix D can be weighted, depending on the type of experiment being analyzed, and depending on the properties of the noise or background signal generated during data acquisition. Weighting is generally used whenever the properties of the noise are not uniform throughout the measurement space (i.e., heteroscedastic noise). This is particularly true in the case of “counting” experiments in which the noise is characterized as following a Poisson probability distribution in which the magnitude of the uncertainty varies with the magnitude of the signal. For multi-channel data acquisition the noise is not characterized by a single probability distribution, but rather, by a distribution whose parameters will, in principle, differ from channel to channel and from pixel to pixel within a channel. Additionally, heteroscedasticity can also arise from other effects, e.g., non-uniform detector responses, or mathematical transformations applied to the data (e.g., taking logarithms).
Weighting is also useful when there is a large disparity between the total number of counts (e.g., observations) arising from different elements or phases (e.g., a sample comprising a small amount of contaminant located on top of a substrate made of a single material). Weighting, therefore, is useful for accurately identifying minor phases, trace elements, or subtle gradients in composition across a sample. By properly accounting for experimental noise characteristics, chemically relevant features having small numbers of counts can become more significant, in a least squares sense, than large magnitude noise associated with major spectroscopic features.
Weighting can also be used to account for data “outliers”. Data outliers can include malfunctioning energy channels or pixel elements in a detector array. For example, a dead (i.e., inoperative) energy channel (e.g., zero signal) can be effectively removed from the data matrix D by assigning a sufficiently small weight. Alternatively, for detectors that use a CCD pixel array, an occurrence of a “hot” (i.e., saturated) or dead pixel can be weighted in a similar fashion, namely, by assigning a sufficiently small weight.
Therefore, data matrix D can be weighted to create a weighted data matrix {tilde over (D)}, according to
{tilde over (D)}=GDH (3)
where G is a pre-multiply weighting matrix having dimensions m×m; and H is a post-multiply weighting matrix having dimensions n×n. In general, G is used to weight the row-space of D, and H is used to weight the column-space of D. Obviously, the weighting matrices G and H can be the identity matrix if no weighting is desired.
The matrix G can be used to account for unequal variance in the observations from pixel to pixel, independent of channel number (i.e., energy or wavelength). For example, weighting by G could be used in the case where there are hot or dead pixels, or if there was an array detector where different detector elements had different noise properties. Weighting by G also could be used in the situation where there were much higher total counts in some pixels as compared to others, caused, for example, by unequal dwell times.
The matrix H can account for unequal variance in the observations from channel to channel, independent of pixel location. For example, weighting by H could be used when the noise level varies from one energy channel to the next. For X-ray detection in spectral analysis, the noise will be higher on average in channels that have greater signal on average due to the Poisson nature of a counting experiment.
Spectral image analysis of the weighted data matrix {tilde over (D)} yields {tilde over (C)}=GC, which is the weighted concentration matrix, and {tilde over (S)}T=STH, which is the weighted spectral shapes matrix. The corresponding unweighted concentrations and spectra can be recovered as C=G−1{tilde over (C)} and ST={tilde over (S)}TH−1, respectively. For simplicity, the analysis of the unweighted data matrix D will be described hereinafter, although it will be understood that the method of the present invention can also be applied to the weighted data matrix.
PCA is one of the core techniques of multivariate statistical analysis and it has been employed in numerous and diverse applications including dimensional reduction, data compression, exploratory data analysis, factor analysis, pattern recognition, classification, and multivariate calibration. In particular, PCA can be used in the analysis of hyperspectral image data. See P. Geladi and H. Grahn, Multivariate Image Analysis, Wiley, Chinchester, UK (1996).
The goal of PCA is to extract the useful information in a high-dimension data set into a lower dimension subspace. From a geometric point of view, PCA begins by finding that single direction in the n-dimensional space that best describes the location of the data. The vector describing that direction is the first principal component. Once found, a second direction, orthogonal to the first, is determined that best accounts for the variation in the data that is orthogonal to the first. This is the second principal component. The process continues with each new principal component maximally accounting for the variation in the data that is orthogonal to all preceding components. The first few principal components will contain the chemical information of interest. If there are p such principal components, the remaining n−p principal components are assumed to describe experimental noise or error. Limiting further analysis to the p-dimensional subspace defined by the first p principal components provides the desired dimensional reduction and data compression.
In matrix terms, PCA is concerned with factoring a data matrix D into the product of two other matrices, a scores matrix T whose columns are mutually orthogonal and a matrix P of orthonormal loading vectors, according to
D=TPT (4)
To take a spectroscopic example, the loading vectors P describe the spectral characteristics of the chemical constituents of a sample and the scores T are related to their concentrations. The data set is then seen to be described as a linear combination of basis spectra. An alternative point of view can be obtained by performing PCA on the transposed data matrix DT:
DT=
In this case, the orthonormal loading vectors
The close connection between these two points of view can be made apparent by considering singular value decomposition (SVD) of the data, a method that is often used to compute PCA. SVD performs the factorization
D=UΣVT (6)
If D is an m×n matrix, then U and V are m×m and n×n orthogonal matrices, respectively, and Σ is an m×n diagonal matrix containing the singular values along the diagonal, ordered by decreasing size. The right singular vectors V provide abstract representations of the spectra of the individual chemical components (e.g., elements or phases). The min(m, n) singular values are related to the amount of variance in the data that is accounted for by the corresponding principal components. Specifically, the ith singular value is equal to the square root of the variance accounted for by the ith principal component. The diagonal form indicates that the transformed data are uncorrelated. By decomposing the data into a set of uncorrelated factors of decreasing statistical significance, data compression can be accomplished by selecting those factors having the greatest statistical significance and discarding the rest as noise or error.
SVD has the useful property that the space spanned by the first p columns of V represents, in a least squares sense, the best rank-p approximation to the space spanned by the rows of D. The remaining n−p columns of V represent experimental noise or error and can be discarded. Thus, the scores and loading matrices can be truncated, or compressed, to contain only those vectors corresponding to significant singular values. Letting Vp be the matrix whose columns are the first p columns of V, a p-component PCA model can then be computed, according to
where the columns of Pp are mutually orthonormal and those of Tp are mutually orthogonal. Similarly, the space spanned by the first p columns of U represents, in a least squares sense, the best rank-p approximation to the space spanned by the columns of D. The remaining m−p columns of U represent experimental noise or error and can be discarded. Thus, an alternative PCA factorization of D is
where, again, the columns of
DT=VΣUT=(VΣ)UT=
In the following discussion, the bars will be dropped for clarity and the loading matrix P will represent vectors that are orthonormal in the spatial domain, unless otherwise noted.
The conclusion that arises from the foregoing discussion is that there are two equivalent and equally valid ways to describe the data matrix D in terms of a PCA model. In the picture provided by Eq. (7), the orthonormal basis is spectral in nature, whereas, the model in Eq. (8) has an orthonormal image basis. The difference between the two points of view is subtle but has major implications. These relate to how properties of orthogonal and orthonormal (i.e., orthogonal and normalized to unit length) vectors differ upon rotational transformation, as will be described below.
While the matrices T and P include all of the information contained in the pure components, they do not do so in a chemically recognizable form. Therefore, a single principal component will not, in general, represent either a pure element or a multi-element phase, but rather, a linear combination of such elements or phases. In other words, there may not be a one-to-one correspondence between a selected principal component and a particular chemical phase or pure element. For example, physically admissible concentrations must be non-negative, and the pure-component spectra must be greater than the background signal, whereas a general principal component analysis need not be thus constrained. The principal components produced by PCA often have negative values, which presents a spectrum that is difficult for the practicing analyst to understand. Additionally, a major disadvantage of PCA is that the principal components are constrained to be orthogonal, while any chemical phase that contains overlapping spectral peaks or non-specific spectral backgrounds will necessarily be non-orthogonal. Therefore, subsequent post-processing of results obtained from PCA is useful for transforming the abstract principal components into physically meaningful pure components.
Transformation or “rotation” of PCA loading vectors has been commonly employed, particularly in the psychometric community, to obtain more readily interpretable factors. See H. H. Harman, Modern Factor Analysis, 3rd Revised ed., The University of Chicago Press, Chicago (1976); M. W. Browne, Multivariate Behavioral Research 36, 111 (2001); and M. Forina et al., J. Chemometrics 3, 115 (1988). The general goal of such procedures is to “rotate the factors to simple structure.” Thus, given a matrix of loading vectors P, a rotation matrix R is sought such that the rotated loadings {tilde over (P)}=PR are in some sense simpler and hopefully easier to interpret that the original loadings. As is amply demonstrated by the references, many definitions of “simple structure” have been proposed in the literature, and many algorithms are available to determine R such that the simplicity of {tilde over (P)} is maximized. At a very basic level, however, simplicity just means that there are many zeros in the matrix of loading vectors {tilde over (P)}. A matrix that has perfectly simple structure is one that has one and only one non-zero entry in every row of the matrix.
In the area of spectral image analysis, rotation of the spectral principal components has been used to obtain simpler spectra that may be easier to interpret. Conversely, the present invention uses rotation of the spatial-domain loading vectors to obtain better, more interpretable components in the spectral domain. Accordingly, a rotation matrix R can be determined, for instance, by the Varimax procedure, that maximizes the simplicity of {tilde over (P)}. The inverse rotation can then be applied to the scores matrix T to yield a factor model that fits the original data equally as well as the PCA model. Assuming R is an orthogonal matrix (i.e., a true rotation), then
D=TPT=TRRTPT=(TR)(PR)T={tilde over (T)}{tilde over (P)}T (spectral basis) (10)
D=PTT=PRRTTT=(PR)(TR)T={tilde over (P)}{tilde over (T)}T (spatial basis) (11)
Interestingly, while the columns of the rotated loading matrix {tilde over (P)}, like those of the original loading matrix P, are mutually orthonormal, the columns of {tilde over (T)} have lost the orthogonality present in the original scores matrix T. The implication of this result is that rotation allows the spatial and spectral orthogonality imposed by PCA to be relaxed in either the spatial domain through Eq. (10) or in the spectral domain via Eq. (11).
The key observation, now, is that while spectra are rarely orthogonal, that is, pure-component spectra typically overlap either one another or non-specific backgrounds, samples are often relatively orthogonal in a spatial sense. That is, in a sample with discrete chemical phases X and Y, and assuming boundary pixels make up a small fraction of the total, any particular pixel is likely to represent either phase X or phase Y, but not both. This observation suggests that the factorization approach indicated by Eq. (11) will be effective for the spectral image analysis of a large and important class of analytical problems having spatially simple structure (i.e., substantially comprising pure pixels).
In
The Cu/Ni diffusion couple does not represent a sample with perfectly simple structure. The presence of mixed composition pixels in the diffusion zone introduces a bias into the results. For instance, the centroids of the abundance distributions in the PC2 vs PC1 plot in
A simulation can be used to show that no bias is introduced in a sample that does, in fact, possess perfectly simple structure. In
In
However, the factorization approach of Eq. (11) yields improved results. In
As demonstrated by the foregoing discussion regarding spatially simple samples, excellent spectral estimates can be achieved by first rotating spatial-domain loading vectors to simple structure and then applying the same rotation (for the case of an orthogonal R) to the scores matrix. The pure-component concentration estimates, on the other hand, are still constrained to be orthogonal, which can introduce bias into the results if the samples are not perfectly simple. Frequently, these biases can be compensated by imposing additional constraints, non-negativity for instance, on the derived components subsequent to factor rotation.
Return now to the Cu/Ni diffusion-couple sample shown in
In
Further exclusion of bias requires that refinements be made to both the rotated spectral and spatial components. This refinement can be accomplished using algorithms that impose additional constraints. Typically, the refinement requires an iterative algorithm. Small changes in the spatial-domain components will induce changes in the spectral-domain estimates, which induce additional changes in the spatial components, and so on. For example, the refinement can be accomplished using an alternating least squares algorithm. Alternatively, a particularly suitable family of algorithms for the case of non-negative concentrations and spectra is the maximum-likelihood-based, non-negative matrix factorization (NNMF) method, which has been further elaborated for Poisson data. See D. D. Lee and H. S. Seung, Nature 401, 708 (1999), and P. Sajda et al., Proc. of the SPIE, Wavelets: Applications in Signal and Image Processing X 5207, 321 (2003).
While these NNMF algorithms tend to be slower than MCR-ALS, they do not introduce constraint-induced bias as quickly as do constrained-least-squares-based approaches. See U.S. patent application Ser. No. 10/794,538 to Keenan, which is incorporated herein by reference. However, since these algorithms are being used simply as a refinement step, the number of required iterations is small and performance is acceptable. For Poisson data, Sajda et al.'s algorithm can be applied to the rotated factors and spectral data after unweighting (i.e., in physical space). In
In
D=ABT (12)
where the data factor matrix A comprises spatial information and the data factor B comprises spectral information. At step 20, the data matrix D can be weighted to create a weighted data matrix, according to Eq. (3). For ease of description, the resulting weighted data matrix {tilde over (D)} can be assigned to data matrix D (as indicated by the ← arrow). At step 30, the data matrix D can be factored using PCA and the resulting components truncated to provide Pf and Tf.
In
For the case that the data matrix is represented in factored form, at step 32, the generalized symmetric eigenvalue problem can then be solved, according to
(BTB)(ATA)V=λV (13)
At steps 33 and 34, the PCA model for D can then be computed according to
P=AV (14)
and TT is obtained as the solution to the set of linear equations:
VTT=BT (15)
The PCA will yield a very large number of principal components. In general, the number of principal components representing noise greatly outnumbers the chemically relevant principal components. Therefore, the f most significant components (e.g., corresponding to the f most significant eigenvalues or singular values) are selected at step 35. It is not necessary to know exactly how many significant components f are needed to represent the chemical information, only that the number of f most significant components selected equals or exceeds the number of chemical pure components p in the sample (i.e., f>p). See R. Bro and C. Andersson, Chemometrics and Intell. Lab. Syst. 42, 105 (1998). Accordingly, at step 35, the loading and scores matrices P and T can be truncated to matrices Pf and Tf by retaining only those vectors corresponding to the f most significant components.
Step 35 generates a factored form of the original data matrix, irrespective of whether the full data matrix D or a factored form of the data matrix, A and B, was provided at step 20. It may be desirable, based on performance considerations, to use this factor model in the subsequent refinement steps. In this case, Pf and Tf can be copied to A and B, respectively, in step 36, and stored for future use. The original truncated principal component matrices, Pf and Tf, are returned at step 30.
Returning now to
While the rotated loading matrix is maximally simple, the signs of the rotated principal components suffer a sign ambiguity. In the approximately perfectly simple case, the elements of a given score vector (i.e., a column of Tp) will be either predominantly negative or predominantly positive. Since we desire non-negative spectral components, the columns of Tp that are predominantly negative along with the corresponding columns of Pp are multiplied by −1 at step 61.
If no refinement of the rotated PCA solution is necessary, the loading vectors matrix can be normalized and assigned to the concentration matrix (i.e., C←PP) and the scores matrix can be normalized and assigned to the spectral shapes matrix (i.e., S←TP) and these matrices can be interpreted at step 90.
Alternatively, the rotated PCA solution can be further refined. Only the rotated loading matrix can be refined, or both the loading and scores matrices can be refined. The rotated PCA solution can be refined either prior or subsequent to unweighting. At step 70, the rotated PCA solution can be unweighted prior to refinement by recovering the unweighted loading matrix and unweighted scores matrix using the inverse of the pre- and post-multiply weighting matrices, according to Pp←G−1Pp and Tp←H−1Tp. Obviously, the inverse weighting matrices G−1 and H−1 can be the identity matrix, if the data matrix was not weighted at step 20. At step 75, the unweighted PCA factors Pp and Tp can then be refined using one of the methods described in
In
Step 63 shows a constrained-least-squares approach to refine either the unweighted rotated loading matrix (i.e., at step 75), or the weighted rotated loading matrix (i.e., at step 80), using the data matrix D. The refined loading matrix Pp can be computed from the data and the feasible estimate for the scores matrix Tp by least squares, subject to constraints, according to
A common constraint is non-negativity (i.e., Pp≧0). The refined PCA solution can then be normalized and assigned to the concentration matrix (i.e., C←Pp) and the spectral shapes matrix (i.e., S←TP) at step 75 or 80. If the resulting concentration and spectral shapes matrices C and S are weighted, they can be unweighted at step 85.
Alternatively, at step 64, if the data matrix is represented as an arbitrary factor model (i.e., as ABT) the refined rotated loading matrix can be computed, subject to constraints, according to
If the rotated PCA factors have been unweighted at step 70, a refined factorization including refined estimates of both Pp and Tp can be computed, in general, using an alternating sequence of conditional refinements with the data matrix D, according to step 65. One specific refinement algorithm is based on constrained alternating least squares. Given an initial feasible estimate for the scores matrix Tp, an updated Pp can be computed, subject to constraints, according to
Given this updated Pp, an updated Tp can be computed, subject to constraints, according to
The updated PCA factors can be normalized and Eqs. (18) and (19) can be iterated for a predetermined number of iteration loops or until an acceptable level of convergence is achieved. The refined PCA solution can then be assigned to the concentration and spectral shapes matrices at step 75. A similar refined PCA factorization can be computed if the data matrix is represented as an arbitrary factor model.
A preferred refinement algorithm, at step 65, for the case of Poisson data is the constrained non-negative matrix factorization algorithm of Sajda et al. In
At step 65A, the truncated, rotated PCA factors are used to generate a prediction of the data, which is assigned to a matrix Q (i.e., Q=PPTpT). The matrix Q can then be thresholded, to disallow values below a certain positive value, such as a noise floor (i.e., Q←thresh(Q)). For example, all values below the noise floor can be set equal to a small positive value. The prediction can then be compared with the actual data by taking the element by element quotient
Note that quotient matrix Q will be a matrix of ones, at this point, for the case that the prediction exactly matches the data.
At step 65B, the loading matrix Pp can be updated according to
Pp←Pp×[Q(Tpdiag(1nTTp)−1] (20)
where x represents element-wise multiplication. As a given predicted data element deviates from the actual data element, so does the corresponding element of Q differ from one. Thus, elements whose predictions differ most from the actual data experience relatively larger magnitude adjustments during updating.
At step 65C, an updated quotient matrix Q is computed, using the updated Pp.
At step 65D, an updated scores matrix Tp can be computed using the updated matrix Q, according to
Tp←Tp×[(diag(1mTPp)−1PpT)Q]T (21)
The updated PCA factors can be normalized and steps 65A-D can be iterated for a predetermined number of iteration loops or until an acceptable level of convergence is achieved. The refined PCA solution can then be assigned to the unweighted concentration and spectral shapes matrices C and S at step 75. This approach was used to refine the estimated pure-component spectra in
Steps 66 and 67 show alternative refinement approaches based on the NNMF algorithm of Lee and Seung. This method iteratively updates the non-negative rotated PCA factors under the assumption of uniform Gaussian noise. For heteroscedastic data, these algorithms are appropriately applied in the weighted space.
At step 66, a refined weighted PCA factorization can be computed using the data matrix D. Given an initial feasible estimate for the scores matrix Tp, the loading matrix Pp can be updated, according to
Given this updated Pp, an updated Tp can be computed, according to
The updated PCA factors can be normalized and Eqs. (22) and (23) can be iterated for a predetermined number of iteration loops or until an acceptable level of convergence is achieved. The refined PCA solution can then be assigned to a weighted concentration matrix (i.e., C←PP) and a weighted spectral shapes matrix (i.e., S←TP) at step 80. These weighted matrices can be unweighted at step 85, according to C←G−1C and S←H−1S. This approach was used to refine the estimated pure-component spectra in
Alternatively, at step 67, a refined weighted PCA factorization can be computed using a pre-factored data matrix ABT. Given Tp, an updated Pp can be computed, according to
With this updated Pp, an updated Tp can be computed, according to
The updated PCA factors can be normalized and Eqs. (24) and (25) can be iterated for a predetermined number of iteration loops or an acceptable level of convergence is achieved. The refined PCA solution can then be assigned to a weighted concentration matrix (i.e., C←Pp) and a weighted spectral shapes matrix (i.e., S←Tp) at step 80. These weighted matrices can be unweighted at step 85, according to C←G−1C and S←H−1S. This approach was used to refine the estimated pure-component spectra in
The above description provides methods to analyze imaged samples that are spatially simple and consist of relatively discrete chemical phases. That is, at most pixel locations, it is likely that only one chemical species has a markedly non-zero concentration. PCA loading vector rotation techniques are not effective for analyzing imaged samples having substantial numbers of mixed pixels comprising more than one chemical species. Below is described a method for spectral image analysis of such mixed samples by simplicity-constrained MCR-ALS.
In
In
In
It is interesting to note in
In
At step 140, the concentration matrix C is estimated using spatial simplicity constraints in addition to other standard constraints, such as non-negativity. As indicated by the dotted box 140, step 140A of selecting the number and locations of zeros in the concentration matrix C and step 140B of solving the constrained problem can be performed sequentially, or these steps can be interleaved. Further, there are many approaches that can be used to select the number and locations of zero concentration and solving the constrained problem.
In
This unconstrained solution can be used to estimate the number and locations of zero concentration. Several approaches can be used to select the numbers and locations of the zero matrix elements in the unconstrained solution for the concentration matrix C, including absolute or relative thresholding, fixing the number of zeros, and zeroing the abundance elements globally or on a per component basis. At step 143, the number and locations for the abundance elements in C that are selected to be zero can be represented by a logical matrix V. At step 144, the matrix C is made feasible by setting any remaining negative matrix elements to zero. After identifying the particular abundance elements in matrix C that should be zero, a constrained least squares estimate for C is obtained at step 145:
subject to zero-equality constraints at the particular locations indicated by the logical matrix V, together with other constraints, such as non-negativity, applied to the full matrix. The combined application of equality and inequality constraints (e.g., non-negativity) can be accomplished using the constrained ALS algorithms described in U.S. Pat. No. 7,451,173 to Van Benthem and Keenan, which is incorporated herein by reference.
Returning to
The estimates for C and S can be normalized and steps 140 and 150 can be iterated for a predetermined number of iteration loops or until an acceptable level of convergence is achieved. At step 160, the converged solution can be unweighted, according to C←G−1C and S←H−1S.
Two approaches are evident for selecting the number and locations of elements in the concentration matrix C, at step 143, that will be constrained to equal zero. First, abundance estimates that are smaller than a threshold value can be constrained to equal zero. The threshold can be an absolute fixed abundance level. Alternatively, the threshold can be relative, for example, component abundances that are less than a certain fraction of the total abundance in a given pixel are constrained to equal zero. The threshold can be global, that is, a single value that applies equally to all components, or a different threshold can be applied to each component. In
The second approach simply sets a fixed number of elements to zero. The number of zeros can be fixed globally, or can be set individually for each component. In
Obviously, the number and/or locations of the zeros may vary from iteration to iteration of the MCR-ALS algorithm. As described above, a convenient method for selecting the locations of the zeros uses an unconstrained estimate of the abundance matrix. That is, given the current estimate of the spectral shapes matrix S, an unconstrained estimate of C is obtained by classical least squares at step 142. For the thresholding method, locations of the zeros are identified as those whose unconstrained estimates are smaller than a threshold value. Alternatively, if the absolute number of zeros is to be fixed, the locations of the zeros are chosen to correspond with the fixed number of smallest unconstrained abundance estimates. Since the first step in the constrained least squares algorithm of Van Benthem and Keenan is to obtain an unconstrained solution for C, minor modifications of that algorithm allow the equality-constraint matrices to be computed on-the-fly with little performance penalty.
The total number of zero abundances in the six-block simulation shown in
In the foregoing discussion, the numbers of zeros were selected by visual inspection. Standard image processing techniques can also be used to select the threshold/fixed number of zeros in an automated way, thus enabling an adaptive simplicity-constrained MCR-ALS algorithm. In
This histogram processing approach can also be applied iteratively. In
A relative thresholding approach to selecting the numbers and locations of the abundance elements to be constrained to equal zero can be described with reference to the analysis presented in
a) shows the pure-component spectra and abundance maps, as estimated by standard non-negativity constrained MCR-ALS for 3 components. The estimated spectra have non-physical shapes and are difficult to interpret. The physical significance of the third (bottom) component is difficult to discern since appreciable intensity is found at most pixels.
The present invention has been described as methods for spectral image analysis by exploiting spatial simplicity. It will be understood that the above description is merely illustrative of the applications of the principles of the present invention, the scope of which is to be determined by the claims viewed in light of the specification. Other variants and modifications of the invention will be apparent to those of skill in the art.
This is a divisional application of application Ser. No. 11/233,223, filed Sep. 22, 2005, now U.S. Pat. No. 7,725,517, which claims the benefit of U.S. Provisional Application No. 60/614,867, filed Sep. 29, 2004, both of which are incorporated herein by reference.
This invention was made with Government support under contract no. DE-AC04-94AL85000 awarded by the U.S. Department of Energy to Sandia Corporation. The Government has certain rights in the invention.
Number | Name | Date | Kind |
---|---|---|---|
6377907 | Waclawski | Apr 2002 | B1 |
6584413 | Keenan et al. | Jun 2003 | B1 |
6675106 | Keenan et al. | Jan 2004 | B1 |
7127095 | El Fakhri et al. | Oct 2006 | B2 |
7212664 | Lee et al. | May 2007 | B2 |
Number | Date | Country | |
---|---|---|---|
60614867 | Sep 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11233223 | Sep 2005 | US |
Child | 12702934 | US |