This disclosure relates to image processing, including object recognition.
Reconstructing a digital image and/or digital signal from measurement data is generally a linear inverse problem. A variety of techniques have been used for image estimation, including iterative approaches and deep learning approaches. Conventional techniques using either of these approaches suffer from several disadvantages. In conventional iterative imaging systems, disadvantages include the limitation to a narrow set of signal priors (statistics), slow convergence, and inability for real-time imaging.
For example, the Iterative Shrinkage and Thresholding Algorithm (ISTA) restricts prior distribution of the desired signal to the Laplace distribution, which does not capture desired statistical properties including self-similarity, heavy-tailed marginal distributions, and self-reinforcement among local coefficients and is slow to converge.
The Compressive Sampling Matching Pursuit (CoSaMP) algorithm restricts prior distribution of the desired signal to the Laplace distribution, which does not capture desired statistical properties including self-similarity, heavytailed marginal distributions, and self-reinforcement among local coefficients. Success is highly dependent on signal sparsity, which is not accurate for many signals and images.
The l1 least squares (l1-LS) restricts prior distribution of the desired signal to the Laplace distribution, which does not capture desired statistical properties including self-similarity, heavy-tailed marginal distributions, and self-reinforcement among local coefficients. It can be slow to converge.
The Bayesian Compressive Sensing (BCS) algorithm restricts prior distribution of the desired signal to the Student's t distribution, which does not capture desired statistical properties including self-similarity, heavy-tailed marginal distributions, and self-reinforcement among local coefficients. Enforces higher levels of signal sparsity, which is not accurate for many signals and images. It is slow to converge.
The Hierarchical Bayesian Maximum a Posteriori algorithm uses compound Gaussian prior distribution, with scale variable fixed to a log normal distribution, to iteratively solve a linear inverse problem with particular application in tomographic imaging. This method outperforms comparative iterative estimation approaches, with Compound Gaussian Least Squares (below) performing comparably, but is computationally intensive and slow to converge.
The Compound Gaussian Least Squares (CG-LS) algorithm uses a compound Gaussian prior distribution, with scale variable fixed to a log normal distribution, to iteratively and alternatively solve a linear inverse problem with particular application in tomographic imaging. This method outperforms comparative iterative estimation approaches, with HB-MAP performing comparably, but is slow to converge.
In conventional deep learning imaging systems disadvantages include the reliance on significant amounts of imaging training data and the black-box nature of these systems. For example, Reconstruction Network (ReconNet) uses a standard convolution neural network to solve a linear inverse problem with particular application in compressive sensing. In using standard convolutional layers, this method is a blackbox process where there is little understanding as to what each layer in this neural network is representing. The method does not incorporate any prior information about the signals of interest. These factors consequently require ReconNet to be trained on a significant amount of training samples to be accurate.
Iterative Shrinkage and Thresholding Algorithm Network (ISTA-Net) structures a deep neural network around the iterative shrinkage and thresholding algorithm for solving linear inverse problems with particular application in compressive sensing. The proximal operator evaluation in ISTA is replaced by a sequence of convolutional layers in ISTA-Net. Restricts prior distribution of the desired signal to the Laplace distribution, which does not capture desired statistical properties including self-similarity, heavy-tailed marginal distributions, and self-reinforcement among local coefficients. Method performs well when trained on a significant amount of training data, but stagnates when few training samples are available.
Memory Augmented Deep Unfolded Network (MADUN) structures a deep neural network around the iterative shrinkage and thresholding algorithm for solving linear inverse problems with particular application in compressive sensing. The proximal operator evaluation in ISTA is replaced by a convolutional long short-term network in MAD UN. This method does not incorporate any prior information about the signals of interest and instead learns this from data provided to the network. Consequently, MADUN requires a significant amount of training samples required to train the network for sufficient accuracy. A significant component, the long short-term networks, of this method is a blackbox process where the precise relationship between the signal and network layer outputs is unclear.
The accompanying drawings, which are incorporated in and constitute part of the specification, illustrate embodiments of the disclosure and, together with the general description given above and the detailed descriptions of embodiments given below, serve to explain the principles of the present disclosure. In the drawings:
3, first hidden layer L1 ∈
5, second hidden layer L2 ∈
4, third hidden layer L3 ∈
2, and output layer in L4 ∈
5 in accordance with an embodiment of the present disclosure;
Features and advantages of the present disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings, in which like reference characters identify corresponding elements throughout. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.
In the following description, numerous specific details are set forth to provide a thorough understanding of the disclosure. However, it will be apparent to those skilled in the art that the disclosure, including structures, systems, and methods, may be practiced without these specific details. The description and representation herein are the common means used by those experienced or skilled in the art to most effectively convey the substance of their work to others skilled in the art. In other instances, well-known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the disclosure.
References in the specification to “one embodiment,” “an embodiment,” “an exemplary embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to understand that such description(s) can affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
Embodiments of the present disclosure use a deep neural network for image estimation using a (learned) compound Gaussian prior. For example, embodiments of the present disclosure use an unrolled deep network that solves linear inverse problems with particular application in tomographic imaging and image compressive sensing. Systems and methods in accordance with embodiments of the present disclosure result in image reconstructions with a higher similarity index than those produced by conventional methods. Image reconstructions enabled by embodiments of the present disclosure are useful in a variety of applications, including radar, sonar, medical, and tomographic imaging systems.
Embodiments of the present disclosure provide a deep neural network, Deep Regularized Compound Gaussian Network (DR-CG-Net), that is formed from applying algorithm unrolling to a generalized Compound Gaussian Least Squares (CG-LS) iterative algorithm. The generalized CG-LS algorithm incorporates a compound Gaussian prior, which decomposes a signal of interest into the product of a scale variable component and Gaussian component. The compound Gaussian prior is a generalization on an assumed Laplace prior and t distribution and has been shown empirically to better capture statistical properties for signals of interest. Additionally, the generalized CG-LS algorithm does not fix the scale variable distribution and instead retains an implicit distribution that can be specified on an application specific basis.
In an embodiment, DR-CG-Net learns the scale variable distribution from the training data and is the first deep neural network to incorporate learning a prior distribution while still constraining to a specific class of priors. In an embodiment, DR-CG-Net constrains to the powerful class of compound Gaussian priors. In doing so, DR-CG-Net maintains the useful impact of the compound Gaussian prior, but has greater flexibility and learning capacities within this class of priors to represent signals.
Systems and methods according to embodiments of the present disclosure significantly outperform comparative methods and perform comparably in tomographic imaging and compressive sensing, when only a small amount of training data is available. Additionally, when larger amounts of training data are available, systems and methods according to embodiments of the present disclosure outperforms conventional tomographic imaging techniques.
It is an aspect of the present disclosure to provide a method for reconstructing an image. The method includes accessing measurement data with a computer system, where the measurement data have been acquired from a subject using an imaging system. A neural network is also accessed with the computer system, where the neural network has been trained on training data to reconstruct an image from data consistent with the measurement data. The neural network includes an unrolled iterative reconstruction algorithm that implements a compound Gaussian prior. A reconstructed image is generated from the measurement data by inputting the measurement data to the neural network using the computer system, thereby generating the reconstructed image from an output of the neural network. The reconstructed image may then be output with the computer system.
It is another aspect of the present disclosure to provide a method for training a neural network to reconstruct an image from measurement data. The method includes accessing training data with a computer system, where the training data include matched pairs of measurement data and sparse domain coefficients. A neural network is trained on the training data, using the computer system, to learn parameters of the neural network. The neural network includes an unrolled iterative reconstruction algorithm implementing a compound Gaussian prior. The neural network and the learned parameters are stored with the computer system as a trained neural network.
Described here are systems and methods for reconstructing images and/or signals using a deep learning model that incorporates a compound Gaussian prior. As a non-limiting example, the deep learning model can be implemented as a deep neural network that incorporates a broad compound Gaussian prior distribution for the signal-of-interest (e.g., the image or signal to be reconstructed) into a deep learning framework for solving general linear inverse problems. In some implementations, the disclosed systems and methods may be referred to as a “CG-Net.” Unlike previous unrolling approaches, the systems and methods described in the present disclosure do not need to replace any part of the original optimization of the iterative reconstruction framework with a neural network structure that must be learned from scratch, which provides for a better interpretability of the CG-Net.
It is an aspect of the systems and methods described in the present disclosure to provide a deep learning-based reconstruction algorithm that incorporates a compound Gaussian prior. The compound Gaussian prior may be a generalization on the Laplace prior and/or Student's t-distribution. Advantageously, the compound Gaussian prior can better capture statistical properties of the signals of interest. Accordingly, by incorporating a compound Gaussian prior into the reconstruction techniques described in the present disclosure, an improvement in reconstructed signal quality can be achieved relative to previous deep learning-based reconstruction techniques, even when only a small number of training data samples is available. Moreover, the incorporation of the compound Gaussian prior in the disclosed systems and methods provides a more computationally efficient implementation that is faster and easier to implement in a deep neural network structure. Accordingly, the functioning of the computer system implementing the reconstruction techniques described in the present disclosure is also improved by way of the reduced computational burden on the limited resources of the computer system (e.g., reduced memory burden, reduced computational complexity).
Image reconstruction is often formulated as an underdetermined linear inverse problem. In other words, the forward measurement model may be formulated as:
where x ∈n is a vectored, n=N×N, image observed through a measurement matrix Ψ∈
m×n, with additive white noise, ν ∈
m, and where y ∈
m are the underdetermined linear measurements. Typically, m<<n implying that obtaining the signal or image, x, given the measurements, y, does not produce a unique solution. Although, exploiting the sparsity of images under a change of basis makes it possible to uniquely reconstruct an image with measurements sampled below the Nyquist frequency. Letting Φ∈
n×n be a sparsity inducing transformation, such as the discrete Fourier transform or a wavelet dictionary, and letting c∈
n be the sparse coefficients of x, and by defining A=ΨΦ, Eqn. (1) can be rewritten as,
Now, the inverse estimation problem to Eqn. (2) aims to recover the sparse coefficients, c, from the measurements, y, corresponding to an observation method, Ψ, and sparsity transform, Φ, chosen a priori. Often, image estimation is solved by an iterative algorithm, which involves the minimization of an objective function and may include subspace projections between minimization steps. Many iterative algorithms for image estimation have been developed, including Iterative Shrinkage and Thresholding, Basis Pursuit, Bayesian Compressive Sensing (BCS), and Compressive Sampling Matching Pursuit (CoSaMP).
One useful way to formulate inverse problems in imaging is by Bayesian estimation. In particular, considering the maximum a posteriori (MAP) estimation of c from Eqn. (2):
which can, equivalently, be viewed as a regularized least squares (RLS) optimization:
As the regularization satisfies R(c) ∝-log(p(c)), the choice of prior density, p(c), of the sparse coefficients, c, is a useful component to incorporate domain level knowledge into the image reconstruction problem.
With that in mind, the systems and methods described in the present disclosure make use of a compound Gaussian prior. Sparse coefficients of natural images exhibit self-similarity, heavy-tailed marginal distributions, and self-reinforcement among local coefficients. Such properties are not encompassed by the generalized Gaussian prior typically assumed for the image sparsity coefficients. Instead, a class of densities known as compound Gaussian (“CG”) densities, or Gaussian scale mixtures, better captures the statistical properties of sparse coefficients of natural images and images from other modalities, such as radar. A useful formulation of the CG prior lies in modeling the sparse coefficients of images as the Hadamard product:
such that z=h(x), where h:→
is a componentwise, positive, nonlinear function, x follows a multi-scale Gaussian tree process, u˜
(0, Σu), and u and z are independent random variables. The CG prior subsumes many well-known distributions including the generalized Gaussian allowing an interpretation of the CG prior as a generalization of compressive sensing work.
Deep neural networks (DNN) generally include three main layer types: input, hidden, and output. An example DNN is shown in 3, first hidden layer L1 ∈
5, second hidden layer L2 ∈
4, third hidden layer L3 ∈
2, and output layer in L4 ∈
5 in accordance with an embodiment of the present disclosure. The input layer, L0, is assigned the input data to the network such as the measurements, y, from Eqn. (2). The output layer, L4, returns the desired output from the network, when provided an input, such as the wavelet coefficients, c, from Eqn. (2). Finally, hidden layers, L1, L2, and L3, are transformations of the input data to produce the desired output. In general, Li∈
d
Layers of a DNN are connected by functions, ƒi, parameterized by θi, such that
where Li is network layer i and the network contains k hidden layers. For a standard, fully connected, neural network (NN)
where x ∈n is an input vector, Wi ∈
m×n is a weight matrix, bi=
m is a bias vector, and σ:
→
is a componentwise activation function. Standard activation functions include sigmoid, Rectified Linear Unit (ReLU), and hyperbolic tangent. In networks described in the present disclosure, the input may include noisy Radon transform measurements, or other medical imaging or other imaging data, yi∈
m, from (2), and the output layer returns the estimated wavelet coefficients denoted by c(yi;Θ∈
n where Θ=[θ0, . . . , θk] are the network parameters.
Using a training dataset, ={(yi, ci):i=1, 2, . . . , M}, the network parameters, Θ=[θ0, . . . , θk], may be learned by optimizing a loss or error function,
(Θ), between the network outputs c(yi; Θ), and actual wavelet coefficients ci. Common loss functions for image reconstruction neural networks include mean-squared error (MSE), normalized mean-squared error (NMSE), peak signal to noise ratio (PSNR), SSIM, or linear combinations of these.
In some implementations, algorithm unrolling may be used when constructing the deep learning-based machine learning models described in the present disclosure. Algorithm unrolling structures a DNN based upon an iterative image estimation algorithm (IIEA). Algorithm unrolling places the operations from each step j of the IIEA as the function ƒj defining the layers in a DNN. Then the parameters, θj, on each step j of the IIEA parameterize ƒj in the DNN. By training the unrolled DNN, each θj is learned, which optimizes the IIEA, to produce improved image estimates. Algorithm unrolling can provide significant performance improvements in image reconstruction while offering simple interpretability of the network layers.
It is an aspect of the systems and methods described in the present disclosure to provide an iterative image estimation algorithm, referred to as compound Gaussian least squares (CG-LS), for solving Eqn. (2) with general sensing matrix Ψ, under the assumption of a global compound Gaussian prior. Informed by the statistical representation of image coefficients through the compound Gaussian prior, the CG-LS iterative image reconstruction algorithm, which enforces the compound Gaussian prior, is used as the reconstruction framework for developing the CG-Net deep learning reconstruction model. CG-LS is based on an RLS optimization where a regularization term is chosen to enforce the CG prior. Furthermore, the CG-LS algorithm is unrolled into a DNN that may be referred to as CG-Net. For training CG-Net, image estimation is estimated from Radon transform measurements, or other medical imaging or other imaging data types, as these data types underlie many imaging applications, including microwave, acoustic, and medical imaging modalities.
An example of the CG-LS iterative reconstruction algorithm is now described in more detail. Consider
the regularized least squares estimate of c from Eqn. (2). The regularization function, R:n→
, is determined by the prior enforced on c. For the compound Gaussian prior, we decompose c=z⊙u where z=h(x)=√{square root over (exp(x/α))}, for Gaussian random vectors u and x, and we aim to estimate z and u. Let h be the component-wise, invertible, nonlinear function in the CG prior and ƒ=h−1 the inverse nonlinearity. Hence, the following regularization may be selected, as an example:
to enforce normality of u and x=ƒ(z), and the regularized least squares estimate [(z*)T (u*)T]T may be considered to be given as the solution to:
Based on the component-wise nonlinearity h(x)=√{square root over (exp(x/α))}, the inverse is also a component-wise function and may be given by ƒ(z)=2α ln(z). The regularization, R(z,u), may approximate the log prior from a MAP estimation while still capturing the desired statistical properties for the prior. Empirically, this approximation is effective for the underlying problem of estimating image wavelet coefficients, or other sparse domain coefficients.
For a vector w∈n, D{w}=diag(w) can be defined and Aw=AD{w}. Due to the explicit joint estimation in Eqn. (9), optimization may be performed by block coordinate descent, which on iteration k may be given by the following:
Algorithm 1 below details pseudocode for the CG-LS estimate of u and z.
Note that Eqn. (11) is a Tikhonov regularization problem with a solution given by:
As a non-limiting example, Eqn. (10) can be solved using damped Newton descent, which uses the gradient and Hessian of Eqn. (10), given respectively by:
For clarity, “step” may be used to refer to an update of z by damped Newton descent (or other techniques) and “iteration” to refer to an update of both u and z in CG-LS. It can be shown that Eqn. (10), with ƒ(z)=h−1(z)=2α ln(z), is a convex cost function so long as z∈(0, e)n. Hence, in some examples, a nonlinear thresholding operator may be applied to z after each Newton descent step, such as the following component-wise operator:
where ϵ is a small positive real number. Thus, z on Newton descent step j+1 of iteration k, denoted by zkj+1, is
where ηkj is a step size often chosen via a backtracking, Armijo line search, or the like.
As a non-limiting example, a convergence criterion for CG-LS can be implemented as follows: Let δ be a small positive real number and define the Newton decrement λ(z;u, y)∈ as
The Newton decrement serves as a convergence metric for Newton descent. Thus, before each Newton descent step, that is between lines 5 and 6 in Algorithm 1, a check can be made if λ(zkj; uk−1, y)≤δ. When this holds the Newton descent steps are exited and the algorithm proceed with the next u estimate, uk, given in line 12 of Algorithm 1. Once λ(zk0; uk−1, y)≤δ, that is, Newton descent is unnecessary because the error tolerance has been reached, it can be said that CG-LS has converged and the algorithm can be terminated. If no k∈{1, 2, . . . , K} exists such that λ(zk0; uk−1, y)≤δ then CG-LS did not converge.
In addition to Algorithm 1, the CG-LS algorithm may also be implemented with Algorithm 2 illustrated below.
a,b(ATy) and u0 = (Az
In this example, steps are performed using a steepest descent instead of damped Newton descent. As described above, CG-LS is an iterative reconstruction algorithm that approximately solves the optimization in Eqn. (9).
The cost function is a regularized least squares cost function where, as given by the CG prior, the sparse coefficients are decomposed as c=z⊙u and the regularization may be taken to be R(c)=R(u,z)=λ∥u∥22+μ∥ƒ(z)∥22 to enforce normality of u and x=ƒ(z), a Gaussian tree process, as desired from the CG prior.
As the optimization in Eqn. (10) cannot be solved analytically, in some examples a steepest descent approach can be used to iteratively and approximately solve Eqn. (10). Given a norm ∥·∥ on n and differentiable function, g(x):
n→
, the steepest descent direction, d:
n→
n, is defined as
where ∥·∥* is the dual norm given by ∥w∥*=max∥V∥=1wTv. For instance the Euclidean norm produces d (x)=−∇g(x). Additionally, each steepest descent step may be scaled by a step size, η=η(x), determined by a backtracking line search. That is, given two user-chosen parameters τ∈(0,½] and β∈(0,1) the step size is chosen to be η=βr where r=r(x) is the minimum, non-negative integer such that
In some examples, r can be determined by incrementing from zero until Eqn. (20) is first satisfied. With an initial guess x0 the sequence {xj}j=1∞ generated by steepest descent is given by xj=xj−1+η(j)d(xj−1) where η(j) depends on the previous step xj−1 as η(j)=βr(x
Now, when applying steepest descent to the cost function in Eqn. (10), zkj can be referred to as the estimate of z on steepest descent step j of iteration k. For generality, a different norm may define each steepest descent step as is the case in Newton descent for a convex cost function with non-constant Hessian. The descent direction corresponding to norm ∥·∥(k,j), with dual norm ∥·∥*(k,j), for steepest descent step j of iteration k can be denoted as dkj=dkj(z). Furthermore, the backtracking line search step size for steepest descent step j of iteration k can be denoted as ηk(j)=βr(z
Note, dkj=dkj(z)=dkj(z;uk−1,y) as dkj is parameterized by uk−1 and y. Let J be the maximum number of steepest descent steps; then, for notation, zkj=zk=zk+10, as described above.
As described above, Eqn. (11) has the Tikhonov solution in Eqn. (12). Next, the initial estimate of z can be defined as z0=a,b(ATy) where the modified ReLU function is defined by (where ReLU(x)=max{0,x})
and is applied elementwise to ATy. a,b is a projection operator onto the interval [min{a, b}, max{a, b}]. This eliminates negative values, as z should have positive components, and limits the maximum values in the initial z estimate. The initial u estimate, denoted as u0, is given by Eqn. (12).
Finally, the gradient, ∇zF(u,z), and a user-chosen parameter δ>0 determine convergence of CG-LS. On each steepest descent step j of iteration k, a check is made if ∥∇zF(uk−1,zkj−1)∥*(k,j)<δ. When this holds, the steepest descent steps are exited, taking zk=zkj−1. Once ∥∇zF(uk−1,zk0)∥*(k,1)<δ CG-LS has converged and the estimates uk−1 and zk−1 are returned. Otherwise, CG-LS may terminate after a user-chosen maximum number of iterations K, or some other user-defined stopping condition.
In some implementations, the CG-LS algorithms described above may be generalized. The generalized CG-LS algorithm implements a regularized least squares optimization with regularization including a Gaussian term and an implicitly defined term, which together encorce a compound Gaussian prior. For example, image coefficients can be decomposed according to the CG prior, c=z⊙u, and a MAP estimate of the scale variable, z, and Gaussian vector, u, can be considered from Eqn. (2). That is, the following cost function can be defined:
and the following estimate, which is equivalent to the MAP estimate, may be considered:
where ⊆[0, ∞)n is the convex domain of
. Note Pu∝Σu, for Σu the covariance of u, and
(z)∝−log(pz(z)) where pz is the prior density for the scale variable. The G-CG-LS algorithm, given in Algorithm 3 below, is an iterative algorithm that approximately solves Eqn. (23) through block coordinate descent.
[0,b](ATy) and u0 =
(z0)
(zk)
For vector v, Av=Adiag(v) can be defined where diag(v) is the diagonal matrix formed by placing the entries of v on the diagonal. Now, the following can be defined:
and letting g(z,u) be a scale variable update method such that:
At least two possibilities for g are described below. Note that (z)=argminuF(u,z) is a Tikhonov solution. Then, on iteration k of G-CG-LS the following estimates can be determined:
Note that, in practice, the inverse in Eqn. (24) may not be calculated and may instead be solved as a system of linear equations. Furthermore, the computational time to calculate Eqn. (24) can be reduced by using the following Woodburry matrix identity:
Additionally, for a convex set let
be the unique projection onto . Initial estimates can be defined as z0=
[a,b]
(z0). Let ReLU(x)=max{x,0} be the rectified linear unit activation function. It is noted that:
Applying [a,b]
As mentioned above, at least two example implementations for updating the scale variable in the G-CG-LS algorithm are described. Each of these example methods are described as using a step size η>0.
In a first example, a projected gradient descent (“PGD”) method can be used. In the PGD approach, the following updates are performed:
In a second example, an iterative shrinkage and thresholding (“ISTA”) method can be used. In the ISTA approach, the following updates are performed:
where proxƒ is the proximal operator of a function ƒ and is well-defined for convex ƒ. For a non-smooth, convex function, ƒ, the proximal operator is an optimization tool as fixed points of proxƒ minimize ƒ. The general ISTA method, which is equivalent to a proximal gradient descent method, is an optimization method for the sum of a convex, differentiable function and a convex, non-smooth function. From the use of the proximal operator on the non-smooth piece, fixed points of ISTA are optimality points of the original sum of functions.
As described above, a deep neural network, CG-Net, may be constructed by applying algorithm unrolling to the CG-LS algorithm, such as the algorithm summarized in Algorithm 1. As a non-limiting example, the CG-Net model may have a structure as shown in n×
m→
n and ƒz:
n×
n×
m→
n, which may be defined by:
Thus, ƒz(zkj, uk−1, y) corresponds to updating z on descent step j+1 of iteration k in CG-LS and ƒu(zk−1, y) corresponds to updating u on iteration k of CG-LS. In some other examples, when developing the CG-Net model from Algorithm 2, the operation for updating z using a steepest descent may include gkj(z,u,y)≡gkj(z, u, y; akj, bkj, ηk(j), dkj), which is parameterized by a descent vector, dkj, step size, ηk(j) and modified ReLU parameters akj and bkj, corresponding to updating z as:
In CG-LS, the step size, ηk(j), may be found by a backtracking line search. In CG-Net, the application of the modified ReLU activation function, a,b, at each steepest descent step serves to guarantee the next step is not too large and stays within the domain of interest for z.
The network blocks in a
CG-Net contains K+1 estimation layers for u, KJ descent step layers for z, one input, one output, and one initialization layer. Thus, CG-Net contains K(J+1)+4 layers in total.
In CG-Net the function ƒu=ƒu(z, y; λk), connecting layers ZkJ and Uk for every k=1, 2, . . . . K, is parameterized by a positive regularization scalar, λk. In CG-LS a fixed constant λk=λ is taken for all k, where λ is given in Eqn. (11), but for training the CG-Net different constants, learned by the DNN, may be allowed in each update of u (e.g., at each layer updating u). In some examples, the initialization layer, Z0, may also be parameterized by two positive real numbers a0>zmin and b0>zmin, which may be applied through the modified ReLU function, a
Next, parameters of the function ƒz may depend on the choice of nonlinearity h for the CG prior. In the example described above, h(x)=√{square root over (exp(x/α))} was chosen, such that h−1(z)=2α ln(z) and therefore ƒz(z,u,y) in Eqn. (34) is given by:
where K=4μα2 and w1:(0, ∞)→ and w2:(0, ∞)→
are component-wise functions defined by:
respectively. Hence, ƒz=ƒz(z, u, y; ηkj, κkj), which connects two layers Uk and Zkj to layer Zkj+1 for every k=1, 2, . . . , K and every j=0, 1, . . . , J−1, is parameterized by ηkj, a descent step size, and κkj, a scaled product of a regularization scalar, μ, and CG prior constant, α. Again to simplify notation the following are denoted Zk0=Zk−1J and Z0=Z10.
In CG-LS a fixed constant κkj=4μα is taken for all k and j, but for training the CG-Net different constants, learned by the DNN, are allowed for each update step of z. To increase trainability of the network, the descent step size ηkj, which may be chosen by a backtracking line search in CG-LS, is allowed to be a matrix in n×n instead of a single constant. Choosing ηkj to be a matrix provides CG-Net with flexibility to update a coordinate of z as a function of all coordinates of the search direction (H(z;u))−1v(z;u, y) instead of a function of a single coordinate. Such flexibility may increase the rate that the descent steps are able to traverse through the optimization landscape of problem Eqn. (10). Nevertheless, if a single constant step size is optimal then CG-Net can simply learn ηkj as a scaled identity matrix. Alternatively, some ηkj may be considered to be constrained to a specific structure, such as a diagonal matrix. Therefore, the CG-Net parameters Θ may be:
At a maximum, when every ηkj is a full matrix, CG-Net contains K(J(n2+1)+1) parameters although this will be less when certain ηkj have specific structure constraints.
To learn the CG-Net parameters, Θ, a loss function involving the structural similarity index measure (“SSIM”) function may be used, such as:
for ⊂
a batch of training data points. This cost function may be optimized through adaptive moment estimation (“ADAM”), which is a stochastic gradient-based optimizer. Other common optimization methods include: stochastic gradient descent, RMSprop, and Adadelta. The gradient
used in ADAM is calculated via backpropagation through the network. Backpropagation calculations are easily and quickly implemented with automatic differentiation, which is readily available in most machine learning libraries, such as Tensorflow.
In some examples, for each Zkj layer, defined by (35), the steepest descent vector dkj(z;u)=−Bkj∇zF(z;u) may be implemented for a positive definite matrix Bkj that will be learned in CG-Net. In this example, the descent vector is the steepest descent based upon the quadratic norm ∥·∥(B
That is, P531 (L) can be viewed as the closest symmetric, real-valued matrix with minimum eigenvalue of ∈ to (L+LT)/2 as measured by the Frobenius norm. Bkj can be enforced to be positive definite by learning a lower triangular matrix Lkj and setting Bkj=P∈(Lkj). Therefore, CG-Net layer Zkj may be parameterized by a lower triangular matrix Lkj defining the steepest descent vector
Additionally, as ∇zF(u,z), given by:
depends on the regularization scalar μ, layer Zkj is parameterized by regularization scalar μkj. Furthermore, layer Zkj may be parameterized by the step size ηk(j), which can be taken to be a diagonal matrix. That is, instead of learning a single constant to scale the steepest descent vector dkj a different constant can be learned to scale each component of dkj separately. Finally, layer Zkj learns positive real numbers akj>zmin and bkj>zmin, which may be applied through the modified ReLU activation function,
in (35).
Fixing a small real-valued ∈>0, to ensure λk>0 and a0, b0, akj, bkj>zmin in implementation, max{λk,∈} may be used in place of λk, max{a0,zmin+∈} may be used in place of a0, and similarly for b0, akj, and bkj.
In total, in this example implementation, CG-Net has
where n is the image size. The CG-Net parameters are trained by minimizing a loss function involving the SSIM image quality metric, such as using Eqn. (36) above for ⊂
a batch of data points. This cost function may be optimized as described above. The gradient ∇Θ
for ADAM may be calculated via backpropagation through the network.
In some embodiments, a deep regularized implementation of the CG-Net (“DR-CG-Net”) described above may be implemented. In these instances, a deep neural network with end-to-end structure, such as the one shown in , DR-CG-Net learns
through a subnetwork representing either ∇
or
.
Let k(j):
n→
n, for k=1, 2, . . . , K and j=1, 2, . . . , J, be a subnetwork. That is, each
k(j) is a collection of layers mapping from
n to
n. For example, convolutional layers may be used, making each
k(j) subnetwork a CNN. Now, the intermediate scale variable mapping, gk(j), can be defined as:
where rk(j) is the mapping of the data fidelity gradient descent step of F(u,z) in z,
for a step size ηk(j). Note that gk(j) corresponds to the update methods described above, where k(j)) replaces ∇
and
in PGD or ISTA, respectively. In some implementations,
[0,∞)≡ReLU may be applied to ensure all intermediate z estimates in DR-CG-Net maintain positive entries. Finally, recall the Tikhonov update defined in Eqn. (24), and let
(z)=
(z; Pk) for a covariance matrix, Pk.
Each layer k of DR-CG-Net, shown in k shown in
k includes a composition of the J scale variable updates gk(1), . . . , gk(J) shown in
Mathematically, the DR-CG-Net blocks may be detailed as:
The kth complete scale variable mapping k contains:
gk(j)=gk(j) (gk(j−1), Uk−1) is the intermediate scale variable mapping analogous to zkj on line 5 in Algorithm 3;
rk(j)=rk(j) (Uk−1, gk(j−1)) is the data fidelity gradient step; and
O=UK⊙K is the estimated signal coefficients produced by DR-CG-Net.
Note, to simplify notation, let gk(0)=gk−1(J). Assume each subnetwork, k(j), uses D layers, then DR-CG-Net contains K+1 estimation layers for u, KJ(D+1) layers for updating z, one input, one output, and one initialization layer. Thus, DR-CG-Net is a DNN with K(J(D+1)+1)+4 layers.
For every k=0, 1, . . . , K, the layer Uk is parameterized by a covariance matrix, Pk. To reduce the number of parameters learned by the network and for consistency in Pk representing the covariance matrix of u, DR-CG-Net learns a single covariance matrix P and constrains P1= . . . =PK=P. Furthermore, the possibility of a structured covariance matrix can be considered, where P is either a scaled identity, diagonal, tridiagonal, or full matrix. Imposing a covariance structure may be desirable or advantageous, for example to ensure only local reinforcement.
To ensure P is a covariance matrix, i.e. symmetric and positive definite, for ∈>0 a small fixed real number, one of the following structures can be imposed:
In the scaled identity case, only a constant λ is learned. In the diagonal case, a vector λ=[λi]i=1n is learned. In the tridiagonal case, two vectors λ1∈n and λ2∈
n−1 are learned and Ltri is the lower triangular matrix formed by placing λ1 on the diagonal and λ2 on the first subdiagonal. Finally, in the full case, a lower triangular matrix, L, is learned.
Next, each k is parameterized by {ηk(1), . . . , ηk(J)}, a collection of step sizes, as ηk(j) parameterizes the data fidelity gradient step, rk(j), and the parameters of the the subnetworks
k(1), . . . ,
k(J). For fixed constant γmax>0 and fixed z and u the following is taken:
And let the network learn δk(j), a real value parameter. This corresponds to performing a normalized step in the data fidelity gradient step. Normalizing the step size provides numerical stability in preventing the data fidelity gradient step from being too large and counterproductive.
In some implementations, k(j) can be a CNN of depth D using ReLU activation functions. That is, layer d includines ƒd convolutions, i.e. filter channels, using kernel size kd×kd with unit stride and a bias matrix added to each filter channel. Note, zero padding may be applied to each filter channel of the input such that that the output from each filter channel is the same size as the input. Furthermore, ƒD=1 can be taken, such that for X∈
n×n,
k(j (X)∈
n×n. Next, for x∈
n
n×n, vec(X) may be defined to be X reshaped into a vector of size n2 such that vec(mat(x))=x. Then
For Wk,d(j) and Bk,d(j) being the convolutional kernels and biases of k(j), respectively, the DR-CG-Net parameters are
The DR-CG-Net parameters can be trained by minimizing the mean absolute error loss function. Namely, for ⊂
s, a batch of data points the mean absolute error loss function is
This cost function can be optimized through adaptive moment estimation (Adam), which is a stochastic, gradient-based optimizer. Other common optimization methods include stochastic gradient descent, RMSprop, and Adadelta. The gradient, ∇Θ, for Adam is calculated via backpropagation through the network, which can be implemented with automatic differentiation using Tensorflow.
Thus, using the powerful CG class of densities to represent image coefficients, a deep neural network named DR-CG-Net for solving linear inverse problems is provided. This deep neural network is produced by applying algorithm unrolling to the generalized CG-LS algorithm described above, which enforces the CG prior. The generalized CG-LS algorithm allows for problem-specific choices of the scale variable distribution, which subsequently is learned through the unrolled deep neural network. Hence, DR-CG-Net has the flexibility to learn the prior while still being constrained to the class of compound Gaussian distributions.
The method includes accessing measurement data with a computer system, as indicated at step 502. Accessing the measurement data may include retrieving such data from a memory or other suitable data storage device or medium. Additionally or alternatively, accessing the measurement data may include acquiring such data with an imaging system and transferring or otherwise communicating the data to the computer system, which may be a part of the imaging system.
In one non-limiting example, the measurement data may include medical imaging data acquired with a medical imaging system. For instance, the medical imaging system may include an x-ray imaging system (e.g., a computed tomography (“CT”) system), a magnetic resonance imaging (“MRI”) system, a positron emission tomography (“PET”) system, an ultrasound imaging system, or the like. In these instances, the measurement data may include the respective measurement data acquired by the medical imaging system. As a non-limiting example, the measurement data may include x-ray attenuation and/or photon count measurements acquired using a CT system. As another example, the measurement data may include k-space data acquired with an MRI system.
In other examples, the measurement data may include other measurement data acquired using other imaging or sensing modalities, include microwave, acoustic, optical, or otherwise.
A trained neural network (or other suitable machine learning algorithm) is then accessed with the computer system, as indicated at step 504. Accessing the trained neural network may include accessing network parameters (e.g., weights, biases, or both) that have been optimized or otherwise estimated by training the neural network on training data. In some instances, retrieving the neural network can also include retrieving, constructing, or otherwise accessing the particular neural network architecture to be implemented. For instance, data pertaining to the layers in the neural network architecture (e.g., number of layers, type of layers, ordering of layers, connections between layers, hyperparameters for layers) may be retrieved, selected, constructed, or otherwise accessed.
In general, the neural network is trained, or has been trained, on training data in order to reconstruct a signal and/or image from measurement data.
The measurement data are then input to the one or more trained neural networks, generating one or more reconstructed images as an output, as indicated at step 506. In some implementations, as described above, sparse coefficients, c, are output from the output layer of the CG-Net reconstruction network. In these instances, the reconstructed image may be recovered from the sparse coefficients, for example, by using the appropriate sparse domain transformation. For example, when the sparse domain is the wavelet domain, the CG-Net reconstruction network may output wavelet coefficients. By applying a wavelet transformation to these wavelet coefficients, the reconstructed image can be recovered.
Display and/or Store Output
The image(s) generated by inputting the measurement data to the trained neural network(s) can then be displayed to a user, stored for later use or further processing, or both, as indicated at step 508.
The method includes accessing training data with a computer system, as indicated at step 602. Accessing the training data may include retrieving such data from a memory or other suitable data storage device or medium. Alternatively, accessing the training data may include acquiring such data with an imaging system and transferring or otherwise communicating the data to the computer system.
In general, the training data can include measurement data acquired with an imaging system and example images reconstructed from the measurement data. The method can include assembling training data from measurement data and/or images reconstructed from measurement data using a computer system. This step may include assembling the training fata into an appropriate data structure on which the neural network or other machine learning algorithm can be trained.
One or more neural networks (or other suitable machine learning algorithms) are trained on the training data, as indicated at step 604. In general, the neural network can be trained by optimizing network parameters (e.g., weights, biases, or both) based on minimizing a loss function. As one non-limiting example, the loss function may be the SSIM-based loss function described above.
Training a neural network may include initializing the neural network, such as by computing, estimating, or otherwise selecting initial network parameters (e.g., weights, biases, or both). During training, an artificial neural network receives the inputs for a training example and generates an output using the bias for each node, and the connections between each node and the corresponding weights. For instance, training data can be input to the initialized neural network, generating output as an intermediate reconstructed image. The artificial neural network then compares the generated output with the actual output of the training example in order to evaluate the quality of the intermediate reconstructed image. For instance, the intermediate reconstructed image can be passed to a loss function to compute an error. The current neural network can then be updated based on the calculated error (e.g., using backpropagation methods based on the calculated error). For instance, the current neural network can be updated by updating the network parameters (e.g., weights, biases, or both) in order to minimize the loss according to the loss function. The training continues until a training condition is met. The training condition may correspond to, for example, a predetermined number of training examples being used, a minimum accuracy threshold being reached during training and validation, a predetermined number of validation iterations being completed, and the like. When the training condition has been met (e.g., by determining whether an error threshold or other stopping criterion has been satisfied), the current neural network and its associated network parameters represent the trained neural network. Different types of training processes can be used to adjust the bias values and the weights of the node connections based on the training examples. The training processes may include, for example, gradient descent, Newton's method, conjugate gradient, quasi-Newton, Levenberg-Marquardt, among others.
The artificial neural network can be constructed or otherwise trained using the techniques described above in more detail. In addition to learning the neural network parameters described above, in some implementations the CG-Net may be trained on training data to additionally or alternatively learn the inverse of the nonlinear function in the CG prior (i.e., ƒ(z), by approximating it with a sub-network embedded inside of CG-Net. Such an extension of CG-Net may expand its applicability by both no longer requiring a user-specified function ƒ, as well as providing CG-Net with a greater learning capacity.
The one or more trained neural networks are then stored for later use, as indicated at step 606. Storing the neural network(s) may include storing network parameters (e.g., weights, biases, or both), which have been computed or otherwise estimated by training the neural network(s) on the training data. Storing the trained neural network(s) may also include storing the particular neural network architecture to be implemented. For instance, data pertaining to the layers in the neural network architecture (e.g., number of layers, type of layers, ordering of layers, connections between layers, hyperparameters for layers) may be stored.
In an example study, the CG-LS algorithm in Algorithm 2 was tested using gradient descent, which may be denoted by gCG-LS, and Newton descent, which may be denoted by nCG-LS, as the steepest descent method for updating z. As an example, 32×32 images from the CIFAR10 image dataset and 64×64 images from the CalTech101 image dataset may be used. Each image was converted to a single-channel grayscale image, scaled down by the maximum pixel value, and vectorized. A Radon transform, at a specified number of angles, was performed on each image to which white noise is added producing noisy measurements, y, at a specified signal-to-noise ratio (“SNR”). Finally, a biorthogonal wavelet transformation was applied to each image to produce the sparsity coefficients, c. That is, Ψ is a matrix representation of a Radon transform and Φ is a biorthogonal wavelet transformation.
For all simulations in this example, the following were used: ƒ(z)=ln(z), μ=2, K=1000, J=1, and δ=10−6. For measurements at an SNR of 60 dB and 40 dB, λ=0.3 and λ=2 were taken, respectively. Using nCG-LS required the Hessian of the cost function in Eqn. (9) with respect to z to be positive definite, which for ƒ(z)=ln(z), is guaranteed when z∈(0, e)n. A local minimizer can be contained in [1, e)n under sufficient scaling of the input data. Therefore, in each nCG-LS test using 32×32 or 64×64 image measurements, the input measurement was scaled by a factor, chosen empirically, of e−4 or e−6, respectively. Additionally, an eigendecomposition on the Hessian of the cost function in Eqn. (9) with respect to z was chosen to find the closest positive semi-definite matrix that was then used in the Newton descent step. Alternatively, the modified ReLU function 1,e may be applied at each z update to ensure the Hessian of the cost function in Eqn. (9) with respect to z is positive semi-definite. Finally,
a,b=
1,e was chosen for nCG-LS whereas for gCG-LS
a,b=¬1,e
In a non-limiting example, given a sensing matrix, Ψ, sparsity transformation, Φ, and noise level in SNR, a set of training and testing measurement-coefficient pairs, (y,c), can be created as described above, that were then used to train and evaluate a CG-Net created by applying algorithm unrolling to Algorithm 2. For network size, CG-Net running on 32×32 or 64×64 image measurements use (K,J)=(20,1) or (K,J)=(5,1), respectively. The network sizes were chosen empirically such that the time to complete one image reconstruction was reasonably quick while still producing excellent reconstructions on a validation set of test images. Letting ƒ(z)=ln(z) and initializing a0=1, b0=exp(2), every
all μkj=2, each
all bkj=exp(3), and every Lkj=I. Note, for CG-Net running on 32×32 images the diagonal and sub-diagonal in Lkj were learned, constraining Bkj to be a tridiagonal matrix. Finally, λk=0.3 was initialized for 60 dB SNR noise level and λk=2 was initialized all other noise levels. Each CG-Net was trained for 20 epochs, using a learning rate of 10−3, with early stopping implemented when the model overfit a validation dataset.
To further highlight the applicability of CG-Net, alternative sensing matrices, Ψ, and sparsity transformations, Φ may be considered other than those mentioned above. For example, in general, Ψ∈m×n can be considered as a Gaussian matrix where m/n is the sampling ratio. A discrete cosine transformation (“DCT”) dictionary may also be considered as an alternative sparsity transformation.
Additionally or alternatively, in some embodiments, the computing device 950 can communicate information about data received from the data source 902 to a server 952 over a communication network 954, which can execute at least a portion of the CG-Net image reconstruction system 904. In such embodiments, the server 952 can return information to the computing device 950 (and/or any other suitable computing device) indicative of an output of the CG-Net image reconstruction system 904.
In some embodiments, computing device 950 and/or server 952 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, and so on. The computing device 950 and/or server 952 can also reconstruct images from the data.
In some embodiments, data source 902 can be any suitable source of data (e.g., measurement data, images reconstructed from measurement data, processed image data), such as an imaging system, another computing device (e.g., a server storing measurement data, images reconstructed from measurement data, processed image data), and so on. In some embodiments, data source 902 can be local to computing device 950. For example, data source 902 can be incorporated with computing device 950 (e.g., computing device 950 can be configured as part of a device for measuring, recording, estimating, acquiring, or otherwise collecting or storing data). As another example, data source 902 can be connected to computing device 950 by a cable, a direct wireless link, and so on. Additionally or alternatively, in some embodiments, data source 902 can be located locally and/or remotely from computing device 950, and can communicate data to computing device 950 (and/or server 952) via a communication network (e.g., communication network 954).
In some embodiments, communication network 954 can be any suitable communication network or combination of communication networks. For example, communication network 954 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g., a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.), other types of wireless network, a wired network, and so on. In some embodiments, communication network 954 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), any other suitable type of network, or any suitable combination of networks. Communications links shown in
As shown in
In some embodiments, communications systems 1008 can include any suitable hardware, firmware, and/or software for communicating information over communication network 954 and/or any other suitable communication networks. For example, communications systems 1008 can include one or more transceivers, one or more communication chips and/or chip sets, and so on. In a more particular example, communications systems 1008 can include hardware, firmware, and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and so on.
In some embodiments, memory 1010 can include any suitable storage device or devices that can be used to store instructions, values, data, or the like, that can be used, for example, by processor 1002 to present content using display 1004, to communicate with server 952 via communications system(s) 1008, and so on. Memory 1010 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, memory 1010 can include random-access memory (“RAM”), read-only memory (“ROM”), electrically programmable ROM (“EPROM”), electrically erasable ROM (“EEPROM”), other forms of volatile memory, other forms of non-volatile memory, one or more forms of semi-volatile memory, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and so on. In some embodiments, memory 1010 can have encoded thereon, or otherwise stored therein, a computer program for controlling operation of computing device 950. In such embodiments, processor 1002 can execute at least a portion of the computer program to present content (e.g., images, user interfaces, graphics, tables), receive content from server 952, transmit information to server 952, and so on. For example, the processor 1002 and the memory 1010 can be configured to perform the methods described herein (e.g., Algorithm 1 described above, Algorithm 2 described above, the CG-Net reconstruction network shown in
In some embodiments, server 952 can include a processor 1012, a display 1014, one or more inputs 1016, one or more communications systems 1018, and/or memory 1020. In some embodiments, processor 1012 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, and so on. In some embodiments, display 1014 can include any suitable display devices, such as an LCD screen, LED display, OLED display, electrophoretic display, a computer monitor, a touchscreen, a television, and so on. In some embodiments, inputs 1016 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, and so on.
In some embodiments, communications systems 1018 can include any suitable hardware, firmware, and/or software for communicating information over communication network 954 and/or any other suitable communication networks. For example, communications systems 1018 can include one or more transceivers, one or more communication chips and/or chip sets, and so on. In a more particular example, communications systems 1018 can include hardware, firmware, and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and so on.
In some embodiments, memory 1020 can include any suitable storage device or devices that can be used to store instructions, values, data, or the like, that can be used, for example, by processor 1012 to present content using display 1014, to communicate with one or more computing devices 950, and so on. Memory 1020 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, memory 1020 can include RAM, ROM, EPROM, EEPROM, other types of volatile memory, other types of non-volatile memory, one or more types of semi-volatile memory, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and so on. In some embodiments, memory 1020 can have encoded thereon a server program for controlling operation of server 952. In such embodiments, processor 1012 can execute at least a portion of the server program to transmit information and/or content (e.g., data, images, a user interface) to one or more computing devices 950, receive information and/or content from one or more computing devices 950, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone), and so on.
In some embodiments, the server 952 is configured to perform the methods described in the present disclosure. For example, the processor 1012 and memory 1020 can be configured to perform the methods described herein (e.g., Algorithm 1 described above, Algorithm 2 described above, the CG-Net reconstruction network shown in
In some embodiments, data source 902 can include a processor 1022, one or more data acquisition systems 1024, one or more communications systems 1026, and/or memory 1028. In some embodiments, processor 1022 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, and so on. In some embodiments, the one or more data acquisition systems 1024 are generally configured to acquire data, images, or both, and can include an imaging system. Additionally or alternatively, in some embodiments, the one or more data acquisition systems 1024 can include any suitable hardware, firmware, and/or software for coupling to and/or controlling operations of an imaging system. In some embodiments, one or more portions of the data acquisition system(s) 1024 can be removable and/or replaceable.
Note that, although not shown, data source 902 can include any suitable inputs and/or outputs. For example, data source 902 can include input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, a trackpad, a trackball, and so on. As another example, data source 902 can include any suitable display devices, such as an LCD screen, an LED display, an OLED display, an electrophoretic display, a computer monitor, a touchscreen, a television, etc., one or more speakers, and so on.
In some embodiments, communications systems 1026 can include any suitable hardware, firmware, and/or software for communicating information to computing device 950 (and, in some embodiments, over communication network 954 and/or any other suitable communication networks). For example, communications systems 1026 can include one or more transceivers, one or more communication chips and/or chip sets, and so on. In a more particular example, communications systems 1026 can include hardware, firmware, and/or software that can be used to establish a wired connection using any suitable port and/or communication standard (e.g., VGA, DVI video, USB, RS-232, etc.), Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and so on.
In some embodiments, memory 1028 can include any suitable storage device or devices that can be used to store instructions, values, data, or the like, that can be used, for example, by processor 1022 to control the one or more data acquisition systems 1024, and/or receive data from the one or more data acquisition systems 1024; to generate images from data; present content (e.g., data, images, a user interface) using a display; communicate with one or more computing devices 950; and so on. Memory 1028 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, memory 1028 can include RAM, ROM, EPROM, EEPROM, other types of volatile memory, other types of non-volatile memory, one or more types of semi-volatile memory, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and so on. In some embodiments, memory 1028 can have encoded thereon, or otherwise stored therein, a program for controlling operation of data source 902. In such embodiments, processor 1022 can execute at least a portion of the program to generate images, transmit information and/or content (e.g., data, images, a user interface) to one or more computing devices 950, receive information and/or content from one or more computing devices 950, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, etc.), and so on.
In some embodiments, any suitable computer-readable media can be used for storing instructions for performing the functions and/or processes described herein. For example, in some embodiments, computer-readable media can be transitory or non-transitory. For example, non-transitory computer-readable media can include media such as magnetic media (e.g., hard disks, floppy disks), optical media (e.g., compact discs, digital video discs, Blu-ray discs), semiconductor media (e.g., RAM, flash memory, EPROM, EEPROM), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media. As another example, transitory computer-readable media can include signals on networks, in wires, conductors, optical fibers, circuits, or any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
As used herein in the context of computer implementation, unless otherwise specified or limited, the terms “component,” “system,” “module,” “framework,” and the like are intended to encompass part or all of computer-related systems that include hardware, software, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a processor device, a process being executed (or executable) by a processor device, an object, an executable, a thread of execution, a computer program, or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components (or system, module, and so on) may reside within a process or thread of execution, may be localized on one computer, may be distributed between two or more computers or other processor devices, or may be included within another component (or system, module, and so on).
In some implementations, devices or systems disclosed herein can be utilized or installed using methods embodying aspects of the disclosure. Correspondingly, description herein of particular features, capabilities, or intended purposes of a device or system is generally intended to inherently include disclosure of a method of using such features for the intended purposes, a method of implementing such capabilities, and a method of installing disclosed (or otherwise known) components to support these purposes or capabilities. Similarly, unless otherwise indicated or limited, discussion herein of any method of manufacturing or using a particular device or system, including installing the device or system, is intended to inherently include disclosure, as embodiments of the disclosure, of the utilized features and implemented capabilities of such device or system.
The present disclosure has described one or more preferred embodiments, and it should be appreciated that many equivalents, alternatives, variations, and modifications, aside from those expressly stated, are possible and within the scope of the invention.
In an embodiment, source 1104, receiver 1108, and/or image estimation device 1110 can be implemented using a single device or multiple devices. In an embodiment, source 1104, receiver 1108, and/or image estimation device 1110 can be implemented as standalone devices or integrated into one or more host devices. For example, in an embodiment, source 1104, receiver 1108, and image estimation device 1110 are all integrated into a single host device. In an embodiment, image estimation device 1110 is implemented as a standalone device. For example, in an embodiment, image estimation device 1110 is integrated into a host platform, such as a building, vehicle, ship, unmanned aerial vehicle (UAV). In an embodiment, both receiver 1108 and image estimation device 1110 are integrated into the same host platform.
In an embodiment, image estimation device 1110 includes communications system 1112, a memory 1114, a processor 1116, and a display 1118. In an embodiment, communications system 1112 receives the waveforms collected by receiver 1108, optionally stores them in memory 1114, and sends them to processor 1116. As shown in
In an embodiment, image estimation device 1110 recovers an image x=@c from undersampled measurements y by recovering c in the equation:
In Equation (48), Φ represents a dictionary to represent the image (Wavelets, discrete cosine), Ψ′ represents a measurement matrix (e.g., represents the sensor), A=ΨΦ represents an effective measurement matrix, and ν represents additive noise. For example, in an embodiment, image estimation sub-system 1124 of image estimation device 1110 uses waveforms collected by receiver 1108 to recovers an image x=Φc from undersampled measurements y by recovering c in Equation (48). In an embodiment, image estimation device 1110 represents c via Compound Gaussian (CG) prior (subsumes many commonly used priors in image estimation e.g., Laplace, Student's t) according to the equation:
In Equation (49), u is Gaussian random vector, and z a positive non-Gaussian random vector. In an embodiment, the method of
In Equation (50), the term ∥y−A(z⊙u)∥22 encourages that the estimates fit the observed measurements, the term ∥u∥22 encourages u to be Gaussian, and the term R(z) encourages a prior distribution on z. In an embodiment, image estimation device 1110 estimates z and u (to in-turn estimate c) by minimizing Equation (50). In an embodiment, image estimation device 1110 uses a CG-Net deep neural network (DNN) imaging subsystem by unrolling the imaging method in Equation (50) with R(z)=μ∥log z∥22. In an embodiment, an optimization landscape is learned by the DNN. In an embodiment, image estimation device 1110 uses a DR-CG-Net DNN imaging subsystem by unrolling the imaging method in Equation (50). In an embodiment, prior distribution for z will be learned by learning R(z).
It is to be appreciated that the Detailed Description, and not the Abstract, is intended to be used to interpret the claims. The Abstract may set forth one or more but not all exemplary embodiments of the present disclosure as contemplated by the inventor(s), and thus, is not intended to limit the present disclosure and the appended claims in any way.
The present disclosure has been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
The foregoing description of the specific embodiments will so fully reveal the general nature of the disclosure that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present disclosure. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.
Any representative signal processing functions described herein can be implemented using computer processors, computer logic, application specific integrated circuits (ASIC), digital signal processors, etc., as will be understood by those skilled in the art based on the discussion given herein. Accordingly, any processor that performs the signal processing functions described herein is within the scope and spirit of the present disclosure.
The above systems and methods may be implemented using a computer program executing on a machine, using a computer program product, or using a tangible and/or non-transitory computer-readable medium having stored instructions. For example, the functions described herein could be embodied by computer program instructions that are executed by a computer processor or any one of the hardware devices listed above. The computer program instructions cause the processor to perform the signal processing functions described herein. The computer program instructions (e.g., software) can be stored in a tangible non-transitory computer usable medium, computer program medium, or any storage medium that can be accessed by a computer or processor. Such media include a memory device such as a RAM or ROM, or other type of computer storage medium such as a computer disk or CD ROM. Accordingly, any tangible non-transitory computer storage medium having computer program code that cause a processor to perform the signal processing functions described herein are within the scope and spirit of the present disclosure.
While various embodiments of the present disclosure have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the disclosure. Thus, the breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments.
This application claims the benefit of U.S. Provisional Patent Application No. 63/502,582, filed on May 16, 2023, and U.S. Provisional Patent Application No. 63/513,686, filed on Jul. 14, 2023, both of which are incorporated by reference herein in their entireties.
The United States Government has ownership rights in this invention. Licensing inquiries may be directed to Office of Technology Transfer at US Naval Research Laboratory, Code 1004, Washington, DC 20375, USA; +1.202.767.7230; nrltechtran@us.navy.mil, referencing Navy Case Number 211588-US2.
Number | Date | Country | |
---|---|---|---|
63502582 | May 2023 | US | |
63513686 | Jul 2023 | US |