The subject matter disclosed herein relates to tomographic reconstruction, and in particular to the use of deep learning techniques to accelerate iterative reconstruction approaches.
Non-invasive imaging technologies allow images of the internal structures or features of a patient/object to be obtained without performing an invasive procedure on the patient/object. In particular, such non-invasive imaging technologies rely on various physical principles (such as the differential transmission of X-rays through the target volume, the reflection of acoustic waves within the volume, the paramagnetic properties of different tissues and materials within the volume, the breakdown of targeted radionuclides within the body, and so forth) to acquire data and to construct images or otherwise represent the observed internal features of the patient/object.
All reconstruction algorithms are subject to various trade-offs, such as between computational efficiency, patient dose, scanning speed, image quality, and artifacts. Therefore, there is a need for reconstruction techniques that may provide improved benefits, such as increased reconstruction efficiency or speed, while still achieving good image quality or allowing a low patient dose.
In one embodiment, a neural network training method is provided. In accordance with this method, a plurality of sets of scan data are acquired. An iterative reconstruction of each set of scan data is performed to generate one or more input images and one or more target images for each set of scan data. The one or more input images correspond to lower iteration steps or earlier convergence status of the iterative reconstruction than the one or more target image. A neural network is trained to generate a trained neural network by providing the one or more training images and corresponding one or more target images for each set of scan data to the neural network.
In another embodiment, an iterative reconstruction method is provided. In accordance with this method, a set of scan data is acquired. An initial reconstruction of the set of scan data is performed to generate one or more initial images. The one or more initial images are provided to a trained neural network as inputs. A predicted image or a predicted update is received as an output of the trained neural network. An iterative reconstruction algorithm is initialized using the predicted image or an image using the predicted update. The iterative reconstruction algorithm is run for a plurality of steps to generate an output image.
In a further embodiment, an imaging system is provided. In accordance with this embodiment, the imaging system includes: a data acquisition system configured to acquire a set of scan data from one or more scan components; a processing component configured to execute one or more stored processor-executable routines; and a memory storing the one or more executable-routines. The one or more executable routines, when executed by the processing component, cause acts to be performed comprising: performing an initial reconstruction of the set of scan data to generate one or more initial images; providing the one or more initial images to a trained neural network as inputs; receiving a predicted image or a predicted update as an output of the trained neural network; initializing an iterative reconstruction algorithm using the predicted image or an image generated using the predicted update; and running the iterative reconstruction algorithm for a plurality of steps to generate an output image.
These and other features, aspects, and advantages of the present invention will become better understood when the following detailed description is read with reference to the accompanying drawings in which like characters represent like parts throughout the drawings, wherein:
One or more specific embodiments will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure
While aspects of the following discussion are provided in the context of medical imaging, it should be appreciated that the present techniques are not limited to such medical contexts. Indeed, the provision of examples and explanations in such a medical context is only to facilitate explanation by providing instances of real-world implementations and applications. However, the present approaches may also be utilized in other contexts, such as tomographic image reconstruction for industrial Computed Tomography (CT) used in non-destructive inspection of manufactured parts or goods (i.e., quality control or quality review applications), and/or the non-invasive inspection of packages, boxes, luggage, and so forth (i.e., security or screening applications). Moreover, the present techniques are applicable to a wide array of image-domain based optimization problems using iterative algorithms. For example, to accelerate the iterative algorithms used in image processing and analysis such as image denoising/smoothing, non-rigid image registration, image enhancement, and so forth. In general, the present approaches may be desirable in any imaging or screening context or image processing field where the final image is the result of optimizing a cost function for which iterative algorithms are employed.
Furthermore, while the following discussion focuses on standard images or image volumes, it should be understood that the same approach can also be applied to sets of images or image volumes corresponding to different aspects of the scan. For example, spectral CT produces a set of images, including monochromatic images at different energies as well as basis material decomposition images. Or as another example, dynamic CT or PET produces a set of images at different time points. At every iteration of the iterative reconstruction, two or more images are estimated and updated. Hence, the current invention equally applies to these sets of images, where the input to the neural network are multiple sets of images and the prediction is also a set of images. For instance, the input may be monochromatic CT images at 60 keV and 100 keV for iteration numbers 4, 5 and 6, while the output may be monochromatic CT images at 60 keV and 100 keV for iteration number 200.
Further, though CT and positron emission tomography (PET) examples are primarily provided herein, it should be understood that the present approach may be used in other imaging modality contexts that may employ iterative image reconstruction techniques. For instance, the presently described approach may also be suitable for use with other types of X-ray tomographic scanners and/or may also applied to image reconstruction in non-X-ray imaging contexts including, but not limited to, reconstruction using single-photon emission computed tomography (SPECT) images using Bayesian regularized reconstruction of data (e.g., penalized image reconstruction) and/or magnetic resonance (MR) image reconstruction.
In the most general sense an image, as discussed herein, can comprise any array of parameters to be estimated, and iterative reconstruction can comprise any iterative estimation process of these parameters. Hence, another possible application of the proposed approach is to accelerate the training of a neural network, where the network parameters make up the image and are iteratively updated. The network parameters may comprise weights at each node as well as activation energy thresholds. The network is trained iteratively and hence a deep learning method can be applied to estimate the parameters of this other neural network.
With respect to iterative reconstruction, these reconstruction techniques (in contrast to analytical methods) may be desirable for a variety of reasons. Iterative reconstruction algorithms can offer advantages in terms of modeling (and compensating for) the physics of the scan acquisition, modeling the statistics of the measurements to improve the image quality and incorporating prior information. For example, such iterative reconstruction methods may be based on discrete imaging models and may realistically model the system optics, scan geometry, physical effects, and noise statistics. Prior information may be incorporated into the iterative reconstruction using Markov random field neighborhood regularization, Gaussian mixture priors, dictionary learning techniques, and so forth.
As a result, iterative reconstruction techniques often achieve superior image quality, though at relatively high computational cost. For example, model-based iterative reconstruction (MBIR) for CT imaging is a reconstruction technique which iteratively estimates the spatial distribution and values of attenuation coefficients of an image volume from measurements. MBIR is based on an optimization problem whereby a reconstructed image volume is calculated by maximizing or minimizing an objective function containing both data fitting and regularizer terms which in combination control the trade-off between data fidelity and image quality. The data fitting (i.e., data fidelity) term minimizes the error between estimated data obtained from reconstructed images and the acquired data according to an accurate model that takes the noise into consideration. The regularizer term takes the prior knowledge of the image (e.g., attenuation coefficients that are similar within a small neighborhood) to reduce possible artifacts, such as streaks and noise. Therefore, MBIR is tolerant to noise and performs well even in low dose situation. Penalized image reconstruction for other modalities, such as PET, SPECT and MR follows similar principle. The trade-off, however, is that such iterative reconstruction approaches are computationally intensive and may be relatively time consuming, particularly in comparison to analytical reconstruction approaches.
With the preceding introductory comments in mind, the approaches described herein utilize deep learning techniques to accelerate iterative reconstruction of images, such as CT, PET, SPECT, and MR images. As discussed herein, deep learning techniques (which may also be known as deep machine learning, hierarchical learning, or deep structured learning) are a branch of machine learning techniques that employ mathematical representations of data and artificial neural network for learning. By way of example, deep learning approaches may be characterized by their use of one or more algorithms to extract or model high level abstractions of a type of data of interest. This may be accomplished using one or more processing layers, with each layer typically corresponding to a different level of abstraction and, therefore potentially employing or utilizing different aspects of the initial data or outputs of a preceding layer (i.e., a hierarchy or cascade of layers) as the target of the processes or algorithms of a given layer. In an image processing or reconstruction context, this may be characterized as different layers corresponding to the different feature levels or resolution in the data. Processing may therefore proceed hierarchically, i.e., earlier or higher level layers may correspond to higher level or larger features, followed by layers that derive lower level or finer features from the higher level features. In practice, each layer may employ one or more linear and/or non-linear transforms to process the input data to an output data representation for the layer.
As discussed herein, as part of the initial training of deep learning processes to solve a particular problem, training data sets may be employed that have known initial values (e.g., input images) and known or desired values for one or both of the final output (e.g., target images or image updates) of the deep learning process or for individual layers of the deep learning process (assuming a multi-layer algorithmic implementation). In this manner, the deep learning algorithms may process (either in a supervised or guided manner or in an unsupervised or unguided manner) the known or training data sets until the mathematical relationships between the initial data and desired output(s) are discerned and/or the mathematical relationships between the inputs and outputs of each layer are discerned and characterized. Similarly, separate validation data sets may be employed in which both input and desired target values are known, but only the initial values are supplied to the trained deep learning algorithms, with the outputs then being compared to the outputs of the deep learning algorithm to validate the prior training and/or to prevent over-training.
By way of visualization,
The loss or error function 62 measures the difference between the network output (i.e., Ipred) and the training target (i.e., IN) (see
To facilitate explanation of the present iterative reconstruction acceleration using deep learning techniques, the present disclosure primarily discusses these approaches in the context of a CT or PET system. However, it should be understood that the following discussion may also be applicable to other image modalities and systems including, but not limited to, SPECT, magnetic resonance imaging (MRI), as well as to non-medical contexts or any context where iterated reconstruction steps are employed to reconstruct an image. Moreover, the same principle and similar approaches are applicable to image processing problem where an iterative algorithm is used to optimize a cost function to generate the final desired image.
With this in mind, an example of an imaging system 110 (i.e., a scanner) is depicted in
In the depicted example, the collimator 114 shapes or limits a beam of X-rays 116 that passes into a region in which a patient/object 118, is positioned. In the depicted example, the X-rays 116 are collimated to be a cone-shaped beam, i.e., a cone-beam, that passes through the imaged volume. A portion of the X-ray radiation 120 passes through or around the patient/object 118 (or other subject of interest) and impacts a detector array, represented generally at reference numeral 122. Detector elements of the array produce electrical signals that represent the intensity of the incident X-rays 120. These signals are acquired and processed to reconstruct images of the features within the patient/object 118.
Source 112 is controlled by a system controller 124, which furnishes both power, and control signals for CT examination sequences, including acquisition of two-dimensional localizer or scout images used to identify anatomy of interest within the patient/object for subsequent scan protocols. In the depicted embodiment, the system controller 124 controls the source 112 via an X-ray controller 126 which may be a component of the system controller 124. In such an embodiment, the X-ray controller 126 may be configured to provide power and timing signals to the X-ray source 112.
Moreover, the detector 122 is coupled to the system controller 124, which controls acquisition of the signals generated in the detector 122. In the depicted embodiment, the system controller 124 acquires the signals generated by the detector using a data acquisition system 128. The data acquisition system 128 receives data collected by readout electronics of the detector 122. The data acquisition system 128 may receive sampled analog signals from the detector 122 and convert the data to digital signals for subsequent processing by a processor 130 discussed below. Alternatively, in other embodiments the digital-to-analog conversion may be performed by circuitry provided on the detector 122 itself. The system controller 124 may also execute various signal processing and filtration functions with regard to the acquired image signals, such as for initial adjustment of dynamic ranges, interleaving of digital image data, and so forth.
In the embodiment illustrated in
The linear positioning subsystem 134 may enable the patient/object 118, or more specifically a table supporting the patient, to be displaced within the bore of the CT system 110, such as in the z-direction relative to rotation of the gantry. Thus, the table may be linearly moved (in a continuous or step-wise fashion) within the gantry to generate images of particular areas of the patient 118. In the depicted embodiment, the system controller 124 controls the movement of the rotational subsystem 132 and/or the linear positioning subsystem 134 via a motor controller 136.
In general, system controller 124 commands operation of the imaging system 110 (such as via the operation of the source 112, detector 122, and positioning systems described above) to execute examination protocols and to process acquired data. For example, the system controller 124, via the systems and controllers noted above, may rotate a gantry supporting the source 112 and detector 122 about a subject of interest so that X-ray attenuation data may be obtained at one or more views relative to the subject. In the present context, system controller 124 may also include signal processing circuitry, associated memory circuitry for storing programs and routines executed by the computer (such as routines for executing accelerated image processing or reconstruction techniques described herein), as well as configuration parameters, image data, and so forth.
In the depicted embodiment, the image signals acquired and processed by the system controller 124 are provided to a processing component 130 for reconstruction of images in accordance with the presently disclosed algorithms. The processing component 130 may be one or more general or application-specific microprocessors. The data collected by the data acquisition system 128 may be transmitted to the processing component 130 directly or after storage in a memory 138. Any type of memory suitable for storing data might be utilized by such an exemplary system 110. For example, the memory 138 may include one or more optical, magnetic, and/or solid state memory storage structures. Moreover, the memory 138 may be located at the acquisition system site and/or may include remote storage devices for storing data, processing parameters, and/or routines for image reconstruction, as described below.
The processing component 130 may be configured to receive commands and scanning parameters from an operator via an operator workstation 140, typically equipped with a keyboard and/or other input devices. An operator may control the system 110 via the operator workstation 140. Thus, the operator may observe the reconstructed images and/or otherwise operate the system 110 using the operator workstation 140. For example, a display 142 coupled to the operator workstation 140 may be utilized to observe the reconstructed images and to control imaging. Additionally, the images may also be printed by a printer 144 which may be coupled to the operator workstation 140.
Further, the processing component 130 and operator workstation 140 may be coupled to other output devices, which may include standard or special purpose computer monitors and associated processing circuitry. One or more operator workstations 140 may be further linked in the system for outputting system parameters, requesting examinations, viewing images, and so forth. In general, displays, printers, workstations, and similar devices supplied within the system may be local to the data acquisition components, or may be remote from these components, such as elsewhere within an institution or hospital, or in an entirely different location, linked to the image acquisition system via one or more configurable networks, such as the Internet, virtual private networks, and so forth.
It should be further noted that the operator workstation 140 may also be coupled to a picture archiving and communications system (PACS) 146. PACS 146 may in turn be coupled to a remote client 148, radiology department information system (RIS), hospital information system (HIS) or to an internal or external network, so that others at different locations may gain access to the raw or processed image data.
While the preceding discussion has treated the various exemplary components of the imaging system 110 separately, these various components may be provided within a common platform or in interconnected platforms. For example, the processing component 130, memory 138, and operator workstation 140 may be provided collectively as a general or special purpose computer or workstation configured to operate in accordance with the aspects of the present disclosure. In such embodiments, the general or special purpose computer may be provided as a separate component with respect to the data acquisition components of the system 110 or may be provided in a common platform with such components. Likewise, the system controller 124 may be provided as part of such a computer or workstation or as part of a separate system dedicated to image acquisition.
The system of
The estimated image may then be used in the iterative reconstruction so as to effectively move ahead that many steps in the reconstruction process, without performing the intervening iterative reconstruction steps. While the present approach may be used to effectively skip from the beginning of the iterative reconstruction to the final image, in practice it may instead be useful to apply the approach one or more times during reconstruction so as to jump ahead in a more discrete and controlled manner so as to allow the reconstruction algorithms to make adjustments as needed throughout the process. For example, the present approach may be applied after a certain number of iterative reconstruction steps to jump ahead 50, 100, 500, or a 1000 steps, and then allow the conventional reconstruction steps to proceed as usual, thereby saving the computational time associated with the number of steps skipped. Alternatively, the present approach may instead be applied multiple times (e.g., 2, 3, 4, 5, 10) over the course of the reconstruction to jump ahead some number of steps (e.g., 25, 50, 100, 500) each application, and then allow the reconstruction to proceed after each application for some number of steps so that the reconstruction process may make any needed corrections or adjustments as needed. In such uses, different instances or applications of the deep learning acceleration during a single reconstruction may jump ahead different numbers of steps. Further, in such uses, the deep learning algorithm employed at different stages of the iterative reconstruction process may be differently trained (i.e., a different algorithm) to account for the respective stage of the reconstruction process. In such an example, for new data (i.e., in clinical or diagnostic use, the patient data), it may be reconstructed up to the point or in a manner corresponding to what the deep learning acceleration algorithms were trained to receive as inputs. Alternatively, in other instances the same algorithm may be employed regardless of the stage of the reconstruction process.
An example of this concept is shown in
A simple example of the above discussion is shown in
To further illustrate the present concepts,
While the preceding has generally described the use of deep learning approaches for estimating a predicted image that may be used to reinitialize a reconstruction process at a later stage. Conversely, one could instead predict the update to the previous estimate to define a current estimate. That is, IN=Im3+ΔIpred, where the deep learning model learns to predict ΔIpred from {Im1, Im2, Im3}, with reference to
With the preceding in mind, and turning to
Turning to
In some implementations, some or all of the inputs used to train the deep learning model could be difference images between an early iteration image (or patch) and another corresponding early iteration image or patch (e.g., {Im1, Im2-Im1, Im3, etc.} or {Im1, Im2, Im3-Im2, etc.}. Likewise, in some instances the inputs could include image feature descriptors, such as gradient and edges, obtained from the early estimates {Im1, Im2, Im3}. In some cases, hyper-parameters used by penalized iterative algorithms, such as the prior weight β, or a transformation of the hyper-parameters and data such as κ (Kappa) used in a PET image reconstruction which combines the hyper-parameter and data dependency, could also be part of the input to the network. Further, in certain embodiments some or all of the early iteration image estimates (or difference images, or feature descriptors) generated for training may be of reduced size to speed up the network training process. In such reduced size implementations, the network prediction, when applied in a non-training context (i.e., a clinical or diagnostic context) can be scaled up to correspond to a regular size reconstruction.
Turning back to
In one implementation, the weights and biases of the trained neural network 230 are updated using backpropagation by optimization algorithms such as stochastic gradient decent, Adam, AdaGrad, RMSProp, or transferred and optimized from a pre-trained neural network 222. The hyper-parameters of the trained network 230, such as the number of layers (54, 58A, 58B, 60, etc.) and the number of neurons 56 for each layer, the number of convolutional kernels and the size of these kernels in the case of convolutional neural network, and the hyper-parameters for the optimization algorithms used to update the neural network training can be chosen through random grid search, optimization algorithms, or simple trial and error. Techniques like dropout may be used to avoid overfitting of the network 230. In some implementations, validation data sets were used to validate the trained neural network. Turning to
Turning back to training, a further example is explained with reference to
With the preceding in mind, it is possible to train a series of neural networks in this fashion, such as different models or networks for different stages of an iterative reconstruction process. For example, a first neural network may be trained by using early estimates (Im1, Im2, and Im3) as input and an estimate IN1 at iteration number N1 as the first target image where N1>m3. A second neural network may then be trained using IN1 as an input and an estimate IN2 at iteration N2 as the target image, where N2>N1. These sub-networks can be cascaded to form one deeper neural network and serve to pre-train and initialize a final, deeper network, so that the final network can achieve faster convergence and better performance.
In one implementation of such an embodiment, the hidden layers (i.e., layers 58A, 58B, and so forth) in the proposed deep networks can be pre-trained layer-by-layer by leveraging the intermediate iterates of a conventional iterative algorithm. For example, consider a L-layer network that takes IK and IN as the training input and training target, respectively. Pre-training of the S-th layer of the network could take IK+S−1 as the input, and IK+s (where (K+S)<N) as the training target.
It may also be appreciated that while the preceding discussion suggests the use of iteratively reconstructed images as inputs to a trained deep learning model to accelerate an iterative reconstruction process, in practice it may be possible to use images reconstructed using other algorithms as inputs to the trained deep learning model. By way of example, in such implementations the input images to the trained neural network may be reconstructed using approaches that are faster than the iterative reconstruction under consideration, such as analytical reconstruction approaches (e.g., Feldkamp-Davis-Kress (FDK) or filtered back projection (FBP)) or other fast iterative reconstruction method such as ordered subset expectation maximization (OSEM). In essence, images obtained from other algorithms correspond to some points on the convergence curve of the iterative reconstruction method under consideration.
While the preceding outline the underpinnings and variations on the present approach,
With respect to network training, the network employed was defined as having three convolutional layers with a rectified linear unit (ReLU) activation function without POOL layers. Layer 1 was defined as a 3×3 kernel with 36 filters; Layer 2 was defined as a 3×3 kernel with 16 filters; and Layer 3 was defined as a 5×5 kernel with 1 filter. Images from the 20th iteration were used as input images for training while images from the 200th were used as target images. As noted in the preceding sections, it would also have been possible to use images from multiple iterations as inputs (e.g., the 10th, 20th, and 30th iterations) and/or using neighboring slices as part of the input data. Thus, an input image corresponding to a 20th iteration image is fed to a deep learning model (embodied as a neural network) so as to allow the model to learn how to estimate a 200th iteration image from the input image. Thus, the resulted deep learning network was trained to receive as input images (here 2D slices) run to the 20th iteration using BSREM and to output predicted images (here 2D slices) that have a cost function value corresponding to that of an image that would be generated by BSREM with an iteration number larger than the initial 20 iteration step. BSREM reconstruction was then further applied using the predicted images as an initialization.
Turning to
As previously noted, in the iterative reconstruction context, the purpose of iterative steps is to maximize or minimize the cost function. Therefore, acceleration provided by the present approach may be evaluated in terms of cost functions changes (i.e., was acceleration gained in terms of the cost function). In this particular study, the goal is to maximize a cost/objective function. This analysis is shown for the present study in
As shown in the plotted graph BSREM was run for 20 iterations and deep learning based acceleration was used to obtain predicted images which were then used to re-initialize the BSREM algorithm in the deep learning accelerated portion of the study. As may be observed, deep learning based acceleration produced a substantial jump in the cost function, reflecting a much faster convergence. Indeed, not only was a large jump in the objective function observed, but the deep learning based acceleration effectively moved reconstruction onto a faster convergence track. Indeed, in this study even after 200 iterations of BSREM on the un-accelerated track (toward the upper right end of the curve), a cost function level is still not reached that is comparable to what was seen on the accelerated track in only 30 iterations. This indicates an acceleration of at least a factor of seven using deep learning based acceleration for this study.
Technical effects of the invention include utilizing deep learning techniques to accelerate iterative reconstruction of images, such as CT, PET, SPECT, and MR images. In particular, projection (or other) data acquired by an imaging system may be iteratively reconstructed using deep learning approaches to accelerate the reconstruction processing. The present approach utilizes deep learning techniques so as to provide a better initialization to one or more steps of the numerical iterative reconstruction algorithm by learning a trajectory of convergence from estimates at different levels of convergence so that it can reach the maximum or minimum of a cost function faster. In essence the present approach may be construed as taking one or more images at one or more stages of the iterative reconstruction and using trained deep learning algorithms to obtain an estimate of what the image will look like for some number of iterative reconstruction steps in the future (e.g., 10, 50, 100, 200, 500, steps, and so forth). The estimated image may then be used in the iterative reconstruction so as to effectively move ahead that many steps in the reconstruction process, without performing the intervening iterative reconstruction steps.
This written description uses examples to disclose the invention, including the best mode, and also to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements with insubstantial differences from the literal languages of the claims.