The present invention relates generally to medical diagnostic imaging. More specifically, it relates to techniques for image-to-image translation of diagnostic images using deep learning.
In recent years, deep learning has begun to play an important role in image-to-image translation tasks. In radiological imaging, deep learning has made significant contributions to a variety of applications, including but not limited to quantitative MR parametric mapping and water-fat separation. A common deep learning-based imaging model is typically trained to learn the physical model only from input radiological images without consideration to the values of imaging parameters.
We describe here a technique that incorporates the values of imaging parameters as additional input to a deep learning-based radiological imaging model. Thus, not only are radiological images used as input to the deep neural network, but also image maps of the values of critical imaging parameters at every pixel are incorporated into the input to the deep neural network. The inventors have discovered and demonstrated that explicit incorporation of such a priori knowledge as additional input into the network improves prediction accuracy, particularly when flexible imaging parameter values are adopted for data acquisition. Previously, the values of imaging parameters have been used for loss calculation in self-supervised learning, but never as input to the network used for image translation.
Thus, in one aspect, the invention provides a method for diagnostic imaging comprising: performing a diagnostic imaging scan using predetermined image acquisition parameters prescribed in an imaging protocol to produce diagnostic images; and generating translated diagnostic images from the predetermined image acquisition parameters and from the diagnostic images using a deep neural network. The translated diagnostic images are generated by applying both the predetermined image acquisition parameters and the diagnostic images as input to an input layer of the deep neural network. The predetermined image acquisition parameters are input to the deep neural network in the form of parameter image maps with imaging parameter values at each pixel of the parameter image maps. The translated diagnostic images are produced as output from an output layer of the deep neural network.
In one embodiment, the diagnostic imaging is magnetic resonance imaging, the diagnostic images are T1-weighted images acquired using variable flip angles with or without B1 map, the predetermined image acquisition parameters comprise variable flip angles, and the translated diagnostic images comprise a T1 map. The T1-weighted images may be acquired with distinct flip angles. The translated diagnostic images may comprise an uncompensated T1 map. In this case, the nominal variable flip angles are additional input into the neural network. Alternatively, the translated diagnostic images comprise a compensated T1 map that takes into account B1 inhomogeneity. In this case, the predetermined image acquisition parameters (i.e., nominal flip angle) are combined with a B1 map to produce actual variable flip angles that are input into the neural network in the form of a nominal flip angle modulated by the B1 map. The translated diagnostic images may also comprise a ρ map.
In other embodiments, the diagnostic imaging is chemical shift encoded magnetic resonance imaging (MRI) using dual echo image acquisition; the diagnostic images are in-phase and out-of-phase complex MRI images, the predetermined image acquisition parameters are echo times, and the translated diagnostic images are water and fat images.
In yet other embodiments, the diagnostic images may be modified look-locker imaging based T1 weighted images, multi-echo T2 or T2* weighted images, continuous wave T1ρ weighted images, adiabatic T1ρ weighted images. The predetermined image acquisition parameters may be inversion times, echo times, spin-lock times, or number of adiabatic inversion recovery pulses. The translated diagnostic images may comprise T1 map, T2 or R2 map, T2* or R2* map, and T1ρ map.
The deep neural network may be a convolutional network, attention convolutional network, pure attention network, or generative adversarial network.
In some embodiments, the deep neural network may be trained using training diagnostic images and corresponding translated images generated using a conventional MR image processing technique such as least square fitting for generating quantitative parametric maps, and projected power approach for generating water and fat images.
In other embodiments, the deep neural network is trained using training diagnostic images via self-supervised learning technique comprising inputting the training diagnostic images to the deep neural network to produce estimated translated images as output, generating from the estimated translated images synthetic images using a model-based calculation, and computing a loss function by comparing the synthetic images to the training diagnostic images. In this way, reference translated images are no longer needed.
Results demonstrating the performance of embodiments of the present invention are shown in
A schematic diagram illustrating a method for diagnostic imaging according to an embodiment of the present invention is shown in
The diagnostic images 100 and corresponding imaging parameters 102 are both applied to an input layer of a deep neural network 104. The predetermined image acquisition parameters 102 are input to the deep neural network in the form of parameter image maps with imaging parameter values at each pixel of the parameter image maps. For example, the predetermined image acquisition parameters may be values of variable flip angles at each pixel. More generally, MRI image acquisition parameters are variables in pulse sequences that determine how radiofrequency (RF) pulses are applied so as to achieve certain image contrast, signal-to-noise ratio, acquisition time, and/or resolution in corresponding MR images. Examples of MRI imaging parameters include echo time (TE), repetition time (TR), inversion time, flip angle, echo train length.
The deep neural network 104 generates as output at an output layer translated diagnostic images 106 from the predetermined image acquisition parameters 102 and from the diagnostic images 100 that were input to the deep neural network. For example, the translated diagnostic images output from the network 104 may be a T1 map generated from T1-weighted images and flip angle acquisition parameters input to the network. More generally, the network 104 is trained to perform image-to-image translation, which is the process of computationally transforming images acquired using given image acquisition parameters to image(s) that would have been derived using conventional processing technique (e.g., least square fitting, projected power approach). In the present invention, however, the deep learning-based image-to-image translation is performed by supplementing the input images with imaging parameters.
There are different ways to integrate imaging parameters as network input. As an example, the translated diagnostic images 106 output from the network 104 may comprise an uncompensated T1 map or a compensated T1 map that takes into account B1 inhomogeneity. In the former case, the predetermined image acquisition parameters, i.e., nominal flip angles, 102 are directly included as input. In the latter case, the image acquisition parameters of flip angles are combined with B1 map to produce actual variable flip angles 102, and actual variable flip angles are input into the neural network 104 in the form of a nominal flip angle modulated by the B1 map.
This technique can be applied in various radiological imaging modalities, such as MRI, CT, or ultrasound. As an illustrative example, we describe this technique in the context of MRI for quantitative T1 mapping and water-fat separation. We also compare the technique with existing methods that do not use imaging parameters as supplemental input to a deep neural network.
As shown in
and smoothed via a 3D Gaussian kernel.
Also shown in
Similarly,
In the baseline model, two VFA images 226 are combined with the B1 map 228 and input to a deep learning network 230 to generate a compensated T1 map 232. It is significant that this can be performed with only two VFA images as input. In contrast with the baseline method, the present deep learning model 238 predicts the translated images 240 from the same two VFA images 234 supplemented with additional imaging parameter maps 236 that provide the actual flip angles (5° and) 30° at every pixel. The values of nominal flip angles specified by the imaging protocol (i.e., 5° and 30°) are incorporated into network input in the form of actual flip angles, where the actual flip angle is the nominal flip angle modulated by B1 map, as given by α=αnominal·B1. The model derives translated images 240 from two VFA images 234 as well as images 236 that reflect actual flip angles at every pixel. Of note, imaging parameters can be combined with other a priori information (e.g., B1 map) and used as network input.
Two deep learning-based water-fat separation models are compared in
The networks 214, 238, 254 may be trained with diagnostic images and corresponding ground truth translated images (e.g., quantitative parametric maps) that are generated using conventional techniques. For example, the ground truth T1 maps 204 may be generated from VFA images 200 using least square fitting 202. Preferably, the inventors have developed a self-supervised learning method, which does not require computation of ground truth translated images (e.g., parametric maps). Even when ground truth maps are not used in training, the T1 maps predicted from two VFA images have high fidelity to the ground truth maps. This training approach is illustrated in
The ρ map, which is used in image synthesis, may be generated together with the T1 map using a multi-output deep neural network, where different parametric maps are predicted using parallel subnets with distinct encoder and decoder paths. Alternatively, only T1 map is predicted, and ρ map is calculated from the predicted T1 map and an input T1-weighted image (based on the physics model) in every iteration.
More specifically, the value at a position of the attention map is determined by two factors. One is the relevance between the signals at current position i and other position j, defined by an embedded Gaussian function
The other is a representation of the feature value at the other position j, given by a linear function h(Xj)=WhXj. Here, Wf, Wg, and Wh are weight matrices (implemented as 1×1 convolution), whose optimal values are identified by the model in training. Within each attention layer, a shortcut connection is established to include local features as well. The contributions of local and non-local information are balanced by a scale parameter a, whose value is obtained in training.
To simultaneously predict T1 and ρ maps, a multi-output deep neural network may be constructed. The network has parallel subnets with distinct encoder-decoder paths for the generation of individual parametric maps. Each subnet has the network architecture as described in
Results demonstrating the performance of embodiments of the present invention are shown in
Significantly, a priori information of the critical imaging parameters is incorporated as additional network input. In fact, this is a new way to make use of a priori information in any deep learning-based medical imaging model. While a medical imaging model can be established without including imaging parameters, explicit provision of such a priori information is expected and has been demonstrated to improve the performance of the system. Imaging parameters can be incorporated in different ways, either contribute as independent images (as in uncompensated T1 mapping) or combine with other a priori information and form new images (as in compensated T1 mapping). The mechanism can be applied in supervised or self-supervised learning models, for MR imaging and beyond.
This technique is not limited to the illustrative examples discussed here. Other than VFA-base T1 mapping, the proposed method can be extended to a variety of quantitative parametric mapping applications, such as Inversion Recovery based T1 mapping, T2 or T2* mapping, R2 or R2* mapping, and T1ρ mapping.
This technique also can be applied in various radiological imaging facilities (e.g., MRI, CT, ultrasound) or image guided therapeutic facilities (e.g., radiation therapy treatment system).
For example, in variable flip angle imaging based T1 mapping, flip angles can be included; in modified look-locker imaging based T1 mapping, inversion times can be included; in multi-echo T2 or T2* mapping, echo times can be included; in continuous wave T1ρ mapping, spin-lock time can be included; in adiabatic Tip mapping, number of adiabatic inversion recovery pulses can be included. In chemical shift encoded water-fat separation, echo times or the difference between the echo times can be included.
This technique can be easily combined with any deep neural network architecture (e.g., convolutional network, attention convolutional network, pure attention network, or generative adversarial network). It can be applied on different radiological imaging modalities. The diagnostic imaging scan may use various different imaging techniques to acquire input images (imaging techniques depend on imaging modality, application, MR pulse sequence, etc.)
The values of various imaging parameters (e.g., flip angles, echo times, inversion times, spin-lock time, number of adiabatic inversion recovery pulses) may be used as network input (the choice of imaging parameters depends on imaging modality, application, MR pulse sequence, etc.). The values of imaging parameters either as individual images or as images that contain other a priori information (e.g., B1 map) may be used. Deep neural networks with different architectures may be used. Supervised, unsupervised, self-supervised learning models may be used.
This application claims priority from U.S. Provisional Patent Application No. 63/450,225 filed Mar. 6, 2023, which is incorporated herein by reference.
This invention was made with Government support under contract DK117354, EB009690, EB026136 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63450225 | Mar 2023 | US |