This application is based on and claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2022-0097571, filed on Aug. 4, 2022, in the Korean Intellectual Property Office, the disclosure of which is incorporated by reference herein in its entirety.
The disclosure relates to a method and a device for converting a medical image, and more particularly, to a method and a device for converting a non-contrast image into a contrast image by using a deep learning network.
The disclosure was supported by the “Critical Care Patient Specialized Big Data Construction and AI-based CDSS Development” project hosted by Seoul National University Hospital (Task identification number: HI21C1074, Assignment number: HI21C1074050021).
In order to distinguish lesions more clearly during diagnosis or treatment, computed tomography (CT) or magnetic resonance imaging (MRI) is scanned by administrating a contrast medium. A medical image scanned by projecting a contrast medium has high tissue contrast, and thus, lesions may be clearly distinguished. However, a contrast medium is nephrotoxic. For example, a gadolinium contrast medium used in MRI is more nephrotoxic than an iodinated contrast medium used in CT scans, and thus, the contrast medium may not be used when a renal function is reduced. Therefore, elderly patients or patients with renal function problems have no choice but to take non-contrast images.
Provided are an image conversion method and device capable of generating and outputting a contrast image from a non-contrast image by using a deep learning network.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments of the disclosure.
According to an embodiment, an image conversion method includes inputting a non-contrast image to a deep learning network, and generating and outputting a contrast image through the deep learning network, wherein the deep learning network is trained with learning data including one or more contrast learning images and one or more non-contrast learning images.
According to an embodiment, an image conversion device includes an input unit configured to input a non-contrast image to a deep learning network, and a conversion unit configured to generate and output a contrast image through the deep learning network, wherein the deep learning network is trained with learning data including one or more contrast learning images and one or more non-contrast learning images.
The above and other aspects, features, and advantages of certain embodiments of the disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. In this regard, the present embodiments may have different forms and should not be construed as being limited to the descriptions set forth herein. Accordingly, the embodiments are merely described below, by referring to the figures, to explain aspects of the present description. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
Hereinafter, an image conversion method and device according to an embodiment will be described in detail with reference to the accompanying drawings.
Referring to
Referring to
The deep learning network 200 may be trained by using learning data including a dataset of the non-contrast learning images 210 and contrast learning images 220 that are a ground truth. The number of datasets included in the learning data may vary according to an embodiment. When the non-contrast learning image 210 is received, the deep learning network 200 outputs the prediction image 230, and performs a learning process of comparing the prediction image 230 with the contrast learning image 220 of the learning data, and adjusting an internal parameter so that an error between the prediction image 230 and the contrast learning image 220 is minimized.
The non-contrast learning image 210 and the contrast learning image 220 of the learning data may be images directly captured through CT or MRI or processed images for learning. For example, a non-contrast image and a contrast image directly captured from a plurality of patients may be used as the learning data in the embodiment. The contrast learning image 220 of the learning data may be a T1 image of MRI captured by administering a contrast medium.
Contrast images captured by administering the contrast medium may have different contrast enhancement degrees over time for each patient. For example, a contrast enhancement degree of a contrast image of a first patient and a contrast image of a second patient captured at the same time point after administration of the contrast medium may be different from each other. There is a disadvantage in that the learning performance of a deep learning network is reduced when such contrast images having different contrast enhancement degrees are used as learning data.
In order to solve this problem, the deep learning network 200 may use a maximum intensity projection (MIP) image 220 obtained by processing a plurality of contrast learning images as the learning data instead of using contrast learning images as they are. The MIP image 220 is an image including pixels having the highest brightness among pixels of the plurality of contrast learning images, which is described again with reference to
Referring to
In the embodiment, to help understanding, some pixels of first to third contrast learning images 300, 310, and 320 are shown in different shapes. For example, pixels 302, 304, and 306 of the first contrast learning image 300 are displayed as triangles, pixels 312, 314, and 316 of the second contrast learning image 310 are displayed as squares, and pixels 322, 324, and 326 of the third contrast learning image 320 are displayed as star shapes.
When a brightness value of the pixel 302 of the first contrast learning image 300 is the greatest among the pixels 302, 312, and 322 of the first to third contrast learning images 300, 310, and 320 existing at a first position, the image conversion device 100 selects a brightness value of the pixel 312 of the first contrast learning image 300 as a brightness value of a pixel 332 of the MIP image 330 at the first position. In the same way, when a brightness value of the pixel 314 of the second contrast learning image 310 is the greatest among the pixels 304, 314, and 324 of the first to third contrast learning images 300, 310, and 320 existing at a second position, the image conversion device 100 selects a brightness value of the pixel 314 of the second contrast learning image 310 as a brightness value of the pixel 332 of the MIP image 330 at the second position. In other words, when the plurality of contrast learning images 300, 310, and 320 are all projected on one surface, the image conversion device 100 selects a maximum value of brightness values of a plurality of pixels having an overlapping position as a brightness value of the corresponding pixel of the MIP image 330.
The embodiment shows an example of generating the MIP image 330 with respect to the 3 contrast learning images 300, 310, and 320 for convenience of explanation, but the number of contrast learning images 300, 310, and 320 used to generate the MIP image 300 may vary according to an embodiment, such as two or four or more.
As an embodiment, the plurality of contrast learning images 300, 310, and 320 used to generate the MIP image 330 may be a plurality of contrast images captured for a certain period of time after administration of the contrast medium. For example, the image conversion device 100 may select T1 images of MRI captured by administering the contrast medium as the plurality of contrast learning images 300, 310, and 320 used to generate the MIP image 300. As another example, T2 and FIAIR images may be additionally used, but in this case, there is a disadvantage that an image acquisition time is long, which is burdensome to a patient.
Referring to
A part (or a pixel) that appears brighter in the non-contrast image 400 than in the contrast image 410 may exist. Therefore, when the deep learning network 200 is trained by generating the MIP image 420 by using the non-contrast learning image 410 together, thereby obtaining a contrast image (i.e., a prediction image) having a clearer contrast.
Referring to
Referring to
Referring to
The image conversion device 100 obtains a prediction image 740 by inputting a non-contrast learning image 720 of learning data to the U-Net 700. The image conversion device 100 obtains a first result 750 by inputting the prediction image 740 output by the U-Net 700 to the VGG network 710. In addition, the image conversion device 100 obtains a second result 760 by inputting a contrast learning image 730 which is ground truth of the learning data, to the VGG network 710. The prediction image 740 and the contrast learning image 730 are input to the same network that is the VGG network 710.
The image conversion device 100 determines an error 770 between the first result 750 and the second result 760 obtained by inputting the prediction image 740 and the contrast learning image 730 to the VGG network 710, respectively, and trains the U-Net 700 so that the error 770 is minimized. The image conversion device 100 may obtain a contrast image (i.e., the prediction image 740) by inputting a non-contrast image to the U-Net 700 of which training is completed. As an embodiment, the contrast learning image 730 of the learning data may be an MIP image described with reference to
Referring to
The auto encoder 810 of the embodiment uses a diffusion weighted imaging (DWI) image among MRI images as learning data. When training of the auto encoder 810 is completed, features of a middle layer of a network of the auto encoder 810 are input to the expanding path 804 of the U-Net 800 (820). That is, features of the middlemost part of layers of the auto encoder 810 are extracted and input to the expanding path 804 of the U-Net 800.
A data format of features extracted from the auto encoder 810 may be different from a data format of features received by the expanding path 804 of the U-Net 800, and thus, it is necessary to transform the features extracted from the auto encoder 810 to fit the data format of the U-Net 800. For example, when the features extracted from the middle part of the auto encoder 810 are in the form of a 9*16 array and features of an input end of the expanding path 804 of the U-Net 800 are in the form of a 12*12 array, the size of the features extracted from the auto encoder 810 is converted from 9*16 to 12*12.
The U-Net 800 performs learning by adding the features of the auto encoder 810 to the expanding path 804, thereby learning contrast features included in the DWI image together, and thus, a more accurate contrast image may be generated.
As an embodiment, learning data input to the U-Net 800 may include a dataset of a plurality of non-contrast learning images and one contrast learning image as described with reference to
Referring to
The deep learning network for converting the non-contrast image into the contrast image may be trained in various ways and generated. As an embodiment, the image conversion device 100 may select a plurality of contrast images captured after administrating a contrast medium as a contrast learning image, generate a MIP image including pixels having the greatest brightness value among pixels of the plurality of contrast learning images, and train the deep learning network by using learning data including a non-contrast learning image and an MIP image. Examples in this regard are shown in
As an embodiment, the image conversion device 100 may implement a deep learning network as a U-Net. The image conversion device 100 may train the U-Net by using learning data including a dataset of a plurality of non-contrast learning images and one contrast learning image. An example in this regard is shown in
Referring to
The training unit 1000 trains the deep learning network 1030. The training unit 100 may train the deep learning network 1030 with learning data including one or more contrast learning images and one or more non-contrast learning images. As an embodiment, in order to improve the performance of the deep learning network 1030, the training unit 1000 may use an MIP image described with reference to
The input unit 1010 inputs a non-contrast image to the deep learning network 1030. For example, the non-contrast image may be in a data format of digital imaging and communications in medicine (DICOM).
The conversion unit 1020 converts the non-contrast image into a contrast image and outputs the contrast image by using the deep learning network 1030.
The disclosure may also be implemented as computer-readable program codes on a computer-readable recording medium. The computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, an optical data storage device, etc. In addition, the computer-readable recording medium is distributed to computer systems connected over a network, and computer-readable codes may be stored and executed in a distributed manner.
According to an embodiment, a non-contrast image captured without administration of a contrast medium may be converted into a contrast image through a deep learning network. According to an embodiment, the performance of the deep learning network may be improved by using an MIP image generated by processing a plurality of contrast images as learning data instead of using the contrast images as they are as the training data. As an embodiment, the performance of the deep learning network may be improved by implementing the deep learning network as a U-Net and using a VGG network or an auto encoder for training the U-Net.
It should be understood that embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other embodiments. While one or more embodiments have been described with reference to the figures, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2022-0097571 | Aug 2022 | KR | national |