METHOD FOR COMPUTED TOMOGRAPHY IMAGING AND RECONSTRUCTION BASED ON LEARNING

Description

TECHNICAL FIELD

The present disclosure relates to the technical field of Computed Tomography (CT), and in particular, to an imaging and reconstruction method.

BACKGROUND

Computed tomography (CT) is an important research direction in the fields of medical imaging and computer vision graphics. This technology can reconstruct the internal structure of the object by measuring the amount of light absorbed by the scene in different directions. This technology has a wide range of application scenarios, such as medical imaging, industrial monitoring, aviation security inspection and cultural relics protection.

In particular, it is of great scientific and application value to apply tomography technology to dynamic scenes. For example, mechanical inspection and medical diagnosis both require three-dimensional reconstruction of high-speed dynamic scenes. However, extending traditional CT to dynamic scenes faces a critical challenge: since high-quality reconstruction results are often based on intensive sampling in different directions, when the scene changes rapidly, the intensive sampling process must be completed in a short time to avoid the afterimage problem in reconstruction and ensure the reconstruction quality. This makes it necessary for dynamic scene-oriented CT to far exceed the high sampling ability of traditional methods.

Over the past decades, various studies have proposed different algorithms for CT acquisition and reconstruction of dynamic scenes. For a specific dynamic phenomenon, the properties of the observed scene can be used for solution. For example, Chen et al. proposed a scanning reconstruction algorithm (Chen Guang-Hong, Theriault-Lauzier Pascal, Tang Jie, Nett Brian, Leng Shuai, Zambelli Joseph, Qi Zhihua, Bevins Nicholas, Raval Amish, Reeder Scott. 2011. Time-resolved interventional cardiac C-arm cone-beam CT: An application of the PICCS algorithm. IEEE transactions on medical imaging. 31, 4, 907-923). However, such methods are limited to specific scene characteristics and lack universality. Some studies aim to improve sampling speed by reducing the number of point light source samples and making strong prior assumptions for reconstruction. However, the dependence on these assumptions limits the applicability of the methods to certain scenarios (Zang Guangming, Idoughi Ramzi, Wang Congli, Bennett Anthony, Du Jianguo, Skeen Scott, Roberts William L., Wonka Peter, Heidrich Wolfgang. 2020. TomoFluid: reconstructing dynamic fluid from sparse view videos. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1870-1879). Therefore, it is urgent to propose a computed tomography acquisition and reconstruction algorithm for general dynamic scenes.

SUMMARY

The present disclosure aims at solving the problems that the existing computed tomography scanning method has insufficient acquisition ability in dynamic scenes and a strong prior assumption is required for reconstruction, and provides a method which greatly improves the acquisition ability and is oriented to general scenes.

In order to achieve the above object, the present disclosure provides a method for computed tomography imaging and reconstruction based on learning, which measures the scene density distribution in an illumination multiplexing manner, including the following steps:

- (1) Generating training data, including: acquiring parameters of a scanning device, including spatial positions of light sources and sensors, and readings of all sensors by iteratively turning on all light sources with maximum intensity in an empty scene; and generating, by using the parameters, computed tomography (CT) images when a single light source emits light emitted through a scene and reaches the sensors, as the training data.
- (2) Training a neural network according to the training data generated in the step (1). The neural network has the following characteristics:
- a, the CT images of all light sources of the scanning device are taken as an input;
- b, a corresponding density field is taken as an output.
- c, a first layer of the neural network is a linear fully connected layer, and a parameter matrix of the linear fully connected layer is obtained by the following formula:

$W_{l} = f_{w} (W_{raw})$

where W_rawrepresents a parameter to be trained; W_lcorresponds to an illumination intensity matrix during imaging, with a size of k×n_s, n_srepresents a number of light sources of the scanning device, and k represents a number of samples; and f_wrepresents a mapping, and is configured for transforming W_raw, so that the generated illumination intensity matrix W_lcorresponds to a possible illumination intensity of the light sources;

- d, a second layer and subsequent layers are nonlinear neural networks, and the output of a last layer is a density field reconstruction result; and after training is completed, the illumination intensity matrix W_lof the first linear fully connected layer is extracted.
- (3) emitting, by the light sources of the scanning device, light line by line according to the illumination intensity matrix extracted in the step (2), and irradiating a target scene in turn, and obtaining, by the sensors, a measurement value matrix M, with a size of k×n_d, where n_drepresents a number of the sensors; calculating a reconstructed density field by taking M as an output of the first linear fully connected layer of the neural network; measuring, based on the pre-learned illumination intensity matrix, a scene density distribution, using an illumination multiplexing manner; reconstructing, based on acquired measurements, scene density distribution for dynamic scenes; and generating, based on the scene density distribution, CT images for dynamic scenes.

Further, in the step (1), the method of generating the CT images further includes: randomly placing several objects with different densities in an effective area of the scene, and generating the CT images based on a selected ray model according to positions of the light sources and sensors obtained by calibration.

Further, the ray model is a linear absorption model, with an equation as follows:

$I = e^{- K \times_{3} x} ⊙ \tilde{I}$

where I represents a matrix composed of the CT images of different light sources, with a size of n_s×n_d, an element I_ijin I represents a reading of a j^thsensor when a i^thlight source emits light at a maximum intensity in a given density field, x represents a vector after the density field is discretized into voxels, with a length being a number of the voxels n_v, K represents a Radon transform represented by a third-order tensor, x₃represents a mode-3 product of the tensor and the vector, ⊙ represents an element-by-element multiplication between matrices, and Ĩ represents I when the density field is 0 everywhere.

Further, in the step (2), a relationship between the linear fully connected layer and the input is as follows:

$M = W_{l} I$

where I represents a matrix composed of the CT images of different light sources, with a size of n_s×n_d.

Further, in the step (2), the neural network for reconstruction is expressed as follows:

$x_{nn} = f_{recon} (D_{nn}) = f_{recon} (f_{nn} (M))$

where M is mapped into Sinogram D_nnby f_nn, and a density field reconstruction result x_nnis obtained by using a computed tomography reconstruction method f_recon.

Further, the computed tomography reconstruction method is realized by using a filtered back projection method.

Further, in the step (2), a loss function custom-character used for training is expressed as follows:

$ℒ = λ_{r} ℒ_{r} + λ_{p} ℒ_{p}$

$ℒ_{p} = g_{w} (W_{l})$

where custom-character is used to evaluate a density field reconstruction quality, is used to allow the illumination intensity to have a specific property, g_wrepresents a function adopted for evaluating the property of the illumination intensity, and λ_rand λ_pare used to balance the weights between different loss functions.

Further, the loss function custom-character for evaluating the density field reconstruction quality can be expressed as:

$ℒ_{r} = { x_{nn} - x }_{2}$

Further, the function g_wfor actually evaluating the property of the illumination intensity can be designed according to different scenes, and the following two design methods are given, but not limited thereto:

I, for a dynamic scene requiring high-speed scanning, g_w(W_l)=−Σ|W_l|, so that a value of W_ltends to be binary.

II, for a scene requiring low-dose scanning, g_w(W_l)=Σ|W_l|, so that the value of W_ltends to be minimized.

The present disclosure has the beneficial effects that the computed tomography imaging and reconstruction method of illumination multiplexing is obtained through learning. Compared with traditional computed tomography methods for dynamic scenes, which are mainly designed for setups using point light source, the method according to embodiments of the present disclosure can sample in all ray spaces, thereby greatly improving the acquisition efficiency. Thanks to the processor to implement the method, the density field can be reconstructed without strong prior assumptions. Therefore, the method is suitable for general scenes. Compared with the traditional illumination multiplexing imaging method, in the method according to embodiments of the present disclosure, the processor can use a neural network to acquire the used illumination intensity and reconstruction algorithm, thereby greatly reducing the required number of inputs, and achieving the high quality reconstruction. The proposed computed tomography method can collect and reconstruct dynamic scenes, which cannot be implemented by the traditional methods under the same hardware conditions. In addition, this method is not limited to a specific scanning device, and thus has great value of general applicability.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic flow chart of illumination multiplexing imaging reconstruction based on learning;

FIG. 2 is a side view of a scanning device used in an embodiment;

FIG. 3 is a top view and a visual layout of an acquisition module of the scanning device used in an embodiment;

FIG. 4 is a CT image generated by a specific light source using training data in an embodiment;

FIG. 5 is a CT image of a specific light source in an empty scene in an embodiment; and

FIG. 6 is a schematic diagram of the network structure in an embodiment.

DESCRIPTION OF EMBODIMENTS

In order to make the purpose, technical solution and advantages of the present disclosure more clear, the present disclosure will be further described in detail with the attached drawings and specific embodiments.

The present disclosure provides a method for computed tomography imaging and reconstruction based on learning, which measures the scene density distribution in an illumination multiplexing manner. The steps are shown in FIG. 1, and are explained in detail as follows:

- (1) Training data generation: parameters of a scanning device and readings of all sensors by iteratively turning on all light sources with maximum intensity in an empty scene are acquired. The parameters include spatial positions of light sources and sensors. These parameters are used to generate CT images generated when a single light source emits light emitted through a scene and reaches the sensors, as the training data.

In an embodiment, the scanning device includes at least two light sources and at least one sensor, each sensor can receive light from a plurality of light sources at the same time, and the intensities of all light sources are adjustable; the target scene is in part or all of the light path formed by the light source and the sensor.

In one embodiment, the scene is detected by visible light, and FIG. 2 shows a side view of the scanning device used. The optical fiber is used to transmit the light emitted by the LED array to an acquisition module as the light source. After receiving the light, a receiver part of the acquisition module further use the optical fiber to transmit the light to the sensor array, and the camera above takes photos. FIG. 3 shows the top view and visual layout of the acquisition modules. In this embodiment, 24 acquisition modules are connected in a ring shape in turn, numbered as #0, #1, . . . , #23, and each acquisition module has 10 rows and 16 columns, with odd columns as light sources and even columns as sensors. The size of the effective acquisition area is 32 mm×128 mm×128 mm, and the corresponding voxel resolution is 32×128×128. FIG. 4 and FIG. 5 are CT images generated by using a single light source in the scanning device when there are objects and when there is an empty scene, respectively.

Different ray models can be selected to generate CT images. In this embodiment, a linear absorption model is selected, but it is not limited to this. The formula of the linear absorption model is as follows:

$I = e^{- K \times_{3} x} ⊙ \tilde{I}$

where I is a matrix composed of the CT images of different light sources, with a size of n_s×n_d, an element I_ijin I is a reading of a j^thsensor when a i^thlight source emits light at a maximum intensity in a given density field, x is a vector after the density field is discretized into voxels, with a length being a number of the voxels n_v, K is a Radon transform represented by a third-order tensor, x₃is a mode-3 product of the tensor and the vector, ⊙ is an element-by-element multiplication between matrices, and Ĩ is I when the density field is 0 everywhere; the element Ĩ_ijin Ĩ is a reading of the j^thsensor when the i^thlight source emits light at a maximum intensity in the empty scene.

- (2) The neural network is trained according to the training data generated in step (1). The neural network used in this embodiment is shown in FIG. 6, but it is not limited thereto, and the black arrow in the figure indicates the convolution layer unless otherwise specified. The specific description of the neural network used is as follows:
- a, the CT images of all light sources of the scanning device are taken as an input.
- b, a corresponding density field is taken as an output.
- c, a first layer of the neural network is a linear fully connected layer, and a parameter matrix of the linear fully connected layer is obtained by the following equation:

$W_{l} = f_{w} (W_{raw})$

where W_rawis a parameter to be trained; W_lcorresponds to an illumination intensity matrix during imaging, with a size of k×n_s, n_sis a number of light sources of the scanning device, and k is a number of samples; f_wis a mapping, which is used for transforming W_raw, so that the generated illumination intensity matrix W_lcan correspond to a possible illumination intensity of the light sources; in this embodiment, a Sigmoid function is selected for f_wto transform W_rawto [0,1], so that the elements of W_lhave practical physical significance; and the relationship between the linear fully connected layer and the input is as follows:

$M = W_{l} I$

where M represents a measured value matrix of the sensor.

- d, a second layer and subsequent layers are nonlinear neural networks, and the output of a last layer is a density field reconstruction result. After training is completed, the illumination intensity matrix W_lof the first linear fully connected layer is extracted.

In an embodiment, the neural network used for reconstruction is expressed as follows:

$x_{nn} = f_{recon} (D_{nn}) = f_{recon} (f_{nn} (M))$

where M is mapped into Sinogram Dnn by fnn, and then using a Computed Tomography reconstruction method frecon to obtain a density field reconstruction result xnn is obtained by using a computed tomography reconstruction method frecon.

In this embodiment, the measured value matrix is divided into 24 sub-matrices according to the acquisition modules, and the sub-matrix with the acquisition module serial number of G_iis shifted to the left by G_i×8 pixels in rows, so that all sub-matrices share the same parameterized form to take advantage of the rotation invariance of the scanning geometry in this embodiment. The moved measurement matrix is reconstructed by 24 decoding networks sharing parameters to obtain the sinogram of each acquisition module, which is assembled into the reconstructed sinogram after the serial number of the acquisition module is moved back to the original position. Finally, the density field is reconstructed from the differentiable 3D-FBPNet. It should be noted that the network structure shown here is the technical solution of one embodiment of the present disclosure, but not the limitation. The loss function custom-character for training can be expressed as follows:

$ℒ = λ_{r} ℒ_{r} + λ_{p} ℒ_{p}$

$ℒ_{r} = { x_{nn} - x }_{2}$

$ℒ_{p} = g_{w} (W_{l})$

In order to improve the scanning speed, the illumination intensity should be encouraged to increase, therefore g_wis preferably:

$g_{w} (W_{l}) = - \sum ❘ W_{l} ❘$

- (3) The light source of the scanning device emits light line by line according to the illumination intensity matrix taken out in step (2). The a target scene are irradiated in turn, and the measured value matrix M is obtained through the sensors, with a size of k×n_d, n_dis the number of sensors. The reconstructed density field is calculated by taking M as the output of the first linear fully connected layer of the neural network.

Corresponding to the embodiment of the method for computed tomography imaging and reconstruction based on learning, the present disclosure further provides an embodiment of a learning-based Computed Tomography imaging and reconstruction apparatus.

The learning-based Computed Tomography imaging and reconstruction apparatus provided by the embodiment of the present disclosure comprises a memory and one or more processors. The memory stores executable codes, and when the executable codes are executed by the processors, the method for computed tomography imaging and reconstruction based on learning in the embodiment is implemented.

The embodiment of the learning-based Computed Tomography imaging and reconstruction apparatus of the present disclosure can be applied to any device with data processing capability, which can be a device or device such as a computer. The embodiment of the apparatus can be realized by software, or by hardware or a combination of hardware and software. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory through the processor of any equipment with data processing capability. In the hardware level, in addition to the processor, memory, network interface, and nonvolatile memory, any device with data processing capability where the apparatus in the embodiment is located may further include other hardware according to the actual function of the device with data processing capability, which will not be described here again.

The embodiment of the present disclosure further provides a computer-readable storage medium, on which a program is stored, which, when executed by a processor, implements the method for computed tomography imaging and reconstruction based on learning in the above embodiment.

The computer-readable storage medium can be an internal storage unit of any device with data processing capability as described in any of the previous embodiments, such as a hard disk or a memory. The computer-readable storage medium can further be an external storage device of any device with data processing capability, such as a plug-in hard disk, Smart Media Card (SMC), SD card, Flash Card and the like provided on the device. Further, the computer-readable storage medium can further include both internal storage units and external storage devices of any device with data processing capability. The computer-readable storage medium is used for storing the computer program and other programs and data required by any device with data processing capability, and can further be used for temporarily storing data that has been output or will be output. The above is only the preferred embodiment of one or more embodiments of this specification, and it is not intended to limit one or more embodiments of this specification. Any modification, equivalent substitution, improvement and the like made within the spirit and principle of one or more embodiments of this specification shall be included in the protection scope of one or more embodiments of this specification.

Claims

1. A method for computed tomography imaging and reconstruction based on learning, wherein the method is implemented by at least one processor caused by instructions stored in a memory, the instructions cause the at least one processor, and the method comprises the following steps: step (1) generating training data, comprising: acquiring parameters of a scanning device, wherein the parameters comprise spatial positions of light sources and sensors, and readings of all sensors by iteratively turning on all light sources with a maximum intensity in an empty scene; andgenerating, by the parameters, computed tomography (CT) images when a single light source emits light emitted through a scene and reaches the sensors, as the training data;step (2) training a neural network according to the training data generated in the step (1), wherein the neural network has the following characteristics: a, the CT images of all the light sources of the scanning device are taken as an input;b, a corresponding density field is taken as an output;c, a first layer of the neural network is a linear fully connected layer, and a parameter matrix of the linear fully connected layer is obtained by a following equation:
2. The method according to claim 1, wherein said generating the CT images in the step (1) further comprises: randomly placing several objects with different densities in an effective area of the scene, and generating the CT images based on a selected ray model according to positions of the light sources and sensors obtained by calibration.
3. The method according to claim 2, wherein the ray model is a linear absorption model, with an equation as follows:
4. The method according to claim 1, wherein in the step (2), a relationship between the linear fully connected layer and the input is expressed by a following equation:
5. The method according to claim 1, wherein in the step (2), the neural network for reconstruction is expressed as follows:
6. The method according to claim 5, wherein the computed tomography reconstruction method is realized by using a filtered back projection.
7. The method according to claim 1, wherein in the step (2), a loss function for training is expressed as follows:
8. The method according to claim 7, wherein a loss function is calculated by =∥xnn−x∥2, where x represents a vector after the density field is discretized into voxels, and xnn represents a density field reconstruction result.
9. The method according to claim 7, wherein in a dynamic scene requiring high-speed scanning, gw(Wl)=−Σ|Wl|, so that a value of Wl tends to be binary.
10. The method according to claim 7, wherein in a scene requiring low-dose scanning, gw(Wl)=Σ|Wl|, so that a value of Wl tends to be minimized.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of International Application No. PCT/CN2023/102613, filed on Jun. 27, 2023, the contents of which is incorporated herein by reference in its entirety.

Continuations (1)

	Number	Date	Country
Parent	PCT/CN2023/102613	Jun 2023	WO
Child	18777615		US

METHOD FOR COMPUTED TOMOGRAPHY IMAGING AND RECONSTRUCTION BASED ON LEARNING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

Continuations (1)