The present disclosure relates to a system and method for medical image analysis, and especially relates to a system and method for analyzing a medical image using a sequential learning model with uncertainty estimation.
Sequential machine learning models have been used as essential tools to model complex sequential correlations across different domains, such as medical image analysis. Although these sequential learning models, especially state-of-the-art deep learning models such as recurrent neural networks (RNNs), can accurately convert one natural language into another or predict a sequence of physiological parameters at a plurality of anatomic structural positions, one of the challenges with such sequential learning models is the explainability. These models are often treated as black boxes and hard to decipher.
With the increasing complexity of learning models, it may improve the prediction ability for various complex problems in real practice. Nevertheless, more complex models often mean less transparency/explainability. This leads to great concerns in their deployment in their real-world deployment, especially those with significant implications such as healthcare.
One way to avoid this problem is to use simpler and more explainable models such as Gaussian processes and linear regression. However, it significantly limits the model performance.
Therefore, there is an unmet need to improve the sequential learning model, especially for those intend to model complex functions in the technical field of healthcare.
The present disclosure is provided to solve the above-mentioned problems existing in the prior art. The disclosed system and method analyze images, such as medical images using an improved sequential learning model, which not only provides a sequence of accurate predictions but also the corresponding uncertainty estimations. With the improved transparency and explainability of the sequential learning model, the disclosed system and method can improve the accuracy of its prediction process (e.g., to predict physiological-related parameters) when analyzing a medical image.
According to a first aspect of the present disclosure, it provides a method for predicting physiological-related parameters based on a medical image. The method may include receiving a medical image acquired by an image acquisition device and predicting, by a processor, a sequence of physiological-related parameters at a sequence of positions and simultaneously estimating an uncertainty level of the predicted sequence of physiological parameters from the medical image by using a sequential learning model. The sequential learning model is trained to minimize a loss function associated with the uncertainty level.
According to a second aspect of the present disclosure, it provides a system for predicting physiological-related parameters based on a medical image. The system includes a communication interface configured to receive a medical image acquired by an image acquisition device. The system may also include a processor configured to predict a sequence of physiological-related parameters at a sequence of positions and simultaneously estimating an uncertainty level of the predicted sequence of physiological parameters from the medical image by using a sequential learning model. The sequential learning model is trained to minimize a loss function associated with the uncertainty level.
According to a third aspect of the present disclosure, it provides a non-transitory computer storage medium having computer executable instructions stored thereon, wherein the computer-executable instructions, when executed by a processor, perform a method for predicting physiological-related parameters based on a medical image. The method may include acquiring a medical image acquired by an image acquisition device and predicting a sequence of physiological-related parameters at a sequence of positions and simultaneously estimating an uncertainty level of the predicted sequence of physiological parameters from the medical image by using a sequential learning model. The sequential learning model is trained to minimize a loss function associated with the uncertainty level.
The disclosed systems and methods not only provide a sequence of accurate predictions but also the corresponding uncertainty estimations simultaneously by using sequential learning model(s), so as to improve the transparency and explainability of the sequential learning model and the process of predicting physiological-related parameters from a medical image using such a model.
It should be understood that the foregoing general description and the following detailed description are only exemplary and illustrative, and do not intend to limit the claimed invention.
In the drawings that are not necessarily drawn to scale, similar reference numerals may describe similar components in different views. Similar reference numerals with letter suffixes or different letter suffixes may indicate different examples of similar components. The drawings generally show various embodiments by way of example and not limitation, and together with the description and claims, are used to explain the disclosed embodiments. Such embodiments are illustrative and are not intended to be exhaustive or exclusive embodiments of the method, system, or non-transitory computer-readable medium having instructions for implementing the method thereon.
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the drawings.
Consistent throughout this disclosure, a physiological-related parameter may be any form of parameter associated with a physiological status of a subject (e.g., a human or an animal). As an example, the physiological-related parameter may take the form of class labels, continuous variables (parameter measurements), segmentation mask, etc. Besides, the physiological status may be any one of physiological functional status, physiological anatomical status (e.g., belonging to which body site, which organ, which tissue), lesion (e.g., vessel plaque, myocardial bridge, hemangioma, etc.) or non-lesion, with or without foreign matters (such as implant, stent, cannula, catheter, guide wire), etc. or its combination. As another example, the physiological-related parameter may be at least one vessel functional parameter out of blood pressure, blood velocity, blood flow-rate, wall-surface shear stress, fractional flow reserve (FFR), microcirculation resistance index (IMR), and instantaneous wave-free ratio (iFR) and/or its change parameter compared with adjacent position at one or more positions along the centerline of the vessel. Consistent throughout the disclosure, a sequence may include one or more elements, and may be distributed in temporal or spatial domain. As an example, a sequence of positions may include a single position, multiple positions distributed in a tree structure, or multiple positions at a sequence of timing points, etc.
At step 102, a sequential learning model may be applied to the medical image, e.g., the extracted sequence of image patches or feature vectors at a sequence of centerline points, to predict a sequence of physiological-related parameters Y at a sequence of positions meanwhile estimate an uncertainty level thereof. Apart from predicting the output sequence Y=(y1, y2, . . . , yT), its uncertainty level is also estimated. Furthermore, the sequential learning model may be a single sequential learning model that performs the dual functions: predicting Y and estimating its uncertainty level. For example, the single sequential learning model may have two branches of outputs, where one outputs the sequence of physiological-related parameters while the other outputs the uncertainty level thereof. The output sequence of physiological-related parameters may be a single prediction value, or multiple values in spatial/temporal sequences, tree structures, or other data structure. In some embodiments, the output sequence Y may include any one of class labels, continuous physiological parameters, or segmentation masks at the sequence of positions or a combination thereof. Still in the task of predicting physiological parameters for the vessel centerline points from cardiovascular images, the output sequence Y may include any one of vessel physiological-functional status, blood pressure, pressure drop, blood velocity, blood flow-rate, wall shear stress, fractional flow reserve (FFR), FFR change between adjacent vessel centerline points, instantaneous wave-free ratio (iFR), and iFR change between adjacent vessel centerline points, or a combination thereof. In some embodiments, the uncertainty level may measure the uncertainty of the whole sequence, or partial segments, or several locations/points in sequences. In some embodiments, the uncertainty level may measure the uncertainty of just a single prediction. In some embodiments, the uncertainty level may take various forms, such as but not limited to variance and quartile of the distribution, conditional probability, etc. In some embodiments, the sequential learning model may include RNN models such as long short-term memory (LSTM), gated recurrent unit (GRU), and transformers.
With the automatic analysis method as shown in
The uncertainty level may be extremely valuable for making reliable decision and improving model performance. For example,
Specifically, the human resources can be divided into a smaller group of human experts 204a and a larger group of additional human experts 204b (which contains more experts than the human experts 204a). Human experts 204a are allocated to evaluate easier cases that have high confident predictions while additional experts 204b are allocated to hard cases that have low confident predictions. If the human experts, e.g., the human experts 204a or the additional experts 204b confirm the prediction result generated by machine learning system 202, the final prediction 206 is generated. Note that the feedback 205 regarding the evaluation from the human experts 204 may be provided back to the machine learning system 202 to refine the predictions. This machine-human interaction may be repeated for several iterations to maximally leverage the power of machine learning system 202. The limited human resources for evaluation of predictions are one of the bottlenecks for utilization of the machine learning system 202 in medical image analysis and computer-assisted diagnosis. By means of efficient allocation of human resources to evaluation based on the certainty level of the model-generated prediction result, the limited human resource may focus on the low confident predictions, so as to improve the accuracy of the final prediction and improve the efficiency of machine learning system 202. As a comparison, in the prior art example of
The sequential learning model may be trained to estimate the uncertainty level simultaneously with the prediction. In some embodiments, the estimation of the uncertainty level adds a Bayesian aspect to the sequential prediction problem, which may be used to integrate prior knowledge in the inference stage for prediction the sequence of physiological-related parameters at a sequence of positions. The integration of uncertainty estimation in the single sequential learning model during the training stage improves the accuracy of the prediction and the transparency and explainability during the inference stage.
In some embodiments, the predicted sequence of physiological-related parameters and the estimated uncertainty level thereof may be displayed associated with each other (i.e., in an associated manner) on the display, so as to enable a user to make a further decision. Particularly, in view of the prediction result and its uncertainty level, the user may decide to discard the prediction result (e.g., if the uncertainty level is lower than a first threshold level), add evaluation comments to the prediction result (e.g., if the uncertainty level is higher than the first threshold level but lower than a second threshold level) and transfer the same to his/her superior with higher professional level for double check, or directly approve the prediction result (with uncertainty level higher than the second threshold level).
There are various methods to model uncertainty estimation in the sequential learning model. In some embodiments, the uncertainty may be modeled by explicitly modeling the conditional probability. The uncertainty estimation takes two major sources of uncertainty into consideration. One major source originates from the input data X. This type of uncertainty may be due to the noisy measurement or limited training data. As a result, the trained learning model may not be confident for an unseen test data out of the scope of the distribution of the training data. Another major source originates from model specification. For instance, a complex sequential learning model such as RNN may easily overfit the noise in the training data. As a result, sequential learning models trained with different initializations may yield totally different prediction results. In some embodiments, sequential learning models may be provided separately for the prediction of physiological-related parameters and the estimation of the uncertainty level thereof based on the common input. In some embodiments, a single sequential learning model may include two functional branches for (1) the prediction of physiological-related parameters and (2) the estimation of the uncertainty level thereof. Accordingly, two outputs are provided by the single sequential learning model: one for the prediction of physiological-related parameters while the other for the estimation of the uncertainty level thereof.
As an example, in case that the physiological-related parameters are class labels, the method further includes predicting a sequence of class labels together with a sequence of the conditional probabilities at the sequence of positions by using the single sequential learning model. Then the uncertainty level of the predicted sequence of class labels may be estimated based on the sequence of conditional probabilities at the sequence of positions. Besides, as shown in
In some embodiments, in case that the physiological-related parameters are continuous physiological parameters, such as FFR, as shown in
The sequences of mean physiological parameters μ(y1, y2, . . . , yT) 404a and the variances σ2 (y1, y2, . . . , yT) 404b may be used to approximate and determine the conditional probability assuming a Gaussian probability distribution. Under this assumption, the predicted sequences of mean physiological parameters at the sequence of positions, μ(y1, y2, . . . , yT), may be output directly as the predicted sequence of physiological-related parameters; and the uncertainty level of the predicted sequence of physiological-related parameters may be determined based on the sequences of variances at the sequence of positions, σ2 (y1, y2, . . . , yT). During the testing stage, the sequential learning model 402 may yield larger variances for uncertain predictions, and vice versa. The uncertainties, together with the predictions, may then be used for further decision making.
In some embodiments, the sequential learning model 402 may be trained in consideration of the uncertainty level in the loss function.
For example, in case that the physiological-related parameters to be predicted are continuous physiological parameters, the cost function may include a squared L2 norm loss term L based on the sequence of variances at the sequences of positions σ2(y1, y2, . . . , yT) and a divergence between the sequence of mean physiological parameters μ(y1, y2, . . . , yT) and the sequences of ground truth physiological parameters Ŷ(ŷ1, ŷ2, . . . , ŷT). Particularly, the loss term L may be defined by equation (1):
During the training stage, L may be minimized by either minimizing the difference between the prediction result Y and the ground truth Ŷ or maximizing the variance σ2(Y|X).
In case that the physiological-related parameters are class labels, the conditional probability may be directly used in the loss function to train the sequential learning model. In some embodiments, the prediction process may be treated as a task of K-class classification, and the loss function may be defined to use the predicted sequence of probabilities at the sequence of positions therein, to minimize the divergence between the conditional probability and the corresponding label. As an example, the loss function may be defined by equation (2):
L=Σ
t=1
TΣk=1Kyt,k log(pt,k|X), equation (2)
where yt,k represents the class label at the position t, and pt,k|X represents the conditional probability of the class label at the position t given input as x. The loss function as defined by equation (2) is designed to minimize the divergence between the conditional probability pt,k|X and the label yt,k. During the testing stage, the conditional probability pt,k|X output by the sequential learning model may be used directly as the uncertainty estimate for the kth class.
In some embodiments, the conditional probability p(Y|X) may be implicitly modeled using latent variables Z, which may capture the uncertainties in the prediction process. The uncertainty regarding the prediction Y may be approximated by sampling the latent variable Z. As shown in
where (yt) represents the prediction variance of yt. And the sample mean (yt) may be calculated by equation (4):
Then, the sequence of mean of the physiological-related parameters predicted for K times, i.e., (y1, . . . , yT), may be determined as the predicted sequence of physiological-related parameters Y. Further, the sequence of variances of the physiological-related parameters predicted for K times, i.e. (y1, . . . , yT), may be determined and the uncertainty level may be estimated based on the determined sequence of variances.
In some embodiments, the sequential learning model may be a RNN model and the above process in
In the testing stage, the variance of the prediction 604 Y=(y1, y2, . . . , yT) may be approximated by sampling Z. Specifically, for the input sequence X=(x1, x2, . . . , xT), K predictions may be obtained by randomly dropping out Z for K times. As an example, the RNN unit 602 may be an LSTM or GRU unit. The output layers 603 may be a sequence of fully connected layers or other networks. In some embodiments, the crossed-out elements in the latent variables may be randomly selected and set to 0 during the inference stage.
As shown in
In some embodiments, the parameter prediction and uncertainty estimation device 900c may be a dedicated computer or a general-purpose computer. For example, the parameter prediction and uncertainty estimation device 900c may be a computer customized for the hospital to perform image acquisition and image processing tasks, or it may also be a server in the cloud.
The parameter prediction and uncertainty estimation device 900c may include at least one processor 903 configured to perform the method for predicting physiological-related parameter based on a medical image according to any embodiment of the present disclosure. The processor 903 may be configured to receive a medical image acquired by an image acquisition device 900b. The processor 903 may be further configured to predict a sequence of physiological-related parameters at a sequence of positions and simultaneously estimate an uncertainty level thereof from the medical image, by using a sequential learning model. The details of the method performed by the processor 903 are disclosed above and will not be repeated herein.
In some embodiments, the processor 903 may be processing device including one or more general processing device, such as microprocessor, central processing unit (CPU), graphics processing unit (GPU) and so on. More specifically, the processor may be complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, processor running other instruction sets or a combination of instruction sets. The processor 903 may also be one or more dedicated processing device, such as application specific integrated circuit (ASIC), field programmable gate array (FPGA), digital signal processor (DSP), system on chip (SoC) and so on.
The parameter prediction and uncertainty estimation device 900c may further include a storage 901, which may be configured to load or store the trained sequential learning model or an image prediction and estimation program according to any one or more embodiments of the present disclosure. The image prediction and estimation program, when implemented by the processor 903, may perform the method for predicting physiological-related parameters based on a medical image according to any embodiment of the present disclosure.
The storage 901 may be a non-transitory computer readable medium such as read only memory (ROM), random access memory (RAM), phase change random access memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), electrically erasable programmable read only memory (EEPROM), other types of random access memory (RAM), flash memory or other forms of flash memory, cache, register, static memory, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical memory, cassette tape or other magnetic storage devices, or any other possible non-transitory medium for storing information or instructions accessible by computer devices and the like. When are implemented by the processor 903, the instruction stored on the storage 901 can perform the method for predicting physiological-related parameters based on a medical image according to any embodiment of the present disclosure.
Although the model training device 900a and the parameter prediction and uncertainty estimation device 900c are shown as independent devices in
In some embodiments, the parameter prediction and uncertainty estimation device 900c may further include a memory 902 configured to load the sequential learning model according to any one or more embodiments of the present disclosure from such as the storage 901, or temporarily store intermediate data generated in the prediction and estimation procedure by using the sequential learning model. The processor 903 may be communicatively coupled to the memory 902 and configured to execute executable instructions stored thereon to execute the method for predicting physiological-related parameters based on a medical image according to any one of the embodiments of the present disclosure.
In some embodiments, the memory 902 may store intermediate information generated in the training stage or the inference stage, such as feature information, physiological-related parameters at individual positions, variances thereof, prediction results obtained for each time of Z sampling, the learning model parameters for each time of Z sampling, and each loss term value and the like generated while executing computer program, etc. In some embodiments, the memory 902 may store computer-executable instruction, such as one or more image processing programs. In some embodiments, the sequential learning model, each portion, layer, neurons (such as the latent random variables Z), and elements in the sequential learning model can be implemented as applications stored in the storage 901. These applications may be loaded into the memory 902, and then executed by the processor 903 to realize the corresponding processing.
In some embodiments, the memory 902 may be a non-transitory computer readable medium for storing information or instruction that can be accessed and executed by computer equipment and the like, such as read only memory (ROM), random access memory (RAM), phase change random access memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), electrically erasable programmable read only memory (EEPROM), other types of random access memory (RAM), flash disks or other forms of flash memory, cache, register, static memory or any other possible medium.
In some embodiments, the parameter prediction and uncertainty estimation device 900c may further include a communication interface 904 used for acquiring the medical image acquired by the image acquisition device 900b. In some embodiments, the communication interface 904 may include any one of network adapter, cable connector, serial connector, USB connector, parallel connector, high-speed data transmission adapter (such as optical fiber, USB 3.0, Thunderbolt interface, etc.), wireless network adapter (such as WiFi adapter), a telecommunication (such as 3G, 4G/LTE, 5G, etc.) adapter and so on.
The parameter prediction and uncertainty estimation device 900c may be connected to the model training device 900a, the image acquisition device 900b and other components via the communication interface 904. In some embodiments, the communication interface 904 may be configured to receive the trained sequential learning model from the model training device 900, and may also be configured to receive the medical image from the image acquisition device 900b.
In some embodiments, the image acquisition device 900b may include any one of general CT, general MRI, functional magnetic resonance imaging (such as fMRI, DCE-MRI and diffusion MRI), cone-beam computed tomography (CBCT), positron emission tomography (PET), single photon emission computed tomography (SPECT), X-ray imaging, optical tomography (OCT), fluorescence imaging, ultrasound imaging and radiation field imaging and the like.
In some embodiments, the model training device 900a may be configured to train a sequential learning model, and send the trained sequential learning model to the parameter prediction and uncertainty estimation device 900c, to predict physiological-related parameters and simultaneously estimate the uncertainty level based on a medical image using the trained sequential learning model. In some embodiments, the model training device 900a and the parameter prediction and uncertainty estimation device 900c may be implemented by a single computer or processor.
In some embodiments, the model training device 900a may be implemented using hardware specially programmed by software that performs training processing. For example, the model training device 900a may include a processor and a non-transitory computer readable medium similar to the parameter prediction and uncertainty estimation device 900c. The processor implements the training by executing the executable instructions of the training process stored in the computer readable medium. The model training device 900a may also include input and output interfaces to communicate with the training database, network, and/or user interface. The user interface is used to select the set of training data, adjust one or more parameters in the training process, select or modify a framework of the learning model.
Another aspect of the present disclosure is to provide a non-transitory computer readable medium storing instruction thereon, and when implemented, it causes one or more processors to perform the disclosed methods. The computer-readable medium may include volatile or nonvolatile, magnetic, semiconductor-based, tape-based, optical, removable, non-removable or other types of computer-readable media or computer-readable storage devices. For example, the computer-readable medium may be a storage device or a storage module in which a computer instruction is stored, as disclosed. In some embodiments, the computer-readable medium may be a magnetic disk or a flash drive on which computer instructions are stored.
Those skilled in the art may make various modifications and changes to the disclosed method, device, and system. In view of the description and practice of the disclosed system and related methods, other embodiments will be apparent to those skilled in the art.
It is intended that the description and examples are to be regarded as exemplary only, with the true scope being indicated by the appended claims and their equivalents.
This application is based on and claims the benefit of priority of U.S. Provisional Application No. 63/134,172, filed on Jan. 5, 2020, which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63134172 | Jan 2021 | US |