This disclosure relates generally to medical imaging radiomics.
Medical imaging includes Computed Tomography (CT), Magnetic Resonance Imaging (MRI), x-rays, ultrasound, microscopy, and other techniques. Medical imaging may produce two-dimensional images or three-dimensional volumes constructed from multiple two-dimensional images. Such three-dimensional volumes may be sliced in any of a variety of ways to obtain two-dimensional images.
Radiomics, or medical imaging biomarkers, is an active area of research and development. Radiomics models have been applied in a wide range of diagnostics, classification tasks, and disease scoring. Radiomics is advantageous for efficient radiology workflow, for example, reducing errors. Further, radiomics can highlight important features and provide additional information in challenging diagnostic cases. Radiomics has been extensively investigated in oncology diagnostics and is finding applications in other diseases in a wide range of organ systems. Radiomics can significantly boost the utility of imaging studies by using quantitative image features to infer underlying tumor biology and predict patient outcomes. For lung imaging in particular, radiomics has demonstrated utility in cancer diagnosis, prognosis, and precision medicine, as well as in the evaluation of a wide range of lung diseases including COPD, asthma, and tuberculosis.
According to various embodiments, a method of radiomics standardization for patient scan data obtained by a particular imaging device is presented. The method includes acquiring, using the particular imaging machine, the patient scan data; obtaining unstandardized radiomics for the patient scan data; recovering standardized radiomics for the patient scan data based on at least: the patient scan data, the unstandardized radiomics for the patient scan data, and calibration phantom data for the particular machine obtained using at least one calibration phantom; and outputting the standardized radiomics.
Various optional features of the above embodiments include the following. The particular imaging machine can include at least one: x-ray machine, computed tomography machine, magnetic resonance imaging machine, or ultrasound machine. The patient scan data can include a two-dimensional slice of a three-dimensional volume constructed from raw patient scan data. The patient scan data can include raw patient scan data. The method can include: providing the patient scan data and the calibration phantom data to a trained image property predictor; and obtaining noise and resolution characteristics for the particular machine from the trained image property predictor; wherein the recovering the standardized radiomics comprises recovering the standardized radiomics based on the patient scan data, the unstandardized radiomics for the patient scan data, and the noise and resolution characteristics. The recovering the standardized radiomics can include: providing the patient scan data, the unstandardized radiomics for the patient scan data, and the calibration phantom data for the particular machine to a machine learning model trained using a training corpus comprising radiomics in association with example scan data and calibration phantom data, whereby the machine learning model provides the standardized radiomics. The recovering can include: deblurring an image corresponding to the patient scan data to produce a deblurred image; determining radiomics for the deblurred image; determining radiomics for noise of the deblurred image; and deconvolving the radiomics for the deblurred image with the radiomics for the noise of the deblurred image. The recovering can include: passing an image corresponding to the patient scan data to a first machine learning model trained to deblur images to obtain a deblurred image; computing radiomics for the deblurred image; passing the radiomics for the deblurred image to a second machine learning model trained to remove noise, whereby the standardized radiomics are obtained. The radiomics can include standardized radiomics comprise a grey-level co-occurrence matrix. The outputting can include causing the standardized radiomics to be input to a radiomics model for clinical decision making.
According to various embodiments, a system for radiomics standardization for patient scan data obtained by a particular imaging device is presented. The system includes at least one electronic processor that executes instructions to perform operations comprising: acquiring the patient scan data produced by the particular imaging machine; obtaining unstandardized radiomics for the patient scan data; recovering standardized radiomics for the patient scan data based on at least: the patient scan data, the unstandardized radiomics for the patient scan data, and calibration phantom data for the particular machine obtained using at least one calibration phantom; and outputting the standardized radiomics.
Various optional features of the above embodiments include the following. The particular imaging machine can include at least one: x-ray machine, computed tomography machine, magnetic resonance imaging machine, or ultrasound machine. The patient scan data can include a two-dimensional slice of a three-dimensional volume constructed from raw patient scan data. The patient scan data can include raw patient scan data. The operations can further include: providing the patient scan data and the calibration phantom data to a trained image property predictor; and obtaining noise and resolution characteristics for the particular machine from the trained image property predictor; wherein the recovering the standardized radiomics comprises recovering the standardized radiomics based on the patient scan data, the unstandardized radiomics for the patient scan data, and the noise and resolution characteristics. The recovering the standardized radiomics can include: providing the patient scan data, the unstandardized radiomics for the patient scan data, and the calibration phantom data for the particular machine to a machine learning model trained using a training corpus comprising radiomics in association with example scan data and calibration phantom data, whereby the machine learning model provides the standardized radiomics. The recovering can include: deblurring an image corresponding to the patient scan data to produce a deblurred image; determining radiomics for the deblurred image; determining radiomics for noise of the deblurred image; and deconvolving the radiomics for the deblurred image with the radiomics for the noise of the deblurred image. The recovering can include: passing an image corresponding to the patient scan data to a first machine learning model trained to deblur images to obtain a deblurred image; computing radiomics for the deblurred image; passing the radiomics for the deblurred image to a second machine learning model trained to remove noise, whereby the standardized radiomics are obtained. The radiomics can include standardized radiomics comprise a grey-level co-occurrence matrix. The outputting can include causing the standardized radiomics to be to be input to a radiomics model for clinical decision making.
Various features of the embodiments can be more fully appreciated, as the same become better understood with reference to the following detailed description of the embodiments when considered in connection with the accompanying figures, in which:
Reference will now be made in detail to example implementations, illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. In the following description, reference is made to the accompanying drawings that form a part thereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the invention. The following description is, therefore, merely exemplary.
Despite the potential of radiomics, it is widely acknowledged that a major challenge to clinical usage is the robustness and repeatability of radiomics models. Such concerns arise from variability in radiomics values from each step in the imaging chain including: (1) data collection from different imaging systems and protocols, (2) lack of standardization in image formation and processing, and (3) lack of standardization in radiomics computation and reporting of such models. The latter two can be resolved through a concerted effort in the research community and have in fact motivated several initiatives and guidelines to standardize definitions, methodologies, and reporting. The first, however, reflects a fundamental technical challenge, as radiomics values are inherently affected by the quality of the image data, which is in turn affected by acquisition techniques, reconstruction parameters, and scanner specifications.
Radiomics relies on medical image data which not only contain variability due to noise, but also differing biases (e.g. including different spatial resolutions) induced by the use of hardware from different vendors, different acquisition protocols, and different data processing as part of image formation or post-processing. The problem of image-chain-based variability in radiomics features is common among all imaging modalities with a large variety of scanners, acquisitions protocols, and post-processing. Complex noise and bias characteristics have become particularly exaggerated with the increased use of sophisticated data processing schemes, e.g., sparse acquisitions and compressed sensing in MRI model-based iterative methods in computer tomography and nuclear imaging, and machine learning methods in all modalities. Variability in image data is well-known in x-ray computed tomography. Even with conventional linear processing (e.g., filtered backprojection), noise properties of the image are patient and protocol dependent, and non-stationary (e.g., varying contrast with kVp, increased noise with larger patients, etc.). With model-based iterative reconstruction and machine learning methods, image properties can also have significant dependencies in spatial resolution, often characterized as contrast-dependent and space-variant.
One school of thought is that the radiomics model itself can handle variability, that is, given sufficient training data of different varieties, the model can learn to handle the difference in image quality and radiomics features. However, in a real-world clinical setting, there are numerous scanner-specific, institution-specific, or even radiologist-specific imaging and reconstruction protocols. Curating sufficient training data for each case and retraining the model to handle such variability is not feasible.
Several studies have evaluated the effect of mixing imaging conditions on radiomics model performance with limited success, e.g., using smooth and sharp filtered backprojection kernels. The performance of the radiomics models were found to decrease significantly and the authors have suggested against mixing imaging conditions. Numerous studies have emerged to call attention to the dependence of radiomics feature variability on imaging conditions and highlighting such variability as a major challenge to radiomics research.
Attempts to solve these issues tend to fall into one of three categories: (1) harmonizing acquisition and processing protocols, (2) identification of radiomics that are robust across different protocols and imaging conditions, and (3) normalization of radiomics to a standard imaging protocol. With a diversity of vendors and imaging hardware, the first has not been achieved, and the second limits the number of available features. Many normalization strategies (the third category) have been proposed but, thus far, it is unclear whether they can account for the full range of imaging systems and protocol diversity.
Some embodiments provide a solution by treating radiomics computation as an additional step in the imaging chain. Some embodiments utilize an end-to-end prediction framework that relates how each imaging parameter affects radiomics values. Some embodiments can not only predict radiomics from arbitrary imaging conditions, but also invert the model and normalize values to a standard protocol, thereby achieving robust and repeatable radiomics.
These and other features and advantages are shown and described in reference to the figures as set forth presently.
Usage of such end-to-end models permits embodiments to not only predict radiomics variability as a function of imaging condition, but also to standardize radiomics values to a common baseline via inversion of the models. Thus, predictive models are innovative in that they have direct application in the optimization and design of imaging protocols best suited for estimation of specific radiomics. For example, what is good for a radiomics model in terms of performance may be mismatched with what is good for general radiologists' performance. Inverted (that is, recovery) models provide a concrete and mathematically rigorous way to estimate the underlying tissue radiomics, thereby providing a common foundation across data from different sources (vendor, protocol, etc.). Thus, some embodiments utilize a novel paradigm for standardization in which the underlying radiomics themselves are estimated.
There are some distinct advantages in defining standardization as a radiomics estimation problem, as opposed to image standardization, including: (1) The parameter space for radiomics estimation is much smaller than the joint image denoising/blur deconvolution problem. This generally represents a better conditioned inversion. (2) Image correction methods are focusing on providing the underlying true image. Radiomics are only trying to capture specific features (e.g., textures of a certain scale and/or directionality). Thus, some embodiments do not include solving the more difficult problem of estimating the true image (e.g., deblurring/denoising in all directions) and instead focus on the problem of the radiomics themselves. (3) Treating the problem as a radiomics estimation problem permits modeling of the prior distribution of radiomics, e.g., known sparsity in a gray-level cooccurrence matrix, etc. Lastly, focusing on end-to-end radiomics estimation formalizes the quantitation problem. If there are sub-resolution “signatures” of specific disease processes (which cannot be seen visually), modeling and estimation of the underlying radiomics may demonstrate concretely where those signatures arise.
Embodiments may utilize models that are modular and general, and that can encompass combinations of hardware specifications, as well as both linear and nonlinear reconstruction and processing algorithms. The resolution and noise can either be derived from a fully analytic model based on known system parameters, or obtained from empirical measurement of one or more calibration phantoms. Moreover, the prediction framework may be inverted to use measured radiomics in the blurred and noisy data, R(T,S), to obtain the ground truth radiomics values or radiomics values of a standardized imaging protocol, R(T0,S0). Thus, some embodiments include explicit estimation of the underlying radiomics through rigorous modeling of system dependencies that lead to variability, e.g., in noise and resolution. Modeling such dependencies allows for aggregation from disparate data sources with improved radiomics quantitation.
Imaging systems may be linear, weakly nonlinear, or highly nonlinear.
Examples of linear systems include standard filtered backprojection, with noise and resolution characterized by noise-power spectrum and modulation transfer function. Such systems can be considered linear and shift-invariant. Some radiomics computations are also linear (e.g., those based on linear decompositions, such as Fourier, Hadamard, or wavelet transforms, etc.). Computer tomography systems employing locally linearizable model-based reconstructions (like quadratically penalized-likelihood) may also be considered linear. Certain classes of radiomics (e.g., GLCM and histogram-based methods) can also be described by linear operations on either the underlying images or on the radiomics themselves.
Examples of weakly nonlinear systems include model-based iterative reconstruction, which can be modeled as locally linear.
Examples of highly nonlinear system include those that utilize deep learning, e.g., for denoising.
In the following, embodiments are disclosed for use with each of these types of imaging systems. Note that the selection of the appropriate embodiment for a given situation may depend on prior knowledge of the imaging system and whether radiomics computations can be analytically modeled.
Models 202 and 204 are applicable where the imaging system is at least locally linear and noise is stationary. System blur may be shift invariant or involve simple shift variance as a result of focal spot blur. The image noise S and resolution T can be obtained from existing linear systems models given knowledge of the system parameters, or from phantom measurements of noise and resolution. The inputs for the prediction model 202 and the recovery model 204 may therefore include known system blur T (shift variant or invariant) and noise S (magnitude and correlation), as well as the (local) patient scan ({circumflex over (μ)}) from which radiomics features (R) are calculated. According to some embodiments, raw scan data y may be substituted for the patient scan {circumflex over (μ)}.
For example, Gray-Level Co-Occurrence Matrix (GLCM) is an example category of radiomics metrics that can be modeled analytically. For linear imaging systems where the resolution and noise is known, the radiomics recovery procedure may be implemented as described below in reference to
The GLCM may be recovered from an original image using recovery model 322 as follows. The effect of blur on the GLCM is potentially complex and depends on the underlying image ({circumflex over (μ)}). One can attempt to recover the original image using a deblurring operation to yield deblurred image 314. Note that this process is noise magnifying and results in the following form: h−1**{circumflex over (μ)}=μ+h−1**n. Note further that deblurring increases noise in the deblurred image and broadens the GLCM 316 of the deblurred image 314. However, additive Gaussian noise has the effect of convolving the original GLCM with the GLCM of that additive noise. Thus, one can model the noise in the deblurred image 314, form the GLCM 318 of the noise of the deblurred image, and deblur the GLCM directly by deconvolving the GLCM 316 of the deblurred image 314 with the GLCM 318 of the noise of the deblurred image to recover the target GLCM 320. Such deconvolutions may be implemented in a variety of ways including classical Fourier methods, iterative model-based deconvolution, or neural network based techniques. This process is illustrated in
While noisy and blurry measurements yield different GLCM results than the original target data, embodiments can largely recover the GLCM. This processing is noise magnifying; however, the approach as shown and described in reference to
For weakly non-linear imaging systems, e.g., with possible noise and/or resolution data dependence and/or that can be modeled as locally linear as considered in reference to
This scenario permits application of the radiomics prediction and recovery framework in situations where closed-form inputs for T, S may not be directly specified. For example, this may arise in imaging systems with proprietary (“black box”) processing, or in scenarios with more complex noise and resolution properties including contrast dependence of the point spread function, etc.
Models 602 and 604 directly provide the calibration phantom ground truth, phantom scan, and the patient scan into the prediction/recovery framework. Thus, prediction model 602 and recovery model 604 combine all of these inputs and process them appropriately to model/return the radiomics to a particular image quality level. This may be achieved using a general neural network model for both prediction and recovery. Such a network is trained with radiomics for both truth/standardized image quality inputs as well as calibration (texture) phantoms with known baseline and scans on particular systems, protocols, and other system characteristics.
Some embodiments characterize the performance of highly nonlinear systems using perturbation response/specific stimuli of interest rather than traditional metrics of resolution. To probe these effects, some embodiments insert texture stimuli of a calibration phantom in order to characterize the system response to relevant textures. These stimuli reveal both signal transfer in the conventional sense, and radiomics transfer (of the underlying biological “signal”). In particular, some embodiments use high resolution texture stimuli that are representative of different lung patterns (e.g., normal, fibrotic, ground glass, honeycomb) but that are distinct from the set of features used for validation. Examples of such texture stimuli are shown and described below in reference to
Using pairs of “known” input textures and “corrupted” output textures permits the development of a learned transfer model (e.g., image property predictors 406, 408) that captures the more complex image properties and the particular effects on each stimulus. The trained transfer models can then be integrated within a greater recovery model that seeks not an inversion of the similarly corrupted computer tomography data to “clean/true” computer tomography data, but instead seeks to recover the underlying radiomics (a potentially much easier inversion). There are several ways to integrate the transfer model including: (1) explicit integration as an untrained layer within the recovery network, (2) providing the input-output pairs themselves as auxiliary inputs—requiring the recovery network to learn both the transfer model and the radiomics inversion, and (3) integration of not the forward transfer model, but instead the inverse transfer model, again, as an untrained layer in a larger recovery network. The third option effectively embeds image inversion (“corrupt” “clean”) within a larger network that can handle the inversion difficulties as they relate to providing accurate radiomics.
As shown,
A range of structures and stimuli may be present in a physical phantom. For flexibility throughout the development effort, some embodiments utilize a modular phantom design, where specific target stimuli inserts may be placed throughout a larger anthropomorphic phantom. Such a modular design allows for: (1) Establishment of ground truth for the inserts via a computer tomography scan, (2) Refinement and/or alteration of phantoms (e.g., with new inserts), and (3) Re-use of the larger anthropomorphic bulk of the phantom, e.g., between calibration and validation.
A number of phantom construction techniques may be used, e.g., traditional manufacturing methods and additive manufacturing techniques. A summary of such techniques appears in the below Table.
3D-printed phantoms may be subject to unknowns in the printing process that require characterization. Thus, according to some embodiments, fabricated texture inserts may be scanned using a high-resolution microCT to establish ground truth. To accommodate contrast differences due to varying technique (e.g., kVp) between microCT and the computer tomography targets, uniform samples may be also scanned and linear fitting may be applied to find ground truth at the computer tomography technique. Print-to-print variability of targets may be quantified by measuring the variability (e.g., standard deviation) of radiomics of the microCT to determine if ground truth values need to be print-specific (e.g., for multiple copies of an insert in a phantom, or in different copies of the same stimulus across collaborating sites).
At 802, method 800 acquires patient scan data. The patient scan data may be raw scan data or a slice of a volume or image reconstruction. The patient scan data may be acquired by scanning a patient, or by accessing electronically stored data from a prior patient scan.
At 804, method 800 obtains unstandardized radiomics for the patient scan data acquired at 802. The unstandardized radiomics may be obtained by applying any radiomics technique to all or part of the patient scan data. By way of non-limiting example, the radiomics may be based on stochastic measures of image content, e.g., those derived from histograms and GLCMs.
At 806, method 800 determines standardized radiomics for the patient scan data. The standardized radiomics may be standardized in the sense that they may be effectively unaffected by the particular imaging machine used to produce them. The standardized radiomics may be determined using one or more of radiomics recovery model 204, 404, or 604. The standardized radiomics may thus be based on at least the patient scan data (e.g., {circumflex over (μ)} or y) and the unstandardized radiomics for the patient scan data (e.g., R(T,S)). According to embodiments that utilize radiomics recovery models 404 or 604, the standardized radiomics may be further based on analytical models of the imaging system and/or calibration phantom data for the particular imaging machine obtained using at least one calibration phantom, e.g., as shown and described in reference to
At 808, method 800 outputs the standardized radiomics. Method 800 may output the standardized radiomics by displaying on a computer monitor or other display device. Alternately, or in addition, method 800 may output the standardized radiomics by delivering them over a computer network, e.g., via email. Alternatively, or in addition, method 800 may output the standardized radiomics to radiomics models for clinical decision making.
Processors 910 are communicatively coupled to random access memory 914 operating under control of or in conjunction with an operating system. The processors 910 in embodiments may be included in one or more servers, clusters, or other computers or hardware resources, or may be implemented using cloud-based resources. The operating system may be, for example, a distribution of the Linux™ operating system, the Unix™ operating system, or other open-source or proprietary operating system or platform. Processors 910 may communicate with data store 912, such as a hard drive or drive array, to access or store program instructions and other data. Processors 910 may, in general, be programmed or configured to execute control logic and control operations to implement methods disclosed herein, e.g., method 800 of
Certain embodiments can be performed using a computer program or set of programs. The computer programs can exist in a variety of forms both active and inactive. For example, the computer programs can exist as software program(s) comprised of program instructions in source code, object code, executable code or other formats; firmware program(s), or hardware description language (HDL) files. Any of the above can be embodied on a transitory or non-transitory computer readable medium, which include storage devices and signals, in compressed or uncompressed form. Exemplary computer readable storage devices include conventional computer system RAM (random access memory), ROM (read-only memory), EPROM (erasable, programmable ROM), EEPROM (electrically erasable, programmable ROM), and magnetic or optical disks or tapes.
While the invention has been described with reference to the exemplary embodiments thereof, those skilled in the art will be able to make various modifications to the described embodiments without departing from the true spirit and scope. The terms and descriptions used herein are set forth by way of illustration only and are not meant as limitations. In particular, although the method has been described by examples, the steps of the method can be performed in a different order than illustrated or simultaneously. Those skilled in the art will recognize that these and other variations are possible within the spirit and scope as defined in the following claims and their equivalents.
This application claims the benefit of, and priority to, U.S. Provisional Patent Application No. 63/121,646, entitled, “Radiomics Standardization,” and filed Dec. 4, 2021, which is hereby incorporated by reference in its entirety.
This invention was made with government support under grant CA219608 awarded by the National Institutes of Health (NIH). The government has certain rights in this invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/061729 | 12/3/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63121646 | Dec 2020 | US |